www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - LLVM Coding Standards

reply spir <denis.spir gmail.com> writes:
[slightly OT]

Hello,

I'm reading (just for interest) the LLVM Coding Standards at 
http://llvm.org/docs/CodingStandards.html. Find them very interesting because 
their purposes are clearly explained. Below sample.

Denis

=== sample ===================================
Use Early Exits and continue to Simplify Code

When reading code, keep in mind how much state and how many previous decisions 
have to be remembered by the reader to understand a block of code. Aim to 
reduce indentation where possible when it doesn't make it more difficult to 
understand the code. One great way to do this is by making use of early exits 
and the continue keyword in long loops. As an example of using an early exit 
from a function, consider this "bad" code:

Value *DoSomething(Instruction *I) {
   if (!isa<TerminatorInst>(I) &&
       I->hasOneUse() && SomeOtherThing(I)) {
     ... some long code ....
   }

   return 0;
}

This code has several problems if the body of the 'if' is large. When you're 
looking at the top of the function, it isn't immediately clear that this only 
does interesting things with non-terminator instructions, and only applies to 
things with the other predicates. Second, it is relatively difficult to 
describe (in comments) why these predicates are important because the if 
statement makes it difficult to lay out the comments. Third, when you're deep 
within the body of the code, it is indented an extra level. Finally, when 
reading the top of the function, it isn't clear what the result is if the 
predicate isn't true; you have to read to the end of the function to know that 
it returns null.

It is much preferred to format the code like this:

Value *DoSomething(Instruction *I) {
   // Terminators never need 'something' done to them because ...
   if (isa<TerminatorInst>(I))
     return 0;

   // We conservatively avoid transforming instructions with multiple uses
   // because goats like cheese.
   if (!I->hasOneUse())
     return 0;

   // This is really just here for example.
   if (!SomeOtherThing(I))
     return 0;

   ... some long code ....
}

This fixes these problems. A similar problem frequently happens in for loops. A 
silly example is something like this:

   for (BasicBlock::iterator II = BB->begin(), E = BB->end(); II != E; ++II) {
     if (BinaryOperator *BO = dyn_cast<BinaryOperator>(II)) {
       Value *LHS = BO->getOperand(0);
       Value *RHS = BO->getOperand(1);
       if (LHS != RHS) {
         ...
       }
     }
   }

When you have very, very small loops, this sort of structure is fine. But if it 
exceeds more than 10-15 lines, it becomes difficult for people to read and 
understand at a glance. The problem with this sort of code is that it gets very 
nested very quickly. Meaning that the reader of the code has to keep a lot of 
context in their brain to remember what is going immediately on in the loop, 
because they don't know if/when the if conditions will have elses etc. It is 
strongly preferred to structure the loop like this:

   for (BasicBlock::iterator II = BB->begin(), E = BB->end(); II != E; ++II) {
     BinaryOperator *BO = dyn_cast<BinaryOperator>(II);
     if (!BO) continue;

     Value *LHS = BO->getOperand(0);
     Value *RHS = BO->getOperand(1);
     if (LHS == RHS) continue;

     ...
   }

This has all the benefits of using early exits for functions: it reduces 
nesting of the loop, it makes it easier to describe why the conditions are 
true, and it makes it obvious to the reader that there is no else coming up 
that they have to push context into their brain for. If a loop is large, this 
can be a big understandability win.
========================================
-- 
_________________
vita es estrany
spir.wikidot.com
Apr 11 2011
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/11/11 2:58 PM, spir wrote:
 [slightly OT]

 Hello,

 I'm reading (just for interest) the LLVM Coding Standards at
 http://llvm.org/docs/CodingStandards.html. Find them very interesting
 because their purposes are clearly explained. Below sample.

 Denis

 === sample ===================================
 Use Early Exits and continue to Simplify Code

Heh heh heh. This is bound to annoy many a dinosaur. And they even didn't need to pull the exceptions argument! Andrei
Apr 11 2011
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei:

 Heh heh heh. This is bound to annoy many a dinosaur. And they even 
 didn't need to pull the exceptions argument!

I find the LLVM C++ source code readable and sometimes even elegant for being C++ (it's very far from the C-style hairy code of DMD, despite sometimes DMD is a little more efficient during compilation). So I think they are doing a good job. If the source code of an open source project isn't well readable, new people will have less desire to work on the code to add features, remove bugs, etc. CPython developers keep the C source code very simple and very readable even if sometimes this forces to use a bit less efficient code. Bye, bearophile
Apr 11 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/11/2011 2:03 PM, bearophile wrote:
 it's very far from the C-style hairy code of DMD

I'm overcompensating for being bald.
Apr 11 2011
next sibling parent "Nick Sabalausky" <a a.a> writes:
"Walter Bright" <newshound2 digitalmars.com> wrote in message 
news:io08sr$ouq$1 digitalmars.com...
 On 4/11/2011 2:03 PM, bearophile wrote:
 it's very far from the C-style hairy code of DMD

I'm overcompensating for being bald.

Heh. Now that's a classic line if I've ever heard one :)
Apr 11 2011
prev sibling parent bearophile <bearophileHUGS lycos.com> writes:
Walter:

 I'm overcompensating for being bald.

*hands a Code Brush [TM]* :-) Bye, bearophile
Apr 11 2011
prev sibling next sibling parent "Nick Sabalausky" <a a.a> writes:
"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message 
news:invnce$2mfr$4 digitalmars.com...
 On 4/11/11 2:58 PM, spir wrote:
 [slightly OT]

 Hello,

 I'm reading (just for interest) the LLVM Coding Standards at
 http://llvm.org/docs/CodingStandards.html. Find them very interesting
 because their purposes are clearly explained. Below sample.

 Denis

 === sample ===================================
 Use Early Exits and continue to Simplify Code

Heh heh heh. This is bound to annoy many a dinosaur. And they even didn't need to pull the exceptions argument!

That's something I've come full circle on. I stared out on the old interactive BASICs (the ones with line numbers and gratuitous GOTO). So I didn't originally have a problem with early exits and continue even after I had moved to C/C++ (and started agreeing, from direct experience, with "goto is evil" - which I still agree with). Then in the "OO-is-your-God" age of the late 90's and early 2000's I found some reasonable sounding arguments for avoiding early exits and continue and got very much into the habit of avoiding them. The last few years though, I've been finding that I *never* have any trouble grokking code due to early exits or continue (unless the code is already convoluted anyway). And I've also realized I find code that makes intelligent use of it to be much *easier* to understand and reason about since it takes the otherwise-distracting special cases and just shoves them out of the way so the rest of the code can focus on the real core of the task. Plus, Andrei is right: Reducing the number of nested scopes makes a big improvement in readability. I've often been considered a dinosaur, and probably for good reason. But consider this to be one early-exit-and-continue-loving dino. :)
Apr 11 2011
prev sibling next sibling parent spir <denis.spir gmail.com> writes:
On 04/12/2011 02:34 AM, Nick Sabalausky wrote:
 The last few years though, I've been finding that I *never* have any trouble
 grokking code due to early exits or continue (unless the code is already
 convoluted anyway). And I've also realized I find code that makes
 intelligent use of it to be much *easier* to understand and reason about
 since it takes the otherwise-distracting special cases and just shoves them
 out of the way so the rest of the code can focus on the real core of the
 task.

This is imo a very good explanation. Denis -- _________________ vita es estrany spir.wikidot.com
Apr 12 2011
prev sibling parent spir <denis.spir gmail.com> writes:
On 04/12/2011 03:08 AM, Walter Bright wrote:
 On 4/11/2011 2:03 PM, bearophile wrote:
 it's very far from the C-style hairy code of DMD

I'm overcompensating for being bald.

Unfortunately, beeing bald does not seem to always help. While I'm nearly bald as well, I found the few pieces of dmd and core stdlib I've tried to understand rather difficult. But not that much because of coding style than for not beeing able to really grok the *design*. Maybe a few more comments oriented toward overall conception would help; provided they're not overly hairy themselves; dunno. Denis -- _________________ vita es estrany spir.wikidot.com
Apr 12 2011
prev sibling next sibling parent Iain Buclaw <ibuclaw ubuntu.com> writes:
== Quote from spir (denis.spir gmail.com)'s article
 [slightly OT]
 Hello,
 I'm reading (just for interest) the LLVM Coding Standards at
 http://llvm.org/docs/CodingStandards.html. Find them very interesting because
 their purposes are clearly explained. Below sample.
 Denis
 === sample ===================================
 Use Early Exits and continue to Simplify Code
 When reading code, keep in mind how much state and how many previous decisions
 have to be remembered by the reader to understand a block of code. Aim to
 reduce indentation where possible when it doesn't make it more difficult to
 understand the code. One great way to do this is by making use of early exits
 and the continue keyword in long loops. As an example of using an early exit
 from a function, consider this "bad" code:
 Value *DoSomething(Instruction *I) {
    if (!isa<TerminatorInst>(I) &&
        I->hasOneUse() && SomeOtherThing(I)) {
      ... some long code ....
    }
    return 0;
 }
 This code has several problems if the body of the 'if' is large. When you're
 looking at the top of the function, it isn't immediately clear that this only
 does interesting things with non-terminator instructions, and only applies to
 things with the other predicates. Second, it is relatively difficult to
 describe (in comments) why these predicates are important because the if
 statement makes it difficult to lay out the comments. Third, when you're deep
 within the body of the code, it is indented an extra level. Finally, when
 reading the top of the function, it isn't clear what the result is if the
 predicate isn't true; you have to read to the end of the function to know that
 it returns null.
 It is much preferred to format the code like this:
 Value *DoSomething(Instruction *I) {
    // Terminators never need 'something' done to them because ...
    if (isa<TerminatorInst>(I))
      return 0;
    // We conservatively avoid transforming instructions with multiple uses
    // because goats like cheese.
    if (!I->hasOneUse())
      return 0;
    // This is really just here for example.
    if (!SomeOtherThing(I))
      return 0;
    ... some long code ....
 }

I've been doing similar such shuffles in the GDC code recently. One example that gets my dander right up: if (SomeOtherThing(I)) { ... some long code .... } else I->SomeValue = t;
Apr 11 2011
prev sibling next sibling parent reply Spacen Jasset <spacenjasset yahoo.co.uk> writes:
On 11/04/2011 20:58, spir wrote:
 [slightly OT]

 Hello,

 I'm reading (just for interest) the LLVM Coding Standards at
 http://llvm.org/docs/CodingStandards.html. Find them very interesting
 because their purposes are clearly explained. Below sample.

 Denis

That seem all fairly sensible. It also reminds me of open source projects written in C, where GOTO is used, like so: HANDLE handle1 = open(...); ... if (out_of_memory) goto cleanup; if (invalid_format) goto cleanup; ... cleanup: if (handle1) close(handle1); if (handle2) close(handle2); This code uses the dreaded goto statement, but I belive you can see that the author is trying to make the code more readable, or at least get rid of the nested indents/multiple cleanup problem you inevitably come across at some points in C code. It does tend to be more readable than the alternative, too. I think that people like to follow rules, that is as soon as they have internalised them and made them their own. What this means is that they often then follow them to a fault, and you get deeply nested, but "structured" code, where instead you would be better of with more logically linear code as in the case of the early exit. Coding standards should probably just say: try and write readable code. Everyone knows what readable code looks like. It just not always quick or easy to make it that way. While I am on the subject, I've *always* thought major languages have poor loop constructs: (A) for (;;) { std::getline(is, line); if (line.size() == 0) break; ...some things... } You have to call getline always at least once, then you need to test if the line is empty to terminate the loop. So how do you do it another way? (B) std::getline(is, line); while (line.size() != 0) { ...some things... std::getline(is, line); } Isn't there something a bit wrong here? N.B. a do .. while doesn't help here either. in (A) there is no duplication, in essence what I am saying, is that there should be a loop whereby you can put the exit condition where you need it, but *also* the compiler should check you have an exit condition to prevent mistakes. This whole WHILE vs DO vs FOR loops thing is strange to me. Instead you could just have: loop { ... if (condition) exit; ... } instead of WHILE and DO. Whereby you *must* have an exit condition. But I suppose you need a FOR loop because the following may be error prone. int x=0; loop { if x > 9 exit; ... x++; } So you would then end up with a LOOP a FOREVER (perhaps which is for(;;) by convention anyway) and a FOR loop. I'll put the coffee down now...
Apr 11 2011
next sibling parent "Nick Sabalausky" <a a.a> writes:
"Spacen Jasset" <spacenjasset yahoo.co.uk> wrote in message 
news:invvmu$7ha$1 digitalmars.com...
 That seem all fairly sensible. It also reminds me of open source projects 
 written in C, where GOTO is used, like so:


 HANDLE handle1 = open(...);

 ...
 if (out_of_memory)
 goto cleanup;

 if (invalid_format)
 goto cleanup;
 ...

 cleanup:
    if (handle1)
       close(handle1);
    if (handle2)
     close(handle2);


 This code uses the dreaded goto statement, but I belive you can see that 
 the author is trying to make the code more readable, or at least get rid 
 of the nested indents/multiple cleanup problem you inevitably come across 
 at some points in C code. It does tend to be more readable than the 
 alternative, too.

I'm a big anti-goto guy, but I think what bugs me about that code more than the goto is that its particular style of error handling gives me flashbacks of VB6. Bring on the try/catch/finally and scope guards! Of course, I do agree, without try/catch/finally or scope guards that code really isn't too bad of an alternative.
 I think that people like to follow rules, that is as soon as they have 
 internalised them and made them their own. What this means is that they 
 often then follow them to a fault, and you get deeply nested, but 
 "structured" code, where instead you would be better of with more 
 logically linear code as in the case of the early exit. Coding standards 
 should probably just say: try and write readable code. Everyone knows what 
 readable code looks like. It just not always quick or easy to make it that 
 way.

I think that's a good analysis. And particularly true of programmers. We're trained to think in terms of defining structure and following it. We're more like Hermes from Futurama than most of us might think. Minus the dreadlocks, of course (which incidentally look terrible in real life, unlike on TV. In real life they just look like someone forgot to wash their hair for twenty years.)
 While I am on the subject, I've *always* thought major languages have poor 
 loop constructs:

 Instead you could just have:

 loop
 {
  ...
  if (condition) exit;
  ...
 }

 instead of WHILE and DO. Whereby you *must* have an exit condition.

Yea, once in a while I do come across a need to have the exit condition in the middle of the loop body. It doesn't happen to me often, but it is kind of annoying when it does crop up. I feel like I have to contort the task to fit the tool. Fortunately, though, like I said, I don't come across such cases often. YMMV, of course.
 I'll put the coffee down now...

[OT]: My local World Market has a "Vanilla Macadamia Kona Coffee". A little pricey as far as coffees go, but I think it's becoming my new god. :) If only I could find it in whole-bean decaf form... (I don't like the way caffeine affects me, and pre-ground doesn't store well unless you have an air-tight container on hand.) It's kind of weird if you think about it how much our society has gone nuts over what's essentially "bean juice" - doesn't sound so appetizing when you describe it that way :) Um, ok, yea I guess I *really* need decaf, don't I? ;)
Apr 11 2011
prev sibling next sibling parent Daniel Gibson <metalcaedes gmail.com> writes:
Am 12.04.2011 00:31, schrieb Spacen Jasset:
 On 11/04/2011 20:58, spir wrote:
 [slightly OT]

 Hello,

 I'm reading (just for interest) the LLVM Coding Standards at
 http://llvm.org/docs/CodingStandards.html. Find them very interesting
 because their purposes are clearly explained. Below sample.

 Denis

That seem all fairly sensible. It also reminds me of open source projects written in C, where GOTO is used, like so: HANDLE handle1 = open(...); ... if (out_of_memory) goto cleanup; if (invalid_format) goto cleanup; ... cleanup: if (handle1) close(handle1); if (handle2) close(handle2); This code uses the dreaded goto statement, but I belive you can see that the author is trying to make the code more readable, or at least get rid of the nested indents/multiple cleanup problem you inevitably come across at some points in C code. It does tend to be more readable than the alternative, too.

I agree. I've seen and used this in C code as well. IMHO it is mostly a workaround for not having exceptions in C - the cleanup-stuff would belong in catch{...} or finally{...} (or even better, when scope-guards are available, in scope(exit){...} or scope(failure){...} And I agree that this is far more readable than using a plethora of bool flags and ifs.
 I think that people like to follow rules, that is as soon as they have
 internalised them and made them their own. What this means is that they
 often then follow them to a fault, and you get deeply nested, but
 "structured" code, where instead you would be better of with more
 logically linear code as in the case of the early exit. Coding standards
 should probably just say: try and write readable code. Everyone knows
 what readable code looks like. It just not always quick or easy to make
 it that way.
 
 
 While I am on the subject, I've *always* thought major languages have
 poor loop constructs:
 
 
 (A)
 
 for (;;)
 {
     std::getline(is, line);
     if (line.size() == 0)
         break;
     ...some things...
 }
 

 
 Instead you could just have:
 
 loop
 {
  ...
  if (condition) exit;
  ...
 }
 
 instead of WHILE and DO. Whereby you *must* have an exit condition.
 
 
 But I suppose you need a FOR loop because the following may be error prone.
 
 int x=0;
 loop
 {
 if x > 9 exit;
 ...
 x++;
 }

Yeah. And I guess while-loops also have their uses. I think just loop like you're suggesting is not available because for(;;) and while(1) achieve the same thing without too much additional typing. Cheers, - Daniel
Apr 11 2011
prev sibling next sibling parent Don <nospam nospam.com> writes:
Spacen Jasset wrote:
 While I am on the subject, I've *always* thought major languages have 
 poor loop constructs:
 
 
 (A)
 
 for (;;)
 {
     std::getline(is, line);
     if (line.size() == 0)
         break;
     ...some things...
 }
 
 
 You have to call getline always at least once, then you need to test if 
 the line is empty to terminate the loop. So how do you do it another way?

FORTH had BEGIN ... WHILE ... REPEAT. I really miss it. C family languages just have do ... while, which seems to be pretty much useless.
Apr 11 2011
prev sibling next sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 12.04.2011 2:31, Spacen Jasset wrote:
 On 11/04/2011 20:58, spir wrote:
 [slightly OT]

 Hello,

 I'm reading (just for interest) the LLVM Coding Standards at
 http://llvm.org/docs/CodingStandards.html. Find them very interesting
 because their purposes are clearly explained. Below sample.

 Denis


 Instead you could just have:

 loop
 {
  ...
  if (condition) exit;
  ...
 }

 instead of WHILE and DO. Whereby you *must* have an exit condition.


 But I suppose you need a FOR loop because the following may be error 
 prone.

 int x=0;
 loop
 {
 if x > 9 exit;
 ....
 x++;
 }

Looks a lot like PL/SQL;)
 So you would then end up with a LOOP a FOREVER (perhaps which is 
 for(;;) by convention anyway) and a FOR loop.

 I'll put the coffee down now...

-- Dmitry Olshansky
Apr 12 2011
prev sibling next sibling parent reply Mafi <mafi example.org> writes:
Am 12.04.2011 00:31, schrieb Spacen Jasset:
 std::getline(is, line);
 while (line.size() != 0)
 {
          ...some things...
      std::getline(is, line);
 }

What's wrong with while( std::getline(is, line), (line.size() != 0) ) { //... some things } I mean, that's what the comma operator is for.
Apr 12 2011
next sibling parent "Nick Sabalausky" <a a.a> writes:
"Mafi" <mafi example.org> wrote in message 
news:io19ar$47h$1 digitalmars.com...
 Am 12.04.2011 00:31, schrieb Spacen Jasset:
 std::getline(is, line);
 while (line.size() != 0)
 {
          ...some things...
      std::getline(is, line);
 }

What's wrong with while( std::getline(is, line), (line.size() != 0) ) { //... some things } I mean, that's what the comma operator is for.

I'm deathly afraid of the comma operator.
Apr 12 2011
prev sibling parent Spacen Jasset <spacenjasset yahoo.co.uk> writes:
On 12/04/2011 11:21, Mafi wrote:
 Am 12.04.2011 00:31, schrieb Spacen Jasset:
 std::getline(is, line);
 while (line.size() != 0)
 {
 ...some things...
 std::getline(is, line);
 }

What's wrong with while( std::getline(is, line), (line.size() != 0) ) { //... some things } I mean, that's what the comma operator is for.

Well...okay. But sometimes it's not just one statement that you always need to do before your test condition. Also, I would say that sort of code is a little bit more difficult to read. I don't mind typing a little bit more to be able to make it look a bit better. As other posters have pointed out, it seems to me, at least, that having a way to express your model/idea or view of a problem directly is the most useful thing a language can give you. In other words less code structure design caused by safety and/or other issues, and more problem, higher level design visible in the code. As an example, perhaps not a great one. The RAII pattern. It's a code type of design/solution used to manage resources, and specifically to prevent resource leaks, rather than anything in particular to do with a problem that is being solved. e.g. That of reading and processing a file. So we end up with (a) solution to some problem (b) solution to the method of expressing the solution to the problem as you have put above: while( std::getline(is, line), (line.size() != 0) ) { There is a strong component of (b), rather than just (a), which ideally in utopia we don't want to spend time on thinking about.
Apr 15 2011
prev sibling next sibling parent spir <denis.spir gmail.com> writes:
On 04/12/2011 03:15 AM, Daniel Gibson wrote:
 While I am on the subject, I've *always* thought major languages have
  poor loop constructs:


  (A)

  for (;;)
  {
       std::getline(is, line);
       if (line.size() == 0)
           break;
       ...some things...
  }


  Instead you could just have:

  loop
  {
    ...
    if (condition) exit;
    ...
  }

  instead of WHILE and DO. Whereby you *must* have an exit condition.


  But I suppose you need a FOR loop because the following may be error prone.

  int x=0;
  loop
  {
  if x>  9 exit;
  ...
  x++;
  }


I think just loop like you're suggesting is not available because for(;;) and while(1) achieve the same thing without too much additional typing.

I've been thinking of a loop construct allowing either while or until, each either at start or end, or even both: loop [(while|until) condition] { body } [(while|until) condition] Side-Note: I favor until over while because I "feel" (for what reason?) booleans should be initially false, or false by default, thus the end-condition should express a change and be positive. loop until end1 { body } until end2 ends are initially false loop while end1 { body } while end2 ends are initially true Denis -- _________________ vita es estrany spir.wikidot.com
Apr 12 2011
prev sibling parent spir <denis.spir gmail.com> writes:
On 04/15/2011 01:10 PM, Spacen Jasset wrote:
 As other posters have pointed out, it seems to me, at least, that having a way
 to express your model/idea or view of a problem directly is the most useful
 thing a language can give you.

This is my definition of a good language :-) Denis -- _________________ vita es estrany spir.wikidot.com
Apr 15 2011
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2011-04-11 21:58, spir wrote:
 [slightly OT]

 Hello,

 I'm reading (just for interest) the LLVM Coding Standards at
 http://llvm.org/docs/CodingStandards.html. Find them very interesting
 because their purposes are clearly explained. Below sample.

 Denis

 === sample ===================================
 Use Early Exits and continue to Simplify Code

I usually try to use early exists as much as possible. I think it simplifies the code. -- /Jacob Carlborg
Apr 12 2011
prev sibling parent spir <denis.spir gmail.com> writes:
On 04/12/2011 12:31 AM, Spacen Jasset wrote:
 I think that people like to follow rules, that is as soon as they have
 internalised them and made them their own. What this means is that they often
 then follow them to a fault, and you get deeply nested, but "structured" code,
 where instead you would be better of with more logically linear code as in the
 case of the early exit. Coding standards should probably just say: try and
 write readable code.

Yes!
 Everyone knows what readable code looks like.

No! It's a cultural issue. In C and C-derived languages, culture favors cleverness, complication, & terseness over readability. (Actually, even when readable code code is terser.) The key issue is readable code looks *easier*, thus it's not that rewarding. Just like what a dancer truelly masters (read: with grace) *looks* easy to do. Readabilty, like simplicity, is *very* difficult to achieve. Designing a language that favors readability (and/or simplicity) as well; among other points, it requires allowing users express their models by direct analogy (I mean the code should somehow mirror the model, its contents & structure, like a homomorphism or a metaphor). Denis -- _________________ vita es estrany spir.wikidot.com
Apr 12 2011