digitalmars.D - Redundancies often reveal bugs

bearophile (42/43) Sep 30 2010 Here (pdf alert) I have found a very simple but interesting paper that h...

Kagamin (2/15) Sep 30 2010 Yes, fields and locals in camelCase is a bug.

Jonathan M Davis (4/24) Oct 01 2010 ??? Why on earth would it be a bug to have variable names in camelcase?

JimBob (14/22) Oct 01 2010 I get hit much more often by somthing like this....

Peter Alexander (8/10) Oct 01 2010 I disagree. I always do that in constructors:

Daniel Gibson (23/33) Oct 01 2010 cal

Simen kjaeraas (4/5) Oct 01 2010 Me likes.

retard (5/10) Oct 02 2010 Looks almost like Scala:

JimBob (5/16) Oct 01 2010 I'm sure it would. But i think the benefit would outweigh the cost. I me...

Justin Johansson (10/11) Oct 01 2010 So far most respondents have gone completely off-subject here.

Justin Johansson (6/6) Oct 01 2010 On 2/10/2010 1:52 AM, Justin Johansson wrote:

bearophile (63/64) Oct 01 2010 This reduces useless code in the constructor and keep the code more DRY,...

Simen kjaeraas (6/14) Oct 01 2010 Oh, but it can (sort of). By allowing this syntax, there is *very* littl...
bearophile (6/16) Oct 01 2010 Sorry, that's wrong. The correct part:

retard (19/63) Oct 02 2010 If you've ever compiled open source code, you probably have noticed that...

Walter Bright (2/6) Oct 03 2010 this(const int x, const int y) { ... }
Andrej Mitrovic (12/62) Oct 14 2010 Don't forget pragma abuse! I don't have the exact source, but I've

retard (4/18) Oct 15 2010 Ah, true. Makes one wonder, if C/C++ as systems programming languages ar...

Stewart Gordon (15/18) Oct 02 2010

bearophile <bearophileHUGS lycos.com> writes:

Here (pdf alert) I have found a very simple but interesting paper that has
confirmed an hypothesis of mine.

This is a page that contains a pdf that shows a short introduction to the paper:
http://www.ganssle.com/tem/tem80.htm

This is the paper, "Using Redundancies to Find Errors", by Yichen Xie and
Dawson Engler, 2002:
www.stanford.edu/~engler/p401-xie.pdf


A trimmed down quote from the tem80 page:

Researchers at Stanford have just released a paper detailing their use of
automated tools

to look for redundant code in 1.6 million lines of Linux. "Redundant" is
defined as:
- Idempotent operations (like assigning a variable to itself)
- Values assigned to variables that are not subsequently used
- Dead code
- Redundant conditionals

They found that redundancies, even when harmless, strongly correlate with bugs.
Even
when the extra code causes no problems, odds are high that other, real, errors
will be
found within a few lines of the redundant operations.

Block-copied code is often suspect, as the developer neglects to change things
needed for
the code�s new use. Another common problem area: error handlers, which are
tough to
test, and are, in data I�ve gathered, a huge source of problems in deployed
systems.
The authors note that their use of lint has long produced warnings about unused
variables
and return codes, which they've always treated as harmless stylistic issues.
Now it's clear
that lint is indeed signalling something that may be critically important.
The study makes me wonder if compilers that optimize out dead code to reduce
memory
needs aren't in fact doing us a disservice. Perhaps they should error and exit
instead.<


This study confirms that situations like:
x = x;
often hide bugs, unused variables are often enough (as I have suspected,
despite what Walter said about it) a sign for possible real bugs, and assigned
but later unused variables too may hide bugs.

This paper has confirmed that some of my enhancement requests need more
attention:

http://d.puremagic.com/issues/show_bug.cgi?id=3878
http://d.puremagic.com/issues/show_bug.cgi?id=3960
http://d.puremagic.com/issues/show_bug.cgi?id=4407


situations like x=x; reveal true bugs like:

class Foo {
    int x, y;
    this(int x_, int y_) {
        this.x = x;
        y = y;
        
    }
}
void main() {}


Now I think that such redundancies and similar things often enough hide true
bugs. But what to do? To turn x=x; into a true error? In a comment to bug 3878
Don gives a situation where DMD may raise a true true compile-time error. But
in other cases a true error looks too much to me. 

Bye,
bearophile

Sep 30 2010

Kagamin <spam here.lot> writes:

bearophile Wrote:

 errors will be found
 often hide bugs

 situations like x=x; reveal true bugs like:
 
 class Foo {
     int x, y;
     this(int x_, int y_) {
         this.x = x;
         y = y;
         
     }
 }
 void main() {}

Yes, fields and locals in camelCase is a bug.

Sep 30 2010

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Thursday 30 September 2010 23:33:26 Kagamin wrote:
 bearophile Wrote:
 errors will be found
 often hide bugs
 
 situations like x=x; reveal true bugs like:
 
 class Foo {
 
     int x, y;
     this(int x_, int y_) {
     
         this.x = x;
         y = y;
     
     }
 
 }
 void main() {}

 
 Yes, fields and locals in camelCase is a bug.

??? Why on earth would it be a bug to have variable names in camelcase? 
Camelcase is purely a stylistic issue - and one which most people adhere to.

- Jonathan M Davis

Oct 01 2010

"JimBob" <jim bob.com> writes:

"bearophile" <bearophileHUGS lycos.com> wrote in message 
news:i83cil$2o02$1 digitalmars.com...
 situations like x=x; reveal true bugs like:

 class Foo {
    int x, y;
    this(int x_, int y_) {
        this.x = x;
        y = y;

    }
 }

I get hit much more often by somthing like this....

class Foo {
   int m_x, m_y;
   this(int x, int y)
   {
       int m_x = x;
       int m_y = y;
   }
}

I dont know if it is, but IMO it really should be an error to declare local 
variables that hide member variables.

Oct 01 2010

Peter Alexander <peter.alexander.au gmail.com> writes:

 I dont know if it is, but IMO it really should be an error to declare local
 variables that hide member variables.

I disagree. I always do that in constructors:

int x, y;
this(int x, int y)
{
  this.x = x;
  this.y = y;
}

I think you would annoy a lot of people if it was forbidden.

Oct 01 2010

Daniel Gibson <metalcaedes gmail.com> writes:

On Fri, Oct 1, 2010 at 9:50 AM, Peter Alexander
<peter.alexander.au gmail.com> wrote:
 I dont know if it is, but IMO it really should be an error to declare lo=


cal
 variables that hide member variables.

 I disagree. I always do that in constructors:

 int x, y;
 this(int x, int y)
 {
 =A0this.x =3D x;
 =A0this.y =3D y;
 }

 I think you would annoy a lot of people if it was forbidden.

I do the same, but got a nasty bug that took me hours to find because
in one condition later down the constructor I forgot the "this."
prefix. this kind of bug is hard to spot by just reading the code.

IMHO it's quite tedious to do all these  assignments in a constructor
anyway - it'd be cool to have some possibility to say "this
constructor argument should be assigned to the classes field of the
same name", like

int x, y, z;

this(class int x, class int y, int a) {
  // this.x and this.y are set implicitly
  this.z =3D (x+y)/a;
}

or something like that. Dunno if "class" is an appropriate keyword for
that (probably not), but it should suffice to illustrate the idea.
Well, maybe "this(int this.x, int this.y, int a)" would be better.
And maybe this wouldn't need addition to the language at all but could
be done with some template/string-mixin magic.
I haven't really thought this through, but *some* possibility to do
this (assign constructor- or even function-arguments to class field of
same name) would be cool :-)

Oct 01 2010

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

Daniel Gibson <metalcaedes gmail.com> wrote:

 this(int this.x, int this.y, int a)

Me likes.

-- 
Simen

Oct 01 2010

retard <re tard.com.invalid> writes:

Fri, 01 Oct 2010 12:38:26 +0200, Simen kjaeraas wrote:

 Daniel Gibson <metalcaedes gmail.com> wrote:
 
 this(int this.x, int this.y, int a)

 
 Me likes.

Looks almost like Scala:

class MyClass(var x: Int, var y: Int, a: Int) {
...
}

Oct 02 2010

"JimBob" <jim bob.com> writes:

"Peter Alexander" <peter.alexander.au gmail.com> wrote in message 
news:i843rl$1gr9$1 digitalmars.com...
 I dont know if it is, but IMO it really should be an error to declare 
 local
 variables that hide member variables.

 I disagree. I always do that in constructors:

 int x, y;
 this(int x, int y)
 {
  this.x = x;
  this.y = y;
 }

 I think you would annoy a lot of people if it was forbidden.

I'm sure it would. But i think the benefit would outweigh the cost. I mean 
the cost is coding style, personal preference, the benefit is fewer bugs.

And people would get used to it.

Oct 01 2010

Justin Johansson <no spam.com> writes:

On 1/10/2010 11:12 AM, bearophile wrote:
 Here (pdf alert) I have found a very simple but interesting paper that has
confirmed an hypothesis of mine.

So far most respondents have gone completely off-subject here.

In hardware systems redundancy is critical for safety.  In software
systems redundancy is bad because, as you and the paper suggest,
redundancy makes for bugs.  The principle for software is both
normalization, DRY (do not repeat yourself) and ZIP (zero intolerance
for plagiarism).

As always, I enjoy your interesting posts.

Regards
Justin Johansson

Oct 01 2010

Justin Johansson <no spam.com> writes:

On 2/10/2010 1:52 AM, Justin Johansson wrote:
Whoops, bug in my reply.
ZIP as "zero intolerance for plagiarism" is obviously
what I did not mean.  I meant "zero tolerance"
rather than "zero intolerance" but then the acronym
ZTP does not sound so good, :-(

Oct 01 2010

bearophile <bearophileHUGS lycos.com> writes:

Thank you for all the answers.

Daniel Gibson:	

 Well, maybe "this(int this.x, int this.y, int a)" would be better.

This reduces useless code in the constructor and keep the code more DRY, looks
able to avoid part of the problems I was talking about (but not all of them).
So this struct:



struct Something {
    int x, y, aa;
    this(int x_, int y_, int a_) {
        this.x = x_;
        this.y = y_;
        this.aa = a_ * a_ + x_;
    }
    void update(int x_, int b) {
        this.x = x_;
        this.aa += b;
    }
}


May be written (it's just syntax sugar):


struct Something {
    int x, y, aa;
    this(this.x, this.y, int a_) {
        this.aa = a_ * a_ + x;
    }
    void update(this.x) {
        this.aa += b;
    }
}


In some situations you need constructor arguments to be of type different from
instance attributes. In such situations you may use the normal old syntax. Or
instance argument types may be optional, so this code:



class Foo {}
class Bar : Foo {}

class Something {
    Foo c;
    this(Bar c_) {
        this.c = c_;
    }
}
void main() {
    auto s = new Something(new Bar);
}


May be written:


class Foo {}
class Bar : Foo {}

class Something {
    Foo c;
    this(Bar this.c) {}
}
void main() {
    auto s = new Something(new Bar);
}



That syntax idea is nice to avoid some code duplication, and I'd like to have
it if it has no bad side effects (beside making the language a bit more
complex), but it can't avoid bugs like the following inc(), so I think it's not
enough to solve the problems I was talking about:



class Foo {
    int x;
    void inc(int x) { x += x; }
}
void main() {}



Despite Python is seen by some people as a scripting language unfit for larger
programs, it contains many design decisions able to avoid several kinds of bugs
(that are often enough present in D programs too). Regarding the bugs discussed
in this post, Python is able to avoid some of them because inside methods all
instance attributes must be prefixed by a name typically like "self." (and
class instance attributes, that are similar to static class attributes in D,
must be prefixed by the class name).

So some of the troubles in D code I am talking about may be avoided requiring
the "this." prefix where the code may be ambiguous for the programmer (I am not
talking about code ambiguous for the compiler). This can't avoid troubles like

forbid the method arguments that have the same name as class/struct/union
attributes (this is what bug 3878 is about).

For the problems we are talking in this thread probably more than one solution
at the same time is needed. The method "this" arguments seem a nice idea to
improve the DRY-ness of the code and avoid some bugs, the obligatory usage of
the "this." prefix when the code is ambiguous for the programmer helps avoid
other bugs, and maybe a warning for x=x; lines of code is useful, and a warning
for unused variables and unused last assigned values too are useful to avoid
other bugs.

Bye,
bearophile

Oct 01 2010

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

bearophile <bearophileHUGS lycos.com> wrote:

 but it can't avoid bugs like the following inc(), so I think it's not  
 enough to solve the problems I was talking about:



 class Foo {
     int x;
     void inc(int x) { x += x; }
 }
 void main() {}

Oh, but it can (sort of). By allowing this syntax, there is *very* little
reason to allow for shadowing of members by parameters or local variables,
and those may thus more readily be disallowed.

-- 
Simen

Oct 01 2010

bearophile <bearophileHUGS lycos.com> writes:


 struct Something {
     int x, y, aa;
     this(this.x, this.y, int a_) {
         this.aa = a_ * a_ + x;
     }
     void update(this.x) {
         this.aa += b;
     }
 }

Sorry, that's wrong. The correct part:

    void update(this.x, int b) {
        this.aa += b;
    }

Bye,
bearophile

Oct 01 2010

retard <re tard.com.invalid> writes:

Thu, 30 Sep 2010 21:12:53 -0400, bearophile wrote:

 Here (pdf alert) I have found a very simple but interesting paper that
 has confirmed an hypothesis of mine.
 
 This is a page that contains a pdf that shows a short introduction to
 the paper: http://www.ganssle.com/tem/tem80.htm
 
 This is the paper, "Using Redundancies to Find Errors", by Yichen Xie
 and Dawson Engler, 2002: www.stanford.edu/~engler/p401-xie.pdf
 
 
 A trimmed down quote from the tem80 page:
 
Researchers at Stanford have just released a paper detailing their use
of automated tools

 to look for redundant code in 1.6 million lines of Linux. "Redundant" is
 defined as: - Idempotent operations (like assigning a variable to
 itself) - Values assigned to variables that are not subsequently used -
 Dead code
 - Redundant conditionals
 
 They found that redundancies, even when harmless, strongly correlate
 with bugs. Even when the extra code causes no problems, odds are high
 that other, real, errors will be found within a few lines of the
 redundant operations.
 
 Block-copied code is often suspect, as the developer neglects to change
 things needed for the codes new use. Another common problem area: 

error
 handlers, which are tough to test, and are, in data Ive gathered, a
 huge source of problems in deployed systems. The authors note that their
 use of lint has long produced warnings about unused variables and return
 codes, which they've always treated as harmless stylistic issues. Now
 it's clear that lint is indeed signalling something that may be
 critically important. The study makes me wonder if compilers that
 optimize out dead code to reduce memory needs aren't in fact doing us a
 disservice. Perhaps they should error and exit instead.

If you've ever compiled open source code, you probably have noticed that 
some developers take software quality seriously. Their programs show no 
warnings/errors on compile time. That's not very impressive, when the 
code is below 5000 LOC, but if you apply the same principle when the 
codebase grows to 500000 LOC, it's a big win.

OTOH, there are lots of projects with lazy bastards developing them. 
Something ALWAYS breaks. A minor update from gcc ?.?.0 to ?.?.1 seems to 
be enough to break something. The developers were too lazy to study even 
the basic functionality of C and seem rather surprised when the compiler 
prevents data corruption or segfaults or other indeterministic states. I 
always treat code with lots of these bugs as something completely rotten. 
In distros like Gentoo these bugs prevent people from actually installing 
and using the program.

 class Foo {
     int x, y;
     this(int x_, int y_) {
         this.x = x;
         y = y;
         
     }
 }
 void main() {}

Some languages prevent this bug by making the parameters immutable in 
some sense (at least shallow immutability). It's even possible in Java, 
and in one place I worked previously "final params by default" was one of 
the rules in code review and style guides.

Oct 02 2010

Walter Bright <newshound2 digitalmars.com> writes:

retard wrote:
 Some languages prevent this bug by making the parameters immutable in 
 some sense (at least shallow immutability). It's even possible in Java, 
 and in one place I worked previously "final params by default" was one of 
 the rules in code review and style guides.

this(const int x, const int y) { ... }

Oct 03 2010

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On 10/2/10, retard <re tard.com.invalid> wrote:
 Thu, 30 Sep 2010 21:12:53 -0400, bearophile wrote:

 Here (pdf alert) I have found a very simple but interesting paper that
 has confirmed an hypothesis of mine.

 This is a page that contains a pdf that shows a short introduction to
 the paper: http://www.ganssle.com/tem/tem80.htm

 This is the paper, "Using Redundancies to Find Errors", by Yichen Xie
 and Dawson Engler, 2002: www.stanford.edu/~engler/p401-xie.pdf


 A trimmed down quote from the tem80 page:

Researchers at Stanford have just released a paper detailing their use
of automated tools

 to look for redundant code in 1.6 million lines of Linux. "Redundant" is
 defined as: - Idempotent operations (like assigning a variable to
 itself) - Values assigned to variables that are not subsequently used -
 Dead code
 - Redundant conditionals

 They found that redundancies, even when harmless, strongly correlate
 with bugs. Even when the extra code causes no problems, odds are high
 that other, real, errors will be found within a few lines of the
 redundant operations.

 Block-copied code is often suspect, as the developer neglects to change
 things needed for the code=92s new use. Another common problem area:

 error
 handlers, which are tough to test, and are, in data I=92ve gathered, a
 huge source of problems in deployed systems. The authors note that their
 use of lint has long produced warnings about unused variables and return
 codes, which they've always treated as harmless stylistic issues. Now
 it's clear that lint is indeed signalling something that may be
 critically important. The study makes me wonder if compilers that
 optimize out dead code to reduce memory needs aren't in fact doing us a
 disservice. Perhaps they should error and exit instead.

 If you've ever compiled open source code, you probably have noticed that
 some developers take software quality seriously. Their programs show no
 warnings/errors on compile time. That's not very impressive, when the
 code is below 5000 LOC, but if you apply the same principle when the
 codebase grows to 500000 LOC, it's a big win.

 OTOH, there are lots of projects with lazy bastards developing them.
 Something ALWAYS breaks. A minor update from gcc ?.?.0 to ?.?.1 seems to
 be enough to break something. The developers were too lazy to study even
 the basic functionality of C and seem rather surprised when the compiler
 prevents data corruption or segfaults or other indeterministic states. I
 always treat code with lots of these bugs as something completely rotten.
 In distros like Gentoo these bugs prevent people from actually installing
 and using the program.

Don't forget pragma abuse! I don't have the exact source, but I've
seen code like this in several medium-big sized projects:

// Shut up stupid compiler warnings
#pragma (DISABLE, 5596)
#pragma (DISABLE, 5597)
#pragma (DISABLE, 5598)

So not only do people neglect warnings, they get annoyed with them but
then decide the best solution is to silence the compiler.

OTOH in some cases the warnings are caused by 3rd party libraries and
the warnings are re-enabled for user-code again (I've seen this latter
case used in Scintilla or Scite).

Oct 14 2010

retard <re tard.com.invalid> writes:

Thu, 14 Oct 2010 17:21:39 +0200, Andrej Mitrovic wrote:

 Don't forget pragma abuse! I don't have the exact source, but I've seen
 code like this in several medium-big sized projects:
 
 // Shut up stupid compiler warnings
 #pragma (DISABLE, 5596)
 #pragma (DISABLE, 5597)
 #pragma (DISABLE, 5598)
 
 So not only do people neglect warnings, they get annoyed with them but
 then decide the best solution is to silence the compiler.
 
 OTOH in some cases the warnings are caused by 3rd party libraries and
 the warnings are re-enabled for user-code again (I've seen this latter
 case used in Scintilla or Scite).

Ah, true. Makes one wonder, if C/C++ as systems programming languages are 
not limiting the programmer unlike impractical high level languages, why 
do you need to hack the simple warning/error system..

Oct 15 2010

Stewart Gordon <smjg_1998 yahoo.com> writes:

On 01/10/2010 02:12, bearophile wrote:
<snip>
 Researchers at Stanford have just released a paper detailing their use of
automated tools

 to look for redundant code in 1.6 million lines of Linux. "Redundant" is
defined as:
 - Idempotent operations (like assigning a variable to itself)

<snip>

Idempotent operations are not necessarily redundant.

For example,

     x = y;

is idempotent, but not redundant.  But performing the same idempotent 
operation multiple times in succession is an example of redundancy.

Really, section 2 of that paper isn't about idempotence at all.

For those who aren't sure what idempotent means, put simply it means 
that performing the operation multiple times in succession has the same 
effect as performing it only once.

But assigning a variable to itself is indeed redundant, because it has 
no effect.

Stewart.

Oct 02 2010

D Programming

C/C++ Programming

Other

digitalmars.D - Redundancies often reveal bugs