www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Redundancies often reveal bugs

reply bearophile <bearophileHUGS lycos.com> writes:
Here (pdf alert) I have found a very simple but interesting paper that has
confirmed an hypothesis of mine.

This is a page that contains a pdf that shows a short introduction to the paper:
http://www.ganssle.com/tem/tem80.htm

This is the paper, "Using Redundancies to Find Errors", by Yichen Xie and
Dawson Engler, 2002:
www.stanford.edu/~engler/p401-xie.pdf


A trimmed down quote from the tem80 page:

Researchers at Stanford have just released a paper detailing their use of
automated tools

- Idempotent operations (like assigning a variable to itself) - Values assigned to variables that are not subsequently used - Dead code - Redundant conditionals They found that redundancies, even when harmless, strongly correlate with bugs. Even when the extra code causes no problems, odds are high that other, real, errors will be found within a few lines of the redundant operations. Block-copied code is often suspect, as the developer neglects to change things needed for the codeís new use. Another common problem area: error handlers, which are tough to test, and are, in data Iíve gathered, a huge source of problems in deployed systems. The authors note that their use of lint has long produced warnings about unused variables and return codes, which they've always treated as harmless stylistic issues. Now it's clear that lint is indeed signalling something that may be critically important. The study makes me wonder if compilers that optimize out dead code to reduce memory needs aren't in fact doing us a disservice. Perhaps they should error and exit instead.< This study confirms that situations like: x = x; often hide bugs, unused variables are often enough (as I have suspected, despite what Walter said about it) a sign for possible real bugs, and assigned but later unused variables too may hide bugs. This paper has confirmed that some of my enhancement requests need more attention: http://d.puremagic.com/issues/show_bug.cgi?id=3878 http://d.puremagic.com/issues/show_bug.cgi?id=3960 http://d.puremagic.com/issues/show_bug.cgi?id=4407 situations like x=x; reveal true bugs like: class Foo { int x, y; this(int x_, int y_) { this.x = x; y = y; } } void main() {} Now I think that such redundancies and similar things often enough hide true bugs. But what to do? To turn x=x; into a true error? In a comment to bug 3878 Don gives a situation where DMD may raise a true true compile-time error. But in other cases a true error looks too much to me. Bye, bearophile
Sep 30 2010
next sibling parent reply Kagamin <spam here.lot> writes:
bearophile Wrote:

 errors will be found
 often hide bugs

 situations like x=x; reveal true bugs like:
 
 class Foo {
     int x, y;
     this(int x_, int y_) {
         this.x = x;
         y = y;
         
     }
 }
 void main() {}

Yes, fields and locals in camelCase is a bug.
Sep 30 2010
parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday 30 September 2010 23:33:26 Kagamin wrote:
 bearophile Wrote:
 errors will be found
 often hide bugs
 
 situations like x=x; reveal true bugs like:
 
 class Foo {
 
     int x, y;
     this(int x_, int y_) {
     
         this.x = x;
         y = y;
     
     }
 
 }
 void main() {}

Yes, fields and locals in camelCase is a bug.

??? Why on earth would it be a bug to have variable names in camelcase? Camelcase is purely a stylistic issue - and one which most people adhere to. - Jonathan M Davis
Oct 01 2010
prev sibling next sibling parent reply "JimBob" <jim bob.com> writes:
"bearophile" <bearophileHUGS lycos.com> wrote in message 
news:i83cil$2o02$1 digitalmars.com...
 situations like x=x; reveal true bugs like:

 class Foo {
    int x, y;
    this(int x_, int y_) {
        this.x = x;
        y = y;

    }
 }

I get hit much more often by somthing like this.... class Foo { int m_x, m_y; this(int x, int y) { int m_x = x; int m_y = y; } } I dont know if it is, but IMO it really should be an error to declare local variables that hide member variables.
Oct 01 2010
next sibling parent reply Peter Alexander <peter.alexander.au gmail.com> writes:
 I dont know if it is, but IMO it really should be an error to declare local
 variables that hide member variables.

I disagree. I always do that in constructors: int x, y; this(int x, int y) { this.x = x; this.y = y; } I think you would annoy a lot of people if it was forbidden.
Oct 01 2010
parent "JimBob" <jim bob.com> writes:
"Peter Alexander" <peter.alexander.au gmail.com> wrote in message 
news:i843rl$1gr9$1 digitalmars.com...
 I dont know if it is, but IMO it really should be an error to declare 
 local
 variables that hide member variables.

I disagree. I always do that in constructors: int x, y; this(int x, int y) { this.x = x; this.y = y; } I think you would annoy a lot of people if it was forbidden.

I'm sure it would. But i think the benefit would outweigh the cost. I mean the cost is coding style, personal preference, the benefit is fewer bugs. And people would get used to it.
Oct 01 2010
prev sibling next sibling parent Daniel Gibson <metalcaedes gmail.com> writes:
On Fri, Oct 1, 2010 at 9:50 AM, Peter Alexander
<peter.alexander.au gmail.com> wrote:
 I dont know if it is, but IMO it really should be an error to declare lo=


 variables that hide member variables.

I disagree. I always do that in constructors: int x, y; this(int x, int y) { =A0this.x =3D x; =A0this.y =3D y; } I think you would annoy a lot of people if it was forbidden.

I do the same, but got a nasty bug that took me hours to find because in one condition later down the constructor I forgot the "this." prefix. this kind of bug is hard to spot by just reading the code. IMHO it's quite tedious to do all these assignments in a constructor anyway - it'd be cool to have some possibility to say "this constructor argument should be assigned to the classes field of the same name", like int x, y, z; this(class int x, class int y, int a) { // this.x and this.y are set implicitly this.z =3D (x+y)/a; } or something like that. Dunno if "class" is an appropriate keyword for that (probably not), but it should suffice to illustrate the idea. Well, maybe "this(int this.x, int this.y, int a)" would be better. And maybe this wouldn't need addition to the language at all but could be done with some template/string-mixin magic. I haven't really thought this through, but *some* possibility to do this (assign constructor- or even function-arguments to class field of same name) would be cool :-)
Oct 01 2010
prev sibling next sibling parent "Simen kjaeraas" <simen.kjaras gmail.com> writes:
Daniel Gibson <metalcaedes gmail.com> wrote:

 this(int this.x, int this.y, int a)

Me likes. -- Simen
Oct 01 2010
prev sibling parent retard <re tard.com.invalid> writes:
Fri, 01 Oct 2010 12:38:26 +0200, Simen kjaeraas wrote:

 Daniel Gibson <metalcaedes gmail.com> wrote:
 
 this(int this.x, int this.y, int a)

Me likes.

Looks almost like Scala: class MyClass(var x: Int, var y: Int, a: Int) { ... }
Oct 02 2010
prev sibling next sibling parent reply Justin Johansson <no spam.com> writes:
On 1/10/2010 11:12 AM, bearophile wrote:
 Here (pdf alert) I have found a very simple but interesting paper that has
confirmed an hypothesis of mine.

So far most respondents have gone completely off-subject here. In hardware systems redundancy is critical for safety. In software systems redundancy is bad because, as you and the paper suggest, redundancy makes for bugs. The principle for software is both normalization, DRY (do not repeat yourself) and ZIP (zero intolerance for plagiarism). As always, I enjoy your interesting posts. Regards Justin Johansson
Oct 01 2010
parent Justin Johansson <no spam.com> writes:
On 2/10/2010 1:52 AM, Justin Johansson wrote:
Whoops, bug in my reply.
ZIP as "zero intolerance for plagiarism" is obviously
what I did not mean.  I meant "zero tolerance"
rather than "zero intolerance" but then the acronym
ZTP does not sound so good, :-(
Oct 01 2010
prev sibling next sibling parent reply retard <re tard.com.invalid> writes:
Thu, 30 Sep 2010 21:12:53 -0400, bearophile wrote:

 Here (pdf alert) I have found a very simple but interesting paper that
 has confirmed an hypothesis of mine.
 
 This is a page that contains a pdf that shows a short introduction to
 the paper: http://www.ganssle.com/tem/tem80.htm
 
 This is the paper, "Using Redundancies to Find Errors", by Yichen Xie
 and Dawson Engler, 2002: www.stanford.edu/~engler/p401-xie.pdf
 
 
 A trimmed down quote from the tem80 page:
 
Researchers at Stanford have just released a paper detailing their use
of automated tools

defined as: - Idempotent operations (like assigning a variable to itself) - Values assigned to variables that are not subsequently used - Dead code - Redundant conditionals They found that redundancies, even when harmless, strongly correlate with bugs. Even when the extra code causes no problems, odds are high that other, real, errors will be found within a few lines of the redundant operations. Block-copied code is often suspect, as the developer neglects to change things needed for the code’s new use. Another common problem area:

 handlers, which are tough to test, and are, in data I’ve gathered, a
 huge source of problems in deployed systems. The authors note that their
 use of lint has long produced warnings about unused variables and return
 codes, which they've always treated as harmless stylistic issues. Now
 it's clear that lint is indeed signalling something that may be
 critically important. The study makes me wonder if compilers that
 optimize out dead code to reduce memory needs aren't in fact doing us a
 disservice. Perhaps they should error and exit instead.

If you've ever compiled open source code, you probably have noticed that some developers take software quality seriously. Their programs show no warnings/errors on compile time. That's not very impressive, when the code is below 5000 LOC, but if you apply the same principle when the codebase grows to 500000 LOC, it's a big win. OTOH, there are lots of projects with lazy bastards developing them. Something ALWAYS breaks. A minor update from gcc ?.?.0 to ?.?.1 seems to be enough to break something. The developers were too lazy to study even the basic functionality of C and seem rather surprised when the compiler prevents data corruption or segfaults or other indeterministic states. I always treat code with lots of these bugs as something completely rotten. In distros like Gentoo these bugs prevent people from actually installing and using the program.
 class Foo {
     int x, y;
     this(int x_, int y_) {
         this.x = x;
         y = y;
         
     }
 }
 void main() {}

Some languages prevent this bug by making the parameters immutable in some sense (at least shallow immutability). It's even possible in Java, and in one place I worked previously "final params by default" was one of the rules in code review and style guides.
Oct 02 2010
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
retard wrote:
 Some languages prevent this bug by making the parameters immutable in 
 some sense (at least shallow immutability). It's even possible in Java, 
 and in one place I worked previously "final params by default" was one of 
 the rules in code review and style guides.

this(const int x, const int y) { ... }
Oct 03 2010
prev sibling parent retard <re tard.com.invalid> writes:
Thu, 14 Oct 2010 17:21:39 +0200, Andrej Mitrovic wrote:

 Don't forget pragma abuse! I don't have the exact source, but I've seen
 code like this in several medium-big sized projects:
 
 // Shut up stupid compiler warnings
 #pragma (DISABLE, 5596)
 #pragma (DISABLE, 5597)
 #pragma (DISABLE, 5598)
 
 So not only do people neglect warnings, they get annoyed with them but
 then decide the best solution is to silence the compiler.
 
 OTOH in some cases the warnings are caused by 3rd party libraries and
 the warnings are re-enabled for user-code again (I've seen this latter
 case used in Scintilla or Scite).

Ah, true. Makes one wonder, if C/C++ as systems programming languages are not limiting the programmer unlike impractical high level languages, why do you need to hack the simple warning/error system..
Oct 15 2010
prev sibling next sibling parent Stewart Gordon <smjg_1998 yahoo.com> writes:
On 01/10/2010 02:12, bearophile wrote:
<snip>
 Researchers at Stanford have just released a paper detailing their use of
automated tools

- Idempotent operations (like assigning a variable to itself)

Idempotent operations are not necessarily redundant. For example, x = y; is idempotent, but not redundant. But performing the same idempotent operation multiple times in succession is an example of redundancy. Really, section 2 of that paper isn't about idempotence at all. For those who aren't sure what idempotent means, put simply it means that performing the operation multiple times in succession has the same effect as performing it only once. But assigning a variable to itself is indeed redundant, because it has no effect. Stewart.
Oct 02 2010
prev sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 10/2/10, retard <re tard.com.invalid> wrote:
 Thu, 30 Sep 2010 21:12:53 -0400, bearophile wrote:

 Here (pdf alert) I have found a very simple but interesting paper that
 has confirmed an hypothesis of mine.

 This is a page that contains a pdf that shows a short introduction to
 the paper: http://www.ganssle.com/tem/tem80.htm

 This is the paper, "Using Redundancies to Find Errors", by Yichen Xie
 and Dawson Engler, 2002: www.stanford.edu/~engler/p401-xie.pdf


 A trimmed down quote from the tem80 page:

Researchers at Stanford have just released a paper detailing their use
of automated tools

defined as: - Idempotent operations (like assigning a variable to itself) - Values assigned to variables that are not subsequently used - Dead code - Redundant conditionals They found that redundancies, even when harmless, strongly correlate with bugs. Even when the extra code causes no problems, odds are high that other, real, errors will be found within a few lines of the redundant operations. Block-copied code is often suspect, as the developer neglects to change things needed for the code=92s new use. Another common problem area:

 handlers, which are tough to test, and are, in data I=92ve gathered, a
 huge source of problems in deployed systems. The authors note that their
 use of lint has long produced warnings about unused variables and return
 codes, which they've always treated as harmless stylistic issues. Now
 it's clear that lint is indeed signalling something that may be
 critically important. The study makes me wonder if compilers that
 optimize out dead code to reduce memory needs aren't in fact doing us a
 disservice. Perhaps they should error and exit instead.

If you've ever compiled open source code, you probably have noticed that some developers take software quality seriously. Their programs show no warnings/errors on compile time. That's not very impressive, when the code is below 5000 LOC, but if you apply the same principle when the codebase grows to 500000 LOC, it's a big win. OTOH, there are lots of projects with lazy bastards developing them. Something ALWAYS breaks. A minor update from gcc ?.?.0 to ?.?.1 seems to be enough to break something. The developers were too lazy to study even the basic functionality of C and seem rather surprised when the compiler prevents data corruption or segfaults or other indeterministic states. I always treat code with lots of these bugs as something completely rotten. In distros like Gentoo these bugs prevent people from actually installing and using the program.

Don't forget pragma abuse! I don't have the exact source, but I've seen code like this in several medium-big sized projects: // Shut up stupid compiler warnings #pragma (DISABLE, 5596) #pragma (DISABLE, 5597) #pragma (DISABLE, 5598) So not only do people neglect warnings, they get annoyed with them but then decide the best solution is to silence the compiler. OTOH in some cases the warnings are caused by 3rd party libraries and the warnings are re-enabled for user-code again (I've seen this latter case used in Scintilla or Scite).
Oct 14 2010