www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Length comparison

reply Bill Baxter <wbaxter gmail.com> writes:
This bit me just now:

     if (-1 < somearray.length) {
     }

Ouch.  Took me a while to figure out that that bit of code was the problem.

Is there some way to make these kind of bugs less likely or easier to find?

--bb
Dec 05 2006
parent reply xs0 <xs0 xs0.com> writes:
Bill Baxter wrote:
 This bit me just now:
 
     if (-1 < somearray.length) {
     }
 
 Ouch.  Took me a while to figure out that that bit of code was the problem.
 
 Is there some way to make these kind of bugs less likely or easier to find?
It should be made illegal :) In particular: - comparing a negative constant with an unsigned variable - comparing a too large constant with a signed variable (like 3_000_000_000 and something of type int) - setting an unsigned var to a negative constant (uint a = -1) For other cases, comparing signed with unsigned should produce a warning and, more importantly, work properly: int a; uint b; a < b should be a < 0 || cast(uint)a < b a > b should be a > 0 && cast(uint)a > b etc.etc. I'm quite sure someone will say that's much slower so shouldn't be done, but: - you can do it properly and only compare variables of the same type in the first place - you can trivially make it fast again by casting, with the positive side effect of explicitly stating whether you want to use signed or unsigned comparison - I doubt signed/unsigned comparisons are common enough to even make any noticeable speed difference Thoughts? :) xs0
Dec 05 2006
next sibling parent Lionello Lunesu <lio lunesu.remove.com> writes:
xs0 wrote:
 Bill Baxter wrote:
 This bit me just now:

     if (-1 < somearray.length) {
     }

 Ouch.  Took me a while to figure out that that bit of code was the 
 problem.

 Is there some way to make these kind of bugs less likely or easier to 
 find?
It should be made illegal :) In particular: - comparing a negative constant with an unsigned variable - comparing a too large constant with a signed variable (like 3_000_000_000 and something of type int) - setting an unsigned var to a negative constant (uint a = -1) For other cases, comparing signed with unsigned should produce a warning and, more importantly, work properly: int a; uint b; a < b should be a < 0 || cast(uint)a < b a > b should be a > 0 && cast(uint)a > b etc.etc.
http://d.puremagic.com/issues/show_bug.cgi?id=259 Could you please add your remarks to that bugzilla issue as well? Especially the proper implementation for mixed comparison. I agree that the result should at least be correct, no matter* how slow. L. * of course, within limits :)
Dec 05 2006
prev sibling parent reply Don Clugston <dac nospam.com.au> writes:
xs0 wrote:
 Bill Baxter wrote:
 This bit me just now:

     if (-1 < somearray.length) {
     }

 Ouch.  Took me a while to figure out that that bit of code was the 
 problem.

 Is there some way to make these kind of bugs less likely or easier to 
 find?
It should be made illegal :) In particular: - comparing a negative constant with an unsigned variable - comparing a too large constant with a signed variable (like 3_000_000_000 and something of type int) - setting an unsigned var to a negative constant (uint a = -1) For other cases, comparing signed with unsigned should produce a warning and, more importantly, work properly: int a; uint b; a < b should be a < 0 || cast(uint)a < b a > b should be a > 0 && cast(uint)a > b etc.etc. I'm quite sure someone will say that's much slower so shouldn't be done, but: - you can do it properly and only compare variables of the same type in the first place - you can trivially make it fast again by casting, with the positive side effect of explicitly stating whether you want to use signed or unsigned comparison
But then you have to explicitly specify the type. In your code above, suppose that later on, someone changes 'a' to 'long', and sets a = (1L+int.max)*10 , b=7. Now, the code compiles, but it tells you that a < b !!! You've introduced a bug.
 - I doubt signed/unsigned comparisons are common enough to even make any 
 noticeable speed difference
I think the underlying problem is, there's no way to specify a 'positiveint' type, ie, a number in the range 0..int.max, which can therefore be cast to both int and uint without error. So you have to declare such variables as either 'int' (implying that they can in theory be <0) or as 'uint' (implying that that they can in theory be >int.max). Neither choice is good. There's no way to "do it properly". It's a personal tradeoff which one you use. And for this reason, comparing signed variables with unsigned variables is almost never an error (when people use 'uint', they almost always mean 'positive int'). Detecting mismatch comparisons with literals would be a good thing, though.
Dec 05 2006
parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Don Clugston wrote:
 xs0 wrote:
<snip>
 For other cases, comparing signed with unsigned should produce a 
 warning and, more importantly, work properly:

 int a;
 uint b;

 a < b  should be  a < 0 || cast(uint)a < b
 a > b  should be  a > 0 && cast(uint)a > b
 etc.etc.

 I'm quite sure someone will say that's much slower so shouldn't be 
 done, but:
 - you can do it properly and only compare variables of the same type 
 in the first place
 - you can trivially make it fast again by casting, with the positive 
 side effect of explicitly stating whether you want to use signed or 
 unsigned comparison
But then you have to explicitly specify the type. In your code above, suppose that later on, someone changes 'a' to 'long', and sets a = (1L+int.max)*10 , b=7. Now, the code compiles, but it tells you that a < b !!! You've introduced a bug.
I think his point was that the compiler should perform this transformation automatically, i.e. compile "a < b" as if it said "(a < 0 || cast(uint)a < b)". Extrapolating, if 'a' was changed to a long, the compiler should then compile it as if transformed as above, except with the cast changed to "cast(ulong)". So yes, if done manually this introduces a bug, but if that just becomes the new meaning of comparing a signed and unsigned integer then it should work without a problem. I think this is a good idea.
 - I doubt signed/unsigned comparisons are common enough to even make 
 any noticeable speed difference
I think the underlying problem is, there's no way to specify a 'positiveint' type, ie, a number in the range 0..int.max, which can therefore be cast to both int and uint without error.
If opCast could be overloaded this might be possible with a struct, but it wouldn't be pretty (at least, not without also needing implicit casts and overloaded opAssign). But AFAIK as it is, in the current language you're right that this can't be done.
 So you have to declare such variables as either 'int' (implying that 
 they can in theory be <0) or as 'uint' (implying that that they can in 
 theory be >int.max). Neither choice is good. There's no way to "do it 
 properly". It's a personal tradeoff which one you use.
 
 And for this reason, comparing signed variables with unsigned variables 
 is almost never an error (when people use 'uint', they almost always 
 mean 'positive int'). Detecting mismatch comparisons with literals would 
 be a good thing, though.
Yes it would be a good thing if this was detected. IMHO it would be an even better thing if the automatic transformation above would then be performed when such a mismatch is detected :).
Dec 05 2006
parent reply Don Clugston <dac nospam.com.au> writes:
Frits van Bommel wrote:
 Don Clugston wrote:
 xs0 wrote:
<snip>
 For other cases, comparing signed with unsigned should produce a 
 warning and, more importantly, work properly:

 int a;
 uint b;

 a < b  should be  a < 0 || cast(uint)a < b
 a > b  should be  a > 0 && cast(uint)a > b
 etc.etc.

 I'm quite sure someone will say that's much slower so shouldn't be 
 done, but:
 - you can do it properly and only compare variables of the same type 
 in the first place
 - you can trivially make it fast again by casting, with the positive 
 side effect of explicitly stating whether you want to use signed or 
 unsigned comparison
But then you have to explicitly specify the type. In your code above, suppose that later on, someone changes 'a' to 'long', and sets a = (1L+int.max)*10 , b=7. Now, the code compiles, but it tells you that a < b !!! You've introduced a bug.
I think his point was that the compiler should perform this transformation automatically, i.e. compile "a < b" as if it said "(a < 0 || cast(uint)a < b)". Extrapolating, if 'a' was changed to a long, the compiler should then compile it as if transformed as above, except with the cast changed to "cast(ulong)". So yes, if done manually this introduces a bug, but if that just becomes the new meaning of comparing a signed and unsigned integer then it should work without a problem. I think this is a good idea.
 - I doubt signed/unsigned comparisons are common enough to even make 
 any noticeable speed difference
I think the underlying problem is, there's no way to specify a 'positiveint' type, ie, a number in the range 0..int.max, which can therefore be cast to both int and uint without error.
If opCast could be overloaded this might be possible with a struct, but it wouldn't be pretty (at least, not without also needing implicit casts and overloaded opAssign). But AFAIK as it is, in the current language you're right that this can't be done.
 So you have to declare such variables as either 'int' (implying that 
 they can in theory be <0) or as 'uint' (implying that that they can in 
 theory be >int.max). Neither choice is good. There's no way to "do it 
 properly". It's a personal tradeoff which one you use.

 And for this reason, comparing signed variables with unsigned 
 variables is almost never an error (when people use 'uint', they 
 almost always mean 'positive int'). Detecting mismatch comparisons 
 with literals would be a good thing, though.
Yes it would be a good thing if this was detected. IMHO it would be an even better thing if the automatic transformation above would then be performed when such a mismatch is detected :).
But if a mismatch was detected, would there be any way to request a fast comparison without introducing bugs?
Dec 05 2006
parent Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Don Clugston wrote:
 Frits van Bommel wrote:
 Don Clugston wrote:
<snip>
 And for this reason, comparing signed variables with unsigned 
 variables is almost never an error (when people use 'uint', they 
 almost always mean 'positive int'). Detecting mismatch comparisons 
 with literals would be a good thing, though.
Yes it would be a good thing if this was detected. IMHO it would be an even better thing if the automatic transformation above would then be performed when such a mismatch is detected :).
But if a mismatch was detected, would there be any way to request a fast comparison without introducing bugs?
Depends on what you mean by fast and what you mean by bugs ;). Hmm... If "a < 0 || cast(uint)a < b" isn't fast enough for your liking, then probably the only way to make it faster would be to eliminate the "a < 0" part, which would require writing out "cast(uint)a < b" in full. That might introduce bugs if the type of a was changed to something that can have values bigger than an uint can hold. So yeah, that might be a problem. The reality of it is of course that this change probably won't happen anyway, because of Walter's whole "if it looks like C, it behaves like C" philosophy. If he holds to that the best we can probably hope for is some way to express this that doesn't look like C. Maybe some new operators for these "safe comparison" operations... Unless... What does the C standard say about comparing negative signed numbers to unsigned numbers? If the result is Undefined(TM) (or implementation-defined) then defining it in D would not be limited by this :). Forgive me if this is all a bit too much stream-of-consciousness[1], but writing about things is a good way to think about them :). [1]: Is that the right term? Not sure. Not a term I use a lot ;) (especially in English)
Dec 05 2006