digitalmars.D - Length comparison

Bill Baxter (6/6) Dec 05 2006 This bit me just now:

xs0 (24/32) Dec 05 2006 It should be made illegal :) In particular:

Lionello Lunesu (7/34) Dec 05 2006 http://d.puremagic.com/issues/show_bug.cgi?id=259
Don Clugston (17/54) Dec 05 2006 But then you have to explicitly specify the type. In your code above,

Frits van Bommel (19/58) Dec 05 2006 I think his point was that the compiler should perform this

Don Clugston (3/66) Dec 05 2006 But if a mismatch was detected, would there be any way to request a fast...

Frits van Bommel (23/36) Dec 05 2006 Depends on what you mean by fast and what you mean by bugs ;).

Bill Baxter <wbaxter gmail.com> writes:

This bit me just now:

     if (-1 < somearray.length) {
     }

Ouch.  Took me a while to figure out that that bit of code was the problem.

Is there some way to make these kind of bugs less likely or easier to find?

--bb

Dec 05 2006

xs0 <xs0 xs0.com> writes:

Bill Baxter wrote:
 This bit me just now:
 
     if (-1 < somearray.length) {
     }
 
 Ouch.  Took me a while to figure out that that bit of code was the problem.
 
 Is there some way to make these kind of bugs less likely or easier to find?

It should be made illegal :) In particular:
- comparing a negative constant with an unsigned variable
- comparing a too large constant with a signed variable (like 
3_000_000_000 and something of type int)
- setting an unsigned var to a negative constant (uint a = -1)

For other cases, comparing signed with unsigned should produce a warning 
and, more importantly, work properly:

int a;
uint b;

a < b  should be  a < 0 || cast(uint)a < b
a > b  should be  a > 0 && cast(uint)a > b
etc.etc.

I'm quite sure someone will say that's much slower so shouldn't be done, 
but:
- you can do it properly and only compare variables of the same type in 
the first place
- you can trivially make it fast again by casting, with the positive 
side effect of explicitly stating whether you want to use signed or 
unsigned comparison
- I doubt signed/unsigned comparisons are common enough to even make any 
noticeable speed difference

Thoughts? :)


xs0

Dec 05 2006

Lionello Lunesu <lio lunesu.remove.com> writes:

xs0 wrote:
 Bill Baxter wrote:
 This bit me just now:

     if (-1 < somearray.length) {
     }

 Ouch.  Took me a while to figure out that that bit of code was the 
 problem.

 Is there some way to make these kind of bugs less likely or easier to 
 find?

 
 It should be made illegal :) In particular:
 - comparing a negative constant with an unsigned variable
 - comparing a too large constant with a signed variable (like 
 3_000_000_000 and something of type int)
 - setting an unsigned var to a negative constant (uint a = -1)
 
 For other cases, comparing signed with unsigned should produce a warning 
 and, more importantly, work properly:
 
 int a;
 uint b;
 
 a < b  should be  a < 0 || cast(uint)a < b
 a > b  should be  a > 0 && cast(uint)a > b
 etc.etc.

http://d.puremagic.com/issues/show_bug.cgi?id=259

Could you please add your remarks to that bugzilla issue as well? 
Especially the proper implementation for mixed comparison. I agree that 
the result should at least be correct, no matter* how slow.

L.

* of course, within limits :)

Dec 05 2006

Don Clugston <dac nospam.com.au> writes:

xs0 wrote:
 Bill Baxter wrote:
 This bit me just now:

     if (-1 < somearray.length) {
     }

 Ouch.  Took me a while to figure out that that bit of code was the 
 problem.

 Is there some way to make these kind of bugs less likely or easier to 
 find?

 
 It should be made illegal :) In particular:
 - comparing a negative constant with an unsigned variable
 - comparing a too large constant with a signed variable (like 
 3_000_000_000 and something of type int)
 - setting an unsigned var to a negative constant (uint a = -1)
 
 For other cases, comparing signed with unsigned should produce a warning 
 and, more importantly, work properly:
 
 int a;
 uint b;
 
 a < b  should be  a < 0 || cast(uint)a < b
 a > b  should be  a > 0 && cast(uint)a > b
 etc.etc.
 
 I'm quite sure someone will say that's much slower so shouldn't be done, 
 but:
 - you can do it properly and only compare variables of the same type in 
 the first place
 - you can trivially make it fast again by casting, with the positive 
 side effect of explicitly stating whether you want to use signed or 
 unsigned comparison

But then you have to explicitly specify the type. In your code above, 
suppose that later on, someone changes 'a' to 'long', and sets a = 
(1L+int.max)*10 , b=7. Now, the code compiles, but it tells you that a < 
b  !!!
You've introduced a bug.

 - I doubt signed/unsigned comparisons are common enough to even make any 
 noticeable speed difference

I think the underlying problem is, there's no way to specify a 
'positiveint' type, ie, a number in the range 0..int.max, which can 
therefore be cast to both int and uint without error.

So you have to declare such variables as either 'int' (implying that 
they can in theory be <0) or as 'uint' (implying that that they can in 
theory be >int.max). Neither choice is good. There's no way to "do it 
properly". It's a personal tradeoff which one you use.

And for this reason, comparing signed variables with unsigned variables 
is almost never an error (when people use 'uint', they almost always 
mean 'positive int'). Detecting mismatch comparisons with literals would 
be a good thing, though.

Dec 05 2006

Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:

Don Clugston wrote:
 xs0 wrote:

<snip>
 For other cases, comparing signed with unsigned should produce a 
 warning and, more importantly, work properly:

 int a;
 uint b;

 a < b  should be  a < 0 || cast(uint)a < b
 a > b  should be  a > 0 && cast(uint)a > b
 etc.etc.

 I'm quite sure someone will say that's much slower so shouldn't be 
 done, but:
 - you can do it properly and only compare variables of the same type 
 in the first place
 - you can trivially make it fast again by casting, with the positive 
 side effect of explicitly stating whether you want to use signed or 
 unsigned comparison

 
 But then you have to explicitly specify the type. In your code above, 
 suppose that later on, someone changes 'a' to 'long', and sets a = 
 (1L+int.max)*10 , b=7. Now, the code compiles, but it tells you that a < 
 b  !!!
 You've introduced a bug.

I think his point was that the compiler should perform this 
transformation automatically, i.e. compile "a < b" as if it said "(a < 0 
|| cast(uint)a < b)".
Extrapolating, if 'a' was changed to a long, the compiler should then 
compile it as if transformed as above, except with the cast changed to 
"cast(ulong)".
So yes, if done manually this introduces a bug, but if that just becomes 
the new meaning of comparing a signed and unsigned integer then it 
should work without a problem. I think this is a good idea.

 - I doubt signed/unsigned comparisons are common enough to even make 
 any noticeable speed difference

 
 I think the underlying problem is, there's no way to specify a 
 'positiveint' type, ie, a number in the range 0..int.max, which can 
 therefore be cast to both int and uint without error.

If opCast could be overloaded this might be possible with a struct, but 
it wouldn't be pretty (at least, not without also needing implicit casts 
and overloaded opAssign).
But AFAIK as it is, in the current language you're right that this can't 
be done.

 So you have to declare such variables as either 'int' (implying that 
 they can in theory be <0) or as 'uint' (implying that that they can in 
 theory be >int.max). Neither choice is good. There's no way to "do it 
 properly". It's a personal tradeoff which one you use.
 
 And for this reason, comparing signed variables with unsigned variables 
 is almost never an error (when people use 'uint', they almost always 
 mean 'positive int'). Detecting mismatch comparisons with literals would 
 be a good thing, though.

Yes it would be a good thing if this was detected.
IMHO it would be an even better thing if the automatic transformation 
above would then be performed when such a mismatch is detected :).

Dec 05 2006

Don Clugston <dac nospam.com.au> writes:

Frits van Bommel wrote:
 Don Clugston wrote:
 xs0 wrote:

 <snip>
 For other cases, comparing signed with unsigned should produce a 
 warning and, more importantly, work properly:

 int a;
 uint b;

 a < b  should be  a < 0 || cast(uint)a < b
 a > b  should be  a > 0 && cast(uint)a > b
 etc.etc.

 I'm quite sure someone will say that's much slower so shouldn't be 
 done, but:
 - you can do it properly and only compare variables of the same type 
 in the first place
 - you can trivially make it fast again by casting, with the positive 
 side effect of explicitly stating whether you want to use signed or 
 unsigned comparison

 But then you have to explicitly specify the type. In your code above, 
 suppose that later on, someone changes 'a' to 'long', and sets a = 
 (1L+int.max)*10 , b=7. Now, the code compiles, but it tells you that a 
 < b  !!!
 You've introduced a bug.

 
 I think his point was that the compiler should perform this 
 transformation automatically, i.e. compile "a < b" as if it said "(a < 0 
 || cast(uint)a < b)".
 Extrapolating, if 'a' was changed to a long, the compiler should then 
 compile it as if transformed as above, except with the cast changed to 
 "cast(ulong)".
 So yes, if done manually this introduces a bug, but if that just becomes 
 the new meaning of comparing a signed and unsigned integer then it 
 should work without a problem. I think this is a good idea.
 
 - I doubt signed/unsigned comparisons are common enough to even make 
 any noticeable speed difference

 I think the underlying problem is, there's no way to specify a 
 'positiveint' type, ie, a number in the range 0..int.max, which can 
 therefore be cast to both int and uint without error.

 
 If opCast could be overloaded this might be possible with a struct, but 
 it wouldn't be pretty (at least, not without also needing implicit casts 
 and overloaded opAssign).
 But AFAIK as it is, in the current language you're right that this can't 
 be done.
 
 So you have to declare such variables as either 'int' (implying that 
 they can in theory be <0) or as 'uint' (implying that that they can in 
 theory be >int.max). Neither choice is good. There's no way to "do it 
 properly". It's a personal tradeoff which one you use.

 And for this reason, comparing signed variables with unsigned 
 variables is almost never an error (when people use 'uint', they 
 almost always mean 'positive int'). Detecting mismatch comparisons 
 with literals would be a good thing, though.

 
 Yes it would be a good thing if this was detected.
 IMHO it would be an even better thing if the automatic transformation 
 above would then be performed when such a mismatch is detected :).

But if a mismatch was detected, would there be any way to request a fast 
comparison without introducing bugs?

Dec 05 2006

Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:

Don Clugston wrote:
 Frits van Bommel wrote:
 Don Clugston wrote:


<snip>
 And for this reason, comparing signed variables with unsigned 
 variables is almost never an error (when people use 'uint', they 
 almost always mean 'positive int'). Detecting mismatch comparisons 
 with literals would be a good thing, though.

 Yes it would be a good thing if this was detected.
 IMHO it would be an even better thing if the automatic transformation 
 above would then be performed when such a mismatch is detected :).

 
 But if a mismatch was detected, would there be any way to request a fast 
 comparison without introducing bugs?

Depends on what you mean by fast and what you mean by bugs ;).

Hmm...
If "a < 0 || cast(uint)a < b" isn't fast enough for your liking, then 
probably the only way to make it faster would be to eliminate the "a < 
0" part, which would require writing out "cast(uint)a < b" in full. That 
  might introduce bugs if the type of a was changed to something that 
can have values bigger than an uint can hold.
So yeah, that might be a problem.

The reality of it is of course that this change probably won't happen 
anyway, because of Walter's whole "if it looks like C, it behaves like 
C" philosophy. If he holds to that the best we can probably hope for is 
some way to express this that doesn't look like C. Maybe some new 
operators for these "safe comparison" operations...
Unless... What does the C standard say about comparing negative signed 
numbers to unsigned numbers? If the result is Undefined(TM) (or 
implementation-defined) then defining it in D would not be limited by 
this :).

Forgive me if this is all a bit too much stream-of-consciousness[1], but 
writing about things is a good way to think about them :).


[1]: Is that the right term? Not sure. Not a term I use a lot ;) 
(especially in English)

Dec 05 2006

D Programming

C/C++ Programming

Other

digitalmars.D - Length comparison