www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Mixing operations with signed and unsigned types

reply Michal Minich <michal.minich gmail.com> writes:
I was surprised by the behavior of division. The resulting type of 
division in example below is uint and the value is incorrect. I would 
expect that when one of operands is signed, then the result is signed 
type. 

int  a = -6;
uint b = 2;
auto c = a / b;          // c is type of uint, and has value 2147483645
int  d = a / b;          // int,  2147483645
auto e = a / cast(int)b; // e, -3 (ok)

I have longer time problems with mixing int and uint, so I tested some 
expression now and here is the result. 

auto f = a - b           // uint, 4294967288
auto g = a + b           // uint, 4294967292
auto h = a < b           // bool, false
auto i = a > b           // bool, true

Recently while I was hunting some bug in templated code, I created a 
templated function for operator <, which requires both arguments to be 
either signed or unsigned. Fortunately D such function was quite easy to 
do, if it wasn't possible I don't know if I would ever find form where 
the ints and uints come from...

bool sameSign (A, B) () {
    return isUnsigned!(A) && isUnsigned!(B)) || (isSigned!(A) && isSigned!
(B);
}

bool lt (A, B) (A a, B b) {
    static assert (sameSign!(A, B) ());
    return a < b;
}

Could somebody please tell me why is this behavior, when mixing signed 
and unsigned, preferred over one that computes correct result. If this 
cannot be changed, is it possible to just make compiler error/warning 
when such incorrect calculation could occur. If it is possible in D code 
to require same-signed types for function, it is definitely possible for 
compiler to require explicit cast in such cases.
Jun 29 2010
next sibling parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
Michal Minich wrote:
 I was surprised by the behavior of division. The resulting type of 
 division in example below is uint and the value is incorrect. I would 
 expect that when one of operands is signed, then the result is signed 
 type. 

Going by the spec http://www.digitalmars.com/d/1.0/type.html "Usual Arithmetic Conversions" the compiler is behaving correctly. But see below.... <snip>
 auto f = a - b           // uint, 4294967288
 auto g = a + b           // uint, 4294967292
 auto h = a < b           // bool, false
 auto i = a > b           // bool, true
 
 Recently while I was hunting some bug in templated code, I created a 
 templated function for operator <, which requires both arguments to be 
 either signed or unsigned.

It is in fact a bug that DMD accepts it. http://www.digitalmars.com/d/1.0/expression.html#RelExpression http://d.puremagic.com/issues/show_bug.cgi?id=259
 Fortunately D such function was quite easy to 
 do, if it wasn't possible I don't know if I would ever find form where 
 the ints and uints come from...
 
 bool sameSign (A, B) () {
     return isUnsigned!(A) && isUnsigned!(B)) || (isSigned!(A) && isSigned!
 (B);
 }
 
 bool lt (A, B) (A a, B b) {
     static assert (sameSign!(A, B) ());
     return a < b;
 }
 
 Could somebody please tell me why is this behavior, when mixing signed 
 and unsigned, preferred over one that computes correct result.

It would appear to be Walter's idea of C compatibility taking control again.
 If this cannot be changed, is it possible to just make compiler 
 error/warning when such incorrect calculation could occur. If it is 
 possible in D code to require same-signed types for function, it is 
 definitely possible for compiler to require explicit cast in such 
 cases.

I agree. Either behave sensibly or generate an error. Stewart.
Jun 29 2010
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Stewart Gordon:
 http://d.puremagic.com/issues/show_bug.cgi?id=259

I have added my vote there a lot of time ago. I think Andrei says that fixing this is unworkable, but I don't know why. If you make this an error and at the same time turn array indexes/lengths into signed values, you don't have that many unsigned values in normal D programs, so you need very few casts and it becomes workable. Bye, bearophile
Jun 29 2010
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Michal Minich:

Why on the earth should array indexes and lengths be signed !!!

I have explained why lengthy elsewhere. Short answer: signed fixnum integers are a bad approximation of natural numbers, because they are limited in range, they don't even tell you when you try to step out of their limits, and their limits aren't even symmetrical (so you can't perform abs(int.min)). But unsigned numbers are an even worse approximation, C signed-unsigned conversion rules turn signed values into unsigned in silly situations, and lot of programmers are bad in using them (this means they sometimes write buggy code when they use unsigned values. Yet the language forces such any kind of programmers to use unsigned integers often in even normal simple programs, because indexes and array lengths are everywhere). Unsigned values are unsafe, they are good if you need an array of bits to implement a bit set, or if you want to perform bitwise operations, otherwise I think they are often the wrong choice in D (I don't want to remove them as in Java because in some situations they are very useful, especially in a near-system-language as D).
 I voted for the bug, but IMO it should be fixed by other means

One other partial solution is to introduce optional runtime integral overflows in D (probably two independent switches are needed, one for signed and one for unsigned integral overflows).
 and would probably affect lot of code.

Yes, but often for the better ;-) Bye, bearophile
Jun 29 2010
prev sibling parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
bearophile wrote:
 Stewart Gordon:
 http://d.puremagic.com/issues/show_bug.cgi?id=259

I have added my vote there a lot of time ago. I think Andrei says that fixing this is unworkable, but I don't know why. If you make this an error and at the same time turn array indexes/lengths into signed values, you don't have that many unsigned values in normal D programs, so you need very few casts and it becomes workable.

That's probably because many people neglect the unsigned types, instead using the signed types for array indices and the like. Array indices are actually of type size_t. Effectively, what you seem to be suggesting is that size_t be the same as ptrdiff_t. There is, however, another problem: signed types convert implicitly to unsigned types, though they do generate a warning if compiled with -w (except peculiarly for int/uint). Removing this implicit conversion would break certain existing code that uses signed types where it should be using unsigned. If we also change array indices to be signed, it would break that code that sensibly uses unsigned types, which is probably worse. Stewart.
Jun 29 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
Stewart Gordon:
 what you seem to be suggesting is that size_t be the same as ptrdiff_t.

Yes, but an unsigned word type needs to be kept in the language.
 There is, however, another problem: signed types convert implicitly to 
 unsigned types, though they do generate a warning if compiled with -w 
 (except peculiarly for int/uint).  Removing this implicit conversion 
 would break certain existing code that uses signed types where it should 
 be using unsigned.

 If we also change array indices to be signed, it 
 would break that code that sensibly uses unsigned types, which is 
 probably worse.

Yes, of course that code needs to be fixed after the change I have suggested. A "breaking change" means that some of the old code needs to be fixed. Bye, bearophile
Jun 30 2010
parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
bearophile wrote:
 Stewart Gordon:

 If we also change array indices to be signed, it would break that
 code that sensibly uses unsigned types, which is probably worse.

Yes, of course that code needs to be fixed after the change I have suggested. A "breaking change" means that some of the old code needs to be fixed.

That code needs to be "fixed"? My point was that being forced to use signed types for values that cannot possibly be negative doesn't to me constitute fixing anything. Stewart.
Jun 30 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
Stewart Gordon:
 That code needs to be "fixed"?  My point was that being forced to use 
 signed types for values that cannot possibly be negative doesn't to me 
 constitute fixing anything.

Yes, in my opinion it needs to be fixed. Using unsigned integers in D is a hazard, so if you use them where they are not necessary (and representing positive-only values is often not one of such cases) then you are doing something wrong, or doing premature optimization. Using a unsigned value to represent a positive-only value is not going to increase your program safety as it happens for example in Delphi, in D it decreases your program resilience. Using size_t and uint in your code where you can use an int is something that needs to be fixed, in my opinion. Normal D programmers writing very mundane code must not be forced to face unsigned values every few lines of code. Unsigned values in D are quite bug-prone, so the language has to avoid putting them on your plate every time you want to write some code. You need to be free to use them when you want, but it's better for you to use them only when necessary. Unsigned values have some purposes, like representing bit fields, representing very large integers (over signed values range) when you are optimizing your code and with your profiler you have found a hot spot and you want to reduce space used or increase performance, to work with bitwise operators, to work with bit fields, and few more. But letting all programmers, even D newbies mess with unsigned values every time they want to use an array length is something that will cause a very large number of bugs and wasted programming time in future D programs. You will need a hard evidence to convince me this is false. If you want to make your D code a bit more safe you have to write code like: cast(int)somearray_.length - degree because if you write more normal expressions like: somearray_.length - degree You can quickly put some bugs in your code :-) I have written something like 300_000 lines of D code so far, and I have found a good number of unsigned-derived bugs in my code. Good luck with your code. And by the way, in C# you have ways to use unsigned values, but I think array lengths and indexes are signed. Maybe they know better than Walter and you about this design detail. Bye, bearophile
Jul 01 2010
parent Stewart Gordon <smjg_1998 yahoo.com> writes:
bearophile wrote:
<snip>
 Yes, in my opinion it needs to be fixed.  Using unsigned integers 
 in D is a hazard, so if you use them where they are not necessary 
 (and representing positive-only values is often not one of such 
 cases) then you are doing something wrong,

If it's logical and the program works, it isn't objectively wrong. Some of us prefer to use unsigned types where the value is semantically unsigned, and know what we're doing. So any measures to stronghold programmers against using them are going to be a nuisance. I can also imagine promoting your mindset leading to edit wars between developers declaring an int and then putting assert (qwert >= 0); in the class invariant, and those who see this and think it's brain-damaged. <snip>
 Using size_t and uint in your code where you can use an int is 
 something that needs to be fixed, in my opinion.  Normal D 
 programmers writing very mundane code must not be forced to face 
 unsigned values every few lines of code.

True, but that doesn't mean that we should force programmers to use signed values for nearly everything. But it is all the more reason to fix unsigned op signed to be signed, if it is to be allowed at all. The way it is at the moment, a single unsigned value in a formula can force the whole result to be unsigned, thereby leading to unexpected results.
 Unsigned values in D are quite bug-prone, so the language has to
 avoid putting them on your plate every time you want to write some
 code.  You need to be free to use them when you want, but it's better
 for you to use them only when necessary.

You could make a similar argument the same about integer types generally. People coming from BASIC backgrounds, or new to programming generally, are sooner or later going to have some work to do when they find that 1/4 != 0.25. Add to that the surprise that is silent overflow....
 Unsigned values have some purposes, like representing bit fields, 
 representing very large integers (over signed values range) when 
 you are optimizing your code and with your profiler you have found 
 a hot spot and you want to reduce space used or increase 
 performance, to work with bitwise operators, to work with bit 
 fields, and few more.

Interfacing file formats. Simplifying certain conditional expressions. Making code self-documenting. Maybe others.... Stewart.
Jul 01 2010
prev sibling next sibling parent Michal Minich <michal.minich gmail.com> writes:
On Tue, 29 Jun 2010 19:42:45 -0400, bearophile wrote:

 Stewart Gordon:
 http://d.puremagic.com/issues/show_bug.cgi?id=259

I have added my vote there a lot of time ago. I think Andrei says that fixing this is unworkable, but I don't know why. If you make this an error and at the same time turn array indexes/lengths into signed values, you don't have that many unsigned values in normal D programs, so you need very few casts and it becomes workable. Bye, bearophile

Why on the earth should array indexes and lengths be signed !!! My brain just explodes when I think of something like that.
Jun 29 2010
prev sibling next sibling parent Michal Minich <michal.minich gmail.com> writes:
On Tue, 29 Jun 2010 19:42:45 -0400, bearophile wrote:

 Stewart Gordon:
 http://d.puremagic.com/issues/show_bug.cgi?id=259

I have added my vote there a lot of time ago. I think Andrei says that fixing this is unworkable, but I don't know why. If you make this an error and at the same time turn array indexes/lengths into signed values, you don't have that many unsigned values in normal D programs, so you need very few casts and it becomes workable. Bye, bearophile

I voted for the bug, but IMO it should be fixed by other means as making array indexes and lengths signed. It makes no sense for me, and would probably affect lot of code.
Jun 29 2010
prev sibling parent Michal Minich <michal.minich gmail.com> writes:
On Wed, 30 Jun 2010 00:30:19 +0100, Stewart Gordon wrote:

 Michal Minich wrote:
 I was surprised by the behavior of division. The resulting type of
 division in example below is uint and the value is incorrect. I would
 expect that when one of operands is signed, then the result is signed
 type.

Going by the spec http://www.digitalmars.com/d/1.0/type.html "Usual Arithmetic Conversions" the compiler is behaving correctly.

point 4.4 in docs - "The signed type is converted to the unsigned type." this is just not good for most common binary operators, it might be useful for &, | and maybe shift, but they are quite less common....
Jun 29 2010
prev sibling parent Michal Minich <michal.minich gmail.com> writes:
There is very long discussion on digitamars.D ng "Is there ANY chance we 
can fix the bitwise operator precedence rules?" which I should probably 
read first...but was there some conclusion ?
Jun 29 2010