www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Mixing operations with signed and unsigned types

reply Michal Minich <michal.minich gmail.com> writes:
I was surprised by the behavior of division. The resulting type of 
division in example below is uint and the value is incorrect. I would 
expect that when one of operands is signed, then the result is signed 
type. 

int  a = -6;
uint b = 2;
auto c = a / b;          // c is type of uint, and has value 2147483645
int  d = a / b;          // int,  2147483645
auto e = a / cast(int)b; // e, -3 (ok)

I have longer time problems with mixing int and uint, so I tested some 
expression now and here is the result. 

auto f = a - b           // uint, 4294967288
auto g = a + b           // uint, 4294967292
auto h = a < b           // bool, false
auto i = a > b           // bool, true

Recently while I was hunting some bug in templated code, I created a 
templated function for operator <, which requires both arguments to be 
either signed or unsigned. Fortunately D such function was quite easy to 
do, if it wasn't possible I don't know if I would ever find form where 
the ints and uints come from...

bool sameSign (A, B) () {
    return isUnsigned!(A) && isUnsigned!(B)) || (isSigned!(A) && isSigned!
(B);
}

bool lt (A, B) (A a, B b) {
    static assert (sameSign!(A, B) ());
    return a < b;
}

Could somebody please tell me why is this behavior, when mixing signed 
and unsigned, preferred over one that computes correct result. If this 
cannot be changed, is it possible to just make compiler error/warning 
when such incorrect calculation could occur. If it is possible in D code 
to require same-signed types for function, it is definitely possible for 
compiler to require explicit cast in such cases.
Jun 29 2010
next sibling parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
Michal Minich wrote:
 I was surprised by the behavior of division. The resulting type of 
 division in example below is uint and the value is incorrect. I would 
 expect that when one of operands is signed, then the result is signed 
 type. 
Going by the spec http://www.digitalmars.com/d/1.0/type.html "Usual Arithmetic Conversions" the compiler is behaving correctly. But see below.... <snip>
 auto f = a - b           // uint, 4294967288
 auto g = a + b           // uint, 4294967292
 auto h = a < b           // bool, false
 auto i = a > b           // bool, true
 
 Recently while I was hunting some bug in templated code, I created a 
 templated function for operator <, which requires both arguments to be 
 either signed or unsigned.
It is in fact a bug that DMD accepts it. http://www.digitalmars.com/d/1.0/expression.html#RelExpression http://d.puremagic.com/issues/show_bug.cgi?id=259
 Fortunately D such function was quite easy to 
 do, if it wasn't possible I don't know if I would ever find form where 
 the ints and uints come from...
 
 bool sameSign (A, B) () {
     return isUnsigned!(A) && isUnsigned!(B)) || (isSigned!(A) && isSigned!
 (B);
 }
 
 bool lt (A, B) (A a, B b) {
     static assert (sameSign!(A, B) ());
     return a < b;
 }
 
 Could somebody please tell me why is this behavior, when mixing signed 
 and unsigned, preferred over one that computes correct result.
It would appear to be Walter's idea of C compatibility taking control again.
 If this cannot be changed, is it possible to just make compiler 
 error/warning when such incorrect calculation could occur. If it is 
 possible in D code to require same-signed types for function, it is 
 definitely possible for compiler to require explicit cast in such 
 cases.
I agree. Either behave sensibly or generate an error. Stewart.
Jun 29 2010
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Stewart Gordon:
 http://d.puremagic.com/issues/show_bug.cgi?id=259
I have added my vote there a lot of time ago. I think Andrei says that fixing this is unworkable, but I don't know why. If you make this an error and at the same time turn array indexes/lengths into signed values, you don't have that many unsigned values in normal D programs, so you need very few casts and it becomes workable. Bye, bearophile
Jun 29 2010
next sibling parent reply Michal Minich <michal.minich gmail.com> writes:
On Tue, 29 Jun 2010 19:42:45 -0400, bearophile wrote:

 Stewart Gordon:
 http://d.puremagic.com/issues/show_bug.cgi?id=259
I have added my vote there a lot of time ago. I think Andrei says that fixing this is unworkable, but I don't know why. If you make this an error and at the same time turn array indexes/lengths into signed values, you don't have that many unsigned values in normal D programs, so you need very few casts and it becomes workable. Bye, bearophile
Why on the earth should array indexes and lengths be signed !!! My brain just explodes when I think of something like that.
Jun 29 2010
parent bearophile <bearophileHUGS lycos.com> writes:
Michal Minich:

Why on the earth should array indexes and lengths be signed !!!
I have explained why lengthy elsewhere. Short answer: signed fixnum integers are a bad approximation of natural numbers, because they are limited in range, they don't even tell you when you try to step out of their limits, and their limits aren't even symmetrical (so you can't perform abs(int.min)). But unsigned numbers are an even worse approximation, C signed-unsigned conversion rules turn signed values into unsigned in silly situations, and lot of programmers are bad in using them (this means they sometimes write buggy code when they use unsigned values. Yet the language forces such any kind of programmers to use unsigned integers often in even normal simple programs, because indexes and array lengths are everywhere). Unsigned values are unsafe, they are good if you need an array of bits to implement a bit set, or if you want to perform bitwise operations, otherwise I think they are often the wrong choice in D (I don't want to remove them as in Java because in some situations they are very useful, especially in a near-system-language as D).
 I voted for the bug, but IMO it should be fixed by other means
One other partial solution is to introduce optional runtime integral overflows in D (probably two independent switches are needed, one for signed and one for unsigned integral overflows).
 and would probably affect lot of code.
Yes, but often for the better ;-) Bye, bearophile
Jun 29 2010
prev sibling next sibling parent Michal Minich <michal.minich gmail.com> writes:
On Tue, 29 Jun 2010 19:42:45 -0400, bearophile wrote:

 Stewart Gordon:
 http://d.puremagic.com/issues/show_bug.cgi?id=259
I have added my vote there a lot of time ago. I think Andrei says that fixing this is unworkable, but I don't know why. If you make this an error and at the same time turn array indexes/lengths into signed values, you don't have that many unsigned values in normal D programs, so you need very few casts and it becomes workable. Bye, bearophile
I voted for the bug, but IMO it should be fixed by other means as making array indexes and lengths signed. It makes no sense for me, and would probably affect lot of code.
Jun 29 2010
prev sibling parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
bearophile wrote:
 Stewart Gordon:
 http://d.puremagic.com/issues/show_bug.cgi?id=259
I have added my vote there a lot of time ago. I think Andrei says that fixing this is unworkable, but I don't know why. If you make this an error and at the same time turn array indexes/lengths into signed values, you don't have that many unsigned values in normal D programs, so you need very few casts and it becomes workable.
That's probably because many people neglect the unsigned types, instead using the signed types for array indices and the like. Array indices are actually of type size_t. Effectively, what you seem to be suggesting is that size_t be the same as ptrdiff_t. There is, however, another problem: signed types convert implicitly to unsigned types, though they do generate a warning if compiled with -w (except peculiarly for int/uint). Removing this implicit conversion would break certain existing code that uses signed types where it should be using unsigned. If we also change array indices to be signed, it would break that code that sensibly uses unsigned types, which is probably worse. Stewart.
Jun 29 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
Stewart Gordon:
 what you seem to be suggesting is that size_t be the same as ptrdiff_t.
Yes, but an unsigned word type needs to be kept in the language.
 There is, however, another problem: signed types convert implicitly to 
 unsigned types, though they do generate a warning if compiled with -w 
 (except peculiarly for int/uint).  Removing this implicit conversion 
 would break certain existing code that uses signed types where it should 
 be using unsigned.
 If we also change array indices to be signed, it 
 would break that code that sensibly uses unsigned types, which is 
 probably worse.
Yes, of course that code needs to be fixed after the change I have suggested. A "breaking change" means that some of the old code needs to be fixed. Bye, bearophile
Jun 30 2010
parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
bearophile wrote:
 Stewart Gordon:
<snip>
 If we also change array indices to be signed, it would break that
 code that sensibly uses unsigned types, which is probably worse.
Yes, of course that code needs to be fixed after the change I have suggested. A "breaking change" means that some of the old code needs to be fixed.
That code needs to be "fixed"? My point was that being forced to use signed types for values that cannot possibly be negative doesn't to me constitute fixing anything. Stewart.
Jun 30 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
Stewart Gordon:
 That code needs to be "fixed"?  My point was that being forced to use 
 signed types for values that cannot possibly be negative doesn't to me 
 constitute fixing anything.
Yes, in my opinion it needs to be fixed. Using unsigned integers in D is a hazard, so if you use them where they are not necessary (and representing positive-only values is often not one of such cases) then you are doing something wrong, or doing premature optimization. Using a unsigned value to represent a positive-only value is not going to increase your program safety as it happens for example in Delphi, in D it decreases your program resilience. Using size_t and uint in your code where you can use an int is something that needs to be fixed, in my opinion. Normal D programmers writing very mundane code must not be forced to face unsigned values every few lines of code. Unsigned values in D are quite bug-prone, so the language has to avoid putting them on your plate every time you want to write some code. You need to be free to use them when you want, but it's better for you to use them only when necessary. Unsigned values have some purposes, like representing bit fields, representing very large integers (over signed values range) when you are optimizing your code and with your profiler you have found a hot spot and you want to reduce space used or increase performance, to work with bitwise operators, to work with bit fields, and few more. But letting all programmers, even D newbies mess with unsigned values every time they want to use an array length is something that will cause a very large number of bugs and wasted programming time in future D programs. You will need a hard evidence to convince me this is false. If you want to make your D code a bit more safe you have to write code like: cast(int)somearray_.length - degree because if you write more normal expressions like: somearray_.length - degree You can quickly put some bugs in your code :-) I have written something like 300_000 lines of D code so far, and I have found a good number of unsigned-derived bugs in my code. Good luck with your code. And by the way, in C# you have ways to use unsigned values, but I think array lengths and indexes are signed. Maybe they know better than Walter and you about this design detail. Bye, bearophile
Jul 01 2010
parent Stewart Gordon <smjg_1998 yahoo.com> writes:
bearophile wrote:
<snip>
 Yes, in my opinion it needs to be fixed.  Using unsigned integers 
 in D is a hazard, so if you use them where they are not necessary 
 (and representing positive-only values is often not one of such 
 cases) then you are doing something wrong,
If it's logical and the program works, it isn't objectively wrong. Some of us prefer to use unsigned types where the value is semantically unsigned, and know what we're doing. So any measures to stronghold programmers against using them are going to be a nuisance. I can also imagine promoting your mindset leading to edit wars between developers declaring an int and then putting assert (qwert >= 0); in the class invariant, and those who see this and think it's brain-damaged. <snip>
 Using size_t and uint in your code where you can use an int is 
 something that needs to be fixed, in my opinion.  Normal D 
 programmers writing very mundane code must not be forced to face 
 unsigned values every few lines of code.
True, but that doesn't mean that we should force programmers to use signed values for nearly everything. But it is all the more reason to fix unsigned op signed to be signed, if it is to be allowed at all. The way it is at the moment, a single unsigned value in a formula can force the whole result to be unsigned, thereby leading to unexpected results.
 Unsigned values in D are quite bug-prone, so the language has to
 avoid putting them on your plate every time you want to write some
 code.  You need to be free to use them when you want, but it's better
 for you to use them only when necessary.
You could make a similar argument the same about integer types generally. People coming from BASIC backgrounds, or new to programming generally, are sooner or later going to have some work to do when they find that 1/4 != 0.25. Add to that the surprise that is silent overflow....
 Unsigned values have some purposes, like representing bit fields, 
 representing very large integers (over signed values range) when 
 you are optimizing your code and with your profiler you have found 
 a hot spot and you want to reduce space used or increase 
 performance, to work with bitwise operators, to work with bit 
 fields, and few more.
<snip> Interfacing file formats. Simplifying certain conditional expressions. Making code self-documenting. Maybe others.... Stewart.
Jul 01 2010
prev sibling parent Michal Minich <michal.minich gmail.com> writes:
On Wed, 30 Jun 2010 00:30:19 +0100, Stewart Gordon wrote:

 Michal Minich wrote:
 I was surprised by the behavior of division. The resulting type of
 division in example below is uint and the value is incorrect. I would
 expect that when one of operands is signed, then the result is signed
 type.
Going by the spec http://www.digitalmars.com/d/1.0/type.html "Usual Arithmetic Conversions" the compiler is behaving correctly.
point 4.4 in docs - "The signed type is converted to the unsigned type." this is just not good for most common binary operators, it might be useful for &, | and maybe shift, but they are quite less common....
Jun 29 2010
prev sibling next sibling parent Michal Minich <michal.minich gmail.com> writes:
There is very long discussion on digitamars.D ng "Is there ANY chance we 
can fix the bitwise operator precedence rules?" which I should probably 
read first...but was there some conclusion ?
Jun 29 2010
prev sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Stewart Gordon:

Sorry for the late reply, I was quite busy. Thank you for your comments, even
if I don't agree with some of them :-)


If it's logical and the program works, it isn't objectively wrong.<
Right. But bug-prone means that often enough people write code that doesn't work.
Some of us prefer to use unsigned types where the value is semantically
unsigned, and know what we're doing.  So any measures to stronghold programmers
against using them are going to be a nuisance.<
I have not asked to remove the unsigned types, so you can relax. And replacing lengths/indexes with signed values isn't a way to forbid you to use unsigned values in your programs, it's right the opposite: it's a way to not force me (and many other programmers that want to write simple D non-system programs) to use unsigned values in my code. D (and all other languages beside ASM) try to push programmers toward safer ways to write code, even types can be seen as restrictions, but a wise programmer knows they are there to help the creation of less buggy programs, etc.
 I can also imagine promoting your mindset leading to edit wars between
 developers declaring an int and then putting
      assert (qwert >= 0);
 in the class invariant, and those who see this and think it's brain-damaged.
This is quite interesting. You think that using an unsigned type in D is today the same thing than using a signed value + an assert of it not being negative? In the beginning, when I was used to Delphi programming I have done the same, but I have soon found out that was unsafe. Today D unsigned values don't give you a nice overflow error (as I have asked Walter many times) when you try to assign them a number outside their range, they happily wrap around, this causes bugs in programs. So using an unsigned number to denote a value that can't be negative is dangerous and it can be stupid too. In D you need to take a signed value from outside and then assign it to a unsigned value only after you have tested it to be nonnegative.
 True, but that doesn't mean that we should force programmers to use
 signed values for nearly everything.
D wants to be a system language, and I presume system programmers are able to use unsigned values. But D can also be used as application language (as C#) and I presume most usages of D will be of this kind. And in my experience there is a good number of 'application programmers' that have problems with unsigned numbers. Length and array indexes are not something that is used by system programmers only (as the opBinary operator overloading) they are things used often in any kind of programs, even small ones, so making them unsigned will be a trap for many programmers. I don't care if you use unsigned values in your programs, and I don't want to force you to use signed values in your programs, but I want to be able to avoid unsigned values when I write small non-system D programs, because they introduce complexities and bugs that I can live without.
 But it is all the more reason to fix unsigned op signed to be signed, if
 it is to be allowed at all.  The way it is at the moment, a single
 unsigned value in a formula can force the whole result to be unsigned,
 thereby leading to unexpected results.
I think Walter will not change this, because this way D syntax equal to C syntax does things different from C (there are few exceptions to this D rule, like fixed-sized arrays are passed by value in D and by pointer in C). So given that this will not change, other solutions need to be found. I have suggested two solutions, that can be used at the same time: - Introducing run-time integral overflow (as in Delphi and C#, but I think in D two separate switches can be useful, one for signed overflows and one for unsigned overflows); - and removing a very common source of unsigned values in simple D programs (length/indexes).
 You could make a similar argument the same about integer types
 generally.  People coming from BASIC backgrounds, or new to programming
 generally, are sooner or later going to have some work to do when they
 find that 1/4 != 0.25.  
Some languages are indeed able to represent fractions natively, like Scheme. A "good" high-level language, designed for humans and not for CPUs deserves to act more correctly. So I agree that's a possible source of problems for newbies. But having just one possible source of "problems" is better than having two possible sources of problems :-) And in my experience, while somewhat more experienced programmers are quickly able to cope with the lack of native fractions in a language (and I prefer to have two operators to perform divisions, like / and div in Delphi and / and // in Python3, to denote float or integer divisions), they keep having bugs caused by unsigned values combined with C conversion rules. So I think unsigned values cause worse troubles.
 Add to that the surprise that is silent overflow....
Adding optional runtime integral overflows in D is something that I really want. My experience with Delphi has shown me many times they are able to catch some of my bugs. Walter is Just Wrong [TM] about not appreciating them. C# developers are right on this.
Interfacing file formats.  Simplifying certain conditional expressions. Making
code self-documenting.  Maybe others....<
Simplifying certain conditional expressions with unsigned values is cool, but you want to do it only in performance-critical spots of your programs, because they can be tricky and in every other part of your program they are bug-prone premature optimization :-) Regarding the self-documenting of unsigned values, I have explained that this is true in a language that actually enforces their unsigned nature, but in D they are just traps :-) In a language like Ada you can actually do what you mean, and denote their non-negative nature, this is an example: http://ideone.com/ViiOB with Ada.Integer_Text_Io, Ada.Text_Io; use Ada.Integer_Text_Io, Ada.Text_Io; procedure Test is subtype Small is Integer range 0..99; Input : Small; begin loop Get(Input); if Input = 42 then exit; Else Put (Input); new_line; end if; end loop; end; The Small type is user-defined and it can't be negative (or more than 99, in Ada ranges are closed on the right), so if you try to assign 100 or a negative value (as in that example), you receive a run-time error like: raised CONSTRAINT_ERROR : prog.adb:9 range check failed This is the right way to enforce a nonnegtive number. In D I will try to create a ranged integer (with run-time overflow errors), and if you don't like to use similar ranged values, then it's better to add things like that assert(qwert >= 0); to your class invariant. Bye, bearophile
Jul 04 2010
parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
bearophile wrote:
 Stewart Gordon:
<snip>
 Some of us prefer to use unsigned types where the value is 
 semantically unsigned, and know what we're doing. So any measures 
 to stronghold programmers against using them are going to be a 
 nuisance.
I have not asked to remove the unsigned types, so you can relax. And replacing lengths/indexes with signed values isn't a way to forbid you to use unsigned values in your programs, it's right the opposite: it's a way to not force me (and many other programmers that want to write simple D non-system programs) to use unsigned values in my code.
I didn't think you were asking for unsigned types to be removed. My point was that having language features and APIs relying on signed types for semantically unsigned values is not just a way of not forcing you to use unsigned types - it's also potentially a way of forcing you not to use them, or to pepper your code with casts if you do. I guess you just can't please everybody. <snip>
 I can also imagine promoting your mindset leading to edit wars 
 between developers declaring an int and then putting
      assert (qwert >= 0);
 in the class invariant, and those who see this and think it's 
 brain-damaged.
This is quite interesting. You think that using an unsigned type in D is today the same thing than using a signed value + an assert of it not being negative?
Not quite - an unsigned type has twice the range. It's true that this extra range isn't always used, but in some apps/APIs there may be bits that use the extra range and bits that don't, and it is often simpler to use unsigned everywhere it's logical than to expect the user to remember which is which.
 In the beginning, when I was used to Delphi programming I have done 
 the same, but I have soon found out that was unsafe. Today D unsigned 
 values don't give you a nice overflow error (as I have asked Walter 
 many times) when you try to assign them a number outside their range, 
 they happily wrap around, this causes bugs in programs.
Trouble is that it would add a lot of code to every integer arithmetic operation. Of course, it could be omitted in release builds, but arithmetic is so frequent an activity that the extent it would slow down and bloat development builds would be annoying.
 So using an unsigned number to denote a value that can't be negative 
 is dangerous and it can be stupid too. In D you need to take a signed 
 value from outside and then assign it to a unsigned value only after 
 you have tested it to be nonnegative.
Why can't I read a 32-bit unsigned integer from a binary file, or use such functions as std.conv.toUint (whereby '-' is just another illegal character)?
 But it is all the more reason to fix unsigned op signed to be 
 signed, if it is to be allowed at all.  The way it is at the 
 moment, a single unsigned value in a formula can force the whole 
 result to be unsigned, thereby leading to unexpected results.
I think Walter will not change this, because this way D syntax equal to C syntax does things different from C
But D isn't designed to be fully source-compatible with C, hence the suggestion of making it illegal.
 (there are few exceptions to this D rule, like fixed-sized arrays are 
 passed by value in D and by pointer in C).
<snip> Indeed, the "looks like C, acts like C" principle isn't consistently applied. For instance, in switch, we have: - a case (no pun intended) of it being applied, even though there's no real reason for D to allow the code (fall through) - a case of it being breached (SwitchDefault error). Stewart.
Jul 05 2010
next sibling parent reply Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
On 07/05/2010 07:59 AM, Stewart Gordon wrote:
 bearophile wrote:
 Stewart Gordon:
 I can also imagine promoting your mindset leading to edit wars
 between developers declaring an int and then putting
 assert (qwert >= 0);
 in the class invariant, and those who see this and think it's
 brain-damaged.
As opposed to doing what?
 This is quite interesting. You think that using an unsigned type in D
 is today the same thing than using a signed value + an assert of it
 not being negative?
Not quite - an unsigned type has twice the range. It's true that this extra range isn't always used, but in some apps/APIs there may be bits that use the extra range and bits that don't, and it is often simpler to use unsigned everywhere it's logical than to expect the user to remember which is which.
Another important difference is the point of non-'continuity' with a signed integer, that point is *.max/min. Assuming typical usage of integers centers around zero, this point doesn't get hit frequently. with an unsigned integer, that point is 0. Assuming the same, this point gets hit much more frequently, which has important implications for subtraction and comparison.
Jul 05 2010
parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
Ellery Newcomer wrote:
 On 07/05/2010 07:59 AM, Stewart Gordon wrote:
 bearophile wrote:
 Stewart Gordon:
 I can also imagine promoting your mindset leading to edit wars
 between developers declaring an int and then putting
 assert (qwert >= 0);
 in the class invariant, and those who see this and think it's
 brain-damaged.
As opposed to doing what?
Just using uint, of course! <snip>
 Another important difference is the point of non-'continuity'
 
 with a signed integer, that point is *.max/min. Assuming typical usage 
 of integers centers around zero, this point doesn't get hit frequently.
 
 with an unsigned integer, that point is 0. Assuming the same, this point 
 gets hit much more frequently, which has important implications for 
 subtraction and comparison.
Subtraction - yes, obviously. Comparison - how do you mean? Stewart.
Jul 06 2010
parent reply Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
On 07/06/2010 07:05 PM, Stewart Gordon wrote:
 Ellery Newcomer wrote:
 On 07/05/2010 07:59 AM, Stewart Gordon wrote:
 bearophile wrote:
 Stewart Gordon:
 I can also imagine promoting your mindset leading to edit wars
 between developers declaring an int and then putting
 assert (qwert >= 0);
 in the class invariant, and those who see this and think it's
 brain-damaged.
As opposed to doing what?
Just using uint, of course!
For enforcing a non-negative constraint, that is brain damaged. Semantically, the two are very different. int i; assert(i >= 0); says i can cross the 0 boundary, but it's an error if it does, i.e. programmer doesn't need to be perfect because it *does get caught* (extreme instances notwithstanding). uint i; says i cannot cross the 0 boundary, but it isn't an error if it does. programmer needs to be perfect and error doesn't get caught (unless what you're using it for can do appropriate bounds checking).
 Comparison - how do you mean?

 Stewart.
Mmmph. Just signed/unsigned, I guess (I was thinking foggily that comparison intrinsically involves subtraction or something like that)
Jul 06 2010
parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
Ellery Newcomer wrote:
 On 07/06/2010 07:05 PM, Stewart Gordon wrote:
<snip>
 Just using uint, of course!
For enforcing a non-negative constraint, that is brain damaged. Semantically, the two are very different.
So effectively, the edit wars would be between people thinking at cross purposes. I guess it would be interesting to see how many libraries are using unsigned types wherever the value is semantically unsigned, and how many are using signed types for such values (maybe with a few exceptions when there's a specific reason).
 int i;
 assert(i >= 0);
 
 says i can cross the 0 boundary, but it's an error if it does, i.e. 
 programmer doesn't need to be perfect because it *does get caught* 
 (extreme instances notwithstanding).
Or equivalently, uint i; assert (i <= cast(uint) int.max);
 uint i;
 
 says i cannot cross the 0 boundary, but it isn't an error if it does. 
 programmer needs to be perfect and error doesn't get caught (unless what 
 you're using it for can do appropriate bounds checking).
Or the wrapping round is an intended feature of what you're using it for.
 Comparison - how do you mean?

 Stewart.
Mmmph. Just signed/unsigned, I guess (I was thinking foggily that comparison intrinsically involves subtraction or something like that)
But whether subtraction for comparison works doesn't depend on whether the legal ranges of the source values are signed or unsigned, at least as long as they're both the same. What it does depend on is whether the subtraction is performed in more bits than the number required to represent the legal range. Stewart.
Jul 08 2010
parent =?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= <jeberger free.fr> writes:
Stewart Gordon wrote:
 Ellery Newcomer wrote:
 On 07/06/2010 07:05 PM, Stewart Gordon wrote:
<snip>
 Just using uint, of course!
For enforcing a non-negative constraint, that is brain damaged. Semantically, the two are very different.
=20 So effectively, the edit wars would be between people thinking at cross=
 purposes.
=20
 I guess it would be interesting to see how many libraries are using
 unsigned types wherever the value is semantically unsigned, and how man=
y
 are using signed types for such values (maybe with a few exceptions whe=
n
 there's a specific reason).
=20
I used to use unsigned types wherever the value is semantically unsigned, but I am in the process of changing to signed everywhere possible because of the brain dead way mixed operations are handled (in C, but D would be the same). Jerome --=20 mailto:jeberger free.fr http://jeberger.free.fr Jabber: jeberger jabber.fr
Jul 08 2010
prev sibling parent bearophile <bearophileHUGS lycos.com> writes:
Stewart Gordon:

 having language features and APIs relying on signed types
 for semantically unsigned values is not just a way of not forcing you to
 use unsigned types - it's also potentially a way of forcing you not to
 use them, or to pepper your code with casts if you do.  I guess you just
 can't please everybody.
It's better to limit the usage of unsigned values in APIs too. While you can't change the API of existing C libs (that can use unsigned values too) we can create an ecosystem of D modules that use unsigned values only when they are necessary (this means only in uncommon cases). This allows you to use few signed->unsigned casts in your code (C# libs are designed like this).
 Not quite - an unsigned type has twice the range.  It's true that this
 extra range isn't always used,
This happens and it's one of the usage examples of unsigned values, but this usage case requires some conditions: - The max value of the signed value for example 127, 32767, 2147483647 or 9223372036854775807, is not big enough, this happens. - The range of the unsigned is surely big enough in your code. This happens, but sometimes it also happens that what overflows a signed value also overflows the unsigned one, it's just one bit more. - You can't use a bigger value (and cent/ucent are not available yet in D). This is less common. Some APIs give you a value of a certain size/type and you can't change it, but in many other situations you can use for example a long where a int isn't enough. There are other situations where this is bad (for example you have a large array of such values in a performance-critical spot of your program, so using long instead of uint doubles the array size and increases the cache pressure), or you really have a compute-bound spot in your program, where doing lot of operations on unsigned values give you better performance compared to using longs, but this is not a so common situation. There are many situations where using an uint instead of a long is premature optimization :-)
 but in some apps/APIs there may be bits
 that use the extra range and bits that don't, and it is often simpler to
 use unsigned everywhere it's logical than to expect the user to remember
 which is which.
The D language has to give you the tools to use messy APIs too, but it's better to teach D programmers to create less messy D APIs, allowing usage of only or mostly signed values, etc. The array lengths and indexes are a good spot to start improving the APIs of the language.
 Trouble is that it would add a lot of code to every integer arithmetic
 operation.
This is a quantitative discussion, in theory this feature can be implemented and then we can measure how many bytes are added to the binaries of a certain number of interesting benchmark programs. My experience with Delphi and C# shows that for me this cost is tolerable (I have tried it in small and medium size programs, for years), both in compile time, run time and binary size (compile time is about the same. Run-time performance is worsened usually less than the array bound tests done by D!). I have filed some related bugs for LLVM: http://llvm.org/bugs/show_bug.cgi?id=4916 http://llvm.org/bugs/show_bug.cgi?id=4917 http://llvm.org/bugs/show_bug.cgi?id=4918 Those are enhancement proposals that ask to the LLVM backend to produce optimal asm when it is fed with C code that tests for specific overflows. They show that such overflow tests can add several instructions for each overflow test if they are done through normal C/D code. But that overhead can be reduced to about 3 instructions (one of them is a jump that usually is not taken, so the code execution goes forward straight, so on modern CPUs this jump costs very little) if the compiler implements them more directly, from some little 'templates' written by a human (and in several cases the compiler can omit such tests, for example for loop variables, simple operations with enums, etc, reducing both code size and performance loss). To test overflows of bytes/shorts/ubytes/ushorts it's needed a little more code, because the CPU flags don't help you much on this. And in my programs often operations are done among floating point values, that have no overflow tests, so they incur in no speed loss or code size increase.
 Of course, it could be omitted in release builds, but
 arithmetic is so frequent an activity that the extent it would slow down
 and bloat development builds would be annoying.
The overflow tests I am talking about are optional, if you don't want them you can disable them even in development builds. If you don't want them you don't have to pay for them and you can ignore them. They don't even slow down compilation (unless by design during compilation they are always switched on to watch for overflows among the values known compile-time).
 But D isn't designed to be fully source-compatible with C, hence the
 suggestion of making it illegal.
From what Andrei and Walter have said this will not happen (maybe because the language and Phobos force to use too many unsigned values, and we are back to my original idea), so different solutions are needed. ----------------- Some numbers, using GCC 4.5 and FreePascal 2.4 (fp), on Windows Vista. Key: C = C code stripped and max optimized. fp = FreePascal code max optimized. fp+r = FreePascal code max optimized + range tests + overflow tests. Benchmarks: nbody: binary size, bytes: C: 11_776 fp: 127_336 fp+r: 127_464 runtime, N=1_000_000, seconds: C: 0.56 fp: 0.65 fp+r: 0.66 old fannkuch: binary size, bytes: C: 12_288 fp: 66_011 fp+r: 66_651 runtime, N=11, seconds: C: 4.97 fp: 5.14 fp+r: 9.54 old mandelbrot: binary size, bytes: C: 11_264 fp: 66_661 fp+r: 66_723 runtime, size=3000, seconds: C: 1.88 fp: 10.35 fp+r: 11.89 fasta: binary size, bytes: C: 13_312 fp: 73_748 fp+r: 74_658 runtime, N=5_000_000, seconds: C: 2.09 fp: 2.06 fp+r: 2.06 recursive: binary size, bytes: C: 18_944 fp: 72_980 fp+r: 73_042 runtime, N=5_000_000, seconds: C: 4.04 fp: 11.88 fp+r: 11.90 nbody is heavy FP. fannkuch is mostly about small array of integers with some integer operations. mandelbrot is FP-heavy but contains bit-twiddling too. fasta contains arrays and integer operations. recursive contains both FP and integer-based operations. Those are small programs, both the size and performance doesn't change a lot. But better benchmarks are needed. Bye, bearophile
Jul 05 2010