www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - 'int' is enough for 'length' to migrate code from x86 to x64

reply "FrankLike" <1150015857 qq.com> writes:
If you migrate your projct from x86 to x64,you will find the 
length is error,you must modify it ,such as:
   int i= (something).length
to
   size_t i = (something).length

but now ,'int' is enough for use,not huge and not small,only 
enough.
'int' is easy to write,and most people are used to it.
Most importantly easier to migrate code,if  'length''s return 
value type is 'int'.

Thank you all.
Nov 18 2014
next sibling parent reply ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Tue, 18 Nov 2014 12:33:51 +0000
FrankLike via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 If you migrate your projct from x86 to x64,you will find the=20
 length is error,you must modify it ,such as:
    int i=3D (something).length
 to
    size_t i =3D (something).length
=20
 but now ,'int' is enough for use,not huge and not small,only=20
 enough.
 'int' is easy to write,and most people are used to it.
 Most importantly easier to migrate code,if  'length''s return=20
 value type is 'int'.
=20
 Thank you all.
drop your C. auto len =3D smth.length; works ok for both x86 and x86_64 (don't know what x64 is).
Nov 18 2014
next sibling parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Tuesday, 18 November 2014 at 13:35:46 UTC, ketmar via 
Digitalmars-d wrote:
 On Tue, 18 Nov 2014 12:33:51 +0000
 FrankLike via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 If you migrate your projct from x86 to x64,you will find the 
 length is error,you must modify it ,such as:
    int i= (something).length
 to
    size_t i = (something).length
 
 but now ,'int' is enough for use,not huge and not small,only 
 enough.
 'int' is easy to write,and most people are used to it.
 Most importantly easier to migrate code,if  'length''s return 
 value type is 'int'.
 
 Thank you all.
drop your C. auto len = smth.length; works ok for both x86 and x86_64 (don't know what x64 is).
x64 is commonly used by windows programmers to refer to x86_64 with a 64 bit OS.
Nov 18 2014
parent "FrankLike" <1150015857 qq.com> writes:
 but now ,'int' is enough for use,not huge and not small,only 
 enough.
 'int' is easy to write,and most people are used to it.
 Most importantly easier to migrate code,if  'length''s return 
 value type is 'int'.
How about your idea?
Nov 18 2014
prev sibling parent reply "FrankLike" <1150015857 qq.com> writes:
 drop your C.

   auto len = smth.length;

 works ok for both x86 and x86_64 (don't know what x64 is).
Many excellent projects such as dfl,dgui,tango, many 'length' which type is 'int' or 'uint',they are D's,many people like it.but they should migrate to 64 bit.So if 'length' type is 'int',they can work on 64 bit,but now,they must be modify for 'length''s type.
Nov 18 2014
next sibling parent reply ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Tue, 18 Nov 2014 14:24:16 +0000
FrankLike via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 drop your C.

   auto len =3D smth.length;

 works ok for both x86 and x86_64 (don't know what x64 is).
=20 Many excellent projects such as dfl,dgui,tango, many 'length'=20 which type is 'int' or 'uint',they are D's,many people like=20 it.but they should migrate to 64 bit.So if 'length' type is=20 'int',they can work on 64 bit,but now,they must be modify for=20 'length''s type.
broken code must be fixed by the authors of the broken code. that code is broken. authors must fix it.
Nov 18 2014
parent reply "FrankLike" <1150015857 qq.com> writes:
 Many excellent projects such as dfl,dgui,tango, many 'length' 
 which type is 'int' or 'uint',they are D's,many people like 
 it.but they should migrate to 64 bit.So if 'length' type is 
 'int',they can work on 64 bit,but now,they must be modify for 
 'length''s type.
broken code must be fixed by the authors of the broken code. that code is broken. authors must fix it.
Now,dfl's anthor miller has stopped updating the dfl project,and dgui also was.But they are excellent,we like to use it.
Nov 18 2014
parent reply ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Tue, 18 Nov 2014 14:41:18 +0000
FrankLike via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 Many excellent projects such as dfl,dgui,tango, many 'length'=20
 which type is 'int' or 'uint',they are D's,many people like=20
 it.but they should migrate to 64 bit.So if 'length' type is=20
 'int',they can work on 64 bit,but now,they must be modify for=20
 'length''s type.
broken code must be fixed by the authors of the broken code.=20 that code is broken. authors must fix it.
=20 Now,dfl's anthor miller has stopped updating the dfl project,and=20 dgui also was.But they are excellent,we like to use it.
fork 'em and fix 'em. if nobody wants to fix the project, the project is dead and should be not used, obviously.
Nov 18 2014
parent reply "Frank Like" <1150015857 qq.com> writes:
 but now ,'int' is enough for use,not huge and not small,only 
 enough.
 'int' is easy to write,and most people are used to it.
 Most importantly easier to migrate code,if  'length''s return
value type is 'int'.
How about your idea?
Nov 18 2014
parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Tue, 18 Nov 2014 15:01:25 +0000
schrieb "Frank Like" <1150015857 qq.com>:

 but now ,'int' is enough for use,not huge and not small,only=20
 enough.
 'int' is easy to write,and most people are used to it.
 Most importantly easier to migrate code,if  'length''s return
value type is 'int'.
=20 How about your idea?
I get the idea of a broken record right now... Clearly size_t (which I tend to alias with =E2=84=95 in my code for brevity and coolness) can express more than 2^31-1 items, which is appropriate to reflect the increase in usable memory per application on 64-bit platforms. Yes, the 64-bit version of a program or library can handle larger data sets. Just like it was when people transitioned from 16-bit to 32-bit. I wont use `int` just because the technically correct thing is `size_t`, even it it is a little harder to type. Aside from the size factor, I personally prefer unsigned types for countable stuff like array lengths. Mixed arithmetics decay to unsinged anyways and you don't need checks like `assert(idx >=3D 0)`. It is a matter of taste though and others prefer languages with no unsigned types at all. --=20 Marco
Nov 18 2014
next sibling parent reply "Frank Like" <1150015857 qq.com> writes:
 Clearly size_t (which I tend to alias with ℕ in my code for
 brevity and coolness) can express more than 2^31-1 items, which
 is appropriate to reflect the increase in usable memory per
 application on 64-bit platforms. Yes, the 64-bit version of a
 program or library can handle larger data sets. Just like it
I known it. but if you compile the dfl Library to 64 bit,you will find error: core.sys.windows.windows.WaitForMultipleObjects(uint nCount,void** lpHandles,....) is not callable using argument types(ulong,void**,...) the 'WaitForMultipleObjects' Function is in dmd2/src/druntime/src/core/sys/windows/windows.d the argument of first is dfl's value ,it come from a 'length' ,it's type is size_t,now it is 'ulong' on 64 bit. So druntime must keep the same as phobos for size_t.
Nov 19 2014
parent reply "Dominikus Dittes Scherkl" writes:
On Wednesday, 19 November 2014 at 09:06:16 UTC, Maroc Leise wrote:
 Clearly size_t (which I tend to alias with ℕ in my code for
 brevity and coolness)
No, this is far from the implied infinite set. A much better candidate for ℕ is BigUInt (and ℤ for BigInt)
Nov 19 2014
parent Marco Leise <Marco.Leise gmx.de> writes:
Am Wed, 19 Nov 2014 10:22:49 +0000
schrieb "Dominikus Dittes Scherkl"
<Dominikus.Scherkl continental-corporation.com>:

 On Wednesday, 19 November 2014 at 09:06:16 UTC, Maroc Leise wrote:
 Clearly size_t (which I tend to alias with =E2=84=95 in my code for
 brevity and coolness)
No, this is far from the implied infinite set. A much better candidate for =E2=84=95 is BigUInt (and =E2=84=A4 for BigIn=
t) How far exactly is it from infinity? And how much closer is BigInt? I wanted a fast =E2=84=95 within the constraints of the machine. ;) --=20 Marco
Nov 21 2014
prev sibling next sibling parent reply "Don" <x nospam.com> writes:
On Tuesday, 18 November 2014 at 18:23:52 UTC, Marco Leise wrote:
 Am Tue, 18 Nov 2014 15:01:25 +0000
 schrieb "Frank Like" <1150015857 qq.com>:

 but now ,'int' is enough for use,not huge and not small,only 
 enough.
 'int' is easy to write,and most people are used to it.
 Most importantly easier to migrate code,if  'length''s return
value type is 'int'.
How about your idea?
I get the idea of a broken record right now... Clearly size_t (which I tend to alias with ℕ in my code for brevity and coolness) can express more than 2^31-1 items, which is appropriate to reflect the increase in usable memory per application on 64-bit platforms. Yes, the 64-bit version of a program or library can handle larger data sets. Just like it was when people transitioned from 16-bit to 32-bit. I wont use `int` just because the technically correct thing is `size_t`, even it it is a little harder to type.
This is difficult. Having arr.length return an unsigned type, is a dreadful language mistake.
 Aside from the size factor, I personally prefer unsigned types
 for countable stuff like array lengths. Mixed arithmetics
 decay to unsinged anyways and you don't need checks like
 `assert(idx >= 0)`. It is a matter of taste though and others
 prefer languages with no unsigned types at all.
No! No! No! This is completely wrong. Unsigned does not mean "positive". It means "no sign", and therefore "wrapping semantics". eg length - 4 > 0, if length is 2. Weird consequence: using subtraction with an unsigned type is nearly always a bug. I wish D hadn't called unsigned integers 'uint'. They should have been called '__uint' or something. They should look ugly. You need a very, very good reason to use an unsigned type. We have a builtin type that is deadly but seductive.
Nov 19 2014
next sibling parent reply "Matthias Bentrup" <matthias.bentrup googlemail.com> writes:
On Wednesday, 19 November 2014 at 10:03:35 UTC, Don wrote:
 On Tuesday, 18 November 2014 at 18:23:52 UTC, Marco Leise wrote:
 Am Tue, 18 Nov 2014 15:01:25 +0000
 schrieb "Frank Like" <1150015857 qq.com>:

 but now ,'int' is enough for use,not huge and not 
 small,only enough.
 'int' is easy to write,and most people are used to it.
 Most importantly easier to migrate code,if  'length''s 
 return
value type is 'int'.
How about your idea?
I get the idea of a broken record right now... Clearly size_t (which I tend to alias with ℕ in my code for brevity and coolness) can express more than 2^31-1 items, which is appropriate to reflect the increase in usable memory per application on 64-bit platforms. Yes, the 64-bit version of a program or library can handle larger data sets. Just like it was when people transitioned from 16-bit to 32-bit. I wont use `int` just because the technically correct thing is `size_t`, even it it is a little harder to type.
This is difficult. Having arr.length return an unsigned type, is a dreadful language mistake.
 Aside from the size factor, I personally prefer unsigned types
 for countable stuff like array lengths. Mixed arithmetics
 decay to unsinged anyways and you don't need checks like
 `assert(idx >= 0)`. It is a matter of taste though and others
 prefer languages with no unsigned types at all.
No! No! No! This is completely wrong. Unsigned does not mean "positive". It means "no sign", and therefore "wrapping semantics". eg length - 4 > 0, if length is 2. Weird consequence: using subtraction with an unsigned type is nearly always a bug. I wish D hadn't called unsigned integers 'uint'. They should have been called '__uint' or something. They should look ugly. You need a very, very good reason to use an unsigned type. We have a builtin type that is deadly but seductive.
int has wrapping the same semantics too, only it wraps to negative numbers instead of zero. If you insist on non-wrapping length, it should return double or long double.
Nov 19 2014
parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Wednesday, 19 November 2014 at 11:04:05 UTC, Matthias Bentrup 
wrote:
 On Wednesday, 19 November 2014 at 10:03:35 UTC, Don wrote:
 On Tuesday, 18 November 2014 at 18:23:52 UTC, Marco Leise 
 wrote:
 Am Tue, 18 Nov 2014 15:01:25 +0000
 schrieb "Frank Like" <1150015857 qq.com>:

 but now ,'int' is enough for use,not huge and not 
 small,only enough.
 'int' is easy to write,and most people are used to it.
 Most importantly easier to migrate code,if  'length''s 
 return
value type is 'int'.
How about your idea?
I get the idea of a broken record right now... Clearly size_t (which I tend to alias with ℕ in my code for brevity and coolness) can express more than 2^31-1 items, which is appropriate to reflect the increase in usable memory per application on 64-bit platforms. Yes, the 64-bit version of a program or library can handle larger data sets. Just like it was when people transitioned from 16-bit to 32-bit. I wont use `int` just because the technically correct thing is `size_t`, even it it is a little harder to type.
This is difficult. Having arr.length return an unsigned type, is a dreadful language mistake.
 Aside from the size factor, I personally prefer unsigned types
 for countable stuff like array lengths. Mixed arithmetics
 decay to unsinged anyways and you don't need checks like
 `assert(idx >= 0)`. It is a matter of taste though and others
 prefer languages with no unsigned types at all.
No! No! No! This is completely wrong. Unsigned does not mean "positive". It means "no sign", and therefore "wrapping semantics". eg length - 4 > 0, if length is 2. Weird consequence: using subtraction with an unsigned type is nearly always a bug. I wish D hadn't called unsigned integers 'uint'. They should have been called '__uint' or something. They should look ugly. You need a very, very good reason to use an unsigned type. We have a builtin type that is deadly but seductive.
int has wrapping the same semantics too, only it wraps to negative numbers instead of zero. If you insist on non-wrapping length, it should return double or long double.
Which would be totally wrong for different reasons. Short of BigInts or overflow-checking, there is no perfect option. An overflow-checked type that could be reasonably well optimised would be nice, as mentioned by bearophile many times.
Nov 19 2014
parent reply "Don" <x nospam.com> writes:
On Wednesday, 19 November 2014 at 11:43:38 UTC, John Colvin wrote:
 On Wednesday, 19 November 2014 at 11:04:05 UTC, Matthias 
 Bentrup wrote:
 On Wednesday, 19 November 2014 at 10:03:35 UTC, Don wrote:
 On Tuesday, 18 November 2014 at 18:23:52 UTC, Marco Leise 
 wrote:
 Am Tue, 18 Nov 2014 15:01:25 +0000
 schrieb "Frank Like" <1150015857 qq.com>:

 but now ,'int' is enough for use,not huge and not 
 small,only enough.
 'int' is easy to write,and most people are used to it.
 Most importantly easier to migrate code,if  'length''s 
 return
value type is 'int'.
How about your idea?
I get the idea of a broken record right now... Clearly size_t (which I tend to alias with ℕ in my code for brevity and coolness) can express more than 2^31-1 items, which is appropriate to reflect the increase in usable memory per application on 64-bit platforms. Yes, the 64-bit version of a program or library can handle larger data sets. Just like it was when people transitioned from 16-bit to 32-bit. I wont use `int` just because the technically correct thing is `size_t`, even it it is a little harder to type.
This is difficult. Having arr.length return an unsigned type, is a dreadful language mistake.
 Aside from the size factor, I personally prefer unsigned 
 types
 for countable stuff like array lengths. Mixed arithmetics
 decay to unsinged anyways and you don't need checks like
 `assert(idx >= 0)`. It is a matter of taste though and others
 prefer languages with no unsigned types at all.
No! No! No! This is completely wrong. Unsigned does not mean "positive". It means "no sign", and therefore "wrapping semantics". eg length - 4 > 0, if length is 2. Weird consequence: using subtraction with an unsigned type is nearly always a bug. I wish D hadn't called unsigned integers 'uint'. They should have been called '__uint' or something. They should look ugly. You need a very, very good reason to use an unsigned type. We have a builtin type that is deadly but seductive.
int has wrapping the same semantics too, only it wraps to negative numbers instead of zero.
No. Signed types do not *wrap*. They *overflow* if their range is exceeded. This is not the same thing. Overflow is always an error. And the compiler could insert checks to detect this. That's not possible for unsigned types. With an unsigned type, wrapping is part of the semantics. Moreover, hitting an overflow with a signed type is an exceptional situation. Wrapping with an unsigned type is entirely normal, and happens with things like 2u - 1u.
If you insist on
 non-wrapping length, it should return double or long double.
Which would be totally wrong for different reasons. Short of BigInts or overflow-checking, there is no perfect option. An overflow-checked type that could be reasonably well optimised would be nice, as mentioned by bearophile many times.
I don't think we need to worry about the pathological cases. The problem with unsigned size_t is that it introduces inappropriate semantics everywhere for the sake of the pathological cases. IMHO the correct solution is to say that the length of a slice cannot exceed half of the memory space, otherwise a runtime error will occur. And then make size_t a positive integer. Then let typeof(size_t - size_t) == int, instead of uint. All other operations stay as size_t. Perhaps we can get most of the way, by improving range propagation.
Nov 19 2014
parent reply ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Wed, 19 Nov 2014 13:33:21 +0000
Don via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 No. Signed types do not *wrap*. They *overflow* if their range is=20
 exceeded.
same for unsigned ints.
 This is not the same thing. Overflow is always an error.
 And the compiler could insert checks to detect this.
and for unsigned ints. i want compilers to has special code for this. something like `checkedInt(...)`. and this must be built-in, 'cause checking carry flag is cheap, but can be done only on "machine" level.
 That's not possible for unsigned types. With an unsigned type,=20
 wrapping is part of the semantics.
see above.
 Moreover, hitting an overflow with a signed type is an=20
 exceptional situation. Wrapping with an unsigned type is entirely=20
 normal, and happens with things like 2u - 1u.
having results of unsigned int wrapping defined doesn't mean that it's "normal". it's just *defined*, so you can check for it without triggering UB.
 IMHO the correct solution is to say that the length of a slice=20
 cannot exceed half of the memory space, otherwise a runtime error=20
 will occur. And then make size_t a positive integer.
but why? maybe 1/3 of address space fits better? or 256 bytes, to really avoid "overflows" and "wrapping"?
 Then let typeof(size_t - size_t) =3D=3D int, instead of uint. All=20
 other operations stay as size_t.
check and cast. you can check length and then safely cast it to int, no probs.
Nov 19 2014
parent reply "Don" <x nospam.com> writes:
On Wednesday, 19 November 2014 at 13:47:31 UTC, ketmar via 
Digitalmars-d wrote:
 On Wed, 19 Nov 2014 13:33:21 +0000
 Don via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 No. Signed types do not *wrap*. They *overflow* if their range 
 is exceeded.
same for unsigned ints.
 This is not the same thing. Overflow is always an error.
 And the compiler could insert checks to detect this.
and for unsigned ints. i want compilers to has special code for this. something like `checkedInt(...)`. and this must be built-in, 'cause checking carry flag is cheap, but can be done only on "machine" level.
I don't know what you mean. For unsigned ints, carry is not an error. That's the whole point of unsigned!
 That's not possible for unsigned types. With an unsigned type,
 wrapping is part of the semantics.
see above.
 Moreover, hitting an overflow with a signed type is an 
 exceptional situation. Wrapping with an unsigned type is 
 entirely normal, and happens with things like 2u - 1u.
having results of unsigned int wrapping defined doesn't mean that it's "normal". it's just *defined*, so you can check for it without triggering UB.
 IMHO the correct solution is to say that the length of a slice 
 cannot exceed half of the memory space, otherwise a runtime 
 error will occur. And then make size_t a positive integer.
but why? maybe 1/3 of address space fits better? or 256 bytes, to really avoid "overflows" and "wrapping"?
No. The point is to get correct semantics. Unsigned types do not have the correct semantics. Signed types do.
 Then let typeof(size_t - size_t) == int, instead of uint. All 
 other operations stay as size_t.
check and cast. you can check length and then safely cast it to int, no probs.
This is the job of the compiler, not the programmer. The compiler should do this at all possible places where a slice could exceed the int.max / long.max. That's cheap because there are hardly any places it could happen (for example, for array slices it can only happen with 1-byte types). --- Almost everybody seems to think that unsigned means positive. It does not. ---
Nov 19 2014
next sibling parent ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Wed, 19 Nov 2014 14:04:15 +0000
Don via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 I don't know what you mean. For unsigned ints, carry is not an=20
 error. That's the whole point of unsigned!
this *may* be not a error. it depends of what programmer wants.
 This is the job of the compiler, not the programmer. The compiler=20
 should do this at all possible places where a slice could exceed=20
 the int.max / long.max. That's cheap because there are hardly any=20
 places it could happen (for example, for array slices it can only=20
 happen with 1-byte types).
i, for myself, don't want the compiler to do arcane type conversions behind my back.
 ---
 Almost everybody seems to think that unsigned means positive. It=20
 does not.
 ---
sure, it includes zero, which is neither positive nor negative. ;-)
Nov 19 2014
prev sibling next sibling parent "Matthias Bentrup" <matthias.bentrup googlemail.com> writes:
On Wednesday, 19 November 2014 at 14:04:16 UTC, Don wrote:
 No. The point is to get correct semantics. Unsigned types do 
 not have the correct semantics. Signed types do.
In D both signed and unsigned integers have defined wrapping semantincs. In C++ signed integers are allowed to format your harddisk on overflow.
Nov 19 2014
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/19/14 6:04 AM, Don wrote:
 Almost everybody seems to think that unsigned means positive. It does not.
That's an exaggeration. With only a bit of care one can use D's unsigned types for positive numbers. Please let's not reduce the matter to black and white. Andrei
Nov 19 2014
parent reply "Don" <x nospam.com> writes:
On Wednesday, 19 November 2014 at 17:55:26 UTC, Andrei 
Alexandrescu wrote:
 On 11/19/14 6:04 AM, Don wrote:
 Almost everybody seems to think that unsigned means positive. 
 It does not.
That's an exaggeration. With only a bit of care one can use D's unsigned types for positive numbers. Please let's not reduce the matter to black and white. Andrei
Even in the responses in this thread indicate that about half of the people here don't understand unsigned. "unsigned" means "I want to use modulo 2^^n arithmetic". It does not mean, "this is an integer which cannot be negative". Using modulo 2^^n arithmetic is *weird*. If you are using uint/ulong to represent a non-negative integer, you are using the incorrect type.
 "With only a bit of care one can use D's unsigned types for 
 positive numbers."
I do not believe that that statement to be true. I believe that bugs caused by unsigned calculations are subtle and require an extraordinary level of diligence. I showed an example at DConf, that I had found in production code. It's particularly challenging in D because of the widespread use of 'auto': auto x = foo(); auto y = bar(); auto z = baz(); if (x - y > z) { ... } This might be a bug, if one of these functions returns an unsigned type. Good luck finding that. Note that if all functions return unsigned, there isn't even any signed-unsigned mismatch. I believe the correct statement, is "With only a bit of care one can use D's unsigned types for positive numbers and believe that one's code is correct, even though it contains subtle bugs."
Nov 20 2014
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/20/14 12:18 AM, Don wrote:
 On Wednesday, 19 November 2014 at 17:55:26 UTC, Andrei Alexandrescu wrote:
 On 11/19/14 6:04 AM, Don wrote:
 Almost everybody seems to think that unsigned means positive. It does
 not.
That's an exaggeration. With only a bit of care one can use D's unsigned types for positive numbers. Please let's not reduce the matter to black and white. Andrei
Even in the responses in this thread indicate that about half of the people here don't understand unsigned. "unsigned" means "I want to use modulo 2^^n arithmetic". It does not mean, "this is an integer which cannot be negative". Using modulo 2^^n arithmetic is *weird*. If you are using uint/ulong to represent a non-negative integer, you are using the incorrect type.
 "With only a bit of care one can use D's unsigned types for positive
 numbers."
I do not believe that that statement to be true. I believe that bugs caused by unsigned calculations are subtle and require an extraordinary level of diligence. I showed an example at DConf, that I had found in production code. It's particularly challenging in D because of the widespread use of 'auto': auto x = foo(); auto y = bar(); auto z = baz(); if (x - y > z) { ... } This might be a bug, if one of these functions returns an unsigned type. Good luck finding that. Note that if all functions return unsigned, there isn't even any signed-unsigned mismatch. I believe the correct statement, is "With only a bit of care one can use D's unsigned types for positive numbers and believe that one's code is correct, even though it contains subtle bugs."
Well I'm sorry but I quite disagree. -- Andrei
Nov 20 2014
next sibling parent reply "FrankLike" <1150015857 qq.com> writes:
 auto x = foo();
 auto y = bar();
 auto z = baz();

 if (x - y > z) { ... }


 This might be a bug, if one of these functions returns an 
 unsigned
 type.  Good luck finding that. Note that if all functions 
 return
 unsigned, there isn't even any signed-unsigned mismatch.

 I believe the correct statement, is "With only a bit of care 
 one can use
 D's unsigned types for positive numbers and believe that one's 
 code is
 correct, even though it contains subtle bugs."
Well I'm sorry but I quite disagree. -- Andrei
This might be a bug. 'Length' always needs to compare sizes. 'Width' and 'Height' like it. ************************ dfl/drawing.d line:185 -218 ************************** /// Size opAdd(Size sz) { Size result; result.width = width + sz.width; result.height = height + sz.height; return result; } /// Size opSub(Size sz) { Size result; result.width = width - sz.width; result.height = height - sz.height; return result; } /// void opAddAssign(Size sz) { width += sz.width; height += sz.height; } /// void opSubAssign(Size sz) { width -= sz.width; height -= sz.height; } ***********************end************************* if the type of width and height are size_t,then their values will be error. small test: ----------------------- import std.stdio; void main() { size_t width = 10; size_t height = 20; writeln("before width is ",width," ,height is ",height); height -= 1; width -= height; writeln("after width is ",width," ,height is ",height); } ---------- "after width is " ERROR.
Nov 20 2014
parent "Matthias Bentrup" <matthias.bentrup googlemail.com> writes:
On Thursday, 20 November 2014 at 13:26:23 UTC, FrankLike wrote:
 auto x = foo();
 auto y = bar();
 auto z = baz();

 if (x - y > z) { ... }


 This might be a bug, if one of these functions returns an 
 unsigned
 type.  Good luck finding that. Note that if all functions 
 return
 unsigned, there isn't even any signed-unsigned mismatch.

 I believe the correct statement, is "With only a bit of care 
 one can use
 D's unsigned types for positive numbers and believe that 
 one's code is
 correct, even though it contains subtle bugs."
Well I'm sorry but I quite disagree. -- Andrei
This might be a bug. 'Length' always needs to compare sizes. 'Width' and 'Height' like it. ************************ dfl/drawing.d line:185 -218 ************************** /// Size opAdd(Size sz) { Size result; result.width = width + sz.width; result.height = height + sz.height; return result; } /// Size opSub(Size sz) { Size result; result.width = width - sz.width; result.height = height - sz.height; return result; } /// void opAddAssign(Size sz) { width += sz.width; height += sz.height; } /// void opSubAssign(Size sz) { width -= sz.width; height -= sz.height; } ***********************end************************* if the type of width and height are size_t,then their values will be error. small test: ----------------------- import std.stdio; void main() { size_t width = 10; size_t height = 20; writeln("before width is ",width," ,height is ",height); height -= 1; width -= height; writeln("after width is ",width," ,height is ",height); } ---------- "after width is " ERROR.
I get "after width is 18446744073709551607 ,height is 19", which looks mathematically correct to me.
Nov 20 2014
prev sibling parent reply Ary Borenszweig <ary esperanto.org.ar> writes:
On 11/20/14, 6:47 AM, Andrei Alexandrescu wrote:
 On 11/20/14 12:18 AM, Don wrote:
 On Wednesday, 19 November 2014 at 17:55:26 UTC, Andrei Alexandrescu
 wrote:
 On 11/19/14 6:04 AM, Don wrote:
 Almost everybody seems to think that unsigned means positive. It does
 not.
That's an exaggeration. With only a bit of care one can use D's unsigned types for positive numbers. Please let's not reduce the matter to black and white. Andrei
Even in the responses in this thread indicate that about half of the people here don't understand unsigned. "unsigned" means "I want to use modulo 2^^n arithmetic". It does not mean, "this is an integer which cannot be negative". Using modulo 2^^n arithmetic is *weird*. If you are using uint/ulong to represent a non-negative integer, you are using the incorrect type.
 "With only a bit of care one can use D's unsigned types for positive
 numbers."
I do not believe that that statement to be true. I believe that bugs caused by unsigned calculations are subtle and require an extraordinary level of diligence. I showed an example at DConf, that I had found in production code. It's particularly challenging in D because of the widespread use of 'auto': auto x = foo(); auto y = bar(); auto z = baz(); if (x - y > z) { ... } This might be a bug, if one of these functions returns an unsigned type. Good luck finding that. Note that if all functions return unsigned, there isn't even any signed-unsigned mismatch. I believe the correct statement, is "With only a bit of care one can use D's unsigned types for positive numbers and believe that one's code is correct, even though it contains subtle bugs."
Well I'm sorry but I quite disagree. -- Andrei
I don't think disagreeing without a reason (like the one Don gave above) is good. You could show us the benefits of unsigned types over signed types (possibly considering that not every program in the world needs an array with 2^64 elements).
Nov 20 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/20/14 6:20 AM, Ary Borenszweig wrote:
 On 11/20/14, 6:47 AM, Andrei Alexandrescu wrote:
 On 11/20/14 12:18 AM, Don wrote:
 On Wednesday, 19 November 2014 at 17:55:26 UTC, Andrei Alexandrescu
 wrote:
 On 11/19/14 6:04 AM, Don wrote:
 Almost everybody seems to think that unsigned means positive. It does
 not.
That's an exaggeration. With only a bit of care one can use D's unsigned types for positive numbers. Please let's not reduce the matter to black and white. Andrei
Even in the responses in this thread indicate that about half of the people here don't understand unsigned. "unsigned" means "I want to use modulo 2^^n arithmetic". It does not mean, "this is an integer which cannot be negative". Using modulo 2^^n arithmetic is *weird*. If you are using uint/ulong to represent a non-negative integer, you are using the incorrect type.
 "With only a bit of care one can use D's unsigned types for positive
 numbers."
I do not believe that that statement to be true. I believe that bugs caused by unsigned calculations are subtle and require an extraordinary level of diligence. I showed an example at DConf, that I had found in production code. It's particularly challenging in D because of the widespread use of 'auto': auto x = foo(); auto y = bar(); auto z = baz(); if (x - y > z) { ... } This might be a bug, if one of these functions returns an unsigned type. Good luck finding that. Note that if all functions return unsigned, there isn't even any signed-unsigned mismatch. I believe the correct statement, is "With only a bit of care one can use D's unsigned types for positive numbers and believe that one's code is correct, even though it contains subtle bugs."
Well I'm sorry but I quite disagree. -- Andrei
I don't think disagreeing without a reason (like the one Don gave above) is good.
Most of the statements I disagreed with were opinions.
 "unsigned" means "I want to use modulo 2^^n arithmetic". It does
 not mean, "this is an integer which cannot be negative".
Opinion.
 Using modulo 2^^n arithmetic is *weird*.
Opinion.
 If you are using
 uint/ulong to represent a non-negative integer, you are using the
 incorrect type.
Opinion.
 I believe that
 bugs caused by unsigned calculations are subtle and require an
 extraordinary level of diligence.
Opinion (correctly qualified as belief). Andrei
Nov 20 2014
parent reply "Araq" <rumpf_a web.de> writes:
 Most of the statements I disagreed with were opinions.

 "unsigned" means "I want to use modulo 2^^n arithmetic". It 
 does
 not mean, "this is an integer which cannot be negative".
Opinion.
 Using modulo 2^^n arithmetic is *weird*.
Opinion.
 If you are using
 uint/ulong to represent a non-negative integer, you are 
 using the
 incorrect type.
Opinion.
 I believe that
 bugs caused by unsigned calculations are subtle and require 
 an
 extraordinary level of diligence.
Opinion (correctly qualified as belief).
It's not only his "opinion", it's his *experience* and if we want to play the "argument by authority" game: he most likely wrote more production quality code in D than you did. Here are some more "opinions": http://critical.eschertech.com/2010/04/07/danger-unsigned-types-used-here/
Nov 20 2014
next sibling parent reply "flamencofantasy" <flamencofantasy gmail.com> writes:
On Thursday, 20 November 2014 at 15:40:40 UTC, Araq wrote:
 Most of the statements I disagreed with were opinions.

 "unsigned" means "I want to use modulo 2^^n arithmetic". It 
 does
 not mean, "this is an integer which cannot be negative".
Opinion.
 Using modulo 2^^n arithmetic is *weird*.
Opinion.
 If you are using
 uint/ulong to represent a non-negative integer, you are 
 using the
 incorrect type.
Opinion.
 I believe that
 bugs caused by unsigned calculations are subtle and require 
 an
 extraordinary level of diligence.
Opinion (correctly qualified as belief).
It's not only his "opinion", it's his *experience* and if we want to play the "argument by authority" game: he most likely wrote more production quality code in D than you did. Here are some more "opinions": http://critical.eschertech.com/2010/04/07/danger-unsigned-types-used-here/
My experience is totally the opposite of his. I have been using unsigned for lengths, widths, heights for the past 15 years in C, pretend to be any kind of authority though. The article you point to is totally flawed and kinda wasteful in terms of having to read it; the very first code snippet is obviously buggy. You can't purposefully write buggy code and then comment on the dangers of this or that! size_t i; for (i = size - 1; i >= 0; --i) { If you that's subtle to you then yes, use signed!
Nov 20 2014
parent reply "Kagamin" <spam here.lot> writes:
On Thursday, 20 November 2014 at 16:34:12 UTC, flamencofantasy 
wrote:
 My experience is totally the opposite of his. I have been using 
 unsigned for lengths, widths, heights for the past 15 years in 

 pretend to be any kind of authority though.
are not CLS-compliant. You're going against established practices any notable problem and makes reasoning about code easier as you don't have to manually check for unsigned conversion bugs everywhere.
 The article you point to is totally flawed and kinda wasteful 
 in terms of having to read it; the very first code snippet is 
 obviously buggy.
That's the whole point: mixing signed with unsigned is bug-prone. Worse, it's inevitable if you force unsigned types everywhere.
Nov 21 2014
next sibling parent reply "FrankLike" <1150015857 qq.com> writes:
On Friday, 21 November 2014 at 09:43:04 UTC, Kagamin wrote:
 On Thursday, 20 November 2014 at 16:34:12 UTC, flamencofantasy

 they are not CLS-compliant. You're going against established 
 practices there. And signed types for numbers works wonders in 

 easier as you don't have to manually check for unsigned 
 conversion bugs everywhere.
 That's the whole point: mixing signed with unsigned is 
 bug-prone. Worse, it's inevitable if you force unsigned types 
 everywhere.
Right. Druntime should have a checksize_t.d.... Frank
Nov 21 2014
parent "FrankLike" <1150015857 qq.com> writes:
  Druntime's checkint.d  should be modify:

  uint subu(uint x, uint y, ref bool overflow)
{
     if (x < y)
       return y - x;
      else
       return x - y;
}

  uint subu(ulong x, ulong y, ref bool overflow)
{
     if (x < y)
       return y - x;
      else
       return x - y;
}


Frank
Nov 21 2014
prev sibling parent "flamencofantasy" <flamencofantasy gmail.com> writes:
On Friday, 21 November 2014 at 09:43:04 UTC, Kagamin wrote:


 they are not CLS-compliant. You're going against established 
 practices there. And signed types for numbers works wonders in 

 easier as you don't have to manually check for unsigned 
 conversion bugs everywhere.
I don't want to be CLS compliant! I make very heavy use of unsafe code, stackalloc and interop to worry about CLS compliance. Actually one of the major reasons I am looking at D for production code is so that I don't have to mix and match language/runtime :). Anyways, I believe the discussion is about using unsigned for array lengths, not unsigned in general. At this point most people seem to express an opinion - including me, and I certainly hope D stays as it is when it comes to length of an array. I am not convinced in the slightest that signed is the way to go.
Nov 21 2014
prev sibling next sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Thursday, 20 November 2014 at 15:40:40 UTC, Araq wrote:
 Most of the statements I disagreed with were opinions.

 "unsigned" means "I want to use modulo 2^^n arithmetic". It 
 does
 not mean, "this is an integer which cannot be negative".
Opinion.
 Using modulo 2^^n arithmetic is *weird*.
Opinion.
 If you are using
 uint/ulong to represent a non-negative integer, you are 
 using the
 incorrect type.
Opinion.
 I believe that
 bugs caused by unsigned calculations are subtle and require 
 an
 extraordinary level of diligence.
Opinion (correctly qualified as belief).
It's not only his "opinion", it's his *experience* and if we want to play the "argument by authority" game: he most likely wrote more production quality code in D than you did.
Urrmmm, really? Andrei has written a hell of a lot of production quality code. I use it every day, in production, as do many others.
Nov 20 2014
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/20/14 7:40 AM, Araq wrote:
 It's not only his "opinion", it's his *experience*
Yes, there's some good anecdotal evidence too. IMHO not enough to trigger a change to other solutions that have their own issues. -- Andrei
Nov 20 2014
prev sibling parent ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Thu, 20 Nov 2014 15:40:39 +0000
Araq via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 Here are some more "opinions":
 http://critical.eschertech.com/2010/04/07/danger-unsigned-types-used-here/
trying to illustrate something with obviously wrong code is very funny. the whole article then reduces to "hey, i'm writing bad code, and i can teach you to do the same!" won't buy it.
Nov 21 2014
prev sibling next sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Thu, Nov 20, 2014 at 08:18:23AM +0000, Don via Digitalmars-d wrote:
 On Wednesday, 19 November 2014 at 17:55:26 UTC, Andrei Alexandrescu wrote:
On 11/19/14 6:04 AM, Don wrote:
Almost everybody seems to think that unsigned means positive. It
does not.
That's an exaggeration. With only a bit of care one can use D's unsigned types for positive numbers. Please let's not reduce the matter to black and white. Andrei
Even in the responses in this thread indicate that about half of the people here don't understand unsigned. "unsigned" means "I want to use modulo 2^^n arithmetic". It does not mean, "this is an integer which cannot be negative". Using modulo 2^^n arithmetic is *weird*. If you are using uint/ulong to represent a non-negative integer, you are using the incorrect type.
[...] By that logic, using an int to represent an integer is also using the incorrect type, because a signed type is *also* subject to module 2^^n arithmetic -- just a different form of it where the most negative value wraps around to the most positive values. Fixed-width integers in computing are NOT the same thing as unrestricted integers in mathematics. No matter how you try to rationalize it, as long as you use hardware fix-width "integers", you're dealing with modulo arithmetic in one form or another. Pretending you're not, is the real source of said subtle bugs. T -- Why waste time learning, when ignorance is instantaneous? -- Hobbes, from Calvin & Hobbes
Nov 20 2014
next sibling parent "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"H. S. Teoh via Digitalmars-d"  wrote in message 
news:mailman.2156.1416499421.9932.digitalmars-d puremagic.com...

 By that logic, using an int to represent an integer is also using the
 incorrect type, because a signed type is *also* subject to module 2^^n
 arithmetic -- just a different form of it where the most negative value
 wraps around to the most positive values.  Fixed-width integers in
 computing are NOT the same thing as unrestricted integers in
 mathematics. No matter how you try to rationalize it, as long as you use
 hardware fix-width "integers", you're dealing with modulo arithmetic in
 one form or another. Pretending you're not, is the real source of said
 subtle bugs.
While what you've said is true, the typical range of values stored in an integral type is much more likely to cause unsigned wrapping than signed overflow. So to get the desired 'integer-like' behaviour from D's integral types, you need to care about magnitude for signed types, or both magnitude and ordering for unsigned types. eg 'a < b' becoming 'a - b < 0' is valid for integers, and small ints, but not valid for small uints unless a > b. You will always have to care about the imperfect representation of mathematical integers, but with unsigned types you have an extra rule that is much more likely to affect typical code.
Nov 21 2014
prev sibling parent "Kagamin" <spam here.lot> writes:
On Thursday, 20 November 2014 at 16:03:41 UTC, H. S. Teoh via 
Digitalmars-d wrote:
 By that logic, using an int to represent an integer is also 
 using the
 incorrect type, because a signed type is *also* subject to 
 module 2^^n
 arithmetic -- just a different form of it where the most 
 negative value
 wraps around to the most positive values.
The type is chosen at design time so that it's unlikely to overflow for the particular scenario. Why would you want the count of objects to reset at some point when counting objects? Wrapping of unsigned integers has valid usage for e.g. hash functions, but there they are used as bit arrays, not proper numbers, and arithmetic operators are used for bit shuffling, not for computing some numbers.
Nov 21 2014
prev sibling next sibling parent "Steve Sobel" <s.sobellian gmail.com> writes:
On Thursday, 20 November 2014 at 08:18:24 UTC, Don wrote:
 ...

 It's particularly challenging in D because of the widespread 
 use of 'auto':

 auto x = foo();
 auto y = bar();
 auto z = baz();

 if (x - y > z) { ... }


 This might be a bug, if one of these functions returns an 
 unsigned type.  Good luck finding that. Note that if all 
 functions return unsigned, there isn't even any signed-unsigned 
 mismatch.

 ...
I personally think this code is bad style. If the function requires a signed integer type, then `auto` with no qualifications at all is clearly too loose- if the programmer had specified what he needed to begin with, the error would have been caught at compile time. You can replace `auto` with an explicit signed integer type like `long`. If foo and bar are template parameters and you don't know the precise return type, then a static assert that x and y are signed will do the trick. If it is known that x > y and the function does not require a signed integer type, then an assert should be used. Frankly that snippet just illustrates the sort of constraints that should be put on generic code.
Nov 20 2014
prev sibling parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Thu, 20 Nov 2014 08:18:23 +0000
schrieb "Don" <x nospam.com>:

 It's particularly challenging in D because of the widespread use=20
 of 'auto':
=20
 auto x =3D foo();
 auto y =3D bar();
 auto z =3D baz();
=20
 if (x - y > z) { ... }
=20
=20
 This might be a bug, if one of these functions returns an=20
 unsigned type.  Good luck finding that. Note that if all=20
 functions return unsigned, there isn't even any signed-unsigned=20
 mismatch.
With those function names I cannot write code. =E2=84=95 x =3D length(); =E2=84=95 y =3D index(); =E2=84=95 z =3D requiredRange(); if (x - y > z) { ... } Ah, now we're getting somewhere. Yes the code is obviously correct. You need to be aware of the value ranges of your variables and write subtractions in a way that the result can only be >=3D 0. If you realize that you cannot guarantee that for some case, you just found a logic bug. An invalid program state that you need to assert/if-else/throw. I don't get why so many APIs return ints. Must be to support Java or something where proper unsigned types aren't available. --=20 Marco
Nov 21 2014
parent reply "Don" <x nospam.com> writes:
On Friday, 21 November 2014 at 17:23:51 UTC, Marco Leise wrote:
 Am Thu, 20 Nov 2014 08:18:23 +0000
 schrieb "Don" <x nospam.com>:

 It's particularly challenging in D because of the widespread 
 use of 'auto':
 
 auto x = foo();
 auto y = bar();
 auto z = baz();
 
 if (x - y > z) { ... }
 
 
 This might be a bug, if one of these functions returns an 
 unsigned type.  Good luck finding that. Note that if all 
 functions return unsigned, there isn't even any 
 signed-unsigned mismatch.
With those function names I cannot write code. ℕ x = length(); ℕ y = index(); ℕ z = requiredRange(); if (x - y > z) { ... } Ah, now we're getting somewhere. Yes the code is obviously correct. You need to be aware of the value ranges of your variables and write subtractions in a way that the result can only be >= 0. If you realize that you cannot guarantee that for some case, you just found a logic bug. An invalid program state that you need to assert/if-else/throw.
Yup. And that is not captured in the type system.
 I don't get why so many APIs return ints. Must be to support
 Java or something where proper unsigned types aren't available.
???? D and C do not have suitable types either. unsigned != ℕ. In D, 1u - 2u > 0u. This is defined behaviour, not an overflow.
Nov 24 2014
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/24/14 4:54 AM, Don wrote:
 In D,  1u - 2u > 0u. This is defined behaviour, not an overflow.
I think I get what you mean, but overflow is also defined behavior (in D at least). -- Andrei
Nov 24 2014
parent reply "Don" <x nospam.com> writes:
On Monday, 24 November 2014 at 15:56:44 UTC, Andrei Alexandrescu 
wrote:
 On 11/24/14 4:54 AM, Don wrote:
 In D,  1u - 2u > 0u. This is defined behaviour, not an 
 overflow.
I think I get what you mean, but overflow is also defined behavior (in D at least). -- Andrei
Aargh! You're right. That's new, and dreadful. It didn't used to be. The offending commit is alexrp 2012-05-15 15:37:24 which only provides an unsigned example. Why are defining behaviour that is always a bug? Java makes it defined, but it has to because it doesn't have unsigned types. I think the intention probably was to improve on the C situation, where there is undefined behaviour that really should be defined. But do we really want to preclude ever having overflow checking for integers?
Nov 25 2014
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Don:

 Aargh! You're right. That's new, and dreadful. It didn't used 
 to be.
 The offending commit is

 alexrp              2012-05-15 15:37:24

 which only provides an unsigned example.

 Why are defining behaviour that is always a bug? Java makes it 
 defined, but it has to because it doesn't have unsigned types.
 I think the intention probably was to improve on the C 
 situation, where there is undefined behaviour that really 
 should be defined.

 But do we really want to preclude ever having overflow checking 
 for integers?
+1 Bye, bearophile
Nov 25 2014
prev sibling parent reply "Kagamin" <spam here.lot> writes:
On Tuesday, 25 November 2014 at 11:43:01 UTC, Don wrote:
 Why are defining behaviour that is always a bug? Java makes it 
 defined, but it has to because it doesn't have unsigned types.
 I think the intention probably was to improve on the C 
 situation, where there is undefined behaviour that really 
 should be defined.
Mostly to prevent optimizations based on no-overflow assumption.
 But do we really want to preclude ever having overflow checking 
 for integers?
Overflow checking doesn't contradict to overflow being defined. The latter simply reflects how hardware works, nothing else. And hardware works that way, because that's a fast implementation of arithmetic for general case.
Nov 25 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Tuesday, 25 November 2014 at 13:52:32 UTC, Kagamin wrote:
 Overflow checking doesn't contradict to overflow being defined. 
 The latter simply reflects how hardware works, nothing else. 
 And hardware works that way, because that's a fast 
 implementation of arithmetic for general case.
So you are basically saying that D does not provide modular arithmetic, but allows you to continue with the incorrect result of an overflow as a modulo representation? Because you have to choose, you cannot both have modular arithmetic and overflow at the same time for the same operator. Overflow happens because you have monotonic semantics for addition, not modular semantics. Btw, http://dlang.org/expression needs a clean up, the term "underflow" is not used correctly.
Nov 25 2014
parent reply "Kagamin" <spam here.lot> writes:
On Tuesday, 25 November 2014 at 14:30:36 UTC, Ola Fosheim Grøstad 
wrote:
 So you are basically saying that D does not provide modular 
 arithmetic, but allows you to continue with the incorrect 
 result of an overflow as a modulo representation?
Correctness is an emergent property - when behavior matches expectation, so overflow has variable correctness in various parts of the code.
Nov 25 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Tuesday, 25 November 2014 at 15:42:13 UTC, Kagamin wrote:
 Correctness is an emergent property - when behavior matches 
 expectation, so overflow has variable correctness in various 
 parts of the code.
I assume you are basically saying that Walter's view that matching C++ is more important than getting it right, because some people might expect C++ behaviour. Yet Ada chose a different path and is considered a better language with respect to correctness. I think it is important to get the definitions consistent and sound so they are easy to reason about, both for users and implementors. So one should choose whether the type is primarily monotonic, with incorrect values "truncated into" modulo N, or if the type is primarily modular. If addition is defined to be primarily monotonic it means you can optimize "if(x < x+1)…" into "if (true)…". If it is defined to be primarily modular, then you cannot.
Nov 25 2014
parent reply "Kagamin" <spam here.lot> writes:
On Tuesday, 25 November 2014 at 15:52:22 UTC, Ola Fosheim Grøstad 
wrote:
 I assume you are basically saying that Walter's view that 
 matching C++ is more important than getting it right, because 
 some people might expect C++ behaviour. Yet Ada chose a 
 different path and is considered a better language with respect 
 to correctness.
C++ legacy is huge especially in culture. That said, the true issue is in beliefs (which probably stem from 16-bit era). Can't judge Ada, have no experience with it, though examples of Java and .net show how marginal is importance of unsigned types.
 I think it is important to get the definitions consistent and 
 sound so they are easy to reason about, both for users and 
 implementors. So one should choose whether the type is 
 primarily monotonic, with incorrect values "truncated into" 
 modulo N, or if the type is primarily modular.
In this light examples by Marco Leise become interesting, he tries to evade wrapping even for unsigned types, so, yes types are primarily monotonic and optimized for small values.
 If addition is defined to be primarily monotonic it means you 
 can optimize "if(x < x+1)…" into "if (true)…". If it is defined 
 to be primarily modular, then you cannot.
Such optimizations have a bad reputation. If they were more conservative and didn't propagate back in code flow, the situation would be probably better. Also isn't (x < x+1) a suspicious expression, is it a good idea to mess with it?
Nov 25 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Tuesday, 25 November 2014 at 18:24:29 UTC, Kagamin wrote:
 C++ legacy is huge especially in culture. That said, the true 
 issue is in beliefs (which probably stem from 16-bit era). 
 Can't judge Ada, have no experience with it, though examples of 
 Java and .net show how marginal is importance of unsigned types.
Unsigned bytes are important, and I personally tend to make just about everything unsigned when dealing with C-like languages because that makes me aware of the pitfalls and I avoid the signedness issue. The downside is that it takes extra work to get the evaluation order right and you have to take extra care to make sure loops terminate correctly by being very conscious about +-1 issues when terminating around zero. But I don't really think C++ legacy is a good reason to keep implicit coercion no matter what programming style one has. Coercion is generally something I try to avoid, even explicitly, so why would I want the compiler to do it with no warning?
 Such optimizations have a bad reputation. If they were more 
 conservative and didn't propagate back in code flow, the 
 situation would be probably better. Also isn't (x < x+1) a 
 suspicious expression, is it a good idea to mess with it?
It is just an example, it could be the result of substituting aliased values. Anyway, I think it is important to not only define what happens if you add 1 to 0xffffffff, but also define whether that result is considered in correspondence with the type. If it isn't a correct value for the type, then the programmer will have to make no assumptions that optimizations will heed the resulting incorrect value. The only acceptable alternative is to have the language specification explicitly define the type as modular and overflow free. If not you end up with weak typing…? I personally would take the monotonic optimizations and rather have a separate bit-fidling type that provides a clean builtin swiss-army-knife toolset that gives close to direct access to the whole arsenal that the CPU instruction set provides (carry, ROL/ROR, bitcounts etc).
Nov 25 2014
next sibling parent "Frank Like" <1150015857 qq.com> writes:
when I migrate dfl codes from x86 to 64 bit,modify the drawing.d 
,find the 'offset' and 'index',point(x,y),rect(x,y....),all be 
keep with the 'lengh's type, so I don't modify them to 
size_t,only cast(int)length to int,then it's easy to migrate dfl 
codes to 64 bit.
Ok,then dfl can work  on 64 bit now.
Nov 26 2014
prev sibling parent reply "Kagamin" <spam here.lot> writes:
On Tuesday, 25 November 2014 at 22:56:50 UTC, Ola Fosheim Grøstad 
wrote:
 I personally would take the monotonic optimizations and rather 
 have a separate bit-fidling type that provides a clean builtin 
 swiss-army-knife toolset that gives close to direct access to 
 the whole arsenal that the CPU instruction set provides (carry, 
 ROL/ROR, bitcounts etc).
I don't think there's such clear separation that can be expressed in a type, it's more in used coding practices rather than type. You can't change coding practice by introducing a new type.
Nov 27 2014
next sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 27 November 2014 at 08:31:24 UTC, Kagamin wrote:
 I don't think there's such clear separation that can be 
 expressed in a type, it's more in used coding practices rather 
 than type. You can't change coding practice by introducing a 
 new type.
You need to separate and define the old types as well as introducing a clean way to do low level manipulation. How to do the latter is not as clear, but… …regular types should be constrained to convey the intent of the programmer. The intent is conveyed to the compiler and to readers of the source-code. So the type definition should be strict on whether the intent is to convey monotonic qualities or circular/modular qualities. The C-practice of casting from void* to char* to float to uint to int in order to do bit manipulation leads to badly structured code. Intrinsics also leads to less readable code. There's got to be a better solution to keep "bit hacks" separate from regular code. Maybe a register type that maps onto SIMD registers…
Nov 27 2014
prev sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Kagamin:

 You can't change coding practice by introducing a new type.
We can try to change coding practice introducing new types :-) Bye, bearophile
Nov 27 2014
prev sibling next sibling parent reply ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Mon, 24 Nov 2014 12:54:58 +0000
Don via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 In D,  1u - 2u > 0u. This is defined behaviour, not an overflow.
this *is* overflow. D just has overflow result defined.
Nov 24 2014
next sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Monday, 24 November 2014 at 16:00:53 UTC, ketmar via 
Digitalmars-d wrote:
 this *is* overflow. D just has overflow result defined.
So it basically is and isn't modular arithmetic at the same time? I think Ada got this right by providing the ability to specify the modulo value, so you can define: type Weekday is mod 7; type Byte is mod 256; A solid solution solution is to provide «As if Infinitely Ranged Integer Model» where the compiler figures out how large integers are needed for computation and then does overflow detection when you truncate for storage: http://resources.sei.cmu.edu/library/asset-view.cfm?assetid=9019
Nov 24 2014
parent reply "Matthias Bentrup" <matthias.bentrup googlemail.com> writes:
On Monday, 24 November 2014 at 16:45:35 UTC, Ola Fosheim Grøstad
wrote:
 On Monday, 24 November 2014 at 16:00:53 UTC, ketmar via 
 Digitalmars-d wrote:
 this *is* overflow. D just has overflow result defined.
So it basically is and isn't modular arithmetic at the same time?
Overflow is part of modular arithmetic. However, there is no signed and unsigned modular arithmetic, or, more precisely, they are the same. Computer words just aren't a good representation of integers. You can either use modular arithmetic, which follows the common arithmetic laws for addition and multiplication (commutativity, associativity, etc., even most non-zero numbers have a multiplicative inverse), but break the common ordering laws (a >= 0 && b >= 0 implies a+b >= 0). Or you can use some other order preserving arithmetic (e.g. saturating to min/max values), but that breaks the arithmetic laws.
 I think Ada got this right by providing the ability to specify 
 the modulo value, so you can define:

 type Weekday is mod 7;
 type Byte is mod 256;

 A solid solution solution is to provide «As if Infinitely 
 Ranged Integer Model» where the compiler figures out how large 
 integers are needed for computation and then does overflow 
 detection when you truncate for storage:

 http://resources.sei.cmu.edu/library/asset-view.cfm?assetid=9019
You could just as well use a library like GMP.
Nov 24 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Monday, 24 November 2014 at 17:12:31 UTC, Matthias Bentrup 
wrote:
 Overflow is part of modular arithmetic. However, there is no
 signed and unsigned modular arithmetic, or, more precisely, they
 are the same.
Would you say that a phase that goes from 0…2pi overflows? Does polar coordinates overflow once every turn? I'd say overflow/underflow means that the result is wrong. (Carry is not overflow per se).
 Or you can use some other order preserving arithmetic (e.g.
 saturating to min/max values), but that breaks the arithmetic
 laws.
I don't think it breaks them, but I think a system language would be better off by having explicit operators for alternative edge-case handling on a bit-fiddling type. E.g.: a + b as regular addition a (+) b as modulo arithmetic addition a [+] b as clamped (saturating) addition The bad behaviour of C-like languages is the implicit coercion to/from a bit-fiddling type. The bit-fiddling should be contained in expression where the programmer by choosing the type says "I am gonna do tricky bit hacks here". Just casting to uint does not convey that message in a clear manner.
 A solid solution solution is to provide «As if Infinitely 
 Ranged Integer Model» where the compiler figures out how large 
 integers are needed for computation and then does overflow 
 detection when you truncate for storage:

 http://resources.sei.cmu.edu/library/asset-view.cfm?assetid=9019
You could just as well use a library like GMP.
I think the point with having compiler support is to retain most optimizations. The compiler select the most efficient representation based on the needed headroom and makes sure that overflow is recorded so that you can eventually respond to it. If you couple AIR with constrained integer types, which Pascal and Ada has, then it can be very efficient in many cases.
Nov 24 2014
next sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Monday, 24 November 2014 at 17:55:06 UTC, Ola Fosheim Grøstad 
wrote:
 I think the point with having compiler support is to retain 
 most optimizations. The compiler select the most efficient 
 representation based on the needed headroom and makes sure that 
 overflow is recorded so that you can eventually respond to it.
It is also worth noting that Intel CPUs have 3 new instructions for working with large integers: MULX and ADCX/ADOX. http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/ia-large-integer-arithmetic-paper.html So there is no reason to not go for it IMO.
Nov 24 2014
prev sibling parent reply "Matthias Bentrup" <matthias.bentrup googlemail.com> writes:
On Monday, 24 November 2014 at 17:55:06 UTC, Ola Fosheim Grøstad
wrote:
 On Monday, 24 November 2014 at 17:12:31 UTC, Matthias Bentrup 
 wrote:
 Overflow is part of modular arithmetic. However, there is no
 signed and unsigned modular arithmetic, or, more precisely, 
 they
 are the same.
Would you say that a phase that goes from 0…2pi overflows? Does polar coordinates overflow once every turn?
No, sin and cos are periodic functions, but that doesn't mean their arguments are modular. sin 4pi is well defined by e.g. the taylor expansion of sin without any modular arithmetic at all.
 I'd say overflow/underflow means that the result is wrong. 
 (Carry is not overflow per se).
There is no right or wrong in Mathematics, only true and false. The result of modular addition with overflow is not wrong, it is just different than the result of integer addition.
 Or you can use some other order preserving arithmetic (e.g.
 saturating to min/max values), but that breaks the arithmetic
 laws.
I don't think it breaks them, but I think a system language would be better off by having explicit operators for alternative edge-case handling on a bit-fiddling type. E.g.: a + b as regular addition a (+) b as modulo arithmetic addition a [+] b as clamped (saturating) addition The bad behaviour of C-like languages is the implicit coercion to/from a bit-fiddling type. The bit-fiddling should be contained in expression where the programmer by choosing the type says "I am gonna do tricky bit hacks here". Just casting to uint does not convey that message in a clear manner.
Agreed, though I don't like the explosion of new operators. I'd saturate(expression).
 A solid solution solution is to provide «As if Infinitely 
 Ranged Integer Model» where the compiler figures out how 
 large integers are needed for computation and then does 
 overflow detection when you truncate for storage:

 http://resources.sei.cmu.edu/library/asset-view.cfm?assetid=9019
You could just as well use a library like GMP.
I think the point with having compiler support is to retain most optimizations. The compiler select the most efficient representation based on the needed headroom and makes sure that overflow is recorded so that you can eventually respond to it. If you couple AIR with constrained integer types, which Pascal and Ada has, then it can be very efficient in many cases.
And can fail spectacularly in others. The compiler always has to prepare for the worst case, i.e. the largest integer size possible, while in practice you may need that only for a few extreme cases.
Nov 24 2014
next sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Monday, 24 November 2014 at 19:06:35 UTC, Matthias Bentrup 
wrote:
 There is no right or wrong in Mathematics, only true and false.
 The result of modular addition with overflow is not wrong, it is
 just different than the result of integer addition.
I think we are talking past each other. In my view the term "overflow" has nothing to do with mathematics, overflow is a signal from the ALU that the computation is incorrect e.g. not in accordance with the intended type.
 Agreed, though I don't like the explosion of new operators. I'd

 saturate(expression).
Yep, that is another way to do it. What is preferable probably varies from case to case.
 And can fail spectacularly in others. The compiler always has to
 prepare for the worst case, i.e. the largest integer size
 possible, while in practice you may need that only for a few
 extreme cases.
In some loops it probably can get tricky to get it right without help from the programmer. I believe some languages allow you to annotate loops with an upper boundary to help the semantic analysis, but you could also add more frequent overflow checks on request?
Nov 24 2014
prev sibling parent "FrankLike" <1150015857 qq.com> writes:
On Monday, 24 November 2014 at 19:06:35 UTC, Matthias Bentrup 
wrote:
 Agreed, though I don't like the explosion of new operators. I'd

 saturate(expression).
You maybe like this: -------------------small test 1-------------------------- import std.stdio; template subuint(T1,T2){ auto subuint(T1 x, T2 y, ref bool overflow) { if(is(T1 == uint) && is(T2==uint)) { if (x < y) { return cast(int)(x -y); } else { return x - y; } } else if(is(T1 == uint) && is(T2==int)) {writeln("enter here1"); if (x < y) { writeln("enter here2"); return cast(int)(x -y); } else { writeln("enter here3"); return x - y; } } else if(is(T1 == int) && is(T2==uint)) { if (x < y) { return cast(int)(x -y); } else { return x - y; } } else if(is(T1 == int) && is(T2==int)) { return x - y; } } } unittest { bool overflow; assert(subuint(3, 2, overflow) == 1); assert(!overflow); assert(subuint(3, 4, overflow) == -1); assert(!overflow); assert(subuint(uint.max, 1, overflow) == uint.max - 1); writeln("typeid = ",typeid(subuint(uint.max, 1, overflow))); assert(!overflow); assert(subuint(1, 1, overflow) == uint.min); assert(!overflow); assert(subuint(0, 1, overflow) == -1); assert(!overflow); assert(subuint(uint.max - 1, uint.max, overflow) == -1); assert(!overflow); assert(subuint(0, 0, overflow) == 0); assert(!overflow); assert(subuint(3, -2, overflow) == 5); assert(!overflow); assert(subuint(uint.max, -1, overflow) == uint.max + 1); assert(!overflow); assert(subuint(1, -1, overflow) == 2); assert(!overflow); assert(subuint(0, -1, overflow) == 1); assert(!overflow); assert(subuint(uint.max - 1, int.max, overflow) == int.max); assert(!overflow); assert(subuint(0, 0, overflow) == 0); assert(!overflow); assert(subuint(-2, 1, overflow) == -3); assert(!overflow); } void main() { uint a= 3; int b = 4; int c =2; writeln("c -a =",c-a); writeln("a -b =",a-b); writeln("----------------"); bool overflow; writeln("typeid = ",typeid(subuint(a, b, overflow)),", a-b=",subuint(a, b, overflow)); writeln("ok"); } ---------------here is a simple ,but it's error-------------------------- import std.stdio; template subuint(T1,T2){ auto subuint(T1 x, T2 y, ref bool overflow) { if(is(T1 == int) && is(T2==int)) { return x - y; } else if((is(T1 == uint) && is(T2==int)) | (is(T1 == uint) && is(T2==uint)) | (is(T1 == int) && is(T2==uint))) { if (x < y) { return cast(int)(x -y); } else { return x - y; } } } } void main() { uint a= 3; int b = 4; int c =2; writeln("c -a =",c-a); writeln("a -b =",a-b); writeln("----------------"); bool overflow; writeln("typeid = ",typeid(subuint(a, b, overflow)),", a-b=",subuint(a, b, overflow)); writeln("ok"); }
Nov 24 2014
prev sibling parent reply "Don" <x nospam.com> writes:
On Monday, 24 November 2014 at 16:00:53 UTC, ketmar via 
Digitalmars-d wrote:
 On Mon, 24 Nov 2014 12:54:58 +0000
 Don via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 In D,  1u - 2u > 0u. This is defined behaviour, not an 
 overflow.
this *is* overflow. D just has overflow result defined.
No, that is not overflow. That is a carry. Overflow is when the sign bit changes.
Nov 24 2014
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Tuesday, 25 November 2014 at 07:39:44 UTC, Don wrote:
 No, that is not overflow. That is a carry. Overflow is when the 
 sign bit changes.
I think this discussion will be less confusing with clearing up the terminology. An overflow condition happens when the representation cannot hold the magnitude of the intended type. In floating point that is +Inf and -Inf. And underflow condition happens when the representation cannot represent the precision of small numbers. In floating point that is +0, -0 and denormal numbers, detected or undetected. Carry is an extra bit that can be considered part of the computation for a concrete machine code instruction that provides carry. Eg 32bits + 32bits => (32+1) bits. If the intended type is true Reals and the representation is integer then we get: 0u - 1u => overflow 1u / 2u => underflow Carry can be taken as an overflow condition, but it is not proper overflow if you interpret it as s part of the result that depends on the machine language instruction and use of it. For a regular ADD/SUB instruction with carry the ALU covers two intended types (signed/unsigned) and use the control register flags in a way which let's the programmer make the interpretation. Some SIMD instructions does not provide control register flags and are therefore true modular arithmetic that does not overflow by definition, but if you use them for representing a non-modular intended type then you get undetected overflow… Overflow is in relation to an interpretation: the intended type versus the internal representation and the concrete machine language instruction.
Nov 25 2014
prev sibling parent ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Mon, 24 Nov 2014 12:54:58 +0000
Don via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 In D,  1u - 2u > 0u. This is defined behaviour, not an overflow.
p.s. sorry, of course this is not and overflow. this is underflow.
Nov 24 2014
prev sibling next sibling parent reply ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Wed, 19 Nov 2014 10:03:34 +0000
Don via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 No! No! No!  This is completely wrong. Unsigned does not mean=20
 "positive". It means "no sign", and therefore "wrapping=20
 semantics".
 eg length - 4 > 0, if length is 2.
=20
 Weird consequence: using subtraction with an unsigned type is=20
 nearly always a bug.
negative length is a bug too. and it doesn't matter what king of integer is used if you didn't fail your sanity checks.
 I wish D hadn't called unsigned integers 'uint'. They should have=20
 been called '__uint' or something. They should look ugly. You=20
 need a very, very good reason to use an unsigned type.
=20
 We have a builtin type that is deadly but seductive.
you just named all of built-in types. ah, and rename all keywords too, they far too dangerous.
Nov 19 2014
parent reply Ary Borenszweig <ary esperanto.org.ar> writes:
On 11/19/14, 10:21 AM, ketmar via Digitalmars-d wrote:
 On Wed, 19 Nov 2014 10:03:34 +0000
 Don via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 No! No! No!  This is completely wrong. Unsigned does not mean
 "positive". It means "no sign", and therefore "wrapping
 semantics".
 eg length - 4 > 0, if length is 2.

 Weird consequence: using subtraction with an unsigned type is
 nearly always a bug.
negative length is a bug too.
How is that a bug? Can you provide some code that exhibits this?
Nov 19 2014
parent reply "FrankLike" <1150015857 qq.com> writes:
 How is that a bug? Can you provide some code that exhibits this?
If you compile the dfl Library to 64 bit,you will find error: core.sys.windows.windows.WaitForMultipleObjects(uint nCount,void** lpHandles,....) is not callable using argument types(ulong,void**,...) the 'WaitForMultipleObjects' Function is in dmd2/src/druntime/src/core/sys/windows/windows.d the argument of first is dfl's value ,it comes from a 'length' ,it's type is size_t,now it is 'ulong' on 64 bit. So druntime must keep the same as phobos for size_t. Or keep the same to int with WindowsAPI to modify the size_t to int ?
Nov 19 2014
next sibling parent Ary Borenszweig <ary esperanto.org.ar> writes:
On 11/19/14, 9:54 PM, FrankLike wrote:
 How is that a bug? Can you provide some code that exhibits this?
If you compile the dfl Library to 64 bit,you will find error: core.sys.windows.windows.WaitForMultipleObjects(uint nCount,void** lpHandles,....) is not callable using argument types(ulong,void**,...) the 'WaitForMultipleObjects' Function is in dmd2/src/druntime/src/core/sys/windows/windows.d the argument of first is dfl's value ,it comes from a 'length' ,it's type is size_t,now it is 'ulong' on 64 bit. So druntime must keep the same as phobos for size_t. Or keep the same to int with WindowsAPI to modify the size_t to int ?
Sorry, maybe I wasn't clear. I asked "how a negative length can be a bug". (because you can't set a negative length, so it can't really happen)
Nov 19 2014
prev sibling parent "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"FrankLike"  wrote in message news:musbvhmuuhhvetovxqwb forum.dlang.org...

 If you compile the dfl Library to 64 bit,you will find error:

 core.sys.windows.windows.WaitForMultipleObjects(uint
 nCount,void** lpHandles,....) is not callable using argument
 types(ulong,void**,...)

 the 'WaitForMultipleObjects' Function is in
 dmd2/src/druntime/src/core/sys/windows/windows.d

 the argument of first is dfl's value ,it comes from a 'length'
 ,it's type is size_t,now it is 'ulong' on 64 bit.

 So druntime must keep the same as  phobos for size_t.
 Or  keep the same to int with WindowsAPI to  modify the size_t to int ?
I suggest using WaitForMultipleObjects(to!uint(xxx.length), ...) as it will both convert and check for overflow IIRC. I'm just happy D gives you an error here instead of silently truncating the value.
Nov 21 2014
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/19/14 2:03 AM, Don wrote:
 On Tuesday, 18 November 2014 at 18:23:52 UTC, Marco Leise wrote:
 Am Tue, 18 Nov 2014 15:01:25 +0000
 schrieb "Frank Like" <1150015857 qq.com>:

 but now ,'int' is enough for use,not huge and not small,only > enough.
 'int' is easy to write,and most people are used to it.
 Most importantly easier to migrate code,if  'length''s return
value type is 'int'.
How about your idea?
I get the idea of a broken record right now... Clearly size_t (which I tend to alias with ℕ in my code for brevity and coolness) can express more than 2^31-1 items, which is appropriate to reflect the increase in usable memory per application on 64-bit platforms. Yes, the 64-bit version of a program or library can handle larger data sets. Just like it was when people transitioned from 16-bit to 32-bit. I wont use `int` just because the technically correct thing is `size_t`, even it it is a little harder to type.
This is difficult. Having arr.length return an unsigned type, is a dreadful language mistake.
 Aside from the size factor, I personally prefer unsigned types
 for countable stuff like array lengths. Mixed arithmetics
 decay to unsinged anyways and you don't need checks like
 `assert(idx >= 0)`. It is a matter of taste though and others
 prefer languages with no unsigned types at all.
No! No! No! This is completely wrong. Unsigned does not mean "positive". It means "no sign", and therefore "wrapping semantics". eg length - 4 > 0, if length is 2. Weird consequence: using subtraction with an unsigned type is nearly always a bug. I wish D hadn't called unsigned integers 'uint'. They should have been called '__uint' or something. They should look ugly. You need a very, very good reason to use an unsigned type. We have a builtin type that is deadly but seductive.
I agree this applies to C and C++. Not quite to D. -- Andrei
Nov 19 2014
parent reply Ary Borenszweig <ary esperanto.org.ar> writes:
On 11/19/14, 1:46 PM, Andrei Alexandrescu wrote:
 On 11/19/14 2:03 AM, Don wrote:
 We have a builtin type that is deadly but seductive.
I agree this applies to C and C++. Not quite to D. -- Andrei
See my response to Don. Don't you think that's counter-intuitive?
Nov 19 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/19/14 10:13 AM, Ary Borenszweig wrote:
 On 11/19/14, 1:46 PM, Andrei Alexandrescu wrote:
 On 11/19/14 2:03 AM, Don wrote:
 We have a builtin type that is deadly but seductive.
I agree this applies to C and C++. Not quite to D. -- Andrei
See my response to Don. Don't you think that's counter-intuitive?
No. -- Andrei
Nov 19 2014
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

 No. -- Andrei
Yet, experience in D has shown very well that having unsigned lengths is the wrong design choice. It's a D wart. Bye, bearophile
Nov 19 2014
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/19/14 12:38 PM, bearophile wrote:
 Andrei Alexandrescu:

 No. -- Andrei
Yet, experience in D has shown very well that having unsigned lengths is the wrong design choice. It's a D wart.
Care to back that up? -- Andrei
Nov 19 2014
parent "deadalnix" <deadalnix gmail.com> writes:
On Thursday, 20 November 2014 at 00:07:23 UTC, Andrei 
Alexandrescu wrote:
 On 11/19/14 12:38 PM, bearophile wrote:
 Andrei Alexandrescu:

 No. -- Andrei
Yet, experience in D has shown very well that having unsigned lengths is the wrong design choice. It's a D wart.
Care to back that up? -- Andrei
Don mentioned troubles with that at DConf.
Nov 19 2014
prev sibling parent reply "flamencofantasy" <flamencofantasy gmail.com> writes:
On Wednesday, 19 November 2014 at 20:38:15 UTC, bearophile wrote:
 Andrei Alexandrescu:

 No. -- Andrei
Yet, experience in D has shown very well that having unsigned lengths is the wrong design choice. It's a D wart. Bye, bearophile
Yet, MY experience in D has shown very well that having unsigned lengths is the RIGHT design choice. It's NOT a D wart.
Nov 19 2014
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Thu, Nov 20, 2014 at 01:00:00AM +0000, flamencofantasy via Digitalmars-d
wrote:
 On Wednesday, 19 November 2014 at 20:38:15 UTC, bearophile wrote:
[...]
Yet, experience in D has shown very well that having unsigned lengths
is the wrong design choice. It's a D wart.

Bye,
bearophile
Yet, MY experience in D has shown very well that having unsigned lengths is the RIGHT design choice. It's NOT a D wart.
[...] I concur. However, the fact that you can freely mix signed and unsigned types in unsafe ways without any warning, is a fly that spoils the soup. If this kind of unsafe mixing wasn't allowed, or required explict casts (to signify "yes I know what I'm doing and I'm prepared to face the consequences"), I suspect that bearophile would be much happier about this issue. ;-) T -- MAS = Mana Ada Sistem?
Nov 19 2014
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/19/2014 5:03 PM, H. S. Teoh via Digitalmars-d wrote:
 If this kind of unsafe mixing wasn't allowed, or required explict casts
 (to signify "yes I know what I'm doing and I'm prepared to face the
 consequences"), I suspect that bearophile would be much happier about
 this issue. ;-)
Explicit casts are worse than the problem - they can easily cause bugs. As for me personally, I like having a complete set of signed and unsigned integral types at my disposal. It's like having a full set of wrenches that are open end on one end and boxed on the other :-) Most of the time either end will work, but sometimes only one will. Now, if D were a non-systems language like Basic, Go or Java, unsigned types could be reasonably dispensed with. But D is a systems programming language, and it ought to have available types that match what the hardware supports.
Nov 20 2014
next sibling parent reply Ary Borenszweig <ary esperanto.org.ar> writes:
On 11/20/14, 5:02 AM, Walter Bright wrote:
 On 11/19/2014 5:03 PM, H. S. Teoh via Digitalmars-d wrote:
 If this kind of unsafe mixing wasn't allowed, or required explict casts
 (to signify "yes I know what I'm doing and I'm prepared to face the
 consequences"), I suspect that bearophile would be much happier about
 this issue. ;-)
Explicit casts are worse than the problem - they can easily cause bugs. As for me personally, I like having a complete set of signed and unsigned integral types at my disposal. It's like having a full set of wrenches that are open end on one end and boxed on the other :-) Most of the time either end will work, but sometimes only one will. Now, if D were a non-systems language like Basic, Go or Java, unsigned types could be reasonably dispensed with. But D is a systems programming language, and it ought to have available types that match what the hardware supports.
Nobody is saying to remove unsigned types from the language. They have their uses. It's just that using them for an array's length leads to subtle bugs. That's all.
Nov 20 2014
next sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Thu, Nov 20, 2014 at 11:22:00AM -0300, Ary Borenszweig via Digitalmars-d
wrote:
 On 11/20/14, 5:02 AM, Walter Bright wrote:
[...]
As for me personally, I like having a complete set of signed and
unsigned integral types at my disposal. It's like having a full set
of wrenches that are open end on one end and boxed on the other :-)
Most of the time either end will work, but sometimes only one will.

Now, if D were a non-systems language like Basic, Go or Java,
unsigned types could be reasonably dispensed with. But D is a systems
programming language, and it ought to have available types that match
what the hardware supports.
Nobody is saying to remove unsigned types from the language. They have their uses. It's just that using them for an array's length leads to subtle bugs. That's all.
Using unsigned types for array length doesn't necessarily lead to subtle bugs, if the language was stricter about mixing signed and unsigned values. T -- Recently, our IT department hired a bug-fix engineer. He used to work for Volkswagen.
Nov 20 2014
parent reply "deadalnix" <deadalnix gmail.com> writes:
On Thursday, 20 November 2014 at 15:55:21 UTC, H. S. Teoh via
Digitalmars-d wrote:
 Using unsigned types for array length doesn't necessarily lead 
 to subtle
 bugs, if the language was stricter about mixing signed and 
 unsigned
 values.
Yes, I think that this is the real issue.
Nov 20 2014
parent reply "Wyatt" <wyatt.epp gmail.com> writes:
On Thursday, 20 November 2014 at 20:17:15 UTC, deadalnix wrote:
 On Thursday, 20 November 2014 at 15:55:21 UTC, H. S. Teoh via
 Digitalmars-d wrote:
 Using unsigned types for array length doesn't necessarily lead 
 to subtle
 bugs, if the language was stricter about mixing signed and 
 unsigned
 values.
Yes, I think that this is the real issue.
Thirded. Array lengths are always non-negative integers. This is axiomatic. But the subtraction thing keeps coming up in this thread; what to do? There's probably something fundamentally wrong with this and I'll probably be called an idiot by both "sides", but my gut feeling is that if expressions with subtraction simply returned a signed type by default, much of the problem would disappear. It doesn't catch everything and stuff like: uint x = 2; uint y = 4; uint z = x - y; ...is still going to overflow, but maybe you know what you're doing? More importantly, changing it to auto z = x - y; actually works as expected for the majority of cases. (I'm actually on the fence re: pass/warn/error on mixing, but I _will_ note C's promotion rules have bitten me in the ass a few times and I have no particular love for them.) -Wyatt PS: I can't even believe how this thread has blown up, considering how it started.
Nov 21 2014
parent Marco Leise <Marco.Leise gmx.de> writes:
Am Fri, 21 Nov 2014 16:32:20 +0000
schrieb "Wyatt" <wyatt.epp gmail.com>:

 Array lengths are always non-negative integers.  This is=20
 axiomatic.  But the subtraction thing keeps coming up in this=20
 thread; what to do?
=20
 There's probably something fundamentally wrong with this and I'll=20
 probably be called an idiot by both "sides", but my gut feeling=20
 is that if expressions with subtraction simply returned a signed=20
 type by default, much of the problem would disappear. [...]
As I said above, I always order my unsigned variables by magnitude and uint.max - uint.min should result in uint.max and not -1. In code dealing with lengths or offsets there is typically some "base" that is less than the "position" or an "index" that is less than the "length". The expression `base - position` is just wrong. If it is in fact below "base" then you will end up with an if-else later on under guarantee. So why not place it up front: if (position >=3D base) { auto offset =3D position - base; } else { =E2=80=A6 }
 [...]
=20
 -Wyatt
=20
 PS: I can't even believe how this thread has blown up,=20
 considering how it started.
Exactly my thought, but suddenly I couldn't stop myself from posting. --=20 Marco
Nov 21 2014
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/20/2014 6:22 AM, Ary Borenszweig wrote:
 Nobody is saying to remove unsigned types from the language. They have their
 uses. It's just that using them for an array's length leads to subtle bugs.
 That's all.
If that is changed to a signed type, then you'll have a same-only-different set of subtle bugs, plus you'll break the intuition about these things from everyone who has used C/C++ a lot.
Nov 20 2014
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Walter Bright:

 If that is changed to a signed type, then you'll have a 
 same-only-different set of subtle bugs,
This is possible. Can you show some of the bugs, we can discuss them, and see if they are actually worse than the current situation. Bye, bearophile
Nov 20 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/20/2014 3:25 PM, bearophile wrote:
 Walter Bright:

 If that is changed to a signed type, then you'll have a same-only-different
 set of subtle bugs,
This is possible. Can you show some of the bugs, we can discuss them, and see if they are actually worse than the current situation.
All you're doing is trading 0 crossing for 0x7FFFFFFF crossing issues, and pretending the problems have gone away.
Nov 20 2014
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/20/2014 7:11 PM, Walter Bright wrote:
 On 11/20/2014 3:25 PM, bearophile wrote:
 Walter Bright:

 If that is changed to a signed type, then you'll have a same-only-different
 set of subtle bugs,
This is possible. Can you show some of the bugs, we can discuss them, and see if they are actually worse than the current situation.
All you're doing is trading 0 crossing for 0x7FFFFFFF crossing issues, and pretending the problems have gone away.
BTW, granted the 0x7FFFFFFF problems exhibit the bugs less often, but paradoxically this can make the bug worse, because then it only gets found much, much later in supposedly tested & robust code. 0 crossing bugs tend to show up much sooner, and often immediately.
Nov 20 2014
next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Friday, 21 November 2014 at 04:53:38 UTC, Walter Bright wrote:
 BTW, granted the 0x7FFFFFFF problems exhibit the bugs less 
 often, but paradoxically this can make the bug worse, because 
 then it only gets found much, much later in supposedly tested & 
 robust code.

 0 crossing bugs tend to show up much sooner, and often 
 immediately.
Yes, I have to say the current design has some issue, but alternative seems worse.
Nov 20 2014
prev sibling next sibling parent reply "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"Walter Bright"  wrote in message news:m4mggi$e1h$1 digitalmars.com...

 BTW, granted the 0x7FFFFFFF problems exhibit the bugs less often, but 
 paradoxically this can make the bug worse, because then it only gets found
 much, much later in supposedly tested & robust code.

 0 crossing bugs tend to show up much sooner, and often immediately.
I don't think I have ever written a D program where an array had more than 2^^31 elements. And I'm sure I've never had it where 2^31-1 wasn't enough and yet 2^^32-1 was. Zero, on the other hand, is usually quite near the typical array lengths and differences in lengths.
Nov 21 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/21/2014 12:10 AM, Daniel Murphy wrote:
 "Walter Bright"  wrote in message news:m4mggi$e1h$1 digitalmars.com...

 BTW, granted the 0x7FFFFFFF problems exhibit the bugs less often, but
 paradoxically this can make the bug worse, because then it only gets found
 much, much later in supposedly tested & robust code.

 0 crossing bugs tend to show up much sooner, and often immediately.
I don't think I have ever written a D program where an array had more than 2^^31 elements. And I'm sure I've never had it where 2^31-1 wasn't enough and yet 2^^32-1 was.
There turned out to be such a bug in one of the examples in "Programming Pearls" that remained undetected for many years: http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html
 Zero, on the other hand, is usually quite near the typical array lengths and
 differences in lengths.
That's true, that's why they are detected sooner, when it is less costly to fix them.
Nov 21 2014
parent reply "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"Walter Bright"  wrote in message news:m4mu0q$sc5$1 digitalmars.com...

 Zero, on the other hand, is usually quite near the typical array lengths 
 and
 differences in lengths.
That's true, that's why they are detected sooner, when it is less costly to fix them.
It would be even less costly if they weren't possible.
Nov 21 2014
parent reply "Matthias Bentrup" <matthias.bentrup googlemail.com> writes:
On Friday, 21 November 2014 at 08:54:40 UTC, Daniel Murphy wrote:
 "Walter Bright"  wrote in message 
 news:m4mu0q$sc5$1 digitalmars.com...

 Zero, on the other hand, is usually quite near the typical 
 array lengths and
 differences in lengths.
That's true, that's why they are detected sooner, when it is less costly to fix them.
It would be even less costly if they weren't possible.
(http://msdn.microsoft.com/en-us/library/khy08726.aspx), which allow the programmer to specify if overflows should wrap of fail within an arithmetic expression. That could be a useful addition to D. However, a language that doesn't have unsigned integers and modular arithmetic is IMHO not a system language, because that is how most hardware works internally.
Nov 21 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/21/2014 1:01 AM, Matthias Bentrup wrote:

 (http://msdn.microsoft.com/en-us/library/khy08726.aspx), which allow the
 programmer to specify if overflows should wrap of fail within an arithmetic
 expression. That could be a useful addition to D.
D already has them: https://github.com/D-Programming-Language/druntime/blob/master/src/core/checkedint.d
Nov 21 2014
parent "FrankLike" <1150015857 qq.com> writes:
 D already has them:

 https://github.com/D-Programming-Language/druntime/blob/master/src/core/checkedint.d
Druntime's checkint.d should be modify: uint subu(uint x, uint y, ref bool overflow) { if (x < y) return y - x; else return x - y; } ulong subu(ulong x, ulong y, ref bool overflow) { if (x < y) return y - x; else return x - y; } Frank
Nov 21 2014
prev sibling next sibling parent "Kagamin" <spam here.lot> writes:
On Friday, 21 November 2014 at 04:53:38 UTC, Walter Bright wrote:
 BTW, granted the 0x7FFFFFFF problems exhibit the bugs less 
 often, but paradoxically this can make the bug worse, because 
 then it only gets found much, much later in supposedly tested & 
 robust code.

 0 crossing bugs tend to show up much sooner, and often 
 immediately.
Wrong. Unsigned integers can hold bigger values, so it takes more to makes them overflow, hence the bug is harder to detect.
 http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html
 Specifically, it fails if the sum of low and high is greater 
 than the maximum positive int value
So it fails sooner for signed integers than for unsigned integers.
Nov 21 2014
prev sibling next sibling parent reply "Don" <x nospam.com> writes:
On Friday, 21 November 2014 at 04:53:38 UTC, Walter Bright wrote:
 On 11/20/2014 7:11 PM, Walter Bright wrote:
 On 11/20/2014 3:25 PM, bearophile wrote:
 Walter Bright:

 If that is changed to a signed type, then you'll have a 
 same-only-different
 set of subtle bugs,
This is possible. Can you show some of the bugs, we can discuss them, and see if they are actually worse than the current situation.
All you're doing is trading 0 crossing for 0x7FFFFFFF crossing issues, and pretending the problems have gone away.
BTW, granted the 0x7FFFFFFF problems exhibit the bugs less often, but paradoxically this can make the bug worse, because then it only gets found much, much later in supposedly tested & robust code. 0 crossing bugs tend to show up much sooner, and often immediately.
You're missing the point here. The problem is that people are using 'uint' as if it were a positive integer type. Suppose D had a type 'natint', which could hold natural numbers in the range 0..uint.max. Sounds like 'uint', right? People make the mistake of thinking that is what uint is. But it is not. How would natint behave, in the type system? typeof (natint - natint) == int NOT natint !!! This would of course overflow if the result is too big to fit in an int. But the type would be correct. 1 - 2 == -1. But typeof (uint - uint ) == uint. The bit pattern is identical to the other case. But the type is wrong. It is for this reason that uint is not appropriate as a model for positive integers. Having warnings about mixing int and uint operations in relational operators is a bit misleading, because mixing signed and unsigned is not usually the real problem. Instead, those warnings a symptom of a type system mistake. You are quite right in saying that with a signed length, overflows can still occur. But, those are in principle detectable. The compiler could add runtime overflow checks for them, for example. But the situation for unsigned is not fixable, because it is a problem with the type system. By making .length unsigned, we are telling people that if .length is used in a subtraction expression, the type will be wrong. It is the incorrect use of the type system that is the underlying problem.
Nov 21 2014
next sibling parent "Matthias Bentrup" <matthias.bentrup googlemail.com> writes:
On Friday, 21 November 2014 at 15:36:02 UTC, Don wrote:
 On Friday, 21 November 2014 at 04:53:38 UTC, Walter Bright 
 wrote:
 On 11/20/2014 7:11 PM, Walter Bright wrote:
 On 11/20/2014 3:25 PM, bearophile wrote:
 Walter Bright:

 If that is changed to a signed type, then you'll have a 
 same-only-different
 set of subtle bugs,
This is possible. Can you show some of the bugs, we can discuss them, and see if they are actually worse than the current situation.
All you're doing is trading 0 crossing for 0x7FFFFFFF crossing issues, and pretending the problems have gone away.
BTW, granted the 0x7FFFFFFF problems exhibit the bugs less often, but paradoxically this can make the bug worse, because then it only gets found much, much later in supposedly tested & robust code. 0 crossing bugs tend to show up much sooner, and often immediately.
You're missing the point here. The problem is that people are using 'uint' as if it were a positive integer type. Suppose D had a type 'natint', which could hold natural numbers in the range 0..uint.max. Sounds like 'uint', right? People make the mistake of thinking that is what uint is. But it is not. How would natint behave, in the type system? typeof (natint - natint) == int NOT natint !!! This would of course overflow if the result is too big to fit in an int. But the type would be correct. 1 - 2 == -1.
So if i is a natint the expression i-- would change the type of variable i on the fly to int ?
Nov 21 2014
prev sibling next sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Fri, Nov 21, 2014 at 03:36:01PM +0000, Don via Digitalmars-d wrote:
[...]
 Suppose  D had a type 'natint', which could hold natural numbers in
 the range 0..uint.max.  Sounds like 'uint', right? People make the
 mistake of thinking that is what uint is. But it is not.
 
 How would natint behave, in the type system?
 
 typeof (natint - natint)  ==  int     NOT natint  !!!
Wrong. (uint.max - 0) == uint.max, which is of type uint. If you interpret it as int, you get a negative number, which is wrong. So your proposal breaks uint in even worse ways, in that now subtracting a smaller number from a larger number may overflow, whereas it wouldn't before. So that fixes nothing, you're just shifting the problem somewhere else. T -- Too many people have open minds but closed eyes.
Nov 21 2014
parent reply "Don" <x nospam.com> writes:
On Friday, 21 November 2014 at 15:50:05 UTC, H. S. Teoh via 
Digitalmars-d wrote:
 On Fri, Nov 21, 2014 at 03:36:01PM +0000, Don via Digitalmars-d 
 wrote:
 [...]
 Suppose  D had a type 'natint', which could hold natural 
 numbers in
 the range 0..uint.max.  Sounds like 'uint', right? People make 
 the
 mistake of thinking that is what uint is. But it is not.
 
 How would natint behave, in the type system?
 
 typeof (natint - natint)  ==  int     NOT natint  !!!
Wrong. (uint.max - 0) == uint.max, which is of type uint.
It is not uint.max. It is natint.max. And yes, that's an overflow condition. Exactly the same as when you do int.max + int.max.
 If you
 interpret it as int, you get a negative number, which is wrong. 
 So your
 proposal breaks uint in even worse ways, in that now 
 subtracting a
 smaller number from a larger number may overflow, whereas it 
 wouldn't
 before. So that fixes nothing, you're just shifting the problem
 somewhere else.


 T
This is not a proposal!!!! I am just illustrating the difference between what people *think* uint does, vs what it actually does. The type that I think would be useful, would be a number in the range 0..int.max. It has no risk of underflow. To put it another way: natural numbers are a subset of mathematical integers. (the range 0..infinity) signed types are a subset of mathematical integers (the range -int.max .. int.max). unsigned types are not a subset of mathematical integers. They do not just have a restricted range. They have different semantics. The question of what happens when a range is exceeded, is a different question.
Nov 21 2014
next sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Friday, 21 November 2014 at 16:12:19 UTC, Don wrote:
 On Friday, 21 November 2014 at 15:50:05 UTC, H. S. Teoh via 
 Digitalmars-d wrote:
 On Fri, Nov 21, 2014 at 03:36:01PM +0000, Don via 
 Digitalmars-d wrote:
 [...]
 Suppose  D had a type 'natint', which could hold natural 
 numbers in
 the range 0..uint.max.  Sounds like 'uint', right? People 
 make the
 mistake of thinking that is what uint is. But it is not.
 
 How would natint behave, in the type system?
 
 typeof (natint - natint)  ==  int     NOT natint  !!!
Wrong. (uint.max - 0) == uint.max, which is of type uint.
It is not uint.max. It is natint.max. And yes, that's an overflow condition. Exactly the same as when you do int.max + int.max.
 If you
 interpret it as int, you get a negative number, which is 
 wrong. So your
 proposal breaks uint in even worse ways, in that now 
 subtracting a
 smaller number from a larger number may overflow, whereas it 
 wouldn't
 before. So that fixes nothing, you're just shifting the problem
 somewhere else.


 T
This is not a proposal!!!! I am just illustrating the difference between what people *think* uint does, vs what it actually does. The type that I think would be useful, would be a number in the range 0..int.max. It has no risk of underflow. To put it another way: natural numbers are a subset of mathematical integers. (the range 0..infinity) signed types are a subset of mathematical integers (the range -int.max .. int.max). unsigned types are not a subset of mathematical integers. They do not just have a restricted range. They have different semantics.
I was under the impression that in D: uint = { x mod 2^32 | x ∈ Z_0 } int = { x - 2^31 | x ∈ uint } which matches the hardware.
Nov 21 2014
prev sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Friday, 21 November 2014 at 16:12:19 UTC, Don wrote:
 It is not uint.max. It is natint.max. And yes, that's an 
 overflow condition.

 Exactly the same as when you do int.max + int.max.
This depends on how you look at it. From a formal perspective assume zero as the base, then a predecessor function P and a successor function S. Then you have 0u - 1u + 2u ==> SSP0 Then you do a normalization where you cancel out successor and predecessor pairs and you get the result S0 ==> 1u. On the other hand if you end up with P0 the result should be bottom (error). In binary representation you need to collect the carry over N terms, so you need an extra accumulator which you can get by extending the precision by ~ log2(N) bits. Then do a masking of the most significant bits to check for over/underflow. Advanced for a compiler, but possible.
 The type that I think would be useful, would be a number in the 
 range 0..int.max.
 It has no risk of underflow.
Yep, from a correctness perspective length should be integer with a >=0 constraint. Ada also acknowledge this by having unsigned integers being 31 bits like you suggest. And now that most CPUs go 64 bit then a 63 bit integer would be the right choice for array length.
 unsigned types are not a subset of mathematical integers.

 They do not just have a restricted range. They have different 
 semantics.


 The question of what happens when a range is exceeded, is a 
 different question.
There is really no difference between signed and unsigned in principle since you only have an offset, but in practical programming 64 bits signed and 63 bits unsigned is enough for most situations with the advantage that you have the same bit representation with only one interpretation. What the semantics are depend on how you define the operators, right? So you can have both modular arithmetic and non-modular in the same type by providing more operators. This is after all how the hardware does it. Contrary to what is claimed by others in this thread the general hardware ALU does not default to modular arithmetic, it preserves resolution: 32bit + 32bit ==> 33bit result 32bit * 32bit ==> 64bit result Modular arithmetic is an artifact of the language, not the hardware.
Nov 23 2014
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/21/2014 7:36 AM, Don wrote:
 On Friday, 21 November 2014 at 04:53:38 UTC, Walter Bright wrote:
 0 crossing bugs tend to show up much sooner, and often immediately.
You're missing the point here. The problem is that people are using 'uint' as if it were a positive integer type. Suppose D had a type 'natint', which could hold natural numbers in the range 0..uint.max. Sounds like 'uint', right? People make the mistake of thinking that is what uint is. But it is not. How would natint behave, in the type system? typeof (natint - natint) == int NOT natint !!! This would of course overflow if the result is too big to fit in an int. But the type would be correct. 1 - 2 == -1. But typeof (uint - uint ) == uint. The bit pattern is identical to the other case. But the type is wrong. It is for this reason that uint is not appropriate as a model for positive integers. Having warnings about mixing int and uint operations in relational operators is a bit misleading, because mixing signed and unsigned is not usually the real problem. Instead, those warnings a symptom of a type system mistake. You are quite right in saying that with a signed length, overflows can still occur. But, those are in principle detectable. The compiler could add runtime overflow checks for them, for example. But the situation for unsigned is not fixable, because it is a problem with the type system. By making .length unsigned, we are telling people that if .length is used in a subtraction expression, the type will be wrong. It is the incorrect use of the type system that is the underlying problem.
I believe I do understand the problem. As a practical matter, overflow checks are not going to be added for performance reasons. Also, in principle, uint-uint can generate a runtime check for underflow (i.e. the carry flag).
Nov 21 2014
parent reply "Don" <x nospam.com> writes:
On Friday, 21 November 2014 at 20:17:12 UTC, Walter Bright wrote:
 On 11/21/2014 7:36 AM, Don wrote:
 On Friday, 21 November 2014 at 04:53:38 UTC, Walter Bright 
 wrote:
 0 crossing bugs tend to show up much sooner, and often 
 immediately.
You're missing the point here. The problem is that people are using 'uint' as if it were a positive integer type. Suppose D had a type 'natint', which could hold natural numbers in the range 0..uint.max. Sounds like 'uint', right? People make the mistake of thinking that is what uint is. But it is not. How would natint behave, in the type system? typeof (natint - natint) == int NOT natint !!! This would of course overflow if the result is too big to fit in an int. But the type would be correct. 1 - 2 == -1. But typeof (uint - uint ) == uint. The bit pattern is identical to the other case. But the type is wrong. It is for this reason that uint is not appropriate as a model for positive integers. Having warnings about mixing int and uint operations in relational operators is a bit misleading, because mixing signed and unsigned is not usually the real problem. Instead, those warnings a symptom of a type system mistake. You are quite right in saying that with a signed length, overflows can still occur. But, those are in principle detectable. The compiler could add runtime overflow checks for them, for example. But the situation for unsigned is not fixable, because it is a problem with the type system. By making .length unsigned, we are telling people that if .length is used in a subtraction expression, the type will be wrong. It is the incorrect use of the type system that is the underlying problem.
I believe I do understand the problem. As a practical matter, overflow checks are not going to be added for performance reasons.
The performance overhead would be practically zero. All we would need to do, is restrict array slices such that the length cannot exceed ssize_t.max. This can only happen in the case where the element type has a size of 1, and only in the case of slicing a pointer, concatenation, and memory allocation. Making this restriction would have been unreasonable in the 8 and 16 bit days, but D doesn't support those. For 32 bits, this is an extreme corner case. For 64 bit, this condition never happens at all. In exchange, 99% of uses of unsigned would disappear from D code, and with it, a whole category of bugs.
 Also, in principle, uint-uint can generate a runtime check for 
 underflow (i.e. the carry flag).
No it cannot. The compiler does not have enough information to know if the value is intended to be positive integer, or an unsigned. That information is lost from the type system. Eg from C, wrapping of an unsigned type is not an error. It is perfectly defined behaviour. With signed types, it's undefined behaviour. To make this clear: I am not proposing that size_t should be changed. I am proposing that for .length returns a signed type, that for array slices is guaranteed to never be negative.
Nov 24 2014
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/24/14 2:20 AM, Don wrote:
 I am proposing that for .length returns a signed type, that for array
 slices is guaranteed to never be negative.
Assuming you do make the case this change is an improvement, do you believe it's worth the breakage it would create? -- Andrei
Nov 24 2014
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/24/2014 2:20 AM, Don wrote:
 I believe I do understand the problem. As a practical matter, overflow checks
 are not going to be added for performance reasons.
The performance overhead would be practically zero. All we would need to do, is restrict array slices such that the length cannot exceed ssize_t.max. This can only happen in the case where the element type has a size of 1, and only in the case of slicing a pointer, concatenation, and memory allocation.
(length1 + length2) / 2
 In exchange, 99% of uses of unsigned would disappear from D code, and with it,
a
 whole category of bugs.
You're not proposing changing size_t, so I believe this statement is incorrect.
 Also, in principle, uint-uint can generate a runtime check for underflow (i.e.
 the carry flag).
No it cannot. The compiler does not have enough information to know if the value is intended to be positive integer, or an unsigned. That information is lost from the type system. Eg from C, wrapping of an unsigned type is not an error. It is perfectly defined behaviour. With signed types, it's undefined behaviour.
I know it's not an error. It can be defined to be an error, and the compiler can insert a runtime check. (I'm not proposing this, just saying it can be done.)
 To make this clear: I am not proposing that size_t should be changed.
 I am proposing that for .length returns a signed type, that for array slices is
 guaranteed to never be negative.
There'll be mass confusion if .length is not the same type as .sizeof
Nov 24 2014
next sibling parent "Don" <x nospam.com> writes:
On Monday, 24 November 2014 at 21:34:19 UTC, Walter Bright wrote:
 On 11/24/2014 2:20 AM, Don wrote:
 I believe I do understand the problem. As a practical matter, 
 overflow checks
 are not going to be added for performance reasons.
The performance overhead would be practically zero. All we would need to do, is restrict array slices such that the length cannot exceed ssize_t.max. This can only happen in the case where the element type has a size of 1, and only in the case of slicing a pointer, concatenation, and memory allocation.
(length1 + length2) / 2
That's not an issue with length, that's an issue with doing a calculation with an insufficient bit width. Unsigned doesn't actually help, it's still wrong. For unsigned values, if length1 = length2 = 0x8000_0000, that gives an answer of 0.
 In exchange, 99% of uses of unsigned would disappear from D 
 code, and with it, a
 whole category of bugs.
You're not proposing changing size_t, so I believe this statement is incorrect.
From the D code that I've seen, almost all uses of size_t come directly from the use of .length. But I concede (see below) that many of them come from .sizeof.
 Also, in principle, uint-uint can generate a runtime check 
 for underflow (i.e.
 the carry flag).
No it cannot. The compiler does not have enough information to know if the value is intended to be positive integer, or an unsigned. That information is lost from the type system. Eg from C, wrapping of an unsigned type is not an error. It is perfectly defined behaviour. With signed types, it's undefined behaviour.
I know it's not an error. It can be defined to be an error, and the compiler can insert a runtime check. (I'm not proposing this, just saying it can be done.)
But it can't do that, without turning unsigned into a different type. You'd be turning unsigned into a 'non-negative' which is a completely different type. This is my whole point. unsigned has no sign, you just get the raw bit pattern with no interpretation. This can mean several things, for example: 1. extended_non_negative is where you are using it for the positive range 0.. +0xFFFF_FFFF Then, overflow and underflow are errors. 2. a value where the highest bit is always 0. This can be safely used as int or uint. 3. Or, it can be modulo 2^^32 arithmetic, where wrapping is intended. 4. It can be part of extended precision arithmetic, where you want the carry flag. 5. It can be just a raw bit pattern. 6. The high bit can be a sign bit. This is a signed type, cast to uint. If the sign bit ever flips because of a carry, that's an error. The type system doesn't specify a meaning for the bit pattern. We've got a special type for case 6, but not for the others. The problem with unsigned is that since it can mean so many things, as if it were a union of these possibilities. So it's not strictly typed -- you need to careful, requiring some element of faith-based programming. And "signed-unsigned mismatch" is really where you are implicitly assuming that the unsigned value is case 2 or 6. But, if it is one of the other cases, you get nonsense. But those "signed unsigned mismatch" errors only catch some of the possible cases where you may forget which interpretation you are using, and act as if it were another one.
 To make this clear: I am not proposing that size_t should be 
 changed.
 I am proposing that for .length returns a signed type, that 
 for array slices is
 guaranteed to never be negative.
There'll be mass confusion if .length is not the same type as .sizeof
Ah, that is a good point. .sizeof is another source of unsigned. Again, quite unnecessarily, can a single type ever actually use up half of the memory space? (It was possible in the 8 and 16 bit days, but it's hard to imagine today). Even sillier, it is nearly always known at compile time! But still, .sizeof is low-level in a way that .length is not.
Nov 25 2014
prev sibling parent "Kagamin" <spam here.lot> writes:
On Monday, 24 November 2014 at 21:34:19 UTC, Walter Bright wrote:
 In exchange, 99% of uses of unsigned would disappear from D 
 code, and with it, a
 whole category of bugs.
You're not proposing changing size_t, so I believe this statement is incorrect.
The idea is to make unsigned types opt-in, a deliberate choice of individual programmers, not forced by the language. Positive signed integers convert to unsigned integers perfectly without losing information, so mixing types will work perfectly for those who request it.
Nov 25 2014
prev sibling parent Marco Leise <Marco.Leise gmx.de> writes:
Am Thu, 20 Nov 2014 20:53:31 -0800
schrieb Walter Bright <newshound2 digitalmars.com>:

 On 11/20/2014 7:11 PM, Walter Bright wrote:
 On 11/20/2014 3:25 PM, bearophile wrote:
 Walter Bright:

 If that is changed to a signed type, then you'll have a same-only-different
 set of subtle bugs,
This is possible. Can you show some of the bugs, we can discuss them, and see if they are actually worse than the current situation.
All you're doing is trading 0 crossing for 0x7FFFFFFF crossing issues, and pretending the problems have gone away.
BTW, granted the 0x7FFFFFFF problems exhibit the bugs less often, but paradoxically this can make the bug worse, because then it only gets found much, much later in supposedly tested & robust code. 0 crossing bugs tend to show up much sooner, and often immediately.
+1000. This is also the reason we have a special float .init in D. There is no plethora of bugs to show, because they are under the radar. Signed types are only more convenient in the scripting language sense, like using double for everything and array indexing in JavaScript. -- Marco
Nov 21 2014
prev sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Walter Bright:

 All you're doing is trading 0 crossing for 0x7FFFFFFF crossing 
 issues, and pretending the problems have gone away.
I'm not pretending anything. I am asking in practical programming what of the two solutions leads to leas problems/bugs. So far I've seen the unsigned solution and I've seen it's highly bug-prone.
 BTW, granted the 0x7FFFFFFF problems exhibit the bugs less 
 often, but paradoxically this can make the bug worse, because 
 then it only gets found much, much later in supposedly tested & 
 robust code.
Is this true? Do you have some examples of buggy code? Bye, bearophile
Nov 21 2014
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/21/2014 12:10 AM, bearophile wrote:
 Walter Bright:

 All you're doing is trading 0 crossing for 0x7FFFFFFF crossing issues, and
 pretending the problems have gone away.
I'm not pretending anything. I am asking in practical programming what of the two solutions leads to leas problems/bugs. So far I've seen the unsigned solution and I've seen it's highly bug-prone.
I'm suggesting that having a bug and detecting the bug are two different things. The 0-crossing bug is easier to detect, but that doesn't mean that shifting the problem to 0x7FFFFFFF crossing bugs is making the bug count less.
 BTW, granted the 0x7FFFFFFF problems exhibit the bugs less often, but
 paradoxically this can make the bug worse, because then it only gets found
 much, much later in supposedly tested & robust code.
Is this true? Do you have some examples of buggy code?
http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html
Nov 21 2014
next sibling parent "Frank Like" <1150015857 qq.com> writes:
Mathematical difference between unsigned value,size comparison
should be done before in the right side of the equal sign
character.

such as:  l3 = (l1 >l2)? (l1 - l2):(l2 - l1);

If this work is done in druntime,small bug will be rarely.D will 
be a real system language.

Frank
Nov 21 2014
prev sibling next sibling parent reply Ary Borenszweig <ary esperanto.org.ar> writes:
On 11/21/14, 5:45 AM, Walter Bright wrote:
 On 11/21/2014 12:10 AM, bearophile wrote:
 Walter Bright:

 All you're doing is trading 0 crossing for 0x7FFFFFFF crossing
 issues, and
 pretending the problems have gone away.
I'm not pretending anything. I am asking in practical programming what of the two solutions leads to leas problems/bugs. So far I've seen the unsigned solution and I've seen it's highly bug-prone.
I'm suggesting that having a bug and detecting the bug are two different things. The 0-crossing bug is easier to detect, but that doesn't mean that shifting the problem to 0x7FFFFFFF crossing bugs is making the bug count less.
 BTW, granted the 0x7FFFFFFF problems exhibit the bugs less often, but
 paradoxically this can make the bug worse, because then it only gets
 found
 much, much later in supposedly tested & robust code.
Is this true? Do you have some examples of buggy code?
http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html
"This bug can manifest itself for arrays whose length (in elements) is 2^30 or greater (roughly a billion elements)" How often does that happen in practice?
Nov 21 2014
next sibling parent reply ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Fri, 21 Nov 2014 11:17:06 -0300
Ary Borenszweig via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 "This bug can manifest itself for arrays whose length (in elements) is=20
 2^30 or greater (roughly a billion elements)"
=20
 How often does that happen in practice?
once in almost ten years is too often, as for me. i think that the answer must be "never". either no bug, or the code is broken. and one of the worst code is the code that "works most of the time", but still broken.
Nov 21 2014
parent reply Ary Borenszweig <ary esperanto.org.ar> writes:
On 11/21/14, 11:47 AM, ketmar via Digitalmars-d wrote:
 On Fri, 21 Nov 2014 11:17:06 -0300
 Ary Borenszweig via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 "This bug can manifest itself for arrays whose length (in elements) is
 2^30 or greater (roughly a billion elements)"

 How often does that happen in practice?
once in almost ten years is too often, as for me. i think that the answer must be "never". either no bug, or the code is broken. and one of the worst code is the code that "works most of the time", but still broken.
You see, if you don't use a BigNum for everything than you will always have hidden bugs, be it with int, uint or whatever. The thing is that with int bugs are much less frequent than with uint. So I don't know why you'd rather have uint than int...
Nov 21 2014
parent reply ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Fri, 21 Nov 2014 14:38:26 -0300
Ary Borenszweig via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 You see, if you don't use a BigNum for everything than you will always=20
 have hidden bugs, be it with int, uint or whatever.
why do you believe that i'm not aware of overflows and don't checking for that? i'm used to think about overflows and do overflow checking in production code since my Z80 days. and i don't believe that "infrequent bug" is better than "frequent bug". both are equally bad.
Nov 21 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/21/2014 10:05 AM, ketmar via Digitalmars-d wrote:
 why do you believe that i'm not aware of overflows and don't checking
 for that? i'm used to think about overflows and do overflow checking in
 production code since my Z80 days. and i don't believe that "infrequent
 bug" is better than "frequent bug". both are equally bad.
Having coded with 16 bit computers for decades, one gets used to thinking about and dealing with overflows :-)
Nov 21 2014
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Fri, Nov 21, 2014 at 10:57:44AM -0800, Walter Bright via Digitalmars-d wrote:
 On 11/21/2014 10:05 AM, ketmar via Digitalmars-d wrote:
why do you believe that i'm not aware of overflows and don't checking
for that? i'm used to think about overflows and do overflow checking
in production code since my Z80 days. and i don't believe that
"infrequent bug" is better than "frequent bug". both are equally bad.
Having coded with 16 bit computers for decades, one gets used to thinking about and dealing with overflows :-)
I used to write 8-bit assembly code on the 6502 (yeah I'm so dating myself), where overflows and underflows happen ALL THE TIME. :-) In fact, they happen so much, that I learned to embrace modulo arithmetic instead of fearing it. I would take advantage of value wrapping to shave off a few cycles here and a few cycles there -- they do add up, given that the CPU only ran at a meager 1MHz, so every little bit counts. Then in the 16-bit days, I wrote a sliding-window buffer using a 64kB buffer where the wraparound of the 16-bit index variable was a feature rather than a bug. :-) Basically, once it reaches within a certain distance from the end of the window, the next 32kB block of data was paged in from disk into the other half of the buffer, so as long as the operation didn't span more than 32kB each time, you can just increment the 16-bit index without needing to check for the end of the buffer -- it'd automatically wrap around to the beginning where the new 32kB block has been loaded once you go past the end. A truly circular buffer! :-P T -- That's not a bug; that's a feature!
Nov 21 2014
parent Marco Leise <Marco.Leise gmx.de> writes:
Am Fri, 21 Nov 2014 11:41:55 -0800
schrieb "H. S. Teoh via Digitalmars-d"
<digitalmars-d puremagic.com>:

 On Fri, Nov 21, 2014 at 10:57:44AM -0800, Walter Bright via Digitalmars-d
wrote:
 On 11/21/2014 10:05 AM, ketmar via Digitalmars-d wrote:
why do you believe that i'm not aware of overflows and don't checking
for that? i'm used to think about overflows and do overflow checking
in production code since my Z80 days. and i don't believe that
"infrequent bug" is better than "frequent bug". both are equally bad.
Having coded with 16 bit computers for decades, one gets used to thinking about and dealing with overflows :-)
I used to write 8-bit assembly code on the 6502 (yeah I'm so dating myself), where overflows and underflows happen ALL THE TIME. :-) In fact, they happen so much, that I learned to embrace modulo arithmetic instead of fearing it. I would take advantage of value wrapping to shave off a few cycles here and a few cycles there -- they do add up, given that the CPU only ran at a meager 1MHz, so every little bit counts. Then in the 16-bit days, I wrote a sliding-window buffer using a 64kB buffer where the wraparound of the 16-bit index variable was a feature rather than a bug. :-) Basically, once it reaches within a certain distance from the end of the window, the next 32kB block of data was paged in from disk into the other half of the buffer, so as long as the operation didn't span more than 32kB each time, you can just increment the 16-bit index without needing to check for the end of the buffer -- it'd automatically wrap around to the beginning where the new 32kB block has been loaded once you go past the end. A truly circular buffer! :-P T
I used to be a kid playing a GameBoy game called Mystic Quest (which some in Asia may know it as part of the Final Fantasy franchise). One day I got really greedy and spend days collecting gold. Vast amounts of gold. I learned which enemies drop the most and soon got 10000, 20000, 30000, 40000, 50000, 60000, ... NOOOOOOOOOOO! That game taught me two things. That Dodos are extinct and not checking for overflows is a painful experience. -- Marco
Nov 21 2014
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/21/14 6:17 AM, Ary Borenszweig wrote:
 On 11/21/14, 5:45 AM, Walter Bright wrote:
 On 11/21/2014 12:10 AM, bearophile wrote:
 Walter Bright:

 All you're doing is trading 0 crossing for 0x7FFFFFFF crossing
 issues, and
 pretending the problems have gone away.
I'm not pretending anything. I am asking in practical programming what of the two solutions leads to leas problems/bugs. So far I've seen the unsigned solution and I've seen it's highly bug-prone.
I'm suggesting that having a bug and detecting the bug are two different things. The 0-crossing bug is easier to detect, but that doesn't mean that shifting the problem to 0x7FFFFFFF crossing bugs is making the bug count less.
 BTW, granted the 0x7FFFFFFF problems exhibit the bugs less often, but
 paradoxically this can make the bug worse, because then it only gets
 found
 much, much later in supposedly tested & robust code.
Is this true? Do you have some examples of buggy code?
http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html
"This bug can manifest itself for arrays whose length (in elements) is 2^30 or greater (roughly a billion elements)" How often does that happen in practice?
Every time you read a DVD image :o). I should say that in my doctoral work it was often the case I'd have very large arrays. Andrei
Nov 21 2014
parent Ary Borenszweig <ary esperanto.org.ar> writes:
On 11/21/14, 1:32 PM, Andrei Alexandrescu wrote:
 On 11/21/14 6:17 AM, Ary Borenszweig wrote:
 On 11/21/14, 5:45 AM, Walter Bright wrote:
 On 11/21/2014 12:10 AM, bearophile wrote:
 Walter Bright:

 All you're doing is trading 0 crossing for 0x7FFFFFFF crossing
 issues, and
 pretending the problems have gone away.
I'm not pretending anything. I am asking in practical programming what of the two solutions leads to leas problems/bugs. So far I've seen the unsigned solution and I've seen it's highly bug-prone.
I'm suggesting that having a bug and detecting the bug are two different things. The 0-crossing bug is easier to detect, but that doesn't mean that shifting the problem to 0x7FFFFFFF crossing bugs is making the bug count less.
 BTW, granted the 0x7FFFFFFF problems exhibit the bugs less often, but
 paradoxically this can make the bug worse, because then it only gets
 found
 much, much later in supposedly tested & robust code.
Is this true? Do you have some examples of buggy code?
http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html
"This bug can manifest itself for arrays whose length (in elements) is 2^30 or greater (roughly a billion elements)" How often does that happen in practice?
Every time you read a DVD image :o). I should say that in my doctoral work it was often the case I'd have very large arrays.
Oh, sorry, I totally forgot that when you open a DVD with VLC it reads the whole thing to memory. </sarcasm>
Nov 21 2014
prev sibling parent "Don" <x nospam.com> writes:
On Friday, 21 November 2014 at 08:46:20 UTC, Walter Bright wrote:
 On 11/21/2014 12:10 AM, bearophile wrote:
 Walter Bright:

 All you're doing is trading 0 crossing for 0x7FFFFFFF 
 crossing issues, and
 pretending the problems have gone away.
I'm not pretending anything. I am asking in practical programming what of the two solutions leads to leas problems/bugs. So far I've seen the unsigned solution and I've seen it's highly bug-prone.
I'm suggesting that having a bug and detecting the bug are two different things. The 0-crossing bug is easier to detect, but that doesn't mean that shifting the problem to 0x7FFFFFFF crossing bugs is making the bug count less.
 BTW, granted the 0x7FFFFFFF problems exhibit the bugs less 
 often, but
 paradoxically this can make the bug worse, because then it 
 only gets found
 much, much later in supposedly tested & robust code.
Is this true? Do you have some examples of buggy code?
http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html
Changing signed to unsigned in that example does NOT fix the bug. It just means it fails with length = 2^^31 instead of length = 2^^30. uint a = 0x8000_0000u; uint b = 0x8000_0002u; assert( (a + b) /2 == 0); But actually I don't understand that article. The arrays are int, not char. Since length fits into 32 bits, the largest possible value is 2^^32-1. Therefore, for an int array, with 4 byte elements, the largest possible value is 2^^30-1. So I think the article is wrong. I don't think there is a bug in the code.
Nov 24 2014
prev sibling parent reply ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Fri, 21 Nov 2014 08:10:55 +0000
bearophile via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 BTW, granted the 0x7FFFFFFF problems exhibit the bugs less=20
 often, but paradoxically this can make the bug worse, because=20
 then it only gets found much, much later in supposedly tested &=20
 robust code.
=20 Is this true? Do you have some examples of buggy code?
any code which does something like `if (a-b < 0)` is broken. it will work in most cases, but it is broken. you MUST to check values before subtracting. and if you must to do checks anyway, what is the reason of making length signed?
Nov 21 2014
parent reply "FrankLike" <1150015857 qq.com> writes:
On Friday, 21 November 2014 at 13:59:08 UTC, ketmar via 
Digitalmars-d wrote:

 any code which does something like `if (a-b < 0)` is broken. it
Modify it: https://github.com/D-Programming-Language/druntime/blob/master/src/core/checkedint.d Modify method: subu(uint ...) or subu(ulong ...) if(x<y) return y -x ; else return x -y; It will be not broken.
Nov 21 2014
parent reply ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Fri, 21 Nov 2014 14:55:45 +0000
FrankLike via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 On Friday, 21 November 2014 at 13:59:08 UTC, ketmar via=20
 Digitalmars-d wrote:
=20
 any code which does something like `if (a-b < 0)` is broken. it
=20 Modify it:=20 https://github.com/D-Programming-Language/druntime/blob/master/src/core/c=
heckedint.d
=20
 Modify method: subu(uint ...) or subu(ulong ...)
 if(x<y)
 return y -x ;
 else
   return x -y;
=20
 It will be not broken.
and it will not do the same anymore too. it's not a fix at all.
Nov 21 2014
parent "FrankLike" <1150015857 qq.com> writes:
On Friday, 21 November 2014 at 15:13:22 UTC, ketmar via 
Digitalmars-d wrote:


 and it will not do the same anymore too. it's not a fix at all.
But it is a part of bugs. Sure,bug which is in mixing sign and unsign values should be fix.
Nov 21 2014
prev sibling parent reply "Kagamin" <spam here.lot> writes:
On Thursday, 20 November 2014 at 21:27:11 UTC, Walter Bright 
wrote:
 If that is changed to a signed type, then you'll have a 
 same-only-different set of subtle bugs
If people use signed length with unsigned integers, the length with implicitly convert to unsigned and behave like now, no difference.
 plus you'll break the intuition about these things from 
 everyone who has used C/C++ a lot.
C/C++ programmers disagree: http://critical.eschertech.com/2010/04/07/danger-unsigned-types-used-here/ Why do you think they can't handle signed integers?
Nov 21 2014
parent ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Fri, 21 Nov 2014 09:23:01 +0000
Kagamin via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 C/C++ programmers disagree:=20
 http://critical.eschertech.com/2010/04/07/danger-unsigned-types-used-here/
 Why do you think they can't handle signed integers?
being C programmer i disagree that author of the article is C programmer.
Nov 21 2014
prev sibling next sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Thu, Nov 20, 2014 at 12:02:42AM -0800, Walter Bright via Digitalmars-d wrote:
 On 11/19/2014 5:03 PM, H. S. Teoh via Digitalmars-d wrote:
If this kind of unsafe mixing wasn't allowed, or required explict
casts (to signify "yes I know what I'm doing and I'm prepared to face
the consequences"), I suspect that bearophile would be much happier
about this issue. ;-)
Explicit casts are worse than the problem - they can easily cause bugs.
Not any worse bugs than are currently *silently accepted* by the compiler!
 As for me personally, I like having a complete set of signed and
 unsigned integral types at my disposal. It's like having a full set of
 wrenches that are open end on one end and boxed on the other :-) Most
 of the time either end will work, but sometimes only one will.
 
 Now, if D were a non-systems language like Basic, Go or Java, unsigned
 types could be reasonably dispensed with. But D is a systems
 programming language, and it ought to have available types that match
 what the hardware supports.
Please note that I never suggested anywhere that we get rid of unsigned types. In fact, I think it was a right decision to include unsigned types in the language and to use an unsigned type for array length. What *could* be improved, is the prevention of obvious mistakes in *mixing* signed and unsigned types. Right now, D allows code like the following with no warning: uint x; int y; auto z = x - y; BTW, this one is the same in essence as an actual bug that I fixed in druntime earlier this year, so downplaying it as a mistake people make 'cos they confound computer math with math math is fallacious. T -- He who laughs last thinks slowest.
Nov 20 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/20/2014 7:52 AM, H. S. Teoh via Digitalmars-d wrote:
 What *could* be improved, is the prevention of obvious mistakes in
 *mixing* signed and unsigned types. Right now, D allows code like the
 following with no warning:

 	uint x;
 	int y;
 	auto z = x - y;

 BTW, this one is the same in essence as an actual bug that I fixed in
 druntime earlier this year, so downplaying it as a mistake people make
 'cos they confound computer math with math math is fallacious.
What about: uint x; auto z = x - 1; ?
Nov 20 2014
next sibling parent reply "FrankLike" <1150015857 qq.com> writes:
 What about:

     uint x;
     auto z = x - 1;

 ?
When mixing signed and unsigned, as signed, it maybe no mistaken. thhere is a small test,add 'cast(long)' before - operator,if it's auto add,maybe fine. ----- import std.stdio; void main() { size_t width = 10; size_t height = 20; writeln("before width is ",width," ,height is ",height); height -= 15; width -= cast(long)height; writeln("after width is ",width," ,height is ",height); } ----result: after width is 5 . it's ok. Frank
Nov 20 2014
parent Walter Bright <newshound2 digitalmars.com> writes:
On 11/20/2014 3:37 PM, FrankLike wrote:
 What about:

     uint x;
     auto z = x - 1;

 ?
When mixing signed and unsigned, as signed, it maybe no mistaken.
I think you missed my question - is that legal code under H.S.Teoh's proposal?
Nov 20 2014
prev sibling parent reply ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Thu, 20 Nov 2014 13:28:37 -0800
Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 On 11/20/2014 7:52 AM, H. S. Teoh via Digitalmars-d wrote:
 What *could* be improved, is the prevention of obvious mistakes in
 *mixing* signed and unsigned types. Right now, D allows code like the
 following with no warning:

 	uint x;
 	int y;
 	auto z =3D x - y;

 BTW, this one is the same in essence as an actual bug that I fixed in
 druntime earlier this year, so downplaying it as a mistake people make
 'cos they confound computer math with math math is fallacious.
=20 What about: =20 uint x; auto z =3D x - 1; =20 ? =20
here z must be `long`. and for `ulong` compiler must emit error.
Nov 21 2014
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/21/14 6:03 AM, ketmar via Digitalmars-d wrote:
 On Thu, 20 Nov 2014 13:28:37 -0800
 Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 On 11/20/2014 7:52 AM, H. S. Teoh via Digitalmars-d wrote:
 What *could* be improved, is the prevention of obvious mistakes in
 *mixing* signed and unsigned types. Right now, D allows code like the
 following with no warning:

 	uint x;
 	int y;
 	auto z = x - y;

 BTW, this one is the same in essence as an actual bug that I fixed in
 druntime earlier this year, so downplaying it as a mistake people make
 'cos they confound computer math with math math is fallacious.
What about: uint x; auto z = x - 1; ?
here z must be `long`. and for `ulong` compiler must emit error.
Would you agree that that would break a substantial amount of correct D code? -- Andrei
Nov 21 2014
next sibling parent reply ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Fri, 21 Nov 2014 08:31:13 -0800
Andrei Alexandrescu via Digitalmars-d <digitalmars-d puremagic.com>
wrote:

 Would you agree that that would break a substantial amount of correct D=20
 code? -- Andrei
i don't think that code with possible int wrapping and `auto` is correct, so the answer is "no". bad code must be made bad.
Nov 21 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/21/14 8:57 AM, ketmar via Digitalmars-d wrote:
 On Fri, 21 Nov 2014 08:31:13 -0800
 Andrei Alexandrescu via Digitalmars-d <digitalmars-d puremagic.com>
 wrote:

 Would you agree that that would break a substantial amount of correct D
 code? -- Andrei
i don't think that code with possible int wrapping and `auto` is correct, so the answer is "no". bad code must be made bad.
I think you misunderstood the question. -- Andrei
Nov 21 2014
parent ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Fri, 21 Nov 2014 17:45:11 -0800
Andrei Alexandrescu via Digitalmars-d <digitalmars-d puremagic.com>
wrote:

 On 11/21/14 8:57 AM, ketmar via Digitalmars-d wrote:
 On Fri, 21 Nov 2014 08:31:13 -0800
 Andrei Alexandrescu via Digitalmars-d <digitalmars-d puremagic.com>
 wrote:

 Would you agree that that would break a substantial amount of correct D
 code? -- Andrei
i don't think that code with possible int wrapping and `auto` is correct, so the answer is "no". bad code must be made bad.
=20 I think you misunderstood the question. -- Andrei
i don't think so. you have two questions here: is such code correct, and will it broke? if the answer to the first question is "no", then the second question has no sense. i keep thinking that such code is not correct (albeit it compiles ok), so nothing will break. at least this is how i understood your question. maybe you are right, and i didn't get it.
Nov 21 2014
prev sibling next sibling parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Fri, Nov 21, 2014 at 08:31:13AM -0800, Andrei Alexandrescu via Digitalmars-d
wrote:
 On 11/21/14 6:03 AM, ketmar via Digitalmars-d wrote:
On Thu, 20 Nov 2014 13:28:37 -0800
Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:

On 11/20/2014 7:52 AM, H. S. Teoh via Digitalmars-d wrote:
What *could* be improved, is the prevention of obvious mistakes in
*mixing* signed and unsigned types. Right now, D allows code like
the following with no warning:

	uint x;
	int y;
	auto z = x - y;

BTW, this one is the same in essence as an actual bug that I fixed
in druntime earlier this year, so downplaying it as a mistake
people make 'cos they confound computer math with math math is
fallacious.
What about: uint x; auto z = x - 1; ?
here z must be `long`. and for `ulong` compiler must emit error.
What if x==uint.max?
 Would you agree that that would break a substantial amount of correct
 D code? -- Andrei
Yeah I don't think it's a good idea for subtraction to yield a different type from its operands. Non-closure of operators (i.e., results are of a different type than operands) leads to a lot of frustration because you keep ending up with the wrong type, and inevitably people will just throw in random casts everywhere just to make things work. T -- We are in class, we are supposed to be learning, we have a teacher... Is it too much that I expect him to teach me??? -- RL
Nov 21 2014
prev sibling parent ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Fri, 21 Nov 2014 09:08:54 -0800
"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> wrote:

What about:

      uint x;
      auto z =3D x - 1;

?
here z must be `long`. and for `ulong` compiler must emit error.
=20 What if x=3D=3Duint.max?
nothing bad, long is perfectly able to represent that.
 Would you agree that that would break a substantial amount of correct
 D code? -- Andrei
=20 Yeah I don't think it's a good idea for subtraction to yield a different type from its operands. Non-closure of operators (i.e., results are of a different type than operands) leads to a lot of frustration because you keep ending up with the wrong type, and inevitably people will just throw in random casts everywhere just to make things work.
not any subtraction, only that with `auto` vardecl.
Nov 21 2014
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/21/2014 6:03 AM, ketmar via Digitalmars-d wrote:
 On Thu, 20 Nov 2014 13:28:37 -0800
 Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 On 11/20/2014 7:52 AM, H. S. Teoh via Digitalmars-d wrote:
 What *could* be improved, is the prevention of obvious mistakes in
 *mixing* signed and unsigned types. Right now, D allows code like the
 following with no warning:

 	uint x;
 	int y;
 	auto z = x - y;

 BTW, this one is the same in essence as an actual bug that I fixed in
 druntime earlier this year, so downplaying it as a mistake people make
 'cos they confound computer math with math math is fallacious.
What about: uint x; auto z = x - 1; ?
here z must be `long`. and for `ulong` compiler must emit error.
So, any time an integer literal appears in an unsigned expression, the type of the expression becomes signed?
Nov 21 2014
next sibling parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Fri, Nov 21, 2014 at 10:59:13AM -0800, Walter Bright via Digitalmars-d wrote:
 On 11/21/2014 6:03 AM, ketmar via Digitalmars-d wrote:
On Thu, 20 Nov 2014 13:28:37 -0800
Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:

On 11/20/2014 7:52 AM, H. S. Teoh via Digitalmars-d wrote:
What *could* be improved, is the prevention of obvious mistakes in
*mixing* signed and unsigned types. Right now, D allows code like
the following with no warning:

	uint x;
	int y;
	auto z = x - y;

BTW, this one is the same in essence as an actual bug that I fixed
in druntime earlier this year, so downplaying it as a mistake
people make 'cos they confound computer math with math math is
fallacious.
What about: uint x; auto z = x - 1; ?
here z must be `long`. and for `ulong` compiler must emit error.
So, any time an integer literal appears in an unsigned expression, the type of the expression becomes signed?
And subtracting two ulongs gives a compile error?! Whoa, that is truly crippled. T -- Food and laptops don't mix.
Nov 21 2014
prev sibling parent reply ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Fri, 21 Nov 2014 10:59:13 -0800
Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 On 11/21/2014 6:03 AM, ketmar via Digitalmars-d wrote:
 On Thu, 20 Nov 2014 13:28:37 -0800
 Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 On 11/20/2014 7:52 AM, H. S. Teoh via Digitalmars-d wrote:
 What *could* be improved, is the prevention of obvious mistakes in
 *mixing* signed and unsigned types. Right now, D allows code like the
 following with no warning:

 	uint x;
 	int y;
 	auto z =3D x - y;

 BTW, this one is the same in essence as an actual bug that I fixed in
 druntime earlier this year, so downplaying it as a mistake people make
 'cos they confound computer math with math math is fallacious.
What about: uint x; auto z =3D x - 1; ?
here z must be `long`. and for `ulong` compiler must emit error.
=20 =20 So, any time an integer literal appears in an unsigned expression, the ty=
pe of=20
 the expression becomes signed?
nope. only for `auto` expressions.
Nov 21 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/21/2014 11:36 AM, ketmar via Digitalmars-d wrote:
 On Fri, 21 Nov 2014 10:59:13 -0800
 Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 On 11/21/2014 6:03 AM, ketmar via Digitalmars-d wrote:
 On Thu, 20 Nov 2014 13:28:37 -0800
 Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 On 11/20/2014 7:52 AM, H. S. Teoh via Digitalmars-d wrote:
 What *could* be improved, is the prevention of obvious mistakes in
 *mixing* signed and unsigned types. Right now, D allows code like the
 following with no warning:

 	uint x;
 	int y;
 	auto z = x - y;

 BTW, this one is the same in essence as an actual bug that I fixed in
 druntime earlier this year, so downplaying it as a mistake people make
 'cos they confound computer math with math math is fallacious.
What about: uint x; auto z = x - 1; ?
here z must be `long`. and for `ulong` compiler must emit error.
So, any time an integer literal appears in an unsigned expression, the type of the expression becomes signed?
nope. only for `auto` expressions.
So 'auto' has different type rules for expressions than anywhere else in D? Consider: void foo(T)(T a) { ... } if (x - 1) foo(x - 1); if (auto a = x - 1) foo(a); and now foo() is instantiated with a different type? I'm afraid I can't sell that to anyone :-(
Nov 21 2014
parent reply ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Fri, 21 Nov 2014 11:52:29 -0800
Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 So 'auto' has different type rules for expressions than anywhere else in =
D?
=20
 Consider:
=20
      void foo(T)(T a) { ... }
=20
      if (x - 1) foo(x - 1);
      if (auto a =3D x - 1) foo(a);
=20
 and now foo() is instantiated with a different type?
=20
 I'm afraid I can't sell that to anyone :-(
the whole thing of `auto` is to let compiler decide. i won't be surprised if `a` becomes 80-bit real or bigint here -- that's up to compiler to decide which type will be able to hold the whole range. `auto` for me means "i don't care what type you'll choose, just do it for me and don't lose any bits." this can be some kind of structure with overloaded operators, for example.
Nov 21 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/21/2014 12:09 PM, ketmar via Digitalmars-d wrote:
 On Fri, 21 Nov 2014 11:52:29 -0800
 Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 So 'auto' has different type rules for expressions than anywhere else in D?

 Consider:

       void foo(T)(T a) { ... }

       if (x - 1) foo(x - 1);
       if (auto a = x - 1) foo(a);

 and now foo() is instantiated with a different type?

 I'm afraid I can't sell that to anyone :-(
the whole thing of `auto` is to let compiler decide. i won't be surprised if `a` becomes 80-bit real or bigint here -- that's up to compiler to decide which type will be able to hold the whole range. `auto` for me means "i don't care what type you'll choose, just do it for me and don't lose any bits." this can be some kind of structure with overloaded operators, for example.
'auto' doesn't mean "break my code if I refactor out expressions into temporaries".
Nov 21 2014
parent ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Fri, 21 Nov 2014 12:35:31 -0800
Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 'auto' doesn't mean "break my code if I refactor out expressions into tem=
poraries". this can be easily avoided: just don't use `auto` for refactoring. i'm still thinking about `auto` as "any type that is able to hold the result". `uint` is obviously not able to hold the result of uint subtraction (as it can be negative, and uint can't). so when compiler automatically chooses the type which can't hold the resulting value, i see this as a design flaw and safety breaking feature. one of D flaws -- as i can see it -- is trying to be both a reasonably high-level and "close to metal". it's ok for "metal" language to use uint to hold that result and wrap. but it's not ok for high-level language, high-level language should free me of thinking about "wrapping", "overflow" and so on. i'm not telling that D is bad, i'm simply trying to tell that such confusions will raise again and again. "too far from metal" vs "too close to metal". i may just not get the whole concept, so i don't know what to expect in such cases. sometimes D amuses me with it's shamelessly breakage of principle of least astonishment. but yet again it very well can be flaw in my brains and not in D.
Nov 21 2014
prev sibling parent Nick Treleaven <ntrel-pub mybtinternet.com> writes:
On 20/11/2014 08:02, Walter Bright wrote:
 On 11/19/2014 5:03 PM, H. S. Teoh via Digitalmars-d wrote:
 If this kind of unsafe mixing wasn't allowed, or required explict casts
 (to signify "yes I know what I'm doing and I'm prepared to face the
 consequences"), I suspect that bearophile would be much happier about
 this issue. ;-)
Explicit casts are worse than the problem - they can easily cause bugs.
I recently explained to you that explicit casts are easily avoided using `import std.conv: signed, unsigned;`. D compilers badly need a way to detect bug-prone sign mixing. It is no exaggeration to say D is worse than C compilers in this regard. Usually we discuss how to compete with modern languages; here we are not even keeping up with C. It's disappointing this issue was pre-approved last year, but now neither you nor even Andrei seem particularly cognizant of the need to resolve it. If you belittle the problem, you discourage others from trying to solve it.
Nov 22 2014
prev sibling parent "Kagamin" <spam here.lot> writes:
On Thursday, 20 November 2014 at 01:05:51 UTC, H. S. Teoh via 
Digitalmars-d wrote:
 However, the fact that you can freely mix signed and unsigned 
 types in
 unsafe ways without any warning, is a fly that spoils the soup.

 If this kind of unsafe mixing wasn't allowed, or required 
 explict casts
 (to signify "yes I know what I'm doing and I'm prepared to face 
 the
 consequences"), I suspect that bearophile would be much happier 
 about
 this issue. ;-)
If usage of unsigned types is not controlled, they will systematically mix with signed types, the mix becomes normal flow of the code. Disallowing normal flow of the code is even worse.
Nov 20 2014
prev sibling parent reply Ary Borenszweig <ary esperanto.org.ar> writes:
On 11/19/14, 7:03 AM, Don wrote:
 On Tuesday, 18 November 2014 at 18:23:52 UTC, Marco Leise wrote:

 Weird consequence: using subtraction with an unsigned type is nearly
 always a bug.

 I wish D hadn't called unsigned integers 'uint'. They should have been
 called '__uint' or something. They should look ugly. You need a very,
 very good reason to use an unsigned type.

 We have a builtin type that is deadly but seductive.
I agree. An array's length makes sense as an unsigned ("an array can't have a negative length, right?") but it leads to the bugs you say. For example: ~~~ import std.stdio; void main() { auto a = [1, 2, 3]; auto b = [1, 2, 3, 4]; if (a.length - b.length > 0) { writeln("Can you spot the bug that easily?"); } } ~~~ Yes, it makes sense, but at the same time it leads to super unintuitive math operations being involved. Rust made the same mistake and now a couple of times I've seen bugs like these being reported. Never seen them in Java or .Net though. I wonder why...
Nov 19 2014
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/19/14 10:09 AM, Ary Borenszweig wrote:
 On 11/19/14, 7:03 AM, Don wrote:
 On Tuesday, 18 November 2014 at 18:23:52 UTC, Marco Leise wrote:
>
 Weird consequence: using subtraction with an unsigned type is nearly
 always a bug.

 I wish D hadn't called unsigned integers 'uint'. They should have been
 called '__uint' or something. They should look ugly. You need a very,
 very good reason to use an unsigned type.

 We have a builtin type that is deadly but seductive.
I agree. An array's length makes sense as an unsigned ("an array can't have a negative length, right?") but it leads to the bugs you say. For example: ~~~ import std.stdio; void main() { auto a = [1, 2, 3]; auto b = [1, 2, 3, 4]; if (a.length - b.length > 0) { writeln("Can you spot the bug that easily?"); } } ~~~ Yes, it makes sense, but at the same time it leads to super unintuitive math operations being involved. Rust made the same mistake and now a couple of times I've seen bugs like these being reported. Never seen them in Java or .Net though. I wonder why...
There are related bugs in Java too, e.g. I remember one in binary search where (i + j) / 2 was wrong because of an overflow. Also, Java does have a package for unsigned integers so apparently it's necessary. Andrei
Nov 19 2014
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

 There are related bugs in Java too, e.g. I remember one in 
 binary search where (i + j) / 2 was wrong because of an 
 overflow.
This is possible in D too.
 Also, Java does have a package for unsigned integers so 
 apparently it's necessary.
This is irrelevant. No one here is saying that a system language should not have unsigned values. The discussion here is about the type of array lengths. Bye, bearophile
Nov 19 2014
next sibling parent reply "Matthias Bentrup" <matthias.bentrup googlemail.com> writes:
On Wednesday, 19 November 2014 at 20:40:53 UTC, bearophile wrote:
 Andrei Alexandrescu:

 There are related bugs in Java too, e.g. I remember one in 
 binary search where (i + j) / 2 was wrong because of an 
 overflow.
This is possible in D too.
 Also, Java does have a package for unsigned integers so 
 apparently it's necessary.
This is irrelevant. No one here is saying that a system language should not have unsigned values. The discussion here is about the type of array lengths. Bye, bearophile
The only signed types that are able to represent all possible array lengths on 64 bit systems are long double and cent.
Nov 19 2014
parent Marco Leise <Marco.Leise gmx.de> writes:
Am Wed, 19 Nov 2014 21:38:50 +0000
schrieb "Matthias Bentrup" <matthias.bentrup googlemail.com>:

 array lengths on 64 bit systems are long double and cent.
Last time I checked, null was a reserved address. ;) -- Marco
Nov 21 2014
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/19/14 12:40 PM, bearophile wrote:
 Andrei Alexandrescu:

 There are related bugs in Java too, e.g. I remember one in binary
 search where (i + j) / 2 was wrong because of an overflow.
This is possible in D too.
 Also, Java does have a package for unsigned integers so apparently
 it's necessary.
This is irrelevant. No one here is saying that a system language should not have unsigned values. The discussion here is about the type of array lengths.
I think we're in good shape with unsigned. -- Andrei
Nov 19 2014
next sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Wed, Nov 19, 2014 at 04:08:11PM -0800, Andrei Alexandrescu via Digitalmars-d
wrote:
 On 11/19/14 12:40 PM, bearophile wrote:
Andrei Alexandrescu:

There are related bugs in Java too, e.g. I remember one in binary
search where (i + j) / 2 was wrong because of an overflow.
This is possible in D too.
Also, Java does have a package for unsigned integers so apparently
it's necessary.
This is irrelevant. No one here is saying that a system language should not have unsigned values. The discussion here is about the type of array lengths.
I think we're in good shape with unsigned. -- Andrei
Implicit conversion between signed/unsigned is the fly that spoils the soup, and the source of subtle bugs that persistently crop up when dealing with size_t. The fact of the matter is that humans are error-prone, even when they are aware of the pitfalls of mixing signed / unsigned types, and currently the language is doing nothing to help prevent these sorts of mistakes. T -- Help a man when he is in trouble and he will remember you when he is in trouble again.
Nov 19 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/19/14 4:24 PM, H. S. Teoh via Digitalmars-d wrote:
 On Wed, Nov 19, 2014 at 04:08:11PM -0800, Andrei Alexandrescu via
Digitalmars-d wrote:
 On 11/19/14 12:40 PM, bearophile wrote:
 Andrei Alexandrescu:

 There are related bugs in Java too, e.g. I remember one in binary
 search where (i + j) / 2 was wrong because of an overflow.
This is possible in D too.
 Also, Java does have a package for unsigned integers so apparently
 it's necessary.
This is irrelevant. No one here is saying that a system language should not have unsigned values. The discussion here is about the type of array lengths.
I think we're in good shape with unsigned. -- Andrei
Implicit conversion between signed/unsigned is the fly that spoils the soup, and the source of subtle bugs that persistently crop up when dealing with size_t. The fact of the matter is that humans are error-prone, even when they are aware of the pitfalls of mixing signed / unsigned types, and currently the language is doing nothing to help prevent these sorts of mistakes.
That I partially, fractionally even, agree with. We agonized for a long time about what to do to improve on the state of the art back in 2007 - literally months I recall. Part of the conclusion was that reverting to int for object lengths would be a net negative. Andrei
Nov 19 2014
next sibling parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Wed, Nov 19, 2014 at 04:42:53PM -0800, Andrei Alexandrescu via Digitalmars-d
wrote:
 On 11/19/14 4:24 PM, H. S. Teoh via Digitalmars-d wrote:
On Wed, Nov 19, 2014 at 04:08:11PM -0800, Andrei Alexandrescu via Digitalmars-d
wrote:
On 11/19/14 12:40 PM, bearophile wrote:
Andrei Alexandrescu:

There are related bugs in Java too, e.g. I remember one in binary
search where (i + j) / 2 was wrong because of an overflow.
This is possible in D too.
Also, Java does have a package for unsigned integers so apparently
it's necessary.
This is irrelevant. No one here is saying that a system language should not have unsigned values. The discussion here is about the type of array lengths.
I think we're in good shape with unsigned. -- Andrei
Implicit conversion between signed/unsigned is the fly that spoils the soup, and the source of subtle bugs that persistently crop up when dealing with size_t. The fact of the matter is that humans are error-prone, even when they are aware of the pitfalls of mixing signed / unsigned types, and currently the language is doing nothing to help prevent these sorts of mistakes.
That I partially, fractionally even, agree with. We agonized for a long time about what to do to improve on the state of the art back in 2007 - literally months I recall. Part of the conclusion was that reverting to int for object lengths would be a net negative.
[...] Actually, I agree about using unsigned for object lengths. I think it's a sound decision -- why use a signed value for something that can never be negative? OTOH, what spoils this (hence the fly in soup reference), is the fact that you can now take these unsigned values and randomly mix them up with signed values without a single warning from the compiler. Even requiring a cast to signify "I know what I'm doing, just do as I say" would be an improvement over the current silent acceptance of questionable code like `if (x.length - y.length < 0) ...`. T -- I am a consultant. My job is to make your job redundant. -- Mr Tom
Nov 19 2014
prev sibling parent reply "FrankLike" <1150015857 qq.com> writes:
 That I partially, fractionally even, agree with. We agonized 
 for a long time about what to do to improve on the state of the 
 art back in 2007 - literally months I recall. Part of the 
 conclusion was that reverting to int for object lengths would 
 be a net negative.

 Andrei
All of these discussions, let me known how to do about 'length',D better than c,it's a system language. For WindowsAPI: cast(something)length ,and other modify it to size_t. Thank you all.
Nov 19 2014
parent Marco Leise <Marco.Leise gmx.de> writes:
Am Thu, 20 Nov 2014 07:42:20 +0000
schrieb "FrankLike" <1150015857 qq.com>:

 That I partially, fractionally even, agree with. We agonized 
 for a long time about what to do to improve on the state of the 
 art back in 2007 - literally months I recall. Part of the 
 conclusion was that reverting to int for object lengths would 
 be a net negative.

 Andrei
All of these discussions, let me known how to do about 'length',D better than c,it's a system language. For WindowsAPI: cast(something)length ,and other modify it to size_t. Thank you all.
The correct thing is to use for WindowsAPI: length.to!int It checks if the length actually fits into an int. -- Marco
Nov 21 2014
prev sibling parent reply "Sean Kelly" <sean invisibleduck.org> writes:
On Thursday, 20 November 2014 at 00:08:08 UTC, Andrei 
Alexandrescu wrote:
 I think we're in good shape with unsigned.
I'd actually prefer signed. Index-based algorithms can be tricky to write correctly with unsigned index values. The reason size_t is unsigned in Druntime is because I felt that half the memory range on 32-bit was potentially too small a maximum size in a systems language, and it's unsigned on 64-bit for the sake of consistency.
Nov 20 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/20/14 7:29 AM, Sean Kelly wrote:
 On Thursday, 20 November 2014 at 00:08:08 UTC, Andrei Alexandrescu wrote:
 I think we're in good shape with unsigned.
I'd actually prefer signed. Index-based algorithms can be tricky to write correctly with unsigned index values.
The most difficult pattern that comes to mind is the "long arrow" operator seen in backward iteration: void fun(int[] a) { for (auto i = a.length; i --> 0; ) { // use i } } Andrei
Nov 20 2014
parent reply "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"Andrei Alexandrescu"  wrote in message 
news:m4l711$1t39$1 digitalmars.com...

 The most difficult pattern that comes to mind is the "long arrow" operator 
 seen in backward iteration:

 void fun(int[] a)
 {
      for (auto i = a.length; i --> 0; )
      {
          // use i
      }
 }
Over the years most of my unsigned-related bugs have been from screwing up various loop conditions. Thankfully D solves this perfectly with: void fun(int[] a) { foreach_reverse(i, 0...a.length) { } } So I never have to write those again.
Nov 21 2014
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Daniel Murphy:

 void fun(int[] a)
 {
    foreach_reverse(i, 0...a.length)
    {
    }
 }
Better (it's a workaround for a D design flaw that we're unwilling to fix): foreach_reverse(immutable i, 0...a.length) Bye, bearophile
Nov 21 2014
parent reply "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"bearophile"  wrote in message news:rqyuiioyrrjgggctfpcx forum.dlang.org...

 Better (it's a workaround for a D design flaw that we're unwilling to 
 fix):

 foreach_reverse(immutable i, 0...a.length)
I know you feel that way, but I'd rather face the non-existent risk of accidentally mutating the induction variable than write immutable every time.
Nov 21 2014
parent "bearophile" <bearophileHUGS lycos.com> writes:
Daniel Murphy:

 foreach_reverse(immutable i, 0...a.length)
I know you feel that way, but I'd rather face the non-existent risk of accidentally mutating the induction variable than write immutable every time.
It's not non-existent :-) (And the right default for a modern language is to have immutable on default and mutable on request. If D doesn't have this quality, better to add immutable every damn time). Bye, bearophile
Nov 21 2014
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/21/2014 12:16 AM, Daniel Murphy wrote:
 Over the years most of my unsigned-related bugs have been from screwing up
 various loop conditions.  Thankfully D solves this perfectly with:

 void fun(int[] a)
 {
     foreach_reverse(i, 0...a.length)
     {
     }
 }

 So I never have to write those again.
I thought everyone hated foreach_reverse! But, yeah, foreach and ranges+algorithms have virtually eliminated a large category of looping bugs.
Nov 21 2014
next sibling parent reply "Stefan Koch" <uplink.coder googlemail.com> writes:
On Friday, 21 November 2014 at 09:37:50 UTC, Walter Bright wrote:
 I thought everyone hated foreach_reverse!
I dislike foreach_reverse; 1. it's a keyword with an underscore in it; 2. complicates implementation of foreach and parsing. 3. key_word with under_score
Nov 21 2014
next sibling parent "Daniel Murphy" <yebbliesnospam gmail.com> writes:
 On Friday, 21 November 2014 at 09:37:50 UTC, Walter Bright wrote:
 I thought everyone hated foreach_reverse!
Not me. It's ugly but it gets the job done. All I have to do is add '_reverse' and it just works! "Stefan Koch" wrote in message news:mmvuvkdfnvwezyvtcceq forum.dlang.org...
 I dislike foreach_reverse;
 1. it's a keyword with an underscore in it;
So what.
 2. complicates implementation of foreach and parsing.
The additional complexity is trivial.
 3. key_word with under_score
Don't care.
Nov 21 2014
prev sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Friday, 21 November 2014 at 09:47:32 UTC, Stefan Koch wrote:
 On Friday, 21 November 2014 at 09:37:50 UTC, Walter Bright 
 wrote:
 I thought everyone hated foreach_reverse!
I dislike foreach_reverse; 1. it's a keyword with an underscore in it; 2. complicates implementation of foreach and parsing. 3. key_word with under_score
These are compiler implementation issue and all solvable. People don't give a shit about how the compiler work and rightly so. The language is made to fit need of the user, not the needs of the implementer.
Nov 21 2014
next sibling parent "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"deadalnix"  wrote in message news:qhirkvbtoiomkyjjugiy forum.dlang.org...

 These are compiler implementation issue and all solvable. People don't 
 give a shit about how the compiler work and rightly so. The language is 
 made to fit need of the user, not the needs of the implementer.
Usually, although they start to care when the complexity results in a never-ending supply of bugs. foreach_reverse doesn't come anywhere near this fortunately.
Nov 21 2014
prev sibling parent ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Sat, 22 Nov 2014 03:09:59 +0000
deadalnix via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 On Friday, 21 November 2014 at 09:47:32 UTC, Stefan Koch wrote:
 On Friday, 21 November 2014 at 09:37:50 UTC, Walter Bright=20
 wrote:
 I thought everyone hated foreach_reverse!
I dislike foreach_reverse; 1. it's a keyword with an underscore in it; 2. complicates implementation of foreach and parsing. 3. key_word with under_score
=20 These are compiler implementation issue and all solvable. People=20 don't give a shit about how the compiler work and rightly so. The=20 language is made to fit need of the user, not the needs of the=20 implementer.
`foreach (auto n; ...)` anyone? and `foreach (; ...)`? nope. "cosmetic changes aren't needed". this is clearly "made for implementer". luckyly, it's not me who will try explain to newcomers why they has new variable declaration in `foreach` which looks like variable reusing, why they must invent new variable name for each nested `foreach` and so on. but please, don't tell me about "solvable" -- all this "solvable" only in the sense "make your own fork and fix it. ah, and support your fork. and don't forget that your code cannot be used with vanilla compiler anymore." ok for me, but for others?
Nov 22 2014
prev sibling next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Walter Bright:

 I thought everyone hated foreach_reverse!
I love it! Bye, bearophile
Nov 21 2014
parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Fri, 21 Nov 2014 11:24:37 +0000
schrieb "bearophile" <bearophileHUGS lycos.com>:

 Walter Bright:
 
 I thought everyone hated foreach_reverse!
I love it! Bye, bearophile
Hey, it is a bit ugly, but I'd pick foreach_reverse (i; 0 .. length) anytime over import std.range; foreach (i; iota(length).retro()) -- Marco
Nov 21 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/21/14 12:22 PM, Marco Leise wrote:
 Am Fri, 21 Nov 2014 11:24:37 +0000
 schrieb "bearophile" <bearophileHUGS lycos.com>:

 Walter Bright:

 I thought everyone hated foreach_reverse!
I love it! Bye, bearophile
Hey, it is a bit ugly, but I'd pick foreach_reverse (i; 0 .. length) anytime over import std.range; foreach (i; iota(length).retro())
I agree, though "foreach (i; length.iota.retro)" is no slouch either! -- Andrei
Nov 21 2014
parent Marco Leise <Marco.Leise gmx.de> writes:
Am Fri, 21 Nov 2014 17:50:11 -0800
schrieb Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:

 I agree, though "foreach (i; length.iota.retro)" is no slouch either! -- 
 Andrei
Yes, no, well, it feels like too much science for a loop with a decrementing index instead of an incrementing, no matter how few parenthesis are used. It is not the place where I would want to introduce functional programming to someone who never saw D code before. That said, I'd also be uncertain if compilers transparently convert this to the equivalent of a reverse loop. -- Marco
Nov 22 2014
prev sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Friday, 21 November 2014 at 09:37:50 UTC, Walter Bright wrote:
 On 11/21/2014 12:16 AM, Daniel Murphy wrote:
 Over the years most of my unsigned-related bugs have been from 
 screwing up
 various loop conditions.  Thankfully D solves this perfectly 
 with:

 void fun(int[] a)
 {
    foreach_reverse(i, 0...a.length)
    {
    }
 }

 So I never have to write those again.
I thought everyone hated foreach_reverse! But, yeah, foreach and ranges+algorithms have virtually eliminated a large category of looping bugs.
Well yeah, it is kind of ugly looking, and a language construct for that when we have retro in phobos... foreach_reverse is essentially dead weight in the spec.
Nov 21 2014
prev sibling next sibling parent "uri" <uri gmail.com> writes:
On Wednesday, 19 November 2014 at 18:09:11 UTC, Ary Borenszweig 
wrote:
 On 11/19/14, 7:03 AM, Don wrote:
 On Tuesday, 18 November 2014 at 18:23:52 UTC, Marco Leise 
 wrote:

 Weird consequence: using subtraction with an unsigned type is 
 nearly
 always a bug.

 I wish D hadn't called unsigned integers 'uint'. They should 
 have been
 called '__uint' or something. They should look ugly. You need 
 a very,
 very good reason to use an unsigned type.

 We have a builtin type that is deadly but seductive.
I agree. An array's length makes sense as an unsigned ("an array can't have a negative length, right?") but it leads to the bugs you say. For example: ~~~ import std.stdio; void main() { auto a = [1, 2, 3]; auto b = [1, 2, 3, 4]; if (a.length - b.length > 0) { writeln("Can you spot the bug that easily?"); } } ~~~ Yes, it makes sense, but at the same time it leads to super unintuitive math operations being involved. Rust made the same mistake and now a couple of times I've seen bugs like these being reported. Never seen them in Java or .Net though. I wonder why...
IMO array length should be unsigned but I'd like to see unsafe operators on unsigned types illegal. It is trivial to write (a.signed - b.signed) and it should be explicit in the code, i.e. not something that the compiler will do automatically behind the scenes. Cheers, uri
Nov 19 2014
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/19/2014 10:09 AM, Ary Borenszweig wrote:
 I agree. An array's length makes sense as an unsigned ("an array can't have a
 negative length, right?") but it leads to the bugs you say. For example:

 ~~~
 import std.stdio;

 void main() {
    auto a = [1, 2, 3];
    auto b = [1, 2, 3, 4];
    if (a.length - b.length > 0) {
      writeln("Can you spot the bug that easily?");
Yes.
    }
 }
 ~~~

 Yes, it makes sense, but at the same time it leads to super unintuitive math
 operations being involved.
Computer math is not math math. It is its own beast, and if you're going to write in a systems programming language it is very important to learn how it works, or you'll be nothing but frustrated.
 Rust made the same mistake and now a couple of times I've seen bugs like these
 being reported. Never seen them in Java or .Net though. I wonder why...
D is meant to be easily used by C and C++ programmers. It follows the same model of signed/unsigned arithmetic and integral promotions. This is very, very deliberate. To change this would be a disaster. For example, in America we drive on the right. In Australia, they drive on the left. When I visit Australia, I know this, but when stepping out into the road I instinctively check my left for cars, step into the road, and my foot gets run over by a car coming from the right. I've had to be very careful as a pedestrian there, as my intuition would get me killed. Don't mess with systems programmers' intuitions. It'll cause more problems than it solves.
Nov 20 2014
next sibling parent "Kagamin" <spam here.lot> writes:
On Thursday, 20 November 2014 at 08:14:41 UTC, Walter Bright 
wrote:
 Computer math is not math math. It is its own beast, and if 
 you're going to write in a systems programming language it is 
 very important to learn how it works, or you'll be nothing but 
 frustrated.
Understanding how it works doesn't mean error prone practices must be forced everywhere. It's not like D can't work with signed types.
 Rust made the same mistake and now a couple of times I've seen 
 bugs like these
 being reported. Never seen them in Java or .Net though. I 
 wonder why...
D is meant to be easily used by C and C++ programmers. It follows the same model of signed/unsigned arithmetic and integral promotions. This is very, very deliberate. To change this would be a disaster.
If unsigned types exist, it doesn't mean they must be forced everywhere.
 For example, in America we drive on the right. In Australia, 
 they drive on the left. When I visit Australia, I know this, 
 but when stepping out into the road I instinctively check my 
 left for cars, step into the road, and my foot gets run over by 
 a car coming from the right. I've had to be very careful as a 
 pedestrian there, as my intuition would get me killed.

 Don't mess with systems programmers' intuitions. It'll cause 
 more problems than it solves.
Bad things can happen, but why make them more probable instead of trying to make them less probable?
Nov 20 2014
prev sibling parent reply "CraigDillabaugh" <craig.dillabaugh gmail.com> writes:
On Thursday, 20 November 2014 at 08:14:41 UTC, Walter Bright 
wrote:
clip
 For example, in America we drive on the right. In Australia, 
 they drive on the left. When I visit Australia, I know this, 
 but when stepping out into the road I instinctively check my 
 left for cars, step into the road, and my foot gets run over by 
 a car coming from the right. I've had to be very careful as a 
 pedestrian there, as my intuition would get me killed.

 Don't mess with systems programmers' intuitions. It'll cause 
 more problems than it solves.
I live in Quebec and my intuition always tells me to look both ways - because you never know :o)
Nov 21 2014
parent "Meta" <jared771 gmail.com> writes:
On Friday, 21 November 2014 at 16:48:35 UTC, CraigDillabaugh 
wrote:
 I live in Quebec and my intuition always tells me to look both 
 ways - because you never know :o)
While doing my driver's training years ago, my instructor half-jokingly warned us never to jaywalk in Quebec unless we have a death wish and want to hear all about chalices and tabernacles.
Nov 21 2014
prev sibling next sibling parent reply "Kagamin" <spam here.lot> writes:
On Tuesday, 18 November 2014 at 18:23:52 UTC, Marco Leise wrote:
 Aside from the size factor, I personally prefer unsigned types
 for countable stuff like array lengths. Mixed arithmetics
 decay to unsinged anyways and you don't need checks like
 `assert(idx >= 0)`.
Failing assert(-1 < arr.length) make little sense though, -1 can't be bigger than a non-negative number.
Nov 19 2014
parent Marco Leise <Marco.Leise gmx.de> writes:
Am Wed, 19 Nov 2014 10:42:30 +0000
schrieb "Kagamin" <spam here.lot>:

 On Tuesday, 18 November 2014 at 18:23:52 UTC, Marco Leise wrote:
 Aside from the size factor, I personally prefer unsigned types
 for countable stuff like array lengths. Mixed arithmetics
 decay to unsinged anyways and you don't need checks like
 `assert(idx >= 0)`.
Failing assert(-1 < arr.length) make little sense though, -1 can't be bigger than a non-negative number.
It makes little sense, right. -- Marco
Nov 21 2014
prev sibling parent reply "Kagamin" <spam here.lot> writes:
On Tuesday, 18 November 2014 at 18:23:52 UTC, Marco Leise wrote:
 Mixed arithmetics decay to unsinged anyways and you don't need 
 checks like `assert(idx >= 0)`.
What such assert gets you, what bound checking doesn't?
Nov 19 2014
parent Marco Leise <Marco.Leise gmx.de> writes:
Am Wed, 19 Nov 2014 11:01:12 +0000
schrieb "Kagamin" <spam here.lot>:

 On Tuesday, 18 November 2014 at 18:23:52 UTC, Marco Leise wrote:
 Mixed arithmetics decay to unsinged anyways and you don't need 
 checks like `assert(idx >= 0)`.
What such assert gets you, what bound checking doesn't?
It gets me the same when idx is an index into a D slice. -- Marco
Nov 21 2014
prev sibling parent "Sean Kelly" <sean invisibleduck.org> writes:
On Tuesday, 18 November 2014 at 14:24:18 UTC, FrankLike wrote:
 Many excellent projects such as dfl,dgui,tango, many 'length' 
 which type is 'int' or 'uint',they are D's,many people like 
 it.but they should migrate to 64 bit.So if 'length' type is 
 'int',they can work on 64 bit,but now,they must be modify for 
 'length''s type.
The type of 'length' has always been size_t, which is aliased to uint on x86 and ulong on x64. I think an argument could be made for using long instead of ulong, but the switch from unsigned to signed across bus widths could result in portability issues (when building an x64 program for x86).
Nov 18 2014
prev sibling next sibling parent "matovitch" <camille.brugel laposte.net> writes:
On Tuesday, 18 November 2014 at 12:33:52 UTC, FrankLike wrote:
 If you migrate your projct from x86 to x64,you will find the 
 length is error,you must modify it ,such as:
   int i= (something).length
 to
   size_t i = (something).length

 but now ,'int' is enough for use,not huge and not small,only 
 enough.
 'int' is easy to write,and most people are used to it.
 Most importantly easier to migrate code,if  'length''s return 
 value type is 'int'.

 Thank you all.
I'm using size_t and std::size_t in C/C++...but sure I am a bit weird/extremist.
Nov 18 2014
prev sibling parent reply "Jeremy DeHaan" <dehaan.jeremiah gmail.com> writes:
On Tuesday, 18 November 2014 at 12:33:52 UTC, FrankLike wrote:
 If you migrate your projct from x86 to x64,you will find the 
 length is error,you must modify it ,such as:
   int i= (something).length
 to
   size_t i = (something).length

 but now ,'int' is enough for use,not huge and not small,only 
 enough.
 'int' is easy to write,and most people are used to it.
 Most importantly easier to migrate code,if  'length''s return 
 value type is 'int'.

 Thank you all.
This is a weird thing to argue. just because an int is good enough for you does not mean that it is the best thing for all people.
Nov 18 2014
parent reply "David Eagen" <davideagen mailinator.com> writes:
Isn't the purpose of size_t is to be large enough to address all 
available memory? A negative value is not only too small but 
doesn't make sense when discussing lengths.

Correctness requires using size_t.
Nov 18 2014
parent reply ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Tue, 18 Nov 2014 17:59:04 +0000
David Eagen via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 Isn't the purpose of size_t is to be large enough to address all=20
 available memory? A negative value is not only too small but=20
 doesn't make sense when discussing lengths.
=20
 Correctness requires using size_t.
yes. besides, there is no such thing as "negative length", so it's somewhat... weird to use signed integer for length.
Nov 18 2014
next sibling parent "Kagamin" <spam here.lot> writes:
On Tuesday, 18 November 2014 at 18:03:35 UTC, ketmar via 
Digitalmars-d wrote:
 On Tue, 18 Nov 2014 17:59:04 +0000
 David Eagen via Digitalmars-d <digitalmars-d puremagic.com> 
 wrote:

 Isn't the purpose of size_t is to be large enough to address 
 all available memory? A negative value is not only too small 
 but doesn't make sense when discussing lengths.
 
 Correctness requires using size_t.
yes. besides, there is no such thing as "negative length", so it's somewhat... weird to use signed integer for length.
The reason is so that D won't mess with implicit signed-unsigned conversion, not negative length.
Nov 19 2014
prev sibling parent reply "Don" <x nospam.com> writes:
On Tuesday, 18 November 2014 at 18:03:35 UTC, ketmar via 
Digitalmars-d wrote:
 On Tue, 18 Nov 2014 17:59:04 +0000
 David Eagen via Digitalmars-d <digitalmars-d puremagic.com> 
 wrote:

 Isn't the purpose of size_t is to be large enough to address 
 all available memory? A negative value is not only too small 
 but doesn't make sense when discussing lengths.
 
 Correctness requires using size_t.
yes. besides, there is no such thing as "negative length", so it's somewhat... weird to use signed integer for length.
A length can certainly be negative. Just as a bank balance can be negative. It's just a number. If I have two pencils of length 10 cm and 15 cm, then the first one is -5 cm longer than the other. Of course any physical pencil is always of positive length, but that doesn't mean that typeof(pencil.length) can never be negative.
Nov 19 2014
next sibling parent "FrankLike" <1150015857 qq.com> writes:
 If I have two pencils of length 10 cm and 15 cm, then the first 
 one is -5 cm longer than the other.

 Of course any physical pencil is always of positive length, but 
 that doesn't mean that typeof(pencil.length) can never be 
 negative.
Right. 'int' is easy to use,is enough,and easy to migrate code to 64 bit.
Nov 19 2014
prev sibling parent reply ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Wed, 19 Nov 2014 13:47:50 +0000
Don via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 yes. besides, there is no such thing as "negative length", so=20
 it's
 somewhat... weird to use signed integer for length.
=20 A length can certainly be negative. Just as a bank balance can be=20 negative. It's just a number.
what is the relation between bank balance and length? please, show me the array with -1 element, so i can see negative length myself. seems to me that you are using "length" as "quantity", but that isn't right.
 If I have two pencils of length 10 cm and 15 cm, then the first=20
 one is -5 cm longer than the other.
and again "length" is not a relation. show me pencil of length -10 cm. when you substractin lengthes, you got completely different type as a result, not "length" anymore. ah, that untyped real-life math! ;-)
 Of course any physical pencil is always of positive length, but=20
 that doesn't mean that typeof(pencil.length) can never be=20
 negative.
it can't be negative. if it can be negative, it is wrongly named, it's not "length".
Nov 19 2014
parent reply David Gileadi <gileadis NSPMgmail.com> writes:
On 11/19/14, 6:57 AM, ketmar via Digitalmars-d wrote:
 On Wed, 19 Nov 2014 13:47:50 +0000
 Don via Digitalmars-d <digitalmars-d puremagic.com> wrote:
 If I have two pencils of length 10 cm and 15 cm, then the first
 one is -5 cm longer than the other.
and again "length" is not a relation. show me pencil of length -10 cm. when you substractin lengthes, you got completely different type as a result, not "length" anymore. ah, that untyped real-life math! ;-)
To me the salient point is that this is not just a mess with real-life math but also with math in D: lengths are unsigned but people subtract them all the time. If they're (un)lucky this will return correct values for a while, but then someday the lengths may be reversed and the values will be hugely wrong: int[] a = [1, 2, 3]; int[] b = [5, 4, 3, 2, 1]; writefln("%s", b.length - a.length); // Yup, 2 writefln("%s", a.length - b.length); // WAT? 18446744073709551614 This is why I agree with Don that:
 Having arr.length return an unsigned type, is a dreadful language
 mistake.
Nov 19 2014
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
David Gileadi:

     writefln("%s", b.length - a.length);  // Yup, 2
     writefln("%s", a.length - b.length);  // WAT? 
 18446744073709551614
Nowadays a better way to write such kind of code is using the Phobos "signed" function: writefln("%s", b.length.signed - a.length.signed); writefln("%s", a.length.signed - b.length.signed);
 This is why I agree with Don that:

 Having arr.length return an unsigned type, is a dreadful
language mistake.
This mistake is by design. Walter has resisted several times turning them into signed values, and even making size_t a type strongly different from uint/ulong. I don't agree with this design decision, but it's unlikely that Walter has changed his mind on it. So better to go on and discuss other things more likely to happen. Bye, bearophile
Nov 19 2014
parent reply David Gileadi <gileadis NSPMgmail.com> writes:
On 11/19/14, 9:12 AM, bearophile wrote:
 David Gileadi:

     writefln("%s", b.length - a.length);  // Yup, 2
     writefln("%s", a.length - b.length);  // WAT? 18446744073709551614
Nowadays a better way to write such kind of code is using the Phobos "signed" function: writefln("%s", b.length.signed - a.length.signed); writefln("%s", a.length.signed - b.length.signed);
But this requires the following knowledge: 1. That the signed function exists, and its location. As a casual D programmer I didn't know about it. 2. That there's a need for it at all, which requires knowing that length is unsigned. I did know this, but I bet in the heat of programming I'd easily forget it. In a semi-complex algorithm the bug could easily hide for a long time before biting.
 This is why I agree with Don that:

 Having arr.length return an unsigned type, is a dreadful
language mistake.
This mistake is by design. Walter has resisted several times turning them into signed values, and even making size_t a type strongly different from uint/ulong. I don't agree with this design decision, but it's unlikely that Walter has changed his mind on it. So better to go on and discuss other things more likely to happen.
Yes, I bet you're right about the likelihood of change.
Nov 19 2014
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
David Gileadi:

 But this requires the following knowledge:
 1. That the signed function exists, and its location. As a 
 casual D programmer I didn't know about it.
 2. That there's a need for it at all, which requires knowing 
 that length is unsigned. I did know this, but I bet in the heat 
 of programming I'd easily forget it. In a semi-complex 
 algorithm the bug could easily hide for a long time before 
 biting.
A solution for similar language design traps/mistakes is static analysis, with tools like Dscanner trained to recognize the troubling coding patterns, that should suggest workarounds (or to use better designed languages. But no language is perfect). Bye, bearophile
Nov 19 2014
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 11/19/2014 8:22 AM, David Gileadi wrote:
 2. That there's a need for it at all, which requires knowing that length is
 unsigned. I did know this, but I bet in the heat of programming I'd easily
 forget it. In a semi-complex algorithm the bug could easily hide for a long
time
 before biting.
If it was signed, you'd just have different issues hiding.
Nov 20 2014
prev sibling next sibling parent reply ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Wed, 19 Nov 2014 09:02:49 -0700
David Gileadi via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 On 11/19/14, 6:57 AM, ketmar via Digitalmars-d wrote:
 On Wed, 19 Nov 2014 13:47:50 +0000
 Don via Digitalmars-d <digitalmars-d puremagic.com> wrote:
 If I have two pencils of length 10 cm and 15 cm, then the first
 one is -5 cm longer than the other.
and again "length" is not a relation. show me pencil of length -10 cm. when you substractin lengthes, you got completely different type as a result, not "length" anymore. ah, that untyped real-life math! ;-)
=20 To me the salient point is that this is not just a mess with real-life=20 math but also with math in D: lengths are unsigned but people subtract=20 them all the time. If they're (un)lucky this will return correct values=20 for a while, but then someday the lengths may be reversed and the values=
=20
 will be hugely wrong:
=20
      int[] a =3D [1, 2, 3];
      int[] b =3D [5, 4, 3, 2, 1];
=20
      writefln("%s", b.length - a.length);  // Yup, 2
      writefln("%s", a.length - b.length);  // WAT? 18446744073709551614
=20
 This is why I agree with Don that:
=20
  > Having arr.length return an unsigned type, is a dreadful language
  > mistake.
ah, let range checking catch that. sure, there are edge cases where range checking fails, but not so many. besides, overflows are possible with signed ints too, so what signed length does is simply hiding the bad code. any code reviewer must ring a bell when he sees length subtraction without prior checking, be it signed or unsigned. so there is no reason in negative lengthes anyway.
Nov 19 2014
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
ketmar:

 ah, let range checking catch that.
No thanks, I prefer to not have bugs in the first place.
 besides, overflows are possible with signed ints too,
From my experience in coding in D they are far more unlikely than sign-related bugs of array lengths.
 so what signed length does is simply hiding the bad code.
Signed lengths avoids traps that are quite easy to fall into.
 any code reviewer must ring
 a bell when he sees length subtraction without prior checking,
 be it signed or unsigned.
The unsigned nature of array lengths is more tricky than that. They cause troubles even if you just compare (with <) a length with a signed value. Bye, bearophile
Nov 19 2014
parent reply "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"bearophile"  wrote in message news:lkcltlokangpzzdzzfjg forum.dlang.org...

 From my experience in coding in D they are far more unlikely than 
 sign-related bugs of array lengths.
Here's a simple program to calculate the relative size of two files, that will not work correctly with unsigned lengths. module sizediff import std.file; import std.stdio; void main(string[] args) { assert(args.length == 3, "Usage: sizediff file1 file2"); auto l1 = args[1].read().length; auto l2 = args[2].read().length; writeln("Difference: ", l1 - l2); } The two ways this can fail (that I want to highlight) are: 1. If either file is too large to fit in a size_t the result will (probably) be wrong 2. If file2 is bigger than file1 the result will be wrong If length was signed, problem 2 would not exist, and problem 1 would be more likely to occur. I think it's clear that signed lengths would work for more possible realistic inputs. While this is just an example, a similar pattern occurs in real code whenever array/range lengths are subtracted.
Nov 21 2014
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/21/2014 12:31 AM, Daniel Murphy wrote:
 Here's a simple program to calculate the relative size of two files, that will
 not work correctly with unsigned lengths.

 module sizediff

 import std.file;
 import std.stdio;

 void main(string[] args)
 {
     assert(args.length == 3, "Usage: sizediff file1 file2");
     auto l1 = args[1].read().length;
     auto l2 = args[2].read().length;
     writeln("Difference: ", l1 - l2);
 }

 The two ways this can fail (that I want to highlight) are:
 1. If either file is too large to fit in a size_t the result will (probably) be
 wrong
Presumably read() will throw if the size is larger than it can handle. If it doesn't, this code is not buggy, but read() is.
Nov 21 2014
parent reply "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"Walter Bright"  wrote in message news:m4mua1$shh$1 digitalmars.com...

 Presumably read() will throw if the size is larger than it can handle. If 
 it doesn't, this code is not buggy, but read() is.
You're right, but that's really not the point.
Nov 21 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/21/14 12:56 AM, Daniel Murphy wrote:
 "Walter Bright"  wrote in message news:m4mua1$shh$1 digitalmars.com...

 Presumably read() will throw if the size is larger than it can handle.
 If it doesn't, this code is not buggy, but read() is.
You're right, but that's really not the point.
What is your point? (Honest question.) Are you proposing that we make all array lengths signed? -- Andrei
Nov 21 2014
parent reply "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"Andrei Alexandrescu"  wrote in message 
news:m4nn38$1lau$2 digitalmars.com...

 What is your point? (Honest question.)
That using signed integers exclusively eliminates one class of bugs, while making another class only marginally more likely.
 Are you proposing that we make all array lengths signed?
No, I think that ship has sailed. But I recommend avoiding unsigned types for general arithmetic.
Nov 21 2014
next sibling parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Sat, 22 Nov 2014 06:34:11 +1100
schrieb "Daniel Murphy" <yebbliesnospam gmail.com>:

 "Andrei Alexandrescu"  wrote in message=20
 news:m4nn38$1lau$2 digitalmars.com...
=20
 What is your point? (Honest question.)
=20 That using signed integers exclusively eliminates one class of bugs, whil=
e=20
 making another class only marginally more likely.
=20
 Are you proposing that we make all array lengths signed?
=20 No, I think that ship has sailed. But I recommend avoiding unsigned type=
s=20
 for general arithmetic.=20
=20 I think it is more about getting into the right mind set. All hardware integer types are limited and need overflow checking. As someone using unsigned types all the time, all I need to keep in mind are two rules: 1) Overflow: uint number; =E2=80=A6 number =3D 10 * number + ch - '0'; It is handled with: if (number > uint.max / 10) if (number > uint.max - (ch - '0')) An underflow practically doesn't happen with unsigned arithmetic. 2) Subtraction Order Subtract the smaller value from the bigger one. a) Commonly one entity is of greater magnitude than the other: fileSize - offset length - idx b) If both entities are equal before the Lord I make them ordered to make rule 1) hold: if (fileSize1 > fileSize2) { // Do one thing } else { // Do the other thing } The length of an array is perfectly represented by a size_t. My goal is to do the technically correct thing and thereby make overflow bugs impossible. I.e. With unsigned types in general and size_t in particular you cannot pass anything that is prone to underflow/overflow. It is all natural numbers and any overflows must have happened already before the array got indexed. Inside opIndex, the unsigned types simplify the range checks (by the compiler or explicit) by removing the need to test for < 0. At the end of the day I find myself using unsigned types much more frequently than signed types because I find it easier to keep them in check and reason about. --=20 Marco
Nov 21 2014
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/21/2014 1:33 PM, Marco Leise wrote:
 As someone using unsigned types all the time, all I need to
 keep in mind are two rules:

 1) Overflow:

    uint number;
    …
    number = 10 * number + ch - '0';

    It is handled with:
    if (number > uint.max / 10)
    if (number > uint.max - (ch - '0'))
You're better off with core.checkedint here, as not only is it correct, but (eventually) it will enable the compiler to optimize it to simply checking the carry flag.
Nov 21 2014
parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Fri, 21 Nov 2014 18:07:15 -0800
schrieb Walter Bright <newshound2 digitalmars.com>:

 On 11/21/2014 1:33 PM, Marco Leise wrote:
 As someone using unsigned types all the time, all I need to
 keep in mind are two rules:

 1) Overflow:

    uint number;
    =E2=80=A6
    number =3D 10 * number + ch - '0';

    It is handled with:
    if (number > uint.max / 10)
    if (number > uint.max - (ch - '0'))
=20 You're better off with core.checkedint here, as not only is it correct, b=
ut=20
 (eventually) it will enable the compiler to optimize it to simply checkin=
g the=20
 carry flag.
Ah right, I keep forgetting that module. By the way, are calls to core.checkedint inlined? Since they are in another module and not templated I'd think not. --=20 Marco
Nov 22 2014
parent Walter Bright <newshound2 digitalmars.com> writes:
On 11/22/2014 1:07 PM, Marco Leise wrote:
 Am Fri, 21 Nov 2014 18:07:15 -0800
 You're better off with core.checkedint here, as not only is it correct, but
 (eventually) it will enable the compiler to optimize it to simply checking the
 carry flag.
Ah right, I keep forgetting that module.
It is brand new (!) so you're forgiven.
 By the way, are calls
 to core.checkedint inlined? Since they are in another module
 and not templated I'd think not.
Yes, since the import leaves the source for it in there.
Nov 22 2014
prev sibling parent "FrankLike" <1150015857 qq.com> writes:
On Friday, 21 November 2014 at 21:22:55 UTC, Marco Leise wrote:

 Inside opIndex, the unsigned types simplify the range checks
 (by the compiler or explicit) by removing the need to test for
 < 0.
 At the end of the day I find myself using unsigned types much
 more frequently than signed types because I find it easier to
 keep them in check and reason about.
Yes,I think so,except in individual cases.
Nov 22 2014
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/21/14 11:34 AM, Daniel Murphy wrote:
 "Andrei Alexandrescu"  wrote in message
 news:m4nn38$1lau$2 digitalmars.com...

 What is your point? (Honest question.)
That using signed integers exclusively eliminates one class of bugs, while making another class only marginally more likely.
 Are you proposing that we make all array lengths signed?
No, I think that ship has sailed. But I recommend avoiding unsigned types for general arithmetic.
That's reasonable. Also I agree that the discussion is moot. -- Andrei
Nov 21 2014
prev sibling next sibling parent reply "Frank Like" <1150015857 qq.com> writes:
 Here's a simple program to calculate the relative size of two 
 files, that will not work correctly with unsigned lengths.

 module sizediff;

 import std.file;
 import std.stdio;

 void main(string[] args)
 {
    assert(args.length == 3, "Usage: sizediff file1 file2");
    auto l1 = args[1].read().length;
    auto l2 = args[2].read().length;
    writeln("Difference: ", l1 - l2);
 }
This will be ok: writeln("Difference: ", (l1 >l2)? (l1 - l2):(l2 - l1)); If 'length''s type is not 'size_t',but is 'int' or 'long', it will be ok like this: import std.math; writeln("Difference: ", abs(l1 >l2)); Mathematical difference between unsigned value,size comparison should be done before in the right side of the equal sign character. If this work is done in druntime,D will be a real system language.
Nov 21 2014
parent "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"Frank Like"  wrote in message news:zhejapfebcvxnzrezcqj forum.dlang.org...

 If this work is done in druntime,D will be a real system language.
Sure, this is obviously the fundamental thing holding D back from being a _real_ system language.
Nov 21 2014
prev sibling parent reply ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Fri, 21 Nov 2014 19:31:23 +1100
Daniel Murphy via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 "bearophile"  wrote in message news:lkcltlokangpzzdzzfjg forum.dlang.org.=
..
=20
 From my experience in coding in D they are far more unlikely than=20
 sign-related bugs of array lengths.
=20 Here's a simple program to calculate the relative size of two files, that=
=20
 will not work correctly with unsigned lengths.
=20
 module sizediff
=20
 import std.file;
 import std.stdio;
=20
 void main(string[] args)
 {
     assert(args.length =3D=3D 3, "Usage: sizediff file1 file2");
     auto l1 =3D args[1].read().length;
     auto l2 =3D args[2].read().length;
     writeln("Difference: ", l1 - l2);
 }
=20
 The two ways this can fail (that I want to highlight) are:
 1. If either file is too large to fit in a size_t the result will (probab=
ly)=20
 be wrong
 2. If file2 is bigger than file1 the result will be wrong
=20
 If length was signed, problem 2 would not exist, and problem 1 would be m=
ore=20
 likely to occur.  I think it's clear that signed lengths would work for m=
ore=20
 possible realistic inputs.
no, the problem 2 just becomes hidden. while the given code works most of the time, it is still broken.
Nov 21 2014
next sibling parent reply "Araq" <rumpf_a web.de> writes:
 no, the problem 2 just becomes hidden. while the given code 
 works most
 of the time, it is still broken.
You cannot handle stack overflow in C reliably or out of memory conditions so "fails in extreme edge cases" is true for every piece of software. "broken" is not a black-white thing. "Works most of the time" surely is much more useful than "doesn't work". Otherwise you would throw away your phone the first time you get a busy signal.
Nov 21 2014
parent ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Fri, 21 Nov 2014 14:37:39 +0000
Araq via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 "broken" is not a black-white thing. "Works most of the time"
 surely is much more useful than "doesn't work". Otherwise you
 would throw away your phone the first time you get a busy signal.
"works most of the time" is the worst thing: the bug can be hidden for decades and then suddenly blows up stright into your face, making you wonder what happens with "good code". i will chose the code which "doesn't work" over "works most of the time" one: the first has a clearly visible problem, and the former has a carefully hidden problem. i prefer visible problems. btw, your phone example is totally wrong, 'case "busy" is a well-defined state. i for sure will throw the phone away if the phone accepts only *some* incoming calls and silently ignores some others (without me explicitly telling it to do so, of course). that's like a code that "works most of the time". but not in that time when they phoning you to tell that your house is on fire.
Nov 21 2014
prev sibling parent reply Ary Borenszweig <ary esperanto.org.ar> writes:
On 11/21/14, 11:29 AM, ketmar via Digitalmars-d wrote:
 On Fri, 21 Nov 2014 19:31:23 +1100
 Daniel Murphy via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 "bearophile"  wrote in message news:lkcltlokangpzzdzzfjg forum.dlang.org...

  From my experience in coding in D they are far more unlikely than
 sign-related bugs of array lengths.
Here's a simple program to calculate the relative size of two files, that will not work correctly with unsigned lengths. module sizediff import std.file; import std.stdio; void main(string[] args) { assert(args.length == 3, "Usage: sizediff file1 file2"); auto l1 = args[1].read().length; auto l2 = args[2].read().length; writeln("Difference: ", l1 - l2); } The two ways this can fail (that I want to highlight) are: 1. If either file is too large to fit in a size_t the result will (probably) be wrong 2. If file2 is bigger than file1 the result will be wrong If length was signed, problem 2 would not exist, and problem 1 would be more likely to occur. I think it's clear that signed lengths would work for more possible realistic inputs.
no, the problem 2 just becomes hidden. while the given code works most of the time, it is still broken.
So how would you solve problem 2?
Nov 21 2014
parent ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Fri, 21 Nov 2014 14:36:53 -0300
Ary Borenszweig via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 On 11/21/14, 11:29 AM, ketmar via Digitalmars-d wrote:
 On Fri, 21 Nov 2014 19:31:23 +1100
 Daniel Murphy via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 "bearophile"  wrote in message news:lkcltlokangpzzdzzfjg forum.dlang.o=
rg...
  From my experience in coding in D they are far more unlikely than
 sign-related bugs of array lengths.
Here's a simple program to calculate the relative size of two files, t=
hat
 will not work correctly with unsigned lengths.

 module sizediff

 import std.file;
 import std.stdio;

 void main(string[] args)
 {
      assert(args.length =3D=3D 3, "Usage: sizediff file1 file2");
      auto l1 =3D args[1].read().length;
      auto l2 =3D args[2].read().length;
      writeln("Difference: ", l1 - l2);
 }

 The two ways this can fail (that I want to highlight) are:
 1. If either file is too large to fit in a size_t the result will (pro=
bably)
 be wrong
 2. If file2 is bigger than file1 the result will be wrong

 If length was signed, problem 2 would not exist, and problem 1 would b=
e more
 likely to occur.  I think it's clear that signed lengths would work fo=
r more
 possible realistic inputs.
no, the problem 2 just becomes hidden. while the given code works most of the time, it is still broken.
=20 So how would you solve problem 2?
with proper check before doing subtraction. or by switching to some Scheme compiler with full numeric tower.
Nov 21 2014
prev sibling parent reply "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Wednesday, 19 November 2014 at 16:02:50 UTC, David Gileadi 
wrote:
 On 11/19/14, 6:57 AM, ketmar via Digitalmars-d wrote:
 On Wed, 19 Nov 2014 13:47:50 +0000
 Don via Digitalmars-d <digitalmars-d puremagic.com> wrote:
 If I have two pencils of length 10 cm and 15 cm, then the 
 first
 one is -5 cm longer than the other.
and again "length" is not a relation. show me pencil of length -10 cm. when you substractin lengthes, you got completely different type as a result, not "length" anymore. ah, that untyped real-life math! ;-)
To me the salient point is that this is not just a mess with real-life math but also with math in D: lengths are unsigned but people subtract them all the time. If they're (un)lucky this will return correct values for a while, but then someday the lengths may be reversed and the values will be hugely wrong: int[] a = [1, 2, 3]; int[] b = [5, 4, 3, 2, 1]; writefln("%s", b.length - a.length); // Yup, 2 writefln("%s", a.length - b.length); // WAT? 18446744073709551614 This is why I agree with Don that:
 Having arr.length return an unsigned type, is a dreadful
language
 mistake.
I'd say length being unsigned is fine. The real mistake is that the difference between two unsigned values isn't signed, which would be the most "correct" behaviour. Let people cast the result if they want wrapping (or better, use a helper function to document the intentiion).
Nov 19 2014
parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Wed, 19 Nov 2014 18:20:24 +0000
schrieb "Marc Sch=C3=BCtz" <schuetzm gmx.net>:

 I'd say length being unsigned is fine. The real mistake is that=20
 the difference between two unsigned values isn't signed, which=20
 would be the most "correct" behaviour.
Now take my position where I explicitly write code relying on the fact that `bigger - smaller` yields correct results. uint bigger =3D uint.max; uint smaller =3D 2; if (bigger > smaller) { auto added =3D bigger - smaller; // Now 'added' is an int with the value -3 ! } else { auto removed =3D smaller - bigger; } In fact checking which value is larger is the only way to handle the full result range of subtracting two machine integers which is ~2 times larger than what the original type can handle: T.min - T.max .. T.max - T.min This is one reason why I'd like to just keep working with the original unsigned type, but split the range around the positive/negative pivot with an if-else. Implicit conversion of unsigned subtractions to signed values would make the above code unnecessarily hard.
 Let people cast the result=20
 if they want wrapping (or better, use a helper function to=20
 document the intentiion).
--=20 Marco
Nov 21 2014
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/21/14 1:55 PM, Marco Leise wrote:
 Am Wed, 19 Nov 2014 18:20:24 +0000
 schrieb "Marc Schütz" <schuetzm gmx.net>:

 I'd say length being unsigned is fine. The real mistake is that
 the difference between two unsigned values isn't signed, which
 would be the most "correct" behaviour.
Now take my position where I explicitly write code relying on the fact that `bigger - smaller` yields correct results. uint bigger = uint.max; uint smaller = 2; if (bigger > smaller) { auto added = bigger - smaller; // Now 'added' is an int with the value -3 ! } else { auto removed = smaller - bigger; }
Interesting insight. Thanks for the many analytical examples you're giving in this thread. -- Andrei
Nov 21 2014
parent "FrankLike" <1150015857 qq.com> writes:
On Saturday, 22 November 2014 at 01:57:05 UTC, Andrei 
Alexandrescu wrote:
 On 11/21/14 1:55 PM, Marco Leise wrote:
 Am Wed, 19 Nov 2014 18:20:24 +0000
 schrieb "Marc Schütz" <schuetzm gmx.net>:

 I'd say length being unsigned is fine. The real mistake is 
 that
 the difference between two unsigned values isn't signed, which
 would be the most "correct" behaviour.
Now take my position where I explicitly write code relying on the fact that `bigger - smaller` yields correct results. uint bigger = uint.max; uint smaller = 2; if (bigger > smaller) { auto added = bigger - smaller; // Now 'added' is an int with the value -3 ! } else { auto removed = smaller - bigger; }
Interesting insight. Thanks for the many analytical examples you're giving in this thread. -- Andrei
It's right. ------smalltest---------------- import std.stdio; void main() { uint a =100; int b =80; auto c = a-b; auto d = b-a; writeln("c is ",c,", c's type is ",typeid(c)); writeln("d is ",d,", d's type is ",typeid(d)); auto e = b- a; writeln("e is ",e," e's type is ",typeid(e)); auto f = cast(int)b- cast(int)a; writeln("f is ",f," f's type is ",typeid(f)); } -------------end----------------------------- Only c and f's result is ok.
Nov 21 2014
prev sibling parent reply "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Friday, 21 November 2014 at 21:44:25 UTC, Marco Leise wrote:
 Am Wed, 19 Nov 2014 18:20:24 +0000
 schrieb "Marc Schütz" <schuetzm gmx.net>:

 I'd say length being unsigned is fine. The real mistake is 
 that the difference between two unsigned values isn't signed, 
 which would be the most "correct" behaviour.
Now take my position where I explicitly write code relying on the fact that `bigger - smaller` yields correct results. uint bigger = uint.max; uint smaller = 2; if (bigger > smaller) { auto added = bigger - smaller; // Now 'added' is an int with the value -3 ! } else { auto removed = smaller - bigger; } In fact checking which value is larger is the only way to handle the full result range of subtracting two machine integers which is ~2 times larger than what the original type can handle: T.min - T.max .. T.max - T.min This is one reason why I'd like to just keep working with the original unsigned type, but split the range around the positive/negative pivot with an if-else. Implicit conversion of unsigned subtractions to signed values would make the above code unnecessarily hard.
Yes, that's true. However, I doubt that this is a common case. I'd say that when two values are to be subtracted (signed or unsigned), and there's no knowledge about which one is larger, it's more useful to get a signed difference. This should be correct in most cases, because I believe it is more likely that the two values are close to each other. It only becomes a problem when they're an opposite sides of the value range. Unfortunately, no matter how you turn it, there will always be corner cases that a) will be wrong and b) the compiler will allow silently. So the question becomes one of preferences between usefulness for common use cases, ease of detection of errors, and compatibility.
Nov 22 2014
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Saturday, 22 November 2014 at 11:12:06 UTC, Marc Schütz wrote:
 I'd say that when two values are to be subtracted (signed or 
 unsigned), and there's no knowledge about which one is larger, 
 it's more useful to get a signed difference. This should be 
 correct in most cases, because I believe it is more likely that 
 the two values are close to each other. It only becomes a 
 problem when they're an opposite sides of the value range.
Not being able to decrement unsigned types would be a disaster. Think about unsigned integers as an enumeration. You should be able to both take the predecessor and successor of the value. This is also in line with how you formalize natural numbers in math: 0 == zero 1 == successor(zero) 2 == successor(successor(zero)) This is basically a unary representation of natural numbers and it allows both addition and subtraction. Unsigned int should be considered a binary representation of the same capped at max value. Bearophile has given a sensible solution a long time ago, make type coercion explicit and add a weaker coercion operator. That operator should prevent senseless type coercion, but allow system-level-coercion over signedness. Problem fixed.
Nov 22 2014