www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - bearophile can say "i told you so" (re uint->int implicit conv)

reply "Adam D. Ruppe" <destructionator gmail.com> writes:
I was working on a project earlier today that stores IP addresses 
in a database as a uint. For some reason though, some addresses 
were coming out as 0.0.0.0, despite the fact that if(ip == 0) 
return; in the only place it actually saves them (which was my 
first attempted quick fix for the bug).

Turns out the problem was this:

if (arg == typeid(uint)) {
	int e = va_arg!uint(_argptr);
	a = to!string(e);
}


See, I copy/pasted it from the int check, but didn't update the 
type on the left hand side. So it correctly pulled a uint out of 
the varargs, but then assigned it to an int, which the compiler 
accepted silently, so to!string() printed -blah instead of 
bigblah... which then got truncated by the database, resulting in 
zero being stored.

I've since changed it to be "auto e = ..." and it all works 
correctly now.



Anyway I thought I'd share this just because one of the many 
times bearophile has talked about this as a potentially buggy 
situation, I was like "bah humbug"... and now I've actually been 
there!

I still don't think I'm for changing the language though just 
because of potential annoyances in other places unsigned works 
(such as array.length) but at least I've actually felt the other 
side of the argument in real world code now.
Mar 28 2013
next sibling parent =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 28-03-2013 21:03, Adam D. Ruppe wrote:
 I was working on a project earlier today that stores IP addresses in a
 database as a uint. For some reason though, some addresses were coming
 out as 0.0.0.0, despite the fact that if(ip == 0) return; in the only
 place it actually saves them (which was my first attempted quick fix for
 the bug).

 Turns out the problem was this:

 if (arg == typeid(uint)) {
      int e = va_arg!uint(_argptr);
      a = to!string(e);
 }


 See, I copy/pasted it from the int check, but didn't update the type on
 the left hand side. So it correctly pulled a uint out of the varargs,
 but then assigned it to an int, which the compiler accepted silently, so
 to!string() printed -blah instead of bigblah... which then got truncated
 by the database, resulting in zero being stored.

 I've since changed it to be "auto e = ..." and it all works correctly now.



 Anyway I thought I'd share this just because one of the many times
 bearophile has talked about this as a potentially buggy situation, I was
 like "bah humbug"... and now I've actually been there!

 I still don't think I'm for changing the language though just because of
 potential annoyances in other places unsigned works (such as
 array.length) but at least I've actually felt the other side of the
 argument in real world code now.

This is exactly why many new languages only allow implicit integer conversions where the target type is strictly a >= type with the same sign, i.e. uint -> ulong, short -> int, and so on. It is indeed very unfortunate that we have these dangerous implicit conversions in D. I would welcome a change to remove them (because it would likely catch real bugs in many cases). ... And, you know, many other changes to the language/compiler over the last couple of releases have broken plenty of my code. I wonder when we'll finally say "this is the D programming language, period". The current situation where some breaking changes are perfectly OK while others are not is kind of ridiculous. I'm personally in favor of fixing some of the serious issues we have in the language once and for all and then *finally* stabilizing the language. It's ridiculous that we claim the language to be stable (or stabilizing) while we're still actively breaking real code to fix language issues. Fixing language issues is good and we should do it more so we can actually get to a point where we can call D stable. The current situation where some changes get blocked because the reviewer happens to be in a "D is stable" mood is -- sorry, but really -- stupid. I used to even tell people "we're stabilizing D" when they ask why we don't fix some particular language design issue. I don't anymore, because I realized just how ridiculous this situation has gotten. Well... end of rant. -- Alex Rønne Petersen alex alexrp.com / alex lycus.org http://alexrp.com / http://lycus.org
Mar 28 2013
prev sibling next sibling parent Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:
On Thu, 28 Mar 2013 21:03:07 +0100
"Adam D. Ruppe" <destructionator gmail.com> wrote:
 which then got truncated by the database,

While I won't necessarily disagree with the rest, that right there is "the real WTF". A database that silently alters data is unreliable, and therefore fundamentally broken as a database. It should have raised an error instead. Is this MySQL, by any chance? And if so, are you making sure to use strict-mode? That might help. From what I can tell, having strict mode disabled is basically MySQL's "please fuck up half of my data" feature. Not that I necessarily trust its strict mode to always be right (which could very well be unfounded pessimism on my part), but it should at least help.
Mar 28 2013
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Thursday, 28 March 2013 at 21:29:55 UTC, Nick Sabalausky wrote:
 Is this MySQL, by any chance?

Yes, and no on strict mode, I didn't even know it had one!
Mar 28 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Adam D. Ruppe:

 if (arg == typeid(uint)) {
 	int e = va_arg!uint(_argptr);
 	a = to!string(e);
 }


 See, I copy/pasted it from the int check, but didn't update the 
 type on the left hand side. So it correctly pulled a uint out 
 of the varargs, but then assigned it to an int, which the 
 compiler accepted silently,

If you remove the implicit uint==>int assignment from D you have to add many cast() in the code. And casts are dangerous, maybe even more than implicit casts. That's why D is the way it is. Maybe here a cast(signed) is a bit safer. I didn't write a Bugzilla request to remove the implicit uint==>int assignment. (I think the signed-unsigned comparisons are more dangerous than those signed-unsigned assignments. But maybe too is a problem with no solution). ------------------ Alex Rønne Petersen:
I'm personally in favor of fixing some of the serious issues we 
have in

That's quite hard to do because the problems are not easy to fix/improve, it takes time and a _lot_ of thinking. You can't quickly fix "shared", memory ownership problems, redesign things to not preclude the future creation of a far more parallel GC, and so on. And even much simpler things like properties need time to be redesigned. Maybe in the D world there's some need for a theoretician, beside Andrei. But I agree most of the time should now be used facing the larger holes, design problems and missing parts of D, and less on everything else. Because the more time passes, the less easy it becomes to fix/improve those things. It's a shame to have to leave D after all this work just because similar problems get essentially frozen. Bye, bearophile
Mar 28 2013
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 03/28/2013 09:03 PM, Adam D. Ruppe wrote:
 I was working on a project earlier today that stores IP addresses in a
 database as a uint. For some reason though, some addresses were coming
 out as 0.0.0.0, despite the fact that if(ip == 0) return; in the only
 place it actually saves them (which was my first attempted quick fix for
 the bug).

 Turns out the problem was this:

 if (arg == typeid(uint)) {
      int e = va_arg!uint(_argptr);
      a = to!string(e);
 }


 See, I copy/pasted it from the int check, but didn't update the type on
 the left hand side. ...

While I agree that implicit uint <-> int is a bad situation, I think the following practises deserve the larger part of the blame: - Having too much redundant information in the code. - Copypasta & edit instead of string mixins / static foreach. Of course, sometimes there is a significant amount of temptation. (Also, that code snippet is nowhere near the most convenient line length. Eliminating the temporary completely is a valid option. :o))
Mar 28 2013
parent reply Benjamin Thaut <code benjamin-thaut.de> writes:
Am 29.03.2013 02:28, schrieb Adam D. Ruppe:
 Part of why I did it this way was the annoyance that I can't do a
 variadic template in an interface. I'd REALLY prefer to do it that way
 so there wouldn't be a list of types at all - just plain to!string(foo).

Who says you can't ? In fact you can using the NVI idiom: http://dpaste.dzfl.pl/d3b6dc77 Kind Regards Benjamin Thaut
Mar 29 2013
parent Benjamin Thaut <code benjamin-thaut.de> writes:
Am 29.03.2013 13:10, schrieb Adam D. Ruppe:
 Is that fairly new in D? I'm almost certain I tried it and it didn't
 work when I originally wrote this code (which was a couple years ago).

Yes it's fairly new. I think dmd 2.060 or something along that line. I tend to use it a lot because its so awesome ^^ For example my streaming interface for binary streams: interface IInputStream { public: final size_t read(T)(ref T data) if(!thBase.traits.isArray!T) { static assert(!is(T == const) && !is(T == immutable), "can not read into const / immutable value"); return readImpl((cast(void*)&data)[0..T.sizeof]); } final size_t read(T)(T data) if(thBase.traits.isArray!T) { static assert(!is(typeof(data[0]) == const) && !is(typeof(data[0]) == immutable), "can not read into const / immutable array"); return readImpl((cast(void*)data.ptr)[0..(arrayType!T.sizeof * data.length)]); } size_t skip(size_t bytes); protected: size_t readImpl(void[] buffer); } Kind Regards Benjamin Thaut
Mar 29 2013
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Mar 28, 2013 at 09:03:07PM +0100, Adam D. Ruppe wrote:
 I was working on a project earlier today that stores IP addresses in
 a database as a uint. For some reason though, some addresses were
 coming out as 0.0.0.0, despite the fact that if(ip == 0) return; in
 the only place it actually saves them (which was my first attempted
 quick fix for the bug).
 
 Turns out the problem was this:
 
 if (arg == typeid(uint)) {
 	int e = va_arg!uint(_argptr);
 	a = to!string(e);
 }
 
 
 See, I copy/pasted it from the int check, but didn't update the type
 on the left hand side. So it correctly pulled a uint out of the
 varargs, but then assigned it to an int, which the compiler accepted
 silently, so to!string() printed -blah instead of bigblah... which
 then got truncated by the database, resulting in zero being stored.

IMO, the compiler should insert bounds checks in non-release mode when implicitly converting between signed and unsigned. Also, I don't like repeating types, precisely for this reason; if that second line had been written: auto e = va_arg!uint(_argptr); then this bug wouldn't have happened. But once you repeat 'uint' twice, there's the risk that you'll forget to update both instances when changing/copying the code. DRY is a good principle to live by when it comes to coding.
 I've since changed it to be "auto e = ..." and it all works
 correctly now.

Yep! :)
 Anyway I thought I'd share this just because one of the many times
 bearophile has talked about this as a potentially buggy situation, I
 was like "bah humbug"... and now I've actually been there!
 
 I still don't think I'm for changing the language though just
 because of potential annoyances in other places unsigned works (such
 as array.length) but at least I've actually felt the other side of
 the argument in real world code now.

Maybe it's time to introduce cast(signed) or cast(unsigned) to the language, as bearophile suggests? T -- Государство делает вид, что платит нам зарплату, а мы делаем вид, что работаем.
Mar 28 2013
prev sibling next sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, March 28, 2013 15:11:02 H. S. Teoh wrote:
 Maybe it's time to introduce cast(signed) or cast(unsigned) to the
 language, as bearophile suggests?

It's not terribly pretty, but you can always do this auto foo = cast(Unsigned!(typeof(var))var; or auto bar = to!(Unsigned!(typeof(var)))(var); - Jonathan M Davis
Mar 28 2013
parent Walter Bright <newshound2 digitalmars.com> writes:
On 3/28/2013 6:17 PM, Jonathan M Davis wrote:
 On Thursday, March 28, 2013 15:11:02 H. S. Teoh wrote:
 Maybe it's time to introduce cast(signed) or cast(unsigned) to the
 language, as bearophile suggests?

It's not terribly pretty, but you can always do this:

http://dlang.org/phobos/std_traits.html#.Unsigned http://dlang.org/phobos/std_traits.html#.Signed
Mar 28 2013
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Friday, 29 March 2013 at 01:18:03 UTC, Jonathan M Davis wrote:
 It's not terribly pretty, but you can always do this

We could also do more C++ looking: unsigned_cast!foo or IFTI or whatever;
Mar 28 2013
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Thursday, 28 March 2013 at 22:12:57 UTC, H. S. Teoh wrote:
 Also, I don't like repeating types, precisely for this reason; 
 if that second line had been written:

Yeah, I usually don't either, but apparently I did here. Murphy's law at work perhaps!
Mar 28 2013
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Thursday, 28 March 2013 at 21:58:05 UTC, bearophile wrote:
 I didn't write a Bugzilla request to remove the implicit 
 uint==>int assignment. (I think the signed-unsigned comparisons 
 are more dangerous than those signed-unsigned assignments. But 
 maybe too is a problem with no solution).

Oh maybe I got it mixed up, but I definitely remember talking about signed/unsigned something with you before!
Mar 28 2013
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, March 29, 2013 02:19:49 Adam D. Ruppe wrote:
 On Friday, 29 March 2013 at 01:18:03 UTC, Jonathan M Davis wrote:
 It's not terribly pretty, but you can always do this

We could also do more C++ looking: unsigned_cast!foo or IFTI or whatever;

It would be pretty trivial to add a wrapper function to make it cleaner. I was just pointing out that we already provided a way to cast to an unsigned type of the same size without needing to add anything to the language. - Jonathan M Davis
Mar 28 2013
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Thursday, 28 March 2013 at 22:04:36 UTC, Timon Gehr wrote:
 - Copypasta & edit instead of string mixins / static foreach.

Part of why I did it this way was the annoyance that I can't do a variadic template in an interface. I'd REALLY prefer to do it that way so there wouldn't be a list of types at all - just plain to!string(foo). The actual line in the program is a little longer too more like this: if(arg == typeid(string) || arg == typeid(immutable(string)) || arg == typeid(const(string))) It annoyed me that there's so many different typeids even though it really doesn't matter for me here. But oh well, I got this code to a point where it works (with a few practices I keep in mind) and now I generally don't think about it anymore.
Mar 28 2013
prev sibling next sibling parent "Kagamin" <spam here.lot> writes:
On Friday, 29 March 2013 at 01:18:03 UTC, Jonathan M Davis wrote:
 On Thursday, March 28, 2013 15:11:02 H. S. Teoh wrote:
 Maybe it's time to introduce cast(signed) or cast(unsigned) to 
 the
 language, as bearophile suggests?

It's not terribly pretty, but you can always do this auto foo = cast(Unsigned!(typeof(var))var; or auto bar = to!(Unsigned!(typeof(var)))(var); - Jonathan M Davis

short signed(ushort n){ return cast(short)n; } int signed(uint n){ return cast(int)n; } long signed(ulong n){ return cast(long)n; } int n = va_arg!uint(_argptr).signed;
Mar 28 2013
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Friday, 29 March 2013 at 09:26:33 UTC, Benjamin Thaut wrote:
 Who says you can't ? In fact you can using the NVI idiom:

Is that fairly new in D? I'm almost certain I tried it and it didn't work when I originally wrote this code (which was a couple years ago). But it'd be worth redoing it now. The other place I use runtime varargs is: // vararg hack so property assignment works right, even with null string opDispatch(string field, string file = __FILE__, size_t line = __LINE__)(...) I think there's a better way to do that now too. I'll have to spend some weekend gime on this.
Mar 29 2013
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday, March 29, 2013 13:10:02 Adam D. Ruppe wrote:
 On Friday, 29 March 2013 at 09:26:33 UTC, Benjamin Thaut wrote:
 Who says you can't ? In fact you can using the NVI idiom:

didn't work when I originally wrote this code (which was a couple years ago).

It'll work with classes and protected. It's _supposed_ to work with interfaces and private according to TDPL, but AFAIK, that hasn't been implemented yet (though it might be; I don't know). - Jonathan M Davis
Mar 29 2013
parent Benjamin Thaut <code benjamin-thaut.de> writes:
Am 29.03.2013 20:29, schrieb Jonathan M Davis:
 No. -w makes it so that warnings are errors, so you generally can't make
 anything a warning unless you're willing for it to be treated as an error at
 least some of the time (and a lot of people compile with -w), and this sort of
 thing is _supposed_ to work without a warning - primarily because if it
 doesn't, you're forced to cast all over the place when you're dealing with
 both signed and unsigned types, and the casts actually make your code more
 error-prone, because you could end up casting something other than uint to int
 or int to uint by accident (e.g. long to uint) and end up with bugs due to
 that.

Reading this tells me two things: 1) The D-Cast is seriously broken, the default behavior should not be one that "breaks" stuff if you don't use it right. I personally really like the idea of having different types of casts. Some of which still doe checks and other that just do what you want because you know what yu are doing. 2) The library needs something like an int_cast which checks casts from one integer type to another and asserts / throws on error. (For an example see https://github.com/Ingrater/thBase/blob/master/src/thBase/casts.d#L28) Kind Regards Benjamin Thaut
Mar 29 2013
prev sibling next sibling parent "Minas Mina" <minas_mina1990 hotmail.co.uk> writes:
Consider:
uint u = ...;
int x = u;

Wouldn't a warning be enough?
Mar 29 2013
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday, March 29, 2013 17:27:10 Minas Mina wrote:
 Consider:
 uint u = ...;
 int x = u;
 
 Wouldn't a warning be enough?

No. -w makes it so that warnings are errors, so you generally can't make anything a warning unless you're willing for it to be treated as an error at least some of the time (and a lot of people compile with -w), and this sort of thing is _supposed_ to work without a warning - primarily because if it doesn't, you're forced to cast all over the place when you're dealing with both signed and unsigned types, and the casts actually make your code more error-prone, because you could end up casting something other than uint to int or int to uint by accident (e.g. long to uint) and end up with bugs due to that. There are definitely cases where it would be nice to warn about conversions between signed and unsigned values, but there's a definite cost to it as well, so the situation is not at all clear cut. - Jonathan M Davis
Mar 29 2013
prev sibling next sibling parent "Jesse Phillips" <Jessekphillips+D gmail.com> writes:
On Friday, 29 March 2013 at 19:38:32 UTC, Benjamin Thaut wrote:

 2) The library needs something like an int_cast which checks 
 casts from one integer type to another and asserts / throws on 
 error.

uint value = 3408924; auto v = std.conv.to!int(value); Exception is thrown if an overflow occurs.
Mar 29 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
 If you remove the implicit uint==>int assignment from D you 
 have to add many cast() in the code. And casts are dangerous, 
 maybe even more than implicit casts.

On the other hand I have not tried D with such change, so that's just an hypothesis. And maybe a library-defined toSigned()/toUnsigned() are enough here. Bye, bearophile
Mar 29 2013
prev sibling next sibling parent "Kagamin" <spam here.lot> writes:
On Friday, 29 March 2013 at 19:29:21 UTC, Jonathan M Davis wrote:
 because if it
 doesn't, you're forced to cast all over the place

There are so many implicit conversions between signed and unsigned? Are they all ok?
Mar 30 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Kagamin:

 Jonathan M Davis wrote:
 because if it
 doesn't, you're forced to cast all over the place

There are so many implicit conversions between signed and unsigned? Are they all ok?

I think Jonathan doesn't have enough proof that forbidding signed<->unsigned implicit casts in D is worse than the current situation that allows them. Bye, bearophile
Mar 30 2013
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, March 30, 2013 22:12:30 bearophile wrote:
 Kagamin:
 Jonathan M Davis wrote:
 because if it
 doesn't, you're forced to cast all over the place

There are so many implicit conversions between signed and unsigned? Are they all ok?

I think Jonathan doesn't have enough proof that forbidding signed<->unsigned implicit casts in D is worse than the current situation that allows them.

Walter is the one that you have to convince, and I don't think that that's ever going to happen. - Jonathan M Davis
Mar 30 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Jonathan M Davis:

 Walter is the one that you have to convince, and I don't think 
 that that's ever going to happen.

I understand. But maybe Walter too don't have that proof... I compile C code with all warnings, and the compiler tells me most cases of mixing signed with unsigned. I usually remove most of them. I think the Go language doesn't have that implicit cast and Go programmers seem able to survive. Bye, bearophile
Mar 30 2013
prev sibling next sibling parent "Kagamin" <spam here.lot> writes:
I vaguely remember Walter said those diagnostics are mostly false 
positives. Though I don't remember whether if was about implicit 
conversions.
Mar 31 2013
prev sibling next sibling parent "Kagamin" <spam here.lot> writes:
I can say C compilers bug me with InterlockedIncrement function: 
it can be called on both volatile and non-volatile variables so 
type qualification of argument can't always match that of 
parameter and the compiler complains. I found that silly.
Mar 31 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Kagamin:

 I vaguely remember Walter said those diagnostics are mostly 
 false positives. Though I don't remember whether if was about 
 implicit conversions.

I agree several of them seem innocuous. Bye, bearophile
Mar 31 2013
prev sibling next sibling parent "Kagamin" <spam here.lot> writes:
On Friday, 29 March 2013 at 05:34:07 UTC, Kagamin wrote:
 On Friday, 29 March 2013 at 01:18:03 UTC, Jonathan M Davis 
 wrote:
 On Thursday, March 28, 2013 15:11:02 H. S. Teoh wrote:
 Maybe it's time to introduce cast(signed) or cast(unsigned) 
 to the
 language, as bearophile suggests?

It's not terribly pretty, but you can always do this auto foo = cast(Unsigned!(typeof(var))var; or auto bar = to!(Unsigned!(typeof(var)))(var); - Jonathan M Davis

short signed(ushort n){ return cast(short)n; } int signed(uint n){ return cast(int)n; } long signed(ulong n){ return cast(long)n; } int n = va_arg!uint(_argptr).signed;

BTW phobos already has the function: http://dlang.org/phobos/std_traits.html#.unsigned I'm not sure if it's enough without `signed` counterpart.
Apr 01 2013
prev sibling next sibling parent reply "Don" <turnyourkidsintocash nospam.com> writes:
On Thursday, 28 March 2013 at 20:03:08 UTC, Adam D. Ruppe wrote:
 I was working on a project earlier today that stores IP 
 addresses in a database as a uint. For some reason though, some 
 addresses were coming out as 0.0.0.0, despite the fact that 
 if(ip == 0) return; in the only place it actually saves them 
 (which was my first attempted quick fix for the bug).

 Turns out the problem was this:

 if (arg == typeid(uint)) {
 	int e = va_arg!uint(_argptr);
 	a = to!string(e);
 }


 See, I copy/pasted it from the int check, but didn't update the 
 type on the left hand side. So it correctly pulled a uint out 
 of the varargs, but then assigned it to an int, which the 
 compiler accepted silently, so to!string() printed -blah 
 instead of bigblah... which then got truncated by the database, 
 resulting in zero being stored.

 I've since changed it to be "auto e = ..." and it all works 
 correctly now.



 Anyway I thought I'd share this just because one of the many 
 times bearophile has talked about this as a potentially buggy 
 situation, I was like "bah humbug"... and now I've actually 
 been there!

 I still don't think I'm for changing the language though just 
 because of potential annoyances in other places unsigned works 
 (such as array.length) but at least I've actually felt the 
 other side of the argument in real world code now.

IMHO, array.length is *the* place where unsigned does *not* work. size_t should be an integer. We're not supporting 16 bit systems, and the few cases where a size_t value can potentially exceed int.max could be disallowed. The problem with unsigned is that it gets used as "positive integer", which it is not. I think it was a big mistake that D turned C's "unsigned long" into "ulong", thereby making it look more attractive. Nobody should be using unsigned types unless they have a really good reason. Unfortunately, size_t forces you to use them.
Apr 02 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/2/13 3:49 AM, Don wrote:
 IMHO, array.length is *the* place where unsigned does *not* work. size_t
 should be an integer. We're not supporting 16 bit systems, and the few
 cases where a size_t value can potentially exceed int.max could be
 disallowed.

 The problem with unsigned is that it gets used as "positive integer",
 which it is not. I think it was a big mistake that D turned C's
 "unsigned long" into "ulong", thereby making it look more attractive.
 Nobody should be using unsigned types unless they have a really good
 reason. Unfortunately, size_t forces you to use them.

I used to lean a lot more toward this opinion until I got to work on a C++ codebase using signed integers as array sizes and indices. It's an pain all over the code - two tests instead of one or casts all over, more cases to worry about... changing the code to use unsigned throughout ended up being an improvement. Andrei
Apr 02 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/2/2013 12:47 PM, Andrei Alexandrescu wrote:
 I used to lean a lot more toward this opinion until I got to work on a C++
 codebase using signed integers as array sizes and indices. It's an pain all
over
 the code - two tests instead of one or casts all over, more cases to worry
 about... changing the code to use unsigned throughout ended up being an
 improvement.

For example, with a signed array index, a bounds check is two comparisons rather than one.
Apr 02 2013
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/2/13 11:10 PM, Steven Schveighoffer wrote:
 On Tue, 02 Apr 2013 16:32:21 -0400, Walter Bright
 <newshound2 digitalmars.com> wrote:

 On 4/2/2013 12:47 PM, Andrei Alexandrescu wrote:
 I used to lean a lot more toward this opinion until I got to work on
 a C++
 codebase using signed integers as array sizes and indices. It's an
 pain all over
 the code - two tests instead of one or casts all over, more cases to
 worry
 about... changing the code to use unsigned throughout ended up being an
 improvement.

For example, with a signed array index, a bounds check is two comparisons rather than one.

Why? struct myArr { int length; int opIndex(int idx) { if(cast(uint)idx >= cast(uint)length) throw new RangeError(); ...} } -Steve

As I said - either two tests or casts all over. Andrei
Apr 02 2013
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 4/2/2013 8:10 PM, Steven Schveighoffer wrote:
 On Tue, 02 Apr 2013 16:32:21 -0400, Walter Bright <newshound2 digitalmars.com>
 wrote:
 For example, with a signed array index, a bounds check is two comparisons
 rather than one.

Why? struct myArr { int length; int opIndex(int idx) { if(cast(uint)idx >= cast(uint)length) throw new RangeError(); ...} }

Being able to cast to unsigned implies that the unsigned types exist. So no improvement.
Apr 04 2013
prev sibling next sibling parent "renoX" <renozyx gmail.com> writes:
On Tuesday, 2 April 2013 at 07:49:04 UTC, Don wrote:
[cut]
 IMHO, array.length is *the* place where unsigned does *not* 
 work. size_t should be an integer. We're not supporting 16 bit 
 systems, and the few cases where a size_t value can potentially 
 exceed int.max could be disallowed.

 The problem with unsigned is that it gets used as "positive 
 integer", which it is not. I think it was a big mistake that D 
 turned C's  "unsigned long" into "ulong", thereby making it 
 look more attractive. Nobody should be using unsigned types 
 unless they have a really good reason. Unfortunately, size_t 
 forces you to use them.

You forgot something: an explanation why you feel that way.. I do consider unsigned int as "positive integer", why do you think that isn't the case? IMHO the issue with unsigned are 1) implicit conversion: a C mistake and an even worst mistake to copy it from C knowing that this will lead to many errors! 2) lack of overflow checks by default.
Apr 02 2013
prev sibling next sibling parent "Don" <turnyourkidsintocash nospam.com> writes:
On Tuesday, 2 April 2013 at 08:29:41 UTC, renoX wrote:
 On Tuesday, 2 April 2013 at 07:49:04 UTC, Don wrote:
 [cut]
 IMHO, array.length is *the* place where unsigned does *not* 
 work. size_t should be an integer. We're not supporting 16 bit 
 systems, and the few cases where a size_t value can 
 potentially exceed int.max could be disallowed.

 The problem with unsigned is that it gets used as "positive 
 integer", which it is not. I think it was a big mistake that D 
 turned C's  "unsigned long" into "ulong", thereby making it 
 look more attractive. Nobody should be using unsigned types 
 unless they have a really good reason. Unfortunately, size_t 
 forces you to use them.

You forgot something: an explanation why you feel that way.. I do consider unsigned int as "positive integer", why do you think that isn't the case?

You can actually see it from the name. An unsigned number is exactly that -- it's a value with *no sign*. That's quite different from a positive integer, which is a number where the sign is known to be positive. If it has no sign, that means that the interpretation of the sign requires further information. For example, it may be the low digits of a multi-byte number. (In fact, in the Intel docs, multi-word operations are the primary reason for the existence of unsigned operations). It might also be a bag of bits. Mathematically, a positive integer is Z+, just with a limited range. If an operation exceeds the range, it's really an overflow error, the representation has broken down. An uint, however, is a value mod 2^^32, and follows completely normal modular arithmetic rules. It's the responsibility of the surrounding code to add meaning to it. But very often, people use 'uint' when they really want an int, whose sign bit is zero.
 IMHO the issue with unsigned are
 1) implicit conversion: a C mistake and an even worst mistake 
 to copy it from C knowing that this will lead to many errors!
 2) lack of overflow checks by default.

I'm not sure how (2) is relevant. Note that overflow of unsigned operations is impossible. Only signed numbers can overflow. Unsigned numbers wrap instead, and this is not an error, it's the central feature of their semantics.
Apr 02 2013
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Tuesday, April 02, 2013 09:49:03 Don wrote:
 On Thursday, 28 March 2013 at 20:03:08 UTC, Adam D. Ruppe wrote:
 I was working on a project earlier today that stores IP
 addresses in a database as a uint. For some reason though, some
 addresses were coming out as 0.0.0.0, despite the fact that
 if(ip == 0) return; in the only place it actually saves them
 (which was my first attempted quick fix for the bug).
 
 Turns out the problem was this:
 
 if (arg == typeid(uint)) {
 
 int e = va_arg!uint(_argptr);
 a = to!string(e);
 
 }
 
 
 See, I copy/pasted it from the int check, but didn't update the
 type on the left hand side. So it correctly pulled a uint out
 of the varargs, but then assigned it to an int, which the
 compiler accepted silently, so to!string() printed -blah
 instead of bigblah... which then got truncated by the database,
 resulting in zero being stored.
 
 I've since changed it to be "auto e = ..." and it all works
 correctly now.
 
 
 
 Anyway I thought I'd share this just because one of the many
 times bearophile has talked about this as a potentially buggy
 situation, I was like "bah humbug"... and now I've actually
 been there!
 
 I still don't think I'm for changing the language though just
 because of potential annoyances in other places unsigned works
 (such as array.length) but at least I've actually felt the
 other side of the argument in real world code now.

IMHO, array.length is *the* place where unsigned does *not* work. size_t should be an integer. We're not supporting 16 bit systems, and the few cases where a size_t value can potentially exceed int.max could be disallowed. The problem with unsigned is that it gets used as "positive integer", which it is not. I think it was a big mistake that D turned C's "unsigned long" into "ulong", thereby making it look more attractive. Nobody should be using unsigned types unless they have a really good reason. Unfortunately, size_t forces you to use them.

Naturally, the biggest reason to have size_t be unsigned is so that you can access the whole address space, though on 64-bit machines, that's not particularly relevant, since you're obviouly not going to have a machine with that much RAM (you're extremely unlikely to even have machine with that much hard drive space, though I think that I've heard of some machines existing which have run into that problem on 64-bit machines as crazy as that would be). For some people though, it _is_ a big deal on 32-bit machines. For instance, IIRC, David Simcha need 64-bit support for some of the stuff he was doing (biology stuff I think), because he couldn't address enough memory on a 32-bit machine to do what he was doing. And I know that one of the products where I work is going to have to move to 64-bit OS, because they're failing at keeping its main process' memory footprint low enough to work on a 32-bit box. Having a signed size_t would make it even worse. Granted, they're using C++, not D, but the issue is the same. So, it's arguably important on 32-bit machines that size_t be unsigned, but 64-bit doesn't really have that excuse. However, making size_t unsigned on 32- bit machines and signed on 64-bit machines would create its own set of problems, and I suspect that would be an even worse idea than making size_t signed on 64-bit machines. I do agree though that in general, unsigned types should be used with discretion, and they tend to be overused IMHO. I'm not convinced that that's the case with size_t though, since 32-bit machines do make it a necessity sometimes. - Jonathan M Davis
Apr 02 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Don:

 But very often, people use 'uint' when they really want an int, 
 whose sign bit is zero.

Sometimes you need the modular nature of unsigned values, and some other times you just need an integer that according to the logic of the program never gets negative and you want the full range of a word, not throwing away one bit, but you don't want it to wrap-around. In programs I'd like to use: 1) integers of various sizes (with error if you try to go outside their range); 2) subranges of 1 (with error if you try to go outside their range); 3) unsigned integers of various sizes (with error if you try to go outside their range); 4) subranges of 3 (with error if you try to go outside their range); 5) unsigned integers with wrap-around; 6) multi precision integer; Bye, bearophile
Apr 02 2013
prev sibling next sibling parent "Franz" <franziskaner a.com> writes:
On Friday, 29 March 2013 at 19:29:21 UTC, Jonathan M Davis wrote:
 No. -w makes it so that warnings are errors, so you generally 
 can't make
 anything a warning unless you're willing for it to be treated 
 as an error at
 least some of the time (and a lot of people compile with -w), 
 and this sort of
 thing is _supposed_ to work without a warning - primarily 
 because if it
 doesn't, you're forced to cast all over the place when you're 
 dealing with
 both signed and unsigned types, and the casts actually make 
 your code more
 error-prone, because you could end up casting something other 
 than uint to int
 or int to uint by accident (e.g. long to uint) and end up with 
 bugs due to
 that.

from unsigned to signed and vice-versa. When I sum 2 short values I am forced to manually cast the result to short if I want to assign it to a short variable. Isn't that prone to errors, too? Yet the compiler forces me to cast. I really think we should eliminate this discrepancy.
Apr 02 2013
prev sibling next sibling parent "Don" <turnyourkidsintocash nospam.com> writes:
On Tuesday, 2 April 2013 at 09:43:37 UTC, Jonathan M Davis wrote:
 On Tuesday, April 02, 2013 09:49:03 Don wrote:
 On Thursday, 28 March 2013 at 20:03:08 UTC, Adam D. Ruppe 
 wrote:
 I was working on a project earlier today that stores IP
 addresses in a database as a uint. For some reason though, 
 some
 addresses were coming out as 0.0.0.0, despite the fact that
 if(ip == 0) return; in the only place it actually saves them
 (which was my first attempted quick fix for the bug).
 
 Turns out the problem was this:
 
 if (arg == typeid(uint)) {
 
 int e = va_arg!uint(_argptr);
 a = to!string(e);
 
 }
 
 
 See, I copy/pasted it from the int check, but didn't update 
 the
 type on the left hand side. So it correctly pulled a uint out
 of the varargs, but then assigned it to an int, which the
 compiler accepted silently, so to!string() printed -blah
 instead of bigblah... which then got truncated by the 
 database,
 resulting in zero being stored.
 
 I've since changed it to be "auto e = ..." and it all works
 correctly now.
 
 
 
 Anyway I thought I'd share this just because one of the many
 times bearophile has talked about this as a potentially buggy
 situation, I was like "bah humbug"... and now I've actually
 been there!
 
 I still don't think I'm for changing the language though just
 because of potential annoyances in other places unsigned 
 works
 (such as array.length) but at least I've actually felt the
 other side of the argument in real world code now.

IMHO, array.length is *the* place where unsigned does *not* work. size_t should be an integer. We're not supporting 16 bit systems, and the few cases where a size_t value can potentially exceed int.max could be disallowed. The problem with unsigned is that it gets used as "positive integer", which it is not. I think it was a big mistake that D turned C's "unsigned long" into "ulong", thereby making it look more attractive. Nobody should be using unsigned types unless they have a really good reason. Unfortunately, size_t forces you to use them.

Naturally, the biggest reason to have size_t be unsigned is so that you can access the whole address space, though on 64-bit machines, that's not particularly relevant, since you're obviouly not going to have a machine with that much RAM (you're extremely unlikely to even have machine with that much hard drive space, though I think that I've heard of some machines existing which have run into that problem on 64-bit machines as crazy as that would be). For some people though, it _is_ a big deal on 32-bit machines. For instance, IIRC, David Simcha need 64-bit support for some of the stuff he was doing (biology stuff I think), because he couldn't address enough memory on a 32-bit machine to do what he was doing. And I know that one of the products where I work is going to have to move to 64-bit OS, because they're failing at keeping its main process' memory footprint low enough to work on a 32-bit box. Having a signed size_t would make it even worse. Granted, they're using C++, not D, but the issue is the same.

My feeling is, that since the 16 bit days, using more than half of the address space is such an usual activity that it deserves special treatment in the code. I don't think its unreasonable to require a cast for every use of those super-sized sizes. Even if you have an array which doesn't fit into an int, you can only have one such array in your program! This really, really obscure corner case doesn't deserve to be polluting the language. All those signed/unsigned issues basically come from it. It's a helluva price to pay. It's looking like an even worse deal now, because anybody with large memory requirements will be on 64 bits. We've made this sacrifice for the sake of a situation that is no longer relevant.
Apr 02 2013
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 02 Apr 2013 16:32:21 -0400, Walter Bright  
<newshound2 digitalmars.com> wrote:

 On 4/2/2013 12:47 PM, Andrei Alexandrescu wrote:
 I used to lean a lot more toward this opinion until I got to work on a  
 C++
 codebase using signed integers as array sizes and indices. It's an pain  
 all over
 the code - two tests instead of one or casts all over, more cases to  
 worry
 about... changing the code to use unsigned throughout ended up being an
 improvement.

For example, with a signed array index, a bounds check is two comparisons rather than one.

Why? struct myArr { int length; int opIndex(int idx) { if(cast(uint)idx >= cast(uint)length) throw new RangeError(); ...} } -Steve
Apr 02 2013
prev sibling next sibling parent "Don" <turnyourkidsintocash nospam.com> writes:
On Wednesday, 3 April 2013 at 03:26:54 UTC, Andrei Alexandrescu 
wrote:
 On 4/2/13 11:10 PM, Steven Schveighoffer wrote:
 On Tue, 02 Apr 2013 16:32:21 -0400, Walter Bright
 <newshound2 digitalmars.com> wrote:

 On 4/2/2013 12:47 PM, Andrei Alexandrescu wrote:
 I used to lean a lot more toward this opinion until I got to 
 work on
 a C++
 codebase using signed integers as array sizes and indices. 
 It's an
 pain all over
 the code - two tests instead of one or casts all over, more 
 cases to
 worry
 about... changing the code to use unsigned throughout ended 
 up being an
 improvement.

For example, with a signed array index, a bounds check is two comparisons rather than one.

Why? struct myArr { int length; int opIndex(int idx) { if(cast(uint)idx >= cast(uint)length) throw new RangeError(); ...} } -Steve

As I said - either two tests or casts all over. Andrei

Yeah, but I think that what this is, is demonstrating what a useful concept a positive integer type is. There's huge value in statically knowing that the sign bit is never negative. Unfortunately, using uint for this purpose gives the wrong semantics, and introduces these signed/unsigned issues, which are basically silly. Personally I suspect there aren't many uses for unsigned types of sizes other than the full machine word. In all the other sizes, a positive integer would be more useful.
Apr 03 2013
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 02 Apr 2013 23:26:54 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 4/2/13 11:10 PM, Steven Schveighoffer wrote:
 On Tue, 02 Apr 2013 16:32:21 -0400, Walter Bright
 <newshound2 digitalmars.com> wrote:

 On 4/2/2013 12:47 PM, Andrei Alexandrescu wrote:
 I used to lean a lot more toward this opinion until I got to work on
 a C++
 codebase using signed integers as array sizes and indices. It's an
 pain all over
 the code - two tests instead of one or casts all over, more cases to
 worry
 about... changing the code to use unsigned throughout ended up being  
 an
 improvement.

For example, with a signed array index, a bounds check is two comparisons rather than one.

Why? struct myArr { int length; int opIndex(int idx) { if(cast(uint)idx >= cast(uint)length) throw new RangeError(); ...} } -Steve

As I said - either two tests or casts all over.

But this is not "all over", it's in one place, for bounds checking. I find that using unsigned int doesn't really hurt much, but it can make things awkward. For example, it's better to do addition than subtraction: for(int i = 0; i < arr.length - 1; ++i) { if(arr[i] >= arr[i+1]) throw new Exception("Not sorted!"); } This has a bug, and is better written as: for(int i = 0; i + 1 < arr.length; ++i) These are the kinds of things that can get you into trouble. With a signed length, then both loops are equivalent, and we don't have that error. I'm not sure which is better. It feels to me that if you CAN achieve the correct performance (even if this means casting), but the default errs on the side of safety, that might be a better option. -Steve
Apr 03 2013
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 03 Apr 2013 07:33:05 -0400, Don <turnyourkidsintocash nospam.com>  
wrote:

 Yeah, but I think that what this is, is demonstrating what a useful  
 concept a positive integer type is. There's huge value in statically  
 knowing that the sign bit is never negative. Unfortunately, using uint  
 for this purpose gives the wrong semantics, and introduces these  
 signed/unsigned issues, which are basically silly.

 Personally I suspect there aren't many uses for unsigned types of sizes  
 other than the full machine word. In all the other sizes, a positive  
 integer would be more useful.

Hm.. would it be useful to have a "guaranteed non-negative" integer type? Like array length. Then the compiler could make that assumption, and do something like what I did as an optimization? Subtracting from that type would result in a plain-old int. -Steve
Apr 03 2013
prev sibling next sibling parent "Don" <turnyourkidsintocash nospam.com> writes:
On Wednesday, 3 April 2013 at 14:54:03 UTC, Steven Schveighoffer 
wrote:
 On Wed, 03 Apr 2013 07:33:05 -0400, Don 
 <turnyourkidsintocash nospam.com> wrote:

 Yeah, but I think that what this is, is demonstrating what a 
 useful concept a positive integer type is. There's huge value 
 in statically knowing that the sign bit is never negative. 
 Unfortunately, using uint for this purpose gives the wrong 
 semantics, and introduces these signed/unsigned issues, which 
 are basically silly.

 Personally I suspect there aren't many uses for unsigned types 
 of sizes other than the full machine word. In all the other 
 sizes, a positive integer would be more useful.

Hm.. would it be useful to have a "guaranteed non-negative" integer type? Like array length. Then the compiler could make that assumption, and do something like what I did as an optimization? Subtracting from that type would result in a plain-old int. -Steve

I think it would be extremely useful. I think "always positive" is a fundamental mathematical property that isn't captured by the type system. But I fear the heritage from C just has too much momentum. One thing we could do immediately, without changing anything in the language definition at all, is add range propagation for array length. So that, for any array A, A.length is in the range 0 .. (size_t.max/A[0].sizeof) which would mean that unless A is of type byte, ubyte, void, or char, the length is known to be a positive integer. And of course for a static array, the exact length is known. Although that has very limited applicability (only works within a single expression), I think it might help quite a lot.
Apr 04 2013
prev sibling next sibling parent "Kagamin" <spam here.lot> writes:
On Tuesday, 2 April 2013 at 09:43:37 UTC, Jonathan M Davis wrote:
 Naturally, the biggest reason to have size_t be unsigned is so 
 that you can
 access the whole address space

Length exists to limit access to memory. If you want unlimited access, use just a pointer.
 For some people though, it _is_ a big deal on 32-bit machines. 
 For
 instance, IIRC, David Simcha need 64-bit support for some of 
 the stuff he was
 doing (biology stuff I think), because he couldn't address 
 enough memory on a
 32-bit machine to do what he was doing. And I know that one of 
 the products
 where I work is going to have to move to 64-bit OS, because 
 they're failing at
 keeping its main process' memory footprint low enough to work 
 on a 32-bit box.
 Having a signed size_t would make it even worse. Granted, 
 they're using C++,
 not D, but the issue is the same.

I'm afraid, those applications are not tied to 32-bit ints. They just want a lot of memory because they have a lot of data. It means they want more than 4 gigs, so uint won't help in the slightest: it can't address more than 4 gigs, and applications will keep failing. There's a technology to use more than 4 gigs on 32-bit system: http://en.wikipedia.org/wiki/Address_Windowing_Extensions but uint still has no advantage over int, as it still can't address all the needed memory (which is more than 4 gigs).
Apr 04 2013
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, April 04, 2013 15:20:26 Kagamin wrote:
 I'm afraid, those applications are not tied to 32-bit ints. They
 just want a lot of memory because they have a lot of data. It
 means they want more than 4 gigs, so uint won't help in the
 slightest: it can't address more than 4 gigs, and applications
 will keep failing.

It's a difference of a factor of 2. You can access twice as much memory with a uint than an int. It's quite possible to need enough memory that an int wouldn't be enough and a uint would be. Of course, going 64-bit pretty much solves the problem, because you're not going to have enough memory to need anywhere near 64-bits of address space any time soon (and probaly not ever), but uint _can_ make a difference or 32-bit machines, because it gives you twice as much memory to play around with. - Jonathan M Davis
Apr 04 2013
prev sibling next sibling parent "Kagamin" <spam here.lot> writes:
I'm afraid, a factor of 2 is too small. If an application needs 
gigabytes, you'll have hard time trying to convince it to not use 
more than 4 gigs. Or more specifically between 2 and 4 gigs.

Your examples don't specify if those applications needed large 
contiguous allocations (which is another problem in itself), only 
a memory consumption. Actually a program can consume more memory 
in small allocations, because this way it can use fragmented 
address space to the fullest.
Apr 04 2013
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 04 Apr 2013 15:10:28 -0400, Walter Bright  
<newshound2 digitalmars.com> wrote:

 On 4/2/2013 8:10 PM, Steven Schveighoffer wrote:
 On Tue, 02 Apr 2013 16:32:21 -0400, Walter Bright  
 <newshound2 digitalmars.com>
 wrote:
 For example, with a signed array index, a bounds check is two  
 comparisons
 rather than one.

Why? struct myArr { int length; int opIndex(int idx) { if(cast(uint)idx >= cast(uint)length) throw new RangeError(); ...} }

Being able to cast to unsigned implies that the unsigned types exist. So no improvement.

The issue is the type of length, not that uints exist. In fact, opIndex can take a uint, and then you don't need any casts, as far as I know: int opIndex(uint idx) { if(idx >= length) throw new RangeError(); ...} I think length will be promoted to uint (and it is always positive), so it's fine, only requires one check. -Steve
Apr 04 2013
prev sibling next sibling parent "Kagamin" <spam here.lot> writes:
BTW don't we already have a hungry application *with* unsigned 
integers?
http://d.puremagic.com/issues/show_bug.cgi?id=4236
http://d.puremagic.com/issues/show_bug.cgi?id=6498
http://d.puremagic.com/issues/show_bug.cgi?id=3719

http://d.puremagic.com/issues/show_bug.cgi?id=4984 - and who 
reported this?
Apr 04 2013
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, April 04, 2013 21:39:35 Kagamin wrote:
 BTW don't we already have a hungry application *with* unsigned
 integers?
 http://d.puremagic.com/issues/show_bug.cgi?id=4236
 http://d.puremagic.com/issues/show_bug.cgi?id=6498
 http://d.puremagic.com/issues/show_bug.cgi?id=3719
 
 http://d.puremagic.com/issues/show_bug.cgi?id=4984 - and who
 reported this?

I wasn't arguing otherwise. Some applications need 64-bits to do what they do. My point was that with 32-bit programs, using unsigned integers gives you lengths twice as long, so it's quite possible for a 32-bit program to work with size_t being unsigned but not work if it were signed. But regardless of whether size_t is signed or unsigned, there's a limit to how much memory you can deal with in a 32-bit program, and some programs will need to go to 64- bit. It's just that if size_t is unsigned, the limit is higher. But I would point out that the bugs that you listed are not at really related to this discussion. They're about dmd running out of memory when compiling, and it's running out of memory not because it needs 64-bit to have enough memory or because size_t is signed (because it is) but because it doesn't reuse memory like it's supposed to. It generally just eats more without releasing it properly. It should be perfectly possible for a 32-bit dmd to compile those programs without running out of memory. And if that issue has anything to do with this discussion, it would be to point out that dmd's problems would be made worse by making size_t signed, which would just underline the fact that making size_t signed on 32-bit systems would make things worse (though dmd is currently written in C++, so whether size_t is signed or unsigned in D doesn't really matter for it at the moment). - Jonathan M Davis
Apr 04 2013
prev sibling parent "Kagamin" <spam here.lot> writes:
On Friday, 5 April 2013 at 01:26:27 UTC, Jonathan M Davis wrote:
 But I would point out that the bugs that you listed are not at 
 really related
 to this discussion. They're about dmd running out of memory 
 when compiling,
 and it's running out of memory not because it needs 64-bit to 
 have enough
 memory or because size_t is signed (because it is) but because 
 it doesn't
 reuse memory like it's supposed to. It generally just eats more 
 without
 releasing it properly. It should be perfectly possible for a 
 32-bit dmd to
 compile those programs without running out of memory. And if 
 that issue has
 anything to do with this discussion, it would be to point out 
 that dmd's
 problems would be made worse by making size_t signed

How is that if the problem is not in size_t? If dmd would need a large array, it won't be possible to solve by properly releasing memory: if the array is needed, no matter what you release, nothing you can do with that array. The issues show a real memory consumption mechanics: in a case, when an application needs gigabytes, it won't stop at 4 gigs just *because* there is 32-bit limit, so if uint buys you anything, it's too negligible to be considered: it's much easier to migrate to 64-bit than playing russian roulette pushing limits of 32-bit and see whether you hit them.
Apr 04 2013