www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Integer conversions too pedantic in 64-bit

reply dsimcha <dsimcha yahoo.com> writes:
Now that DMD has a 64-bit beta available, I'm working on getting a whole bunch
of code to compile in 64 mode.  Frankly, the compiler is way too freakin'
pedantic when it comes to implicit conversions (or lack thereof) of
array.length.  99.999% of the time it's safe to assume an array is not going
to be over 4 billion elements long.  I'd rather have a bug the 0.001% of the
time than deal with the pedantic errors the rest of the time, because I think
it would be less total time and effort invested.  To force me to either put
casts in my code everywhere or change my entire codebase to use wider integers
(with ripple effects just about everywhere) strikes me as purity winning out
over practicality.
Feb 14 2011
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Monday, February 14, 2011 15:22:38 dsimcha wrote:
 Now that DMD has a 64-bit beta available, I'm working on getting a whole
 bunch of code to compile in 64 mode.  Frankly, the compiler is way too
 freakin' pedantic when it comes to implicit conversions (or lack thereof)
 of array.length.  99.999% of the time it's safe to assume an array is not
 going to be over 4 billion elements long.  I'd rather have a bug the
 0.001% of the time than deal with the pedantic errors the rest of the
 time, because I think it would be less total time and effort invested.  To
 force me to either put casts in my code everywhere or change my entire
 codebase to use wider integers (with ripple effects just about everywhere)
 strikes me as purity winning out over practicality.
I would have thought that you'd be using size_t when dealing with arrays, since that's what their length and indexing type is. - Jonathan M Davis
Feb 14 2011
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
dsimcha wrote:
 Now that DMD has a 64-bit beta available, I'm working on getting a whole bunch
 of code to compile in 64 mode.  Frankly, the compiler is way too freakin'
 pedantic when it comes to implicit conversions (or lack thereof) of
 array.length.  99.999% of the time it's safe to assume an array is not going
 to be over 4 billion elements long.  I'd rather have a bug the 0.001% of the
 time than deal with the pedantic errors the rest of the time, because I think
 it would be less total time and effort invested.  To force me to either put
 casts in my code everywhere or change my entire codebase to use wider integers
 (with ripple effects just about everywhere) strikes me as purity winning out
 over practicality.
We dealt with that in updating Phobos/Druntime to 64 bits. The end result was worth it (and yes, there would have been undiscovered bugs without those pedantic checks). Most of the issues are solved if you use auto and foreach where possible, and size_t for the rest of the cases.
Feb 14 2011
next sibling parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Here's something I've noticed (x86 code):

void main()
{
    ulong size = 2;
    int[] arr = new int[](size);
}

This will error with:
sizetTest.d(8): Error: cannot implicitly convert expression (size) of
type ulong to uint

size_t is aliased to uint since I'm running 32bit.

I'm really not experienced at all with 64bit, so I don't know if it's
good to use uint explicitly (my hunch is that it's not good). uint as
the array size wouldn't even compile in 64bit, right?

If I'm correct, wouldn't it be better if the error showed that it
expects size_t which might be aliased to whatever type for a
particular machine?
Feb 14 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
Andrej Mitrovic wrote:
 Here's something I've noticed (x86 code):
 
 void main()
 {
     ulong size = 2;
     int[] arr = new int[](size);
 }
 
 This will error with:
 sizetTest.d(8): Error: cannot implicitly convert expression (size) of
 type ulong to uint
 
 size_t is aliased to uint since I'm running 32bit.
 
 I'm really not experienced at all with 64bit, so I don't know if it's
 good to use uint explicitly (my hunch is that it's not good). uint as
 the array size wouldn't even compile in 64bit, right?
Use size_t for all array indices and sizes, and you'll be fine.
 If I'm correct, wouldn't it be better if the error showed that it
 expects size_t which might be aliased to whatever type for a
 particular machine?
The compiler doesn't actually know about size_t.
Feb 14 2011
parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
auto, size_t, foreach & friends.

Once we get a stable 64bit compiler it's time to start advertising the
portability of D apps across 32/64bit, methinks! ;)
Feb 14 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Monday, February 14, 2011 16:30:09 Andrej Mitrovic wrote:
 Here's something I've noticed (x86 code):
 
 void main()
 {
     ulong size = 2;
     int[] arr = new int[](size);
 }
 
 This will error with:
 sizetTest.d(8): Error: cannot implicitly convert expression (size) of
 type ulong to uint
 
 size_t is aliased to uint since I'm running 32bit.
 
 I'm really not experienced at all with 64bit, so I don't know if it's
 good to use uint explicitly (my hunch is that it's not good). uint as
 the array size wouldn't even compile in 64bit, right?
 
 If I'm correct, wouldn't it be better if the error showed that it
 expects size_t which might be aliased to whatever type for a
 particular machine?
Use size_t. It's the type which is used. It's aliased to whichever type is appropriate for the architecture. On 32 bits, that would be a 32 bit integer, so it's uint. On 64 bits, that would be a 64 bit integer, so it's ulong. - Jonathan m Davis
Feb 14 2011
prev sibling next sibling parent reply spir <denis.spir gmail.com> writes:
On 02/15/2011 01:56 AM, Jonathan M Davis wrote:
 On Monday, February 14, 2011 16:30:09 Andrej Mitrovic wrote:
 Here's something I've noticed (x86 code):

 void main()
 {
      ulong size = 2;
      int[] arr = new int[](size);
 }

 This will error with:
 sizetTest.d(8): Error: cannot implicitly convert expression (size) of
 type ulong to uint

 size_t is aliased to uint since I'm running 32bit.

 I'm really not experienced at all with 64bit, so I don't know if it's
 good to use uint explicitly (my hunch is that it's not good). uint as
 the array size wouldn't even compile in 64bit, right?

 If I'm correct, wouldn't it be better if the error showed that it
 expects size_t which might be aliased to whatever type for a
 particular machine?
Use size_t. It's the type which is used. It's aliased to whichever type is appropriate for the architecture. On 32 bits, that would be a 32 bit integer, so it's uint. On 64 bits, that would be a 64 bit integer, so it's ulong.
Rename size-t, or rather introduce a meaningful standard alias? (would vote for Natural) Denis -- _________________ vita es estrany spir.wikidot.com
Feb 14 2011
next sibling parent reply "Nick Sabalausky" <a a.a> writes:
"spir" <denis.spir gmail.com> wrote in message 
news:mailman.1648.1297732015.4748.digitalmars-d puremagic.com...
 On 02/15/2011 01:56 AM, Jonathan M Davis wrote:
 On Monday, February 14, 2011 16:30:09 Andrej Mitrovic wrote:
 Here's something I've noticed (x86 code):

 void main()
 {
      ulong size = 2;
      int[] arr = new int[](size);
 }

 This will error with:
 sizetTest.d(8): Error: cannot implicitly convert expression (size) of
 type ulong to uint

 size_t is aliased to uint since I'm running 32bit.

 I'm really not experienced at all with 64bit, so I don't know if it's
 good to use uint explicitly (my hunch is that it's not good). uint as
 the array size wouldn't even compile in 64bit, right?

 If I'm correct, wouldn't it be better if the error showed that it
 expects size_t which might be aliased to whatever type for a
 particular machine?
Use size_t. It's the type which is used. It's aliased to whichever type is appropriate for the architecture. On 32 bits, that would be a 32 bit integer, so it's uint. On 64 bits, that would be a 64 bit integer, so it's ulong.
Rename size-t, or rather introduce a meaningful standard alias? (would vote for Natural)
My bikeshed is painted "native" and "word" :)
Feb 14 2011
parent reply "Nick Sabalausky" <a a.a> writes:
"Nick Sabalausky" <a a.a> wrote in message 
news:ijcm8d$1lf5$1 digitalmars.com...
 "spir" <denis.spir gmail.com> wrote in message 
 news:mailman.1648.1297732015.4748.digitalmars-d puremagic.com...
 Rename size-t, or rather introduce a meaningful standard alias? (would 
 vote for Natural)
My bikeshed is painted "native" and "word" :)
...With some "wordsize" around the trim.
Feb 14 2011
parent spir <denis.spir gmail.com> writes:
On 02/15/2011 02:55 AM, Nick Sabalausky wrote:
 "Nick Sabalausky"<a a.a>  wrote in message
 news:ijcm8d$1lf5$1 digitalmars.com...
 "spir"<denis.spir gmail.com>  wrote in message
 news:mailman.1648.1297732015.4748.digitalmars-d puremagic.com...
 Rename size-t, or rather introduce a meaningful standard alias? (would
 vote for Natural)
My bikeshed is painted "native" and "word" :)
...With some "wordsize" around the trim.
Not bad, but how does "wordsize" tell about usage (ordinal=index/position, cardinal=count/length) and semantics (unsigned)? "uint" is rather good; actually means about the same as "natural" for me. But it's a bit cryptic and does not adapt to platform native word size, unfortunately. I use uint for now to avoid custom, but correct & meaningful, alias(es) for size_t. (I must have a blockage with using mindless terms like "size_t" ;-) denis -- _________________ vita es estrany spir.wikidot.com
Feb 15 2011
prev sibling parent reply Piotr Szturmaj <bncrbme jadamspam.pl> writes:
spir wrote:
 Rename size-t, or rather introduce a meaningful standard alias? (would
 vote for Natural)
Maybe ptrint and ptruint?
Feb 14 2011
parent reply spir <denis.spir gmail.com> writes:
On 02/15/2011 03:44 AM, Piotr Szturmaj wrote:
 spir wrote:
 Rename size-t, or rather introduce a meaningful standard alias? (would
 vote for Natural)
Maybe ptrint and ptruint?
If ptr means pointer, then it's wrong: size-t is used for more than that, I guess. Strangely enough, while "size" may suggest it, .length does not return a size_t but an uint. Denis -- _________________ vita es estrany spir.wikidot.com
Feb 15 2011
next sibling parent reply Daniel Gibson <metalcaedes gmail.com> writes:
Am 15.02.2011 12:50, schrieb spir:
 On 02/15/2011 03:44 AM, Piotr Szturmaj wrote:
 spir wrote:
 Rename size-t, or rather introduce a meaningful standard alias? (would
 vote for Natural)
Maybe ptrint and ptruint?
If ptr means pointer, then it's wrong: size-t is used for more than that, I guess. Strangely enough, while "size" may suggest it, .length does not return a size_t but an uint. Denis
.length of what? An array? I'm pretty sure it returns size_t. Cheers, - Daniel
Feb 15 2011
next sibling parent reply spir <denis.spir gmail.com> writes:
On 02/15/2011 02:01 PM, Daniel Gibson wrote:
 Am 15.02.2011 12:50, schrieb spir:
 On 02/15/2011 03:44 AM, Piotr Szturmaj wrote:
 spir wrote:
 Rename size-t, or rather introduce a meaningful standard alias? (would
 vote for Natural)
Maybe ptrint and ptruint?
If ptr means pointer, then it's wrong: size-t is used for more than that, I guess. Strangely enough, while "size" may suggest it, .length does not return a size_t but an uint. Denis
.length of what? An array? I'm pretty sure it returns size_t.
unittest { int[] ints; auto l = ints.length; writeln(typeof(l).stringof); } press play ;-) denis -- _________________ vita es estrany spir.wikidot.com
Feb 15 2011
next sibling parent reply Daniel Gibson <metalcaedes gmail.com> writes:
Am 15.02.2011 15:18, schrieb spir:
 On 02/15/2011 02:01 PM, Daniel Gibson wrote:
 Am 15.02.2011 12:50, schrieb spir:
 On 02/15/2011 03:44 AM, Piotr Szturmaj wrote:
 spir wrote:
 Rename size-t, or rather introduce a meaningful standard alias? (would
 vote for Natural)
Maybe ptrint and ptruint?
If ptr means pointer, then it's wrong: size-t is used for more than that, I guess. Strangely enough, while "size" may suggest it, .length does not return a size_t but an uint. Denis
.length of what? An array? I'm pretty sure it returns size_t.
unittest { int[] ints; auto l = ints.length; writeln(typeof(l).stringof); } press play ;-) denis
void main() { size_t x; writefln(typeof(x).stringof); } try this, too ;-) Because it's an alias the information about size_t gone at runtime and the "real" type is shown. uint in your case. (Here - gdc on amd64 - it's ulong). Cheers, - Daniel
Feb 15 2011
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Daniel Gibson:

 void main() {
    size_t x;
    writefln(typeof(x).stringof);
 }
 try this, too ;-)
 
 Because it's an alias the information about size_t gone at runtime and 
 the "real" type is shown. uint in your case. (Here - gdc on amd64 - it's 
 ulong).
I think both typeof() and stringof are compile-time things. And regarding lost alias information I suggest to do as Clang does: http://d.puremagic.com/issues/show_bug.cgi?id=5004 Bye, bearophile
Feb 15 2011
next sibling parent "Nick Sabalausky" <a a.a> writes:
"bearophile" <bearophileHUGS lycos.com> wrote in message 
news:ijefj9$25sm$1 digitalmars.com...
 Daniel Gibson:

 void main() {
    size_t x;
    writefln(typeof(x).stringof);
 }
 try this, too ;-)

 Because it's an alias the information about size_t gone at runtime and
 the "real" type is shown. uint in your case. (Here - gdc on amd64 - it's
 ulong).
I think both typeof() and stringof are compile-time things. And regarding lost alias information I suggest to do as Clang does: http://d.puremagic.com/issues/show_bug.cgi?id=5004
That would *really* be nice. In my Goldie parsing lib, I make heavy use of templated aliases to provide maximally-reader-friendly types for strongly-typed tokens (ie, if the programmer desires, each symbol and each production rule has its own type, to ensure maximum compile-time safety). These aliases wrap much less readable internal types. Expecting the user to understand the internal type for any error message is not nice.
Feb 15 2011
prev sibling parent Daniel Gibson <metalcaedes gmail.com> writes:
Am 15.02.2011 19:10, schrieb bearophile:
 Daniel Gibson:
 
 void main() {
    size_t x;
    writefln(typeof(x).stringof);
 }
 try this, too ;-)

 Because it's an alias the information about size_t gone at runtime and 
 the "real" type is shown. uint in your case. (Here - gdc on amd64 - it's 
 ulong).
I think both typeof() and stringof are compile-time things. And regarding lost alias information I suggest to do as Clang does: http://d.puremagic.com/issues/show_bug.cgi?id=5004 Bye, bearophile
Hmm yeah, you're probably right. After sending my reply I thought about that myself. However: At the time typeof() is handled by the compiler the aliases are already resolved. I agree that "aka" for alias information in error-messages would be helpful in general, but this wouldn't help here.
Feb 15 2011
prev sibling parent spir <denis.spir gmail.com> writes:
On 02/15/2011 03:25 PM, Daniel Gibson wrote:
 Am 15.02.2011 15:18, schrieb spir:
 On 02/15/2011 02:01 PM, Daniel Gibson wrote:
 Am 15.02.2011 12:50, schrieb spir:
 On 02/15/2011 03:44 AM, Piotr Szturmaj wrote:
 spir wrote:
 Rename size-t, or rather introduce a meaningful standard alias? (would
 vote for Natural)
Maybe ptrint and ptruint?
If ptr means pointer, then it's wrong: size-t is used for more than that, I guess. Strangely enough, while "size" may suggest it, .length does not return a size_t but an uint. Denis
.length of what? An array? I'm pretty sure it returns size_t.
unittest { int[] ints; auto l = ints.length; writeln(typeof(l).stringof); } press play ;-) denis
void main() { size_t x; writefln(typeof(x).stringof); } try this, too ;-) Because it's an alias the information about size_t gone at runtime and the "real" type is shown. uint in your case. (Here - gdc on amd64 - it's ulong).
Oops, you're right! had not realised yet names are de-aliased on output. denis -- _________________ vita es estrany spir.wikidot.com
Feb 15 2011
prev sibling parent Adam Ruppe <destructionator gmail.com> writes:
spir wrote:
 press play
Since size_t is an alias, you wouldn't see it's name anywhere except the source code.
Feb 15 2011
prev sibling parent reply Jens Mueller <jens.k.mueller gmx.de> writes:
spir wrote:
 On 02/15/2011 02:01 PM, Daniel Gibson wrote:
Am 15.02.2011 12:50, schrieb spir:
On 02/15/2011 03:44 AM, Piotr Szturmaj wrote:
spir wrote:
Rename size-t, or rather introduce a meaningful standard alias? (would
vote for Natural)
Maybe ptrint and ptruint?
If ptr means pointer, then it's wrong: size-t is used for more than that, I guess. Strangely enough, while "size" may suggest it, .length does not return a size_t but an uint. Denis
.length of what? An array? I'm pretty sure it returns size_t.
unittest { int[] ints; auto l = ints.length; writeln(typeof(l).stringof); } press play ;-)
I do not get it. The above returns uint which is fine because my dmd v2.051 is 32-bit only. I.e. size_t is an alias to uint (see src/druntime/src/object_.d lin 52). But somehow I think you are implying it does not return size_t. This is right in the sense that it does not return the alias name size_t but it returns the aliased type name, namely uint. What's the problem? This writeln(size_t.stringof); also returns uint. I read that the compiler is free to return whatever name of an alias, i.e. either the name of the alias or the name of the thing it was aliased to (which can be again an alias). I do not understand the rule for stringof (reading http://www.digitalmars.com/d/2.0/property.html#stringof) but I never had a problem. Jens
Feb 15 2011
parent "Nick Sabalausky" <a a.a> writes:
"Jens Mueller" <jens.k.mueller gmx.de> wrote in message 
news:mailman.1694.1297781518.4748.digitalmars-d puremagic.com...
 I read that the compiler is free to return whatever name of an alias,
 i.e. either the name of the alias or the name of the thing it was
 aliased to (which can be again an alias). I do not understand the rule
 for stringof (reading
 http://www.digitalmars.com/d/2.0/property.html#stringof) but I never had
 a problem.
DMD itself has never really understood stringof.
Feb 15 2011
prev sibling parent Piotr Szturmaj <bncrbme jadamspam.pl> writes:
spir wrote:
 On 02/15/2011 03:44 AM, Piotr Szturmaj wrote:
 spir wrote:
 Rename size-t, or rather introduce a meaningful standard alias? (would
 vote for Natural)
Maybe ptrint and ptruint?
If ptr means pointer, then it's wrong: size-t is used for more than that, I guess. Strangely enough, while "size" may suggest it, .length does not return a size_t but an uint.
ptr prefix shows that int/uint depends on CPU word (32/64 bit), i.e. they have the same size as pointer. However, it may led to confusion, which type - signed or unsigned - is right for the job.
Feb 15 2011
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2011-02-15 01:08, Walter Bright wrote:
 dsimcha wrote:
 Now that DMD has a 64-bit beta available, I'm working on getting a
 whole bunch
 of code to compile in 64 mode. Frankly, the compiler is way too freakin'
 pedantic when it comes to implicit conversions (or lack thereof) of
 array.length. 99.999% of the time it's safe to assume an array is not
 going
 to be over 4 billion elements long. I'd rather have a bug the 0.001%
 of the
 time than deal with the pedantic errors the rest of the time, because
 I think
 it would be less total time and effort invested. To force me to either
 put
 casts in my code everywhere or change my entire codebase to use wider
 integers
 (with ripple effects just about everywhere) strikes me as purity
 winning out
 over practicality.
We dealt with that in updating Phobos/Druntime to 64 bits. The end result was worth it (and yes, there would have been undiscovered bugs without those pedantic checks). Most of the issues are solved if you use auto and foreach where possible, and size_t for the rest of the cases.
Yes, exactly, what's the reason not to use size_t. I've used size_t for length and index in arrays for as long as I've been using D. -- /Jacob Carlborg
Feb 15 2011
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Monday, February 14, 2011 17:06:43 spir wrote:
 On 02/15/2011 01:56 AM, Jonathan M Davis wrote:
 On Monday, February 14, 2011 16:30:09 Andrej Mitrovic wrote:
 Here's something I've noticed (x86 code):
 
 void main()
 {
 
      ulong size = 2;
      int[] arr = new int[](size);
 
 }
 
 This will error with:
 sizetTest.d(8): Error: cannot implicitly convert expression (size) of
 type ulong to uint
 
 size_t is aliased to uint since I'm running 32bit.
 
 I'm really not experienced at all with 64bit, so I don't know if it's
 good to use uint explicitly (my hunch is that it's not good). uint as
 the array size wouldn't even compile in 64bit, right?
 
 If I'm correct, wouldn't it be better if the error showed that it
 expects size_t which might be aliased to whatever type for a
 particular machine?
Use size_t. It's the type which is used. It's aliased to whichever type is appropriate for the architecture. On 32 bits, that would be a 32 bit integer, so it's uint. On 64 bits, that would be a 64 bit integer, so it's ulong.
Rename size-t, or rather introduce a meaningful standard alias? (would vote for Natural)
Why? size_t is what's used in C++. It's well known and what lots of programmers would expect What would you gain by renaming it? - Jonathan M Davis
Feb 14 2011
parent reply "Nick Sabalausky" <a a.a> writes:
"Jonathan M Davis" <jmdavisProg gmx.com> wrote in message 
news:mailman.1650.1297733226.4748.digitalmars-d puremagic.com...
 On Monday, February 14, 2011 17:06:43 spir wrote:
 Rename size-t, or rather introduce a meaningful standard alias? (would 
 vote
 for Natural)
Why? size_t is what's used in C++. It's well known and what lots of programmers would expect What would you gain by renaming it?
Although I fully realize how much this sounds like making a big deal out of nothing, to me, using "size_t" has always felt really clumsy and awkward. I think it's partly because of using an underscore in such an otherwise short identifier, and partly because I've been aware of size_t for years and still don't have the slightest clue WTF that "t" means. Something like "wordsize" would make a lot more sense and frankly feel much nicer. And, of course, there's a lot of well-known things in C++ that D deliberately destroys. D is a different language, it may as well do things better.
Feb 14 2011
next sibling parent reply Don <nospam nospam.com> writes:
Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message 
 news:mailman.1650.1297733226.4748.digitalmars-d puremagic.com...
 On Monday, February 14, 2011 17:06:43 spir wrote:
 Rename size-t, or rather introduce a meaningful standard alias? (would 
 vote
 for Natural)
Why? size_t is what's used in C++. It's well known and what lots of programmers would expect What would you gain by renaming it?
Although I fully realize how much this sounds like making a big deal out of nothing, to me, using "size_t" has always felt really clumsy and awkward. I think it's partly because of using an underscore in such an otherwise short identifier, and partly because I've been aware of size_t for years and still don't have the slightest clue WTF that "t" means. Something like "wordsize" would make a lot more sense and frankly feel much nicer. And, of course, there's a lot of well-known things in C++ that D deliberately destroys. D is a different language, it may as well do things better.
To my mind, a bigger problem is that size_t is WRONG. It should be an integer. NOT unsigned.
Feb 14 2011
next sibling parent spir <denis.spir gmail.com> writes:
On 02/15/2011 03:11 AM, Don wrote:
 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.1650.1297733226.4748.digitalmars-d puremagic.com...
 On Monday, February 14, 2011 17:06:43 spir wrote:
 Rename size-t, or rather introduce a meaningful standard alias? (would vote
 for Natural)
Why? size_t is what's used in C++. It's well known and what lots of programmers would expect What would you gain by renaming it?
Although I fully realize how much this sounds like making a big deal out of nothing, to me, using "size_t" has always felt really clumsy and awkward. I think it's partly because of using an underscore in such an otherwise short identifier, and partly because I've been aware of size_t for years and still don't have the slightest clue WTF that "t" means. Something like "wordsize" would make a lot more sense and frankly feel much nicer. And, of course, there's a lot of well-known things in C++ that D deliberately destroys. D is a different language, it may as well do things better.
To my mind, a bigger problem is that size_t is WRONG. It should be an integer. NOT unsigned.
That would /also/ solve dark corner issue & bugs. Let us define a standard alias to be used for indices, length, and such, and take the opportunity to give it a meaningful name. Then let core and lib functions to expect & return integer's. But this is a hard path, don't you think? Denis -- _________________ vita es estrany spir.wikidot.com
Feb 15 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Monday, February 14, 2011 18:11:10 Don wrote:
 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.1650.1297733226.4748.digitalmars-d puremagic.com...
 
 On Monday, February 14, 2011 17:06:43 spir wrote:
 Rename size-t, or rather introduce a meaningful standard alias? (would
 vote
 for Natural)
Why? size_t is what's used in C++. It's well known and what lots of programmers would expect What would you gain by renaming it?
Although I fully realize how much this sounds like making a big deal out of nothing, to me, using "size_t" has always felt really clumsy and awkward. I think it's partly because of using an underscore in such an otherwise short identifier, and partly because I've been aware of size_t for years and still don't have the slightest clue WTF that "t" means. Something like "wordsize" would make a lot more sense and frankly feel much nicer. And, of course, there's a lot of well-known things in C++ that D deliberately destroys. D is a different language, it may as well do things better.
To my mind, a bigger problem is that size_t is WRONG. It should be an integer. NOT unsigned.
Why exactly should it be signed? You're not going to index an array with a negative value (bounds checking would blow up on that I would think, though IIRC you can do that in C/C++ - which is a fairly stupid thing to do IMHO). You lose half the possible length of arrays if you have a signed size_t (less of a problem in 64-bit land than 32-bit land). I don't see any benefit to it being signed other than you can have a for loop do something like this: for(size_t i = a.length - 1; i >= 0; --i) And while that can be annoying at times, it's not like it's all that hard to code around. Is there some low level reason why size_t should be signed or something I'm completely missing? - Jonathan M Davis
Feb 15 2011
prev sibling parent spir <denis.spir gmail.com> writes:
On 02/15/2011 11:24 PM, Jonathan M Davis wrote:
 Is there some low level reason why size_t should be signed or something I'm
 completely missing?
My personal issue with unsigned ints in general as implemented in C-like languages is that the range of non-negative signed integers is half of the range of corresponding unsigned integers (for same size). * practically: known issues, and bugs if not checked by the language * conceptually: contradicts the "obvious" idea that unsigned (aka naturals) is a subset of signed (aka integers) denis -- _________________ vita es estrany spir.wikidot.com
Feb 15 2011
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Monday, February 14, 2011 17:58:17 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.1650.1297733226.4748.digitalmars-d puremagic.com...
 
 On Monday, February 14, 2011 17:06:43 spir wrote:
 Rename size-t, or rather introduce a meaningful standard alias? (would
 vote
 for Natural)
Why? size_t is what's used in C++. It's well known and what lots of programmers would expect What would you gain by renaming it?
Although I fully realize how much this sounds like making a big deal out of nothing, to me, using "size_t" has always felt really clumsy and awkward. I think it's partly because of using an underscore in such an otherwise short identifier, and partly because I've been aware of size_t for years and still don't have the slightest clue WTF that "t" means. Something like "wordsize" would make a lot more sense and frankly feel much nicer. And, of course, there's a lot of well-known things in C++ that D deliberately destroys. D is a different language, it may as well do things better.
I believe that t is for type. The same goes for types such as time_t. The size part of the name is probably meant to be short for either word size or pointer size. Personally, I see nothing wrong with size_t and see no reason to change it. If it were a particularly bad name and there was a good suggestion for a replacement, then perhaps I'd support changing it. But I see nothing wrong with size_t at all. - Jonathan M Davis
Feb 14 2011
parent reply "Nick Sabalausky" <a a.a> writes:
"Jonathan M Davis" <jmdavisProg gmx.com> wrote in message 
news:mailman.1655.1297736016.4748.digitalmars-d puremagic.com...
 I believe that t is for type. The same goes for types such as time_t. The 
 size
 part of the name is probably meant to be short for either word size or 
 pointer
 size.

 Personally, I see nothing wrong with size_t and see no reason to change 
 it. If
 it were a particularly bad name and there was a good suggestion for a
 replacement, then perhaps I'd support changing it. But I see nothing wrong 
 with
 size_t at all.
So it's (modified) hungarian notation? Didn't that go out with boy bands, Matrix spoofs and dancing CG babies?
Feb 14 2011
next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Monday, February 14, 2011 18:19:35 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.1655.1297736016.4748.digitalmars-d puremagic.com...
 
 I believe that t is for type. The same goes for types such as time_t. The
 size
 part of the name is probably meant to be short for either word size or
 pointer
 size.
 
 Personally, I see nothing wrong with size_t and see no reason to change
 it. If
 it were a particularly bad name and there was a good suggestion for a
 replacement, then perhaps I'd support changing it. But I see nothing
 wrong with
 size_t at all.
So it's (modified) hungarian notation? Didn't that go out with boy bands, Matrix spoofs and dancing CG babies?
How is it hungarian notation? Hungarian notation puts the type of the variable in the name. size_t _is_ the type. I don't see any relation to hungarian notation. And I'm pretty sure that size_t predates the invention of hungarian notation by a fair margin anyway. - Jonathan M Davis
Feb 14 2011
parent reply "Nick Sabalausky" <a a.a> writes:
"Jonathan M Davis" <jmdavisProg gmx.com> wrote in message 
news:mailman.1657.1297736740.4748.digitalmars-d puremagic.com...
 On Monday, February 14, 2011 18:19:35 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.1655.1297736016.4748.digitalmars-d puremagic.com...

 I believe that t is for type. The same goes for types such as time_t. 
 The
 size
 part of the name is probably meant to be short for either word size or
 pointer
 size.

 Personally, I see nothing wrong with size_t and see no reason to change
 it. If
 it were a particularly bad name and there was a good suggestion for a
 replacement, then perhaps I'd support changing it. But I see nothing
 wrong with
 size_t at all.
So it's (modified) hungarian notation? Didn't that go out with boy bands, Matrix spoofs and dancing CG babies?
How is it hungarian notation? Hungarian notation puts the type of the variable in the name. size_t _is_ the type. I don't see any relation to hungarian notation. And I'm pretty sure that size_t predates the invention of hungarian notation by a fair margin anyway.
If the "t" means "type", then "size_t" puts "what the symbol is" into the name of the symbol. Even if that *technically* isn't hungarian notation, it's the same basic principle. Aside from that, what's the point of putting "type" in the name of a type? We don't say int_t, float_t, object_t, Widget_t, etc. That'd be stupid. They just simply *are* types. How about making a statement that has "_s" tacked on to the end of its name to specify that it's a statement? "foreach_s", "if_s". It's pointless. If "size" isn't a good name for the type (and it isn't), then the solution is to find a better name, not to tack a "_t" to the end of it. C/C++ has a lot of stupid stuff that C/C++ programmers are used to. Doesn't mean D should copy it.
Feb 14 2011
next sibling parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
The question is then do you want to be more consistent with the
language (abolish size_t and make something nicer), or be consistent
with the known standards (C99 ISO, et all.).

I'd vote for a change, but I know it will never happen (even though it
just might not be too late if we're not coding for 64 bits yet). It's
hardcoded in the skin of C++ programmers, and Walter is at least one
of them.
Feb 14 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
Andrej Mitrovic wrote:
 The question is then do you want to be more consistent with the
 language (abolish size_t and make something nicer), or be consistent
 with the known standards (C99 ISO, et all.).
 
 I'd vote for a change, but I know it will never happen (even though it
 just might not be too late if we're not coding for 64 bits yet). It's
 hardcoded in the skin of C++ programmers, and Walter is at least one
 of them.
We also don't go around renaming should to shud, or use dvorak keyboards. Having to constantly explain that "use 'ourfancyname' instead of size_t, it works exactly the same as size_t" is a waste of our time and potential users' time.
Feb 14 2011
parent reply spir <denis.spir gmail.com> writes:
On 02/15/2011 06:51 AM, Walter Bright wrote:
 Andrej Mitrovic wrote:
 The question is then do you want to be more consistent with the
 language (abolish size_t and make something nicer), or be consistent
 with the known standards (C99 ISO, et all.).

 I'd vote for a change, but I know it will never happen (even though it
 just might not be too late if we're not coding for 64 bits yet). It's
 hardcoded in the skin of C++ programmers, and Walter is at least one
 of them.
We also don't go around renaming should to shud, or use dvorak keyboards. Having to constantly explain that "use 'ourfancyname' instead of size_t, it works exactly the same as size_t" is a waste of our time and potential users' time.
Having to constantly explain that "_t" means type, that "size" does not mean size, what this type is supposed to mean instead, what it is used for in core and stdlib functionality, and what programmers are supposed to use it for... isn't this a waste of our time? This, only because the name is mindless? Please, just allow others having a correct, meaningful (and hopefully styleguide compliant) alternative --defined as a standard just like size_t. And go on using size_t as you like it. denis -- _________________ vita es estrany spir.wikidot.com
Feb 15 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
spir wrote:
 Having to constantly explain that "_t" means type, that "size" does not 
 mean size, what this type is supposed to mean instead, what it is used 
 for in core and stdlib functionality, and what programmers are supposed 
 to use it for... isn't this a waste of our time? This, only because the 
 name is mindless?
No, because there is a vast body of work that uses size_t and a vast body of programmers who know what it is and are totally used to it.
Feb 15 2011
parent reply "Nick Sabalausky" <a a.a> writes:
"Walter Bright" <newshound2 digitalmars.com> wrote in message 
news:ijeil4$2aso$3 digitalmars.com...
 spir wrote:
 Having to constantly explain that "_t" means type, that "size" does not 
 mean size, what this type is supposed to mean instead, what it is used 
 for in core and stdlib functionality, and what programmers are supposed 
 to use it for... isn't this a waste of our time? This, only because the 
 name is mindless?
No, because there is a vast body of work that uses size_t and a vast body of programmers who know what it is and are totally used to it.
And there's a vast body who don't. And there's a vast body who are used to C++, so let's just abandon D and make it an implementation of C++ instead.
Feb 15 2011
next sibling parent Daniel Gibson <metalcaedes gmail.com> writes:
Am 15.02.2011 22:20, schrieb Nick Sabalausky:
 "Walter Bright" <newshound2 digitalmars.com> wrote in message 
 news:ijeil4$2aso$3 digitalmars.com...
 spir wrote:
 Having to constantly explain that "_t" means type, that "size" does not 
 mean size, what this type is supposed to mean instead, what it is used 
 for in core and stdlib functionality, and what programmers are supposed 
 to use it for... isn't this a waste of our time? This, only because the 
 name is mindless?
No, because there is a vast body of work that uses size_t and a vast body of programmers who know what it is and are totally used to it.
And there's a vast body who don't.
They've got to learn some name for it anyway, so why not size_t? This also makes using C functions that use size_t easier/more clear.
 And there's a vast body who are used to C++, so let's just abandon D and 
 make it an implementation of C++ instead.
 
Feb 15 2011
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
Nick Sabalausky wrote:
 "Walter Bright" <newshound2 digitalmars.com> wrote in message 
 news:ijeil4$2aso$3 digitalmars.com...
 spir wrote:
 Having to constantly explain that "_t" means type, that "size" does not 
 mean size, what this type is supposed to mean instead, what it is used 
 for in core and stdlib functionality, and what programmers are supposed 
 to use it for... isn't this a waste of our time? This, only because the 
 name is mindless?
No, because there is a vast body of work that uses size_t and a vast body of programmers who know what it is and are totally used to it.
And there's a vast body who don't. And there's a vast body who are used to C++, so let's just abandon D and make it an implementation of C++ instead.
I would agree that D is a complete waste of time if all it consisted of was renaming things.
Feb 15 2011
next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2011-02-15 16:33:33 -0500, Walter Bright <newshound2 digitalmars.com> said:

 Nick Sabalausky wrote:
 "Walter Bright" <newshound2 digitalmars.com> wrote in message 
 news:ijeil4$2aso$3 digitalmars.com...
 spir wrote:
 Having to constantly explain that "_t" means type, that "size" does not 
 mean size, what this type is supposed to mean instead, what it is used 
 for in core and stdlib functionality, and what programmers are supposed 
 to use it for... isn't this a waste of our time? This, only because the 
 name is mindless?
No, because there is a vast body of work that uses size_t and a vast body of programmers who know what it is and are totally used to it.
And there's a vast body who don't. And there's a vast body who are used to C++, so let's just abandon D and make it an implementation of C++ instead.
I would agree that D is a complete waste of time if all it consisted of was renaming things.
I'm just wondering whether 'size_t', because it is named after its C counterpart, doesn't feel too alien in D, causing people to prefer 'uint' or 'ulong' instead even when they should not. We're seeing a lot of code failing on 64-bit because authors used the fixed-size types which are more D-like in naming. Wouldn't more D-like names that don't look like relics from C -- something like 'word' and 'uword' -- have helped prevent those bugs by making the word-sized type look worth consideration? -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Feb 15 2011
next sibling parent spir <denis.spir gmail.com> writes:
On 02/15/2011 10:49 PM, Michel Fortin wrote:
 On 2011-02-15 16:33:33 -0500, Walter Bright <newshound2 digitalmars.com> said:

 Nick Sabalausky wrote:
 "Walter Bright" <newshound2 digitalmars.com> wrote in message
 news:ijeil4$2aso$3 digitalmars.com...
 spir wrote:
 Having to constantly explain that "_t" means type, that "size" does not
 mean size, what this type is supposed to mean instead, what it is used for
 in core and stdlib functionality, and what programmers are supposed to use
 it for... isn't this a waste of our time? This, only because the name is
 mindless?
No, because there is a vast body of work that uses size_t and a vast body of programmers who know what it is and are totally used to it.
And there's a vast body who don't. And there's a vast body who are used to C++, so let's just abandon D and make it an implementation of C++ instead.
I would agree that D is a complete waste of time if all it consisted of was renaming things.
I'm just wondering whether 'size_t', because it is named after its C counterpart, doesn't feel too alien in D, causing people to prefer 'uint' or 'ulong' instead even when they should not. We're seeing a lot of code failing on 64-bit because authors used the fixed-size types which are more D-like in naming. Wouldn't more D-like names that don't look like relics from C -- something like 'word' and 'uword' -- have helped prevent those bugs by making the word-sized type look worth consideration?
Exactly :-) Denis -- _________________ vita es estrany spir.wikidot.com
Feb 15 2011
prev sibling parent Mafi <mafi example.org> writes:
Am 15.02.2011 22:49, schrieb Michel Fortin:
 On 2011-02-15 16:33:33 -0500, Walter Bright <newshound2 digitalmars.com>
 said:

 Nick Sabalausky wrote:
 "Walter Bright" <newshound2 digitalmars.com> wrote in message
 news:ijeil4$2aso$3 digitalmars.com...
 spir wrote:
 Having to constantly explain that "_t" means type, that "size" does
 not mean size, what this type is supposed to mean instead, what it
 is used for in core and stdlib functionality, and what programmers
 are supposed to use it for... isn't this a waste of our time? This,
 only because the name is mindless?
No, because there is a vast body of work that uses size_t and a vast body of programmers who know what it is and are totally used to it.
And there's a vast body who don't. And there's a vast body who are used to C++, so let's just abandon D and make it an implementation of C++ instead.
I would agree that D is a complete waste of time if all it consisted of was renaming things.
I'm just wondering whether 'size_t', because it is named after its C counterpart, doesn't feel too alien in D, causing people to prefer 'uint' or 'ulong' instead even when they should not. We're seeing a lot of code failing on 64-bit because authors used the fixed-size types which are more D-like in naming. Wouldn't more D-like names that don't look like relics from C -- something like 'word' and 'uword' -- have helped prevent those bugs by making the word-sized type look worth consideration?
I am also for renaming it. It should begin with u to ensure everybody knows it's unsigned even if there's no signed counterpart. But what we definitely should avoid is to have two names for the same thing. It's the same mistake C++ did with inheriting everything from C and _adding_ it's own way. Mafi
Feb 16 2011
prev sibling parent "Nick Sabalausky" <a a.a> writes:
"Walter Bright" <newshound2 digitalmars.com> wrote in message 
news:ijerk4$2u3a$1 digitalmars.com...
 Nick Sabalausky wrote:
 "Walter Bright" <newshound2 digitalmars.com> wrote in message 
 news:ijeil4$2aso$3 digitalmars.com...
 spir wrote:
 Having to constantly explain that "_t" means type, that "size" does not 
 mean size, what this type is supposed to mean instead, what it is used 
 for in core and stdlib functionality, and what programmers are supposed 
 to use it for... isn't this a waste of our time? This, only because the 
 name is mindless?
No, because there is a vast body of work that uses size_t and a vast body of programmers who know what it is and are totally used to it.
And there's a vast body who don't. And there's a vast body who are used to C++, so let's just abandon D and make it an implementation of C++ instead.
I would agree that D is a complete waste of time if all it consisted of was renaming things.
And since D *does* force C++ users to learn far bigger differences, learning a different name for something is trivial.
Feb 15 2011
prev sibling parent spir <denis.spir gmail.com> writes:
On 02/15/2011 05:50 AM, Andrej Mitrovic wrote:
 The question is then do you want to be more consistent with the
 language (abolish size_t and make something nicer), or be consistent
 with the known standards (C99 ISO, et all.).

 I'd vote for a change, but I know it will never happen (even though it
 just might not be too late if we're not coding for 64 bits yet). It's
 hardcoded in the skin of C++ programmers, and Walter is at least one
 of them.
We don't need to change in the sense of replace. We just need a /standard/ correct and meaningful alternative. It must be standard to be "shared wealth" of the community, thus defined in the core stdlib or whereever (as opposed to people using their own terms, all different, as I did for a while). alias size_t GoodTypeName; // always available Possibly in a while there would be a consensus to get rid of such historic junk as size_t, but it's a different step, and probably a later phase of the language's evolution imo. All we nedd now, is to be able to use a good name for an unsigned type sized to machine word and usable for indices, length, etc. Maybe the #1 type in real code, by the way, or is it string? As long as such a name is not defined as standard, it may be counter-productive for the community, and annoying for others reading our code, to use our own preferred terms. Denis -- _________________ vita es estrany spir.wikidot.com
Feb 15 2011
prev sibling parent spir <denis.spir gmail.com> writes:
On 02/15/2011 03:26 AM, Jonathan M Davis wrote:
 On Monday, February 14, 2011 18:19:35 Nick Sabalausky wrote:
 "Jonathan M Davis"<jmdavisProg gmx.com>  wrote in message
 news:mailman.1655.1297736016.4748.digitalmars-d puremagic.com...

 I believe that t is for type. The same goes for types such as time_t. The
 size
 part of the name is probably meant to be short for either word size or
 pointer
 size.

 Personally, I see nothing wrong with size_t and see no reason to change
 it. If
 it were a particularly bad name and there was a good suggestion for a
 replacement, then perhaps I'd support changing it. But I see nothing
 wrong with
 size_t at all.
So it's (modified) hungarian notation? Didn't that go out with boy bands, Matrix spoofs and dancing CG babies?
How is it hungarian notation? Hungarian notation puts the type of the variable in the name. size_t _is_ the type. I don't see any relation to hungarian notation. And I'm pretty sure that size_t predates the invention of hungarian notation by a fair margin anyway.
size_t is not the type of size_t ;-) For sure it is Hungarian notation. What is the type of size_t? Type (at least conceptually, even if D does not have live type elements). Just what the name says. denis -- _________________ vita es estrany spir.wikidot.com
Feb 15 2011
prev sibling next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
Nick Sabalausky wrote:
 I've been aware of size_t for years and still 
 don't have the slightest clue WTF that "t" means.
A _t postfix in C is a common convention to signify that an identifier is a type.
Feb 14 2011
prev sibling next sibling parent reply spir <denis.spir gmail.com> writes:
On 02/15/2011 02:58 AM, Nick Sabalausky wrote:
 "Jonathan M Davis"<jmdavisProg gmx.com>  wrote in message
 news:mailman.1650.1297733226.4748.digitalmars-d puremagic.com...
 On Monday, February 14, 2011 17:06:43 spir wrote:
 Rename size-t, or rather introduce a meaningful standard alias? (would
 vote
 for Natural)
Why? size_t is what's used in C++. It's well known and what lots of programmers would expect What would you gain by renaming it?
Although I fully realize how much this sounds like making a big deal out of nothing, to me, using "size_t" has always felt really clumsy and awkward. I think it's partly because of using an underscore in such an otherwise short identifier, and partly because I've been aware of size_t for years and still don't have the slightest clue WTF that "t" means. Something like "wordsize" would make a lot more sense and frankly feel much nicer. And, of course, there's a lot of well-known things in C++ that D deliberately destroys. D is a different language, it may as well do things better.
Agreed. While making something different... About the suffix "-_t", I bet it means type, what do you think? (may well be wrong, just because I have here and there seen custom types like name_t or point_t or such) Anyone has a history of C/++ at hand? Denis -- _________________ vita es estrany spir.wikidot.com
Feb 15 2011
parent Daniel Gibson <metalcaedes gmail.com> writes:
Am 15.02.2011 11:30, schrieb spir:
 On 02/15/2011 02:58 AM, Nick Sabalausky wrote:
 "Jonathan M Davis"<jmdavisProg gmx.com> wrote in message
 news:mailman.1650.1297733226.4748.digitalmars-d puremagic.com...
 On Monday, February 14, 2011 17:06:43 spir wrote:
 Rename size-t, or rather introduce a meaningful standard alias? (would
 vote
 for Natural)
Why? size_t is what's used in C++. It's well known and what lots of programmers would expect What would you gain by renaming it?
Although I fully realize how much this sounds like making a big deal out of nothing, to me, using "size_t" has always felt really clumsy and awkward. I think it's partly because of using an underscore in such an otherwise short identifier, and partly because I've been aware of size_t for years and still don't have the slightest clue WTF that "t" means. Something like "wordsize" would make a lot more sense and frankly feel much nicer. And, of course, there's a lot of well-known things in C++ that D deliberately destroys. D is a different language, it may as well do things better.
Agreed. While making something different... About the suffix "-_t", I bet it means type, what do you think? (may well be wrong, just because I have here and there seen custom types like name_t or point_t or such) Anyone has a history of C/++ at hand? Denis
I've seen _t in C code for typedef'ed types. like struct foo_s { ... }; typedef struct foo_s foo_t; and then "foo_t myfoo; myfoo.x = 42;" etc instead of "struct foo_s myfoo; myfoo.x = 42;" etc and also stuff like typedef float vec_t; typedef vec_t vec3_t[3]; So it is used to indicate that the it's an aliased type. I don't see the problem with size_t. It's the type used for sizes. sizeof(foo) (or foo.sizeof in D) uses it. Cheers, - Daniel
Feb 15 2011
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 14 Feb 2011 20:58:17 -0500, Nick Sabalausky <a a.a> wrote:

 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.1650.1297733226.4748.digitalmars-d puremagic.com...
 On Monday, February 14, 2011 17:06:43 spir wrote:
 Rename size-t, or rather introduce a meaningful standard alias? (would
 vote
 for Natural)
Why? size_t is what's used in C++. It's well known and what lots of programmers would expect What would you gain by renaming it?
Although I fully realize how much this sounds like making a big deal out of nothing, to me, using "size_t" has always felt really clumsy and awkward. I think it's partly because of using an underscore in such an otherwise short identifier, and partly because I've been aware of size_t for years and still don't have the slightest clue WTF that "t" means. Something like "wordsize" would make a lot more sense and frankly feel much nicer. And, of course, there's a lot of well-known things in C++ that D deliberately destroys. D is a different language, it may as well do things better.
Hey, bikeshedders, I found this cool easter-egg feature in D! It's called alias! Don't like the name of something? Well you can change it! alias size_t wordsize; Now, you can use wordsize instead of size_t in your code, and the compiler doesn't care! (in fact, that's all size_t is anyways *hint hint*) ;) -Steve
Feb 15 2011
next sibling parent reply Adam Ruppe <destructionator gmail.com> writes:
Sometimes I think we should troll the users a little and make
a release with names like so:

alias size_t
TypeUsedForArraySizes_Indexes_AndOtherRelatedTasksThatNeedAnUnsignedMachineSizeWord;

alias ptrdiff_t
TypeUsedForDifferencesBetweenPointers_ThatIs_ASignedMachineSizeWordAlsoUsableForOffsets;

alias iota lazyRangeThatGoesFromStartToFinishByTheGivenStepAmount;


Cash money says everyone would be demanding an emergency release with
shorter names. We'd argue for months about it... and probably settle
back where we started.
Feb 15 2011
parent reply "Nick Sabalausky" <a a.a> writes:
"Adam Ruppe" <destructionator gmail.com> wrote in message 
news:ije0gi$18vo$1 digitalmars.com...
 Sometimes I think we should troll the users a little and make
 a release with names like so:

 alias size_t
 TypeUsedForArraySizes_Indexes_AndOtherRelatedTasksThatNeedAnUnsignedMachineSizeWord;

 alias ptrdiff_t
 TypeUsedForDifferencesBetweenPointers_ThatIs_ASignedMachineSizeWordAlsoUsableForOffsets;

 alias iota lazyRangeThatGoesFromStartToFinishByTheGivenStepAmount;


 Cash money says everyone would be demanding an emergency release with
 shorter names. We'd argue for months about it... and probably settle
 back where we started.
A small software company I once worked for, Main Sequence Technologies, had their heads so far up their asses it was trivial for me to get posted on TheDailyWTF's Code Snippet of the Day (This company had a rather...interesting...way of creating their "else" clauses). One of the many "Programming 101, Chapter 1" things they had a habit of screwing up was "Use meaningful variable names!". Throughout the codebase (VB6 - yea, that tells you a lot about their level of competence), there were variables like "aaa", "staaa", "bbb", "stbbb", "ccc", etc. Those are actual names they used. (I even found a file-loading function named "save".) Needless to say, trying to understand the twisted codebase enough to actually do anything with it was...well, you can imagine. So I would try to clean things up when I could, in large part just so I could actually keep it all straight in my own mind. Anyway, to bring this all back around to what you said above, there were times when I understood enough about a variable to know it wasn't relevant to whatever my main task was, and therefore didn't strictly need to go wasting even *more* time trying to figure out what the hell the variable actually did. So I ended up in the habit of just renaming those variables to things like: bbb -> thisVariableNeedsAMuchMoreMeaningfulNameThan_bbb That was satisfying ;) Call it "self-documenting code".
Feb 15 2011
parent reply spir <denis.spir gmail.com> writes:
On 02/15/2011 10:45 PM, Nick Sabalausky wrote:
 "Adam Ruppe"<destructionator gmail.com>  wrote in message
 news:ije0gi$18vo$1 digitalmars.com...
 Sometimes I think we should troll the users a little and make
 a release with names like so:

 alias size_t
 TypeUsedForArraySizes_Indexes_AndOtherRelatedTasksThatNeedAnUnsignedMachineSizeWord;

 alias ptrdiff_t
 TypeUsedForDifferencesBetweenPointers_ThatIs_ASignedMachineSizeWordAlsoUsableForOffsets;

 alias iota lazyRangeThatGoesFromStartToFinishByTheGivenStepAmount;


 Cash money says everyone would be demanding an emergency release with
 shorter names. We'd argue for months about it... and probably settle
 back where we started.
A small software company I once worked for, Main Sequence Technologies, had their heads so far up their asses it was trivial for me to get posted on TheDailyWTF's Code Snippet of the Day (This company had a rather...interesting...way of creating their "else" clauses). One of the many "Programming 101, Chapter 1" things they had a habit of screwing up was "Use meaningful variable names!". Throughout the codebase (VB6 - yea, that tells you a lot about their level of competence), there were variables like "aaa", "staaa", "bbb", "stbbb", "ccc", etc. Those are actual names they used. (I even found a file-loading function named "save".) Needless to say, trying to understand the twisted codebase enough to actually do anything with it was...well, you can imagine. So I would try to clean things up when I could, in large part just so I could actually keep it all straight in my own mind. Anyway, to bring this all back around to what you said above, there were times when I understood enough about a variable to know it wasn't relevant to whatever my main task was, and therefore didn't strictly need to go wasting even *more* time trying to figure out what the hell the variable actually did. So I ended up in the habit of just renaming those variables to things like: bbb -> thisVariableNeedsAMuchMoreMeaningfulNameThan_bbb
Did you actually type this yourself, Nick, or do you have a secret prototype of camel-case automaton, based on an English language lexing DFA? denis -- _________________ vita es estrany spir.wikidot.com
Feb 15 2011
parent "Nick Sabalausky" <a a.a> writes:
"spir" <denis.spir gmail.com> wrote in message 
news:mailman.1709.1297810216.4748.digitalmars-d puremagic.com...
 On 02/15/2011 10:45 PM, Nick Sabalausky wrote:
 "Adam Ruppe"<destructionator gmail.com>  wrote in message
 news:ije0gi$18vo$1 digitalmars.com...
 Sometimes I think we should troll the users a little and make
 a release with names like so:

 alias size_t
 TypeUsedForArraySizes_Indexes_AndOtherRelatedTasksThatNeedAnUnsignedMachineSizeWord;

 alias ptrdiff_t
 TypeUsedForDifferencesBetweenPointers_ThatIs_ASignedMachineSizeWordAlsoUsableForOffsets;

 alias iota lazyRangeThatGoesFromStartToFinishByTheGivenStepAmount;


 Cash money says everyone would be demanding an emergency release with
 shorter names. We'd argue for months about it... and probably settle
 back where we started.
A small software company I once worked for, Main Sequence Technologies, had their heads so far up their asses it was trivial for me to get posted on TheDailyWTF's Code Snippet of the Day (This company had a rather...interesting...way of creating their "else" clauses). One of the many "Programming 101, Chapter 1" things they had a habit of screwing up was "Use meaningful variable names!". Throughout the codebase (VB6 - yea, that tells you a lot about their level of competence), there were variables like "aaa", "staaa", "bbb", "stbbb", "ccc", etc. Those are actual names they used. (I even found a file-loading function named "save".) Needless to say, trying to understand the twisted codebase enough to actually do anything with it was...well, you can imagine. So I would try to clean things up when I could, in large part just so I could actually keep it all straight in my own mind. Anyway, to bring this all back around to what you said above, there were times when I understood enough about a variable to know it wasn't relevant to whatever my main task was, and therefore didn't strictly need to go wasting even *more* time trying to figure out what the hell the variable actually did. So I ended up in the habit of just renaming those variables to things like: bbb -> thisVariableNeedsAMuchMoreMeaningfulNameThan_bbb
Did you actually type this yourself, Nick, or do you have a secret prototype of camel-case automaton, based on an English language lexing DFA?
With all the coding I do, holding 'shift' between words is almost as natural to me as hitting 'space' between words. An automated english -> camel-case tool wouldn't need anything fancy though. Just toUpper() the first character after each space and then remove the spaces. I may be missing what you meant, though.
Feb 15 2011
prev sibling parent reply spir <denis.spir gmail.com> writes:
On 02/15/2011 02:36 PM, Steven Schveighoffer wrote:
 On Mon, 14 Feb 2011 20:58:17 -0500, Nick Sabalausky <a a.a> wrote:

 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.1650.1297733226.4748.digitalmars-d puremagic.com...
 On Monday, February 14, 2011 17:06:43 spir wrote:
 Rename size-t, or rather introduce a meaningful standard alias? (would
 vote
 for Natural)
Why? size_t is what's used in C++. It's well known and what lots of programmers would expect What would you gain by renaming it?
Although I fully realize how much this sounds like making a big deal out of nothing, to me, using "size_t" has always felt really clumsy and awkward. I think it's partly because of using an underscore in such an otherwise short identifier, and partly because I've been aware of size_t for years and still don't have the slightest clue WTF that "t" means. Something like "wordsize" would make a lot more sense and frankly feel much nicer. And, of course, there's a lot of well-known things in C++ that D deliberately destroys. D is a different language, it may as well do things better.
Hey, bikeshedders, I found this cool easter-egg feature in D! It's called alias! Don't like the name of something? Well you can change it! alias size_t wordsize; Now, you can use wordsize instead of size_t in your code, and the compiler doesn't care! (in fact, that's all size_t is anyways *hint hint*)
Sure, but it's not the point of this one bikeshedding thread. If you do that, then you're the only one who knows what "wordsize" means. Good, maybe, for app-specific semantic notions (alias Employee[] Staff;); certainly not for types at the highest degree of general purpose like size_t. We need a standard alias. Denis -- _________________ vita es estrany spir.wikidot.com
Feb 15 2011
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 15 Feb 2011 09:26:21 -0500, spir <denis.spir gmail.com> wrote:

 On 02/15/2011 02:36 PM, Steven Schveighoffer wrote:
 Hey, bikeshedders, I found this cool easter-egg feature in D! It's  
 called
 alias! Don't like the name of something? Well you can change it!

 alias size_t wordsize;

 Now, you can use wordsize instead of size_t in your code, and the  
 compiler
 doesn't care! (in fact, that's all size_t is anyways *hint hint*)
Sure, but it's not the point of this one bikeshedding thread. If you do that, then you're the only one who knows what "wordsize" means. Good, maybe, for app-specific semantic notions (alias Employee[] Staff;); certainly not for types at the highest degree of general purpose like size_t. We need a standard alias.
The standard alias is size_t. If you don't like it, alias it to something else. Why should I have to use something that's unfamiliar to me because you don't like size_t? I guarantee whatever you came up with would not be liked by some people, so they would have to alias it, you can't please everyone. size_t works, it has a precedent, it's already *there*, just use it, or alias it if you don't like it. No offense, but this discussion is among the most pointless I've seen. -Steve
Feb 15 2011
parent reply "Nick Sabalausky" <a a.a> writes:
"Steven Schveighoffer" <schveiguy yahoo.com> wrote in message 
news:op.vqx78nkceav7ka steve-laptop...
 size_t works,  it has a precedent, it's already *there*, just use it, or 
 alias it if you  don't like it.
One could make much the same argument about the whole of C++. It works, it has a precedent, it's already *there*, just use it.
Feb 15 2011
next sibling parent Daniel Gibson <metalcaedes gmail.com> writes:
Am 15.02.2011 22:48, schrieb Nick Sabalausky:
 "Steven Schveighoffer" <schveiguy yahoo.com> wrote in message 
 news:op.vqx78nkceav7ka steve-laptop...
 size_t works,  it has a precedent, it's already *there*, just use it, or 
 alias it if you  don't like it.
One could make much the same argument about the whole of C++. It works, it has a precedent, it's already *there*, just use it.
If your only problem with C++ is related to names.. you should probably do that.
Feb 15 2011
prev sibling parent reply "Nick Sabalausky" <a a.a> writes:
"Nick Sabalausky" <a a.a> wrote in message 
news:ijesem$brd$1 digitalmars.com...
 "Steven Schveighoffer" <schveiguy yahoo.com> wrote in message 
 news:op.vqx78nkceav7ka steve-laptop...
 size_t works,  it has a precedent, it's already *there*, just use it, or 
 alias it if you  don't like it.
One could make much the same argument about the whole of C++. It works, it has a precedent, it's already *there*, just use it.
The whole reason I came to D was because, at the time, D was more interested in fixing C++'s idiocy than just merely aping C++ as the theme seems to be now.
Feb 15 2011
next sibling parent Lutger Blijdestijn <lutger.blijdestijn gmail.com> writes:
Nick Sabalausky wrote:

 "Nick Sabalausky" <a a.a> wrote in message
 news:ijesem$brd$1 digitalmars.com...
 "Steven Schveighoffer" <schveiguy yahoo.com> wrote in message
 news:op.vqx78nkceav7ka steve-laptop...
 size_t works,  it has a precedent, it's already *there*, just use it, or
 alias it if you  don't like it.
One could make much the same argument about the whole of C++. It works, it has a precedent, it's already *there*, just use it.
The whole reason I came to D was because, at the time, D was more interested in fixing C++'s idiocy than just merely aping C++ as the theme seems to be now.
I don't see any difference, D has always kept a strong link to it's C++ heritage. It's just a matter of what you define as idiocy.
Feb 15 2011
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 15 Feb 2011 16:50:21 -0500, Nick Sabalausky <a a.a> wrote:

 "Nick Sabalausky" <a a.a> wrote in message
 news:ijesem$brd$1 digitalmars.com...
 "Steven Schveighoffer" <schveiguy yahoo.com> wrote in message
 news:op.vqx78nkceav7ka steve-laptop...
 size_t works,  it has a precedent, it's already *there*, just use it,  
 or
 alias it if you  don't like it.
One could make much the same argument about the whole of C++. It works, it has a precedent, it's already *there*, just use it.
The whole reason I came to D was because, at the time, D was more interested in fixing C++'s idiocy than just merely aping C++ as the theme seems to be now.
Nick, this isn't a feature, it's not a design, it's not a whole language, it's a *single name*, one which is easily changed if you want to change it. module nick; alias size_t wordsize; Now you can use it anywhere, it's sooo freaking simple, I don't understand the outrage. BTW, what I meant about it's already there is that any change to the size_t name would have to have some benefit besides "it's a different name" because it will break any code that currently uses it. If this whole argument is to just add another alias, then I'll just stop reading this thread since it has no point. -Steve
Feb 16 2011
parent reply =?UTF-8?B?Z8O2bGdlbGl5ZWxl?= <usuldan gmail.com> writes:
On 2/16/11 9:09 AM, Steven Schveighoffer wrote:
 On Tue, 15 Feb 2011 16:50:21 -0500, Nick Sabalausky <a a.a> wrote:

 "Nick Sabalausky" <a a.a> wrote in message
 module nick;

 alias size_t wordsize;

 Now you can use it anywhere, it's sooo freaking simple, I don't
 understand the outrage.
But that is somewhat selfish. Given size_t causes dissatisfaction with a lot of people, people will start create their won aliases and then you end up having 5 different versions of it around. If this type is an important one for writing architecture independent code that can take advantage of architectural limits, then we better don't have 5 different names for it in common code. I don't think changing stuff like this should be distruptive. size_t can be marked deprecated and could be removed in a future release, giving people enough time to adapt. Furthermore, with the 64-bit support in dmd approaching, this is the time to do it, if ever.
Feb 16 2011
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 16 Feb 2011 09:23:09 -0500, gölgeliyele <usuldan gmail.com> wrote:

 On 2/16/11 9:09 AM, Steven Schveighoffer wrote:
 On Tue, 15 Feb 2011 16:50:21 -0500, Nick Sabalausky <a a.a> wrote:

 "Nick Sabalausky" <a a.a> wrote in message
 module nick;

 alias size_t wordsize;

 Now you can use it anywhere, it's sooo freaking simple, I don't
 understand the outrage.
But that is somewhat selfish. Given size_t causes dissatisfaction with a lot of people, people will start create their won aliases and then you end up having 5 different versions of it around. If this type is an important one for writing architecture independent code that can take advantage of architectural limits, then we better don't have 5 different names for it in common code.
Sir, you've heard from the men who don't like size_t. But what about the silent masses who do? So we change it. And then people don't like what it's changed to, for example, I might like size_t or already have lots of code that uses size_t. So I alias your new name to size_t in my code. How does this make things better/different? bearophile doesn't like writeln. He uses something else in his libs, it's just an alias. Does that mean we should change writeln? IT'S A NAME!!! one which many are used to using/knowing. Whatever name it is, you just learn it, and once you know it, you just use it. If we hadn't been using it for the last 10 years, I'd say, sure, let's have a vote and decide on a name. You can't please everyone with every name. size_t isn't so terrible that it needs to be changed, so can we focus efforts on actual important things? This is the sheddiest bikeshed argument I've seen in a while. I'm done with this thread... -Steve
Feb 16 2011
parent reply =?UTF-8?B?Z8O2bGdlbGl5ZWxl?= <usuldan gmail.com> writes:
On 2/16/11 9:45 AM, Steven Schveighoffer wrote:

 I'm done with this thread...

 -Steve
Ok, I don't want to drag on. But there is a reason why we have a style. size_t is against the D style and obviously does not match. I use size_t as much as Walter does in my day job, and I even like it. It just does not fit into D's type names. That is all.
Feb 16 2011
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, February 16, 2011 06:51:21 g=C3=B6lgeliyele wrote:
 On 2/16/11 9:45 AM, Steven Schveighoffer wrote:
 I'm done with this thread...
=20
 -Steve
=20 Ok, I don't want to drag on. But there is a reason why we have a style. size_t is against the D style and obviously does not match. I use size_t as much as Walter does in my day job, and I even like it. It just does not fit into D's type names. That is all.
If we were much earlier in the D development process, then perhaps it would= make=20 some sense to change the name. But as it is, it's going to break a lot of c= ode=20 for a simple name change. Lots of C, C++, and D programmers are fine with s= ize_t.=20 I see no reason to break a ton of code just because a few people complain a= bout=20 a name on the mailing list. Not to mention, size_t isn't exactly normal anyway. Virtually every type in= D=20 has a fixed size, but size_t is different. It's an alias whose size varies= =20 depending on the architecture you're compiling on. As such, perhaps that fa= ct=20 that it doesn't follow the normal naming scheme is a _good_ thing. I tend to agree with Steve on this. This is core language stuff that's been= the=20 way that it is since the beginning. Changing it is just going to break code= and=20 cause even more headaches for porting code from C or C++ to D. This definit= ely=20 comes across as bikeshedding. If we were way earlier in the development pro= cess=20 of D, then I think that there would be a much better argument. But at this= =20 point, the language spec is supposed to be essentially stable. And just bec= ause=20 the name doesn't quite fit in with the others is _not_ a good enough reason= to go=20 and change the language spec. =2D Jonathan M Davis
Feb 16 2011
parent Daniel Gibson <metalcaedes gmail.com> writes:
Am 16.02.2011 19:20, schrieb Jonathan M Davis:
 On Wednesday, February 16, 2011 06:51:21 gölgeliyele wrote:
 On 2/16/11 9:45 AM, Steven Schveighoffer wrote:
 I'm done with this thread...

 -Steve
Ok, I don't want to drag on. But there is a reason why we have a style. size_t is against the D style and obviously does not match. I use size_t as much as Walter does in my day job, and I even like it. It just does not fit into D's type names. That is all.
If we were much earlier in the D development process, then perhaps it would make some sense to change the name. But as it is, it's going to break a lot of code for a simple name change. Lots of C, C++, and D programmers are fine with size_t. I see no reason to break a ton of code just because a few people complain about a name on the mailing list. Not to mention, size_t isn't exactly normal anyway. Virtually every type in D has a fixed size, but size_t is different. It's an alias whose size varies depending on the architecture you're compiling on. As such, perhaps that fact that it doesn't follow the normal naming scheme is a _good_ thing. I tend to agree with Steve on this. This is core language stuff that's been the way that it is since the beginning. Changing it is just going to break code and cause even more headaches for porting code from C or C++ to D. This definitely comes across as bikeshedding. If we were way earlier in the development process of D, then I think that there would be a much better argument. But at this point, the language spec is supposed to be essentially stable. And just because the name doesn't quite fit in with the others is _not_ a good enough reason to go and change the language spec. - Jonathan M Davis
Well IMHO it would be feasible to add another alias (keeping size_t), update phobos to use the new alias and to recommend to use the new alias instead of size_t. Or, even better, add a new *type* that behaves like size_t but prevents non-portable use without explicit casting, use it throughout phobos and keep size_t for compatibility reasons (and for interfacing with C). But I really don't care much.. size_t is okay for me the way it is. The best argument I've heard so far was from Michel Fortin, that having a more D-ish name may encourage the use of size_t instead of uint - but hopefully people will be more portability-aware once 64bit DMD is out anyway. IMHO it's definitely too late (for D2) to add a better type that is signed etc, like Don proposed. Also I'm not sure how well that would work when interfacing with C. It may make sense for the compiler to handle unsigned/signed comparisons and operations more strictly or more securely (=> implicit casting to the next bigger unsigned type before comparing or stuff like that), though. Cheers, - Daniel
Feb 16 2011
prev sibling next sibling parent reply Jason House <jason.james.house gmail.com> writes:
The very fact that you didn't have issues with size_t before compiling in 64
bit mode seems like a short-coming of D. It should be hard to write code that
isn't platform independent. One would kind of hope that size_t was a distinct
type that could have uints assigned to them without casts. It might even be
good to allow size_t to implicitly convert to ulong.

dsimcha Wrote:

 Now that DMD has a 64-bit beta available, I'm working on getting a whole bunch
 of code to compile in 64 mode.  Frankly, the compiler is way too freakin'
 pedantic when it comes to implicit conversions (or lack thereof) of
 array.length.  99.999% of the time it's safe to assume an array is not going
 to be over 4 billion elements long.  I'd rather have a bug the 0.001% of the
 time than deal with the pedantic errors the rest of the time, because I think
 it would be less total time and effort invested.  To force me to either put
 casts in my code everywhere or change my entire codebase to use wider integers
 (with ripple effects just about everywhere) strikes me as purity winning out
 over practicality.
Feb 14 2011
parent dsimcha <dsimcha yahoo.com> writes:
Actually, the more I think about it the more I realize it was kind of a 
corner case.  Basically, in the two programs that were a royal PITA to 
convert to 64-bit, I was storing arrays of indices into other arrays. 
This needed to be reasonably memory-efficient.  There is no plausible 
way that any array that I was storing an index into could be >4 billion 
elements, so I used uints instead of size_t's.

On 2/14/2011 11:47 PM, Jason House wrote:
 The very fact that you didn't have issues with size_t before compiling in 64
bit mode seems like a short-coming of D. It should be hard to write code that
isn't platform independent. One would kind of hope that size_t was a distinct
type that could have uints assigned to them without casts. It might even be
good to allow size_t to implicitly convert to ulong.

 dsimcha Wrote:

 Now that DMD has a 64-bit beta available, I'm working on getting a whole bunch
 of code to compile in 64 mode.  Frankly, the compiler is way too freakin'
 pedantic when it comes to implicit conversions (or lack thereof) of
 array.length.  99.999% of the time it's safe to assume an array is not going
 to be over 4 billion elements long.  I'd rather have a bug the 0.001% of the
 time than deal with the pedantic errors the rest of the time, because I think
 it would be less total time and effort invested.  To force me to either put
 casts in my code everywhere or change my entire codebase to use wider integers
 (with ripple effects just about everywhere) strikes me as purity winning out
 over practicality.
Feb 14 2011
prev sibling next sibling parent spir <denis.spir gmail.com> writes:
On 02/15/2011 02:28 AM, Jonathan M Davis wrote:
 On Monday, February 14, 2011 17:06:43 spir wrote:
 On 02/15/2011 01:56 AM, Jonathan M Davis wrote:
 On Monday, February 14, 2011 16:30:09 Andrej Mitrovic wrote:
 Here's something I've noticed (x86 code):

 void main()
 {

       ulong size = 2;
       int[] arr = new int[](size);

 }

 This will error with:
 sizetTest.d(8): Error: cannot implicitly convert expression (size) of
 type ulong to uint

 size_t is aliased to uint since I'm running 32bit.

 I'm really not experienced at all with 64bit, so I don't know if it's
 good to use uint explicitly (my hunch is that it's not good). uint as
 the array size wouldn't even compile in 64bit, right?

 If I'm correct, wouldn't it be better if the error showed that it
 expects size_t which might be aliased to whatever type for a
 particular machine?
Use size_t. It's the type which is used. It's aliased to whichever type is appropriate for the architecture. On 32 bits, that would be a 32 bit integer, so it's uint. On 64 bits, that would be a 64 bit integer, so it's ulong.
Rename size-t, or rather introduce a meaningful standard alias? (would vote for Natural)
Why? size_t is what's used in C++. It's well known and what lots of programmers would expect What would you gain by renaming it?
Then state on D's front page: "D is a language for C++ programmers..." "size"_t is wrong, wrong, wrong: * Nothing says it's a type alias (should be "Size"). * The name's "morphology" is weird. * It does not even tell about semantics & usage: a majority of use cases is probably as indices! (ordinal, not cardinal as suggested by the name). ("sizediff_t" also exists, but seems unused in D) "Natural" would be good according to all those points: the name tells it's unsigned, a natural number is either an ordinal or a cardinal, and it fits D style guidelines. Better proposals welcome :-) Aliasing does /not/ mean removing size_t, just proposing a correct, sensible, and meaningful alternative. If people like it, and if using the correct name is encouraged, then after a few years the legacy garbage can endly be recycled ;-) In any case, this alternative must be *standard*, for the whole community to know it. I have used Ordinal & Cardinal for a while, but stopped because of that: people reading my code had to guess a bit (not that hard, but still), or jump to declarations. Again: size_t is /wrong/. The fact that for you it means what it means, due to your experience as C++ programmer, does not change a iota (lol!) to its wrongness. If we never improve languages just because of mindless conservatism, then in 3 generations programmers will still be stuck with junk from the 1970's. Denis -- _________________ vita es estrany spir.wikidot.com
Feb 15 2011
prev sibling next sibling parent spir <denis.spir gmail.com> writes:
On 02/15/2011 02:28 AM, Jonathan M Davis wrote:
 On Monday, February 14, 2011 17:06:43 spir wrote:
 On 02/15/2011 01:56 AM, Jonathan M Davis wrote:
 On Monday, February 14, 2011 16:30:09 Andrej Mitrovic wrote:
 Here's something I've noticed (x86 code):

 void main()
 {

       ulong size = 2;
       int[] arr = new int[](size);

 }

 This will error with:
 sizetTest.d(8): Error: cannot implicitly convert expression (size) of
 type ulong to uint

 size_t is aliased to uint since I'm running 32bit.

 I'm really not experienced at all with 64bit, so I don't know if it's
 good to use uint explicitly (my hunch is that it's not good). uint as
 the array size wouldn't even compile in 64bit, right?

 If I'm correct, wouldn't it be better if the error showed that it
 expects size_t which might be aliased to whatever type for a
 particular machine?
Use size_t. It's the type which is used. It's aliased to whichever type is appropriate for the architecture. On 32 bits, that would be a 32 bit integer, so it's uint. On 64 bits, that would be a 64 bit integer, so it's ulong.
Rename size-t, or rather introduce a meaningful standard alias? (would vote for Natural)
Why? size_t is what's used in C++. It's well known and what lots of programmers would expect What would you gain by renaming it?
Then state on D's front page: "D is a language for C++ programmers..." "size"_t is wrong, wrong, wrong: * Nothing says it's a type alias (should be "Size"). * The name's "morphology" is weird. * It does not even tell about semantics & usage: a majority of use cases is probably as indices! (ordinal, not cardinal as suggested by the name). ("sizediff_t" also exists, but seems unused in D) "Natural" would be good according to all those points: the name tells it's unsigned, a natural number is either an ordinal or a cardinal, and it fits D style guidelines. Better proposals welcome :-) Aliasing does /not/ mean removing size_t, just proposing a correct, sensible, and meaningful alternative. If people like it, and if using the correct name is encouraged, then after a few years the legacy garbage can endly be recycled ;-) In any case, this alternative must be *standard*, for the whole community to know it. I have used Ordinal & Cardinal for a while, but stopped because of that: people reading my code had to guess a bit (not that hard, but still), or jump to declarations. Again: size_t is /wrong/. The fact that for you it means what it means, due to your experience as C++ programmer, does not change a iota (lol!) to its wrongness. If we never improve languages just because of mindless conservatism, then in 3 generations programmers will still be stuck with junk from the 1970's. Denis -- _________________ vita es estrany spir.wikidot.com
Feb 15 2011
prev sibling next sibling parent Iain Buclaw <ibuclaw ubuntu.com> writes:
== Quote from dsimcha (dsimcha yahoo.com)'s article
 Now that DMD has a 64-bit beta available, I'm working on getting a whole bunch
 of code to compile in 64 mode.  Frankly, the compiler is way too freakin'
 pedantic when it comes to implicit conversions (or lack thereof) of
 array.length.  99.999% of the time it's safe to assume an array is not going
 to be over 4 billion elements long.  I'd rather have a bug the 0.001% of the
 time than deal with the pedantic errors the rest of the time, because I think
 it would be less total time and effort invested.  To force me to either put
 casts in my code everywhere or change my entire codebase to use wider integers
 (with ripple effects just about everywhere) strikes me as purity winning out
 over practicality.
I have a similar grudge about short's being implicitly converted to int's, resulting in hundreds of unwanted casts thrown in everywhere cluttering up code. ie: short a,b,c,; a = b + c; Hidden implicit casts should die.
Feb 15 2011
prev sibling next sibling parent reply foobar <foo bar.com> writes:
Steven Schveighoffer Wrote:

 On Tue, 15 Feb 2011 09:26:21 -0500, spir <denis.spir gmail.com> wrote:
 
 On 02/15/2011 02:36 PM, Steven Schveighoffer wrote:
 Hey, bikeshedders, I found this cool easter-egg feature in D! It's  
 called
 alias! Don't like the name of something? Well you can change it!

 alias size_t wordsize;

 Now, you can use wordsize instead of size_t in your code, and the  
 compiler
 doesn't care! (in fact, that's all size_t is anyways *hint hint*)
Sure, but it's not the point of this one bikeshedding thread. If you do that, then you're the only one who knows what "wordsize" means. Good, maybe, for app-specific semantic notions (alias Employee[] Staff;); certainly not for types at the highest degree of general purpose like size_t. We need a standard alias.
The standard alias is size_t. If you don't like it, alias it to something else. Why should I have to use something that's unfamiliar to me because you don't like size_t? I guarantee whatever you came up with would not be liked by some people, so they would have to alias it, you can't please everyone. size_t works, it has a precedent, it's already *there*, just use it, or alias it if you don't like it. No offense, but this discussion is among the most pointless I've seen. -Steve
I disagree that the discussion is pointless. On the contrary, the OP pointed out some valid points: 1. that size_t is inconsistent with D's style guide. the "_t" suffix is a C++ convention and not a D one. While it makes sense for [former?] C++ programmers it will confuse newcomers to D from other languages that would expect the language to follow its own style guide. 2. the proposed change is backwards compatible - the OP asked for an *additional* alias. 3. generic concepts should belong to the standard library and not user code which is also where size_t is already defined. IMO, we already have a byte type, it's plain common sense to extend this with a "native word" type.
Feb 15 2011
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
foobar wrote:
 1.  that size_t is inconsistent with D's style guide. the "_t" suffix is a C++
convention and not a D one. While it makes sense for [former?] C++ programmers
it will confuse newcomers to D from other languages that would expect the
language to follow its own style guide. 
It's a C convention.
 2. the proposed change is backwards compatible - the OP asked for an
*additional* alias.
I do not believe that value is added by adding more and more aliases for the same thing. It makes the library large and complex but with no depth.
Feb 15 2011
parent spir <denis.spir gmail.com> writes:
On 02/15/2011 08:05 PM, Walter Bright wrote:
 foobar wrote:
 1. that size_t is inconsistent with D's style guide. the "_t" suffix is a C++
 convention and not a D one. While it makes sense for [former?] C++
 programmers it will confuse newcomers to D from other languages that would
 expect the language to follow its own style guide.
It's a C convention.
 2. the proposed change is backwards compatible - the OP asked for an
 *additional* alias.
I do not believe that value is added by adding more and more aliases for the same thing. It makes the library large and complex but with no depth.
If we asked for various aliases for numerous builtin terms of the language, your point would be fully valid. But here is only asked for a single standard alias for what may well be the most used type in the language; which presently has a obscure alias name. Cost: one line of code in object.d: alias typeof(int.sizeof) size_t; alias typeof(int.sizeof) Abcdef; // add this As an aside, the opportunity may be taken to use machine-word-size signed values as a standard for indices/positions and sizes/counts/lengths (and offsets?), everywhere in the language, for the coming 64-bit version. Don, IIRC, and Bearophile, referred to issues due to unsigned values. This would also give an obvious name for the alias, "Integer", that probably few would contest (hope so). Denis -- _________________ vita es estrany spir.wikidot.com
Feb 15 2011
prev sibling next sibling parent reply so <so so.so> writes:
 I disagree that the discussion is pointless.
 On the contrary, the OP pointed out some valid points:

 1.  that size_t is inconsistent with D's style guide. the "_t" suffix is  
 a C++ convention and not a D one. While it makes sense for [former?] C++  
 programmers it will confuse newcomers to D from other languages that  
 would expect the language to follow its own style guide.
 2. the proposed change is backwards compatible - the OP asked for an  
 *additional* alias.
 3. generic concepts should belong to the standard library and not user  
 code which is also where size_t is already defined.

 IMO, we already have a byte type, it's plain common sense to extend this  
 with a "native word" type.
Funny thing is the most important argument against size_t got the least attention. I will leave it as an exercise for the reader.
Feb 15 2011
parent reply "Nick Sabalausky" <a a.a> writes:
"so" <so so.so> wrote in message news:op.vqyk3emumpw3zg so-pc...
 I disagree that the discussion is pointless.
 On the contrary, the OP pointed out some valid points:

 1.  that size_t is inconsistent with D's style guide. the "_t" suffix is 
 a C++ convention and not a D one. While it makes sense for [former?] C++ 
 programmers it will confuse newcomers to D from other languages that 
 would expect the language to follow its own style guide.
 2. the proposed change is backwards compatible - the OP asked for an 
 *additional* alias.
 3. generic concepts should belong to the standard library and not user 
 code which is also where size_t is already defined.

 IMO, we already have a byte type, it's plain common sense to extend this 
 with a "native word" type.
Funny thing is the most important argument against size_t got the least attention. I will leave it as an exercise for the reader.
That variables of type "size_t" are frequently used to store indicies rather than the actual *size* of anything? That it does nothing to help with 32/64-bit portability until you actually compile your code both ways? That Nick doesn't like it? ;)
Feb 15 2011
next sibling parent reply Daniel Gibson <metalcaedes gmail.com> writes:
Am 15.02.2011 23:00, schrieb Nick Sabalausky:
 "so" <so so.so> wrote in message news:op.vqyk3emumpw3zg so-pc...
 I disagree that the discussion is pointless.
 On the contrary, the OP pointed out some valid points:

 1.  that size_t is inconsistent with D's style guide. the "_t" suffix is 
 a C++ convention and not a D one. While it makes sense for [former?] C++ 
 programmers it will confuse newcomers to D from other languages that 
 would expect the language to follow its own style guide.
 2. the proposed change is backwards compatible - the OP asked for an 
 *additional* alias.
 3. generic concepts should belong to the standard library and not user 
 code which is also where size_t is already defined.

 IMO, we already have a byte type, it's plain common sense to extend this 
 with a "native word" type.
Funny thing is the most important argument against size_t got the least attention. I will leave it as an exercise for the reader.
That variables of type "size_t" are frequently used to store indicies rather than the actual *size* of anything? That it does nothing to help with 32/64-bit portability until you actually compile your code both ways?
I don't understand that point.
 
 That Nick doesn't like it? ;)
 
Feb 15 2011
parent reply "Nick Sabalausky" <a a.a> writes:
"Daniel Gibson" <metalcaedes gmail.com> wrote in message 
news:ijett7$1ie$5 digitalmars.com...
 Am 15.02.2011 23:00, schrieb Nick Sabalausky:
 "so" <so so.so> wrote in message news:op.vqyk3emumpw3zg so-pc...
 Funny thing is the most important argument against size_t got the least
 attention.
 I will leave it as an exercise for the reader.
That variables of type "size_t" are frequently used to store indicies rather than the actual *size* of anything? That it does nothing to help with 32/64-bit portability until you actually compile your code both ways?
I don't understand that point.
If you're writing something in 32-bit and you use size_t, it may compile perfectly fine for 32-bit, but the compiler won't tell you about any problems that will appear when you compile the same code for 64-bit (such as "can't implicitly convert"). Presumably the same would apply to writing something on 64-bit and then suddenly compiling for 32-bit. I'm not actually asserting that this is a big issue. Maybe it is, maybe it isn't, I don't know. Just making guesses at what "so" sees as "the most important argument against size_t [that] got the least attention".
Feb 15 2011
parent reply Daniel Gibson <metalcaedes gmail.com> writes:
Am 15.02.2011 23:29, schrieb Nick Sabalausky:
 "Daniel Gibson" <metalcaedes gmail.com> wrote in message 
 news:ijett7$1ie$5 digitalmars.com...
 Am 15.02.2011 23:00, schrieb Nick Sabalausky:
 "so" <so so.so> wrote in message news:op.vqyk3emumpw3zg so-pc...
 Funny thing is the most important argument against size_t got the least
 attention.
 I will leave it as an exercise for the reader.
That variables of type "size_t" are frequently used to store indicies rather than the actual *size* of anything? That it does nothing to help with 32/64-bit portability until you actually compile your code both ways?
I don't understand that point.
If you're writing something in 32-bit and you use size_t, it may compile perfectly fine for 32-bit, but the compiler won't tell you about any problems that will appear when you compile the same code for 64-bit (such as "can't implicitly convert"). Presumably the same would apply to writing something on 64-bit and then suddenly compiling for 32-bit. I'm not actually asserting that this is a big issue. Maybe it is, maybe it isn't, I don't know. Just making guesses at what "so" sees as "the most important argument against size_t [that] got the least attention".
Ok, that is right. Probably it would be helpful if size_t was a proper type that can't be mixed with other types in dangerous ways without explicit casting. Cheers, - Daniel
Feb 15 2011
parent reply Adam Ruppe <destructionator gmail.com> writes:
Daniel Gibson wrote:
 Probably it would be helpful if size_t was a proper type that can't
 be mixed with other types in dangerous ways without explicit casting.
Bad idea: once you insert an explicit cast, you now have a *hidden* bug on the new platform instead of a compile error.
Feb 15 2011
next sibling parent Daniel Gibson <metalcaedes gmail.com> writes:
Am 15.02.2011 23:43, schrieb Adam Ruppe:
 Daniel Gibson wrote:
 Probably it would be helpful if size_t was a proper type that can't
 be mixed with other types in dangerous ways without explicit casting.
Bad idea: once you insert an explicit cast, you now have a *hidden* bug on the new platform instead of a compile error.
You should only cast when you know what you're doing.. If you get a compiler error on the new platform and just shut it up by doing an explicit cast then, it's just as bad. But having to do an explicit cast either way forces you to think about what you're doing, hopefully avoiding large pieces of code that need to be rewritten because they only worked because size_t was uint or such. Cheers, - Daniel
Feb 15 2011
prev sibling parent bearophile <bearophileHUGS lycos.com> writes:
Adam Ruppe:

 Daniel Gibson wrote:
 Probably it would be helpful if size_t was a proper type that can't
 be mixed with other types in dangerous ways without explicit casting.
Bad idea: once you insert an explicit cast, you now have a *hidden* bug on the new platform instead of a compile error.
I'll keep this in mind. Bye, bearophile
Feb 15 2011
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Tuesday, February 15, 2011 14:00:12 Nick Sabalausky wrote:
 "so" <so so.so> wrote in message news:op.vqyk3emumpw3zg so-pc...
 
 I disagree that the discussion is pointless.
 On the contrary, the OP pointed out some valid points:
 
 1.  that size_t is inconsistent with D's style guide. the "_t" suffix is
 a C++ convention and not a D one. While it makes sense for [former?] C++
 programmers it will confuse newcomers to D from other languages that
 would expect the language to follow its own style guide.
 2. the proposed change is backwards compatible - the OP asked for an
 *additional* alias.
 3. generic concepts should belong to the standard library and not user
 code which is also where size_t is already defined.
 
 IMO, we already have a byte type, it's plain common sense to extend this
 with a "native word" type.
Funny thing is the most important argument against size_t got the least attention. I will leave it as an exercise for the reader.
That variables of type "size_t" are frequently used to store indicies rather than the actual *size* of anything? That it does nothing to help with 32/64-bit portability until you actually compile your code both ways?
What _does_ have to do with 32/64-bit portability until you compile both ways? Regardless of what the name is, it's still going to be the word size of the machine and vary between 32-bit and 64-bit anyway. - Jonathan M Davis
Feb 15 2011
parent Don <nospam nospam.com> writes:
Jonathan M Davis wrote:
 On Tuesday, February 15, 2011 14:00:12 Nick Sabalausky wrote:
 "so" <so so.so> wrote in message news:op.vqyk3emumpw3zg so-pc...

 I disagree that the discussion is pointless.
 On the contrary, the OP pointed out some valid points:

 1.  that size_t is inconsistent with D's style guide. the "_t" suffix is
 a C++ convention and not a D one. While it makes sense for [former?] C++
 programmers it will confuse newcomers to D from other languages that
 would expect the language to follow its own style guide.
 2. the proposed change is backwards compatible - the OP asked for an
 *additional* alias.
 3. generic concepts should belong to the standard library and not user
 code which is also where size_t is already defined.

 IMO, we already have a byte type, it's plain common sense to extend this
 with a "native word" type.
Funny thing is the most important argument against size_t got the least attention. I will leave it as an exercise for the reader.
That variables of type "size_t" are frequently used to store indicies rather than the actual *size* of anything? That it does nothing to help with 32/64-bit portability until you actually compile your code both ways?
What _does_ have to do with 32/64-bit portability until you compile both ways? Regardless of what the name is, it's still going to be the word size of the machine and vary between 32-bit and 64-bit anyway.
size_t could be made a genuine type, and given a range of 0..2^^64-1, even when it is a 32 bit value. Then, it'd fail to implicitly convert to int, uint on 32 bit systems. But, if you did certain operations on it (eg, & 0xFFFF_FFFF) then it could be store in a uint without a cast.
Feb 15 2011
prev sibling parent so <so so.so> writes:
 That variables of type "size_t" are frequently used to store indicies  
 rather
 than the actual *size* of anything?

 That it does nothing to help with 32/64-bit portability until you  
 actually
 compile your code both ways?

 That Nick doesn't like it? ;)
Nice try! But i was referring Don's argument. :)
Feb 15 2011
prev sibling parent reply =?ISO-8859-1?Q?g=F6lgeliyele?= <usuldan gmail.com> writes:
On 2/15/11 12:24 PM, foobar wrote:
 I disagree that the discussion is pointless.
 On the contrary, the OP pointed out some valid points:

 1.  that size_t is inconsistent with D's style guide. the "_t" suffix is a C++
convention and not a D one. While it makes sense for [former?] C++ programmers
it will confuse newcomers to D from other languages that would expect the
language to follow its own style guide.
 2. the proposed change is backwards compatible - the OP asked for an
*additional* alias.
 3. generic concepts should belong to the standard library and not user code
which is also where size_t is already defined.

 IMO, we already have a byte type, it's plain common sense to extend this with
a "native word" type.
Look at the basic data types: bool, byte, ubyte, short, ushort, int, uint, long, ulong, cent, ucent, float, double, real, ifloat, idouble, ireal, cfloat, cdouble, creal, char, wchar, dchar While size_t is just an alias, it will be used in a similar way to the above. One can see that it does not fit among these, stylistically speaking. There seems to be a common pattern here, a prefixing character is consistently used to differentiate basic types, such as u-short/short, c-float/float, w-char/char, etc. I wonder if something similar can be done for size_t. nint comes to mind, for native int, that is n-int. Sample code: nint end = 0; // nintendo :) Having too many aliases seems like a problem to me. Different developers will start using different names and reading code will become harder. One would need to learn two things that refer to the same. My 2 cents: I suggest deprecating size_t and replacing it with a better alternative that fits with the D language.
Feb 15 2011
parent reply "Nick Sabalausky" <a a.a> writes:
"glgeliyele" <usuldan gmail.com> wrote in message 
news:ijfc4m$16p6$1 digitalmars.com...
 On 2/15/11 12:24 PM, foobar wrote:
 I disagree that the discussion is pointless.
 On the contrary, the OP pointed out some valid points:

 1.  that size_t is inconsistent with D's style guide. the "_t" suffix is 
 a C++ convention and not a D one. While it makes sense for [former?] C++ 
 programmers it will confuse newcomers to D from other languages that 
 would expect the language to follow its own style guide.
 2. the proposed change is backwards compatible - the OP asked for an 
 *additional* alias.
 3. generic concepts should belong to the standard library and not user 
 code which is also where size_t is already defined.

 IMO, we already have a byte type, it's plain common sense to extend this 
 with a "native word" type.
Look at the basic data types: bool, byte, ubyte, short, ushort, int, uint, long, ulong, cent, ucent, float, double, real, ifloat, idouble, ireal, cfloat, cdouble, creal, char, wchar, dchar While size_t is just an alias, it will be used in a similar way to the above. One can see that it does not fit among these, stylistically speaking. There seems to be a common pattern here, a prefixing character is consistently used to differentiate basic types, such as u-short/short, c-float/float, w-char/char, etc. I wonder if something similar can be done for size_t. nint comes to mind, for native int, that is n-int. Sample code:
I like "nint".
   nint end = 0; // nintendo :)
Heh, I like that even more. It's "int eger;" for a new generation :) And much less contrived, come to think of it.
Feb 15 2011
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2011-02-15 22:41:32 -0500, "Nick Sabalausky" <a a.a> said:

 I like "nint".
But is it unsigned or signed? Do we need 'unint' too? I think 'word' & 'uword' would be a better choice. I can't say I'm too displeased with 'size_t', but it's true that the 'size_t' feels out of place in D code because of its name. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Feb 15 2011
next sibling parent reply "Nick Sabalausky" <a a.a> writes:
"Michel Fortin" <michel.fortin michelf.com> wrote in message 
news:ijfhkt$1fte$1 digitalmars.com...
 On 2011-02-15 22:41:32 -0500, "Nick Sabalausky" <a a.a> said:

 I like "nint".
But is it unsigned or signed? Do we need 'unint' too?
*shrug* Beats me. I can't even remember if size_t is signed or not.
 I think 'word' & 'uword' would be a better choice.
The only problem I have with that is that "word" seems like something you might want to use as a variable name in certain cases. However, I'd still prefer "word" over "size_t"
 I can't say I'm too displeased with 'size_t', but it's true that the 
 'size_t' feels out of place in D code because of its name.
Feb 15 2011
parent =?ISO-8859-1?Q?g=F6lgeliyele?= <usuldan gmail.com> writes:
On 2/15/11 11:33 PM, Nick Sabalausky wrote:
 "Michel Fortin"<michel.fortin michelf.com>  wrote in message
 news:ijfhkt$1fte$1 digitalmars.com...
 On 2011-02-15 22:41:32 -0500, "Nick Sabalausky"<a a.a>  said:

 I like "nint".
But is it unsigned or signed? Do we need 'unint' too?
*shrug* Beats me. I can't even remember if size_t is signed or not.
size_t is unsigned in C/C++, whereas ssize_t is signed. I like word/uword as well, but word is too common as a variable name. What about archint/uarchint ?
Feb 15 2011
prev sibling next sibling parent reply spir <denis.spir gmail.com> writes:
On 02/16/2011 04:49 AM, Michel Fortin wrote:
 On 2011-02-15 22:41:32 -0500, "Nick Sabalausky" <a a.a> said:

 I like "nint".
But is it unsigned or signed? Do we need 'unint' too? I think 'word' & 'uword' would be a better choice. I can't say I'm too displeased with 'size_t', but it's true that the 'size_t' feels out of place in D code because of its name.
yop! Vote for word / uword. unint looks like meaning (x € R / not (x € Z)) lol! Denis -- _________________ vita es estrany spir.wikidot.com
Feb 16 2011
parent Iain Buclaw <ibuclaw ubuntu.com> writes:
== Quote from spir (denis.spir gmail.com)'s article
 On 02/16/2011 04:49 AM, Michel Fortin wrote:
 On 2011-02-15 22:41:32 -0500, "Nick Sabalausky" <a a.a> said:

 I like "nint".
It's the machine integer, so I think the word 'mint' would better match your naming logic. Also, reminds me of this small advert: http://www.youtube.com/watch?v=zuy6o8YXzDo ;)
 But is it unsigned or signed? Do we need 'unint' too?

 I think 'word' & 'uword' would be a better choice. I can't say I'm too
 displeased with 'size_t', but it's true that the 'size_t' feels out of place in
 D code because of its name.
yop! Vote for word / uword. unint looks like meaning (x € R / not (x € Z)) lol! Denis
word/uword sits well with my understanding.
Feb 16 2011
prev sibling next sibling parent reply KennyTM~ <kennytm gmail.com> writes:
On Feb 16, 11 11:49, Michel Fortin wrote:
 On 2011-02-15 22:41:32 -0500, "Nick Sabalausky" <a a.a> said:

 I like "nint".
But is it unsigned or signed? Do we need 'unint' too? I think 'word' & 'uword' would be a better choice. I can't say I'm too displeased with 'size_t', but it's true that the 'size_t' feels out of place in D code because of its name.
'word' may be confusing to Windows programmers because in WinAPI a 'WORD' means an unsigned 16-bit integer (aka 'ushort'). http://msdn.microsoft.com/en-us/library/cc230402(v=PROT.10).aspx
Feb 16 2011
parent reply "Nick Sabalausky" <a a.a> writes:
"KennyTM~" <kennytm gmail.com> wrote in message 
news:ijghne$ts1$1 digitalmars.com...
 On Feb 16, 11 11:49, Michel Fortin wrote:
 On 2011-02-15 22:41:32 -0500, "Nick Sabalausky" <a a.a> said:

 I like "nint".
But is it unsigned or signed? Do we need 'unint' too? I think 'word' & 'uword' would be a better choice. I can't say I'm too displeased with 'size_t', but it's true that the 'size_t' feels out of place in D code because of its name.
'word' may be confusing to Windows programmers because in WinAPI a 'WORD' means an unsigned 16-bit integer (aka 'ushort'). http://msdn.microsoft.com/en-us/library/cc230402(v=PROT.10).aspx
That's just a legacy issue from when windows was mainly on 16-bit machines. "Word" means native size.
Feb 16 2011
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 17.02.2011 9:09, Nick Sabalausky wrote:
 "KennyTM~"<kennytm gmail.com>  wrote in message
 news:ijghne$ts1$1 digitalmars.com...
 On Feb 16, 11 11:49, Michel Fortin wrote:
 On 2011-02-15 22:41:32 -0500, "Nick Sabalausky"<a a.a>  said:

 I like "nint".
But is it unsigned or signed? Do we need 'unint' too? I think 'word'& 'uword' would be a better choice. I can't say I'm too displeased with 'size_t', but it's true that the 'size_t' feels out of place in D code because of its name.
'word' may be confusing to Windows programmers because in WinAPI a 'WORD' means an unsigned 16-bit integer (aka 'ushort'). http://msdn.microsoft.com/en-us/library/cc230402(v=PROT.10).aspx
That's just a legacy issue from when windows was mainly on 16-bit machines. "Word" means native size.
Tell that Intel guys, their assembler syntax (read most x86 assemblers) uses size prefixes word (2 bytes!), dword (4bytes), qword (8) etc. And if that was only assembler syntax issue... -- Dmitry Olshansky
Feb 16 2011
prev sibling parent reply "Denis Koroskin" <2korden gmail.com> writes:
On Wed, 16 Feb 2011 06:49:26 +0300, Michel Fortin  
<michel.fortin michelf.com> wrote:

 On 2011-02-15 22:41:32 -0500, "Nick Sabalausky" <a a.a> said:

 I like "nint".
But is it unsigned or signed? Do we need 'unint' too? I think 'word' & 'uword' would be a better choice. I can't say I'm too displeased with 'size_t', but it's true that the 'size_t' feels out of place in D code because of its name.
I second that. word/uword are shorter than ssize_t/size_t and more in line with other type names. I like it.
Feb 16 2011
parent reply David Nadlinger <see klickverbot.at> writes:
On 2/17/11 8:56 AM, Denis Koroskin wrote:
 I second that. word/uword are shorter than ssize_t/size_t and more in
 line with other type names.

 I like it.
I agree that size_t/ptrdiff_t are misnomers and I'd love to kill them with fire, but when I read about »word«, I intuitively associated it with »two bytes« first – blame Intel or whoever else, but the potential for confusion is definitely not negligible. David
Feb 17 2011
parent reply Don <nospam nospam.com> writes:
David Nadlinger wrote:
 On 2/17/11 8:56 AM, Denis Koroskin wrote:
 I second that. word/uword are shorter than ssize_t/size_t and more in
 line with other type names.

 I like it.
I agree that size_t/ptrdiff_t are misnomers and I'd love to kill them with fire, but when I read about »word«, I intuitively associated it with »two bytes« first – blame Intel or whoever else, but the potential for confusion is definitely not negligible. David
Me too. A word is two bytes. Any other definition seems to be pretty useless. The whole concept of "machine word" seems very archaic and incorrect to me anyway. It assumes that the data registers and address registers are the same size, which is very often not true. For example, on an 8-bit machine (eg, 6502 or Z80), the accumulator was only 8 bits, yet size_t was definitely 16 bits. It's quite plausible that at some time in the future we'll get a machine with 128-bit registers and data bus, but retaining the 64 bit address bus. So we could get a size_t which is smaller than the machine word. In summary: size_t is not the machine word.
Feb 17 2011
next sibling parent reply Russel Winder <russel russel.org.uk> writes:
<minor-rant>

On Thu, 2011-02-17 at 10:13 +0100, Don wrote:
[ . . . ]
 Me too. A word is two bytes. Any other definition seems to be pretty=20
 useless.
Sounds like people have been living with 8- and 16-bit processors for too long. A word is the natural length of an integer item in the processor. It is necessarily machine specific. cf. DEC-10 had 9-bit bytes and 36-bit word, IBM 370 has an 8-bit byte and a 32-bit word, though addresses were 24-bit. ix86 follows IBM 8-bit byte and 32-bit word. The really interesting question is whether on x86_64 the word is 32-bit or 64-bit.
 The whole concept of "machine word" seems very archaic and incorrect to=
=20
 me anyway. It assumes that the data registers and address registers are=
=20
 the same size, which is very often not true.
Machine words are far from archaic, even on the JVM, if you don't know the length of the word on the machine you are executing on, how do you know the set of values that can be represented? In floating point numbers, if you don't know the length of the word, how do you know the accuracy of the computation? Clearly data registers and address registers can be different lengths, it is not the job of a programming language that compiles to native code to ignore this and attempt to homogenize things beyond what is reasonable. If you are working in native code then word length is a crucial property since it can change depending on which processor you compile for.
 For example, on an 8-bit machine (eg, 6502 or Z80), the accumulator was=
=20
 only 8 bits, yet size_t was definitely 16 bits.
The 8051 was only surpassed a couple of years ago by ARMs as the most numerous processor on the planet. 8-bit processors may only have had 8-bit ALUs -- leading to an hypothesis that the word was 8-bits -- but the word length was effectively 16-bit due to the hardware support for multi-byte integer operations.
 It's quite plausible that at some time in the future we'll get a machine=
=20
 with 128-bit registers and data bus, but retaining the 64 bit address=20
 bus. So we could get a size_t which is smaller than the machine word.
=20
 In summary: size_t is not the machine word.
Agreed ! As long as the address bus is less wide than an integer, there are no apparent problems using integers as addresses. The problem comes when addresses are wider than integers. A good statically-typed programming language should manage this by having integers and addresses as distinct sets. C and C++ have led people astray. There should be an appropriate set of integer types and an appropriate set of address types and using one from the other without active conversion is always going to lead to problems. Do not be afraid of the word. Fear leads to anger. Anger leads to hate. Hate leads to suffering. (*) </minor-rant> (*) With apologies to Master Yoda (**) for any misquote. (**) Or more likely whoever his script writer was. --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel russel.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Feb 17 2011
next sibling parent reply Don <nospam nospam.com> writes:
Russel Winder wrote:
 <minor-rant>
 
 On Thu, 2011-02-17 at 10:13 +0100, Don wrote:
 [ . . . ]
 Me too. A word is two bytes. Any other definition seems to be pretty 
 useless.
Sounds like people have been living with 8- and 16-bit processors for too long. A word is the natural length of an integer item in the processor. It is necessarily machine specific. cf. DEC-10 had 9-bit bytes and 36-bit word, IBM 370 has an 8-bit byte and a 32-bit word, though addresses were 24-bit. ix86 follows IBM 8-bit byte and 32-bit word.
Yes, I know. It's true but I think rather useless. We need a name for an 8 bit quantity, and a 16 bit quantity, and higher powers of two. 'byte' is an established name for the first one, even though historically there were 9-bit bytes. IMHO 'word' wasn't such a bad name for the second one, even though its etomology comes from the machine word size of some specific early processors. But the equally arbitrary name 'short' has become widely accepted.
 The really interesting question is whether on x86_64 the word is 32-bit
 or 64-bit.
With the rising importance of the SIMD instruction set, you could even argue that it is 128 bits in many cases...
 The whole concept of "machine word" seems very archaic and incorrect to 
 me anyway. It assumes that the data registers and address registers are 
 the same size, which is very often not true.
Machine words are far from archaic, even on the JVM, if you don't know the length of the word on the machine you are executing on, how do you know the set of values that can be represented? In floating point numbers, if you don't know the length of the word, how do you know the accuracy of the computation?
Yes, but they're not necessarily the same number. There is a native size for every type of operation, but it's not universal across all operations. I don't think there's a way you can define "machine word" in a way which is terribly useful. By the time you've got something unambiguous and well-defined, it doesn't have many interesting properties. It's valid in such limited cases that you'd be better off with a clearer name.
 Clearly data registers and address registers can be different lengths,
 it is not the job of a programming language that compiles to native code
 to ignore this and attempt to homogenize things beyond what is
 reasonable.
Agreed, and this is I think what makes the concept of "machine word" not very helpful.
 
 If you are working in native code then word length is a crucial property
 since it can change depending on which processor you compile for.
 
 For example, on an 8-bit machine (eg, 6502 or Z80), the accumulator was 
 only 8 bits, yet size_t was definitely 16 bits.
The 8051 was only surpassed a couple of years ago by ARMs as the most numerous processor on the planet. 8-bit processors may only have had 8-bit ALUs -- leading to an hypothesis that the word was 8-bits -- but the word length was effectively 16-bit due to the hardware support for multi-byte integer operations.
The 6502 was restricted to 8 bits in almost every way. About half of the instructions that involved 16 bit quantities would wrap on page boundaries. jmp (0x7FF) would do an indirect jump, getting the low word from address 0x7FF and the high word from 0x700 !!
 It's quite plausible that at some time in the future we'll get a machine 
 with 128-bit registers and data bus, but retaining the 64 bit address 
 bus. So we could get a size_t which is smaller than the machine word.

 In summary: size_t is not the machine word.
Agreed ! As long as the address bus is less wide than an integer, there are no apparent problems using integers as addresses. The problem comes when addresses are wider than integers. A good statically-typed programming language should manage this by having integers and addresses as distinct sets. C and C++ have led people astray. There should be an appropriate set of integer types and an appropriate set of address types and using one from the other without active conversion is always going to lead to problems.
Indeed.
 
 Do not be afraid of the word.  Fear leads to anger.  Anger leads to
 hate.  Hate leads to suffering. (*)
 
 </minor-rant>
 
 (*) With apologies to Master Yoda (**) for any misquote.
 
 (**) Or more likely whoever his script writer was.
Feb 17 2011
parent Olivier Pisano <olivier.pisano laposte.net> writes:
Le 17/02/2011 13:28, Don a écrit :
 Yes, I know. It's true but I think rather useless.
 We need a name for an 8 bit quantity, and a 16 bit quantity, and higher
 powers of two. 'byte' is an established name for the first one, even
 though historically there were 9-bit bytes. IMHO 'word' wasn't such a
 bad name for the second one, even though its etomology comes from the
 machine word size of some specific early processors. But the equally
 arbitrary name 'short' has become widely accepted.
8 bits: octet -> http://en.wikipedia.org/wiki/Octet_%28computing%29
Feb 17 2011
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
Russel Winder wrote:
 Do not be afraid of the word.  Fear leads to anger.  Anger leads to
 hate.  Hate leads to suffering. (*)
 (*) With apologies to Master Yoda (**) for any misquote.
"Luke, trust your feelings!" -- Oggie Ben Doggie Of course, expecting consistency from Star Wars is a waste of time.
Feb 17 2011
next sibling parent reply Russel Winder <russel russel.org.uk> writes:
On Thu, 2011-02-17 at 11:09 -0800, Walter Bright wrote:
 Russel Winder wrote:
 Do not be afraid of the word.  Fear leads to anger.  Anger leads to
 hate.  Hate leads to suffering. (*)
=20
 (*) With apologies to Master Yoda (**) for any misquote.
=20 "Luke, trust your feelings!" -- Oggie Ben Doggie =20 Of course, expecting consistency from Star Wars is a waste of time.
"What -- me worry?" Alfred E Newman (*) Star Wars is like Dr Who you expect revisionist history in every episode. I hate an inconsistent storyline, so the trick is to assume each episode is a completely separate story unrelated to any other episode. (*) Or whoever http://en.wikipedia.org/wiki/Alfred_E._Neuman --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel russel.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Feb 17 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
Russel Winder wrote:
 Star Wars is like Dr Who you expect revisionist history in every
 episode.  I hate an inconsistent storyline, so the trick is to assume
 each episode is a completely separate story unrelated to any other
 episode.
My trick was to lose all interest in SW. Have you seen the series "Defying Gravity"? The plot is a spaceship is sent around a to pass by various planets in the solar system on a mission of discovery. The script writers apparently thought this was boring, so to liven things up they installed a ghost on the spaceship. It's really, really sad.
Feb 18 2011
parent reply "Nick Sabalausky" <a a.a> writes:
"Walter Bright" <newshound2 digitalmars.com> wrote in message 
news:ijmnp7$433$1 digitalmars.com...
 Russel Winder wrote:
 Star Wars is like Dr Who you expect revisionist history in every
 episode.  I hate an inconsistent storyline, so the trick is to assume
 each episode is a completely separate story unrelated to any other
 episode.
My trick was to lose all interest in SW.
I must not be enough of a Star Wars guy, I don't know what anyone's talking about here. Was it the prequel trilogy that introduced the inconsistencies (I still haven't gotten around to episodes 2 or 3 yet), or were there things in the orignal trilogy that I managed to completely overlook? (Or something else entirely?)
 Have you seen the series "Defying Gravity"? The plot is a spaceship is 
 sent around a to pass by various planets in the solar system on a mission 
 of discovery. The script writers apparently thought this was boring, so to 
 liven things up they installed a ghost on the spaceship.

 It's really, really sad.
Sounds like Stargate Universe: A bunch of people trapped on a ancient spaceship of exploration...but to make that concept "interesting" the writers had to make every damn character on the show a certifiable drama queen. Unsurprisingly, dead after only two seasons - a record low for Stargate. Really looking forward to the movie sequels though (as well as the new SG-1/Atlantis movies that, I *think*, are still in the works).
Feb 18 2011
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday, February 18, 2011 14:20:03 Nick Sabalausky wrote:
 "Walter Bright" <newshound2 digitalmars.com> wrote in message
 news:ijmnp7$433$1 digitalmars.com...
 
 Russel Winder wrote:
 Star Wars is like Dr Who you expect revisionist history in every
 episode.  I hate an inconsistent storyline, so the trick is to assume
 each episode is a completely separate story unrelated to any other
 episode.
My trick was to lose all interest in SW.
I must not be enough of a Star Wars guy, I don't know what anyone's talking about here. Was it the prequel trilogy that introduced the inconsistencies (I still haven't gotten around to episodes 2 or 3 yet), or were there things in the orignal trilogy that I managed to completely overlook? (Or something else entirely?)
The prequel movies definitely have some inconsistencies with the originals, but for the most part, they weren't huge. I suspect that the real trouble comes in when you read the books (which I haven't). - Jonathan M Davis
Feb 18 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
Jonathan M Davis wrote:
 The prequel movies definitely have some inconsistencies with the originals,
but 
 for the most part, they weren't huge. I suspect that the real trouble comes in 
 when you read the books (which I haven't).
Huge? How about it never occurs to Vader to search for Luke at the most obvious location in the universe - his nearest living relatives (Uncle Owen)? That's just the start of the ludicrousness. Ok, I have no right to be annoyed, but what an opportunity (to make a truly great movie) squandered.
Feb 18 2011
next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday, February 18, 2011 17:39:34 Walter Bright wrote:
 Jonathan M Davis wrote:
 The prequel movies definitely have some inconsistencies with the
 originals, but for the most part, they weren't huge. I suspect that the
 real trouble comes in when you read the books (which I haven't).
Huge? How about it never occurs to Vader to search for Luke at the most obvious location in the universe - his nearest living relatives (Uncle Owen)? That's just the start of the ludicrousness. Ok, I have no right to be annoyed, but what an opportunity (to make a truly great movie) squandered.
Well, that's not really an inconsistency so much as not properly taking everything into account in the plot (though to be fair, IIRC, Vader had no clue that he even _had_ kids, so it's not like he would have gone looking in the first place). Regardless, I don't think that there's much question that those films could have been much better. - Jonathan M Davis
Feb 18 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
Jonathan M Davis wrote:
 Vader had no clue 
So much for his force!
Feb 18 2011
parent Max Samukha <maxsamukha spambox.com> writes:
On 02/19/2011 07:39 AM, Walter Bright wrote:
 Jonathan M Davis wrote:
 Vader had no clue
So much for his force!
How can one expect consistency from a fairytale?
Feb 19 2011
prev sibling next sibling parent reply Don <nospam nospam.com> writes:
Walter Bright wrote:
 Jonathan M Davis wrote:
 The prequel movies definitely have some inconsistencies with the 
 originals, but for the most part, they weren't huge. I suspect that 
 the real trouble comes in when you read the books (which I haven't).
Huge? How about it never occurs to Vader to search for Luke at the most obvious location in the universe - his nearest living relatives (Uncle Owen)? That's just the start of the ludicrousness. Ok, I have no right to be annoyed, but what an opportunity (to make a truly great movie) squandered.
I nominate the second prequel for the worst movie of all time. I never saw the third one.
Feb 18 2011
parent Walter Bright <newshound2 digitalmars.com> writes:
Don wrote:
 I nominate the second prequel for the worst movie of all time.
 I never saw the third one.
You didn't miss a thing.
Feb 18 2011
prev sibling next sibling parent reply Russel Winder <russel russel.org.uk> writes:
On Fri, 2011-02-18 at 17:52 -0800, Jonathan M Davis wrote:
 On Friday, February 18, 2011 17:39:34 Walter Bright wrote:
 Jonathan M Davis wrote:
 The prequel movies definitely have some inconsistencies with the
 originals, but for the most part, they weren't huge. I suspect that t=
he
 real trouble comes in when you read the books (which I haven't).
=20 Huge? How about it never occurs to Vader to search for Luke at the most obvious location in the universe - his nearest living relatives (Uncle Owen)? That's just the start of the ludicrousness.
The wikipedia article http://en.wikipedia.org/wiki/Star_Wars is quite interesting, and indicates why there are lots of little inconsistencies as well as quite a few big ones. As to the veracity of the material, who knows, it's the Web, lies have the exact same status as truth.
 Ok, I have no right to be annoyed, but what an opportunity (to make a t=
ruly
 great movie) squandered.
=20 Well, that's not really an inconsistency so much as not properly taking=
=20
 everything into account in the plot (though to be fair, IIRC, Vader had n=
o clue=20
 that he even _had_ kids, so it's not like he would have gone looking in t=
he first=20
 place). Regardless, I don't think that there's much question that those f=
ilms=20
 could have been much better.
I think there has been a loss of historical context here, leading to anti-rose coloured (colored?) spectacles. in 1977, Star Wars was a watershed film. Simple fairy tale storyline, space opera on film instead of book. It's impact was greater than 2001: A Space Odyssey which had analogous impact albeit to a smaller audience in 1968. I am sure there are films from the 1940s and 1950s that deserve similar status but television changed the nature of film impact, making 2001 and Star Wars more influential -- again historical context is important. I think Return of the Jedi is quite fun and that the rest of the Star Wars films lost the simplicity and brilliance of Star Wars, pandering to the need for huge budget special effects, essentially driving us to the computer generated, poor storyline, stuff that gets churned out today. With the exception of The Lord of The Rings.=20 Sadly all the effects companies are using C++ and Python, can D get traction as the language of choice for the post-production companies? Crikey, this thread has drifted a good few light years from the original title. --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel russel.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Feb 18 2011
parent "Nick Sabalausky" <a a.a> writes:
"Russel Winder" <russel russel.org.uk> wrote in message 
news:mailman.1784.1298102229.4748.digitalmars-d puremagic.com...
 Sadly all the effects companies are using C++ and Python, can D get
traction as the language of choice for the post-production companies? IIRC, Someone here said that they had written one of the effects tools used for Surrogates and that they wrote it in D.
Feb 19 2011
prev sibling parent Jeff Nowakowski <jeff dilacero.org> writes:
On 02/18/2011 08:39 PM, Walter Bright wrote:
 Huge? How about it never occurs to Vader to search for Luke at the most
 obvious location in the universe - his nearest living relatives (Uncle
 Owen)? That's just the start of the ludicrousness.

 Ok, I have no right to be annoyed, but what an opportunity (to make a
 truly great movie) squandered.
Lighten up, Francis. It was a truly great movie, for it's time.
Feb 19 2011
prev sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday 17 February 2011 23:09:32 Russel Winder wrote:
 On Thu, 2011-02-17 at 11:09 -0800, Walter Bright wrote:
 Russel Winder wrote:
 Do not be afraid of the word.  Fear leads to anger.  Anger leads to
 hate.  Hate leads to suffering. (*)
 
 (*) With apologies to Master Yoda (**) for any misquote.
"Luke, trust your feelings!" -- Oggie Ben Doggie Of course, expecting consistency from Star Wars is a waste of time.
"What -- me worry?" Alfred E Newman (*) Star Wars is like Dr Who you expect revisionist history in every episode. I hate an inconsistent storyline, so the trick is to assume each episode is a completely separate story unrelated to any other episode.
The funny thing is that Doctor Who does a number of things which I would normally consider to make a show a bad show - such as being inconsistent in its timeline and generally being episodic rather than having real story arcs (though some of the newer Doctor Who stuff has had more of a story arc than was typical in the past) - but in spite of all that, it's an absolutely fantastic show - probably because the Doctor's just so much fun. Still, it's interesting how it generally breaks the rules of good storytelling and yet is still so great to watch. - Jonathan M Davis
Feb 17 2011
parent "Nick Sabalausky" <a a.a> writes:
"Jonathan M Davis" <jmdavisProg gmx.com> wrote in message 
news:mailman.1758.1298013272.4748.digitalmars-d puremagic.com...
 On Thursday 17 February 2011 23:09:32 Russel Winder wrote:
 On Thu, 2011-02-17 at 11:09 -0800, Walter Bright wrote:
 Russel Winder wrote:
 Do not be afraid of the word.  Fear leads to anger.  Anger leads to
 hate.  Hate leads to suffering. (*)

 (*) With apologies to Master Yoda (**) for any misquote.
"Luke, trust your feelings!" -- Oggie Ben Doggie Of course, expecting consistency from Star Wars is a waste of time.
"What -- me worry?" Alfred E Newman (*) Star Wars is like Dr Who you expect revisionist history in every episode. I hate an inconsistent storyline, so the trick is to assume each episode is a completely separate story unrelated to any other episode.
The funny thing is that Doctor Who does a number of things which I would normally consider to make a show a bad show - such as being inconsistent in its timeline and generally being episodic rather than having real story arcs (though some of the newer Doctor Who stuff has had more of a story arc than was typical in the past) - but in spite of all that, it's an absolutely fantastic show - probably because the Doctor's just so much fun. Still, it's interesting how it generally breaks the rules of good storytelling and yet is still so great to watch.
One of the things that gets me about Doctor Who (at least the newer ones) is that The Doctor keeps getting companions from modern-day London who, like the Doctor, are enthralled by the idea of travelling anywhere in time and space, and yet...it seems like they still wind up spending most of their time in modern-day London anyway :) (I agree it's an enjoyable show though. The character of The Doctor is definitely a big part of what makes it work.)
Feb 18 2011
prev sibling parent "Nick Sabalausky" <a a.a> writes:
"Russel Winder" <russel russel.org.uk> wrote in message 
news:mailman.1748.1297936806.4748.digitalmars-d puremagic.com...
 A word is the natural length of an integer item in the processor.
 It is necessarily machine specific.  cf. DEC-10 had 9-bit bytes
 and 36-bit word, IBM 370 has an 8-bit byte and a 32-bit word,
 though addresses were 24-bit.  ix86 follows IBM 8-bit byte and
 32-bit word.
Right. Programmers may have gotten used to "word" being 2-bytes due to things like the Win API and x86 Assemblers not updating their usage for the sake of backwards compatibility, but in the EE world where the term originates, "word" is device-specific and is very useful as such.
 Do not be afraid of the word.  Fear leads to anger.  Anger
 leads to hate.  Hate leads to suffering. (*)
This version is better: http://media.bigoo.ws/content/image/funny/funny_1309.jpg
Feb 17 2011
prev sibling parent reply spir <denis.spir gmail.com> writes:
On 02/17/2011 10:13 AM, Don wrote:
 David Nadlinger wrote:
 On 2/17/11 8:56 AM, Denis Koroskin wrote:
 I second that. word/uword are shorter than ssize_t/size_t and more in
 line with other type names.

 I like it.
I agree that size_t/ptrdiff_t are misnomers and I'd love to kill them with fire, but when I read about »word«, I intuitively associated it with »two bytes« first – blame Intel or whoever else, but the potential for confusion is definitely not negligible. David
Me too. A word is two bytes. Any other definition seems to be pretty useless. The whole concept of "machine word" seems very archaic and incorrect to me anyway. It assumes that the data registers and address registers are the same size, which is very often not true. For example, on an 8-bit machine (eg, 6502 or Z80), the accumulator was only 8 bits, yet size_t was definitely 16 bits. It's quite plausible that at some time in the future we'll get a machine with 128-bit registers and data bus, but retaining the 64 bit address bus. So we could get a size_t which is smaller than the machine word. In summary: size_t is not the machine word.
Right, there is no single native machine word size; but I guess what we're interesting in is, from those sizes, the one that ensures minimal processing time. I mean, the data size for which there are native computation instructions (logical, numeric), so that if we use it we get the least number of cycles for a given operation. Also, this size (on common modern architectures, at least) allows directly accessing all of the memory address space; not a neglectable property ;-). Or are there points I'm overlooking? Denis -- _________________ vita es estrany spir.wikidot.com
Feb 17 2011
parent Don <nospam nospam.com> writes:
spir wrote:
 On 02/17/2011 10:13 AM, Don wrote:
 David Nadlinger wrote:
 On 2/17/11 8:56 AM, Denis Koroskin wrote:
 I second that. word/uword are shorter than ssize_t/size_t and more in
 line with other type names.

 I like it.
I agree that size_t/ptrdiff_t are misnomers and I'd love to kill them with fire, but when I read about »word«, I intuitively associated it with »two bytes« first – blame Intel or whoever else, but the potential for confusion is definitely not negligible. David
Me too. A word is two bytes. Any other definition seems to be pretty useless. The whole concept of "machine word" seems very archaic and incorrect to me anyway. It assumes that the data registers and address registers are the same size, which is very often not true. For example, on an 8-bit machine (eg, 6502 or Z80), the accumulator was only 8 bits, yet size_t was definitely 16 bits. It's quite plausible that at some time in the future we'll get a machine with 128-bit registers and data bus, but retaining the 64 bit address bus. So we could get a size_t which is smaller than the machine word. In summary: size_t is not the machine word.
Right, there is no single native machine word size; but I guess what we're interesting in is, from those sizes, the one that ensures minimal processing time. I mean, the data size for which there are native computation instructions (logical, numeric), so that if we use it we get the least number of cycles for a given operation.
There's frequently more than one such size.
 Also, this size (on common modern architectures, at least) allows 
 directly accessing all of the memory address space; not a neglectable 
 property ;-).
This is not necessarily the same.
 Or are there points I'm overlooking?
 
 Denis
Feb 17 2011
prev sibling next sibling parent reply Rainer Schuetze <r.sagitario gmx.de> writes:
I think David has raised a good point here that seems to have been lost 
in the discussion about naming.

Please note that the C name of the machine word integer was usually 
called "int". The C standard only specifies a minimum bit-size for the 
different types (see for example 
http://www.ericgiguere.com/articles/ansi-c-summary.html). Most of 
current C++ implementations have identical "int" sizes, but now "long" 
is different. This approach has failed and has caused many headaches 
when porting software from one platform to another. D has recognized 
this and has explicitely defined the bit-size of the various integer 
types. That's good!

Now, with size_t the distinction between platforms creeps back into the 
language. It is everywhere across phobos, be it as length of ranges or 
size of containers. This can get viral, as everything that gets in touch 
with these values might have to stick to size_t. Is this really desired?

Consider saving an array to disk, trying to read it on another platform. 
How many bits should be written for the size of that array?

Consider a range that maps the contents of a file. The file can be 
larger than 4GB, though a lot of the ranges that wrap the file mapping 
range will truncate the length to 32 bit on 32-bit platforms.

I don't have a perfect solution, but maybe builtin arrays could be 
limited to 2^^32-1 elements (or maybe 2^^31-1 to get rid of endless 
signed/unsigned conversions), so the normal type to be used is still 
"int". Ranges should adopt the type sizes of the underlying objects.

Agreed, a type for the machine word integer must exist, and I don't care 
how it is called, but I would like to see its usage restricted to rare 
cases.

Rainer


dsimcha wrote:
 Now that DMD has a 64-bit beta available, I'm working on getting a whole bunch
 of code to compile in 64 mode.  Frankly, the compiler is way too freakin'
 pedantic when it comes to implicit conversions (or lack thereof) of
 array.length.  99.999% of the time it's safe to assume an array is not going
 to be over 4 billion elements long.  I'd rather have a bug the 0.001% of the
 time than deal with the pedantic errors the rest of the time, because I think
 it would be less total time and effort invested.  To force me to either put
 casts in my code everywhere or change my entire codebase to use wider integers
 (with ripple effects just about everywhere) strikes me as purity winning out
 over practicality.
Feb 15 2011
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 15 Feb 2011 14:15:06 -0500, Rainer Schuetze <r.sagitario gmx.de>  
wrote:

 I think David has raised a good point here that seems to have been lost  
 in the discussion about naming.

 Please note that the C name of the machine word integer was usually  
 called "int". The C standard only specifies a minimum bit-size for the  
 different types (see for example  
 http://www.ericgiguere.com/articles/ansi-c-summary.html). Most of  
 current C++ implementations have identical "int" sizes, but now "long"  
 is different. This approach has failed and has caused many headaches  
 when porting software from one platform to another. D has recognized  
 this and has explicitely defined the bit-size of the various integer  
 types. That's good!

 Now, with size_t the distinction between platforms creeps back into the  
 language. It is everywhere across phobos, be it as length of ranges or  
 size of containers. This can get viral, as everything that gets in touch  
 with these values might have to stick to size_t. Is this really desired?
Do you really want portable code? The thing is, size_t is specifically defined to be *the word size* whereas C defines int as a fuzzy size "should be at least 16 bits, and recommended to be equivalent to the natural size of the machine". size_t is *guaranteed* to be the same size on the same platform, even among different compilers. In addition size_t isn't actually defined by the compiler. So the library controls the size of size_t, not the compiler. This should make it extremely portable.
 Consider saving an array to disk, trying to read it on another platform.  
 How many bits should be written for the size of that array?
It depends on the protocol or file format definition. It should be irrelevant what platform/architecture you are on. Any format or protocol worth its salt will define what size integers you should store. Then you need a protocol implementation that converts between the native size and the stored size. This is just like network endianness vs. host endianness. You always use htonl and ntohl even if your platform has the same endianness as the network, because you want your code to be portable. Not using them is a no-no even if it works fine on your big-endian system.
 I don't have a perfect solution, but maybe builtin arrays could be  
 limited to 2^^32-1 elements (or maybe 2^^31-1 to get rid of endless  
 signed/unsigned conversions), so the normal type to be used is still  
 "int". Ranges should adopt the type sizes of the underlying objects.
No, this is too limiting. If I have 64GB of memory (not out of the question), and I want to have a 5GB array, I think I should be allowed to. This is one of the main reasons to go to 64-bit in the first place. -Steve
Feb 15 2011
parent reply Rainer Schuetze <r.sagitario gmx.de> writes:
Steven Schveighoffer wrote:
 
 In addition size_t isn't actually defined by the compiler.  So the 
 library controls the size of size_t, not the compiler.  This should make 
 it extremely portable.
 
I do not consider the language and the runtime as completely seperate when it comes to writing code. BTW, though defined in object.di, size_t is tied to some compiler internals: alias typeof(int.sizeof) size_t; and the compiler will make assumptions about this when creating array literals.
 Consider saving an array to disk, trying to read it on another 
 platform. How many bits should be written for the size of that array?
It depends on the protocol or file format definition. It should be irrelevant what platform/architecture you are on. Any format or protocol worth its salt will define what size integers you should store.
Agreed, the example probably was not the best one.
 I don't have a perfect solution, but maybe builtin arrays could be 
 limited to 2^^32-1 elements (or maybe 2^^31-1 to get rid of endless 
 signed/unsigned conversions), so the normal type to be used is still 
 "int". Ranges should adopt the type sizes of the underlying objects.
No, this is too limiting. If I have 64GB of memory (not out of the question), and I want to have a 5GB array, I think I should be allowed to. This is one of the main reasons to go to 64-bit in the first place.
Yes, that's the imperfect part of the proposal. An array of ints could still use up to 16 GB, though. What bothers me is that you have to deal with these "portability issues" from the very moment you store the length of an array elsewhere. Not a really big deal, and I don't think it will change, but still feels a bit awkward.
Feb 15 2011
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 15 Feb 2011 18:18:22 -0500, Rainer Schuetze <r.sagitario gmx.de>  
wrote:

 Steven Schveighoffer wrote:
  In addition size_t isn't actually defined by the compiler.  So the  
 library controls the size of size_t, not the compiler.  This should  
 make it extremely portable.
I do not consider the language and the runtime as completely seperate when it comes to writing code.
You are right, in some cases the runtime just extends the compiler features. However, I believe the runtime is meant to be used in multiple compilers. I would expect object.di to remain the same. Probably core too. This should be easily checkable with the newer gdc, which I believe uses a port of druntime.
 BTW, though defined in object.di, size_t is tied to some compiler  
 internals:

 	alias typeof(int.sizeof) size_t;

 and the compiler will make assumptions about this when creating array  
 literals.
This is true. This makes it depend on the compiler. However, I believe the spec is concrete about what the sizeof type should be (if not, it should be made concrete).
 I don't have a perfect solution, but maybe builtin arrays could be  
 limited to 2^^32-1 elements (or maybe 2^^31-1 to get rid of endless  
 signed/unsigned conversions), so the normal type to be used is still  
 "int". Ranges should adopt the type sizes of the underlying objects.
No, this is too limiting. If I have 64GB of memory (not out of the question), and I want to have a 5GB array, I think I should be allowed to. This is one of the main reasons to go to 64-bit in the first place.
Yes, that's the imperfect part of the proposal. An array of ints could still use up to 16 GB, though.
Unless you cast it to void[]. What would exactly happen there, a runtime error? Which means a runtime check for an implicit cast? I don't think it's really an option to make array length always be uint (or int). I wouldn't have a problem with using signed words for length. using more than 2GB for one array in 32-bit land would be so rare that having to jump through special hoops would be fine by me. Obviously for now, 2^63-1 sized arrays is plenty room for todays machines in 64-bit land.
 What bothers me is that you have to deal with these "portability issues"  
 from the very moment you store the length of an array elsewhere. Not a  
 really big deal, and I don't think it will change, but still feels a bit  
 awkward.
Java defines everything to be the same regardless of architecture, and the result is you just can't do certain things (like have a 5GB array). A system-level language should support the full range of architecture capabilities, so you necessarily have to deal with portability issues. If you want a super-portable language that runs the same everywhere, use an interpreted/bytecode language like Java, .Net or Python. D is for getting close to the metal. I see size_t as a way to *mostly* make things portable. It is not perfect, and really cannot be. It's necessary to expose the architecture so you can adapt to it, there's no getting around taht. Really, it's rare that you have to use it anyways, most should use auto. -Steve
Feb 16 2011
prev sibling parent reply Daniel Gibson <metalcaedes gmail.com> writes:
Am 15.02.2011 20:15, schrieb Rainer Schuetze:
 
 I think David has raised a good point here that seems to have been lost in the
 discussion about naming.
 
 Please note that the C name of the machine word integer was usually called
 "int". The C standard only specifies a minimum bit-size for the different types
 (see for example http://www.ericgiguere.com/articles/ansi-c-summary.html). Most
 of current C++ implementations have identical "int" sizes, but now "long" is
 different. This approach has failed and has caused many headaches when porting
 software from one platform to another. D has recognized this and has
explicitely
 defined the bit-size of the various integer types. That's good!
 
 Now, with size_t the distinction between platforms creeps back into the
 language. It is everywhere across phobos, be it as length of ranges or size of
 containers. This can get viral, as everything that gets in touch with these
 values might have to stick to size_t. Is this really desired?
 
 Consider saving an array to disk, trying to read it on another platform. How
 many bits should be written for the size of that array?
 
This can indeed be a problem which actually is existent in Phobos: std.streams Outputstream has a write(char[]) method - and similar methods for wchar and dchar - that do exactly this: write a size_t first and then the data.. in many places they used uint instead of size_t, but at the one method where this is a bad idea they used size_t ;-) (see also http://d.puremagic.com/issues/show_bug.cgi?id=5001 ) In general I think that you just have to define how you serialize data to disk/net/whatever (what endianess, what exact types) and you won't have problems. Just dumping the data to disk isn't portable anyway.
 Consider a range that maps the contents of a file. The file can be larger than
 4GB, though a lot of the ranges that wrap the file mapping range will truncate
 the length to 32 bit on 32-bit platforms.
 
 I don't have a perfect solution, but maybe builtin arrays could be limited to
 2^^32-1 elements (or maybe 2^^31-1 to get rid of endless signed/unsigned
 conversions), so the normal type to be used is still "int". Ranges should adopt
 the type sizes of the underlying objects.
 
 Agreed, a type for the machine word integer must exist, and I don't care how it
 is called, but I would like to see its usage restricted to rare cases.
 
 Rainer
 
Cheers, - Daniel
Feb 15 2011
parent reply spir <denis.spir gmail.com> writes:
On 02/15/2011 10:40 PM, Daniel Gibson wrote:
 Am 15.02.2011 20:15, schrieb Rainer Schuetze:
 I think David has raised a good point here that seems to have been lost in the
 discussion about naming.

 Please note that the C name of the machine word integer was usually called
 "int". The C standard only specifies a minimum bit-size for the different types
 (see for example http://www.ericgiguere.com/articles/ansi-c-summary.html). Most
 of current C++ implementations have identical "int" sizes, but now "long" is
 different. This approach has failed and has caused many headaches when porting
 software from one platform to another. D has recognized this and has
explicitely
 defined the bit-size of the various integer types. That's good!

 Now, with size_t the distinction between platforms creeps back into the
 language. It is everywhere across phobos, be it as length of ranges or size of
 containers. This can get viral, as everything that gets in touch with these
 values might have to stick to size_t. Is this really desired?

 Consider saving an array to disk, trying to read it on another platform. How
 many bits should be written for the size of that array?
This can indeed be a problem which actually is existent in Phobos: std.streams Outputstream has a write(char[]) method - and similar methods for wchar and dchar - that do exactly this: write a size_t first and then the data.. in many places they used uint instead of size_t, but at the one method where this is a bad idea they used size_t ;-) (see also http://d.puremagic.com/issues/show_bug.cgi?id=5001 ) In general I think that you just have to define how you serialize data to disk/net/whatever (what endianess, what exact types) and you won't have problems. Just dumping the data to disk isn't portable anyway.
How do you, in general, cope with the issue that, when using machine-size types, programs or (program+data) combinations will work on some machines and not on others? This disturbs me a lot. I prefere having a constant field of applicability, even if artificially reduced for some set of machines. Similar reflexion about "infinite"-size numbers. Note this is different from using machine-size (unsigned) integers on the implementation side, for implementation reasons. This could be done, I guess, without having language-side issues. Meaning int, for instance, could be on the implementation side the same thing as long on 64-bit machine, but still be semantically limited to 32-bit; so that the code works the same way on all machines. Denis -- _________________ vita es estrany spir.wikidot.com
Feb 15 2011
parent Daniel Gibson <metalcaedes gmail.com> writes:
Am 16.02.2011 00:03, schrieb spir:
 On 02/15/2011 10:40 PM, Daniel Gibson wrote:
 In general I think that you just have to define how you serialize data to
 disk/net/whatever (what endianess, what exact types) and you won't have
 problems. Just dumping the data to disk isn't portable anyway.
How do you, in general, cope with the issue that, when using machine-size types, programs or (program+data) combinations will work on some machines and not on others? This disturbs me a lot. I prefere having a constant field of applicability, even if artificially reduced for some set of machines. Similar reflexion about "infinite"-size numbers.
I'm not sure I understand your question correctly.. 1. You can't always deal with it, there may always be platforms (e.g. 16bit platforms) that just can't execute your code and can't handle your types. 2. When handling data that is exchanged between programs (that may run on different platforms) you just have to agree on a format for that data. You could for example serialize it to XML or JSON or use a binary protocol that defines exactly what types (what size, what endianess, what encoding) are used and how. You can then decide for your applications data things like "this array will *never* exceed 65k elements, so I can store it's size as ushort" and so on. You should enforce these constraints on all platforms, of course (e.g. assert(arr.length <= ushort.max); ) This also means that you can decide that you'll never have any arrays longer than uint.max so they can be read and written on any platform - you just need to make sure that, when reading it from disk/net/..., you read the length in the right format (analog for writing). Or would you prefer D to behave like a 32bit language on any platforms? That means arrays *never* have more than uint.max elements etc? Such constraints are not acceptable for a system programming language. (The alternative - using ulong for array indexes on 32bit platforms - is unacceptable as well because it'd slow thing down to much).
 Note this is different from using machine-size (unsigned) integers on the
 implementation side, for implementation reasons. This could be done, I guess,
 without having language-side issues. Meaning int, for instance, could be on the
 implementation side the same thing as long on 64-bit machine, but still be
 semantically limited to 32-bit; so that the code works the same way on all
 machines.
 
 Denis
Cheers, - Daniel
Feb 15 2011
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Tuesday, February 15, 2011 15:13:33 spir wrote:
 On 02/15/2011 11:24 PM, Jonathan M Davis wrote:
 Is there some low level reason why size_t should be signed or something
 I'm completely missing?
My personal issue with unsigned ints in general as implemented in C-like languages is that the range of non-negative signed integers is half of the range of corresponding unsigned integers (for same size). * practically: known issues, and bugs if not checked by the language * conceptually: contradicts the "obvious" idea that unsigned (aka naturals) is a subset of signed (aka integers)
It's inevitable in any systems language. What are you going to do, throw away a bit for unsigned integers? That's not acceptable for a systems language. On some level, you must live with the fact that you're running code on a specific machine with a specific set of constraints. Trying to do otherwise will pretty much always harm efficiency. True, there are common bugs that might be better prevented, but part of it ultimately comes down to the programmer having some clue as to what they're doing. On some level, we want to prevent common bugs, but the programmer can't have their hand held all the time either. - Jonathan M Davis
Feb 15 2011
parent Walter Bright <newshound2 digitalmars.com> writes:
Jonathan M Davis wrote:
 It's inevitable in any systems language. What are you going to do, throw away
a 
 bit for unsigned integers? That's not acceptable for a systems language. On
some 
 level, you must live with the fact that you're running code on a specific
machine 
 with a specific set of constraints. Trying to do otherwise will pretty much 
 always harm efficiency. True, there are common bugs that might be better 
 prevented, but part of it ultimately comes down to the programmer having some 
 clue as to what they're doing. On some level, we want to prevent common bugs, 
 but the programmer can't have their hand held all the time either.
Yup. A systems language is going to map closely onto the target machine, and that means its characteristics will show up in the language. Trying to pretend that arithmetic on integers is something other than what the CPU natively does just will not work.
Feb 16 2011
prev sibling next sibling parent Bernard Helyer <b.helyer gmail.com> writes:
Disagree quite strongly -- use the correct type. Yes, lengths are ulongs 
on AMD64, yes this means a lot of uints turn into size_t, but that's how 
it's supposed to be IMO. 
Feb 15 2011
prev sibling next sibling parent reply spir <denis.spir gmail.com> writes:
On 02/16/2011 03:07 AM, Jonathan M Davis wrote:
 On Tuesday, February 15, 2011 15:13:33 spir wrote:
 On 02/15/2011 11:24 PM, Jonathan M Davis wrote:
 Is there some low level reason why size_t should be signed or something
 I'm completely missing?
My personal issue with unsigned ints in general as implemented in C-like languages is that the range of non-negative signed integers is half of the range of corresponding unsigned integers (for same size). * practically: known issues, and bugs if not checked by the language * conceptually: contradicts the "obvious" idea that unsigned (aka naturals) is a subset of signed (aka integers)
It's inevitable in any systems language. What are you going to do, throw away a bit for unsigned integers? That's not acceptable for a systems language. On some level, you must live with the fact that you're running code on a specific machine with a specific set of constraints. Trying to do otherwise will pretty much always harm efficiency. True, there are common bugs that might be better prevented, but part of it ultimately comes down to the programmer having some clue as to what they're doing. On some level, we want to prevent common bugs, but the programmer can't have their hand held all the time either.
I cannot prove it, but I really think you're wrong on that. First, the question of 1 bit. Think at this -- speaking of 64 bit size: * 99.999% of all uses of unsigned fit under 2^63 * To benefit from the last bit, you must have the need to store a value 2^63 <= v < 2^64 * Not only this, you must step on a case where /any/ possible value for v (depending on execution data) could be >= 2^63, but /all/ possible values for v are guaranteed < 2^64 This can only be a very small fraction of cases where your value does not fit in 63 bits, don't you think. Has it ever happened to you (even in 32 bits)? Something like: "what a luck! this value would not (always) fit in 31 bits, but (due to this constraint), I can be sure it will fit in 32 bits (always, whatever input data it depends on). In fact, n bits do the job because (1) nearly all unsigned values are very small (2) the size used at a time covers the memory range at the same time. Upon efficiency, if unsigned is not a subset of signed, then at a low level you may be forced to add checks in numerous utility routines, the kind constantly used, everywhere one type may play with the other. I'm not sure where the gain is. Upon correctness, intuitively I guess (just a wild guess indeed) if unigned values form a subset of signed ones programmers will more easily reason correctly about them. Now, I perfectly understand the "sacrifice" of one bit sounds like a sacrilege ;-) (*) Denis (*) But you know, when as a young guy you have coded for 8 & 16-bit machines, having 63 or 64... -- _________________ vita es estrany spir.wikidot.com
Feb 16 2011
next sibling parent reply Don <nospam nospam.com> writes:
spir wrote:
 On 02/16/2011 03:07 AM, Jonathan M Davis wrote:
 On Tuesday, February 15, 2011 15:13:33 spir wrote:
 On 02/15/2011 11:24 PM, Jonathan M Davis wrote:
 Is there some low level reason why size_t should be signed or something
 I'm completely missing?
My personal issue with unsigned ints in general as implemented in C-like languages is that the range of non-negative signed integers is half of the range of corresponding unsigned integers (for same size). * practically: known issues, and bugs if not checked by the language * conceptually: contradicts the "obvious" idea that unsigned (aka naturals) is a subset of signed (aka integers)
It's inevitable in any systems language. What are you going to do, throw away a bit for unsigned integers? That's not acceptable for a systems language. On some level, you must live with the fact that you're running code on a specific machine with a specific set of constraints. Trying to do otherwise will pretty much always harm efficiency. True, there are common bugs that might be better prevented, but part of it ultimately comes down to the programmer having some clue as to what they're doing. On some level, we want to prevent common bugs, but the programmer can't have their hand held all the time either.
I cannot prove it, but I really think you're wrong on that. First, the question of 1 bit. Think at this -- speaking of 64 bit size: * 99.999% of all uses of unsigned fit under 2^63 * To benefit from the last bit, you must have the need to store a value 2^63 <= v < 2^64 * Not only this, you must step on a case where /any/ possible value for v (depending on execution data) could be >= 2^63, but /all/ possible values for v are guaranteed < 2^64 This can only be a very small fraction of cases where your value does not fit in 63 bits, don't you think. Has it ever happened to you (even in 32 bits)? Something like: "what a luck! this value would not (always) fit in 31 bits, but (due to this constraint), I can be sure it will fit in 32 bits (always, whatever input data it depends on). In fact, n bits do the job because (1) nearly all unsigned values are very small (2) the size used at a time covers the memory range at the same time. Upon efficiency, if unsigned is not a subset of signed, then at a low level you may be forced to add checks in numerous utility routines, the kind constantly used, everywhere one type may play with the other. I'm not sure where the gain is. Upon correctness, intuitively I guess (just a wild guess indeed) if unigned values form a subset of signed ones programmers will more easily reason correctly about them. Now, I perfectly understand the "sacrifice" of one bit sounds like a sacrilege ;-) (*) Denis
 (*) But you know, when as a young guy you have coded for 8 & 16-bit 
 machines, having 63 or 64...
Exactly. It is NOT the same as the 8 & 16 bit case. The thing is, the fraction of cases where the MSB is important has been decreasing *exponentially* from the 8-bit days. It really was necessary to use the entire address space (or even more, in the case of segmented architecture on the 286![1]) to measure the size of anything. D only supports 32 bit and higher, so it isn't hamstrung in the way that C is. Yes, there are still cases where you need every bit. But they are very, very exceptional -- rare enough that I think the type could be called __uint, __ulong. [1] What was size_t on the 286 ? Note that in the small memory model (all pointers 16 bits) it really was possible to have an object of size 0xFFFF_FFFF, because the code was in a different address space.
Feb 16 2011
next sibling parent spir <denis.spir gmail.com> writes:
On 02/16/2011 12:21 PM, Don wrote:
 spir wrote:
 On 02/16/2011 03:07 AM, Jonathan M Davis wrote:
 On Tuesday, February 15, 2011 15:13:33 spir wrote:
 On 02/15/2011 11:24 PM, Jonathan M Davis wrote:
 Is there some low level reason why size_t should be signed or something
 I'm completely missing?
My personal issue with unsigned ints in general as implemented in C-like languages is that the range of non-negative signed integers is half of the range of corresponding unsigned integers (for same size). * practically: known issues, and bugs if not checked by the language * conceptually: contradicts the "obvious" idea that unsigned (aka naturals) is a subset of signed (aka integers)
It's inevitable in any systems language. What are you going to do, throw away a bit for unsigned integers? That's not acceptable for a systems language. On some level, you must live with the fact that you're running code on a specific machine with a specific set of constraints. Trying to do otherwise will pretty much always harm efficiency. True, there are common bugs that might be better prevented, but part of it ultimately comes down to the programmer having some clue as to what they're doing. On some level, we want to prevent common bugs, but the programmer can't have their hand held all the time either.
I cannot prove it, but I really think you're wrong on that. First, the question of 1 bit. Think at this -- speaking of 64 bit size: * 99.999% of all uses of unsigned fit under 2^63 * To benefit from the last bit, you must have the need to store a value 2^63 <= v < 2^64 * Not only this, you must step on a case where /any/ possible value for v (depending on execution data) could be >= 2^63, but /all/ possible values for v are guaranteed < 2^64 This can only be a very small fraction of cases where your value does not fit in 63 bits, don't you think. Has it ever happened to you (even in 32 bits)? Something like: "what a luck! this value would not (always) fit in 31 bits, but (due to this constraint), I can be sure it will fit in 32 bits (always, whatever input data it depends on). In fact, n bits do the job because (1) nearly all unsigned values are very small (2) the size used at a time covers the memory range at the same time. Upon efficiency, if unsigned is not a subset of signed, then at a low level you may be forced to add checks in numerous utility routines, the kind constantly used, everywhere one type may play with the other. I'm not sure where the gain is. Upon correctness, intuitively I guess (just a wild guess indeed) if unigned values form a subset of signed ones programmers will more easily reason correctly about them. Now, I perfectly understand the "sacrifice" of one bit sounds like a sacrilege ;-) (*) Denis
 (*) But you know, when as a young guy you have coded for 8 & 16-bit machines,
 having 63 or 64...
Exactly. It is NOT the same as the 8 & 16 bit case. The thing is, the fraction of cases where the MSB is important has been decreasing *exponentially* from the 8-bit days. It really was necessary to use the entire address space (or even more, in the case of segmented architecture on the 286![1]) to measure the size of anything. D only supports 32 bit and higher, so it isn't hamstrung in the way that C is. Yes, there are still cases where you need every bit. But they are very, very exceptional -- rare enough that I think the type could be called __uint, __ulong.
Add this: in the case where one needs exactly all 64 bits, then the proper type to use is exactly ulong.
 [1] What was size_t on the 286 ?
 Note that in the small memory model (all pointers 16 bits) it really was
 possible to have an object of size 0xFFFF_FFFF, because the code was in a
 different address space.
Denis -- _________________ vita es estrany spir.wikidot.com
Feb 16 2011
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
Don wrote:
 [1] What was size_t on the 286 ?
16 bits
 Note that in the small memory model (all pointers 16 bits) it really was 
 possible to have an object of size 0xFFFF_FFFF, because the code was in 
 a different address space.
Not really. I think the 286 had a hard limit of 16 Mb. There was a so-called "huge" memory model which attempted (badly) to fake a linear address space across the segmented model. It never worked very well (such as having wacky problems when an object straddled a segment boundary), and applications built with it sucked in the performance dept. I never supported it for that reason. A lot of the effort in 16 bit programming went to breaking up data structures so no individual part of it spanned more than 64K.
Feb 16 2011
parent reply Don <nospam nospam.com> writes:
Walter Bright wrote:
 Don wrote:
 [1] What was size_t on the 286 ?
 
 16 bits
 
 Note that in the small memory model (all pointers 16 bits) it really 
 was possible to have an object of size 0xFFFF_FFFF, because the code 
 was in a different address space.
Not really. I think the 286 had a hard limit of 16 Mb.
I mean, you can have a 16 bit code pointer, and a 16 bit data pointer. So, you can concievably have a 64K data item, using the full size of size_t. That isn't possible on a modern, linear address space, because the code has to go somewhere...
 
 There was a so-called "huge" memory model which attempted (badly) to 
 fake a linear address space across the segmented model. It never worked 
 very well (such as having wacky problems when an object straddled a 
 segment boundary), and applications built with it sucked in the 
 performance dept. I never supported it for that reason.
 
 A lot of the effort in 16 bit programming went to breaking up data 
 structures so no individual part of it spanned more than 64K.
Yuck. I just caught the very last of that era. I wrote a couple of 16-bit DLLs. From memory, you couldn't assume the stack was in the data segment, and you got horrific memory corruption if you did. I've got no nostalgia for those days...
Feb 16 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
Don wrote:
 Walter Bright wrote:
 Don wrote:
 [1] What was size_t on the 286 ?
 16 bits

 Note that in the small memory model (all pointers 16 bits) it really 
 was possible to have an object of size 0xFFFF_FFFF, because the code 
 was in a different address space.
Not really. I think the 286 had a hard limit of 16 Mb.
I mean, you can have a 16 bit code pointer, and a 16 bit data pointer. So, you can concievably have a 64K data item, using the full size of size_t. That isn't possible on a modern, linear address space, because the code has to go somewhere...
Actually, you can have a segmented model on a 32 bit machine rather than a flat model, with separate segments for code, data, and stack. The Digital Mars DOS Extender actually does this. The advantage of it is you cannot execute data on the stack.
 There was a so-called "huge" memory model which attempted (badly) to 
 fake a linear address space across the segmented model. It never 
 worked very well (such as having wacky problems when an object 
 straddled a segment boundary), and applications built with it sucked 
 in the performance dept. I never supported it for that reason.

 A lot of the effort in 16 bit programming went to breaking up data 
 structures so no individual part of it spanned more than 64K.
Yuck. I just caught the very last of that era. I wrote a couple of 16-bit DLLs. From memory, you couldn't assume the stack was in the data segment, and you got horrific memory corruption if you did. I've got no nostalgia for those days...
I rather enjoyed it, and the pay was good <g>.
Feb 16 2011
next sibling parent dsimcha <dsimcha yahoo.com> writes:
This whole conversation makes me feel like The Naive Noob for 
complaining about how much 32-bit address space limitations suck and we 
need 64 support.

On 2/16/2011 8:52 PM, Walter Bright wrote:
 Don wrote:
 Walter Bright wrote:
 Don wrote:
 [1] What was size_t on the 286 ?
 16 bits

 Note that in the small memory model (all pointers 16 bits) it really
 was possible to have an object of size 0xFFFF_FFFF, because the code
 was in a different address space.
Not really. I think the 286 had a hard limit of 16 Mb.
I mean, you can have a 16 bit code pointer, and a 16 bit data pointer. So, you can concievably have a 64K data item, using the full size of size_t. That isn't possible on a modern, linear address space, because the code has to go somewhere...
Actually, you can have a segmented model on a 32 bit machine rather than a flat model, with separate segments for code, data, and stack. The Digital Mars DOS Extender actually does this. The advantage of it is you cannot execute data on the stack.
 There was a so-called "huge" memory model which attempted (badly) to
 fake a linear address space across the segmented model. It never
 worked very well (such as having wacky problems when an object
 straddled a segment boundary), and applications built with it sucked
 in the performance dept. I never supported it for that reason.

 A lot of the effort in 16 bit programming went to breaking up data
 structures so no individual part of it spanned more than 64K.
Yuck. I just caught the very last of that era. I wrote a couple of 16-bit DLLs. From memory, you couldn't assume the stack was in the data segment, and you got horrific memory corruption if you did. I've got no nostalgia for those days...
I rather enjoyed it, and the pay was good <g>.
Feb 16 2011
prev sibling parent reply Kagamin <spam here.lot> writes:
Walter Bright Wrote:

 Actually, you can have a segmented model on a 32 bit machine rather than a
flat 
 model, with separate segments for code, data, and stack. The Digital Mars DOS 
 Extender actually does this. The advantage of it is you cannot execute data on 
 the stack.
AFAIK you inevitably have segments in flat model, x86 just doesn't work in other way. On windows stack segment seems to be the same as data segment, code segment is different. Are they needed for access check? I thought access modes are checked in page tables.
Feb 17 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
Kagamin wrote:
 Walter Bright Wrote:
 
 Actually, you can have a segmented model on a 32 bit machine rather than a
 flat model, with separate segments for code, data, and stack. The Digital
 Mars DOS Extender actually does this. The advantage of it is you cannot
 execute data on the stack.
AFAIK you inevitably have segments in flat model, x86 just doesn't work in other way. On windows stack segment seems to be the same as data segment, code segment is different. Are they needed for access check? I thought access modes are checked in page tables.
Operating systems choose to set the segment registers to all the same value which results in the 'flat' model, but many other models are possible with the x86 hardware.
Feb 17 2011
parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Is it true that you're not allowed to play with the segment registers
in 32bit flat protected mode?
Feb 17 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
Andrej Mitrovic wrote:
 Is it true that you're not allowed to play with the segment registers
 in 32bit flat protected mode?
Yes, that's the operating system's job.
Feb 17 2011
parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 2/17/11, Walter Bright <newshound2 digitalmars.com> wrote:
 Andrej Mitrovic wrote:
 Is it true that you're not allowed to play with the segment registers
 in 32bit flat protected mode?
Yes, that's the operating system's job.
They took our jerbs!
Feb 17 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
Andrej Mitrovic wrote:
 On 2/17/11, Walter Bright <newshound2 digitalmars.com> wrote:
 Andrej Mitrovic wrote:
 Is it true that you're not allowed to play with the segment registers
 in 32bit flat protected mode?
Yes, that's the operating system's job.
They took our jerbs!
You can always start your own company and hire yourself, or write your own operating system and set the segment registers!
Feb 17 2011
parent reply "Nick Sabalausky" <a a.a> writes:
"Walter Bright" <newshound2 digitalmars.com> wrote in message 
news:ijk6la$1d9a$1 digitalmars.com...
 Andrej Mitrovic wrote:
 On 2/17/11, Walter Bright <newshound2 digitalmars.com> wrote:
 Andrej Mitrovic wrote:
 Is it true that you're not allowed to play with the segment registers
 in 32bit flat protected mode?
Yes, that's the operating system's job.
They took our jerbs!
You can always start your own company and hire yourself, or write your own operating system and set the segment registers!
"They took our jerbs!" is a South Park reference.
Feb 17 2011
parent Walter Bright <newshound2 digitalmars.com> writes:
Nick Sabalausky wrote:
 "Walter Bright" <newshound2 digitalmars.com> wrote in message 
 news:ijk6la$1d9a$1 digitalmars.com...
 Andrej Mitrovic wrote:
 On 2/17/11, Walter Bright <newshound2 digitalmars.com> wrote:
 Andrej Mitrovic wrote:
 Is it true that you're not allowed to play with the segment registers
 in 32bit flat protected mode?
Yes, that's the operating system's job.
They took our jerbs!
You can always start your own company and hire yourself, or write your own operating system and set the segment registers!
"They took our jerbs!" is a South Park reference.
I've seen it everywhere on the intarnets.
Feb 17 2011
prev sibling parent "Alexander Malakhov" <anm programmer.net> writes:
Don <nospam nospam.com> =D0=C9=D3=C1=CC(=C1) =D7 =D3=D7=CF=A3=CD =D0=C9=D3=
=D8=CD=C5 Wed, 16 Feb 2011 17:21:06  =

+0600:
 Exactly. It is NOT the same as the 8 & 16 bit case. The thing is, the =
=
 fraction of cases where the MSB is important has been decreasing  =
 *exponentially* from the 8-bit days. [...]
Some facts to back your opinion: * today's most powerful supercomputer have "just" 230 TB of RAM, which i= s = between 2^47 and 2^48 (http://www.top500.org/site/systems/3154) * Windows7 x64 __virtual__ memory limit is 8 TB (=3D 2^43) (http://msdn.microsoft.com/en-us/library/aa366778(VS.85).aspx#physica= l_memory_limits_windows_7) -- = Alexander
Feb 18 2011
prev sibling parent reply Kevin Bealer <kevindangerbealer removedanger.gmail.com> writes:
== Quote from spir (denis.spir gmail.com)'s article
 On 02/16/2011 03:07 AM, Jonathan M Davis wrote:
 On Tuesday, February 15, 2011 15:13:33 spir wrote:
 On 02/15/2011 11:24 PM, Jonathan M Davis wrote:
 Is there some low level reason why size_t should be signed or something
 I'm completely missing?
My personal issue with unsigned ints in general as implemented in C-like languages is that the range of non-negative signed integers is half of the range of corresponding unsigned integers (for same size). * practically: known issues, and bugs if not checked by the language * conceptually: contradicts the "obvious" idea that unsigned (aka naturals) is a subset of signed (aka integers)
It's inevitable in any systems language. What are you going to do, throw away a bit for unsigned integers? That's not acceptable for a systems language. On some level, you must live with the fact that you're running code on a specific machine with a specific set of constraints. Trying to do otherwise will pretty much always harm efficiency. True, there are common bugs that might be better prevented, but part of it ultimately comes down to the programmer having some clue as to what they're doing. On some level, we want to prevent common bugs, but the programmer can't have their hand held all the time either.
I cannot prove it, but I really think you're wrong on that. First, the question of 1 bit. Think at this -- speaking of 64 bit size: * 99.999% of all uses of unsigned fit under 2^63 * To benefit from the last bit, you must have the need to store a value 2^63 <= v < 2^64 * Not only this, you must step on a case where /any/ possible value for v (depending on execution data) could be >= 2^63, but /all/ possible values for v are guaranteed < 2^64 This can only be a very small fraction of cases where your value does not fit in 63 bits, don't you think. Has it ever happened to you (even in 32 bits)? Something like: "what a luck! this value would not (always) fit in 31 bits, but (due to this constraint), I can be sure it will fit in 32 bits (always, whatever input data it depends on). In fact, n bits do the job because (1) nearly all unsigned values are very small (2) the size used at a time covers the memory range at the same time. Upon efficiency, if unsigned is not a subset of signed, then at a low level you may be forced to add checks in numerous utility routines, the kind constantly used, everywhere one type may play with the other. I'm not sure where the gain is. Upon correctness, intuitively I guess (just a wild guess indeed) if unigned values form a subset of signed ones programmers will more easily reason correctly about them. Now, I perfectly understand the "sacrifice" of one bit sounds like a sacrilege ;-) (*) Denis (*) But you know, when as a young guy you have coded for 8 & 16-bit machines, having 63 or 64...
If you write low level code, it happens all the time. For example, you can copy memory areas quickly on some machines by treating them as arrays of "long" and copying the values -- which requires the upper bit to be preserved. Or you compute a 64 bit hash value using an algorithm that is part of some standard protocol. Oops -- requires an unsigned 64 bit number, the signed version would produce the wrong result. And since the standard expects normal behaving int64's you are stuck -- you'd have to write a little class to simulate unsigned 64 bit math. E.g. a library that computes md5 sums. Not to mention all the code that uses 64 bit numbers as bit fields where the different bits or sets of bits are really subfields of the total range of values. What you are saying is true of high level code that models real life -- if the value is someone's salary or the number of toasters they are buying from a store you are probably fine -- but a lot of low level software (ipv4 stacks, video encoders, databases, etc) are based on designs that require numbers to behave a certain way, and losing a bit is going to be a pain. I've run into this with Java, which lacks unsigned types, and once you run into a case that needs that extra bit it gets annoying right quick. Kevin
Feb 16 2011
next sibling parent reply Daniel Gibson <metalcaedes gmail.com> writes:
Am 17.02.2011 05:19, schrieb Kevin Bealer:
 == Quote from spir (denis.spir gmail.com)'s article
 On 02/16/2011 03:07 AM, Jonathan M Davis wrote:
 On Tuesday, February 15, 2011 15:13:33 spir wrote:
 On 02/15/2011 11:24 PM, Jonathan M Davis wrote:
 Is there some low level reason why size_t should be signed or something
 I'm completely missing?
My personal issue with unsigned ints in general as implemented in C-like languages is that the range of non-negative signed integers is half of the range of corresponding unsigned integers (for same size). * practically: known issues, and bugs if not checked by the language * conceptually: contradicts the "obvious" idea that unsigned (aka naturals) is a subset of signed (aka integers)
It's inevitable in any systems language. What are you going to do, throw away a bit for unsigned integers? That's not acceptable for a systems language. On some level, you must live with the fact that you're running code on a specific machine with a specific set of constraints. Trying to do otherwise will pretty much always harm efficiency. True, there are common bugs that might be better prevented, but part of it ultimately comes down to the programmer having some clue as to what they're doing. On some level, we want to prevent common bugs, but the programmer can't have their hand held all the time either.
I cannot prove it, but I really think you're wrong on that. First, the question of 1 bit. Think at this -- speaking of 64 bit size: * 99.999% of all uses of unsigned fit under 2^63 * To benefit from the last bit, you must have the need to store a value 2^63 <= v < 2^64 * Not only this, you must step on a case where /any/ possible value for v (depending on execution data) could be >= 2^63, but /all/ possible values for v are guaranteed < 2^64 This can only be a very small fraction of cases where your value does not fit in 63 bits, don't you think. Has it ever happened to you (even in 32 bits)? Something like: "what a luck! this value would not (always) fit in 31 bits, but (due to this constraint), I can be sure it will fit in 32 bits (always, whatever input data it depends on). In fact, n bits do the job because (1) nearly all unsigned values are very small (2) the size used at a time covers the memory range at the same time. Upon efficiency, if unsigned is not a subset of signed, then at a low level you may be forced to add checks in numerous utility routines, the kind constantly used, everywhere one type may play with the other. I'm not sure where the gain is. Upon correctness, intuitively I guess (just a wild guess indeed) if unigned values form a subset of signed ones programmers will more easily reason correctly about them. Now, I perfectly understand the "sacrifice" of one bit sounds like a sacrilege ;-) (*) Denis (*) But you know, when as a young guy you have coded for 8 & 16-bit machines, having 63 or 64...
If you write low level code, it happens all the time. For example, you can copy memory areas quickly on some machines by treating them as arrays of "long" and copying the values -- which requires the upper bit to be preserved. Or you compute a 64 bit hash value using an algorithm that is part of some standard protocol. Oops -- requires an unsigned 64 bit number, the signed version would produce the wrong result. And since the standard expects normal behaving int64's you are stuck -- you'd have to write a little class to simulate unsigned 64 bit math. E.g. a library that computes md5 sums. Not to mention all the code that uses 64 bit numbers as bit fields where the different bits or sets of bits are really subfields of the total range of values. What you are saying is true of high level code that models real life -- if the value is someone's salary or the number of toasters they are buying from a store you are probably fine -- but a lot of low level software (ipv4 stacks, video encoders, databases, etc) are based on designs that require numbers to behave a certain way, and losing a bit is going to be a pain. I've run into this with Java, which lacks unsigned types, and once you run into a case that needs that extra bit it gets annoying right quick. Kevin
It was not proposed to alter ulong (int64), but to only a size_t equivalent. ;) And I agree that not having unsigned types (like in Java) just sucks. Wasn't Java even advertised as a programming language for network stuff? Quite ridiculous without unsigned types.. Cheers, - Daniel
Feb 16 2011
parent Kevin Bealer <kevindangerbealer removedanger.gmail.com> writes:
== Quote from Daniel Gibson (metalcaedes gmail.com)'s article
 It was not proposed to alter ulong (int64), but to only a size_t equivalent. ;)
 And I agree that not having unsigned types (like in Java) just sucks.
 Wasn't Java even advertised as a programming language for network stuff? Quite
 ridiculous without unsigned types..
 Cheers,
 - Daniel
Ah yes, but if you want to copy data quickly you want to use the efficient size for doing so. Since architectures vary, size_t (or the new name if one is added) would seem to new users to be the natural choice for that size. So it becomes a likely error if it doesn't behave as expected. My personal reaction to this thread is that I think most of the arguments of the people who want to change the name or add a new one are true -- but not sufficient to make it worth while. There is always some learning curve and size_t is not that hard to learn or that hard to accept. Kevin
Feb 17 2011
prev sibling parent spir <denis.spir gmail.com> writes:
On 02/17/2011 05:19 AM, Kevin Bealer wrote:
 == Quote from spir (denis.spir gmail.com)'s article
 On 02/16/2011 03:07 AM, Jonathan M Davis wrote:
 On Tuesday, February 15, 2011 15:13:33 spir wrote:
 On 02/15/2011 11:24 PM, Jonathan M Davis wrote:
 Is there some low level reason why size_t should be signed or something
 I'm completely missing?
My personal issue with unsigned ints in general as implemented in C-like languages is that the range of non-negative signed integers is half of the range of corresponding unsigned integers (for same size). * practically: known issues, and bugs if not checked by the language * conceptually: contradicts the "obvious" idea that unsigned (aka naturals) is a subset of signed (aka integers)
It's inevitable in any systems language. What are you going to do, throw away a bit for unsigned integers? That's not acceptable for a systems language. On some level, you must live with the fact that you're running code on a specific machine with a specific set of constraints. Trying to do otherwise will pretty much always harm efficiency. True, there are common bugs that might be better prevented, but part of it ultimately comes down to the programmer having some clue as to what they're doing. On some level, we want to prevent common bugs, but the programmer can't have their hand held all the time either.
I cannot prove it, but I really think you're wrong on that. First, the question of 1 bit. Think at this -- speaking of 64 bit size: * 99.999% of all uses of unsigned fit under 2^63 * To benefit from the last bit, you must have the need to store a value 2^63<= v< 2^64 * Not only this, you must step on a case where /any/ possible value for v (depending on execution data) could be>= 2^63, but /all/ possible values for v are guaranteed< 2^64 This can only be a very small fraction of cases where your value does not fit in 63 bits, don't you think. Has it ever happened to you (even in 32 bits)? Something like: "what a luck! this value would not (always) fit in 31 bits, but (due to this constraint), I can be sure it will fit in 32 bits (always, whatever input data it depends on). In fact, n bits do the job because (1) nearly all unsigned values are very small (2) the size used at a time covers the memory range at the same time. Upon efficiency, if unsigned is not a subset of signed, then at a low level you may be forced to add checks in numerous utility routines, the kind constantly used, everywhere one type may play with the other. I'm not sure where the gain is. Upon correctness, intuitively I guess (just a wild guess indeed) if unigned values form a subset of signed ones programmers will more easily reason correctly about them. Now, I perfectly understand the "sacrifice" of one bit sounds like a sacrilege ;-) (*) Denis (*) But you know, when as a young guy you have coded for 8& 16-bit machines, having 63 or 64...
If you write low level code, it happens all the time. For example, you can copy memory areas quickly on some machines by treating them as arrays of "long" and copying the values -- which requires the upper bit to be preserved. Or you compute a 64 bit hash value using an algorithm that is part of some standard protocol. Oops -- requires an unsigned 64 bit number, the signed version would produce the wrong result. And since the standard expects normal behaving int64's you are stuck -- you'd have to write a little class to simulate unsigned 64 bit math. E.g. a library that computes md5 sums. Not to mention all the code that uses 64 bit numbers as bit fields where the different bits or sets of bits are really subfields of the total range of values. What you are saying is true of high level code that models real life -- if the value is someone's salary or the number of toasters they are buying from a store you are probably fine -- but a lot of low level software (ipv4 stacks, video encoders, databases, etc) are based on designs that require numbers to behave a certain way, and losing a bit is going to be a pain. I've run into this with Java, which lacks unsigned types, and once you run into a case that needs that extra bit it gets annoying right quick.
You're right indeed, but this is a different issue. If you need to perform bit-level manipulation, then the proper type to use is u-somesize. What we were discussing, I guess, is the standard type used by both stdlib and application code for indices/positions and counts/sizes/lenths. SomeType count (E) (E[] elements, E element) SomeType search (E) (E[] elements, E element, SomeType fromPos=0) Denis -- _________________ vita es estrany spir.wikidot.com
Feb 17 2011
prev sibling next sibling parent Kagamin <spam here.lot> writes:
Adam Ruppe Wrote:

 alias iota lazyRangeThatGoesFromStartToFinishByTheGivenStepAmount;
Ever wondered, what iota is. At last it's self-documented.
Feb 17 2011
prev sibling parent reply Kagamin <spam here.lot> writes:
dsimcha Wrote:

 Now that DMD has a 64-bit beta available, I'm working on getting a whole bunch
 of code to compile in 64 mode.  Frankly, the compiler is way too freakin'
 pedantic when it comes to implicit conversions (or lack thereof) of
 array.length.  99.999% of the time it's safe to assume an array is not going
 to be over 4 billion elements long.  I'd rather have a bug the 0.001% of the
 time than deal with the pedantic errors the rest of the time, because I think
 it would be less total time and effort invested.  To force me to either put
 casts in my code everywhere or change my entire codebase to use wider integers
 (with ripple effects just about everywhere) strikes me as purity winning out
 over practicality.
int ilength(void[] a) property { return cast(int)a.length; } --- int mylen=bb.ilength; ---
Feb 17 2011
parent reply dsimcha <dsimcha yahoo.com> writes:
Funny, as simple as it is, this is a great idea for std.array because it 
shortens the verbose cast(int) a.length to one extra character.  You 
could even put an assert in it to check in debug mode only that the 
conversion is safe.

On 2/17/2011 7:18 AM, Kagamin wrote:
 dsimcha Wrote:

 Now that DMD has a 64-bit beta available, I'm working on getting a whole bunch
 of code to compile in 64 mode.  Frankly, the compiler is way too freakin'
 pedantic when it comes to implicit conversions (or lack thereof) of
 array.length.  99.999% of the time it's safe to assume an array is not going
 to be over 4 billion elements long.  I'd rather have a bug the 0.001% of the
 time than deal with the pedantic errors the rest of the time, because I think
 it would be less total time and effort invested.  To force me to either put
 casts in my code everywhere or change my entire codebase to use wider integers
 (with ripple effects just about everywhere) strikes me as purity winning out
 over practicality.
int ilength(void[] a) property { return cast(int)a.length; } --- int mylen=bb.ilength; ---
Feb 17 2011
parent reply Kagamin <spam here.lot> writes:
dsimcha Wrote:

 Funny, as simple as it is, this is a great idea for std.array because it 
 shortens the verbose cast(int) a.length to one extra character.  You 
 could even put an assert in it to check in debug mode only that the 
 conversion is safe.
 int ilength(void[] a)  property
 {
    return cast(int)a.length;
 }
I'm not sure the code is correct. I have a vague impression that void[] is like byte[], at least, it's used as such, and conversion from int[] to byte[] multiplies the length by 4.
Feb 17 2011
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 17 Feb 2011 09:45:14 -0500, Kagamin <spam here.lot> wrote:

 dsimcha Wrote:

 Funny, as simple as it is, this is a great idea for std.array because it
 shortens the verbose cast(int) a.length to one extra character.  You
 could even put an assert in it to check in debug mode only that the
 conversion is safe.
 int ilength(void[] a)  property
 {
    return cast(int)a.length;
 }
I'm not sure the code is correct. I have a vague impression that void[] is like byte[], at least, it's used as such, and conversion from int[] to byte[] multiplies the length by 4.
Yes, David has proposed a corrected version on the Phobos mailing list: http://lists.puremagic.com/pipermail/phobos/2011-February/004493.html -Steve
Feb 17 2011
parent reply bearophile <bearophileHUGS lycos.com> writes:
Steven Schveighoffer:

 Yes, David has proposed a corrected version on the Phobos mailing list:
 
 http://lists.puremagic.com/pipermail/phobos/2011-February/004493.html
I suggest it to return a signed value, like an int. But a signed long is OK too. I suggest a name as "len" (or "slen") because I often write "length" wrongly. Does it support code like: auto l = arr.len; arr.len = 10; arr.len++; A big problem: it's limited to arrays, so aa.len or rbtree.len, set.len, etc, don't work. So I'd like something more standard... So I am not sure this is a good idea. Bye, bearophile
Feb 17 2011
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 17 Feb 2011 13:08:08 -0500, bearophile <bearophileHUGS lycos.com>  
wrote:

 Steven Schveighoffer:

 Yes, David has proposed a corrected version on the Phobos mailing list:

 http://lists.puremagic.com/pipermail/phobos/2011-February/004493.html
I suggest it to return a signed value, like an int. But a signed long is OK too. I suggest a name as "len" (or "slen") because I often write "length" wrongly.
This isn't replacing length, it is in addition to length (which will continue to return size_t).
 Does it support code like:
 auto l = arr.len;
 arr.len = 10;
 arr.len++;
arr.length = 10 already works. It's int l = arr.length that doesn't. if arr.length++ doesn't work already, it should be made to work (separate bug).
 A big problem: it's limited to arrays, so aa.len or rbtree.len, set.len,  
 etc, don't work. So I'd like something more standard... So I am not sure  
 this is a good idea.
The point is to avoid code like cast(int)arr.length everywhere you can safely assume arr.length can fit in a (u)int. This case is extremely common for arrays, you seldom have an array of more than 2 or 4 billion elements. For other types, the case might not be as common, plus you can add properties to other types, something you cannot do to arrays. As far as I'm concerned, this isn't going to affect me at all, I like to use size_t. But I don't see the harm in adding it. -Steve
Feb 17 2011