www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Re: Integer conversions too pedantic in 64-bit

reply foobar <foo bar.com> writes:
Steven Schveighoffer Wrote:

 On Tue, 15 Feb 2011 09:26:21 -0500, spir <denis.spir gmail.com> wrote:
 
 On 02/15/2011 02:36 PM, Steven Schveighoffer wrote:

 Hey, bikeshedders, I found this cool easter-egg feature in D! It's  
 called
 alias! Don't like the name of something? Well you can change it!

 alias size_t wordsize;

 Now, you can use wordsize instead of size_t in your code, and the  
 compiler
 doesn't care! (in fact, that's all size_t is anyways *hint hint*)

Sure, but it's not the point of this one bikeshedding thread. If you do that, then you're the only one who knows what "wordsize" means. Good, maybe, for app-specific semantic notions (alias Employee[] Staff;); certainly not for types at the highest degree of general purpose like size_t. We need a standard alias.

The standard alias is size_t. If you don't like it, alias it to something else. Why should I have to use something that's unfamiliar to me because you don't like size_t? I guarantee whatever you came up with would not be liked by some people, so they would have to alias it, you can't please everyone. size_t works, it has a precedent, it's already *there*, just use it, or alias it if you don't like it. No offense, but this discussion is among the most pointless I've seen. -Steve

I disagree that the discussion is pointless. On the contrary, the OP pointed out some valid points: 1. that size_t is inconsistent with D's style guide. the "_t" suffix is a C++ convention and not a D one. While it makes sense for [former?] C++ programmers it will confuse newcomers to D from other languages that would expect the language to follow its own style guide. 2. the proposed change is backwards compatible - the OP asked for an *additional* alias. 3. generic concepts should belong to the standard library and not user code which is also where size_t is already defined. IMO, we already have a byte type, it's plain common sense to extend this with a "native word" type.
Feb 15 2011
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
foobar wrote:
 1.  that size_t is inconsistent with D's style guide. the "_t" suffix is a C++
convention and not a D one. While it makes sense for [former?] C++ programmers
it will confuse newcomers to D from other languages that would expect the
language to follow its own style guide. 

It's a C convention.
 2. the proposed change is backwards compatible - the OP asked for an
*additional* alias.

I do not believe that value is added by adding more and more aliases for the same thing. It makes the library large and complex but with no depth.
Feb 15 2011
parent spir <denis.spir gmail.com> writes:
On 02/15/2011 08:05 PM, Walter Bright wrote:
 foobar wrote:
 1. that size_t is inconsistent with D's style guide. the "_t" suffix is a C++
 convention and not a D one. While it makes sense for [former?] C++
 programmers it will confuse newcomers to D from other languages that would
 expect the language to follow its own style guide.

It's a C convention.
 2. the proposed change is backwards compatible - the OP asked for an
 *additional* alias.

I do not believe that value is added by adding more and more aliases for the same thing. It makes the library large and complex but with no depth.

If we asked for various aliases for numerous builtin terms of the language, your point would be fully valid. But here is only asked for a single standard alias for what may well be the most used type in the language; which presently has a obscure alias name. Cost: one line of code in object.d: alias typeof(int.sizeof) size_t; alias typeof(int.sizeof) Abcdef; // add this As an aside, the opportunity may be taken to use machine-word-size signed values as a standard for indices/positions and sizes/counts/lengths (and offsets?), everywhere in the language, for the coming 64-bit version. Don, IIRC, and Bearophile, referred to issues due to unsigned values. This would also give an obvious name for the alias, "Integer", that probably few would contest (hope so). Denis -- _________________ vita es estrany spir.wikidot.com
Feb 15 2011
prev sibling next sibling parent reply so <so so.so> writes:
 I disagree that the discussion is pointless.
 On the contrary, the OP pointed out some valid points:

 1.  that size_t is inconsistent with D's style guide. the "_t" suffix is  
 a C++ convention and not a D one. While it makes sense for [former?] C++  
 programmers it will confuse newcomers to D from other languages that  
 would expect the language to follow its own style guide.
 2. the proposed change is backwards compatible - the OP asked for an  
 *additional* alias.
 3. generic concepts should belong to the standard library and not user  
 code which is also where size_t is already defined.

 IMO, we already have a byte type, it's plain common sense to extend this  
 with a "native word" type.

Funny thing is the most important argument against size_t got the least attention. I will leave it as an exercise for the reader.
Feb 15 2011
next sibling parent reply "Nick Sabalausky" <a a.a> writes:
"so" <so so.so> wrote in message news:op.vqyk3emumpw3zg so-pc...
 I disagree that the discussion is pointless.
 On the contrary, the OP pointed out some valid points:

 1.  that size_t is inconsistent with D's style guide. the "_t" suffix is 
 a C++ convention and not a D one. While it makes sense for [former?] C++ 
 programmers it will confuse newcomers to D from other languages that 
 would expect the language to follow its own style guide.
 2. the proposed change is backwards compatible - the OP asked for an 
 *additional* alias.
 3. generic concepts should belong to the standard library and not user 
 code which is also where size_t is already defined.

 IMO, we already have a byte type, it's plain common sense to extend this 
 with a "native word" type.

Funny thing is the most important argument against size_t got the least attention. I will leave it as an exercise for the reader.

That variables of type "size_t" are frequently used to store indicies rather than the actual *size* of anything? That it does nothing to help with 32/64-bit portability until you actually compile your code both ways? That Nick doesn't like it? ;)
Feb 15 2011
next sibling parent reply Daniel Gibson <metalcaedes gmail.com> writes:
Am 15.02.2011 23:00, schrieb Nick Sabalausky:
 "so" <so so.so> wrote in message news:op.vqyk3emumpw3zg so-pc...
 I disagree that the discussion is pointless.
 On the contrary, the OP pointed out some valid points:

 1.  that size_t is inconsistent with D's style guide. the "_t" suffix is 
 a C++ convention and not a D one. While it makes sense for [former?] C++ 
 programmers it will confuse newcomers to D from other languages that 
 would expect the language to follow its own style guide.
 2. the proposed change is backwards compatible - the OP asked for an 
 *additional* alias.
 3. generic concepts should belong to the standard library and not user 
 code which is also where size_t is already defined.

 IMO, we already have a byte type, it's plain common sense to extend this 
 with a "native word" type.

Funny thing is the most important argument against size_t got the least attention. I will leave it as an exercise for the reader.

That variables of type "size_t" are frequently used to store indicies rather than the actual *size* of anything? That it does nothing to help with 32/64-bit portability until you actually compile your code both ways?

I don't understand that point.
 
 That Nick doesn't like it? ;)
 

Feb 15 2011
parent reply "Nick Sabalausky" <a a.a> writes:
"Daniel Gibson" <metalcaedes gmail.com> wrote in message 
news:ijett7$1ie$5 digitalmars.com...
 Am 15.02.2011 23:00, schrieb Nick Sabalausky:
 "so" <so so.so> wrote in message news:op.vqyk3emumpw3zg so-pc...
 Funny thing is the most important argument against size_t got the least
 attention.
 I will leave it as an exercise for the reader.

That variables of type "size_t" are frequently used to store indicies rather than the actual *size* of anything? That it does nothing to help with 32/64-bit portability until you actually compile your code both ways?

I don't understand that point.

If you're writing something in 32-bit and you use size_t, it may compile perfectly fine for 32-bit, but the compiler won't tell you about any problems that will appear when you compile the same code for 64-bit (such as "can't implicitly convert"). Presumably the same would apply to writing something on 64-bit and then suddenly compiling for 32-bit. I'm not actually asserting that this is a big issue. Maybe it is, maybe it isn't, I don't know. Just making guesses at what "so" sees as "the most important argument against size_t [that] got the least attention".
Feb 15 2011
parent reply Daniel Gibson <metalcaedes gmail.com> writes:
Am 15.02.2011 23:29, schrieb Nick Sabalausky:
 "Daniel Gibson" <metalcaedes gmail.com> wrote in message 
 news:ijett7$1ie$5 digitalmars.com...
 Am 15.02.2011 23:00, schrieb Nick Sabalausky:
 "so" <so so.so> wrote in message news:op.vqyk3emumpw3zg so-pc...
 Funny thing is the most important argument against size_t got the least
 attention.
 I will leave it as an exercise for the reader.

That variables of type "size_t" are frequently used to store indicies rather than the actual *size* of anything? That it does nothing to help with 32/64-bit portability until you actually compile your code both ways?

I don't understand that point.

If you're writing something in 32-bit and you use size_t, it may compile perfectly fine for 32-bit, but the compiler won't tell you about any problems that will appear when you compile the same code for 64-bit (such as "can't implicitly convert"). Presumably the same would apply to writing something on 64-bit and then suddenly compiling for 32-bit. I'm not actually asserting that this is a big issue. Maybe it is, maybe it isn't, I don't know. Just making guesses at what "so" sees as "the most important argument against size_t [that] got the least attention".

Ok, that is right. Probably it would be helpful if size_t was a proper type that can't be mixed with other types in dangerous ways without explicit casting. Cheers, - Daniel
Feb 15 2011
parent reply Adam Ruppe <destructionator gmail.com> writes:
Daniel Gibson wrote:
 Probably it would be helpful if size_t was a proper type that can't
 be mixed with other types in dangerous ways without explicit casting.

Bad idea: once you insert an explicit cast, you now have a *hidden* bug on the new platform instead of a compile error.
Feb 15 2011
next sibling parent Daniel Gibson <metalcaedes gmail.com> writes:
Am 15.02.2011 23:43, schrieb Adam Ruppe:
 Daniel Gibson wrote:
 Probably it would be helpful if size_t was a proper type that can't
 be mixed with other types in dangerous ways without explicit casting.

Bad idea: once you insert an explicit cast, you now have a *hidden* bug on the new platform instead of a compile error.

You should only cast when you know what you're doing.. If you get a compiler error on the new platform and just shut it up by doing an explicit cast then, it's just as bad. But having to do an explicit cast either way forces you to think about what you're doing, hopefully avoiding large pieces of code that need to be rewritten because they only worked because size_t was uint or such. Cheers, - Daniel
Feb 15 2011
prev sibling parent bearophile <bearophileHUGS lycos.com> writes:
Adam Ruppe:

 Daniel Gibson wrote:
 Probably it would be helpful if size_t was a proper type that can't
 be mixed with other types in dangerous ways without explicit casting.

Bad idea: once you insert an explicit cast, you now have a *hidden* bug on the new platform instead of a compile error.

I'll keep this in mind. Bye, bearophile
Feb 15 2011
prev sibling parent Don <nospam nospam.com> writes:
Jonathan M Davis wrote:
 On Tuesday, February 15, 2011 14:00:12 Nick Sabalausky wrote:
 "so" <so so.so> wrote in message news:op.vqyk3emumpw3zg so-pc...

 I disagree that the discussion is pointless.
 On the contrary, the OP pointed out some valid points:

 1.  that size_t is inconsistent with D's style guide. the "_t" suffix is
 a C++ convention and not a D one. While it makes sense for [former?] C++
 programmers it will confuse newcomers to D from other languages that
 would expect the language to follow its own style guide.
 2. the proposed change is backwards compatible - the OP asked for an
 *additional* alias.
 3. generic concepts should belong to the standard library and not user
 code which is also where size_t is already defined.

 IMO, we already have a byte type, it's plain common sense to extend this
 with a "native word" type.

attention. I will leave it as an exercise for the reader.

rather than the actual *size* of anything? That it does nothing to help with 32/64-bit portability until you actually compile your code both ways?

What _does_ have to do with 32/64-bit portability until you compile both ways? Regardless of what the name is, it's still going to be the word size of the machine and vary between 32-bit and 64-bit anyway.

size_t could be made a genuine type, and given a range of 0..2^^64-1, even when it is a 32 bit value. Then, it'd fail to implicitly convert to int, uint on 32 bit systems. But, if you did certain operations on it (eg, & 0xFFFF_FFFF) then it could be store in a uint without a cast.
Feb 15 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Tuesday, February 15, 2011 14:00:12 Nick Sabalausky wrote:
 "so" <so so.so> wrote in message news:op.vqyk3emumpw3zg so-pc...
 
 I disagree that the discussion is pointless.
 On the contrary, the OP pointed out some valid points:
 
 1.  that size_t is inconsistent with D's style guide. the "_t" suffix is
 a C++ convention and not a D one. While it makes sense for [former?] C++
 programmers it will confuse newcomers to D from other languages that
 would expect the language to follow its own style guide.
 2. the proposed change is backwards compatible - the OP asked for an
 *additional* alias.
 3. generic concepts should belong to the standard library and not user
 code which is also where size_t is already defined.
 
 IMO, we already have a byte type, it's plain common sense to extend this
 with a "native word" type.

Funny thing is the most important argument against size_t got the least attention. I will leave it as an exercise for the reader.

That variables of type "size_t" are frequently used to store indicies rather than the actual *size* of anything? That it does nothing to help with 32/64-bit portability until you actually compile your code both ways?

What _does_ have to do with 32/64-bit portability until you compile both ways? Regardless of what the name is, it's still going to be the word size of the machine and vary between 32-bit and 64-bit anyway. - Jonathan M Davis
Feb 15 2011
prev sibling parent so <so so.so> writes:
 That variables of type "size_t" are frequently used to store indicies  
 rather
 than the actual *size* of anything?

 That it does nothing to help with 32/64-bit portability until you  
 actually
 compile your code both ways?

 That Nick doesn't like it? ;)

Nice try! But i was referring Don's argument. :)
Feb 15 2011
prev sibling next sibling parent reply =?ISO-8859-1?Q?g=F6lgeliyele?= <usuldan gmail.com> writes:
On 2/15/11 12:24 PM, foobar wrote:
 I disagree that the discussion is pointless.
 On the contrary, the OP pointed out some valid points:

 1.  that size_t is inconsistent with D's style guide. the "_t" suffix is a C++
convention and not a D one. While it makes sense for [former?] C++ programmers
it will confuse newcomers to D from other languages that would expect the
language to follow its own style guide.
 2. the proposed change is backwards compatible - the OP asked for an
*additional* alias.
 3. generic concepts should belong to the standard library and not user code
which is also where size_t is already defined.

 IMO, we already have a byte type, it's plain common sense to extend this with
a "native word" type.

Look at the basic data types: bool, byte, ubyte, short, ushort, int, uint, long, ulong, cent, ucent, float, double, real, ifloat, idouble, ireal, cfloat, cdouble, creal, char, wchar, dchar While size_t is just an alias, it will be used in a similar way to the above. One can see that it does not fit among these, stylistically speaking. There seems to be a common pattern here, a prefixing character is consistently used to differentiate basic types, such as u-short/short, c-float/float, w-char/char, etc. I wonder if something similar can be done for size_t. nint comes to mind, for native int, that is n-int. Sample code: nint end = 0; // nintendo :) Having too many aliases seems like a problem to me. Different developers will start using different names and reading code will become harder. One would need to learn two things that refer to the same. My 2 cents: I suggest deprecating size_t and replacing it with a better alternative that fits with the D language.
Feb 15 2011
parent reply "Nick Sabalausky" <a a.a> writes:
"gölgeliyele" <usuldan gmail.com> wrote in message 
news:ijfc4m$16p6$1 digitalmars.com...
 On 2/15/11 12:24 PM, foobar wrote:
 I disagree that the discussion is pointless.
 On the contrary, the OP pointed out some valid points:

 1.  that size_t is inconsistent with D's style guide. the "_t" suffix is 
 a C++ convention and not a D one. While it makes sense for [former?] C++ 
 programmers it will confuse newcomers to D from other languages that 
 would expect the language to follow its own style guide.
 2. the proposed change is backwards compatible - the OP asked for an 
 *additional* alias.
 3. generic concepts should belong to the standard library and not user 
 code which is also where size_t is already defined.

 IMO, we already have a byte type, it's plain common sense to extend this 
 with a "native word" type.

Look at the basic data types: bool, byte, ubyte, short, ushort, int, uint, long, ulong, cent, ucent, float, double, real, ifloat, idouble, ireal, cfloat, cdouble, creal, char, wchar, dchar While size_t is just an alias, it will be used in a similar way to the above. One can see that it does not fit among these, stylistically speaking. There seems to be a common pattern here, a prefixing character is consistently used to differentiate basic types, such as u-short/short, c-float/float, w-char/char, etc. I wonder if something similar can be done for size_t. nint comes to mind, for native int, that is n-int. Sample code:

I like "nint".
   nint end = 0; // nintendo :)

Heh, I like that even more. It's "int eger;" for a new generation :) And much less contrived, come to think of it.
Feb 15 2011
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2011-02-15 22:41:32 -0500, "Nick Sabalausky" <a a.a> said:

 I like "nint".

But is it unsigned or signed? Do we need 'unint' too? I think 'word' & 'uword' would be a better choice. I can't say I'm too displeased with 'size_t', but it's true that the 'size_t' feels out of place in D code because of its name. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Feb 15 2011
next sibling parent reply "Nick Sabalausky" <a a.a> writes:
"Michel Fortin" <michel.fortin michelf.com> wrote in message 
news:ijfhkt$1fte$1 digitalmars.com...
 On 2011-02-15 22:41:32 -0500, "Nick Sabalausky" <a a.a> said:

 I like "nint".

But is it unsigned or signed? Do we need 'unint' too?

*shrug* Beats me. I can't even remember if size_t is signed or not.
 I think 'word' & 'uword' would be a better choice.

The only problem I have with that is that "word" seems like something you might want to use as a variable name in certain cases. However, I'd still prefer "word" over "size_t"
 I can't say I'm too displeased with 'size_t', but it's true that the 
 'size_t' feels out of place in D code because of its name.

Feb 15 2011
parent =?ISO-8859-1?Q?g=F6lgeliyele?= <usuldan gmail.com> writes:
On 2/15/11 11:33 PM, Nick Sabalausky wrote:
 "Michel Fortin"<michel.fortin michelf.com>  wrote in message
 news:ijfhkt$1fte$1 digitalmars.com...
 On 2011-02-15 22:41:32 -0500, "Nick Sabalausky"<a a.a>  said:

 I like "nint".

But is it unsigned or signed? Do we need 'unint' too?

*shrug* Beats me. I can't even remember if size_t is signed or not.

size_t is unsigned in C/C++, whereas ssize_t is signed. I like word/uword as well, but word is too common as a variable name. What about archint/uarchint ?
Feb 15 2011
prev sibling next sibling parent Iain Buclaw <ibuclaw ubuntu.com> writes:
== Quote from spir (denis.spir gmail.com)'s article
 On 02/16/2011 04:49 AM, Michel Fortin wrote:
 On 2011-02-15 22:41:32 -0500, "Nick Sabalausky" <a a.a> said:

 I like "nint".



It's the machine integer, so I think the word 'mint' would better match your naming logic. Also, reminds me of this small advert: http://www.youtube.com/watch?v=zuy6o8YXzDo ;)
 But is it unsigned or signed? Do we need 'unint' too?

 I think 'word' & 'uword' would be a better choice. I can't say I'm too
 displeased with 'size_t', but it's true that the 'size_t' feels out of place in
 D code because of its name.

unint looks like meaning (x € R / not (x € Z)) lol! Denis

word/uword sits well with my understanding.
Feb 16 2011
prev sibling next sibling parent reply KennyTM~ <kennytm gmail.com> writes:
On Feb 16, 11 11:49, Michel Fortin wrote:
 On 2011-02-15 22:41:32 -0500, "Nick Sabalausky" <a a.a> said:

 I like "nint".

But is it unsigned or signed? Do we need 'unint' too? I think 'word' & 'uword' would be a better choice. I can't say I'm too displeased with 'size_t', but it's true that the 'size_t' feels out of place in D code because of its name.

'word' may be confusing to Windows programmers because in WinAPI a 'WORD' means an unsigned 16-bit integer (aka 'ushort'). http://msdn.microsoft.com/en-us/library/cc230402(v=PROT.10).aspx
Feb 16 2011
parent reply "Nick Sabalausky" <a a.a> writes:
"KennyTM~" <kennytm gmail.com> wrote in message 
news:ijghne$ts1$1 digitalmars.com...
 On Feb 16, 11 11:49, Michel Fortin wrote:
 On 2011-02-15 22:41:32 -0500, "Nick Sabalausky" <a a.a> said:

 I like "nint".

But is it unsigned or signed? Do we need 'unint' too? I think 'word' & 'uword' would be a better choice. I can't say I'm too displeased with 'size_t', but it's true that the 'size_t' feels out of place in D code because of its name.

'word' may be confusing to Windows programmers because in WinAPI a 'WORD' means an unsigned 16-bit integer (aka 'ushort'). http://msdn.microsoft.com/en-us/library/cc230402(v=PROT.10).aspx

That's just a legacy issue from when windows was mainly on 16-bit machines. "Word" means native size.
Feb 16 2011
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 17.02.2011 9:09, Nick Sabalausky wrote:
 "KennyTM~"<kennytm gmail.com>  wrote in message
 news:ijghne$ts1$1 digitalmars.com...
 On Feb 16, 11 11:49, Michel Fortin wrote:
 On 2011-02-15 22:41:32 -0500, "Nick Sabalausky"<a a.a>  said:

 I like "nint".

I think 'word'& 'uword' would be a better choice. I can't say I'm too displeased with 'size_t', but it's true that the 'size_t' feels out of place in D code because of its name.

means an unsigned 16-bit integer (aka 'ushort'). http://msdn.microsoft.com/en-us/library/cc230402(v=PROT.10).aspx

"Word" means native size.

uses size prefixes word (2 bytes!), dword (4bytes), qword (8) etc. And if that was only assembler syntax issue... -- Dmitry Olshansky
Feb 16 2011
prev sibling parent reply David Nadlinger <see klickverbot.at> writes:
On 2/17/11 8:56 AM, Denis Koroskin wrote:
 I second that. word/uword are shorter than ssize_t/size_t and more in
 line with other type names.

 I like it.

I agree that size_t/ptrdiff_t are misnomers and I'd love to kill them with fire, but when I read about »word«, I intuitively associated it with »two bytes« first – blame Intel or whoever else, but the potential for confusion is definitely not negligible. David
Feb 17 2011
parent reply Don <nospam nospam.com> writes:
David Nadlinger wrote:
 On 2/17/11 8:56 AM, Denis Koroskin wrote:
 I second that. word/uword are shorter than ssize_t/size_t and more in
 line with other type names.

 I like it.

I agree that size_t/ptrdiff_t are misnomers and I'd love to kill them with fire, but when I read about »word«, I intuitively associated it with »two bytes« first – blame Intel or whoever else, but the potential for confusion is definitely not negligible. David

Me too. A word is two bytes. Any other definition seems to be pretty useless. The whole concept of "machine word" seems very archaic and incorrect to me anyway. It assumes that the data registers and address registers are the same size, which is very often not true. For example, on an 8-bit machine (eg, 6502 or Z80), the accumulator was only 8 bits, yet size_t was definitely 16 bits. It's quite plausible that at some time in the future we'll get a machine with 128-bit registers and data bus, but retaining the 64 bit address bus. So we could get a size_t which is smaller than the machine word. In summary: size_t is not the machine word.
Feb 17 2011
next sibling parent Don <nospam nospam.com> writes:
spir wrote:
 On 02/17/2011 10:13 AM, Don wrote:
 David Nadlinger wrote:
 On 2/17/11 8:56 AM, Denis Koroskin wrote:
 I second that. word/uword are shorter than ssize_t/size_t and more in
 line with other type names.

 I like it.

I agree that size_t/ptrdiff_t are misnomers and I'd love to kill them with fire, but when I read about »word«, I intuitively associated it with »two bytes« first – blame Intel or whoever else, but the potential for confusion is definitely not negligible. David

Me too. A word is two bytes. Any other definition seems to be pretty useless. The whole concept of "machine word" seems very archaic and incorrect to me anyway. It assumes that the data registers and address registers are the same size, which is very often not true. For example, on an 8-bit machine (eg, 6502 or Z80), the accumulator was only 8 bits, yet size_t was definitely 16 bits. It's quite plausible that at some time in the future we'll get a machine with 128-bit registers and data bus, but retaining the 64 bit address bus. So we could get a size_t which is smaller than the machine word. In summary: size_t is not the machine word.

Right, there is no single native machine word size; but I guess what we're interesting in is, from those sizes, the one that ensures minimal processing time. I mean, the data size for which there are native computation instructions (logical, numeric), so that if we use it we get the least number of cycles for a given operation.

There's frequently more than one such size.
 Also, this size (on common modern architectures, at least) allows 
 directly accessing all of the memory address space; not a neglectable 
 property ;-).

This is not necessarily the same.
 Or are there points I'm overlooking?
 
 Denis

Feb 17 2011
prev sibling next sibling parent reply Don <nospam nospam.com> writes:
Russel Winder wrote:
 <minor-rant>
 
 On Thu, 2011-02-17 at 10:13 +0100, Don wrote:
 [ . . . ]
 Me too. A word is two bytes. Any other definition seems to be pretty 
 useless.

Sounds like people have been living with 8- and 16-bit processors for too long. A word is the natural length of an integer item in the processor. It is necessarily machine specific. cf. DEC-10 had 9-bit bytes and 36-bit word, IBM 370 has an 8-bit byte and a 32-bit word, though addresses were 24-bit. ix86 follows IBM 8-bit byte and 32-bit word.

Yes, I know. It's true but I think rather useless. We need a name for an 8 bit quantity, and a 16 bit quantity, and higher powers of two. 'byte' is an established name for the first one, even though historically there were 9-bit bytes. IMHO 'word' wasn't such a bad name for the second one, even though its etomology comes from the machine word size of some specific early processors. But the equally arbitrary name 'short' has become widely accepted.
 The really interesting question is whether on x86_64 the word is 32-bit
 or 64-bit.

With the rising importance of the SIMD instruction set, you could even argue that it is 128 bits in many cases...
 The whole concept of "machine word" seems very archaic and incorrect to 
 me anyway. It assumes that the data registers and address registers are 
 the same size, which is very often not true.

Machine words are far from archaic, even on the JVM, if you don't know the length of the word on the machine you are executing on, how do you know the set of values that can be represented? In floating point numbers, if you don't know the length of the word, how do you know the accuracy of the computation?

Yes, but they're not necessarily the same number. There is a native size for every type of operation, but it's not universal across all operations. I don't think there's a way you can define "machine word" in a way which is terribly useful. By the time you've got something unambiguous and well-defined, it doesn't have many interesting properties. It's valid in such limited cases that you'd be better off with a clearer name.
 Clearly data registers and address registers can be different lengths,
 it is not the job of a programming language that compiles to native code
 to ignore this and attempt to homogenize things beyond what is
 reasonable.

Agreed, and this is I think what makes the concept of "machine word" not very helpful.
 
 If you are working in native code then word length is a crucial property
 since it can change depending on which processor you compile for.
 
 For example, on an 8-bit machine (eg, 6502 or Z80), the accumulator was 
 only 8 bits, yet size_t was definitely 16 bits.

The 8051 was only surpassed a couple of years ago by ARMs as the most numerous processor on the planet. 8-bit processors may only have had 8-bit ALUs -- leading to an hypothesis that the word was 8-bits -- but the word length was effectively 16-bit due to the hardware support for multi-byte integer operations.

The 6502 was restricted to 8 bits in almost every way. About half of the instructions that involved 16 bit quantities would wrap on page boundaries. jmp (0x7FF) would do an indirect jump, getting the low word from address 0x7FF and the high word from 0x700 !!
 It's quite plausible that at some time in the future we'll get a machine 
 with 128-bit registers and data bus, but retaining the 64 bit address 
 bus. So we could get a size_t which is smaller than the machine word.

 In summary: size_t is not the machine word.

Agreed ! As long as the address bus is less wide than an integer, there are no apparent problems using integers as addresses. The problem comes when addresses are wider than integers. A good statically-typed programming language should manage this by having integers and addresses as distinct sets. C and C++ have led people astray. There should be an appropriate set of integer types and an appropriate set of address types and using one from the other without active conversion is always going to lead to problems.

Indeed.
 
 Do not be afraid of the word.  Fear leads to anger.  Anger leads to
 hate.  Hate leads to suffering. (*)
 
 </minor-rant>
 
 (*) With apologies to Master Yoda (**) for any misquote.
 
 (**) Or more likely whoever his script writer was.

Feb 17 2011
parent Olivier Pisano <olivier.pisano laposte.net> writes:
Le 17/02/2011 13:28, Don a Ă©crit :
 Yes, I know. It's true but I think rather useless.
 We need a name for an 8 bit quantity, and a 16 bit quantity, and higher
 powers of two. 'byte' is an established name for the first one, even
 though historically there were 9-bit bytes. IMHO 'word' wasn't such a
 bad name for the second one, even though its etomology comes from the
 machine word size of some specific early processors. But the equally
 arbitrary name 'short' has become widely accepted.

8 bits: octet -> http://en.wikipedia.org/wiki/Octet_%28computing%29
Feb 17 2011
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
Russel Winder wrote:
 Do not be afraid of the word.  Fear leads to anger.  Anger leads to
 hate.  Hate leads to suffering. (*)

 (*) With apologies to Master Yoda (**) for any misquote.

"Luke, trust your feelings!" -- Oggie Ben Doggie Of course, expecting consistency from Star Wars is a waste of time.
Feb 17 2011
next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday 17 February 2011 23:09:32 Russel Winder wrote:
 On Thu, 2011-02-17 at 11:09 -0800, Walter Bright wrote:
 Russel Winder wrote:
 Do not be afraid of the word.  Fear leads to anger.  Anger leads to
 hate.  Hate leads to suffering. (*)
 
 (*) With apologies to Master Yoda (**) for any misquote.

"Luke, trust your feelings!" -- Oggie Ben Doggie Of course, expecting consistency from Star Wars is a waste of time.

"What -- me worry?" Alfred E Newman (*) Star Wars is like Dr Who you expect revisionist history in every episode. I hate an inconsistent storyline, so the trick is to assume each episode is a completely separate story unrelated to any other episode.

The funny thing is that Doctor Who does a number of things which I would normally consider to make a show a bad show - such as being inconsistent in its timeline and generally being episodic rather than having real story arcs (though some of the newer Doctor Who stuff has had more of a story arc than was typical in the past) - but in spite of all that, it's an absolutely fantastic show - probably because the Doctor's just so much fun. Still, it's interesting how it generally breaks the rules of good storytelling and yet is still so great to watch. - Jonathan M Davis
Feb 17 2011
parent "Nick Sabalausky" <a a.a> writes:
"Jonathan M Davis" <jmdavisProg gmx.com> wrote in message 
news:mailman.1758.1298013272.4748.digitalmars-d puremagic.com...
 On Thursday 17 February 2011 23:09:32 Russel Winder wrote:
 On Thu, 2011-02-17 at 11:09 -0800, Walter Bright wrote:
 Russel Winder wrote:
 Do not be afraid of the word.  Fear leads to anger.  Anger leads to
 hate.  Hate leads to suffering. (*)

 (*) With apologies to Master Yoda (**) for any misquote.

"Luke, trust your feelings!" -- Oggie Ben Doggie Of course, expecting consistency from Star Wars is a waste of time.

"What -- me worry?" Alfred E Newman (*) Star Wars is like Dr Who you expect revisionist history in every episode. I hate an inconsistent storyline, so the trick is to assume each episode is a completely separate story unrelated to any other episode.

The funny thing is that Doctor Who does a number of things which I would normally consider to make a show a bad show - such as being inconsistent in its timeline and generally being episodic rather than having real story arcs (though some of the newer Doctor Who stuff has had more of a story arc than was typical in the past) - but in spite of all that, it's an absolutely fantastic show - probably because the Doctor's just so much fun. Still, it's interesting how it generally breaks the rules of good storytelling and yet is still so great to watch.

One of the things that gets me about Doctor Who (at least the newer ones) is that The Doctor keeps getting companions from modern-day London who, like the Doctor, are enthralled by the idea of travelling anywhere in time and space, and yet...it seems like they still wind up spending most of their time in modern-day London anyway :) (I agree it's an enjoyable show though. The character of The Doctor is definitely a big part of what makes it work.)
Feb 18 2011
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
Russel Winder wrote:
 Star Wars is like Dr Who you expect revisionist history in every
 episode.  I hate an inconsistent storyline, so the trick is to assume
 each episode is a completely separate story unrelated to any other
 episode.

My trick was to lose all interest in SW. Have you seen the series "Defying Gravity"? The plot is a spaceship is sent around a to pass by various planets in the solar system on a mission of discovery. The script writers apparently thought this was boring, so to liven things up they installed a ghost on the spaceship. It's really, really sad.
Feb 18 2011
next sibling parent reply "Nick Sabalausky" <a a.a> writes:
"Walter Bright" <newshound2 digitalmars.com> wrote in message 
news:ijmnp7$433$1 digitalmars.com...
 Russel Winder wrote:
 Star Wars is like Dr Who you expect revisionist history in every
 episode.  I hate an inconsistent storyline, so the trick is to assume
 each episode is a completely separate story unrelated to any other
 episode.

My trick was to lose all interest in SW.

I must not be enough of a Star Wars guy, I don't know what anyone's talking about here. Was it the prequel trilogy that introduced the inconsistencies (I still haven't gotten around to episodes 2 or 3 yet), or were there things in the orignal trilogy that I managed to completely overlook? (Or something else entirely?)
 Have you seen the series "Defying Gravity"? The plot is a spaceship is 
 sent around a to pass by various planets in the solar system on a mission 
 of discovery. The script writers apparently thought this was boring, so to 
 liven things up they installed a ghost on the spaceship.

 It's really, really sad.

Sounds like Stargate Universe: A bunch of people trapped on a ancient spaceship of exploration...but to make that concept "interesting" the writers had to make every damn character on the show a certifiable drama queen. Unsurprisingly, dead after only two seasons - a record low for Stargate. Really looking forward to the movie sequels though (as well as the new SG-1/Atlantis movies that, I *think*, are still in the works).
Feb 18 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
Jonathan M Davis wrote:
 The prequel movies definitely have some inconsistencies with the originals,
but 
 for the most part, they weren't huge. I suspect that the real trouble comes in 
 when you read the books (which I haven't).

Huge? How about it never occurs to Vader to search for Luke at the most obvious location in the universe - his nearest living relatives (Uncle Owen)? That's just the start of the ludicrousness. Ok, I have no right to be annoyed, but what an opportunity (to make a truly great movie) squandered.
Feb 18 2011
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
Jonathan M Davis wrote:
 Vader had no clue 

So much for his force!
Feb 18 2011
parent Max Samukha <maxsamukha spambox.com> writes:
On 02/19/2011 07:39 AM, Walter Bright wrote:
 Jonathan M Davis wrote:
 Vader had no clue

So much for his force!

How can one expect consistency from a fairytale?
Feb 19 2011
prev sibling next sibling parent reply Don <nospam nospam.com> writes:
Walter Bright wrote:
 Jonathan M Davis wrote:
 The prequel movies definitely have some inconsistencies with the 
 originals, but for the most part, they weren't huge. I suspect that 
 the real trouble comes in when you read the books (which I haven't).

Huge? How about it never occurs to Vader to search for Luke at the most obvious location in the universe - his nearest living relatives (Uncle Owen)? That's just the start of the ludicrousness. Ok, I have no right to be annoyed, but what an opportunity (to make a truly great movie) squandered.

I nominate the second prequel for the worst movie of all time. I never saw the third one.
Feb 18 2011
parent Walter Bright <newshound2 digitalmars.com> writes:
Don wrote:
 I nominate the second prequel for the worst movie of all time.
 I never saw the third one.

You didn't miss a thing.
Feb 18 2011
prev sibling parent Jeff Nowakowski <jeff dilacero.org> writes:
On 02/18/2011 08:39 PM, Walter Bright wrote:
 Huge? How about it never occurs to Vader to search for Luke at the most
 obvious location in the universe - his nearest living relatives (Uncle
 Owen)? That's just the start of the ludicrousness.

 Ok, I have no right to be annoyed, but what an opportunity (to make a
 truly great movie) squandered.

Lighten up, Francis. It was a truly great movie, for it's time.
Feb 19 2011
prev sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday, February 18, 2011 14:20:03 Nick Sabalausky wrote:
 "Walter Bright" <newshound2 digitalmars.com> wrote in message
 news:ijmnp7$433$1 digitalmars.com...
 
 Russel Winder wrote:
 Star Wars is like Dr Who you expect revisionist history in every
 episode.  I hate an inconsistent storyline, so the trick is to assume
 each episode is a completely separate story unrelated to any other
 episode.

My trick was to lose all interest in SW.

I must not be enough of a Star Wars guy, I don't know what anyone's talking about here. Was it the prequel trilogy that introduced the inconsistencies (I still haven't gotten around to episodes 2 or 3 yet), or were there things in the orignal trilogy that I managed to completely overlook? (Or something else entirely?)

The prequel movies definitely have some inconsistencies with the originals, but for the most part, they weren't huge. I suspect that the real trouble comes in when you read the books (which I haven't). - Jonathan M Davis
Feb 18 2011
prev sibling parent "Nick Sabalausky" <a a.a> writes:
"Russel Winder" <russel russel.org.uk> wrote in message 
news:mailman.1748.1297936806.4748.digitalmars-d puremagic.com...
 A word is the natural length of an integer item in the processor.
 It is necessarily machine specific.  cf. DEC-10 had 9-bit bytes
 and 36-bit word, IBM 370 has an 8-bit byte and a 32-bit word,
 though addresses were 24-bit.  ix86 follows IBM 8-bit byte and
 32-bit word.

Right. Programmers may have gotten used to "word" being 2-bytes due to things like the Win API and x86 Assemblers not updating their usage for the sake of backwards compatibility, but in the EE world where the term originates, "word" is device-specific and is very useful as such.
 Do not be afraid of the word.  Fear leads to anger.  Anger
 leads to hate.  Hate leads to suffering. (*)

This version is better: http://media.bigoo.ws/content/image/funny/funny_1309.jpg
Feb 17 2011
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Wed, 16 Feb 2011 06:49:26 +0300, Michel Fortin  
<michel.fortin michelf.com> wrote:

 On 2011-02-15 22:41:32 -0500, "Nick Sabalausky" <a a.a> said:

 I like "nint".

But is it unsigned or signed? Do we need 'unint' too? I think 'word' & 'uword' would be a better choice. I can't say I'm too displeased with 'size_t', but it's true that the 'size_t' feels out of place in D code because of its name.

I second that. word/uword are shorter than ssize_t/size_t and more in line with other type names. I like it.
Feb 16 2011
prev sibling next sibling parent Russel Winder <russel russel.org.uk> writes:
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<minor-rant>

On Thu, 2011-02-17 at 10:13 +0100, Don wrote:
[ . . . ]
 Me too. A word is two bytes. Any other definition seems to be pretty=20
 useless.

Sounds like people have been living with 8- and 16-bit processors for too long. A word is the natural length of an integer item in the processor. It is necessarily machine specific. cf. DEC-10 had 9-bit bytes and 36-bit word, IBM 370 has an 8-bit byte and a 32-bit word, though addresses were 24-bit. ix86 follows IBM 8-bit byte and 32-bit word. The really interesting question is whether on x86_64 the word is 32-bit or 64-bit.
 The whole concept of "machine word" seems very archaic and incorrect to=

 me anyway. It assumes that the data registers and address registers are=

 the same size, which is very often not true.

Machine words are far from archaic, even on the JVM, if you don't know the length of the word on the machine you are executing on, how do you know the set of values that can be represented? In floating point numbers, if you don't know the length of the word, how do you know the accuracy of the computation? Clearly data registers and address registers can be different lengths, it is not the job of a programming language that compiles to native code to ignore this and attempt to homogenize things beyond what is reasonable. If you are working in native code then word length is a crucial property since it can change depending on which processor you compile for.
 For example, on an 8-bit machine (eg, 6502 or Z80), the accumulator was=

 only 8 bits, yet size_t was definitely 16 bits.

The 8051 was only surpassed a couple of years ago by ARMs as the most numerous processor on the planet. 8-bit processors may only have had 8-bit ALUs -- leading to an hypothesis that the word was 8-bits -- but the word length was effectively 16-bit due to the hardware support for multi-byte integer operations.
 It's quite plausible that at some time in the future we'll get a machine=

 with 128-bit registers and data bus, but retaining the 64 bit address=20
 bus. So we could get a size_t which is smaller than the machine word.
=20
 In summary: size_t is not the machine word.

Agreed ! As long as the address bus is less wide than an integer, there are no apparent problems using integers as addresses. The problem comes when addresses are wider than integers. A good statically-typed programming language should manage this by having integers and addresses as distinct sets. C and C++ have led people astray. There should be an appropriate set of integer types and an appropriate set of address types and using one from the other without active conversion is always going to lead to problems. Do not be afraid of the word. Fear leads to anger. Anger leads to hate. Hate leads to suffering. (*) </minor-rant> (*) With apologies to Master Yoda (**) for any misquote. (**) Or more likely whoever his script writer was. --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel russel.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Feb 17 2011
prev sibling parent Russel Winder <russel russel.org.uk> writes:
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Thu, 2011-02-17 at 11:09 -0800, Walter Bright wrote:
 Russel Winder wrote:
 Do not be afraid of the word.  Fear leads to anger.  Anger leads to
 hate.  Hate leads to suffering. (*)

 (*) With apologies to Master Yoda (**) for any misquote.

"Luke, trust your feelings!" -- Oggie Ben Doggie =20 Of course, expecting consistency from Star Wars is a waste of time.

"What -- me worry?" Alfred E Newman (*) Star Wars is like Dr Who you expect revisionist history in every episode. I hate an inconsistent storyline, so the trick is to assume each episode is a completely separate story unrelated to any other episode. (*) Or whoever http://en.wikipedia.org/wiki/Alfred_E._Neuman --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel russel.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Feb 17 2011