digitalmars.D - Re: Integer conversions too pedantic in 64-bit

Steven Schveighoffer Wrote:

 On Tue, 15 Feb 2011 09:26:21 -0500, spir <denis.spir gmail.com> wrote:
 
 On 02/15/2011 02:36 PM, Steven Schveighoffer wrote:


 Hey, bikeshedders, I found this cool easter-egg feature in D! It's  
 called
 alias! Don't like the name of something? Well you can change it!

 alias size_t wordsize;

 Now, you can use wordsize instead of size_t in your code, and the  
 compiler
 doesn't care! (in fact, that's all size_t is anyways *hint hint*)


 Sure, but it's not the point of this one bikeshedding thread. If you do  
 that, then you're the only one who knows what "wordsize" means. Good,  
 maybe, for app-specific semantic notions (alias Employee[] Staff;);  
 certainly not for types at the highest degree of general purpose like  
 size_t. We need a standard alias.


 The standard alias is size_t.  If you don't like it, alias it to something  
 else.  Why should I have to use something that's unfamiliar to me because  
 you don't like size_t?
 
 I guarantee whatever you came up with would not be liked by some people,  
 so they would have to alias it, you can't please everyone.  size_t works,  
 it has a precedent, it's already *there*, just use it, or alias it if you  
 don't like it.
 
 No offense, but this discussion is among the most pointless I've seen.
 
 -Steve


I disagree that the discussion is pointless. 
On the contrary, the OP pointed out some valid points:

1.  that size_t is inconsistent with D's style guide. the "_t" suffix is a C++
convention and not a D one. While it makes sense for [former?] C++ programmers
it will confuse newcomers to D from other languages that would expect the
language to follow its own style guide. 
2. the proposed change is backwards compatible - the OP asked for an
*additional* alias.
3. generic concepts should belong to the standard library and not user code
which is also where size_t is already defined. 

IMO, we already have a byte type, it's plain common sense to extend this with a
"native word" type.

Feb 15 2011

Walter Bright <newshound2 digitalmars.com> writes:

foobar wrote:
 1.  that size_t is inconsistent with D's style guide. the "_t" suffix is a C++
convention and not a D one. While it makes sense for [former?] C++ programmers
it will confuse newcomers to D from other languages that would expect the
language to follow its own style guide. 


It's a C convention.

 2. the proposed change is backwards compatible - the OP asked for an
*additional* alias.


I do not believe that value is added by adding more and more aliases for the 
same thing. It makes the library large and complex but with no depth.

Feb 15 2011

spir <denis.spir gmail.com> writes:

On 02/15/2011 08:05 PM, Walter Bright wrote:
 foobar wrote:
 1. that size_t is inconsistent with D's style guide. the "_t" suffix is a C++
 convention and not a D one. While it makes sense for [former?] C++
 programmers it will confuse newcomers to D from other languages that would
 expect the language to follow its own style guide.


 It's a C convention.

 2. the proposed change is backwards compatible - the OP asked for an
 *additional* alias.


 I do not believe that value is added by adding more and more aliases for the
 same thing. It makes the library large and complex but with no depth.


If we asked for various aliases for numerous builtin terms of the language, 
your point would be fully valid. But here is only asked for a single standard 
alias for what may well be the most used type in the language; which presently 
has a obscure alias name.
Cost: one line of code in object.d:
     alias typeof(int.sizeof)                    size_t;
     alias typeof(int.sizeof)                    Abcdef; // add this

As an aside, the opportunity may be taken to use machine-word-size signed 
values as a standard for indices/positions and sizes/counts/lengths (and 
offsets?), everywhere in the language, for the coming 64-bit version. Don, 
IIRC, and Bearophile, referred to issues due to unsigned values.
This would also give an obvious name for the alias, "Integer", that probably 
few would contest (hope so).

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Feb 15 2011

so <so so.so> writes:

 I disagree that the discussion is pointless.
 On the contrary, the OP pointed out some valid points:

 1.  that size_t is inconsistent with D's style guide. the "_t" suffix is  
 a C++ convention and not a D one. While it makes sense for [former?] C++  
 programmers it will confuse newcomers to D from other languages that  
 would expect the language to follow its own style guide.
 2. the proposed change is backwards compatible - the OP asked for an  
 *additional* alias.
 3. generic concepts should belong to the standard library and not user  
 code which is also where size_t is already defined.

 IMO, we already have a byte type, it's plain common sense to extend this  
 with a "native word" type.


Funny thing is the most important argument against size_t got the least  
attention.
I will leave it as an exercise for the reader.

Feb 15 2011

"Nick Sabalausky" <a a.a> writes:

"so" <so so.so> wrote in message news:op.vqyk3emumpw3zg so-pc...
 I disagree that the discussion is pointless.
 On the contrary, the OP pointed out some valid points:

 1.  that size_t is inconsistent with D's style guide. the "_t" suffix is 
 a C++ convention and not a D one. While it makes sense for [former?] C++ 
 programmers it will confuse newcomers to D from other languages that 
 would expect the language to follow its own style guide.
 2. the proposed change is backwards compatible - the OP asked for an 
 *additional* alias.
 3. generic concepts should belong to the standard library and not user 
 code which is also where size_t is already defined.

 IMO, we already have a byte type, it's plain common sense to extend this 
 with a "native word" type.


 Funny thing is the most important argument against size_t got the least 
 attention.
 I will leave it as an exercise for the reader.


That variables of type "size_t" are frequently used to store indicies rather 
than the actual *size* of anything?

That it does nothing to help with 32/64-bit portability until you actually 
compile your code both ways?

That Nick doesn't like it? ;)

Feb 15 2011

Daniel Gibson <metalcaedes gmail.com> writes:

Am 15.02.2011 23:00, schrieb Nick Sabalausky:
 "so" <so so.so> wrote in message news:op.vqyk3emumpw3zg so-pc...
 I disagree that the discussion is pointless.
 On the contrary, the OP pointed out some valid points:

 1.  that size_t is inconsistent with D's style guide. the "_t" suffix is 
 a C++ convention and not a D one. While it makes sense for [former?] C++ 
 programmers it will confuse newcomers to D from other languages that 
 would expect the language to follow its own style guide.
 2. the proposed change is backwards compatible - the OP asked for an 
 *additional* alias.
 3. generic concepts should belong to the standard library and not user 
 code which is also where size_t is already defined.

 IMO, we already have a byte type, it's plain common sense to extend this 
 with a "native word" type.


 Funny thing is the most important argument against size_t got the least 
 attention.
 I will leave it as an exercise for the reader.


 That variables of type "size_t" are frequently used to store indicies rather 
 than the actual *size* of anything?
 
 That it does nothing to help with 32/64-bit portability until you actually 
 compile your code both ways?


I don't understand that point.

 
 That Nick doesn't like it? ;)

Feb 15 2011

"Nick Sabalausky" <a a.a> writes:

"Daniel Gibson" <metalcaedes gmail.com> wrote in message 
news:ijett7$1ie$5 digitalmars.com...
 Am 15.02.2011 23:00, schrieb Nick Sabalausky:
 "so" <so so.so> wrote in message news:op.vqyk3emumpw3zg so-pc...
 Funny thing is the most important argument against size_t got the least
 attention.
 I will leave it as an exercise for the reader.


 That variables of type "size_t" are frequently used to store indicies 
 rather
 than the actual *size* of anything?

 That it does nothing to help with 32/64-bit portability until you 
 actually
 compile your code both ways?


 I don't understand that point.


If you're writing something in 32-bit and you use size_t, it may compile 
perfectly fine for 32-bit, but the compiler won't tell you about any 
problems that will appear when you compile the same code for 64-bit (such as 
"can't implicitly convert"). Presumably the same would apply to writing 
something on 64-bit and then suddenly compiling for 32-bit.

I'm not actually asserting that this is a big issue. Maybe it is, maybe it 
isn't, I don't know. Just making guesses at what "so" sees as "the most 
important argument against size_t [that] got the least attention".

Feb 15 2011

Daniel Gibson <metalcaedes gmail.com> writes:

Am 15.02.2011 23:29, schrieb Nick Sabalausky:
 "Daniel Gibson" <metalcaedes gmail.com> wrote in message 
 news:ijett7$1ie$5 digitalmars.com...
 Am 15.02.2011 23:00, schrieb Nick Sabalausky:
 "so" <so so.so> wrote in message news:op.vqyk3emumpw3zg so-pc...
 Funny thing is the most important argument against size_t got the least
 attention.
 I will leave it as an exercise for the reader.


 That variables of type "size_t" are frequently used to store indicies 
 rather
 than the actual *size* of anything?

 That it does nothing to help with 32/64-bit portability until you 
 actually
 compile your code both ways?


 I don't understand that point.


 If you're writing something in 32-bit and you use size_t, it may compile 
 perfectly fine for 32-bit, but the compiler won't tell you about any 
 problems that will appear when you compile the same code for 64-bit (such as 
 "can't implicitly convert"). Presumably the same would apply to writing 
 something on 64-bit and then suddenly compiling for 32-bit.
 
 I'm not actually asserting that this is a big issue. Maybe it is, maybe it 
 isn't, I don't know. Just making guesses at what "so" sees as "the most 
 important argument against size_t [that] got the least attention".
 


Ok, that is right.
Probably it would be helpful if size_t was a proper type that can't be mixed
with other types in dangerous ways without explicit casting.

Cheers,
- Daniel

Feb 15 2011

Adam Ruppe <destructionator gmail.com> writes:

Daniel Gibson wrote:
 Probably it would be helpful if size_t was a proper type that can't
 be mixed with other types in dangerous ways without explicit casting.


Bad idea: once you insert an explicit cast, you now have a *hidden*
bug on the new platform instead of a compile error.

Feb 15 2011

Daniel Gibson <metalcaedes gmail.com> writes:

Am 15.02.2011 23:43, schrieb Adam Ruppe:
 Daniel Gibson wrote:
 Probably it would be helpful if size_t was a proper type that can't
 be mixed with other types in dangerous ways without explicit casting.


 Bad idea: once you insert an explicit cast, you now have a *hidden*
 bug on the new platform instead of a compile error.


You should only cast when you know what you're doing..
If you get a compiler error on the new platform and just shut it up by doing an
explicit cast then, it's just as bad.
But having to do an explicit cast either way forces you to think about what
you're doing, hopefully avoiding large pieces of code that need to be rewritten
because they only worked because size_t was uint or such.

Cheers,
- Daniel

Feb 15 2011

bearophile <bearophileHUGS lycos.com> writes:

Adam Ruppe:

 Daniel Gibson wrote:
 Probably it would be helpful if size_t was a proper type that can't
 be mixed with other types in dangerous ways without explicit casting.


 Bad idea: once you insert an explicit cast, you now have a *hidden*
 bug on the new platform instead of a compile error.


I'll keep this in mind.

Bye,
bearophile

Feb 15 2011

Don <nospam nospam.com> writes:

Jonathan M Davis wrote:
 On Tuesday, February 15, 2011 14:00:12 Nick Sabalausky wrote:
 "so" <so so.so> wrote in message news:op.vqyk3emumpw3zg so-pc...

 I disagree that the discussion is pointless.
 On the contrary, the OP pointed out some valid points:

 1.  that size_t is inconsistent with D's style guide. the "_t" suffix is
 a C++ convention and not a D one. While it makes sense for [former?] C++
 programmers it will confuse newcomers to D from other languages that
 would expect the language to follow its own style guide.
 2. the proposed change is backwards compatible - the OP asked for an
 *additional* alias.
 3. generic concepts should belong to the standard library and not user
 code which is also where size_t is already defined.

 IMO, we already have a byte type, it's plain common sense to extend this
 with a "native word" type.


 attention.
 I will leave it as an exercise for the reader.


 rather than the actual *size* of anything?

 That it does nothing to help with 32/64-bit portability until you actually
 compile your code both ways?


 What _does_ have to do with 32/64-bit portability until you compile both ways? 
 Regardless of what the name is, it's still going to be the word size of the 
 machine and vary between 32-bit and 64-bit anyway.


size_t could be made a genuine type, and given a range of 0..2^^64-1, 
even when it is a 32 bit value. Then, it'd fail to implicitly convert to 
int, uint on 32 bit systems. But, if you did certain operations on it 
(eg, & 0xFFFF_FFFF) then it could be store in a uint without a cast.

Feb 15 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday, February 15, 2011 14:00:12 Nick Sabalausky wrote:
 "so" <so so.so> wrote in message news:op.vqyk3emumpw3zg so-pc...
 
 I disagree that the discussion is pointless.
 On the contrary, the OP pointed out some valid points:
 
 1.  that size_t is inconsistent with D's style guide. the "_t" suffix is
 a C++ convention and not a D one. While it makes sense for [former?] C++
 programmers it will confuse newcomers to D from other languages that
 would expect the language to follow its own style guide.
 2. the proposed change is backwards compatible - the OP asked for an
 *additional* alias.
 3. generic concepts should belong to the standard library and not user
 code which is also where size_t is already defined.
 
 IMO, we already have a byte type, it's plain common sense to extend this
 with a "native word" type.


 Funny thing is the most important argument against size_t got the least
 attention.
 I will leave it as an exercise for the reader.


 That variables of type "size_t" are frequently used to store indicies
 rather than the actual *size* of anything?
 
 That it does nothing to help with 32/64-bit portability until you actually
 compile your code both ways?


What _does_ have to do with 32/64-bit portability until you compile both ways? 
Regardless of what the name is, it's still going to be the word size of the 
machine and vary between 32-bit and 64-bit anyway.

- Jonathan M Davis

Feb 15 2011

so <so so.so> writes:

 That variables of type "size_t" are frequently used to store indicies  
 rather
 than the actual *size* of anything?

 That it does nothing to help with 32/64-bit portability until you  
 actually
 compile your code both ways?

 That Nick doesn't like it? ;)


Nice try! But i was referring Don's argument. :)

Feb 15 2011

=?ISO-8859-1?Q?g=F6lgeliyele?= <usuldan gmail.com> writes:

On 2/15/11 12:24 PM, foobar wrote:
 I disagree that the discussion is pointless.
 On the contrary, the OP pointed out some valid points:

 1.  that size_t is inconsistent with D's style guide. the "_t" suffix is a C++
convention and not a D one. While it makes sense for [former?] C++ programmers
it will confuse newcomers to D from other languages that would expect the
language to follow its own style guide.
 2. the proposed change is backwards compatible - the OP asked for an
*additional* alias.
 3. generic concepts should belong to the standard library and not user code
which is also where size_t is already defined.

 IMO, we already have a byte type, it's plain common sense to extend this with
a "native word" type.


Look at the basic data types:

bool, byte, ubyte, short, ushort, int, uint, long, ulong, cent, ucent, 
float, double, real, ifloat, idouble, ireal, cfloat, cdouble, creal, 
char, wchar, dchar 	

While size_t is just an alias, it will be used in a similar way to the 
above. One can  see that it does not fit among these, stylistically 
speaking. There seems to be a common pattern here, a prefixing character 
is consistently used to differentiate basic types, such as 
u-short/short, c-float/float, w-char/char, etc. I wonder if something 
similar can be done for size_t. nint comes to mind, for native int, that 
is n-int. Sample code:

   nint end = 0; // nintendo :)

Having too many aliases seems like a problem to me. Different developers 
will start using different names and reading code will become harder. 
One would need to learn two things that refer to the same.

My 2 cents: I suggest deprecating size_t and replacing it with a better 
alternative that fits with the D language.

Feb 15 2011

"Nick Sabalausky" <a a.a> writes:

"g�lgeliyele" <usuldan gmail.com> wrote in message 
news:ijfc4m$16p6$1 digitalmars.com...
 On 2/15/11 12:24 PM, foobar wrote:
 I disagree that the discussion is pointless.
 On the contrary, the OP pointed out some valid points:

 1.  that size_t is inconsistent with D's style guide. the "_t" suffix is 
 a C++ convention and not a D one. While it makes sense for [former?] C++ 
 programmers it will confuse newcomers to D from other languages that 
 would expect the language to follow its own style guide.
 2. the proposed change is backwards compatible - the OP asked for an 
 *additional* alias.
 3. generic concepts should belong to the standard library and not user 
 code which is also where size_t is already defined.

 IMO, we already have a byte type, it's plain common sense to extend this 
 with a "native word" type.


 Look at the basic data types:

 bool, byte, ubyte, short, ushort, int, uint, long, ulong, cent, ucent, 
 float, double, real, ifloat, idouble, ireal, cfloat, cdouble, creal, char, 
 wchar, dchar
 While size_t is just an alias, it will be used in a similar way to the 
 above. One can  see that it does not fit among these, stylistically 
 speaking. There seems to be a common pattern here, a prefixing character 
 is consistently used to differentiate basic types, such as u-short/short, 
 c-float/float, w-char/char, etc. I wonder if something similar can be done 
 for size_t. nint comes to mind, for native int, that is n-int. Sample 
 code:


I like "nint".

   nint end = 0; // nintendo :)


Heh, I like that even more. It's "int eger;" for a new generation :)  And 
much less contrived, come to think of it.

Feb 15 2011

Michel Fortin <michel.fortin michelf.com> writes:

On 2011-02-15 22:41:32 -0500, "Nick Sabalausky" <a a.a> said:

 I like "nint".


But is it unsigned or signed? Do we need 'unint' too?

I think 'word' & 'uword' would be a better choice. I can't say I'm too 
displeased with 'size_t', but it's true that the 'size_t' feels out of 
place in D code because of its name.


-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Feb 15 2011

"Nick Sabalausky" <a a.a> writes:

"Michel Fortin" <michel.fortin michelf.com> wrote in message 
news:ijfhkt$1fte$1 digitalmars.com...
 On 2011-02-15 22:41:32 -0500, "Nick Sabalausky" <a a.a> said:

 I like "nint".


 But is it unsigned or signed? Do we need 'unint' too?


*shrug* Beats me. I can't even remember if size_t is signed or not.

 I think 'word' & 'uword' would be a better choice.


The only problem I have with that is that "word" seems like something you 
might want to use as a variable name in certain cases. However, I'd still 
prefer "word" over "size_t"

 I can't say I'm too displeased with 'size_t', but it's true that the 
 'size_t' feels out of place in D code because of its name.

Feb 15 2011

=?ISO-8859-1?Q?g=F6lgeliyele?= <usuldan gmail.com> writes:

On 2/15/11 11:33 PM, Nick Sabalausky wrote:
 "Michel Fortin"<michel.fortin michelf.com>  wrote in message
 news:ijfhkt$1fte$1 digitalmars.com...
 On 2011-02-15 22:41:32 -0500, "Nick Sabalausky"<a a.a>  said:

 I like "nint".


 But is it unsigned or signed? Do we need 'unint' too?


 *shrug* Beats me. I can't even remember if size_t is signed or not.


size_t is unsigned in C/C++, whereas ssize_t is signed.

I like word/uword as well, but word is too common as a variable name.

What about archint/uarchint ?

Feb 15 2011

Iain Buclaw <ibuclaw ubuntu.com> writes:

== Quote from spir (denis.spir gmail.com)'s article
 On 02/16/2011 04:49 AM, Michel Fortin wrote:
 On 2011-02-15 22:41:32 -0500, "Nick Sabalausky" <a a.a> said:

 I like "nint".






It's the machine integer, so I think the word 'mint' would better match your
naming logic. Also, reminds me of this small advert:
http://www.youtube.com/watch?v=zuy6o8YXzDo ;)

 But is it unsigned or signed? Do we need 'unint' too?

 I think 'word' & 'uword' would be a better choice. I can't say I'm too
 displeased with 'size_t', but it's true that the 'size_t' feels out of place in
 D code because of its name.


 unint looks like meaning (x € R / not (x € Z)) lol!
 Denis


word/uword sits well with my understanding.

Feb 16 2011

KennyTM~ <kennytm gmail.com> writes:

On Feb 16, 11 11:49, Michel Fortin wrote:
 On 2011-02-15 22:41:32 -0500, "Nick Sabalausky" <a a.a> said:

 I like "nint".


 But is it unsigned or signed? Do we need 'unint' too?

 I think 'word' & 'uword' would be a better choice. I can't say I'm too
 displeased with 'size_t', but it's true that the 'size_t' feels out of
 place in D code because of its name.


'word' may be confusing to Windows programmers because in WinAPI a 
'WORD' means an unsigned 16-bit integer (aka 'ushort').

http://msdn.microsoft.com/en-us/library/cc230402(v=PROT.10).aspx

Feb 16 2011

"Nick Sabalausky" <a a.a> writes:

"KennyTM~" <kennytm gmail.com> wrote in message 
news:ijghne$ts1$1 digitalmars.com...
 On Feb 16, 11 11:49, Michel Fortin wrote:
 On 2011-02-15 22:41:32 -0500, "Nick Sabalausky" <a a.a> said:

 I like "nint".


 But is it unsigned or signed? Do we need 'unint' too?

 I think 'word' & 'uword' would be a better choice. I can't say I'm too
 displeased with 'size_t', but it's true that the 'size_t' feels out of
 place in D code because of its name.


 'word' may be confusing to Windows programmers because in WinAPI a 'WORD' 
 means an unsigned 16-bit integer (aka 'ushort').

 http://msdn.microsoft.com/en-us/library/cc230402(v=PROT.10).aspx


That's just a legacy issue from when windows was mainly on 16-bit machines. 
"Word" means native size.

Feb 16 2011

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 17.02.2011 9:09, Nick Sabalausky wrote:
 "KennyTM~"<kennytm gmail.com>  wrote in message
 news:ijghne$ts1$1 digitalmars.com...
 On Feb 16, 11 11:49, Michel Fortin wrote:
 On 2011-02-15 22:41:32 -0500, "Nick Sabalausky"<a a.a>  said:

 I like "nint".



 I think 'word'&  'uword' would be a better choice. I can't say I'm too
 displeased with 'size_t', but it's true that the 'size_t' feels out of
 place in D code because of its name.


 means an unsigned 16-bit integer (aka 'ushort').

 http://msdn.microsoft.com/en-us/library/cc230402(v=PROT.10).aspx


 "Word" means native size.


uses size prefixes word (2 bytes!), dword (4bytes), qword (8) etc.
And if that was only assembler syntax issue...

-- 
Dmitry Olshansky

Feb 16 2011

David Nadlinger <see klickverbot.at> writes:

On 2/17/11 8:56 AM, Denis Koroskin wrote:
 I second that. word/uword are shorter than ssize_t/size_t and more in
 line with other type names.

 I like it.


I agree that size_t/ptrdiff_t are misnomers and I'd love to kill them 
with fire, but when I read about »word«, I intuitively associated it 
with »two bytes« first – blame Intel or whoever else, but the potential 
for confusion is definitely not negligible.

David

Feb 17 2011

Don <nospam nospam.com> writes:

David Nadlinger wrote:
 On 2/17/11 8:56 AM, Denis Koroskin wrote:
 I second that. word/uword are shorter than ssize_t/size_t and more in
 line with other type names.

 I like it.


 I agree that size_t/ptrdiff_t are misnomers and I'd love to kill them 
 with fire, but when I read about »word«, I intuitively associated it 
 with »two bytes« first – blame Intel or whoever else, but the potential 
 for confusion is definitely not negligible.
 
 David


Me too. A word is two bytes. Any other definition seems to be pretty 
useless.

The whole concept of "machine word" seems very archaic and incorrect to 
me anyway. It assumes that the data registers and address registers are 
the same size, which is very often not true.
For example, on an 8-bit machine (eg, 6502 or Z80), the accumulator was 
only 8 bits, yet size_t was definitely 16 bits.
It's quite plausible that at some time in the future we'll get a machine 
with 128-bit registers and data bus, but retaining the 64 bit address 
bus. So we could get a size_t which is smaller than the machine word.

In summary: size_t is not the machine word.

Feb 17 2011

Don <nospam nospam.com> writes:

spir wrote:
 On 02/17/2011 10:13 AM, Don wrote:
 David Nadlinger wrote:
 On 2/17/11 8:56 AM, Denis Koroskin wrote:
 I second that. word/uword are shorter than ssize_t/size_t and more in
 line with other type names.

 I like it.


 I agree that size_t/ptrdiff_t are misnomers and I'd love to kill them 
 with
 fire, but when I read about »word«, I intuitively associated it with 
 »two
 bytes« first – blame Intel or whoever else, but the potential for 
 confusion
 is definitely not negligible.

 David


 Me too. A word is two bytes. Any other definition seems to be pretty 
 useless.

 The whole concept of "machine word" seems very archaic and incorrect 
 to me
 anyway. It assumes that the data registers and address registers are 
 the same
 size, which is very often not true.
 For example, on an 8-bit machine (eg, 6502 or Z80), the accumulator 
 was only 8
 bits, yet size_t was definitely 16 bits.
 It's quite plausible that at some time in the future we'll get a 
 machine with
 128-bit registers and data bus, but retaining the 64 bit address bus. 
 So we
 could get a size_t which is smaller than the machine word.

 In summary: size_t is not the machine word.


 Right, there is no single native machine word size; but I guess what 
 we're interesting in is, from those sizes, the one that ensures minimal 
 processing time. I mean, the data size for which there are native 
 computation instructions (logical, numeric), so that if we use it we get 
 the least number of cycles for a given operation.


There's frequently more than one such size.

 Also, this size (on common modern architectures, at least) allows 
 directly accessing all of the memory address space; not a neglectable 
 property ;-).


This is not necessarily the same.

 Or are there points I'm overlooking?
 
 Denis

Feb 17 2011

Don <nospam nospam.com> writes:

Russel Winder wrote:
 <minor-rant>
 
 On Thu, 2011-02-17 at 10:13 +0100, Don wrote:
 [ . . . ]
 Me too. A word is two bytes. Any other definition seems to be pretty 
 useless.


 Sounds like people have been living with 8- and 16-bit processors for
 too long.
 
 A word is the natural length of an integer item in the processor.  It is
 necessarily machine specific.  cf. DEC-10 had 9-bit bytes and 36-bit
 word, IBM 370 has an 8-bit byte and a 32-bit word, though addresses were
 24-bit.  ix86 follows IBM 8-bit byte and 32-bit word.


Yes, I know. It's true but I think rather useless.
We need a name for an 8 bit quantity, and a 16 bit quantity, and higher 
powers of two. 'byte' is an established name for the first one, even 
though historically there were 9-bit bytes. IMHO 'word' wasn't such a 
bad name for the second one, even though its etomology comes from the 
machine word size of some specific early processors. But the equally 
arbitrary name 'short' has become widely accepted.

 The really interesting question is whether on x86_64 the word is 32-bit
 or 64-bit.


With the rising importance of the SIMD instruction set, you could even 
argue that it is 128 bits in many cases...


 The whole concept of "machine word" seems very archaic and incorrect to 
 me anyway. It assumes that the data registers and address registers are 
 the same size, which is very often not true.


 Machine words are far from archaic, even on the JVM, if you don't know
 the length of the word on the machine you are executing on, how do you
 know the set of values that can be represented?  In floating point
 numbers, if you don't know the length of the word, how do you know the
 accuracy of the computation?


Yes, but they're not necessarily the same number. There is a native size 
for every type of operation, but it's not universal across all operations.

I don't think there's a way you can define "machine word" in a way which 
is terribly useful. By the time you've got something unambiguous and 
well-defined, it doesn't have many interesting properties. It's valid in 
such limited cases that you'd be better off with a clearer name.

 Clearly data registers and address registers can be different lengths,
 it is not the job of a programming language that compiles to native code
 to ignore this and attempt to homogenize things beyond what is
 reasonable.


Agreed, and this is I think what makes the concept of "machine word" not 
very helpful.

 
 If you are working in native code then word length is a crucial property
 since it can change depending on which processor you compile for.
 
 For example, on an 8-bit machine (eg, 6502 or Z80), the accumulator was 
 only 8 bits, yet size_t was definitely 16 bits.


 The 8051 was only surpassed a couple of years ago by ARMs as the most
 numerous processor on the planet.  8-bit processors may only have had
 8-bit ALUs -- leading to an hypothesis that the word was 8-bits -- but
 the word length was effectively 16-bit due to the hardware support for
 multi-byte integer operations.


The 6502 was restricted to 8 bits in almost every way. About half of the 
instructions that involved 16 bit quantities would wrap on page 
boundaries. jmp (0x7FF) would do an indirect jump, getting the low word 
from address 0x7FF and the high word from 0x700 !!


 It's quite plausible that at some time in the future we'll get a machine 
 with 128-bit registers and data bus, but retaining the 64 bit address 
 bus. So we could get a size_t which is smaller than the machine word.

 In summary: size_t is not the machine word.


 Agreed !
 
 As long as the address bus is less wide than an integer, there are no
 apparent problems using integers as addresses.  The problem comes when
 addresses are wider than integers.  A good statically-typed programming
 language should manage this by having integers and addresses as distinct
 sets.  C and C++ have led people astray.  There should be an appropriate
 set of integer types and an appropriate set of address types and using
 one from the other without active conversion is always going to lead to
 problems.


Indeed.

 
 Do not be afraid of the word.  Fear leads to anger.  Anger leads to
 hate.  Hate leads to suffering. (*)
 
 </minor-rant>
 
 (*) With apologies to Master Yoda (**) for any misquote.
 
 (**) Or more likely whoever his script writer was.

Feb 17 2011

Olivier Pisano <olivier.pisano laposte.net> writes:

Le 17/02/2011 13:28, Don a écrit :
 Yes, I know. It's true but I think rather useless.
 We need a name for an 8 bit quantity, and a 16 bit quantity, and higher
 powers of two. 'byte' is an established name for the first one, even
 though historically there were 9-bit bytes. IMHO 'word' wasn't such a
 bad name for the second one, even though its etomology comes from the
 machine word size of some specific early processors. But the equally
 arbitrary name 'short' has become widely accepted.


8 bits: octet -> http://en.wikipedia.org/wiki/Octet_%28computing%29

Feb 17 2011

Walter Bright <newshound2 digitalmars.com> writes:

Russel Winder wrote:
 Do not be afraid of the word.  Fear leads to anger.  Anger leads to
 hate.  Hate leads to suffering. (*)


 (*) With apologies to Master Yoda (**) for any misquote.


"Luke, trust your feelings!" -- Oggie Ben Doggie

Of course, expecting consistency from Star Wars is a waste of time.

Feb 17 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Thursday 17 February 2011 23:09:32 Russel Winder wrote:
 On Thu, 2011-02-17 at 11:09 -0800, Walter Bright wrote:
 Russel Winder wrote:
 Do not be afraid of the word.  Fear leads to anger.  Anger leads to
 hate.  Hate leads to suffering. (*)
 
 (*) With apologies to Master Yoda (**) for any misquote.


 "Luke, trust your feelings!" -- Oggie Ben Doggie
 
 Of course, expecting consistency from Star Wars is a waste of time.


 "What -- me worry?"  Alfred E Newman  (*)
 
 Star Wars is like Dr Who you expect revisionist history in every
 episode.  I hate an inconsistent storyline, so the trick is to assume
 each episode is a completely separate story unrelated to any other
 episode.


The funny thing is that Doctor Who does a number of things which I would 
normally consider to make a show a bad show - such as being inconsistent in its 
timeline and generally being episodic rather than having real story arcs
(though 
some of the newer Doctor Who stuff has had more of a story arc than was typical 
in the past) - but in spite of all that, it's an absolutely fantastic show - 
probably because the Doctor's just so much fun. Still, it's interesting how it 
generally breaks the rules of good storytelling and yet is still so great to 
watch.

- Jonathan M Davis

Feb 17 2011

"Nick Sabalausky" <a a.a> writes:

"Jonathan M Davis" <jmdavisProg gmx.com> wrote in message 
news:mailman.1758.1298013272.4748.digitalmars-d puremagic.com...
 On Thursday 17 February 2011 23:09:32 Russel Winder wrote:
 On Thu, 2011-02-17 at 11:09 -0800, Walter Bright wrote:
 Russel Winder wrote:
 Do not be afraid of the word.  Fear leads to anger.  Anger leads to
 hate.  Hate leads to suffering. (*)

 (*) With apologies to Master Yoda (**) for any misquote.


 "Luke, trust your feelings!" -- Oggie Ben Doggie

 Of course, expecting consistency from Star Wars is a waste of time.


 "What -- me worry?"  Alfred E Newman  (*)

 Star Wars is like Dr Who you expect revisionist history in every
 episode.  I hate an inconsistent storyline, so the trick is to assume
 each episode is a completely separate story unrelated to any other
 episode.


 The funny thing is that Doctor Who does a number of things which I would
 normally consider to make a show a bad show - such as being inconsistent 
 in its
 timeline and generally being episodic rather than having real story arcs 
 (though
 some of the newer Doctor Who stuff has had more of a story arc than was 
 typical
 in the past) - but in spite of all that, it's an absolutely fantastic 
 show -
 probably because the Doctor's just so much fun. Still, it's interesting 
 how it
 generally breaks the rules of good storytelling and yet is still so great 
 to
 watch.


One of the things that gets me about Doctor Who (at least the newer ones) is 
that The Doctor keeps getting companions from modern-day London who, like 
the Doctor, are enthralled by the idea of travelling anywhere in time and 
space, and yet...it seems like they still wind up spending most of their 
time in modern-day London anyway :)  (I agree it's an enjoyable show though. 
The character of The Doctor is definitely a big part of what makes it work.)

Feb 18 2011

Walter Bright <newshound2 digitalmars.com> writes:

Russel Winder wrote:
 Star Wars is like Dr Who you expect revisionist history in every
 episode.  I hate an inconsistent storyline, so the trick is to assume
 each episode is a completely separate story unrelated to any other
 episode.


My trick was to lose all interest in SW.

Have you seen the series "Defying Gravity"? The plot is a spaceship is sent 
around a to pass by various planets in the solar system on a mission of 
discovery. The script writers apparently thought this was boring, so to liven 
things up they installed a ghost on the spaceship.

It's really, really sad.

Feb 18 2011

"Nick Sabalausky" <a a.a> writes:

"Walter Bright" <newshound2 digitalmars.com> wrote in message 
news:ijmnp7$433$1 digitalmars.com...
 Russel Winder wrote:
 Star Wars is like Dr Who you expect revisionist history in every
 episode.  I hate an inconsistent storyline, so the trick is to assume
 each episode is a completely separate story unrelated to any other
 episode.


 My trick was to lose all interest in SW.


I must not be enough of a Star Wars guy, I don't know what anyone's talking 
about here. Was it the prequel trilogy that introduced the inconsistencies 
(I still haven't gotten around to episodes 2 or 3 yet), or were there things 
in the orignal trilogy that I managed to completely overlook? (Or something 
else entirely?)

 Have you seen the series "Defying Gravity"? The plot is a spaceship is 
 sent around a to pass by various planets in the solar system on a mission 
 of discovery. The script writers apparently thought this was boring, so to 
 liven things up they installed a ghost on the spaceship.

 It's really, really sad.


Sounds like Stargate Universe: A bunch of people trapped on a ancient 
spaceship of exploration...but to make that concept "interesting" the 
writers had to make every damn character on the show a certifiable drama 
queen. Unsurprisingly, dead after only two seasons - a record low for 
Stargate. Really looking forward to the movie sequels though (as well as the 
new SG-1/Atlantis movies that, I *think*, are still in the works).

Feb 18 2011

Walter Bright <newshound2 digitalmars.com> writes:

Jonathan M Davis wrote:
 The prequel movies definitely have some inconsistencies with the originals,
but 
 for the most part, they weren't huge. I suspect that the real trouble comes in 
 when you read the books (which I haven't).


Huge? How about it never occurs to Vader to search for Luke at the most obvious 
location in the universe - his nearest living relatives (Uncle Owen)? That's 
just the start of the ludicrousness.

Ok, I have no right to be annoyed, but what an opportunity (to make a truly 
great movie) squandered.

Feb 18 2011

Walter Bright <newshound2 digitalmars.com> writes:

Jonathan M Davis wrote:
 Vader had no clue 


So much for his force!

Feb 18 2011

Max Samukha <maxsamukha spambox.com> writes:

On 02/19/2011 07:39 AM, Walter Bright wrote:
 Jonathan M Davis wrote:
 Vader had no clue


 So much for his force!


How can one expect consistency from a fairytale?

Feb 19 2011

Don <nospam nospam.com> writes:

Walter Bright wrote:
 Jonathan M Davis wrote:
 The prequel movies definitely have some inconsistencies with the 
 originals, but for the most part, they weren't huge. I suspect that 
 the real trouble comes in when you read the books (which I haven't).


 Huge? How about it never occurs to Vader to search for Luke at the most 
 obvious location in the universe - his nearest living relatives (Uncle 
 Owen)? That's just the start of the ludicrousness.
 
 Ok, I have no right to be annoyed, but what an opportunity (to make a 
 truly great movie) squandered.


I nominate the second prequel for the worst movie of all time.
I never saw the third one.

Feb 18 2011

Walter Bright <newshound2 digitalmars.com> writes:

Don wrote:
 I nominate the second prequel for the worst movie of all time.
 I never saw the third one.



You didn't miss a thing.

Feb 18 2011

Jeff Nowakowski <jeff dilacero.org> writes:

On 02/18/2011 08:39 PM, Walter Bright wrote:
 Huge? How about it never occurs to Vader to search for Luke at the most
 obvious location in the universe - his nearest living relatives (Uncle
 Owen)? That's just the start of the ludicrousness.

 Ok, I have no right to be annoyed, but what an opportunity (to make a
 truly great movie) squandered.


Lighten up, Francis. It was a truly great movie, for it's time.

Feb 19 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Friday, February 18, 2011 14:20:03 Nick Sabalausky wrote:
 "Walter Bright" <newshound2 digitalmars.com> wrote in message
 news:ijmnp7$433$1 digitalmars.com...
 
 Russel Winder wrote:
 Star Wars is like Dr Who you expect revisionist history in every
 episode.  I hate an inconsistent storyline, so the trick is to assume
 each episode is a completely separate story unrelated to any other
 episode.


 My trick was to lose all interest in SW.


 I must not be enough of a Star Wars guy, I don't know what anyone's talking
 about here. Was it the prequel trilogy that introduced the inconsistencies
 (I still haven't gotten around to episodes 2 or 3 yet), or were there
 things in the orignal trilogy that I managed to completely overlook? (Or
 something else entirely?)


The prequel movies definitely have some inconsistencies with the originals, but 
for the most part, they weren't huge. I suspect that the real trouble comes in 
when you read the books (which I haven't).

- Jonathan M Davis

Feb 18 2011

"Nick Sabalausky" <a a.a> writes:

"Russel Winder" <russel russel.org.uk> wrote in message 
news:mailman.1748.1297936806.4748.digitalmars-d puremagic.com...
 A word is the natural length of an integer item in the processor.
 It is necessarily machine specific.  cf. DEC-10 had 9-bit bytes
 and 36-bit word, IBM 370 has an 8-bit byte and a 32-bit word,
 though addresses were 24-bit.  ix86 follows IBM 8-bit byte and
 32-bit word.


Right. Programmers may have gotten used to "word" being 2-bytes due to 
things like the Win API and x86 Assemblers not updating their usage for the 
sake of backwards compatibility, but in the EE world where the term 
originates, "word" is device-specific and is very useful as such.

 Do not be afraid of the word.  Fear leads to anger.  Anger
 leads to hate.  Hate leads to suffering. (*)


This version is better:
http://media.bigoo.ws/content/image/funny/funny_1309.jpg

Feb 17 2011

"Denis Koroskin" <2korden gmail.com> writes:

On Wed, 16 Feb 2011 06:49:26 +0300, Michel Fortin  
<michel.fortin michelf.com> wrote:

 On 2011-02-15 22:41:32 -0500, "Nick Sabalausky" <a a.a> said:

 I like "nint".


 But is it unsigned or signed? Do we need 'unint' too?

 I think 'word' & 'uword' would be a better choice. I can't say I'm too  
 displeased with 'size_t', but it's true that the 'size_t' feels out of  
 place in D code because of its name.


I second that. word/uword are shorter than ssize_t/size_t and more in line  
with other type names.

I like it.

Feb 16 2011

Russel Winder <russel russel.org.uk> writes:

Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<minor-rant>

On Thu, 2011-02-17 at 10:13 +0100, Don wrote:
[ . . . ]
 Me too. A word is two bytes. Any other definition seems to be pretty=20
 useless.


Sounds like people have been living with 8- and 16-bit processors for
too long.

A word is the natural length of an integer item in the processor.  It is
necessarily machine specific.  cf. DEC-10 had 9-bit bytes and 36-bit
word, IBM 370 has an 8-bit byte and a 32-bit word, though addresses were
24-bit.  ix86 follows IBM 8-bit byte and 32-bit word.

The really interesting question is whether on x86_64 the word is 32-bit
or 64-bit.

 The whole concept of "machine word" seems very archaic and incorrect to=


 me anyway. It assumes that the data registers and address registers are=


 the same size, which is very often not true.


Machine words are far from archaic, even on the JVM, if you don't know
the length of the word on the machine you are executing on, how do you
know the set of values that can be represented?  In floating point
numbers, if you don't know the length of the word, how do you know the
accuracy of the computation?

Clearly data registers and address registers can be different lengths,
it is not the job of a programming language that compiles to native code
to ignore this and attempt to homogenize things beyond what is
reasonable.

If you are working in native code then word length is a crucial property
since it can change depending on which processor you compile for.

 For example, on an 8-bit machine (eg, 6502 or Z80), the accumulator was=


 only 8 bits, yet size_t was definitely 16 bits.


The 8051 was only surpassed a couple of years ago by ARMs as the most
numerous processor on the planet.  8-bit processors may only have had
8-bit ALUs -- leading to an hypothesis that the word was 8-bits -- but
the word length was effectively 16-bit due to the hardware support for
multi-byte integer operations.

 It's quite plausible that at some time in the future we'll get a machine=


 with 128-bit registers and data bus, but retaining the 64 bit address=20
 bus. So we could get a size_t which is smaller than the machine word.
=20
 In summary: size_t is not the machine word.


Agreed !

As long as the address bus is less wide than an integer, there are no
apparent problems using integers as addresses.  The problem comes when
addresses are wider than integers.  A good statically-typed programming
language should manage this by having integers and addresses as distinct
sets.  C and C++ have led people astray.  There should be an appropriate
set of integer types and an appropriate set of address types and using
one from the other without active conversion is always going to lead to
problems.

Do not be afraid of the word.  Fear leads to anger.  Anger leads to
hate.  Hate leads to suffering. (*)

</minor-rant>

(*) With apologies to Master Yoda (**) for any misquote.

(**) Or more likely whoever his script writer was.
--=20
Russel.
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D
Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder ekiga.n=
et
41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel russel.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder

Feb 17 2011

Russel Winder <russel russel.org.uk> writes:

Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Thu, 2011-02-17 at 11:09 -0800, Walter Bright wrote:
 Russel Winder wrote:
 Do not be afraid of the word.  Fear leads to anger.  Anger leads to
 hate.  Hate leads to suffering. (*)


 (*) With apologies to Master Yoda (**) for any misquote.


 "Luke, trust your feelings!" -- Oggie Ben Doggie
=20
 Of course, expecting consistency from Star Wars is a waste of time.


"What -- me worry?"  Alfred E Newman  (*)

Star Wars is like Dr Who you expect revisionist history in every
episode.  I hate an inconsistent storyline, so the trick is to assume
each episode is a completely separate story unrelated to any other
episode.


(*) Or whoever http://en.wikipedia.org/wiki/Alfred_E._Neuman
--=20
Russel.
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D
Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder ekiga.n=
et
41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel russel.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder

Feb 17 2011

D Programming

C/C++ Programming

Other

digitalmars.D - Re: Integer conversions too pedantic in 64-bit