www.digitalmars.com         C & C++   DMDScript  

D - Ideas, thoughts, and criticisms, part three. Naming of "int" and friends.

reply Antti =?iso-8859-1?Q?Syk=E4ri?= <jsykari cc.hut.fi> writes:
About integer and other types, their names.

I had something else to say, too, but I forgot.

References are at the bottom, as usual. Now to the point.


I noticed that in D, there are well-defined sizes for the common integer
types.

byte     8 bits
short   16 bits
int     32 bits
long    64 bits
cent   128 bits

I also noticed that the new 64-bit processor from AMD is coming out
maybe this year already, and Itanium has been around for a some time
already. Not to mention alpha and other 64-bit processors commonly in
use. (Or on paper -- Knuth's MMIX (see knuth:mmix) comes into mind for
some reason ;)

Anyway, the trend seems to go into the direction that people start to
use the 64-bit registers more and more, and soon nobody is using the
32-bit registers any more. This might happen in a few years. Probably C
compilers would start using "int" as a 64-bit data type and make "short"
to be 32-bit instead.

Well, that's the reason that the integral types were designed to be
upwards-compatible in C.

Let's take some perspective.

In the year 2015, I don't want to teach my children the basics of "D"
like this:

'Now, if you want to define an integer in this low-end 256-bit machine,
you gotta use this "übercent" keyword. And unsigned is "uübercent". No,
you cannot use "int" for integer like in C, because D was not designed
to be upwards compatible and now the D standards committee has to invent
new names for every new, bigger integer types...'

This is not really the future I'd like to see. Don't look back at the
past and try to mimic C; look into the future.

Please don't use the word "int" to represent "a 32-bit integer", but
make it instead simply represent "an integer". (Of a range which is
suitable for the machine at hand - that is, on a int would be 32 bit on
a 32-bit machine, 64 bits on a 64-bit machine, etc.)

And please don't use the word "long" to represent "a 64-bit integer",
since in a few years 64 bits won't be long at all.

I'd use the following conventions for well-defined integers:

int8, uint8 (or byte, if you like- I do, actually)
int16
int32
int64
int128
...etc, to be continued. Hopefully. (And I think that now that we are
going to get a lot more parallelism in processors, even bigger registers
will emerge soon - think about MMX technology and its successors)

also:
float32
float64
float80
(and in the future, float128?)

Actually, I'd even prefer "i8, i16, i32, i64" etc. since it's quicker to
type. But might make the language look more obfuscated.

On the char thing I have nothing to say currently. They are a
problematic issue. (But wchar seems something nobody will ever use
anyway.  At least if it's named "wchar". On the other hand, if every
"char" would be a 16/32-bit one, everybody would use byte then.)

Oh, actually, hey.

Use just an alias 'char', which is defined to whatever is proper default
on the platform - and then define char8, char16 and char32. Right?

(I think this approach might suck, because there's no single "char type"
on a platform. Maybe if they could be converted implicitly somehow...
Maybe...)

Antti.


(By the way,

I was lately reading a paper about the development of the C language
(see ritchie:c-devel) (actually I found that via google, searching
something about the history of the switch statement). I bumped into the
following passage:

"Sethi [Sethi 81] observed that many of the nested declarations and
expressions would become simpler if the indirection operator had been
taken as a postfix operator instead of prefix, but by then it was too
late to change."

I couldn't get hold of that paper myself - www.acm.org says "NT machine
internal error" or something like that. Maybe some of you is lucky
enough and knows/finds out what Sethi had in mind.)

References:

(ritchie:c-devel) http://cm.bell-labs.com/cm/cs/who/dmr/chist.html
(knuth:mmix) http://www-cs-faculty.stanford.edu/~knuth/fasc1.ps.gz
Aug 27 2002
next sibling parent reply Pavel Minayev <evilone omen.ru> writes:
On Wed, 28 Aug 2002 02:28:36 +0000 (UTC) Antti Syk_ri <jsykari cc.hut.fi> wrote:

 I'd use the following conventions for well-defined integers:
 
 int8, uint8 (or byte, if you like- I do, actually)
 int16
 int32
 int64
 int128
 ....etc, to be continued. Hopefully. (And I think that now that we are
 going to get a lot more parallelism in processors, even bigger registers
 will emerge soon - think about MMX technology and its successors)
 
 also:
 float32
 float64
 float80
 (and in the future, float128?)

Personally, I like this idea (and even had it earlier). Not sure if someone else will, though.
 Use just an alias 'char', which is defined to whatever is proper default
 on the platform - and then define char8, char16 and char32. Right?
 
 (I think this approach might suck, because there's no single "char type"
 on a platform. Maybe if they could be converted implicitly somehow...
 Maybe...)

They can.
Aug 27 2002
next sibling parent reply "Sandor Hojtsy" <hojtsy index.hu> writes:
"Pavel Minayev" <evilone omen.ru> wrote in message
news:CFN374963858391088 news.digitalmars.com...
 On Wed, 28 Aug 2002 02:28:36 +0000 (UTC) Antti Syk_ri <jsykari cc.hut.fi>

 I'd use the following conventions for well-defined integers:

 int8, uint8 (or byte, if you like- I do, actually)
 int16
 int32
 int64
 int128
 ....etc, to be continued. Hopefully. (And I think that now that we are
 going to get a lot more parallelism in processors, even bigger registers
 will emerge soon - think about MMX technology and its successors)

 also:
 float32
 float64
 float80
 (and in the future, float128?)

Personally, I like this idea (and even had it earlier). Not sure if

 else will, though.

At least, if this "int32", "int64" ... won't become a keyword, make them aliases in the standard library by all means. How it sux to always alias you own "int16" and "int32" type in C++ all the time, if you want to use this style, because of differences in the POSIX standard implementation. (of some similar alias concept) And reserve 2 levels more of keywords. I don't think you need more than that: D won't live forever, after all.
Aug 28 2002
parent "Walter" <walter digitalmars.com> writes:
"Sandor Hojtsy" <hojtsy index.hu> wrote in message
news:aki8eq$1vdf$1 digitaldaemon.com...
 At least, if this "int32", "int64" ...  won't become a keyword, make them
 aliases in the standard library by all means.

That's probably a reasonable compromise.
 How it sux to always alias you own "int16" and "int32" type in C++ all the
 time, if you want to use this style, because of differences in the POSIX
 standard implementation. (of some similar alias concept)
 And reserve 2 levels more of keywords. I don't think you need more than
 that: D won't live forever, after all.

cent and ucent are reserved for 128 bit integers.
Sep 02 2002
prev sibling next sibling parent reply Mac Reiter <Mac_member pathlink.com> writes:
In article <CFN374963858391088 news.digitalmars.com>, Pavel Minayev says...
On Wed, 28 Aug 2002 02:28:36 +0000 (UTC) Antti Syk_ri <jsykari cc.hut.fi> wrote:

 I'd use the following conventions for well-defined integers:
 
 int8, uint8 (or byte, if you like- I do, actually)
 int16
 int32
 int64
 int128
 ....etc, to be continued. Hopefully. (And I think that now that we are
 going to get a lot more parallelism in processors, even bigger registers
 will emerge soon - think about MMX technology and its successors)
 
 also:
 float32
 float64
 float80
 (and in the future, float128?)

Personally, I like this idea (and even had it earlier). Not sure if someone else will, though.

Just for the vote, I like it too. Our company made a header for C++ so that we would have similar names available. Having it defined in the language saves having to conditionally compile based on platform, which you can't do in D anyway. Mac
Aug 28 2002
parent "Walter" <walter digitalmars.com> writes:
"Mac Reiter" <Mac_member pathlink.com> wrote in message
news:akim55$2eu0$1 digitaldaemon.com...
 Having it defined in the language saves
 having to conditionally compile based on platform, which you can't do in D
 anyway.

Ah, but you can do that. Check out the version declaration and version statement. There are a set of predefined version identifiers for the hosting environment.
Sep 02 2002
prev sibling parent "Walter" <walter digitalmars.com> writes:
"Pavel Minayev" <evilone omen.ru> wrote in message
news:CFN374963858391088 news.digitalmars.com...
 On Wed, 28 Aug 2002 02:28:36 +0000 (UTC) Antti Syk_ri <jsykari cc.hut.fi>

 int8, uint8 (or byte, if you like- I do, actually)
 int16
 int32
 int64
 int128
 ....etc, to be continued. Hopefully. (And I think that now that we are
 going to get a lot more parallelism in processors, even bigger registers
 will emerge soon - think about MMX technology and its successors)


 else will, though.

It is a good and sensible idea, it just is not aesthetically pleasing to the eye. Is that relevant? I think so. One of my beefs with Perl is it just doesn't look good. An example from the Perl book: unlink "/tmp/myfile$$"; $ && ($ =~ s/\(eval \d+\) at line (\d+)/$0 . " line " . ($1+$start)/e, die $ ); exit 0; Sorry, it just looks like C after being run over by a truck <g>. Not that int16 looks *that* bad, but in general identifiers with digits appended don't look good and are visually confusing when in algebraic expressions.
Sep 02 2002
prev sibling parent reply Hanhua Feng <hanhua cs.columbia.edu> writes:
Hello, I am new here.

This is my suggestion about names of integers and floats.
For integers, we can follow D.E. Knuth's rule,

   byte 8 bits
   wyde 16 bits
   tetra 32 bits
   octa  64 bits
and more:
   biocta 128 bits
   quadocta 256 bits

For floating numbers, names are
   float(or single)  32 bits
   double         64 bits
   triple         96 bits(or 80 bits, it may be aligned to 96 bits)
   quadruple     128 bits ( as SUN already named it )

Antti Sykäri wrote:
 About integer and other types, their names.
 
 I had something else to say, too, but I forgot.
 
 References are at the bottom, as usual. Now to the point.
 
 
 I noticed that in D, there are well-defined sizes for the common integer
 types.
 
 byte     8 bits
 short   16 bits
 int     32 bits
 long    64 bits
 cent   128 bits
  .....

Aug 28 2002
parent reply "Walter" <walter digitalmars.com> writes:
"Hanhua Feng" <hanhua cs.columbia.edu> wrote in message
news:3D6CEEC1.2050204 cs.columbia.edu...
 Hello, I am new here.

Welcome!
 This is my suggestion about names of integers and floats.
 For integers, we can follow D.E. Knuth's rule,

    byte 8 bits
    wyde 16 bits
    tetra 32 bits
    octa  64 bits
 and more:
    biocta 128 bits
    quadocta 256 bits

Ack! <g>
 For floating numbers, names are
    float(or single)  32 bits
    double         64 bits
    triple         96 bits(or 80 bits, it may be aligned to 96 bits)
    quadruple     128 bits ( as SUN already named it )

D has "extended" right now. I confess I hate it. Perhaps "triple" is a better idea.
Sep 02 2002
next sibling parent reply Mark Evans <Mark_member pathlink.com> writes:
I often employ names indicating both the sign and bitsize:

int8
uint8
int16
uint16
int32
uint32
int64
uint64
int128
uint128

float32
float64
float80
float96
float128

complex32
complex64
complex80
complex96
complex128

char8utf
char8ascii
char16whatever
..
etc.

Perhaps D could adopt some aliases like these.  They are nice because they are
unambiguous (unlike C ints whose size can vary from machine to machine).

The term "double" is made explicit by the use of 64, which is 32x2.  I really
like having the size indicated in the type name.

You see this sort of thing all the time in cross-platform code and generic APIs
which must cover several compilers or OS's.

Mark
Sep 02 2002
parent reply Pavel Minayev <evilone omen.ru> writes:
Mark Evans wrote:

 Perhaps D could adopt some aliases like these.  They are nice because they are
 unambiguous (unlike C ints whose size can vary from machine to machine).
 
 The term "double" is made explicit by the use of 64, which is 32x2.  I really
 like having the size indicated in the type name.
 
 You see this sort of thing all the time in cross-platform code and generic APIs
 which must cover several compilers or OS's.

Putting these in some special module, something like types.d, seems like a good idea to me. Then one can import that module when needed.
Sep 02 2002
parent Mac Reiter <Mac_member pathlink.com> writes:
Having created an in-house .h file for our C++ work, I can state that it is very
easy to fall back on the good old 'int' when you are coding, even though you
know you were supposed to specify a particular type of int (int32, for
instance).  Picking a consistent, flexible, and extendible type naming system
and building it into the language avoids this problem.  If there is no such
thing as an 'int' to use, then you have to specify what you meant.  I don't mind
having to specify what I meant when I am programming ;)

The usual argument for 'int' in C/C++ is that it is supposed to represent the
fastest integral type on the given architecture.  Fine, but why not have things
like:

fast8  (8bit, on x86)
fast16 (32bit, on x86)
fast32 (32bit, on x86)
fast64 (??bit, on x86)

This way you can state that you are in a hurry, as well as specifying what kind
of range you'll need.

Our internal .h file actually had types for 'fast', 'tight', and 'exact'.
'fast' gave you a fast type for loops, 'tight' gave you the smallest type that
would hold your range (for arrays), and 'exact' gave you exactly what you asked
for (for struct/packet overlays).  This may be overkill for general use, but it
is a workable and explicit system...  For what it's worth, we ended up
shortening the names considerably, so that we had things like F8 (fast 8bit),
UT16 (unsigned tight 16bit), etc.  While it does make the names shorter to type,
I'm not sure yet what it does to readability...  I can say that you either want
highly abbreviated or fully expanded names, because trying to combine them is
just confusing -- tint8 looks like a color, rather than a char; eint16 looks
like an "extended" int16, whatever that would mean.

Mac

In article <al1hjr$2g87$1 digitaldaemon.com>, Pavel Minayev says...
Mark Evans wrote:

 Perhaps D could adopt some aliases like these.  They are nice because they are
 unambiguous (unlike C ints whose size can vary from machine to machine).
 
 The term "double" is made explicit by the use of 64, which is 32x2.  I really
 like having the size indicated in the type name.
 
 You see this sort of thing all the time in cross-platform code and generic APIs
 which must cover several compilers or OS's.

Putting these in some special module, something like types.d, seems like a good idea to me. Then one can import that module when needed.

Sep 03 2002
prev sibling parent "Dario" <supdar yahoo.com> writes:
 D has "extended" right now. I confess I hate it. Perhaps "triple" is a
 better idea.

'extended' is too long to type indeed! 'integer' is one character shorter, but it has been shortened anyway (we have 'int', in fact!). Why don't you choose 'ext'? People will be encouraged to use that type if it's fast enough to type. Few people use 'long double' in C: it's even two words long!!! 'triple' is pretty, but doesn't suit the definition. 'extended' isn't necessary 80bit (or 96): it's the largest representation available for the implementation. Maybe we should have both. So we won't have to invent new types for implementations with 128bit floats. (i.e. float:32, double:64, triple:96, ext:128)
Sep 05 2002