www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - "native integer" - best practice..

reply Stephen Waits <steve waits.net> writes:
Hi all,

Like lots of you, for our portable C++ stuff, we use our own set of 
typedefs for ints and floats..  uint32, int32, uint16, and so on.

However, we only use these types when we actually require a specific 
size.  If, for example, we just need a loop counter or an array index, 
we always use "int" or "unsigned int" because we can be (fairly) certain 
that this will be the machine's "native" type and it won't have to go 
through some extra hoops to access it on, say, a 64 bit machine.  These 
sorts of unnatural accesses can add up to quite a few cycles.

[The sad part of this, in C++, is that we can only be "fairly certain" 
as I stated above.]

So, in D, we have these types:

http://www.digitalmars.com/d/type.html

Which are absolutely great, because in C/C++ we never REALLY knew what 
size anything was going to be - so that's wonderfully predictable now, 
and good.  [though the "it may be bigger on some platforms" thing is a 
bit uncomfortable]

But what would you use when you don't need something size specific, but 
instead, just want the most natural integer or floating point type for 
the target machine?

Thanks,
Steve
Jul 10 2004
next sibling parent J Anderson <REMOVEanderson badmama.com.au> writes:
Stephen Waits wrote:

 Hi all,

 Like lots of you, for our portable C++ stuff, we use our own set of 
 typedefs for ints and floats..  uint32, int32, uint16, and so on.

 However, we only use these types when we actually require a specific 
 size.  If, for example, we just need a loop counter or an array index, 
 we always use "int" or "unsigned int" because we can be (fairly) 
 certain that this will be the machine's "native" type and it won't 
 have to go through some extra hoops to access it on, say, a 64 bit 
 machine.  These sorts of unnatural accesses can add up to quite a few 
 cycles.

 [The sad part of this, in C++, is that we can only be "fairly certain" 
 as I stated above.]

 So, in D, we have these types:

 http://www.digitalmars.com/d/type.html

 Which are absolutely great, because in C/C++ we never REALLY knew what 
 size anything was going to be - so that's wonderfully predictable now, 
 and good.  [though the "it may be bigger on some platforms" thing is a 
 bit uncomfortable]

 But what would you use when you don't need something size specific, 
 but instead, just want the most natural integer or floating point type 
 for the target machine?

 Thanks,
 Steve

You can simply use another typedef, however for things like loops 32-bit iterations are still going to be more efficient (due to caching reasons) on 64-bit machines for a long time. 64-bit machines are more about larger addressable memory space then anything else. -- -Anderson: http://badmama.com.au/~anderson/
Jul 11 2004
prev sibling next sibling parent reply "Matthew" <admin stlsoft.dot.dot.dot.dot.org> writes:
Unless I'm mis-remembering - in case which apologies - Walter's said that size_t
and ptrdiff_t are the answer. That's fine, but for the names. I'd prefer
something that specifically connotes the ambient platform integer aspect to it,
but it's coming up with a name. apint doesn't really grab one, does it?

"Stephen Waits" <steve waits.net> wrote in message
news:ccqluq$31hp$1 digitaldaemon.com...
 Hi all,

 Like lots of you, for our portable C++ stuff, we use our own set of
 typedefs for ints and floats..  uint32, int32, uint16, and so on.

 However, we only use these types when we actually require a specific
 size.  If, for example, we just need a loop counter or an array index,
 we always use "int" or "unsigned int" because we can be (fairly) certain
 that this will be the machine's "native" type and it won't have to go
 through some extra hoops to access it on, say, a 64 bit machine.  These
 sorts of unnatural accesses can add up to quite a few cycles.

 [The sad part of this, in C++, is that we can only be "fairly certain"
 as I stated above.]

 So, in D, we have these types:

 http://www.digitalmars.com/d/type.html

 Which are absolutely great, because in C/C++ we never REALLY knew what
 size anything was going to be - so that's wonderfully predictable now,
 and good.  [though the "it may be bigger on some platforms" thing is a
 bit uncomfortable]

 But what would you use when you don't need something size specific, but
 instead, just want the most natural integer or floating point type for
 the target machine?

 Thanks,
 Steve

Jul 11 2004
next sibling parent teqDruid <me teqdruid.com> writes:
How about machine-integer... "mint"?

On Sun, 11 Jul 2004 17:08:40 +1000, Matthew wrote:

 Unless I'm mis-remembering - in case which apologies - Walter's said that
 size_t and ptrdiff_t are the answer. That's fine, but for the names. I'd
 prefer something that specifically connotes the ambient platform integer
 aspect to it, but it's coming up with a name. apint doesn't really grab
 one, does it?

Jul 11 2004
prev sibling next sibling parent reply =?ISO-8859-1?Q?Sigbj=F8rn_Lund_Olsen?= <sigbjorn lundolsen.net> writes:
Matthew wrote:
 Unless I'm mis-remembering - in case which apologies - Walter's said that
size_t
 and ptrdiff_t are the answer. That's fine, but for the names. I'd prefer
 something that specifically connotes the ambient platform integer aspect to it,
 but it's coming up with a name. apint doesn't really grab one, does it?

aint/uaint? (that's what I've used - "address" integer) apint (to me) sounds like arbitrary precision integer, but that's because I did a numerical analysis program using a C++ lib called apfloat :-o My main qualm is still the lack of a separate unsigned keyword. Cheers, Sigbjørn Lund Olsen
Jul 11 2004
parent reply Antti =?iso-8859-1?Q?Syk=E4ri?= <jsykari gamma.hut.fi> writes:
In article <ccqror$8c3$1 digitaldaemon.com>, Sigbjørn Lund Olsen wrote:
 Matthew wrote:
 apint doesn't really grab one, does it?

apint (to me) sounds like arbitrary precision integer, but that's because I did a numerical analysis program using a C++ lib called apfloat :-o

Sounds like "a pint" to me... Or something resembling apes. -Antti -- I will not be using Plan 9 in the creation of weapons of mass destruction to be used by nations other than the US.
Jul 12 2004
parent =?ISO-8859-1?Q?Sigbj=F8rn_Lund_Olsen?= <sigbjorn lundolsen.net> writes:
Antti Sykäri wrote:

 In article <ccqror$8c3$1 digitaldaemon.com>, Sigbjørn Lund Olsen wrote:
 
Matthew wrote:

apint doesn't really grab one, does it?

apint (to me) sounds like arbitrary precision integer, but that's because I did a numerical analysis program using a C++ lib called apfloat :-o

Sounds like "a pint" to me...

Hidden benefits! Cheers, Sigbjørn Lund Olsen
Jul 12 2004
prev sibling parent reply Andy Friesen <andy ikagames.com> writes:
Matthew wrote:
 Unless I'm mis-remembering - in case which apologies - Walter's said that
size_t
 and ptrdiff_t are the answer. That's fine, but for the names. I'd prefer
 something that specifically connotes the ambient platform integer aspect to it,
 but it's coming up with a name. apint doesn't really grab one, does it?

I'm sure this has been argued before, but I'll say it anyway: This should just be plain old vanilla 'int'. It's what is almost always needed, and the name connotes nothing more than a signed, whole number. Sized types should have the size as part of their name: int8, int32, int64, etc. This obliviates ambiguity and makes it obvious what the next in line should be called. (long long long int is a bit, uh, long) This does mean breaking a lot of things, but D is still pre-1.0, so there's still a chance. Placing some aliases for byte, short, etc in object.d would keep existing code compiling. -- andy
Jul 11 2004
parent J Anderson <REMOVEanderson badmama.com.au> writes:
Andy Friesen wrote:

 Matthew wrote:

 Unless I'm mis-remembering - in case which apologies - Walter's said 
 that size_t
 and ptrdiff_t are the answer. That's fine, but for the names. I'd prefer
 something that specifically connotes the ambient platform integer 
 aspect to it,
 but it's coming up with a name. apint doesn't really grab one, does it?

I'm sure this has been argued before, but I'll say it anyway: This should just be plain old vanilla 'int'. It's what is almost always needed, and the name connotes nothing more than a signed, whole number. Sized types should have the size as part of their name: int8, int32, int64, etc. This obliviates ambiguity and makes it obvious what the next in line should be called. (long long long int is a bit, uh, long) This does mean breaking a lot of things, but D is still pre-1.0, so there's still a chance. Placing some aliases for byte, short, etc in object.d would keep existing code compiling. -- andy

The problem here is that we get stuck in the days of C++. C++ programs have had many porting problems because people didn't use the specific size int value. If you ensure that int will always be the same size then these problems go away. Users who know what they are doing can use the alias instead. -- -Anderson: http://badmama.com.au/~anderson/
Jul 11 2004
prev sibling next sibling parent "Carlos Santander B." <carlos8294 msn.com> writes:
"Stephen Waits" <steve waits.net> escribió en el mensaje
news:ccqluq$31hp$1 digitaldaemon.com
| Hi all,
|
| Like lots of you, for our portable C++ stuff, we use our own set of
| typedefs for ints and floats..  uint32, int32, uint16, and so on.
|
| However, we only use these types when we actually require a specific
| size.  If, for example, we just need a loop counter or an array index,
| we always use "int" or "unsigned int" because we can be (fairly) certain
| that this will be the machine's "native" type and it won't have to go
| through some extra hoops to access it on, say, a 64 bit machine.  These
| sorts of unnatural accesses can add up to quite a few cycles.
|
| [The sad part of this, in C++, is that we can only be "fairly certain"
| as I stated above.]
|
| So, in D, we have these types:
|
| http://www.digitalmars.com/d/type.html
|
| Which are absolutely great, because in C/C++ we never REALLY knew what
| size anything was going to be - so that's wonderfully predictable now,
| and good.  [though the "it may be bigger on some platforms" thing is a
| bit uncomfortable]
|
| But what would you use when you don't need something size specific, but
| instead, just want the most natural integer or floating point type for
| the target machine?
|
| Thanks,
| Steve

You can always use std.stdint

-----------------------
Carlos Santander Bernal
Jul 11 2004
prev sibling next sibling parent reply "Walter" <newshound digitalmars.com> writes:
"Stephen Waits" <steve waits.net> wrote in message
news:ccqluq$31hp$1 digitaldaemon.com...
 However, we only use these types when we actually require a specific
 size.  If, for example, we just need a loop counter or an array index,
 we always use "int" or "unsigned int" because we can be (fairly) certain
 that this will be the machine's "native" type and it won't have to go
 through some extra hoops to access it on, say, a 64 bit machine.  These
 sorts of unnatural accesses can add up to quite a few cycles.

For a loop index, you should use 'size_t'. This is because the offset to a pointer will increase in size on 64 bit machines. size_t is defined in object.d.
 Which are absolutely great, because in C/C++ we never REALLY knew what
 size anything was going to be - so that's wonderfully predictable now,
 and good.  [though the "it may be bigger on some platforms" thing is a
 bit uncomfortable]

 But what would you use when you don't need something size specific, but
 instead, just want the most natural integer or floating point type for
 the target machine?

Good question. And I think the answer is, don't worry about it. Use the minimum size that gets you the precision you need. Going beyond that is premature optimization; it's hopeless to try and optimize code for a machine one has no idea what the characteristics of are. The goals of portability and optimization are orthogonal.
Jul 11 2004
parent Stephen Waits <steve waits.net> writes:
Walter wrote:
 
 For a loop index, you should use 'size_t'. This is because the offset to a
 pointer will increase in size on 64 bit machines. size_t is defined in
 object.d.

Perfect, thank you.
But what would you use when you don't need something size specific, but
instead, just want the most natural integer or floating point type for
the target machine?

Good question. And I think the answer is, don't worry about it. Use the minimum size that gets you the precision you need. Going beyond that is premature optimization; it's hopeless to try and optimize code for a machine one has no idea what the characteristics of are. The goals of portability and optimization are orthogonal.

They aren't necessarily orthogonal in my opinion. Say, for instance, I follow your advice - but then we later ("optimize last" and all) find that on one platform we must use different types.. .. that means on that platform we'll end up with conditional code - something that breaks portability, and/or fuglies up the code. This, of course, is a situation that should only really matter in inner-loops or similar - and the "native integer type" (size_t) solves my problem perfectly. That said, another type of "mint" or as others have suggested, IMO, is worth considering. BUT, you've said in the past that we really shouldn't have two ways to do ONE thing - something I really agree with. I guess I'm saying, "size_t" is fine! I'd rather you spend time with Norbert's multi-dim array stuff than mess with this kinda stuff :) Thanks, Steve
Jul 11 2004
prev sibling parent reply Russ Lewis <spamhole-2001-07-16 deming-os.org> writes:
Stephen Waits wrote:
 
 Hi all,
 
 Like lots of you, for our portable C++ stuff, we use our own set of 
 typedefs for ints and floats..  uint32, int32, uint16, and so on.
 
 However, we only use these types when we actually require a specific 
 size.  If, for example, we just need a loop counter or an array index, 
 we always use "int" or "unsigned int" because we can be (fairly) certain 
 that this will be the machine's "native" type and it won't have to go 
 through some extra hoops to access it on, say, a 64 bit machine.  These 
 sorts of unnatural accesses can add up to quite a few cycles.
 
 [The sad part of this, in C++, is that we can only be "fairly certain" 
 as I stated above.]
 
 So, in D, we have these types:
 
 http://www.digitalmars.com/d/type.html
 
 Which are absolutely great, because in C/C++ we never REALLY knew what 
 size anything was going to be - so that's wonderfully predictable now, 
 and good.  [though the "it may be bigger on some platforms" thing is a 
 bit uncomfortable]
 
 But what would you use when you don't need something size specific, but 
 instead, just want the most natural integer or floating point type for 
 the target machine?

I think that we need to be careful here; the truth of the matter is that you DO care about the size of your int's, at least within a certain range. For instance, if you have a loop that has 500 iterations, you might say, "I don't care about the size of my int" ... until you have to run on a machine where 8 bit ints are natural. So, I would like to reconstruct your argument: "There are many times where we care that an integer size cover at least a certain range, and don't care if the integer is larger than that." A number of people have suggested syntaxes where you define the ranges and the compiler figures out the type. I would suggest that we can arrive at what you're talking about with typedef's, something like (excuse my very-wordy names): typedef <whatever> fastest_uint_min16; typedef <whatever> fastest_int_min64; etc. Now you can write portable code that also has a hair of optimization; you know that you will always have AT LEAST a certain range (so you're safe) but you've also specified that it's ok to use a larger type if that is faster. CAVEAT: I know that "fastest" may be hard to define...but this is a start, at least...
Jul 12 2004
parent reply J Anderson <REMOVEanderson badmama.com.au> writes:
Russ Lewis wrote:

 Stephen Waits wrote:

 Hi all,

 Like lots of you, for our portable C++ stuff, we use our own set of 
 typedefs for ints and floats..  uint32, int32, uint16, and so on.

 However, we only use these types when we actually require a specific 
 size.  If, for example, we just need a loop counter or an array 
 index, we always use "int" or "unsigned int" because we can be 
 (fairly) certain that this will be the machine's "native" type and it 
 won't have to go through some extra hoops to access it on, say, a 64 
 bit machine.  These sorts of unnatural accesses can add up to quite a 
 few cycles.

 [The sad part of this, in C++, is that we can only be "fairly 
 certain" as I stated above.]

 So, in D, we have these types:

 http://www.digitalmars.com/d/type.html

 Which are absolutely great, because in C/C++ we never REALLY knew 
 what size anything was going to be - so that's wonderfully 
 predictable now, and good.  [though the "it may be bigger on some 
 platforms" thing is a bit uncomfortable]

 But what would you use when you don't need something size specific, 
 but instead, just want the most natural integer or floating point 
 type for the target machine?

I think that we need to be careful here; the truth of the matter is that you DO care about the size of your int's, at least within a certain range. For instance, if you have a loop that has 500 iterations, you might say, "I don't care about the size of my int" ... until you have to run on a machine where 8 bit ints are natural. So, I would like to reconstruct your argument: "There are many times where we care that an integer size cover at least a certain range, and don't care if the integer is larger than that." A number of people have suggested syntaxes where you define the ranges and the compiler figures out the type. I would suggest that we can arrive at what you're talking about with typedef's, something like (excuse my very-wordy names): typedef <whatever> fastest_uint_min16; typedef <whatever> fastest_int_min64; etc. Now you can write portable code that also has a hair of optimization; you know that you will always have AT LEAST a certain range (so you're safe) but you've also specified that it's ok to use a larger type if that is faster. CAVEAT: I know that "fastest" may be hard to define...but this is a start, at least...

the data type. We don't care what the internals are. That has so many other advantages, I don't know why it never made it into D. However I think that getting the largest efficient type for the current system is mainly useful for things like coping large chunks for data, not for iterating though loops (unless its used int the copy of course). In these cases you still need to know the size of the data type your dealing with so that you can trim the algorithm off at the edges of the block. -- -Anderson: http://badmama.com.au/~anderson/
Jul 13 2004
parent reply "me" <memsom interalpha.co.uk> writes:
 Or we could go the ada root, which allows you to specify the range of
 the data type.   We don't care what the internals are.  That has so many
 other advantages, I don't know why it never made it into D.

There's something to be said for this system. Pascal uses it too, for example type myint = 1..10; //int with 10 elements numbered 1 - 10 tyep myint2 = -1..255; //byte alike with a single negative element The compiler uses the smallest native type that the range fits into. Pascal also has: byte - 8bit unsigned int word - 16bit unsigned int longword - 32bit unsigned int also the less well named: shortint - 8bit signed int smallint - 16bit signed int longint - 32bit signed int int64 - 64bit signed integer (Delphi 4 onwards) But also: integer - native signed int (16bit on Win3.1, 32bit for Win32) cardinal - native unsigned int (16bit on Win3.1, 32bit for Win32) The last two change with each platform inline with the processor. Matt
Jul 16 2004
parent reply "Matthew Wilson" <admin.hat stlsoft.dot.org> writes:
Isn't this something that can be done by a library?

"me" <memsom interalpha.co.uk> wrote in message
news:cd9rlk$p0c$1 digitaldaemon.com...
 Or we could go the ada root, which allows you to specify the range of
 the data type.   We don't care what the internals are.  That has so many
 other advantages, I don't know why it never made it into D.

There's something to be said for this system. Pascal uses it too, for example type myint = 1..10; //int with 10 elements numbered 1 - 10 tyep myint2 = -1..255; //byte alike with a single negative element The compiler uses the smallest native type that the range fits into.

 also has:

 byte - 8bit unsigned int
 word - 16bit unsigned int
 longword - 32bit unsigned int

 also the less well named:

 shortint - 8bit signed int
 smallint - 16bit signed int
 longint - 32bit signed int
 int64 - 64bit signed integer (Delphi 4 onwards)

 But also:

 integer - native signed int (16bit on Win3.1, 32bit for Win32)
 cardinal - native unsigned int (16bit on Win3.1, 32bit for Win32)

 The last two change with each platform inline with the processor.

 Matt

Jul 17 2004
parent reply J Anderson <REMOVEanderson badmama.com.au> writes:
Matthew Wilson wrote:

Isn't this something that can be done by a library?
  

I can't see it being done neatly+efficiently by a library. Particularly with static type checking, being able to use ranges as parameters to standard arrays, ect...
"me" <memsom interalpha.co.uk> wrote in message
news:cd9rlk$p0c$1 digitaldaemon.com...
  

Or we could go the ada root, which allows you to specify the range of
the data type.   We don't care what the internals are.  That has so many
other advantages, I don't know why it never made it into D.
      

example type myint = 1..10; //int with 10 elements numbered 1 - 10 tyep myint2 = -1..255; //byte alike with a single negative element The compiler uses the smallest native type that the range fits into.

also has:

byte - 8bit unsigned int
word - 16bit unsigned int
longword - 32bit unsigned int

also the less well named:

shortint - 8bit signed int
smallint - 16bit signed int
longint - 32bit signed int
int64 - 64bit signed integer (Delphi 4 onwards)

But also:

integer - native signed int (16bit on Win3.1, 32bit for Win32)
cardinal - native unsigned int (16bit on Win3.1, 32bit for Win32)

The last two change with each platform inline with the processor.

Matt

    


-- -Anderson: http://badmama.com.au/~anderson/
Jul 17 2004
parent reply "Matthew Wilson" <admin.hat stlsoft.dot.org> writes:
"J Anderson" <REMOVEanderson badmama.com.au> wrote in message
news:cdatag$1788$1 digitaldaemon.com...
 Matthew Wilson wrote:

Isn't this something that can be done by a library?

I can't see it being done neatly+efficiently by a library. Particularly with static type checking, being able to use ranges as parameters to standard arrays, ect...

Please give more info. Can you show a couple of examples that support your case?
"me" <memsom interalpha.co.uk> wrote in message
news:cd9rlk$p0c$1 digitaldaemon.com...


Or we could go the ada root, which allows you to specify the range of
the data type.   We don't care what the internals are.  That has so




other advantages, I don't know why it never made it into D.

example type myint = 1..10; //int with 10 elements numbered 1 - 10 tyep myint2 = -1..255; //byte alike with a single negative element The compiler uses the smallest native type that the range fits into.

also has:

byte - 8bit unsigned int
word - 16bit unsigned int
longword - 32bit unsigned int

also the less well named:

shortint - 8bit signed int
smallint - 16bit signed int
longint - 32bit signed int
int64 - 64bit signed integer (Delphi 4 onwards)

But also:

integer - native signed int (16bit on Win3.1, 32bit for Win32)
cardinal - native unsigned int (16bit on Win3.1, 32bit for Win32)

The last two change with each platform inline with the processor.

Matt


-- -Anderson: http://badmama.com.au/~anderson/

Jul 17 2004
next sibling parent "me" <memsom interalpha.co.uk> writes:
 I can't see it being done neatly+efficiently by a library.  Particularly
 with static type checking, being able to use ranges as parameters to
 standard arrays, ect...

Please give more info. Can you show a couple of examples that support your case?

Excuse the Pascal... //define a subscript type myrange = -1..255; //new type defined - compiler will use the smallint to store //use this for your array type myarray = array[myrange] of string; //257 strings //create a var using array var message_text: myarray = ('error', ...etc...., 'another string'); Pascal also allows: type myenum = (meError, meUp, meDown, meLeft, meRight, meUndefined); //meError's ordinal value is 0.. type myenumarray = array[myenum] of sometype; another example... using the 'in' operator to test if a value is 'in' a range/set. type myrange = 0..16; var t = 15; if (t in myrange) then ; //do something I'm not clear how this could be cleanly done in a library... Matt
Jul 18 2004
prev sibling parent J Anderson <REMOVEanderson badmama.com.au> writes:
Matthew Wilson wrote:

"J Anderson" <REMOVEanderson badmama.com.au> wrote in message
news:cdatag$1788$1 digitaldaemon.com...
  

Matthew Wilson wrote:

    

Isn't this something that can be done by a library?


      

with static type checking, being able to use ranges as parameters to standard arrays, ect...

Please give more info. Can you show a couple of examples that support your case?

Something like: range tank as int = 1..12; range halftank as tanksize = 1..6; tank tank1; halftank tank2 = 4; tank1 = tank2; //ok tank2 = tank1; //compile time error tank2 [heaftank] myarray; myarray[tank2] = 8; //compile time error because of assignment myarray[tank1] = 4; //compile time error because of index tank2 [heaftank] myarray2; myarray[halftank] = myarray2[halftank]; //copy ... void GetRange (range r) { return array[r]; } I can't see how all this can be done neatly with a library at compile-time with compile-time checks.
  

"me" <memsom interalpha.co.uk> wrote in message
news:cd9rlk$p0c$1 digitaldaemon.com...


      

Or we could go the ada root, which allows you to specify the range of
the data type.   We don't care what the internals are.  That has so
          




other advantages, I don't know why it never made it into D.


          

example type myint = 1..10; //int with 10 elements numbered 1 - 10 tyep myint2 = -1..255; //byte alike with a single negative element The compiler uses the smallest native type that the range fits into.

also has:

byte - 8bit unsigned int
word - 16bit unsigned int
longword - 32bit unsigned int

also the less well named:

shortint - 8bit signed int
smallint - 16bit signed int
longint - 32bit signed int
int64 - 64bit signed integer (Delphi 4 onwards)

But also:

integer - native signed int (16bit on Win3.1, 32bit for Win32)
cardinal - native unsigned int (16bit on Win3.1, 32bit for Win32)

The last two change with each platform inline with the processor.

Matt



        


-Anderson: http://badmama.com.au/~anderson/


-- -Anderson: http://badmama.com.au/~anderson/
Jul 19 2004