www.digitalmars.com         C & C++   DMDScript  

D - Basic Integral Data Types flawed?

reply "Mark T" <mt nospam.com> writes:
I think using the Java standard sizes for integral sizes is a mistake since
D does not need a VM :) and since "D is designed to fit comfortably with a C
compiler for the target system".

I think the D language spec is overly targeted to the IA32/x86 architecture.
Generally, the C programmer uses "int" as the most efficient type for the
CPU (it has been a while but I think "int" on the DEC Alpha is 64 bits). Of
course, there are still plenty of 16 bit CPUs and odd-ball DSPs which would
could possibly use D.

The following D types should be modified to match the underlying C types for
the specific target (since you have a single target currently that wouldn't
break much code), then interfacing to existing C code would be very
straightforward.

short
ushort
int
uint
long
ulong

D should introduce the following types (similar to C99 with the "_t"
removed) for those times when you need an exact bit length data type.
  int8 - signed 8 bits
  uint8 - unsigned 8 bits
  int16 - signed 16 bits
  uint16 - unsigned 16 bits
  int32 - signed 32 bits
  uint32 - unsigned 32 bits
  etc

I do embedded programming and use the exact size C99 types quite often.

Mark
May 02 2002
next sibling parent reply "Pavel Minayev" <evilone omen.ru> writes:
"Mark T" <mt nospam.com> wrote in message
news:aar9td$3gu$1 digitaldaemon.com...

 I think using the Java standard sizes for integral sizes is a mistake

 D does not need a VM :) and since "D is designed to fit comfortably with a

 compiler for the target system".

 I think the D language spec is overly targeted to the IA32/x86

 Generally, the C programmer uses "int" as the most efficient type for the
 CPU (it has been a while but I think "int" on the DEC Alpha is 64 bits).

 course, there are still plenty of 16 bit CPUs and odd-ball DSPs which

 could possibly use D.

D is not 16-bit. For 64-bit computers, I think 32-bit int is not any slower than 64-bit, or am I wrong? Non-fixed size of C data types was (and is) a constant source of bugs and troubles. Just look at the typical "platform.h" of any multi-platform library - you'll see a lot of #defines and typedefs there, just to provide some workaround. I vote for fixed type sizes.
May 02 2002
parent Jonathan Cano <jonathan_95060 yahoo.com> writes:
 
 For 64-bit computers, I think 32-bit int is not any slower than 64-bit,
 or am I wrong?

No, you are not.
 Non-fixed size of C data types was (and is) a constant source of bugs and
 troubles. Just look at the typical "platform.h" of any multi-platform
 library - you'll see a lot of #defines and typedefs there, just to provide
 some workaround.
 
 I vote for fixed type sizes.

Me too! This is a major peeve for me. Another point in the argument for fixed size data types: + The programmer really should the domain of his variable. For example, you never would use "unsigned char i;" for a loop that you know is going to range from 1 to 10,000. Careful programmers should always consider the domain (i.e. the range of values) their variable is allowed to take on. types without fixed sizes promote sloppy thinking. Cheers, --jfc
May 03 2002
prev sibling next sibling parent reply "Richard Krehbiel" <rich kastle.com> writes:
"Mark T" <mt nospam.com> wrote in message
news:aar9td$3gu$1 digitaldaemon.com...
 I think using the Java standard sizes for integral sizes is a mistake

 D does not need a VM :) and since "D is designed to fit comfortably with a

 compiler for the target system".

 I think the D language spec is overly targeted to the IA32/x86

 Generally, the C programmer uses "int" as the most efficient type for the
 CPU (it has been a while but I think "int" on the DEC Alpha is 64 bits).

 course, there are still plenty of 16 bit CPUs and odd-ball DSPs which

 could possibly use D.

What kind of target system are you thinking of? D is not intended to be compatible with every target that C is; rather, D will be compatible with the target's C environment where D supports that environment. There may be some environments that, consequently, D won't be able to support. Watch: This is me, not caring. (IMHO I think ANSI went 'way to far in trying to make a standard for C that can support every weirdo legacy platform ever made. I'm sorry, but I'm not going to worry about 12 bit, one's complement, descriptor-based machines in which calloc *doesn't* set pointers to NULL and doubles to 0.0.)
 The following D types should be modified to match the underlying C types

 the specific target (since you have a single target currently that

 break much code), then interfacing to existing C code would be very
 straightforward.

 short
 ushort
 int
 uint
 long
 ulong

They *do* match the C types. They just don't have the same name ("int" = "long"; "ulong" = "unsigned long long").
 D should introduce the following types (similar to C99 with the "_t"
 removed) for those times when you need an exact bit length data type.
   int8 - signed 8 bits
   uint8 - unsigned 8 bits
   int16 - signed 16 bits
   uint16 - unsigned 16 bits
   int32 - signed 32 bits
   uint32 - unsigned 32 bits
   etc

 I do embedded programming and use the exact size C99 types quite often.

D has exact-sized types. They just have different names. (Types with exact-sized type names in D don't solve the problem that, on some platforms, C's "int" is D's "int", but on others, C's "int" may be D's "short".) Richard Krehbiel, Arlington, VA, USA rich kastle.com (work) or krehbiel3 comcast.net (personal)
May 02 2002
parent reply "Walter" <walter digitalmars.com> writes:
"Richard Krehbiel" <rich kastle.com> wrote in message
news:aarpct$1o03$1 digitaldaemon.com...
 (IMHO I think ANSI went 'way to far in trying to make a standard for C

 can support every weirdo legacy platform ever made.  I'm sorry, but I'm

 going to worry about 12 bit, one's complement, descriptor-based machines

 which calloc *doesn't* set pointers to NULL and doubles to 0.0.)

You can see that in some of the postings to the C newsgroups. For instance, look at the bending over backwards to support CPUs with no stack. Apparently, some ancient IBM computer has no stack. I don't see much point in making things more difficult for 99.9999% of the machines out there to accommodate .00001% of them. I myself have programmed machines with 10 bit bytes and with 18 bit words. But those machines are LONG obsolete, and for good reason. I once annoyed a number of C purists by suggesting that, for 8 bit architectures, it made sense to make a non-compliant C variant that was adapted to the particular characteristics of, say, the 6502. Their position that if it was possible to make a compliant C implementation for it, that should be used for all applications. Never mind the horrific inefficiency of it. I'm much more pragmatic about bending the language to suit the need, not the other way around <g>. For another example, it is just a reality that to write professional C/C++ apps on DOS, you need to use near and far. Yes, that made it non-ANSI. That's life.
May 02 2002
parent reply "Richard Krehbiel" <rich kastle.com> writes:
"Walter" <walter digitalmars.com> wrote in message
news:aas2e5$2lmm$1 digitaldaemon.com...
 "Richard Krehbiel" <rich kastle.com> wrote in message
 news:aarpct$1o03$1 digitaldaemon.com...
 (IMHO I think ANSI went 'way to far in trying to make a standard for C

 can support every weirdo legacy platform ever made.  I'm sorry, but I'm

 going to worry about 12 bit, one's complement, descriptor-based machines

 which calloc *doesn't* set pointers to NULL and doubles to 0.0.)

You can see that in some of the postings to the C newsgroups. For

 look at the bending over backwards to support CPUs with no stack.
 Apparently, some ancient IBM computer has no stack.

The ancient, obsolete processor you're thinking of may well be the PowerPC! Subroutine calls place the return address in a link register, which, by *convention* *only*, the called function "pushes" onto a software-managed stack referred to by R1. (I coded IBM 370 mainframe machine code in a former life, and it also has no stack. This machine architecture lives on in the current IBM mainframe lineup.) -- Richard Krehbiel, Arlington, VA, USA rich kastle.com (work) or krehbiel3 comcast.net (personal)
May 03 2002
next sibling parent reply Russell Borogove <kaleja estarcion.com> writes:
Richard Krehbiel wrote:
 "Walter" <walter digitalmars.com> wrote in message
 news:aas2e5$2lmm$1 digitaldaemon.com...
look at the bending over backwards to support CPUs with no stack.
Apparently, some ancient IBM computer has no stack.

The ancient, obsolete processor you're thinking of may well be the PowerPC! Subroutine calls place the return address in a link register, which, by *convention* *only*, the called function "pushes" onto a software-managed stack referred to by R1.

That's not all that unusual. I believe the Hitachi SH architecture does the same thing. I'm not sure what it means to "have no stack" -- all you need is a chunk of memory and equivalent functionality to an address register with inc/dec. I suppose if you have no address registers, or none that are preserved across function calls by convention, then you could be said to have no stack, but you could just reserve a word of memory to hold a stack pointer. Push and pop or call and return just become macro sequences in this these cases. Besides PPC, there are architectures which have indirect-with-predecrement or -with-postincrement which have no dedicated stack pointer -- just conventions.[1] -Russell B [1] I may be completely misremembering, but some even use a general register as the Program Counter/Instruction Pointer, meaning that the same circuitry that does "*p++" is doing instruction reads, and the same addressing modes available with address registers are available in PC-relative form.
May 03 2002
parent reply "Walter" <walter digitalmars.com> writes:
"Russell Borogove" <kaleja estarcion.com> wrote in message
news:3CD2C330.7050908 estarcion.com...
 [1] I may be completely misremembering,
 but some even use a general register as the
 Program Counter/Instruction Pointer, meaning that
 the same circuitry that does "*p++" is doing
 instruction reads, and the same addressing modes
 available with address registers are available
 in PC-relative form.

You remember correctly, that was the PDP-11. The 11 was a marvelously designed 16 bit instruction set, so marvelous that many later CPUs bragged about being "like" the 11, even though they screwed up their design.
May 03 2002
parent Russell Borogove <kaleja estarcion.com> writes:
Walter wrote:
 "Russell Borogove" <kaleja estarcion.com> wrote in message
 news:3CD2C330.7050908 estarcion.com...
 
[1] I may be completely misremembering,
but some even use a general register as the
Program Counter/Instruction Pointer, meaning that
the same circuitry that does "*p++" is doing
instruction reads, and the same addressing modes
available with address registers are available
in PC-relative form.

You remember correctly, that was the PDP-11.

Thought it might be, but I'm getting mistrustful of my memory in my old age. A little IIRC avoids a lot of public humiliation. :) -R
May 04 2002
prev sibling parent reply "Walter" <walter digitalmars.com> writes:
"Richard Krehbiel" <rich kastle.com> wrote in message
news:aatrrm$1j1v$1 digitaldaemon.com...
 The ancient, obsolete processor you're thinking of may well be the

 Subroutine calls place the return address in a link register, which, by
 *convention* *only*, the called function "pushes" onto a software-managed
 stack referred to by R1.

It's still a stack.
 (I coded IBM 370 mainframe machine code in a former life, and it also has

 stack.  This machine architecture lives on in the current IBM mainframe
 lineup.)

Does it emulate a stack?
May 03 2002
parent "Richard Krehbiel" <krehbiel3 comcast.net> writes:
"Walter" <walter digitalmars.com> wrote in message
news:aauksb$2k56$3 digitaldaemon.com...
 "Richard Krehbiel" <rich kastle.com> wrote in message
 news:aatrrm$1j1v$1 digitaldaemon.com...
 The ancient, obsolete processor you're thinking of may well be the

 Subroutine calls place the return address in a link register, which, by
 *convention* *only*, the called function "pushes" onto a


 stack referred to by R1.

It's still a stack.

Of course. It's just that it's proper use is not hardware-enforced.
 (I coded IBM 370 mainframe machine code in a former life, and it also


 no
 stack.  This machine architecture lives on in the current IBM mainframe
 lineup.)

Does it emulate a stack?

My code never did; the standard calling conventions did not call for a default stack pointer, and there was no memory space set aside for a stack. And oddly enough, I never used any recursive algorithms...
May 04 2002
prev sibling next sibling parent reply Russell Borogove <kaleja estarcion.com> writes:
Mark T wrote:
 The following D types should be modified to match the underlying C types for
 the specific target

Hold it right there -- some hardware platforms currently support different C implementations with different sizes for the same underlying C type. Example: some C compilers for 68000 Macs think an int is 16-bit ("most efficient" in terms of the 16-bit bus of older Macs) and some think an int is 32-bit ("most efficient" in that it's the biggest thing the CPU can eat). There are also lots of C compilers where a command-line option or pragma or incompatible hack selects the int size. What's the "underlying" size of an int there? -Russell B
May 03 2002
parent "Walter" <walter digitalmars.com> writes:
"Russell Borogove" <kaleja estarcion.com> wrote in message
news:3CD2BFF2.8040103 estarcion.com...
 There are also lots of C compilers where a command-line
 option or pragma or incompatible hack selects the int
 size.

Yuk. I really don't like the semantics of the compiler being controlled by the command line. It should be done in the source code itself. I'm forced into doing that with the C compiler, but with D I can try to design out such things. I've seen some C programs where it seems much of the logic was transferred into the makefile that was generated by perl scripts - aaarrrggghh. It was *really* hard to figure out what was going on.
May 03 2002
prev sibling next sibling parent reply "Matthew Wilson" <mwilson nextgengaming.com> writes:
I would actually go farther, in suggesting a conservative subset of all the
ideas expressed in this item.

Basically, we could do away with short, int, long, etc, and use
s/uint8/16/32/64/128. Additionally (and please let me know whether or not
this is the case) the char type (whether ANSI or Unicode - again I am not
sure in whether this has been decided) should be a distinct type from any of
the aforementioned 10 integer types, as should bool.

I know this strikes at the C-ness of it all, but I am pretty sure it makes
for simpler coding and better maintainability. In my own C/C++ code I have
for years had types akin to the C99, and the only time I ever have cause to
use one of the "real" types is when defining post-inc/decrement operators
(ie. operator ++(int)).

Whether this is accepted (and I doubt, given the style offense it would
cause to most C/C++-heads), there could surely be these types provided
alongside the non-sized ones?

Alternatively, in Java the types sizes are all strictly defined, and that
causes no problems either. It is pretty easy to remember that int is 32 and
long 64 bits.

Undecidely, ...

Matthew



"Mark T" <mt nospam.com> wrote in message
news:aar9td$3gu$1 digitaldaemon.com...
 I think using the Java standard sizes for integral sizes is a mistake

 D does not need a VM :) and since "D is designed to fit comfortably with a

 compiler for the target system".

 I think the D language spec is overly targeted to the IA32/x86

 Generally, the C programmer uses "int" as the most efficient type for the
 CPU (it has been a while but I think "int" on the DEC Alpha is 64 bits).

 course, there are still plenty of 16 bit CPUs and odd-ball DSPs which

 could possibly use D.

 The following D types should be modified to match the underlying C types

 the specific target (since you have a single target currently that

 break much code), then interfacing to existing C code would be very
 straightforward.

 short
 ushort
 int
 uint
 long
 ulong

 D should introduce the following types (similar to C99 with the "_t"
 removed) for those times when you need an exact bit length data type.
   int8 - signed 8 bits
   uint8 - unsigned 8 bits
   int16 - signed 16 bits
   uint16 - unsigned 16 bits
   int32 - signed 32 bits
   uint32 - unsigned 32 bits
   etc

 I do embedded programming and use the exact size C99 types quite often.

 Mark

May 09 2002
parent "Pavel Minayev" <evilone omen.ru> writes:
"Matthew Wilson" <mwilson nextgengaming.com> wrote in message
news:abff2a$1i69$1 digitaldaemon.com...

 Basically, we could do away with short, int, long, etc, and use
 s/uint8/16/32/64/128. Additionally (and please let me know whether or not

Walter promised to give us a unit with appropriate typedefs: module types; typedef byte int8; typedef short int16; typedef int int32; ...
 Alternatively, in Java the types sizes are all strictly defined, and that
 causes no problems either. It is pretty easy to remember that int is 32

 long 64 bits.

This is exactly how D works, and I would really prefer it to remain.
May 10 2002
prev sibling parent reply "Matthew Wilson" <mwilson nextgengaming.com> writes:
 would actually go farther, in suggesting a conservative subset of all the
ideas expressed in this item.

Basically, we could do away with short, int, long, etc, and use
s/uint8/16/32/64/128. Additionally (and please let me know whether or not
this is the case) the char type (whether ANSI or Unicode - again I am not
sure in whether this has been decided) should be a distinct type from any of
the aforementioned 10 integer types, as should bool.

I know this strikes at the C-ness of it all, but I am pretty sure it makes
for simpler coding and better maintainability. In my own C/C++ code I have
for years had types akin to the C99, and the only time I ever have cause to
use one of the "real" types is when defining post-inc/decrement operators
(ie. operator ++(int)).

Whether this is accepted (and I doubt, given the style offense it would
cause to most C/C++-heads), there could surely be these types provided
alongside the non-sized ones?

Alternatively, in Java the types sizes are all strictly defined, and that
causes no problems either. It is pretty easy to remember that int is 32 and
long 64 bits.

Undecidely, ...

Matthew


"Mark T" <mt nospam.com> wrote in message
news:aar9td$3gu$1 digitaldaemon.com...
 I think using the Java standard sizes for integral sizes is a mistake

 D does not need a VM :) and since "D is designed to fit comfortably with a

 compiler for the target system".

 I think the D language spec is overly targeted to the IA32/x86

 Generally, the C programmer uses "int" as the most efficient type for the
 CPU (it has been a while but I think "int" on the DEC Alpha is 64 bits).

 course, there are still plenty of 16 bit CPUs and odd-ball DSPs which

 could possibly use D.

 The following D types should be modified to match the underlying C types

 the specific target (since you have a single target currently that

 break much code), then interfacing to existing C code would be very
 straightforward.

 short
 ushort
 int
 uint
 long
 ulong

 D should introduce the following types (similar to C99 with the "_t"
 removed) for those times when you need an exact bit length data type.
   int8 - signed 8 bits
   uint8 - unsigned 8 bits
   int16 - signed 16 bits
   uint16 - unsigned 16 bits
   int32 - signed 32 bits
   uint32 - unsigned 32 bits
   etc

 I do embedded programming and use the exact size C99 types quite often.

 Mark

May 09 2002
parent reply Karl Bochert <kbochert ix.netcom.com> writes:
On Fri, 10 May 2002 13:40:13 +1000, "Matthew Wilson"
<mwilson nextgengaming.com> wrote:
  would actually go farther, in suggesting a conservative subset of all the
 ideas expressed in this item.
 
 Basically, we could do away with short, int, long, etc, and use
 s/uint8/16/32/64/128. Additionally (and please let me know whether or not
 this is the case) the char type (whether ANSI or Unicode - again I am not
 sure in whether this has been decided) should be a distinct type from any of
 the aforementioned 10 integer types, as should bool.

Get rid of unsigned entirely. It adds nothing but confusion. Its very use implies that overflow behavior is important! Karl Bochert
May 10 2002
next sibling parent reply "OddesE" <OddesE_XYZ hotmail.com> writes:
"Karl Bochert" <kbochert ix.netcom.com> wrote in message
news:1103_1021047954 bose...
 On Fri, 10 May 2002 13:40:13 +1000, "Matthew Wilson"

  would actually go farther, in suggesting a conservative subset of all


 ideas expressed in this item.

 Basically, we could do away with short, int, long, etc, and use
 s/uint8/16/32/64/128. Additionally (and please let me know whether or


 this is the case) the char type (whether ANSI or Unicode - again I am


 sure in whether this has been decided) should be a distinct type from


 the aforementioned 10 integer types, as should bool.

Get rid of unsigned entirely. It adds nothing but confusion. Its very use implies that overflow behavior is important! Karl Bochert

Mmmm.... It doubles the range of a type without any loss! If a value is *always* positive (the index of an array) why not express this using an unsigned type? -- Stijn OddesE_XYZ hotmail.com http://OddesE.cjb.net _________________________________________________ Remove _XYZ from my address when replying by mail
May 10 2002
parent reply Karl Bochert <kbochert ix.netcom.com> writes:
On Fri, 10 May 2002 20:39:29 +0200, "OddesE" <OddesE_XYZ hotmail.com> wrote:

 Get rid of unsigned entirely. It adds nothing but confusion.
 Its very use implies that overflow behavior is important!

 Karl Bochert


 
 Mmmm....
 It doubles the range of a type without any loss!

That was important when memory was small.
 If a value is *always* positive (the index of an array)
 why not express this using an unsigned type?
 

I used to feel the same way -- 'unsigned' was a sort of contract with myself. Its probably better (but not good) to use a comment: int i; //an index -- must be positive. Its true that: int i; i -= 1; arr[i]; will produce odd results, but then so (probably) would: unsigned int i; i -= 1; arr[i]; The more meaningful distinction between 'index' and 'sum' is that the former is (should be) an ordinal. If D used ordinal indexes, then it might be useful to have an 'unsigned' to declare them, but even that would be marginal. Karl Bochert
May 10 2002
parent reply Russ Lewis <spamhole-2001-07-16 deming-os.org> writes:
Karl Bochert wrote:

 I used to feel the same way -- 'unsigned' was a sort of contract
 with myself. Its probably better (but not good) to use a comment:
    int i;  //an index -- must be positive.

D allows you to make explicit contracts between yourself and the compiler, which I think is a Good Thing. Isn't 'unsigned' just a contract with the compiler that the variable should never go negative?
 Its true that:

      int i;
     i -= 1;
     arr[i];

Frankly, I think that if the compiler can detect that you're going to subtract from 0 on an unsigned number, it should register that as a contract violation.
 will produce odd results, but then so (probably) would:

     unsigned int i;
     i -= 1;
     arr[i];

 The more meaningful distinction between 'index' and 'sum' is that the
 former is (should be) an ordinal. If D used ordinal indexes, then it might be
 useful to have an 'unsigned' to declare them, but even that would be
 marginal.

-- The Villagers are Online! http://villagersonline.com .[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ] .[ (a version.of(English).(precise.more)) is(possible) ] ?[ you want.to(help(develop(it))) ]
May 11 2002
parent reply "Stephen Fuld" <s.fuld.pleaseremove att.net> writes:
"Russ Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message
news:3CDD170F.80589F76 deming-os.org...
 Karl Bochert wrote:

 I used to feel the same way -- 'unsigned' was a sort of contract
 with myself. Its probably better (but not good) to use a comment:
    int i;  //an index -- must be positive.

D allows you to make explicit contracts between yourself and the compiler,

 I think is a Good Thing.  Isn't 'unsigned' just a contract with the

 the variable should never go negative?

 Its true that:

      int i;
     i -= 1;
     arr[i];

Frankly, I think that if the compiler can detect that you're going to

 from 0 on an unsigned number, it should register that as a contract

Yes. Walter has already acknowledged the advantages of range variables and IIRC agreed to put them in version 2. Unsigned is essentially a range restriction (>=0) on an integer. (In fact, the ability to eliminate the whole "unsigned" issue and its correspoonding syntax is an argument for supporting ranges in version 1) Without any further syntax, range restrictions are a kind of shortcut for equivalent design by contract constructs and so any detected violation of such a range restriction should be treated equivalently to an assertion violation. Note that with additional syntax, range variables become much more usefull, but that is a different topic. One other note. If it wasn't so bloody inconvenient, a language could require ranges on all its integer variables and thus eliminate the whole short/long/int/longlong/double/verylong/doublelong/ultralong/ etc. mess. Just specify the range and let the compiler figure out what how much storage it needs. -- - Stephen Fuld e-mail address disguised to prevent spam
May 11 2002
parent reply "Sean L. Palmer" <spalmer iname.com> writes:
 Frankly, I think that if the compiler can detect that you're going to

 from 0 on an unsigned number, it should register that as a contract

Yes. Walter has already acknowledged the advantages of range variables

 IIRC agreed to put them in version 2. Unsigned is essentially a range
 restriction (>=0) on an integer.  (In fact, the ability to eliminate the
 whole "unsigned" issue and its correspoonding syntax is an argument for
 supporting ranges in version 1) Without any further syntax, range
 restrictions are a kind of shortcut for equivalent design by contract
 constructs and so any detected violation of such a range restriction

 be treated equivalently to an assertion violation.

Maybe you could have it clamp the value to the limits of the target range instead of just lopping off the top bits, have it lop off the excess value. Note that this is more like the behavior of casting float to int or int to float (nevermind that mostly the hardware does the remapping)... something is remapping a value of one type into a possible value of the other type. Maybe some information has to be lost... I think the part of the value that goes outside the range should be lost, but it should turn into the most extreme value possible (the closest one can get to the original value) when this happens. For ints, NaN and 0 are the same value, or perhaps you could use 0x80000000. For floats, I'd have it measure the range and if necessary when converting if it's beyond the capabilities of the target int, it could turn into MAXINT (-0x80000000 thru 0x7FFFFFFF) instead of the low 32 bits of the integer representation of the float. Or one could use the old cast behavior to get bits converted in the fastest way possible (usually by lopping off extra hi bits... which is useful for carving up data but loses a different kind of information, actually loses the most important part of the information in most cases) You could do for instance int a = 65536; short b = saturate(short) a; // value is 32767 short c = cast(short) a; // value is 0 or even enum tristate { false, maybe, true }; tristate res1 = saturate(tristate) -45; // value is false tristate res2 = saturate(tristate) 57; // value is true tristate res3 = saturate(tristate) 1.1f; // value is maybe But actually it'd be nice if you could establish an attribute which lets the compiler know a particular variable is always needing to be clamped or saturated to the maximum range, that way you wouldn't need the cast, it'd be implicit. Kinda like const or volatile in C++. Does D do same syntax for dynamic casts as for static casts? i.e. if (cast(ObjDerived)myobj) I seem to recall it uses special properties like "a.IsDerived(B)" or something. Maybe saturated could be one of those special properties. Perhaps we can have some compile time mojo that works like so byte a; a.variable_saturated() = true; int x; a = x; // saturates a.variable_saturated() = false; a = x; // doesn't saturate variable_saturated would have to be assigned a compile-time constant as its value, that would be our restriction. tweaking this bit which actually exists in the compiler would actually change the subsequent semantic processing. I don't know if you want your semantic processor to have state, or have that state modified by the program being compiled like this. Another alternative is to use the same method public/private/etc use. struct pixelRGBA { saturated: ubyte R,G,B,A; }; Anyone like this? Sean
 Note that with additional syntax, range variables become much more

 but that is a different topic.

Cool
 One other note.  If it wasn't so bloody inconvenient, a language could
 require ranges on all its integer variables and thus eliminate the whole
 short/long/int/longlong/double/verylong/doublelong/ultralong/ etc. mess.
 Just specify the range and let the compiler figure out what how much

 it needs.

Good idea.
May 12 2002
parent "Stephen Fuld" <s.fuld.pleaseremove att.net> writes:
"Sean L. Palmer" <spalmer iname.com> wrote in message
news:abl6qj$1fd5$1 digitaldaemon.com...
 Frankly, I think that if the compiler can detect that you're going to

 from 0 on an unsigned number, it should register that as a contract

Yes. Walter has already acknowledged the advantages of range variables

 IIRC agreed to put them in version 2. Unsigned is essentially a range
 restriction (>=0) on an integer.  (In fact, the ability to eliminate the
 whole "unsigned" issue and its correspoonding syntax is an argument for
 supporting ranges in version 1) Without any further syntax, range
 restrictions are a kind of shortcut for equivalent design by contract
 constructs and so any detected violation of such a range restriction

 be treated equivalently to an assertion violation.

Maybe you could have it clamp the value to the limits of the target range instead of just lopping off the top bits, have it lop off the excess

 Note that this is more like the behavior of casting float to int or int to
 float (nevermind that mostly the hardware does the remapping)... something
 is remapping a value of one type into a possible value of the other type.
 Maybe some information has to be lost... I think the part of the value

 goes outside the range should be lost, but it should turn into the most
 extreme value possible (the closest one can get to the original value)

 this happens.  For ints, NaN and 0 are the same value, or perhaps you

 use 0x80000000.  For floats, I'd have it measure the range and if

 when converting if it's beyond the capabilities of the target int, it

 turn into MAXINT (-0x80000000 thru 0x7FFFFFFF) instead of the low 32 bits

 the integer representation of the float.  Or one could use the old cast
 behavior to get bits converted in the fastest way possible (usually by
 lopping off extra hi bits... which is useful for carving up data but loses

 different kind of information, actually loses the most important part of

 information in most cases)

 You could do for instance

 int a = 65536;
 short b = saturate(short) a; // value is 32767
 short c = cast(short) a;    // value is 0

 or even

 enum tristate { false, maybe, true };
 tristate res1 = saturate(tristate) -45;  // value is false
 tristate res2 = saturate(tristate) 57;  // value is true
 tristate res3 = saturate(tristate) 1.1f;  // value is maybe

 But actually it'd be nice if you could establish an attribute which lets

 compiler know a particular variable is always needing to be clamped or
 saturated to the maximum range, that way you wouldn't need the cast, it'd

 implicit.  Kinda like const or volatile in C++.

 Does D do same syntax for dynamic casts as for static casts?  i.e.  if
 (cast(ObjDerived)myobj)  I seem to recall it uses special properties like
 "a.IsDerived(B)" or something.  Maybe saturated could be one of those
 special properties.

 Perhaps we can have some compile time mojo that works like so

 byte a;
 a.variable_saturated() = true;
 int x;
 a = x;   // saturates
 a.variable_saturated() = false;
 a = x;  // doesn't saturate

 variable_saturated would have to be assigned a compile-time constant as

 value, that would be our restriction.  tweaking this bit which actually
 exists in the compiler would actually change the subsequent semantic
 processing.  I don't know if you want your semantic processor to have

 or have that state modified by the program being compiled like this.

 Another alternative is to use the same method public/private/etc use.

 struct pixelRGBA
 {
   saturated:
     ubyte R,G,B,A;
 };

 Anyone like this?

Assuming that range is going to be a possible attribute in the declaration, it seems that saturated should be a "modifier" of range. It modifies range in that it modifies what to do when the variable goes out of range (clamp it instead of throwing an exception). That would allow saturating at a any value, both for the min and the max. I don't know if saturating is enough of an advantage to be worth putting in the language, but it wouldn't be hard to implement if you are doing ranges already and shouldn't be too costly for the tests in the resulting code. -- - Stephen Fuld e-mail address disguised to prevent spam
May 12 2002
prev sibling parent reply Russell Borogove <kaleja estarcion.com> writes:
Karl Bochert wrote:
 On Fri, 10 May 2002 13:40:13 +1000, "Matthew Wilson"
<mwilson nextgengaming.com> wrote:
 
 would actually go farther, in suggesting a conservative subset of all the
ideas expressed in this item.

Basically, we could do away with short, int, long, etc, and use
s/uint8/16/32/64/128. Additionally (and please let me know whether or not
this is the case) the char type (whether ANSI or Unicode - again I am not
sure in whether this has been decided) should be a distinct type from any of
the aforementioned 10 integer types, as should bool.

Get rid of unsigned entirely. It adds nothing but confusion. Its very use implies that overflow behavior is important!

I use unsigned vs. signed to control the behavior of shifts, not the behavior of overflows. -Russell B
May 10 2002
next sibling parent "Robert W. Cunningham" <rcunning acm.org> writes:
Russell Borogove wrote:

 Karl Bochert wrote:
 On Fri, 10 May 2002 13:40:13 +1000, "Matthew Wilson"
<mwilson nextgengaming.com> wrote:

 would actually go farther, in suggesting a conservative subset of all the
ideas expressed in this item.

Basically, we could do away with short, int, long, etc, and use
s/uint8/16/32/64/128. Additionally (and please let me know whether or not
this is the case) the char type (whether ANSI or Unicode - again I am not
sure in whether this has been decided) should be a distinct type from any of
the aforementioned 10 integer types, as should bool.

Get rid of unsigned entirely. It adds nothing but confusion. Its very use implies that overflow behavior is important!

I use unsigned vs. signed to control the behavior of shifts, not the behavior of overflows. -Russell B

And I haven't seen many signed hardware registers... -BobC
May 10 2002
prev sibling parent reply Karl Bochert <kbochert ix.netcom.com> writes:
On Fri, 10 May 2002 14:10:18 -0700, Russell Borogove <kaleja estarcion.com>
wrote:
 
 I use unsigned vs. signed to control the behavior of
 shifts, not the behavior of overflows.
 

It knows nothing of signed-ness. Therfore: int val; val = -1; val >>= 2; //shifts like C's unsigned val += 2; // Adds like C's 'signed' Shifting a signed is actually multiplication (or division) and if that's what you want, thats what you should say. Karl Bochert
May 10 2002
parent reply "OddesE" <OddesE_XYZ hotmail.com> writes:
"Karl Bochert" <kbochert ix.netcom.com> wrote in message
news:1103_1021079639 bose...
 On Fri, 10 May 2002 14:10:18 -0700, Russell Borogove

 I use unsigned vs. signed to control the behavior of
 shifts, not the behavior of overflows.

It knows nothing of signed-ness. Therfore: int val; val = -1; val >>= 2; //shifts like C's unsigned val += 2; // Adds like C's 'signed' Shifting a signed is actually multiplication (or division) and if that's what you want, thats what you should say. Karl Bochert

I disagree. The effects of a shift might be the same as a multiplication or a division, but the operation sure is different. It is an old optimisation trick to use shifts instead of multiplications to gain speed: int i = 10; // ... Draw some graphics // Skip to the next scanline (on a 640x480 display) i = i + (i << 9) + (i << 7); // Same as i = i * 640 but faster A lot of game graphics actually require that the width of textures or sprites be a multiple of 2, or even a power of two to ease the use of these kind of tricks. The same goes for memory and unsigned numbers. Sure we have got lots of memory these days, but our games also require more and more. Why waste memory or range of numbers when it is not necessary? -- Stijn OddesE_XYZ hotmail.com http://OddesE.cjb.net _________________________________________________ Remove _XYZ from my address when replying by mail
May 11 2002
parent Russell Borogove <kaleja estarcion.com> writes:
OddesE wrote:
 "Karl Bochert" <kbochert ix.netcom.com> wrote in message
 news:1103_1021079639 bose...
Shifting a signed is actually multiplication (or division)  and if that's
what you want, thats what you should say.

I disagree. The effects of a shift might be the same as a multiplication or a division, but the operation sure is different. It is an old optimisation trick to use shifts instead of multiplications to gain speed:

I believe Karl's position is that such things are for the compiler to optimize[1], not the programmer; although it's possible for the programmer to know that the divisor is a power of two in situations where the compiler doesn't. -Russell B [1] And whose responsibility is it to optimize shift versus multiply on a Pentium 4, where multiplies may well be faster than shifts?
May 11 2002