www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - [suggestion] std type aliases (updated)

reply =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Revised,
Here is the full list of my suggested type aliases for D:

    TYPE        ALIAS   // RANGE

    void                // void

Integer: (std.stdint)
    byte        int8_t  // 8-bit signed
   ubyte       uint8_t  // 8-bit unsigned (0x00-0xFF)

   short       int16_t  // 16-bit signed
  ushort      uint16_t  // 16-bit unsigned (0x0000-0xFFFF)

     int       int32_t  // 32-bit signed
    uint      uint32_t  // 32-bit unsigned (0x00000000-0xFFFFFFFF)

    long       int64_t  // 64-bit signed (could be two int registers)
   ulong      uint64_t  // 64-bit unsigned (could be two uint registers)

    cent      int128_t  // 128-bit signed (reserved for future use)
   ucent     uint128_t  // 128-bit unsigned (reserved for future use)

Floating Point: (std.stdfloat)
      float  float32_t  // 32-bit single precision (about 6 digits)
     double  float64_t  // 64-bit double precision (about 15 digits)
   extended  float80_t  // 64/80/128-bit extended precision (platform)

     ifloat   imag32_t  // \
    idouble   imag64_t  // imaginary versions of the above real ones
  iextended   imag80_t  // /

     cfloat   comp32_t  // \
    cdouble   comp64_t  // complex (with both real and imaginary parts)
  cextended   comp80_t  // /

Character: (std.stdutf)
    char        utf8_t  // \x00-\x7F (ASCII)
   wchar       utf16_t  // \u0000-\uD7FF, \uE000-\uFFFF
   dchar       utf32_t  // \U00000000-\U0010FFFF (Unicode)

Boolean: (std.stdbool)
     bit          bool  // false (0) | true (1)
    byte         wbool  // false (zero) | true (non-zero)
     int         dbool  // false (zero) | true (non-zero)

String: (std.stdstr)
   char[]          str    // UTF-8, optimized for US-ASCII
  wchar[]         wstr    // UTF-16, optimized for Unicode
  dchar[]         dstr    // UTF-32, easy codepoint access


This is the updated version, after discussions...

It requires renaming "real" back to "extended".
(and adding the keywords "cent" and "ucent" too)

--anders

Implementation: (Public Domain)

 module std.stdint;
 
 /* Exact sizes */
 
 alias  byte    int8_t;
 alias ubyte   uint8_t;
 alias  short  int16_t;
 alias ushort uint16_t;
 alias   int   int32_t;
 alias  uint  uint32_t;
 alias  long   int64_t;
 alias ulong  uint64_t;
 alias  cent  int128_t;
 alias ucent uint128_t;

 module std.stdfloat;
 
 /* floating point types */
 
 alias     float  float32_t; // 32-bit single precision
 alias    double  float64_t; // 64-bit double precision
 alias  extended  float80_t; // 64|80|128-bit extended
 
 alias     float   real32_t; // \
 alias    double   real64_t; // Real
 alias  extended   real80_t; // /
 
 alias    ifloat   imag32_t; // \
 alias   idouble   imag64_t; // Imaginary
 alias iextended   imag80_t; // /
 
 alias    cfloat   comp32_t; // \
 alias   cdouble   comp64_t; // Complex (Real + Imaginary)
 alias cextended   comp80_t; // /

 module std.stdutf;
 
 /* UTF code units */
 
 alias  char  utf8_t; // UTF-8 code unit
 alias wchar utf16_t; // UTF-16 code unit
 alias dchar utf32_t; // UTF-32 code point

 module std.stdbool;
 
 /* boolean types */
 
 alias   bit   bool;  // boolean (true/false)
 alias  byte  wbool;  // wide boolean (like wchar)
 alias   int  dbool;  // double boolean (like dchar)

 module std.stdstr;
 
 /* string types */
 
 alias  char[]   str; // ASCII-optimized
 alias wchar[]  wstr; // Unicode-optimized
 alias dchar[]  dstr; // codepoint-optimized

Feb 11 2005
next sibling parent reply Mark Junker <mjscod gmx.de> writes:
Anders F Björklund schrieb:

 String: (std.stdstr)
   char[]          str    // UTF-8, optimized for US-ASCII
  wchar[]         wstr    // UTF-16, optimized for Unicode
  dchar[]         dstr    // UTF-32, easy codepoint access

Are you sure that you'll always use Multi-Byte-Character strings? Maybe you you can use UCS32 for dchar[]? Regards, Mark
Feb 11 2005
parent =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Mark Junker wrote:

 String: (std.stdstr)
   char[]          str    // UTF-8, optimized for US-ASCII
  wchar[]         wstr    // UTF-16, optimized for Unicode
  dchar[]         dstr    // UTF-32, easy codepoint access

Are you sure that you'll always use Multi-Byte-Character strings? Maybe you you can use UCS32 for dchar[]?

What is UCS32 ? (I've only heard of UCS-4, which is obsolete) And the decision that D should "only" support Unicode is not mine at all but Walter's, and something that I agree with... It's still possible to use e.g. Latin-1 strings by using the ubyte[] data type, but you can only use C functions then... All D functions take Unicode strings, like: str, wstr, dstr --anders
Feb 11 2005
prev sibling next sibling parent reply Norbert Nemec <Norbert Nemec-online.de> writes:
Anders F Björklund wrote:

 Revised,
 Here is the full list of my suggested type aliases for D:
 
 Integer: (std.stdint)
     byte        int8_t  // 8-bit signed
    ubyte       uint8_t  // 8-bit unsigned (0x00-0xFF)

are those _t suffices really needed?
      ifloat   imag32_t  // \
     idouble   imag64_t  // imaginary versions of the above real ones
   iextended   imag80_t  // /

      cfloat   comp32_t  // \
     cdouble   comp64_t  // complex (with both real and imaginary parts)
   cextended   comp80_t  // /

I'm not really happy with these. I would suggest to * either change 'imag'->'imaginary' and 'comp'->'complex' (or 'cpx' if you really want to stay short) * or even better: Stick with float float32 // \ double float64 // imaginary versions of the above real ones extended float80 // / ifloat ifloat32 // \ idouble ifloat64 // imaginary versions of the above real ones iextended ifloat80 // / cfloat cfloat32 // \ cdouble cfloat64 // complex (with both real and imaginary parts) cextended cfloat80 // / which avoids any questionable mixing of numerical and mathematical names and is very structured.
Feb 11 2005
parent reply =?ISO-8859-15?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Norbert Nemec wrote:

Here is the full list of my suggested type aliases for D:

Integer: (std.stdint)
    byte        int8_t  // 8-bit signed
   ubyte       uint8_t  // 8-bit unsigned (0x00-0xFF)

[...] are those _t suffices really needed?

There are two reasons: 1) to avoid mixing them up with integers. int8_t x = 8; int16_t y = 16; int32_t z = 32; 2) They are already in use, in ISO C99... #include <stdint.h> So it's just an extension into "float"/"utf" ?
 I'm not really happy with these. I would suggest to
 
 * either change 'imag'->'imaginary' and 'comp'->'complex' (or 'cpx' if you
  really want to stay short)
 
 *  or even better: Stick with
 
       float   float32  // \
      double   float64  // imaginary versions of the above real ones
    extended   float80  // /
 
      ifloat   ifloat32  // \
     idouble   ifloat64  // imaginary versions of the above real ones
   iextended   ifloat80  // /
 
      cfloat   cfloat32  // \
     cdouble   cfloat64  // complex (with both real and imaginary parts)
   cextended   cfloat80  // /
 
 which avoids any questionable mixing of numerical and mathematical names and
 is very structured.

Sure, that works much better! (but with the _t suffix) imaginary and complex are too much like "unsigned", and we are already using the "u" prefix for those... Dropping the "real" names from DMD is really simple. Just a matter of will and decisions, in the end...
 /lexer.c:1924:    {	"real",		TOKfloat80	},
 ./lexer.c:1933:    {	"ireal",	TOKimaginary80	},
 ./lexer.c:1937:    {	"creal",	TOKcomplex80	},

 ./mtype.c:666:			c = "real";
 ./mtype.c:681:			c = "ireal";
 ./mtype.c:696:			c = "creal";

I guess the old names will have to be provided as aliases, in order to not break existing code... (in object.d) "real" is OK, but maybe deprecate "ireal" and "creal" ? ("imaginary" and "complex" could be added too, if wanted, but they are a little like "unsigned" or even "integer") Put a page up at: http://www.prowiki.org/wiki4d/wiki.cgi?StdTypeAliases Will change to the above, and drop "imag" and "comp" And maybe start a petition to get extended back ? :-) --anders
Feb 11 2005
parent reply Norbert Nemec <Norbert Nemec-online.de> writes:
Anders F Björklund wrote:

 Norbert Nemec wrote:
 
Here is the full list of my suggested type aliases for D:

Integer: (std.stdint)
    byte        int8_t  // 8-bit signed
   ubyte       uint8_t  // 8-bit unsigned (0x00-0xFF)

[...] are those _t suffices really needed?

There are two reasons: 1) to avoid mixing them up with integers. int8_t x = 8; int16_t y = 16; int32_t z = 32;

I don't understand at all. What might someone mix up and where does the _t help? Of course, people have to think when reading the above lines, but I see no special danger of mixing anything up.
 2) They are already in use, in ISO C99...
 
     #include <stdint.h>

OK, but I never really liked that _t thing in the C libraries. It makes sense for identifiers that do not necessarily sound like a type ('size_t', 'date_t' and so on) but int, float & Co. immediately look like a type to anyone who has ever worked in C. Probably a pure matter of taste...
Feb 11 2005
next sibling parent =?ISO-8859-15?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Norbert Nemec wrote:

There are two reasons:

1) to avoid mixing them up with integers.

    int8_t x = 8;
    int16_t y = 16;
    int32_t z = 32;

I don't understand at all. What might someone mix up and where does the _t help? Of course, people have to think when reading the above lines, but I see no special danger of mixing anything up.

I think it was mostly with cast(int8) and such expressions, but I'm not sure myself to be honest. :-) I know it was discussed on this newsgroup in great length (as usual), so search the archives for it perhaps ? I do know that "int8" and friends were forever rejected as a D type... (similar to how boolean conditionals and a String class are rejected)
2) They are already in use, in ISO C99...

    #include <stdint.h>

OK, but I never really liked that _t thing in the C libraries. It makes sense for identifiers that do not necessarily sound like a type ('size_t', 'date_t' and so on) but int, float & Co. immediately look like a type to anyone who has ever worked in C. Probably a pure matter of taste...

Probably, and there aren't any debate of e.g. the sizes of int and wchar in D, but int32_t and utf16_t makes it self-explanatory which is which ? (they are also a *compromise* between SInt32 and INT32 and all variants) Having both (at the same time) is a matter of appealing to both... Note that std.stdint is *not* imported by default, so you still have to explicitly import it to get the alias types. Ditto with std.stdfloat, and probably the proposed stdbool and stdstr as well - although the suggestion is that "bool" and "str" aliases should always be defined. The more esoteric wbool and dbool and wstr and dstr can be imported explicitly, if wants to use them over the built-in types they represent. Just that newcomers to D do find "bool" easier than bit, and might just find "str" easier than the actual char[] ? Then, when they are ready for it, you can slap them with the full bool/wbool/dbool and str/wstr/dstr performance considerations... Simple to begin with, but still with all the options open later ? --anders
Feb 11 2005
prev sibling parent reply "Ivan Senji" <ivan.senji public.srce.hr> writes:
"Norbert Nemec" <Norbert Nemec-online.de> wrote in message
news:cuibd3$1v0o$1 digitaldaemon.com...
 Anders F Björklund wrote:

 Norbert Nemec wrote:

Here is the full list of my suggested type aliases for D:

Integer: (std.stdint)
    byte        int8_t  // 8-bit signed
   ubyte       uint8_t  // 8-bit unsigned (0x00-0xFF)

[...] are those _t suffices really needed?

There are two reasons: 1) to avoid mixing them up with integers. int8_t x = 8; int16_t y = 16; int32_t z = 32;

I don't understand at all. What might someone mix up and where does the _t help? Of course, people have to think when reading the above lines, but I see no special danger of mixing anything up.

I' don't like those t's either. int8_t wouldn't get my vote but int 8 would.
 2) They are already in use, in ISO C99...

     #include <stdint.h>

OK, but I never really liked that _t thing in the C libraries. It makes sense for identifiers that do not necessarily sound like a type ('size_t', 'date_t' and so on) but int, float & Co. immediately look like a type to anyone who has ever worked in C. Probably a pure matter of taste...

Feb 11 2005
parent reply =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Ivan Senji wrote:

Here is the full list of my suggested type aliases for D:

Integer: (std.stdint)
   byte        int8_t  // 8-bit signed
  ubyte       uint8_t  // 8-bit unsigned (0x00-0xFF)



I don't understand at all. What might someone mix up and where does the _t help? Of course, people have to think when reading the above lines, but I see no special danger of mixing anything up.

I' don't like those t's either. int8_t wouldn't get my vote but int 8 would.

int8_t has already been added to std.stdint long ago*, it's a bit late ? It also rhymes with size_t, time_t, pthread_t, ptrdiff_t, wchar_t etc... --anders * Like five years ago, or something ? At least in regular ISO C and C++.
Feb 11 2005
parent reply Jason Mills <jmills cs.mun.ca> writes:
Anders F Björklund wrote:
 Ivan Senji wrote:
 
 Here is the full list of my suggested type aliases for D:

 Integer: (std.stdint)
   byte        int8_t  // 8-bit signed
  ubyte       uint8_t  // 8-bit unsigned (0x00-0xFF)



I don't understand at all. What might someone mix up and where does the _t help? Of course, people have to think when reading the above lines, but I see no special danger of mixing anything up.

I' don't like those t's either. int8_t wouldn't get my vote but int 8 would.

int8_t has already been added to std.stdint long ago*, it's a bit late ?

That's too bad.
 It also rhymes with size_t, time_t, pthread_t, ptrdiff_t, wchar_t etc...
 
 --anders
 
 * Like five years ago, or something ? At least in regular ISO C and C++.

I agree with Norbert. type_t seems so ugly. To me it ranks up there with Hungarian notation for ugliness. I always hated size_t, time_t, etc. Please, keep D easy on the eyes. Jason
Feb 11 2005
parent =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Jason Mills wrote:

 It also rhymes with size_t, time_t, pthread_t, ptrdiff_t, wchar_t etc...

with Hungarian notation for ugliness.

Most of these are totally optional extras, or portability aliases. So you don't have to use them, if you find them visually apalling ? The only aliases suggested for defining always are: bool, str The rest would be available as optional modules, import if you want. (as usual, there's always the risk of someone else using them, but) Of course, there is also the issue of changing "real" type back to "extended" again, just to avoid the silly "imaginary real" (ireal)
  I always hated size_t, time_t, etc. Please, keep D easy on the eyes.

But if you've always hated C, then maybe D will always be ugly too ? (I know, "_t" is more Unix and Posix than it is C. But they're related) --anders
Feb 11 2005
prev sibling next sibling parent reply "Lionello Lunesu" <lionello.lunesu crystalinter.remove.com> writes:
Hi..

And the discussion continues....

I think it should rather be the other way around:

int8 should be the internal type, with "byte" as an alias.

If there's a change eminent, we should take this opportunity to get rid of 
those strange C names like: short, long, double. They're adjectives for 
cyring out loud.

The only types that actually mean something are bit, "int" for integers and 
"float" for floating point numbers. The others are basically based on these.

Weren't "short" and "long" originally meant as type modifiers? "short int" 
for a 16-bit number, "long int" for a "longer than normal" (multi-register) 
64-bit number. "double float" for a float with double precision? What about 
"short real" for float and "long real" for a double?

The current types are too much based on C, their origin long since 
forgotten.

Come to think of it, "complex" can be both a noun and an adjective, and 
since "complex int" will never be popular (and useful), I guess it'll be one 
of the bases: bit, int, float, complex ?

Ah, finally a coherent naming system: "short bool" for a bit/bool, "bool" 
for a byte (C++), "long bool" for dbool. Char's tricky: there's nothing 
shorter and two longer variants.. I'll have to think about this.

"This would break all existing programs" - make aliases;
"Too late to make such a change" - with aliases nothing breaks;
"We're only fixing bugs, not changing the language" - press delete (no wait, 
it's a newsgroup).

Lionello.
Feb 11 2005
parent reply =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Lionello Lunesu wrote:

 And the discussion continues....
 
 I think it should rather be the other way around:
 int8 should be the internal type, with "byte" as an alias.

I think that ship has sailed, like a few years ago... Besides, it would work the same in the end wouldn't it?
 If there's a change eminent, we should take this opportunity to get rid of 
 those strange C names like: short, long, double. They're adjectives for 
 cyring out loud.

I don't think that D aims to get rid of C, just improve it... (and not too much either, Java and C# have changed much more) How about changing { for BEGIN and } for END, or using := for assignment and = for equality ? Me, I'll think I will pass ;-)
 The only types that actually mean something are bit, "int" for integers and 
 "float" for floating point numbers. The others are basically based on these.

Yup. And "char" should probably be renamed as "codeunit", too. (Since there are no characters in technical Unicode lingo...) Only problem is that the "float" precision is too poor, just like the "int" was - back when it was a 16-bit type... So today, int and double are now the "standard" types - short and float can be used when space/time is an issue more than precision and long and extended can both be used if the hardware so allows (i.e. on a 64-bit / X86 machine) and bit and byte are provided for other special uses
 Weren't "short" and "long" originally meant as type modifiers? "short int" 
 for a 16-bit number, "long int" for a "longer than normal" (multi-register) 
 64-bit number. "double float" for a float with double precision? What about 
 "short real" for float and "long real" for a double?

That was then, this is now. And two-word keywords are *not* in D. (and a "long int" was a 32-bit type, the "long long" was 64-bits) Shouldn't that include "regular int" and "single float" too then ? Then you can have "unsigned regular int" and "imaginary single float" That isn't really pratical, IMHO.
 The current types are too much based on C, their origin long since 
 forgotten.
 
 Come to think of it, "complex" can be both a noun and an adjective, and 
 since "complex int" will never be popular (and useful), I guess it'll be one 
 of the bases: bit, int, float, complex ?

But complex is now a type modifier: "complex float" (cfloat) A lot like "unsigned integer" (uint) or "wide character" (wchar). signed is the default for the integers, so it has no prefix. (could have been "sint") real is the default for the floats, so it has no prefix either. (could have been "rfloat")
 Ah, finally a coherent naming system: "short bool" for a bit/bool, "bool" 
 for a byte (C++), "long bool" for dbool. Char's tricky: there's nothing 
 shorter and two longer variants.. I'll have to think about this.

Please do, but I don't think it will change the D types... Which reminds me, there is no boolean type. Live with it :-)
 "This would break all existing programs" - make aliases;
 "Too late to make such a change" - with aliases nothing breaks;
 "We're only fixing bugs, not changing the language" - press delete (no wait, 
 it's a newsgroup).

How about "There are more important things to fix before release" ? And it would be nice if cent and ucent could be hacked in, so that I can get a type with 16-byte alignment to use for the vector types... (for linking to SIMD extensions outside of D, such as AltiVec or SSE) --anders
Feb 11 2005
parent reply "Lionello Lunesu" <lionello.lunesu crystalinter.remove.com> writes:
Hi..

 I think it should rather be the other way around:
 int8 should be the internal type, with "byte" as an alias.

I think that ship has sailed, like a few years ago... Besides, it would work the same in the end wouldn't it?

Well, yes, but all those aliases will prevent you from using those identifiers for other purposes. It's like all those #undef 's I have to type to prevent conflicts when including windows.h (Is there an 'unalias' by the way?). That's why the built-in names should actually be the only ones; only aliased to ease a transition from C to D. It just feel 'the wrong way around' for me that the compiler uses platform dependent names and you have to use aliases (different on different platforms) to end up with the same struct, interface.
 How about changing { for BEGIN and } for END,
 or using := for assignment and = for equality ?

Just giving other extreme suggestions doesn't really built a case against mine.
 The only types that actually mean something are bit, "int" for integers 
 and "float" for floating point numbers. The others are basically based on 
 these.

Yup. And "char" should probably be renamed as "codeunit", too. (Since there are no characters in technical Unicode lingo...) Only problem is that the "float" precision is too poor, just like the "int" was - back when it was a 16-bit type... So today, int and double are now the "standard" types - short and float can be used when space/time is an issue more than precision and long and extended can both be used if the hardware so allows (i.e. on a 64-bit / X86 machine) and bit and byte are provided for other special uses
 Weren't "short" and "long" originally meant as type modifiers? "short 
 int" for a 16-bit number, "long int" for a "longer than normal" 
 (multi-register) 64-bit number. "double float" for a float with double 
 precision? What about "short real" for float and "long real" for a 
 double?

That was then, this is now. And two-word keywords are *not* in D. (and a "long int" was a 32-bit type, the "long long" was 64-bits)

Read on, you yourself wrote about "signed int" and "complex real". What's the difference with "short int" ?
 Shouldn't that include "regular int" and "single float" too then ?
 Then you can have "unsigned regular int" and "imaginary single float"

Hey, I'm making a case for fewer keywords, not more :-)
 But complex is now a type modifier: "complex float" (cfloat)
 A lot like "unsigned integer" (uint) or "wide character" (wchar).

That's nice, but just as "long int" became simply "long", there's nothing that prevents "complex real" to become simply "complex" (seeing as there's no use for 'complex int' and even "complex float" will be of little use, because complex is used during big calculations, needing a lot of precision).
 Which reminds me, there is no boolean type. Live with it :-)

Hey, if it looks like a bool and it 'walks' like a bool, then I'm happy :-)
 How about "There are more important things to fix before release" ?

Uhm - then what's your post all about?
 And it would be nice if cent and ucent could be hacked in, so that
 I can get a type with 16-byte alignment to use for the vector types...
 (for linking to SIMD extensions outside of D, such as AltiVec or SSE)

You have to agree that it makes more sense to add "int128" to the language, and an alias "cent" (whatever that's derived from). What will "int256" be called? Why all those new names? Seriously: is D primarily meant for C/C++ programmers that got tired of strcpy, ->, malloc/free? Lionello.
Feb 11 2005
parent =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Lionello Lunesu wrote:

 Well, yes, but all those aliases will prevent you from using those 
 identifiers for other purposes. It's like all those #undef 's I have to type 
 to prevent conflicts when including windows.h (Is there an 'unalias' by the 
 way?). That's why the built-in names should actually be the only ones; only 
 aliased to ease a transition from C to D.

That was actually another reason for the _t suffixes in stdint.
 It just feel 'the wrong way around' for me that the compiler uses platform 
 dependent names and you have to use aliases (different on different 
 platforms) to end up with the same struct, interface.

Just the way Walter designed the language, I suppose.
How about changing { for BEGIN and } for END,
or using := for assignment and = for equality ?

Just giving other extreme suggestions doesn't really built a case against mine.

No, that was called something in the meta-discussion language... Not "strawman", but something else? I'll try to stay on topic :-)
That was then, this is now. And two-word keywords are *not* in D.
(and a "long int" was a 32-bit type, the "long long" was 64-bits)

Read on, you yourself wrote about "signed int" and "complex real". What's the difference with "short int" ?

Walter didn't like it, and replaced it with "short" which was an allowed abbrevation in regular C. Java uses the same thing. And that is nice, makes D easier to understand for Java programmers, especially for people who started programming with Java (a lot now) but want something closer to the machine and think C is "too low"... The main differences to Java is: "boolean" -> "bit", and "char" -> wchar (and java does not have the "extended" float or "cent" int types either)
Shouldn't that include "regular int" and "single float" too then ?
Then you can have "unsigned regular int" and "imaginary single float"

Hey, I'm making a case for fewer keywords, not more :-)

I don't think shorter keywords that make up bigger ones are coming back into fashion. At least not before D 1.0 ?
But complex is now a type modifier: "complex float" (cfloat)

That's nice, but just as "long int" became simply "long", there's nothing that prevents "complex real" to become simply "complex" (seeing as there's no use for 'complex int' and even "complex float" will be of little use, because complex is used during big calculations, needing a lot of precision).

It actually *was* "complex", before the float and double versions were added. When they were, it changed from "complex" to "c". The only mistake was introducing the new "imaginary real" type... (which came as a unfortunate side effect of changing "extended")
Which reminds me, there is no boolean type. Live with it :-)

Hey, if it looks like a bool and it 'walks' like a bool, then I'm happy :-)

The bool type is alright. The issue is just with if(0) and if(null)...
How about "There are more important things to fix before release" ?

Uhm - then what's your post all about?

Three new std alias modules, and a DMD change from "real" to "extended" A few dozen lines of source code in total, and pretty trivial as well. Not rewriting the names of the built-in types, or something like that?
And it would be nice if cent and ucent could be hacked in, so that
I can get a type with 16-byte alignment to use for the vector types...
(for linking to SIMD extensions outside of D, such as AltiVec or SSE)

You have to agree that it makes more sense to add "int128" to the language, and an alias "cent" (whatever that's derived from).

But "cent" is already in the language, and has been for some time ? (and I don't think types with integer suffixes are a good idea, no)
 What will "int256" be called? Why all those new names?

I don't see any need for any more types, no. But 128-bit registers are a reality today... And std.stdint was written by Walter Bright, and comes ported over from standard C99 / C++ I just extended it with the logical int128_t
 Seriously: is D primarily meant for C/C++ programmers that got tired of 
 strcpy, ->, malloc/free?

I think so ? At least that's the way that I am using it, to stay off C++ It also has a large potential for Java programmers that want more speed. --anders
Feb 11 2005
prev sibling next sibling parent reply Alex Stevenson <ans104 cs.york.ac.uk> writes:
Anders F Björklund wrote:
 Revised,
 Here is the full list of my suggested type aliases for D:
 /* snipped out loads'o'stuff */
 
 
 
 module std.stdfloat;

 /* floating point types */

 alias     float  float32_t; // 32-bit single precision
 alias    double  float64_t; // 64-bit double precision
 alias  extended  float80_t; // 64|80|128-bit extended

 alias     float   real32_t; // \
 alias    double   real64_t; // Real
 alias  extended   real80_t; // /

 alias    ifloat   imag32_t; // \
 alias   idouble   imag64_t; // Imaginary
 alias iextended   imag80_t; // /

 alias    cfloat   comp32_t; // \
 alias   cdouble   comp64_t; // Complex (Real + Imaginary)
 alias cextended   comp80_t; // /


Is it a good idea to use the bit length in the real/extended aliases (float80_t) when the length of these types is platform defined? It would be misleading to call a type float80_t on platforms where it's implemented as a 64 or 128-bit type.
Feb 11 2005
parent reply =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Alex Stevenson wrote:

 Is it a good idea to use the bit length in the real/extended aliases 
 (float80_t) when the length of these types is platform defined?

No, but in reality the type is *always* 80 bits (when implemented) It's just that on CPU platforms that *do not have* the type, it falls back to using one or two doubles (which is better than simply failing) On the offical platforms, it's always 80 bits. (Win32 and Linux X86) --anders
Feb 11 2005
parent reply Alex Stevenson <ans104 cs.york.ac.uk> writes:
Anders F Björklund wrote:
 Alex Stevenson wrote:
 
 Is it a good idea to use the bit length in the real/extended aliases 
 (float80_t) when the length of these types is platform defined?

No, but in reality the type is *always* 80 bits (when implemented) It's just that on CPU platforms that *do not have* the type, it falls back to using one or two doubles (which is better than simply failing) On the offical platforms, it's always 80 bits. (Win32 and Linux X86) --anders

Hmm, all I can find in the D spec is this note in http://www.digitalmars.com/d/type.html is: "real : largest hardware implemented floating point size (Implementation Note: 80 bits for Intel CPU's)" This suggests to me that while on x86 platforms it should be 80 bits, but on other platforms it will change. This is fine while D only officially supports x86 platforms, but this situation isn't guaranteed to hold forever. Plus should the language spec really limit the hardware platform? The situation as I understood it was: D Language Spec: Does not limit to specific platforms (but provides implementation guidance) DMD compiler: Supports x86 Linux/Win32 only GDC : Supports x86/Mac/Solaris If my understanding is correct, the 'always 80-bit' assumption is based on the DMD compiler rather than the D language spec.
Feb 11 2005
parent =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Alex Stevenson wrote:

 No, but in reality the type is *always* 80 bits (when implemented)

 It's just that on CPU platforms that *do not have* the type, it falls
 back to using one or two doubles (which is better than simply failing)

 On the offical platforms, it's always 80 bits. (Win32 and Linux X86)

Hmm, all I can find in the D spec is this note in http://www.digitalmars.com/d/type.html is: "real : largest hardware implemented floating point size (Implementation Note: 80 bits for Intel CPU's)" This suggests to me that while on x86 platforms it should be 80 bits, but on other platforms it will change.

Correct. It uses the C type "long double", so it depends on the implementation of that type. Mac OS X does not have it, so GDC simply uses the regular "double" type instead - for that type too. Other platforms with PowerPC or Sparc processors might have a standard library that implements a "long double" using two double registers (making it 128 bits), just as they now implement a "long long" (64-bit integer) using two 32 bit registers, unless the platform processor is a 64-bit one ? And that's "hardware support", as far as DMD is concerned. Note: DMD/GDC is written in C++, and not in D itself...
 This is fine while D only officially supports x86 platforms, but this 
 situation isn't guaranteed to hold forever.
 
 Plus should the language spec really limit the hardware platform? The 
 situation as I understood it was:
 
 D Language Spec: Does not limit to specific platforms (but provides 
 implementation guidance)
 DMD compiler: Supports x86 Linux/Win32 only
 GDC : Supports x86/Mac/Solaris
 
 If my understanding is correct, the 'always 80-bit' assumption is based 
 on the DMD compiler rather than the D language spec.

The D specification does not limit the hardware platform. It just says the "biggest precision available", and that will be one of 64 or 80 or 128 bits depending on CPU... But "float80_t" is just an alias, so it can be the same ? (I do not think it has much use in reality, since one might as well use the more verbose: "real", "imaginary", "complex" that doesn't have the same false implementation hints...) The size of e.g. (void*) is not fixed either. It's usually 32 bits now, but can become 64 bits any day on the 64-bit platforms (that is: "version(AMD64)" or "version(PPC64)") http://www.digitalmars.com/d/version.html http://www.prowiki.org/wiki4d/wiki.cgi?DocComments/Version There are currently some "char" issues in C with building GDC on Linux PPC platforms, but David is working on that... Solaris was finally added after fixing alignment issues in DMD that caused crashes (!) with the Sparc processor. So the list above should have been: GDC : Supports Linux/Mac OS X/Solaris (since GDC does not support Mac OS 9) Currently GDC does not support inline assembler either, since it needs to be tied in with the D variables and such --anders
Feb 11 2005
prev sibling parent reply "Ben Hinkle" <bhinkle mathworks.com> writes:
"Anders F Björklund" <afb algonet.se> wrote in message 
news:cuhudm$1gla$2 digitaldaemon.com...
 Revised,
 Here is the full list of my suggested type aliases for D:

    TYPE        ALIAS   // RANGE

    void                // void

 Integer: (std.stdint)
    byte        int8_t  // 8-bit signed
   ubyte       uint8_t  // 8-bit unsigned (0x00-0xFF)

   short       int16_t  // 16-bit signed
  ushort      uint16_t  // 16-bit unsigned (0x0000-0xFFFF)

     int       int32_t  // 32-bit signed
    uint      uint32_t  // 32-bit unsigned (0x00000000-0xFFFFFFFF)

    long       int64_t  // 64-bit signed (could be two int registers)
   ulong      uint64_t  // 64-bit unsigned (could be two uint registers)

    cent      int128_t  // 128-bit signed (reserved for future use)
   ucent     uint128_t  // 128-bit unsigned (reserved for future use)

These are already done, right? I can't remember.
 Floating Point: (std.stdfloat)
      float  float32_t  // 32-bit single precision (about 6 digits)
     double  float64_t  // 64-bit double precision (about 15 digits)
   extended  float80_t  // 64/80/128-bit extended precision (platform)

     ifloat   imag32_t  // \
    idouble   imag64_t  // imaginary versions of the above real ones
  iextended   imag80_t  // /

     cfloat   comp32_t  // \
    cdouble   comp64_t  // complex (with both real and imaginary parts)
  cextended   comp80_t  // /

 Character: (std.stdutf)
    char        utf8_t  // \x00-\x7F (ASCII)
   wchar       utf16_t  // \u0000-\uD7FF, \uE000-\uFFFF
   dchar       utf32_t  // \U00000000-\U0010FFFF (Unicode)

 Boolean: (std.stdbool)
     bit          bool  // false (0) | true (1)
    byte         wbool  // false (zero) | true (non-zero)
     int         dbool  // false (zero) | true (non-zero)

The module std.stdint is defined because C programmers will look for it - and it has some useful "fast" and "least" aliases etc. I haven't heard of people needing these float, char or bool aliases except for individual naming preferences. The argument for including them is different that the argument for including std.stdint. I'd be hesitent to define wbool and dbool unless they have some boolean behaviors (eg it would confusing that cast(wbool)10 != cast(wbool)20)
 String: (std.stdstr)
   char[]          str    // UTF-8, optimized for US-ASCII
  wchar[]         wstr    // UTF-16, optimized for Unicode
  dchar[]         dstr    // UTF-32, easy codepoint access

"str" is a common variable name for strings. Plus it would hide the array semantics. [snip]
Feb 11 2005
next sibling parent reply =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Ben Hinkle wrote:

Here is the full list of my suggested type aliases for D:

Integer: (std.stdint)

These are already done, right? I can't remember.

They are, except "cent" which is not implemented. But they are done, with the _t suffix, and the versions without the suffix has *not* been added even if suggested multiple times over 3 years time. See the thread list at the top of this page: http://www.prowiki.org/wiki4d/wiki.cgi?StdTypeAliases
Floating Point: (std.stdfloat)

The module std.stdint is defined because C programmers will look for it - and it has some useful "fast" and "least" aliases etc. I haven't heard of people needing these float, char or bool aliases except for individual naming preferences. The argument for including them is different that the argument for including std.stdint.

That's why it's just a suggestion... But if it is to be, better be the same all over ? (which is why I posted it to the group, yet again)
 I'd be hesitent to define wbool and dbool 
 unless they have some boolean behaviors (eg it would confusing that 
 cast(wbool)10 != cast(wbool)20)

All the bool types use the Zero-Is-False ideom. (the inequality above would still be true) This would still "work", though: assert(cast(wbool)10 && cast(wbool)20)); It's not in any way a boolean type for D, which is something that Walter (and thus D) does *not* want... But the suggestion was to spare the "weirdo" wbool/dbool and wstr/dstr for explicit imports: import std.stdbool; import std.stdstr;
String: (std.stdstr)
  char[]          str    // UTF-8, optimized for US-ASCII
 wchar[]         wstr    // UTF-16, optimized for Unicode
 dchar[]         dstr    // UTF-32, easy codepoint access

"str" is a common variable name for strings. Plus it would hide the array semantics.

But this code still works just fine: alias char[] str; void main() { str str = "str"; } And it would hide the array semantics, which can be saved for later... str s = "hello," ~ "world!"; can be an easy start, just as void main(str[] args); But yes, hiding that is half of the point. (shortness of writing it being the other) It's not a String struct/class, though. That's something else D does not want... It's not exactly a new issue, and this my last posting is my final attempt to bring the issues to closure... I'm not sure I will "succeed", but I won't try again. "bool" is currently always defined, thankfully, and I can just continue to just define "str" locally... (and just "do an Arcane Jill" and get the hell out) I've also tried to document boolean and strings in D, something that bit and char[] haven't really helped ? It's much easier in e.g. Java, but I do think that D's types are an improvement over the C types (_Bool / char*) --anders PS. "imaginary real" is still very very silly, though. IMHO.
Feb 11 2005
next sibling parent reply Kris <Kris_member pathlink.com> writes:
In article <cuj45t$2pf3$1 digitaldaemon.com>,
=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= says...
  char[]          str    // UTF-8, optimized for US-ASCII
 wchar[]         wstr    // UTF-16, optimized for Unicode
 dchar[]         dstr    // UTF-32, easy codepoint access



Just a thought: how about naming those utf8, utf16, utf32, instead?
 I can just continue to just define "str" locally...
 (and just "do an Arcane Jill" and get the hell out)

Heck; don't do that ... for all the nice "Norman Rockwell" praise this NG ocassionally gets, it is often a most despairing place to suggest "change". FWIW: I find that those who disagree are the only vocal ones -- those who agree just don't say anything, publicly -- a somewhat strange arrangement that often caters mostly to the squeaky wheels. It can be frustrating.
 I've also tried to document boolean and strings in D,
 something that bit and char[] haven't really helped ?

You try to do something nice for the community, yet some focker goes and cuts the heads off the flowers. What you're doing is cool. D has more facial warts than Lemmy -- they could have been removed long ago, but the patient has this argumentative multiple personality disorder.
"imaginary real" is still very very silly, though. IMHO.

Certainly is :-)
Feb 11 2005
parent =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Kris wrote:

 char[]          str    // UTF-8, optimized for US-ASCII
wchar[]         wstr    // UTF-16, optimized for Unicode
dchar[]         dstr    // UTF-32, easy codepoint access



Just a thought: how about naming those utf8, utf16, utf32, instead?

I used those names for the individual code units instead... "utf8string" or something could have been used, but I just thought it to be much too technical. I wanted the integer / float types to be techy, and the boolean and string types to be simple ?
I can just continue to just define "str" locally...
(and just "do an Arcane Jill" and get the hell out)

Heck; don't do that ... for all the nice "Norman Rockwell" praise this NG ocassionally gets, it is often a most despairing place to suggest "change". FWIW: I find that those who disagree are the only vocal ones -- those who agree just don't say anything, publicly -- a somewhat strange arrangement that often caters mostly to the squeaky wheels. It can be frustrating.

Well, it was only half-true... I'm going on vacation :-) I'll be back to check up on you guys later, upon return.
 What you're doing is cool. D has more facial warts than Lemmy -- they could
have
 been removed long ago, but the patient has this argumentative multiple
 personality disorder.

Well, it's mostly because some wants D to become "kewler than C#" or "easier than Java" when all it wants is to become a better C and a simpler C++ ? To myself, D fits just greatly between C and Java. (I'm trying to avoid C++ and C# as much as I can) --anders
Feb 11 2005
prev sibling parent reply "Ben Hinkle" <bhinkle mathworks.com> writes:
"Anders F Björklund" <afb algonet.se> wrote in message 
news:cuj45t$2pf3$1 digitaldaemon.com...
 Ben Hinkle wrote:

Here is the full list of my suggested type aliases for D:

Integer: (std.stdint)

These are already done, right? I can't remember.

They are, except "cent" which is not implemented. But they are done, with the _t suffix, and the versions without the suffix has *not* been added even if suggested multiple times over 3 years time. See the thread list at the top of this page: http://www.prowiki.org/wiki4d/wiki.cgi?StdTypeAliases

I scanned the rest of the thread before posting so I missed the updated update about not having _t. Those int32 etc names were needed in C because of the unknown int sizes. Technically they aren't needed in D. IIRC Java and C# don't have aliases for "int" and friends. If you want a double in Java you say "double". If Walter wanted to change the _t's in std.stdint to non-_t's then that would be fine. One question I have is when should one use these aliases? In typical code or do you see them mostly used for special situations? I don't really know when I'd use those aliases.
Floating Point: (std.stdfloat)

The module std.stdint is defined because C programmers will look for it - and it has some useful "fast" and "least" aliases etc. I haven't heard of people needing these float, char or bool aliases except for individual naming preferences. The argument for including them is different that the argument for including std.stdint.

That's why it's just a suggestion... But if it is to be, better be the same all over ? (which is why I posted it to the group, yet again)

That's fine. Though right now the "same all over" are the standard names like "int", "double", etc. The case for "bool" being an alias for "bit" was strong. The case for a standard "float64" as an alias for "double" is much weaker IMO. I'm perfectly fine with personal aliases for things - people can do what they want. But to put it in std.stdfloat you have to have a strong case that many people would benefit from the name.
 I'd be hesitent to define wbool and dbool unless they have some boolean 
 behaviors (eg it would confusing that cast(wbool)10 != cast(wbool)20)

All the bool types use the Zero-Is-False ideom. (the inequality above would still be true)

replace wbool with bool and the inequality is false.
 This would still "work", though:
 assert(cast(wbool)10 && cast(wbool)20));

true.
 It's not in any way a boolean type for D, which is
 something that Walter (and thus D) does *not* want...

The "bit" type has boolean behavior. It can only take on two values. The "byte" type can take many values and so it does not have boolean behavior.
 But the suggestion was to spare the "weirdo"
 wbool/dbool and wstr/dstr for explicit imports:
 import std.stdbool; import std.stdstr;

ok, but still in std, no?
String: (std.stdstr)
  char[]          str    // UTF-8, optimized for US-ASCII
 wchar[]         wstr    // UTF-16, optimized for Unicode
 dchar[]         dstr    // UTF-32, easy codepoint access

"str" is a common variable name for strings. Plus it would hide the array semantics.

But this code still works just fine: alias char[] str; void main() { str str = "str"; }

shudder. :-)
 And it would hide the array semantics, which can
 be saved for later... str s = "hello," ~ "world!";
 can be an easy start, just as void main(str[] args);

 But yes, hiding that is half of the point.
 (shortness of writing it being the other)

 It's not a String struct/class, though.
 That's something else D does not want...


 It's not exactly a new issue, and this my last posting
 is my final attempt to bring the issues to closure...
 I'm not sure I will "succeed", but I won't try again.

 "bool" is currently always defined, thankfully, and
 I can just continue to just define "str" locally...
 (and just "do an Arcane Jill" and get the hell out)

I don't mean to be discouraging and I hope you don't abandon playing around with D and proposing changes. That's not the way D will get better. I posted to say what I see as some issues with the proposal. For example I'm all for a wbool but it should have boolean behavior. I would be for standardized names but not if we are going to keep the already-standard names. Aliases should be introduced carefully.
 I've also tried to document boolean and strings in D,
 something that bit and char[] haven't really helped ?

That was very nice to see.
 It's much easier in e.g. Java, but I do think that D's
 types are an improvement over the C types (_Bool / char*)

 --anders

 PS.
 "imaginary real" is still very very silly, though. IMHO.

yup
Feb 11 2005
parent =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Ben Hinkle wrote:

 I scanned the rest of the thread before posting so I missed the updated 
 update about not having _t.  Those int32 etc names were needed in C because 
 of the unknown int sizes. Technically they aren't needed in D. IIRC Java and 
 C# don't have aliases for "int" and friends. If you want a double in Java 
 you say "double". If Walter wanted to change the _t's in std.stdint to 
 non-_t's then that would be fine. One question I have is when should one use 
 these aliases? In typical code or do you see them mostly used for special 
 situations? I don't really know when I'd use those aliases.

I mostly tossed in the floats for "completeness", I don't know either :) Just thought them a nice complement to the integer ones already there ? And I agree there is a much weaker case in D, compared to what C has...
But if it is to be, better be the same all over ?
(which is why I posted it to the group, yet again)

That's fine. Though right now the "same all over" are the standard names like "int", "double", etc. The case for "bool" being an alias for "bit" was strong. The case for a standard "float64" as an alias for "double" is much weaker IMO. I'm perfectly fine with personal aliases for things - people can do what they want. But to put it in std.stdfloat you have to have a strong case that many people would benefit from the name.

The only strong argument I have is that "creal" and "ireal" are silly, but changing that back to "extended" will do, without adding stdfloat ? The ones I was more proud of were the utf#_t ones for char/wchar/dchar.
I'd be hesitent to define wbool and dbool unless they have some boolean 
behaviors (eg it would confusing that cast(wbool)10 != cast(wbool)20)

All the bool types use the Zero-Is-False ideom. (the inequality above would still be true)

replace wbool with bool and the inequality is false.

I think I wrote that poorly. I just meant what's below:
This would still "work", though:
assert(cast(wbool)10 && cast(wbool)20));

true.

Which means that 10 is still "true", assert(10); and that 20 is also still "true", assert(20); but *not* that 10 == 20, the way 1 == 1 for bit And that works just like it does in C and C++.
It's not in any way a boolean type for D, which is
something that Walter (and thus D) does *not* want...

The "bit" type has boolean behavior. It can only take on two values. The "byte" type can take many values and so it does not have boolean behavior.

Bit is fine as a boolean type, exactly for that reason. And it also has the weird non-integer casting behaviour in that 2 becomes 1 when put in a bit variable - if you put 256 in a byte it's 0, 65536 in a short too. But bit's OK for "bool", now that it can be addressed and used as inout. However, both byte and int can also only have two values: zero and non-zero, and that's what I wrote in the table and how D uses them. (byte are a good substitute for addressable arrays, and int is already being used in e.g. opEquals etc) The aliases were just a tongue-in-cheek documentation of those facts...
But this code still works just fine:

alias char[] str;

void main()
{
  str str = "str";
}

shudder. :-)

So it shouldn't break any existing code, if "str" was in object.d ?
 I don't mean to be discouraging and I hope you don't abandon playing around 
 with D and proposing changes. That's not the way D will get better. I posted 
 to say what I see as some issues with the proposal. For example I'm all for 
 a wbool but it should have boolean behavior. I would be for standardized 
 names but not if we are going to keep the already-standard names. Aliases 
 should be introduced carefully.

* stdfloat is mostly useless, and can be killed at the spot... * cent types are not needed until cent/ucent keywords are added * stdutf would be a nice complement to the stdint types... * wbool and dbool and wstr and dstr are not really needed * "str" should be added to object.d right next to bool, I think Order ranking: 1) extended (for "real") 2) str (for "char[]") 3) std.stdutf 4) std.stdbool (ignorable, just a compliment to article) 5) std.stdstr (ignorable, just a compliment to article) 6) std.stdfloat (ignorable) --anders
Feb 11 2005
prev sibling parent Kris <Kris_member pathlink.com> writes:
In article <cuj2ro$2ofe$1 digitaldaemon.com>, Ben Hinkle says...
<snip>
and it has some useful "fast" and "least" aliases etc. I haven't heard of 
people needing these float, char or bool aliases except for individual 
naming preferences. The argument for including them is different that the 
argument for including std.stdint. I'd be hesitent to define wbool and dbool 
unless they have some boolean behaviors (eg it would confusing that 
cast(wbool)10 != cast(wbool)20)

 String: (std.stdstr)
   char[]          str    // UTF-8, optimized for US-ASCII
  wchar[]         wstr    // UTF-16, optimized for Unicode
  dchar[]         dstr    // UTF-32, easy codepoint access

"str" is a common variable name for strings. Plus it would hide the array semantics.

All salient points ...
Feb 11 2005