digitalmars.D - char, wchar and dchar should be supported equally

James McComb (48/48) Jun 03 2005 I like D having char, wchar and dchar. And I like the way that they will...

Trevor Parscal (11/32) Jun 03 2005 well.. wtoString is a bad naming convention.. I think toWString or

Hasan Aljudy (22/54) Jun 03 2005 I think that toString or any std function that takes a string and

Trevor Parscal (9/13) Jun 03 2005 The best idea for this I have heard thus far.. Especially since, anytime...

Regan Heath (43/50) Jun 03 2005 If you're using char[] then it gets converted to dchar[], processed, the...

James McComb (5/17) Jun 04 2005 Thinks: so that's how you do it! :)

Regan Heath (15/21) Jun 03 2005 Yes and No. In many cases, yes, especially where ASCII is used. However ...

Hasan Aljudy (4/11) Jun 03 2005 What then is the point of having all of these different types?

Regan Heath (24/35) Jun 03 2005 They're each better or worse depending on the data you're operating on.

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (12/14) Jun 04 2005 That's like saying that booleans should always be represented

Hasan Aljudy (15/29) Jun 04 2005 No, it's not like representing booleans with ints .. it's actually like

Kris (9/38) Jun 04 2005 It would be great to resolve this ongoing concern. However, you might
Vathix (2/6) Jun 04 2005 Maybe there should be isascii(char) somewhere :)

=?UTF-8?B?QW5kZXJzIEYgQmrDtnJrbHVuZA==?= (5/13) Jun 05 2005 I suggested that enhancement last year, but it wasn't popular...

Derek Parnell (32/48) Jun 05 2005 You mean like this ...

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (10/27) Jun 05 2005 Is that the "Natural Docs" format ?

Derek Parnell (15/45) Jun 05 2005 Good on ya.

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (7/14) Jun 05 2005 http://www.naturaldocs.org/

Derek Parnell (19/23) Jun 04 2005 Yes please. I've had to write dchar[] versions of a lot of things in

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (5/8) Jun 04 2005 Not that anyone cares, but templates also have severe problems

James McComb <alan jamesmccomb.id.au> writes:

I like D having char, wchar and dchar. And I like the way that they will 
(soon?) implicitly convert between each other. But I don't like the way 
that D is biased towards char. I think that char, dchar and wchar should 
be supported equally.

For example, modern Windows systems support UTF-16 (via the W 
functions). So you might decide to use wchar, because that is also 
UTF-16. The Windows API expects zero-terminated strings, and you can 
clearly indicate this in your code by calling toStringz. But toStringz 
takes char, so your wchar will be implicitly converted to char and then 
implicitly converted back to wchar. So there is no point using wchar!

But what if every function in std.string had wchar and dchar versions?
Then you could use wchar and call wtoStringz. (At the end of this email, 
there is some working code showing how this could be implemented using 
templates and aliases. There are other ways that std.string could 
support wchar and dchar, such as function overloading or function 
templates.)

Also, in order for char, wchar and dchar to be supported equally, Object 
should have wtoString and dtoString methods. (Because toString cannot be 
overloaded based on its return type.)

Does anyone else out there feel the same? Or should I get over it and 
JUC (Just Use Char) like I already JUB (Just Use Bit)?

James McComb

<code>
import std.stdio;

template TStringFunctions(T) {
     T[] toStringz(T[] str) {
         if (!str)
             return "";

         T[] copy = str.dup;
         return copy ~= '\0';
     }

     // Other string functions...
}

alias TStringFunctions!(char)  stringFunctions;
alias TStringFunctions!(wchar) wstringFunctions;
alias TStringFunctions!(dchar) dstringFunctions;

alias stringFunctions.toStringz  toStringz;
alias wstringFunctions.toStringz wtoStringz;
alias dstringFunctions.toStringz dtoStringz;

// Other string function aliases...

// Example usage
void main() {
     char[]   str = "utf-8 string";
     wchar[] wstr = "utf-16 string";

     str  = toStringz(str);
     wstr = wtoStringz(wstr);
}
</code>

Jun 03 2005

Trevor Parscal <trevorparscal hotmail.com> writes:

James McComb wrote:
 I like D having char, wchar and dchar. And I like the way that they will 
 (soon?) implicitly convert between each other. But I don't like the way 
 that D is biased towards char. I think that char, dchar and wchar should 
 be supported equally.
 
 For example, modern Windows systems support UTF-16 (via the W 
 functions). So you might decide to use wchar, because that is also 
 UTF-16. The Windows API expects zero-terminated strings, and you can 
 clearly indicate this in your code by calling toStringz. But toStringz 
 takes char, so your wchar will be implicitly converted to char and then 
 implicitly converted back to wchar. So there is no point using wchar!
 
 But what if every function in std.string had wchar and dchar versions?
 Then you could use wchar and call wtoStringz. (At the end of this email, 
 there is some working code showing how this could be implemented using 
 templates and aliases. There are other ways that std.string could 
 support wchar and dchar, such as function overloading or function 
 templates.)
 
 *snip* Object should have wtoString and dtoString methods. 
 

well.. wtoString is a bad naming convention.. I think toWString or 
toDString makes a little more sense, but to be honest, I think it should 
work like read and write, and return char[], wchar[], or dchar[] based 
on what you cast.

That's my two cents anyhoo, as an avid dchar[] user.

-- 
Thanks,
Trevor Parscal
www.trevorparscal.com
trevorparscal hotmail.com

Jun 03 2005

Hasan Aljudy <hasan.aljudy gmail.com> writes:

Trevor Parscal wrote:
 James McComb wrote:
 
 I like D having char, wchar and dchar. And I like the way that they 
 will (soon?) implicitly convert between each other. But I don't like 
 the way that D is biased towards char. I think that char, dchar and 
 wchar should be supported equally.

 For example, modern Windows systems support UTF-16 (via the W 
 functions). So you might decide to use wchar, because that is also 
 UTF-16. The Windows API expects zero-terminated strings, and you can 
 clearly indicate this in your code by calling toStringz. But toStringz 
 takes char, so your wchar will be implicitly converted to char and 
 then implicitly converted back to wchar. So there is no point using 
 wchar!

 But what if every function in std.string had wchar and dchar versions?
 Then you could use wchar and call wtoStringz. (At the end of this 
 email, there is some working code showing how this could be 
 implemented using templates and aliases. There are other ways that 
 std.string could support wchar and dchar, such as function overloading 
 or function templates.)

 *snip* Object should have wtoString and dtoString methods.

 
 
 well.. wtoString is a bad naming convention.. I think toWString or 
 toDString makes a little more sense, but to be honest, I think it should 
 work like read and write, and return char[], wchar[], or dchar[] based 
 on what you cast.
 
 That's my two cents anyhoo, as an avid dchar[] user.
 

I think that toString or any std function that takes a string and 
processes it, should always take dchar and return dchar.

Assuming that dchar is implicitly convertable to char and wchar, there 
can be no loss of information when doing something like:

<code>
dchar[] someFunction(dchar[]) ...

...

wchar[] wtest = ...
wtest = someFunction(wtest); //no loss

...

char[] test = ..
test = someFunction(test); //no loss
</code>

of course I maybe wrong, but I'm assuming that converting a char to 
wchar is like converting an int to double .. where any extra space is 
just filled with zeros (speaking in the bit level), and you can convert 
an int to double, process it, and convert it back to int, and assume 
that no information will be lost because of the conversion to double.
ofcourse information can be lost if "int" is not enough to store the 
value returned from the function, but this has nothing to do with 
converting back and forth to double then to int.

Jun 03 2005

Trevor Parscal <trevorparscal hotmail.com> writes:

Hasan Aljudy wrote:
 
 I think that toString or any std function that takes a string and 
 processes it, should always take dchar and return dchar.
 

The best idea for this I have heard thus far.. Especially since, anytime 
you are doing a toString you aren't going to be worried about the 
addtional overhead of a dchar[] (or so I believe)

-- 
Thanks,
Trevor Parscal
www.trevorparscal.com
trevorparscal hotmail.com

Jun 03 2005

"Regan Heath" <regan netwin.co.nz> writes:

On Fri, 03 Jun 2005 20:42:25 -0700, Trevor Parscal  
<trevorparscal hotmail.com> wrote:
 Hasan Aljudy wrote:
  I think that toString or any std function that takes a string and  
 processes it, should always take dchar and return dchar.

 The best idea for this I have heard thus far.. Especially since, anytime  
 you are doing a toString you aren't going to be worried about the  
 addtional overhead of a dchar[] (or so I believe)

If you're using char[] then it gets converted to dchar[], processed, then  
converted back. That's not ideal IMO.

Ideally we only want conversion to happen in 1, or at most 2 places.

1. Data is converted on input from <input format> to <internal format>.
2. Data is converted on output from <internal format> to <output format>.


they will do both (for one reason or another). Each application will have  
a different <internal format> chosen for some specific reason, perhaps  
even a different <internal format> for each group of data.

So, Ideally we require 3 variants of every single string function. But of  
course, we dont want to be repeating ourselves all the time, in fact we  
want only one 'function' we just want to re-use it for all 3 string types.  
So, might I suggest using templates eg.

import std.stdio;
import std.ctype;

template toLowerT(Type) {
   Type[] toLowerT(Type[] input) {
     Type[] res = input.dup;
     foreach(inout Type c; res)
     	c = tolower(c);
     return res;
   }
}

alias toLowerT!(char) toLower;
alias toLowerT!(wchar) toLower;
alias toLowerT!(dchar) toLower;

void main()
{
	char[] a = "REGAN";
	wchar[] b = "WAS";
	dchar[] c = "HERE";
	
	//we can even use the x.fn() form as opposed to fn(x) if we wish.
	writefln("%s=%s",a,a.toLower());
	writefln("%s=%s",b,b.toLower());
	writefln("%s=%s",c,c.toLower());
}

NOTE: I realise using ctype's tolower function will only work with ASCII,  
not the full compliment of unicode characters. This is a semi-functional  
example only.

Regan

Jun 03 2005

James McComb <alan jamesmccomb.id.au> writes:

Regan Heath wrote:

 template toLowerT(Type) {
   Type[] toLowerT(Type[] input) {
     Type[] res = input.dup;
     foreach(inout Type c; res)
         c = tolower(c);
     return res;
   }
 }
 
 alias toLowerT!(char) toLower;
 alias toLowerT!(wchar) toLower;
 alias toLowerT!(dchar) toLower;

Thinks: so that's how you do it! :)

This is the kind of thing I had in mind. Is there any chance that 
std.string actually *will* be implemented like this?

James McComb

Jun 04 2005

"Regan Heath" <regan netwin.co.nz> writes:

On Fri, 03 Jun 2005 21:37:23 -0600, Hasan Aljudy <hasan.aljudy gmail.com>  
wrote:
 of course I maybe wrong, but I'm assuming that converting a char to  
 wchar is like converting an int to double .. where any extra space is  
 just filled with zeros (speaking in the bit level)

Yes and No. In many cases, yes, especially where ASCII is used. However  
some UTF-8 'characters'/'glyphs' (not sure what the correct term is  
exactly) take 2 or more char's (UTF-8 codepoints) to represent, so when  
converting them you might go from 3 chars to 1 wchar (1 UTF-16 codepoint)  
which is a decrease in byte space required, and often a change in the  
value of the codepoint.

 , and you can convert an int to double, process it, and convert it back  
 to int, and assume that no information will be lost because of the  
 conversion to double.

Converting to/from char[], wchar[] and dchar[] causes no loss of data,  
ever. All existing glyphs can be represented in UTF-8(char[]),  
UTF-16(wchar[]) and UTF-32(dchar[]), thus all existing strings can be  
represented in all types. Of course that representation uses a different  
number of bytes and may in fact use different bit patterns(codepoints) as  
well.

Regan

Jun 03 2005

Hasan Aljudy <hasan.aljudy gmail.com> writes:

Regan Heath wrote:
  > Converting to/from char[], wchar[] and dchar[] causes no loss of data,
 ever. All existing glyphs can be represented in UTF-8(char[]),  
 UTF-16(wchar[]) and UTF-32(dchar[]), thus all existing strings can be  
 represented in all types. Of course that representation uses a 
 different  number of bytes and may in fact use different bit 
 patterns(codepoints) as  well.
 
 Regan

What then is the point of having all of these different types?

How does UTF-8 work? when you only have 256 possible values?

Jun 03 2005

"Regan Heath" <regan netwin.co.nz> writes:

On Sat, 04 Jun 2005 00:05:46 -0600, Hasan Aljudy <hasan.aljudy gmail.com>  
wrote:
 Regan Heath wrote:
   > Converting to/from char[], wchar[] and dchar[] causes no loss of  
 data,
 ever. All existing glyphs can be represented in UTF-8(char[]),   
 UTF-16(wchar[]) and UTF-32(dchar[]), thus all existing strings can be   
 represented in all types. Of course that representation uses a  
 different  number of bytes and may in fact use different bit  
 patterns(codepoints) as  well.
  Regan

 What then is the point of having all of these different types?

They're each better or worse depending on the data you're operating on.

Terminology: (I think this is correct)
   Codepoint == one char, wchar, or dchar.
   Character == a symbol, made up of 1 or more codepoints.

UTF-8 is perfect if most/all of your data is ASCII, as UTF-8 characters  
have the same values as they do in ASCII, ASCII is a sub-set of UTF-8  
(which can represent characters that do not exist in ASCII).

UTF-16 is better than UTF-8 in cases where most/all of your data would  
take 2 or more UTF-8 codepoints to represent. Essentially UTF-16 can store  
some characters in less space than UTF-8 can.

UTF-32 is better than UTF-16 in cases where most/all of your data would  
take 2 or more UTF-16 codepoints to represent.

Some people choose to use UTF-32 as you can guarantee a codepoint == a  
character, meaning the dchar's length property is the 'string' length  
(this is not always the case with wchar, or char, due to some characters  
taking more than 1 codepoint).

 How does UTF-8 work? when you only have 256 possible values?

In essence it uses between 1 and 4 codepoints to represent a single  
character.

Someone probably has a better reference than this:
http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&item_id=IWS-AppendixA

I just quickly googled that up.

Regan

Jun 03 2005

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:

Hasan Aljudy wrote:

 I think that toString or any std function that takes a string and 
 processes it, should always take dchar and return dchar.

That's like saying that booleans should always be represented
with "int", and I'm afraid it won't fly around here since we're
obsessed with the size of variables more than processing time :-)

Conversion is a real problem, but at least you can do:
    char[] str; foreach(dchar c; str) { ... }
Plus some ASCII shortcuts, when the high bit isn't set.


Much more on http://prowiki.org/wiki4d/wiki.cgi?CharsAndStrs
(and several other pages on the Wiki4D, like Derek's RFE:
  "FeatureRequestList/ImplicitConversionBetweenUTF")

--anders

PS. You probably meant to say "dchar[]", and not dchar ?

Jun 04 2005

Hasan Aljudy <hasan.aljudy gmail.com> writes:

Anders F Bj�rklund wrote:
 Hasan Aljudy wrote:
 
 I think that toString or any std function that takes a string and 
 processes it, should always take dchar and return dchar.

 
 
 That's like saying that booleans should always be represented
 with "int", and I'm afraid it won't fly around here since we're
 obsessed with the size of variables more than processing time :-)
 

No, it's not like representing booleans with ints .. it's actually like 
saying ints should always be represented by doubles.

booleans are not numbers, there is no reason to represent them as 
numbers, and no one should ever store numbers in booleans.

But char, wchar, and dchar are all characters, just with different 
storage space.

I don't really think anybody cares about size, most people who care 
would care most about performance (processing time).

imagine if all std functions used short instead of int ;) that could be 
a serious problem.

 Conversion is a real problem, but at least you can do:
    char[] str; foreach(dchar c; str) { ... }
 Plus some ASCII shortcuts, when the high bit isn't set.
 

I don't like having to read the unicode specs to be able to deal with 
simple things like char. Your "ASCII shortcuts" would be low-level stuff 
dealing with how char and dchar are represented in memory.

C'mon people, D is a high level language.

Jun 04 2005

"Kris" <fu bar.com> writes:

It would be great to resolve this ongoing concern. However, you might
consider trying the ICU project for all your unicode needs ~ it's what Java
uses under the covers:
http://www-306.ibm.com/software/globalization/icu/index.jsp

There's a D interface available over here, along with a well-rounded String
class: http://dsource.org/forums/viewtopic.php?t=148

- Kris

"Hasan Aljudy" <hasan.aljudy gmail.com> wrote in message
news:d7t8tc$b40$1 digitaldaemon.com...
 Anders F Bj�rklund wrote:
 Hasan Aljudy wrote:

 I think that toString or any std function that takes a string and
 processes it, should always take dchar and return dchar.


 That's like saying that booleans should always be represented
 with "int", and I'm afraid it won't fly around here since we're
 obsessed with the size of variables more than processing time :-)

 No, it's not like representing booleans with ints .. it's actually like
 saying ints should always be represented by doubles.

 booleans are not numbers, there is no reason to represent them as
 numbers, and no one should ever store numbers in booleans.

 But char, wchar, and dchar are all characters, just with different
 storage space.

 I don't really think anybody cares about size, most people who care
 would care most about performance (processing time).

 imagine if all std functions used short instead of int ;) that could be
 a serious problem.

 Conversion is a real problem, but at least you can do:
    char[] str; foreach(dchar c; str) { ... }
 Plus some ASCII shortcuts, when the high bit isn't set.

 I don't like having to read the unicode specs to be able to deal with
 simple things like char. Your "ASCII shortcuts" would be low-level stuff
 dealing with how char and dchar are represented in memory.

 C'mon people, D is a high level language.

Jun 04 2005

Vathix <vathix dprogramming.com> writes:

 I don't like having to read the unicode specs to be able to deal with  
 simple things like char. Your "ASCII shortcuts" would be low-level stuff  
 dealing with how char and dchar are represented in memory.

 C'mon people, D is a high level language.

Maybe there should be isascii(char) somewhere :)
Would be inlined and self documenting.

Jun 04 2005

=?UTF-8?B?QW5kZXJzIEYgQmrDtnJrbHVuZA==?= <afb algonet.se> writes:

Vathix wrote:

 I don't like having to read the unicode specs to be able to deal with  
 simple things like char. Your "ASCII shortcuts" would be low-level 
 stuff  dealing with how char and dchar are represented in memory.

 C'mon people, D is a high level language.

 
 Maybe there should be isascii(char) somewhere :)
 Would be inlined and self documenting.

I suggested that enhancement last year, but it wasn't popular...

http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D.bugs/2154

Or maybe it just got lost in this crippled "bug reporting system" ?

--anders

Jun 05 2005

Derek Parnell <derek psych.ward> writes:

On Sun, 05 Jun 2005 09:25:09 +0200, Anders F Bj�rklund wrote:

 Vathix wrote:
 
 I don't like having to read the unicode specs to be able to deal with  
 simple things like char. Your "ASCII shortcuts" would be low-level 
 stuff  dealing with how char and dchar are represented in memory.

 C'mon people, D is a high level language.

 
 Maybe there should be isascii(char) somewhere :)
 Would be inlined and self documenting.

 
 I suggested that enhancement last year, but it wasn't popular...
 
 http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D.bugs/2154
 
 Or maybe it just got lost in this crippled "bug reporting system" ?

You mean like this ...
//---------------------------
//  --- isASCII --
// Returns true if the supplied argument is an ASCII character.
//
// Paramaters:
//      (1)   -- char -- The character to test.
//   (return) -- bool -- 'true' if the character is ASCII otherwise false.
//---------------------------
bool isASCII(char c)
out(result)
{
    assert(result == (UTF8stride[c] == 1));
}
body{
    return (cast(uint)c <= 127U ? true : false);
}
unittest
{
   assert(isASCII('a') == true);
   assert(isASCII('~') == true);
   assert(isASCII('\xFF') == false);
   assert(isASCII('\x80') == false);
   assert(isASCII('\x00') == true);
   assert(isASCII(cast(char) -1) == false);
}
//---------------------------



-- 
Derek Parnell
Melbourne, Australia
5/06/2005 7:13:16 PM

Jun 05 2005

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:

Derek Parnell wrote:

 You mean like this ...
 //---------------------------
 //  --- isASCII --
 // Returns true if the supplied argument is an ASCII character.
 //
 // Paramaters:
 //      (1)   -- char -- The character to test.
 //   (return) -- bool -- 'true' if the character is ASCII otherwise false.
 //---------------------------

Is that the "Natural Docs" format ?

I think I prefer Doxygen, myself:
/// Is the supplied code unit an ASCII character ?
///  param c    The UTF-8 code unit to test.
///  return     'true' if the character is ASCII

 bool isASCII(char c)
 out(result)
 {
     assert(result == (UTF8stride[c] == 1));
 }
 body{
     return (cast(uint)c <= 127U ? true : false);
 }

But surely this workaround shouldn't be needed ?

If a "bool" function can't return a comparison,
then there's something severly broken somewhere...

--anders

Jun 05 2005

Derek Parnell <derek psych.ward> writes:

On Sun, 05 Jun 2005 12:09:47 +0200, Anders F Bj�rklund wrote:

 Derek Parnell wrote:
 
 You mean like this ...
 //---------------------------
 //  --- isASCII --
 // Returns true if the supplied argument is an ASCII character.
 //
 // Paramaters:
 //      (1)   -- char -- The character to test.
 //   (return) -- bool -- 'true' if the character is ASCII otherwise false.
 //---------------------------

 
 Is that the "Natural Docs" format ?

Dunno. What's that ? I just made this up on the spot.

 I think I prefer Doxygen, myself:
 /// Is the supplied code unit an ASCII character ?
 ///  param c    The UTF-8 code unit to test.
 ///  return     'true' if the character is ASCII

Good on ya.

 bool isASCII(char c)
 out(result)
 {
     assert(result == (UTF8stride[c] == 1));
 }
 body{
     return (cast(uint)c <= 127U ? true : false);
 }

 
 But surely this workaround shouldn't be needed ?
 
 If a "bool" function can't return a comparison,
 then there's something severly broken somewhere...

I make a distinction between the machine code that is generated by a
compiler and the source code that is read by a human.

Yes, the compiler is able to work out that a bool is returned from a
comparison, but by writing it out explicitly, we also get a clear and
unambiguous statement of intent by the coder. We get the same machine code
generated and now its also human readable too.

In other words, it is self-documenting and does not rely on the
sophistication of the compiler. 

-- 
Derek Parnell
Melbourne, Australia
5/06/2005 8:39:19 PM

Jun 05 2005

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:

Derek Parnell wrote:

Is that the "Natural Docs" format ?

 
 Dunno. What's that ? I just made this up on the spot.

http://www.naturaldocs.org/

Whatever style is used, it should be parsable ?

 Yes, the compiler is able to work out that a bool is returned from a
 comparison, but by writing it out explicitly, we also get a clear and
 unambiguous statement of intent by the coder. We get the same machine code
 generated and now its also human readable too.

Ah, OK, then it wasn't a compiler bug <phew>.
Just a matter of opinion on readability... :-)

Like: "a < b" versus "(a < b) ? true : false"

--anders

Jun 05 2005

Derek Parnell <derek psych.ward> writes:

On Sat, 04 Jun 2005 11:20:47 +1000, James McComb wrote:

 I like D having char, wchar and dchar. And I like the way that they will 
 (soon?) implicitly convert between each other. But I don't like the way 
 that D is biased towards char. I think that char, dchar and wchar should 
 be supported equally.

Yes please. I've had to write dchar[] versions of a lot of things in
std.string and others. 

I tend to use char[] only when reading to and from files/streams, and use
dchar[] for internal routines. The application I'm working on now does a
lot of text processing and it is too slow to convert char[] -> dchar[],
process it, convert dchar[] -> char[]. 

The simplicity of dchar[] is that the array index always points to the
start of a character, where as with char[] and wchar[] the index can point
to somewhere inside a character. (Remembering that each character in a
dchar[] string is the same size - a dchar - but characters in wchar[] and
char[] have variable sizes.)

The current Phobos routines are heavily biased to char[]. Also, the use of
templates is not always the best solution because there are some
optimizations available, depending on the UTF encoding format used.


-- 
Derek Parnell
Melbourne, Australia
4/06/2005 6:08:29 PM

Jun 04 2005

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:

Derek Parnell wrote:

 The current Phobos routines are heavily biased to char[]. Also, the use of
 templates is not always the best solution because there are some
 optimizations available, depending on the UTF encoding format used.

Not that anyone cares, but templates also have severe problems
on other D platforms such as with the GDC compiler on Mac OS X...

It's getting better, but it's like "the early days of C++" or so.

--anders

Jun 04 2005

D Programming

C/C++ Programming

Other

digitalmars.D - char, wchar and dchar should be supported equally