digitalmars.D.learn - dchar, wchar, char conversions: what is the best practice?

Kevin Bealer (81/81) Feb 03 2007 What is the 'best practice' when converting from dchar to char or vice

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (4/11) Feb 03 2007 No Magic in Phobos, sorry...

Kevin Bealer (3/18) Feb 03 2007 Thanks -- nice writeup btw.

torhu (4/11) Feb 03 2007 import std.utf;

Kevin Bealer <kevinbealer gmail.com> writes:

What is the 'best practice' when converting from dchar to char or vice 
versa?  And wchar of course.

I expected this code to do a magic conversion:

dchar[] one = "one";
char[] one_a = cast(char[]) one;

Instead, it produces 'o...n...e...' where . is a 0 byte, in other words 
it casts the D type of the array but does not change the encoding.

I think this is not entirely unreasonable design, but, since implicit 
conversions don't work and explicit conversions do as shown above, there 
must be some standard way of going from one format to another, right?

I have this:

     dchar[] okay;
     foreach(ch; c) {
         okay ~= ch;
     }

Which seems to work fine, but seems a little piecemeal.  Is there a more 
standard idiomatic way to do this?

Kevin

// Code

import std.stdio;

void hexdump(T)(char[] z, T[] s)
{
     char[] pad = z.dup;
     pad[] = ' ';

     writefln("%s, %s * %s", z, s.length, T.sizeof);

     writef("%s --> ", z);
     byte * b = cast(byte*) s.ptr;
     int N = s.length*T.sizeof;

     for(int i = 0; i < N; i++) {
         writef("%2.2x ", b[i]);

         if ((i & 7) == 7)
             writef("\n %s    ", pad);
     }
     writefln("\n");
}

int main()
{
     char[]  c = "abcd";
     wchar[] w = "1234";
     dchar[] d = "WXYZ";

     dchar[] okay;
     foreach(ch; c) {
         okay ~= ch;
     }

     char[] okay2;
     foreach(ch; okay) {
         okay2 ~= ch;
     }

     hexdump("char", c);
     hexdump("wchar", w);
     hexdump("dchar", d);
     hexdump("okay-C", okay);
     hexdump("okay-D", okay2);

     char[] dc = cast(char[]) d;
     char[] wc = cast(char[]) w;
     dchar[] cd = cast(dchar[]) c;

     hexdump!(char) ("d-to-c", dc);
     hexdump!(char) ("w-to-c", wc);
     hexdump!(dchar)("c-to-d", cd);

     return 0;
}

// Output

char, 4 * 1
char --> 61 62 63 64

wchar, 4 * 2
wchar --> 31 00 32 00 33 00 34 00

dchar, 4 * 4
dchar --> 57 00 00 00 58 00 00 00
           59 00 00 00 5a 00 00 00

okay-C, 4 * 4
okay-C --> 61 00 00 00 62 00 00 00
            63 00 00 00 64 00 00 00

okay-D, 4 * 1
okay-D --> 61 62 63 64

d-to-c, 16 * 1
d-to-c --> 57 00 00 00 58 00 00 00
            59 00 00 00 5a 00 00 00

w-to-c, 8 * 1
w-to-c --> 31 00 32 00 33 00 34 00

c-to-d, 1 * 4
c-to-d --> 61 62 63 64

Feb 03 2007

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:

Kevin Bealer wrote:
 
 What is the 'best practice' when converting from dchar to char or vice 
 versa?  And wchar of course.

http://www.prowiki.org/wiki4d/wiki.cgi?CharsAndStrs

 I expected this code to do a magic conversion:
 
 dchar[] one = "one";
 char[] one_a = cast(char[]) one;

No Magic in Phobos, sorry...

--anders

Feb 03 2007

Kevin Bealer <kevinbealer gmail.com> writes:

Anders F Bj�rklund wrote:
 Kevin Bealer wrote:
 What is the 'best practice' when converting from dchar to char or vice 
 versa?  And wchar of course.

 
 http://www.prowiki.org/wiki4d/wiki.cgi?CharsAndStrs
 
 I expected this code to do a magic conversion:

 dchar[] one = "one";
 char[] one_a = cast(char[]) one;

 
 No Magic in Phobos, sorry...
 
 --anders

Thanks -- nice writeup btw.

Kevin

Feb 03 2007

torhu <fake address.dude> writes:

Kevin Bealer wrote:
 What is the 'best practice' when converting from dchar to char or vice 
 versa?  And wchar of course.
 
 I expected this code to do a magic conversion:
 
 dchar[] one = "one";
 char[] one_a = cast(char[]) one;


import std.utf;
char[] one_a = toUTF8(one);

There's also toUTF16 and toUTF32.

Feb 03 2007

D Programming

C/C++ Programming

Other

digitalmars.D.learn - dchar, wchar, char conversions: what is the best practice?