digitalmars.D.learn - dchar, wchar, char conversions: what is the best practice?
- Kevin Bealer <kevinbealer gmail.com> Feb 03 2007
- =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> Feb 03 2007
- Kevin Bealer <kevinbealer gmail.com> Feb 03 2007
- torhu <fake address.dude> Feb 03 2007
What is the 'best practice' when converting from dchar to char or vice
versa? And wchar of course.
I expected this code to do a magic conversion:
dchar[] one = "one";
char[] one_a = cast(char[]) one;
Instead, it produces 'o...n...e...' where . is a 0 byte, in other words
it casts the D type of the array but does not change the encoding.
I think this is not entirely unreasonable design, but, since implicit
conversions don't work and explicit conversions do as shown above, there
must be some standard way of going from one format to another, right?
I have this:
dchar[] okay;
foreach(ch; c) {
okay ~= ch;
}
Which seems to work fine, but seems a little piecemeal. Is there a more
standard idiomatic way to do this?
Kevin
// Code
import std.stdio;
void hexdump(T)(char[] z, T[] s)
{
char[] pad = z.dup;
pad[] = ' ';
writefln("%s, %s * %s", z, s.length, T.sizeof);
writef("%s --> ", z);
byte * b = cast(byte*) s.ptr;
int N = s.length*T.sizeof;
for(int i = 0; i < N; i++) {
writef("%2.2x ", b[i]);
if ((i & 7) == 7)
writef("\n %s ", pad);
}
writefln("\n");
}
int main()
{
char[] c = "abcd";
wchar[] w = "1234";
dchar[] d = "WXYZ";
dchar[] okay;
foreach(ch; c) {
okay ~= ch;
}
char[] okay2;
foreach(ch; okay) {
okay2 ~= ch;
}
hexdump("char", c);
hexdump("wchar", w);
hexdump("dchar", d);
hexdump("okay-C", okay);
hexdump("okay-D", okay2);
char[] dc = cast(char[]) d;
char[] wc = cast(char[]) w;
dchar[] cd = cast(dchar[]) c;
hexdump!(char) ("d-to-c", dc);
hexdump!(char) ("w-to-c", wc);
hexdump!(dchar)("c-to-d", cd);
return 0;
}
// Output
char, 4 * 1
char --> 61 62 63 64
wchar, 4 * 2
wchar --> 31 00 32 00 33 00 34 00
dchar, 4 * 4
dchar --> 57 00 00 00 58 00 00 00
59 00 00 00 5a 00 00 00
okay-C, 4 * 4
okay-C --> 61 00 00 00 62 00 00 00
63 00 00 00 64 00 00 00
okay-D, 4 * 1
okay-D --> 61 62 63 64
d-to-c, 16 * 1
d-to-c --> 57 00 00 00 58 00 00 00
59 00 00 00 5a 00 00 00
w-to-c, 8 * 1
w-to-c --> 31 00 32 00 33 00 34 00
c-to-d, 1 * 4
c-to-d --> 61 62 63 64
Feb 03 2007
Kevin Bealer wrote:What is the 'best practice' when converting from dchar to char or vice versa? And wchar of course.
http://www.prowiki.org/wiki4d/wiki.cgi?CharsAndStrsI expected this code to do a magic conversion: dchar[] one = "one"; char[] one_a = cast(char[]) one;
No Magic in Phobos, sorry... --anders
Feb 03 2007
Anders F Björklund wrote:Kevin Bealer wrote:What is the 'best practice' when converting from dchar to char or vice versa? And wchar of course.
http://www.prowiki.org/wiki4d/wiki.cgi?CharsAndStrsI expected this code to do a magic conversion: dchar[] one = "one"; char[] one_a = cast(char[]) one;
No Magic in Phobos, sorry... --anders
Thanks -- nice writeup btw. Kevin
Feb 03 2007
Kevin Bealer wrote:What is the 'best practice' when converting from dchar to char or vice versa? And wchar of course. I expected this code to do a magic conversion: dchar[] one = "one"; char[] one_a = cast(char[]) one;
import std.utf; char[] one_a = toUTF8(one); There's also toUTF16 and toUTF32.
Feb 03 2007









Kevin Bealer <kevinbealer gmail.com> 