digitalmars.D - string and utf aliases
- =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (47/47) Oct 14 2004 Would it be yet another "blasphemy" to
Would it be yet another "blasphemy" to
add a string *alias* to the language ?
(No, not a string typedef. Just alias)
I think that, and some char type aliases
similar to stdint.d, could do *wonders*
for the readability/understandability ?
alias char utf8_t;
alias wchar utf16_t;
alias dchar utf32_t;
alias utf8_t[] string; // ASCII-optimized
alias utf16_t[] ustring; // Unicode-optimized
Used like in the following example D program,
that will print all args in UTF-8 and UTF-32:
void main(string[] args)
{
foreach(int a, string arg; args) {
printf("%d: %.*s\n", a, arg);
printf(" ");
foreach (utf8_t b; arg) {
printf("%02x ", b);
}
printf("\n");
foreach (utf32_t c; arg) {
printf("\t\\U%08x\n", c);
}
}
}
For simple ASCII, the output looks something like:
0: ./unichar
2e 2f 75 6e 69 63 68 61 72
\U0000002e
\U0000002f
\U00000075
\U0000006e
\U00000069
\U00000063
\U00000068
\U00000061
\U00000072
With unicode arguments, it looks ... different.
(since some UTF-8 code units will be surrogates)
--anders
PS:
I think this string alias and UTF-8 chars are way
better than Java's String class and UTF-16 chars!
(pretty much the same way that the compiled D code
vastly outperforms the Java code with JVM startup)
Oct 14 2004








=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se>