www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - UTF-32 bug

It seems incredible, to me, that toUTF8(dchar[]) can ever /return/ invalid
UTF-8. But it does, when given invalid input! The following code

















Compiles successfully. Output is






The problem is that the output SHOULD be...





Fortunately, the fix is very simple. All you have to do is modify
toUTF8(dchar[]) to verify that every dchar in the input returns true from
std.utf.isValidDchar(). (Observe that isValidDchar(0xD800) correctly returns
false.)

Arcane Jill
Jul 09 2004