digitalmars.D.bugs - doFormat counts bytes, not characters

Nick (67/68) Feb 09 2005 When calculating padding, doFormat counts the number of bytes in the str...

When calculating padding, doFormat counts the number of bytes in the string, not
hte number of characters. UTF-8 strings containing multibyte characters are
padded wrong.

A quick and dirty fix is to apply the following to format.d:

141c141
<           int padding = field_width - (strlen(prefix) + s.length);
---
           int padding = field_width - (strlen(prefix) + toUTF32(s).length);

Another better solution to add functions to std.utf for counting the number of
characters in a string. This is slightly faster and avoids unnecessary memory
allocation.

One possible way to do this (for UTF-8) follows below. It's basically a stripped
down version of decode().
























































Nick

Feb 09 2005

D Programming

C/C++ Programming

Other

digitalmars.D.bugs - doFormat counts bytes, not characters