www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - convert ANSI to UTF-8

reply gertje gertje.org writes:
Hello,

Does anybody have or know how to write a function to convert an ANSI string to a
UTF-8 string? I am not using windows, so I cannot rely on the functions in
std.windows.charset, since they use the MultiByteToWideChar function from the
windows API...

Geert
Jul 07 2006
next sibling parent Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
gertje gertje.org wrote:
 Hello,
 
 Does anybody have or know how to write a function to convert an ANSI string to
a
 UTF-8 string? I am not using windows, so I cannot rely on the functions in
 std.windows.charset, since they use the MultiByteToWideChar function from the
 windows API...

I don't know what you mean by an ANSI string. An ascii string is a subset of an utf-8 string so no conversion is neccessary. If your source string is in the (in western countries) common ISO 8859-1 (aka latin-1) the character values are a subset of the unicode code points and you can convert directly using std.utf.encode. If the encoding is different, you need to supply the mapping to unicode code points yourself (which shouldn't be too hard). /Oskar
Jul 07 2006
prev sibling next sibling parent Lionello Lunesu <lio lunesu.remove.com> writes:
gertje gertje.org wrote:
 Hello,
 
 Does anybody have or know how to write a function to convert an ANSI string to
a
 UTF-8 string? I am not using windows, so I cannot rely on the functions in
 std.windows.charset, since they use the MultiByteToWideChar function from the
 windows API...

I suppose you need something like libiconv, http://www.gnu.org/software/libiconv/ It has the mappings tables for a lot of encodings. Have never had the 'pleasure' to work with libiconv myself, but I see it's used in many projects. L.
Jul 07 2006
prev sibling parent Sean Kelly <sean f4.ca> writes:
gertje gertje.org wrote:
 Hello,
 
 Does anybody have or know how to write a function to convert an ANSI string to
a
 UTF-8 string? I am not using windows, so I cannot rely on the functions in
 std.windows.charset, since they use the MultiByteToWideChar function from the
 windows API...

You might want to look at 'mbsrtowcs' which is a standard C function. It's supposed to be in wchar.h, but as wchar is a keyword in D I've placed it in string.d instead: http://svn.dsource.org/projects/ares/trunk/src/ares/std/c/string.d Sean
Jul 07 2006