www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Solution to the encoding problem

reply Nick <Nick_member pathlink.com> writes:
Here is my "solution" to the encoding problem that a lot of people have been
complaining about. The problem is that d library functions cannot read or write
anything but unicode from stdin/stdout. So for fun I made:

http://folk.uio.no/mortennk/d/locale_v0.1.zip

Quick usage example:

# import locale.translate;
#
# void main()
# {
#    // Read a line from stdin, it is converted to utf-8.
#    char[] a = tin.readLine();
#
#    // a is now unicode, you can do what you want with it here
#    // ... 
#
#    // Write the string out again, converting it automatically back to the
#    // local charset.
#    tout.writefln("Your string is ", a);
# }

(The file also contains some general character conversion routines.)

Ok, it's not meant as a perfect solution, or even a good one. Just to
demonstrate _one_ way to solve the problem.

Oh, and it uses iconv and is currently linux-only. Sorry :) Any feedback is
appreciated.

Nick
Mar 02 2005
parent reply =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Nick wrote:

 Here is my "solution" to the encoding problem that a lot of people have been
 complaining about. The problem is that d library functions cannot read or write
 anything but unicode from stdin/stdout. So for fun I made:

 Oh, and it uses iconv and is currently linux-only. Sorry :) Any feedback is
 appreciated.

Here is a more portable way of using iconv: http://www.algonet.se/~afb/d/libiconv.d Still, the iconv library is kinda huge ?
 968K    libiconv.2.dylib
 872K    iconv.dll

Good for the general case, but sorta overkill for Latin-1 and other simple 8-bit encodings ? And then I haven't even begun on locale... ;-) (the code you posted uses GNU glibc version) You see, the BSD locale.h has these instead:
 #define LC_ALL          0
 #define LC_COLLATE      1
 #define LC_CTYPE        2
 #define LC_MONETARY     3
 #define LC_NUMERIC      4
 #define LC_TIME         5
 #define LC_MESSAGES     6

And that's even more fun to port than the subtle differences between GNU libiconv and BSD iconv... --anders
Mar 02 2005
parent reply "Ben Hinkle" <bhinkle mathworks.com> writes:
"Anders F Björklund" <afb algonet.se> wrote in message 
news:d04ce0$2nis$1 digitaldaemon.com...
 Nick wrote:

 Here is my "solution" to the encoding problem that a lot of people have 
 been
 complaining about. The problem is that d library functions cannot read or 
 write
 anything but unicode from stdin/stdout. So for fun I made:

 Oh, and it uses iconv and is currently linux-only. Sorry :) Any feedback 
 is
 appreciated.

Here is a more portable way of using iconv: http://www.algonet.se/~afb/d/libiconv.d

Ah - cool! I had forgotten you had done that. Can we put the link in a FAQ somewhere? Together with a link to Mango's ICU port. Plus examples on how to use them. By the way, how many FAQs do we have anyway? I see http://www.digitalmars.com/d/faq.html http://www.prowiki.org/wiki4d/wiki.cgi?FaqRoadmap
Mar 02 2005
next sibling parent David L. Davis <SpottedTiger yahoo.com> writes:
In article <d04lhg$oe$1 digitaldaemon.com>, Ben Hinkle says...
By the way, how many FAQs do we have anyway? I see
http://www.digitalmars.com/d/faq.html
http://www.prowiki.org/wiki4d/wiki.cgi?FaqRoadmap

Ben: Well I'd say we have at least 4 FAQs, if you add the following two: DedicateD (http://int19h.tamb.ru/faq.html) MKoD (my D site)(http://spottedtiger.tripod.com/D_Language/D_FAQ_XP.html) David L. ------------------------------------------------------------------- "Dare to reach for the Stars...Dare to Dream, Build, and Achieve!"
Mar 02 2005
prev sibling parent =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Ben Hinkle wrote:

Here is a more portable way of using iconv:
http://www.algonet.se/~afb/d/libiconv.d

Ah - cool! I had forgotten you had done that. Can we put the link in a FAQ somewhere? Together with a link to Mango's ICU port. Plus examples on how to use them.

I throw a bunch of hacks on my personal site, that I don't really know where else to put... (or didn't have the time to clean up for release) I'll be happy to contribute some elsewhere ? --anders
Mar 02 2005