www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Writing unicode strings to the console

reply "Jeremy DeHaan" <dehaan.jeremiah gmail.com> writes:
I was playing with unicode strings the other day, and have been 
searching for a way to correctly write unicode to the console.

If I try something like:

dstring String = "さいごの果実";
		
writeln(String);

All I get is a bunch of nonsense as if it converts the dstring 
into a regular string. Is it possible to write the unicode string 
to the console correctly?
Dec 17 2012
next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
I suggest you use string instead of dstring, because utf-8 
(string) has better output support than utf-32 (dstring), and 
both support the complete unicode character set.

If string doesn't work, the question is: Windows or Linux?

On Windows, the api call SetConsoleOutputCP will help

http://msdn.microsoft.com/en-us/library/windows/desktop/ms686036%28v=vs.85%29.aspx

The magic number for UTF-8 is 65001. (see here: 
http://msdn.microsoft.com/en-us/library/dd317756%28v=VS.85%29.aspx 
)

The link says utf32 is only available to managed applications, so 
you probably want to use utf-8.



If you are on linux, you need to get a terminal that supports 
utf8. Writing "\033%G" to an xterm will switch it to utf8, but 
this is the default most the time.... so you'll probably be ok on 
that.


Again though, writing strings is probably going to give better 
results than dstring on either OS with any set of options.
Dec 17 2012
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Jeremy DeHaan:

 Is it possible to write the unicode string to the console 
 correctly?
What is your operating system? On oldish Windows you have to set the console to Unicode or nearly Unicode. I don't know about Windows7/8. Bye, bearophile
Dec 17 2012
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Dec 18, 2012 at 01:29:55AM +0100, Jeremy DeHaan wrote:
 I was playing with unicode strings the other day, and have been
 searching for a way to correctly write unicode to the console.
 
 If I try something like:
 
 dstring String = "さいごの果実";
 		
 writeln(String);
 
 All I get is a bunch of nonsense as if it converts the dstring into a
 regular string. Is it possible to write the unicode string to the
 console correctly?
It works for me (urxvt on Linux/64bit). What console are you using? Does your console support Unicode output? Did you set the console's encoding to UTF-8? T -- One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie. -- The Silicon Valley Tarot
Dec 17 2012
prev sibling next sibling parent reply "Jeremy DeHaan" <dehaan.jeremiah gmail.com> writes:
 Adam D. Ruppe
I suggest you use string instead of dstring, because utf-8 
(string) has better output support than utf-32 (dstring), and 
both support the complete unicode character set.
Tried string and wstring. Both had the same results as my dstring.
On Windows, the api call SetConsoleOutputCP will help
http://msdn.microsoft.com/en-us/library/windows/desktop/ms686036%28v=vs.85%29.aspx
The magic number for UTF-8 is 65001. (see here: 
http://msdn.microsoft.com/en-us/library/dd317756%28v=VS.85%29.aspx 
)
How/Where would I call this? I am using 64-bit Win7 As for the console, I do my D development with MonoDevelop, so it's the console used in that. I looked, but I didn't see any settings relating to this. Thanks for your help guys!
Dec 17 2012
parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Tuesday, 18 December 2012 at 00:59:12 UTC, Jeremy DeHaan wrote:
 How/Where would I call this?
Right at the beginning of your main, but after trying it, I don't think this is going to fix your problem anyway... I think it is fonts. But: import std.stdio; extern(Windows) int SetConsoleOutputCP(uint); void main() { if(SetConsoleOutputCP(65001) == 0) throw new Exception("failure"); string String = "さいごの果\&ldquo;hello\&rdquo;"; writeln(String); } Is how you'd use it. But I just tried on my Windows 7 computer, and the default was utf8, so the call was unnecessary. You can try though and see what happens. A problem I did have though was my console font didn't support unicode. If you bring up the properties option on the console window, under the Font tab, you can pick a font. Raster fonts is what mine was set to, and had no unicode support. It output gibberish. Lucida Console gave better results - it had the right number of characters and showed the curly quotes, but not the other characters. Could be because I only have the English language pack for Windows installed on my computer. But anyway I suggest you try playing with the different fonts you have and see what happens.
Dec 17 2012
prev sibling next sibling parent "Sam Hu" <samhudotsamhu gmail.com> writes:
On Tuesday, 18 December 2012 at 00:29:56 UTC, Jeremy DeHaan wrote:
 I was playing with unicode strings the other day, and have been 
 searching for a way to correctly write unicode to the console.

 If I try something like:

 dstring String = "さいごの果実";
 		
 writeln(String);

 All I get is a bunch of nonsense as if it converts the dstring 
 into a regular string. Is it possible to write the unicode 
 string to the console correctly?
http://forum.dlang.org/thread/suzymdzjeifnfirtbnrc dfeed.kimsufi.thecybershadow.net#post-suzymdzjeifnfirtbnrc:40dfeed.kimsufi.thecybershadow.net
Dec 18 2012
prev sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Tuesday, 18 December 2012 at 00:29:56 UTC, Jeremy DeHaan wrote:
 I was playing with unicode strings the other day, and have been 
 searching for a way to correctly write unicode to the console.

 If I try something like:

 dstring String = "さいごの果実";
 		
 writeln(String);

 All I get is a bunch of nonsense as if it converts the dstring 
 into a regular string. Is it possible to write the unicode 
 string to the console correctly?
If all else fails, you can always just print to file instead. That's what I do.
Dec 18 2012