www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - How to print unicode like: こんにちは 世界

reply Matthew Ong <ongbp yahoo.com> writes:
Hi,

import std.stdio;
alias immutable(wchar)[] String;
String str="Hello, world; or &#922;&#945;&#955;&#951;&#956;&#941;&#961;&#945;
&#954;&#972;&#963;&#956;&#949;; or &#12371;&#12435;&#12395;&#12385;&#12399;
&#19990;&#30028;";
writeln(str); // It prints garbage on console.

In Java and Go, that just works. I believe UTF-X is handles that.

How to do that within D?

Yes. I am still new to D. No. I am not japanese but chinese.


Matthew Ong
May 19 2011
next sibling parent Matthew Ong <ongbp yahoo.com> writes:
AH... The web encoder corrupted the string. into NON human readable.
May 19 2011
prev sibling next sibling parent reply Matthew Ong <ongbp yahoo.com> writes:
On 5/19/2011 11:22 PM, Matthew Ong wrote:
 Hi,

 import std.stdio;
 alias immutable(wchar)[] String;
 String str="Hello, world;
or&#922;&#945;&#955;&#951;&#956;&#941;&#961;&#945;&#954;&#972
&#963;&#956;&#949;; or&#12371;&#12435;&#12395;&#12385;&#12399;&#19990;&#30028;";
 writeln(str); // It prints garbage on console.

 In Java and Go, that just works. I believe UTF-X is handles that.

 How to do that within D?

 Yes. I am still new to D. No. I am not japanese but chinese.


 Matthew Ong

String str="Hello, world; or Καλημέρα κόσμε; or こんにちは 世界"; -- Matthew Ong email: ongbp yahoo.com
May 19 2011
next sibling parent Robert Clipsham <robert octarineparrot.com> writes:
On 19/05/2011 16:19, Matthew Ong wrote:
 On 5/19/2011 11:22 PM, Matthew Ong wrote:
 Hi,

 import std.stdio;
 alias immutable(wchar)[] String;
 String str="Hello, world;
 or&#922;&#945;&#955;&#951;&#956;&#941;&#961;&#945;&#954;&#972;&#963;&#956;&#949;;
 or&#12371;&#12435;&#12395;&#12385;&#12399;&#19990;&#30028;";
 writeln(str); // It prints garbage on console.

 In Java and Go, that just works. I believe UTF-X is handles that.

 How to do that within D?

 Yes. I am still new to D. No. I am not japanese but chinese.


 Matthew Ong

String str="Hello, world; or Καλημέρα κόσμε; or こんにちは 世界";

There is a built in alias for immutable(wchar)[], wstring. Try postfixing with w, eg: wstring str = "Hello, world; or Καλημέρα κόσμε; or こんにちは 世界"w; You could even use auto, given that you postfixed the string with w. -- Robert http://octarineparrot.com/
May 19 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
If you're on Windows, you need to set the proper codepage for the
console, you can do it programmatically like so:

version(Windows)
{
   import std.c.windows.windows;
   extern(Windows) BOOL SetConsoleOutputCP(UINT);
}

void main()
{
   version(Windows) SetConsoleOutputCP(65001);
   writeln("Hello, world; or =CE=9A=CE=B1=CE=BB=CE=B7=CE=BC=CE=AD=CF=81=CE=
=B1 =CE=BA=CF=8C=CF=83=CE=BC=CE=B5; or =E3=81=93=E3=82=93=E3=81=AB=E3=81=A1=
=E3=81=AF =E4=B8=96=E7=95=8C");
}

You would also need a Unicode-aware font, maybe Lucida or something
similar. Typically fixed-point fonts used for programming have little
support for Unicode characters, and you would get back a black or
white box like "[]".


On 5/19/11, Andrej Mitrovic <andrej.mitrovich gmail.com> wrote:
 Which OS?

May 19 2011
prev sibling next sibling parent reply Russel Winder <russel russel.org.uk> writes:
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Thu, 2011-05-19 at 22:37 +0200, Andrej Mitrovic wrote:
[ . . . ]
 You would also need a Unicode-aware font, maybe Lucida or something
 similar. Typically fixed-point fonts used for programming have little
 support for Unicode characters, and you would get back a black or
 white box like "[]".

Alternatively use a nice proportional font with Unicode support for code so it is more readable? Backward compatibility with 80x24 terminals is just so last millennium. --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel russel.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
May 19 2011
parent reply Matthew Ong <ongbp yahoo.com> writes:
On 5/20/2011 2:55 PM, Russel Winder wrote:
 On Thu, 2011-05-19 at 22:37 +0200, Andrej Mitrovic wrote:
 [ . . . ]
 You would also need a Unicode-aware font, maybe Lucida or something
 similar. Typically fixed-point fonts used for programming have little
 support for Unicode characters, and you would get back a black or
 white box like "[]".

Alternatively use a nice proportional font with Unicode support for code so it is more readable? Backward compatibility with 80x24 terminals is just so last millennium.

Thanks for the example code on the unicode issue. hi Andrej, version(Windows) { import std.c.windows.windows; extern(Windows) BOOL SetConsoleOutputCP(UINT); } void main() { version(Windows) SetConsoleOutputCP(65001); writeln("Hello, world; or Καλημέρα κόσμε; or こんにちは 世界"); } I am running this on a M$ Vista Professional edition. Unfortunately that code sample does not work... Please see attachment. The reason I am testing this is to understand how the stream library works. Console test is fundamental test to see if the language handles unicode within the source code itself automatically and display them correctly when it does file IO, network and also GUI level. Normally in mordern languages it should, because that allow a developer to easily define a simple resource bundle file to loaded automatically within the same fragment of code. If not, then, there is going to be problem. I hope we do not need to write such code: version(English){ // some code }version(Chinese_MainLand){ // some other code }version(Chinese_HongKong){ // yet another code }...etc I have originally plan to send an attached screen shot but it is not working for the newsgroup. Perhaps someone can show me how to do that here. -- Matthew Ong email: ongbp yahoo.com
May 20 2011
parent reply Matthew Ong <ongbp yahoo.com> writes:
On 5/21/2011 2:46 PM, Matthew Ong wrote:
 On 5/20/2011 2:55 PM, Russel Winder wrote:
 On Thu, 2011-05-19 at 22:37 +0200, Andrej Mitrovic wrote:
 [ . . . ]
 You would also need a Unicode-aware font, maybe Lucida or something
 similar. Typically fixed-point fonts used for programming have little
 support for Unicode characters, and you would get back a black or
 white box like "[]".

Alternatively use a nice proportional font with Unicode support for code so it is more readable? Backward compatibility with 80x24 terminals is just so last millennium.

Thanks for the example code on the unicode issue. hi Andrej, version(Windows) { import std.c.windows.windows; extern(Windows) BOOL SetConsoleOutputCP(UINT); } void main() { version(Windows) SetConsoleOutputCP(65001); writeln("Hello, world; or Καλημέρα κόσμε; or こんにちは 世界"); } I am running this on a M$ Vista Professional edition. Unfortunately that code sample does not work... Please see attachment. The reason I am testing this is to understand how the stream library works. Console test is fundamental test to see if the language handles unicode within the source code itself automatically and display them correctly when it does file IO, network and also GUI level. Normally in mordern languages it should, because that allow a developer to easily define a simple resource bundle file to loaded automatically within the same fragment of code. If not, then, there is going to be problem. I hope we do not need to write such code: version(English){ // some code }version(Chinese_MainLand){ // some other code }version(Chinese_HongKong){ // yet another code }...etc I have originally plan to send an attached screen shot but it is not working for the newsgroup. Perhaps someone can show me how to do that here.

correction on my part. It does seem like a font issue. Strangely the same console program that used to be able to show unicode correctly for both java and go is no longer working??? I think it is true, it has nothing to do with D. More like a font thing. The result I got working is within the Eclipse Java and Go Console but NOT the DDT D console. Without IDE, all of them show garbage... Perhaps some one can kindly show me how the console fonts setting works for M$ Vista.
Unicode-aware font, maybe Lucida< How I tried Lucida it also does not 

Tab. -- Matthew Ong email: ongbp yahoo.com
May 21 2011
parent Matthew Ong <ongbp yahoo.com> writes:
On 5/21/2011 7:42 PM, Andrej Mitrovic wrote:
 Oh yeah cmd.exe doesn't really have that many font options. Personally
 I use console2 from http://sourceforge.net/projects/console/ , which
 has font options and nice things compared to cmd.exe. (It's really
 just a GUI wrapper with some extras).

-- Matthew Ong email: ongbp yahoo.com
May 23 2011
prev sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Oh yeah cmd.exe doesn't really have that many font options. Personally
I use console2 from http://sourceforge.net/projects/console/ , which
has font options and nice things compared to cmd.exe. (It's really
just a GUI wrapper with some extras).
May 21 2011
prev sibling next sibling parent Adam D. Ruppe <destructionator gmail.com> writes:
Try using the hex unicode characters like this:

"\u0123"

I don't think decimal characters like you put work in D strings.
The &name; syntax sometimes works but only for named chars, and
you need to put a \ before them like this:

writeln("\&ldquo; hello \&rdquo;");

Prints:
 hello 

You'll have to make sure your console is set to a unicode code
page though. This might be done automatically on Windows; I'm
not sure.
May 19 2011
prev sibling next sibling parent Jesse Phillips <jessekphillips+D gmail.com> writes:
Matthew Ong Wrote:

 Hi,
 
 import std.stdio;
 alias immutable(wchar)[] String;
 String str="Hello, world; or &#922;&#945;&#955;&#951;&#956;&#941;&#961;&#945;
&#954;&#972;&#963;&#956;&#949;; or &#12371;&#12435;&#12395;&#12385;&#12399;
&#19990;&#30028;";
 writeln(str); // It prints garbage on console.
 
 In Java and Go, that just works. I believe UTF-X is handles that.
 
 How to do that within D?
 
 Yes. I am still new to D. No. I am not japanese but chinese.
 
 
 Matthew Ong

consoles tend not to display Unicode, generally you need to tell it to use a font that supports it and I've heard people say you can pass a string to the console to say the output is unicode. This has been the only issue I've seen people have, D operates just fine with Unicode.
May 19 2011
prev sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Which OS?
May 19 2011