www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - utf-8?

reply Steve Teale <steve.teale britseyeview.com> writes:
import std.stdio;

void main()
{
   string s = "Die Walküre";
   writefln(s);
}

Gives error - invalid utf-8 sequence. I pasted the text from a Wiki page that
claims to be utf-8. What's happening?
Mar 17 2009
next sibling parent reply Ary Borenszweig <ary esperanto.org.ar> writes:
Steve Teale wrote:
 import std.stdio;
 
 void main()
 {
    string s = "Die Walküre";
    writefln(s);
 }
 
 Gives error - invalid utf-8 sequence. I pasted the text from a Wiki page that
claims to be utf-8. What's happening?
Did you save the file as UTF-8?
Mar 17 2009
next sibling parent Trass3r <mrmocool gmx.de> writes:
Ary Borenszweig schrieb:
 Gives error - invalid utf-8 sequence. I pasted the text from a Wiki 
 page that claims to be utf-8. What's happening?
Did you save the file as UTF-8?
Yeah, you must save the file as UTF-8.
Mar 17 2009
prev sibling parent reply Steve Teale <steve.teale britseyeview.com> writes:
Ary Borenszweig Wrote:

 Steve Teale wrote:
 import std.stdio;
 
 void main()
 {
    string s = "Die Walküre";
    writefln(s);
 }
 
 Gives error - invalid utf-8 sequence. I pasted the text from a Wiki page that
claims to be utf-8. What's happening?
Did you save the file as UTF-8?
Was not offered the opportunity. I pasted the text from the web page into Windows Notepad, and saved it - this seemed to me to be as generic as possible given the facilities I have available. What editor will do the trick? Also, if I'm using the Windows version of the compiler, shouldn't it be smart enough to convert the Windows character set to UTF-8?
Mar 17 2009
next sibling parent Ary Borenszweig <ary esperanto.org.ar> writes:
Steve Teale wrote:
 Ary Borenszweig Wrote:
 
 Steve Teale wrote:
 import std.stdio;

 void main()
 {
    string s = "Die Walküre";
    writefln(s);
 }

 Gives error - invalid utf-8 sequence. I pasted the text from a Wiki page that
claims to be utf-8. What's happening?
Did you save the file as UTF-8?
Was not offered the opportunity. I pasted the text from the web page into Windows Notepad, and saved it - this seemed to me to be as generic as possible given the facilities I have available. What editor will do the trick? Also, if I'm using the Windows version of the compiler, shouldn't it be smart enough to convert the Windows character set to UTF-8?
Notepad will do the trick, but you must save it as UTF-8. In the Save As dialog select UTF-8 in Encoding.
Mar 17 2009
prev sibling parent Walter Bright <newshound1 digitalmars.com> writes:
Steve Teale wrote:
 Also, if I'm using the Windows version of
 the compiler, shouldn't it be smart enough to convert the Windows
 character set to UTF-8?
The huge problem with code pages is, given a file with some text in it, there is NO CLUE what code page the file's text encoding is in. Just because dmd is running on Windows is not any reason at all that the character set of the file must be the Windows character set. I ran into this problem all the time with C code.
Mar 17 2009
prev sibling next sibling parent Georg Wrede <georg.wrede iki.fi> writes:
Steve Teale wrote:
 import std.stdio;
 
 void main()
 {
    string s = "Die Walküre";
    writefln(s);
 }
 
 Gives error - invalid utf-8 sequence. I pasted the text from a Wiki
 page that claims to be utf-8. What's happening?
I pasted your code above and compiled and ran it with no problem. I'm on linux: Fedora 10, and dmd v2.026.
Mar 17 2009
prev sibling parent reply Gide Nwawudu <gide btinternet.com> writes:
On Tue, 17 Mar 2009 10:48:56 -0400, Steve Teale
<steve.teale britseyeview.com> wrote:

import std.stdio;

void main()
{
   string s = "Die Walküre";
   writefln(s);
}

Gives error - invalid utf-8 sequence. I pasted the text from a Wiki page that
claims to be utf-8. What's happening?
Works for me, you should save the file as UTF-8 and set your codepage to 65001. C:\> dmd test.d C:\> test Die Walk+?re C:\>chcp 65001 Active code page: 65001 C:\>test Die Walküre Gide
Mar 17 2009
next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Wed, 18 Mar 2009 02:34:49 +0300, Gide Nwawudu <gide btinternet.com> wrote:

 On Tue, 17 Mar 2009 10:48:56 -0400, Steve Teale
 <steve.teale britseyeview.com> wrote:

 import std.stdio;

 void main()
 {
   string s = "Die Walküre";
   writefln(s);
 }

 Gives error - invalid utf-8 sequence. I pasted the text from a Wiki  
 page that claims to be utf-8. What's happening?
Works for me, you should save the file as UTF-8 and set your codepage to 65001. C:\> dmd test.d C:\> test Die Walk+?re C:\>chcp 65001 Active code page: 65001 C:\>test Die Walküre Gide
I believe Phobos should do it manually.
Mar 17 2009
prev sibling parent reply Steve Teale <steve.teale britseyeview.com> writes:
Gide Nwawudu Wrote:

 On Tue, 17 Mar 2009 10:48:56 -0400, Steve Teale
 <steve.teale britseyeview.com> wrote:
 
import std.stdio;

void main()
{
   string s = "Die Walküre";
   writefln(s);
}

Gives error - invalid utf-8 sequence. I pasted the text from a Wiki page that
claims to be utf-8. What's happening?
Works for me, you should save the file as UTF-8 and set your codepage to 65001. C:\> dmd test.d C:\> test Die Walk+?re C:\>chcp 65001 Active code page: 65001 C:\>test Die Walküre Gide
Yup, that does it. I'd missed the encoding option in notepad. What were you running the program in - in a cmd window I see graphics characters.
Mar 18 2009
parent Daniel Keep <daniel.keep.lists gmail.com> writes:
Steve Teale wrote:
 Gide Nwawudu Wrote:
 
 On Tue, 17 Mar 2009 10:48:56 -0400, Steve Teale
 <steve.teale britseyeview.com> wrote:

 import std.stdio;

 void main()
 {
   string s = "Die Walk�re";
   writefln(s);
 }

 Gives error - invalid utf-8 sequence. I pasted the text from a Wiki page that
claims to be utf-8. What's happening?
Works for me, you should save the file as UTF-8 and set your codepage to 65001. C:\> dmd test.d C:\> test Die Walk+?re C:\>chcp 65001 Active code page: 65001 C:\>test Die Walk�re Gide
Yup, that does it. I'd missed the encoding option in notepad. What were you running the program in - in a cmd window I see graphics characters.
You have to configure CMD to use Lucida Console as the font. Also note that CMD won't do fallbacks like virtually every other Windows app: if a character isn't in Lucida Console, you won't see it. -- Daniel
Mar 18 2009