www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - odd writefln behaviour on windows

reply "Regan Heath" <regan netwin.co.nz> writes:
------------yZNFEBz4SxhNiSlkBP8WQV
Content-Type: text/plain; format=flowed; delsp=yes; charset=iso-8859-15
Content-Transfer-Encoding: 8bit

Compile the attached UTF-8 encoded source file. Follow the directions in  
the source. I'll paste it here too, in case someone knows the answer  
already:

/+
To replicate this bug on windows:
- left-click top left corner of command prompt window
- select "properties"
- select "font"
- select "Lucida Console"
- type "chcp 65001" into command prompt

Now, compile/run this example. Things to try:
- nothing, note that it writes "smörgåsbord" twice
- uncomment the FPUTWC line, note that it prevents the 2nd "smörgåsbord"
- comment char[] s = "smörgåsbord" and uncomment char[] s = "test", note  
that it prints test<newline>test
+/

import std.stdio;

void main()
{
	//char[] s = "smörgåsbord";
	//this string does not exhibit the same problems.
	char[] s = "test";

	writef("%s",s);
	//uncomment this, note that it prevents the next writef
	FPUTWC('\n', stdout);
	writef("%s",s);	
}

Regan
------------yZNFEBz4SxhNiSlkBP8WQV
Content-Disposition: attachment; filename=abug9.d
Content-Type: application/octet-stream; name=abug9.d
Content-Transfer-Encoding: 8bit

/+
To replicate this bug on windows:
- left-click top left corner of command prompt window
- select "properties"
- select "font"
- select "Lucida Console"
- type "chcp 65001" into command prompt

Now, compile/run this example. Things to try:
- nothing, note that it writes "smörgåsbord" twice
- uncomment the FPUTWC line, note that it prevents the 2nd "smörgåsbord"
- comment char[] s = "smörgåsbord" and uncomment char[] s = "test", note that
it prints test<newline>test
+/

import std.stdio;

void main()
{
	//char[] s = "smörgåsbord";
	//this string does not exhibit the same problems.
	char[] s = "test";

	writef("%s",s);
	//uncomment this, note that it prevents the next writef
	FPUTWC('\n', stdout);
	writef("%s",s);	
}
------------yZNFEBz4SxhNiSlkBP8WQV--
Nov 24 2005
next sibling parent reply Georg Wrede <georg.wrede nospam.org> writes:
Regan Heath wrote:
 Compile the attached UTF-8 encoded source file.

Ehh, I'm not sure what the point here is?
Nov 25 2005
parent "Regan Heath" <regan netwin.co.nz> writes:
On Sat, 26 Nov 2005 02:01:36 +0200, Georg Wrede <georg.wrede nospam.org>  
wrote:
 Regan Heath wrote:
 Compile the attached UTF-8 encoded source file.

Ehh, I'm not sure what the point here is?

On my pc, using the instructions I posted, using writefln, or otherwise writing the \n character to the console prevents any further output to the console from being written. I suspect it only occurs on windows. Regan
Nov 25 2005
prev sibling parent reply Manfred Nowak <svv1999 hotmail.com> writes:
Regan Heath wrote:

 - type "chcp 65001" into command prompt

Why do you set an unsupported cp? http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=133659&SiteID=1 - manfred
Nov 26 2005
parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Sat, 26 Nov 2005 08:35:38 +0000 (UTC), Manfred Nowak  
<svv1999 hotmail.com> wrote:
 Regan Heath wrote:

 - type "chcp 65001" into command prompt

Why do you set an unsupported cp? http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=133659&SiteID=1

Because AFAIK it's the only way to get UTF-8 output from D to appear on the screen correctly. Also AFAIK writef will only output UTF-8, so, I don't really have any choice in the matter.. unless you know something I don't? Regan.
Nov 26 2005
next sibling parent reply Thomas Kuehne <thomas-dloop kuehne.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Regan Heath schrieb am 2005-11-26:
 On Sat, 26 Nov 2005 08:35:38 +0000 (UTC), Manfred Nowak  
<svv1999 hotmail.com> wrote:
 Regan Heath wrote:

 - type "chcp 65001" into command prompt

Why do you set an unsupported cp? http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=133659&SiteID=1

Because AFAIK it's the only way to get UTF-8 output from D to appear on the screen correctly. Also AFAIK writef will only output UTF-8, so, I don't really have any choice in the matter.. unless you know something I don't?

I haven't got a Windows box right now, thus the code below might require some fixes. # import std.c.windows.windows, std.windows.syserror, std.c.stdio; # # enum CODE_PAGE : uint { # ANSI = 0, # OEM = 1, # } # # void windows_fprintf(FILE* file, CODE_PAGE codePage, wchar* msg){ # char[] result; # result.length = WideCharToMultiByte(codePage, 0, msg, -1, null, 0, null, null); # # if(result.length){ # size_t len = WideCharToMultiByte(codePage, 0, msg, -1, result.ptr, result.length, null, null); # if(len != result.length){ # throw new Exception("transcoding failure: " ~ sysErrorString(GetLastError())); # } # } # # fprintf(file, "%.*s", result); # } Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFDiXJF3w+/yD4P9tIRArLFAJ90UrSkax4mI+TYrccLO0K8hq8cZwCdG3Zi GwDLNFmq2f0ExPAe/0aMHrE= =rMuE -----END PGP SIGNATURE-----
Nov 26 2005
parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Sat, 26 Nov 2005 21:38:08 +0000 (UTC), Thomas Kuehne  
<thomas-dloop kuehne.cn> wrote:
 Regan Heath schrieb am 2005-11-26:
 On Sat, 26 Nov 2005 08:35:38 +0000 (UTC), Manfred Nowak
 <svv1999 hotmail.com> wrote:
 Regan Heath wrote:

 - type "chcp 65001" into command prompt

Why do you set an unsupported cp? http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=133659&SiteID=1

Because AFAIK it's the only way to get UTF-8 output from D to appear on the screen correctly. Also AFAIK writef will only output UTF-8, so, I don't really have any choice in the matter.. unless you know something I don't?

I haven't got a Windows box right now, thus the code below might require some fixes. # import std.c.windows.windows, std.windows.syserror, std.c.stdio; # # enum CODE_PAGE : uint { # ANSI = 0, # OEM = 1, # } # # void windows_fprintf(FILE* file, CODE_PAGE codePage, wchar* msg){ # char[] result; # result.length = WideCharToMultiByte(codePage, 0, msg, -1, null, 0, null, null); # # if(result.length){ # size_t len = WideCharToMultiByte(codePage, 0, msg, -1, result.ptr, result.length, null, null); # if(len != result.length){ # throw new Exception("transcoding failure: " ~ sysErrorString(GetLastError())); # } # } # # fprintf(file, "%.*s", result); # }

Thanks for this but I'm not actually after a solution, per-se :) Is writefln broken? or is the windows console broken? as in, are either of them not correctly supporting UTF-8? I suspect the latter, I was really just posting to find that out. Regan
Nov 26 2005
parent =?ISO-8859-15?Q?Jari-Matti_M=E4kel=E4?= <jmjmak invalid_utu.fi> writes:
Regan Heath wrote:
 Thanks for this but I'm not actually after a solution, per-se :)
 
 Is writefln broken? or is the windows console broken? as in, are either 
 of  them not correctly supporting UTF-8?
 
 I suspect the latter, I was really just posting to find that out.

Windows 9x/NT/XP command prompt is a piece of crap. I'm pretty sure the Vista command prompt won't be any better.
Nov 28 2005
prev sibling parent Manfred Nowak <svv1999 hotmail.com> writes:
Regan Heath wrote:

[...]
 Because AFAIK it's the only way to get UTF-8 output from D to
 appear on  the screen correctly.

As you found out yourself, setting the CP to the unsupported 65001 does not work correctly. But any conclusion is wrong, that UTF8 output from D cannot appear on the _screen_ correctly. If you want to use a special display program, like the command prompt under windows, you have to adept your D program to the capabilities of that display program.
 Also AFAIK writef will only output UTF-8, so, I  don't really
 have any choice in the matter.. unless you know something I 
 don't? 

I am quite sure, that you know the solution: for every UTF8 output, for which the intended display program is incapable of presenting, use a substitute presentation. For example an "oe" instead of the intended `\ouml' might do the job. Unnecessary to mention that utf8 and the like is intended as a replacement for utilities like `iconv' and therefore Walters design decision to restrict D's capabilities to (UTF8,...) is the right way to go? -manfred
Nov 26 2005