www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - DMD 144 Windows command line not in Locale

reply Georg Wrede <georg.wrede nospam.org> writes:
Below, I give some parameters in the dos window to hello, and it gets 
the non-usascii characters wrong.

My W2000 is set to Finland.

-----------

C:\dmd\samples\d>dmd hello.d
C:\dmd\bin\..\..\dm\bin\link.exe hello,,,user32+kernel32/noi;

C:\dmd\samples\d>hello itse senkin parjaava ämälämkä k,,,ölkjölkjölkj s
hello world
args.length = 7
args[0] = 'C:\dmd\samples\d\hello.exe'
args[1] = 'itse'
args[2] = 'senkin'
args[3] = 'parjaava'
args[4] = 'õmõlõmkõ'
args[5] = 'k,,,÷lkj÷lkj÷lkj'
args[6] = 's'

C:\dmd\samples\d>hello öööÖÖÖäÄåÅ
hello world
args.length = 1
args[0] = 'C:\dmd\samples\d\hello.exe'

C:\dmd\samples\d>hello öööÖÖÖäÄåÅ x
hello world
args.length = 2
args[0] = 'C:\dmd\samples\d\hello.exe'
args[1] = 'x'

C:\dmd\samples\d>hello aöööa x
hello world
args.length = 3
args[0] = 'C:\dmd\samples\d\hello.exe'
args[1] = 'a÷÷÷a'
args[2] = 'x'

C:\dmd\samples\d>hello öööö x
hello world
args.length = 2
args[0] = 'C:\dmd\samples\d\hello.exe'
args[1] = 'x'

C:\dmd\samples\d>dmd
Digital Mars D Compiler v0.144
Jan 25 2006
parent reply Tom S <h3r3tic remove.mat.uni.torun.pl> writes:
Georg Wrede wrote:
 Below, I give some parameters in the dos window to hello, and it gets 
 the non-usascii characters wrong.

The console is probably using encoding other than UTF8. Try executing "chcp 65001" before running the D program. -- -----BEGIN GEEK CODE BLOCK----- Version: 3.1 GCS/M d-pu s+: a-->----- C+++$>++++ UL P+ L+ E--- W++ N++ o? K? w++ !O !M V? PS- PE- Y PGP t 5 X? R tv-- b DI- D+ G e>+++ h>++ !r !y ------END GEEK CODE BLOCK------ Tomasz Stachowiak /+ a.k.a. h3r3tic +/
Jan 25 2006
parent reply Georg Wrede <georg.wrede nospam.org> writes:
Tom S wrote:
 Georg Wrede wrote:
 
 Below, I give some parameters in the dos window to hello, and it gets 
 the non-usascii characters wrong.

The console is probably using encoding other than UTF8. Try executing "chcp 65001" before running the D program.

Ehh, that's not the point, is it?
Jan 25 2006
parent reply =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Georg Wrede wrote:

 The console is probably using encoding other than UTF8. Try executing 
 "chcp 65001" before running the D program.

Ehh, that's not the point, is it?

Not sure where the "bug" is here, though... Either D does not "support" any consoles other than UTF-8, and using anything else falls into undefined behaviour (i.e. what it does now) Or, if it is *meant* to convert the local encoding into UTF-8 before stuffing into the args[][] table. Right now, it just copies whatever. It's pretty simple to get invalid Unicode, in the arguments array... Just as it's very simple to get undefined return codes* from programs ? But I haven't heard if the problem is in the spec or the implementation. --anders * void main() {}
Jan 25 2006
parent reply Sean Kelly <sean f4.ca> writes:
Anders F Björklund wrote:
 Georg Wrede wrote:
 
 The console is probably using encoding other than UTF8. Try executing 
 "chcp 65001" before running the D program.

Ehh, that's not the point, is it?

Not sure where the "bug" is here, though... Either D does not "support" any consoles other than UTF-8, and using anything else falls into undefined behaviour (i.e. what it does now) Or, if it is *meant* to convert the local encoding into UTF-8 before stuffing into the args[][] table. Right now, it just copies whatever. It's pretty simple to get invalid Unicode, in the arguments array... Just as it's very simple to get undefined return codes* from programs ? But I haven't heard if the problem is in the spec or the implementation.

I think the data should be converted, if possible, by the compiler runtime before main() is called. This should simply be a matter of adding some code to dmain2.d in phobos/internal. How should this be handled? ie. is there a specific known console encoding that should be converted from? Sean
Jan 25 2006
parent "Chris Miller" <chris dprogramming.com> writes:
On Wed, 25 Jan 2006 17:49:44 -0500, Sean Kelly <sean f4.ca> wrote:

 Anders F Björklund wrote:
 Georg Wrede wrote:

 The console is probably using encoding other than UTF8. Try executing  
 "chcp 65001" before running the D program.

Ehh, that's not the point, is it?

Either D does not "support" any consoles other than UTF-8, and using anything else falls into undefined behaviour (i.e. what it does now) Or, if it is *meant* to convert the local encoding into UTF-8 before stuffing into the args[][] table. Right now, it just copies whatever. It's pretty simple to get invalid Unicode, in the arguments array... Just as it's very simple to get undefined return codes* from programs ? But I haven't heard if the problem is in the spec or the implementation.

I think the data should be converted, if possible, by the compiler runtime before main() is called. This should simply be a matter of adding some code to dmain2.d in phobos/internal. How should this be handled? ie. is there a specific known console encoding that should be converted from? Sean

It's simple to fix main args[][], just use GetCommandLineW() (which is supported on win9x/me) and std.utf.toUTF8() it. I announced this simple fix awhile ago.
Jan 25 2006