www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - std.process.shell character encoding crash

Hello.
I was trying to run this simple code:

---

import std.stdio;
import std.process;

void main(){
    string a = shell("dir");
}

---

DMDv2.049/linux works fine, but on windows/DMDv2.051, I've got these errors:

C:\test>rdmd shell_test.d

C:\test>dchar decode(in char[], ref size_t): Invalid UTF-8 sequence [32, 83, 118
, 97, 122, 101, 107, 32, 118, 32, 106, 101, 100, 110, 111, 116, 99, 101, 32, 67,
 32, 110, 101, 109, 160, 32, 167, 160, 100, 110, 111, 117, 32, 106, 109, 101, 11
0, 111, 118, 107, 117, 46, 13, 10, 32, 83, 130, 114, 105, 111, 118, 130, 32, 159
, 161, 115, 108, 111, 32, 115, 118, 97, 122, 107, 117, 32, 106, 101, 32, 56, 56,
 54, 65, 45, 67, 51, 52, 49, 46, 13, 10, 13, 10, 32, 86, 236, 112, 105, 115, 32,
 97, 100, 114, 101, 115, 160, 253, 101, 32, 67, 58, 92, 116, 101, 115, 116, 13,
10, 13, 10, 48, 52, 46, 48, 49, 46, 50, 48, 49, 49, 32, 32, 49, 57, 58, 49, 52,
32, 32, 32, 32, 60, 68, 73, 82, 62, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 46,
13, 10, 48, 52, 46, 48, 49, 46, 50, 48, 49, 49, 32, 32, 49, 57, 58, 49, 52, 32,
32, 32, 32, 60, 68, 73, 82, 62, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 46, 46,
13, 10, 48, 52, 46, 48, 49, 46, 50, 48, 49, 49, 32, 32, 49, 57, 58, 49, 52, 32,
32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 48, 32, 98, 100,
 52, 57, 53, 48, 102, 97, 54, 55, 54, 100, 49, 55, 97, 97, 54, 101, 55, 100, 56,
 51, 100, 55, 53, 102, 48, 53, 102, 51, 55, 100, 49, 48, 55, 97, 50, 50, 99, 56,
 54, 52, 51, 51, 54, 97, 100, 54, 101, 101, 50, 97, 49, 99, 56, 98, 50, 55, 57,
55, 102, 56, 98, 101, 13, 10, 48, 52, 46, 48, 49, 46, 50, 48, 49, 49, 32, 32, 49
, 56, 58, 48, 57, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 51
, 51, 50, 32, 100, 46, 116, 120, 116, 13, 10, 48, 52, 46, 48, 49, 46, 50, 48, 49
, 49, 32, 32, 49, 57, 58, 49, 51, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32
, 32, 50, 255, 53, 55, 51, 32, 115, 104, 101, 108, 108, 95, 116, 101, 115, 116,
45, 100, 45, 49, 48, 48, 70, 66, 53, 56, 49, 65, 48, 48, 70, 53, 69, 54, 51, 69,
 69, 56, 54, 55, 50, 65, 51, 49, 56, 49, 50, 56, 54, 51, 69, 46, 109, 97, 112, 1
3, 10, 48, 52, 46, 48, 49, 46, 50, 48, 49, 49, 32, 32, 49, 57, 58, 49, 49, 32, 3
2, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 57, 48, 32, 115, 104,
 101, 108, 108, 95, 116, 101, 115, 116, 46, 100, 13, 10, 48, 52, 46, 48, 49, 46,
 50, 48, 49, 49, 32, 32, 49, 57, 58, 49, 52, 32, 32, 32, 32, 32, 32, 32, 32, 32,
 32, 32, 32, 32, 53, 255, 55, 53, 51, 32, 115, 104, 101, 108, 108, 95, 116, 101,
 115, 116, 46, 100, 46, 100, 101, 112, 115, 13, 10, 32, 32, 32, 32, 32, 32, 32,
32, 32, 32, 32, 32, 32, 32, 32, 53, 32, 115, 111, 117, 98, 111, 114, 133, 44, 32
, 32, 32, 32, 32, 32, 32, 32, 32, 32, 56, 255, 55, 52, 56, 32, 98, 97, 106, 116,
 133, 13, 10, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 65, 100, 114, 101, 115
, 160, 253, 133, 58, 32, 32, 32, 32, 32, 50, 44, 32, 32, 32, 86, 111, 108, 110,
236, 99, 104, 32, 98, 97, 106, 116, 133, 58, 32, 32, 51, 255, 52, 53, 56, 255, 4
8, 56, 52, 255, 56, 54, 52, 13, 10] around index 24

---

It was on windows with czech locales, standard output from dir should be
something like this:

C:\test>dir
 Svazek v jednotce C nemá žádnou jmenovku.




04.01.2011  19:13    <DIR>          .
04.01.2011  19:13    <DIR>          ..
04.01.2011  18:09               332 d.txt
04.01.2011  19:13             2 573 shell_test-d-100FB581A00F5E63EE8672A31812863
E.map
04.01.2011  19:11                90 shell_test.d
04.01.2011  19:13             5 753 shell_test.d.deps



---

I think, that shell() is doing some character encoding conversions, but it
fails, because input isn't in UTF/ASCII, but in some obscure windows encoding
(Windows-1252 or something like that).

Is there any way how to fix this problem? Or shell() is lost for everyone who
have system encoding different than UTF/ASCII?

PS: Sorry for my english. Is bad, I know..
Jan 04 2011