www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 12493] New: std.file.readText doesn't convert Windows newlines correctly

reply d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=12493

           Summary: std.file.readText doesn't convert Windows newlines
                    correctly
           Product: D
           Version: D2
          Platform: x86
        OS/Version: Windows
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Phobos
        AssignedTo: nobody puremagic.com
        ReportedBy: bearophile_hugs eml.cc


--- Comment #0 from bearophile_hugs eml.cc 2014-03-30 07:16:32 PDT ---
If I create a text file named "text.txt" like this, that contains four lines,
and I save it with Windows newlines:


this
is
a
test




Then if I read and print the whole file like this:

void main() {
    import std.stdio: writeln;
    import std.file: readText;
    auto t = readText("text.txt");
    writeln(t);
}


Output:

this

is

a

test


Expected output:

this
is
a
test

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Mar 30 2014
next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=12493


yebblies <yebblies gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
                 CC|                            |yebblies gmail.com
           Platform|x86                         |All
         Resolution|                            |INVALID


--- Comment #1 from yebblies <yebblies gmail.com> 2014-04-05 01:28:10 EST ---
stdout is opened in text mode, so it converts all \n to \r\n giving \r\n\n

Everything is working as designed, as terrible as that design might be.

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Apr 04 2014
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=12493


Vladimir Panteleev <thecybershadow gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |thecybershadow gmail.com


--- Comment #2 from Vladimir Panteleev <thecybershadow gmail.com> 2014-04-04
18:40:16 EEST ---
I can't reproduce this issue on my machine.

 stdout is opened in text mode, so it converts all \n to \r\n giving \r\n\n

I thought text mode converted both DOS and UNIX newlines to OS newlines, so it shouldn't double-convert? -- Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 04 2014
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=12493



--- Comment #3 from Vladimir Panteleev <thecybershadow gmail.com> 2014-04-04
18:42:03 EEST ---
If I redirect the output to a file, I get \r\r\n newlines, not \r\n\n. That
makes sense, I don't know where \r\n\n come from.

Consecutive \r symbols are a no-op, so I don't see why it would print empty
lines between text lines.

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Apr 04 2014
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=12493



--- Comment #4 from yebblies <yebblies gmail.com> 2014-04-05 02:54:29 EST ---
(In reply to comment #3)
 If I redirect the output to a file, I get \r\r\n newlines, not \r\n\n. That
 makes sense, I don't know where \r\n\n come from.
 

Oops, that's what I meant. It converts \n to \r\n, so the original \r\n becomes \r\r\n
 Consecutive \r symbols are a no-op, so I don't see why it would print empty
 lines between text lines.

What \r does depends on your terminal. -- Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 04 2014
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=12493



--- Comment #5 from Vladimir Panteleev <thecybershadow gmail.com> 2014-04-04
18:56:55 EEST ---
(In reply to comment #4)
 What \r does depends on your terminal.

What terminal prints a newline when it sees \r? A 90's Macintosh? -- Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 04 2014
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=12493



--- Comment #6 from yebblies <yebblies gmail.com> 2014-04-05 02:59:26 EST ---
(In reply to comment #5)
 (In reply to comment #4)
 What \r does depends on your terminal.

What terminal prints a newline when it sees \r? A 90's Macintosh?

I'd be very impressed if that's what bearophile is running windows on. -- Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 04 2014
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=12493



--- Comment #7 from Vladimir Panteleev <thecybershadow gmail.com> 2014-04-04
19:01:34 EEST ---
Windows has a very defined meaning of what \r means. It means, "move the
caret/cursor to the beginning of the line" (\r is CR for Carriage Return, \n is
LF for Line Feed). I recall that my matrix printer interpreted these control
characters in the same way.

Which is why I'm curious how bearophile got that program to print interleaving
lines in the first place.

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Apr 04 2014
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=12493



--- Comment #8 from bearophile_hugs eml.cc 2014-04-04 10:20:54 PDT ---
(In reply to comment #7)

 Which is why I'm curious how bearophile got that program to print interleaving
 lines in the first place.

Let's see. If I add a print of the bytes, like this: void main() { import std.stdio; import std.file: readText; auto t = readText("text.txt"); writef("%(%d %)", t); } The output is: 116 104 105 115 13 10 105 115 13 10 97 13 10 116 101 115 116 Is your output the same? -- Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 04 2014
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=12493



--- Comment #9 from Vladimir Panteleev <thecybershadow gmail.com> 2014-04-04
20:22:40 EEST ---
Yes.

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Apr 04 2014
prev sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=12493



--- Comment #10 from bearophile_hugs eml.cc 2014-04-04 10:34:08 PDT ---
(In reply to comment #9)
 Yes.

OK. So I presume it's my terminal that acts a little strangely. Thank you all for the answers, and sorry for the invalid report. -- Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 04 2014