www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 11995] New: std.File.ByLine strips trailing empty line

reply d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=11995

           Summary: std.File.ByLine strips trailing empty line
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Phobos
        AssignedTo: nobody puremagic.com
        ReportedBy: monarchdodra gmail.com



//----
import std.stdio;
void main()
{
    File("stuff0", "w").write("line 1\nline 2");
    File("stuff1", "w").write("line 1\nline 2\n");
    File("stuff2", "w").write("line 1\nline 2\na");
    File("stuff3", "w").write("line 1\nline 2\n\n\n");
    writeln(File("stuff0").byLine());
    writeln(File("stuff1").byLine());
    writeln(File("stuff2").byLine());
    writeln(File("stuff3").byLine());
}
//----
["line 1", "line 2"]
["line 1", "line 2"]
["line 1", "line 2", "a"]
["line 1", "line 2", "", ""]
//----

The problem is that the last empty line is stripped away. This can be a problem
for algorithms that copy/mutate files, as they can't reliably preserve the
exact amount of lines to produce: File0 and File1 produced the exact same
output, but are different :/

I think the correct output should be:
//----
["line 1", "line 2"]
["line 1", "line 2", ""]
["line 1", "line 2", "a"]
["line 1", "line 2", "", "", ""]
//----

With this scheme, it's simple: every line in the file gets an entry, and the
last line of byLine does not actually have a terminator.

This means that something like:
File("out.txt", "w").writefln("%(%s\n%)", File("in.txt").byLine());
Will *exactly* duplicate the input file.

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 25 2014
next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=11995


Peter Alexander <peter.alexander.au gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |peter.alexander.au gmail.co
                   |                            |m



16:07:40 PST ---
I agree, but I wonder if this is too much of a breaking change? I imagine
byLine is used a lot, and I could easily see use cases that would break on this
change.

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 25 2014
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=11995


Andrej Mitrovic <andrej.mitrovich gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |andrej.mitrovich gmail.com



04:09:27 PST ---
Is this similar/dupe of http://d.puremagic.com/issues/show_bug.cgi?id=11830 and
http://d.puremagic.com/issues/show_bug.cgi?id=11465?

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 26 2014
prev sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=11995




04:16:01 PST ---

 Is this similar/dupe of http://d.puremagic.com/issues/show_bug.cgi?id=11830 and
 http://d.puremagic.com/issues/show_bug.cgi?id=11465?
No, they are both separate bugs. 11830 is a bug with the laziness of the implementation asumming front is called. 11465 is to do with line endings. This bug is about the treatment of trailing empty lines. -- Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jan 26 2014