www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 9599] New: File.byLine doesn't function properly with take

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9599

           Summary: File.byLine doesn't function properly with take
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Phobos
        AssignedTo: nobody puremagic.com
        ReportedBy: zshazz gmail.com


--- Comment #0 from Chris Cain <zshazz gmail.com> 2013-02-26 23:46:15 PST ---
Using 2.062, Regarding the following code:

---
import std.stdio, std.range;

void main() {
    auto file = File.tmpfile();
    file.write("1\n2\n3\n");
    file.rewind();

    auto fbl = file.byLine();
    foreach(line; fbl.take(1)) writeln(line);
    foreach(line; fbl) writeln(line);
}
---

The expected output for this would be:
---
1
2
3

---

but actual output:
---
1
3

---

Generalized observation: When take is used on a ByLine range, it takes the
appropriate number of elements and then consumes one additional element
preventing anything else from using it.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 26 2013
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9599


monarchdodra gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |monarchdodra gmail.com


--- Comment #1 from monarchdodra gmail.com 2013-02-27 03:35:11 PST ---
(In reply to comment #0)
 Using 2.062, Regarding the following code:

The bug is actually inside byLine itself, so we can remove take from the equation. The problem is that byLine is over-eager: 1) Creating a front element eagerly pops that element. 2) poping an element eagerlly parses the next, effectivelly popping it off too if it is never read: Reduced test showing this: //---- import std.stdio; void main() { auto file = File.tmpfile(); file.write("1\n2\n3\n4\n5"); file.rewind(); auto fbl1 = file.byLine(); writeln(fbl1.front); //prints 1. auto fbl2 = file.byLine(); writeln(fbl2.front); //prints 2... Wait. Who popped off 1? fbl2.popFront(); //pops off 2, and consumes 3. auto fbl3 = file.byLine(); writeln(fbl3.front); //prints 4. } //---- Ideally, byLine should be reworked to be a little more lazy, and better preserve the integrity of its underlying stream: - "front means do NOT modify the referenced container" - "pop means remove the CURRENT element, and stop there" byLine is obviously not doing that. The fact that it is *just* an input range does not mean it gets to bypass standard rules. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 27 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9599



--- Comment #2 from monarchdodra gmail.com 2013-02-27 10:09:30 PST ---
(In reply to comment #1)
 (In reply to comment #0)
 Using 2.062, Regarding the following code:

The bug is actually inside byLine itself

byChunk is subject to the exact same issue. The fact that they don't behave according to normal range semantics could be a potentially serious problems when not used linearly. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 27 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9599


monarchdodra gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|nobody puremagic.com        |monarchdodra gmail.com


--- Comment #3 from monarchdodra gmail.com 2013-03-13 00:20:19 PDT ---
(In reply to comment #1)
 Ideally, byLine should be reworked to be a little more lazy, and better
 preserve the integrity of its underlying stream:
 - "front means do NOT modify the referenced container"
 - "pop means remove the CURRENT element, and stop there"

Either that, or take the -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 13 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9599


Nick Treleaven <ntrel-public yahoo.co.uk> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |pull
                 CC|                            |ntrel-public yahoo.co.uk


--- Comment #4 from Nick Treleaven <ntrel-public yahoo.co.uk> 2013-07-22
08:32:06 PDT ---
(In reply to comment #1)
 The bug is actually inside byLine itself, so we can remove take from the
 equation. The problem is that byLine is over-eager:
 1) Creating a front element eagerly pops that element.
 2) poping an element eagerlly parses the next, effectivelly popping it off too
 if it is never read:

https://github.com/D-Programming-Language/phobos/pull/1433
    auto file = File.tmpfile();
    file.write("1\n2\n3\n4\n5");
    file.rewind();
 
    auto fbl1 = file.byLine();
    writeln(fbl1.front); //prints 1.
 
    auto fbl2 = file.byLine();
    writeln(fbl2.front); //prints 2... Wait. Who popped off 1?

I think the above behaviour is understandable for a range like ByLine. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 22 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9599



--- Comment #5 from github-bugzilla puremagic.com 2013-08-19 10:50:18 PDT ---
Commits pushed to master at https://github.com/D-Programming-Language/phobos

https://github.com/D-Programming-Language/phobos/commit/4c2a8bea355e2a980b21d41c5454fe7a34de1777
Add unittest for issue 9599, plus some other byLine cases

https://github.com/D-Programming-Language/phobos/commit/ec1f0fdb9d3f4b9ffd3acd444d27195ffc6a15fb
Fix Issue 9599 - File.byLine doesn't function properly with take

Calling take could wrongly pop an extra line from the range.
Solved by making ByLine use reference-counting.

Note: Just changing ByLine not to eagerly read the next line was not
sufficient to handle all cases properly (plus that makes empty() less
efficient).

Note: ByLine was documented until recently.

https://github.com/D-Programming-Language/phobos/commit/7bc6e8153921b10eb61179ec318e01b825ff94c5
Merge pull request #1433 from ntrel/byLine-take

Fix Issue 9599 - File.byLine doesn't function properly with take

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 19 2013
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9599


monarchdodra gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED


-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 19 2013