www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 8084] New: std.stdio.ByLine is not true input range

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8084

           Summary: std.stdio.ByLine is not true input range
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Phobos
        AssignedTo: nobody puremagic.com
        ReportedBy: k.hara.pg gmail.com



version(A) and version(B) should output same result, but A is completely
broken.

import std.array;
import std.stdio;
void main()
{
    string fname = __FILE__;
    version(A)
    {
        auto lines = File(fname).byLine().array();
        foreach (ln; lines)
            writeln(ln);
    }
    version(B)
    {
        foreach (ln; File(fname).byLine())
            writeln(ln);
    }
}

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 11 2012
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8084


Steven Schveighoffer <schveiguy yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
                 CC|                            |schveiguy yahoo.com
         Resolution|                            |WONTFIX



12:10:30 PDT ---
In order for byLine to be efficient, it must reuse the buffer it's reading.  In
order for the above code to be equivlent, it must dup every line, which is
hugely inefficient if you aren't going to use them beyond the scope of the
foreach statement.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 11 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8084


timon.gehr gmx.ch changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |timon.gehr gmx.ch



This is true, but byLine is not a true InputRange, therefore it is questionable
whether or not it should expose an InputRange interface. int opApply(scope int
delegate(scope string ln)) would be more descriptive.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 12 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8084


Brad Roberts <braddr puremagic.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |braddr puremagic.com



---
What part of the definition of an InputRange does it violate?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 12 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8084





 In order for byLine to be efficient, it must reuse the buffer it's reading.  In
 order for the above code to be equivlent, it must dup every line, which is
 hugely inefficient if you aren't going to use them beyond the scope of the
 foreach statement.
OK. ByLine.front returns the slice of internal line buffer, so it specifies 'temporary' data. We should get dup in each iteration. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
May 12 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8084


Jonathan M Davis <jmdavisProg gmx.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jmdavisProg gmx.com



PDT ---
 OK. ByLine.front returns the slice of internal line buffer, so it specifies
'temporary' data. So? It's returning a reference type, so it could change on you. It _is_ potentially problematic in some cases, but I don't see how that makes it so that it isn't an input range
 We should get dup in each iteration.
It's specifically _avoiding_ that for efficiency purposes. Now, maybe having ByLine's front return the same array with changes values rather than a new array makes it so that it doesn't work with a lot of range-based functions, effectively making it useless as a range, even if it _does_ follow the API. If that's the case, maybe we should make it use opApply for the loop case (having it use the current non-duping behavior) and have front do a dup (and if front as pure, that result could even be implicitly converted to an immutable result, avoiding having to dup it twice to get an immutable copy). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
May 12 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8084




19:38:27 PDT ---
ByLine is a true input range, it has empty, popFront and front.

If it was not a true input range, foreach would not work on it.

I also think isInputRange would pass.

The major issue I see with ByLine is that its default type of front is char[]. 
It really should be const(char)[].  And in fact, I don't think char[] should be
possible.

Perhaps, this bug can be redone to address that.


 OK. ByLine.front returns the slice of internal line buffer, so it specifies
 'temporary' data. We should get dup in each iteration.
If you mean ByLine should dup for each iteration, that is extremely wasteful. What if you only need the data during the loop and not afterward? -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
May 12 2012
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8084







 OK. ByLine.front returns the slice of internal line buffer, so it specifies
 'temporary' data. We should get dup in each iteration.
I mean we should dup the front of ByLine in *user code* if it is used after whole iteration, not ByLine should return dup-ed front. And I can agree that ByLine is *true* input range because it has empty, front, and popFront. Therefore, the summary was wrong. Now I think it is correct that this issue was marked as 'resolved wontfix'. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
May 12 2012