digitalmars.D.announce - File.byLine ought to return dups?
- Graham Fawcett <fawcett uwindsor.ca> Jun 04 2010
- "Steven Schveighoffer" <schveiguy yahoo.com> Jun 04 2010
- Adam Ruppe <destructionator gmail.com> Jun 04 2010
- Graham Fawcett <fawcett uwindsor.ca> Jun 04 2010
Hi folks,
I expected the following program to print the lines of a file in
reverse:
// bad.d
import std.stdio;
import std.range;
void main() {
auto f = File("bad.d");
foreach(char[] line; retro(array(f.byLine())))
writeln(line);
}
However, this produces very unusual output: fragments of the same
lines are printed repeatedly. I suspect it's because the byLine()
'generator' is not dup'ing the arrays it reads from the file.
This works as expected:
// good.d
import std.stdio;
import std.range;
Retro!(char[][]) retroLines(File f) {
char[][] lines;
foreach(line; f.byLine())
lines ~= line.dup; // note the .dup
return retro(lines);
}
void main() {
auto f = File("good.d");
foreach(line; retroLines(f))
writeln(line);
}
If you remove the '.dup', then this behaves badly as well.
So is this a bug in File.byLine, or am I just using it badly? :)
Thanks,
Graham
Jun 04 2010
On Fri, 04 Jun 2010 15:27:03 -0400, Graham Fawcett <fawcett uwindsor.ca> wrote:Hi folks, I expected the following program to print the lines of a file in reverse: // bad.d import std.stdio; import std.range; void main() { auto f = File("bad.d"); foreach(char[] line; retro(array(f.byLine()))) writeln(line); } However, this produces very unusual output: fragments of the same lines are printed repeatedly. I suspect it's because the byLine() 'generator' is not dup'ing the arrays it reads from the file. This works as expected: // good.d import std.stdio; import std.range; Retro!(char[][]) retroLines(File f) { char[][] lines; foreach(line; f.byLine()) lines ~= line.dup; // note the .dup return retro(lines); } void main() { auto f = File("good.d"); foreach(line; retroLines(f)) writeln(line); } If you remove the '.dup', then this behaves badly as well. So is this a bug in File.byLine, or am I just using it badly? :)
The latter. File is re-using the buffer for each line, so you are seeing the data get overwritten. This is for performance reasons. Not everyone wants incur heap allocations for every line of a file ;) As you showed, it's possible to get the desired behavior if you need it. The reverse would be impossible. Now, that being said, a nice addition would be to create a duper range that allows you to do one expression: foreach(char[] line; retro(array(duper(f.byLine())))) -Steve
Jun 04 2010
Looking at the code, it appears to be by design, though not explictly documented - it probably should be. Something to note is if you specifically ask for a string in the foreach, it will refuse to compile - the compiler knows it is a mutable char[] too.
Jun 04 2010
Sorry for the double-post to .announce -- I had deleted my .announce post, but obviously not thoroughly enough. I'll follow up on the list. Graham On Fri, 04 Jun 2010 15:41:43 -0400, Steven Schveighoffer wrote:On Fri, 04 Jun 2010 15:27:03 -0400, Graham Fawcett <fawcett uwindsor.ca> wrote:Hi folks, I expected the following program to print the lines of a file in reverse: // bad.d import std.stdio; import std.range; void main() { auto f = File("bad.d"); foreach(char[] line; retro(array(f.byLine()))) writeln(line); } However, this produces very unusual output: fragments of the same lines are printed repeatedly. I suspect it's because the byLine() 'generator' is not dup'ing the arrays it reads from the file. This works as expected: // good.d import std.stdio; import std.range; Retro!(char[][]) retroLines(File f) { char[][] lines; foreach(line; f.byLine()) lines ~= line.dup; // note the .dup return retro(lines); } void main() { auto f = File("good.d"); foreach(line; retroLines(f)) writeln(line); } If you remove the '.dup', then this behaves badly as well. So is this a bug in File.byLine, or am I just using it badly? :)
The latter. File is re-using the buffer for each line, so you are seeing the data get overwritten. This is for performance reasons. Not everyone wants incur heap allocations for every line of a file ;) As you showed, it's possible to get the desired behavior if you need it. The reverse would be impossible. Now, that being said, a nice addition would be to create a duper range that allows you to do one expression: foreach(char[] line; retro(array(duper(f.byLine())))) -Steve
Jun 04 2010









"Steven Schveighoffer" <schveiguy yahoo.com> 