www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Phobos usability with text files

reply bearophile <bearophileHUGS lycos.com> writes:
This is related to this (closed) issue, but this time I prefer to discuss the
topic on the newsgroup:
http://d.puremagic.com/issues/show_bug.cgi?id=4474

To show why this enhancement request is useful I use a little scripting task. A
D2 program has to read a file that contains one word in each line, and print
all the words with the highest length.

A first naive functional style implementation (if you keep in mind that File
opens files in binary mode on default, this default is different from the
Python one, I am not sure what's the best default. I think most files I open
are text ones, so I think a bit better default is to open in text mode on
default), it doesn't work, it produces an access violation:

// #1
import std.stdio, std.array, std.algorithm;
void main() {
    auto lazyWords = File("words.txt", "r").byLine();
    auto maxLen = reduce!max(map!"a.length"(lazyWords));
    writeln(filter!((w){ return w.length == maxLen; })(lazyWords));
}


Finding the max length consumes the lazy iterable, so if the text file is small
a possible functional style solution is to convert the lazy iterable into an
array (other functional style solutions are possible, like reading the file
twice, etc):

// #2
import std.stdio, std.array, std.algorithm;
void main() {
    auto lazyWords = File("words.txt", "r").byLine();
    auto words = array(lazyWords);
    auto maxLen = reduce!max(map!"a.length"(words));
    writeln(filter!((w){ return w.length == maxLen; })(words));
}

But #2 doesn't work, because while "words" is an array of the words, byLine()
uses the same buffer, so words is a sequence of useless stuff.

To debug that code you need to dup each array item:

// #3
import std.stdio, std.array, std.algorithm;
void main() {
    auto lazyWords = File("words.txt", "r").byLine();
    auto words = array(map!"a.dup"(lazyWords));
    auto maxLen = reduce!max(map!"a.length"(words));
    writeln(filter!((w){ return w.length == maxLen; })(words));
}


#3 works, but for scripting-line programs it's not the first version I write, I
have had to debug the code. I think other programmers will have the same
troubles. So I think byLine() default behavour is a bit bug-prone. Re-using the
same buffer gives a good performance boost, but it's often a premature
optimization in scripts. I prefer a language and library to use unsafe
optimizations only on request.

My suggested solution is to split File.byLine() in two member functions, one
returns a lazy iterable that reuses the line buffer and one that doesn't reuse
them. And my preferred solution is to give the shorter name (so it becomes the
"default") to the method that dups the line, because this is the safer (less
bug prone) behaviour. This is also in accord with D Zen. They may be named
byLine() and byFastLine() or something similar.

If you prefer the "default" one to be the less safe version, then the safe
version may be named byDupLine():

// #4
import std.stdio, std.array, std.algorithm;
void main() {
    auto lazyWords = File("words.txt", "r").byDupLine();
    auto lines = array(lazyWords);
    auto maxLen = reduce!max(map!"a.length"(words));
    writeln(filter!((w){ return w.length == maxLen; })(words));
}


Using the "maxs" (from http://d.puremagic.com/issues/show_bug.cgi?id=4705 ),
the code becomes:

// #5
import std.stdio, std.algorithm;
void main() {
    auto lazyWords = File("words.txt", "r").byDupLine();
    writeln(maxs!"a.length"(lazyWords));
}

Bye,
bearophile
Dec 26 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/26/10 10:12 AM, bearophile wrote:
 This is related to this (closed) issue, but this time I prefer to discuss the
topic on the newsgroup:
 http://d.puremagic.com/issues/show_bug.cgi?id=4474

 To show why this enhancement request is useful I use a little scripting task.
A D2 program has to read a file that contains one word in each line, and print
all the words with the highest length.

 A first naive functional style implementation (if you keep in mind that File
opens files in binary mode on default, this default is different from the
Python one, I am not sure what's the best default. I think most files I open
are text ones, so I think a bit better default is to open in text mode on
default), it doesn't work, it produces an access violation:

 // #1
 import std.stdio, std.array, std.algorithm;
 void main() {
      auto lazyWords = File("words.txt", "r").byLine();
      auto maxLen = reduce!max(map!"a.length"(lazyWords));
      writeln(filter!((w){ return w.length == maxLen; })(lazyWords));
 }


 Finding the max length consumes the lazy iterable, so if the text file is
small a possible functional style solution is to convert the lazy iterable into
an array (other functional style solutions are possible, like reading the file
twice, etc):

 // #2
 import std.stdio, std.array, std.algorithm;
 void main() {
      auto lazyWords = File("words.txt", "r").byLine();
      auto words = array(lazyWords);
      auto maxLen = reduce!max(map!"a.length"(words));
      writeln(filter!((w){ return w.length == maxLen; })(words));
 }

 But #2 doesn't work, because while "words" is an array of the words, byLine()
uses the same buffer, so words is a sequence of useless stuff.

 To debug that code you need to dup each array item:

 // #3
 import std.stdio, std.array, std.algorithm;
 void main() {
      auto lazyWords = File("words.txt", "r").byLine();
      auto words = array(map!"a.dup"(lazyWords));
      auto maxLen = reduce!max(map!"a.length"(words));
      writeln(filter!((w){ return w.length == maxLen; })(words));
 }


 #3 works, but for scripting-line programs it's not the first version I write,
I have had to debug the code. I think other programmers will have the same
troubles. So I think byLine() default behavour is a bit bug-prone. Re-using the
same buffer gives a good performance boost, but it's often a premature
optimization in scripts. I prefer a language and library to use unsafe
optimizations only on request.

 My suggested solution is to split File.byLine() in two member functions, one
returns a lazy iterable that reuses the line buffer and one that doesn't reuse
them. And my preferred solution is to give the shorter name (so it becomes the
"default") to the method that dups the line, because this is the safer (less
bug prone) behaviour. This is also in accord with D Zen. They may be named
byLine() and byFastLine() or something similar.

 If you prefer the "default" one to be the less safe version, then the safe
version may be named byDupLine():

 // #4
 import std.stdio, std.array, std.algorithm;
 void main() {
      auto lazyWords = File("words.txt", "r").byDupLine();
      auto lines = array(lazyWords);
      auto maxLen = reduce!max(map!"a.length"(words));
      writeln(filter!((w){ return w.length == maxLen; })(words));
 }


 Using the "maxs" (from http://d.puremagic.com/issues/show_bug.cgi?id=4705 ),
the code becomes:

 // #5
 import std.stdio, std.algorithm;
 void main() {
      auto lazyWords = File("words.txt", "r").byDupLine();
      writeln(maxs!"a.length"(lazyWords));
 }

But you don't need a new string for each line to evaluate max over line lengths; the current byLine works. Generally I think buffer reuse in byLine() is too valuable to let go. Andrei
Dec 26 2010
next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2010-12-26 12:13:41 -0500, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 On 12/26/10 10:12 AM, bearophile wrote:
 This is related to this (closed) issue, but this time I prefer to 
 discuss the topic on the newsgroup:
 http://d.puremagic.com/issues/show_bug.cgi?id=4474
 
 To show why this enhancement request is useful I use a little scripting 
 task. A D2 program has to read a file that contains one word in each 
 line, and print all the words with the highest length.
 
 A first naive functional style implementation (if you keep in mind that 
 File opens files in binary mode on default, this default is different 
 from the Python one, I am not sure what's the best default. I think 
 most files I open are text ones, so I think a bit better default is to 
 open in text mode on default), it doesn't work, it produces an access 
 violation:
 
 // #1
 import std.stdio, std.array, std.algorithm;
 void main() {
      auto lazyWords = File("words.txt", "r").byLine();
      auto maxLen = reduce!max(map!"a.length"(lazyWords));
      writeln(filter!((w){ return w.length == maxLen; })(lazyWords));
 }
 
 
 Finding the max length consumes the lazy iterable, so if the text file 
 is small a possible functional style solution is to convert the lazy 
 iterable into an array (other functional style solutions are possible, 
 like reading the file twice, etc):
 
 // #2
 import std.stdio, std.array, std.algorithm;
 void main() {
      auto lazyWords = File("words.txt", "r").byLine();
      auto words = array(lazyWords);
      auto maxLen = reduce!max(map!"a.length"(words));
      writeln(filter!((w){ return w.length == maxLen; })(words));
 }
 
 But #2 doesn't work, because while "words" is an array of the words, 
 byLine() uses the same buffer, so words is a sequence of useless stuff.
 
 To debug that code you need to dup each array item:
 
 // #3
 import std.stdio, std.array, std.algorithm;
 void main() {
      auto lazyWords = File("words.txt", "r").byLine();
      auto words = array(map!"a.dup"(lazyWords));
      auto maxLen = reduce!max(map!"a.length"(words));
      writeln(filter!((w){ return w.length == maxLen; })(words));
 }
 
 
 #3 works, but for scripting-line programs it's not the first version I 
 write, I have had to debug the code. I think other programmers will 
 have the same troubles. So I think byLine() default behavour is a bit 
 bug-prone. Re-using the same buffer gives a good performance boost, but 
 it's often a premature optimization in scripts. I prefer a language and 
 library to use unsafe optimizations only on request.
 
 My suggested solution is to split File.byLine() in two member 
 functions, one returns a lazy iterable that reuses the line buffer and 
 one that doesn't reuse them. And my preferred solution is to give the 
 shorter name (so it becomes the "default") to the method that dups the 
 line, because this is the safer (less bug prone) behaviour. This is 
 also in accord with D Zen. They may be named byLine() and byFastLine() 
 or something similar.
 
 If you prefer the "default" one to be the less safe version, then the 
 safe version may be named byDupLine():
 
 // #4
 import std.stdio, std.array, std.algorithm;
 void main() {
      auto lazyWords = File("words.txt", "r").byDupLine();
      auto lines = array(lazyWords);
      auto maxLen = reduce!max(map!"a.length"(words));
      writeln(filter!((w){ return w.length == maxLen; })(words));
 }
 
 
 Using the "maxs" (from 
 http://d.puremagic.com/issues/show_bug.cgi?id=4705 ), the code becomes:
 
 // #5
 import std.stdio, std.algorithm;
 void main() {
      auto lazyWords = File("words.txt", "r").byDupLine();
      writeln(maxs!"a.length"(lazyWords));
 }

But you don't need a new string for each line to evaluate max over line lengths; the current byLine works.

That's true.
 Generally I think buffer reuse in byLine() is too valuable to let go.

I also agree it's wasteful. But I think bearophile's experiment has illustrated two noteworthy problems. The first issue is that calling filter! on the already-consumed result of byLine() gives you a seg fault. I reproduced this locally, but haven't pinpointed the problem. The second one is this: array(file.byline()) which gives a wrong result because of the buffer reuse. Either it should not compile or it should idup every line (both of which are not entirely satisfactory, but they're better than getting wrong results). I think a range should be able to express if the value can be reused or not. If the value can't be reused, then the algorithm should either not instantiate, or in some cases it might create a copy. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Dec 26 2010
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Michel Fortin:

 But you don't need a new string for each line to evaluate max over line 
 lengths; the current byLine works.

That's true.
 Generally I think buffer reuse in byLine() is too valuable to let go.

I also agree it's wasteful.

See my answers to Andrei.
 But I think bearophile's experiment has illustrated two noteworthy 
 problems. The first issue is that calling filter! on the 
 already-consumed result of byLine() gives you a seg fault. I reproduced 
 this locally, but haven't pinpointed the problem.

Maybe beside that one there is yet another problem, I have had problems in creating a version that opens and scans the file twice.
 The second one is this:
 
 	array(file.byline())
 	
 which gives a wrong result because of the buffer reuse. Either it 
 should not compile or it should idup every line (both of which are not 
 entirely satisfactory, but they're better than getting wrong results).

I don't agree. dupping every line is like performing a limited deep-dup, and this is not the job of array(). Not compiling again is not the job of array(). So the problem needs to be solved elsewhere, array() is OK, I think.
 I think a range should be able to express if the value can be reused or 
 not. If the value can't be reused, then the algorithm should either not 
 instantiate, or in some cases it might create a copy.

Generally in a nonfunctional language iterables need to exhaust :-) Bye, bearophile
Dec 26 2010
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/26/10 11:53 AM, Michel Fortin wrote:
 On 2010-12-26 12:13:41 -0500, Andrei Alexandrescu
 Generally I think buffer reuse in byLine() is too valuable to let go.

I also agree it's wasteful. But I think bearophile's experiment has illustrated two noteworthy problems. The first issue is that calling filter! on the already-consumed result of byLine() gives you a seg fault. I reproduced this locally, but haven't pinpointed the problem.

That should be filed as a Phobos bug. Could you please do the honors?
 The second one is this:

 array(file.byline())

 which gives a wrong result because of the buffer reuse. Either it should
 not compile or it should idup every line (both of which are not entirely
 satisfactory, but they're better than getting wrong results).

I agree it would be great if that meaningless expression wouldn't compile.
 I think a range should be able to express if the value can be reused or
 not. If the value can't be reused, then the algorithm should either not
 instantiate, or in some cases it might create a copy.

I thought a couple of days about this. There's a host of desirable properties that could ideally be expressed during compilation. My experience is that inevitably they add complexity to the range definition. For example, we could define a trait like this: template iteratesDistinctObjects(R) if (isInputRange!R) { enum iteratesDistinctObjects = true; } That would cover the common case, then the ByLine range would say: template iteratesDistinctObjects(R) if (is(R == ByLine)) { enum iteratesDistinctObjects = false; } That means all existing ranges and algorithms need to be reviewed to take that property into account. The complexity of the library rises. Even if we cover that, there are many properties like that, and user may still define their own ranges, algorithms, or plain old code that do the wrong thing in one way or another. So it all becomes a matter of deciding at which point we can reasonably ask the library user to Read The Fine Manual. I'm not sure where the decision about this particular issue should be. By the way, a simple way to define a range that issues one new string per line is: auto r = map!(to!string)(f.byLine()); Andrei
Dec 27 2010
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei:

But you don't need a new string for each line to evaluate max over line
lengths; the current byLine works.<

Right. There are various different ways to implement that little task in functional style. But the task requires to print the longest ones. So you have to filter according to the max length. And if you filter you consume the iterable (I think). So you need to dup the array again. It looks simple, but I have tried to create a new version like that writing some versions, and I have had several 'access violations'. This very bad for people that want to write a script with D.
 Generally I think buffer reuse in byLine() is too valuable to let go.

In my post you see I have never suggested to remove buffer reuse. I have suggested two possible alternatives. In both alternatives the byLine() is split into two different methods: First possibility, this is my preferred: 1a) Modify byLine() so it doesn't reuse the buffer. 1b) Add another method, like byFastLine() that reuses the buffer. Alternative: 2a) Keep byLine() as it is now, so it reuses the buffer. 2b) Add another method, like byDupLine() that doesn't reuse the buffer. In the post I have explained the rationale behind this. I prefer the first possibility because the D Zen says that the more safe option is the default one, and the faster less safe is on request (and I agree with this part of the D Zen). Talking about usability and egonomy in this newsgroups is sometimes a very tiring job :-) Even very little things seem to require a lot of work and discussions. Bye, bearophile
Dec 26 2010
next sibling parent Adam D. Ruppe <destructionator gmail.com> writes:
bearophile wrote:
 Talking about usability and egonomy in this newsgroups
 is sometimes a very tiring job :-)

Remember, a lot of your posts are actually very subjective. Those things tend to get a lot of debate without an objective right or wrong coming out of it. It could go both ways.
Dec 26 2010
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/26/10 11:56 AM, bearophile wrote:
 Andrei:

 But you don't need a new string for each line to evaluate max over
  line lengths; the current byLine works.<

Right. There are various different ways to implement that little task in functional style. But the task requires to print the longest ones. So you have to filter according to the max length. And if you filter you consume the iterable (I think). So you need to dup the array again. It looks simple, but I have tried to create a new version like that writing some versions, and I have had several 'access violations'. This very bad for people that want to write a script with D.
 Generally I think buffer reuse in byLine() is too valuable to let
 go.

In my post you see I have never suggested to remove buffer reuse. I have suggested two possible alternatives. In both alternatives the byLine() is split into two different methods: First possibility, this is my preferred: 1a) Modify byLine() so it doesn't reuse the buffer. 1b) Add another method, like byFastLine() that reuses the buffer. Alternative: 2a) Keep byLine() as it is now, so it reuses the buffer. 2b) Add another method, like byDupLine() that doesn't reuse the buffer. In the post I have explained the rationale behind this. I prefer the first possibility because the D Zen says that the more safe option is the default one, and the faster less safe is on request (and I agree with this part of the D Zen).

Well the D Zen would attempt to reconcile the two such that the obvious option is the safest and the fastest. Let's also not forget that "safe" is a bit abused here - we're not talking about lack of safety as much as incorrect results. Maybe a byLine!string() would be the best of both worlds by automatically calling to!string against each line. That can actually be nicely extended to e.g. byLine!(double[]) to automatically read lines of whitespace-separated doubles.
 Talking about usability and egonomy in this newsgroups is sometimes a
 very tiring job :-) Even very little things seem to require a lot of
 work and discussions.

One possible issue is that you start with the assumption you're unequivocally right in matters that are highly debatable. That sometimes makes it tiring for the rest of us, too, but I'm not one to complain :o). Andrei
Dec 26 2010
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Andrei:

 Well the D Zen would attempt to reconcile the two such that the obvious option
is the safest and the fastest.<

That's indeed better (but not always possible).
Let's also not forget that "safe" is a bit abused here - we're not talking
about lack of safety as much as incorrect results.<

Right, I meant less hard to find a working version, with less bugs and less segfaults. (In dlibs1 I have used a third design, a run time boolean argument (or template compile-time) that on default performs the copies, and on request reuses the same buffer on each iteration. This avoids the function/method duplication.)
Maybe a byLine!string() would be the best of both worlds by automatically
calling to!string against each line.<

Let's see what other people think about this.
That can actually be nicely extended to e.g. byLine!(double[]) to automatically
read lines of whitespace-separated doubles.<

Sometimes you tend to over-engineer things. Be careful.
One possible issue is that you start with the assumption you're unequivocally
right in matters that are highly debatable. That sometimes makes it tiring for
the rest of us, too, but I'm not one to complain :o).<

I am sorry. Bye and thank you, bearophile
Dec 26 2010
prev sibling parent Ary Borenszweig <ary esperanto.org.ar> writes:
If the function is left as is I expect questions about "Why isn't this working?"
for it to appear on D.learn about 1 time each month.

Exactly like what happens with property += value and other things that lead to
incorrect result or don't work, viewed from the most obvious point of view.
Dec 27 2010
prev sibling next sibling parent spir <denis.spir gmail.com> writes:
On Sun, 26 Dec 2010 12:53:29 -0500
Michel Fortin <michel.fortin michelf.com> wrote:

 On 2010-12-26 12:13:41 -0500, Andrei Alexandrescu=20
 <SeeWebsiteForEmail erdani.org> said:
=20
 On 12/26/10 10:12 AM, bearophile wrote:
 This is related to this (closed) issue, but this time I prefer to=20
 discuss the topic on the newsgroup:
 http://d.puremagic.com/issues/show_bug.cgi?id=3D4474
=20
 To show why this enhancement request is useful I use a little scriptin=



 task. A D2 program has to read a file that contains one word in each=20
 line, and print all the words with the highest length.
=20
 A first naive functional style implementation (if you keep in mind tha=



 File opens files in binary mode on default, this default is different=



 from the Python one, I am not sure what's the best default. I think=20
 most files I open are text ones, so I think a bit better default is to=



 open in text mode on default), it doesn't work, it produces an access=



 violation:
=20
 // #1
 import std.stdio, std.array, std.algorithm;
 void main() {
      auto lazyWords =3D File("words.txt", "r").byLine();
      auto maxLen =3D reduce!max(map!"a.length"(lazyWords));
      writeln(filter!((w){ return w.length =3D=3D maxLen; })(lazyWords)=



 }
=20
=20
 Finding the max length consumes the lazy iterable, so if the text file=



 is small a possible functional style solution is to convert the lazy=20
 iterable into an array (other functional style solutions are possible,=



 like reading the file twice, etc):
=20
 // #2
 import std.stdio, std.array, std.algorithm;
 void main() {
      auto lazyWords =3D File("words.txt", "r").byLine();
      auto words =3D array(lazyWords);
      auto maxLen =3D reduce!max(map!"a.length"(words));
      writeln(filter!((w){ return w.length =3D=3D maxLen; })(words));
 }
=20
 But #2 doesn't work, because while "words" is an array of the words,=20
 byLine() uses the same buffer, so words is a sequence of useless stuff.
=20
 To debug that code you need to dup each array item:
=20
 // #3
 import std.stdio, std.array, std.algorithm;
 void main() {
      auto lazyWords =3D File("words.txt", "r").byLine();
      auto words =3D array(map!"a.dup"(lazyWords));
      auto maxLen =3D reduce!max(map!"a.length"(words));
      writeln(filter!((w){ return w.length =3D=3D maxLen; })(words));
 }
=20
=20
 #3 works, but for scripting-line programs it's not the first version I=



 write, I have had to debug the code. I think other programmers will=20
 have the same troubles. So I think byLine() default behavour is a bit=



 bug-prone. Re-using the same buffer gives a good performance boost, bu=



 it's often a premature optimization in scripts. I prefer a language an=



 library to use unsafe optimizations only on request.
=20
 My suggested solution is to split File.byLine() in two member=20
 functions, one returns a lazy iterable that reuses the line buffer and=



 one that doesn't reuse them. And my preferred solution is to give the=



 shorter name (so it becomes the "default") to the method that dups the=



 line, because this is the safer (less bug prone) behaviour. This is=20
 also in accord with D Zen. They may be named byLine() and byFastLine()=



 or something similar.
=20
 If you prefer the "default" one to be the less safe version, then the=



 safe version may be named byDupLine():
=20
 // #4
 import std.stdio, std.array, std.algorithm;
 void main() {
      auto lazyWords =3D File("words.txt", "r").byDupLine();
      auto lines =3D array(lazyWords);
      auto maxLen =3D reduce!max(map!"a.length"(words));
      writeln(filter!((w){ return w.length =3D=3D maxLen; })(words));
 }
=20
=20
 Using the "maxs" (from=20
 http://d.puremagic.com/issues/show_bug.cgi?id=3D4705 ), the code becom=



=20
 // #5
 import std.stdio, std.algorithm;
 void main() {
      auto lazyWords =3D File("words.txt", "r").byDupLine();
      writeln(maxs!"a.length"(lazyWords));
 }

But you don't need a new string for each line to evaluate max over line=


 lengths; the current byLine works.

That's true.=09 =20 =20
 Generally I think buffer reuse in byLine() is too valuable to let go.

I also agree it's wasteful.

That's true, indeed. But here the point, I guess, is that you need the reus= e the line sequence (whatever it is), as argument to filter.
 But I think bearophile's experiment has illustrated two noteworthy=20
 problems. The first issue is that calling filter! on the=20
 already-consumed result of byLine() gives you a seg fault. I reproduced=20
 this locally, but haven't pinpointed the problem.
=20
 The second one is this:
=20
 	array(file.byline())
 =09
 which gives a wrong result because of the buffer reuse. Either it=20
 should not compile or it should idup every line (both of which are not=20
 entirely satisfactory, but they're better than getting wrong results).

I tried it myself, and the result is very strange: unittest { // text.txt holds: "abc\ndefgh\nijk\nlmn\nopqrs\ntuvwx\nyz\n" auto lazyWords =3D File("test.txt").byLine(); //~ auto words =3D Array(map!((ll){return ll.dup;})(lazyWords)); auto words =3D Array(lazyWords); writeln(words); auto lengths =3D map!((l){return l.length;})(words); writeln(lengths); auto maxLength =3D reduce!max(lengths); writeln(maxLength); auto longWords =3D filter!((l){return l.length=3D=3DmaxLength;})(words); writeln(longWords); } =3D=3D> [yz , yz wx, yz , yz , yz wx, yz wx, yz] [3, 5, 3, 3, 5, 5, 2] 5 [yz wx, yz wx, yz wx] I don't understand why/how, asp that lengths is correct ;-) When replacing the "auto words =3D ..." statements, we get as expected: [abc, defgh, ijk, lmn, opqrs, tuvwx, yz] [3, 5, 3, 3, 5, 5, 2] 5 [defgh, opqrs, tuvwx]
 I think a range should be able to express if the value can be reused or=20
 not. If the value can't be reused, then the algorithm should either not=20
 instantiate, or in some cases it might create a copy.

Makes sense. I also like Bearophile's proposal. But would let 'byLines' for= the lazy/single-iteration-only version and call the dupping method simply = 'lines'. For me (is it just me?), the latter correctly expresses an array-l= ike collection of elements one can safely (re)use. Denis -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Dec 28 2010
prev sibling parent spir <denis.spir gmail.com> writes:
On Mon, 27 Dec 2010 18:21:24 +0000 (UTC)
Ary Borenszweig <ary esperanto.org.ar> wrote:

 If the function is left as is I expect questions about "Why isn't this wo=

 for it to appear on D.learn about 1 time each month.
=20
 Exactly like what happens with property +=3D value and other things that =

 incorrect result or don't work, viewed from the most obvious point of vie=

Very probable. If those recurrent complains are not are drawn from another = language's behaviour (*), they are sure signs of incorrect design. Denis (*) I mean someone trying to program Java or C++ or Python in D. -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Dec 28 2010