digitalmars.D - Phobos usability with text files

bearophile (50/50) Dec 26 2010 This is related to this (closed) issue, but this time I prefer to discus...

Andrei Alexandrescu (5/53) Dec 26 2010 But you don't need a new string for each line to evaluate max over line

Michel Fortin (21/116) Dec 26 2010 That's true.

bearophile (8/31) Dec 26 2010 Maybe beside that one there is yet another problem, I have had problems ...
Andrei Alexandrescu (26/41) Dec 27 2010 I agree it would be great if that meaningless expression wouldn't compil...
spir (65/184) Dec 28 2010 g=20

bearophile (16/18) Dec 26 2010 Right. There are various different ways to implement that little task in...

Adam D. Ruppe (4/6) Dec 26 2010 Remember, a lot of your posts are actually very subjective. Those
Andrei Alexandrescu (13/42) Dec 26 2010 Well the D Zen would attempt to reconcile the two such that the obvious

bearophile (9/14) Dec 26 2010 Right, I meant less hard to find a working version, with less bugs and l...
Ary Borenszweig (4/4) Dec 27 2010 If the function is left as is I expect questions about "Why isn't this w...

spir (12/17) Dec 28 2010 rking?"

bearophile (4/7) Dec 26 2010 Right. But I have discussed a lot about usability & egonomy matters in t...

Brad Roberts (6/22) Dec 26 2010 I don't use python, so I don't hang out on those groups. But for me, my...

bearophile <bearophileHUGS lycos.com> writes:

This is related to this (closed) issue, but this time I prefer to discuss the
topic on the newsgroup:
http://d.puremagic.com/issues/show_bug.cgi?id=4474

To show why this enhancement request is useful I use a little scripting task. A
D2 program has to read a file that contains one word in each line, and print
all the words with the highest length.

A first naive functional style implementation (if you keep in mind that File
opens files in binary mode on default, this default is different from the
Python one, I am not sure what's the best default. I think most files I open
are text ones, so I think a bit better default is to open in text mode on
default), it doesn't work, it produces an access violation:


import std.stdio, std.array, std.algorithm;
void main() {
    auto lazyWords = File("words.txt", "r").byLine();
    auto maxLen = reduce!max(map!"a.length"(lazyWords));
    writeln(filter!((w){ return w.length == maxLen; })(lazyWords));
}


Finding the max length consumes the lazy iterable, so if the text file is small
a possible functional style solution is to convert the lazy iterable into an
array (other functional style solutions are possible, like reading the file
twice, etc):


import std.stdio, std.array, std.algorithm;
void main() {
    auto lazyWords = File("words.txt", "r").byLine();
    auto words = array(lazyWords);
    auto maxLen = reduce!max(map!"a.length"(words));
    writeln(filter!((w){ return w.length == maxLen; })(words));
}


uses the same buffer, so words is a sequence of useless stuff.

To debug that code you need to dup each array item:


import std.stdio, std.array, std.algorithm;
void main() {
    auto lazyWords = File("words.txt", "r").byLine();
    auto words = array(map!"a.dup"(lazyWords));
    auto maxLen = reduce!max(map!"a.length"(words));
    writeln(filter!((w){ return w.length == maxLen; })(words));
}



have had to debug the code. I think other programmers will have the same
troubles. So I think byLine() default behavour is a bit bug-prone. Re-using the
same buffer gives a good performance boost, but it's often a premature
optimization in scripts. I prefer a language and library to use unsafe
optimizations only on request.

My suggested solution is to split File.byLine() in two member functions, one
returns a lazy iterable that reuses the line buffer and one that doesn't reuse
them. And my preferred solution is to give the shorter name (so it becomes the
"default") to the method that dups the line, because this is the safer (less
bug prone) behaviour. This is also in accord with D Zen. They may be named
byLine() and byFastLine() or something similar.

If you prefer the "default" one to be the less safe version, then the safe
version may be named byDupLine():


import std.stdio, std.array, std.algorithm;
void main() {
    auto lazyWords = File("words.txt", "r").byDupLine();
    auto lines = array(lazyWords);
    auto maxLen = reduce!max(map!"a.length"(words));
    writeln(filter!((w){ return w.length == maxLen; })(words));
}


Using the "maxs" (from http://d.puremagic.com/issues/show_bug.cgi?id=4705 ),
the code becomes:


import std.stdio, std.algorithm;
void main() {
    auto lazyWords = File("words.txt", "r").byDupLine();
    writeln(maxs!"a.length"(lazyWords));
}

Bye,
bearophile

Dec 26 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 12/26/10 10:12 AM, bearophile wrote:
 This is related to this (closed) issue, but this time I prefer to discuss the
topic on the newsgroup:
 http://d.puremagic.com/issues/show_bug.cgi?id=4474

 To show why this enhancement request is useful I use a little scripting task.
A D2 program has to read a file that contains one word in each line, and print
all the words with the highest length.

 A first naive functional style implementation (if you keep in mind that File
opens files in binary mode on default, this default is different from the
Python one, I am not sure what's the best default. I think most files I open
are text ones, so I think a bit better default is to open in text mode on
default), it doesn't work, it produces an access violation:


 import std.stdio, std.array, std.algorithm;
 void main() {
      auto lazyWords = File("words.txt", "r").byLine();
      auto maxLen = reduce!max(map!"a.length"(lazyWords));
      writeln(filter!((w){ return w.length == maxLen; })(lazyWords));
 }


 Finding the max length consumes the lazy iterable, so if the text file is
small a possible functional style solution is to convert the lazy iterable into
an array (other functional style solutions are possible, like reading the file
twice, etc):


 import std.stdio, std.array, std.algorithm;
 void main() {
      auto lazyWords = File("words.txt", "r").byLine();
      auto words = array(lazyWords);
      auto maxLen = reduce!max(map!"a.length"(words));
      writeln(filter!((w){ return w.length == maxLen; })(words));
 }


uses the same buffer, so words is a sequence of useless stuff.

 To debug that code you need to dup each array item:


 import std.stdio, std.array, std.algorithm;
 void main() {
      auto lazyWords = File("words.txt", "r").byLine();
      auto words = array(map!"a.dup"(lazyWords));
      auto maxLen = reduce!max(map!"a.length"(words));
      writeln(filter!((w){ return w.length == maxLen; })(words));
 }



I have had to debug the code. I think other programmers will have the same
troubles. So I think byLine() default behavour is a bit bug-prone. Re-using the
same buffer gives a good performance boost, but it's often a premature
optimization in scripts. I prefer a language and library to use unsafe
optimizations only on request.

 My suggested solution is to split File.byLine() in two member functions, one
returns a lazy iterable that reuses the line buffer and one that doesn't reuse
them. And my preferred solution is to give the shorter name (so it becomes the
"default") to the method that dups the line, because this is the safer (less
bug prone) behaviour. This is also in accord with D Zen. They may be named
byLine() and byFastLine() or something similar.

 If you prefer the "default" one to be the less safe version, then the safe
version may be named byDupLine():


 import std.stdio, std.array, std.algorithm;
 void main() {
      auto lazyWords = File("words.txt", "r").byDupLine();
      auto lines = array(lazyWords);
      auto maxLen = reduce!max(map!"a.length"(words));
      writeln(filter!((w){ return w.length == maxLen; })(words));
 }


 Using the "maxs" (from http://d.puremagic.com/issues/show_bug.cgi?id=4705 ),
the code becomes:


 import std.stdio, std.algorithm;
 void main() {
      auto lazyWords = File("words.txt", "r").byDupLine();
      writeln(maxs!"a.length"(lazyWords));
 }

But you don't need a new string for each line to evaluate max over line 
lengths; the current byLine works.

Generally I think buffer reuse in byLine() is too valuable to let go.


Andrei

Dec 26 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-12-26 12:13:41 -0500, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 On 12/26/10 10:12 AM, bearophile wrote:
 This is related to this (closed) issue, but this time I prefer to 
 discuss the topic on the newsgroup:
 http://d.puremagic.com/issues/show_bug.cgi?id=4474
 
 To show why this enhancement request is useful I use a little scripting 
 task. A D2 program has to read a file that contains one word in each 
 line, and print all the words with the highest length.
 
 A first naive functional style implementation (if you keep in mind that 
 File opens files in binary mode on default, this default is different 
 from the Python one, I am not sure what's the best default. I think 
 most files I open are text ones, so I think a bit better default is to 
 open in text mode on default), it doesn't work, it produces an access 
 violation:
 

 import std.stdio, std.array, std.algorithm;
 void main() {
      auto lazyWords = File("words.txt", "r").byLine();
      auto maxLen = reduce!max(map!"a.length"(lazyWords));
      writeln(filter!((w){ return w.length == maxLen; })(lazyWords));
 }
 
 
 Finding the max length consumes the lazy iterable, so if the text file 
 is small a possible functional style solution is to convert the lazy 
 iterable into an array (other functional style solutions are possible, 
 like reading the file twice, etc):
 

 import std.stdio, std.array, std.algorithm;
 void main() {
      auto lazyWords = File("words.txt", "r").byLine();
      auto words = array(lazyWords);
      auto maxLen = reduce!max(map!"a.length"(words));
      writeln(filter!((w){ return w.length == maxLen; })(words));
 }
 

 byLine() uses the same buffer, so words is a sequence of useless stuff.
 
 To debug that code you need to dup each array item:
 

 import std.stdio, std.array, std.algorithm;
 void main() {
      auto lazyWords = File("words.txt", "r").byLine();
      auto words = array(map!"a.dup"(lazyWords));
      auto maxLen = reduce!max(map!"a.length"(words));
      writeln(filter!((w){ return w.length == maxLen; })(words));
 }
 
 

 write, I have had to debug the code. I think other programmers will 
 have the same troubles. So I think byLine() default behavour is a bit 
 bug-prone. Re-using the same buffer gives a good performance boost, but 
 it's often a premature optimization in scripts. I prefer a language and 
 library to use unsafe optimizations only on request.
 
 My suggested solution is to split File.byLine() in two member 
 functions, one returns a lazy iterable that reuses the line buffer and 
 one that doesn't reuse them. And my preferred solution is to give the 
 shorter name (so it becomes the "default") to the method that dups the 
 line, because this is the safer (less bug prone) behaviour. This is 
 also in accord with D Zen. They may be named byLine() and byFastLine() 
 or something similar.
 
 If you prefer the "default" one to be the less safe version, then the 
 safe version may be named byDupLine():
 

 import std.stdio, std.array, std.algorithm;
 void main() {
      auto lazyWords = File("words.txt", "r").byDupLine();
      auto lines = array(lazyWords);
      auto maxLen = reduce!max(map!"a.length"(words));
      writeln(filter!((w){ return w.length == maxLen; })(words));
 }
 
 
 Using the "maxs" (from 
 http://d.puremagic.com/issues/show_bug.cgi?id=4705 ), the code becomes:
 

 import std.stdio, std.algorithm;
 void main() {
      auto lazyWords = File("words.txt", "r").byDupLine();
      writeln(maxs!"a.length"(lazyWords));
 }

 
 But you don't need a new string for each line to evaluate max over line 
 lengths; the current byLine works.

That's true.	


 Generally I think buffer reuse in byLine() is too valuable to let go.

I also agree it's wasteful.

But I think bearophile's experiment has illustrated two noteworthy 
problems. The first issue is that calling filter! on the 
already-consumed result of byLine() gives you a seg fault. I reproduced 
this locally, but haven't pinpointed the problem.

The second one is this:

	array(file.byline())
	
which gives a wrong result because of the buffer reuse. Either it 
should not compile or it should idup every line (both of which are not 
entirely satisfactory, but they're better than getting wrong results).

I think a range should be able to express if the value can be reused or 
not. If the value can't be reused, then the algorithm should either not 
instantiate, or in some cases it might create a copy.


-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Dec 26 2010

bearophile <bearophileHUGS lycos.com> writes:

Michel Fortin:

 But you don't need a new string for each line to evaluate max over line 
 lengths; the current byLine works.

 
 That's true.	
 
 
 Generally I think buffer reuse in byLine() is too valuable to let go.

 
 I also agree it's wasteful.

See my answers to Andrei.


 But I think bearophile's experiment has illustrated two noteworthy 
 problems. The first issue is that calling filter! on the 
 already-consumed result of byLine() gives you a seg fault. I reproduced 
 this locally, but haven't pinpointed the problem.

Maybe beside that one there is yet another problem, I have had problems in
creating a version that opens and scans the file twice.


 The second one is this:
 
 	array(file.byline())
 	
 which gives a wrong result because of the buffer reuse. Either it 
 should not compile or it should idup every line (both of which are not 
 entirely satisfactory, but they're better than getting wrong results).

I don't agree. dupping every line is like performing a limited deep-dup, and
this is not the job of array().
Not compiling again is not the job of array(). So the problem needs to be
solved elsewhere, array() is OK, I think.


 I think a range should be able to express if the value can be reused or 
 not. If the value can't be reused, then the algorithm should either not 
 instantiate, or in some cases it might create a copy.

Generally in a nonfunctional language iterables need to exhaust :-)

Bye,
bearophile

Dec 26 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 12/26/10 11:53 AM, Michel Fortin wrote:
 On 2010-12-26 12:13:41 -0500, Andrei Alexandrescu
 Generally I think buffer reuse in byLine() is too valuable to let go.

 I also agree it's wasteful.

 But I think bearophile's experiment has illustrated two noteworthy
 problems. The first issue is that calling filter! on the
 already-consumed result of byLine() gives you a seg fault. I reproduced
 this locally, but haven't pinpointed the problem.

That should be filed as a Phobos bug. Could you please do the honors?

 The second one is this:

 array(file.byline())

 which gives a wrong result because of the buffer reuse. Either it should
 not compile or it should idup every line (both of which are not entirely
 satisfactory, but they're better than getting wrong results).

I agree it would be great if that meaningless expression wouldn't compile.

 I think a range should be able to express if the value can be reused or
 not. If the value can't be reused, then the algorithm should either not
 instantiate, or in some cases it might create a copy.

I thought a couple of days about this. There's a host of desirable 
properties that could ideally be expressed during compilation. My 
experience is that inevitably they add complexity to the range 
definition. For example, we could define a trait like this:

template iteratesDistinctObjects(R) if (isInputRange!R) {
     enum iteratesDistinctObjects = true;
}

That would cover the common case, then the ByLine range would say:

template iteratesDistinctObjects(R) if (is(R == ByLine)) {
     enum iteratesDistinctObjects = false;
}

That means all existing ranges and algorithms need to be reviewed to 
take that property into account. The complexity of the library rises.

Even if we cover that, there are many properties like that, and user may 
still define their own ranges, algorithms, or plain old code that do the 
wrong thing in one way or another. So it all becomes a matter of 
deciding at which point we can reasonably ask the library user to Read 
The Fine Manual. I'm not sure where the decision about this particular 
issue should be.

By the way, a simple way to define a range that issues one new string 
per line is:

auto r = map!(to!string)(f.byLine());


Andrei

Dec 27 2010

spir <denis.spir gmail.com> writes:

On Sun, 26 Dec 2010 12:53:29 -0500
Michel Fortin <michel.fortin michelf.com> wrote:

 On 2010-12-26 12:13:41 -0500, Andrei Alexandrescu=20
 <SeeWebsiteForEmail erdani.org> said:
=20
 On 12/26/10 10:12 AM, bearophile wrote:
 This is related to this (closed) issue, but this time I prefer to=20
 discuss the topic on the newsgroup:
 http://d.puremagic.com/issues/show_bug.cgi?id=3D4474
=20
 To show why this enhancement request is useful I use a little scriptin=



g=20
 task. A D2 program has to read a file that contains one word in each=20
 line, and print all the words with the highest length.
=20
 A first naive functional style implementation (if you keep in mind tha=



t=20
 File opens files in binary mode on default, this default is different=



=20
 from the Python one, I am not sure what's the best default. I think=20
 most files I open are text ones, so I think a bit better default is to=



=20
 open in text mode on default), it doesn't work, it produces an access=



=20
 violation:
=20

 import std.stdio, std.array, std.algorithm;
 void main() {
      auto lazyWords =3D File("words.txt", "r").byLine();
      auto maxLen =3D reduce!max(map!"a.length"(lazyWords));
      writeln(filter!((w){ return w.length =3D=3D maxLen; })(lazyWords)=



);
 }
=20
=20
 Finding the max length consumes the lazy iterable, so if the text file=



=20
 is small a possible functional style solution is to convert the lazy=20
 iterable into an array (other functional style solutions are possible,=



=20
 like reading the file twice, etc):
=20

 import std.stdio, std.array, std.algorithm;
 void main() {
      auto lazyWords =3D File("words.txt", "r").byLine();
      auto words =3D array(lazyWords);
      auto maxLen =3D reduce!max(map!"a.length"(words));
      writeln(filter!((w){ return w.length =3D=3D maxLen; })(words));
 }
=20

 byLine() uses the same buffer, so words is a sequence of useless stuff.
=20
 To debug that code you need to dup each array item:
=20

 import std.stdio, std.array, std.algorithm;
 void main() {
      auto lazyWords =3D File("words.txt", "r").byLine();
      auto words =3D array(map!"a.dup"(lazyWords));
      auto maxLen =3D reduce!max(map!"a.length"(words));
      writeln(filter!((w){ return w.length =3D=3D maxLen; })(words));
 }
=20
=20




=20
 write, I have had to debug the code. I think other programmers will=20
 have the same troubles. So I think byLine() default behavour is a bit=



=20
 bug-prone. Re-using the same buffer gives a good performance boost, bu=



t=20
 it's often a premature optimization in scripts. I prefer a language an=



d=20
 library to use unsafe optimizations only on request.
=20
 My suggested solution is to split File.byLine() in two member=20
 functions, one returns a lazy iterable that reuses the line buffer and=



=20
 one that doesn't reuse them. And my preferred solution is to give the=



=20
 shorter name (so it becomes the "default") to the method that dups the=



=20
 line, because this is the safer (less bug prone) behaviour. This is=20
 also in accord with D Zen. They may be named byLine() and byFastLine()=



=20
 or something similar.
=20
 If you prefer the "default" one to be the less safe version, then the=



=20
 safe version may be named byDupLine():
=20

 import std.stdio, std.array, std.algorithm;
 void main() {
      auto lazyWords =3D File("words.txt", "r").byDupLine();
      auto lines =3D array(lazyWords);
      auto maxLen =3D reduce!max(map!"a.length"(words));
      writeln(filter!((w){ return w.length =3D=3D maxLen; })(words));
 }
=20
=20
 Using the "maxs" (from=20
 http://d.puremagic.com/issues/show_bug.cgi?id=3D4705 ), the code becom=



es:
=20

 import std.stdio, std.algorithm;
 void main() {
      auto lazyWords =3D File("words.txt", "r").byDupLine();
      writeln(maxs!"a.length"(lazyWords));
 }

=20
 But you don't need a new string for each line to evaluate max over line=


=20
 lengths; the current byLine works.

=20
 That's true.=09
=20
=20
 Generally I think buffer reuse in byLine() is too valuable to let go.

=20
 I also agree it's wasteful.

That's true, indeed. But here the point, I guess, is that you need the reus=
e the line sequence (whatever it is), as argument to filter.

 But I think bearophile's experiment has illustrated two noteworthy=20
 problems. The first issue is that calling filter! on the=20
 already-consumed result of byLine() gives you a seg fault. I reproduced=20
 this locally, but haven't pinpointed the problem.
=20
 The second one is this:
=20
 	array(file.byline())
 =09
 which gives a wrong result because of the buffer reuse. Either it=20
 should not compile or it should idup every line (both of which are not=20
 entirely satisfactory, but they're better than getting wrong results).

I tried it myself, and the result is very strange:
unittest {
    // text.txt holds: "abc\ndefgh\nijk\nlmn\nopqrs\ntuvwx\nyz\n"
    auto lazyWords =3D File("test.txt").byLine();
//~     auto words =3D Array(map!((ll){return ll.dup;})(lazyWords));
    auto words =3D Array(lazyWords);
    writeln(words);
    auto lengths =3D map!((l){return l.length;})(words);
    writeln(lengths);
    auto maxLength =3D reduce!max(lengths);
    writeln(maxLength);
    auto longWords =3D filter!((l){return l.length=3D=3DmaxLength;})(words);
    writeln(longWords);
}
=3D=3D>
[yz
, yz
wx, yz
, yz
, yz
wx, yz
wx, yz]
[3, 5, 3, 3, 5, 5, 2]
5
[yz
wx, yz
wx, yz
wx]

I don't understand why/how, asp that lengths is correct ;-)
When replacing the "auto words =3D ..." statements, we get as expected:
[abc, defgh, ijk, lmn, opqrs, tuvwx, yz]
[3, 5, 3, 3, 5, 5, 2]
5
[defgh, opqrs, tuvwx]

 I think a range should be able to express if the value can be reused or=20
 not. If the value can't be reused, then the algorithm should either not=20
 instantiate, or in some cases it might create a copy.

Makes sense. I also like Bearophile's proposal. But would let 'byLines' for=
 the lazy/single-iteration-only version and call the dupping method simply =
'lines'. For me (is it just me?), the latter correctly expresses an array-l=
ike collection of elements one can safely (re)use.

Denis
-- -- -- -- -- -- --
vit esse estrany =E2=98=A3

spir.wikidot.com

Dec 28 2010

bearophile <bearophileHUGS lycos.com> writes:

Andrei:

But you don't need a new string for each line to evaluate max over line
lengths; the current byLine works.<

Right. There are various different ways to implement that little task in
functional style. But the task requires to print the longest ones. So you have
to filter according to the max length. And if you filter you consume the
iterable (I think). So you need to dup the array again.

It looks simple, but I have tried to create a new version like that writing
some versions, and I have had several 'access violations'. This very bad for
people that want to write a script with D.


 Generally I think buffer reuse in byLine() is too valuable to let go.

In my post you see I have never suggested to remove buffer reuse.

I have suggested two possible alternatives. In both alternatives the byLine()
is split into two different methods:

First possibility, this is my preferred:
1a) Modify byLine() so it doesn't reuse the buffer.
1b) Add another method, like byFastLine() that reuses the buffer.


Alternative:
2a) Keep byLine() as it is now, so it reuses the buffer.
2b) Add another method, like byDupLine() that doesn't reuse the buffer.


In the post I have explained the rationale behind this.

I prefer the first possibility because the D Zen says that the more safe option
is the default one, and the faster less safe is on request (and I agree with
this part of the D Zen).

Talking about usability and egonomy in this newsgroups is sometimes a very
tiring job :-) Even very little things seem to require a lot of work and
discussions.

Bye,
bearophile

Dec 26 2010

Adam D. Ruppe <destructionator gmail.com> writes:

bearophile wrote:
 Talking about usability and egonomy in this newsgroups
 is sometimes a very tiring job :-)

Remember, a lot of your posts are actually very subjective. Those
things tend to get a lot of debate without an objective right or
wrong coming out of it. It could go both ways.

Dec 26 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 12/26/10 11:56 AM, bearophile wrote:
 Andrei:

 But you don't need a new string for each line to evaluate max over
  line lengths; the current byLine works.<

 Right. There are various different ways to implement that little task
 in functional style. But the task requires to print the longest ones.
 So you have to filter according to the max length. And if you filter
 you consume the iterable (I think). So you need to dup the array
 again.

 It looks simple, but I have tried to create a new version like that
 writing some versions, and I have had several 'access violations'.
 This very bad for people that want to write a script with D.


 Generally I think buffer reuse in byLine() is too valuable to let
 go.

 In my post you see I have never suggested to remove buffer reuse.

 I have suggested two possible alternatives. In both alternatives the
  byLine() is split into two different methods:

 First possibility, this is my preferred: 1a) Modify byLine() so it
 doesn't reuse the buffer. 1b) Add another method, like byFastLine()
 that reuses the buffer.


 Alternative: 2a) Keep byLine() as it is now, so it reuses the
 buffer. 2b) Add another method, like byDupLine() that doesn't reuse
 the buffer.


 In the post I have explained the rationale behind this.

 I prefer the first possibility because the D Zen says that the more
 safe option is the default one, and the faster less safe is on
 request (and I agree with this part of the D Zen).

Well the D Zen would attempt to reconcile the two such that the obvious 
option is the safest and the fastest.

Let's also not forget that "safe" is a bit abused here - we're not 
talking about lack of safety as much as incorrect results.

Maybe a byLine!string() would be the best of both worlds by 
automatically calling to!string against each line. That can actually be 
nicely extended to e.g. byLine!(double[]) to automatically read lines of 
whitespace-separated doubles.

 Talking about usability and egonomy in this newsgroups is sometimes a
 very tiring job :-) Even very little things seem to require a lot of
 work and discussions.

One possible issue is that you start with the assumption you're 
unequivocally right in matters that are highly debatable. That sometimes 
makes it tiring for the rest of us, too, but I'm not one to complain :o).


Andrei

Dec 26 2010

bearophile <bearophileHUGS lycos.com> writes:

Andrei:

 Well the D Zen would attempt to reconcile the two such that the obvious option
is the safest and the fastest.<

That's indeed better (but not always possible).


Let's also not forget that "safe" is a bit abused here - we're not talking
about lack of safety as much as incorrect results.<

Right, I meant less hard to find a working version, with less bugs and less
segfaults.

(In dlibs1 I have used a third design, a run time boolean argument (or template
compile-time) that on default performs the copies, and on request reuses the
same buffer on each iteration. This avoids the function/method duplication.)


Maybe a byLine!string() would be the best of both worlds by automatically
calling to!string against each line.<

Let's see what other people think about this.


That can actually be nicely extended to e.g. byLine!(double[]) to automatically
read lines of whitespace-separated doubles.<

Sometimes you tend to over-engineer things. Be careful.


One possible issue is that you start with the assumption you're unequivocally
right in matters that are highly debatable. That sometimes makes it tiring for
the rest of us, too, but I'm not one to complain :o).<

I am sorry.

Bye and thank you,
bearophile

Dec 26 2010

Ary Borenszweig <ary esperanto.org.ar> writes:

If the function is left as is I expect questions about "Why isn't this working?"
for it to appear on D.learn about 1 time each month.

Exactly like what happens with property += value and other things that lead to
incorrect result or don't work, viewed from the most obvious point of view.

Dec 27 2010

spir <denis.spir gmail.com> writes:

On Mon, 27 Dec 2010 18:21:24 +0000 (UTC)
Ary Borenszweig <ary esperanto.org.ar> wrote:

 If the function is left as is I expect questions about "Why isn't this wo=

rking?"
 for it to appear on D.learn about 1 time each month.
=20
 Exactly like what happens with property +=3D value and other things that =

lead to
 incorrect result or don't work, viewed from the most obvious point of vie=

w.

Very probable. If those recurrent complains are not are drawn from another =
language's behaviour (*), they are sure signs of incorrect design.

Denis

(*) I mean someone trying to program Java or C++ or Python in D.
-- -- -- -- -- -- --
vit esse estrany =E2=98=A3

spir.wikidot.com

Dec 28 2010

bearophile <bearophileHUGS lycos.com> writes:

Adam D. Ruppe:

 Remember, a lot of your posts are actually very subjective. Those
 things tend to get a lot of debate without an objective right or
 wrong coming out of it. It could go both ways.

Right. But I have discussed a lot about usability & egonomy matters in the
Python newsgroups too (some thousands posts), and that place was somewhat
different. Python users seem to take ergonomy more seriously, they are more
interested in this topic. For Python devs usability is one of the most
important things (because often for Python performance is less important than
usability). I don't know, maybe it's just a cultural difference among the two
newsgroups.

Bye,
bearophile

Dec 26 2010

Brad Roberts <braddr slice-2.puremagic.com> writes:

On Sun, 26 Dec 2010, bearophile wrote:

 Adam D. Ruppe:
 
 Remember, a lot of your posts are actually very subjective. Those
 things tend to get a lot of debate without an objective right or
 wrong coming out of it. It could go both ways.

 
 Right. But I have discussed a lot about usability & egonomy matters in 
 the Python newsgroups too (some thousands posts), and that place was 
 somewhat different. Python users seem to take ergonomy more seriously, 
 they are more interested in this topic. For Python devs usability is one 
 of the most important things (because often for Python performance is 
 less important than usability). I don't know, maybe it's just a cultural 
 difference among the two newsgroups.
 
 Bye,
 bearophile

I don't use python, so I don't hang out on those groups.  But for me, my 
objection to your style posts is that you state your opinions as if they 
were facts.  They're not and shouldn't be presented as if they were.

Later,
Brad

Dec 26 2010

D Programming

C/C++ Programming

Other

digitalmars.D - Phobos usability with text files