www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Overhauling the notion of output range

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
The notion of output range has been a tad vague in the past; up until 
now a range that wanted to qualify as an output range had to define a 
method called put.

That definition is awkward though. For example, the std.algorithm.copy() 
primitive should work for an output range, but also for some other 
ranges that allow assignment to front.

In related news, there's been this burning desire regarding a 
lightweight output range defined as simply a delegate that accepts 
const(char)[]. That range is an output range all right, and the way you 
output to it is by simply calling the delegate!

All of the three mechanisms above are desirable for certain types. So 
imagine this dialog (paraphrased from 
http://www.youtube.com/watch?v=MrTsuvykUZk):

Guy 1: We need a good output range interface, and we have three good 
candidates. We should make every one an output range.

Guy 2: What do you mean, every one?

Guy 1: E-V-E-R-Y O-N-E!!!!

So, I defined isOutputRange!R to yield true if at least one of these 
conditions above is met: put() definition, input range with assignable 
front, delegate.

Please refer to this code and this doc:

http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227

http://erdani.com/d/phobos/std_range.html

This confers a ton of flexibility to programmers who need to define 
output ranges. I've also reworked std.format to use put(r, e) so now it 
works with all output ranges seamlessly.

Any thoughts would be appreciated. Thanks!


Andrei
Jul 11 2010
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 07/11/2010 08:17 PM, Andrei Alexandrescu wrote:
[snip]

Forgot to mention one detail: now the free function std.range.put() 
serves as a convenient dispatcher for any kind of output range.

Andrei
Jul 11 2010
prev sibling next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 The notion of output range has been a tad vague in the past; up until
 now a range that wanted to qualify as an output range had to define a
 method called put.
 That definition is awkward though. For example, the std.algorithm.copy()
 primitive should work for an output range, but also for some other
 ranges that allow assignment to front.
 In related news, there's been this burning desire regarding a
 lightweight output range defined as simply a delegate that accepts
 const(char)[]. That range is an output range all right, and the way you
 output to it is by simply calling the delegate!
 All of the three mechanisms above are desirable for certain types. So
 imagine this dialog (paraphrased from
 http://www.youtube.com/watch?v=MrTsuvykUZk):
 Guy 1: We need a good output range interface, and we have three good
 candidates. We should make every one an output range.
 Guy 2: What do you mean, every one?
 Guy 1: E-V-E-R-Y O-N-E!!!!
 So, I defined isOutputRange!R to yield true if at least one of these
 conditions above is met: put() definition, input range with assignable
 front, delegate.
 Please refer to this code and this doc:
 http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227
 http://erdani.com/d/phobos/std_range.html
 This confers a ton of flexibility to programmers who need to define
 output ranges. I've also reworked std.format to use put(r, e) so now it
 works with all output ranges seamlessly.
 Any thoughts would be appreciated. Thanks!
 Andrei
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 The notion of output range has been a tad vague in the past; up until
 now a range that wanted to qualify as an output range had to define a
 method called put.
 That definition is awkward though. For example, the std.algorithm.copy()
 primitive should work for an output range, but also for some other
 ranges that allow assignment to front.
 In related news, there's been this burning desire regarding a
 lightweight output range defined as simply a delegate that accepts
 const(char)[]. That range is an output range all right, and the way you
 output to it is by simply calling the delegate!
 All of the three mechanisms above are desirable for certain types. So
 imagine this dialog (paraphrased from
 http://www.youtube.com/watch?v=MrTsuvykUZk):
 Guy 1: We need a good output range interface, and we have three good
 candidates. We should make every one an output range.
 Guy 2: What do you mean, every one?
 Guy 1: E-V-E-R-Y O-N-E!!!!
 So, I defined isOutputRange!R to yield true if at least one of these
 conditions above is met: put() definition, input range with assignable
 front, delegate.
 Please refer to this code and this doc:
 http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227
 http://erdani.com/d/phobos/std_range.html
 This confers a ton of flexibility to programmers who need to define
 output ranges. I've also reworked std.format to use put(r, e) so now it
 works with all output ranges seamlessly.
 Any thoughts would be appreciated. Thanks!
 Andrei
Nice. I had quietly wondered why input ranges with assignable front aren't also output ranges, but never gotten around to asking. Two comments: 1. What happens if you run out of space in the input range variety? I guess it throws? 2. Would there be a "standard" way of signaling how much stuff was written to the input range variety? I guess that since functions that output their results via output ranges would usually return void, they could return an integer instead indicating how much stuff was written. 3. While we're on the subject of improving output ranges, I was just thinking before I read your post that it would be nice to have a Tee type in std.range, since outputting to exactly one output range is kind of inflexible. Such a type would itself be an output range. It would take in N output ranges where N > 1 as instantiation parameters, and pass any input received to all of the underlying ranges. The use case I thought of for this is when I'm generating tons of data through a monte carlo simulation and don't want to store it all in memory. Both Summary in dstats and Histogram in dflplot can be used as output ranges. I'd love to write a function that outputs results to an output range and use Tee to get both summary statistics and a histogram.
Jul 11 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 07/11/2010 08:34 PM, dsimcha wrote:
 1.  What happens if you run out of space in the input range variety?  I guess
it
 throws?
It's up to the range. Most will throw.
 2.  Would there be a "standard" way of signaling how much stuff was written to
the
 input range variety?  I guess that since functions that output their results
via
 output ranges would usually return void, they could return an integer instead
 indicating how much stuff was written.
There is no standard way at the moment. I think not throwing implies all passed data has been written.
 3.  While we're on the subject of improving output ranges, I was just thinking
 before I read your post that it would be nice to have a Tee type in std.range,
 since outputting to exactly one output range is kind of inflexible.  Such a
type
 would itself be an output range.  It would take in N output ranges where N>  1
as
 instantiation parameters, and pass any input received to all of the underlying
 ranges.
Yah, tee would be great.
 The use case I thought of for this is when I'm generating tons of data through
a
 monte carlo simulation and don't want to store it all in memory.  Both Summary
in
 dstats and Histogram in dflplot can be used as output ranges.  I'd love to
write a
 function that outputs results to an output range and use Tee to get both
summary
 statistics and a histogram.
... and checkpoint all that to a file. Andrei
Jul 11 2010
prev sibling next sibling parent reply Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Mon, Jul 12, 2010 at 03:17, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 In related news, there's been this burning desire regarding a lightweight
 output range defined as simply a delegate that accepts const(char)[]. That
 range is an output range all right, and the way you output to it is by
 simply calling the delegate!
 <http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227>
 http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227
It's still early where I live, but... For the callable case, why just accepting E[] instead of any range with E as element? Though, thinking about it, I right now have no idea how to put that into a template constraint, given only R and E. Hmm...
Jul 11 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 07/12/2010 12:45 AM, Philippe Sigaud wrote:
 On Mon, Jul 12, 2010 at 03:17, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail erdani.org>>
 wrote:

     In related news, there's been this burning desire regarding a
     lightweight output range defined as simply a delegate that accepts
     const(char)[]. That range is an output range all right, and the way
     you output to it is by simply calling the delegate!
     <http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227>

     http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227


 It's still early where I live, but... For the callable case, why just
 accepting E[] instead of any range with E as element? Though, thinking
 about it, I right now have no idea how to put that into a template
 constraint, given only R and E.

 Hmm...
Good point. I haven't thought of it that way - I used arrays as a lingua franca buffer. Your suggestion is interesting. I see a risk of infinite regression in writing the constraint, but the idea warrants more discussion. Andrei
Jul 11 2010
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 12 Jul 2010 02:21:25 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 07/12/2010 12:45 AM, Philippe Sigaud wrote:
 On Mon, Jul 12, 2010 at 03:17, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail erdani.org>>
 wrote:

     In related news, there's been this burning desire regarding a
     lightweight output range defined as simply a delegate that accepts
     const(char)[]. That range is an output range all right, and the way
     you output to it is by simply calling the delegate!
     <http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227>

     http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227


 It's still early where I live, but... For the callable case, why just
 accepting E[] instead of any range with E as element? Though, thinking
 about it, I right now have no idea how to put that into a template
 constraint, given only R and E.

 Hmm...
Good point. I haven't thought of it that way - I used arrays as a lingua franca buffer. Your suggestion is interesting. I see a risk of infinite regression in writing the constraint, but the idea warrants more discussion.
Wait, isn't a delegate that accepts a type T an output range of type T? Why does the argument have to be an array/range? For example, a delegate that accepts a string is a range of strings, is it not? Or do you consider it a range of immutable(char)? For example, I'd expect to be able to use a push_back delegate on an array-type of integers as an output range. -Steve
Jul 12 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 07/12/2010 07:44 AM, Steven Schveighoffer wrote:
 On Mon, 12 Jul 2010 02:21:25 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 07/12/2010 12:45 AM, Philippe Sigaud wrote:
 On Mon, Jul 12, 2010 at 03:17, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail erdani.org>>
 wrote:

 In related news, there's been this burning desire regarding a
 lightweight output range defined as simply a delegate that accepts
 const(char)[]. That range is an output range all right, and the way
 you output to it is by simply calling the delegate!
 <http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227>


 http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227



 It's still early where I live, but... For the callable case, why just
 accepting E[] instead of any range with E as element? Though, thinking
 about it, I right now have no idea how to put that into a template
 constraint, given only R and E.

 Hmm...
Good point. I haven't thought of it that way - I used arrays as a lingua franca buffer. Your suggestion is interesting. I see a risk of infinite regression in writing the constraint, but the idea warrants more discussion.
Wait, isn't a delegate that accepts a type T an output range of type T? Why does the argument have to be an array/range?
Efficiency - see the doFormat disaster.
 For example, a delegate that accepts a string is a range of strings, is
 it not? Or do you consider it a range of immutable(char)?
Good call. Probably I need to handle ranges of arrays differently. (The problem has already come up, but I punted on it.)
 For example, I'd expect to be able to use a push_back delegate on an
 array-type of integers as an output range.
Would it hurt to define push_back for arrays too? The thing is, if you can output an array you can always output one occasional element. But if you only know how to output an element, having the client side output arrays in a loop can be quite slow. Andrei
Jul 12 2010
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 12 Jul 2010 10:41:51 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 07/12/2010 07:44 AM, Steven Schveighoffer wrote:
 On Mon, 12 Jul 2010 02:21:25 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 07/12/2010 12:45 AM, Philippe Sigaud wrote:
 On Mon, Jul 12, 2010 at 03:17, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail erdani.org>>
 wrote:

 In related news, there's been this burning desire regarding a
 lightweight output range defined as simply a delegate that accepts
 const(char)[]. That range is an output range all right, and the way
 you output to it is by simply calling the delegate!
 <http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227>


 http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227



 It's still early where I live, but... For the callable case, why just
 accepting E[] instead of any range with E as element? Though, thinking
 about it, I right now have no idea how to put that into a template
 constraint, given only R and E.

 Hmm...
Good point. I haven't thought of it that way - I used arrays as a lingua franca buffer. Your suggestion is interesting. I see a risk of infinite regression in writing the constraint, but the idea warrants more discussion.
Wait, isn't a delegate that accepts a type T an output range of type T? Why does the argument have to be an array/range?
Efficiency - see the doFormat disaster.
Yes, I'm not saying that a delegate that accepts a range of T cannot be an output range of T, I just wondered why it *has* to be a range.
 For example, a delegate that accepts a string is a range of strings, is
 it not? Or do you consider it a range of immutable(char)?
Good call. Probably I need to handle ranges of arrays differently. (The problem has already come up, but I punted on it.)
It seems to me to be similar to appending -- you can append an element or an array of elements. But appending the individual element makes sense in a lot of cases, despite performance. If output ranges were restricted to deal only with stream data (i.e. char) then I'd agree only accepting an array of data is a good idea.
 For example, I'd expect to be able to use a push_back delegate on an
 array-type of integers as an output range.
Would it hurt to define push_back for arrays too? The thing is, if you can output an array you can always output one occasional element. But if you only know how to output an element, having the client side output arrays in a loop can be quite slow.
If I always have to do something like this in order to append a single element: put(r, (&elem)[0..1]); Then the concept of output ranges is much less attractive to me. In some cases, appending a single element is all that works. For example, a linked list could be an output range, and passing it an array is not going to be any more optimal than passing individual elements. -Steve
Jul 12 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 07/12/2010 09:59 AM, Steven Schveighoffer wrote:
 On Mon, 12 Jul 2010 10:41:51 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 07/12/2010 07:44 AM, Steven Schveighoffer wrote:
 On Mon, 12 Jul 2010 02:21:25 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 07/12/2010 12:45 AM, Philippe Sigaud wrote:
 On Mon, Jul 12, 2010 at 03:17, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail erdani.org>>
 wrote:

 In related news, there's been this burning desire regarding a
 lightweight output range defined as simply a delegate that accepts
 const(char)[]. That range is an output range all right, and the way
 you output to it is by simply calling the delegate!
 <http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227>



 http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227




 It's still early where I live, but... For the callable case, why just
 accepting E[] instead of any range with E as element? Though, thinking
 about it, I right now have no idea how to put that into a template
 constraint, given only R and E.

 Hmm...
Good point. I haven't thought of it that way - I used arrays as a lingua franca buffer. Your suggestion is interesting. I see a risk of infinite regression in writing the constraint, but the idea warrants more discussion.
Wait, isn't a delegate that accepts a type T an output range of type T? Why does the argument have to be an array/range?
Efficiency - see the doFormat disaster.
Yes, I'm not saying that a delegate that accepts a range of T cannot be an output range of T, I just wondered why it *has* to be a range.
 For example, a delegate that accepts a string is a range of strings, is
 it not? Or do you consider it a range of immutable(char)?
Good call. Probably I need to handle ranges of arrays differently. (The problem has already come up, but I punted on it.)
It seems to me to be similar to appending -- you can append an element or an array of elements. But appending the individual element makes sense in a lot of cases, despite performance. If output ranges were restricted to deal only with stream data (i.e. char) then I'd agree only accepting an array of data is a good idea.
 For example, I'd expect to be able to use a push_back delegate on an
 array-type of integers as an output range.
Would it hurt to define push_back for arrays too? The thing is, if you can output an array you can always output one occasional element. But if you only know how to output an element, having the client side output arrays in a loop can be quite slow.
If I always have to do something like this in order to append a single element: put(r, (&elem)[0..1]);
No, the library does that. Look here: http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L306 Andrei
Jul 12 2010
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 12 Jul 2010 11:05:33 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 07/12/2010 09:59 AM, Steven Schveighoffer wrote:
 If I always have to do something like this in order to append a single
 element:

 put(r, (&elem)[0..1]);
No, the library does that. Look here: http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L306
So you're saying it's not ok for an output range to support appending a single element, but it's ok for put to support appending a single element? Well, all that will end up happening is cases where appending a single element is the only possibility will produce overloaded add functions, one that takes a single element, and one that takes an array. The one that takes an array will look like this: foreach(e; arr) add(e); I can tell you this for sure, because it's exactly what's in many dcollections classes. So what happens when you call put(r, e) for one of these output classes? Instead of just calling add(e), it calls (add((&e)[0..1])) which in turn goes through some needless loop, which then ends up calling add(e). I don't see why this is preferable. -Steve
Jul 12 2010
next sibling parent reply torhu <no spam.invalid> writes:
On 12.07.2010 18:48, Steven Schveighoffer wrote:
[...]
 So what happens when you call put(r, e) for one of these output
 classes? Instead of just calling add(e), it calls (add((&e)[0..1]))
 which in turn goes through some needless loop, which then ends up
 calling add(e).  I don't see why this is preferable.
put(r, e) prefers to call r.put(e) for single element adds. Doesn't that take care of it?
Jul 12 2010
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 12 Jul 2010 13:43:18 -0400, torhu <no spam.invalid> wrote:

 On 12.07.2010 18:48, Steven Schveighoffer wrote:
 [...]
 So what happens when you call put(r, e) for one of these output
 classes? Instead of just calling add(e), it calls (add((&e)[0..1]))
 which in turn goes through some needless loop, which then ends up
 calling add(e).  I don't see why this is preferable.
put(r, e) prefers to call r.put(e) for single element adds. Doesn't that take care of it?
No, dcollections doesn't define put. But you could take a delegate to add. The question is, why the prejudice against the single element version when using a delegate as an output range? The fact that an output range can define a put function that takes a single element, but you can't use a delegate just makes no sense to me. -Steve
Jul 12 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 07/12/2010 12:55 PM, Steven Schveighoffer wrote:
 On Mon, 12 Jul 2010 13:43:18 -0400, torhu <no spam.invalid> wrote:

 On 12.07.2010 18:48, Steven Schveighoffer wrote:
 [...]
 So what happens when you call put(r, e) for one of these output
 classes? Instead of just calling add(e), it calls (add((&e)[0..1]))
 which in turn goes through some needless loop, which then ends up
 calling add(e). I don't see why this is preferable.
put(r, e) prefers to call r.put(e) for single element adds. Doesn't that take care of it?
No, dcollections doesn't define put. But you could take a delegate to add. The question is, why the prejudice against the single element version when using a delegate as an output range? The fact that an output range can define a put function that takes a single element, but you can't use a delegate just makes no sense to me. -Steve
Efficiency? Andrei
Jul 12 2010
parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 On 07/12/2010 12:55 PM, Steven Schveighoffer wrote:
 On Mon, 12 Jul 2010 13:43:18 -0400, torhu <no spam.invalid> wrote:

 On 12.07.2010 18:48, Steven Schveighoffer wrote:
 [...]
 So what happens when you call put(r, e) for one of these output
 classes? Instead of just calling add(e), it calls (add((&e)[0..1]))
 which in turn goes through some needless loop, which then ends up
 calling add(e). I don't see why this is preferable.
put(r, e) prefers to call r.put(e) for single element adds. Doesn't that take care of it?
No, dcollections doesn't define put. But you could take a delegate to add. The question is, why the prejudice against the single element version when using a delegate as an output range? The fact that an output range can define a put function that takes a single element, but you can't use a delegate just makes no sense to me. -Steve
Efficiency? Andrei
But efficiency only matters some of the time, assuming the kind of inefficiency in question is small constant term inefficiency (as is the case here) and not O(N!) inefficiency or 1000-fold constant term inefficiency. Calling a delegate once per element is inefficient, but not **that** inefficient, and it's convenient at times, such as when the delegate will be used in contexts other than as an output range. I think inefficiency is a poor excuse for not supporting this, and the users of Phobos should decide on a case-by-case basis whether calling a delegate for every element is efficient enough. The following benchmark takes a whopping 830 milliseconds on my computer, for a **hundred million** iterations: import std.stdio, std.perf; void main() { void foo() {} auto pc = new PerformanceCounter; pc.start; auto del = &foo; foreach(i; 0..100_000_000) { del(); } pc.stop; writeln(pc.milliseconds); }
Jul 12 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 07/12/2010 01:31 PM, dsimcha wrote:
 But efficiency only matters some of the time, assuming the kind of
inefficiency in
 question is small constant term inefficiency (as is the case here) and not
O(N!)
 inefficiency or 1000-fold constant term inefficiency.
Historically the constant was large.
 Calling a delegate once per element is inefficient, but not **that**
inefficient,
 and it's convenient at times, such as when the delegate will be used in
contexts
 other than as an output range.  I think inefficiency is a poor excuse for not
 supporting this, and the users of Phobos should decide on a case-by-case basis
 whether calling a delegate for every element is efficient enough.
I guess that's fine, though I clearly recall that doFormat was, at the time I decided to rewrite it (2008 or so), virtually unusable for serious workloads. I don't know to what extent improvement in the compiler have eroded that margin. Andrei
Jul 12 2010
parent dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 On 07/12/2010 01:31 PM, dsimcha wrote:
 But efficiency only matters some of the time, assuming the kind of
inefficiency in
 question is small constant term inefficiency (as is the case here) and not
O(N!)
 inefficiency or 1000-fold constant term inefficiency.
Historically the constant was large.
I'm not saying don't encourage the use of delegates that take ranges where it makes sense, such as for formatting text. I'm just saying support both and don't force their use even where it doesn't matter.
Jul 12 2010
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 07/12/2010 11:48 AM, Steven Schveighoffer wrote:
 On Mon, 12 Jul 2010 11:05:33 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 07/12/2010 09:59 AM, Steven Schveighoffer wrote:
 If I always have to do something like this in order to append a single
 element:

 put(r, (&elem)[0..1]);
No, the library does that. Look here: http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L306
So you're saying it's not ok for an output range to support appending a single element, but it's ok for put to support appending a single element?
Yes. The point is that with a delegate you must choose between accepting E and E[]. Given the constraint, it's better for everyone to accept E[] and let put() take care of the occasional E by doing the wraparoo (&elem)[0..1].
 Well, all that will end up happening is cases where appending a single
 element is the only possibility will produce overloaded add functions,
 one that takes a single element, and one that takes an array. The one
 that takes an array will look like this:

 foreach(e; arr)
 add(e);
That's fine. the point is that if you put this loop on the wrong side of the delegate, it's much less efficient.
 I can tell you this for sure, because it's exactly what's in many
 dcollections classes.

 So what happens when you call put(r, e) for one of these output classes?
 Instead of just calling add(e), it calls (add((&e)[0..1])) which in turn
 goes through some needless loop, which then ends up calling add(e). I
 don't see why this is preferable.
Ah, I see. There is a confusion. The array restriction is only for delegates. For straight ranges, you should accept individual Es. Let me copy the current definition of isOutputRange: template isOutputRange(R, E) { enum bool isOutputRange = is(typeof( { R r; E e; r.put(e); // can write element to range }())) || isInputRange!R && is(typeof( { R r; E e; r.front = e; // can assign to the front of range }())) || is(typeof( { R r; E[] es; r(es); }())); } Notice how the E[] requirement is for delegates only. Andrei
Jul 12 2010
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 12 Jul 2010 13:49:50 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 07/12/2010 11:48 AM, Steven Schveighoffer wrote:
 On Mon, 12 Jul 2010 11:05:33 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 07/12/2010 09:59 AM, Steven Schveighoffer wrote:
 If I always have to do something like this in order to append a single
 element:

 put(r, (&elem)[0..1]);
No, the library does that. Look here: http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L306
So you're saying it's not ok for an output range to support appending a single element, but it's ok for put to support appending a single element?
Yes. The point is that with a delegate you must choose between accepting E and E[]. Given the constraint, it's better for everyone to accept E[] and let put() take care of the occasional E by doing the wraparoo (&elem)[0..1].
But given a delegate that takes a single element, there's no way to wrap it so it can be an output range. Yet such a delegate can easily be something that outputs something. Indeed, a delegate that takes a string takes a single element, but because a string happens to be defined as a range of chars, it passes the test for output ranges. I could loop on an array of strings of one character, and output that to a valid output range no problem. The only thing that solves this problem correctly is buffering. What if I have my own container types that are large chunks of data, but don't happen to define the input range primitives? Why should I be artificially prevented from using those as input to output ranges? Really to me, you are saying, "I want your delegate to be efficient", but you defined something that is related to that in a small set of circumstances (when the arrays being passed in are large).
 Well, all that will end up happening is cases where appending a single
 element is the only possibility will produce overloaded add functions,
 one that takes a single element, and one that takes an array. The one
 that takes an array will look like this:

 foreach(e; arr)
 add(e);
That's fine. the point is that if you put this loop on the wrong side of the delegate, it's much less efficient.
Here's a proposal for put/isOutputRange which would solve my problem and not have any for loops in it: template isOutputRange(R, E) { enum bool isOutputRange = is(typeof( { R r; E e; r.put(e); // can write element to range }())) || isInputRange!R && is(typeof( { R r; E e; r.front = e; // can assign to the front of range }())) || is(typeof( { R r; E[] es; r(es); }())) || is(typeof( { R r; E es; r(es); }())); } void put(R, E)(ref R r, E e) if (isOutputRange!(R, E)) { static if (!isArray!R && is(typeof(r.put(e)))) { r.put(e); } else static if (isInputRange!R && is(typeof(r.front = e))) { r.front = e; r.popFront(); } else static if (is(typeof(r(e)))) { r(e); } else { static assert(false); } }
 I can tell you this for sure, because it's exactly what's in many
 dcollections classes.

 So what happens when you call put(r, e) for one of these output classes?
 Instead of just calling add(e), it calls (add((&e)[0..1])) which in turn
 goes through some needless loop, which then ends up calling add(e). I
 don't see why this is preferable.
Ah, I see. There is a confusion. The array restriction is only for delegates. For straight ranges, you should accept individual Es.
Why the discrepency? A naive coder can define inefficient ranges just as well as he can define inefficient delegates. -Steve
Jul 12 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 07/12/2010 01:47 PM, Steven Schveighoffer wrote:
 On Mon, 12 Jul 2010 13:49:50 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Yes. The point is that with a delegate you must choose between
 accepting E and E[]. Given the constraint, it's better for everyone to
 accept E[] and let put() take care of the occasional E by doing the
 wraparoo (&elem)[0..1].
But given a delegate that takes a single element, there's no way to wrap it so it can be an output range. Yet such a delegate can easily be something that outputs something.
void delegate(int) perItem; void ofCourseThereIsAWay(int[] items) { foreach (e; items) perItem(i); }
 Indeed, a delegate that takes a string takes a single element, but
 because a string happens to be defined as a range of chars, it passes
 the test for output ranges.
Yah, that's a good point.
 I could loop on an array of strings of one character, and output that to
 a valid output range no problem. The only thing that solves this problem
 correctly is buffering.
I don't understand the point here.
 What if I have my own container types that are large chunks of data, but
 don't happen to define the input range primitives? Why should I be
 artificially prevented from using those as input to output ranges?
I don't understand this either.
 Really to me, you are saying, "I want your delegate to be efficient",
 but you defined something that is related to that in a small set of
 circumstances (when the arrays being passed in are large).
What I'm saying is, "If you know how to output one item you may as well output several, as I'm producing them in bulk".
 Here's a proposal for put/isOutputRange which would solve my problem and
 not have any for loops in it:
[snip] I'm uncomfortable about allowing inefficiency by design if I can help it, but I guess the costs all depend on the costs of trafficking one item.
 I can tell you this for sure, because it's exactly what's in many
 dcollections classes.

 So what happens when you call put(r, e) for one of these output classes?
 Instead of just calling add(e), it calls (add((&e)[0..1])) which in turn
 goes through some needless loop, which then ends up calling add(e). I
 don't see why this is preferable.
Ah, I see. There is a confusion. The array restriction is only for delegates. For straight ranges, you should accept individual Es.
Why the discrepency? A naive coder can define inefficient ranges just as well as he can define inefficient delegates.
That doesn't mean we shouldn't foster the mechanisms that reduce the chance of bad designs happening. Andrei
Jul 12 2010
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 12 Jul 2010 15:18:17 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 07/12/2010 01:47 PM, Steven Schveighoffer wrote:
 On Mon, 12 Jul 2010 13:49:50 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Yes. The point is that with a delegate you must choose between
 accepting E and E[]. Given the constraint, it's better for everyone to
 accept E[] and let put() take care of the occasional E by doing the
 wraparoo (&elem)[0..1].
But given a delegate that takes a single element, there's no way to wrap it so it can be an output range. Yet such a delegate can easily be something that outputs something.
void delegate(int) perItem; void ofCourseThereIsAWay(int[] items) { foreach (e; items) perItem(i); }
Yeah, I realized after posting that it was incorrect :) But encouraging this kind of thing is probably not fostering efficient code (roll your own delegate when the compiler complains). It goes to my example with dcollections and add.
 I could loop on an array of strings of one character, and output that to
 a valid output range no problem. The only thing that solves this problem
 correctly is buffering.
I don't understand the point here.
My point is, if the reason behind requiring arrays instead of single elements is to force efficiency, then the reason is flawed. I can make inefficient code even when having to pass arrays to an output range. To solve this, you could make an output range that buffers elements until it has enough to pass to an underlying sink that takes an array of elements. The point is, you've only gone halfway to ensuring efficiency. But going the full way means you are imposing possibly bad buffering semantics on everything. If you can only go halfway, then I think the design is more annoying than successful.
 What if I have my own container types that are large chunks of data, but
 don't happen to define the input range primitives? Why should I be
 artificially prevented from using those as input to output ranges?
I don't understand this either.
Well, I had started constructing an example of how this would work, but I realize, I have no idea how you intend to use such a delegate as an output range... I don't really know how such output ranges will be generically usable in conjunction with output ranges which support the put interface.
 Really to me, you are saying, "I want your delegate to be efficient",
 but you defined something that is related to that in a small set of
 circumstances (when the arrays being passed in are large).
What I'm saying is, "If you know how to output one item you may as well output several, as I'm producing them in bulk".
I have no control over being able to output items in bulk or not, all I have is a delegate. What do I do?
 Here's a proposal for put/isOutputRange which would solve my problem and
 not have any for loops in it:
[snip] I'm uncomfortable about allowing inefficiency by design if I can help it, but I guess the costs all depend on the costs of trafficking one item.
I'm unsure how it will work either. I admit now that I didn't think through how this will be used. I imagined that the delegate version would be substituted for an output range which takes an element at a time, but now I'm not sure how you could write generic code that does that, given that the delegate must take an array of elements instead. I guess I'll wait to see how it works before objecting any further. -Steve
Jul 12 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 07/12/2010 02:41 PM, Steven Schveighoffer wrote:
 I'm unsure how it will work either. I admit now that I didn't think
 through how this will be used.
It's very simple. As far as a user of an output range is concerned, they should write stuff like: put(r, '['); char[] someBuf; put(r, someBuf); put(r, ", "); put(r, ']'); in confidence that things are reasonably efficient. Andrei
Jul 12 2010
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 12 Jul 2010 17:25:43 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 07/12/2010 02:41 PM, Steven Schveighoffer wrote:
 I'm unsure how it will work either. I admit now that I didn't think
 through how this will be used.
It's very simple. As far as a user of an output range is concerned, they should write stuff like: put(r, '['); char[] someBuf; put(r, someBuf); put(r, ", "); put(r, ']'); in confidence that things are reasonably efficient.
How does that work for a range whose front() can be assigned a dchar? Wait, it doesn't, because it won't compile. But wouldn't that be the same for a delegate that takes a dchar? I'm very confused at what you are trying to do. I expected that a char[] would be a valid output range. -Steve
Jul 12 2010
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 07/12/2010 04:39 PM, Steven Schveighoffer wrote:
 On Mon, 12 Jul 2010 17:25:43 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 07/12/2010 02:41 PM, Steven Schveighoffer wrote:
 I'm unsure how it will work either. I admit now that I didn't think
 through how this will be used.
It's very simple. As far as a user of an output range is concerned, they should write stuff like: put(r, '['); char[] someBuf; put(r, someBuf); put(r, ", "); put(r, ']'); in confidence that things are reasonably efficient.
How does that work for a range whose front() can be assigned a dchar? Wait, it doesn't, because it won't compile. But wouldn't that be the same for a delegate that takes a dchar? I'm very confused at what you are trying to do. I expected that a char[] would be a valid output range.
A char[] should be a valid output range for a dchar. I forgot to encode all valid strings situations, working on that now. Andrei
Jul 12 2010
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 07/12/2010 04:39 PM, Steven Schveighoffer wrote:
 On Mon, 12 Jul 2010 17:25:43 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 07/12/2010 02:41 PM, Steven Schveighoffer wrote:
 I'm unsure how it will work either. I admit now that I didn't think
 through how this will be used.
It's very simple. As far as a user of an output range is concerned, they should write stuff like: put(r, '['); char[] someBuf; put(r, someBuf); put(r, ", "); put(r, ']'); in confidence that things are reasonably efficient.
How does that work for a range whose front() can be assigned a dchar? Wait, it doesn't, because it won't compile. But wouldn't that be the same for a delegate that takes a dchar? I'm very confused at what you are trying to do. I expected that a char[] would be a valid output range.
Actually a char[] is not a valid output range. Overwriting variable-length codes with other variable-length codes might mess up the string. Here's what I have. Works? void put(R, E)(ref R r, E e) { static if (!isArray!R && is(typeof(r.put(e)))) { r.put(e); } else static if (!isArray!R && is(typeof(r.put((&e)[0..1])))) { r.put((&e)[0..1]); } else static if (is(typeof(r.front = e, r.popFront()))) { r.front = e; r.popFront(); } else static if (isInputRange!E && is(typeof(put(r, e.front)))) { for (; !e.empty; e.popFront()) put(r, e.front); } else static if (is(typeof(r(e)))) { r(e); } else static if (is(typeof(r((&e)[0..1])))) { r((&e)[0..1]); } else { static assert(false, "Cannot put a "~E.stringof~" into a "~R.stringof); } } Andrei
Jul 12 2010
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 07/12/2010 09:58 PM, Andrei Alexandrescu wrote:
 Here's what I have. Works?
[snip] To clarify: http://erdani.com/d/phobos/std_range.html Andrei
Jul 12 2010
parent reply Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Tue, Jul 13, 2010 at 05:18, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org>  wrote:

    To clarify:

    http://erdani.com/d/phobos/std_range.html


Seems good to me. Lots of flexibility. I may begin to write output ranges
:-)

So in the very first case, e may well be a range, but the way it's used
internally is up to the output range designer. Maybe it uses a loop, maybe
not. OK.

You call r(e) and r([e])) 'delegates'. Are you just using it as a generic
term, or does that mean using any callable (struct with opCall) is not a
good idea in this case, for whatever reason?


Philippe
Jul 12 2010
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 13 Jul 2010 01:52:52 -0400, Philippe Sigaud  
<philippe.sigaud gmail.com> wrote:

 You call r(e) and r([e])) 'delegates'. Are you just using it as a generic
 term, or does that mean using any callable (struct with opCall) is not a
 good idea in this case, for whatever reason?
I think struct with opCall is ok. That's pretty much all a delegate is anyways. -Steve
Jul 13 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 07/13/2010 06:18 AM, Steven Schveighoffer wrote:
 On Tue, 13 Jul 2010 01:52:52 -0400, Philippe Sigaud
 <philippe.sigaud gmail.com> wrote:

 You call r(e) and r([e])) 'delegates'. Are you just using it as a generic
 term, or does that mean using any callable (struct with opCall) is not a
 good idea in this case, for whatever reason?
I think struct with opCall is ok. That's pretty much all a delegate is anyways.
Yes, the delegate is mostly used as an example. Any syntactically valid r(e) should be accepted. Andrei
Jul 13 2010
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 12 Jul 2010 22:58:07 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 07/12/2010 04:39 PM, Steven Schveighoffer wrote:
 On Mon, 12 Jul 2010 17:25:43 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 07/12/2010 02:41 PM, Steven Schveighoffer wrote:
 I'm unsure how it will work either. I admit now that I didn't think
 through how this will be used.
It's very simple. As far as a user of an output range is concerned, they should write stuff like: put(r, '['); char[] someBuf; put(r, someBuf); put(r, ", "); put(r, ']'); in confidence that things are reasonably efficient.
How does that work for a range whose front() can be assigned a dchar? Wait, it doesn't, because it won't compile. But wouldn't that be the same for a delegate that takes a dchar? I'm very confused at what you are trying to do. I expected that a char[] would be a valid output range.
Actually a char[] is not a valid output range. Overwriting variable-length codes with other variable-length codes might mess up the string.
Hm... I think it should be, and here is why. Imagine this situation: char[1024] buffer = void; // allocate some blank space on the stack put(buffer, someInputRange); But the above won't compile anyways, because a ref char[1024] isn't a range, and even if it was, it would be left in a state where it pointed to the uninitialized data. What we need is a helper struct, and then we are covered. char[1024] buffer = void; CharBuilder builder(buffer[]); // defines put put(builder, someInputRange); So I think we are good. Does Appender work here?
 Here's what I have. Works?

 void put(R, E)(ref R r, E e)
 {
      static if (!isArray!R && is(typeof(r.put(e))))
      {
          r.put(e);
      }
      else static if (!isArray!R && is(typeof(r.put((&e)[0..1]))))
      {
          r.put((&e)[0..1]);
      }
      else static if (is(typeof(r.front = e, r.popFront())))
      {
          r.front = e;
          r.popFront();
      }
      else static if (isInputRange!E && is(typeof(put(r, e.front))))
      {
          for (; !e.empty; e.popFront()) put(r, e.front);
      }
      else static if (is(typeof(r(e))))
      {
          r(e);
      }
      else static if (is(typeof(r((&e)[0..1]))))
      {
          r((&e)[0..1]);
      }
      else
      {
          static assert(false, "Cannot put a "~E.stringof~" into a  
 "~R.stringof);
      }
 }
That is satisfactory, it encompasses what I was saying, thanks! -Steve
Jul 13 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 07/13/2010 06:15 AM, Steven Schveighoffer wrote:
 On Mon, 12 Jul 2010 22:58:07 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Actually a char[] is not a valid output range. Overwriting
 variable-length codes with other variable-length codes might mess up
 the string.
Hm... I think it should be, and here is why. Imagine this situation: char[1024] buffer = void; // allocate some blank space on the stack put(buffer, someInputRange); But the above won't compile anyways, because a ref char[1024] isn't a range, and even if it was, it would be left in a state where it pointed to the uninitialized data. What we need is a helper struct, and then we are covered. char[1024] buffer = void; CharBuilder builder(buffer[]); // defines put put(builder, someInputRange); So I think we are good. Does Appender work here?
Appender doesn't currently work with fixed-size buffers, but it could and should be made to work. It's a good idea. Andrei
Jul 13 2010
prev sibling next sibling parent Christian Kamm <kamm-incasoftware removethis.de> writes:
Andrei Alexandrescu wrote:
 Yes. The point is that with a delegate you must choose between accepting
 E and E[]. Given the constraint, it's better for everyone to accept E[]
 and let put() take care of the occasional E by doing the wraparoo
 (&elem)[0..1].
I don't think the current implementation of put allows passing E and E[] correctly: void put(R, E)(ref R r, E e) if (isOutputRange!(R, E)) { if (isArray!E && is(typeof(r(e)))) { r(e); } else static if (is(typeof(r(new E[])))) { r((&e)[0 .. 1]); } else { static assert(false); } } Example: typeof(fn) == void function(int[][]). put(fn, int[]) -> ok put(fn, int[][]) -> no match, isOutputRange!(fn, int[][]) fails. You'll need a void put(R, E)(ref R r, E[] a) if (isOutputRange!(R, E)) overload get the desired behavior. Implementing that one generically also makes explicit what Steve was saying: front-assignable ranges and classic output ranges are stuck using put with a single element in a loop. Associating different performance tradeoffs with abstractions that should be interchangeable sounds odd. In the end it comes down to the question why R.put(E) and void R(E[]) should both be output ranges for E. They should either take E or E[]. I'd go with void foo(E) being an output range for E and define the overloads void put(ref R r, T t) if (isOutputRange!(R, T[])) // makes 1-element slice void put(ref R r, E e) if (isOutputRange!(R, E)) void put(ref R r, A[] a) if (isOutputRange!(R, A)) // uses foreach Christian
Jul 12 2010
prev sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2010-07-12 13:49:50 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 The point is that with a delegate you must choose between accepting E 
 and E[]. Given the constraint, it's better for everyone to accept E[] 
 and let put() take care of the occasional E by doing the wraparoo 
 (&elem)[0..1].
If this means what I think, it means put() cannot be memory safe. Making an array from a stack variable and passing it around cannot be safe unless you can trust this reference won't escape the scope of the delegate you're calling (and there's no way to enforce that for dynamic arrays). To be safe, all you can do is copy the element on the heap. Am I wrong? -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Jul 12 2010
prev sibling parent BLS <windevguy hotmail.de> writes:
On 12/07/2010 03:17, Andrei Alexandrescu wrote:
 he notion of output range has been a tad vague in the past; up until now
 a range that wanted to qualify as an output range had to define a method
 called put.
What if we have to deal with non orthogonal structures, or .. simple directed graphs ? well well.. yet we do not even have a simple map!(string , T) in std.container .. So, For me the Range Stuff still lacks proof of product. I remain pretty sceptical. bjoern
Jul 12 2010