digitalmars.D - ideas about ranges

Steven Schveighoffer (90/90) May 21 2009 The thread discussing what to do for input ranges vs. forward ranges got...

Jason House (5/126) May 21 2009 This is exactly like what Andrei posted as his design.

Steven Schveighoffer (10/26) May 21 2009 If that's true, then I apologize, I was looking at the current docs, and...

Andrei Alexandrescu (5/38) May 21 2009 There are quite a few subtle points about the discussion you started,

Andrei Alexandrescu (78/194) May 22 2009 The hierarchy of concepts is:

Steven Schveighoffer (36/40) May 22 2009 I'll respond to this one point separately.

Andrei Alexandrescu (4/6) May 22 2009 I agree.

Steven Schveighoffer (89/204) May 22 2009 Right, a stream doesn't fit really well within any of those categories, ...

Andrei Alexandrescu (20/46) May 22 2009 We considered that as well. It is problematic because looking at

Steven Schveighoffer (32/57) May 22 2009 This is why I was suggesting that foreach has to support both the simple...

Andrei Alexandrescu (8/11) May 22 2009 Defining null references has been on the table too. I wrote a paragraph

Steven Schveighoffer (12/22) May 22 2009 &(r.popNext()) still gets you back to pointer-land. Might as well retur...
Daniel Keep (7/22) May 23 2009 Why not have something along these lines:

Steven Schveighoffer (25/42) May 22 2009 I misspoke, what I meant was maybe range isn't the best fit for *ref

Andrei Alexandrescu (3/8) May 22 2009 Plenty of algorithms work on input ranges.

Steven Schveighoffer (11/18) May 22 2009 I'm confused, by input range you mean a stream? Because I'm operating

Andrei Alexandrescu (8/30) May 22 2009 There isn't. All I'm saying is, we have the following constraints:

Steven Schveighoffer (9/32) May 22 2009 This is the assumption I am challenging. I don't think you need this

Andrei Alexandrescu (7/18) May 22 2009 I believe there is indeed a terminology problem. To me, "input range" ==...

Steven Schveighoffer (9/25) May 22 2009 OK, I can see you're not going to alter your opinion, so I'll stop

Andrei Alexandrescu (8/59) May 22 2009 I'm using STL's definitions. I'd be glad to depart if there was a solid

Steven Schveighoffer (46/53) May 26 2009 I reread the chapter on iterators in my C++ book. I was convinced that ...

Steve Teale (5/74) May 26 2009 Steve,

Steven Schveighoffer (17/24) May 26 2009 Ranges to me are a great concept. They are safer than C++ iterators, bu...

dsimcha (15/41) May 26 2009 You may be right. D's ranges may be in some ways slightly less powerful...

Lionello Lunesu (4/37) May 22 2009 You don't need a pointer to T, you need a nullable T :)

Steven Schveighoffer (5/37) May 22 2009 No, you need a pointer. If T is a reference type, how do you distinguis...

Jason House (4/6) May 22 2009 Do you have any common cases where a range would generate nulls as part ...

Andrei Alexandrescu (3/12) May 22 2009 You could have a vector containing plenty of null references.
Rainer Deyke (9/12) May 22 2009 Iterating over an array that contains nulls?

Andrei Alexandrescu (4/18) May 22 2009 They can't store a nullable ref int. They'd have to store a nullable

Daniel Keep (2/20) May 23 2009 The way C# solves this is to have IsSet and Get properties or something.

Lionello Lunesu (9/15) May 22 2009 [snip]

Andrei Alexandrescu (3/40) May 22 2009 You need a nullable reference to T...

Lionello Lunesu (8/16) May 22 2009 [snip]

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

The thread discussing what to do for input ranges vs. forward ranges got  
me thinking.

The range concept may be defined backwards in terms of which is more  
specialized.  Consider that an input range is always usable as a stream.   
But a stream is not easy to use as an input range (the range primitive).

Case in point, a file.  To fit into the most primitive range concept, it  
must define 3 functions:

front()
popFront()
empty()

empty is easy, it's just "am I at end of file"

But front is not so easy.  In order to know what's at the front, you need  
to read it.  And at that point you've altered the underlying file.

Look at the implementation of front and popFront for a possible FileByChar  
implementation:

dchar front()
{
   if(!bufferValid)
     popFront();
   return buffer;
}

void popFront()
{
    // read buffer from source, setting bufferValid if the range isn't  
empty by default
}

What sucks about this is, we have to introduce a buffer in the range, just  
because we can't look at data until we've popped it.  Not only that, but  
calling front a file before anything is read requires a check to fill the  
buffer in case we haven't read anything yet.  This could be alleviated by  
filling in the constructor, but it's still more complex than necessary.   
Consider also that the underlying stream might already be buffered, so we  
are buffering a buffer.

And finally, if you copy such a range, the buffer might be copied while  
the stream itself may not.  this could result in strange garbage data.

But since the primitives for input range are set by the compiler (it uses  
them to do foreach), we have to implement them to make our stream ranges  
friendly to foreach.

Round peg, meet square hole.

But what are the true requirements for iteration using foreach?

1. check if there's anything left
2. get the next element

Step 2 now is split into popFront and front.  So a foreach loop is a  
rewritten for loop like this:

foreach(x; range)
{
   ...
}

translates to:
{
   auto _r = range;
   while(!_r.empty)
   {
     auto x = _r.front();
     _r.popFront();
     ...
   }
}

What if step 2 was one function?  Call it popNext(), and make it  
equivalent to calling _r.front() and popFront() in one step on ranges that  
implement that method.

How does this work with foreach?

{
   auto _r = range;
   while(!_r.empty)
   {
     auto x = _r.popNext();
     ...
   }
}

Basically, the same code, one less line.

Consider that any range defined today with front() and popFront() can  
implement popNext (and popNext could be an external function if we can get  
3015 resolved).

So what I think we may need is a different range primitive:

An iterable range defines: (name to be decided)

bool empty()
T popNext()

An input range is an iterable range that also defines:

T front();
popFront();

Now look at our FileByChar example as an iterable range:

T popNext()
{
    return source.get(); // no buffering required
}

And it works perfectly with the new foreach requirements.

And it correctly doesn't work with algorithms that require front and  
popFront.

-Steve

May 21 2009

Jason House <jason.james.house gmail.com> writes:

Steven Schveighoffer Wrote:

 
 The thread discussing what to do for input ranges vs. forward ranges got  
 me thinking.
 
 The range concept may be defined backwards in terms of which is more  
 specialized.  Consider that an input range is always usable as a stream.   
 But a stream is not easy to use as an input range (the range primitive).

 Case in point, a file.  To fit into the most primitive range concept, it  
 must define 3 functions:
 
 front()
 popFront()
 empty()
 
 empty is easy, it's just "am I at end of file"
 
 But front is not so easy.  In order to know what's at the front, you need  
 to read it.  And at that point you've altered the underlying file.
 
 Look at the implementation of front and popFront for a possible FileByChar  
 implementation:
 
 dchar front()
 {
    if(!bufferValid)
      popFront();
    return buffer;
 }
 
 void popFront()
 {
     // read buffer from source, setting bufferValid if the range isn't  
 empty by default
 }
 
 What sucks about this is, we have to introduce a buffer in the range, just  
 because we can't look at data until we've popped it.  Not only that, but  
 calling front a file before anything is read requires a check to fill the  
 buffer in case we haven't read anything yet.  This could be alleviated by  
 filling in the constructor, but it's still more complex than necessary.   
 Consider also that the underlying stream might already be buffered, so we  
 are buffering a buffer.
 
 And finally, if you copy such a range, the buffer might be copied while  
 the stream itself may not.  this could result in strange garbage data.
 
 But since the primitives for input range are set by the compiler (it uses  
 them to do foreach), we have to implement them to make our stream ranges  
 friendly to foreach.
 
 Round peg, meet square hole.
 
 But what are the true requirements for iteration using foreach?
 
 1. check if there's anything left
 2. get the next element
 
 Step 2 now is split into popFront and front.  So a foreach loop is a  
 rewritten for loop like this:
 
 foreach(x; range)
 {
    ...
 }
 
 translates to:
 {
    auto _r = range;
    while(!_r.empty)
    {
      auto x = _r.front();
      _r.popFront();
      ...
    }
 }
 
 What if step 2 was one function?  Call it popNext(), and make it  
 equivalent to calling _r.front() and popFront() in one step on ranges that  
 implement that method.
 
 How does this work with foreach?
 
 {
    auto _r = range;
    while(!_r.empty)
    {
      auto x = _r.popNext();
      ...
    }
 }
 
 Basically, the same code, one less line.
 
 Consider that any range defined today with front() and popFront() can  
 implement popNext (and popNext could be an external function if we can get  
 3015 resolved).
 
 So what I think we may need is a different range primitive:
 
 An iterable range defines: (name to be decided)
 
 bool empty()
 T popNext()
 
 An input range is an iterable range that also defines:
 
 T front();
 popFront();

This is exactly like what Andrei posted as his design.

Your iterable range = Andrei's input range
Your input range = Andrei's forward range

 
 Now look at our FileByChar example as an iterable range:
 
 T popNext()
 {
     return source.get(); // no buffering required
 }
 
 And it works perfectly with the new foreach requirements.
 
 And it correctly doesn't work with algorithms that require front and  
 popFront.
 
 -Steve

May 21 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 21 May 2009 17:40:27 -0400, Jason House  
<jason.james.house gmail.com> wrote:

 Steven Schveighoffer Wrote:
 So what I think we may need is a different range primitive:

 An iterable range defines: (name to be decided)

 bool empty()
 T popNext()

 An input range is an iterable range that also defines:

 T front();
 popFront();

 This is exactly like what Andrei posted as his design.

 Your iterable range = Andrei's input range
 Your input range = Andrei's forward range

If that's true, then I apologize, I was looking at the current docs, and  
Andrei's post about having some sort of "save" method.  The current docs  
definitely are not the same as what I proposed.

And in any case, my proposal requires a change to the compiler for  
foreach.  I haven't seen that talked about yet.  If that's in the works,  
then that's cool.  As I've said before, I don't need credit, I just like  
to have the ideas knocked around.

-Steve

May 21 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Steven Schveighoffer wrote:
 On Thu, 21 May 2009 17:40:27 -0400, Jason House 
 <jason.james.house gmail.com> wrote:
 
 Steven Schveighoffer Wrote:
 So what I think we may need is a different range primitive:

 An iterable range defines: (name to be decided)

 bool empty()
 T popNext()

 An input range is an iterable range that also defines:

 T front();
 popFront();

 This is exactly like what Andrei posted as his design.

 Your iterable range = Andrei's input range
 Your input range = Andrei's forward range

 
 If that's true, then I apologize, I was looking at the current docs, and 
 Andrei's post about having some sort of "save" method.  The current docs 
 definitely are not the same as what I proposed.
 
 And in any case, my proposal requires a change to the compiler for 
 foreach.  I haven't seen that talked about yet.  If that's in the works, 
 then that's cool.  As I've said before, I don't need credit, I just like 
 to have the ideas knocked around.
 
 -Steve

There are quite a few subtle points about the discussion you started, 
and a few subtler subpoints still. I've had a busy day so I couldn't 
answer, but I'll try tomorrow.

Andrei

May 21 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Steven Schveighoffer wrote:
 
 The thread discussing what to do for input ranges vs. forward ranges got 
 me thinking.
 
 The range concept may be defined backwards in terms of which is more 
 specialized.  Consider that an input range is always usable as a 
 stream.  But a stream is not easy to use as an input range (the range 
 primitive).

The hierarchy of concepts is:

input range < forward range < bidirectional range < random-access range

Having length, having slicing, and infinity are orthogonal properties.

 Case in point, a file.  To fit into the most primitive range concept, it 
 must define 3 functions:
 
 front()
 popFront()
 empty()
 
 empty is easy, it's just "am I at end of file"
 
 But front is not so easy.  In order to know what's at the front, you 
 need to read it.  And at that point you've altered the underlying file.

Not even empty is easy. If you're defining a file range that gives you 
e.g. whitespace-separated integers, you can't have empty return feof(p) 
because there might be only spaces through the end of file. So you need 
to cache one element ahead to be able to implement empty().

I've discussed this with Walter for the longest time. The correct 
primitive for an input range that needs to do no caching is:

Tuple!(T, "data", bool, "got") popNext();

When you call popNext it decides in the same place whether there is 
data, and also gives it to you. If "got" comes off as false, you know 
you're done.

There are two issues with this form. It's not easy to use (even if you 
play around with the signature by e.g. passing data or bool by 
reference) and it's not easy to inline for non-input ranges. The second 
problem could possibly be eliminated, but the first stays. It's just too 
clunky a primitive to use, you'll need temporaries and all that stuff.

 Look at the implementation of front and popFront for a possible 
 FileByChar implementation:
 
 dchar front()
 {
   if(!bufferValid)
     popFront();
   return buffer;
 }
 
 void popFront()
 {
    // read buffer from source, setting bufferValid if the range isn't 
 empty by default
 }
 
 What sucks about this is, we have to introduce a buffer in the range, 
 just because we can't look at data until we've popped it.  Not only 
 that, but calling front a file before anything is read requires a check 
 to fill the buffer in case we haven't read anything yet.  This could be 
 alleviated by filling in the constructor, but it's still more complex 
 than necessary.  Consider also that the underlying stream might already 
 be buffered, so we are buffering a buffer.

Yah, I agree. I've implemented a few of those in Phobos. It would be 
good to have a more solid solution. This problem, however, also collides 
with returning an rvalue versus a ref. A container wants to return a 
ref. An input range wants to return by value. Then it's difficult to use 
a container as an input range.

 And finally, if you copy such a range, the buffer might be copied while 
 the stream itself may not.  this could result in strange garbage data.

I don't understand this. You could make sure copy does the right thing.

 But since the primitives for input range are set by the compiler (it 
 uses them to do foreach), we have to implement them to make our stream 
 ranges friendly to foreach.
 
 Round peg, meet square hole.
 
 But what are the true requirements for iteration using foreach?
 
 1. check if there's anything left
 2. get the next element
 
 Step 2 now is split into popFront and front.  So a foreach loop is a 
 rewritten for loop like this:
 
 foreach(x; range)
 {
   ...
 }
 
 translates to:
 {
   auto _r = range;
   while(!_r.empty)
   {
     auto x = _r.front();
     _r.popFront();
     ...
   }
 }
 
 What if step 2 was one function?  Call it popNext(), and make it 
 equivalent to calling _r.front() and popFront() in one step on ranges 
 that implement that method.

This will not solve all problems. It will improve things like defining 
ranges that read one character at a time from a stream. But a function 
that does read-and-check-for-empty in one shot is the true solution.

 How does this work with foreach?
 
 {
   auto _r = range;
   while(!_r.empty)
   {
     auto x = _r.popNext();
     ...
   }
 }
 
 Basically, the same code, one less line.
 
 Consider that any range defined today with front() and popFront() can 
 implement popNext (and popNext could be an external function if we can 
 get 3015 resolved).
 
 So what I think we may need is a different range primitive:
 
 An iterable range defines: (name to be decided)
 
 bool empty()
 T popNext()
 
 An input range is an iterable range that also defines:
 
 T front();
 popFront();

I think you are using "iterable range" instead of "input range" and 
"input range" instead of "forward range". This is compared to STL 
terminology, which I borrowed.

 Now look at our FileByChar example as an iterable range:
 
 T popNext()
 {
    return source.get(); // no buffering required
 }
 
 And it works perfectly with the new foreach requirements.
 
 And it correctly doesn't work with algorithms that require front and 
 popFront.

That's great. Again, popNext integrated with check for empty is the best 
solution. The correct way to go is to define this:

T popNext(ref bool gotSomething);

and leave it to the discretion of the range if they prefer returning by 
reference:

ref T popNext(ref bool gotSomething);

(this might be good for many forward ranges.) Then we nicely ask Walter 
to allow inlining of such functions, and finally implement this in 
std.range:

ref ElementType!R popNext(R)(R r, ref bool gotSomething)
     if (isForwardRange!R)
{
     if (r.empty)
     {
         gotSomething = false;
         static typeof(return) dumbo;
         return dumbo;
     }
     auto result = &(r.front());
     r.popFront();
     gotSomething = true;
     return *result;
}

Admittedly this looks considerably messier than it ought to, and that's 
never a good sign. For starters, I could predict with accuracy the 
sneering remarks of certain posters who shall remain unnamed. Worse, 
they'd have a point (only) this one time :o). Messiness was one of the 
factors that made me decide to steer away from this design. A simpler 
solution is to just return by value:

ElementType!R popNext(R)(R r, ref bool gotSomething)
     if (isForwardRange!R)
{
     if (r.empty)
     {
         gotSomething = false;
         return typeof(return).init;
     }
     auto result = r.front;
     r.popFront();
     return result;
}

This looks a tad more sane. But then it copies data more than it 
strictly should, and for whom? For everything that's not strictly a file 
input, which is most things you want to iterate! Wrong default.

As of this time, I am undecided on what's the best way to go. Opinions 
are welcome.


Andrei

May 22 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 22 May 2009 11:48:34 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 And finally, if you copy such a range, the buffer might be copied while  
 the stream itself may not.  this could result in strange garbage data.

 I don't understand this. You could make sure copy does the right thing.

I'll respond to this one point separately.

consider you have a range with a statically defined buffer (not a  
heap-allocated):

struct range
{
   uint buf;
   FILE *source;
}

Let's assume the data in the source reads 1,2,3,4,5

The buffer contains 1, and 2,3,4,5 is still on the source

Without any alterations, copying the range copies the buffer, so now one  
instance of the range will read correctly, say:

1,2,3

and the copy will read:

1,4,5

Now, let's say when copying, we kill the buffer from the source range,  
re-run our example, and it now correctly reads:

1,2,3
4,5

But now, you might reorder the data in a strange way:

auto r2 = r1; // r1 now has no buffer, r2 contains '1'

output(take(r1, 3));
output(take(r2,2));

outputs:
2,3,4
1,5

This can be a problem if you are depending on order.  This isn't a likely  
scenario, but without a static buffer per range, you don't have any  
reordering issues.

I just think having weird bugs like that will be troublesome in certain  
cases.  Calling it 'expected behavior' is not going to help matters.

In addition, without a buffer to worry about, the code becomes much  
simpler.

-Steve

May 22 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Steven Schveighoffer wrote:
[snip]
 In addition, without a buffer to worry about, the code becomes much 
 simpler.

I agree.

Andrei

May 22 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 22 May 2009 11:48:34 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
  The thread discussing what to do for input ranges vs. forward ranges  
 got me thinking.
  The range concept may be defined backwards in terms of which is more  
 specialized.  Consider that an input range is always usable as a  
 stream.  But a stream is not easy to use as an input range (the range  
 primitive).

 The hierarchy of concepts is:

 input range < forward range < bidirectional range < random-access range

 Having length, having slicing, and infinity are orthogonal properties.

Right, a stream doesn't fit really well within any of those categories,  
because it's even more simple than an input range.  That was my point.   
You have to bolt functionality and a buffer on top of it to get it to fit  
the model.  Maybe this is inevitable, I was wondering if it would be  
easier to deal with if the range paradigm had another simpler construct  
than input range.

 Case in point, a file.  To fit into the most primitive range concept,  
 it must define 3 functions:
  front()
 popFront()
 empty()
  empty is easy, it's just "am I at end of file"
  But front is not so easy.  In order to know what's at the front, you  
 need to read it.  And at that point you've altered the underlying file.

 Not even empty is easy. If you're defining a file range that gives you  
 e.g. whitespace-separated integers, you can't have empty return feof(p)  
 because there might be only spaces through the end of file. So you need  
 to cache one element ahead to be able to implement empty().

Even a stream that does nothing but read straight characters, but has no  
idea how long the stream is, such as stdin, cannot specify empty without  
trying to read more data.  I see your point.

 I've discussed this with Walter for the longest time. The correct  
 primitive for an input range that needs to do no caching is:

 Tuple!(T, "data", bool, "got") popNext();

 When you call popNext it decides in the same place whether there is  
 data, and also gives it to you. If "got" comes off as false, you know  
 you're done.

Yes, I see now that my model won't cut it.  I had a feeling that my model  
wasn't quite correct, but as you say, yours still is clunky.

Looking at other iteration models:


start the enumeration. (yuck!)
Java: This is similar to what I was proposing: only has "get next element"  
and "has more elements".

I'm not sure if the right answer has been found by any of these yet.  I  
can feel there is a solution that should be simple, and accurately reflect  
the underlying objects' API, but I have to think more about it.

  What sucks about this is, we have to introduce a buffer in the range,  
 just because we can't look at data until we've popped it.  Not only  
 that, but calling front a file before anything is read requires a check  
 to fill the buffer in case we haven't read anything yet.  This could be  
 alleviated by filling in the constructor, but it's still more complex  
 than necessary.  Consider also that the underlying stream might already  
 be buffered, so we are buffering a buffer.

 Yah, I agree. I've implemented a few of those in Phobos. It would be  
 good to have a more solid solution. This problem, however, also collides  
 with returning an rvalue versus a ref. A container wants to return a  
 ref. An input range wants to return by value. Then it's difficult to use  
 a container as an input range.

don't you mean the other way around?  That is, it's easy to convert a ref  
into a value, but not vice versa.

 But since the primitives for input range are set by the compiler (it  
 uses them to do foreach), we have to implement them to make our stream  
 ranges friendly to foreach.
  Round peg, meet square hole.
  But what are the true requirements for iteration using foreach?
  1. check if there's anything left
 2. get the next element
  Step 2 now is split into popFront and front.  So a foreach loop is a  
 rewritten for loop like this:
  foreach(x; range)
 {
   ...
 }
  translates to:
 {
   auto _r = range;
   while(!_r.empty)
   {
     auto x = _r.front();
     _r.popFront();
     ...
   }
 }
  What if step 2 was one function?  Call it popNext(), and make it  
 equivalent to calling _r.front() and popFront() in one step on ranges  
 that implement that method.

 This will not solve all problems. It will improve things like defining  
 ranges that read one character at a time from a stream. But a function  
 that does read-and-check-for-empty in one shot is the true solution.

Yes, agreed.  I was focusing on not having to keep around transient data  
to avoid reordering or caching issues.  But it does need to be all-in-one  
as you have shown.

 How does this work with foreach?
  {
   auto _r = range;
   while(!_r.empty)
   {
     auto x = _r.popNext();
     ...
   }
 }
  Basically, the same code, one less line.
  Consider that any range defined today with front() and popFront() can  
 implement popNext (and popNext could be an external function if we can  
 get 3015 resolved).
  So what I think we may need is a different range primitive:
  An iterable range defines: (name to be decided)
  bool empty()
 T popNext()
  An input range is an iterable range that also defines:
  T front();
 popFront();

 I think you are using "iterable range" instead of "input range" and  
 "input range" instead of "forward range". This is compared to STL  
 terminology, which I borrowed.

I don't know what the terminology should be.  I was using your  
terminology.  when I said input range, I meant input range as you defined  
it on your std.range documentation.  An input range has the following  
methods:

T front()
void popFront()
bool empty()
and by composition: T popNext() (this wasn't in the docs, but it's  
something that we're throwing around here)

An "iterable" range, or one that does not have persistent access to  
elements and must modify the range and get data at the same time, was  
supposed to be a subset of that. I was just defining it as T popNext() and  
empty(), but it looks like those have to be combined into one function as  
you say.

 (this might be good for many forward ranges.) Then we nicely ask Walter  
 to allow inlining of such functions, and finally implement this in  
 std.range:

 ref ElementType!R popNext(R)(R r, ref bool gotSomething)
      if (isForwardRange!R)
 {
      if (r.empty)
      {
          gotSomething = false;
          static typeof(return) dumbo;
          return dumbo;
      }
      auto result = &(r.front());
      r.popFront();
      gotSomething = true;
      return *result;
 }

 Admittedly this looks considerably messier than it ought to, and that's  
 never a good sign. For starters, I could predict with accuracy the  
 sneering remarks of certain posters who shall remain unnamed. Worse,  
 they'd have a point (only) this one time :o). Messiness was one of the  
 factors that made me decide to steer away from this design. A simpler  
 solution is to just return by value:

Yeah, it's ugly.  Dealing with orthogonal returns is not a graceful  
proposition.

Another idea is to make the T the ref'd arg, similar to how the system  
call read() works:

bool popNext(ref T);

This has some benefits:

1) you aren't checking a temporary for emptyness, so it fits well within a  
loop construct
2) you can avoid returning a dummy element in the empty range case.
3) you avoid returning a piece of possibly large data on the stack when  
it's probably just going to get copied anyways (although the compiler can  
sometimes optimize this).

One thing this does though, is remove the possibility of returning a  
reference to an element.  Not sure if this is a bad thing, since you can  
always convert a reference to a value, and when operating in this mode,  
you are working with the lowest common denominator (as everything has to  
be able to implement this).

However, with foreach, you may want to have references sometimes.  For  
that, I'd guess that foreach might have to support 3 apis: opApply, range,  
and stream (popNext API).

I'm not convinced that range is the best fit for ref'd data anyways,  
opApply is much more adept at it.

How does implementation of popNext look?

bool popNext(R)(R r, ref ElementType!R elem)
      if (isForwardRange!R)
{
      if (r.empty)
          return false;
      elem = r.front();
      r.popFront();
      return true;
}

usage:

R r;
for(ElementType!R x; r.popNext(x);)
{
    // use x
}

vs. your idea for popNext:

R r;
bool done;
// ... geez, I don't even know an easy way to do this???  does this work?
while(ref x = r.popNext(done), done)
{
    // use x
}

Still thinking, may have more ideas...  How do we get a reference to the  
element...

-Steve

May 22 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Steven Schveighoffer wrote:
 Another idea is to make the T the ref'd arg, similar to how the system 
 call read() works:
 
 bool popNext(ref T);
 
 This has some benefits:
 
 1) you aren't checking a temporary for emptyness, so it fits well within 
 a loop construct
 2) you can avoid returning a dummy element in the empty range case.
 3) you avoid returning a piece of possibly large data on the stack when 
 it's probably just going to get copied anyways (although the compiler 
 can sometimes optimize this).

We considered that as well. It is problematic because looking at 
elements always entails copying them, which is rather silly if you do it 
for e.g. an array.

By golly, I kid you not but the interface I personally like the most is:

struct ArchetypalInputRange(T)
{
     T* popNext();
}

popNext returns a pointer to a T (the value may be reused) or null when 
the range is done. Iteration becomes easy and efficient for all types. 
An input range would still have to keep a buffer (and return a pointer 
to it), but things are not awkward to implement.

Given the bad reputation that pointers have, I guess people wouldn't 
like this all that much.

 I'm not convinced that range is the best fit for ref'd data anyways, 
 opApply is much more adept at it.

If ranges aren't fit to mutate stuff, we failed. Many algorithms in 
std.algorithm mutate their data and I can't bring myself to think how 
I'd implement them all with opApply.

 How does implementation of popNext look?
 
 bool popNext(R)(R r, ref ElementType!R elem)
      if (isForwardRange!R)
 {
      if (r.empty)
          return false;
      elem = r.front();
      r.popFront();
      return true;
 }

It forces a copy whenever you want to take a look at something.


Andrei

May 22 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 22 May 2009 15:55:52 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 Another idea is to make the T the ref'd arg, similar to how the system  
 call read() works:
  bool popNext(ref T);
  This has some benefits:
  1) you aren't checking a temporary for emptyness, so it fits well  
 within a loop construct
 2) you can avoid returning a dummy element in the empty range case.
 3) you avoid returning a piece of possibly large data on the stack when  
 it's probably just going to get copied anyways (although the compiler  
 can sometimes optimize this).

 We considered that as well. It is problematic because looking at  
 elements always entails copying them, which is rather silly if you do it  
 for e.g. an array.

This is why I was suggesting that foreach has to support both the simple  
popNext API and the range API.  Which sucks, but I can't see a way around  
it without using pointers...

 By golly, I kid you not but the interface I personally like the most is:

 struct ArchetypalInputRange(T)
 {
      T* popNext();
 }

 popNext returns a pointer to a T (the value may be reused) or null when  
 the range is done. Iteration becomes easy and efficient for all types.  
 An input range would still have to keep a buffer (and return a pointer  
 to it), but things are not awkward to implement.

 Given the bad reputation that pointers have, I guess people wouldn't  
 like this all that much.

Then maybe we have to come up with a better pointer...  I kind of like the  
way classes work for this.  Comparing them to null does what you want it  
to do, but doing anything else with it forwards to the underlying data.   
It's rebindable, and you can't do arithmetic on the pointer.

ref accomplishes all of this, except you can't get at the underlying  
pointer to do things like compare to null or rebind.  Maybe we simply need  
some new operators to get at the ref addresses.

ref T popNext();
ref T t; // initialized to null
while((t =& popNext()) ==& null)
...
Still doesn't look as good as:

while(auto t = popNext())

Hm.. what about returning an array of 0 or one element? (ugh).

What about a type which acts like a class reference for simple data types:

struct rref(T)
{
   prviate T *data;
   ref T opStar() {return *data;}
   ref T opDot() {return *data;}
   void opAssign(ref T t) {data = &t;}
}

Need some compiler help to allow comparing to null, probably want this to  
be a builtin type, and have some better syntax.

BTW, I like your solution the best too.  I thought of using pointers, but  
decided against it because of the taboo factor.

-Steve

May 22 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Steven Schveighoffer wrote:
 ref accomplishes all of this, except you can't get at the underlying 
 pointer to do things like compare to null or rebind.  Maybe we simply 
 need some new operators to get at the ref addresses.

Defining null references has been on the table too. I wrote a paragraph 
about them and then deleted it in fear of major aggravation.

There's no need for new syntax. We can easily define things such that 
you can return *null when a reference is expected and checking 
&(r.popNext()) is null to figure out what's happening. I think it's bad 
design though.


Andrei

May 22 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 22 May 2009 16:40:55 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 ref accomplishes all of this, except you can't get at the underlying  
 pointer to do things like compare to null or rebind.  Maybe we simply  
 need some new operators to get at the ref addresses.

 Defining null references has been on the table too. I wrote a paragraph  
 about them and then deleted it in fear of major aggravation.

 There's no need for new syntax. We can easily define things such that  
 you can return *null when a reference is expected and checking  
 &(r.popNext()) is null to figure out what's happening. I think it's bad  
 design though.

&(r.popNext()) still gets you back to pointer-land.  Might as well return  
a pointer.

Also, I don't even think this works, because you can either check for null  
or get the data, but not both:

ref T t; // syntax error, no?

while(&(t = r.popNext()) !is null)
{
}

doesn't work because you're not rebinding t.

-Steve

May 22 2009

Daniel Keep <daniel.keep.lists gmail.com> writes:

Andrei Alexandrescu wrote:
 Steven Schveighoffer wrote:
 ref accomplishes all of this, except you can't get at the underlying
 pointer to do things like compare to null or rebind.  Maybe we simply
 need some new operators to get at the ref addresses.

 
 Defining null references has been on the table too. I wrote a paragraph
 about them and then deleted it in fear of major aggravation.
 
 There's no need for new syntax. We can easily define things such that
 you can return *null when a reference is expected and checking
 &(r.popNext()) is null to figure out what's happening. I think it's bad
 design though.
 
 
 Andrei

Why not have something along these lines:

bool isrefset(T)(ref T r)
{
    return (&r !is null);
}

Then you don't have to dirty yourself by directly using pointers.  :)

May 23 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 22 May 2009 15:55:52 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:

 I'm not convinced that range is the best fit for ref'd data anyways,  
 opApply is much more adept at it.

 If ranges aren't fit to mutate stuff, we failed. Many algorithms in  
 std.algorithm mutate their data and I can't bring myself to think how  
 I'd implement them all with opApply.

I misspoke, what I meant was maybe range isn't the best fit for *ref  
iteration using foreach*.  Of course, having ranges that can ref data is  
important for other uses of ranges.

So would it be acceptable to change foreach to use either the popNext api  
or the opApply api, and not have foreach support the input range api?   
That was my ponder.

 How does implementation of popNext look?
  bool popNext(R)(R r, ref ElementType!R elem)
      if (isForwardRange!R)
 {
      if (r.empty)
          return false;
      elem = r.front();
      r.popFront();
      return true;
 }

 It forces a copy whenever you want to take a look at something.

Yes, and in the case of:

foreach(x; r)

You are going to do copying.  This is what I'm guessing would be the  
standard model for getting data from a stream using foreach (it is now,  
with opApply).

In the case of:

foreach(ref x; r)

you *need* a reference, so at this point you'd need to use the opApply or  
input-range style (front, popfront, empty) api to run foreach.  If streams  
don't support ref data (which they shouldn't IMO), this is a compiler  
error.

I'm concentrating mostly on usages with foreach, not algorithms.  If we  
are to have streams that fit into the range model, then they need to be  
foreach'able.  I don't know that they need a lot of support to feed into  
std.algorithm as reference data.  I.e. you aren't going to sort a network  
stream.

-Steve

May 22 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Steven Schveighoffer wrote:
 I'm concentrating mostly on usages with foreach, not algorithms.  If we 
 are to have streams that fit into the range model, then they need to be 
 foreach'able.  I don't know that they need a lot of support to feed into 
 std.algorithm as reference data.  I.e. you aren't going to sort a 
 network stream.

Plenty of algorithms work on input ranges.

Andrei

May 22 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 22 May 2009 17:10:45 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 I'm concentrating mostly on usages with foreach, not algorithms.  If we  
 are to have streams that fit into the range model, then they need to be  
 foreach'able.  I don't know that they need a lot of support to feed  
 into std.algorithm as reference data.  I.e. you aren't going to sort a  
 network stream.

 Plenty of algorithms work on input ranges.

I'm confused, by input range you mean a stream?  Because I'm operating  
under the assumption that an input range is anything that defines front,  
popFront, and empty.  While you can shoehorn a stream into being an input  
range, they don't necessarily implement the required elements easily.  We  
may be confusing terminology.

Can you name an example of an input range that is a stream, and an  
algorithm that mutates the stream in place (thereby requiring ref  
elements)?

-Steve

May 22 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Steven Schveighoffer wrote:
 On Fri, 22 May 2009 17:10:45 -0400, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 
 Steven Schveighoffer wrote:
 I'm concentrating mostly on usages with foreach, not algorithms.  If 
 we are to have streams that fit into the range model, then they need 
 to be foreach'able.  I don't know that they need a lot of support to 
 feed into std.algorithm as reference data.  I.e. you aren't going to 
 sort a network stream.

 Plenty of algorithms work on input ranges.

 
 I'm confused, by input range you mean a stream?  Because I'm operating 
 under the assumption that an input range is anything that defines front, 
 popFront, and empty.  While you can shoehorn a stream into being an 
 input range, they don't necessarily implement the required elements 
 easily.  We may be confusing terminology.
 
 Can you name an example of an input range that is a stream, and an 
 algorithm that mutates the stream in place (thereby requiring ref 
 elements)?

There isn't. All I'm saying is, we have the following constraints:

1. Any range should be seamlessly and efficiently used as an input 
range. This probably entails return by ref for efficiency.

2. Input ranges should define popNext to do at the same time reading, 
testing for empty, and advancing.

That's it!


Andrei

May 22 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 22 May 2009 21:22:55 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 On Fri, 22 May 2009 17:10:45 -0400, Andrei Alexandrescu  
 <SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 I'm concentrating mostly on usages with foreach, not algorithms.  If  
 we are to have streams that fit into the range model, then they need  
 to be foreach'able.  I don't know that they need a lot of support to  
 feed into std.algorithm as reference data.  I.e. you aren't going to  
 sort a network stream.

 Plenty of algorithms work on input ranges.

  I'm confused, by input range you mean a stream?  Because I'm operating  
 under the assumption that an input range is anything that defines  
 front, popFront, and empty.  While you can shoehorn a stream into being  
 an input range, they don't necessarily implement the required elements  
 easily.  We may be confusing terminology.
  Can you name an example of an input range that is a stream, and an  
 algorithm that mutates the stream in place (thereby requiring ref  
 elements)?

 There isn't. All I'm saying is, we have the following constraints:

 1. Any range should be seamlessly and efficiently used as an input range.

This is the assumption I am challenging.  I don't think you need this  
assumptions for ranges to work.  You can always bolt input range  
functionality on top of a stream range if you want to treat a stream as an  
input range for some reason.  But if foreach doesn't utilize the popNext  
api, then streams require an unnecessary layer on top, just to use foreach  
with them.

-Steve

May 22 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Steven Schveighoffer wrote:
 On Fri, 22 May 2009 21:22:55 -0400, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 1. Any range should be seamlessly and efficiently used as an input range.

 
 This is the assumption I am challenging.  I don't think you need this 
 assumptions for ranges to work.  You can always bolt input range 
 functionality on top of a stream range if you want to treat a stream as 
 an input range for some reason.

I believe there is indeed a terminology problem. To me, "input range" == 
"stream" == "socket" == "bridge that is sinking under your feet as you 
walk it". So to me there exists no "stream range". That to me is an 
"input range".

 But if foreach doesn't utilize the 
 popNext api, then streams require an unnecessary layer on top, just to 
 use foreach with them.

We can arrange that foreach uses popNext, but it must be worth it.

Andrei

May 22 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 22 May 2009 21:50:03 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 On Fri, 22 May 2009 21:22:55 -0400, Andrei Alexandrescu  
 <SeeWebsiteForEmail erdani.org> wrote:
 1. Any range should be seamlessly and efficiently used as an input  
 range.

  This is the assumption I am challenging.  I don't think you need this  
 assumptions for ranges to work.  You can always bolt input range  
 functionality on top of a stream range if you want to treat a stream as  
 an input range for some reason.

 I believe there is indeed a terminology problem. To me, "input range" ==  
 "stream" == "socket" == "bridge that is sinking under your feet as you  
 walk it". So to me there exists no "stream range". That to me is an  
 "input range".

OK, I can see you're not going to alter your opinion, so I'll stop  
debating.  I respectfully disagree with your definitions.

 But if foreach doesn't utilize the popNext api, then streams require an  
 unnecessary layer on top, just to use foreach with them.

 We can arrange that foreach uses popNext, but it must be worth it.

I can't say for certain that it is, but you definitely will know if when  
you start implementing ranges based on stuff like files, and things just  
seem difficult to implement or have unsolvable problems.  I guess we wait  
and see.

-Steve

May 22 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Steven Schveighoffer wrote:
Andrei Alexandrescu wrote:
 Steven Schveighoffer wrote:
 On Fri, 22 May 2009 21:22:55 -0400, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 1. Any range should be seamlessly and efficiently used as an input 
 range.

 This is the assumption I am challenging.  I don't think you need this 
 assumptions for ranges to work.  You can always bolt input range 
 functionality on top of a stream range if you want to treat a stream 
 as an input range for some reason.

 
 I believe there is indeed a terminology problem. To me, "input range" == 
 "stream" == "socket" == "bridge that is sinking under your feet as you 
 walk it". So to me there exists no "stream range". That to me is an 
 "input range".
 
 But if foreach doesn't utilize the popNext api, then streams require 
 an unnecessary layer on top, just to use foreach with them.

 
 We can arrange that foreach uses popNext, but it must be worth it.
 
 Andrei

 On Fri, 22 May 2009 21:50:03 -0400, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 
 Steven Schveighoffer wrote:
 On Fri, 22 May 2009 21:22:55 -0400, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 1. Any range should be seamlessly and efficiently used as an input 
 range.

  This is the assumption I am challenging.  I don't think you need 
 this assumptions for ranges to work.  You can always bolt input range 
 functionality on top of a stream range if you want to treat a stream 
 as an input range for some reason.

 I believe there is indeed a terminology problem. To me, "input range" 
 == "stream" == "socket" == "bridge that is sinking under your feet as 
 you walk it". So to me there exists no "stream range". That to me is 
 an "input range".

 
 OK, I can see you're not going to alter your opinion, so I'll stop 
 debating.  I respectfully disagree with your definitions.

I'm using STL's definitions. I'd be glad to depart if there was a solid 
reason. How would you name things?

 But if foreach doesn't utilize the popNext api, then streams require 
 an unnecessary layer on top, just to use foreach with them.

 We can arrange that foreach uses popNext, but it must be worth it.

 
 I can't say for certain that it is, but you definitely will know if when 
 you start implementing ranges based on stuff like files, and things just 
 seem difficult to implement or have unsolvable problems.  I guess we 
 wait and see.

One hopeful element is that there are usually more containers than 
streams/files. Besides, most streams are buffered. So maybe things 
aren't that bad.


Andrei

May 22 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 22 May 2009 22:35:39 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 Andrei Alexandrescu wrote:
 Steven Schveighoffer wrote:
  OK, I can see you're not going to alter your opinion, so I'll stop  
 debating.  I respectfully disagree with your definitions.

 I'm using STL's definitions. I'd be glad to depart if there was a solid  
 reason. How would you name things?

I reread the chapter on iterators in my C++ book.  I was convinced that I  
would see something to fuel my beliefs.

I'm now convinced that ranges can stay as is.  Here is why:

I was focusing on a stream range as a more basic range than an input  
range.  However, I wasn't thinking about the whole picture.  There are  
more than just input streams.  There are output streams and input/output  
streams.  It makes sense to support all of them, not just input streams.

So really, I'm thinking of a different tree of entities, which already  
exist: streams!  That is, the unification of ranges and streams really  
isn't necessary.  You can wrap a stream as an input, output, or forward  
range, and deal with the shoehorning (and buffer issues) when you really  
need it.  I don't think viewing streams as ranges is going to be a useful  
idiom unless you need to convert from one world to the other.  So trying  
to find a way to make files be ranges might be simply an overenthusiastic  
view of ranges, similar to how people try to make everything object  
oriented when they learn how OO design works.

As far as foreach, we still have opApply for streams (or shoehorning if  
you need range-ification).

However, out of all this may come one interesting idea: popNext  
(specifically the 'bool popNext(ref T val)' form).

Note that there are a couple of advantages to using popNext with foreach:
1. Multiple variables, and easy overloading.  i.e. foreach(int idx, T val;  
range) would be possible (and much less cumbersome than opApply!).
2. 1 function call per loop instead of 3.  This may have a distinct  
advantage if you absolutely have to use a virtual method, or your range  
must call virtual methods on it's container.

I still don't know how to make ref values work with popNext without using  
pointers.  It would be nice to have a rebindable ref type.

I thought of this syntax, does this seem appealing at all?

int i = 5, j = 6;
ref int ri = i;
ri = 7;
assert(i == 7);
ri.& = &j;
ri = 8;
assert(j == 8);

type of ref.& is *not* a pointer, the arithmetic operators are not defined  
for it, and it can't be assigned to a pointer (although it can be assigned  
 from a pointer).

You also need to be able to pass the type of ref.& to rebind references  
inside a popNext function.

If we could solve these problems, we might have a better case of being  
able to get rid of opApply.

-Steve

May 26 2009

Steve Teale <steve.teale britseyeview.com> writes:

Steven Schveighoffer Wrote:

 On Fri, 22 May 2009 22:35:39 -0400, Andrei Alexandrescu  
 <SeeWebsiteForEmail erdani.org> wrote:
 
 Steven Schveighoffer wrote:
 Andrei Alexandrescu wrote:
 Steven Schveighoffer wrote:
  OK, I can see you're not going to alter your opinion, so I'll stop  
 debating.  I respectfully disagree with your definitions.

 I'm using STL's definitions. I'd be glad to depart if there was a solid  
 reason. How would you name things?

 
 I reread the chapter on iterators in my C++ book.  I was convinced that I  
 would see something to fuel my beliefs.
 
 I'm now convinced that ranges can stay as is.  Here is why:
 
 I was focusing on a stream range as a more basic range than an input  
 range.  However, I wasn't thinking about the whole picture.  There are  
 more than just input streams.  There are output streams and input/output  
 streams.  It makes sense to support all of them, not just input streams.
 
 So really, I'm thinking of a different tree of entities, which already  
 exist: streams!  That is, the unification of ranges and streams really  
 isn't necessary.  You can wrap a stream as an input, output, or forward  
 range, and deal with the shoehorning (and buffer issues) when you really  
 need it.  I don't think viewing streams as ranges is going to be a useful  
 idiom unless you need to convert from one world to the other.  So trying  
 to find a way to make files be ranges might be simply an overenthusiastic  
 view of ranges, similar to how people try to make everything object  
 oriented when they learn how OO design works.
 
 As far as foreach, we still have opApply for streams (or shoehorning if  
 you need range-ification).
 
 However, out of all this may come one interesting idea: popNext  
 (specifically the 'bool popNext(ref T val)' form).
 
 Note that there are a couple of advantages to using popNext with foreach:
 1. Multiple variables, and easy overloading.  i.e. foreach(int idx, T val;  
 range) would be possible (and much less cumbersome than opApply!).
 2. 1 function call per loop instead of 3.  This may have a distinct  
 advantage if you absolutely have to use a virtual method, or your range  
 must call virtual methods on it's container.
 
 I still don't know how to make ref values work with popNext without using  
 pointers.  It would be nice to have a rebindable ref type.
 
 I thought of this syntax, does this seem appealing at all?
 
 int i = 5, j = 6;
 ref int ri = i;
 ri = 7;
 assert(i == 7);
 ri.& = &j;
 ri = 8;
 assert(j == 8);
 
 type of ref.& is *not* a pointer, the arithmetic operators are not defined  
 for it, and it can't be assigned to a pointer (although it can be assigned  
  from a pointer).
 
 You also need to be able to pass the type of ref.& to rebind references  
 inside a popNext function.
 
 If we could solve these problems, we might have a better case of being  
 able to get rid of opApply.
 
 -Steve

Steve,

I'm glad to see that there are others out there who are equally skeptical about
ranges..

Your point about Ranges being similar to an early view of OOP is a good one.
Just because someone comes up with an interesting concept, it does not mean you
need to use it all the time, or even ever if it does not sit well with you.

Steve

May 26 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 26 May 2009 13:45:05 -0400, Steve Teale  
<steve.teale britseyeview.com> wrote:

 Steve,

 I'm glad to see that there are others out there who are equally  
 skeptical about ranges..

 Your point about Ranges being similar to an early view of OOP is a good  
 one. Just because someone comes up with an interesting concept, it does  
 not mean you need to use it all the time, or even ever if it does not  
 sit well with you.

Ranges to me are a great concept.  They are safer than C++ iterators, but  
they are not exactly new.  For example Java Iterators are essentially  
input ranges.

However, STL algorithms applied to such concepts has not often been done  
(if ever before), probably because not many languages have such extensive  
template support.  It's certainly an interesting pairing, and I think it  
will prove to be a great fit.

On the skeptical side, I don't think ranges are a great fit for all cases  
where C++ iterators work well.  Using iterators as a marker, for instance,  
is not handled well by a range.  It seems to me that streams also don't  
fit well as a range (but certainly there needs to be some connector code  
between ranges and streams).

So I guess, you wouldn't call me a skeptic, but I'm certainly not  
convinced that ranges are the solution to all problems :)

-Steve

May 26 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article
 On Tue, 26 May 2009 13:45:05 -0400, Steve Teale
 <steve.teale britseyeview.com> wrote:
 Steve,

 I'm glad to see that there are others out there who are equally
 skeptical about ranges..

 Your point about Ranges being similar to an early view of OOP is a good
 one. Just because someone comes up with an interesting concept, it does
 not mean you need to use it all the time, or even ever if it does not
 sit well with you.

 Ranges to me are a great concept.  They are safer than C++ iterators, but
 they are not exactly new.  For example Java Iterators are essentially
 input ranges.
 However, STL algorithms applied to such concepts has not often been done
 (if ever before), probably because not many languages have such extensive
 template support.  It's certainly an interesting pairing, and I think it
 will prove to be a great fit.
 On the skeptical side, I don't think ranges are a great fit for all cases
 where C++ iterators work well.  Using iterators as a marker, for instance,
 is not handled well by a range.  It seems to me that streams also don't
 fit well as a range (but certainly there needs to be some connector code
 between ranges and streams).
 So I guess, you wouldn't call me a skeptic, but I'm certainly not
 convinced that ranges are the solution to all problems :)
 -Steve

You may be right.  D's ranges may be in some ways slightly less powerful than
C++
iterators, for reasons you pointed out, but IMHO they're better even if they
are.
 I believe that trying to make something flexible enough to handle 99-100% of
use
cases is a recipe for disaster, as whatever you come up with will be way too
complex and difficult to use for the easiest 90-95% of cases.  It's better to
make
it handle the easiest 90-95% of cases well, i.e. with as much conceptual
simplicity and consistency and as little boilerplate as possible, and assume
that
people will roll their own custom solution for the really hard cases, since they
probably will anyhow.  In other words, worse really is better sometimes.

I've programmed very little in C++, but during my forays into it, I've always
felt
that STL iterators were so mind-numbingly verbose and ugly that I really didn't
want to use them no matter how flexible or efficient they were.  They are so
messy
that ad-hoc roll-your-own solutions start to seem like a better idea for a lot
of
the simpler cases, and this is exactly what I usually did.

May 26 2009

Lionello Lunesu <lio lunesu.remove.com> writes:

Andrei Alexandrescu wrote:
 Steven Schveighoffer wrote:
 Another idea is to make the T the ref'd arg, similar to how the system 
 call read() works:

 bool popNext(ref T);

 This has some benefits:

 1) you aren't checking a temporary for emptyness, so it fits well 
 within a loop construct
 2) you can avoid returning a dummy element in the empty range case.
 3) you avoid returning a piece of possibly large data on the stack 
 when it's probably just going to get copied anyways (although the 
 compiler can sometimes optimize this).

 
 We considered that as well. It is problematic because looking at 
 elements always entails copying them, which is rather silly if you do it 
 for e.g. an array.
 
 By golly, I kid you not but the interface I personally like the most is:
 
 struct ArchetypalInputRange(T)
 {
     T* popNext();
 }
 
 popNext returns a pointer to a T (the value may be reused) or null when 
 the range is done. Iteration becomes easy and efficient for all types. 
 An input range would still have to keep a buffer (and return a pointer 
 to it), but things are not awkward to implement.
 
 Given the bad reputation that pointers have, I guess people wouldn't 
 like this all that much.

You don't need a pointer to T, you need a nullable T :)
...which doesn't exist...

L.

May 22 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 22 May 2009 20:43:22 -0400, Lionello Lunesu  
<lio lunesu.remove.com> wrote:

 Andrei Alexandrescu wrote:
 Steven Schveighoffer wrote:
 Another idea is to make the T the ref'd arg, similar to how the system  
 call read() works:

 bool popNext(ref T);

 This has some benefits:

 1) you aren't checking a temporary for emptyness, so it fits well  
 within a loop construct
 2) you can avoid returning a dummy element in the empty range case.
 3) you avoid returning a piece of possibly large data on the stack  
 when it's probably just going to get copied anyways (although the  
 compiler can sometimes optimize this).

  We considered that as well. It is problematic because looking at  
 elements always entails copying them, which is rather silly if you do  
 it for e.g. an array.
  By golly, I kid you not but the interface I personally like the most  
 is:
  struct ArchetypalInputRange(T)
 {
     T* popNext();
 }
  popNext returns a pointer to a T (the value may be reused) or null  
 when the range is done. Iteration becomes easy and efficient for all  
 types. An input range would still have to keep a buffer (and return a  
 pointer to it), but things are not awkward to implement.
  Given the bad reputation that pointers have, I guess people wouldn't  
 like this all that much.

 You don't need a pointer to T, you need a nullable T :)
 ...which doesn't exist...

No, you need a pointer.  If T is a reference type, how do you distinguish  
a null element from the end of the iteration?

-Steve

May 22 2009

Jason House <jason.james.house gmail.com> writes:

Steven Schveighoffer wrote:

 No, you need a pointer.  If T is a reference type, how do you distinguish
 a null element from the end of the iteration?

Do you have any common cases where a range would generate nulls as part of 
its normal output?  I think using the natural null feature of reference 
types is a clean and reasonable solution.

May 22 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Jason House wrote:
 Steven Schveighoffer wrote:
 
 No, you need a pointer.  If T is a reference type, how do you distinguish
 a null element from the end of the iteration?

 
 Do you have any common cases where a range would generate nulls as part of 
 its normal output?  I think using the natural null feature of reference 
 types is a clean and reasonable solution.
 

You could have a vector containing plenty of null references.

Andrei

May 22 2009

Rainer Deyke <rainerd eldwood.com> writes:

Jason House wrote:
 Do you have any common cases where a range would generate nulls as part of 
 its normal output?  I think using the natural null feature of reference 
 types is a clean and reasonable solution.

Iterating over an array that contains nulls?

Nullable!(T) must be able to hold all values of T, plus null.  By
inference, Nullable!(Nullable!(T)) must be able to hold all values of T,
plus two different null values.  'Maybe' in Haskell can do this.  (The
two null values are called 'Nothing' and 'Just Nothing'.)
'boost::optional' in C++ can do this.


-- 
Rainer Deyke - rainerd eldwood.com

May 22 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Rainer Deyke wrote:
 Jason House wrote:
 Do you have any common cases where a range would generate nulls as part of 
 its normal output?  I think using the natural null feature of reference 
 types is a clean and reasonable solution.

 
 Iterating over an array that contains nulls?
 
 Nullable!(T) must be able to hold all values of T, plus null.  By
 inference, Nullable!(Nullable!(T)) must be able to hold all values of T,
 plus two different null values.  'Maybe' in Haskell can do this.  (The
 two null values are called 'Nothing' and 'Just Nothing'.)
 'boost::optional' in C++ can do this.
 
 

They can't store a nullable ref int. They'd have to store a nullable 
pointer to an int. Then wait, there exists a null pointer to an int.

Andrei

May 22 2009

Daniel Keep <daniel.keep.lists gmail.com> writes:

Andrei Alexandrescu wrote:
 Rainer Deyke wrote:
 Jason House wrote:
 Do you have any common cases where a range would generate nulls as
 part of its normal output?  I think using the natural null feature of
 reference types is a clean and reasonable solution.

 Iterating over an array that contains nulls?

 Nullable!(T) must be able to hold all values of T, plus null.  By
 inference, Nullable!(Nullable!(T)) must be able to hold all values of T,
 plus two different null values.  'Maybe' in Haskell can do this.  (The
 two null values are called 'Nothing' and 'Just Nothing'.)
 'boost::optional' in C++ can do this.

 
 They can't store a nullable ref int. They'd have to store a nullable
 pointer to an int. Then wait, there exists a null pointer to an int.
 
 Andrei

May 23 2009

"Lionello Lunesu" <lionello lunesu.remove.com> writes:

"Steven Schveighoffer" <schveiguy yahoo.com> wrote in message 
news:op.uucxvnxleav7ka steves.networkengines.com...
 On Fri, 22 May 2009 20:43:22 -0400, Lionello Lunesu 
 <lio lunesu.remove.com> wrote:

[snip]
 You don't need a pointer to T, you need a nullable T :)
 ...which doesn't exist...

 No, you need a pointer.  If T is a reference type, how do you distinguish 
 a null element from the end of the iteration?

If the container/range contains pointers then T would be a pointer, let's 
say S*, and the function would return a nullable-T, let's say nullable<S*> 
or somesuch.

Or what Andrei said: a nullable reference to T, which might be a nullable 
reference to a pointer when iterating over pointers.

L.

May 22 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Lionello Lunesu wrote:
 Andrei Alexandrescu wrote:
 Steven Schveighoffer wrote:
 Another idea is to make the T the ref'd arg, similar to how the 
 system call read() works:

 bool popNext(ref T);

 This has some benefits:

 1) you aren't checking a temporary for emptyness, so it fits well 
 within a loop construct
 2) you can avoid returning a dummy element in the empty range case.
 3) you avoid returning a piece of possibly large data on the stack 
 when it's probably just going to get copied anyways (although the 
 compiler can sometimes optimize this).

 We considered that as well. It is problematic because looking at 
 elements always entails copying them, which is rather silly if you do 
 it for e.g. an array.

 By golly, I kid you not but the interface I personally like the most is:

 struct ArchetypalInputRange(T)
 {
     T* popNext();
 }

 popNext returns a pointer to a T (the value may be reused) or null 
 when the range is done. Iteration becomes easy and efficient for all 
 types. An input range would still have to keep a buffer (and return a 
 pointer to it), but things are not awkward to implement.

 Given the bad reputation that pointers have, I guess people wouldn't 
 like this all that much.

 
 You don't need a pointer to T, you need a nullable T :)
 ...which doesn't exist...

You need a nullable reference to T...

Andrei

May 22 2009

"Lionello Lunesu" <lionello lunesu.remove.com> writes:

"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message 
news:gv7j9t$m9g$2 digitalmars.com...
 Lionello Lunesu wrote:
 Andrei Alexandrescu wrote:


[snip]
 Given the bad reputation that pointers have, I guess people wouldn't 
 like this all that much.

 You don't need a pointer to T, you need a nullable T :)
 ...which doesn't exist...

 You need a nullable reference to T...

Agreed. I just wanted to make a point (again) about nullable types. A 
nullable reference would be great, as long as it works for all types, 
builtins/classes/structs/... Without nullable references, SafeD will be a 
p.i.t.a....

L.

May 22 2009

D Programming

C/C++ Programming

Other

digitalmars.D - ideas about ranges