www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - ideas about ranges

reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
The thread discussing what to do for input ranges vs. forward ranges got  
me thinking.

The range concept may be defined backwards in terms of which is more  
specialized.  Consider that an input range is always usable as a stream.   
But a stream is not easy to use as an input range (the range primitive).

Case in point, a file.  To fit into the most primitive range concept, it  
must define 3 functions:

front()
popFront()
empty()

empty is easy, it's just "am I at end of file"

But front is not so easy.  In order to know what's at the front, you need  
to read it.  And at that point you've altered the underlying file.

Look at the implementation of front and popFront for a possible FileByChar  
implementation:

dchar front()
{
   if(!bufferValid)
     popFront();
   return buffer;
}

void popFront()
{
    // read buffer from source, setting bufferValid if the range isn't  
empty by default
}

What sucks about this is, we have to introduce a buffer in the range, just  
because we can't look at data until we've popped it.  Not only that, but  
calling front a file before anything is read requires a check to fill the  
buffer in case we haven't read anything yet.  This could be alleviated by  
filling in the constructor, but it's still more complex than necessary.   
Consider also that the underlying stream might already be buffered, so we  
are buffering a buffer.

And finally, if you copy such a range, the buffer might be copied while  
the stream itself may not.  this could result in strange garbage data.

But since the primitives for input range are set by the compiler (it uses  
them to do foreach), we have to implement them to make our stream ranges  
friendly to foreach.

Round peg, meet square hole.

But what are the true requirements for iteration using foreach?

1. check if there's anything left
2. get the next element

Step 2 now is split into popFront and front.  So a foreach loop is a  
rewritten for loop like this:

foreach(x; range)
{
   ...
}

translates to:
{
   auto _r = range;
   while(!_r.empty)
   {
     auto x = _r.front();
     _r.popFront();
     ...
   }
}

What if step 2 was one function?  Call it popNext(), and make it  
equivalent to calling _r.front() and popFront() in one step on ranges that  
implement that method.

How does this work with foreach?

{
   auto _r = range;
   while(!_r.empty)
   {
     auto x = _r.popNext();
     ...
   }
}

Basically, the same code, one less line.

Consider that any range defined today with front() and popFront() can  
implement popNext (and popNext could be an external function if we can get  
3015 resolved).

So what I think we may need is a different range primitive:

An iterable range defines: (name to be decided)

bool empty()
T popNext()

An input range is an iterable range that also defines:

T front();
popFront();

Now look at our FileByChar example as an iterable range:

T popNext()
{
    return source.get(); // no buffering required
}

And it works perfectly with the new foreach requirements.

And it correctly doesn't work with algorithms that require front and  
popFront.

-Steve
May 21 2009
next sibling parent reply Jason House <jason.james.house gmail.com> writes:
Steven Schveighoffer Wrote:

 
 The thread discussing what to do for input ranges vs. forward ranges got  
 me thinking.
 
 The range concept may be defined backwards in terms of which is more  
 specialized.  Consider that an input range is always usable as a stream.   
 But a stream is not easy to use as an input range (the range primitive).

 Case in point, a file.  To fit into the most primitive range concept, it  
 must define 3 functions:
 
 front()
 popFront()
 empty()
 
 empty is easy, it's just "am I at end of file"
 
 But front is not so easy.  In order to know what's at the front, you need  
 to read it.  And at that point you've altered the underlying file.
 
 Look at the implementation of front and popFront for a possible FileByChar  
 implementation:
 
 dchar front()
 {
    if(!bufferValid)
      popFront();
    return buffer;
 }
 
 void popFront()
 {
     // read buffer from source, setting bufferValid if the range isn't  
 empty by default
 }
 
 What sucks about this is, we have to introduce a buffer in the range, just  
 because we can't look at data until we've popped it.  Not only that, but  
 calling front a file before anything is read requires a check to fill the  
 buffer in case we haven't read anything yet.  This could be alleviated by  
 filling in the constructor, but it's still more complex than necessary.   
 Consider also that the underlying stream might already be buffered, so we  
 are buffering a buffer.
 
 And finally, if you copy such a range, the buffer might be copied while  
 the stream itself may not.  this could result in strange garbage data.
 
 But since the primitives for input range are set by the compiler (it uses  
 them to do foreach), we have to implement them to make our stream ranges  
 friendly to foreach.
 
 Round peg, meet square hole.
 
 But what are the true requirements for iteration using foreach?
 
 1. check if there's anything left
 2. get the next element
 
 Step 2 now is split into popFront and front.  So a foreach loop is a  
 rewritten for loop like this:
 
 foreach(x; range)
 {
    ...
 }
 
 translates to:
 {
    auto _r = range;
    while(!_r.empty)
    {
      auto x = _r.front();
      _r.popFront();
      ...
    }
 }
 
 What if step 2 was one function?  Call it popNext(), and make it  
 equivalent to calling _r.front() and popFront() in one step on ranges that  
 implement that method.
 
 How does this work with foreach?
 
 {
    auto _r = range;
    while(!_r.empty)
    {
      auto x = _r.popNext();
      ...
    }
 }
 
 Basically, the same code, one less line.
 
 Consider that any range defined today with front() and popFront() can  
 implement popNext (and popNext could be an external function if we can get  
 3015 resolved).
 
 So what I think we may need is a different range primitive:
 
 An iterable range defines: (name to be decided)
 
 bool empty()
 T popNext()
 
 An input range is an iterable range that also defines:
 
 T front();
 popFront();

This is exactly like what Andrei posted as his design. Your iterable range = Andrei's input range Your input range = Andrei's forward range
 Now look at our FileByChar example as an iterable range:
 
 T popNext()
 {
     return source.get(); // no buffering required
 }
 
 And it works perfectly with the new foreach requirements.
 
 And it correctly doesn't work with algorithms that require front and  
 popFront.
 
 -Steve

May 21 2009
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Steven Schveighoffer wrote:
 On Thu, 21 May 2009 17:40:27 -0400, Jason House 
 <jason.james.house gmail.com> wrote:
 
 Steven Schveighoffer Wrote:
 So what I think we may need is a different range primitive:

 An iterable range defines: (name to be decided)

 bool empty()
 T popNext()

 An input range is an iterable range that also defines:

 T front();
 popFront();

This is exactly like what Andrei posted as his design. Your iterable range = Andrei's input range Your input range = Andrei's forward range

If that's true, then I apologize, I was looking at the current docs, and Andrei's post about having some sort of "save" method. The current docs definitely are not the same as what I proposed. And in any case, my proposal requires a change to the compiler for foreach. I haven't seen that talked about yet. If that's in the works, then that's cool. As I've said before, I don't need credit, I just like to have the ideas knocked around. -Steve

There are quite a few subtle points about the discussion you started, and a few subtler subpoints still. I've had a busy day so I couldn't answer, but I'll try tomorrow. Andrei
May 21 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 21 May 2009 17:40:27 -0400, Jason House  
<jason.james.house gmail.com> wrote:

 Steven Schveighoffer Wrote:
 So what I think we may need is a different range primitive:

 An iterable range defines: (name to be decided)

 bool empty()
 T popNext()

 An input range is an iterable range that also defines:

 T front();
 popFront();

This is exactly like what Andrei posted as his design. Your iterable range = Andrei's input range Your input range = Andrei's forward range

If that's true, then I apologize, I was looking at the current docs, and Andrei's post about having some sort of "save" method. The current docs definitely are not the same as what I proposed. And in any case, my proposal requires a change to the compiler for foreach. I haven't seen that talked about yet. If that's in the works, then that's cool. As I've said before, I don't need credit, I just like to have the ideas knocked around. -Steve
May 21 2009
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Steven Schveighoffer wrote:
 
 The thread discussing what to do for input ranges vs. forward ranges got 
 me thinking.
 
 The range concept may be defined backwards in terms of which is more 
 specialized.  Consider that an input range is always usable as a 
 stream.  But a stream is not easy to use as an input range (the range 
 primitive).

The hierarchy of concepts is: input range < forward range < bidirectional range < random-access range Having length, having slicing, and infinity are orthogonal properties.
 Case in point, a file.  To fit into the most primitive range concept, it 
 must define 3 functions:
 
 front()
 popFront()
 empty()
 
 empty is easy, it's just "am I at end of file"
 
 But front is not so easy.  In order to know what's at the front, you 
 need to read it.  And at that point you've altered the underlying file.

Not even empty is easy. If you're defining a file range that gives you e.g. whitespace-separated integers, you can't have empty return feof(p) because there might be only spaces through the end of file. So you need to cache one element ahead to be able to implement empty(). I've discussed this with Walter for the longest time. The correct primitive for an input range that needs to do no caching is: Tuple!(T, "data", bool, "got") popNext(); When you call popNext it decides in the same place whether there is data, and also gives it to you. If "got" comes off as false, you know you're done. There are two issues with this form. It's not easy to use (even if you play around with the signature by e.g. passing data or bool by reference) and it's not easy to inline for non-input ranges. The second problem could possibly be eliminated, but the first stays. It's just too clunky a primitive to use, you'll need temporaries and all that stuff.
 Look at the implementation of front and popFront for a possible 
 FileByChar implementation:
 
 dchar front()
 {
   if(!bufferValid)
     popFront();
   return buffer;
 }
 
 void popFront()
 {
    // read buffer from source, setting bufferValid if the range isn't 
 empty by default
 }
 
 What sucks about this is, we have to introduce a buffer in the range, 
 just because we can't look at data until we've popped it.  Not only 
 that, but calling front a file before anything is read requires a check 
 to fill the buffer in case we haven't read anything yet.  This could be 
 alleviated by filling in the constructor, but it's still more complex 
 than necessary.  Consider also that the underlying stream might already 
 be buffered, so we are buffering a buffer.

Yah, I agree. I've implemented a few of those in Phobos. It would be good to have a more solid solution. This problem, however, also collides with returning an rvalue versus a ref. A container wants to return a ref. An input range wants to return by value. Then it's difficult to use a container as an input range.
 And finally, if you copy such a range, the buffer might be copied while 
 the stream itself may not.  this could result in strange garbage data.

I don't understand this. You could make sure copy does the right thing.
 But since the primitives for input range are set by the compiler (it 
 uses them to do foreach), we have to implement them to make our stream 
 ranges friendly to foreach.
 
 Round peg, meet square hole.
 
 But what are the true requirements for iteration using foreach?
 
 1. check if there's anything left
 2. get the next element
 
 Step 2 now is split into popFront and front.  So a foreach loop is a 
 rewritten for loop like this:
 
 foreach(x; range)
 {
   ...
 }
 
 translates to:
 {
   auto _r = range;
   while(!_r.empty)
   {
     auto x = _r.front();
     _r.popFront();
     ...
   }
 }
 
 What if step 2 was one function?  Call it popNext(), and make it 
 equivalent to calling _r.front() and popFront() in one step on ranges 
 that implement that method.

This will not solve all problems. It will improve things like defining ranges that read one character at a time from a stream. But a function that does read-and-check-for-empty in one shot is the true solution.
 How does this work with foreach?
 
 {
   auto _r = range;
   while(!_r.empty)
   {
     auto x = _r.popNext();
     ...
   }
 }
 
 Basically, the same code, one less line.
 
 Consider that any range defined today with front() and popFront() can 
 implement popNext (and popNext could be an external function if we can 
 get 3015 resolved).
 
 So what I think we may need is a different range primitive:
 
 An iterable range defines: (name to be decided)
 
 bool empty()
 T popNext()
 
 An input range is an iterable range that also defines:
 
 T front();
 popFront();

I think you are using "iterable range" instead of "input range" and "input range" instead of "forward range". This is compared to STL terminology, which I borrowed.
 Now look at our FileByChar example as an iterable range:
 
 T popNext()
 {
    return source.get(); // no buffering required
 }
 
 And it works perfectly with the new foreach requirements.
 
 And it correctly doesn't work with algorithms that require front and 
 popFront.

That's great. Again, popNext integrated with check for empty is the best solution. The correct way to go is to define this: T popNext(ref bool gotSomething); and leave it to the discretion of the range if they prefer returning by reference: ref T popNext(ref bool gotSomething); (this might be good for many forward ranges.) Then we nicely ask Walter to allow inlining of such functions, and finally implement this in std.range: ref ElementType!R popNext(R)(R r, ref bool gotSomething) if (isForwardRange!R) { if (r.empty) { gotSomething = false; static typeof(return) dumbo; return dumbo; } auto result = &(r.front()); r.popFront(); gotSomething = true; return *result; } Admittedly this looks considerably messier than it ought to, and that's never a good sign. For starters, I could predict with accuracy the sneering remarks of certain posters who shall remain unnamed. Worse, they'd have a point (only) this one time :o). Messiness was one of the factors that made me decide to steer away from this design. A simpler solution is to just return by value: ElementType!R popNext(R)(R r, ref bool gotSomething) if (isForwardRange!R) { if (r.empty) { gotSomething = false; return typeof(return).init; } auto result = r.front; r.popFront(); return result; } This looks a tad more sane. But then it copies data more than it strictly should, and for whom? For everything that's not strictly a file input, which is most things you want to iterate! Wrong default. As of this time, I am undecided on what's the best way to go. Opinions are welcome. Andrei
May 22 2009
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Steven Schveighoffer wrote:
[snip]
 In addition, without a buffer to worry about, the code becomes much 
 simpler.

I agree. Andrei
May 22 2009
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Steven Schveighoffer wrote:
 Another idea is to make the T the ref'd arg, similar to how the system 
 call read() works:
 
 bool popNext(ref T);
 
 This has some benefits:
 
 1) you aren't checking a temporary for emptyness, so it fits well within 
 a loop construct
 2) you can avoid returning a dummy element in the empty range case.
 3) you avoid returning a piece of possibly large data on the stack when 
 it's probably just going to get copied anyways (although the compiler 
 can sometimes optimize this).

We considered that as well. It is problematic because looking at elements always entails copying them, which is rather silly if you do it for e.g. an array. By golly, I kid you not but the interface I personally like the most is: struct ArchetypalInputRange(T) { T* popNext(); } popNext returns a pointer to a T (the value may be reused) or null when the range is done. Iteration becomes easy and efficient for all types. An input range would still have to keep a buffer (and return a pointer to it), but things are not awkward to implement. Given the bad reputation that pointers have, I guess people wouldn't like this all that much.
 I'm not convinced that range is the best fit for ref'd data anyways, 
 opApply is much more adept at it.

If ranges aren't fit to mutate stuff, we failed. Many algorithms in std.algorithm mutate their data and I can't bring myself to think how I'd implement them all with opApply.
 How does implementation of popNext look?
 
 bool popNext(R)(R r, ref ElementType!R elem)
      if (isForwardRange!R)
 {
      if (r.empty)
          return false;
      elem = r.front();
      r.popFront();
      return true;
 }

It forces a copy whenever you want to take a look at something. Andrei
May 22 2009
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Steven Schveighoffer wrote:
 ref accomplishes all of this, except you can't get at the underlying 
 pointer to do things like compare to null or rebind.  Maybe we simply 
 need some new operators to get at the ref addresses.

Defining null references has been on the table too. I wrote a paragraph about them and then deleted it in fear of major aggravation. There's no need for new syntax. We can easily define things such that you can return *null when a reference is expected and checking &(r.popNext()) is null to figure out what's happening. I think it's bad design though. Andrei
May 22 2009
parent Daniel Keep <daniel.keep.lists gmail.com> writes:
Andrei Alexandrescu wrote:
 Steven Schveighoffer wrote:
 ref accomplishes all of this, except you can't get at the underlying
 pointer to do things like compare to null or rebind.  Maybe we simply
 need some new operators to get at the ref addresses.

Defining null references has been on the table too. I wrote a paragraph about them and then deleted it in fear of major aggravation. There's no need for new syntax. We can easily define things such that you can return *null when a reference is expected and checking &(r.popNext()) is null to figure out what's happening. I think it's bad design though. Andrei

Why not have something along these lines: bool isrefset(T)(ref T r) { return (&r !is null); } Then you don't have to dirty yourself by directly using pointers. :)
May 23 2009
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Steven Schveighoffer wrote:
 I'm concentrating mostly on usages with foreach, not algorithms.  If we 
 are to have streams that fit into the range model, then they need to be 
 foreach'able.  I don't know that they need a lot of support to feed into 
 std.algorithm as reference data.  I.e. you aren't going to sort a 
 network stream.

Plenty of algorithms work on input ranges. Andrei
May 22 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Steven Schveighoffer wrote:
 On Fri, 22 May 2009 17:10:45 -0400, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 
 Steven Schveighoffer wrote:
 I'm concentrating mostly on usages with foreach, not algorithms.  If 
 we are to have streams that fit into the range model, then they need 
 to be foreach'able.  I don't know that they need a lot of support to 
 feed into std.algorithm as reference data.  I.e. you aren't going to 
 sort a network stream.

Plenty of algorithms work on input ranges.

I'm confused, by input range you mean a stream? Because I'm operating under the assumption that an input range is anything that defines front, popFront, and empty. While you can shoehorn a stream into being an input range, they don't necessarily implement the required elements easily. We may be confusing terminology. Can you name an example of an input range that is a stream, and an algorithm that mutates the stream in place (thereby requiring ref elements)?

There isn't. All I'm saying is, we have the following constraints: 1. Any range should be seamlessly and efficiently used as an input range. This probably entails return by ref for efficiency. 2. Input ranges should define popNext to do at the same time reading, testing for empty, and advancing. That's it! Andrei
May 22 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Steven Schveighoffer wrote:
 On Fri, 22 May 2009 21:22:55 -0400, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 1. Any range should be seamlessly and efficiently used as an input range.

This is the assumption I am challenging. I don't think you need this assumptions for ranges to work. You can always bolt input range functionality on top of a stream range if you want to treat a stream as an input range for some reason.

I believe there is indeed a terminology problem. To me, "input range" == "stream" == "socket" == "bridge that is sinking under your feet as you walk it". So to me there exists no "stream range". That to me is an "input range".
 But if foreach doesn't utilize the 
 popNext api, then streams require an unnecessary layer on top, just to 
 use foreach with them.

We can arrange that foreach uses popNext, but it must be worth it. Andrei
May 22 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Steven Schveighoffer wrote:
Andrei Alexandrescu wrote:
 Steven Schveighoffer wrote:
 On Fri, 22 May 2009 21:22:55 -0400, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 1. Any range should be seamlessly and efficiently used as an input 
 range.

This is the assumption I am challenging. I don't think you need this assumptions for ranges to work. You can always bolt input range functionality on top of a stream range if you want to treat a stream as an input range for some reason.

I believe there is indeed a terminology problem. To me, "input range" == "stream" == "socket" == "bridge that is sinking under your feet as you walk it". So to me there exists no "stream range". That to me is an "input range".
 But if foreach doesn't utilize the popNext api, then streams require 
 an unnecessary layer on top, just to use foreach with them.

We can arrange that foreach uses popNext, but it must be worth it. Andrei

 On Fri, 22 May 2009 21:50:03 -0400, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 
 Steven Schveighoffer wrote:
 On Fri, 22 May 2009 21:22:55 -0400, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 1. Any range should be seamlessly and efficiently used as an input 
 range.

this assumptions for ranges to work. You can always bolt input range functionality on top of a stream range if you want to treat a stream as an input range for some reason.

I believe there is indeed a terminology problem. To me, "input range" == "stream" == "socket" == "bridge that is sinking under your feet as you walk it". So to me there exists no "stream range". That to me is an "input range".

OK, I can see you're not going to alter your opinion, so I'll stop debating. I respectfully disagree with your definitions.

I'm using STL's definitions. I'd be glad to depart if there was a solid reason. How would you name things?
 But if foreach doesn't utilize the popNext api, then streams require 
 an unnecessary layer on top, just to use foreach with them.

We can arrange that foreach uses popNext, but it must be worth it.

I can't say for certain that it is, but you definitely will know if when you start implementing ranges based on stuff like files, and things just seem difficult to implement or have unsolvable problems. I guess we wait and see.

One hopeful element is that there are usually more containers than streams/files. Besides, most streams are buffered. So maybe things aren't that bad. Andrei
May 22 2009
parent reply Steve Teale <steve.teale britseyeview.com> writes:
Steven Schveighoffer Wrote:

 On Fri, 22 May 2009 22:35:39 -0400, Andrei Alexandrescu  
 <SeeWebsiteForEmail erdani.org> wrote:
 
 Steven Schveighoffer wrote:
 Andrei Alexandrescu wrote:
 Steven Schveighoffer wrote:
  OK, I can see you're not going to alter your opinion, so I'll stop  
 debating.  I respectfully disagree with your definitions.

I'm using STL's definitions. I'd be glad to depart if there was a solid reason. How would you name things?

I reread the chapter on iterators in my C++ book. I was convinced that I would see something to fuel my beliefs. I'm now convinced that ranges can stay as is. Here is why: I was focusing on a stream range as a more basic range than an input range. However, I wasn't thinking about the whole picture. There are more than just input streams. There are output streams and input/output streams. It makes sense to support all of them, not just input streams. So really, I'm thinking of a different tree of entities, which already exist: streams! That is, the unification of ranges and streams really isn't necessary. You can wrap a stream as an input, output, or forward range, and deal with the shoehorning (and buffer issues) when you really need it. I don't think viewing streams as ranges is going to be a useful idiom unless you need to convert from one world to the other. So trying to find a way to make files be ranges might be simply an overenthusiastic view of ranges, similar to how people try to make everything object oriented when they learn how OO design works. As far as foreach, we still have opApply for streams (or shoehorning if you need range-ification). However, out of all this may come one interesting idea: popNext (specifically the 'bool popNext(ref T val)' form). Note that there are a couple of advantages to using popNext with foreach: 1. Multiple variables, and easy overloading. i.e. foreach(int idx, T val; range) would be possible (and much less cumbersome than opApply!). 2. 1 function call per loop instead of 3. This may have a distinct advantage if you absolutely have to use a virtual method, or your range must call virtual methods on it's container. I still don't know how to make ref values work with popNext without using pointers. It would be nice to have a rebindable ref type. I thought of this syntax, does this seem appealing at all? int i = 5, j = 6; ref int ri = i; ri = 7; assert(i == 7); ri.& = &j; ri = 8; assert(j == 8); type of ref.& is *not* a pointer, the arithmetic operators are not defined for it, and it can't be assigned to a pointer (although it can be assigned from a pointer). You also need to be able to pass the type of ref.& to rebind references inside a popNext function. If we could solve these problems, we might have a better case of being able to get rid of opApply. -Steve

Steve, I'm glad to see that there are others out there who are equally skeptical about ranges.. Your point about Ranges being similar to an early view of OOP is a good one. Just because someone comes up with an interesting concept, it does not mean you need to use it all the time, or even ever if it does not sit well with you. Steve
May 26 2009
parent dsimcha <dsimcha yahoo.com> writes:
== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article
 On Tue, 26 May 2009 13:45:05 -0400, Steve Teale
 <steve.teale britseyeview.com> wrote:
 Steve,

 I'm glad to see that there are others out there who are equally
 skeptical about ranges..

 Your point about Ranges being similar to an early view of OOP is a good
 one. Just because someone comes up with an interesting concept, it does
 not mean you need to use it all the time, or even ever if it does not
 sit well with you.

they are not exactly new. For example Java Iterators are essentially input ranges. However, STL algorithms applied to such concepts has not often been done (if ever before), probably because not many languages have such extensive template support. It's certainly an interesting pairing, and I think it will prove to be a great fit. On the skeptical side, I don't think ranges are a great fit for all cases where C++ iterators work well. Using iterators as a marker, for instance, is not handled well by a range. It seems to me that streams also don't fit well as a range (but certainly there needs to be some connector code between ranges and streams). So I guess, you wouldn't call me a skeptic, but I'm certainly not convinced that ranges are the solution to all problems :) -Steve

You may be right. D's ranges may be in some ways slightly less powerful than C++ iterators, for reasons you pointed out, but IMHO they're better even if they are. I believe that trying to make something flexible enough to handle 99-100% of use cases is a recipe for disaster, as whatever you come up with will be way too complex and difficult to use for the easiest 90-95% of cases. It's better to make it handle the easiest 90-95% of cases well, i.e. with as much conceptual simplicity and consistency and as little boilerplate as possible, and assume that people will roll their own custom solution for the really hard cases, since they probably will anyhow. In other words, worse really is better sometimes. I've programmed very little in C++, but during my forays into it, I've always felt that STL iterators were so mind-numbingly verbose and ugly that I really didn't want to use them no matter how flexible or efficient they were. They are so messy that ad-hoc roll-your-own solutions start to seem like a better idea for a lot of the simpler cases, and this is exactly what I usually did.
May 26 2009
prev sibling parent reply Lionello Lunesu <lio lunesu.remove.com> writes:
Andrei Alexandrescu wrote:
 Steven Schveighoffer wrote:
 Another idea is to make the T the ref'd arg, similar to how the system 
 call read() works:

 bool popNext(ref T);

 This has some benefits:

 1) you aren't checking a temporary for emptyness, so it fits well 
 within a loop construct
 2) you can avoid returning a dummy element in the empty range case.
 3) you avoid returning a piece of possibly large data on the stack 
 when it's probably just going to get copied anyways (although the 
 compiler can sometimes optimize this).

We considered that as well. It is problematic because looking at elements always entails copying them, which is rather silly if you do it for e.g. an array. By golly, I kid you not but the interface I personally like the most is: struct ArchetypalInputRange(T) { T* popNext(); } popNext returns a pointer to a T (the value may be reused) or null when the range is done. Iteration becomes easy and efficient for all types. An input range would still have to keep a buffer (and return a pointer to it), but things are not awkward to implement. Given the bad reputation that pointers have, I guess people wouldn't like this all that much.

You don't need a pointer to T, you need a nullable T :) ...which doesn't exist... L.
May 22 2009
next sibling parent reply Jason House <jason.james.house gmail.com> writes:
Steven Schveighoffer wrote:

 No, you need a pointer.  If T is a reference type, how do you distinguish
 a null element from the end of the iteration?

Do you have any common cases where a range would generate nulls as part of its normal output? I think using the natural null feature of reference types is a clean and reasonable solution.
May 22 2009
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Jason House wrote:
 Steven Schveighoffer wrote:
 
 No, you need a pointer.  If T is a reference type, how do you distinguish
 a null element from the end of the iteration?

Do you have any common cases where a range would generate nulls as part of its normal output? I think using the natural null feature of reference types is a clean and reasonable solution.

You could have a vector containing plenty of null references. Andrei
May 22 2009
prev sibling parent reply Rainer Deyke <rainerd eldwood.com> writes:
Jason House wrote:
 Do you have any common cases where a range would generate nulls as part of 
 its normal output?  I think using the natural null feature of reference 
 types is a clean and reasonable solution.

Iterating over an array that contains nulls? Nullable!(T) must be able to hold all values of T, plus null. By inference, Nullable!(Nullable!(T)) must be able to hold all values of T, plus two different null values. 'Maybe' in Haskell can do this. (The two null values are called 'Nothing' and 'Just Nothing'.) 'boost::optional' in C++ can do this. -- Rainer Deyke - rainerd eldwood.com
May 22 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Rainer Deyke wrote:
 Jason House wrote:
 Do you have any common cases where a range would generate nulls as part of 
 its normal output?  I think using the natural null feature of reference 
 types is a clean and reasonable solution.

Iterating over an array that contains nulls? Nullable!(T) must be able to hold all values of T, plus null. By inference, Nullable!(Nullable!(T)) must be able to hold all values of T, plus two different null values. 'Maybe' in Haskell can do this. (The two null values are called 'Nothing' and 'Just Nothing'.) 'boost::optional' in C++ can do this.

They can't store a nullable ref int. They'd have to store a nullable pointer to an int. Then wait, there exists a null pointer to an int. Andrei
May 22 2009
parent Daniel Keep <daniel.keep.lists gmail.com> writes:
Andrei Alexandrescu wrote:
 Rainer Deyke wrote:
 Jason House wrote:
 Do you have any common cases where a range would generate nulls as
 part of its normal output?  I think using the natural null feature of
 reference types is a clean and reasonable solution.

Iterating over an array that contains nulls? Nullable!(T) must be able to hold all values of T, plus null. By inference, Nullable!(Nullable!(T)) must be able to hold all values of T, plus two different null values. 'Maybe' in Haskell can do this. (The two null values are called 'Nothing' and 'Just Nothing'.) 'boost::optional' in C++ can do this.

They can't store a nullable ref int. They'd have to store a nullable pointer to an int. Then wait, there exists a null pointer to an int. Andrei

The way C# solves this is to have IsSet and Get properties or something.
May 23 2009
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Lionello Lunesu wrote:
 Andrei Alexandrescu wrote:
 Steven Schveighoffer wrote:
 Another idea is to make the T the ref'd arg, similar to how the 
 system call read() works:

 bool popNext(ref T);

 This has some benefits:

 1) you aren't checking a temporary for emptyness, so it fits well 
 within a loop construct
 2) you can avoid returning a dummy element in the empty range case.
 3) you avoid returning a piece of possibly large data on the stack 
 when it's probably just going to get copied anyways (although the 
 compiler can sometimes optimize this).

We considered that as well. It is problematic because looking at elements always entails copying them, which is rather silly if you do it for e.g. an array. By golly, I kid you not but the interface I personally like the most is: struct ArchetypalInputRange(T) { T* popNext(); } popNext returns a pointer to a T (the value may be reused) or null when the range is done. Iteration becomes easy and efficient for all types. An input range would still have to keep a buffer (and return a pointer to it), but things are not awkward to implement. Given the bad reputation that pointers have, I guess people wouldn't like this all that much.

You don't need a pointer to T, you need a nullable T :) ...which doesn't exist...

You need a nullable reference to T... Andrei
May 22 2009
parent "Lionello Lunesu" <lionello lunesu.remove.com> writes:
"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message 
news:gv7j9t$m9g$2 digitalmars.com...
 Lionello Lunesu wrote:
 Andrei Alexandrescu wrote:


 Given the bad reputation that pointers have, I guess people wouldn't 
 like this all that much.

You don't need a pointer to T, you need a nullable T :) ...which doesn't exist...

You need a nullable reference to T...

Agreed. I just wanted to make a point (again) about nullable types. A nullable reference would be great, as long as it works for all types, builtins/classes/structs/... Without nullable references, SafeD will be a p.i.t.a.... L.
May 22 2009
prev sibling parent "Lionello Lunesu" <lionello lunesu.remove.com> writes:
"Steven Schveighoffer" <schveiguy yahoo.com> wrote in message 
news:op.uucxvnxleav7ka steves.networkengines.com...
 On Fri, 22 May 2009 20:43:22 -0400, Lionello Lunesu 
 <lio lunesu.remove.com> wrote:

 You don't need a pointer to T, you need a nullable T :)
 ...which doesn't exist...

No, you need a pointer. If T is a reference type, how do you distinguish a null element from the end of the iteration?

If the container/range contains pointers then T would be a pointer, let's say S*, and the function would return a nullable-T, let's say nullable<S*> or somesuch. Or what Andrei said: a nullable reference to T, which might be a nullable reference to a pointer when iterating over pointers. L.
May 22 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 22 May 2009 11:48:34 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 And finally, if you copy such a range, the buffer might be copied while  
 the stream itself may not.  this could result in strange garbage data.

I don't understand this. You could make sure copy does the right thing.

I'll respond to this one point separately. consider you have a range with a statically defined buffer (not a heap-allocated): struct range { uint buf; FILE *source; } Let's assume the data in the source reads 1,2,3,4,5 The buffer contains 1, and 2,3,4,5 is still on the source Without any alterations, copying the range copies the buffer, so now one instance of the range will read correctly, say: 1,2,3 and the copy will read: 1,4,5 Now, let's say when copying, we kill the buffer from the source range, re-run our example, and it now correctly reads: 1,2,3 4,5 But now, you might reorder the data in a strange way: auto r2 = r1; // r1 now has no buffer, r2 contains '1' output(take(r1, 3)); output(take(r2,2)); outputs: 2,3,4 1,5 This can be a problem if you are depending on order. This isn't a likely scenario, but without a static buffer per range, you don't have any reordering issues. I just think having weird bugs like that will be troublesome in certain cases. Calling it 'expected behavior' is not going to help matters. In addition, without a buffer to worry about, the code becomes much simpler. -Steve
May 22 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 22 May 2009 11:48:34 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
  The thread discussing what to do for input ranges vs. forward ranges  
 got me thinking.
  The range concept may be defined backwards in terms of which is more  
 specialized.  Consider that an input range is always usable as a  
 stream.  But a stream is not easy to use as an input range (the range  
 primitive).

The hierarchy of concepts is: input range < forward range < bidirectional range < random-access range Having length, having slicing, and infinity are orthogonal properties.

Right, a stream doesn't fit really well within any of those categories, because it's even more simple than an input range. That was my point. You have to bolt functionality and a buffer on top of it to get it to fit the model. Maybe this is inevitable, I was wondering if it would be easier to deal with if the range paradigm had another simpler construct than input range.
 Case in point, a file.  To fit into the most primitive range concept,  
 it must define 3 functions:
  front()
 popFront()
 empty()
  empty is easy, it's just "am I at end of file"
  But front is not so easy.  In order to know what's at the front, you  
 need to read it.  And at that point you've altered the underlying file.

Not even empty is easy. If you're defining a file range that gives you e.g. whitespace-separated integers, you can't have empty return feof(p) because there might be only spaces through the end of file. So you need to cache one element ahead to be able to implement empty().

Even a stream that does nothing but read straight characters, but has no idea how long the stream is, such as stdin, cannot specify empty without trying to read more data. I see your point.
 I've discussed this with Walter for the longest time. The correct  
 primitive for an input range that needs to do no caching is:

 Tuple!(T, "data", bool, "got") popNext();

 When you call popNext it decides in the same place whether there is  
 data, and also gives it to you. If "got" comes off as false, you know  
 you're done.

Yes, I see now that my model won't cut it. I had a feeling that my model wasn't quite correct, but as you say, yours still is clunky. Looking at other iteration models: C#: similar to D ranges, but moveNext (a.k.a. popFront) is required to start the enumeration. (yuck!) Java: This is similar to what I was proposing: only has "get next element" and "has more elements". I'm not sure if the right answer has been found by any of these yet. I can feel there is a solution that should be simple, and accurately reflect the underlying objects' API, but I have to think more about it.
  What sucks about this is, we have to introduce a buffer in the range,  
 just because we can't look at data until we've popped it.  Not only  
 that, but calling front a file before anything is read requires a check  
 to fill the buffer in case we haven't read anything yet.  This could be  
 alleviated by filling in the constructor, but it's still more complex  
 than necessary.  Consider also that the underlying stream might already  
 be buffered, so we are buffering a buffer.

Yah, I agree. I've implemented a few of those in Phobos. It would be good to have a more solid solution. This problem, however, also collides with returning an rvalue versus a ref. A container wants to return a ref. An input range wants to return by value. Then it's difficult to use a container as an input range.

don't you mean the other way around? That is, it's easy to convert a ref into a value, but not vice versa.
 But since the primitives for input range are set by the compiler (it  
 uses them to do foreach), we have to implement them to make our stream  
 ranges friendly to foreach.
  Round peg, meet square hole.
  But what are the true requirements for iteration using foreach?
  1. check if there's anything left
 2. get the next element
  Step 2 now is split into popFront and front.  So a foreach loop is a  
 rewritten for loop like this:
  foreach(x; range)
 {
   ...
 }
  translates to:
 {
   auto _r = range;
   while(!_r.empty)
   {
     auto x = _r.front();
     _r.popFront();
     ...
   }
 }
  What if step 2 was one function?  Call it popNext(), and make it  
 equivalent to calling _r.front() and popFront() in one step on ranges  
 that implement that method.

This will not solve all problems. It will improve things like defining ranges that read one character at a time from a stream. But a function that does read-and-check-for-empty in one shot is the true solution.

Yes, agreed. I was focusing on not having to keep around transient data to avoid reordering or caching issues. But it does need to be all-in-one as you have shown.
 How does this work with foreach?
  {
   auto _r = range;
   while(!_r.empty)
   {
     auto x = _r.popNext();
     ...
   }
 }
  Basically, the same code, one less line.
  Consider that any range defined today with front() and popFront() can  
 implement popNext (and popNext could be an external function if we can  
 get 3015 resolved).
  So what I think we may need is a different range primitive:
  An iterable range defines: (name to be decided)
  bool empty()
 T popNext()
  An input range is an iterable range that also defines:
  T front();
 popFront();

I think you are using "iterable range" instead of "input range" and "input range" instead of "forward range". This is compared to STL terminology, which I borrowed.

I don't know what the terminology should be. I was using your terminology. when I said input range, I meant input range as you defined it on your std.range documentation. An input range has the following methods: T front() void popFront() bool empty() and by composition: T popNext() (this wasn't in the docs, but it's something that we're throwing around here) An "iterable" range, or one that does not have persistent access to elements and must modify the range and get data at the same time, was supposed to be a subset of that. I was just defining it as T popNext() and empty(), but it looks like those have to be combined into one function as you say.
 (this might be good for many forward ranges.) Then we nicely ask Walter  
 to allow inlining of such functions, and finally implement this in  
 std.range:

 ref ElementType!R popNext(R)(R r, ref bool gotSomething)
      if (isForwardRange!R)
 {
      if (r.empty)
      {
          gotSomething = false;
          static typeof(return) dumbo;
          return dumbo;
      }
      auto result = &(r.front());
      r.popFront();
      gotSomething = true;
      return *result;
 }

 Admittedly this looks considerably messier than it ought to, and that's  
 never a good sign. For starters, I could predict with accuracy the  
 sneering remarks of certain posters who shall remain unnamed. Worse,  
 they'd have a point (only) this one time :o). Messiness was one of the  
 factors that made me decide to steer away from this design. A simpler  
 solution is to just return by value:

Yeah, it's ugly. Dealing with orthogonal returns is not a graceful proposition. Another idea is to make the T the ref'd arg, similar to how the system call read() works: bool popNext(ref T); This has some benefits: 1) you aren't checking a temporary for emptyness, so it fits well within a loop construct 2) you can avoid returning a dummy element in the empty range case. 3) you avoid returning a piece of possibly large data on the stack when it's probably just going to get copied anyways (although the compiler can sometimes optimize this). One thing this does though, is remove the possibility of returning a reference to an element. Not sure if this is a bad thing, since you can always convert a reference to a value, and when operating in this mode, you are working with the lowest common denominator (as everything has to be able to implement this). However, with foreach, you may want to have references sometimes. For that, I'd guess that foreach might have to support 3 apis: opApply, range, and stream (popNext API). I'm not convinced that range is the best fit for ref'd data anyways, opApply is much more adept at it. How does implementation of popNext look? bool popNext(R)(R r, ref ElementType!R elem) if (isForwardRange!R) { if (r.empty) return false; elem = r.front(); r.popFront(); return true; } usage: R r; for(ElementType!R x; r.popNext(x);) { // use x } vs. your idea for popNext: R r; bool done; // ... geez, I don't even know an easy way to do this??? does this work? while(ref x = r.popNext(done), done) { // use x } Still thinking, may have more ideas... How do we get a reference to the element... -Steve
May 22 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 22 May 2009 15:55:52 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 Another idea is to make the T the ref'd arg, similar to how the system  
 call read() works:
  bool popNext(ref T);
  This has some benefits:
  1) you aren't checking a temporary for emptyness, so it fits well  
 within a loop construct
 2) you can avoid returning a dummy element in the empty range case.
 3) you avoid returning a piece of possibly large data on the stack when  
 it's probably just going to get copied anyways (although the compiler  
 can sometimes optimize this).

We considered that as well. It is problematic because looking at elements always entails copying them, which is rather silly if you do it for e.g. an array.

This is why I was suggesting that foreach has to support both the simple popNext API and the range API. Which sucks, but I can't see a way around it without using pointers...
 By golly, I kid you not but the interface I personally like the most is:

 struct ArchetypalInputRange(T)
 {
      T* popNext();
 }

 popNext returns a pointer to a T (the value may be reused) or null when  
 the range is done. Iteration becomes easy and efficient for all types.  
 An input range would still have to keep a buffer (and return a pointer  
 to it), but things are not awkward to implement.

 Given the bad reputation that pointers have, I guess people wouldn't  
 like this all that much.

Then maybe we have to come up with a better pointer... I kind of like the way classes work for this. Comparing them to null does what you want it to do, but doing anything else with it forwards to the underlying data. It's rebindable, and you can't do arithmetic on the pointer. ref accomplishes all of this, except you can't get at the underlying pointer to do things like compare to null or rebind. Maybe we simply need some new operators to get at the ref addresses. ref T popNext(); ref T t; // initialized to null while((t =& popNext()) ==& null) ... Still doesn't look as good as: while(auto t = popNext()) Hm.. what about returning an array of 0 or one element? (ugh). What about a type which acts like a class reference for simple data types: struct rref(T) { prviate T *data; ref T opStar() {return *data;} ref T opDot() {return *data;} void opAssign(ref T t) {data = &t;} } Need some compiler help to allow comparing to null, probably want this to be a builtin type, and have some better syntax. BTW, I like your solution the best too. I thought of using pointers, but decided against it because of the taboo factor. -Steve
May 22 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 22 May 2009 16:40:55 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 ref accomplishes all of this, except you can't get at the underlying  
 pointer to do things like compare to null or rebind.  Maybe we simply  
 need some new operators to get at the ref addresses.

Defining null references has been on the table too. I wrote a paragraph about them and then deleted it in fear of major aggravation. There's no need for new syntax. We can easily define things such that you can return *null when a reference is expected and checking &(r.popNext()) is null to figure out what's happening. I think it's bad design though.

&(r.popNext()) still gets you back to pointer-land. Might as well return a pointer. Also, I don't even think this works, because you can either check for null or get the data, but not both: ref T t; // syntax error, no? while(&(t = r.popNext()) !is null) { } doesn't work because you're not rebinding t. -Steve
May 22 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 22 May 2009 15:55:52 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:

 I'm not convinced that range is the best fit for ref'd data anyways,  
 opApply is much more adept at it.

If ranges aren't fit to mutate stuff, we failed. Many algorithms in std.algorithm mutate their data and I can't bring myself to think how I'd implement them all with opApply.

I misspoke, what I meant was maybe range isn't the best fit for *ref iteration using foreach*. Of course, having ranges that can ref data is important for other uses of ranges. So would it be acceptable to change foreach to use either the popNext api or the opApply api, and not have foreach support the input range api? That was my ponder.
 How does implementation of popNext look?
  bool popNext(R)(R r, ref ElementType!R elem)
      if (isForwardRange!R)
 {
      if (r.empty)
          return false;
      elem = r.front();
      r.popFront();
      return true;
 }

It forces a copy whenever you want to take a look at something.

Yes, and in the case of: foreach(x; r) You are going to do copying. This is what I'm guessing would be the standard model for getting data from a stream using foreach (it is now, with opApply). In the case of: foreach(ref x; r) you *need* a reference, so at this point you'd need to use the opApply or input-range style (front, popfront, empty) api to run foreach. If streams don't support ref data (which they shouldn't IMO), this is a compiler error. I'm concentrating mostly on usages with foreach, not algorithms. If we are to have streams that fit into the range model, then they need to be foreach'able. I don't know that they need a lot of support to feed into std.algorithm as reference data. I.e. you aren't going to sort a network stream. -Steve
May 22 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 22 May 2009 20:43:22 -0400, Lionello Lunesu  
<lio lunesu.remove.com> wrote:

 Andrei Alexandrescu wrote:
 Steven Schveighoffer wrote:
 Another idea is to make the T the ref'd arg, similar to how the system  
 call read() works:

 bool popNext(ref T);

 This has some benefits:

 1) you aren't checking a temporary for emptyness, so it fits well  
 within a loop construct
 2) you can avoid returning a dummy element in the empty range case.
 3) you avoid returning a piece of possibly large data on the stack  
 when it's probably just going to get copied anyways (although the  
 compiler can sometimes optimize this).

elements always entails copying them, which is rather silly if you do it for e.g. an array. By golly, I kid you not but the interface I personally like the most is: struct ArchetypalInputRange(T) { T* popNext(); } popNext returns a pointer to a T (the value may be reused) or null when the range is done. Iteration becomes easy and efficient for all types. An input range would still have to keep a buffer (and return a pointer to it), but things are not awkward to implement. Given the bad reputation that pointers have, I guess people wouldn't like this all that much.

You don't need a pointer to T, you need a nullable T :) ...which doesn't exist...

No, you need a pointer. If T is a reference type, how do you distinguish a null element from the end of the iteration? -Steve
May 22 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 22 May 2009 17:10:45 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 I'm concentrating mostly on usages with foreach, not algorithms.  If we  
 are to have streams that fit into the range model, then they need to be  
 foreach'able.  I don't know that they need a lot of support to feed  
 into std.algorithm as reference data.  I.e. you aren't going to sort a  
 network stream.

Plenty of algorithms work on input ranges.

I'm confused, by input range you mean a stream? Because I'm operating under the assumption that an input range is anything that defines front, popFront, and empty. While you can shoehorn a stream into being an input range, they don't necessarily implement the required elements easily. We may be confusing terminology. Can you name an example of an input range that is a stream, and an algorithm that mutates the stream in place (thereby requiring ref elements)? -Steve
May 22 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 22 May 2009 21:22:55 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 On Fri, 22 May 2009 17:10:45 -0400, Andrei Alexandrescu  
 <SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 I'm concentrating mostly on usages with foreach, not algorithms.  If  
 we are to have streams that fit into the range model, then they need  
 to be foreach'able.  I don't know that they need a lot of support to  
 feed into std.algorithm as reference data.  I.e. you aren't going to  
 sort a network stream.

Plenty of algorithms work on input ranges.

under the assumption that an input range is anything that defines front, popFront, and empty. While you can shoehorn a stream into being an input range, they don't necessarily implement the required elements easily. We may be confusing terminology. Can you name an example of an input range that is a stream, and an algorithm that mutates the stream in place (thereby requiring ref elements)?

There isn't. All I'm saying is, we have the following constraints: 1. Any range should be seamlessly and efficiently used as an input range.

This is the assumption I am challenging. I don't think you need this assumptions for ranges to work. You can always bolt input range functionality on top of a stream range if you want to treat a stream as an input range for some reason. But if foreach doesn't utilize the popNext api, then streams require an unnecessary layer on top, just to use foreach with them. -Steve
May 22 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 22 May 2009 21:50:03 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 On Fri, 22 May 2009 21:22:55 -0400, Andrei Alexandrescu  
 <SeeWebsiteForEmail erdani.org> wrote:
 1. Any range should be seamlessly and efficiently used as an input  
 range.

assumptions for ranges to work. You can always bolt input range functionality on top of a stream range if you want to treat a stream as an input range for some reason.

I believe there is indeed a terminology problem. To me, "input range" == "stream" == "socket" == "bridge that is sinking under your feet as you walk it". So to me there exists no "stream range". That to me is an "input range".

OK, I can see you're not going to alter your opinion, so I'll stop debating. I respectfully disagree with your definitions.
 But if foreach doesn't utilize the popNext api, then streams require an  
 unnecessary layer on top, just to use foreach with them.

We can arrange that foreach uses popNext, but it must be worth it.

I can't say for certain that it is, but you definitely will know if when you start implementing ranges based on stuff like files, and things just seem difficult to implement or have unsolvable problems. I guess we wait and see. -Steve
May 22 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 22 May 2009 22:35:39 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 Andrei Alexandrescu wrote:
 Steven Schveighoffer wrote:
  OK, I can see you're not going to alter your opinion, so I'll stop  
 debating.  I respectfully disagree with your definitions.

I'm using STL's definitions. I'd be glad to depart if there was a solid reason. How would you name things?

I reread the chapter on iterators in my C++ book. I was convinced that I would see something to fuel my beliefs. I'm now convinced that ranges can stay as is. Here is why: I was focusing on a stream range as a more basic range than an input range. However, I wasn't thinking about the whole picture. There are more than just input streams. There are output streams and input/output streams. It makes sense to support all of them, not just input streams. So really, I'm thinking of a different tree of entities, which already exist: streams! That is, the unification of ranges and streams really isn't necessary. You can wrap a stream as an input, output, or forward range, and deal with the shoehorning (and buffer issues) when you really need it. I don't think viewing streams as ranges is going to be a useful idiom unless you need to convert from one world to the other. So trying to find a way to make files be ranges might be simply an overenthusiastic view of ranges, similar to how people try to make everything object oriented when they learn how OO design works. As far as foreach, we still have opApply for streams (or shoehorning if you need range-ification). However, out of all this may come one interesting idea: popNext (specifically the 'bool popNext(ref T val)' form). Note that there are a couple of advantages to using popNext with foreach: 1. Multiple variables, and easy overloading. i.e. foreach(int idx, T val; range) would be possible (and much less cumbersome than opApply!). 2. 1 function call per loop instead of 3. This may have a distinct advantage if you absolutely have to use a virtual method, or your range must call virtual methods on it's container. I still don't know how to make ref values work with popNext without using pointers. It would be nice to have a rebindable ref type. I thought of this syntax, does this seem appealing at all? int i = 5, j = 6; ref int ri = i; ri = 7; assert(i == 7); ri.& = &j; ri = 8; assert(j == 8); type of ref.& is *not* a pointer, the arithmetic operators are not defined for it, and it can't be assigned to a pointer (although it can be assigned from a pointer). You also need to be able to pass the type of ref.& to rebind references inside a popNext function. If we could solve these problems, we might have a better case of being able to get rid of opApply. -Steve
May 26 2009
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 26 May 2009 13:45:05 -0400, Steve Teale  
<steve.teale britseyeview.com> wrote:

 Steve,

 I'm glad to see that there are others out there who are equally  
 skeptical about ranges..

 Your point about Ranges being similar to an early view of OOP is a good  
 one. Just because someone comes up with an interesting concept, it does  
 not mean you need to use it all the time, or even ever if it does not  
 sit well with you.

Ranges to me are a great concept. They are safer than C++ iterators, but they are not exactly new. For example Java Iterators are essentially input ranges. However, STL algorithms applied to such concepts has not often been done (if ever before), probably because not many languages have such extensive template support. It's certainly an interesting pairing, and I think it will prove to be a great fit. On the skeptical side, I don't think ranges are a great fit for all cases where C++ iterators work well. Using iterators as a marker, for instance, is not handled well by a range. It seems to me that streams also don't fit well as a range (but certainly there needs to be some connector code between ranges and streams). So I guess, you wouldn't call me a skeptic, but I'm certainly not convinced that ranges are the solution to all problems :) -Steve
May 26 2009