www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - "the last change" for ranges

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
In wake of a few discussion I've witnessed, I'm thinking of a last 
change for ranges. (In fact there's one more, but that's minor.)

The problem is that input ranges and forward ranges have the same 
syntactic interface, but different semantic interfaces. Consider the 
problem of finding the first two identical adjacent items in a range:

R adjacentFind(R)(R r)
{
     if (r,empty) return r;
     R last = r;
     r.popFront;
     for (; !r.empty && last.front != r.front; last.popFront, r.popFront)
     {
     }
     return r;
}

This will work properly on lists and vectors, but horrendously on files 
and sockets. This is because input ranges can't be saved for later use: 
incrementing r also increments popFront and essentially forces both to 
look at the same current value.

I'm thinking a better design is to require any range that's forward or 
better to define a function save(). Ranges that don't implement it are 
input ranges; those that do, will guarantee a brand new range is 
returned from save(). So then adjacentFind would look like this:

R adjacentFind(R)(R r)
{
     if (r,empty) return r;
     R last = r.save;
     r.popFront;
     for (; !r.empty && last.front != r.front; last.popFront, r.popFront)
     {
     }
     return r;
}

Obviously, when you pass a range that doesn't support save, adjacentFind 
will not compile, which is what we want.

Andrei

P.S. There is a way to implement adjacentFind for forward ranges by 
saving data instead of ranges. I've used a limited version above for 
illustration purposes.
May 20 2009
next sibling parent reply "Denis Koroskin" <2korden gmail.com> writes:
On Wed, 20 May 2009 20:19:30 +0400, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

 In wake of a few discussion I've witnessed, I'm thinking of a last  
 change for ranges. (In fact there's one more, but that's minor.)

 The problem is that input ranges and forward ranges have the same  
 syntactic interface, but different semantic interfaces. Consider the  
 problem of finding the first two identical adjacent items in a range:

 R adjacentFind(R)(R r)
 {
      if (r,empty) return r;
      R last = r;
      r.popFront;
      for (; !r.empty && last.front != r.front; last.popFront, r.popFront)
      {
      }
      return r;
 }

 This will work properly on lists and vectors, but horrendously on files  
 and sockets. This is because input ranges can't be saved for later use:  
 incrementing r also increments popFront and essentially forces both to  
 look at the same current value.

 I'm thinking a better design is to require any range that's forward or  
 better to define a function save(). Ranges that don't implement it are  
 input ranges; those that do, will guarantee a brand new range is  
 returned from save(). So then adjacentFind would look like this:

 R adjacentFind(R)(R r)
 {
      if (r,empty) return r;
      R last = r.save;
      r.popFront;
      for (; !r.empty && last.front != r.front; last.popFront, r.popFront)
      {
      }
      return r;
 }

 Obviously, when you pass a range that doesn't support save, adjacentFind  
 will not compile, which is what we want.

 Andrei

 P.S. There is a way to implement adjacentFind for forward ranges by  
 saving data instead of ranges. I've used a limited version above for  
 illustration purposes.

Why not r.dup?
May 20 2009
parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Denis Koroskin (2korden gmail.com)'s article
 Why not r.dup?

.dup is supposed to imply copying of the range's contents, not copying of the range's iteration state.
May 20 2009
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
dsimcha wrote:
 == Quote from Denis Koroskin (2korden gmail.com)'s article
 Why not r.dup?

.dup is supposed to imply copying of the range's contents, not copying of the range's iteration state.

Yes, for arrays save() is: T[] save(T)(T[] r) { return r; } Andrei
May 20 2009
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Wed, 20 May 2009 20:23:27 +0400, Denis Koroskin <2korden gmail.com> wrote:

 On Wed, 20 May 2009 20:19:30 +0400, Andrei Alexandrescu  
 <SeeWebsiteForEmail erdani.org> wrote:

 In wake of a few discussion I've witnessed, I'm thinking of a last
 change for ranges. (In fact there's one more, but that's minor.)

 The problem is that input ranges and forward ranges have the same
 syntactic interface, but different semantic interfaces. Consider the
 problem of finding the first two identical adjacent items in a range:

 R adjacentFind(R)(R r)
 {
      if (r,empty) return r;
      R last = r;
      r.popFront;
      for (; !r.empty && last.front != r.front; last.popFront,  
 r.popFront)
      {
      }
      return r;
 }

 This will work properly on lists and vectors, but horrendously on files
 and sockets. This is because input ranges can't be saved for later use:
 incrementing r also increments popFront and essentially forces both to
 look at the same current value.

 I'm thinking a better design is to require any range that's forward or
 better to define a function save(). Ranges that don't implement it are
 input ranges; those that do, will guarantee a brand new range is
 returned from save(). So then adjacentFind would look like this:

 R adjacentFind(R)(R r)
 {
      if (r,empty) return r;
      R last = r.save;
      r.popFront;
      for (; !r.empty && last.front != r.front; last.popFront,  
 r.popFront)
      {
      }
      return r;
 }

 Obviously, when you pass a range that doesn't support save, adjacentFind
 will not compile, which is what we want.

 Andrei

 P.S. There is a way to implement adjacentFind for forward ranges by
 saving data instead of ranges. I've used a limited version above for
 illustration purposes.

Why not r.dup?

Nevermind, I don't want to turn it into a bycicle shed discussion, but .dup would be consistent with arrays. Do you suggest that ranges now have a reference semantics? That's quite a big change, but I do believe that it's for good, because it make classes a rightful ranges citizen.
May 20 2009
prev sibling next sibling parent BLS <windevguy hotmail.de> writes:
Andrei Alexandrescu wrote:
 In wake of a few discussion I've witnessed, I'm thinking of a last 
 change for ranges. (In fact there's one more, but that's minor.)
 
 The problem is that input ranges and forward ranges have the same 
 syntactic interface, but different semantic interfaces. Consider the 
 problem of finding the first two identical adjacent items in a range:
 
 R adjacentFind(R)(R r)
 {
     if (r,empty) return r;
     R last = r;
     r.popFront;
     for (; !r.empty && last.front != r.front; last.popFront, r.popFront)
     {
     }
     return r;
 }
 
 This will work properly on lists and vectors, but horrendously on files 
 and sockets. This is because input ranges can't be saved for later use: 
 incrementing r also increments popFront and essentially forces both to 
 look at the same current value.
 
 I'm thinking a better design is to require any range that's forward or 
 better to define a function save(). Ranges that don't implement it are 
 input ranges; those that do, will guarantee a brand new range is 
 returned from save(). So then adjacentFind would look like this:
 
 R adjacentFind(R)(R r)
 {
     if (r,empty) return r;
     R last = r.save;
     r.popFront;
     for (; !r.empty && last.front != r.front; last.popFront, r.popFront)
     {
     }
     return r;
 }
 
 Obviously, when you pass a range that doesn't support save, adjacentFind 
 will not compile, which is what we want.
 
 Andrei
 
 P.S. There is a way to implement adjacentFind for forward ranges by 
 saving data instead of ranges. I've used a limited version above for 
 illustration purposes.

I REALLY hope that ranges will have some room for a "Let's have a closer look" chapter in your book. Sometimes I found them quite hard to understand: Björn
May 20 2009
prev sibling next sibling parent reply Bill Baxter <wbaxter gmail.com> writes:
On Wed, May 20, 2009 at 9:19 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

 I'm thinking a better design is to require any range that's forward or
 better to define a function save(). Ranges that don't implement it are in=

 ranges; those that do, will guarantee a brand new range is returned from
 save(). So then adjacentFind would look like this:

 R adjacentFind(R)(R r)
 {
 =A0 =A0if (r,empty) return r;
 =A0 =A0R last =3D r.save;
 =A0 =A0r.popFront;
 =A0 =A0for (; !r.empty && last.front !=3D r.front; last.popFront, r.popFr=

 =A0 =A0{
 =A0 =A0}
 =A0 =A0return r;
 }

 Obviously, when you pass a range that doesn't support save, adjacentFind
 will not compile, which is what we want.

The only other alternative that comes to mind would be forcing input ranges to hide their copy constructor, or whatever the D equivalent is, making R last =3D r; fail. But that would make input ranges very difficult to use. So, of those two options at least, requiring a .save sounds like the better choice. The down side is you will get no error if you write the code the first way, without a .save. I see this as turning into tip #5 in "Effective D" -- "Know when to use .save" It would be nice if that potential mistake could be eliminated somehow. You could perhaps require input ranges to implement transfer semantics, and have them implement a .clone for cases when you really do want to make an aliasing copy. --bb
May 20 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Bill Baxter wrote:
 On Wed, May 20, 2009 at 9:19 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 
 I'm thinking a better design is to require any range that's forward or
 better to define a function save(). Ranges that don't implement it are input
 ranges; those that do, will guarantee a brand new range is returned from
 save(). So then adjacentFind would look like this:

 R adjacentFind(R)(R r)
 {
    if (r,empty) return r;
    R last = r.save;
    r.popFront;
    for (; !r.empty && last.front != r.front; last.popFront, r.popFront)
    {
    }
    return r;
 }

 Obviously, when you pass a range that doesn't support save, adjacentFind
 will not compile, which is what we want.

The only other alternative that comes to mind would be forcing input ranges to hide their copy constructor, or whatever the D equivalent is, making R last = r; fail. But that would make input ranges very difficult to use.

Exactly. I thought of that design, and it was difficult to even pass a range to a function.
 So, of those two options at least, requiring a .save sounds like the
 better choice.
 
 The down side is you will get no error if you write the code the first
 way, without a .save.   I see this as turning into tip #5 in
 "Effective D" -- "Know when to use .save"   It would be nice if that
 potential mistake could be eliminated somehow.  You could perhaps
 require input ranges to implement transfer semantics, and have them
 implement a .clone for cases when you really do want to make an
 aliasing copy.

Good point. I don't have a solution for that. Giving ranges move semantics would probably make for another Effective D tip (or perhaps more... move semantics are pretty brutal). Another partial solution is to define a different interface for input ranges, one that combines front() and popFront(). Something like popNext. That way, people who use only the primitives empty() and popNext() know they are using a forward range and with hope they'll remember they can't really save copies of it and expect them to "remember" where they are in the input. Andrei
May 20 2009
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Robert Jacques wrote:
 Bicycle shed: Well, since output ranges use 'put', how about 'get' for 
 input ranges?

Nice color :o). In fact, "put" is a poor choice because it doesn't reflect advancement. Probably putNext and getNext are better. Andrei
May 20 2009
prev sibling next sibling parent reply Jason House <jason.james.house gmail.com> writes:
I feel like there are too many differences between input and forward ranges for
such a minor difference. Many range functions are written assuming no side
effects on the caller. This can restrict the use of helper functions. It may be
best to make their usage different... 

Andrei Alexandrescu Wrote:

 In wake of a few discussion I've witnessed, I'm thinking of a last 
 change for ranges. (In fact there's one more, but that's minor.)
 
 The problem is that input ranges and forward ranges have the same 
 syntactic interface, but different semantic interfaces. Consider the 
 problem of finding the first two identical adjacent items in a range:
 
 R adjacentFind(R)(R r)
 {
      if (r,empty) return r;
      R last = r;
      r.popFront;
      for (; !r.empty && last.front != r.front; last.popFront, r.popFront)
      {
      }
      return r;
 }
 
 This will work properly on lists and vectors, but horrendously on files 
 and sockets. This is because input ranges can't be saved for later use: 
 incrementing r also increments popFront and essentially forces both to 
 look at the same current value.
 
 I'm thinking a better design is to require any range that's forward or 
 better to define a function save(). Ranges that don't implement it are 
 input ranges; those that do, will guarantee a brand new range is 
 returned from save(). So then adjacentFind would look like this:
 
 R adjacentFind(R)(R r)
 {
      if (r,empty) return r;
      R last = r.save;
      r.popFront;
      for (; !r.empty && last.front != r.front; last.popFront, r.popFront)
      {
      }
      return r;
 }
 
 Obviously, when you pass a range that doesn't support save, adjacentFind 
 will not compile, which is what we want.
 
 Andrei
 
 P.S. There is a way to implement adjacentFind for forward ranges by 
 saving data instead of ranges. I've used a limited version above for 
 illustration purposes.

May 20 2009
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Jason House wrote:
 I feel like there are too many differences between input and forward
 ranges for such a minor difference. Many range functions are written
 assuming no side effects on the caller. This can restrict the use of
 helper functions. It may be best to make their usage different...

So how do you think we should go about it? Also don't forget that any ranges should be seamlessly and efficiently treated as input ranges. Andrei
May 20 2009
parent reply Jason House <jason.james.house gmail.com> writes:
Andrei Alexandrescu Wrote:

 Jason House wrote:
 I feel like there are too many differences between input and forward
 ranges for such a minor difference. Many range functions are written
 assuming no side effects on the caller. This can restrict the use of
 helper functions. It may be best to make their usage different...

So how do you think we should go about it? Also don't forget that any ranges should be seamlessly and efficiently treated as input ranges. Andrei

You won't like my answer! Like you've already said, the semantics of forward ranges and input ranges are different. I would argue that forward ranges have value semantics but input ranges do not. Any function that implicitly assumes value semantics is wrong. Sadly, overlapping API's makes that all too easy for someone to write bad code that passes simplistic tests with forward ranges and then fail with input ranges. My initial thoughts is that input ranges should have two changes: 1. A different API from forward ranges 2. Be a reference type (AKA class instead of struct) In short, I disagree with your basic premise of treating the two ranges similarly.
May 20 2009
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Jason House wrote:
 Andrei Alexandrescu Wrote:
 
 Jason House wrote:
 I feel like there are too many differences between input and forward
 ranges for such a minor difference. Many range functions are written
 assuming no side effects on the caller. This can restrict the use of
 helper functions. It may be best to make their usage different...

ranges should be seamlessly and efficiently treated as input ranges. Andrei

You won't like my answer! Like you've already said, the semantics of forward ranges and input ranges are different. I would argue that forward ranges have value semantics but input ranges do not. Any function that implicitly assumes value semantics is wrong. Sadly, overlapping API's makes that all too easy for someone to write bad code that passes simplistic tests with forward ranges and then fail with input ranges. My initial thoughts is that input ranges should have two changes: 1. A different API from forward ranges 2. Be a reference type (AKA class instead of struct) In short, I disagree with your basic premise of treating the two ranges similarly.

I don't want to treat them similarly, but we should be able to treat forward ranges as input ranges. Otherwise, all algorithms that work for input ranges would have to be written twice. Andrei
May 20 2009
next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Bill Baxter (wbaxter gmail.com)'s article
 On Wed, May 20, 2009 at 11:44 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Jason House wrote:
 Andrei Alexandrescu Wrote:

 Jason House wrote:
 I feel like there are too many differences between input and forward
 ranges for such a minor difference. Many range functions are written
 assuming no side effects on the caller. This can restrict the use of
 helper functions. It may be best to make their usage different...

So how do you think we should go about it? Also don't forget that any ranges should be seamlessly and efficiently treated as input ranges. Andrei

You won't like my answer! Like you've already said, the semantics of forward ranges and input ranges are different. I would argue that forward ranges have value semantics but input ranges do not. Any function that implicitly assumes value semantics is wrong. Sadly, overlapping API's makes that all too easy for someone to write bad code that passes simplistic tests with forward ranges and then fail with input ranges. My initial thoughts is that input ranges should have two changes: 1. A different API from forward ranges 2. Be a reference type (AKA class instead of struct) In short, I disagree with your basic premise of treating the two ranges similarly.

I don't want to treat them similarly, but we should be able to treat forward ranges as input ranges. Otherwise, all algorithms that work for input ranges would have to be written twice.

No need to write the algos twice now, but you do have to add a line or two of code to every input range algo. Or force the the user to call the converter function. --bb

But if you make the input range a class as Jason proposed, then: 1. Unless it's final, its methods will be virtual (slow). 2. You trigger a heap allocation every time you want to make this conversion. (slow)
May 20 2009
next sibling parent Jason House <jason.james.house gmail.com> writes:
dsimcha Wrote:

 == Quote from Bill Baxter (wbaxter gmail.com)'s article
 On Wed, May 20, 2009 at 11:44 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Jason House wrote:
 Andrei Alexandrescu Wrote:

 Jason House wrote:
 I feel like there are too many differences between input and forward
 ranges for such a minor difference. Many range functions are written
 assuming no side effects on the caller. This can restrict the use of
 helper functions. It may be best to make their usage different...

So how do you think we should go about it? Also don't forget that any ranges should be seamlessly and efficiently treated as input ranges. Andrei

You won't like my answer! Like you've already said, the semantics of forward ranges and input ranges are different. I would argue that forward ranges have value semantics but input ranges do not. Any function that implicitly assumes value semantics is wrong. Sadly, overlapping API's makes that all too easy for someone to write bad code that passes simplistic tests with forward ranges and then fail with input ranges. My initial thoughts is that input ranges should have two changes: 1. A different API from forward ranges 2. Be a reference type (AKA class instead of struct) In short, I disagree with your basic premise of treating the two ranges similarly.

I don't want to treat them similarly, but we should be able to treat forward ranges as input ranges. Otherwise, all algorithms that work for input ranges would have to be written twice.

No need to write the algos twice now, but you do have to add a line or two of code to every input range algo. Or force the the user to call the converter function. --bb

But if you make the input range a class as Jason proposed, then: 1. Unless it's final, its methods will be virtual (slow). 2. You trigger a heap allocation every time you want to make this conversion. (slow)

Scope classes avoid the heap allocations. Classes are not required for referance semantics. Specially constructed structs can also satisfy the requirement. By declaring the typical input range to be a (scope final) class, I was hoping to emphasize the fundamental difference with forward ranges. It should be trivial to write a (scope) wrapper that converts a forward range into an input range. The compiler should be able to optimize away the wrapper or at least inline the functions.
May 20 2009
prev sibling parent reply Jason House <jason.james.house gmail.com> writes:
Bill Baxter Wrote:

 On Wed, May 20, 2009 at 12:05 PM, dsimcha <dsimcha yahoo.com> wrote:
 == Quote from Bill Baxter (wbaxter gmail.com)'s article
 On Wed, May 20, 2009 at 11:44 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Jason House wrote:
 Andrei Alexandrescu Wrote:

 Jason House wrote:
 I feel like there are too many differences between input and forward
 ranges for such a minor difference. Many range functions are written
 assuming no side effects on the caller. This can restrict the use of
 helper functions. It may be best to make their usage different...

So how do you think we should go about it? Also don't forget that any ranges should be seamlessly and efficiently treated as input ranges. Andrei

You won't like my answer! Like you've already said, the semantics of forward ranges and input ranges are different. I would argue that forward ranges have value semantics but input ranges do not. Any function that implicitly assumes value semantics is wrong. Sadly, overlapping API's makes that all too easy for someone to write bad code that passes simplistic tests with forward ranges and then fail with input ranges. My initial thoughts is that input ranges should have two changes: 1. A different API from forward ranges 2. Be a reference type (AKA class instead of struct) In short, I disagree with your basic premise of treating the two ranges similarly.

I don't want to treat them similarly, but we should be able to treat forward ranges as input ranges. Otherwise, all algorithms that work for input ranges would have to be written twice.

No need to write the algos twice now, but you do have to add a line or two of code to every input range algo.  Or force the the user to call the converter function. --bb

But if you make the input range a class as Jason proposed, then: 1.  Unless it's final, its methods will be virtual (slow). 2.  You trigger a heap allocation every time you want to make this conversion.  (slow)

Maybe, but I don't really agree that input ranges should be forced to be classes. Seems like they should be allowed to be either as long as they support the required methods.

If you really mean methods and semantics, then I agree. It's becoming increasingly clear to me that D users struggle against the struct/class division. I frequently think of the division as value/reference type while many others think of it as non-virtual/virtual. These different perspectives means there is a large set of "objects" where the two definitions disagree on which data type (struct or class) is more appropriate. IMHO, D should have a type with low size and function call overhead like a struct as well as reference semantics like a class.
 Actually that's a good argument for not making  a = b part of the
 Forward Range concept.   If you get rid of that one, then Forward
 Ranges can be either classes or structs too.

Having undefined behavior (for assignments and passing as arguments) is bad.
May 20 2009
parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Jason House (jason.james.house gmail.com)'s article
 IMHO, D should have a type with low size and function call overhead > like a

What's wrong with a pointer to a heap-allocated struct? I sometimes need what you describe, too, and I've never seen a case where this doesn't do the job.
May 20 2009
next sibling parent reply Jason House <jason.james.house gmail.com> writes:
dsimcha Wrote:

 == Quote from Jason House (jason.james.house gmail.com)'s article
 IMHO, D should have a type with low size and function call overhead > like a

What's wrong with a pointer to a heap-allocated struct? I sometimes need what you describe, too, and I've never seen a case where this doesn't do the job.

That does the job, but it looks ugly ;) I think it's also not allowed in safe d.
May 20 2009
parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Jason House (jason.james.house gmail.com)'s article
 dsimcha Wrote:
 == Quote from Jason House (jason.james.house gmail.com)'s article
 IMHO, D should have a type with low size and function call overhead > like a

What's wrong with a pointer to a heap-allocated struct? I sometimes need what you describe, too, and I've never seen a case where this doesn't do the job.


Well then that argues more for a generic reference type than for a whole new aggregate type different from both classes and structs. IIRC, Ref was supposed to be coming to std.typecons soon. Also, if you don't care about the few bytes of overhead for vtbl and monitor, there's always final classes. The bottom line is that I can see where what you're asking for could be useful, but the cases where neither a final class nor a pointer to a heap-allocated struct cut it are way too few and far between to justify a whole new aggregate type.
May 20 2009
parent reply Jason House <jason.james.house gmail.com> writes:
dsimcha Wrote:

 == Quote from Jason House (jason.james.house gmail.com)'s article
 dsimcha Wrote:
 == Quote from Jason House (jason.james.house gmail.com)'s article
 IMHO, D should have a type with low size and function call overhead > like a

What's wrong with a pointer to a heap-allocated struct? I sometimes need what you describe, too, and I've never seen a case where this doesn't do the job.


Well then that argues more for a generic reference type than for a whole new aggregate type different from both classes and structs. IIRC, Ref was supposed to be coming to std.typecons soon. Also, if you don't care about the few bytes of overhead for vtbl and monitor, there's always final classes. The bottom line is that I can see where what you're asking for could be useful, but the cases where neither a final class nor a pointer to a heap-allocated struct cut it are way too few and far between to justify a whole new aggregate type.

Maybe I'm a bit cynical, but I never expect my posts to cause a change in D, or for my bug reports to even get a comment. My long posts with well thought out ideas either get no response or a reaction like Andrei's recent switch range thread. I no longer try to work out the details and merely hope my efforts plant a seed for thought. It's far less frustrating that way. As far as a struct-like reference type, my only goal was to point out a gap that is effecting users. Your suggestion about a library implementation sounds reasonable. Personally, I use final classes and don't care about the extra overhead. I used to not worry about making classes final, especially since the D "spec" says a compiler can be smart enough to detect when they're appropriate. I can't help but wonder if struct pointers are in my future as I continue to push for performance. What would irk me most about doing that is if such a decision causes a ripple of changes throughout my code base.
May 21 2009
parent reply Georg Wrede <georg.wrede iki.fi> writes:
Jason House wrote:
 Maybe I'm a bit cynical, but I never expect my posts to cause a
 change in D, or for my bug reports to even get a comment.

For that, I must confess you've got a more mature attitude than most participants.
 My long posts with well thought out ideas either get no response or a
 reaction like Andrei's recent switch range thread.

(Not familiar with the particular post/response, but believe me, I've been there, and so have scores of other regulars here.) For such situations, I've decided to Presume: either the post was ill positioned (maybe down in a thread, maybe posted at the same time some of the Celebrities Dropped a Bomb in the Pond), or then simply at a wrong moment. (Discourcially, temporally, psychologically, or socially.)
 I no longer try to work out the details and merely hope my efforts
 plant a seed for thought. It's far less frustrating that way.

Sadly, ( /thoroughly/ sadly), this is like a party group. If you really want exposure, your first post should be no longer than 3 lines long. Then, on the 4th level of the thread, you might piecemeal start exposing the details of what you really wanted to say, in the first post. Hell, our current celebrities do that, and the success, you undoubtedly see.
 What would irk me most about doing that is if such a
 decision causes a ripple of changes throughout my code base.

Life's not fair, especially in the fast lane, and definitely not with D2.0. OTOH, with D1.x, if "things change" there /will/ be a riot. (Actually, there _should_ be War!)
May 21 2009
parent reply Jason House <jason.james.house gmail.com> writes:
Georg Wrede Wrote:

 Jason House wrote:  
 I no longer try to work out the details and merely hope my efforts
 plant a seed for thought. It's far less frustrating that way.

Sadly, ( /thoroughly/ sadly), this is like a party group. If you really want exposure, your first post should be no longer than 3 lines long. Then, on the 4th level of the thread, you might piecemeal start exposing the details of what you really wanted to say, in the first post. Hell, our current celebrities do that, and the success, you undoubtedly see.

Who do you consider to be a celebrity that doesn't have commit access to Phobos? I take people like Walter, Andrei, and Sean because they usually mean pending changes to the language or libraries. I won't say our celebrities don't deserve their status. All I'm saying is that celebrity status gives a lot of leeway for how to share info.
May 21 2009
parent Georg Wrede <georg.wrede iki.fi> writes:
Jason House wrote:
 Georg Wrede Wrote:
 Jason House wrote:  
 I no longer try to work out the details and merely hope my efforts
 plant a seed for thought. It's far less frustrating that way.

Sadly, ( /thoroughly/ sadly), this is like a party group. If you really want exposure, your first post should be no longer than 3 lines long. Then, on the 4th level of the thread, you might piecemeal start exposing the details of what you really wanted to say, in the first post. Hell, our current celebrities do that, and the success, you undoubtedly see.

Who do you consider to be a celebrity that doesn't have commit access to Phobos? I take people like Walter, Andrei, and Sean because they usually mean pending changes to the language or libraries.

Well, it's not that simple. Andrei was a celebrity from day one, although (I think) he got write access only later. Same thing with Don. Matthew Wilson never had write access, and he enjoyed celebrity status. For all I know, there may be several people with write access who don't even appear on this newsgroup. (I have no idea, but I suspect.) And then, of course, everybody's list of celebrities is different.
 I won't say our celebrities don't deserve their status. All I'm
 saying is that celebrity status gives a lot of leeway for how to
 share info.

Sure.
May 21 2009
prev sibling parent Jason House <jason.james.house gmail.com> writes:
Bill Baxter Wrote:

 On Wed, May 20, 2009 at 4:03 PM, dsimcha <dsimcha yahoo.com> wrote:
 == Quote from Jason House (jason.james.house gmail.com)'s article
 IMHO, D should have a type with low size and function call overhead > like a

What's wrong with a pointer to a heap-allocated struct?  I sometimes need what you describe, too, and I've never seen a case where this doesn't do the job.

And you can alias Foo_* Foo, so that it doesn't even look like you're passing around a pointer. :-) --bb

Interesting thought. Wouldn't calls to new still require use of Foo__?
May 20 2009
prev sibling parent dsimcha <dsimcha yahoo.com> writes:
== Quote from Bill Baxter (wbaxter gmail.com)'s article
 On Wed, May 20, 2009 at 11:44 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Jason House wrote:
 Andrei Alexandrescu Wrote:

 Jason House wrote:
 I feel like there are too many differences between input and forward
 ranges for such a minor difference. Many range functions are written
 assuming no side effects on the caller. This can restrict the use of
 helper functions. It may be best to make their usage different...

So how do you think we should go about it? Also don't forget that any ranges should be seamlessly and efficiently treated as input ranges. Andrei

You won't like my answer! Like you've already said, the semantics of forward ranges and input ranges are different. I would argue that forward ranges have value semantics but input ranges do not. Any function that implicitly assumes value semantics is wrong. Sadly, overlapping API's makes that all too easy for someone to write bad code that passes simplistic tests with forward ranges and then fail with input ranges. My initial thoughts is that input ranges should have two changes: 1. A different API from forward ranges 2. Be a reference type (AKA class instead of struct) In short, I disagree with your basic premise of treating the two ranges similarly.

I don't want to treat them similarly, but we should be able to treat forward ranges as input ranges. Otherwise, all algorithms that work for input ranges would have to be written twice.

No need to write the algos twice now, but you do have to add a line or two of code to every input range algo. Or force the the user to call the converter function. --bb

On a broader note, I think that people need to understand that, just as a free society can never be a perfectly safe society, a language that allows programmers the freedom to create concise, general and efficient constructs can never be a perfectly safe language. Yes, we could make D as safe as Java, but then programming in D would be like programming in Java--an exercise in superfluous verbosity and getting around the language's rigidity.
May 20 2009
prev sibling next sibling parent dsimcha <dsimcha yahoo.com> writes:
== Quote from Jason House (jason.james.house gmail.com)'s article
 Andrei Alexandrescu Wrote:
 Jason House wrote:
 I feel like there are too many differences between input and forward
 ranges for such a minor difference. Many range functions are written
 assuming no side effects on the caller. This can restrict the use of
 helper functions. It may be best to make their usage different...

So how do you think we should go about it? Also don't forget that any ranges should be seamlessly and efficiently treated as input ranges. Andrei

Like you've already said, the semantics of forward ranges and input ranges are

do not. Any function that implicitly assumes value semantics is wrong. Sadly, overlapping API's makes that all too easy for someone to write bad code that passes simplistic tests with forward ranges and then fail with input ranges.
 My initial thoughts is that input ranges should have two changes:
 1. A different API from forward ranges
 2. Be a reference type (AKA class instead of struct)
 In short, I disagree with your basic premise of treating the two ranges
similarly.

Then how would you handle functions that only require lowest common denominator functionality? This is true for *a lot* of cases, including some important ones like finding the mean and standard deviation some object. (Incidentally, this is also where iterable comes in, since such a function doesn't even need to care if the iteration is with ranges, opApply, or the fairy #$() * god mother, as long as foreach somehow works.) You mean to tell me you'd require explicit handling of both the input range and forward range case separately in these cases? The day this happens, I switch to Java because D will have become just as much of an insanely verbose bondage and discipline language, but with less libraries.
May 20 2009
prev sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Wed, May 20, 2009 at 5:11 PM, Jason House
<jason.james.house gmail.com> wrote:
 Bill Baxter Wrote:

 On Wed, May 20, 2009 at 4:03 PM, dsimcha <dsimcha yahoo.com> wrote:
 =3D=3D Quote from Jason House (jason.james.house gmail.com)'s article
 IMHO, D should have a type with low size and function call overhead >=




 struct as well as reference semantics like a class.

 What's wrong with a pointer to a heap-allocated struct? =A0I sometimes=



 describe, too, and I've never seen a case where this doesn't do the jo=



 And you can alias Foo_* Foo, so that it doesn't even look like you're
 passing around a pointer. =A0:-)


 --bb

Interesting thought. Wouldn't calls to new still require use of Foo__?

Yes. Also if you're creating 'em on the stack. --bb
May 20 2009
prev sibling next sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Wed, May 20, 2009 at 11:44 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 Jason House wrote:
 Andrei Alexandrescu Wrote:

 Jason House wrote:
 I feel like there are too many differences between input and forward
 ranges for such a minor difference. Many range functions are written
 assuming no side effects on the caller. This can restrict the use of
 helper functions. It may be best to make their usage different...

So how do you think we should go about it? Also don't forget that any ranges should be seamlessly and efficiently treated as input ranges. Andrei

You won't like my answer! Like you've already said, the semantics of forward ranges and input ranges are different. I would argue that forward ranges have value semantics but input ranges do not. Any function that implicitly assumes value semantics is wrong. Sadly, overlapping API's makes that all too easy for someone to write bad code that passes simplistic tests with forward ranges and then fail with input ranges. My initial thoughts is that input ranges should have two changes: 1. A different API from forward ranges 2. Be a reference type (AKA class instead of struct) In short, I disagree with your basic premise of treating the two ranges similarly.

I don't want to treat them similarly, but we should be able to treat forward ranges as input ranges. Otherwise, all algorithms that work for input ranges would have to be written twice.

auto inp = std.typecons.inputRangeFromForwardRange(fwd); No need to write the algos twice now, but you do have to add a line or two of code to every input range algo. Or force the the user to call the converter function. --bb
May 20 2009
prev sibling next sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Wed, May 20, 2009 at 12:05 PM, dsimcha <dsimcha yahoo.com> wrote:
 =3D=3D Quote from Bill Baxter (wbaxter gmail.com)'s article
 On Wed, May 20, 2009 at 11:44 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Jason House wrote:
 Andrei Alexandrescu Wrote:

 Jason House wrote:
 I feel like there are too many differences between input and forwar=






 ranges for such a minor difference. Many range functions are writte=






 assuming no side effects on the caller. This can restrict the use o=






 helper functions. It may be best to make their usage different...

So how do you think we should go about it? Also don't forget that an=





 ranges should be seamlessly and efficiently treated as input ranges.

 Andrei

You won't like my answer! Like you've already said, the semantics of forward ranges and input r=




 are different. I would argue that forward ranges have value semantics=




 input ranges do not. Any function that implicitly assumes value seman=




 wrong. Sadly, overlapping API's makes that all too easy for someone t=




 bad code that passes simplistic tests with forward ranges and then fa=




 input ranges.

 My initial thoughts is that input ranges should have two changes:
 1. A different API from forward ranges
 2. Be a reference type (AKA class instead of struct)

 In short, I disagree with your basic premise of treating the two rang=




 similarly.

I don't want to treat them similarly, but we should be able to treat f=



 ranges as input ranges. Otherwise, all algorithms that work for input =



 would have to be written twice.

No need to write the algos twice now, but you do have to add a line or two of code to every input range algo. =A0Or force the the user to call the converter function. --bb

But if you make the input range a class as Jason proposed, then: 1. =A0Unless it's final, its methods will be virtual (slow). 2. =A0You trigger a heap allocation every time you want to make this conv=

Maybe, but I don't really agree that input ranges should be forced to be classes. Seems like they should be allowed to be either as long as they support the required methods. Actually that's a good argument for not making a =3D b part of the Forward Range concept. If you get rid of that one, then Forward Ranges can be either classes or structs too. --bb
May 20 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 20 May 2009 13:35:14 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Jason House wrote:
 I feel like there are too many differences between input and forward
 ranges for such a minor difference. Many range functions are written
 assuming no side effects on the caller. This can restrict the use of
 helper functions. It may be best to make their usage different...

So how do you think we should go about it? Also don't forget that any ranges should be seamlessly and efficiently treated as input ranges. Andrei

struct Input(R) // enforce that R is a forward range for type T, not versed well enough in D2 to know how to do this yet { R _range; // make this R * if you want Input(R) to be a reference type alias _range this; auto popNext() { auto result = _range.front; _range.popFront(); return result; } // and so-on, can leave out truly duplicate functions like empty as the alias this will take care of that. } // convenience method Input!(R) asInput(R)(R r) { return Input!(R)(r); // or Input!(R)(new R(r)); if reference type is the right answer } No extra code required in any forward range, just wrap it. Input still retains the forward range functions as an underlying base if you need BOTH input and forward range functionality (not sure why). -Steve
May 20 2009
prev sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Wed, May 20, 2009 at 4:03 PM, dsimcha <dsimcha yahoo.com> wrote:
 =3D=3D Quote from Jason House (jason.james.house gmail.com)'s article
 IMHO, D should have a type with low size and function call overhead > li=


 struct as well as reference semantics like a class.

 What's wrong with a pointer to a heap-allocated struct? =A0I sometimes ne=

 describe, too, and I've never seen a case where this doesn't do the job.

And you can alias Foo_* Foo, so that it doesn't even look like you're passing around a pointer. :-) --bb
May 20 2009
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Wed, 20 May 2009 13:04:42 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:
 Bill Baxter wrote:
 On Wed, May 20, 2009 at 9:19 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 I'm thinking a better design is to require any range that's forward or
 better to define a function save(). Ranges that don't implement it are  
 input
 ranges; those that do, will guarantee a brand new range is returned  
 from
 save(). So then adjacentFind would look like this:

 R adjacentFind(R)(R r)
 {
    if (r,empty) return r;
    R last = r.save;
    r.popFront;
    for (; !r.empty && last.front != r.front; last.popFront, r.popFront)
    {
    }
    return r;
 }

 Obviously, when you pass a range that doesn't support save,  
 adjacentFind
 will not compile, which is what we want.

ranges to hide their copy constructor, or whatever the D equivalent is, making R last = r; fail. But that would make input ranges very difficult to use.

Exactly. I thought of that design, and it was difficult to even pass a range to a function.
 So, of those two options at least, requiring a .save sounds like the
 better choice.
  The down side is you will get no error if you write the code the first
 way, without a .save.   I see this as turning into tip #5 in
 "Effective D" -- "Know when to use .save"   It would be nice if that
 potential mistake could be eliminated somehow.  You could perhaps
 require input ranges to implement transfer semantics, and have them
 implement a .clone for cases when you really do want to make an
 aliasing copy.

Good point. I don't have a solution for that. Giving ranges move semantics would probably make for another Effective D tip (or perhaps more... move semantics are pretty brutal). Another partial solution is to define a different interface for input ranges, one that combines front() and popFront(). Something like popNext. That way, people who use only the primitives empty() and popNext() know they are using a forward range and with hope they'll remember they can't really save copies of it and expect them to "remember" where they are in the input. Andrei

Bicycle shed: Well, since output ranges use 'put', how about 'get' for input ranges?
May 20 2009
prev sibling next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 In wake of a few discussion I've witnessed, I'm thinking of a last
 change for ranges. (In fact there's one more, but that's minor.)
 The problem is that input ranges and forward ranges have the same
 syntactic interface, but different semantic interfaces. Consider the
 problem of finding the first two identical adjacent items in a range:
 R adjacentFind(R)(R r)
 {
      if (r,empty) return r;
      R last = r;
      r.popFront;
      for (; !r.empty && last.front != r.front; last.popFront, r.popFront)
      {
      }
      return r;
 }
 This will work properly on lists and vectors, but horrendously on files
 and sockets. This is because input ranges can't be saved for later use:
 incrementing r also increments popFront and essentially forces both to
 look at the same current value.
 I'm thinking a better design is to require any range that's forward or
 better to define a function save(). Ranges that don't implement it are
 input ranges; those that do, will guarantee a brand new range is
 returned from save(). So then adjacentFind would look like this:
 R adjacentFind(R)(R r)
 {
      if (r,empty) return r;
      R last = r.save;
      r.popFront;
      for (; !r.empty && last.front != r.front; last.popFront, r.popFront)
      {
      }
      return r;
 }
 Obviously, when you pass a range that doesn't support save, adjacentFind
 will not compile, which is what we want.
 Andrei
 P.S. There is a way to implement adjacentFind for forward ranges by
 saving data instead of ranges. I've used a limited version above for
 illustration purposes.

Sounds like at least a reasonable solution. The thing I like about it is that, in addition to safety, it allows for the range to do fancy and arbitrary stuff under the hood if necessary to allow for saving. Also, while we're fine tuning input ranges vs. forward ranges, I think the concept of iterables as a catch-all for ranges, opApply, builtins, etc. needs to be introduced and fine tuned, too. We've shown on this NG previously that, while ranges are usually preferable for the flexibility they offer, opApply does have its legitimate use cases.
May 20 2009
parent reply Robert Fraser <fraserofthenight gmail.com> writes:
dsimcha wrote:
 Also, while we're fine tuning input
 ranges vs. forward ranges, I think the concept of iterables as a catch-all for
 ranges, opApply, builtins, etc. needs to be introduced and fine tuned, too. 
We've
 shown on this NG previously that, while ranges are usually preferable for the
 flexibility they offer, opApply does have its legitimate use cases.

An input/forward range is basically just another name/syntax for an iterable. Perhaps algorithms that work on input ranges should be written using foreach instead of front/popFront?
May 20 2009
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Robert Fraser wrote:
 dsimcha wrote:
 Also, while we're fine tuning input
 ranges vs. forward ranges, I think the concept of iterables as a 
 catch-all for
 ranges, opApply, builtins, etc. needs to be introduced and fine tuned, 
 too.  We've
 shown on this NG previously that, while ranges are usually preferable 
 for the
 flexibility they offer, opApply does have its legitimate use cases.

An input/forward range is basically just another name/syntax for an iterable. Perhaps algorithms that work on input ranges should be written using foreach instead of front/popFront?

For most, foreach is not sufficient. Andrei
May 20 2009
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Wed, 20 May 2009 14:02:02 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Robert Jacques wrote:
 Bicycle shed: Well, since output ranges use 'put', how about 'get' for  
 input ranges?

Nice color :o). In fact, "put" is a poor choice because it doesn't reflect advancement. Probably putNext and getNext are better. Andrei

Well, on that note, more shed colors and common use cases: stacks, queues, messge passing, file I/O, network I/O, confusion factor send/recv weak , okay , good , weak , good , low sink/rise bad , bad , bad , bad , bad , high push/pop good , okay , okay , okay , okay , high
May 20 2009
prev sibling parent "Kristian Kilpi" <kjkilpi gmail.com> writes:
On Wed, 20 May 2009 21:02:02 +0300, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Robert Jacques wrote:
 Bicycle shed: Well, since output ranges use 'put', how about 'get' for  
 input ranges?

Nice color :o). In fact, "put" is a poor choice because it doesn't reflect advancement. Probably putNext and getNext are better. Andrei

(Just thinking aloud... :) ) I have been using get() + set() and read() + write(). read() and write() advance to the next item; get() + set() do not. Actually I have implemented my iterator classes (in C++) as follows (simplified): BasicIFlow: read() toNext() isEnd() IFlow: get() read() toNext() isEnd() Iter: get() read() toNext() toPrev() isBegin() isFirst() isLast() isEnd() As seen, Iter is a two-way iterator, and the other two are one-way iterators. (And there are correponding output iterators too, of course.) There are also functions like toFirst(), toEnd(), etc (only Iter has toFirst()). And for convenience, Iter also has functions like getNext() and getPrev() that return the next/previous item without moving the iterator. So there are quite many functions, which is not necessary good ;) (although many of them have default functionality that simply calls the other "core" functions; for example, read() *could* be written with get() + toNext()). I know very little about Ranges (when I have time, that will change), but if I'm not mistaken, they hold and modify the beginning and end of the iterated area? That's an interesting and unique approach. :) My classes move a cursor inside the iterated area. Of course, with the Flow classes, the beginning of the area is moved together with the cursor (as the cursor cannot move backwards).
May 20 2009