www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - front stability

reply Steven Schveighoffer <schveiguy yahoo.com> writes:
I have always treated ranges with this expectation:

1. front gets you the current element of the range. Calling front 
multiple times without calling popFront should get you the same value.
2. popFront moves you to the next element of the range if it exists.

However, there are some ranges which can violate this. For example map:

int i = 0;
auto r = iota(int.max).map!((a) => ++i);

auto e1 = r.front;
auto e2 = r.front;
assert(e1 + 1 == e2);

But I consider this a programming error. It's acceptable to me that it's 
possible to write bugs using ranges, no programming language is immune 
to that.

However, Joseph Wakeling pointed out[1] the generate range[2]:

auto r = generate(() => ++i);

// same result

But unlike map, there isn't a way to make generate work as a proper 
range, because as Joe pointed out, popFront does nothing, front does all 
the work that popFront would do.

There was a bit of discussion when this function was being vetted[3], 
and IMO, this did not result in the correct function.

For example, generate.popFrontN does nothing. generate.drop does 
nothing. And of course, calling front multiple times does not yield the 
same answer, unless you provide a lambda that does the same thing every 
time (a useless case).

The counter-argument seems to be that if you cache the front element, 
then then making a copy of the range via take can repeat the cached 
element[4]. I find this argument severely lacking -- non-forward ranges 
are not meant to be copied and expected to operate properly, it's why we 
require calling save.

I'd say violating the expectations of what popFront and front do is more 
egregious than a particular use case, no matter how valid that case is. 
I'd like to fix this bug, but I see there were quite a few major D 
contributors in the camp that wanted this function to behave the way it 
is. So I'd rather first debate the merits here.

-Steve

[1] https://forum.dlang.org/post/ttyzjfmegkmbluumiczf forum.dlang.org
[2] http://dlang.org/phobos/std_range.html#.generate
[3] https://github.com/dlang/phobos/pull/2606
[4] https://github.com/dlang/phobos/pull/2606#issuecomment-58740715
Jun 02 2016
next sibling parent Atila Neves <atila.neves gmail.com> writes:
On Thursday, 2 June 2016 at 12:51:18 UTC, Steven Schveighoffer 
wrote:
 I have always treated ranges with this expectation:
So have I. It's troubling that's not how some of them work. Atila
Jun 02 2016
prev sibling next sibling parent Jack Stouffer <jack jackstouffer.com> writes:
On Thursday, 2 June 2016 at 12:51:18 UTC, Steven Schveighoffer 
wrote:
 1. front gets you the current element of the range. Calling 
 front multiple times without calling popFront should get you 
 the same value.
 2. popFront moves you to the next element of the range if it 
 exists.
Agreed. To quote Ali, * empty: specifes whether the range is empty; it must return true when the range is considered to be empty, and false otherwise * front: provides access to the element at the beginning of the range * popFront(): shortens the range from the beginning by removing the first element
 However, there are some ranges which can violate this. For 
 example map:
 ...
 But I consider this a programming error.
Completely agree. Comparing to the quote from Ali above, this violates the basic rules of ranges, rules which the vast majority of code expects ranges to follow. Also, these rules cannot (and shouldn't) be loosened because there is no way to check this violation at compile-time. Unless we ask people to add yet another enum to their ranges and have even more static-if's in their code. I don't see how benefit to breaking these rules is greater than the cost.
 However, Joseph Wakeling pointed out[1] the generate range[2]:

 auto r = generate(() => ++i);
I never liked generate, it always seemed like a hack, now I see my opinions have been validated.
 But unlike map, there isn't a way to make generate work as a 
 proper range, because as Joe pointed out, popFront does 
 nothing, front does all the work that popFront would do.

 For example, generate.popFrontN does nothing. generate.drop 
 does nothing. And of course, calling front multiple times does 
 not yield the same answer, unless you provide a lambda that 
 does the same thing every time (a useless case).
These are some pretty bad bugs. It's clear to me than generate should not return a range, but an opApply iterable.
Jun 02 2016
prev sibling next sibling parent Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Thursday, June 02, 2016 08:51:18 Steven Schveighoffer via Digitalmars-d 
wrote:
 I'd say violating the expectations of what popFront and front do is more
 egregious than a particular use case, no matter how valid that case is.
 I'd like to fix this bug, but I see there were quite a few major D
 contributors in the camp that wanted this function to behave the way it
 is. So I'd rather first debate the merits here.
I am in violent agreement with you on this. Copying ranges is unspecified behavior. If you want an actual copy, you have to call save. front must return the same element on every call until popFront is called, and while in some cases, that means returning an element that is equal every time rather than being an identical value (e.g. range.map!(a => to!string(a))()), it's not okay for front to return a different element without popFront being called. I think that for a range to do otherwise is a _very_ clear violation of the range API. If you have to do stuff like caching the value of front in order to keep it consistent, then you're stuck doing that, otherwise it's not a valid range. I really don't see how there's much of an argument otherwise. - Jonathan M Davis
Jun 02 2016
prev sibling next sibling parent reply deadalnix <deadalnix gmail.com> writes:
On Thursday, 2 June 2016 at 12:51:18 UTC, Steven Schveighoffer 
wrote:
 [...]
I opened some bugs about that in the past, but got shut down. It is especially problematic since some ranges (like filter) are calling front several time when iterating.
Jun 02 2016
parent Walter Bright <newshound2 digitalmars.com> writes:
On 6/2/2016 8:21 AM, deadalnix wrote:
 I opened some bugs about that in the past, but got shut down. It is especially
 problematic since some ranges (like filter) are calling front several time when
 iterating.
Please reopen them.
Jun 30 2016
prev sibling next sibling parent ag0aep6g <anonymous example.com> writes:
On 06/02/2016 02:51 PM, Steven Schveighoffer wrote:
 For example, generate.popFrontN does nothing. generate.drop does
 nothing. And of course, calling front multiple times does not yield the
 same answer, unless you provide a lambda that does the same thing every
 time (a useless case).
That's just broken.
 The counter-argument seems to be that if you cache the front element,
 then then making a copy of the range via take can repeat the cached
 element[4]. I find this argument severely lacking -- non-forward ranges
 are not meant to be copied and expected to operate properly, it's why we
 require calling save.
Completely agree with you here. You can only use one of the copies. When you call popFront on one of them, that invalidates the other one. To get two valid copies, use .save. generate should not try to go beyond the contract here when it means broken behavior elsewhere.
Jun 02 2016
prev sibling next sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 6/2/16 8:51 AM, Steven Schveighoffer wrote:
 I have always treated ranges with this expectation:

 1. front gets you the current element of the range. Calling front
 multiple times without calling popFront should get you the same value.
 2. popFront moves you to the next element of the range if it exists.
Jack Stouffer has created a PR to formalize this. Please comment if you have objections (the group that argued for current generate behavior was absent from this post that was meant to be a debate). I think this is the right thing to do. https://github.com/dlang/phobos/pull/4511 -Steve
Jun 30 2016
parent reply jmh530 <john.michael.hall gmail.com> writes:
On Thursday, 30 June 2016 at 14:17:22 UTC, Steven Schveighoffer 
wrote:
 Jack Stouffer has created a PR to formalize this.

 Please comment if you have objections (the group that argued 
 for current generate behavior was absent from this post that 
 was meant to be a debate). I think this is the right thing to 
 do.
It looks like this is just documentation changes. This really doesn't prevent anyone from making a Range that violates front stability. For instance, suppose isInputRange is changed so that auto h = r.front; // can get the front of the range becomes auto h = r.front; // can get the front of the range auto j = r.front; assert(h == j); //ensure front stability This doesn't seem to work for the map example above, but could it be adapted so that it does?
Jun 30 2016
next sibling parent Jack Stouffer <jack jackstouffer.com> writes:
On Thursday, 30 June 2016 at 14:59:11 UTC, jmh530 wrote:
 It looks like this is just documentation changes. This really 
 doesn't prevent anyone from making a Range that violates front 
 stability.
No, but what it will do is to make it clear that these are either bugs or intended behavior, as this is still a point of disagreement.
 For instance, suppose isInputRange is changed so that

 auto h = r.front; // can get the front of the range

 becomes

 auto h = r.front; // can get the front of the range
 auto j = r.front;
 assert(h == j);  //ensure front stability
This cannot be done at compile time.
Jun 30 2016
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 6/30/16 10:59 AM, jmh530 wrote:
 On Thursday, 30 June 2016 at 14:17:22 UTC, Steven Schveighoffer wrote:
 Jack Stouffer has created a PR to formalize this.

 Please comment if you have objections (the group that argued for
 current generate behavior was absent from this post that was meant to
 be a debate). I think this is the right thing to do.
It looks like this is just documentation changes. This really doesn't prevent anyone from making a Range that violates front stability. For instance, suppose isInputRange is changed so that auto h = r.front; // can get the front of the range becomes auto h = r.front; // can get the front of the range auto j = r.front; assert(h == j); //ensure front stability
This doesn't solve it, and it can't be solved -- halting problem. We have some expectations that are assumed and some that are mechanically tested. This PR clarifies the assumptions. -Steve
Jun 30 2016
parent Walter Bright <newshound2 digitalmars.com> writes:
On 6/30/2016 8:32 AM, Steven Schveighoffer wrote:
 This doesn't solve it, and it can't be solved -- halting problem. We have some
 expectations that are assumed and some that are mechanically tested. This PR
 clarifies the assumptions.
I made some range drivers that brutally test the protocol: http://www.digitalmars.com/sargon/asinputrange.html http://www.digitalmars.com/sargon/asforwardrange.html I think they should be part of Phobos.
Jun 30 2016
prev sibling next sibling parent reply Mathias Lang via Digitalmars-d <digitalmars-d puremagic.com> writes:
2016-06-02 14:51 GMT+02:00 Steven Schveighoffer via Digitalmars-d <
digitalmars-d puremagic.com>:

 I have always treated ranges with this expectation:
I think the case is pretty clear here, and I'm in agreement with you. I just want to add a note on the following point: 2016-06-02 14:51 GMT+02:00 Steven Schveighoffer via Digitalmars-d < digitalmars-d puremagic.com>:
 The counter-argument seems to be that if you cache the front element, then
 then making a copy of the range via take can repeat the cached element[4].
 I find this argument severely lacking -- non-forward ranges are not meant
 to be copied and expected to operate properly, it's why we require calling
 save.
The compiler is blatantly guilty of doing so: https://issues.dlang.org/show_bug.cgi?id=15413
Jun 30 2016
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 6/30/16 11:56 AM, Mathias Lang via Digitalmars-d wrote:
 2016-06-02 14:51 GMT+02:00 Steven Schveighoffer via Digitalmars-d
 <digitalmars-d puremagic.com <mailto:digitalmars-d puremagic.com>>:

     I have always treated ranges with this expectation:


 I think the case is pretty clear here, and I'm in agreement with you.

 I just want to add a note on the following point:
 2016-06-02 14:51 GMT+02:00 Steven Schveighoffer via
 Digitalmars-d <digitalmars-d puremagic.com
 <mailto:digitalmars-d puremagic.com>>:

     The counter-argument seems to be that if you cache the front
     element, then then making a copy of the range via take can repeat
     the cached element[4]. I find this argument severely lacking --
     non-forward ranges are not meant to be copied and expected to
     operate properly, it's why we require calling save.


 The compiler is blatantly guilty of doing so:
 https://issues.dlang.org/show_bug.cgi?id=15413
I don't think that's a bug. The range definition says nothing about what happens on copying, or that non-forward ranges shouldn't be copied. What it says is that if you make a copy and use both, then there is no guarantee what happens. This is the use case in the argument I referenced. Unfortunately, for the likes of forward ranges, copying mainly always does the same thing as .save does. So you have tons and tons of code that doesn't properly use .save. Such is the way things are, and I'm not sure it will ever get better :( -Steve
Jun 30 2016
parent reply default0 <Kevin.Labschek gmx.de> writes:
On Thursday, 30 June 2016 at 18:07:41 UTC, Steven Schveighoffer 
wrote:
 On 6/30/16 11:56 AM, Mathias Lang via Digitalmars-d wrote:
 2016-06-02 14:51 GMT+02:00 Steven Schveighoffer via 
 Digitalmars-d
 <digitalmars-d puremagic.com 
 <mailto:digitalmars-d puremagic.com>>:

     I have always treated ranges with this expectation:


 I think the case is pretty clear here, and I'm in agreement 
 with you.

 I just want to add a note on the following point:
 2016-06-02 14:51 GMT+02:00 Steven Schveighoffer via
 Digitalmars-d <digitalmars-d puremagic.com
 <mailto:digitalmars-d puremagic.com>>:

     The counter-argument seems to be that if you cache the 
 front
     element, then then making a copy of the range via take can 
 repeat
     the cached element[4]. I find this argument severely 
 lacking --
     non-forward ranges are not meant to be copied and expected 
 to
     operate properly, it's why we require calling save.


 The compiler is blatantly guilty of doing so:
 https://issues.dlang.org/show_bug.cgi?id=15413
I don't think that's a bug. The range definition says nothing about what happens on copying, or that non-forward ranges shouldn't be copied. What it says is that if you make a copy and use both, then there is no guarantee what happens. This is the use case in the argument I referenced. Unfortunately, for the likes of forward ranges, copying mainly always does the same thing as .save does. So you have tons and tons of code that doesn't properly use .save. Such is the way things are, and I'm not sure it will ever get better :( -Steve
Can we have ranges that disallow using them like rangeCache = range; without sacrificing usability in other cases? Going forward having ranges that simply do not compile when used that way would probably make lots of people actually honor .save and allow the compiler to point out buggy code (even if, in the end, .save really comes down to a no-op for many ranges).
Jun 30 2016
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Thu, Jun 30, 2016 at 07:23:24PM +0000, default0 via Digitalmars-d wrote:
 On Thursday, 30 June 2016 at 18:07:41 UTC, Steven Schveighoffer wrote:
[...] [...]
 Unfortunately, for the likes of forward ranges, copying mainly
 always does the same thing as .save does. So you have tons and tons
 of code that doesn't properly use .save. Such is the way things are,
 and I'm not sure it will ever get better :(
 
 -Steve
Can we have ranges that disallow using them like rangeCache = range; without sacrificing usability in other cases? Going forward having ranges that simply do not compile when used that way would probably make lots of people actually honor .save and allow the compiler to point out buggy code (even if, in the end, .save really comes down to a no-op for many ranges).
Andrei has mentioned that, in retrospect, we should have defined forward ranges such that copying the range object *is* saving. (Though it is unclear to me what is to be done with by-reference ranges like range classes, since you'd still need a separate method to be able to copy the class object. Unless you wrap them in a struct that calls the copy method in its postblit.) That aside, though, I'm not sure how you'd enforce not copying a range. The range API is defined by constraints rather than a language-level Concept, so the actual range object could be any arbitrary type. It doesn't seem to make sense that the assignment operator should suddenly be prohibited just because it happens to have members named .front, .empty, .popFront. Besides, outside of generic code, what if the user-defined range is actually such that assignment does the Right Thing(tm)? The cases that we want to prohibit are restricted only to generic code, since code that already "knows" the concrete type shouldn't be arbitrarily restricted like this. T -- I think Debian's doing something wrong, `apt-get install pesticide', doesn't seem to remove the bugs on my system! -- Mike Dresser
Jun 30 2016
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Thursday, 30 June 2016 at 19:34:47 UTC, H. S. Teoh wrote:
 arbitrary type. It doesn't seem to make sense that the 
 assignment operator should suddenly be prohibited just because 
 it happens to have members named .front, .empty, .popFront.
Seems like this could be resolved by requiring all range code to use an enclosure type that wraps the range functionality.
Jun 30 2016
prev sibling next sibling parent Jesse Phillips <Jesse.K.Phillips+D gmail.com> writes:
On Thursday, 2 June 2016 at 12:51:18 UTC, Steven Schveighoffer 
wrote:
 The counter-argument seems to be that if you cache the front 
 element, then then making a copy of the range via take can 
 repeat the cached element[4]. I find this argument severely 
 lacking -- non-forward ranges are not meant to be copied and 
 expected to operate properly, it's why we require calling save.

 [4] 
 https://github.com/dlang/phobos/pull/2606#issuecomment-58740715
I think front returning the same value every time is the appropriate behavior for every range (I've had many times I wish this weren't true so I wouldn't have to implement caching). Map is no different, yes the function could technically change each time front is called, but that is an issue in the user code and not the implementation of map (map should only accept pure functions :)) I think the example of take() is a valid criticism. It is the issue of value ranges vs reference ranges all over again. Due to these distinct range types, providing a range to a function invalidates that range within the scope. It is not possible to know if the range will start at the same spot prior to calling function or if it will start after. In the case generator, we see a hybrid which isn't the only range that will do front() caching. Practically we don't run into issues with undefined behavior in our code, probably because we don't change out the type of range being used very often. I don't have a solution for this problem, but think having generator() add a new uncertainty (does calling front() a second time advance my range?) is a very bad precedence to set in Phobos.
Jun 30 2016
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 6/2/2016 5:51 AM, Steven Schveighoffer wrote:
 I have always treated ranges with this expectation:

 1. front gets you the current element of the range. Calling front multiple
times
 without calling popFront should get you the same value.
 2. popFront moves you to the next element of the range if it exists.
Right. Also: 3. empty can be called multiple times in a row, and must always return the same result. 4. it is not legal to call front() on an empty range, but it is not required to call empty() first 5. it is not legal to call popFront() on an empty range, but it is not required to call empty() or front() first.
 I'd say violating the expectations of what popFront and front do is more
 egregious than a particular use case, no matter how valid that case is. I'd
like
 to fix this bug, but I see there were quite a few major D contributors in the
 camp that wanted this function to behave the way it is. So I'd rather first
 debate the merits here.
This was debated at length in the past. The result was the rules above. Ranges that do not follow those rules need to be fixed. I don't think there's a need to debate it again.
Jun 30 2016
parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 6/30/16 7:50 PM, Walter Bright wrote:
 On 6/2/2016 5:51 AM, Steven Schveighoffer wrote:
 I have always treated ranges with this expectation:

 1. front gets you the current element of the range. Calling front
 multiple times
 without calling popFront should get you the same value.
 2. popFront moves you to the next element of the range if it exists.
Right. Also: 3. empty can be called multiple times in a row, and must always return the same result. 4. it is not legal to call front() on an empty range, but it is not required to call empty() first 5. it is not legal to call popFront() on an empty range, but it is not required to call empty() or front() first.
Jack, can you update the rules to reflect this?
 I'd say violating the expectations of what popFront and front do is more
 egregious than a particular use case, no matter how valid that case
 is. I'd like
 to fix this bug, but I see there were quite a few major D contributors
 in the
 camp that wanted this function to behave the way it is. So I'd rather
 first
 debate the merits here.
This was debated at length in the past. The result was the rules above. Ranges that do not follow those rules need to be fixed. I don't think there's a need to debate it again.
Thank you, I will set up a PR to fix generate. -Steve
Jun 30 2016