digitalmars.D - range behaviour

Benjamin Thaut (8/8) May 13 2014 I know that there was a recent discussion about how the methods of

Jonathan M Davis via Digitalmars-d (13/21) May 13 2014 Certainly, ranges are pretty much always used this way, but there was so...

Steven Schveighoffer (12/28) May 13 2014 I don't agree there was a consensus. I think empty should not have to be...

H. S. Teoh via Digitalmars-d (38/43) May 13 2014 I only partially agree with this, up to the point where .empty has been

Steven Schveighoffer (26/38) May 13 2014 You can be generic, but still know that empty is not necessary.
luka8088 (13/27) May 13 2014 This is a potential issue because if it turns out that empty _must_ be

Jonathan M Davis via Digitalmars-d (7/38) May 13 2014 On Tue, 13 May 2014 13:29:32 -0400

H. S. Teoh via Digitalmars-d (35/64) May 13 2014 [...]

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (8/17) May 13 2014 An API is a "user" interface. It should be intuitive.

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (10/25) May 14 2014 I wouldn't say never, because optimizers have become damn good. I

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (28/35) May 14 2014 Well, "never" without a proof is of course a too strong claim.

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (72/83) May 14 2014 As a demonstration of what I mean:

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (26/36) May 14 2014 I only have DMD installed, but I trust your word for it. However

Dicebot (64/67) May 14 2014 There is no single reason why this should be true. Actually

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (4/13) May 14 2014 This isn't really a loop. It is an expression: n*(n+1)

Dicebot (8/21) May 15 2014 I think it is your turn to make a counter-example ;) It does not

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (9/10) May 15 2014 I've given 2 examples that no compiler can get at… because they

Dicebot (14/20) May 15 2014 If compiler lacks contextual knowledge, than only means that

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (70/79) May 16 2014 Not really. Here is a simple example, a sawtooth generator that

Jonathan M Davis via Digitalmars-d (19/22) May 13 2014 Oh, but that's part of the fun. If the range's front returns by ref or a...
Dicebot (2/13) May 13 2014 I don't remember any final decision made from that discussion.

Benjamin Thaut <code benjamin-thaut.de> writes:

I know that there was a recent discussion about how the methods of 
ranges should behave.

E.g.

- Does empty always have to be called before calling front or popFront?
- Is it allowed to call front multiple times between two calls to popFront?

Was there a result of that discussion? Is it documented somewhere?

Kind Regards
Benjamin Thaut

May 13 2014

Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:

On Tue, 13 May 2014 18:38:44 +0200
Benjamin Thaut via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 I know that there was a recent discussion about how the methods of
 ranges should behave.

 E.g.

 - Does empty always have to be called before calling front or
 popFront?

Certainly, ranges are pretty much always used this way, but there was some
debate as to whether empty could have work done in it (and thus _had_ to be
called). However, I believe that the consensus was that yes, empty had to be
called (certainly, both Walter and Andrei felt that way).

 - Is it allowed to call front multiple times between two calls to
 popFront?

Definitely. _Lots_ of range-based code would break otherwise - though there
are casese where that can cause problems depending on what you rely on (e.g.
map!(a => to!string(a)) will return equal strings, but they aren't the _same_
string).

 Was there a result of that discussion? Is it documented somewhere?

AFAIK, there's just the semi-recent newsgroup discussion on the matter, though
maybe someone put something up on the wiki.

- Jonathan M Davis

May 13 2014

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 13 May 2014 12:58:09 -0400, Jonathan M Davis via Digitalmars-d  
<digitalmars-d puremagic.com> wrote:

 On Tue, 13 May 2014 18:38:44 +0200
 Benjamin Thaut via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 I know that there was a recent discussion about how the methods of
 ranges should behave.

 E.g.

 - Does empty always have to be called before calling front or
 popFront?

 Certainly, ranges are pretty much always used this way, but there was  
 some
 debate as to whether empty could have work done in it (and thus _had_ to  
 be
 called). However, I believe that the consensus was that yes, empty had  
 to be
 called (certainly, both Walter and Andrei felt that way).

I don't agree there was a consensus. I think empty should not have to be  
called if it's already logically known that the range is not empty. The  
current documentation states that, and I don't think there was an  
agreement that we should change it, despite the arguments from Walter.

In any case, I think generic code for an unknown range type in an unknown  
condition should have to call empty, since it cannot logically prove that  
it's not.

Even if it was required, it would be an unenforceable policy, just like  
range.save.

-Steve

May 13 2014

"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:

On Tue, May 13, 2014 at 01:29:32PM -0400, Steven Schveighoffer via
Digitalmars-d wrote:
[...]
 I don't agree there was a consensus. I think empty should not have to
 be called if it's already logically known that the range is not empty.

I only partially agree with this, up to the point where .empty has been
checked previously, such as:

	void f1() {
		void f2(R)(R r) if (isInputRange!R) {
			doSomething(r.front);
		}

		auto r = makeMeARange();
		if (!r.empty)
			f2(r);
		...
	}

Even in this case, I'd put an in-contract on f2 that verifies that the
range is indeed non-empty:

	...
		void f2(R)(R r)
			if (isInputRange!R)
		in { assert(!r.empty); }
		body {
			doSomething(r.front);
		}

[...]
 In any case, I think generic code for an unknown range type in an
 unknown condition should have to call empty, since it cannot logically
 prove that it's not.

[...]

In my mind, *all* range-based code is generic. If you need to depend on
something about your range beyond what the range API guarantees, then
you should be using the concrete type rather than a template argument,
which means that your code is no longer range-based code -- it's
operating on a concrete type. (Of course, if the concrete type
implements the range API, then there's nothing wrong with using range
API methods, but one shouldn't be under the illusion that one is writing
range-based code. It is type-dependent code that just happens to have
range-based methods. If the code will break if the range is substituted
by a different range that doesn't have the same guarantees (outside of
what the range API specifies), then it's not range-based code.)


T

-- 
Many open minds should be closed for repairs. -- K5 user

May 13 2014

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 13 May 2014 13:40:55 -0400, H. S. Teoh via Digitalmars-d  
<digitalmars-d puremagic.com> wrote:

 On Tue, May 13, 2014 at 01:29:32PM -0400, Steven Schveighoffer via  
 Digitalmars-d wrote:
 [...]

 In any case, I think generic code for an unknown range type in an
 unknown condition should have to call empty, since it cannot logically
 prove that it's not.

 [...]

 In my mind, *all* range-based code is generic. If you need to depend on
 something about your range beyond what the range API guarantees, then
 you should be using the concrete type rather than a template argument,
 which means that your code is no longer range-based code -- it's
 operating on a concrete type.

You can be generic, but still know that empty is not necessary.

This is a potential use case (somewhat similar to yours):

void foo(R)(R r)
{
    if(!r.empty)
    {
       auto r2 = map!(x => x.bar)(r); // or some similar translation range
       // have to check r2.empty here? The "empty must be called" rules say  
yes.
    }
}

I will note, there are cases in phobos that don't check empty because it's  
provably not necessary.

The issue arose because of the filter range, which someone was trying to  
make fully lazy. The whole thrust of the argument is that we want to force  
people to call empty not because we want them to write generically safe  
code but because we want to *instrument* empty to do something other than  
check for emptiness. In other words, if we are guaranteed that empty will  
be called before anything else, we can add extra bits to empty for e.g.  
lazy initialization. Otherwise, we have to add the bits to all the  
primitives.

It also results in an awkward call when it's not strictly necessary:

r2.empty;

-Steve

May 13 2014

luka8088 <luka8088 owave.net> writes:

On 13.5.2014. 19:40, H. S. Teoh via Digitalmars-d wrote:
 On Tue, May 13, 2014 at 01:29:32PM -0400, Steven Schveighoffer via
Digitalmars-d wrote:
 [...]
 Even in this case, I'd put an in-contract on f2 that verifies that the
 range is indeed non-empty:
 
 	...
 		void f2(R)(R r)
 			if (isInputRange!R)
 		in { assert(!r.empty); }
 		body {
 			doSomething(r.front);
 		}
 
 [...]

This is a potential issue because if it turns out that empty _must_ be
called than the author could put the front population logic inside empty.

Consider:

struct R {
  bool empty () { front = 1; return false; }
  int front = 0;
  void popFront () { front = 0; }
}

This is a valid code if empty _must_ be called, but it will behave
differently if passed to f2 in case asserts are compiled out. In case
asserts are compiled out empty is never called and front in never populated.

Because of this I think that it is necessary to document range behavior.

May 13 2014

Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:

On Tue, 13 May 2014 13:29:32 -0400
Steven Schveighoffer via Digitalmars-d <digitalmars-d puremagic.com>
wrote:

 On Tue, 13 May 2014 12:58:09 -0400, Jonathan M Davis via
 Digitalmars-d <digitalmars-d puremagic.com> wrote:

 On Tue, 13 May 2014 18:38:44 +0200
 Benjamin Thaut via Digitalmars-d <digitalmars-d puremagic.com>
 wrote:

 I know that there was a recent discussion about how the methods of
 ranges should behave.

 E.g.

 - Does empty always have to be called before calling front or
 popFront?

 Certainly, ranges are pretty much always used this way, but there
 was some
 debate as to whether empty could have work done in it (and thus
 _had_ to be
 called). However, I believe that the consensus was that yes, empty
 had to be
 called (certainly, both Walter and Andrei felt that way).

 I don't agree there was a consensus. I think empty should not have to
 be called if it's already logically known that the range is not
 empty. The current documentation states that, and I don't think there
 was an agreement that we should change it, despite the arguments from
 Walter.

 In any case, I think generic code for an unknown range type in an
 unknown condition should have to call empty, since it cannot
 logically prove that it's not.

 Even if it was required, it would be an unenforceable policy, just
 like range.save.

Yeah, and they're both cases where the common case will work just fine if you
do it wrong but which will break in less common cases, meaning that the odds
are much higher that the bug won't be caught. :(

- Jonathan M Davis

May 13 2014

"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:

On Tue, May 13, 2014 at 09:58:09AM -0700, Jonathan M Davis via Digitalmars-d
wrote:
 On Tue, 13 May 2014 18:38:44 +0200
 Benjamin Thaut via Digitalmars-d <digitalmars-d puremagic.com> wrote:
 
 I know that there was a recent discussion about how the methods of
 ranges should behave.

 E.g.

 - Does empty always have to be called before calling front or
 popFront?

 
 Certainly, ranges are pretty much always used this way, but there was
 some debate as to whether empty could have work done in it (and thus
 _had_ to be called). However, I believe that the consensus was that
 yes, empty had to be called (certainly, both Walter and Andrei felt
 that way).
 
 - Is it allowed to call front multiple times between two calls to
 popFront?

 
 Definitely. _Lots_ of range-based code would break otherwise - though
 there are casese where that can cause problems depending on what you
 rely on (e.g.  map!(a => to!string(a)) will return equal strings, but
 they aren't the _same_ string).
 
 Was there a result of that discussion? Is it documented somewhere?

 
 AFAIK, there's just the semi-recent newsgroup discussion on the
 matter, though maybe someone put something up on the wiki.

[...]

I second all of the above.

In generic code (all range-based code is generic, since otherwise
there's no point in using the range abstraction, you should just use the
concrete type directly), you cannot assume anything about what .empty,
.front, and .popFront() might do apart from what is specified in the
range API.

Therefore, since .front and .popFront have unspecified behaviour if
.empty is false, the only correct way to write range-based code is to
call .empty first to determine whether it's OK to call .front and
.popFront.

Furthermore, it doesn't make sense to restrict .front to be called only
once, since otherwise we should simplify the API by merging .front and
.popFront and have the same function both return the current element and
advance to the next element. The fact that they were designed to be
separate calls implies that .front can be called multiple times.

Of course, for efficiency purposes range-based code (esp. Phobos code)
should try their best to only call .front once. But it should be
perfectly permissible to call .front multiple times.

Lastly, since the range API is an *abstraction*, it should not dictate
any concrete implementation details such as whether .empty can do
non-trivial initialization work. Properly-written range-based code
should be able to handle all possible implementations of the range API,
including those that do non-trivial work in .empty.

Of course, that doesn't mean it's a *good* idea for .empty to do
non-trivial work. The code is probably cleaner if that work is done
somewhere else (like in the ctor or the function that returns the
range). But that's an issue for the implementer of the range, not the
consumers. The consumers of the range should not depend on one behaviour
or the other, so that they will still work correctly when given a range
that, for whatever reason, needs to do non-trivial work in .empty.


T

-- 
Prosperity breeds contempt, and poverty breeds consent. -- Suck.com

May 13 2014