www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Forward ranges in Phobos v2

reply Dukc <ajieskola gmail.com> writes:
It seems we're going to do Phobos v2. I don't know whether it 
will be the idea Andrei just published or something else, but 
anyway.

So I think it's the time to discuss, do we want to change the 
definition of forward ranges? It seems to be the 
[consensus](https://forum.dlang.org/thread/ztgtmumenampiobbuiwd for
m.dlang.org?page=1) that it's current API is error-prone to use correctly.

For example, we could define that a range that does not offer 
`save()` is still a forward range if it can be copy constructed, 
and that copy constructors for reference ranges would be 
forbidden. But one big problem with that: classes can not be 
ranges then.

At the very least, I think we must take the stance that all 
Phobos v2 ranges must be save-on-copy where the documentation 
does not explicitly declare them reference ranges.

Ideas?
Nov 01 2021
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Nov 01, 2021 at 01:51:32PM +0000, Dukc via Digitalmars-d wrote:
 It seems we're going to do Phobos v2. I don't know whether it will be
 the idea Andrei just published or something else, but anyway.
 
 So I think it's the time to discuss, do we want to change the
 definition of forward ranges? It seems to be the
 [consensus](https://forum.dlang.org/thread/ztgtmumenampiobbuiwd forum.dlang.org?page=1)
 that it's current API is error-prone to use correctly.
 
 For example, we could define that a range that does not offer `save()`
 is still a forward range if it can be copy constructed, and that copy
 constructors for reference ranges would be forbidden. But one big
 problem with that: classes can not be ranges then.
 
 At the very least, I think we must take the stance that all Phobos v2
 ranges must be save-on-copy where the documentation does not
 explicitly declare them reference ranges.
[...] Based on what Andrei said in the past, there will no longer be .save (which has always been a point of confusion in the API and interacts poorly with the type system), but forward ranges will be based on by-value semantics, and input ranges on by-reference semantics. So if it's a by-value type, it's a forward range; if it's a by-reference type, it's an input range. If you have a struct but it only supports input range semantics, then you could pass it by ref or pass a pointer to it. If you have a class but it supports forward range semantics, then wrap it in a struct with a copy ctor that saves state in the copy ctor. T -- Questions are the beginning of intelligence, but the fear of God is the beginning of wisdom.
Nov 01 2021
next sibling parent Dukc <ajieskola gmail.com> writes:
On Monday, 1 November 2021 at 14:13:00 UTC, H. S. Teoh wrote:
 Based on what Andrei said in the past, there will no longer be 
 .save (which has always been a point of confusion in the API 
 and interacts poorly with the type system), but forward ranges 
 will be based on by-value semantics, and input ranges on 
 by-reference semantics. So if it's a by-value type, it's a 
 forward range; if it's a by-reference type, it's an input range.

 If you have a struct but it only supports input range 
 semantics, then you could pass it by ref or pass a pointer to 
 it.

 If you have a class but it supports forward range semantics, 
 then wrap it in a struct with a copy ctor that saves state in 
 the copy ctor.


 T
Simple and effective! I think that's what I'm voting for.
Nov 01 2021
prev sibling parent reply Alexandru Ermicioi <alexandru.ermicioi gmail.com> writes:
On Monday, 1 November 2021 at 14:13:00 UTC, H. S. Teoh wrote:
 If you have a struct but it only supports input range 
 semantics, then you could pass it by ref or pass a pointer to 
 it.

 If you have a class but it supports forward range semantics, 
 then wrap it in a struct with a copy ctor that saves state in 
 the copy ctor.


 T
What about random access range? If you don't care for the implementation of the range but only being a forward range, how would you express it as a parameter for a method? Best regards, Alexandru.
Nov 01 2021
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Nov 01, 2021 at 11:34:20PM +0000, Alexandru Ermicioi via Digitalmars-d
wrote:
 On Monday, 1 November 2021 at 14:13:00 UTC, H. S. Teoh wrote:
 If you have a struct but it only supports input range semantics,
 then you could pass it by ref or pass a pointer to it.
 
 If you have a class but it supports forward range semantics, then
 wrap it in a struct with a copy ctor that saves state in the copy
 ctor.
[...]
 What about random access range?
Presumably, it will be a struct with additional methods needed for random access (opIndex, et al).
 If you don't care for the implementation of the range but only being a
 forward range, how would you express it as a parameter for a method?
[...] Good question, ask Andrei. ;-) Presumably, if we standardize on structs/classes, it could be as simple as: auto myFunc(R)(R range) if (is(R == struct)) { ... // forward range } auto myFunc(R)(R range) if (is(R == class)) { ... // input range } But given that Andrei thinks it's a mistake for ranges to be implemented as classes, I've no idea. T -- Don't modify spaghetti code unless you can eat the consequences.
Nov 01 2021
parent reply Alexandru Ermicioi <alexandru.ermicioi gmail.com> writes:
On Monday, 1 November 2021 at 23:46:07 UTC, H. S. Teoh wrote:
 Good question, ask Andrei. ;-)
Well, I hope he will check this thread and comment on it.
 Presumably, if we standardize on structs/classes, it could be 
 as simple as:

 	auto myFunc(R)(R range) if (is(R == struct)) {
 		... // forward range
 	}

 	auto myFunc(R)(R range) if (is(R == class)) {
 		... // input range
 	}

 But given that Andrei thinks it's a mistake for ranges to be 
 implemented as classes, I've no idea.
That would work, if you have templated funcs, but what if you need it in an interface? If class based ranges are to be in D language, I doubt it will be possible to avoid .save function completely. At least for range interfaces, the save of forward range will have to be expressed through a method, such as .save.
Nov 01 2021
parent reply Paul Backus <snarwin gmail.com> writes:
On Tuesday, 2 November 2021 at 00:05:48 UTC, Alexandru Ermicioi 
wrote:
 On Monday, 1 November 2021 at 23:46:07 UTC, H. S. Teoh wrote:
 Good question, ask Andrei. ;-)
Well, I hope he will check this thread and comment on it.
 Presumably, if we standardize on structs/classes, it could be 
 as simple as:

 	auto myFunc(R)(R range) if (is(R == struct)) {
 		... // forward range
 	}

 	auto myFunc(R)(R range) if (is(R == class)) {
 		... // input range
 	}

 But given that Andrei thinks it's a mistake for ranges to be 
 implemented as classes, I've no idea.
That would work, if you have templated funcs, but what if you need it in an interface? If class based ranges are to be in D language, I doubt it will be possible to avoid .save function completely. At least for range interfaces, the save of forward range will have to be expressed through a method, such as .save.
You can always wrap a class/interface method in a struct that calls .save on copy: struct ClassRangeWrapper(T) if (is(T == class) || is(T == interface)) { T payload; alias payload this; this(ref inout typeof(this) other) inout { this.payload = other.payload.save; } } This way, range algorithms don't need to know about .save, so it can be removed from the official forward range requirements even though it still exists as an implementation detail.
Nov 01 2021
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/1/21 8:13 PM, Paul Backus wrote:
 On Tuesday, 2 November 2021 at 00:05:48 UTC, Alexandru Ermicioi wrote:
 On Monday, 1 November 2021 at 23:46:07 UTC, H. S. Teoh wrote:
 Good question, ask Andrei. ;-)
Well, I hope he will check this thread and comment on it.
 Presumably, if we standardize on structs/classes, it could be as 
 simple as:

     auto myFunc(R)(R range) if (is(R == struct)) {
         ... // forward range
     }

     auto myFunc(R)(R range) if (is(R == class)) {
         ... // input range
     }

 But given that Andrei thinks it's a mistake for ranges to be 
 implemented as classes, I've no idea.
That would work, if you have templated funcs, but what if you need it in an interface? If class based ranges are to be in D language, I doubt it will be possible to avoid .save function completely. At least for range interfaces, the save of forward range will have to be expressed through a method, such as .save.
You can always wrap a class/interface method in a struct that calls .save on copy:
Exactly. No need to support class ranges - simple wrappers can do everything class-like indirection does. Thanks.
Nov 01 2021
parent reply Dukc <ajieskola gmail.com> writes:
On Tuesday, 2 November 2021 at 02:45:11 UTC, Andrei Alexandrescu 
wrote:
 Exactly. No need to support class ranges - simple wrappers can 
 do everything class-like indirection does. Thanks.
Trying to write up a plan based on that one, so you can correct and/or spot weaknesses - stuff in `std.v2.range.interfaces` and `std.v2.concurrency.Generator` will continue to be ranges from Phobos v1 viewpoint but not from Phobos v2 viewpoint. - We add a function, let's say `std.range.valueRange`, in both versions, that will convert any v1 forward range to a value range that works in both versions. - We also add some other function, or perhaps a flag to aforementioned one, that can convert any v1 input ranges to v2 input range. `valueRange` as default must not accept non-forward ranges, because then it cannot guarantee that the result will be a value range. - We need some way to prevent Phobos v2 using v1 reference forward ranges accidently. Making v2 `isInputRange` to be an automatic negative for classes can suffice for now. - Phobos v2 ranges should still continue to provide the `save` method so they can be passed to v1 ranges. We also might provide an `assumeValueRange` function that will add the `save` method on top of any existing input range, assuming value semantics and making it a forward range from v1 perspective.
Nov 02 2021
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Nov 02, 2021 at 12:11:24PM +0000, Dukc via Digitalmars-d wrote:
 On Tuesday, 2 November 2021 at 02:45:11 UTC, Andrei Alexandrescu wrote:
 
 Exactly. No need to support class ranges - simple wrappers can do
 everything class-like indirection does. Thanks.
Trying to write up a plan based on that one, so you can correct and/or spot weaknesses - stuff in `std.v2.range.interfaces` and `std.v2.concurrency.Generator` will continue to be ranges from Phobos v1 viewpoint but not from Phobos v2 viewpoint.
Why is this necessary? I thought we're getting rid of std.range.interfaces.
 - We add a function, let's say `std.range.valueRange`, in both
   versions, that will convert any v1 forward range to a value range
   that works in both versions.
What's a value range?
 - We also add some other function, or perhaps a flag to aforementioned
   one, that can convert any v1 input ranges to v2 input range.
   `valueRange` as default must not accept non-forward ranges, because
   then it cannot guarantee that the result will be a value range.
[...] Interesting idea. So basically a shim for easy translation of v1-based code to v2-based code? That would be nice for gradual migration. It would have to exclude certain incompatible things like autodecoded strings, though. Otherwise it will result in a mess.
 - Phobos v2 ranges should still continue to provide the `save` method
   so they can be passed to v1 ranges.
[...] I'm not sure this is such a good idea, because v2 ranges may have fundamental incompatibilities with v1 algorithms, e.g., a v2 string range (non-autodecoded) being passed to a v1 algorithm (autodecoded) will probably produce the wrong results, likely silently, which is bad. Now imagine mixing v1 algorithms and v2 algorithms in the same UFCS chain (via shims) over a string, and you're in for a heck of time trying to debug the resulting mess. IMO it's better to just keep v1 code distinct from v2 code, and migrate v1-based code to v2-based code on a case-by-case basis. In most cases, you could probably just change `import std` to `import stdv2` and it should work. In cases involving e.g. autodecoding you'd add an adapter or two in your UFCS code, then change to `import stdv2` and that should fix it. For the rest of the cases, just leave `import std` as-is, and existing code should still function as before, with existing semantics, without any surprise breakages. T -- That's not a bug; that's a feature!
Nov 02 2021
next sibling parent reply Adam D Ruppe <destructionator gmail.com> writes:
On Tuesday, 2 November 2021 at 18:09:55 UTC, H. S. Teoh wrote:
 Why is this necessary?  I thought we're getting rid of 
 std.range.interfaces.
It is actually really, really, useful. If phobos didn't offer it, someone would reinvent it anyway. (In fact, there's a lot of cases where using them is more efficient than generating more and more code...)
Nov 02 2021
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Nov 02, 2021 at 07:32:56PM +0000, Adam D Ruppe via Digitalmars-d wrote:
 On Tuesday, 2 November 2021 at 18:09:55 UTC, H. S. Teoh wrote:
 Why is this necessary?  I thought we're getting rid of
 std.range.interfaces.
It is actually really, really, useful. If phobos didn't offer it, someone would reinvent it anyway. (In fact, there's a lot of cases where using them is more efficient than generating more and more code...)
I find it very useful as well, but according to Andrei, v2 will get rid of class-based ranges, and if those are wanted we should use a struct wrapper instead. In any case, if struct wrappers are the way to go, then we better have a standard way of constructing them. It just won't be the same thing as std.range.interfaces. T -- They say that "guns don't kill people, people kill people." Well I think the gun helps. If you just stood there and yelled BANG, I don't think you'd kill too many people. -- Eddie Izzard, Dressed to Kill
Nov 02 2021
parent Adam D Ruppe <destructionator gmail.com> writes:
On Tuesday, 2 November 2021 at 19:44:02 UTC, H. S. Teoh wrote:
 I find it very useful as well, but according to Andrei, v2 will 
 get rid of class-based ranges, and if those are wanted we 
 should use a struct wrapper instead.
Well, you'd keep the class for the actual wrapper, then the struct is just a thin thing on top of that. interface IInputRange { // yada } struct InputRange { IInputRange c; } So you'd wrap it like that. Then for forward range make the copy constructor call the clone/save/deepCopy/whatever method on the class. Then the std.algorithm things just take InputRange. A few details to work out but it is a simple enough thing to do.
Nov 02 2021
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:
On 2021-11-02 15:32, Adam D Ruppe wrote:
 On Tuesday, 2 November 2021 at 18:09:55 UTC, H. S. Teoh wrote:
 Why is this necessary?  I thought we're getting rid of 
 std.range.interfaces.
It is actually really, really, useful. If phobos didn't offer it, someone would reinvent it anyway. (In fact, there's a lot of cases where using them is more efficient than generating more and more code...)
Yah, polymorphism has its place. The only problem is passing around reference ranges. They should have a thin struct wrapper that carries the proper copy semantics.
Nov 02 2021
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Nov 02, 2021 at 04:17:06PM -0400, Andrei Alexandrescu via Digitalmars-d
wrote:
 On 2021-11-02 15:32, Adam D Ruppe wrote:
 On Tuesday, 2 November 2021 at 18:09:55 UTC, H. S. Teoh wrote:
 Why is this necessary?  I thought we're getting rid of
 std.range.interfaces.
It is actually really, really, useful. If phobos didn't offer it, someone would reinvent it anyway. (In fact, there's a lot of cases where using them is more efficient than generating more and more code...)
Yah, polymorphism has its place. The only problem is passing around reference ranges. They should have a thin struct wrapper that carries the proper copy semantics.
Yes, so we need a standard way of constructing such wrappers. Possibly an addition to stdv2.range.interfaces? Or maybe just have the wrapper constructors return the constructed polymorphic range pre-(shrink)wrapped. :-) T -- People who are more than casually interested in computers should have at least some idea of what the underlying hardware is like. Otherwise the programs they write will be pretty weird. -- D. Knuth
Nov 02 2021
prev sibling parent Alexandru Ermicioi <alexandru.ermicioi gmail.com> writes:
On Tuesday, 2 November 2021 at 20:17:06 UTC, Andrei Alexandrescu 
wrote:
 On 2021-11-02 15:32, Adam D Ruppe wrote:
 On Tuesday, 2 November 2021 at 18:09:55 UTC, H. S. Teoh wrote:
 Why is this necessary?  I thought we're getting rid of 
 std.range.interfaces.
It is actually really, really, useful. If phobos didn't offer it, someone would reinvent it anyway. (In fact, there's a lot of cases where using them is more efficient than generating more and more code...)
Yah, polymorphism has its place. The only problem is passing around reference ranges. They should have a thin struct wrapper that carries the proper copy semantics.
So, if forward range interface (from std.range.interfaces) is to be kept in phobos, it should provide a .save method, that can be used instead of copy constructor. Then, it is possible to have only one wrapper struct for transforming it into value type (i.e. behave same as struct forward range), that would use .save when wrapper's copy constructor is invoked. It would allow to use this wrapper as part of method parameter type, in order to enforce people using it, and not randomly forgetting to wrap it.
Nov 03 2021
prev sibling parent reply Dukc <ajieskola gmail.com> writes:
On Tuesday, 2 November 2021 at 18:09:55 UTC, H. S. Teoh wrote:
 On Tue, Nov 02, 2021 at 12:11:24PM +0000, Dukc via 
 Digitalmars-d wrote:
 On Tuesday, 2 November 2021 at 02:45:11 UTC, Andrei 
 Alexandrescu wrote:
 
 Exactly. No need to support class ranges - simple wrappers 
 can do everything class-like indirection does. Thanks.
Trying to write up a plan based on that one, so you can correct and/or spot weaknesses - stuff in `std.v2.range.interfaces` and `std.v2.concurrency.Generator` will continue to be ranges from Phobos v1 viewpoint but not from Phobos v2 viewpoint.
Why is this necessary? I thought we're getting rid of std.range.interfaces.
I quess we could write a more advanced alternative, but I'd prefer to keep the range interfaces around until someone does, to avoid scope creep. The downside is going to be that Phobos v2 cannot use the interfaces directly as they aren't ranges anymore, but that's what the `valueRange` and the "other function" I mentioned are for.
 What's a value range?
Opposite of a reference range - copying implies `save()`.
 Interesting idea. So basically a shim for easy translation of 
 v1-based code to v2-based code?  That would be nice for gradual 
 migration.  It would have to exclude certain incompatible 
 things like autodecoded strings, though. Otherwise it will 
 result in a mess.
I propose that the shim will autodecode if imported from `v1` (if we even add it to `v1`) but not if imported from `v2` - just like the rest of the range-accepting functions.
 - Phobos v2 ranges should still continue to provide the `save` 
 method
   so they can be passed to v1 ranges.
[...] I'm not sure this is such a good idea, because v2 ranges may have fundamental incompatibilities with v1 algorithms, e.g., a v2 string range (non-autodecoded) being passed to a v1 algorithm (autodecoded) will probably produce the wrong results, likely silently, which is bad.
I agree that it's better to avoid function chains like that if easily possible. But the underlying rule is simple and unambiguous: a Phobos v1 function will autodecode a character array, a v2 function will not. If the character range is anything other than a plain array, they behave identically: the decoding or lack of thereof depends on the range itself. Not worth to make the interoperability difficult just because of that IMO. If the users start having problems, they can voluntarily avoid mixing the character-handling `v1` and `v2` ranges - they still enjoy easier interoperability with other ranges.
Nov 02 2021
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:
On 2021-11-02 17:44, Dukc wrote:
 
 What's a value range?
Opposite of a reference range - copying implies `save()`.
Yah, one simple improvement we could make is to assume all forward ranges copy their iteration state when copying the range. Then input ranges do NOT do that, i.e. all copies of an input range refer to the same stream and iterate it together (advancing one advances all). The differentiation can be made with a nested enum tag: struct MyInputRange { enum inputRangeTag = true; ... } Client code can inspect R.inputRangeTag to figure whether the range is input (if present) or forward (if missing).
Nov 02 2021
parent reply Paul Backus <snarwin gmail.com> writes:
On Tuesday, 2 November 2021 at 21:58:20 UTC, Andrei Alexandrescu 
wrote:
 On 2021-11-02 17:44, Dukc wrote:
 
 What's a value range?
Opposite of a reference range - copying implies `save()`.
Yah, one simple improvement we could make is to assume all forward ranges copy their iteration state when copying the range. Then input ranges do NOT do that, i.e. all copies of an input range refer to the same stream and iterate it together (advancing one advances all). The differentiation can be made with a nested enum tag: struct MyInputRange { enum inputRangeTag = true; ... } Client code can inspect R.inputRangeTag to figure whether the range is input (if present) or forward (if missing).
Not sure this is the best idea--it means new-style algorithms will silently treat old-style input ranges as though they were forward ranges, which could lead to incorrect behavior at runtime. If we are going to make incompatible changes to the range API, we should do it in such a way that version mismatches are caught at compile time.
Nov 02 2021
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Nov 02, 2021 at 11:07:08PM +0000, Paul Backus via Digitalmars-d wrote:
 On Tuesday, 2 November 2021 at 21:58:20 UTC, Andrei Alexandrescu wrote:
[...]
 The differentiation can be made with a nested enum tag:
 
 struct MyInputRange {
     enum inputRangeTag = true;
     ...
 }
 
 Client code can inspect R.inputRangeTag to figure whether the range
 is input (if present) or forward (if missing).
Not sure this is the best idea--it means new-style algorithms will silently treat old-style input ranges as though they were forward ranges, which could lead to incorrect behavior at runtime. If we are going to make incompatible changes to the range API, we should do it in such a way that version mismatches are caught at compile time.
The problem with manually-added tags of this sort is that people forget to do it, and that leads to trouble. Preferably, it should be something already implicit in the range type itself, that does not require additional effort to tag. I'm kinda toying with the idea of struct == forward range, class == input range: the difference is inherent in the type itself and requires no further effort beyond the decision to use a by-value type vs. a by-reference type, which coincides with the decision to make something an input range or a forward range. T -- Political correctness: socially-sanctioned hypocrisy.
Nov 02 2021
parent reply Paul Backus <snarwin gmail.com> writes:
On Wednesday, 3 November 2021 at 00:01:33 UTC, H. S. Teoh wrote:
 The problem with manually-added tags of this sort is that 
 people forget to do it, and that leads to trouble.  Preferably, 
 it should be something already implicit in the range type 
 itself, that does not require additional effort to tag.

 I'm kinda toying with the idea of struct == forward range, 
 class == input range: the difference is inherent in the type 
 itself and requires no further effort beyond the decision to 
 use a by-value type vs. a by-reference type, which coincides 
 with the decision to make something an input range or a forward 
 range.
Having input ranges implement `next` and forward ranges implement `head` and `tail` would also make them easy to distinguish.
Nov 02 2021
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Nov 03, 2021 at 12:18:59AM +0000, Paul Backus via Digitalmars-d wrote:
 On Wednesday, 3 November 2021 at 00:01:33 UTC, H. S. Teoh wrote:
 The problem with manually-added tags of this sort is that people
 forget to do it, and that leads to trouble.  Preferably, it should
 be something already implicit in the range type itself, that does
 not require additional effort to tag.
 
 I'm kinda toying with the idea of struct == forward range, class ==
 input range: the difference is inherent in the type itself and
 requires no further effort beyond the decision to use a by-value
 type vs. a by-reference type, which coincides with the decision to
 make something an input range or a forward range.
Having input ranges implement `next` and forward ranges implement `head` and `tail` would also make them easy to distinguish.
That would work too, but makes the input range API no longer a subset of the forward range API. This would lead to code duplication in algorithms that only require an input range but could work equally well with a forward range. T -- Once bitten, twice cry...
Nov 02 2021
parent reply Paul Backus <snarwin gmail.com> writes:
On Wednesday, 3 November 2021 at 00:24:11 UTC, H. S. Teoh wrote:
 On Wed, Nov 03, 2021 at 12:18:59AM +0000, Paul Backus via 
 Digitalmars-d wrote:
 Having input ranges implement `next` and forward ranges 
 implement `head` and `tail` would also make them easy to 
 distinguish.
That would work too, but makes the input range API no longer a subset of the forward range API. This would lead to code duplication in algorithms that only require an input range but could work equally well with a forward range.
Not necessarily. It's possible to implement `next` as a UFCS function for mutable forward ranges using the `head`/`tail` API: auto next(R)(ref R r) if (isForwardRangeV2!R && isMutable!R) { alias E = ElementType!R; if (r.empty) return none!E(); else { auto result = some(r.head); r = r.tail; return result; } }
Nov 02 2021
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:
On 2021-11-02 20:38, Paul Backus wrote:
 On Wednesday, 3 November 2021 at 00:24:11 UTC, H. S. Teoh wrote:
 On Wed, Nov 03, 2021 at 12:18:59AM +0000, Paul Backus via 
 Digitalmars-d wrote:
 Having input ranges implement `next` and forward ranges implement 
 `head` and `tail` would also make them easy to distinguish.
That would work too, but makes the input range API no longer a subset of the forward range API.  This would lead to code duplication in algorithms that only require an input range but could work equally well with a forward range.
Not necessarily. It's possible to implement `next` as a UFCS function for mutable forward ranges using the `head`/`tail` API: auto next(R)(ref R r)     if (isForwardRangeV2!R && isMutable!R) {     alias E = ElementType!R;     if (r.empty)         return none!E();     else     {         auto result = some(r.head);         r = r.tail;         return result;     } }
OK, so the signature of next for all ranges is: Option!(ElementType!R) next(Range)(ref Range); Is that correct?
Nov 03 2021
parent reply Paul Backus <snarwin gmail.com> writes:
On Wednesday, 3 November 2021 at 15:40:41 UTC, Andrei 
Alexandrescu wrote:
 On 2021-11-02 20:38, Paul Backus wrote:
 
 auto next(R)(ref R r)
      if (isForwardRangeV2!R && isMutable!R)
 {
      alias E = ElementType!R;
      if (r.empty)
          return none!E();
      else
      {
          auto result = some(r.head);
          r = r.tail;
          return result;
      }
 }
OK, so the signature of next for all ranges is: Option!(ElementType!R) next(Range)(ref Range); Is that correct?
More precisely, to use the Phobos convention: `is(ReturnType!((Range r) => r.next) == Option!(ElementType!R))`. So, `next` could be a function, a property, or a member variable, and it does not necessarily require an lvalue to call (just like `front` today).
Nov 03 2021
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:
On 2021-11-03 12:18, Paul Backus wrote:
 On Wednesday, 3 November 2021 at 15:40:41 UTC, Andrei Alexandrescu wrote:
 On 2021-11-02 20:38, Paul Backus wrote:
 auto next(R)(ref R r)
      if (isForwardRangeV2!R && isMutable!R)
 {
      alias E = ElementType!R;
      if (r.empty)
          return none!E();
      else
      {
          auto result = some(r.head);
          r = r.tail;
          return result;
      }
 }
OK, so the signature of next for all ranges is: Option!(ElementType!R) next(Range)(ref Range); Is that correct?
More precisely, to use the Phobos convention: `is(ReturnType!((Range r) => r.next) == Option!(ElementType!R))`. So, `next` could be a function, a property, or a member variable, and it does not necessarily require an lvalue to call (just like `front` today).
We've considered this way back when. I'm talking like 2006. It was like this: T next(Range)(ref Range r, ref bool done); The main problem is that iterating such a forward range would entail a copy of each element of the range. This is not scalable in general. This is a showstopper.
Nov 03 2021
parent reply Paul Backus <snarwin gmail.com> writes:
On Wednesday, 3 November 2021 at 17:41:10 UTC, Andrei 
Alexandrescu wrote:
 On 2021-11-03 12:18, Paul Backus wrote:
 On Wednesday, 3 November 2021 at 15:40:41 UTC, Andrei 
 Alexandrescu wrote:
 On 2021-11-02 20:38, Paul Backus wrote:
 auto next(R)(ref R r)
      if (isForwardRangeV2!R && isMutable!R)
 {
      alias E = ElementType!R;
      if (r.empty)
          return none!E();
      else
      {
          auto result = some(r.head);
          r = r.tail;
          return result;
      }
 }
OK, so the signature of next for all ranges is: Option!(ElementType!R) next(Range)(ref Range); Is that correct?
More precisely, to use the Phobos convention: `is(ReturnType!((Range r) => r.next) == Option!(ElementType!R))`. So, `next` could be a function, a property, or a member variable, and it does not necessarily require an lvalue to call (just like `front` today).
We've considered this way back when. I'm talking like 2006. It was like this: T next(Range)(ref Range r, ref bool done); The main problem is that iterating such a forward range would entail a copy of each element of the range. This is not scalable in general. This is a showstopper.
If we want to avoid copying, we can have `next` return a `Ref!T` in the case where the forward range has lvalue elements: struct Ref(T) { T* ptr; ref inout(T) deref() inout { return *ptr; } alias deref this; } I've tested some simple uses of this wrapper on run.dlang.io, and it seems like DIP 1000 is good enough to make it work in safe code. If "returns either `T` or `Ref!T`" sounds like a suspect design for an API, consider that it is basically the same thing as an `auto ref` return value--just with the distinction between ref and non-ref brought inside the type system.
Nov 03 2021
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/3/21 11:25 PM, Paul Backus wrote:
 On Wednesday, 3 November 2021 at 17:41:10 UTC, Andrei Alexandrescu wrote:
 On 2021-11-03 12:18, Paul Backus wrote:
 On Wednesday, 3 November 2021 at 15:40:41 UTC, Andrei Alexandrescu 
 wrote:
 On 2021-11-02 20:38, Paul Backus wrote:
 auto next(R)(ref R r)
      if (isForwardRangeV2!R && isMutable!R)
 {
      alias E = ElementType!R;
      if (r.empty)
          return none!E();
      else
      {
          auto result = some(r.head);
          r = r.tail;
          return result;
      }
 }
OK, so the signature of next for all ranges is: Option!(ElementType!R) next(Range)(ref Range); Is that correct?
More precisely, to use the Phobos convention: `is(ReturnType!((Range r) => r.next) == Option!(ElementType!R))`. So, `next` could be a function, a property, or a member variable, and it does not necessarily require an lvalue to call (just like `front` today).
We've considered this way back when. I'm talking like 2006. It was like this: T next(Range)(ref Range r, ref bool done); The main problem is that iterating such a forward range would entail a copy of each element of the range. This is not scalable in general. This is a showstopper.
If we want to avoid copying, we can have `next` return a `Ref!T` in the case where the forward range has lvalue elements: struct Ref(T) {     T* ptr;     ref inout(T) deref() inout     {         return *ptr;     }     alias deref this; } I've tested some simple uses of this wrapper on run.dlang.io, and it seems like DIP 1000 is good enough to make it work in safe code. If "returns either `T` or `Ref!T`" sounds like a suspect design for an API, consider that it is basically the same thing as an `auto ref` return value--just with the distinction between ref and non-ref brought inside the type system.
That was on the table, too, in the form of a raw pointer. I think it can be made to work, but for lvalue ranges only, and it will be difficult to make safe (scoped etc). Overall this seems to create more problems than it solves.
Nov 03 2021
next sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Thursday, 4 November 2021 at 04:06:15 UTC, Andrei Alexandrescu 
wrote:
 On 11/3/21 11:25 PM, Paul Backus wrote:
 
 If we want to avoid copying, we can have `next` return a 
 `Ref!T` in the case where the forward range has lvalue 
 elements:
 
 struct Ref(T)
 {
      T* ptr;
 
      ref inout(T) deref() inout
      {
          return *ptr;
      }
      alias deref this;
 }
 
 I've tested some simple uses of this wrapper on run.dlang.io, 
 and it seems like DIP 1000 is good enough to make it work in 
  safe code.
 
 If "returns either `T` or `Ref!T`" sounds like a suspect 
 design for an API, consider that it is basically the same 
 thing as an `auto ref` return value--just with the distinction 
 between ref and non-ref brought inside the type system.
That was on the table, too, in the form of a raw pointer. I think it can be made to work, but for lvalue ranges only, and it will be difficult to make safe (scoped etc). Overall this seems to create more problems than it solves.
I'd be curious to see any examples of such problems you have in mind. As far as I'm aware, no special effort should be required to make this safe, aside from enabling -preview=dip1000 (which, granted, is still a work in progress).
Nov 03 2021
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/4/21 12:43 AM, Paul Backus wrote:
 On Thursday, 4 November 2021 at 04:06:15 UTC, Andrei Alexandrescu wrote:
 On 11/3/21 11:25 PM, Paul Backus wrote:
 If we want to avoid copying, we can have `next` return a `Ref!T` in 
 the case where the forward range has lvalue elements:

 struct Ref(T)
 {
      T* ptr;

      ref inout(T) deref() inout
      {
          return *ptr;
      }
      alias deref this;
 }

 I've tested some simple uses of this wrapper on run.dlang.io, and it 
 seems like DIP 1000 is good enough to make it work in  safe code.

 If "returns either `T` or `Ref!T`" sounds like a suspect design for 
 an API, consider that it is basically the same thing as an `auto ref` 
 return value--just with the distinction between ref and non-ref 
 brought inside the type system.
That was on the table, too, in the form of a raw pointer. I think it can be made to work, but for lvalue ranges only, and it will be difficult to make safe (scoped etc). Overall this seems to create more problems than it solves.
I'd be curious to see any examples of such problems you have in mind. As far as I'm aware, no special effort should be required to make this safe, aside from enabling -preview=dip1000 (which, granted, is still a work in progress).
Pointers are problematic because of aliasing and lifetime (what if the pointer survives the data structure it points into). So the `Ref` structs needs to be qualified appropriately with `scope`. So yes DIP1000 would need to be tight. Usability is another matter that hasn't been quite looked at. Once you have a scoped pointer wrapper, what can and what can't you do with it easily? I'm not very sure. Alias this is just poorly done. I think we shouldn't base a fundamental API on it. Anyway, I'm cautiously optimistic. At the very least this should be explored. Note that the whole thing still doesn't address unbuffered ranges. There must be a buffer of at least one element somewhere. That's... problematic.
Nov 04 2021
next sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Thursday, 4 November 2021 at 15:29:59 UTC, Andrei Alexandrescu 
wrote:
 Usability is another matter that hasn't been quite looked at. 
 Once you have a scoped pointer wrapper, what can and what can't 
 you do with it easily? I'm not very sure.

 Alias this is just poorly done. I think we shouldn't base a 
 fundamental API on it.
Both good points. It will take some experimentation to find out where the rough edges of this approach are, and whether they can be adequately sanded down.
 Anyway, I'm cautiously optimistic. At the very least this 
 should be explored.

 Note that the whole thing still doesn't address unbuffered 
 ranges. There must be a buffer of at least one element 
 somewhere. That's... problematic.
Unbuffered ranges will return `Option!T` from `next`, rather than `Option!(Ref!T)`. Again, this is the same distinction we already have between rvalue `front` and lvalue `front`, so I don't think the inconsistency is a problem, as long as we can make `Ref!T` function as a subtype of `T` (either via `alias this` or some more principled mechanism).
Nov 04 2021
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:
On 2021-11-04 12:39, Paul Backus wrote:
 Again, this is the same distinction we already have between rvalue 
 `front` and lvalue `front`
That reminds me, we should drop that like a bad habit too :o). Currently ranges have all sorts of weird, random genericity. Recalling from memory (perhaps/hopefully some of these have been fixed): - At least at some point `empty` did not have to return bool, just something convertible to bool. Like immutable(bool). - For a while we had a lively discussion about length returning ulong instead of size_t (relevant on 32-bit). - front could return pretty much what it damn well pleased, including qualified data, rvalues vs lvalues, noncopyable stuff, etc. - Thinking how inout interacts with everything ranges is just depressing. - I seem to recall there was at least one popFront that returned something meaningful. (Maybe that's not too disruptive.) Based on past experience we could and should simplify the range interface in places where genericity has little value and the implementation effort is high.
Nov 04 2021
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Nov 04, 2021 at 06:38:30PM -0400, Andrei Alexandrescu via Digitalmars-d
wrote:
 On 2021-11-04 12:39, Paul Backus wrote:
 Again, this is the same distinction we already have between rvalue
 `front` and lvalue `front`
That reminds me, we should drop that like a bad habit too :o). Currently ranges have all sorts of weird, random genericity. Recalling from memory (perhaps/hopefully some of these have been fixed):
Yeah, we need to get rid of useless genericity, and also exactly what is expected of range operations should be stated clearly and unambiguously in the API docs. The current range API suffers from insufficient clarity, so many such cases went "under the radar" and inevitably ended up being implemented when some kind soul decided that it would be nice to support this or that niche case.
 - At least at some point `empty` did not have to return bool, just
 something convertible to bool. Like immutable(bool).
Yeah, .empty should return bool, and only bool. Not immutable(bool), not something that alias this to bool, none of that sort. Also, the spec should specify precisely whether .empty must be a function (and whether it should be a member function, a free function, or both), or it's allowed to be a member variable. Currently in my own code I have a few cases where .empty is a variable rather than a function. It hasn't run into any problems yet so far, but things like this must be explicitly stated, otherwise somebody will inevitably write code that assumes one way or the other, and break things for no good reason.
 - For a while we had a lively discussion about length returning ulong
 instead of size_t (relevant on 32-bit).
Whichever way we decide, this should be specified clearly and not left up to interpretation.
 - front could return pretty much what it damn well pleased, including
 qualified data, rvalues vs lvalues, noncopyable stuff, etc.
Yeah, this has been especially troublesome. I think we should specify exactly what type(s) and qualifier(s) are permitted to be returned from .front. Don't forget transient values returned by .front that are invalidated by the next call to .popFront (e.g., std.stdio.File.byLine, which reuses the line buffer). The range API needs to explicitly state whether .popFront is allowed to do this, and if it is allowed, range algorithms that attempt to cache .front past the next invocation to .popFront must be rewritten. (This used to be a pretty big problem, but I think we've fixed most of the cases in Phobos by now. But it still turns its ugly head up every now and then in user code that makes wrong assumptions about the lifetime of the value returned by .front.)
 - Thinking how inout interacts with everything ranges is just
 depressing.
inout is the source of all kinds of nastiness in the language. It's a cute hack that works for the trivial cases, but once you combine it with other language features it's a mess. Consider this: inout T myFunc(T)(inout T delegate(inout T t) dg, inout T u) {...} Does inout apply to the return value of dg, dg itself, or both? How does it interact with the inout on the function's return value? How exactly does inout on t interact with the delegate's inout return, and how do they correlate with the inout of the outer function? This is just one of many cases of ambiguity; it's not hard to construct other examples. In short, it's a mess. And don't forget that inout behaves like const inside the function body, but when passed as a template argument triggers a different instantiation (template bloat). And trying to work with inout in generic code where you have to deal with arbitrary incoming type qualifiers is an exercise in pain. I think we should just flat out *not* support inout in ranges.
 - I seem to recall there was at least one popFront that returned
 something meaningful. (Maybe that's not too disruptive.)
It should be mandated by spec to return void.
 Based on past experience we could and should simplify the range
 interface in places where genericity has little value and the
 implementation effort is high.
+1. Plus, the *exact* expectations of the various range functions should be spelled out in clear, unambiguous terms. Such as ref or non-ref, const or mutable, function or member variable (or free function), transient .front or not, copyable or not, what exactly .popFront returns, etc.. There must be no room left for interpretation except where explicitly allowed. Leave any small detail unspecified, and we can almost be guaranteed to be bitten by it later. Best spell out the exact permitted function signatures and types with list of allowed qualifiers to leave no room for misinterpretation. T -- I am not young enough to know everything. -- Oscar Wilde
Nov 04 2021
parent reply Atila Neves <atila.neves gmail.com> writes:
On Thursday, 4 November 2021 at 23:30:05 UTC, H. S. Teoh wrote:
 On Thu, Nov 04, 2021 at 06:38:30PM -0400, Andrei Alexandrescu 
 via Digitalmars-d wrote:
 On 2021-11-04 12:39, Paul Backus wrote:
 Again, this is the same distinction we already have between 
 rvalue `front` and lvalue `front`
That reminds me, we should drop that like a bad habit too :o). Currently ranges have all sorts of weird, random genericity. Recalling from memory (perhaps/hopefully some of these have been fixed):
Yeah, we need to get rid of useless genericity, and also exactly what is expected of range operations should be stated clearly and unambiguously in the API docs. The current range API suffers from insufficient clarity, so many such cases went "under the radar" and inevitably ended up being implemented when some kind soul decided that it would be nice to support this or that niche case.
Sometimes genericity is a good thing. Take C++, where range for was originally specified in C++11 such that the begin and end iterators had to be the same type, which on the face it seems to makes sense. But then that was found out to be overly constraining, and to be able to add ranges to C++17 they had to change the definition of a range for loop so that end only had to be comparable to begin and could be a different type.
 - At least at some point `empty` did not have to return bool, 
 just something convertible to bool. Like immutable(bool).
Yeah, .empty should return bool, and only bool. Not immutable(bool), not something that alias this to bool, none of that sort. Also, the spec should specify precisely whether .empty must be a function (and whether it should be a member function, a free function, or both), or it's allowed to be a member variable.
Similarly to what I said above, I don't think the spec should do this at all. Plasticity is what D is good at, and leaving it to "range.empty is a bool" is, IMHO, far better. I *love* not using parens for functions with no args and being able to use a function/variable/enum, then being able to change that and not have to touch the rest of the code at all.
Nov 05 2021
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Nov 05, 2021 at 11:43:01AM +0000, Atila Neves via Digitalmars-d wrote:
 On Thursday, 4 November 2021 at 23:30:05 UTC, H. S. Teoh wrote:
[...]
 Yeah, we need to get rid of useless genericity, and also exactly
 what is expected of range operations should be stated clearly and
 unambiguously in the API docs.  The current range API suffers from
 insufficient clarity, so many such cases went "under the radar" and
 inevitably ended up being implemented when some kind soul decided
 that it would be nice to support this or that niche case.
Sometimes genericity is a good thing. Take C++, where range for was originally specified in C++11 such that the begin and end iterators had to be the same type, which on the face it seems to makes sense. But then that was found out to be overly constraining, and to be able to add ranges to C++17 they had to change the definition of a range for loop so that end only had to be comparable to begin and could be a different type.
Genericity is definitely a good thing -- when it doesn't lead to the slippery slope of ever-more-complicated convolutions in the code as a result of trying to cater to every unnatural use case. The whole point of the range abstraction is to *simplify* code; if simplicity and clarity of code is compromised because of genericity, then we have failed. [...]
 Yeah, .empty should return bool, and only bool.  Not
 immutable(bool), not something that alias this to bool, none of that
 sort.
 
 Also, the spec should specify precisely whether .empty must be a
 function (and whether it should be a member function, a free
 function, or both), or it's allowed to be a member variable.
Similarly to what I said above, I don't think the spec should do this at all. Plasticity is what D is good at, and leaving it to "range.empty is a bool" is, IMHO, far better. I *love* not using parens for functions with no args and being able to use a function/variable/enum, then being able to change that and not have to touch the rest of the code at all.
I disagree. The spec *should* explicitly state what .empty (or any other range method/identifier) is allowed to be. If you want more genericity, simply have the spec say ".empty may be either a method or a member field". This may seem trivial, but it's necessary to prevent things like some Phobos code assuming that .empty is always a method, and then it fails when somebody passes in a range that has a field instead. Also, on the user-facing side, it prevents spurious bug reports like "how come my custom-made range with non-copyable .empty masqueraded from a nested struct via alias this doesn't pass isInputRange?", which then prompts some well-meaning soul to implement support for this obscure case, thereby adding all kinds of weird fluff to Phobos that really don't belong there. We want to be able to say to such bug reports, "the spec says .empty can only be method or a bool field, sorry we don't support stuff where .empty is a non-copyable wrapper object that uses alias this to implicitly convert to a value wrapper with an .opCast!bool that returns an immutable(bool) which can then be value-copied onto a bool". Andrei has said many times that these kinds of obscure cases don't belong to Phobos. If some user wants static arrays to work with ranges, then just write `[]` and be done with it, instead of adding yet another useless feature to Phobos (which inevitably will cause some unexpected poor interaction with another obscure case, and we're stuck in the endless churn of accreting features in Phobos that make it harder to maintain yet does not actually make any progress in improving D code). If somebody wants .empty to be a wrapper struct that uses alias this and .opCast!bool to return an immutable(bool), just have them write a wrapper that uses a function .empty to return a bool. The fact that user code ended up in such a tangled mess is a sign that something is wrong on *their* side; we should not be promoting bad code practices by supporting such monstrosities in Phobos; we should instead be triggering a compile error so that the user cleans up his act and writes better code. T -- Insanity is doing the same thing over and over again and expecting different results.
Nov 09 2021
prev sibling parent Atila Neves <atila.neves gmail.com> writes:
On Thursday, 4 November 2021 at 15:29:59 UTC, Andrei Alexandrescu 
wrote:
 On 11/4/21 12:43 AM, Paul Backus wrote:
 On Thursday, 4 November 2021 at 04:06:15 UTC, Andrei 
 Alexandrescu wrote:
 On 11/3/21 11:25 PM, Paul Backus wrote:
 [...]
That was on the table, too, in the form of a raw pointer. I think it can be made to work, but for lvalue ranges only, and it will be difficult to make safe (scoped etc). Overall this seems to create more problems than it solves.
I'd be curious to see any examples of such problems you have in mind. As far as I'm aware, no special effort should be required to make this safe, aside from enabling -preview=dip1000 (which, granted, is still a work in progress).
Pointers are problematic because of aliasing and lifetime (what if the pointer survives the data structure it points into).
T* should mean infinite lifetime by default in safe code: where did you get that pointer to begin with? If a struct contains a T* within it, then scoped struct variables solve the lifetime issue that way. Aliasing, however, is a problem we still have. Which is why we can't currently write a safe vector.
 Note that the whole thing still doesn't address unbuffered 
 ranges. There must be a buffer of at least one element 
 somewhere. That's... problematic.
Yeah, I'm still wondering how to fix that.
Nov 05 2021
prev sibling parent reply Dukc <ajieskola gmail.com> writes:
On Thursday, 4 November 2021 at 04:06:15 UTC, Andrei Alexandrescu 
wrote:
 I think it can be made to work, but for lvalue ranges only, and 
 it will be difficult to make safe (scoped etc).

 Overall this seems to create more problems than it solves.
Yeah, I think we should keep the `front`/`popFront`/`empty` scheme. Not because it's necessarily better, but because there's always a high risk for scope creep and second-system effect when doing projects like Phobos v2. Even discarding autodecoding and `isForwardRange` will be a lot of work already, let's not bite more than we can swallow.
Nov 04 2021
parent reply Paul Backus <snarwin gmail.com> writes:
On Thursday, 4 November 2021 at 10:45:25 UTC, Dukc wrote:
 On Thursday, 4 November 2021 at 04:06:15 UTC, Andrei 
 Alexandrescu wrote:
 I think it can be made to work, but for lvalue ranges only, 
 and it will be difficult to make safe (scoped etc).

 Overall this seems to create more problems than it solves.
Yeah, I think we should keep the `front`/`popFront`/`empty` scheme. Not because it's necessarily better, but because there's always a high risk for scope creep and second-system effect when doing projects like Phobos v2. Even discarding autodecoding and `isForwardRange` will be a lot of work already, let's not bite more than we can swallow.
I agree that this is definitely not a v2 proposal--more like v3 or v4. But I do think it should be on the roadmap.
Nov 04 2021
parent Dukc <ajieskola gmail.com> writes:
On Thursday, 4 November 2021 at 12:59:52 UTC, Paul Backus wrote:
 I agree that this is definitely not a v2 proposal--more like v3 
 or v4. But I do think it should be on the roadmap.
Ah, that's more reasonable. Not saying I agree but at least I disagree much less.
Nov 04 2021
prev sibling next sibling parent reply Dukc <ajieskola gmail.com> writes:
On Wednesday, 3 November 2021 at 00:18:59 UTC, Paul Backus wrote:
 Having input ranges implement `next` and forward ranges 
 implement `head` and `tail` would also make them easy to 
 distinguish.
There's an easier solution: require all v2 ranges to have the `inputRangeTag`. For forward ranges it must be set to false, but if the tag does not exist at all then it isn't a v2 range. If we go for this, I'd rename the tag to `isReferenceRange` though. It's going to require more manual usage of the `valueRange` wrapper though, as the v1 ranges around aren't going to be v2 ranges as often. What do you say?
Nov 03 2021
parent Paul Backus <snarwin gmail.com> writes:
On Wednesday, 3 November 2021 at 11:54:24 UTC, Dukc wrote:
 There's an easier solution: require all v2 ranges to have the 
 `inputRangeTag`. For forward ranges it must be set to false, 
 but if the tag does not exist at all then it isn't a v2 range.

 If we go for this, I'd rename the tag to `isReferenceRange` 
 though.

 It's going to require more manual usage of the `valueRange` 
 wrapper though, as the v1 ranges around aren't going to be v2 
 ranges as often.

 What do you say?
Sure, that would work. But that still leaves the issue of `const` ranges, which is the other thing `head` and `tail` are meant to address. I think if we are going to make incompatible changes to the range API, we might as well do a proper redesign that fixes all of the known issues at once. And until we do that (maybe in v3?), it is probably better to hold off on incompatible changes.
Nov 03 2021
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:
On 2021-11-02 20:18, Paul Backus wrote:
 On Wednesday, 3 November 2021 at 00:01:33 UTC, H. S. Teoh wrote:
 The problem with manually-added tags of this sort is that people 
 forget to do it, and that leads to trouble.  Preferably, it should be 
 something already implicit in the range type itself, that does not 
 require additional effort to tag.

 I'm kinda toying with the idea of struct == forward range, class == 
 input range: the difference is inherent in the type itself and 
 requires no further effort beyond the decision to use a by-value type 
 vs. a by-reference type, which coincides with the decision to make 
 something an input range or a forward range.
Having input ranges implement `next` and forward ranges implement `head` and `tail` would also make them easy to distinguish.
What would be the signature of next? Would forward ranges also implement next? If not, that would mean algorithms for input ranges won't work for forward ranges.
Nov 03 2021
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:
On 2021-11-02 19:07, Paul Backus wrote:
 On Tuesday, 2 November 2021 at 21:58:20 UTC, Andrei Alexandrescu wrote:
 On 2021-11-02 17:44, Dukc wrote:
 What's a value range?
Opposite of a reference range - copying implies `save()`.
Yah, one simple improvement we could make is to assume all forward ranges copy their iteration state when copying the range. Then input ranges do NOT do that, i.e. all copies of an input range refer to the same stream and iterate it together (advancing one advances all). The differentiation can be made with a nested enum tag: struct MyInputRange {     enum inputRangeTag = true;     ... } Client code can inspect R.inputRangeTag to figure whether the range is input (if present) or forward (if missing).
Not sure this is the best idea--it means new-style algorithms will silently treat old-style input ranges as though they were forward ranges, which could lead to incorrect behavior at runtime. If we are going to make incompatible changes to the range API, we should do it in such a way that version mismatches are caught at compile time.
Good point. Maybe have all ranges define that enum with values true and false respectively?
Nov 03 2021
next sibling parent reply jmh530 <john.michael.hall gmail.com> writes:
On Wednesday, 3 November 2021 at 15:03:53 UTC, Andrei 
Alexandrescu wrote:
 [snip]

 Good point. Maybe have all ranges define that enum with values 
 true and false respectively?
Maybe I just haven't been following this discussion closely enough, but I'm a little confused by this. We have `__traits(hasMember, T, inputRangeTag)` after all. There should be default behavior when that tag isn't there and more sophisticated behavior when it is (if it is needed).
Nov 03 2021
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:
On 2021-11-03 11:28, jmh530 wrote:
 On Wednesday, 3 November 2021 at 15:03:53 UTC, Andrei Alexandrescu wrote:
 [snip]

 Good point. Maybe have all ranges define that enum with values true 
 and false respectively?
Maybe I just haven't been following this discussion closely enough, but I'm a little confused by this. We have `__traits(hasMember, T, inputRangeTag)` after all. There should be default behavior when that tag isn't there and more sophisticated behavior when it is (if it is needed).
Yah that's what I had in mind - hasMember is actually easier. I think the problem is that code that forgets to do the check may easily do the wrong thing.
Nov 03 2021
prev sibling next sibling parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 11/3/21 11:03 AM, Andrei Alexandrescu wrote:
 On 2021-11-02 19:07, Paul Backus wrote:
 On Tuesday, 2 November 2021 at 21:58:20 UTC, Andrei Alexandrescu wrote:
 On 2021-11-02 17:44, Dukc wrote:
 What's a value range?
Opposite of a reference range - copying implies `save()`.
Yah, one simple improvement we could make is to assume all forward ranges copy their iteration state when copying the range. Then input ranges do NOT do that, i.e. all copies of an input range refer to the same stream and iterate it together (advancing one advances all). The differentiation can be made with a nested enum tag: struct MyInputRange {     enum inputRangeTag = true;     ... } Client code can inspect R.inputRangeTag to figure whether the range is input (if present) or forward (if missing).
Not sure this is the best idea--it means new-style algorithms will silently treat old-style input ranges as though they were forward ranges, which could lead to incorrect behavior at runtime. If we are going to make incompatible changes to the range API, we should do it in such a way that version mismatches are caught at compile time.
Good point. Maybe have all ranges define that enum with values true and false respectively?
Yes, this is what I was trying to point out in my other post. One thing that is possible is to change at least one of the methods (i.e. change the name of `front`, `popFront`, or `empty`), so it is easy to distinguish a v2 range from a v1 range. An enum works too, and I'd support that. For sure, you need an opt-in for forward ranges because input ranges are the most basic type. Thinking about this some more, maybe an enum is better for another reason. One thing we use introspection for but can bite us is to see if something supports a specific interface. But what happens when what we expect is not what actually happens? The result is usually very confusing messages, or introspection that doesn't result in what we expect it to (i.e. some wrapper of our expected forward range only ends up being an input range). Doing the check for what type of range it is different from the actual code expectations would not only read more cleanly (and perform better I think), it would push the error to the code itself. e.g. you define what you think is a random access range, you specify `enum rangeType = RangeType.RandomAccess` as a member, but forget to define `opIndex`. Instead of the range just being inferred as non-random-access in some other part of code, the compiler tells you `error, cannot call opIndex on type MyRange`, which gives you the exact error you need to fix the problem. -Steve
Nov 03 2021
prev sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 11/3/21 11:03 AM, Andrei Alexandrescu wrote:
 On 2021-11-02 19:07, Paul Backus wrote:
 On Tuesday, 2 November 2021 at 21:58:20 UTC, Andrei Alexandrescu wrote:
 On 2021-11-02 17:44, Dukc wrote:
 What's a value range?
Opposite of a reference range - copying implies `save()`.
Yah, one simple improvement we could make is to assume all forward ranges copy their iteration state when copying the range. Then input ranges do NOT do that, i.e. all copies of an input range refer to the same stream and iterate it together (advancing one advances all). The differentiation can be made with a nested enum tag: struct MyInputRange {     enum inputRangeTag = true;     ... } Client code can inspect R.inputRangeTag to figure whether the range is input (if present) or forward (if missing).
Not sure this is the best idea--it means new-style algorithms will silently treat old-style input ranges as though they were forward ranges, which could lead to incorrect behavior at runtime. If we are going to make incompatible changes to the range API, we should do it in such a way that version mismatches are caught at compile time.
Good point. Maybe have all ranges define that enum with values true and false respectively?
Oh, and one more thing, if we are going to do a tag, a UDA is probably a more asthetic and functional tag. e.g. it doesn't affect `__traits(allMembers, T)`. -Steve
Nov 03 2021
parent Adam D Ruppe <destructionator gmail.com> writes:
On Wednesday, 3 November 2021 at 15:59:40 UTC, Steven 
Schveighoffer wrote:
 Oh, and one more thing, if we are going to do a tag, a UDA is 
 probably a more asthetic and functional tag. e.g. it doesn't 
 affect `__traits(allMembers, T)`.
Best way is to make the decoration also be a check. A UDA can do it: --- template InputRange(Ty = void) { static if(is(Ty == void)) alias T = typeof(this); else alias T = Ty; static import std.range; static assert(std.range.isInputRange!T, "more informative message"); enum isInputRange = true; } InputRange!MyRange struct MyRange { bool empty; int front; //void popFront() {} //mixin InputRange; } --- That silly one works as either a mixin or a UDA each with stylistic pros and cons, but the functionality benefit of a concrete check of intention and nice error messages before granting you the tag would legitimately be really quite nice.
Nov 03 2021
prev sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 11/2/21 8:11 AM, Dukc wrote:

 - We also add some other function, or perhaps a flag to aforementioned 
 one, that can convert any v1 input ranges to v2 input range. 
 `valueRange` as default must not accept non-forward ranges, because then 
 it cannot guarantee that the result will be a value range.
How does phobos v2 view a non-class non-forward v1 range? This is the fundamental problem needing solving, because if you make copyability the defining trait, current v1 input-only ranges (e.g. `File.byLine`) are going to be miscategorized. -Steve
Nov 02 2021
parent Dukc <ajieskola gmail.com> writes:
On Wednesday, 3 November 2021 at 02:34:00 UTC, Steven 
Schveighoffer wrote:
 On 11/2/21 8:11 AM, Dukc wrote:

 - We also add some other function, or perhaps a flag to 
 aforementioned one, that can convert any v1 input ranges to v2 
 input range. `valueRange` as default must not accept 
 non-forward ranges, because then it cannot guarantee that the 
 result will be a value range.
How does phobos v2 view a non-class non-forward v1 range? This is the fundamental problem needing solving, because if you make copyability the defining trait, current v1 input-only ranges (e.g. `File.byLine`) are going to be miscategorized. -Steve
They are v2 input ranges. v2 input ranges are not required to be value ranges, but should be if they can (meaning, if their v1 equivalent would be a forward range). The reason I said that `valueRange` by default must not return a reference range is because it'd conflict with the function name.
Nov 03 2021