digitalmars.D - Forward ranges in Phobos v2

Dukc (15/15) Nov 01 2021 It seems we're going to do Phobos v2. I don't know whether it

H. S. Teoh (15/31) Nov 01 2021 [...]

Dukc (2/15) Nov 01 2021 Simple and effective! I think that's what I'm voting for.
Alexandru Ermicioi (7/14) Nov 01 2021 What about random access range?

H. S. Teoh (19/29) Nov 01 2021 Presumably, it will be a struct with additional methods needed for

Alexandru Ermicioi (8/19) Nov 01 2021 That would work, if you have templated funcs, but what if you

Paul Backus (17/39) Nov 01 2021 You can always wrap a class/interface method in a struct that

Andrei Alexandrescu (3/33) Nov 01 2021 Exactly. No need to support class ranges - simple wrappers can do

Dukc (23/25) Nov 02 2021 Trying to write up a plan based on that one, so you can correct

H. S. Teoh (28/48) Nov 02 2021 Why is this necessary? I thought we're getting rid of

Adam D Ruppe (5/7) Nov 02 2021 It is actually really, really, useful. If phobos didn't offer it,

H. S. Teoh (10/19) Nov 02 2021 I find it very useful as well, but according to Andrei, v2 will get rid

Adam D Ruppe (14/17) Nov 02 2021 Well, you'd keep the class for the actual wrapper, then the

Andrei Alexandrescu (4/13) Nov 02 2021 Yah, polymorphism has its place. The only problem is passing around

H. S. Teoh (8/22) Nov 02 2021 Yes, so we need a standard way of constructing such wrappers. Possibly
Alexandru Ermicioi (11/24) Nov 03 2021 So, if forward range interface (from std.range.interfaces) is to

Dukc (23/55) Nov 02 2021 I quess we could write a more advanced alternative, but I'd

Andrei Alexandrescu (12/16) Nov 02 2021 Yah, one simple improvement we could make is to assume all forward

Paul Backus (8/25) Nov 02 2021 Not sure this is the best idea--it means new-style algorithms

H. S. Teoh (14/30) Nov 02 2021 The problem with manually-added tags of this sort is that people forget

Paul Backus (3/13) Nov 02 2021 Having input ranges implement `next` and forward ranges implement

H. S. Teoh (8/22) Nov 02 2021 That would work too, but makes the input range API no longer a subset of

Paul Backus (16/26) Nov 02 2021 Not necessarily. It's possible to implement `next` as a UFCS

Andrei Alexandrescu (4/32) Nov 03 2021 OK, so the signature of next for all ranges is:

Paul Backus (7/25) Nov 03 2021 More precisely, to use the Phobos convention:

Andrei Alexandrescu (7/36) Nov 03 2021 We've considered this way back when. I'm talking like 2006. It was like

Paul Backus (20/59) Nov 03 2021 If we want to avoid copying, we can have `next` return a `Ref!T`

Andrei Alexandrescu (5/69) Nov 03 2021 That was on the table, too, in the form of a raw pointer.

Paul Backus (7/36) Nov 03 2021 I'd be curious to see any examples of such problems you have in

Andrei Alexandrescu (14/51) Nov 04 2021 Pointers are problematic because of aliasing and lifetime (what if the

Paul Backus (12/22) Nov 04 2021 Both good points. It will take some experimentation to find out

Andrei Alexandrescu (16/18) Nov 04 2021 That reminds me, we should drop that like a bad habit too :o).

H. S. Teoh (61/82) Nov 04 2021 Yeah, we need to get rid of useless genericity, and also exactly what is

Atila Neves (14/40) Nov 05 2021 Sometimes genericity is a good thing. Take C++, where range for

H. S. Teoh (45/74) Nov 09 2021 Genericity is definitely a good thing -- when it doesn't lead to the

Atila Neves (8/32) Nov 05 2021 T* should mean infinite lifetime by default in @safe code: where

Dukc (8/11) Nov 04 2021 Yeah, I think we should keep the `front`/`popFront`/`empty`

Paul Backus (3/15) Nov 04 2021 I agree that this is definitely not a v2 proposal--more like v3

Dukc (3/5) Nov 04 2021 Ah, that's more reasonable. Not saying I agree but at least I

Dukc (10/13) Nov 03 2021 There's an easier solution: require all v2 ranges to have the

Paul Backus (8/17) Nov 03 2021 Sure, that would work. But that still leaves the issue of `const`

Andrei Alexandrescu (4/18) Nov 03 2021 What would be the signature of next?

Andrei Alexandrescu (3/30) Nov 03 2021 Good point. Maybe have all ranges define that enum with values true and

jmh530 (7/10) Nov 03 2021 Maybe I just haven't been following this discussion closely

Andrei Alexandrescu (4/15) Nov 03 2021 Yah that's what I had in mind - hasMember is actually easier. I think

Steven Schveighoffer (25/56) Nov 03 2021 Yes, this is what I was trying to point out in my other post.
Steven Schveighoffer (5/36) Nov 03 2021 Oh, and one more thing, if we are going to do a tag, a UDA is probably a...

Adam D Ruppe (25/28) Nov 03 2021 Best way is to make the decoration also be a check. A UDA can do

Steven Schveighoffer (6/10) Nov 02 2021 How does phobos v2 view a non-class non-forward v1 range? This is the

Dukc (7/18) Nov 03 2021 They are v2 input ranges. v2 input ranges are not required to be

Dukc <ajieskola gmail.com> writes:

It seems we're going to do Phobos v2. I don't know whether it 
will be the idea Andrei just published or something else, but 
anyway.

So I think it's the time to discuss, do we want to change the 
definition of forward ranges? It seems to be the 
[consensus](https://forum.dlang.org/thread/ztgtmumenampiobbuiwd for
m.dlang.org?page=1) that it's current API is error-prone to use correctly.

For example, we could define that a range that does not offer 
`save()` is still a forward range if it can be copy constructed, 
and that copy constructors for reference ranges would be 
forbidden. But one big problem with that: classes can not be 
ranges then.

At the very least, I think we must take the stance that all 
Phobos v2 ranges must be save-on-copy where the documentation 
does not explicitly declare them reference ranges.

Ideas?

Nov 01 2021

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Mon, Nov 01, 2021 at 01:51:32PM +0000, Dukc via Digitalmars-d wrote:
 It seems we're going to do Phobos v2. I don't know whether it will be
 the idea Andrei just published or something else, but anyway.
 
 So I think it's the time to discuss, do we want to change the
 definition of forward ranges? It seems to be the
 [consensus](https://forum.dlang.org/thread/ztgtmumenampiobbuiwd forum.dlang.org?page=1)
 that it's current API is error-prone to use correctly.
 
 For example, we could define that a range that does not offer `save()`
 is still a forward range if it can be copy constructed, and that copy
 constructors for reference ranges would be forbidden. But one big
 problem with that: classes can not be ranges then.
 
 At the very least, I think we must take the stance that all Phobos v2
 ranges must be save-on-copy where the documentation does not
 explicitly declare them reference ranges.

[...]

Based on what Andrei said in the past, there will no longer be .save
(which has always been a point of confusion in the API and interacts
poorly with the type system), but forward ranges will be based on
by-value semantics, and input ranges on by-reference semantics. So if
it's a by-value type, it's a forward range; if it's a by-reference type,
it's an input range.

If you have a struct but it only supports input range semantics, then
you could pass it by ref or pass a pointer to it.

If you have a class but it supports forward range semantics, then wrap
it in a struct with a copy ctor that saves state in the copy ctor.


T

-- 
Questions are the beginning of intelligence, but the fear of God is the
beginning of wisdom.

Nov 01 2021

Dukc <ajieskola gmail.com> writes:

On Monday, 1 November 2021 at 14:13:00 UTC, H. S. Teoh wrote:
 Based on what Andrei said in the past, there will no longer be 
 .save (which has always been a point of confusion in the API 
 and interacts poorly with the type system), but forward ranges 
 will be based on by-value semantics, and input ranges on 
 by-reference semantics. So if it's a by-value type, it's a 
 forward range; if it's a by-reference type, it's an input range.

 If you have a struct but it only supports input range 
 semantics, then you could pass it by ref or pass a pointer to 
 it.

 If you have a class but it supports forward range semantics, 
 then wrap it in a struct with a copy ctor that saves state in 
 the copy ctor.


 T

Simple and effective! I think that's what I'm voting for.

Nov 01 2021

Alexandru Ermicioi <alexandru.ermicioi gmail.com> writes:

On Monday, 1 November 2021 at 14:13:00 UTC, H. S. Teoh wrote:
 If you have a struct but it only supports input range 
 semantics, then you could pass it by ref or pass a pointer to 
 it.

 If you have a class but it supports forward range semantics, 
 then wrap it in a struct with a copy ctor that saves state in 
 the copy ctor.


 T

What about random access range?

If you don't care for the implementation of the range but only 
being a forward range, how would you express it as a parameter 
for a method?

Best regards,
Alexandru.

Nov 01 2021

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Mon, Nov 01, 2021 at 11:34:20PM +0000, Alexandru Ermicioi via Digitalmars-d
wrote:
 On Monday, 1 November 2021 at 14:13:00 UTC, H. S. Teoh wrote:
 If you have a struct but it only supports input range semantics,
 then you could pass it by ref or pass a pointer to it.
 
 If you have a class but it supports forward range semantics, then
 wrap it in a struct with a copy ctor that saves state in the copy
 ctor.


[...]
 What about random access range?

Presumably, it will be a struct with additional methods needed for
random access (opIndex, et al).


 If you don't care for the implementation of the range but only being a
 forward range, how would you express it as a parameter for a method?

[...]

Good question, ask Andrei. ;-)

Presumably, if we standardize on structs/classes, it could be as simple
as:

	auto myFunc(R)(R range) if (is(R == struct)) {
		... // forward range
	}

	auto myFunc(R)(R range) if (is(R == class)) {
		... // input range
	}

But given that Andrei thinks it's a mistake for ranges to be implemented
as classes, I've no idea.


T

-- 
Don't modify spaghetti code unless you can eat the consequences.

Nov 01 2021

Alexandru Ermicioi <alexandru.ermicioi gmail.com> writes:

On Monday, 1 November 2021 at 23:46:07 UTC, H. S. Teoh wrote:
 Good question, ask Andrei. ;-)

Well, I hope he will check this thread and comment on it.

 Presumably, if we standardize on structs/classes, it could be 
 as simple as:

 	auto myFunc(R)(R range) if (is(R == struct)) {
 		... // forward range
 	}

 	auto myFunc(R)(R range) if (is(R == class)) {
 		... // input range
 	}

 But given that Andrei thinks it's a mistake for ranges to be 
 implemented as classes, I've no idea.

That would work, if you have templated funcs, but what if you 
need it in an interface?

If class based ranges are to be in D language, I doubt it will be 
possible to avoid .save function completely. At least for range 
interfaces, the save of forward range will have to be expressed 
through a method, such as .save.

Nov 01 2021

Paul Backus <snarwin gmail.com> writes:

On Tuesday, 2 November 2021 at 00:05:48 UTC, Alexandru Ermicioi 
wrote:
 On Monday, 1 November 2021 at 23:46:07 UTC, H. S. Teoh wrote:
 Good question, ask Andrei. ;-)

 Well, I hope he will check this thread and comment on it.

 Presumably, if we standardize on structs/classes, it could be 
 as simple as:

 	auto myFunc(R)(R range) if (is(R == struct)) {
 		... // forward range
 	}

 	auto myFunc(R)(R range) if (is(R == class)) {
 		... // input range
 	}

 But given that Andrei thinks it's a mistake for ranges to be 
 implemented as classes, I've no idea.

 That would work, if you have templated funcs, but what if you 
 need it in an interface?

 If class based ranges are to be in D language, I doubt it will 
 be possible to avoid .save function completely. At least for 
 range interfaces, the save of forward range will have to be 
 expressed through a method, such as .save.

You can always wrap a class/interface method in a struct that 
calls .save on copy:


struct ClassRangeWrapper(T)
     if (is(T == class) || is(T == interface))
{
     T payload;
     alias payload this;

     this(ref inout typeof(this) other) inout
     {
         this.payload = other.payload.save;
     }
}

This way, range algorithms don't need to know about .save, so it 
can be removed from the official forward range requirements even 
though it still exists as an implementation detail.

Nov 01 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/1/21 8:13 PM, Paul Backus wrote:
 On Tuesday, 2 November 2021 at 00:05:48 UTC, Alexandru Ermicioi wrote:
 On Monday, 1 November 2021 at 23:46:07 UTC, H. S. Teoh wrote:
 Good question, ask Andrei. ;-)

 Well, I hope he will check this thread and comment on it.

 Presumably, if we standardize on structs/classes, it could be as 
 simple as:

     auto myFunc(R)(R range) if (is(R == struct)) {
         ... // forward range
     }

     auto myFunc(R)(R range) if (is(R == class)) {
         ... // input range
     }

 But given that Andrei thinks it's a mistake for ranges to be 
 implemented as classes, I've no idea.

 That would work, if you have templated funcs, but what if you need it 
 in an interface?

 If class based ranges are to be in D language, I doubt it will be 
 possible to avoid .save function completely. At least for range 
 interfaces, the save of forward range will have to be expressed 
 through a method, such as .save.

 
 You can always wrap a class/interface method in a struct that calls 
 .save on copy:

Exactly. No need to support class ranges - simple wrappers can do 
everything class-like indirection does. Thanks.

Nov 01 2021

Dukc <ajieskola gmail.com> writes:

On Tuesday, 2 November 2021 at 02:45:11 UTC, Andrei Alexandrescu 
wrote:
 Exactly. No need to support class ranges - simple wrappers can 
 do everything class-like indirection does. Thanks.

Trying to write up a plan based on that one, so you can correct 
and/or spot weaknesses

- stuff in `std.v2.range.interfaces` and 
`std.v2.concurrency.Generator` will continue to be ranges from 
Phobos v1 viewpoint but not from Phobos v2 viewpoint.

- We add a function, let's say `std.range.valueRange`, in both 
versions, that will convert any v1 forward range to a value range 
that works in both versions.

- We also add some other function, or perhaps a flag to 
aforementioned one, that can convert any v1 input ranges to v2 
input range. `valueRange` as default must not accept non-forward 
ranges, because then it cannot guarantee that the result will be 
a value range.

- We need some way to prevent Phobos v2 using v1 reference 
forward ranges accidently. Making v2 `isInputRange` to be an 
automatic negative for classes can suffice for now.

- Phobos v2 ranges should still continue to provide the `save` 
method so they can be passed to v1 ranges. We also might provide 
an `assumeValueRange` function that will add the `save` method on 
top of any existing input range, assuming value semantics and 
making it a forward range from v1 perspective.

Nov 02 2021

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Tue, Nov 02, 2021 at 12:11:24PM +0000, Dukc via Digitalmars-d wrote:
 On Tuesday, 2 November 2021 at 02:45:11 UTC, Andrei Alexandrescu wrote:
 
 Exactly. No need to support class ranges - simple wrappers can do
 everything class-like indirection does. Thanks.

 
 Trying to write up a plan based on that one, so you can correct and/or
 spot weaknesses
 
 - stuff in `std.v2.range.interfaces` and
   `std.v2.concurrency.Generator` will continue to be ranges from
   Phobos v1 viewpoint but not from Phobos v2 viewpoint.

Why is this necessary?  I thought we're getting rid of
std.range.interfaces.


 - We add a function, let's say `std.range.valueRange`, in both
   versions, that will convert any v1 forward range to a value range
   that works in both versions.

What's a value range?


 - We also add some other function, or perhaps a flag to aforementioned
   one, that can convert any v1 input ranges to v2 input range.
   `valueRange` as default must not accept non-forward ranges, because
   then it cannot guarantee that the result will be a value range.

[...]

Interesting idea. So basically a shim for easy translation of v1-based
code to v2-based code?  That would be nice for gradual migration.  It
would have to exclude certain incompatible things like autodecoded
strings, though. Otherwise it will result in a mess.


 - Phobos v2 ranges should still continue to provide the `save` method
   so they can be passed to v1 ranges.

[...]

I'm not sure this is such a good idea, because v2 ranges may have
fundamental incompatibilities with v1 algorithms, e.g., a v2 string
range (non-autodecoded) being passed to a v1 algorithm (autodecoded)
will probably produce the wrong results, likely silently, which is bad.
Now imagine mixing v1 algorithms and v2 algorithms in the same UFCS
chain (via shims) over a string, and you're in for a heck of time trying
to debug the resulting mess.

IMO it's better to just keep v1 code distinct from v2 code, and migrate
v1-based code to v2-based code on a case-by-case basis.  In most cases,
you could probably just change `import std` to `import stdv2` and it
should work.  In cases involving e.g. autodecoding you'd add an adapter
or two in your UFCS code, then change to `import stdv2` and that should
fix it.  For the rest of the cases, just leave `import std` as-is, and
existing code should still function as before, with existing semantics,
without any surprise breakages.


T

-- 
That's not a bug; that's a feature!

Nov 02 2021

Adam D Ruppe <destructionator gmail.com> writes:

On Tuesday, 2 November 2021 at 18:09:55 UTC, H. S. Teoh wrote:
 Why is this necessary?  I thought we're getting rid of 
 std.range.interfaces.

It is actually really, really, useful. If phobos didn't offer it, 
someone would reinvent it anyway.

(In fact, there's a lot of cases where using them is more 
efficient than generating more and more code...)

Nov 02 2021

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Tue, Nov 02, 2021 at 07:32:56PM +0000, Adam D Ruppe via Digitalmars-d wrote:
 On Tuesday, 2 November 2021 at 18:09:55 UTC, H. S. Teoh wrote:
 Why is this necessary?  I thought we're getting rid of
 std.range.interfaces.

 
 It is actually really, really, useful. If phobos didn't offer it,
 someone would reinvent it anyway.
 
 (In fact, there's a lot of cases where using them is more efficient
 than generating more and more code...)

I find it very useful as well, but according to Andrei, v2 will get rid
of class-based ranges, and if those are wanted we should use a struct
wrapper instead.

In any case, if struct wrappers are the way to go, then we better have a
standard way of constructing them.  It just won't be the same thing as
std.range.interfaces.


T

-- 
They say that "guns don't kill people, people kill people." Well I think the
gun helps. If you just stood there and yelled BANG, I don't think you'd kill
too many people. -- Eddie Izzard, Dressed to Kill

Nov 02 2021

Adam D Ruppe <destructionator gmail.com> writes:

On Tuesday, 2 November 2021 at 19:44:02 UTC, H. S. Teoh wrote:
 I find it very useful as well, but according to Andrei, v2 will 
 get rid of class-based ranges, and if those are wanted we 
 should use a struct wrapper instead.

Well, you'd keep the class for the actual wrapper, then the 
struct is just a thin thing on top of that.

interface IInputRange {
    // yada
}

struct InputRange {
    IInputRange c;
}

So you'd wrap it like that. Then for forward range make the copy 
constructor call the clone/save/deepCopy/whatever method on the 
class.

Then the std.algorithm things just take InputRange. A few details 
to work out but it is a simple enough thing to do.

Nov 02 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:

On 2021-11-02 15:32, Adam D Ruppe wrote:
 On Tuesday, 2 November 2021 at 18:09:55 UTC, H. S. Teoh wrote:
 Why is this necessary?  I thought we're getting rid of 
 std.range.interfaces.

 
 It is actually really, really, useful. If phobos didn't offer it, 
 someone would reinvent it anyway.
 
 (In fact, there's a lot of cases where using them is more efficient than 
 generating more and more code...)

Yah, polymorphism has its place. The only problem is passing around 
reference ranges. They should have a thin struct wrapper that carries 
the proper copy semantics.

Nov 02 2021

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Tue, Nov 02, 2021 at 04:17:06PM -0400, Andrei Alexandrescu via Digitalmars-d
wrote:
 On 2021-11-02 15:32, Adam D Ruppe wrote:
 On Tuesday, 2 November 2021 at 18:09:55 UTC, H. S. Teoh wrote:
 Why is this necessary?� I thought we're getting rid of
 std.range.interfaces.

 
 It is actually really, really, useful. If phobos didn't offer it,
 someone would reinvent it anyway.
 
 (In fact, there's a lot of cases where using them is more efficient
 than generating more and more code...)

 
 Yah, polymorphism has its place. The only problem is passing around
 reference ranges. They should have a thin struct wrapper that carries
 the proper copy semantics.

Yes, so we need a standard way of constructing such wrappers.  Possibly
an addition to stdv2.range.interfaces?  Or maybe just have the wrapper
constructors return the constructed polymorphic range
pre-(shrink)wrapped. :-)


T

-- 
People who are more than casually interested in computers should have at least
some idea of what the underlying hardware is like. Otherwise the programs they
write will be pretty weird. -- D. Knuth

Nov 02 2021

Alexandru Ermicioi <alexandru.ermicioi gmail.com> writes:

On Tuesday, 2 November 2021 at 20:17:06 UTC, Andrei Alexandrescu 
wrote:
 On 2021-11-02 15:32, Adam D Ruppe wrote:
 On Tuesday, 2 November 2021 at 18:09:55 UTC, H. S. Teoh wrote:
 Why is this necessary?  I thought we're getting rid of 
 std.range.interfaces.

 
 It is actually really, really, useful. If phobos didn't offer 
 it, someone would reinvent it anyway.
 
 (In fact, there's a lot of cases where using them is more 
 efficient than generating more and more code...)

 Yah, polymorphism has its place. The only problem is passing 
 around reference ranges. They should have a thin struct wrapper 
 that carries the proper copy semantics.

So, if forward range interface (from std.range.interfaces) is to 
be kept in phobos, it should provide a .save method, that can be 
used instead of copy constructor.

Then, it is possible to have only one wrapper struct for 
transforming it into value type (i.e. behave same as struct 
forward range), that would use .save when wrapper's copy 
constructor is invoked. It would allow to use this wrapper as 
part of method parameter type, in order to enforce people using 
it, and not randomly forgetting to wrap it.

Nov 03 2021

Dukc <ajieskola gmail.com> writes:

On Tuesday, 2 November 2021 at 18:09:55 UTC, H. S. Teoh wrote:
 On Tue, Nov 02, 2021 at 12:11:24PM +0000, Dukc via 
 Digitalmars-d wrote:
 On Tuesday, 2 November 2021 at 02:45:11 UTC, Andrei 
 Alexandrescu wrote:
 
 Exactly. No need to support class ranges - simple wrappers 
 can do everything class-like indirection does. Thanks.

 
 Trying to write up a plan based on that one, so you can 
 correct and/or spot weaknesses
 
 - stuff in `std.v2.range.interfaces` and
   `std.v2.concurrency.Generator` will continue to be ranges 
 from
   Phobos v1 viewpoint but not from Phobos v2 viewpoint.

 Why is this necessary?  I thought we're getting rid of 
 std.range.interfaces.

I quess we could write a more advanced alternative, but I'd 
prefer to keep the range interfaces around until someone does, to 
avoid scope creep.

The downside is going to be that Phobos v2 cannot use the 
interfaces directly as they aren't ranges anymore, but that's 
what the `valueRange` and the "other function" I mentioned are 
for.

 What's a value range?

Opposite of a reference range - copying implies `save()`.


 Interesting idea. So basically a shim for easy translation of 
 v1-based code to v2-based code?  That would be nice for gradual 
 migration.  It would have to exclude certain incompatible 
 things like autodecoded strings, though. Otherwise it will 
 result in a mess.

I propose that the shim will autodecode if imported from `v1` (if 
we even add it to `v1`) but not if imported from `v2` - just like 
the rest of the range-accepting functions.

 - Phobos v2 ranges should still continue to provide the `save` 
 method
   so they can be passed to v1 ranges.

 [...]

 I'm not sure this is such a good idea, because v2 ranges may 
 have fundamental incompatibilities with v1 algorithms, e.g., a 
 v2 string range (non-autodecoded) being passed to a v1 
 algorithm (autodecoded) will probably produce the wrong 
 results, likely silently, which is bad.

I agree that it's better to avoid function chains like that if 
easily possible. But the underlying rule is simple and 
unambiguous: a Phobos v1 function will autodecode a character 
array, a v2 function will not. If the character range is anything 
other than a plain array, they behave identically: the decoding 
or lack of thereof depends on the range itself.

Not worth to make the interoperability difficult just because of 
that IMO. If the users start having problems, they can 
voluntarily avoid mixing the character-handling `v1` and `v2` 
ranges - they still enjoy easier interoperability with other 
ranges.

Nov 02 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:

On 2021-11-02 17:44, Dukc wrote:
 
 What's a value range?

 
 Opposite of a reference range - copying implies `save()`.

Yah, one simple improvement we could make is to assume all forward 
ranges copy their iteration state when copying the range. Then input 
ranges do NOT do that, i.e. all copies of an input range refer to the 
same stream and iterate it together (advancing one advances all).

The differentiation can be made with a nested enum tag:

struct MyInputRange {
     enum inputRangeTag = true;
     ...
}

Client code can inspect R.inputRangeTag to figure whether the range is 
input (if present) or forward (if missing).

Nov 02 2021

Paul Backus <snarwin gmail.com> writes:

On Tuesday, 2 November 2021 at 21:58:20 UTC, Andrei Alexandrescu 
wrote:
 On 2021-11-02 17:44, Dukc wrote:
 
 What's a value range?

 
 Opposite of a reference range - copying implies `save()`.

 Yah, one simple improvement we could make is to assume all 
 forward ranges copy their iteration state when copying the 
 range. Then input ranges do NOT do that, i.e. all copies of an 
 input range refer to the same stream and iterate it together 
 (advancing one advances all).

 The differentiation can be made with a nested enum tag:

 struct MyInputRange {
     enum inputRangeTag = true;
     ...
 }

 Client code can inspect R.inputRangeTag to figure whether the 
 range is input (if present) or forward (if missing).

Not sure this is the best idea--it means new-style algorithms 
will silently treat old-style input ranges as though they were 
forward ranges, which could lead to incorrect behavior at 
runtime. If we are going to make incompatible changes to the 
range API, we should do it in such a way that version mismatches 
are caught at compile time.

Nov 02 2021

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Tue, Nov 02, 2021 at 11:07:08PM +0000, Paul Backus via Digitalmars-d wrote:
 On Tuesday, 2 November 2021 at 21:58:20 UTC, Andrei Alexandrescu wrote:

[...]
 The differentiation can be made with a nested enum tag:
 
 struct MyInputRange {
     enum inputRangeTag = true;
     ...
 }
 
 Client code can inspect R.inputRangeTag to figure whether the range
 is input (if present) or forward (if missing).

 
 Not sure this is the best idea--it means new-style algorithms will
 silently treat old-style input ranges as though they were forward
 ranges, which could lead to incorrect behavior at runtime. If we are
 going to make incompatible changes to the range API, we should do it
 in such a way that version mismatches are caught at compile time.

The problem with manually-added tags of this sort is that people forget
to do it, and that leads to trouble.  Preferably, it should be something
already implicit in the range type itself, that does not require
additional effort to tag.

I'm kinda toying with the idea of struct == forward range, class ==
input range: the difference is inherent in the type itself and requires
no further effort beyond the decision to use a by-value type vs. a
by-reference type, which coincides with the decision to make something
an input range or a forward range.


T

-- 
Political correctness: socially-sanctioned hypocrisy.

Nov 02 2021

Paul Backus <snarwin gmail.com> writes:

On Wednesday, 3 November 2021 at 00:01:33 UTC, H. S. Teoh wrote:
 The problem with manually-added tags of this sort is that 
 people forget to do it, and that leads to trouble.  Preferably, 
 it should be something already implicit in the range type 
 itself, that does not require additional effort to tag.

 I'm kinda toying with the idea of struct == forward range, 
 class == input range: the difference is inherent in the type 
 itself and requires no further effort beyond the decision to 
 use a by-value type vs. a by-reference type, which coincides 
 with the decision to make something an input range or a forward 
 range.

Having input ranges implement `next` and forward ranges implement 
`head` and `tail` would also make them easy to distinguish.

Nov 02 2021

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Wed, Nov 03, 2021 at 12:18:59AM +0000, Paul Backus via Digitalmars-d wrote:
 On Wednesday, 3 November 2021 at 00:01:33 UTC, H. S. Teoh wrote:
 The problem with manually-added tags of this sort is that people
 forget to do it, and that leads to trouble.  Preferably, it should
 be something already implicit in the range type itself, that does
 not require additional effort to tag.
 
 I'm kinda toying with the idea of struct == forward range, class ==
 input range: the difference is inherent in the type itself and
 requires no further effort beyond the decision to use a by-value
 type vs. a by-reference type, which coincides with the decision to
 make something an input range or a forward range.

 
 Having input ranges implement `next` and forward ranges implement
 `head` and `tail` would also make them easy to distinguish.

That would work too, but makes the input range API no longer a subset of
the forward range API.  This would lead to code duplication in
algorithms that only require an input range but could work equally well
with a forward range.


T

-- 
Once bitten, twice cry...

Nov 02 2021

Paul Backus <snarwin gmail.com> writes:

On Wednesday, 3 November 2021 at 00:24:11 UTC, H. S. Teoh wrote:
 On Wed, Nov 03, 2021 at 12:18:59AM +0000, Paul Backus via 
 Digitalmars-d wrote:
 Having input ranges implement `next` and forward ranges 
 implement `head` and `tail` would also make them easy to 
 distinguish.

 That would work too, but makes the input range API no longer a 
 subset of the forward range API.  This would lead to code 
 duplication in algorithms that only require an input range but 
 could work equally well with a forward range.

Not necessarily. It's possible to implement `next` as a UFCS 
function for mutable forward ranges using the `head`/`tail` API:

auto next(R)(ref R r)
     if (isForwardRangeV2!R && isMutable!R)
{
     alias E = ElementType!R;
     if (r.empty)
         return none!E();
     else
     {
         auto result = some(r.head);
         r = r.tail;
         return result;
     }
}

Nov 02 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:

On 2021-11-02 20:38, Paul Backus wrote:
 On Wednesday, 3 November 2021 at 00:24:11 UTC, H. S. Teoh wrote:
 On Wed, Nov 03, 2021 at 12:18:59AM +0000, Paul Backus via 
 Digitalmars-d wrote:
 Having input ranges implement `next` and forward ranges implement 
 `head` and `tail` would also make them easy to distinguish.

 That would work too, but makes the input range API no longer a subset 
 of the forward range API.  This would lead to code duplication in 
 algorithms that only require an input range but could work equally 
 well with a forward range.

 
 Not necessarily. It's possible to implement `next` as a UFCS function 
 for mutable forward ranges using the `head`/`tail` API:
 
 auto next(R)(ref R r)
      if (isForwardRangeV2!R && isMutable!R)
 {
      alias E = ElementType!R;
      if (r.empty)
          return none!E();
      else
      {
          auto result = some(r.head);
          r = r.tail;
          return result;
      }
 }

OK, so the signature of next for all ranges is:

Option!(ElementType!R) next(Range)(ref Range);

Is that correct?

Nov 03 2021

Paul Backus <snarwin gmail.com> writes:

On Wednesday, 3 November 2021 at 15:40:41 UTC, Andrei 
Alexandrescu wrote:
 On 2021-11-02 20:38, Paul Backus wrote:
 
 auto next(R)(ref R r)
      if (isForwardRangeV2!R && isMutable!R)
 {
      alias E = ElementType!R;
      if (r.empty)
          return none!E();
      else
      {
          auto result = some(r.head);
          r = r.tail;
          return result;
      }
 }

 OK, so the signature of next for all ranges is:

 Option!(ElementType!R) next(Range)(ref Range);

 Is that correct?

More precisely, to use the Phobos convention: 
`is(ReturnType!((Range r) => r.next) == Option!(ElementType!R))`.

So, `next` could be a function, a  property, or a member 
variable, and it does not necessarily require an lvalue to call 
(just like `front` today).

Nov 03 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:

On 2021-11-03 12:18, Paul Backus wrote:
 On Wednesday, 3 November 2021 at 15:40:41 UTC, Andrei Alexandrescu wrote:
 On 2021-11-02 20:38, Paul Backus wrote:
 auto next(R)(ref R r)
      if (isForwardRangeV2!R && isMutable!R)
 {
      alias E = ElementType!R;
      if (r.empty)
          return none!E();
      else
      {
          auto result = some(r.head);
          r = r.tail;
          return result;
      }
 }

 OK, so the signature of next for all ranges is:

 Option!(ElementType!R) next(Range)(ref Range);

 Is that correct?

 
 More precisely, to use the Phobos convention: `is(ReturnType!((Range r) 
 => r.next) == Option!(ElementType!R))`.
 
 So, `next` could be a function, a  property, or a member variable, and 
 it does not necessarily require an lvalue to call (just like `front` 
 today).

We've considered this way back when. I'm talking like 2006. It was like 
this:

T next(Range)(ref Range r, ref bool done);

The main problem is that iterating such a forward range would entail a 
copy of each element of the range. This is not scalable in general.

This is a showstopper.

Nov 03 2021

Paul Backus <snarwin gmail.com> writes:

On Wednesday, 3 November 2021 at 17:41:10 UTC, Andrei 
Alexandrescu wrote:
 On 2021-11-03 12:18, Paul Backus wrote:
 On Wednesday, 3 November 2021 at 15:40:41 UTC, Andrei 
 Alexandrescu wrote:
 On 2021-11-02 20:38, Paul Backus wrote:
 auto next(R)(ref R r)
      if (isForwardRangeV2!R && isMutable!R)
 {
      alias E = ElementType!R;
      if (r.empty)
          return none!E();
      else
      {
          auto result = some(r.head);
          r = r.tail;
          return result;
      }
 }

 OK, so the signature of next for all ranges is:

 Option!(ElementType!R) next(Range)(ref Range);

 Is that correct?

 
 More precisely, to use the Phobos convention: 
 `is(ReturnType!((Range r) => r.next) == 
 Option!(ElementType!R))`.
 
 So, `next` could be a function, a  property, or a member 
 variable, and it does not necessarily require an lvalue to 
 call (just like `front` today).

 We've considered this way back when. I'm talking like 2006. It 
 was like this:

 T next(Range)(ref Range r, ref bool done);

 The main problem is that iterating such a forward range would 
 entail a copy of each element of the range. This is not 
 scalable in general.

 This is a showstopper.

If we want to avoid copying, we can have `next` return a `Ref!T` 
in the case where the forward range has lvalue elements:

struct Ref(T)
{
     T* ptr;

     ref inout(T) deref() inout
     {
         return *ptr;
     }
     alias deref this;
}

I've tested some simple uses of this wrapper on run.dlang.io, and 
it seems like DIP 1000 is good enough to make it work in  safe 
code.

If "returns either `T` or `Ref!T`" sounds like a suspect design 
for an API, consider that it is basically the same thing as an 
`auto ref` return value--just with the distinction between ref 
and non-ref brought inside the type system.

Nov 03 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/3/21 11:25 PM, Paul Backus wrote:
 On Wednesday, 3 November 2021 at 17:41:10 UTC, Andrei Alexandrescu wrote:
 On 2021-11-03 12:18, Paul Backus wrote:
 On Wednesday, 3 November 2021 at 15:40:41 UTC, Andrei Alexandrescu 
 wrote:
 On 2021-11-02 20:38, Paul Backus wrote:
 auto next(R)(ref R r)
      if (isForwardRangeV2!R && isMutable!R)
 {
      alias E = ElementType!R;
      if (r.empty)
          return none!E();
      else
      {
          auto result = some(r.head);
          r = r.tail;
          return result;
      }
 }

 OK, so the signature of next for all ranges is:

 Option!(ElementType!R) next(Range)(ref Range);

 Is that correct?

 More precisely, to use the Phobos convention: `is(ReturnType!((Range 
 r) => r.next) == Option!(ElementType!R))`.

 So, `next` could be a function, a  property, or a member variable, 
 and it does not necessarily require an lvalue to call (just like 
 `front` today).

 We've considered this way back when. I'm talking like 2006. It was 
 like this:

 T next(Range)(ref Range r, ref bool done);

 The main problem is that iterating such a forward range would entail a 
 copy of each element of the range. This is not scalable in general.

 This is a showstopper.

 
 If we want to avoid copying, we can have `next` return a `Ref!T` in the 
 case where the forward range has lvalue elements:
 
 struct Ref(T)
 {
      T* ptr;
 
      ref inout(T) deref() inout
      {
          return *ptr;
      }
      alias deref this;
 }
 
 I've tested some simple uses of this wrapper on run.dlang.io, and it 
 seems like DIP 1000 is good enough to make it work in  safe code.
 
 If "returns either `T` or `Ref!T`" sounds like a suspect design for an 
 API, consider that it is basically the same thing as an `auto ref` 
 return value--just with the distinction between ref and non-ref brought 
 inside the type system.

That was on the table, too, in the form of a raw pointer.

I think it can be made to work, but for lvalue ranges only, and it will 
be difficult to make safe (scoped etc).

Overall this seems to create more problems than it solves.

Nov 03 2021

Paul Backus <snarwin gmail.com> writes:

On Thursday, 4 November 2021 at 04:06:15 UTC, Andrei Alexandrescu 
wrote:
 On 11/3/21 11:25 PM, Paul Backus wrote:
 
 If we want to avoid copying, we can have `next` return a 
 `Ref!T` in the case where the forward range has lvalue 
 elements:
 
 struct Ref(T)
 {
      T* ptr;
 
      ref inout(T) deref() inout
      {
          return *ptr;
      }
      alias deref this;
 }
 
 I've tested some simple uses of this wrapper on run.dlang.io, 
 and it seems like DIP 1000 is good enough to make it work in 
  safe code.
 
 If "returns either `T` or `Ref!T`" sounds like a suspect 
 design for an API, consider that it is basically the same 
 thing as an `auto ref` return value--just with the distinction 
 between ref and non-ref brought inside the type system.

 That was on the table, too, in the form of a raw pointer.

 I think it can be made to work, but for lvalue ranges only, and 
 it will be difficult to make safe (scoped etc).

 Overall this seems to create more problems than it solves.

I'd be curious to see any examples of such problems you have in 
mind.

As far as I'm aware, no special effort should be required to make 
this  safe, aside from enabling -preview=dip1000 (which, granted, 
is still a work in progress).

Nov 03 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/4/21 12:43 AM, Paul Backus wrote:
 On Thursday, 4 November 2021 at 04:06:15 UTC, Andrei Alexandrescu wrote:
 On 11/3/21 11:25 PM, Paul Backus wrote:
 If we want to avoid copying, we can have `next` return a `Ref!T` in 
 the case where the forward range has lvalue elements:

 struct Ref(T)
 {
      T* ptr;

      ref inout(T) deref() inout
      {
          return *ptr;
      }
      alias deref this;
 }

 I've tested some simple uses of this wrapper on run.dlang.io, and it 
 seems like DIP 1000 is good enough to make it work in  safe code.

 If "returns either `T` or `Ref!T`" sounds like a suspect design for 
 an API, consider that it is basically the same thing as an `auto ref` 
 return value--just with the distinction between ref and non-ref 
 brought inside the type system.

 That was on the table, too, in the form of a raw pointer.

 I think it can be made to work, but for lvalue ranges only, and it 
 will be difficult to make safe (scoped etc).

 Overall this seems to create more problems than it solves.

 
 I'd be curious to see any examples of such problems you have in mind.
 
 As far as I'm aware, no special effort should be required to make this 
  safe, aside from enabling -preview=dip1000 (which, granted, is still a 
 work in progress).

Pointers are problematic because of aliasing and lifetime (what if the 
pointer survives the data structure it points into). So the `Ref` 
structs needs to be qualified appropriately with `scope`. So yes DIP1000 
would need to be tight.

Usability is another matter that hasn't been quite looked at. Once you 
have a scoped pointer wrapper, what can and what can't you do with it 
easily? I'm not very sure.

Alias this is just poorly done. I think we shouldn't base a fundamental 
API on it.

Anyway, I'm cautiously optimistic. At the very least this should be 
explored.

Note that the whole thing still doesn't address unbuffered ranges. There 
must be a buffer of at least one element somewhere. That's... problematic.

Nov 04 2021

Paul Backus <snarwin gmail.com> writes:

On Thursday, 4 November 2021 at 15:29:59 UTC, Andrei Alexandrescu 
wrote:
 Usability is another matter that hasn't been quite looked at. 
 Once you have a scoped pointer wrapper, what can and what can't 
 you do with it easily? I'm not very sure.

 Alias this is just poorly done. I think we shouldn't base a 
 fundamental API on it.

Both good points. It will take some experimentation to find out 
where the rough edges of this approach are, and whether they can 
be adequately sanded down.

 Anyway, I'm cautiously optimistic. At the very least this 
 should be explored.

 Note that the whole thing still doesn't address unbuffered 
 ranges. There must be a buffer of at least one element 
 somewhere. That's... problematic.

Unbuffered ranges will return `Option!T` from `next`, rather than 
`Option!(Ref!T)`.

Again, this is the same distinction we already have between 
rvalue `front` and lvalue `front`, so I don't think the 
inconsistency is a problem, as long as we can make `Ref!T` 
function as a subtype of `T` (either via `alias this` or some 
more principled mechanism).

Nov 04 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:

On 2021-11-04 12:39, Paul Backus wrote:
 Again, this is the same distinction we already have between rvalue 
 `front` and lvalue `front`

That reminds me, we should drop that like a bad habit too :o).

Currently ranges have all sorts of weird, random genericity. Recalling 
from memory (perhaps/hopefully some of these have been fixed):

- At least at some point `empty` did not have to return bool, just 
something convertible to bool. Like immutable(bool).

- For a while we had a lively discussion about length returning ulong 
instead of size_t (relevant on 32-bit).

- front could return pretty much what it damn well pleased, including 
qualified data, rvalues vs lvalues, noncopyable stuff, etc.

- Thinking how inout interacts with everything ranges is just depressing.

- I seem to recall there was at least one popFront that returned 
something meaningful. (Maybe that's not too disruptive.)

Based on past experience we could and should simplify the range 
interface in places where genericity has little value and the 
implementation effort is high.

Nov 04 2021

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, Nov 04, 2021 at 06:38:30PM -0400, Andrei Alexandrescu via Digitalmars-d
wrote:
 On 2021-11-04 12:39, Paul Backus wrote:
 Again, this is the same distinction we already have between rvalue
 `front` and lvalue `front`

 
 That reminds me, we should drop that like a bad habit too :o).
 
 Currently ranges have all sorts of weird, random genericity. Recalling
 from memory (perhaps/hopefully some of these have been fixed):

Yeah, we need to get rid of useless genericity, and also exactly what is
expected of range operations should be stated clearly and unambiguously
in the API docs.  The current range API suffers from insufficient
clarity, so many such cases went "under the radar" and inevitably ended
up being implemented when some kind soul decided that it would be nice
to support this or that niche case.


 - At least at some point `empty` did not have to return bool, just
 something convertible to bool. Like immutable(bool).

Yeah, .empty should return bool, and only bool.  Not immutable(bool),
not something that alias this to bool, none of that sort.

Also, the spec should specify precisely whether .empty must be a
function (and whether it should be a member function, a free function,
or both), or it's allowed to be a member variable.  Currently in my own
code I have a few cases where .empty is a variable rather than a
function. It hasn't run into any problems yet so far, but things like
this must be explicitly stated, otherwise somebody will inevitably write
code that assumes one way or the other, and break things for no good
reason.


 - For a while we had a lively discussion about length returning ulong
 instead of size_t (relevant on 32-bit).

Whichever way we decide, this should be specified clearly and not left
up to interpretation.


 - front could return pretty much what it damn well pleased, including
 qualified data, rvalues vs lvalues, noncopyable stuff, etc.

Yeah, this has been especially troublesome.  I think we should specify
exactly what type(s) and qualifier(s) are permitted to be returned from
.front.

Don't forget transient values returned by .front that are invalidated by
the next call to .popFront (e.g., std.stdio.File.byLine, which reuses
the line buffer).  The range API needs to explicitly state whether
.popFront is allowed to do this, and if it is allowed, range algorithms
that attempt to cache .front past the next invocation to .popFront must
be rewritten.  (This used to be a pretty big problem, but I think we've
fixed most of the cases in Phobos by now. But it still turns its ugly
head up every now and then in user code that makes wrong assumptions
about the lifetime of the value returned by .front.)


 - Thinking how inout interacts with everything ranges is just
 depressing.

inout is the source of all kinds of nastiness in the language. It's a
cute hack that works for the trivial cases, but once you combine it with
other language features it's a mess. Consider this:

	inout T myFunc(T)(inout T delegate(inout T t) dg, inout T u) {...}

Does inout apply to the return value of dg, dg itself, or both? How does
it interact with the inout on the function's return value?  How exactly
does inout on t interact with the delegate's inout return, and how do
they correlate with the inout of the outer function?  This is just one
of many cases of ambiguity; it's not hard to construct other examples.
In short, it's a mess.

And don't forget that inout behaves like const inside the function body,
but when passed as a template argument triggers a different
instantiation (template bloat).

And trying to work with inout in generic code where you have to deal
with arbitrary incoming type qualifiers is an exercise in pain.

I think we should just flat out *not* support inout in ranges.


 - I seem to recall there was at least one popFront that returned
 something meaningful. (Maybe that's not too disruptive.)

It should be mandated by spec to return void.


 Based on past experience we could and should simplify the range
 interface in places where genericity has little value and the
 implementation effort is high.

+1.

Plus, the *exact* expectations of the various range functions should be
spelled out in clear, unambiguous terms.  Such as ref or non-ref, const
or mutable, function or member variable (or free function), transient
.front or not, copyable or not, what exactly .popFront returns, etc..
There must be no room left for interpretation except where explicitly
allowed.  Leave any small detail unspecified, and we can almost be
guaranteed to be bitten by it later.

Best spell out the exact permitted function signatures and types with
list of allowed qualifiers to leave no room for misinterpretation.


T

-- 
I am not young enough to know everything. -- Oscar Wilde

Nov 04 2021

Atila Neves <atila.neves gmail.com> writes:

On Thursday, 4 November 2021 at 23:30:05 UTC, H. S. Teoh wrote:
 On Thu, Nov 04, 2021 at 06:38:30PM -0400, Andrei Alexandrescu 
 via Digitalmars-d wrote:
 On 2021-11-04 12:39, Paul Backus wrote:
 Again, this is the same distinction we already have between 
 rvalue `front` and lvalue `front`

 
 That reminds me, we should drop that like a bad habit too :o).
 
 Currently ranges have all sorts of weird, random genericity. 
 Recalling from memory (perhaps/hopefully some of these have 
 been fixed):

 Yeah, we need to get rid of useless genericity, and also 
 exactly what is expected of range operations should be stated 
 clearly and unambiguously in the API docs.  The current range 
 API suffers from insufficient clarity, so many such cases went 
 "under the radar" and inevitably ended up being implemented 
 when some kind soul decided that it would be nice to support 
 this or that niche case.

Sometimes genericity is a good thing. Take C++, where range for 
was originally specified in C++11 such that the begin and end 
iterators had to be the same type, which on the face it seems to 
makes sense. But then that was found out to be overly 
constraining, and to be able to add ranges to C++17 they had to 
change the definition of a range for loop so that end only had to 
be comparable to begin and could be a different type.

 - At least at some point `empty` did not have to return bool, 
 just something convertible to bool. Like immutable(bool).

 Yeah, .empty should return bool, and only bool.  Not 
 immutable(bool), not something that alias this to bool, none of 
 that sort.

 Also, the spec should specify precisely whether .empty must be 
 a function (and whether it should be a member function, a free 
 function, or both), or it's allowed to be a member variable.

Similarly to what I said above, I don't think the spec should do 
this at all. Plasticity is what D is good at, and leaving it to 
"range.empty is a bool" is, IMHO, far better. I *love* not using 
parens for functions with no args and being able to use a 
function/variable/enum, then being able to change that and not 
have to touch the rest of the code at all.

Nov 05 2021

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Nov 05, 2021 at 11:43:01AM +0000, Atila Neves via Digitalmars-d wrote:
 On Thursday, 4 November 2021 at 23:30:05 UTC, H. S. Teoh wrote:

[...]
 Yeah, we need to get rid of useless genericity, and also exactly
 what is expected of range operations should be stated clearly and
 unambiguously in the API docs.  The current range API suffers from
 insufficient clarity, so many such cases went "under the radar" and
 inevitably ended up being implemented when some kind soul decided
 that it would be nice to support this or that niche case.

 
 Sometimes genericity is a good thing. Take C++, where range for was
 originally specified in C++11 such that the begin and end iterators
 had to be the same type, which on the face it seems to makes sense.
 But then that was found out to be overly constraining, and to be able
 to add ranges to C++17 they had to change the definition of a range
 for loop so that end only had to be comparable to begin and could be a
 different type.

Genericity is definitely a good thing -- when it doesn't lead to the
slippery slope of ever-more-complicated convolutions in the code as a
result of trying to cater to every unnatural use case.  The whole point
of the range abstraction is to *simplify* code; if simplicity and
clarity of code is compromised because of genericity, then we have
failed.


[...]
 Yeah, .empty should return bool, and only bool.  Not
 immutable(bool), not something that alias this to bool, none of that
 sort.
 
 Also, the spec should specify precisely whether .empty must be a
 function (and whether it should be a member function, a free
 function, or both), or it's allowed to be a member variable.

 
 Similarly to what I said above, I don't think the spec should do this
 at all. Plasticity is what D is good at, and leaving it to
 "range.empty is a bool" is, IMHO, far better. I *love* not using
 parens for functions with no args and being able to use a
 function/variable/enum, then being able to change that and not have to
 touch the rest of the code at all.

I disagree. The spec *should* explicitly state what .empty (or any other
range method/identifier) is allowed to be.  If you want more genericity,
simply have the spec say ".empty may be either a method or a member
field".

This may seem trivial, but it's necessary to prevent things like some
Phobos code assuming that .empty is always a method, and then it fails
when somebody passes in a range that has a field instead.  Also, on the
user-facing side, it prevents spurious bug reports like "how come my
custom-made range with non-copyable .empty masqueraded from a nested
struct via alias this doesn't pass isInputRange?", which then prompts
some well-meaning soul to implement support for this obscure case,
thereby adding all kinds of weird fluff to Phobos that really don't
belong there.

We want to be able to say to such bug reports, "the spec says .empty can
only be method or a bool field, sorry we don't support stuff where
.empty is a non-copyable wrapper object that uses alias this to
implicitly convert to a value wrapper with an .opCast!bool that returns
an immutable(bool) which can then be value-copied onto a bool".

Andrei has said many times that these kinds of obscure cases don't
belong to Phobos. If some user wants static arrays to work with ranges,
then just write `[]` and be done with it, instead of adding yet another
useless feature to Phobos (which inevitably will cause some unexpected
poor interaction with another obscure case, and we're stuck in the
endless churn of accreting features in Phobos that make it harder to
maintain yet does not actually make any progress in improving D code).
If somebody wants .empty to be a wrapper struct that uses alias this and
.opCast!bool to return an immutable(bool), just have them write a
wrapper that uses a function .empty to return a bool.

The fact that user code ended up in such a tangled mess is a sign that
something is wrong on *their* side; we should not be promoting bad code
practices by supporting such monstrosities in Phobos; we should instead
be triggering a compile error so that the user cleans up his act and
writes better code.


T

-- 
Insanity is doing the same thing over and over again and expecting different
results.

Nov 09 2021

Atila Neves <atila.neves gmail.com> writes:

On Thursday, 4 November 2021 at 15:29:59 UTC, Andrei Alexandrescu 
wrote:
 On 11/4/21 12:43 AM, Paul Backus wrote:
 On Thursday, 4 November 2021 at 04:06:15 UTC, Andrei 
 Alexandrescu wrote:
 On 11/3/21 11:25 PM, Paul Backus wrote:
 [...]

 That was on the table, too, in the form of a raw pointer.

 I think it can be made to work, but for lvalue ranges only, 
 and it will be difficult to make safe (scoped etc).

 Overall this seems to create more problems than it solves.

 
 I'd be curious to see any examples of such problems you have 
 in mind.
 
 As far as I'm aware, no special effort should be required to 
 make this  safe, aside from enabling -preview=dip1000 (which, 
 granted, is still a work in progress).

 Pointers are problematic because of aliasing and lifetime (what 
 if the pointer survives the data structure it points into).

T* should mean infinite lifetime by default in  safe code: where 
did you get that pointer to begin with? If a struct contains a T* 
within it, then scoped struct variables solve the lifetime issue 
that way. Aliasing, however, is a problem we still have. Which is 
why we can't currently write a  safe vector.

 Note that the whole thing still doesn't address unbuffered 
 ranges. There must be a buffer of at least one element 
 somewhere. That's... problematic.

Yeah, I'm still wondering how to fix that.

Nov 05 2021

Dukc <ajieskola gmail.com> writes:

On Thursday, 4 November 2021 at 04:06:15 UTC, Andrei Alexandrescu 
wrote:
 I think it can be made to work, but for lvalue ranges only, and 
 it will be difficult to make safe (scoped etc).

 Overall this seems to create more problems than it solves.

Yeah, I think we should keep the `front`/`popFront`/`empty` 
scheme. Not because it's necessarily better, but because there's 
always a high risk for scope creep and second-system effect when 
doing projects like Phobos v2. Even discarding autodecoding and 
`isForwardRange` will be a lot of work already, let's not bite 
more than we can swallow.

Nov 04 2021

Paul Backus <snarwin gmail.com> writes:

On Thursday, 4 November 2021 at 10:45:25 UTC, Dukc wrote:
 On Thursday, 4 November 2021 at 04:06:15 UTC, Andrei 
 Alexandrescu wrote:
 I think it can be made to work, but for lvalue ranges only, 
 and it will be difficult to make safe (scoped etc).

 Overall this seems to create more problems than it solves.

 Yeah, I think we should keep the `front`/`popFront`/`empty` 
 scheme. Not because it's necessarily better, but because 
 there's always a high risk for scope creep and second-system 
 effect when doing projects like Phobos v2. Even discarding 
 autodecoding and `isForwardRange` will be a lot of work 
 already, let's not bite more than we can swallow.

I agree that this is definitely not a v2 proposal--more like v3 
or v4. But I do think it should be on the roadmap.

Nov 04 2021

Dukc <ajieskola gmail.com> writes:

On Thursday, 4 November 2021 at 12:59:52 UTC, Paul Backus wrote:
 I agree that this is definitely not a v2 proposal--more like v3 
 or v4. But I do think it should be on the roadmap.

Ah, that's more reasonable. Not saying I agree but at least I 
disagree much less.

Nov 04 2021

Dukc <ajieskola gmail.com> writes:

On Wednesday, 3 November 2021 at 00:18:59 UTC, Paul Backus wrote:
 Having input ranges implement `next` and forward ranges 
 implement `head` and `tail` would also make them easy to 
 distinguish.

There's an easier solution: require all v2 ranges to have the 
`inputRangeTag`. For forward ranges it must be set to false, but 
if the tag does not exist at all then it isn't a v2 range.

If we go for this, I'd rename the tag to `isReferenceRange` 
though.

It's going to require more manual usage of the `valueRange` 
wrapper though, as the v1 ranges around aren't going to be v2 
ranges as often.

What do you say?

Nov 03 2021

Paul Backus <snarwin gmail.com> writes:

On Wednesday, 3 November 2021 at 11:54:24 UTC, Dukc wrote:
 There's an easier solution: require all v2 ranges to have the 
 `inputRangeTag`. For forward ranges it must be set to false, 
 but if the tag does not exist at all then it isn't a v2 range.

 If we go for this, I'd rename the tag to `isReferenceRange` 
 though.

 It's going to require more manual usage of the `valueRange` 
 wrapper though, as the v1 ranges around aren't going to be v2 
 ranges as often.

 What do you say?

Sure, that would work. But that still leaves the issue of `const` 
ranges, which is the other thing `head` and `tail` are meant to 
address.

I think if we are going to make incompatible changes to the range 
API, we might as well do a proper redesign that fixes all of the 
known issues at once. And until we do that (maybe in v3?), it is 
probably better to hold off on incompatible changes.

Nov 03 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:

On 2021-11-02 20:18, Paul Backus wrote:
 On Wednesday, 3 November 2021 at 00:01:33 UTC, H. S. Teoh wrote:
 The problem with manually-added tags of this sort is that people 
 forget to do it, and that leads to trouble.  Preferably, it should be 
 something already implicit in the range type itself, that does not 
 require additional effort to tag.

 I'm kinda toying with the idea of struct == forward range, class == 
 input range: the difference is inherent in the type itself and 
 requires no further effort beyond the decision to use a by-value type 
 vs. a by-reference type, which coincides with the decision to make 
 something an input range or a forward range.

 
 Having input ranges implement `next` and forward ranges implement `head` 
 and `tail` would also make them easy to distinguish.

What would be the signature of next?

Would forward ranges also implement next? If not, that would mean 
algorithms for input ranges won't work for forward ranges.

Nov 03 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:

On 2021-11-02 19:07, Paul Backus wrote:
 On Tuesday, 2 November 2021 at 21:58:20 UTC, Andrei Alexandrescu wrote:
 On 2021-11-02 17:44, Dukc wrote:
 What's a value range?

 Opposite of a reference range - copying implies `save()`.

 Yah, one simple improvement we could make is to assume all forward 
 ranges copy their iteration state when copying the range. Then input 
 ranges do NOT do that, i.e. all copies of an input range refer to the 
 same stream and iterate it together (advancing one advances all).

 The differentiation can be made with a nested enum tag:

 struct MyInputRange {
     enum inputRangeTag = true;
     ...
 }

 Client code can inspect R.inputRangeTag to figure whether the range is 
 input (if present) or forward (if missing).

 
 Not sure this is the best idea--it means new-style algorithms will 
 silently treat old-style input ranges as though they were forward 
 ranges, which could lead to incorrect behavior at runtime. If we are 
 going to make incompatible changes to the range API, we should do it in 
 such a way that version mismatches are caught at compile time.

Good point. Maybe have all ranges define that enum with values true and 
false respectively?

Nov 03 2021

jmh530 <john.michael.hall gmail.com> writes:

On Wednesday, 3 November 2021 at 15:03:53 UTC, Andrei 
Alexandrescu wrote:
 [snip]

 Good point. Maybe have all ranges define that enum with values 
 true and false respectively?

Maybe I just haven't been following this discussion closely 
enough, but I'm a little confused by this. We have 
`__traits(hasMember, T, inputRangeTag)` after all. There should 
be default behavior when that tag isn't there and more 
sophisticated behavior when it is (if it is needed).

Nov 03 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:

On 2021-11-03 11:28, jmh530 wrote:
 On Wednesday, 3 November 2021 at 15:03:53 UTC, Andrei Alexandrescu wrote:
 [snip]

 Good point. Maybe have all ranges define that enum with values true 
 and false respectively?

 
 Maybe I just haven't been following this discussion closely enough, but 
 I'm a little confused by this. We have `__traits(hasMember, T, 
 inputRangeTag)` after all. There should be default behavior when that 
 tag isn't there and more sophisticated behavior when it is (if it is 
 needed).

Yah that's what I had in mind - hasMember is actually easier. I think 
the problem is that code that forgets to do the check may easily do the 
wrong thing.

Nov 03 2021

Steven Schveighoffer <schveiguy gmail.com> writes:

On 11/3/21 11:03 AM, Andrei Alexandrescu wrote:
 On 2021-11-02 19:07, Paul Backus wrote:
 On Tuesday, 2 November 2021 at 21:58:20 UTC, Andrei Alexandrescu wrote:
 On 2021-11-02 17:44, Dukc wrote:
 What's a value range?

 Opposite of a reference range - copying implies `save()`.

 Yah, one simple improvement we could make is to assume all forward 
 ranges copy their iteration state when copying the range. Then input 
 ranges do NOT do that, i.e. all copies of an input range refer to the 
 same stream and iterate it together (advancing one advances all).

 The differentiation can be made with a nested enum tag:

 struct MyInputRange {
     enum inputRangeTag = true;
     ...
 }

 Client code can inspect R.inputRangeTag to figure whether the range 
 is input (if present) or forward (if missing).

 Not sure this is the best idea--it means new-style algorithms will 
 silently treat old-style input ranges as though they were forward 
 ranges, which could lead to incorrect behavior at runtime. If we are 
 going to make incompatible changes to the range API, we should do it 
 in such a way that version mismatches are caught at compile time.

 
 Good point. Maybe have all ranges define that enum with values true and 
 false respectively?

Yes, this is what I was trying to point out in my other post.

One thing that is possible is to change at least one of the methods 
(i.e. change the name of `front`, `popFront`, or `empty`), so it is easy 
to distinguish a v2 range from a v1 range. An enum works too, and I'd 
support that.

For sure, you need an opt-in for forward ranges because input ranges are 
the most basic type.

Thinking about this some more, maybe an enum is better for another 
reason. One thing we use introspection for but can bite us is to see if 
something supports a specific interface. But what happens when what we 
expect is not what actually happens? The result is usually very 
confusing messages, or introspection that doesn't result in what we 
expect it to (i.e. some wrapper of our expected forward range only ends 
up being an input range).

Doing the check for what type of range it is different from the actual 
code expectations would not only read more cleanly (and perform better I 
think), it would push the error to the code itself. e.g. you define what 
you think is a random access range, you specify `enum rangeType = 
RangeType.RandomAccess` as a member, but forget to define `opIndex`. 
Instead of the range just being inferred as non-random-access in some 
other part of code, the compiler tells you `error, cannot call opIndex 
on type MyRange`, which gives you the exact error you need to fix the 
problem.

-Steve

Nov 03 2021

Steven Schveighoffer <schveiguy gmail.com> writes:

On 11/3/21 11:03 AM, Andrei Alexandrescu wrote:
 On 2021-11-02 19:07, Paul Backus wrote:
 On Tuesday, 2 November 2021 at 21:58:20 UTC, Andrei Alexandrescu wrote:
 On 2021-11-02 17:44, Dukc wrote:
 What's a value range?

 Opposite of a reference range - copying implies `save()`.

 Yah, one simple improvement we could make is to assume all forward 
 ranges copy their iteration state when copying the range. Then input 
 ranges do NOT do that, i.e. all copies of an input range refer to the 
 same stream and iterate it together (advancing one advances all).

 The differentiation can be made with a nested enum tag:

 struct MyInputRange {
     enum inputRangeTag = true;
     ...
 }

 Client code can inspect R.inputRangeTag to figure whether the range 
 is input (if present) or forward (if missing).

 Not sure this is the best idea--it means new-style algorithms will 
 silently treat old-style input ranges as though they were forward 
 ranges, which could lead to incorrect behavior at runtime. If we are 
 going to make incompatible changes to the range API, we should do it 
 in such a way that version mismatches are caught at compile time.

 
 Good point. Maybe have all ranges define that enum with values true and 
 false respectively?

Oh, and one more thing, if we are going to do a tag, a UDA is probably a 
more asthetic and functional tag. e.g. it doesn't affect 
`__traits(allMembers, T)`.

-Steve

Nov 03 2021

Adam D Ruppe <destructionator gmail.com> writes:

On Wednesday, 3 November 2021 at 15:59:40 UTC, Steven 
Schveighoffer wrote:
 Oh, and one more thing, if we are going to do a tag, a UDA is 
 probably a more asthetic and functional tag. e.g. it doesn't 
 affect `__traits(allMembers, T)`.

Best way is to make the decoration also be a check. A UDA can do 
it:

---
template InputRange(Ty = void) {
         static if(is(Ty == void)) alias T = typeof(this); else 
alias T = Ty;
         static import std.range;
         static assert(std.range.isInputRange!T, "more informative 
message");
         enum isInputRange = true;
}

 InputRange!MyRange
struct MyRange {
         bool empty;
         int front;
         //void popFront() {}

         //mixin InputRange;

}
---



That silly one works as either a mixin or a UDA each with 
stylistic pros and cons, but the functionality benefit of a 
concrete check of intention and nice error messages before 
granting you the tag would legitimately be really quite nice.

Nov 03 2021

Steven Schveighoffer <schveiguy gmail.com> writes:

On 11/2/21 8:11 AM, Dukc wrote:

 - We also add some other function, or perhaps a flag to aforementioned 
 one, that can convert any v1 input ranges to v2 input range. 
 `valueRange` as default must not accept non-forward ranges, because then 
 it cannot guarantee that the result will be a value range.

How does phobos v2 view a non-class non-forward v1 range? This is the 
fundamental problem needing solving, because if you make copyability the 
defining trait, current v1 input-only ranges (e.g. `File.byLine`) are 
going to be miscategorized.

-Steve

Nov 02 2021

Dukc <ajieskola gmail.com> writes:

On Wednesday, 3 November 2021 at 02:34:00 UTC, Steven 
Schveighoffer wrote:
 On 11/2/21 8:11 AM, Dukc wrote:

 - We also add some other function, or perhaps a flag to 
 aforementioned one, that can convert any v1 input ranges to v2 
 input range. `valueRange` as default must not accept 
 non-forward ranges, because then it cannot guarantee that the 
 result will be a value range.

 How does phobos v2 view a non-class non-forward v1 range? This 
 is the fundamental problem needing solving, because if you make 
 copyability the defining trait, current v1 input-only ranges 
 (e.g. `File.byLine`) are going to be miscategorized.

 -Steve

They are v2 input ranges. v2 input ranges are not required to be 
value ranges, but should be if they can (meaning, if their v1 
equivalent would be a forward range).

The reason I said that `valueRange` by default must not return a 
reference range is because it'd conflict with the function name.

Nov 03 2021

D Programming

C/C++ Programming

Other

digitalmars.D - Forward ranges in Phobos v2