digitalmars.D - Issue with forward ranges which are reference types

Jonathan M Davis (100/100) Aug 16 2011 Sorry that this is long, but it's very important IMHO, and I don't know ...

Mehrdad (9/17) Aug 16 2011 Funny, I was also thinking about this recently.

Jonathan M Davis (14/36) Aug 16 2011 Phobos' functions pretty much always have template constraints to verify...
Jesse Phillips (7/27) Aug 16 2011 All of the range functions check for functionality, so if your random-

Mehrdad (22/47) Aug 16 2011 Right, but the problem is that none of this template business (e.g.

Jonathan M Davis (35/96) Aug 16 2011 If you're dealing with a reference for a type that implements the functi...

Mehrdad (32/94) Aug 17 2011 That doesn't compile. I think you missed the entire point of my comment

Steven Schveighoffer (23/75) Aug 17 2011 What you are looking for is dynamic typing. That is not supported

Mehrdad (3/12) Aug 17 2011 The correct solution? It doesn't even compile. (See my last post, which

Steven Schveighoffer (8/21) Aug 17 2011 Oh, right, InputRangeObject is a template. Sorry, I forgot about that

Mehrdad (3/25) Aug 17 2011 Er, if they aren't supported then please just remove them altogether...

Steven Schveighoffer (3/34) Aug 17 2011 It's not my call. My opinion differs from the others, especially Andrei...

Andrei Alexandrescu (7/9) Aug 16 2011 [snip]

Jonathan M Davis (12/22) Aug 16 2011 I expect that the result of that is that reference type ranges aren't go...

Peter Alexander (5/11) Aug 17 2011 Apologies for my ignorance, but I haven't really been following all this...

Jonathan M Davis (26/41) Aug 17 2011 A range is any type which has the appropriate functions on it. It doesn'...

Steven Schveighoffer (10/18) Aug 17 2011 Probably not helpful, since the establishment seems to be set in their

Lars T. Kyllingstad (13/36) Aug 17 2011 As long as most functions in std.algorithm don't take the ranges as ref

Steven Schveighoffer (7/37) Aug 17 2011 Do you have a real example besides foo which makes sense on both byRef a...

Lars T. Kyllingstad (16/56) Aug 17 2011 Well, I did try my hand at writing a parser for a wiki-style markup

Steven Schveighoffer (25/80) Aug 17 2011 The problem here seems to be that an input range is used as the base of ...

Jonathan M Davis <jmdavisProg gmx.com> writes:

Sorry that this is long, but it's very important IMHO, and I don't know how to 
make it much shorter and cover what it's supposed to cover. 

Okay. Your typical forward range is either an array a struct which is a value 
type (that is, copying it creates an independent range which points to the 
same elements and is not altered if the original range is altered - the 
elements that it points to aren't copied of course). So, when you want to get 
a copy, it's as easy as

auto rangeCopy = range;

It was previously determined that this would be a problem for ranges which are 
reference types (classes in particular, but it affects structs as well, if 
copying them doesn't create an independent range). So, we added the save 
property.

auto rangeCopy = range.save;

That way, when we need to save the state of a range, we always use save, even 
if that particular range would be copied by a simple assignement. Okay. So far 
so good. There is one major problem with the current situation though: the 
behavior of value-type and reference-type forward ranges is very different when 
they are passed to range-based functions.

We use save within a particular algorithm when we know that we need to copy a 
range's state, but simply passing a range to a function may or may not copy a 
range. Take this possible implementation of a drop function, for instance:

R drop(R)(R range, size_t n)
    if(isInputRange!R)
{
    popFrontN(range, n);
    return range;
}

It pops n elements off of the range and returns the range sans those elements. 
In the case of an input range, the original range is n elements shorter - as 
expected. In the case of forward ranges though, it varies. If you pass an 
array or a struct with value semantics to drop, then the original range is not 
altered. Just passing it to drop is equivalent to calling save. However, if 
you use a range which is a reference type, then just like with an input range, 
the original range is altered. So, the behavior of code could change 
drastically depending on whether it's given a range which is a value type or a 
range which is a reference type - even if they're both forward ranges.

auto valRange = "hello world";
assert(equal(drop(valRange, 5), " world");
assert(equal(valRange, "hello world");

auto refRange = createRefRange("hello world");
assert(equal(drop(refRange, 5), " world");
assert(equal(refRange, " world"));

So, the question is, should a range-based function have the same behavior for 
all forward ranges regardless of whether they're value types or reference 
types? Or should the caller be aware of whether a range is a value type or a 
reference type and call save if necessary? Or should the caller just always 
call save when passing a forward range to a function?

Option 1: If we make it so that all functions behave identically for all 
forward ranges, then we're going to need something like this at the beginning 
of every range-based function for all ranges passed to that function:

static if(isForwardRange!R) range = range.save;

This is definitely a bit tedious, but it's completely doable. And with some 
proper tests for it, it's easy to catch whether a function actually does this 
like it's supposed to. However, while in most cases, the compiler should be 
able to optimize out this assignment for structs, it can't always do it. IIUC, 
if the struct defines a postblit constructor or if any of its member variables 
define a postblit constructor (or if any of their member variables declare a 
postblite construcotr, or any of their member variables' member variables...), 
then the save call and assignment won't be able to be optimized out, and there 
will be a performance cost. However, it's likely that very few range types 
will have postblit constructors, given the usual simplicity of their design 
and the fact if they actually need a postblit constructor, they can probably 
just forgoe it in favor of using the save property for all copying, making 
them reference types.

Option 2: On the other hand, we could make it so that the caller just has to 
be aware of whether a range is value type or a reference type and always call 
save when passing a reference type forward range to a function. This avoids 
the potential performance penalty but is very error-prone. Instead of the 
range-based function worrying about it, now _every_ function that calls a 
range-based function must worry about it, and that's quite likely to be error-
prone. It's also likely to be somewhat problematic in generic code (which 
range-based code frequently is), since you can't exactly test for whether a 
range is a reference type or not, forcing you to pretty much call save all of 
the time when passing ranges to functions in generic code.

Option 2 is essentially what we've been doing in Phobos except that we don't 
actually test reference type ranges at all, and the odds are that a lot of 
Phobos doesn't actually handle reference type ranges correctly - meaning that 
even if the caller remembers to call save before passing such a range to a 
Phobos range-based function, there's a good chance that the function won't 
work correctly.

Option 3: Or, we could just say that you should _always_ call save when 
passing a forward range to a function. That way, you avoid the issue of trying 
to figure out whether a range is a reference type or not. It's also less error-
prone in that the fact that you're always doing it makes you less likely to 
forget to do it when you need to. However, you have the irritation of having 
to check for input ranges (since you can't call save on them) and essentially 
end up doing the exact same thing as option 1 except that you're doing it at 
every call point instead of once inside of the function definition. So, you 
really haven't gained much over putting it inside of the function and made it 
more error-prone than option 1since you have to remember to do it everywhere 
that you call a function instead of just the once in its definition.

So, the question is which option is the better one? Or is there another option 
that I haven't thought of? It's quite clear to me that we're going to need to 
add unit tests to Phobos to verify the behavior of Phobos functions when 
dealing with ranges which are reference types, but I think that we need a 
clear strategy on how to deal with the fact that value-type forward ranges are 
automatically saved when they're passed to a function whereas reference-type 
forward ranges are not.

Thoughts?

- Jonathan M Davis

Aug 16 2011

Mehrdad <wfunction hotmail.com> writes:

On 8/16/2011 9:05 PM, Jonathan M Davis wrote:
 Sorry that this is long, but it's very important IMHO, and I don't know how to
 make it much shorter and cover what it's supposed to cover.

 Okay. Your typical forward range is either an array a struct which is a value
 type (that is, copying it creates an independent range which points to the
 same elements and is not altered if the original range is altered - the
 elements that it points to aren't copied of course).<snip>
 Thoughts?

 - Jonathan M Davis

Funny, I was also thinking about this recently.

The trouble is that that's not the only issue. There's also the issue 
with polymorphism -- i.e., InputRangeObject is pretty much *useless* 
right now because no function ever checks for it (AFAIK... am I wrong?). 
So if you pass a random-access range object as an InputRange, the callee 
will just assume it's an InputRange and would reject it. So you're 
forced to downcast every time, which is really tedious. Things don't 
"just work" anymore.

Aug 16 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday, August 16, 2011 21:17:31 Mehrdad wrote:
 On 8/16/2011 9:05 PM, Jonathan M Davis wrote:
 Sorry that this is long, but it's very important IMHO, and I don't know
 how to make it much shorter and cover what it's supposed to cover.
 
 Okay. Your typical forward range is either an array a struct which is a
 value type (that is, copying it creates an independent range which
 points to the same elements and is not altered if the original range is
 altered - the elements that it points to aren't copied of
 course).<snip>
 Thoughts?
 
 - Jonathan M Davis

 
 Funny, I was also thinking about this recently.
 
 The trouble is that that's not the only issue. There's also the issue
 with polymorphism -- i.e., InputRangeObject is pretty much *useless*
 right now because no function ever checks for it (AFAIK... am I wrong?).
 So if you pass a random-access range object as an InputRange, the callee
 will just assume it's an InputRange and would reject it. So you're
 forced to downcast every time, which is really tedious. Things don't
 "just work" anymore.

Phobos' functions pretty much always have template constraints to verify that 
they're given the correct type of range, and if they don't then they're 
supposed to. So, lots of functions have isInputRange!R, and most of the other 
range types are subtypes of input ranges, so checking for them also checks for 
input ranges - e.g. isForwardRange!R.

Polymorphism has _nothing_ to do with the range API though. If you're dealing 
with a range type which is a class or interface, then it must implement all of 
the appropriate functions for the type (or types) of range(s) that it's 
supposed to be. If the calls happen to be polymorphic, that's fine, but the 
range-based functions don't care. Whether a type is a particular type of range 
or not is _entirely_ a matter of its API.

So, I don't quite understand what your issue is here.

- Jonathan M Davis

Aug 16 2011

Jesse Phillips <jessekphillips+d gmail.com> writes:

On Tue, 16 Aug 2011 21:17:31 -0700, Mehrdad wrote:

 On 8/16/2011 9:05 PM, Jonathan M Davis wrote:
 Sorry that this is long, but it's very important IMHO, and I don't know
 how to make it much shorter and cover what it's supposed to cover.

 Okay. Your typical forward range is either an array a struct which is a
 value type (that is, copying it creates an independent range which
 points to the same elements and is not altered if the original range is
 altered - the elements that it points to aren't copied of
 course).<snip> Thoughts?

 - Jonathan M Davis

 Funny, I was also thinking about this recently.
 
 The trouble is that that's not the only issue. There's also the issue
 with polymorphism -- i.e., InputRangeObject is pretty much *useless*
 right now because no function ever checks for it (AFAIK... am I wrong?).
 So if you pass a random-access range object as an InputRange, the callee
 will just assume it's an InputRange and would reject it. So you're
 forced to downcast every time, which is really tedious. Things don't
 "just work" anymore.

All of the range functions check for functionality, so if your random-
access range object contains, popFront, front, empty (which it is 
required to to be random-access range) then it will be accepted as an 
InputRange.

Considering your work I'm sure you know this so I'm probably 
misunderstanding what point your are making?

Aug 16 2011

Mehrdad <wfunction hotmail.com> writes:

On 8/16/2011 9:37 PM, Jesse Phillips wrote:
 On Tue, 16 Aug 2011 21:17:31 -0700, Mehrdad wrote:

 On 8/16/2011 9:05 PM, Jonathan M Davis wrote:
 Sorry that this is long, but it's very important IMHO, and I don't know
 how to make it much shorter and cover what it's supposed to cover.

 Okay. Your typical forward range is either an array a struct which is a
 value type (that is, copying it creates an independent range which
 points to the same elements and is not altered if the original range is
 altered - the elements that it points to aren't copied of
 course).<snip>  Thoughts?

 - Jonathan M Davis

 Funny, I was also thinking about this recently.

 The trouble is that that's not the only issue. There's also the issue
 with polymorphism -- i.e., InputRangeObject is pretty much *useless*
 right now because no function ever checks for it (AFAIK... am I wrong?).
 So if you pass a random-access range object as an InputRange, the callee
 will just assume it's an InputRange and would reject it. So you're
 forced to downcast every time, which is really tedious. Things don't
 "just work" anymore.

 All of the range functions check for functionality, so if your random-
 access range object contains, popFront, front, empty (which it is
 required to to be random-access range) then it will be accepted as an
 InputRange.

Right, but the problem is that none of this template business (e.g. 
isInputRange!T, hasLength!T, etc.) works if the input is an Object that 
implements InputRange.

For example, consider this:

     static Object getItems()
     { return inputRangeObject([1, 2]); }

     Object collection = getItems();
     if (collection.empty)  //Whoops...
     {
         ...
     }

The caller has no idea what kind of range is returned by getItems(), but 
he still needs to be able to check whether it's empty.

How can he figure this out? He would be forced to cast (which is by 
itself a pretty bad option), but what can he cast the object to?  
InputRange!Object doesn't work because it could be an InputRange!string 
or something. There's really NO way (that I know of) for the caller to 
test and see if the collection is an input range, unless he knows the 

Java).

Hope that makes sense...

Aug 16 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday, August 16, 2011 23:20:14 Mehrdad wrote:
 On 8/16/2011 9:37 PM, Jesse Phillips wrote:
 On Tue, 16 Aug 2011 21:17:31 -0700, Mehrdad wrote:
 On 8/16/2011 9:05 PM, Jonathan M Davis wrote:
 Sorry that this is long, but it's very important IMHO, and I don't
 know
 how to make it much shorter and cover what it's supposed to cover.
 
 Okay. Your typical forward range is either an array a struct which
 is a
 value type (that is, copying it creates an independent range which
 points to the same elements and is not altered if the original range
 is
 altered - the elements that it points to aren't copied of
 course).<snip>  Thoughts?
 
 - Jonathan M Davis

 
 Funny, I was also thinking about this recently.
 
 The trouble is that that's not the only issue. There's also the issue
 with polymorphism -- i.e., InputRangeObject is pretty much *useless*
 right now because no function ever checks for it (AFAIK... am I
 wrong?).
 So if you pass a random-access range object as an InputRange, the
 callee
 will just assume it's an InputRange and would reject it. So you're
 forced to downcast every time, which is really tedious. Things don't
 "just work" anymore.

 
 All of the range functions check for functionality, so if your random-
 access range object contains, popFront, front, empty (which it is
 required to to be random-access range) then it will be accepted as an
 InputRange.

 
 Right, but the problem is that none of this template business (e.g.
 isInputRange!T, hasLength!T, etc.) works if the input is an Object that
 implements InputRange.
 
 For example, consider this:
 
      static Object getItems()
      { return inputRangeObject([1, 2]); }
 
      Object collection = getItems();
      if (collection.empty)  //Whoops...
      {
          ...
      }
 
 The caller has no idea what kind of range is returned by getItems(), but
 he still needs to be able to check whether it's empty.
 
 How can he figure this out? He would be forced to cast (which is by
 itself a pretty bad option), but what can he cast the object to?
 InputRange!Object doesn't work because it could be an InputRange!string
 or something. There's really NO way (that I know of) for the caller to
 test and see if the collection is an input range, unless he knows the

 Java).
 
 Hope that makes sense...

If you're dealing with a reference for a type that implements the functions 
for input range or forward range or whatever, then it's not an issue. It'll 
work with functions that require those range types. If you're dealing with a 
reference that doesn't implement the appropriate functions, then it isn't even 
if the actual type does. You could cast to the actual type and use that, but 
that pretty much assumes that you know the actual type - or if you don't you 
end up having to do something like

auto ir = cast(InputRange)obj;

if(ir)
    //call range func
else
   //do whatever you do if you can't call the range func

However, that OO _normally_ works is that you use a reference which is the 
type that you want to treat the object as. So, all of the code using that 
reference only deals with functionality that that reference type has. You 
don't usually cast it to other types to try and do other stuff. So, if a 
function needs the InputRange class/interface, then that's the reference that 
you use for that particular variable, and then whatever you assign it to (from 
a function parameter or a return function or whatever) is a type which derives 
from or implements InputRange. And it all works just fine.

It's usually _bad_ OO to be casting between object types. In particular, 
actually using the base Object class is usually a _bad_ idea. Some languages 
do that for their containers because they lack proper templates, but then you 
have to worry about casting the objects to the correct type when you get them 

but when they added generics, they made it so that only the internal 
implementation works that way. The generics take care of keeping track of the 
actual type of the objects in the container and you don't have to cast anymore 
(though the casts still occur underneath the hood). So, that improves the 
situation considerably.

But regardless, good OO design does not usually require casting from a base 
class or interface to a derived class. So, the issue that you're describing 
just doesn't happen in good OO code.

- Jonathan M Davis

Aug 16 2011

Mehrdad <wfunction hotmail.com> writes:

On 8/16/2011 11:41 PM, Jonathan M Davis wrote:
 On Tuesday, August 16, 2011 23:20:14 Mehrdad wrote:
 Right, but the problem is that none of this template business (e.g.
 isInputRange!T, hasLength!T, etc.) works if the input is an Object that
 implements InputRange.

 For example, consider this:

       static Object getItems()
       { return inputRangeObject([1, 2]); }

       Object collection = getItems();
       if (collection.empty)  //Whoops...
       {
           ...
       }

 The caller has no idea what kind of range is returned by getItems(), but
 he still needs to be able to check whether it's empty.

 How can he figure this out? He would be forced to cast (which is by
 itself a pretty bad option), but what can he cast the object to?
 InputRange!Object doesn't work because it could be an InputRange!string
 or something. There's really NO way (that I know of) for the caller to
 test and see if the collection is an input range, unless he knows the

 Java).

 Hope that makes sense...

 If you're dealing with a reference for a type that implements the functions
 for input range or forward range or whatever, then it's not an issue. It'll
 work with functions that require those range types. If you're dealing with a
 reference that doesn't implement the appropriate functions, then it isn't even
 if the actual type does. You could cast to the actual type and use that, but
 that pretty much assumes that you know the actual type - or if you don't you
 end up having to do something like

 auto ir = cast(InputRange)obj;

That doesn't compile. I think you missed the entire point of my comment 
-- you have no idea what it's an input range OF.
Read below.
 if(ir)
      //call range func
 else
     //do whatever you do if you can't call the range func

 However, that OO _normally_ works is that you use a reference which is the
 type that you want to treat the object as. So, all of the code using that
 reference only deals with functionality that that reference type has. You
 don't usually cast it to other types to try and do other stuff. So, if a
 function needs the InputRange class/interface, then that's the reference that
 you use for that particular variable, and then whatever you assign it to (from
 a function parameter or a return function or whatever) is a type which derives
 from or implements InputRange. And it all works just fine.

 It's usually _bad_ OO to be casting between object types. In particular,
 actually using the base Object class is usually a _bad_ idea. Some languages
 do that for their containers because they lack proper templates, but then you
 have to worry about casting the objects to the correct type when you get them

 but when they added generics, they made it so that only the internal
 implementation works that way. The generics take care of keeping track of the
 actual type of the objects in the container and you don't have to cast anymore
 (though the casts still occur underneath the hood). So, that improves the
 situation considerably.

 But regardless, good OO design does not usually require casting from a base
 class or interface to a derived class. So, the issue that you're describing
 just doesn't happen in good OO code.

 - Jonathan M Davis

I think you missed my point.

My point wasn't "What if all you have is an Object reference?", but 
rather "What if you don't know the _kind_ of InputRange(T) an object is?".

i.e. You might know very well that a piece of code returns an 
InputRange(T) where T is _SOME_ subclass of a class you know, but not 
have any idea what T is.

When does that happen? When you're dealing with covariance and 
contravariance. You get back a container of SOME kind of object 
reference, but you don't know what kind of container it is, so there's 
NO way for you to "just cast it" to InputRange(T) because you don't know 
what T would be.


situation by letting you return InputRange!T where T is some BASE class 
of what you have. But you can't do that in D (...AFAIK?) so you're 
forced to return an Object, hence my example.

Of course, your argument is that we need to use The Template Hammer, and 
make the CALLER be a template, so that it can accept anything. The 
problem with which is that now your template leaks, i.e. now you're 
forcing a whole bunch of other code to become templated, when it really  
doesn't need to be. You consider that a good idea, but I think you're 
completely ignoring the fact that the entire concept of a shared library 
is to SHARE code. Once you "templatize" a piece of code and then force 
everything else to follow suit, then you can't share the same code -- 
it's NEW code EVERY time. So unless you also consider shared libraries 
to be an indicator of Bad Coding ("only n00bs don't know EVERYTHING at 
compile time") I'm just confused at how you can think that The Template 
Hammer should be used for every nail. It /just fails/ when you're making 
a shared library.

Aug 17 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 17 Aug 2011 02:20:14 -0400, Mehrdad <wfunction hotmail.com> wrote:

 On 8/16/2011 9:37 PM, Jesse Phillips wrote:
 On Tue, 16 Aug 2011 21:17:31 -0700, Mehrdad wrote:

 On 8/16/2011 9:05 PM, Jonathan M Davis wrote:
 Sorry that this is long, but it's very important IMHO, and I don't  
 know
 how to make it much shorter and cover what it's supposed to cover.

 Okay. Your typical forward range is either an array a struct which is  
 a
 value type (that is, copying it creates an independent range which
 points to the same elements and is not altered if the original range  
 is
 altered - the elements that it points to aren't copied of
 course).<snip>  Thoughts?

 - Jonathan M Davis

 Funny, I was also thinking about this recently.

 The trouble is that that's not the only issue. There's also the issue
 with polymorphism -- i.e., InputRangeObject is pretty much *useless*
 right now because no function ever checks for it (AFAIK... am I  
 wrong?).
 So if you pass a random-access range object as an InputRange, the  
 callee
 will just assume it's an InputRange and would reject it. So you're
 forced to downcast every time, which is really tedious. Things don't
 "just work" anymore.

 All of the range functions check for functionality, so if your random-
 access range object contains, popFront, front, empty (which it is
 required to to be random-access range) then it will be accepted as an
 InputRange.

 Right, but the problem is that none of this template business (e.g.  
 isInputRange!T, hasLength!T, etc.) works if the input is an Object that  
 implements InputRange.

 For example, consider this:

      static Object getItems()
      { return inputRangeObject([1, 2]); }

      Object collection = getItems();
      if (collection.empty)  //Whoops...
      {
          ...
      }

 The caller has no idea what kind of range is returned by getItems(), but  
 he still needs to be able to check whether it's empty.

What you are looking for is dynamic typing.  That is not supported  
directly by Object.  That is, you have to know *statically* (i.e. at  
compile time) that *all* instances returned by getItems have an empty  
property.  Object does not have that property, so you used the wrong  
return type.

 How can he figure this out? He would be forced to cast (which is by  
 itself a pretty bad option), but what can he cast the object to?   
 InputRange!Object doesn't work because it could be an InputRange!string  
 or something. There's really NO way (that I know of) for the caller to  
 test and see if the collection is an input range, unless he knows the  

 Java).

Casting is actually the correct solution.  A cast from a based to a  
derived class is not unsafe as long as you forward the type modifiers  
(like const):

if(auto irange = cast(InputRangeObject)collection)
{
    // now you can use irange
    if(collection.empty) // success!
    {
       ...
    }
}

BTW, since input range is the lowest level range, I'd recommend getItems  
return InputRangeObject instead of Object.

Further more, since I'm 100% against class-based ranges, I would recommend  
not using them at all :)  Use a struct instead, or don't use the range  
concept here.

-Steve

Aug 17 2011

Mehrdad <wfunction hotmail.com> writes:

On 8/17/2011 7:14 AM, Steven Schveighoffer wrote:
 Casting is actually the correct solution.

 if(auto irange = cast(InputRangeObject)collection)
 {
    // now you can use irange
    if(collection.empty) // success!
    {
       ...
    }
 } 

The correct solution? It doesn't even compile. (See my last post, which 
was after the one you replied to.)

Aug 17 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 17 Aug 2011 10:56:02 -0400, Mehrdad <wfunction hotmail.com> wrote:

 On 8/17/2011 7:14 AM, Steven Schveighoffer wrote:
 Casting is actually the correct solution.

 if(auto irange = cast(InputRangeObject)collection)
 {
    // now you can use irange
    if(collection.empty) // success!
    {
       ...
    }
 }

 The correct solution? It doesn't even compile. (See my last post, which  
 was after the one you replied to.)

Oh, right, InputRangeObject is a template.  Sorry, I forgot about that  
aspect.

So actually, that isn't possible if you are returning Object, you need to  
return the correct InputRange(T) type.

(in this case InputRange!int)

Another good reason to avoid class-based ranges :)

-Steve

Aug 17 2011

Mehrdad <wfunction hotmail.com> writes:

On 8/17/2011 8:26 AM, Steven Schveighoffer wrote:
 On Wed, 17 Aug 2011 10:56:02 -0400, Mehrdad <wfunction hotmail.com> 
 wrote:

 On 8/17/2011 7:14 AM, Steven Schveighoffer wrote:
 Casting is actually the correct solution.

 if(auto irange = cast(InputRangeObject)collection)
 {
    // now you can use irange
    if(collection.empty) // success!
    {
       ...
    }
 }

 The correct solution? It doesn't even compile. (See my last post, 
 which was after the one you replied to.)

 Oh, right, InputRangeObject is a template.  Sorry, I forgot about that 
 aspect.

 So actually, that isn't possible if you are returning Object, you need 
 to return the correct InputRange(T) type.

 (in this case InputRange!int)

 Another good reason to avoid class-based ranges :)

 -Steve

Er, if they aren't supported then please just remove them altogether... 
hasn't that been the philosophy so far?

Aug 17 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 17 Aug 2011 22:36:14 -0400, Mehrdad <wfunction hotmail.com> wrote:

 On 8/17/2011 8:26 AM, Steven Schveighoffer wrote:
 On Wed, 17 Aug 2011 10:56:02 -0400, Mehrdad <wfunction hotmail.com>  
 wrote:

 On 8/17/2011 7:14 AM, Steven Schveighoffer wrote:
 Casting is actually the correct solution.

 if(auto irange = cast(InputRangeObject)collection)
 {
    // now you can use irange
    if(collection.empty) // success!
    {
       ...
    }
 }

 The correct solution? It doesn't even compile. (See my last post,  
 which was after the one you replied to.)

 Oh, right, InputRangeObject is a template.  Sorry, I forgot about that  
 aspect.

 So actually, that isn't possible if you are returning Object, you need  
 to return the correct InputRange(T) type.

 (in this case InputRange!int)

 Another good reason to avoid class-based ranges :)

 -Steve

 Er, if they aren't supported then please just remove them altogether...  
 hasn't that been the philosophy so far?

It's not my call.  My opinion differs from the others, especially Andrei.

-Steve

Aug 17 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 8/16/11 11:05 PM, Jonathan M Davis wrote:
 Sorry that this is long, but it's very important IMHO, and I don't know how to
 make it much shorter and cover what it's supposed to cover.

[snip]

Keep things as they are. Algorithms operate on ranges as specified in 
their signatures. If they need to create additional copies thereof, they 
use .save. If client code needs to pass a copy of a range to an 
algorithm, it passes .save.

Andrei

Aug 16 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday, August 16, 2011 23:26:37 Andrei Alexandrescu wrote:
 On 8/16/11 11:05 PM, Jonathan M Davis wrote:
 Sorry that this is long, but it's very important IMHO, and I don't know
 how to make it much shorter and cover what it's supposed to cover.

 
 [snip]
 
 Keep things as they are. Algorithms operate on ranges as specified in
 their signatures. If they need to create additional copies thereof, they
 use .save. If client code needs to pass a copy of a range to an
 algorithm, it passes .save.

I expect that the result of that is that reference type ranges aren't going to 
work a lot of the time. Now, how much of that is broken implementations and 
how much of it is actual design issues, I don't know.

It's clear to me however that we need to start having unit tests for reference 
type ranges in Phobos (probably both of the struct and the class variety to be 
on the safe side) to make sure that functions at least work correctly when 
they're passed a reference type range, regardless of whether the range gets 
consumed in the process. I expect that we have quite a few bugs in Phobos 
stemming from the fact that pretty much all of the ranges that we test with 
(and that most people use at in general at this point) are value type ranges.

- Jonathan M Davis

Aug 16 2011

Peter Alexander <peter.alexander.au gmail.com> writes:

On 17/08/11 5:05 AM, Jonathan M Davis wrote:
 It was previously determined that this would be a problem for ranges which are
 reference types (classes in particular, but it affects structs as well, if
 copying them doesn't create an independent range). So, we added the save
 property.

 <snip>

 Thoughts?

Apologies for my ignorance, but I haven't really been following all this 
ranges stuff.

I must be missing something, why would you ever expect an algorithm that 
works with value types to work with reference types as well?

Aug 17 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Wednesday, August 17, 2011 09:03:52 Peter Alexander wrote:
 On 17/08/11 5:05 AM, Jonathan M Davis wrote:
 It was previously determined that this would be a problem for ranges
 which are reference types (classes in particular, but it affects
 structs as well, if copying them doesn't create an independent range).
 So, we added the save property.
 
 <snip>
 
 Thoughts?

 
 Apologies for my ignorance, but I haven't really been following all this
 ranges stuff.
 
 I must be missing something, why would you ever expect an algorithm that
 works with value types to work with reference types as well?

A range is any type which has the appropriate functions on it. It doesn't 
matter whether it's an array, a struct, or a class. And if it's a struct, it 
could be either a value type or a reference type. So, a range could be either 
a value type or a reference type. In the general case, you can't know without 
reading the code whether a particular range is a value type or a reference 
type (though obviously in the case of classes, you know that it's a reference 
type), and traits can't tell you whether a range is a value type or a 
reference type. So, range-based functions can't assume that a range is a value 
type, and they can't assume that a range is a reference type. This has nothing 
to do with the elements in the range mind you. It's purely a matter of the 
type of the range itself.

So, in order to deal with the issue that

auto rangeCopy = range;

doesn't necessarily copy, save was introduced to make it so that you can 
guarantee that you're getting a copy

auto rangeCopy = range.save;

The issue that I'm bringing up is that you still get different behavior between 
value type and reference type ranges when you pass them to a function. The 
only way to guarantee the same behavior is to either call save before passing 
a range into a function or to call it once it's been passed in.

In any case, essentially what it comes down to is that you have no idea in the 
general case whether a range is a value type or a range type, and you _have_ 
to code in a manner which works with both or you're going to end up with buggy 
code.

- Jonathan M Davis

Aug 17 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 17 Aug 2011 00:05:54 -0400, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 So, the question is, should a range-based function have the same  
 behavior for
 all forward ranges regardless of whether they're value types or reference
 types? Or should the caller be aware of whether a range is a value type  
 or a
 reference type and call save if necessary? Or should the caller just  
 always
 call save when passing a forward range to a function?

Probably not helpful, since the establishment seems to be set in their  
opinions, but I'd recommend saying ranges are always structs, and get rid  
of the save concept, replacing it with an enum solution.  The current save  
regime is a fallacy, because it's not enforced.  It's as bad as c++ const.

At the very least, let's wait until someone actually comes up with a valid  
use case for reference-based forward ranges before changing any code.  So  
far, all I've seen is boilerplate *RangeObject, no real usages.

-Steve

Aug 17 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Wed, 17 Aug 2011 10:19:31 -0400, Steven Schveighoffer wrote:

 On Wed, 17 Aug 2011 00:05:54 -0400, Jonathan M Davis
 <jmdavisProg gmx.com> wrote:
 
 So, the question is, should a range-based function have the same
 behavior for
 all forward ranges regardless of whether they're value types or
 reference types? Or should the caller be aware of whether a range is a
 value type or a
 reference type and call save if necessary? Or should the caller just
 always
 call save when passing a forward range to a function?

 Probably not helpful, since the establishment seems to be set in their
 opinions, but I'd recommend saying ranges are always structs, and get
 rid of the save concept, replacing it with an enum solution.  The
 current save regime is a fallacy, because it's not enforced.  It's as
 bad as c++ const.
 
 At the very least, let's wait until someone actually comes up with a
 valid use case for reference-based forward ranges before changing any
 code.  So far, all I've seen is boilerplate *RangeObject, no real
 usages.

As long as most functions in std.algorithm don't take the ranges as ref 
arguments, you need to use a reference-based range whenever you want the 
function to consume the original range.

BTW, this is why I suggested earlier that we add a byRef range.  If you 
absolutely want the function foo() to consume your range, write

   foo(byRef(myRange));

If you absolutely *don't* want the function to consume your range, write

   foo(myRange.save);

If you don't intend to use the range afterwards, and therefore don't care 
whether it is consumed or not, write

  foo(myRange);

-Lars

Aug 17 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 17 Aug 2011 13:15:27 -0400, Lars T. Kyllingstad  
<public kyllingen.nospamnet> wrote:

 On Wed, 17 Aug 2011 10:19:31 -0400, Steven Schveighoffer wrote:

 On Wed, 17 Aug 2011 00:05:54 -0400, Jonathan M Davis
 <jmdavisProg gmx.com> wrote:

 So, the question is, should a range-based function have the same
 behavior for
 all forward ranges regardless of whether they're value types or
 reference types? Or should the caller be aware of whether a range is a
 value type or a
 reference type and call save if necessary? Or should the caller just
 always
 call save when passing a forward range to a function?

 Probably not helpful, since the establishment seems to be set in their
 opinions, but I'd recommend saying ranges are always structs, and get
 rid of the save concept, replacing it with an enum solution.  The
 current save regime is a fallacy, because it's not enforced.  It's as
 bad as c++ const.

 At the very least, let's wait until someone actually comes up with a
 valid use case for reference-based forward ranges before changing any
 code.  So far, all I've seen is boilerplate *RangeObject, no real
 usages.

 As long as most functions in std.algorithm don't take the ranges as ref
 arguments, you need to use a reference-based range whenever you want the
 function to consume the original range.

 BTW, this is why I suggested earlier that we add a byRef range.  If you
 absolutely want the function foo() to consume your range, write

    foo(byRef(myRange));

Do you have a real example besides foo which makes sense on both byRef and  
by value ranges?

I think it's rather important for the function implementation to know  
what's happening with its range while using it.

-Steve

Aug 17 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Wed, 17 Aug 2011 14:15:52 -0400, Steven Schveighoffer wrote:

 On Wed, 17 Aug 2011 13:15:27 -0400, Lars T. Kyllingstad
 <public kyllingen.nospamnet> wrote:
 
 On Wed, 17 Aug 2011 10:19:31 -0400, Steven Schveighoffer wrote:

 On Wed, 17 Aug 2011 00:05:54 -0400, Jonathan M Davis
 <jmdavisProg gmx.com> wrote:

 So, the question is, should a range-based function have the same
 behavior for
 all forward ranges regardless of whether they're value types or
 reference types? Or should the caller be aware of whether a range is
 a value type or a
 reference type and call save if necessary? Or should the caller just
 always
 call save when passing a forward range to a function?

 Probably not helpful, since the establishment seems to be set in their
 opinions, but I'd recommend saying ranges are always structs, and get
 rid of the save concept, replacing it with an enum solution.  The
 current save regime is a fallacy, because it's not enforced.  It's as
 bad as c++ const.

 At the very least, let's wait until someone actually comes up with a
 valid use case for reference-based forward ranges before changing any
 code.  So far, all I've seen is boilerplate *RangeObject, no real
 usages.

 As long as most functions in std.algorithm don't take the ranges as ref
 arguments, you need to use a reference-based range whenever you want
 the function to consume the original range.

 BTW, this is why I suggested earlier that we add a byRef range.  If you
 absolutely want the function foo() to consume your range, write

    foo(byRef(myRange));

 
 Do you have a real example besides foo which makes sense on both byRef
 and by value ranges?

Well, I did try my hand at writing a parser for a wiki-style markup 
language a while ago, which got its input from an input range.

It would look at the front of the range, determine what kind of element 
was there (paragraph, heading, bullet list, etc.), and pass the range on 
to a specialised function for dealing with that kind of element 
(parseHeading(), etc.).

Of course, those functions had to consume the original range, otherwise 
the same element would be repeated over and over again.  For simple 
cases, this was only a matter of parseWhatever() taking the range by ref, 
and everything would work nicely.  Sometimes, however, the range would be 
wrapped by another range (such as Take or Until).  If I wanted these to 
keep consuming the original range, I had to wrap it with byRef().

This happened often enough, and became annoying enough, that I ended up 
using InputRange objects everywhere instead.

-Lars

Aug 17 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 17 Aug 2011 14:53:53 -0400, Lars T. Kyllingstad  
<public kyllingen.nospamnet> wrote:

 On Wed, 17 Aug 2011 14:15:52 -0400, Steven Schveighoffer wrote:

 On Wed, 17 Aug 2011 13:15:27 -0400, Lars T. Kyllingstad
 <public kyllingen.nospamnet> wrote:

 On Wed, 17 Aug 2011 10:19:31 -0400, Steven Schveighoffer wrote:

 On Wed, 17 Aug 2011 00:05:54 -0400, Jonathan M Davis
 <jmdavisProg gmx.com> wrote:

 So, the question is, should a range-based function have the same
 behavior for
 all forward ranges regardless of whether they're value types or
 reference types? Or should the caller be aware of whether a range is
 a value type or a
 reference type and call save if necessary? Or should the caller just
 always
 call save when passing a forward range to a function?

 Probably not helpful, since the establishment seems to be set in their
 opinions, but I'd recommend saying ranges are always structs, and get
 rid of the save concept, replacing it with an enum solution.  The
 current save regime is a fallacy, because it's not enforced.  It's as
 bad as c++ const.

 At the very least, let's wait until someone actually comes up with a
 valid use case for reference-based forward ranges before changing any
 code.  So far, all I've seen is boilerplate *RangeObject, no real
 usages.

 As long as most functions in std.algorithm don't take the ranges as ref
 arguments, you need to use a reference-based range whenever you want
 the function to consume the original range.

 BTW, this is why I suggested earlier that we add a byRef range.  If you
 absolutely want the function foo() to consume your range, write

    foo(byRef(myRange));

 Do you have a real example besides foo which makes sense on both byRef
 and by value ranges?

 Well, I did try my hand at writing a parser for a wiki-style markup
 language a while ago, which got its input from an input range.

 It would look at the front of the range, determine what kind of element
 was there (paragraph, heading, bullet list, etc.), and pass the range on
 to a specialised function for dealing with that kind of element
 (parseHeading(), etc.).

 Of course, those functions had to consume the original range, otherwise
 the same element would be repeated over and over again.  For simple
 cases, this was only a matter of parseWhatever() taking the range by ref,
 and everything would work nicely.  Sometimes, however, the range would be
 wrapped by another range (such as Take or Until).  If I wanted these to
 keep consuming the original range, I had to wrap it with byRef().

 This happened often enough, and became annoying enough, that I ended up
 using InputRange objects everywhere instead

The problem here seems to be that an input range is used as the base of a  
forward range.  A forward range is much different than an input range, in  
that an input range destroys the data as it iterates, whereas the forward  
range does not.

I would say that anything that is forward range or above should never be a  
reference type, but anything that is strictly an input range *should*  
actually be a reference type (hey, I switched opinions!).  The issue is  
that all forward ranges are input ranges.

Note that while I asked for a real example of an *algorithm*, you gave me  
an example of a *type* that doesn't forwar the desired behavior.  I see  
the point however, and I think a different style of thinking is needed.   
That is, accepting a ref range is not guaranteed to make any range into a  
destructive input range, you do need a byRef range.

In any case, I think I found a more straightforward example of something  
that can accept either: walkLength.  walkLength doesn't care whether the  
data gets destroyed or not, it's just counting stuff.  So there is  
legitimate reason for something to accept both an input and forward+ range.

So what I think we need is one more isXRange to determine "is this an  
input range and *only* an input range?"  That is, is this a *destructive*  
input range.  In the current implementation, this would mean  
isInputRange!R && !isForwardRange!R

I still dislike save and how useless it is, though.

-Steve

Aug 17 2011

D Programming

C/C++ Programming

Other

digitalmars.D - Issue with forward ranges which are reference types