www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Move semantics for D

reply Benjamin Thaut <code benjamin-thaut.de> writes:
Move semantics in C++0x are quite nice for optimization purposes. 
Thinking about it, it should be fairly easy to implement move semantics 
in D as structs don't have identity. Therefor a move constructor would 
not be required. You can already move value types for example within an 
array just by plain moving the data of the value around. With a little 
new keyword 'mov' or 'move' it would also be possible to move value 
types into and out of functions, something like this:

mov Range findNext(mov Range r)
{
   //do stuff here
}

With something like this it would not be neccessary to copy the range 
twice during the call of this function, the compiler could just plain 
copy the data and reinitialize the origin in case of the argument.
In case of the return value to only copying would be neccessary as the 
data goes out of scope anyway.

The only question would be if this causes any problems with out contracts.

The pre C++0x move trick, reserving a bit in the value representation 
for marking that the next copy is a move, is unfortunately not possible 
D because D does not have a copy constructor.

I for example have a range that iterates over a octree and thus needs to 
internally track which nodes it already visited and which ones are still 
left. This is done with a stack container. That needs to be copied 
everytime the range is copied, which causes quite some overhead.

Kind Regards
Benjamin Thaut
Jul 12 2012
next sibling parent reply "monarch_dodra" <monarch_dodra gmail.com> writes:
On Friday, 13 July 2012 at 06:50:02 UTC, Benjamin Thaut wrote:
 Move semantics in C++0x are quite nice for optimization 
 purposes. Thinking about it, it should be fairly easy to 
 implement move semantics in D as structs don't have identity. 
 Therefor a move constructor would not be required. You can 
 already move value types for example within an array just by 
 plain moving the data of the value around. With a little new 
 keyword 'mov' or 'move' it would also be possible to move value 
 types into and out of functions, something like this:

 mov Range findNext(mov Range r)
 {
   //do stuff here
 }

 With something like this it would not be neccessary to copy the 
 range twice during the call of this function, the compiler 
 could just plain copy the data and reinitialize the origin in 
 case of the argument.
 In case of the return value to only copying would be neccessary 
 as the data goes out of scope anyway.

 The only question would be if this causes any problems with out 
 contracts.

 The pre C++0x move trick, reserving a bit in the value 
 representation for marking that the next copy is a move, is 
 unfortunately not possible D because D does not have a copy 
 constructor.

 I for example have a range that iterates over a octree and thus 
 needs to internally track which nodes it already visited and 
 which ones are still left. This is done with a stack container. 
 That needs to be copied everytime the range is copied, which 
 causes quite some overhead.

 Kind Regards
 Benjamin Thaut
I'm pretty sure D already has it: Values types are "moved" in and out of function, implicitly, when possible, without any special semantics. For "return by value", the value is simply "blit copied" (memcopy), when returning a local variable. Neither the source constructor is called, nor the target postblit constructor. Just binary bit copy. For passing in by value, the compiler will do the same trick "if and when" it can detect you are never going to use said value again. If you are, you can always force a move using an explicit std.algorithm.move: "myRange = findNext(move(myRange));" You get pretty the same effect as in C++11, except: a) the compiler will eagerly move stuff for you b) you don't even have to define fun(T&&) (Whew) Finally, if you want to remove the responsibility of the move from the caller to the callee, I *guess* you can always just pass by ref and do stuff there: Range findNext(ref Range r) { auto r2; move(r, r2); //move r into r2 //do stuff here return r2; //move return } Note that while it might seem useless to move r into r2 (since you already have r by reference), you still have to move into a local temporary so that the compiler can "move" r2 out of findNext. The above function will make 0 calls to this(this) and 0 calls to ~this. There will be about two copies of this.init. ...of course, at this point, one could wonder why: a) you don't just take a ref, and return void? b) Use the safer just as efficient "myRange = findNext(move(myRange));"
Jul 13 2012
next sibling parent reply "Mehrdad" <wfunction hotmail.com> writes:
On Friday, 13 July 2012 at 08:06:08 UTC, monarch_dodra wrote:
 Move semantics in C++0x are quite nice for optimization 
 purposes.
That's not why they were introduced. They were introduced for making things like unique_ptr possible.
Jul 13 2012
parent reply Benjamin Thaut <code benjamin-thaut.de> writes:
Am 13.07.2012 10:19, schrieb Mehrdad:
 On Friday, 13 July 2012 at 08:06:08 UTC, monarch_dodra wrote:
 Move semantics in C++0x are quite nice for optimization purposes.
That's not why they were introduced. They were introduced for making things like unique_ptr possible.
Well thats your point of view. I'm coming from the gaming industry, and everyone I have talked to so far, is happy because they can use it to optimize better. Kind Regards Benjamin Thaut
Jul 13 2012
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday, July 13, 2012 10:45:03 Benjamin Thaut wrote:
 Am 13.07.2012 10:19, schrieb Mehrdad:
 On Friday, 13 July 2012 at 08:06:08 UTC, monarch_dodra wrote:
 Move semantics in C++0x are quite nice for optimization purposes.
That's not why they were introduced. They were introduced for making things like unique_ptr possible.
Well thats your point of view. I'm coming from the gaming industry, and everyone I have talked to so far, is happy because they can use it to optimize better.
I expect that the reality of the matter is that it was introduced for both. - Jonathan M Davis
Jul 13 2012
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/13/12 5:11 AM, Jonathan M Davis wrote:
 On Friday, July 13, 2012 10:45:03 Benjamin Thaut wrote:
 Am 13.07.2012 10:19, schrieb Mehrdad:
 On Friday, 13 July 2012 at 08:06:08 UTC, monarch_dodra wrote:
 Move semantics in C++0x are quite nice for optimization purposes.
That's not why they were introduced. They were introduced for making things like unique_ptr possible.
Well thats your point of view. I'm coming from the gaming industry, and everyone I have talked to so far, is happy because they can use it to optimize better.
I expect that the reality of the matter is that it was introduced for both. - Jonathan M Davis
Three. These two plus perfect forwarding. Which is imperfect :o). Andrei
Jul 13 2012
parent reply "Mehrdad" <wfunction hotmail.com> writes:
On Friday, 13 July 2012 at 11:04:00 UTC, Andrei Alexandrescu 
wrote:
 Three. These two plus perfect forwarding.
Yup, that too.
 Which is imperfect :o).
Why/how?
Jul 13 2012
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/13/12 9:05 PM, Mehrdad wrote:
 On Friday, 13 July 2012 at 11:04:00 UTC, Andrei Alexandrescu wrote:
 Three. These two plus perfect forwarding.
Yup, that too.
 Which is imperfect :o).
Why/how?
https://www.facebook.com/video/video.php?v=10151094464083109 Andrei
Jul 13 2012
parent reply "Mehrdad" <wfunction hotmail.com> writes:
On Saturday, 14 July 2012 at 03:16:20 UTC, Andrei Alexandrescu 
wrote:
 https://www.facebook.com/video/video.php?v=10151094464083109

 Andrei
Interesting, I'd never even thought about those cases. That definitely _does_ make it less than perfect... but then again, I never really would have expected them to work anyway. (0/NULL? That's why they have nullptr! or I never would've expected bitfields to work either, etc.) Still though, I feel calling it "imperfect" is kinda strong. It doesn't have any _problems_ per se, just shortcomings.
Jul 13 2012
parent =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
  On 07/13/2012 09:45 PM, Mehrdad wrote:> On Saturday, 14 July 2012 at 
03:16:20 UTC, Andrei Alexandrescu wrote:
 https://www.facebook.com/video/video.php?v=10151094464083109

 Andrei
 Still though, I feel calling it "imperfect" is kinda strong. It doesn't
 have any _problems_ per se, just shortcomings.
If I remember correctly, Scott calls it imperfect as well. :) (Have you watched both parts of the presentation? I think it was closer to the end.) The hoops that one needs to go through to make the compiler agree with what is needed, was simply scary for me. For example, the way enable_if is used is so indirect: Enabling or disabling a template implementation by using a strange return type is simply WAT! :) In that sense, the feature is not perfect. Ali
Jul 14 2012
prev sibling parent Benjamin Thaut <code benjamin-thaut.de> writes:
Am 13.07.2012 10:06, schrieb monarch_dodra:
 On Friday, 13 July 2012 at 06:50:02 UTC, Benjamin Thaut wrote:
 Move semantics in C++0x are quite nice for optimization purposes.
 Thinking about it, it should be fairly easy to implement move
 semantics in D as structs don't have identity. Therefor a move
 constructor would not be required. You can already move value types
 for example within an array just by plain moving the data of the value
 around. With a little new keyword 'mov' or 'move' it would also be
 possible to move value types into and out of functions, something like
 this:

 mov Range findNext(mov Range r)
 {
   //do stuff here
 }

 With something like this it would not be neccessary to copy the range
 twice during the call of this function, the compiler could just plain
 copy the data and reinitialize the origin in case of the argument.
 In case of the return value to only copying would be neccessary as the
 data goes out of scope anyway.

 The only question would be if this causes any problems with out
 contracts.

 The pre C++0x move trick, reserving a bit in the value representation
 for marking that the next copy is a move, is unfortunately not
 possible D because D does not have a copy constructor.

 I for example have a range that iterates over a octree and thus needs
 to internally track which nodes it already visited and which ones are
 still left. This is done with a stack container. That needs to be
 copied everytime the range is copied, which causes quite some overhead.

 Kind Regards
 Benjamin Thaut
I'm pretty sure D already has it: Values types are "moved" in and out of function, implicitly, when possible, without any special semantics. For "return by value", the value is simply "blit copied" (memcopy), when returning a local variable. Neither the source constructor is called, nor the target postblit constructor. Just binary bit copy. For passing in by value, the compiler will do the same trick "if and when" it can detect you are never going to use said value again. If you are, you can always force a move using an explicit std.algorithm.move: "myRange = findNext(move(myRange));" You get pretty the same effect as in C++11, except: a) the compiler will eagerly move stuff for you b) you don't even have to define fun(T&&) (Whew) Finally, if you want to remove the responsibility of the move from the caller to the callee, I *guess* you can always just pass by ref and do stuff there: Range findNext(ref Range r) { auto r2; move(r, r2); //move r into r2 //do stuff here return r2; //move return } Note that while it might seem useless to move r into r2 (since you already have r by reference), you still have to move into a local temporary so that the compiler can "move" r2 out of findNext. The above function will make 0 calls to this(this) and 0 calls to ~this. There will be about two copies of this.init. ...of course, at this point, one could wonder why: a) you don't just take a ref, and return void? b) Use the safer just as efficient "myRange = findNext(move(myRange));"
I didn't know know about the compiler already moving structs as return values. But not having to do the move for a argument manuall would be nice, espeically as ref does not work with non-lvalues Kind Regards Benjamin Thaut
Jul 13 2012
prev sibling next sibling parent reply travert phare.normalesup.org (Christophe Travert) writes:
Benjamin Thaut , dans le message (digitalmars.D:172207), a écrit :
 Move semantics in C++0x are quite nice for optimization purposes. 
 Thinking about it, it should be fairly easy to implement move semantics 
 in D as structs don't have identity. Therefor a move constructor would 
 not be required. You can already move value types for example within an 
 array just by plain moving the data of the value around. With a little 
 new keyword 'mov' or 'move' it would also be possible to move value 
 types into and out of functions, something like this:
 
 mov Range findNext(mov Range r)
 {
    //do stuff here
 }
 
 With something like this it would not be neccessary to copy the range 
 twice during the call of this function, the compiler could just plain 
 copy the data and reinitialize the origin in case of the argument.
 In case of the return value to only copying would be neccessary as the 
 data goes out of scope anyway.
If Range is a Rvalue, it will be moved, not copied. It it's a Lvalue, your operation is dangerous, and does not bring you much more than using ref (it may be faster to copy the range than to take the reference, but that's an optimiser issue). auto ref seems to be the solution.
 I for example have a range that iterates over a octree and thus needs to 
 internally track which nodes it already visited and which ones are still 
 left. This is done with a stack container. That needs to be copied 
 everytime the range is copied, which causes quite some overhead.
I would share the tracking data between several instance of the range, making bitwise copy suitable. Tracking data would be duplicated only on call to save or opSlice(). You'd hit the issue of foreach not calling save when it should, but opSlice would solve this, and you could still overload opApply if you want to be sure. -- Christophe
Jul 13 2012
parent reply Benjamin Thaut <code benjamin-thaut.de> writes:
Am 13.07.2012 10:59, schrieb Christophe Travert:
 Benjamin Thaut , dans le message (digitalmars.D:172207), a écrit :
 Move semantics in C++0x are quite nice for optimization purposes.
 Thinking about it, it should be fairly easy to implement move semantics
 in D as structs don't have identity. Therefor a move constructor would
 not be required. You can already move value types for example within an
 array just by plain moving the data of the value around. With a little
 new keyword 'mov' or 'move' it would also be possible to move value
 types into and out of functions, something like this:

 mov Range findNext(mov Range r)
 {
     //do stuff here
 }

 With something like this it would not be neccessary to copy the range
 twice during the call of this function, the compiler could just plain
 copy the data and reinitialize the origin in case of the argument.
 In case of the return value to only copying would be neccessary as the
 data goes out of scope anyway.
If Range is a Rvalue, it will be moved, not copied. It it's a Lvalue, your operation is dangerous, and does not bring you much more than using ref (it may be faster to copy the range than to take the reference, but that's an optimiser issue). auto ref seems to be the solution.
 I for example have a range that iterates over a octree and thus needs to
 internally track which nodes it already visited and which ones are still
 left. This is done with a stack container. That needs to be copied
 everytime the range is copied, which causes quite some overhead.
I would share the tracking data between several instance of the range, making bitwise copy suitable. Tracking data would be duplicated only on call to save or opSlice(). You'd hit the issue of foreach not calling save when it should, but opSlice would solve this, and you could still overload opApply if you want to be sure.
I couldn't find anything in the documentation about foreach calling save or opSlice(). So I assume foreach calls opSlice if aviable? foreach(el; range) { ... } translates to: for(auto r = range[]; !r.empty(); r.popFront() { auto el = r.front(); ... } Kind Regards Benjamin Thaut
Jul 13 2012
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday, July 13, 2012 11:09:04 Benjamin Thaut wrote:
 Am 13.07.2012 10:59, schrieb Christophe Travert:
 Benjamin Thaut , dans le message (digitalmars.D:172207), a =C3=A9cr=
it :
 Move semantics in C++0x are quite nice for optimization purposes.
 Thinking about it, it should be fairly easy to implement move sema=
ntics
 in D as structs don't have identity. Therefor a move constructor w=
ould
 not be required. You can already move value types for example with=
in an
 array just by plain moving the data of the value around. With a li=
ttle
 new keyword 'mov' or 'move' it would also be possible to move valu=
e
 types into and out of functions, something like this:
=20
 mov Range findNext(mov Range r)
 {
=20
     //do stuff here
=20
 }
=20
 With something like this it would not be neccessary to copy the ra=
nge
 twice during the call of this function, the compiler could just pl=
ain
 copy the data and reinitialize the origin in case of the argument.=
 In case of the return value to only copying would be neccessary as=
the
 data goes out of scope anyway.
=20 If Range is a Rvalue, it will be moved, not copied. It it's a Lvalue, your operation is dangerous, and does not bring y=
ou
 much more than using ref (it may be faster to copy the range than t=
o
 take the reference, but that's an optimiser issue).
=20
 auto ref seems to be the solution.
=20
 I for example have a range that iterates over a octree and thus ne=
eds to
 internally track which nodes it already visited and which ones are=
still
 left. This is done with a stack container. That needs to be copied=
 everytime the range is copied, which causes quite some overhead.
=20 I would share the tracking data between several instance of the ran=
ge,
 making bitwise copy suitable. Tracking data would be duplicated onl=
y on
 call to save or opSlice(). You'd hit the issue of foreach not calli=
ng
 save when it should, but opSlice would solve this, and you could st=
ill
 overload opApply if you want to be sure.
=20 I couldn't find anything in the documentation about foreach calling s=
ave
 or opSlice(). So I assume foreach calls opSlice if aviable?
If you have foreach(e; obj) { ... } and obj is a range, then it becomes for(auto __range =3D obj; !__range.empty; __range.popFront()) { auto e =3D __range.front; ... } If obj is not a range, but it does declare opSlice, then it's sliced, s= o you=20 get something like for(auto __range =3D obj[]; !__range.empty; __range.popFront()) { auto e =3D __range.front; ... } There's also opApply to consider though, and it probably gets precedenc= e over=20 opSlice if a type defines both. I don't know exactly how all of that's = laid out=20 right now though. However, there's a recent thread discussing it ("opAp= ply not=20 called for foeach(container)"), so you can read that, and I think that = they=20 give better details on precedence. I haven't read it in great detail th= ough,=20 so I don't know the details other than the fact that it sounds like the= =20 precedence rules for foreach probably neet do be ironed out better. - Jonathan M Davis
Jul 13 2012
prev sibling parent reply "monarch_dodra" <monarch_dodra gmail.com> writes:
On Friday, 13 July 2012 at 09:09:03 UTC, Benjamin Thaut wrote:
 Am 13.07.2012 10:59, schrieb Christophe Travert:
 Benjamin Thaut , dans le message (digitalmars.D:172207), a 
 écrit :
 Move semantics in C++0x are quite nice for optimization 
 purposes.
 Thinking about it, it should be fairly easy to implement move 
 semantics
 in D as structs don't have identity. Therefor a move 
 constructor would
 not be required. You can already move value types for example 
 within an
 array just by plain moving the data of the value around. With 
 a little
 new keyword 'mov' or 'move' it would also be possible to move 
 value
 types into and out of functions, something like this:

 mov Range findNext(mov Range r)
 {
    //do stuff here
 }

 With something like this it would not be neccessary to copy 
 the range
 twice during the call of this function, the compiler could 
 just plain
 copy the data and reinitialize the origin in case of the 
 argument.
 In case of the return value to only copying would be 
 neccessary as the
 data goes out of scope anyway.
If Range is a Rvalue, it will be moved, not copied. It it's a Lvalue, your operation is dangerous, and does not bring you much more than using ref (it may be faster to copy the range than to take the reference, but that's an optimiser issue). auto ref seems to be the solution.
 I for example have a range that iterates over a octree and 
 thus needs to
 internally track which nodes it already visited and which 
 ones are still
 left. This is done with a stack container. That needs to be 
 copied
 everytime the range is copied, which causes quite some 
 overhead.
I would share the tracking data between several instance of the range, making bitwise copy suitable. Tracking data would be duplicated only on call to save or opSlice(). You'd hit the issue of foreach not calling save when it should, but opSlice would solve this, and you could still overload opApply if you want to be sure.
I couldn't find anything in the documentation about foreach calling save or opSlice(). So I assume foreach calls opSlice if aviable? foreach(el; range) { ... } translates to: for(auto r = range[]; !r.empty(); r.popFront() { auto el = r.front(); ... } Kind Regards Benjamin Thaut
Depends if you are asking about "what the compiler does", or "what the compiler should do" or "what the documentation says". There is a discussion about it here: http://forum.dlang.org/thread/rxpbtrawpjzvdfuuwmwp forum.dlang.org I think that in the case of your example, if "range" fulfills the requirements for (at least) an input range, then "save" *should* be called instead of "opSlice". I'm *think* this is what the compiler does, but I'm not 100% sure.
Jul 13 2012
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday, July 13, 2012 11:23:41 monarch_dodra wrote:
 I think that in the case of your example, if "range" fulfills the
 requirements for (at least) an input range, then "save" *should*
 be called instead of "opSlice". I'm *think* this is what the
 compiler does, but I'm not 100% sure.
The compiler never calls save. It will call opSlice on non-ranges, but it'll never call save on anything. It doesn't even know that save exists. It's unnecessary for ranges which aren't reference types, and you already have to worry about calling save on ranges that could be reference types any time that you pass them to a function that you don't want to consume it, so it's not really a big deal to have to call save with foreach for such ranges. Most range-based functions don't even use foreach much anyway. - Jonathan M Davis
Jul 13 2012
parent reply "monarch_dodra" <monarch_dodra gmail.com> writes:
On Friday, 13 July 2012 at 09:47:26 UTC, Jonathan M Davis wrote:
 On Friday, July 13, 2012 11:23:41 monarch_dodra wrote:
 I think that in the case of your example, if "range" fulfills 
 the
 requirements for (at least) an input range, then "save" 
 *should*
 be called instead of "opSlice". I'm *think* this is what the
 compiler does, but I'm not 100% sure.
The compiler never calls save. It will call opSlice on non-ranges, but it'll never call save on anything. It doesn't even know that save exists. It's unnecessary for ranges which aren't reference types, and you already have to worry about calling save on ranges that could be reference types any time that you pass them to a function that you don't want to consume it, so it's not really a big deal to have to call save with foreach for such ranges. Most range-based functions don't even use foreach much anyway. - Jonathan M Davis
What exactly are the semantics of save? The reference in std.range isn't very clear. It would appear it is only useful it its *existence* that promotes a range from input to forward. However, how is it different from a simple assignment? Also, if you are supposed to call save before a foreach "consumes" your range, then why does foreach even bother making a copy of the range before iterating on it? Isn't this behavior promoting dangerous usage for ranges where save is simply "{return this;}", but bites you in the ass the day you use a range with a specific save? I'm confused...
Jul 13 2012
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday, July 13, 2012 12:24:45 monarch_dodra wrote:
 On Friday, 13 July 2012 at 09:47:26 UTC, Jonathan M Davis wrote:
 On Friday, July 13, 2012 11:23:41 monarch_dodra wrote:
 I think that in the case of your example, if "range" fulfills
 the
 requirements for (at least) an input range, then "save"
 *should*
 be called instead of "opSlice". I'm *think* this is what the
 compiler does, but I'm not 100% sure.
The compiler never calls save. It will call opSlice on non-ranges, but it'll never call save on anything. It doesn't even know that save exists. It's unnecessary for ranges which aren't reference types, and you already have to worry about calling save on ranges that could be reference types any time that you pass them to a function that you don't want to consume it, so it's not really a big deal to have to call save with foreach for such ranges. Most range-based functions don't even use foreach much anyway. - Jonathan M Davis
What exactly are the semantics of save? The reference in std.range isn't very clear. It would appear it is only useful it its *existence* that promotes a range from input to forward. However, how is it different from a simple assignment? Also, if you are supposed to call save before a foreach "consumes" your range, then why does foreach even bother making a copy of the range before iterating on it? Isn't this behavior promoting dangerous usage for ranges where save is simply "{return this;}", but bites you in the ass the day you use a range with a specific save? I'm confused...
save exists so that classes can be forward ranges. Arrays and most structs are copied when you assign them to another variable or pass them to a function, but classes aren't. So, save was introduced to make it so that there is a way to explicitly copy a range. It becomes useful even for structs and arrays in that it documents that you're copying it, but it's outright necessary for classes. Unfortunately, since arrays and structs are by far the most common range types, save doesn't get used anywhere near as much as it should be. Basically, you use save if you want to guarantee that the range is copied, and you don't use save if you don't care. You can also look at this: http://stackoverflow.com/questions/11190204/how-do-you-use-ranges-in- - Jonathan M Davis
Jul 13 2012
next sibling parent "monarch_dodra" <monarch_dodra gmail.com> writes:
On Friday, 13 July 2012 at 10:33:13 UTC, Jonathan M Davis wrote:
 On Friday, July 13, 2012 12:24:45 monarch_dodra wrote:
 On Friday, 13 July 2012 at 09:47:26 UTC, Jonathan M Davis 
 wrote:
 On Friday, July 13, 2012 11:23:41 monarch_dodra wrote:
 I think that in the case of your example, if "range" 
 fulfills
 the
 requirements for (at least) an input range, then "save"
 *should*
 be called instead of "opSlice". I'm *think* this is what the
 compiler does, but I'm not 100% sure.
The compiler never calls save. It will call opSlice on non-ranges, but it'll never call save on anything. It doesn't even know that save exists. It's unnecessary for ranges which aren't reference types, and you already have to worry about calling save on ranges that could be reference types any time that you pass them to a function that you don't want to consume it, so it's not really a big deal to have to call save with foreach for such ranges. Most range-based functions don't even use foreach much anyway. - Jonathan M Davis
What exactly are the semantics of save? The reference in std.range isn't very clear. It would appear it is only useful it its *existence* that promotes a range from input to forward. However, how is it different from a simple assignment? Also, if you are supposed to call save before a foreach "consumes" your range, then why does foreach even bother making a copy of the range before iterating on it? Isn't this behavior promoting dangerous usage for ranges where save is simply "{return this;}", but bites you in the ass the day you use a range with a specific save? I'm confused...
save exists so that classes can be forward ranges. Arrays and most structs are copied when you assign them to another variable or pass them to a function, but classes aren't. So, save was introduced to make it so that there is a way to explicitly copy a range. It becomes useful even for structs and arrays in that it documents that you're copying it, but it's outright necessary for classes. Unfortunately, since arrays and structs are by far the most common range types, save doesn't get used anywhere near as much as it should be. Basically, you use save if you want to guarantee that the range is copied, and you don't use save if you don't care. You can also look at this: http://stackoverflow.com/questions/11190204/how-do-you-use-ranges-in- - Jonathan M Davis
I guess that "save" makes sense thanks in that context. Thank you. However, foreach is starting to look very dangerous to me: Isn't the fact that it (potentially) calls opSlice, or makes a copy of your input just a asking for potential problems? Or is this a "struct vs class" issue, that I do not yet fully grasp? Shouldn't we instead enforce an _explicit_ opSlice/save? Eg: auto SomeContainer = ... ; auto SomeRange = SomeContainer[ ... ]; foreach(v; SomeContainer[]) ... ; //Fine, iterate on a new range, and consume that foreach(v; SomeRange.save) ... ; //Fine, iterate on a copy of the range, and comue that foreach(v; SomeContainer) ... ; //Fine, I will NOT call opSlice, and _consume_ your container foreach(v; SomeRange) ... ; //Fine, I will NOT copy, and _consume_ your range ... I suppose the "recommendation" is to use the above form, and that is what I will be doing as of now on. But I still feel that the call to opSlice/copy is really a just trap disguised as a safety net...
Jul 13 2012
prev sibling parent reply "monarch_dodra" <monarch_dodra gmail.com> writes:
On Friday, 13 July 2012 at 10:33:13 UTC, Jonathan M Davis wrote:
 On Friday, July 13, 2012 12:24:45 monarch_dodra wrote:
 On Friday, 13 July 2012 at 09:47:26 UTC, Jonathan M Davis 
 wrote:
 On Friday, July 13, 2012 11:23:41 monarch_dodra wrote:
 I think that in the case of your example, if "range" 
 fulfills
 the
 requirements for (at least) an input range, then "save"
 *should*
 be called instead of "opSlice". I'm *think* this is what the
 compiler does, but I'm not 100% sure.
The compiler never calls save. It will call opSlice on non-ranges, but it'll never call save on anything. It doesn't even know that save exists. It's unnecessary for ranges which aren't reference types, and you already have to worry about calling save on ranges that could be reference types any time that you pass them to a function that you don't want to consume it, so it's not really a big deal to have to call save with foreach for such ranges. Most range-based functions don't even use foreach much anyway. - Jonathan M Davis
What exactly are the semantics of save? The reference in std.range isn't very clear. It would appear it is only useful it its *existence* that promotes a range from input to forward. However, how is it different from a simple assignment? Also, if you are supposed to call save before a foreach "consumes" your range, then why does foreach even bother making a copy of the range before iterating on it? Isn't this behavior promoting dangerous usage for ranges where save is simply "{return this;}", but bites you in the ass the day you use a range with a specific save? I'm confused...
save exists so that classes can be forward ranges. Arrays and most structs are copied when you assign them to another variable or pass them to a function, but classes aren't. So, save was introduced to make it so that there is a way to explicitly copy a range. It becomes useful even for structs and arrays in that it documents that you're copying it, but it's outright necessary for classes. Unfortunately, since arrays and structs are by far the most common range types, save doesn't get used anywhere near as much as it should be. Basically, you use save if you want to guarantee that the range is copied, and you don't use save if you don't care. - Jonathan M Davis
Thanks a lot for the explanation. It makes a lot of sense. However, foreach is starting to look very dangerous to me: Isn't the fact that it (potentially) calls opSlice, or makes a copy of your input just asking for potential problems? Or is this more of a "struct vs class" issue, that I do not yet fully grasp? Shouldn't D _enforce_ an _explicit_ opSlice/save? Eg: auto SomeContainer = ... ; auto SomeRange = SomeContainer[ ... ]; foreach(v; SomeContainer[]) ... ; //Fine, iterate on a new range, and consume that foreach(v; SomeRange.save) ... ; //Fine, iterate on a copy of the range, and comue that foreach(v; SomeContainer) ... ; //Fine, I will NOT call opSlice, and _consume_ your container foreach(v; SomeRange) ... ; //Fine, I will NOT copy, and _consume_ your range On a side note, it would also make "foreach"s behavior clearer... ... I suppose the "recommendation" is to use the above form, and that is what I will be doing as of now on. But I still feel that the internal call to opSlice/copy is really a just trap disguised as a safety net...
Jul 13 2012
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday, July 13, 2012 14:11:44 monarch_dodra wrote:
 But I still feel that the internal call to opSlice/copy is really
 a just trap disguised as a safety net...
Yeah. Kind of. The thing is that there's no way around the fact that structs and arrays are effectively value types as far as ranges go, whereas classes are reference types. If you want them to be copied consistently, you need to use save. If you don't care whether a copy is made or not, then you don't worry about it. But foreach has to be treated specially with ranges anyway, because of strings. foreach(e; str) {} is going to iterate over char, not dchar, whereas strings are ranges of dchar. So, if you iterate over a range generically, you need to do foreach(ElementType!R e; range) {} So, you have to be careful with foreach already, and having it _not_ save but having it copy the range automatically (and therefore implicitly saving for non-reference type forward ranges) is completely consistent with how it works with passing ranges to functions, so in that sense, it really isn't all that bad. If anything, things are _more_ consistent that way. The only way to really solve the save problem would be to disallow forward ranges which were reference types (which you could mostly do by disallowing classes, but I don' think that there's any real way to statically check whether a struct is a reference type or not). Then save wouldn't be needed, and ranges would all function the same as long as no one made structs which were reference types into ranges. But that would also cut off some useful idioms (sometimes, you _want_ a range to be consumed by a function rather than being implicitly copied - which is why I added std.range.RefRange for 2.060). So, I don't know of any way to really fix the problem. You need save to make it possible to copy reference types, and there's no way to reasonably avoid having to deal with both value type and reference type ranges. So, we just deal with it. Unfortunately, far too often, the result of that is that value type ranges work properly and reference type ranges don't, since range-based functions are usually tested with just value types ranges, but improved testing fixes that, and that's not all that hard to do. - Jonathan M Davis
Jul 13 2012
parent "monarch_dodra" <monarch_dodra gmail.com> writes:
On Friday, 13 July 2012 at 22:27:01 UTC, Jonathan M Davis wrote:
 On Friday, July 13, 2012 14:11:44 monarch_dodra wrote:
 But I still feel that the internal call to opSlice/copy is 
 really
 a just trap disguised as a safety net...
Yeah. Kind of. The thing is that there's no way around the fact that structs and arrays are effectively value types as far as ranges go, whereas classes are reference types. If you want them to be copied consistently, you need to use save. If you don't care whether a copy is made or not, then you don't worry about it. But foreach has to be treated specially with ranges anyway, because of strings. foreach(e; str) {} is going to iterate over char, not dchar, whereas strings are ranges of dchar. So, if you iterate over a range generically, you need to do foreach(ElementType!R e; range) {} So, you have to be careful with foreach already, and having it _not_ save but having it copy the range automatically (and therefore implicitly saving for non-reference type forward ranges) is completely consistent with how it works with passing ranges to functions, so in that sense, it really isn't all that bad. If anything, things are _more_ consistent that way. The only way to really solve the save problem would be to disallow forward ranges which were reference types (which you could mostly do by disallowing classes, but I don' think that there's any real way to statically check whether a struct is a reference type or not). Then save wouldn't be needed, and ranges would all function the same as long as no one made structs which were reference types into ranges. But that would also cut off some useful idioms (sometimes, you _want_ a range to be consumed by a function rather than being implicitly copied - which is why I added std.range.RefRange for 2.060). So, I don't know of any way to really fix the problem. You need save to make it possible to copy reference types, and there's no way to reasonably avoid having to deal with both value type and reference type ranges. So, we just deal with it. Unfortunately, far too often, the result of that is that value type ranges work properly and reference type ranges don't, since range-based functions are usually tested with just value types ranges, but improved testing fixes that, and that's not all that hard to do. - Jonathan M Davis
Very good points. I appreciate your answer. I guess I should consider the "copy" the standard byproduct of "passing to foreach" => I should _also_ remember to save before passing a range to a function.
Jul 14 2012
prev sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday, July 13, 2012 08:50:03 Benjamin Thaut wrote:
 Move semantics in C++0x are quite nice for optimization purposes.
 Thinking about it, it should be fairly easy to implement move semantics
 in D as structs don't have identity. Therefor a move constructor would
 not be required. You can already move value types for example within an
 array just by plain moving the data of the value around.
http://stackoverflow.com/questions/4200190/does-d-have-something-akin-to-c0xs- move-semantics http://stackoverflow.com/questions/6884996/questions-about-postblit-and-move- semantics - Jonathan M Davis
Jul 13 2012