www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Struct that destroys its original handle on copy-by-value

reply Joseph Rushton Wakeling via Digitalmars-d-learn writes:
Hello all,

A design question that came up during the hackathon held during the last Berlin 
D Meetup.

I was trying to come up with a range that can be copied by value, but when this 
is done, destroys the original handle.  The idea would be behaviour something 
like this:

     auto originalRange = myWeirdRange(whatever);
     originalRange.take(10).writeln;
     assert(originalRange.empty,
            "The original handle points to an empty range after
copy-by-value.");

A very minimal prototype (missing out the details of front and popFront as 
irrelevant) would be something like:

     struct MyWeirdRange (R)
         if (isInputRange!R)
     {
       private:
         R* input_;   // Assumed to never be empty, FWIW.  I'm missing out
                      // a template check on that for brevity's sake.

       public:
         this (return ref R input)
         {
             /* return ref should guarantee the pointer
              * is safe for the lifetime of the struct,
              * right ... ?
              */
             this.input_ = &input;
         }

         bool empty ()  property
         {
             return this.input_ is null;
         }

         auto front ()  property { ... }

         void popFront () { ... }

         void opAssign (ref typeof(this) that)
         {
             /* copy the internal pointer, then
              * set that of the original to null
              */
             this.input_ = that.input_;
             that.input_ = null;
         }
     }

Basically, this is a range that would actively enforce the principle that its 
use is a one-shot.  You copy it by value (whether by direct assignment or by 
passing it to another function), you leave the original handle an empty range.

However, the above doesn't work; even in the event of a direct assignment, i.e.

     newRange = originalRange;

... the opAssign is never called.  I presume this is because of the 
ref-parameter input, but it's not obvious to me according to the description 
here why this should be: http://dlang.org/operatoroverloading.html#assignment 
Can anyone clarify what's going on here?

Anyway, my main question is: is this design idea even feasible in principle, or 
just a bad idea from the get-go?  And if feasible -- how would I go about it?

Thanks & best wishes,

     -- Joe
Jul 26 2015
next sibling parent reply "Martijn Pot" <martijnpot52 gmail.com> writes:
On Sunday, 26 July 2015 at 11:30:16 UTC, Joseph Rushton Wakeling 
wrote:
 Hello all,

 A design question that came up during the hackathon held during 
 the last Berlin D Meetup.

 [...]
Sounds like unique_ptr (so UniqueRange might be a nice name). Maybe you can get some ideas from that.
Jul 26 2015
parent reply Joseph Rushton Wakeling via Digitalmars-d-learn writes:
On 26/07/15 13:45, Martijn Pot via Digitalmars-d-learn wrote:
 Sounds like unique_ptr (so UniqueRange might be a nice name). Maybe you can get
 some ideas from that.
There is already a Unique in std.typecons. However, I'm not sure that it's doing what I require. Example: Unique!Random rng = new Random(unpredictableSeed); rng.take(10).writeln; ... will fail with an error: Error: struct std.typecons.Unique!(MersenneTwisterEngine!(uint, 32LU, 624LU, 397LU, 31LU, 2567483615u, 11LU, 7LU, 2636928640u, 15LU, 4022730752u, 18LU)).Unique is not copyable because it is annotated with disable My aim by contrast is to _allow_ that kind of use, but render the original handle empty when it's done.
Jul 26 2015
next sibling parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Sunday, 26 July 2015 at 12:16:30 UTC, Joseph Rushton Wakeling 
wrote:
 My aim by contrast is to _allow_ that kind of use, but render 
 the original handle empty when it's done.
I don't think D offers any way to do that. With the disabled postblit, you can force people into a method you write that returns a new copy and clears the original, but that won't just work with assignment. The ref assign might not be forbidden by the written doc but I'm guessing that is just an oversight - struct assignment in D never clears the original, it is always a simple copy (perhaps plus other code)....
Jul 29 2015
next sibling parent "Vlad Levenfeld" <vlevenfeld gmail.com> writes:
On Wednesday, 29 July 2015 at 19:10:36 UTC, Adam D. Ruppe wrote:
 On Sunday, 26 July 2015 at 12:16:30 UTC, Joseph Rushton 
 Wakeling wrote:
 My aim by contrast is to _allow_ that kind of use, but render 
 the original handle empty when it's done.
I don't think D offers any way to do that. With the disabled postblit, you can force people into a method you write that returns a new copy and clears the original, but that won't just work with assignment. The ref assign might not be forbidden by the written doc but I'm guessing that is just an oversight - struct assignment in D never clears the original, it is always a simple copy (perhaps plus other code)....
Slightly OT, but this is something that would be possible with a copy constructor, I think?
Jul 29 2015
prev sibling parent "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Wednesday, 29 July 2015 at 19:10:36 UTC, Adam D. Ruppe wrote:
 On Sunday, 26 July 2015 at 12:16:30 UTC, Joseph Rushton 
 Wakeling wrote:
 My aim by contrast is to _allow_ that kind of use, but render 
 the original handle empty when it's done.
I don't think D offers any way to do that. With the disabled postblit, you can force people into a method you write that returns a new copy and clears the original, but that won't just work with assignment. The ref assign might not be forbidden by the written doc but I'm guessing that is just an oversight - struct assignment in D never clears the original, it is always a simple copy (perhaps plus other code)....
Hmm... are you implying that `opAssign(ref T other)` should be disallowed? Why? I find this the obvious way to implement move semantics, and it is also what std.algorithm.move does:
Jul 30 2015
prev sibling parent reply "Kagamin" <spam here.lot> writes:
On Sunday, 26 July 2015 at 12:16:30 UTC, Joseph Rushton Wakeling 
wrote:
 Example:

     Unique!Random rng = new Random(unpredictableSeed);
     rng.take(10).writeln;
 My aim by contrast is to _allow_ that kind of use, but render 
 the original handle empty when it's done.
`take` stores the range, you can try to use some sort of a weak reference.
Jul 31 2015
parent Joseph Rushton Wakeling via Digitalmars-d-learn writes:
On 31/07/15 13:40, Kagamin via Digitalmars-d-learn wrote:
 On Sunday, 26 July 2015 at 12:16:30 UTC, Joseph Rushton Wakeling wrote:
 Example:

     Unique!Random rng = new Random(unpredictableSeed);
     rng.take(10).writeln;
 My aim by contrast is to _allow_ that kind of use, but render the original
 handle empty when it's done.
`take` stores the range, you can try to use some sort of a weak reference.
Yea, but that's not what I'm trying to achieve. I know how I can pass something to `take` so as to e.g. obtain reference semantics or whatever; what I'm trying to achieve is a range that _doesn't rely on the user knowing the right way to handle it_. I'll expand on this more responding to Ali, so as to clarify the context of what I'm aiming for and why.
Aug 01 2015
prev sibling next sibling parent "Joseph Rushton Wakeling" <joseph.wakeling webdrake.net> writes:
On Sunday, 26 July 2015 at 11:30:16 UTC, Joseph Rushton Wakeling 
wrote:
 Hello all,

 A design question that came up during the hackathon held during 
 the last Berlin D Meetup.

 [...]
Ping on the above -- nobody has any insight...?
Jul 29 2015
prev sibling parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 07/26/2015 04:29 AM, Joseph Rushton Wakeling via Digitalmars-d-learn 
wrote:

 is this design idea even feasible in principle, or just a bad
 idea from the get-go?
As I understand it, it is against one of fundamental D principles: structs are value types where any copy can be used in place of any other. I expect there are examples where even Phobos violates it but the struct documentation still says so: "A struct is defined to not have an identity; that is, the implementation is free to make bit copies of the struct as convenient." http://dlang.org/struct.html
 And if feasible -- how would I go about it?
Disallowing automatic copying and providing a function comes to mind. Ali
Jul 31 2015
next sibling parent reply "Dicebot" <public dicebot.lv> writes:
On Friday, 31 July 2015 at 17:21:40 UTC, Ali Çehreli wrote:
 On 07/26/2015 04:29 AM, Joseph Rushton Wakeling via 
 Digitalmars-d-learn wrote:

 is this design idea even feasible in principle, or just a bad
 idea from the get-go?
As I understand it, it is against one of fundamental D principles: structs are value types where any copy can be used in place of any other. I expect there are examples where even Phobos violates it but the struct documentation still says so: "A struct is defined to not have an identity; that is, the implementation is free to make bit copies of the struct as convenient."
I believe this restriction should be banned. Considering classes have inherent reference + heap semantics (and you can only bail out of that with hacks) saying struct can't be anything but data bags makes impossible to implement whole class of useful designs. The fact that Phobos has to violate it itself to get stuff done is quite telling.
Jul 31 2015
next sibling parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 07/31/2015 11:01 AM, Dicebot wrote:> On Friday, 31 July 2015 at 
17:21:40 UTC, Ali Çehreli wrote:
 On 07/26/2015 04:29 AM, Joseph Rushton Wakeling via
 Digitalmars-d-learn wrote:

 is this design idea even feasible in principle, or just a bad
 idea from the get-go?
As I understand it, it is against one of fundamental D principles: structs are value types where any copy can be used in place of any
other.
 I expect there are examples where even Phobos violates it but the
 struct documentation still says so: "A struct is defined to not have
 an identity; that is, the implementation is free to make bit copies of
 the struct as convenient."
I believe this restriction should be banned. Considering classes have inherent reference + heap semantics (and you can only bail out of that with hacks) saying struct can't be anything but data bags makes impossible to implement whole class of useful designs. The fact that Phobos has to violate it itself to get stuff done is quite telling.
To make sure, I didn't mean that I know of structs in Phobos that behave that way. Although, it would be interesting to identify them. :) Ali
Jul 31 2015
parent "Dicebot" <public dicebot.lv> writes:
On Friday, 31 July 2015 at 18:13:04 UTC, Ali Çehreli wrote:
 To make sure, I didn't mean that I know of structs in Phobos 
 that behave that way. Although, it would be interesting to 
 identify them. :)

 Ali
Things like Unique, Scoped, RefCounted - pretty much everything which wraps reference type with additional semantics.
Jul 31 2015
prev sibling parent reply "H. S. Teoh via Digitalmars-d-learn" <digitalmars-d-learn puremagic.com> writes:
On Fri, Jul 31, 2015 at 06:01:44PM +0000, Dicebot via Digitalmars-d-learn wrote:
 On Friday, 31 July 2015 at 17:21:40 UTC, Ali Çehreli wrote:
On 07/26/2015 04:29 AM, Joseph Rushton Wakeling via Digitalmars-d-learn
wrote:

 is this design idea even feasible in principle, or just a bad
 idea from the get-go?
As I understand it, it is against one of fundamental D principles: structs are value types where any copy can be used in place of any other. I expect there are examples where even Phobos violates it but the struct documentation still says so: "A struct is defined to not have an identity; that is, the implementation is free to make bit copies of the struct as convenient."
I believe this restriction should be banned. Considering classes have inherent reference + heap semantics (and you can only bail out of that with hacks) saying struct can't be anything but data bags makes impossible to implement whole class of useful designs. The fact that Phobos has to violate it itself to get stuff done is quite telling.
On the other hand, though, structs that are *not* mere data bags usually get into the dark area of compiler bugs and other unexpected side-effects or unclear areas of the language, often with questionable or inconsistent behaviour. My suspicion that these problems aren't just because of compiler quality or the language spec being incomplete, but it's because they are caused by this underlying dichotomy between structs being defined to be purely data, and structs being used for things that are more than just pure data. One example of this is how long it took for std.stdio.File to get to its present form. It started out as a "simple" wrapper around the C library FILE*, passed through stages of being "pure data" yet showing buggy corner cases, then stepping into dtor territory with its associated unexpected behaviours, then now into a wrapper around a ref-counted type (and IIRC, it may still show buggy / odd behaviour when you stick it inside a container like an array). Along the way, we had issues with dtors, postblit, and a whole slew of issues, almost all of which can be traced back to the dichotomy between a struct wanting to be "just a value" yet at the same time wanting / needing to be more than that. It seems that what the language (originally) defines structs to be, is not entirely consistent with how it has come to be used (which also entailed later extensions to the struct definition), and this has been a source of problems. T -- Bare foot: (n.) A device for locating thumb tacks on the floor.
Jul 31 2015
parent "Dicebot" <public dicebot.lv> writes:
On Friday, 31 July 2015 at 18:23:39 UTC, H. S. Teoh wrote:
 It seems that what the language (originally) defines structs to 
 be, is not entirely consistent with how it has come to be used 
 (which also entailed later extensions to the struct 
 definition), and this has been a source of problems.
Yes, and it wasn't because people started to misuse structs - it was because certain designs would be simply impossible otherwise. The very language design assumption that structs should always be dumb copyable values was not practical. It may come from times when GC, heaps and classes were considered "good enough for everyone" and RAII intended to be completely replaces by scope(exit). Which didn't work.
Jul 31 2015
prev sibling next sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 7/31/15 1:21 PM, Ali Çehreli wrote:

 Disallowing automatic copying and providing a function comes to mind.
Isn't that what std.algorithm.move is for? -Steve
Jul 31 2015
parent =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 07/31/2015 12:18 PM, Steven Schveighoffer wrote:
 On 7/31/15 1:21 PM, Ali Çehreli wrote:

 Disallowing automatic copying and providing a function comes to mind.
Isn't that what std.algorithm.move is for? -Steve
Sounds great and I like it! :) Ali
Jul 31 2015
prev sibling parent reply Joseph Rushton Wakeling via Digitalmars-d-learn writes:
On 31/07/15 19:21, Ali Çehreli via Digitalmars-d-learn wrote:
 On 07/26/2015 04:29 AM, Joseph Rushton Wakeling via Digitalmars-d-learn wrote:

  > is this design idea even feasible in principle, or just a bad
  > idea from the get-go?

 As I understand it, it is against one of fundamental D principles: structs are
 value types where any copy can be used in place of any other.

 I expect there are examples where even Phobos violates it but the struct
 documentation still says so: "A struct is defined to not have an identity; that
 is, the implementation is free to make bit copies of the struct as convenient."

    http://dlang.org/struct.html
That really feels very bad for the problem domain I have in mind -- random number generation. No implementation should be free to make copies of a random number generator "as convenient", that should be 100% in the hands of the programmer!
  > And if feasible -- how would I go about it?

 Disallowing automatic copying and providing a function comes to mind.
Yes, I considered that, but I don't think it really delivers what's needed :-( Let me give a concrete example of why I was thinking in this direction. Consider RandomSample in std.random. This is a struct (a value type, instantiated on the stack). However, it also wraps a random number generator. It needs to be consumed once and once only, because otherwise there will be unintended statistical correlations in the program. Copy-by-value leads to a situation where you can accidentally consume the same sequence twice (or possibly, only _part_ of the sequence). Now, indeed, one way is to just disable this(this) which prevents copy-by-value. But then you can't do something natural and desirable like: iota(100).randomSample(10, gen).take(5).writeln; ... because you would no longer be able to pass the RandomSample instance into `take`. On the other hand, what you want to disallow is this: auto sample = iota(100).randomSample(10, gen); sample.take(5).writeln; sample.take(5).writeln; // statistical correlations result, // probably unwanted The first situation is still possible, and the second disallowed (or at least, guarded against), _if_ a copy-by-value is finalized by tweaking the source to render it an empty range. I would happily hear alternative solutions to the problem, but that's why I was interested in a struct with the properties I outlined in my original post.
Aug 01 2015
next sibling parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Saturday, 1 August 2015 at 12:10:43 UTC, Joseph Rushton 
Wakeling wrote:
 On 31/07/15 19:21, Ali Çehreli via Digitalmars-d-learn wrote:
 On 07/26/2015 04:29 AM, Joseph Rushton Wakeling via 
 Digitalmars-d-learn wrote:

  > is this design idea even feasible in principle, or just a 
 bad
  > idea from the get-go?

 As I understand it, it is against one of fundamental D 
 principles: structs are
 value types where any copy can be used in place of any other.

 I expect there are examples where even Phobos violates it but 
 the struct
 documentation still says so: "A struct is defined to not have 
 an identity; that
 is, the implementation is free to make bit copies of the 
 struct as convenient."

    http://dlang.org/struct.html
That really feels very bad for the problem domain I have in mind -- random number generation. No implementation should be free to make copies of a random number generator "as convenient", that should be 100% in the hands of the programmer!
  > And if feasible -- how would I go about it?

 Disallowing automatic copying and providing a function comes 
 to mind.
Yes, I considered that, but I don't think it really delivers what's needed :-( Let me give a concrete example of why I was thinking in this direction. Consider RandomSample in std.random. This is a struct (a value type, instantiated on the stack). However, it also wraps a random number generator. It needs to be consumed once and once only, because otherwise there will be unintended statistical correlations in the program. Copy-by-value leads to a situation where you can accidentally consume the same sequence twice (or possibly, only _part_ of the sequence). Now, indeed, one way is to just disable this(this) which prevents copy-by-value. But then you can't do something natural and desirable like: iota(100).randomSample(10, gen).take(5).writeln; ... because you would no longer be able to pass the RandomSample instance into `take`. On the other hand, what you want to disallow is this: auto sample = iota(100).randomSample(10, gen); sample.take(5).writeln; sample.take(5).writeln; // statistical correlations result, // probably unwanted The first situation is still possible, and the second disallowed (or at least, guarded against), _if_ a copy-by-value is finalized by tweaking the source to render it an empty range. I would happily hear alternative solutions to the problem, but that's why I was interested in a struct with the properties I outlined in my original post.
Naïve compromise solution? struct S(bool noCopy = true) { //replace with real state int state = 0; static if(noCopy) disable this(this); property auto copyable() { //did the move manually because I got //weird results std.algorithm.move auto ret = cast(S!false)this; this.state = this.state.init; return ret; } } auto s(int state) { return S!()(state); } void main() { import std.stdio, std.algorithm; auto s = s(42); auto s1 = s.move; assert(s == S!().init); s1.copyable.writeln; assert(s1 == S!().init); } Then at least the simplest mistakes are avoided. Also, it means people are more likely to read important docs i.e. "Why do I have to call this copyable thing? Oh, I see, I'll be careful." I'm not sure how good an idea it is to totally enforce a range to be non-copyable, even if you could deal with the function call chain problem. Even in totally save-aware code, there can still be valid assignment of a range type. I'm pretty sure a lot of phobos ranges/algorithms would be unusable.
Aug 01 2015
parent reply "Dicebot" <public dicebot.lv> writes:
On Saturday, 1 August 2015 at 17:50:28 UTC, John Colvin wrote:
 I'm not sure how good an idea it is to totally enforce a range 
 to be non-copyable, even if you could deal with the function 
 call chain problem. Even in totally save-aware code, there can 
 still be valid assignment of a range type. I'm pretty sure a 
 lot of phobos ranges/algorithms would be unusable.
This is exactly why I proposed to Joe design with destructive copy originally - that would work with any algorithms expecting implicit pass by value but prevent from actual double usage. Sadly, this does not seem to be implementable in D in any reasonable way.
Aug 01 2015
parent Joseph Rushton Wakeling via Digitalmars-d-learn writes:
On 02/08/15 03:38, Dicebot via Digitalmars-d-learn wrote:
 On Saturday, 1 August 2015 at 17:50:28 UTC, John Colvin wrote:
 I'm not sure how good an idea it is to totally enforce a range to be
 non-copyable, even if you could deal with the function call chain problem.
 Even in totally save-aware code, there can still be valid assignment of a
 range type. I'm pretty sure a lot of phobos ranges/algorithms would be
unusable.
This is exactly why I proposed to Joe design with destructive copy originally - that would work with any algorithms expecting implicit pass by value but prevent from actual double usage. Sadly, this does not seem to be implementable in D in any reasonable way.
Yup. This work is follow-up on a really creative bunch of suggestions Dicebot made to me on our flight back from DConf, and which we followed up on at the recent Berlin meetup. The design principle of "destructive copy" is great -- it really cuts through a bunch of potential nastinesses around random number generation -- but it really doesn't look like it's straightforwardly possible :-(
Aug 02 2015
prev sibling parent reply "Kagamin" <spam here.lot> writes:
On Saturday, 1 August 2015 at 11:47:29 UTC, Joseph Rushton 
Wakeling wrote:
 Yea, but that's not what I'm trying to achieve.  I know how I 
 can pass something to `take` so as to e.g. obtain reference 
 semantics or whatever; what I'm trying to achieve is a range 
 that _doesn't rely on the user knowing the right way to handle 
 it_.
Wrapping a reference to a stack-allocated struct is also unsafe, so no way for the user to not know about it. On Saturday, 1 August 2015 at 12:10:43 UTC, Joseph Rushton Wakeling wrote:
 On the other hand, what you want to disallow is this:

    auto sample = iota(100).randomSample(10, gen);

    sample.take(5).writeln;
    sample.take(5).writeln;   // statistical correlations result,
                              // probably unwanted
Try auto sample = iota(100).randomSample(10, &gen);
Aug 03 2015
parent reply "Dicebot" <public dicebot.lv> writes:
On Monday, 3 August 2015 at 08:54:32 UTC, Kagamin wrote:
 On Saturday, 1 August 2015 at 11:47:29 UTC, Joseph Rushton 
 Wakeling wrote:
 Yea, but that's not what I'm trying to achieve.  I know how I 
 can pass something to `take` so as to e.g. obtain reference 
 semantics or whatever; what I'm trying to achieve is a range 
 that _doesn't rely on the user knowing the right way to handle 
 it_.
Wrapping a reference to a stack-allocated struct is also unsafe, so no way for the user to not know about it.
It is now verified as safe by `return ref`.
Aug 03 2015
parent "Joseph Rushton Wakeling" <joseph.wakeling webdrake.net> writes:
On Monday, 3 August 2015 at 09:01:51 UTC, Dicebot wrote:
 It is now verified as safe by `return ref`.
Yes, until you pointed this out to me I'd been convinced that classes were the way forward for RNGs. I think that `return ref` is going to be a _very_ powerful tool for facilitating stack-allocated RNG functionality.
Aug 03 2015