www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - proposal for general dup function

reply "Dan" <dbdavidson yahoo.com> writes:
Phobos can and should have a general dup function, capable of 
duping (i.e. recursive deep copy) structs without requiring any 
effort from struct developers. This can be done to cover the vast 
majority of object copy issues for structs and would have these 
benefits:

- no need to write (and potentially mess up) a dup function for 
your own structs
- ability to dup structs that others have written for which you 
don't have direct read access to all the fields
- as your structs grow less chance of bugs (the general dup would 
recognize and incorporate new fields)
- it addresses the inability to copy const and immutable 
reference structs (which Walter and Andrei may be looking to 
address with copy constructor feature)
- it would add more formality to the dup convention

Issues not covered:

- custom memory management by struct, some structs want to do low 
level stuff themselves. This is ok, though, as the struct 
developer can write his own dup and that will be honored by the 
general dup when composition requires it
- some resources are based on handles that can lead to 
indirection. For example, deep copy of file handles could cause 
unintended sharing. In these cases the back door is for the 
developer to write a custom dup or disable dup.
- classes not covered. Maybe the approach could be extended to 
support classes - but it is much more complicated.

Sample implementation:

I've written one called gdup just to distinguish it from dup. It 
is a function and a property, so the usage should feel natural. 
An example usage is shown below. Implementation and tests are 
located at:

https://github.com/patefacio/d-help/blob/master/d-help/opmix/mix.d
https://github.com/patefacio/d-help/blob/master/d-help/opmix/d_test/gdup_suite.d

A pdf writeup exists in section "dup and Global Dup" at:
https://github.com/patefacio/d-help/blob/master/doc/canonical.pdf?raw=true

The specific functions related to gdup are:

 property auto gdup(T)(const ref T t)
void gdup(T1, T2)(ref T1 t1, const ref T2 t2)
ref T opDupPreferred(T, F)(ref T target, const ref F src)

Some sample questions to the news group where this feature in the 
standard library would solve a user problem:

http://forum.dlang.org/thread/mailman.1946.1352987649.5162.digitalmars-d-learn puremagic.com?page=3
http://forum.dlang.org/thread/pzuparprsetydynbcuce forum.dlang.org
http://forum.dlang.org/thread/urvdcpflzajhpackmxyz forum.dlang.org


Thanks
Dan

   static struct A {
     char[] c;
   }
   static struct B {
     A a;
   }
   static struct C {
     B b;
   }
   void main() {
     const(C) c = C(B(A(['a'])));
     C c2 = c.gdup;
   }
Dec 09 2012
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2012-12-09 15:45, Dan wrote:
 Phobos can and should have a general dup function, capable of duping
 (i.e. recursive deep copy) structs without requiring any effort from
 struct developers.
[snip] I think much of this functionality could be shared with serialization. A few questions and comments. * Are array slices properly handled * I think there need to be a way to explicitly say that a given field and a whole struct shouldn't be duped * All the public strings used for mixins should be templates Orange serialization library, supports classes as well: https://github.com/jacob-carlborg/orange -- /Jacob Carlborg
Dec 09 2012
parent reply "Dan" <dbdavidson yahoo.com> writes:
On Sunday, 9 December 2012 at 16:26:12 UTC, Jacob Carlborg wrote:
 On 2012-12-09 15:45, Dan wrote:
 Phobos can and should have a general dup function, capable of 
 duping
 (i.e. recursive deep copy) structs without requiring any 
 effort from
 struct developers.
[snip] I think much of this functionality could be shared with serialization. A few questions and comments.
I am talking about a much smaller scope (just a few of functions - 200 lines of code tops) - but there are similarities.
 * Are array slices properly handled
Both array slices and associative arrays are properly handled. Let me know if you find otherwise.
 * I think there need to be a way to explicitly say that a given 
 field and a whole struct shouldn't be duped
I definitely see need for that in serialization. Not sure about a generalized dup function, though. I have a similar function for deeply comparing instances and I think this should always hold: assert(typesDeepEqual(t, t.gdup)) If you skipped fields it would not.
 * All the public strings used for mixins should be templates
The mixins in the code are for higher-up functionality, some of it on top of the dup. There is no need for mixin for a general dup. The mixin(PostBlit) is there if you want to provide a dup for your struct so assignments in generic code gets the deep copy semantics with opAssign and copy construction. However, if template mixins are preferred to string mixins I suppose that is a good idea for that code and I'll check it out. I have refactored the dup into its own module, so there is no need for mixin: https://github.com/patefacio/d-help/blob/master/d-help/opmix/dup.d
 Orange serialization library, supports classes as well:

 https://github.com/jacob-carlborg/orange
Dec 10 2012
parent Jacob Carlborg <doob me.com> writes:
On 2012-12-10 13:56, Dan wrote:

 However, if template mixins are preferred to
 string mixins I suppose that is a good idea for that code and I'll check
 it out.
Yes, templates are always preferred. One should try to avoid putting code in string literals as much as possible. -- /Jacob Carlborg
Dec 10 2012
prev sibling parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 12/09/2012 03:45 PM, Dan wrote:
 Phobos can and should have a general dup function, capable of duping (i.e.
 recursive deep copy) structs without requiring any effort from struct
 developers. This can be done to cover the vast majority of object copy issues
 for structs
We already had some discussion about your gdup on d-learn, but just for clarity -- it's possible to do a gidup as well as gdup, correct?
Dec 10 2012
parent reply "Dan" <dbdavidson yahoo.com> writes:
On Monday, 10 December 2012 at 15:36:44 UTC, Joseph Rushton 
Wakeling wrote:
 On 12/09/2012 03:45 PM, Dan wrote:
 Phobos can and should have a general dup function, capable of 
 duping (i.e.
 recursive deep copy) structs without requiring any effort from 
 struct
 developers. This can be done to cover the vast majority of 
 object copy issues
 for structs
We already had some discussion about your gdup on d-learn, but just for clarity -- it's possible to do a gidup as well as gdup, correct?
I think so. Here is a claim I think is true: gdup must do full deep copy to be safe and guarantee the transitive const/immutable. If it is implemented such that there are no casts, then the compiler does its job and ensures everything is good. If there are casts they need to be deemed safe - I have one cast to work around issues in associative array iteration. Assuming that is fine - I claim that gidup or igdup or whatever can just be: property auto gidup(T)(const ref T t) { immutable(T) result = cast(immutable)(t.gdup); return result; } auto another = c.gidup; pragma(msg, typeof(another)); assertNotEquals(another.b.a.c.ptr, c.b.a.c.ptr); assert(0==typesDeepCmp(another,c)); That is, since gdup does a full deep copy, casting the result to immutable is fine as there is no aliasing. Thanks, Dan
Dec 10 2012
parent reply Jacob Carlborg <doob me.com> writes:
On 2012-12-10 17:06, Dan wrote:

 I think so. Here is a claim I think is true: gdup must do full deep copy
 to be safe and guarantee the transitive const/immutable. If it is
 implemented such that there are no casts, then the compiler does its job
 and ensures everything is good. If there are casts they need to be
 deemed safe - I have one cast to work around issues in associative array
 iteration.
I'm pretty sure it can't be done. For classes one need to bypass the constructor. The constructor is the only place where you can initialize const/immutable fields. For class instance one would need to cast it to a ubyte pointer (or similar) and then set the const/immutable fields that way. I think it can be done safely, but not something the compiler can guarantee. -- /Jacob Carlborg
Dec 10 2012
next sibling parent reply "Dan" <dbdavidson yahoo.com> writes:
On Monday, 10 December 2012 at 18:55:09 UTC, Jacob Carlborg wrote:

 I'm pretty sure it can't be done. For classes one need to 
 bypass the constructor. The constructor is the only place where 
 you can initialize const/immutable fields. For class instance 
 one would need to cast it to a ubyte pointer (or similar) and 
 then set the const/immutable fields that way.

 I think it can be done safely, but not something the compiler 
 can guarantee.
Only talking about structs here. classes were listed under issues not covered. Thanks Dan
Dec 10 2012
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2012-12-10 20:07, Dan wrote:

 Only talking about structs here. classes were listed under issues not
 covered.
You might have the same problem with structs. That is, if it's possible to have const/immutable files which are not initialized in the declaration. -- /Jacob Carlborg
Dec 10 2012
parent reply Jacob Carlborg <doob me.com> writes:
On 2012-12-10 20:37, Jacob Carlborg wrote:

 You might have the same problem with structs. That is, if it's possible
 to have const/immutable files which are not initialized in the declaration.
That should have been "fields". -- /Jacob Carlborg
Dec 10 2012
parent reply "Dan" <dbdavidson yahoo.com> writes:
On Monday, 10 December 2012 at 19:40:52 UTC, Jacob Carlborg wrote:
 On 2012-12-10 20:37, Jacob Carlborg wrote:

 You might have the same problem with structs. That is, if it's 
 possible
 to have const/immutable files which are not initialized in the 
 declaration.
That should have been "fields".
You are correct. const or immutable fields won't work. But then, if you had a field that was const or immutable you might as well make it static. Any problems with gdup in that case would be the same you would see if you wanted to implement your own.
Dec 10 2012
parent Jacob Carlborg <doob me.com> writes:
On 2012-12-10 20:55, Dan wrote:

 You are correct. const or immutable fields won't work. But then, if you
 had a field that was const or immutable you might as well make it
 static. Any problems with gdup in that case would be the same you would
 see if you wanted to implement your own.
As I wrote in a previous post, it's possible to implement. I've done it in my serialization library. -- /Jacob Carlborg
Dec 10 2012
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/10/12 2:07 PM, Dan wrote:
 On Monday, 10 December 2012 at 18:55:09 UTC, Jacob Carlborg wrote:

 I'm pretty sure it can't be done. For classes one need to bypass the
 constructor. The constructor is the only place where you can
 initialize const/immutable fields. For class instance one would need
 to cast it to a ubyte pointer (or similar) and then set the
 const/immutable fields that way.

 I think it can be done safely, but not something the compiler can
 guarantee.
Only talking about structs here. classes were listed under issues not covered. Thanks Dan
There will be the same problems with structs containing pointers. Andrei
Dec 10 2012
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/10/12 1:55 PM, Jacob Carlborg wrote:
 On 2012-12-10 17:06, Dan wrote:

 I think so. Here is a claim I think is true: gdup must do full deep copy
 to be safe and guarantee the transitive const/immutable. If it is
 implemented such that there are no casts, then the compiler does its job
 and ensures everything is good. If there are casts they need to be
 deemed safe - I have one cast to work around issues in associative array
 iteration.
I'm pretty sure it can't be done.
Just like Dan I thought it can be done but actually ownership is impossible to establish in general. Consider: class List(T) { List next; T payload; ... } versus class Window { Window parent; ... } It's pretty obvious from the name that duplicating a List would entail duplicating its tail, whereas duplicating a Window would entail just copying the reference to the same parent. However, at the type level there's no distinction between the two members. Now, user-defined attributes could change the playfield radically here. Consider we define things like owned and foreign that would inform the gdup or deepdup function appropriately: class List(T) { owned List next; T payload; ... } versus class Window { foreign Window parent; ... } Now a generic deep duplication function has enough information to duplicate things appropriately. Andrei
Dec 10 2012
next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Monday, 10 December 2012 at 19:46:18 UTC, Andrei Alexandrescu 
wrote:
 On 12/10/12 1:55 PM, Jacob Carlborg wrote:
 On 2012-12-10 17:06, Dan wrote:

 I think so. Here is a claim I think is true: gdup must do 
 full deep copy
 to be safe and guarantee the transitive const/immutable. If 
 it is
 implemented such that there are no casts, then the compiler 
 does its job
 and ensures everything is good. If there are casts they need 
 to be
 deemed safe - I have one cast to work around issues in 
 associative array
 iteration.
I'm pretty sure it can't be done.
Just like Dan I thought it can be done but actually ownership is impossible to establish in general. Consider: class List(T) { List next; T payload; ... } versus class Window { Window parent; ... } It's pretty obvious from the name that duplicating a List would entail duplicating its tail, whereas duplicating a Window would entail just copying the reference to the same parent. However, at the type level there's no distinction between the two members.
Unless the tail is immutable, but in general yes.
 Now, user-defined attributes could change the playfield 
 radically here. Consider we define things like  owned and 
  foreign that would inform the gdup or deepdup function 
 appropriately:

 class List(T) {
      owned List next;
     T payload;
     ...
 }

 versus

 class Window {
      foreign Window parent;
     ...
 }

 Now a generic deep duplication function has enough information 
 to duplicate things appropriately.
That is indeed a perfect use case for attributes.
Dec 10 2012
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2012-12-10 20:46, Andrei Alexandrescu wrote:

 Just like Dan I thought it can be done but actually ownership is
 impossible to establish in general. Consider:

 class List(T) {
      List next;
      T payload;
      ...
 }

 versus

 class Window {
      Window parent;
      ...
 }

 It's pretty obvious from the name that duplicating a List would entail
 duplicating its tail, whereas duplicating a Window would entail just
 copying the reference to the same parent. However, at the type level
 there's no distinction between the two members.
I'm not sure, why wouldn't you want to copy the parent window as well? -- /Jacob Carlborg
Dec 10 2012
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/10/12 2:54 PM, Jacob Carlborg wrote:
 On 2012-12-10 20:46, Andrei Alexandrescu wrote:

 Just like Dan I thought it can be done but actually ownership is
 impossible to establish in general. Consider:

 class List(T) {
 List next;
 T payload;
 ...
 }

 versus

 class Window {
 Window parent;
 ...
 }

 It's pretty obvious from the name that duplicating a List would entail
 duplicating its tail, whereas duplicating a Window would entail just
 copying the reference to the same parent. However, at the type level
 there's no distinction between the two members.
I'm not sure, why wouldn't you want to copy the parent window as well?
You want to create a new window with the same parent. At the top level there's one desktop window, and probably having two would be odd. Andrei
Dec 10 2012
next sibling parent "Dan" <dbdavidson yahoo.com> writes:
On Monday, 10 December 2012 at 20:10:25 UTC, Andrei Alexandrescu 
wrote:

 You want to create a new window with the same parent. At the 
 top level there's one desktop window, and probably having two 
 would be odd.
Ok - so I'm only talking about structs. What you say is what you want here and it makes sense. But that is not what I'm proposing. I'm just talking deep copy of structs that is always deep copy - period (bearophile called it transitive copy). By design that means there is no aliasing at all. So in this case the parent window would be deep copied. This can be done without any change to the struct (cycles when pointers are used are one design issue that would need to be addressed). This is also why gdup can be used to copy any immutable(T) into T (assuming is(T==struct)) and get around the "how do I copy const/immutable instances" we see so often. Below illustrates what would happen and you would not want it for Window parent/child relationships - but you see the consistency. For what you describe you want reference semantics and gdup is not needed. Given this - I think the original claim still holds. I claim that gidup or igdup or whatever can just be: property auto gidup(T)(const ref T t) { immutable(T) result = cast(immutable)(t.gdup); return result; } I am not saying you can or should gdup any struct instance willy nilly, just that the cast to immutable here is safe because all fields are deep copied recursively. ------------------- output ------- Window("c1", 5, 5, 7FFF5BF3F4F0) Window("c1", 5, 5, 7FAC74004FE0) ------------------- import std.stdio; import opmix.mix; struct Window { string name; int x, y; Window *parent; } void main() { auto window = Window("root", 1, 1); auto child = Window("c1", 5, 5, &window); auto c2 = child.gdup; writeln(child); writeln(c2); }
Dec 10 2012
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2012-12-10 21:10, Andrei Alexandrescu wrote:

 You want to create a new window with the same parent. At the top level
 there's one desktop window, and probably having two would be odd.
There are many applications that support several top level windows. These are mostly document based. -- /Jacob Carlborg
Dec 10 2012
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/10/12 4:20 PM, Jacob Carlborg wrote:
 On 2012-12-10 21:10, Andrei Alexandrescu wrote:

 You want to create a new window with the same parent. At the top level
 there's one desktop window, and probably having two would be odd.
There are many applications that support several top level windows. These are mostly document based.
Are we really up for continuing to debate this? Andrei
Dec 10 2012
next sibling parent "Dan" <dbdavidson yahoo.com> writes:
On Monday, 10 December 2012 at 21:24:09 UTC, Andrei Alexandrescu 
wrote:
 On 12/10/12 4:20 PM, Jacob Carlborg wrote:
 On 2012-12-10 21:10, Andrei Alexandrescu wrote:

 You want to create a new window with the same parent. At the 
 top level
 there's one desktop window, and probably having two would be 
 odd.
There are many applications that support several top level windows. These are mostly document based.
Are we really up for continuing to debate this? Andrei
I'm not up for this debate about Windows - just the original about gdup :-) It is not an attempt to automatically determine reference or value semantics, even though your ideas on owned and foreign make sense and are a fine use of attributes. It is about a single phobos/library [g]dup function for structs that provides recursive deep copy without the need for struct designer to do anything extra. I think the benefits pointed out in the original make sense and are worth pursuing. I have a feeling Walter would disagree - simply because apparently he is not a fan of postblits, which serve similar purpose. Further, from another thread he is not a fan of deep copy semantics in general, as he said there is almost never a need (which I would love to hear more commentary on). Yet TDPL shows the a prime example for the need in 7.1.3 (Widget). Plus we get questions all the time on how to cross from the const/immutable world to the mutable - which gdup provides. Thanks Dan
Dec 10 2012
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2012-12-10 22:24, Andrei Alexandrescu wrote:

 Are we really up for continuing to debate this?
No, not really. It's kind of a useless discussion. End. -- /Jacob Carlborg
Dec 10 2012