www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - deepCopy

reply "Denis Koroskin" <2korden gmail.com> writes:
Okay, we've finished what we started with Aleksey today, so I decided to  
share it with you. This is a rough cut, it lacks comments, but it is  
already usable.

deepCopy is a function that makes a deep copy of your object, and  
everything it points to. As simple as that.

Essentially it is a binary serializer except that it doesn't store  
pointers as offsets (although it is capable of doing that, too - change a  
single line - so serialize and deepCopy share 99% of the code).

deepCopy and serialize are similar in design yet different in usage:  
serialized data are usually transmitted to other application, and one must  
worry about different endianness, pointer size etc. On the contrary  
deepCopy is to be used within the same address space and is free from such  
issues.

deepCopy is useful for making sure there are no aliases to your data left.  
This can be used to create a safe immutable copy of your objects, or to  
avoid with memory leaks.

We've written deepCopy to solve memory leaks in ddmd the following way:
a) hook all the memory allocations
b) run code, produce result
c) make a deep copy of the result
d) release all the allocated memory

Since you are deallocating all the memory at once, you can use faster  
allocation methods e.g preallocate a memory and simply advance a pointer,  
or use dsimcha's tempAlloc  
(http://dsource.org/projects/scrapple/browser/trunk/tempAlloc). BTW, I  
remember a discussion about integrating it into druntime, did it go  
anywhere since then?.

deepCopy stores all the data sequentially, so it should reduce memory  
fragmentation and should be more cache-friendly. As a downside, the whole  
block will only be release once last reference to it expires.

If your struct has pointers you can manually specify if that pointer is a  
pointer to one element (default), many or none (excluding it from being  
copied). Exclusion works with references, too:

class Foo : ISerializeable
{
     mixin Serializeable;

     // optional, only needed for precise serialization control
     void describe(SerializeInfo* info)
     {
         info.setLength(buffer, length);
         info.exclude(cachedValue);
     }

     ubyte* buffer;
     size_t length;
     Object cachedValue;
}

I hope someone will find it useful, the code with tests is located here:
http://bitbucket.org/korDen/serialize/src/tip/

Suggestions are welcome!
Oct 18 2010
parent reply Jacob Carlborg <doob me.com> writes:
On 2010-10-19 00:01, Denis Koroskin wrote:
 Okay, we've finished what we started with Aleksey today, so I decided to
 share it with you. This is a rough cut, it lacks comments, but it is
 already usable.

 deepCopy is a function that makes a deep copy of your object, and
 everything it points to. As simple as that.

 Essentially it is a binary serializer except that it doesn't store
 pointers as offsets (although it is capable of doing that, too - change
 a single line - so serialize and deepCopy share 99% of the code).

 deepCopy and serialize are similar in design yet different in usage:
 serialized data are usually transmitted to other application, and one
 must worry about different endianness, pointer size etc. On the contrary
 deepCopy is to be used within the same address space and is free from
 such issues.

 deepCopy is useful for making sure there are no aliases to your data
 left. This can be used to create a safe immutable copy of your objects,
 or to avoid with memory leaks.

 We've written deepCopy to solve memory leaks in ddmd the following way:
 a) hook all the memory allocations
 b) run code, produce result
 c) make a deep copy of the result
 d) release all the allocated memory

 Since you are deallocating all the memory at once, you can use faster
 allocation methods e.g preallocate a memory and simply advance a
 pointer, or use dsimcha's tempAlloc
 (http://dsource.org/projects/scrapple/browser/trunk/tempAlloc). BTW, I
 remember a discussion about integrating it into druntime, did it go
 anywhere since then?.

 deepCopy stores all the data sequentially, so it should reduce memory
 fragmentation and should be more cache-friendly. As a downside, the
 whole block will only be release once last reference to it expires.

 If your struct has pointers you can manually specify if that pointer is
 a pointer to one element (default), many or none (excluding it from
 being copied). Exclusion works with references, too:

 class Foo : ISerializeable
 {
 mixin Serializeable;

 // optional, only needed for precise serialization control
 void describe(SerializeInfo* info)
 {
 info.setLength(buffer, length);
 info.exclude(cachedValue);
 }

 ubyte* buffer;
 size_t length;
 Object cachedValue;
 }

 I hope someone will find it useful, the code with tests is located here:
 http://bitbucket.org/korDen/serialize/src/tip/

 Suggestions are welcome!

What types does this support, all types? Does it support array slices? -- /Jacob Carlborg
Oct 19 2010
parent reply "Denis Koroskin" <2korden gmail.com> writes:
On Tue, 19 Oct 2010 12:37:35 +0400, Jacob Carlborg <doob me.com> wrote:

 On 2010-10-19 00:01, Denis Koroskin wrote:
 Okay, we've finished what we started with Aleksey today, so I decided to
 share it with you. This is a rough cut, it lacks comments, but it is
 already usable.

 deepCopy is a function that makes a deep copy of your object, and
 everything it points to. As simple as that.

 Essentially it is a binary serializer except that it doesn't store
 pointers as offsets (although it is capable of doing that, too - change
 a single line - so serialize and deepCopy share 99% of the code).

 deepCopy and serialize are similar in design yet different in usage:
 serialized data are usually transmitted to other application, and one
 must worry about different endianness, pointer size etc. On the contrary
 deepCopy is to be used within the same address space and is free from
 such issues.

 deepCopy is useful for making sure there are no aliases to your data
 left. This can be used to create a safe immutable copy of your objects,
 or to avoid with memory leaks.

 We've written deepCopy to solve memory leaks in ddmd the following way:
 a) hook all the memory allocations
 b) run code, produce result
 c) make a deep copy of the result
 d) release all the allocated memory

 Since you are deallocating all the memory at once, you can use faster
 allocation methods e.g preallocate a memory and simply advance a
 pointer, or use dsimcha's tempAlloc
 (http://dsource.org/projects/scrapple/browser/trunk/tempAlloc). BTW, I
 remember a discussion about integrating it into druntime, did it go
 anywhere since then?.

 deepCopy stores all the data sequentially, so it should reduce memory
 fragmentation and should be more cache-friendly. As a downside, the
 whole block will only be release once last reference to it expires.

 If your struct has pointers you can manually specify if that pointer is
 a pointer to one element (default), many or none (excluding it from
 being copied). Exclusion works with references, too:

 class Foo : ISerializeable
 {
 mixin Serializeable;

 // optional, only needed for precise serialization control
 void describe(SerializeInfo* info)
 {
 info.setLength(buffer, length);
 info.exclude(cachedValue);
 }

 ubyte* buffer;
 size_t length;
 Object cachedValue;
 }

 I hope someone will find it useful, the code with tests is located here:
 http://bitbucket.org/korDen/serialize/src/tip/

 Suggestions are welcome!

What types does this support, all types? Does it support array slices?

Classes, structs, built-in types, arrays, slices - just about anything. For classes to work you need to implement ISerializable interface using mixin Serializable; so that it could serialize through base class pointer. Not that I think about it, it doesn't support built-in associative arrays yet. I forgot about that one (it might be tricky to serialize it into a sequential memory block). Take a loot at the tests (there are many of them, some of them are very tricky), and give it a try.
Oct 19 2010
parent Jacob Carlborg <doob me.com> writes:
On 2010-10-19 17:52, Denis Koroskin wrote:
 On Tue, 19 Oct 2010 12:37:35 +0400, Jacob Carlborg <doob me.com> wrote:
 What types does this support, all types? Does it support array slices?

Classes, structs, built-in types, arrays, slices - just about anything. For classes to work you need to implement ISerializable interface using mixin Serializable; so that it could serialize through base class pointer. Not that I think about it, it doesn't support built-in associative arrays yet. I forgot about that one (it might be tricky to serialize it into a sequential memory block). Take a loot at the tests (there are many of them, some of them are very tricky), and give it a try.

I'm kind of amazed how sort the code is and that you managed to support delegates. A suggestion, support classes without the mixin if they're not serialized through a base class reference. I might have found a bug: struct Foo { int[] arr; } public void test0() { Foo src; src.arr = [0, 1, 2, 3, 4, 5]; auto dst = deepCopy(src); assert(src.arr is dst.arr); // passes, but should fail ? } The assert passes but should fail, otherwise it's not a deep copy. -- /Jacob Carlborg
Oct 19 2010