www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Order of destruction when garbage collection kicks in

reply "Henning Pohl" <henning still-hidden.de> writes:
In fact there is no order of destruction. And this is one of the 
most annoying D problems I recently had to deal with. Look at 
this example: http://dpaste.dzfl.pl/f3f860b0. This time, it 
segfaulted. Next time it may (in theory) not, because the dtor of 
a is called before the one of b. A holds a reference to a B. In 
the destructor of A I expect b either to be null or a valid 
instance of B (which has not been destroyed yet). You get a kind 
of undefined behavior instead. This is IMO a huge drawback 
towards reference counting with strong/weak references.

Is there right now any way to arrange things?
Apr 09 2013
next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 4/9/13, Henning Pohl <henning still-hidden.de> wrote:
 In the destructor of A I expect b either to be null or a valid
 instance of B (which has not been destroyed yet).
Hmm yeah I hoped to see it be null too.
 This is IMO a huge drawback
 towards reference counting with strong/weak references.
I don't think D classes were ever meant to be used with reference counting. Structs yes, but classes are GC's territory.
Apr 09 2013
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 09 Apr 2013 15:55:09 -0400, Henning Pohl <henning still-hidden.de>  
wrote:

 In fact there is no order of destruction. And this is one of the most  
 annoying D problems I recently had to deal with. Look at this example:  
 http://dpaste.dzfl.pl/f3f860b0. This time, it segfaulted. Next time it  
 may (in theory) not, because the dtor of a is called before the one of  
 b. A holds a reference to a B. In the destructor of A I expect b either  
 to be null or a valid instance of B (which has not been destroyed yet).  
 You get a kind of undefined behavior instead. This is IMO a huge  
 drawback towards reference counting with strong/weak references.

 Is there right now any way to arrange things?
No. You are not allowed to access any GC-managed resources inside a destructor. And it won't be null, because that would be extremely costly. Consider if 300 objects had a pointer to a block of memory. Now those 300 objects all go away. By your expectation, if the targeted block of memory was collected, the GC would have to spend time going through all those 300 pointers, setting them to null. When they are about to all be destroyed. 99.99% of the time, that would be wasted, as those 300 objects may not even have destructors that care. A destructor is ONLY for destroying non-GC managed resources, nothing else. Like an OS file handle for instance. -Steve
Apr 09 2013
parent reply "Henning Pohl" <henning still-hidden.de> writes:
On Tuesday, 9 April 2013 at 20:46:12 UTC, Steven Schveighoffer 
wrote:
 No.  You are not allowed to access any GC-managed resources 
 inside a destructor.
Okay, I finally found it: http://dlang.org/class.html#destructors. But it's not listed here: http://dlang.org/garbage.html.
 And it won't be null, because that would be extremely costly.  
 Consider if 300 objects had a pointer to a block of memory.  
 Now those 300 objects all go away.  By your expectation, if the 
 targeted block of memory was collected, the GC would have to 
 spend time going through all those 300 pointers, setting them 
 to null.  When they are about to all be destroyed.  99.99% of 
 the time, that would be wasted, as those 300 objects may not 
 even have destructors that care.
I thought about the case when the user sets the reference to null before the destructor was even called.
 A destructor is ONLY for destroying non-GC managed resources, 
 nothing else.  Like an OS file handle for instance.
Imagine this case: // External C library. extern(C) { alias void* a_handle; alias void* b_handle; a_handle create_a(); void destroy_a(a_handle); b_handle create_b(a_handle); void destroy_b(b_handle); } class A { this() { // An a_handle owns multiple b_handles. handle = create_a(); } ~this() { // Destroys all bs connected with this a. destroy_a(handle); } private a_handle handle; } class B { this(A a) { this.a = a; // Creates a b_handle connected to the given a_handle. handle = create_b(a.handle); } ~this() { // a already destroyed -> segfault // a still alive -> works destroy_b(handle); } private A a; private b_handle handle; } Any instance of B always needs to be destructed _before_ the A it's connected to. How do you express this in D?
Apr 10 2013
next sibling parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Wed, 10 Apr 2013 10:48:30 +0100, Henning Pohl <henning still-hidden.de>  
wrote:

 On Tuesday, 9 April 2013 at 20:46:12 UTC, Steven Schveighoffer wrote:
 No.  You are not allowed to access any GC-managed resources inside a  
 destructor.
Okay, I finally found it: http://dlang.org/class.html#destructors. But it's not listed here: http://dlang.org/garbage.html.
 And it won't be null, because that would be extremely costly.  Consider  
 if 300 objects had a pointer to a block of memory.  Now those 300  
 objects all go away.  By your expectation, if the targeted block of  
 memory was collected, the GC would have to spend time going through all  
 those 300 pointers, setting them to null.  When they are about to all  
 be destroyed.  99.99% of the time, that would be wasted, as those 300  
 objects may not even have destructors that care.
I thought about the case when the user sets the reference to null before the destructor was even called.
 A destructor is ONLY for destroying non-GC managed resources, nothing  
 else.  Like an OS file handle for instance.
Imagine this case: ... Any instance of B always needs to be destructed _before_ the A it's connected to. How do you express this in D?
// External C library. extern(C) { alias void* a_handle; alias void* b_handle; a_handle create_a(); void destroy_a(a_handle); b_handle create_b(a_handle); void destroy_b(b_handle); } class A { this() { // An a_handle owns multiple b_handles. handle = create_a(); } ~this() { Dispose(false); // ** new ** } private a_handle handle; // ** new ** private bool _disposed; public void Dispose() { Dispose(true); } private void Dispose(bool disposing) { if (!_disposed) { _disposed = true; if (disposing) { // Destroys all bs connected with this a. destroy_a(handle); } } } } class B { this(A a) { this.a = a; // Creates a b_handle connected to the given a_handle. handle = create_b(a.handle); } ~this() { Dispose(false); // **new ** } private A a; private b_handle handle; // ** new ** private bool _disposed; public void Dispose() { Dispose(true); } private void Dispose(bool disposing) { if (!_disposed) { _disposed = true; if (disposing) { destroy_b(handle); a.Dispose(true); // or maybe not depending on how many b handles there are? } } } } Yes, your code will leak library handles/resources if you forget to manually call Dispose() on (at least) the B above (assuming it calls Dispose on its A). It's a pain and not perfect, but I don't think it can be - in D - without compiler/GC support for a disposal pattern like this of the application though, it causes objects to survive a GC collection, any they in turn keep other objects alive so there is a very real issue of objects living longer and resource use being higher. Add to that, that objects can be resurrected after the first collection, before the GC finalizer calls Dispose and you have all sorts of tricky edge cases and timing windows. So, it's not without costs. But, in D, you can do the above and ensure you call Dispose manually and at worst it's the same as manual memory management in C/C++ for a smaller sub-set of your code making it more manageable, but perhaps easier to forget and get wrong. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Apr 10 2013
parent "Regan Heath" <regan netmail.co.nz> writes:
On Wed, 10 Apr 2013 11:08:22 +0100, Regan Heath <regan netmail.co.nz>  
wrote:

 On Wed, 10 Apr 2013 10:48:30 +0100, Henning Pohl  
 <henning still-hidden.de> wrote:

 On Tuesday, 9 April 2013 at 20:46:12 UTC, Steven Schveighoffer wrote:
 No.  You are not allowed to access any GC-managed resources inside a  
 destructor.
Okay, I finally found it: http://dlang.org/class.html#destructors. But it's not listed here: http://dlang.org/garbage.html.
 And it won't be null, because that would be extremely costly.   
 Consider if 300 objects had a pointer to a block of memory.  Now those  
 300 objects all go away.  By your expectation, if the targeted block  
 of memory was collected, the GC would have to spend time going through  
 all those 300 pointers, setting them to null.  When they are about to  
 all be destroyed.  99.99% of the time, that would be wasted, as those  
 300 objects may not even have destructors that care.
I thought about the case when the user sets the reference to null before the destructor was even called.
 A destructor is ONLY for destroying non-GC managed resources, nothing  
 else.  Like an OS file handle for instance.
Imagine this case: ... Any instance of B always needs to be destructed _before_ the A it's connected to. How do you express this in D?
Actually I got my example a bit wrong. You can at least destroy the a library handle in Dispose(true/false). But, you cannot call a.Dispose() from b's Dispose(false) and from what you're saying, you cannot free the library b handle either. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Apr 10 2013
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 10 Apr 2013 05:48:30 -0400, Henning Pohl <henning still-hidden.de>  
wrote:

 Any instance of B always needs to be destructed _before_ the A it's  
 connected to. How do you express this in D?
You must not rely on the GC if you need ordered destruction. Even though destructors CAN destroy other resources, it's almost always better to use manual destruction. Consider that you almost certainly have more memory available than limited resources (such as file handles). If you rely on the GC to destroy that resource (and the GC is not guaranteed to always destroy it), then it's possible you run out of file handles before the memory that owns it is collected. I recommend you introduce destruction methods to close the handles manually outside the GC. Then if you happen to forget to close those manually, as a fail-safe have A destroy its a_handle (which would destroy all the b_handles, right?) in the destructor. A simple example to explain this is a buffered stream. Imagine you have a class which contains GC-managed array as a buffer, and the OS file handle. Upon destruction, it would be best if the buffered stream wrote whatever buffered data was left into the file handle, but since the GC may have destroyed the buffer, we can't do that. So you have to put a warning in the docs not to rely on the GC to destroy the stream. It really requires a change in philosophy -- do not let the GC clean up non-memory resources. Use manual methods to do that. -Steve
Apr 11 2013
prev sibling parent "Regan Heath" <regan netmail.co.nz> writes:
On Tue, 09 Apr 2013 20:55:09 +0100, Henning Pohl <henning still-hidden.de>  
wrote:

 In fact there is no order of destruction. And this is one of the most  
 annoying D problems I recently had to deal with. Look at this example:  
 http://dpaste.dzfl.pl/f3f860b0. This time, it segfaulted. Next time it  
 may (in theory) not, because the dtor of a is called before the one of  
 b. A holds a reference to a B. In the destructor of A I expect b either  
 to be null or a valid instance of B (which has not been destroyed yet).  
 You get a kind of undefined behavior instead. This is IMO a huge  
 drawback towards reference counting with strong/weak references.

 Is there right now any way to arrange things?
http://msdn.microsoft.com/en-gb/library/fs2xkftw.aspx You can implement this in D albeit without the hook into the GC to "prevent finalization" because that concept does not exist. In D you would.. - Your object implements a Dispose method which calls a protected/private Dispose(bool) with true. - Your object destructor calls Dispose(bool) with false. - In Dispose(true) will clean up managed (GC) resources and objects etc. - In Dispose(false) will only clean up unmanaged (OS handles etc). - In Dispose(bool) you set a disposed flag to true to prevent further disposal. In your user code you would call Dispose when you were done with an object. This would trigger Dispose(true) and clean up other GC resources. If not called, the destructor will call Dispose(false). It's not perfect, but it's organised and safe. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Apr 10 2013