www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Did I find the death tractor?

reply pragma <pragma_member pathlink.com> writes:
Humorous subjects aside, Manfred's post got me thinking about how such a trival
behavior (valid references in destructors) is broken in D.  After composing a
few test cases, I pinned down the problem to the garbage collector itself.

In the file std/phobos/internal/gc/gcx.d, deep in the heart of the GC, lies the
problem.  Here's how I read lines 1648 - 1653; after the GC has gathered all the
unmarked pages following a sweep.

# // pseudocode
# foreach(page; unmarkedPages){
#    void* ptr = getObjectForPage(page); // line 16256
#    finalizer(ptr); // line 1631
#    freePage(page); // line 1636 (roughly, there's a loopy bit afterwards that
#                    // does some more 'freeing')
# }

See the problem yet?  Its deleting objects *before* its done with finalizing the
entire set of swept objects.  This is disasterous since entire mobs of mutually
referencing objects could easily be a part of this sweep!  It should be changed
to do the following:

# // bugfixed pseudocode
# foreach(page; unmarkedPages){
#    void* ptr = getObjectForPage(page);
#    finalizer(ptr);
# }
# foreach(page; unmarkedPages){
#    freePage(page);
# }

This would ensure that object cycles, networks and just about any relationship
tree of collected objects gets to work with valid references before being
destroyed.  All at the expense of one extra loop through the swept pages.

(To the curious: The definition for free(), on line 365, is a cleaner example of
the above, as it's the backing definition for the explicit "delete" operator.
Its more straightforward since its not directly involved with mark/sweep)

So can anyone out there verify this?  I want to post this on .bugs but I also
want to be certain that this is the problem.

- EricAnderton at yahoo
Aug 02 2005
parent reply "Walter" <newshound digitalmars.com> writes:
"pragma" <pragma_member pathlink.com> wrote in message
news:dcp79u$of$1 digitaldaemon.com...
 See the problem yet?  Its deleting objects *before* its done with

 entire set of swept objects.  This is disasterous since entire mobs of

 referencing objects could easily be a part of this sweep!  It should be

 to do the following:

This won't solve the issue. Freeing the page doesn't render the objects invalid. What renders the objects invalid is that after the finalizer is run for an object instance, the instance's vptr entry is set to null. This prevents the object instance from ever calling another virtual function. The only way to really solve it is to step through each object, doing a depth-first finalize on each referenced instance. This, of course, leads to the cycle problem.
Aug 02 2005
next sibling parent reply zwang <nehzgnaw gmail.com> writes:
Walter wrote:
 "pragma" <pragma_member pathlink.com> wrote in message
 news:dcp79u$of$1 digitaldaemon.com...
 
See the problem yet?  Its deleting objects *before* its done with

finalizing the
entire set of swept objects.  This is disasterous since entire mobs of

mutually
referencing objects could easily be a part of this sweep!  It should be

changed
to do the following:

This won't solve the issue. Freeing the page doesn't render the objects invalid. What renders the objects invalid is that after the finalizer is run for an object instance, the instance's vptr entry is set to null. This prevents the object instance from ever calling another virtual function. The only way to really solve it is to step through each object, doing a depth-first finalize on each referenced instance. This, of course, leads to the cycle problem.

As I understand it, D developers should * avoid relying on destructors to free up resources or * use the auto keyword or * manually manage memory by tracking instance references Am I missing anything?
Aug 02 2005
next sibling parent AJG <AJG_member pathlink.com> writes:
 This won't solve the issue. Freeing the page doesn't render the objects
 invalid. What renders the objects invalid is that after the finalizer is run
 for an object instance, the instance's vptr entry is set to null. This
 prevents the object instance from ever calling another virtual function.
 
 The only way to really solve it is to step through each object, doing a
 depth-first finalize on each referenced instance. This, of course, leads to
 the cycle problem.
 
 

As I understand it, D developers should * avoid relying on destructors to free up resources or * use the auto keyword or * manually manage memory by tracking instance references Am I missing anything?

Nope. You've described the problem quite well. --AJG.
Aug 02 2005
prev sibling parent reply "Ben Hinkle" <ben.hinkle gmail.com> writes:
"zwang" <nehzgnaw gmail.com> wrote in message 
news:dcp8ps$244$1 digitaldaemon.com...
 Walter wrote:
 "pragma" <pragma_member pathlink.com> wrote in message
 news:dcp79u$of$1 digitaldaemon.com...

See the problem yet?  Its deleting objects *before* its done with

finalizing the
entire set of swept objects.  This is disasterous since entire mobs of

mutually
referencing objects could easily be a part of this sweep!  It should be

changed
to do the following:

This won't solve the issue. Freeing the page doesn't render the objects invalid. What renders the objects invalid is that after the finalizer is run for an object instance, the instance's vptr entry is set to null. This prevents the object instance from ever calling another virtual function. The only way to really solve it is to step through each object, doing a depth-first finalize on each referenced instance. This, of course, leads to the cycle problem.

As I understand it, D developers should * avoid relying on destructors to free up resources

destructors are exactly for freeing up resources - but they can't be GC-managed resources. They must be "external" resources like malloc'ed memory or system resources like a Win32 HANDLE.
 or  * use the auto keyword
 or  * manually manage memory by tracking instance references

 Am I missing anything? 

Aug 02 2005
next sibling parent reply "Uwe Salomon" <post uwesalomon.de> writes:
 As I understand it, D developers should
     * avoid relying on destructors to free up resources

destructors are exactly for freeing up resources - but they can't be GC-managed resources. They must be "external" resources like malloc'ed memory or system resources like a Win32 HANDLE.

That means that all GC-managed resources have to be written with exactly this problem in mind: "Any class that uses me cannot call finalizers on destruction!" I think a lot of the recent problems with this behaviour originate from imperfectly written classes (for example AJG said that he uses classes from the DDBI library ─ perhaps some things need to be rewritten there to make them fully usable?) By the way, another solution to your problem: Create a set (a simple array would do it, or an int[SomeClass] map, or ─ of course :) ─ the Indigo Set) with all database objects you are using. When they are created by the managing class, they are added to the set, and in the destructor they are finalized and removed from the set (they are still accessible because the global set has a pointer on it). But you really should use the delete option, because if it is so crucial that your resources get freed, it would perhaps be a good idea to not let that be up to the GC? If one would do that with file handles, one could simply run out of handles before the GC does the first collection. It is funny that using delete to explicitly free objects is a solution somehow regarded as dowdy, considering that a lot of people find the whole GCing in D fishy enough to not even use the language... Ciao uwe
Aug 02 2005
parent "Ben Hinkle" <ben.hinkle gmail.com> writes:
"Uwe Salomon" <post uwesalomon.de> wrote in message 
news:op.suw2bpi36yjbe6 sandmann.maerchenwald.net...
 As I understand it, D developers should
     * avoid relying on destructors to free up resources

destructors are exactly for freeing up resources - but they can't be GC-managed resources. They must be "external" resources like malloc'ed memory or system resources like a Win32 HANDLE.

That means that all GC-managed resources have to be written with exactly this problem in mind: "Any class that uses me cannot call finalizers on destruction!" I think a lot of the recent problems with this behaviour originate from imperfectly written classes (for example AJG said that he uses classes from the DDBI library ? perhaps some things need to be rewritten there to make them fully usable?)

Agreed. If the class owning the external resource doesn't have a dtor then you have to always be sure to close the resource by hand. I still argue, though, that relying on dtors to close a resource is a back-stop even if one were available. Users should always be encouraged to close their resources explicitly since 1) the GC calls the dtor at an unknown time in the future after the object becomes garbage and 2) the GC may never call the dtor in some rare situations.
 By the way, another solution to your problem: Create a set (a simple array 
 would do it, or an int[SomeClass] map, or ? of course :) ? the Indigo Set) 
 with all database objects you are using. When they are created by the 
 managing class, they are added to the set, and in the destructor they are 
 finalized and removed from the set (they are still accessible because the 
 global set has a pointer on it).

The downside is that the global set would presumably need to be sychronized.
 But you really should use the delete option, because if it is so crucial 
 that your resources get freed, it would perhaps be a good idea to not let 
 that be up to the GC? If one would do that with file handles, one could 
 simply run out of handles before the GC does the first collection.

Agreed. A program shouldn't start failing because the user added more memory to their system.
 It is  funny that using delete to explicitly free objects is a solution 
 somehow  regarded as dowdy, considering that a lot of people find the 
 whole GCing  in D fishy enough to not even use the language...

There are people that still avoid a language because of GC? their loss :-)
 Ciao
 uwe 

Aug 03 2005
prev sibling parent "Walter" <newshound digitalmars.com> writes:
"Ben Hinkle" <ben.hinkle gmail.com> wrote in message
news:dcpcef$4f6$1 digitaldaemon.com...
 As I understand it, D developers should
     * avoid relying on destructors to free up resources

destructors are exactly for freeing up resources - but they can't be GC-managed resources. They must be "external" resources like malloc'ed memory or system resources like a Win32 HANDLE.

Right. Which means that finalizers *can* free up other object instances, as long as those instances were not allocated with the GC.
Aug 03 2005
prev sibling next sibling parent reply pragma <pragma_member pathlink.com> writes:
In article <dcp7tc$1d5$1 digitaldaemon.com>, Walter says...
"pragma" <pragma_member pathlink.com> wrote in message
news:dcp79u$of$1 digitaldaemon.com...
 See the problem yet?  Its deleting objects *before* its done with

 entire set of swept objects.  This is disasterous since entire mobs of

 referencing objects could easily be a part of this sweep!  It should be

 to do the following:

This won't solve the issue.

Walter, thank you for replying to my post. I had a feeling it couldn't be that easy.
Freeing the page doesn't render the objects
invalid. What renders the objects invalid is that after the finalizer is run
for an object instance, the instance's vptr entry is set to null. This
prevents the object instance from ever calling another virtual function.

I see, so no matter what, we can only assume that references used in destructors are invalid since the order of destruction is unknown. Either we clobber objects as we go (what we're doing now) or run the risk of a cycle calling methods on an already finalized object. I think I understand now.
The only way to really solve it is to step through each object, doing a
depth-first finalize on each referenced instance. This, of course, leads to
the cycle problem.

Right. What is the correct way to finalize a cycle, or is there one at all? - EricAnderton at yahoo
Aug 02 2005
parent "Walter" <newshound digitalmars.com> writes:
"pragma" <pragma_member pathlink.com> wrote in message
news:dcpbqj$40j$1 digitaldaemon.com...
 Right.  What is the correct way to finalize a cycle, or is there one at

The gc will finalize cycles just fine. It's just that it doesn't guarantee any order to doing it. Therefore, finalizers should not rely on any of their gc-allocated references to be valid.
Aug 03 2005
prev sibling parent Burton Radons <burton-radons smocky.com> writes:
Walter wrote:

 "pragma" <pragma_member pathlink.com> wrote in message
 news:dcp79u$of$1 digitaldaemon.com...
 
See the problem yet?  Its deleting objects *before* its done with

finalizing the
entire set of swept objects.  This is disasterous since entire mobs of

mutually
referencing objects could easily be a part of this sweep!  It should be

changed
to do the following:

This won't solve the issue. Freeing the page doesn't render the objects invalid. What renders the objects invalid is that after the finalizer is run for an object instance, the instance's vptr entry is set to null. This prevents the object instance from ever calling another virtual function.

Use the synchronisation slot; after all, if the object is dead, it must be disused. If the synchronisation slot is being used when delete is called on it, the user is trying to tricks you and should be told to go to hells. It should be said here - and I don't know why you don't say this - that if you were presented with a patch for the GC which fixed this issue cleanly and efficiently, you would almost certainly put it in the next release. However, as something to spend your own time working on it's not in any way a priority.
 The only way to really solve it is to step through each object, doing a
 depth-first finalize on each referenced instance. This, of course, leads to
 the cycle problem.

I don't think anyone's argued for ordered destruction, merely dependable object references during a single collection pass. It's okay to say that a destructor needs to reset their resources and ensure that they don't break anything by trying to access them illegally later; they should be doing both things already in case initialisation failed. It's not okay to say that you can legally access any other object in your destructor, but doing so will cause a SIGSEGV.
Aug 02 2005