www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Destructors and Deterministic Memory Management

reply dsimcha <dsimcha yahoo.com> writes:
Two closely related topics here:

1.  It is often nice to be able to create a single object that works with both
GC and deterministic memory management.  The idea is that, if delete is called
manually, all sub-objects would be freed deterministically, but the object
could still safely be GC'd.  Since the destructor called by the GC can't
reference sub-objects, would it be feasible to have two destructors for each
class, one that is called when delete is invoked manually and another that is
called by the GC?

2.  One possible solution is to allocate the sub-objects whose lifetimes can't
exceed that of the main object on the C heap, and put std.c.stdlib.free()
calls in the destructor.  However, according to the spec, the GC is not
guaranteed to call the destructor for all unreferenced objects.  Under what
circumstances would the d'tor not get called, leading to a memory leak if this
strategy was used?
May 03 2009
next sibling parent reply Georg Wrede <georg.wrede iki.fi> writes:
dsimcha wrote:
 Two closely related topics here:
 
 1.  It is often nice to be able to create a single object that works with both
 GC and deterministic memory management.  The idea is that, if delete is called
 manually, all sub-objects would be freed deterministically, but the object
 could still safely be GC'd.  Since the destructor called by the GC can't
 reference sub-objects, would it be feasible to have two destructors for each
 class, one that is called when delete is invoked manually and another that is
 called by the GC?

This one you already can do. Just write a myDestructor() that does whatever you need, and then call it when necessary.
May 03 2009
parent Daniel Keep <daniel.keep.lists gmail.com> writes:
Georg Wrede wrote:
 dsimcha wrote:
 Two closely related topics here:

 1.  It is often nice to be able to create a single object that works
 with both
 GC and deterministic memory management.  The idea is that, if delete
 is called
 manually, all sub-objects would be freed deterministically, but the
 object
 could still safely be GC'd.  Since the destructor called by the GC can't
 reference sub-objects, would it be feasible to have two destructors
 for each
 class, one that is called when delete is invoked manually and another
 that is
 called by the GC?

This one you already can do. Just write a myDestructor() that does whatever you need, and then call it when necessary.

Actually, a pattern I stole from C# was dispose. I have a module lying around somewhere that defines a Disposable interface with a void dispose(); method. There's also a mixin that implements much of the logic and boilerplate for you. It also defines void dispose(T)(T) and void destroy(T)(ref T); destroy works like delete except that it will call dispose on the passed value if it can find it. Then I just use destroy everywhere instead of delete, unless it's for a scoped instance in which case I use dispose. -- Daniel
May 03 2009
prev sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
dsimcha wrote:
 Two closely related topics here:
 
 1.  It is often nice to be able to create a single object that works with both
 GC and deterministic memory management.  The idea is that, if delete is called
 manually, all sub-objects would be freed deterministically, but the object
 could still safely be GC'd.  Since the destructor called by the GC can't
 reference sub-objects, would it be feasible to have two destructors for each
 class, one that is called when delete is invoked manually and another that is
 called by the GC?

You can do this today with both Druntime on D 2.0 and Tango on D 1.0, though it isn't the most performant approach. The code would look something like this: import core.runtime; interface Disposable { void dispose(); } bool handler( Object o ) { auto d = cast(Disposable) o; if( d !is null ) { d.dispose(); return false; } return true; } static this() { Runtime.collectHandler = &handler; } If you return false from your collectHandler then the runtime won't call the object's dtor.
May 04 2009
next sibling parent Daniel Keep <daniel.keep.lists gmail.com> writes:
Sean Kelly wrote:
 ...
 
 import core.runtime;
 
 interface Disposable
 {
     void dispose();
 }
 
 bool handler( Object o )
 {
     auto d = cast(Disposable) o;
 
     if( d !is null )
     {
         d.dispose();
         return false;
     }
     return true;
 }
 
 static this()
 {
     Runtime.collectHandler = &handler;
 }
 
 If you return false from your collectHandler then the runtime won't call
 the object's dtor.

:O ?cookie SeanK -- Daniel
May 05 2009
prev sibling next sibling parent Nick B <nick.barbalich gmail.com> writes:
Sean Kelly wrote:
 dsimcha wrote:
 Two closely related topics here:

 1.  It is often nice to be able to create a single object that works 
 with both
 GC and deterministic memory management.  The idea is that, if delete 
 is called
 manually, all sub-objects would be freed deterministically, but the 
 object
 could still safely be GC'd.  Since the destructor called by the GC can't
 reference sub-objects, would it be feasible to have two destructors 
 for each
 class, one that is called when delete is invoked manually and another 
 that is
 called by the GC?

You can do this today with both Druntime on D 2.0 and Tango on D 1.0, though it isn't the most performant approach. The code would look something like this: import core.runtime; interface Disposable { void dispose(); } bool handler( Object o ) { auto d = cast(Disposable) o; if( d !is null ) { d.dispose(); return false; } return true; } static this() { Runtime.collectHandler = &handler; } If you return false from your collectHandler then the runtime won't call the object's dtor.

I raised this enhancement request, below : http://d.puremagic.com/issues/show_bug.cgi?id=2757 Would this sample code solve these resource management / deterministic memory management issues ?
May 05 2009
prev sibling parent reply Georg Wrede <georg.wrede iki.fi> writes:
Sean Kelly wrote:
 dsimcha wrote:
 Two closely related topics here:

 1.  It is often nice to be able to create a single object that works 
 with both
 GC and deterministic memory management.  The idea is that, if delete 
 is called
 manually, all sub-objects would be freed deterministically, but the 
 object
 could still safely be GC'd.  Since the destructor called by the GC can't
 reference sub-objects, would it be feasible to have two destructors 
 for each
 class, one that is called when delete is invoked manually and another 
 that is
 called by the GC?

You can do this today with both Druntime on D 2.0 and Tango on D 1.0, though it isn't the most performant approach. The code would look something like this: import core.runtime; interface Disposable { void dispose(); } bool handler( Object o ) { auto d = cast(Disposable) o; if( d !is null ) { d.dispose(); return false; } return true; } static this() { Runtime.collectHandler = &handler; } If you return false from your collectHandler then the runtime won't call the object's dtor.

Err, if one has an object that needs deterministic destruction, then one calls it (with say myDelete()) to have it release its resources. Until then, of course, it shouldn't get collected. But in any case it won't, because we obviously have a handler to it because we still haven't decided to destruct it. So I assume there's something I don't understand here. After we're done with destructing (as in having called myDelete()), then we "delete the object", or let it go out of scope. Then it becomes eligible for disposal. So, what's the relevance of Disposable here?
May 05 2009
next sibling parent Georg Wrede <georg.wrede iki.fi> writes:
Georg Wrede wrote:
 Sean Kelly wrote:
 dsimcha wrote:
 Two closely related topics here:

 1.  It is often nice to be able to create a single object that works 
 with both
 GC and deterministic memory management.  The idea is that, if delete 
 is called
 manually, all sub-objects would be freed deterministically, but the 
 object
 could still safely be GC'd.  Since the destructor called by the GC can't
 reference sub-objects, would it be feasible to have two destructors 
 for each
 class, one that is called when delete is invoked manually and another 
 that is
 called by the GC?

You can do this today with both Druntime on D 2.0 and Tango on D 1.0, though it isn't the most performant approach. The code would look something like this: import core.runtime; interface Disposable { void dispose(); } bool handler( Object o ) { auto d = cast(Disposable) o; if( d !is null ) { d.dispose(); return false; } return true; } static this() { Runtime.collectHandler = &handler; } If you return false from your collectHandler then the runtime won't call the object's dtor.

Err, if one has an object that needs deterministic destruction, then one calls it (with say myDelete()) to have it release its resources. Until then, of course, it shouldn't get collected. But in any case it won't, because we obviously have a handler to it because we still haven't decided to destruct it. So I assume there's something I don't understand here.

s/handler/reference/
 After we're done with destructing (as in having called myDelete()), then 
 we "delete the object", or let it go out of scope. Then it becomes 
 eligible for disposal. So, what's the relevance of Disposable here?
 

May 05 2009
prev sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
Georg Wrede wrote:
 Sean Kelly wrote:
 dsimcha wrote:
 Two closely related topics here:

 1.  It is often nice to be able to create a single object that works 
 with both
 GC and deterministic memory management.  The idea is that, if delete 
 is called
 manually, all sub-objects would be freed deterministically, but the 
 object
 could still safely be GC'd.  Since the destructor called by the GC can't
 reference sub-objects, would it be feasible to have two destructors 
 for each
 class, one that is called when delete is invoked manually and another 
 that is
 called by the GC?

You can do this today with both Druntime on D 2.0 and Tango on D 1.0, though it isn't the most performant approach. The code would look something like this: import core.runtime; interface Disposable { void dispose(); } bool handler( Object o ) { auto d = cast(Disposable) o; if( d !is null ) { d.dispose(); return false; } return true; } static this() { Runtime.collectHandler = &handler; } If you return false from your collectHandler then the runtime won't call the object's dtor.

Err, if one has an object that needs deterministic destruction, then one calls it (with say myDelete()) to have it release its resources. Until then, of course, it shouldn't get collected. But in any case it won't, because we obviously have a handler to it because we still haven't decided to destruct it. So I assume there's something I don't understand here. After we're done with destructing (as in having called myDelete()), then we "delete the object", or let it go out of scope. Then it becomes eligible for disposal. So, what's the relevance of Disposable here?

The collectHandler will only be called if an object is collected by the GC, not if it's explicitly deleted. So you could write the dtor in a way that assumes all referenced subobjects are still valid and use dispose for cleanup of garbage collected instances only. To me that sounded fairly close to what dsimcha was asking for.
May 05 2009
parent reply Georg Wrede <georg.wrede iki.fi> writes:
Sean Kelly wrote:
 Georg Wrede wrote:
 Sean Kelly wrote:
 dsimcha wrote:
 1. It is often nice to be able to create a single object that
 works with both GC and deterministic memory management. The
 idea is that, if delete is called manually, all sub-objects
 would be freed deterministically, but the object could still
 safely be GC'd. Since the destructor called by the GC can't 
 reference sub-objects, would it be feasible to have two
 destructors for each class, one that is called when delete is
 invoked manually and another that is called by the GC?




 You can do this today with both Druntime on D 2.0 and Tango on D 1.0, 
 though it isn't the most performant approach.  The code would look 
 something like this:

 import core.runtime;

 interface Disposable
 {
     void dispose();
 }

 bool handler( Object o )
 {
     auto d = cast(Disposable) o;

     if( d !is null )
     {
         d.dispose();
         return false;
     }
     return true;
 }

 static this()
 {
     Runtime.collectHandler = &handler;
 }

 If you return false from your collectHandler then the runtime won't 
 call the object's dtor.

Err, if one has an object that needs deterministic destruction, then one calls it (with say myDelete()) to have it release its resources. Until then, of course, it shouldn't get collected. But in any case it won't, because we obviously have a handler to it because we still haven't decided to destruct it. So I assume there's something I don't understand here. After we're done with destructing (as in having called myDelete()), then we "delete the object", or let it go out of scope. Then it becomes eligible for disposal. So, what's the relevance of Disposable here?

The collectHandler will only be called if an object is collected by the GC, not if it's explicitly deleted. So you could write the dtor in a way that assumes all referenced subobjects are still valid and use dispose for cleanup of garbage collected instances only. To me that sounded fairly close to what dsimcha was asking for.

Dsimcha wrote "Since the destructor called by the GC can't reference sub-objects", I got into thinking that we'd then need a myDestructor. But delete myobject; calls ~this() in myobject, as does the GC, as does program exit. I also tested, and the referenced other objects did get deleted. No problem. That implies releasing other resources works, by simply having such release code in ~this() for the object. I found no difference in calling delete or letting the GC do it. So, originally dsimcha's problem was imagined?
May 06 2009
parent reply Sean Kelly <sean invisibleduck.org> writes:
Georg Wrede wrote:
 
 Dsimcha wrote "Since the destructor called by the GC can't reference 
 sub-objects", I got into thinking that we'd then need a myDestructor.
 
 But
 
    delete myobject;
 
 calls ~this() in myobject, as does the GC, as does program exit.
 
 I also tested, and the referenced other objects did get deleted. No 
 problem. That implies releasing other resources works, by simply having 
 such release code in ~this() for the object.
 
 I found no difference in calling delete or letting the GC do it. So, 
 originally dsimcha's problem was imagined?

If an object might possibly be finalized by the GC rather than deleted explicitly then its dtor can't referfence subobjects (because the GC doesn't guarantee any particular finalization order). These subobjects will be finalized by the GC anyway when it detects that they're no longer referenced, but sometimes it's nice to do something with these objects when you know they're still valid. My example provided a way for a different routine to be called for deterministic vs. non-deterministic finalization to allow for this.
May 06 2009
parent reply Georg Wrede <georg.wrede iki.fi> writes:
Sean Kelly wrote:
 Georg Wrede wrote:
 Dsimcha wrote "Since the destructor called by the GC can't reference 
 sub-objects", I got into thinking that we'd then need a myDestructor.

 But

    delete myobject;

 calls ~this() in myobject, as does the GC, as does program exit.

 I also tested, and the referenced other objects did get deleted. No 
 problem. That implies releasing other resources works, by simply 
 having such release code in ~this() for the object.

 I found no difference in calling delete or letting the GC do it. So, 
 originally dsimcha's problem was imagined?

If an object might possibly be finalized by the GC rather than deleted explicitly then its dtor can't referfence subobjects (because the GC doesn't guarantee any particular finalization order). These subobjects will be finalized by the GC anyway when it detects that they're no longer referenced, but sometimes it's nice to do something with these objects when you know they're still valid. My example provided a way for a different routine to be called for deterministic vs. non-deterministic finalization to allow for this.

I've always found that sentence a bit murky in the docs. Thinking more, isn't this what happens: class A { A b = new A; } void main() { A c = new A;} When ca gets collected, it is not "guaranteed" that b gets distroyed first. Fine. But suppose A has a destructor that says delete b. Wouldn't that guarantee that b gets destroyed before c? And if so, shouldn't the sentence in the docs be changed somehow so it doesn't send folks on "a reverse goose chase, meaning running scrared of the geese".
May 06 2009
parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Georg Wrede wrote:
 ...
 
 I've always found that sentence a bit murky in the docs. Thinking more,
 isn't this what happens:
 
 class A {
    A b = new A;
 }
 void main() { A c = new A;}
 
 When ca gets collected, it is not "guaranteed" that b gets distroyed
 first.

No, it is not guaranteed that c gets destroyed first.
 Fine. But suppose A has a destructor that says delete b. Wouldn't
 that guarantee that b gets destroyed before c?

No. GC does a collect and marks both c and b for collection. It blows up b. It goes to blow up c and notices it has a dtor. It calls the dtor. Dtor attempts to delete b. But b is a pointer into a chunk of memory that no longer exists. b's dtor explodes, killing several nearby pedestrians and one dog. When your class' dtor is called, you CANNOT say whether any of the references into GC-controlled memory you hold are still valid. -- Daniel
May 06 2009
parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Daniel Keep wrote:
 When your class' dtor is called, you CANNOT say whether any of the
 references into GC-controlled memory you hold are still valid.

You forgot to add: unless you know for a *fact* they're referenced from a GC root, for example from a global variable (directly or indirectly).
May 06 2009
parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Frits van Bommel wrote:
 Daniel Keep wrote:
 When your class' dtor is called, you CANNOT say whether any of the
 references into GC-controlled memory you hold are still valid.

You forgot to add: unless you know for a *fact* they're referenced from a GC root, for example from a global variable (directly or indirectly).

Or there's an integer somewhere that LOOKS like a pointer to it ... I was talking from the perspective of NOT having any information outside of the object itself. :P -- Daniel
May 06 2009
parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Daniel Keep wrote:
 
 Frits van Bommel wrote:
 Daniel Keep wrote:
 When your class' dtor is called, you CANNOT say whether any of the
 references into GC-controlled memory you hold are still valid.

a GC root, for example from a global variable (directly or indirectly).

Or there's an integer somewhere that LOOKS like a pointer to it ... I was talking from the perspective of NOT having any information outside of the object itself. :P

No, you can't assume this one. For all you know, your program might one day be compiled with a (semi-)precise GC :).
May 06 2009
parent reply Georg Wrede <georg.wrede iki.fi> writes:
Frits van Bommel wrote:
 Daniel Keep wrote:
 Frits van Bommel wrote:
 Daniel Keep wrote:
 When your class' dtor is called, you CANNOT say whether any of the
 references into GC-controlled memory you hold are still valid.

a GC root, for example from a global variable (directly or indirectly).

Or there's an integer somewhere that LOOKS like a pointer to it ... I was talking from the perspective of NOT having any information outside of the object itself. :P

No, you can't assume this one. For all you know, your program might one day be compiled with a (semi-)precise GC :).

I could read the sources till I become an expert on this, but maybe it's more efficient, if somebody thoroughly explains to us, what is really going on when we have an object that has pointers to other instances, and it's time to collect it. Using only two objects, adam and eve. Intially I have a reference to adam, and adam has a reference to eve. And what I'd of course wish, is that eve be destructed before adam, or at least that both's destructors would be run, no matter what. Let's say each has a file to close. So, does /anybody/ know this so well that the explanation ends up being clear, concise, and unambiguous.
May 07 2009
parent reply Leandro Lucarella <llucax gmail.com> writes:
Georg Wrede, el  7 de mayo a las 15:53 me escribiste:
 Frits van Bommel wrote:
Daniel Keep wrote:
Frits van Bommel wrote:
Daniel Keep wrote:
When your class' dtor is called, you CANNOT say whether any of the
references into GC-controlled memory you hold are still valid.

a GC root, for example from a global variable (directly or indirectly).

Or there's an integer somewhere that LOOKS like a pointer to it ... I was talking from the perspective of NOT having any information outside of the object itself. :P


I could read the sources till I become an expert on this, but maybe it's more efficient, if somebody thoroughly explains to us, what is really going on when we have an object that has pointers to other instances, and it's time to collect it. Using only two objects, adam and eve. Intially I have a reference to adam, and adam has a reference to eve. And what I'd of course wish, is that eve be destructed before adam, or at least that both's destructors would be run, no matter what. Let's say each has a file to close. So, does /anybody/ know this so well that the explanation ends up being clear, concise, and unambiguous.

If you want to understand how the current GC works, I'd recomment reading this series of posts: http://proj.llucax.com.ar/blog/dgc/blog/tag/understanding%20the%20current%20gc (from the bottom up, in chronological order, of course) If you want to know why you don't have ordering guarantees, it's because when the "garbage" is swept, you don't do it by following the connectivity graph (as you do when you mark the memory). You can't do it even if you want to, because you don't have roots to the garbage (that's why it's garbage in the first place =). And if you can manage to follow the "old" connectivity graph for some magical reason, you still have problems with cycles. What if eve have a reference to adam too? What do you destroy first? Huston, we have a problem =) -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------------- Dentro de 30 aƱos Argentina va a ser un gran supermercado con 15 changuitos, porque esa va a ser la cantidad de gente que va a poder comprar algo. -- Sidharta Wiki
May 07 2009
parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Leandro Lucarella (llucax gmail.com)'s article
 (from the bottom up, in chronological order, of course)
 If you want to know why you don't have ordering guarantees, it's because
 when the "garbage" is swept, you don't do it by following the connectivity
 graph (as you do when you mark the memory). You can't do it even if you
 want to, because you don't have roots to the garbage (that's why it's
 garbage in the first place =). And if you can manage to follow the "old"
 connectivity graph for some magical reason, you still have problems with
 cycles. What if eve have a reference to adam too? What do you destroy
 first? Huston, we have a problem =)

True, but I wish the GC would allow referencing subobjects for the following very important but very restricted case, just to get around false pointer issues. It seems to work in practice anyhow, so all that would have to happen is for it to be "officially" sanctioned, so that it's not labeled as undefined behavior and thus arbitrarily dangerous: 1. You're only referencing sub-objects to explicitly delete them. 2. These sub-objects are guaranteed to have no more real references and *should* be freed, but may have false pointers. If they're large enough, they will have false pointers with high probability. 3. These sub-objects contain no finalizers of their own, so the finalizer can't get run twice. Calling delete just frees memory. Example: class Foo { // hugeArray either never escapes or we assume that the lifetime // of any escapes is less than the lifetime of the class instance. uint[] hugeArray; this() { hugeArray = new uint[50_000_000]; } ~this() { // There are no more real references to hugeArray, since // it never escapes this class instance, but it is likely // to have false references because it's so huge. // Tell the GC that it is ok to delete it anyhow. delete hugeArray; } }
May 07 2009
parent Leandro Lucarella <llucax gmail.com> writes:
dsimcha, el  7 de mayo a las 16:57 me escribiste:
 == Quote from Leandro Lucarella (llucax gmail.com)'s article
 (from the bottom up, in chronological order, of course)
 If you want to know why you don't have ordering guarantees, it's because
 when the "garbage" is swept, you don't do it by following the connectivity
 graph (as you do when you mark the memory). You can't do it even if you
 want to, because you don't have roots to the garbage (that's why it's
 garbage in the first place =). And if you can manage to follow the "old"
 connectivity graph for some magical reason, you still have problems with
 cycles. What if eve have a reference to adam too? What do you destroy
 first? Huston, we have a problem =)

True, but I wish the GC would allow referencing subobjects for the following very important but very restricted case, just to get around false pointer issues. It seems to work in practice anyhow, so all that would have to happen is for it to be "officially" sanctioned, so that it's not labeled as undefined behavior and thus arbitrarily dangerous: 1. You're only referencing sub-objects to explicitly delete them. 2. These sub-objects are guaranteed to have no more real references and *should* be freed, but may have false pointers. If they're large enough, they will have false pointers with high probability. 3. These sub-objects contain no finalizers of their own, so the finalizer can't get run twice. Calling delete just frees memory. Example: class Foo { // hugeArray either never escapes or we assume that the lifetime // of any escapes is less than the lifetime of the class instance. uint[] hugeArray; this() { hugeArray = new uint[50_000_000]; } ~this() { // There are no more real references to hugeArray, since // it never escapes this class instance, but it is likely // to have false references because it's so huge. // Tell the GC that it is ok to delete it anyhow. delete hugeArray; } }

What's wrong with explicit memory managemente (malloc/free) in that particular case? -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------------- now self-employed, concerned (but powerless), an empowered and informed member of society (pragmatism not idealism), will not cry in public, less chance of illness, tires that grip in the wet (shot of baby strapped in back seat),
May 07 2009