www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - auto classes and finalizers

reply Sean Kelly <sean f4.ca> writes:
I've been following a thread on GC in c.l.c++.m and something Herb 
posted about C++/CLI today got me thinking:

     - a type can have a destructor and/or a finalizer
     - the destructor is called upon a) explicit delete or b) at end 	
       of scope for auto objects
     - the finalizer is called if allocated on the gc heap and the
       destructor has not been called

Given that D can have lexical destruction of objects that weren't 
explicitly designed for it, ie.

     class C {}
     auto C = new C();

Might it not be worthwhile to do something similar to the above?  This 
would allow objects to explicitly delete all their contained data in 
instances where they are being used as auto objects, rather than always 
relying on the GC for this purpose.  I'll admit I don't particularly 
like the idea of separate finalize() and ~this() methods, but it seems 
an attractive enough feature that something along these lines may be 
appropriate.


Sean
Apr 05 2006
next sibling parent reply Mike Capp <mike.capp gmail.com> writes:
In article <e10pk7$2khb$1 digitaldaemon.com>, Sean Kelly says...
I've been following a thread on GC in c.l.c++.m and something Herb 
posted about C++/CLI today got me thinking:

     - a type can have a destructor and/or a finalizer
     - the destructor is called upon a) explicit delete or b) at end 	
       of scope for auto objects
     - the finalizer is called if allocated on the gc heap and the
       destructor has not been called
[snip]
Might it not be worthwhile to do something similar to the above?  This 
would allow objects to explicitly delete all their contained data in 
instances where they are being used as auto objects, rather than always 
relying on the GC for this purpose.  I'll admit I don't particularly 
like the idea of separate finalize() and ~this() methods, but it seems 
an attractive enough feature that something along these lines may be 
appropriate.
Personally I'm against it. I feel quite strongly that defining a destructor (or finalizer) should be illegal for a GC type - it should only be allowed for a class declared as 'auto'. If you need dtor-like behaviour, you should not be using GC, and the compiler should tell you so. I posted this opinion some weeks back in a similar discussion here, expecting to be chased out of town with pitchforks, but the response was very positive. Nobody could think of any counterexamples, at any rate. cheers Mike
Apr 05 2006
parent reply kris <foo bar.com> writes:
Mike Capp wrote:
 In article <e10pk7$2khb$1 digitaldaemon.com>, Sean Kelly says...
 
I've been following a thread on GC in c.l.c++.m and something Herb 
posted about C++/CLI today got me thinking:

    - a type can have a destructor and/or a finalizer
    - the destructor is called upon a) explicit delete or b) at end 	
      of scope for auto objects
    - the finalizer is called if allocated on the gc heap and the
      destructor has not been called
[snip]
Might it not be worthwhile to do something similar to the above?  This 
would allow objects to explicitly delete all their contained data in 
instances where they are being used as auto objects, rather than always 
relying on the GC for this purpose.  I'll admit I don't particularly 
like the idea of separate finalize() and ~this() methods, but it seems 
an attractive enough feature that something along these lines may be 
appropriate.
Personally I'm against it. I feel quite strongly that defining a destructor (or finalizer) should be illegal for a GC type - it should only be allowed for a class declared as 'auto'. If you need dtor-like behaviour, you should not be using GC, and the compiler should tell you so. I posted this opinion some weeks back in a similar discussion here, expecting to be chased out of town with pitchforks, but the response was very positive. Nobody could think of any counterexamples, at any rate. cheers Mike
Mike; Instead of making the dtor illegal for GC types, why not remove the 'auto' keyword from this realm altogether, and just use the existance of a dtor as the class RAII indicator? Thus, any class with a dtor is automatically RAII. When the dtor is actually invoked, all relevant GC allocations should still be intact; yes? What to do about those classes that need a dtor-like construct, but cannot be deemed RAII? Be explicit about closing them, using the close() or dispose() approach. Thoughts? - Kris
Apr 06 2006
parent reply Mike Capp <mike.capp gmail.com> writes:
In article <e12fva$29gr$1 digitaldaemon.com>, kris says...
Mike;

Instead of making the dtor illegal for GC types, why not remove the 
'auto' keyword from this realm altogether, and just use the existance of 
a dtor as the class RAII indicator?
The trouble is that this wouldn't make the RAII behaviour apparent to somebody reading the code. They'd have to go and look at the class definition. I'm happy to do a little extra typing for the sake of code clarity here, in the same way by calls as well as decls was a nice touch.
What to do about those classes that need a dtor-like construct, but 
cannot be deemed RAII? Be explicit about closing them, using the close() 
or dispose() approach.
Can you give some concrete examples of such 'awkward' classes? I'm not saying they don't exist, but I'm not assuming that they must, either. The "dispose" (anti-)pattern is, frankly, awful. It's "Wrong By Default" taken to the extreme. cheers Mike
Apr 06 2006
next sibling parent Georg Wrede <georg.wrede nospam.org> writes:
Mike Capp wrote:
 kris says...
 
 Instead of making the dtor illegal for GC types, why not remove the
 'auto' keyword from this realm altogether, and just use the
 existance of a dtor as the class RAII indicator?
The trouble is that this wouldn't make the RAII behaviour apparent to somebody reading the code. They'd have to go and look at the class definition. I'm happy to do a little extra typing for the sake of having "in" and "ref" arguments marked as such by calls as well as decls was a nice touch.
FWIW, I fully agree.
 What to do about those classes that need a dtor-like construct, but
 cannot be deemed RAII? Be explicit about closing them, using the
 close() or dispose() approach.
Can you give some concrete examples of such 'awkward' classes? I'm not saying they don't exist, but I'm not assuming that they must, either. The "dispose" (anti-)pattern is, frankly, awful. It's "Wrong By Default" taken to the extreme.
Yes, and thinking of a class that needs "destructing", which then may happen much later (at GC time), or never at all -- is just insanity.
Apr 06 2006
prev sibling parent reply kris <foo bar.com> writes:
Mike Capp wrote:
 In article <e12fva$29gr$1 digitaldaemon.com>, kris says...
 
Mike;

Instead of making the dtor illegal for GC types, why not remove the 
'auto' keyword from this realm altogether, and just use the existance of 
a dtor as the class RAII indicator?
The trouble is that this wouldn't make the RAII behaviour apparent to somebody reading the code. They'd have to go and look at the class definition. I'm happy to do a little extra typing for the sake of code clarity here, in the same way such by calls as well as decls was a nice touch.
Yes, that is true.
What to do about those classes that need a dtor-like construct, but 
cannot be deemed RAII? Be explicit about closing them, using the close() 
or dispose() approach.
Can you give some concrete examples of such 'awkward' classes? I'm not saying they don't exist, but I'm not assuming that they must, either.
Well, pretty much anything intended to be long-lived within the program, yet the OS cannot clean up by default. This includes external hardware which should be reset or otherwise released and, more commonly, various types of scant resources used for purposes of optimization ~ Regan noted database resources, which are a good example. Others might include termination network-handshaking, and so on. Such things are often wrapped via a class, with the expectation said class can encapsulate the cleanup process. Their scope (or life expectancy) is often intended to span a considerable period of time. In some cases it might be possible to arrange the code such that these entities are actually scoped on the stack (for RAII purposes), where the enclosing function doesn't exit until termination time. However, others often have a life expectancy based upon "activity" ~ a classic example might be cached database resources, where life-expectancy of the object has nothing to do with scope per se, but is instead often based upon a period of dormancy or inactivity.
 The "dispose"
 (anti-)pattern is, frankly, awful. It's "Wrong By Default" taken to the
extreme.
This is the option left open after the discovery that dtor() is pretty much worthless. I agree that a better solution is needed.
Apr 06 2006
next sibling parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Thu, 06 Apr 2006 11:48:07 -0700, kris <foo bar.com> wrote:
 Mike Capp wrote:
 What to do about those classes that need a dtor-like construct, but  
 cannot be deemed RAII? Be explicit about closing them, using the  
 close() or dispose() approach.
Can you give some concrete examples of such 'awkward' classes? I'm not saying they don't exist, but I'm not assuming that they must, either.
Well, pretty much anything intended to be long-lived within the program, yet the OS cannot clean up by default. This includes external hardware which should be reset or otherwise released and, more commonly, various types of scant resources used for purposes of optimization ~ Regan noted database resources, which are a good example.
I did, I also suggested some solutions: http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/36462 - refrence counting. - a new 'shared' keyword. The idea in that thread (isn't really a new idea) is essentially what Kris said above:
 Instead of making the dtor illegal for GC types, why not remove the  
 'auto' keyword from this realm altogether, and just use the existance  
 of a dtor as the class RAII indicator?
In my case; removal of 'auto' from object instance declaration, but requiring it on class definitions when a dtor is present. Plus requiring it on classes containing classes which are 'auto'. After all a dtor indicates some (non-memory) cleanup needs to be done, making it RAII be definition, no? And any class containing a reference that needs cleanup, will itself need cleanup, right? I think we need to try and come up with some examples of where it can't work, and/or decide what the limitations are and if they're an inappropriate cost to pay for what I think could be quite a safe system to write RAII in. You said:
   The trouble is that this wouldn't make the RAII behaviour apparent to  
 somebody
 reading the code. They'd have to go and look at the class definition.  
 I'm happy
 to do a little extra typing for the sake of code clarity here, in the  
 same way

 marked as such
 by calls as well as decls was a nice touch.
IMO the benefit outweights this cost. Much like it does for 'out' etc function parameters. Regan
Apr 06 2006
parent Daniel Keep <daniel.keep.lists gmail.com> writes:
Where to attach this post... aah well, this seems as good a spot as any,
I guess...

I won't pretend I'm an expert in these things, but it seems to me that
adding reference counting to D's wide range of memory management options
would solve most of these problems, yes?

The main case for keeping dtors with GCed objects is that sometimes you
have an object that needs to be cleaned up in some fashion, but which
isn't (or can't easily be) tied to a particular stack frame.  If you
made this class reference counted, then it would be cleaned up the
second the last reference goes out of scope.

The common drawback is the argument that you then have to watch out for
cycles, but Python seems to be coping fine--it has a generational cycle
checker as far as I understand it, and I've seen papers for creating
thread-safe generational checkers so that wouldn't need to be a problem.

I think having lazy GC, RAII, manual memory management and ref. counting
would cover just about everything you could possibly want to do.

Plus, it'd be a great gloating point: "D: memory management YOUR way!"

	-- Daniel

P.S.  I beg forgiveness if I've oversimplified this.

Regan Heath wrote:
 On Thu, 06 Apr 2006 11:48:07 -0700, kris <foo bar.com> wrote:
 Mike Capp wrote:
 What to do about those classes that need a dtor-like construct, but
 cannot be deemed RAII? Be explicit about closing them, using the
 close() or dispose() approach.
Can you give some concrete examples of such 'awkward' classes? I'm not saying they don't exist, but I'm not assuming that they must, either.
Well, pretty much anything intended to be long-lived within the program, yet the OS cannot clean up by default. This includes external hardware which should be reset or otherwise released and, more commonly, various types of scant resources used for purposes of optimization ~ Regan noted database resources, which are a good example.
I did, I also suggested some solutions: http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/36462 - refrence counting. - a new 'shared' keyword. The idea in that thread (isn't really a new idea) is essentially what Kris said above:
 Instead of making the dtor illegal for GC types, why not remove the
 'auto' keyword from this realm altogether, and just use the
 existance of a dtor as the class RAII indicator?
In my case; removal of 'auto' from object instance declaration, but requiring it on class definitions when a dtor is present. Plus requiring it on classes containing classes which are 'auto'. After all a dtor indicates some (non-memory) cleanup needs to be done, making it RAII be definition, no? And any class containing a reference that needs cleanup, will itself need cleanup, right? I think we need to try and come up with some examples of where it can't work, and/or decide what the limitations are and if they're an inappropriate cost to pay for what I think could be quite a safe system to write RAII in. You said:
   The trouble is that this wouldn't make the RAII behaviour apparent
 to somebody
 reading the code. They'd have to go and look at the class definition.
 I'm happy
 to do a little extra typing for the sake of code clarity here, in the
 same way

 marked as such
 by calls as well as decls was a nice touch.
IMO the benefit outweights this cost. Much like it does for 'out' etc function parameters. Regan
-- v1sw5+8Yhw5ln4+5pr6OFma8u6+7Lw4Tm6+7l6+7D a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
Apr 17 2006
prev sibling parent reply kris <foo bar.com> writes:
I thought it worthwhile to review the dtor behaviour and view the 
concerns from a different direction:

dtor 'state' valid:
- explicit invocation via delete keyword
- explicit invocation via raii

dtor state 'unspecified':
- implicitly called when no more references are held to the object
- implicitly called when a program terminates


Just for fun, let's assume the 'unspecified' issue cannot be resolved. 
Let's also assume there are dtors which expect to "clean up", and which 
will fail when the dtor state is 'unspecified'.

What happens when a programmer forgets to explicitly delete such an 
object? Well, the program is highly likely to fail (or be in an 
inconsistent state) after the GC collects said object. This might be 
before or during program termination.

How does one ensure this cannot occur? One obvious method would be for 
the GC to /not/ invoke any dtor by default. While the GC would still 
collect, such a change would ensure it cannot be the cause of a failing 
program (it would also make the GC a little faster, but that's probably 
beside the point).

Assuming that were the case, we're left with only the two cases where 
cleanup is explicit and the dtor state is 'valid': via the delete 
keyword, and via raii (both of which apply the same functionality).

This would tend to relieve the need for an explicit dispose() pattern, 
since the dtor is now the equivalent?

What about implicit cleanup? In this scenario, it doesn't happen. If you 
don't explicitly (via delete or via raii) delete an object, the dtor is 
not invoked. This applies the notion that it's better to have a leak 
than a dead program. The leak is a bug to be resolved.

What would be really nice is a tool to tell us about such leaks. It 
should be possible for the GC (when configured to do so) to identify 
collected objects which have a non-default dtor. In other words, the GC 
can probably tell if a custom dtor is present (it has a different 
address than a default dtor?). If the GC finds one of these during a 
normal collection cycle, and is about to collect it, it might raise a 
runtime error to indicate the leak instance?

Anyway ~ to summarize, this would have the following effect:

1) no more bogus crashes due to dtors being invoked in an invalid state
2) no need for the dispose() pattern
3) normal collection does not invoke dtors, making it a little faster
4) there's a possibility of a tool to identify and capture leaking 
resources. Something which would be handy anyway.


For the sake of example: "unscoped" resources, such as connection-pools, 
would operate per normal in this scenario: the pool elements should be 
deleted explicitly by the hosting pool (or be treated as leaks, if they 
have a custom dtor). The pool itself would have to be deleted explicitly 
also ~ as is currently the case today ~ which can optionally be handled 
via a module-dtor.

Thoughts?
Apr 09 2006
parent reply Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:
kris wrote:
 I thought it worthwhile to review the dtor behaviour and view the 
 concerns from a different direction:
 
 dtor 'state' valid:
 - explicit invocation via delete keyword
 - explicit invocation via raii
 
 dtor state 'unspecified':
 - implicitly called when no more references are held to the object
 - implicitly called when a program terminates
 
 
 Just for fun, let's assume the 'unspecified' issue cannot be resolved. 
 Let's also assume there are dtors which expect to "clean up", and which 
 will fail when the dtor state is 'unspecified'.
 
 What happens when a programmer forgets to explicitly delete such an 
 object? Well, the program is highly likely to fail (or be in an 
 inconsistent state) after the GC collects said object. This might be 
 before or during program termination.
 
 How does one ensure this cannot occur? One obvious method would be for 
 the GC to /not/ invoke any dtor by default. While the GC would still 
 collect, such a change would ensure it cannot be the cause of a failing 
 program (it would also make the GC a little faster, but that's probably 
 beside the point).
 
 Assuming that were the case, we're left with only the two cases where 
 cleanup is explicit and the dtor state is 'valid': via the delete 
 keyword, and via raii (both of which apply the same functionality).
 
 This would tend to relieve the need for an explicit dispose() pattern, 
 since the dtor is now the equivalent?
 
 What about implicit cleanup? In this scenario, it doesn't happen. If you 
 don't explicitly (via delete or via raii) delete an object, the dtor is 
 not invoked. This applies the notion that it's better to have a leak 
 than a dead program. The leak is a bug to be resolved.
 
 What would be really nice is a tool to tell us about such leaks. It 
 should be possible for the GC (when configured to do so) to identify 
 collected objects which have a non-default dtor. In other words, the GC 
 can probably tell if a custom dtor is present (it has a different 
 address than a default dtor?). If the GC finds one of these during a 
 normal collection cycle, and is about to collect it, it might raise a 
 runtime error to indicate the leak instance?
 
 Anyway ~ to summarize, this would have the following effect:
 
 1) no more bogus crashes due to dtors being invoked in an invalid state
 2) no need for the dispose() pattern
 3) normal collection does not invoke dtors, making it a little faster
 4) there's a possibility of a tool to identify and capture leaking 
 resources. Something which would be handy anyway.
 
 
 For the sake of example: "unscoped" resources, such as connection-pools, 
 would operate per normal in this scenario: the pool elements should be 
 deleted explicitly by the hosting pool (or be treated as leaks, if they 
 have a custom dtor). The pool itself would have to be deleted explicitly 
 also ~ as is currently the case today ~ which can optionally be handled 
 via a module-dtor.
 
 Thoughts?
All of those pros you mention are valid. But you'd have one serious con: * Any class which required cleanup would have to be manually memory managed. -- Bruno Medeiros - CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Apr 10 2006
next sibling parent reply kris <foo bar.com> writes:
Bruno Medeiros wrote:
 kris wrote:
 
 I thought it worthwhile to review the dtor behaviour and view the 
 concerns from a different direction:

 dtor 'state' valid:
 - explicit invocation via delete keyword
 - explicit invocation via raii

 dtor state 'unspecified':
 - implicitly called when no more references are held to the object
 - implicitly called when a program terminates


 Just for fun, let's assume the 'unspecified' issue cannot be resolved. 
 Let's also assume there are dtors which expect to "clean up", and 
 which will fail when the dtor state is 'unspecified'.

 What happens when a programmer forgets to explicitly delete such an 
 object? Well, the program is highly likely to fail (or be in an 
 inconsistent state) after the GC collects said object. This might be 
 before or during program termination.

 How does one ensure this cannot occur? One obvious method would be for 
 the GC to /not/ invoke any dtor by default. While the GC would still 
 collect, such a change would ensure it cannot be the cause of a 
 failing program (it would also make the GC a little faster, but that's 
 probably beside the point).

 Assuming that were the case, we're left with only the two cases where 
 cleanup is explicit and the dtor state is 'valid': via the delete 
 keyword, and via raii (both of which apply the same functionality).

 This would tend to relieve the need for an explicit dispose() pattern, 
 since the dtor is now the equivalent?

 What about implicit cleanup? In this scenario, it doesn't happen. If 
 you don't explicitly (via delete or via raii) delete an object, the 
 dtor is not invoked. This applies the notion that it's better to have 
 a leak than a dead program. The leak is a bug to be resolved.

 What would be really nice is a tool to tell us about such leaks. It 
 should be possible for the GC (when configured to do so) to identify 
 collected objects which have a non-default dtor. In other words, the 
 GC can probably tell if a custom dtor is present (it has a different 
 address than a default dtor?). If the GC finds one of these during a 
 normal collection cycle, and is about to collect it, it might raise a 
 runtime error to indicate the leak instance?

 Anyway ~ to summarize, this would have the following effect:

 1) no more bogus crashes due to dtors being invoked in an invalid state
 2) no need for the dispose() pattern
 3) normal collection does not invoke dtors, making it a little faster
 4) there's a possibility of a tool to identify and capture leaking 
 resources. Something which would be handy anyway.


 For the sake of example: "unscoped" resources, such as 
 connection-pools, would operate per normal in this scenario: the pool 
 elements should be deleted explicitly by the hosting pool (or be 
 treated as leaks, if they have a custom dtor). The pool itself would 
 have to be deleted explicitly also ~ as is currently the case today ~ 
 which can optionally be handled via a module-dtor.

 Thoughts?
All of those pros you mention are valid. But you'd have one serious con: * Any class which required cleanup would have to be manually memory managed.
Thanks; First, let's change the verbiage of "valid" and "unspecified" to be "deterministic" and "non-deterministic" respectively (per Don C). This makes it clear that a dtor invoked /lazily/ by the GC will be invoked in a non-deterministic state (how the GC works today). This non-deterministic state means that it's very likely any or all gc-managed references held purely by a class instance will already be collected when the relevant dtor is invoked. The other aspect to consider is the timeliness of cleanup. Mike suggests that classes that actually have something to cleanup should do so in a timely manner, and that the indicator for this is the presence of a dtor. To get to your assertion: under the suggested model, any class with resources that need to be released should either be 'delete'd at some appropriate point, or have raii applied to it. Classes with dtors that are not cleaned up in this manner can be treated as "leaks" (and can be identified at runtime). Thus, the term "manually memory managed" is not as clear as it might be: raii can be used to clean up, and scope(exit) can be used to cleanup. An explicit 'delete' can be used to cleanup. There's no malloc() or anything like that invoved. The truly serious problem with a 'lazy' cleanup is that the dtor will wind up invoked with non-determinstic state (typically leading to a serious error). The other concern with lazy cleanup is what Mike addresses (if the resource needs cleaning up, it should be done in a timely manner ~ not at some arbitrary point in the future). What would be an example of a class requiring cleanup, which should be performed lazily? I can't think of a reasonable one off-hand, but let's take an example anyway: Suppose I have a class that holds a file-handle. This handle should be released when the class is no longer in use. Luckily, the file-handle does not require to be GC-managed itself (can be held by the class as an integer). This provides us with two choices ~ release the handle in a timely fashion, or release it at some undetermined point in the future (when the class is collected). We're lucky to have a choice here; it's actually something of a special case. The model suggested follows Mike's proposal that the file-handle should actually be released as soon as reasonably possible. RAII can be used to ensure that happens automagically. What happens if said class is not raii, and it not hit with a 'delete'? The suggested model can easily identify that class instance as a "leak" when collected by the GC, and report it as such. That is: instead of the GC-collector invoking the dtor with a non-deterministic state, it instead identifies a leaking resource. As far as automatic cleanup goes, I think D is already well armed via raii and the scope() idiom. Adopting an attitude of cleaning up resources in a timely manner will surely only be of benefit in the long run? Another approach here is to allow the collector to invoke the dtor (as it does today), and somehow ensure that its state is fully deterministic (which is not done today). I suspect that would be notably more expensive and/or difficult to achieve? However, that also does not address Mike's concern about timely cleanup, which I think is of valid concern. Thus, I really like the simplicity of the model as described above. It also has the added bonus of eliminating the need for a redundant dispose() pattern, and makes the GC a little faster :-) - Kris
Apr 10 2006
parent reply Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:
kris wrote:
 Bruno Medeiros wrote:
 All of those pros you mention are valid. But you'd have one serious con:
 * Any class which required cleanup would have to be manually memory 
 managed.
Just one addendum: I was just pointing out that con, I wasn't saying it was or was not, a bad idea overall.
 
 First, let's change the verbiage of "valid" and "unspecified" to be 
 "deterministic" and "non-deterministic" respectively (per Don C).
 
Let's not. *g* See my reply to the Don.
 
 To get to your assertion: under the suggested model, any class with 
 resources that need to be released should either be 'delete'd at some 
 appropriate point, or have raii applied to it. Classes with dtors that 
 are not cleaned up in this manner can be treated as "leaks" (and can be 
 identified at runtime).
 
 Thus, the term "manually memory managed" is not as clear as it might be: 
 raii can be used to clean up, and scope(exit) can be used to cleanup. An 
 explicit 'delete' can be used to cleanup. There's no malloc() or 
 anything like that invoved.
 
Those are all manual memory management. (Even if auto and scope() are much better than plain malloc/free). [Note: RAII's auto = scope(exit)] You would have an automatic leak/failure detection, true.
 The truly serious problem with a 'lazy' cleanup is that the dtor will 
 wind up invoked with non-determinstic state (typically leading to a 
 serious error). The other concern with lazy cleanup is what Mike 
 addresses (if the resource needs cleaning up, it should be done in a 
 timely manner ~ not at some arbitrary point in the future).
 
The state is *undefined*, it is not "non-deterministic" nor "deterministic". This is the kind of terminology blur up that I was leery of. :P -- Bruno Medeiros - CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Apr 13 2006
parent kris <foo bar.com> writes:
Bruno Medeiros wrote:
 kris wrote:
 
 Bruno Medeiros wrote:

 All of those pros you mention are valid. But you'd have one serious con:
 * Any class which required cleanup would have to be manually memory 
 managed.
Just one addendum: I was just pointing out that con, I wasn't saying it was or was not, a bad idea overall.
 First, let's change the verbiage of "valid" and "unspecified" to be 
 "deterministic" and "non-deterministic" respectively (per Don C).
Let's not. *g* See my reply to the Don.
heheh :)
 
 To get to your assertion: under the suggested model, any class with 
 resources that need to be released should either be 'delete'd at some 
 appropriate point, or have raii applied to it. Classes with dtors that 
 are not cleaned up in this manner can be treated as "leaks" (and can 
 be identified at runtime).

 Thus, the term "manually memory managed" is not as clear as it might 
 be: raii can be used to clean up, and scope(exit) can be used to 
 cleanup. An explicit 'delete' can be used to cleanup. There's no 
 malloc() or anything like that invoved.
Those are all manual memory management. (Even if auto and scope() are much better than plain malloc/free). [Note: RAII's auto = scope(exit)] You would have an automatic leak/failure detection, true.
 The truly serious problem with a 'lazy' cleanup is that the dtor will 
 wind up invoked with non-determinstic state (typically leading to a 
 serious error). The other concern with lazy cleanup is what Mike 
 addresses (if the resource needs cleaning up, it should be done in a 
 timely manner ~ not at some arbitrary point in the future).
The state is *undefined*, it is not "non-deterministic" nor "deterministic". This is the kind of terminology blur up that I was leery of. :P
:-D Terminology aside; with the current implementation, invocation of dtors during a collection often causes serious problems. That's why we see the use of close/dispose patterns in D. It would be great to avoid both of those things :p
Apr 13 2006
prev sibling next sibling parent reply kris <foo bar.com> writes:
Bruno Medeiros wrote:
 kris wrote:
 All of those pros you mention are valid. But you'd have one serious con:
 * Any class which required cleanup would have to be manually memory 
 managed.
Can anyone come up with some examples whereby a class needs to cleanup, and also /needs/ to be collected lazily? In other words, where raii or delete could not be applied appropriately?
Apr 10 2006
next sibling parent reply Sean Kelly <sean f4.ca> writes:
kris wrote:
 Bruno Medeiros wrote:
 kris wrote:
 All of those pros you mention are valid. But you'd have one serious con:
 * Any class which required cleanup would have to be manually memory 
 managed.
Can anyone come up with some examples whereby a class needs to cleanup, and also /needs/ to be collected lazily? In other words, where raii or delete could not be applied appropriately?
Well, there are plenty of instances where the lifetime of an object isn't bound to a specific owner or scope--consider connection objects for a server app. However, in most cases it's possible (and correct) to delegate cleanup responsibility to a specific manager object or to link it to the occurrence of some specific event. So far as non-deterministic cleanup via dtors is concerned, I think it's mostly implemented as a fail-safe. And it may be more correct to signal an error if such an object is encountered via a GC run than to simply clean it up silently, as a careful programmer might consider this a resource leak. Sean
Apr 10 2006
next sibling parent kris <foo bar.com> writes:
Sean Kelly wrote:
 kris wrote:
 
 Bruno Medeiros wrote:

 kris wrote:
 All of those pros you mention are valid. But you'd have one serious con:
 * Any class which required cleanup would have to be manually memory 
 managed.
Can anyone come up with some examples whereby a class needs to cleanup, and also /needs/ to be collected lazily? In other words, where raii or delete could not be applied appropriately?
Well, there are plenty of instances where the lifetime of an object isn't bound to a specific owner or scope--consider connection objects for a server app. However, in most cases it's possible (and correct) to delegate cleanup responsibility to a specific manager object or to link it to the occurrence of some specific event.
Aye
 So far as 
 non-deterministic cleanup via dtors is concerned, I think it's mostly 
 implemented as a fail-safe.  And it may be more correct to signal an 
 error if such an object is encountered via a GC run than to simply clean 
 it up silently, as a careful programmer might consider this a resource 
 leak.
Yes; that's how I feel about it also. Especially when the "silent" cleanup leads to SegFaults and such. Intended as a fail-safe, but actually a failure-causation ;-)
Apr 10 2006
prev sibling parent Georg Wrede <georg.wrede nospam.org> writes:
Sean Kelly wrote:
 kris wrote:
 
 Bruno Medeiros wrote:

 kris wrote:
 All of those pros you mention are valid. But you'd have one serious con:
 * Any class which required cleanup would have to be manually memory 
 managed.
Can anyone come up with some examples whereby a class needs to cleanup, and also /needs/ to be collected lazily? In other words, where raii or delete could not be applied appropriately?
Well, there are plenty of instances where the lifetime of an object isn't bound to a specific owner or scope--consider connection objects for a server app. However, in most cases it's possible (and correct) to delegate cleanup responsibility to a specific manager object or to link it to the occurrence of some specific event. So far as non-deterministic cleanup via dtors is concerned, I think it's mostly implemented as a fail-safe. And it may be more correct to signal an error if such an object is encountered via a GC run than to simply clean it up silently, as a careful programmer might consider this a resource leak.
Writing this kind of code demands that the programmer keeps (in his mind) a clear picture of _who_ owns the instance. Getting that unclear is a sure receipe for disaster.
Apr 10 2006
prev sibling parent Georg Wrede <georg.wrede nospam.org> writes:
kris wrote:
 Bruno Medeiros wrote:
 
 kris wrote:
 All of those pros you mention are valid. But you'd have one serious con:
 * Any class which required cleanup would have to be manually memory 
 managed.
Can anyone come up with some examples whereby a class needs to cleanup, and also /needs/ to be collected lazily? In other words, where raii or delete could not be applied appropriately?
Got another idea. It seems to me that this discussion is pretty abstract. Normally, half the participants would be talking about Apples and the other about Oranges, without neither noticing. But in this D newsgroup, I believe the state of knowledge is high enough for such not to happen. However, half of the _audience_ may not be that clear on that both apples and oranges belong to the Class Magnoliopsida, and one of them to the Order Rosales and the other to Sapindales. But which? (And I certainly admit I belong to this Audience here.) To serve and accomodate all, and to even possibly start to get potentially worthwhile commentary from a larger group of eyes, I suggest we try to construct the simplest Structure of Instances needed to display _all_ of the discussed woes. As a first draft (and not even remotely pretending it is adequate), I cast the following: VIEW THIS IN MONOSPACE FONT =========================== code heap iRa -----------------> alpha ---> beta ^ / \ / \ / \ V gamma ^ ^ / \ / \ / \ V V iRb ----------------> delta <--> epsilon (Oh, iR stands for Instance Reference, just to not get involved with the types or classes: SomeClass iRx = new SomeClass(); // Create a reference to an instance. ) So, the upper half makes a singly linked list and the lower half makes a doubly linked list, and then there arem two references (or D variables) pointing to the Alpha and Delta instances. Can this structure demosnstrate _all_ of the problems we're currently discussing, or should it be more complicated?
Apr 10 2006
prev sibling parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Mon, 10 Apr 2006 13:33:56 +0100, Bruno Medeiros  
<brunodomedeirosATgmail SPAM.com> wrote:
 All of those pros you mention are valid. But you'd have one serious con:
 * Any class which required cleanup would have to be manually memory  
 managed.
Not memory managed, surely.. the memory will still be collected by the GC, all that changes is that the dtor is not invoked when that happens.. or at least that is how I understood Kris's proposal. Regan
Apr 10 2006
parent reply Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:
Regan Heath wrote:
 On Mon, 10 Apr 2006 13:33:56 +0100, Bruno Medeiros 
 <brunodomedeirosATgmail SPAM.com> wrote:
 All of those pros you mention are valid. But you'd have one serious con:
 * Any class which required cleanup would have to be manually memory 
 managed.
Not memory managed, surely.. the memory will still be collected by the GC, all that changes is that the dtor is not invoked when that happens.. or at least that is how I understood Kris's proposal. Regan
Kris clearly mentioned that a class with a dtor (i.e. a class needing cleanup) being collected by the GC would be an abnormal situation. (which could, or not, be detected by the runtime.) -- Bruno Medeiros - CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Apr 13 2006
parent Sean Kelly <sean f4.ca> writes:
Bruno Medeiros wrote:
 Regan Heath wrote:
 On Mon, 10 Apr 2006 13:33:56 +0100, Bruno Medeiros 
 <brunodomedeirosATgmail SPAM.com> wrote:
 All of those pros you mention are valid. But you'd have one serious con:
 * Any class which required cleanup would have to be manually memory 
 managed.
Not memory managed, surely.. the memory will still be collected by the GC, all that changes is that the dtor is not invoked when that happens.. or at least that is how I understood Kris's proposal.
Kris clearly mentioned that a class with a dtor (i.e. a class needing cleanup) being collected by the GC would be an abnormal situation. (which could, or not, be detected by the runtime.)
The version of Ares released yesterday has code in place to do this. For now, you'll have to alter the finalizer if you wanted to do something special (dmdrt/memory.d:cr_finalize), but eventually it it will probably call an onFinalizeError function in the standard library that can be hooked in a similar manner to onAssertError. The error will be signaled when the GC collects an object that has a dtor. Default behavior will likely be to do ignore it and move on. Sean
Apr 13 2006
prev sibling parent reply "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"Sean Kelly" <sean f4.ca> wrote in message 
news:e10pk7$2khb$1 digitaldaemon.com...
     - a type can have a destructor and/or a finalizer
     - the destructor is called upon a) explicit delete or b) at end of 
 scope for auto objects
     - the finalizer is called if allocated on the gc heap and the
       destructor has not been called
Would you mind explaining why exactly there needs to be a difference between destructors and finalizers? I've been following all the arguments about this heap vs. auto classes and dtors vs. finalizers, and I still can't figure out why destructors _can't be the finalizers_. Do finalizers do something fundamentally different from destructors?
Apr 05 2006
parent reply Sean Kelly <sean f4.ca> writes:
Jarrett Billingsley wrote:
 "Sean Kelly" <sean f4.ca> wrote in message 
 news:e10pk7$2khb$1 digitaldaemon.com...
     - a type can have a destructor and/or a finalizer
     - the destructor is called upon a) explicit delete or b) at end of 
 scope for auto objects
     - the finalizer is called if allocated on the gc heap and the
       destructor has not been called
Would you mind explaining why exactly there needs to be a difference between destructors and finalizers? I've been following all the arguments about this heap vs. auto classes and dtors vs. finalizers, and I still can't figure out why destructors _can't be the finalizers_. Do finalizers do something fundamentally different from destructors?
Since finalizers are called when the GC destroys an object, they are very limited in what they can do. They can't assume any GC managed object they have a reference to is valid, etc. By contrast, destructors can make this assumption, because the object is being destroyed deterministically. I think having both may be too confusing to be worthwhile, but it would allow for things like this: class LinkedList { ~this() { // called deterministically for( Node n = top; n; ) { Node t = n->next; delete n; n = t; } finalize(); } void finalize() { // called by GC // nodes may have already been destroyed // so leave them alone, but special // resources could be reclaimed } } The argument against finalizers, as Mike mentioned, is that you typically want to reclaim such special resources deterministically, so letting the GC take care of this 'someday' is of questionable utility. Sean
Apr 05 2006
next sibling parent reply kris <foo bar.com> writes:
Sean Kelly wrote:
 Jarrett Billingsley wrote:
 
 "Sean Kelly" <sean f4.ca> wrote in message 
 news:e10pk7$2khb$1 digitaldaemon.com...

     - a type can have a destructor and/or a finalizer
     - the destructor is called upon a) explicit delete or b) at end 
 of scope for auto objects
     - the finalizer is called if allocated on the gc heap and the
       destructor has not been called
Would you mind explaining why exactly there needs to be a difference between destructors and finalizers? I've been following all the arguments about this heap vs. auto classes and dtors vs. finalizers, and I still can't figure out why destructors _can't be the finalizers_. Do finalizers do something fundamentally different from destructors?
Since finalizers are called when the GC destroys an object, they are very limited in what they can do. They can't assume any GC managed object they have a reference to is valid, etc. By contrast, destructors can make this assumption, because the object is being destroyed deterministically. I think having both may be too confusing to be worthwhile, but it would allow for things like this: class LinkedList { ~this() { // called deterministically for( Node n = top; n; ) { Node t = n->next; delete n; n = t; } finalize(); } void finalize() { // called by GC // nodes may have already been destroyed // so leave them alone, but special // resources could be reclaimed } } The argument against finalizers, as Mike mentioned, is that you typically want to reclaim such special resources deterministically, so letting the GC take care of this 'someday' is of questionable utility.
Yes, it is. The "death tractors" (dtors in D) are notably less than useful right now. Any dependencies are likely in an unknown state (as you note), and then, dtors are not invoked when the program exits. From what I recall, dtors are not even invoked when you "delete" an object? It's actually quite hard to nail down when they /are/ invoked :) Regardless; any "special resources" one would, somewhat naturally, wish to cleanup via dtors have to be explicitly managed via other means. This usually means a global application-list of "special stuff", which does not seem to jive with OOP very well? On the face of it, it shouldn't be hard for the GC to invloke dtors in such a manner whereby dependencies are preserved ~ that would at least help. But then, the whole notion is somewhat worthless (in D) when it's implemented as a non-deterministic activity. Given all that, the finalizer behaviour mentioned above sounds rather like the current death-tractor behaviour?
Apr 05 2006
next sibling parent reply "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"kris" <foo bar.com> wrote in message news:e11fds$m0m$1 digitaldaemon.com...
 Yes, it is. The "death tractors" (dtors in D) are notably less than useful 
 right now. Any dependencies are likely in an unknown state (as you note), 
 and then, dtors are not invoked when the program exits. From
 what I recall, dtors are not even invoked when you "delete" an object? 
 It's actually quite hard to nail down when they /are/ invoked :)
They are invoked when you call delete. This is how you do the deterministic "list of special stuff" that you mention - you just 'delete' them all, perhaps in a certain order. In fact, the dtors are also called on program exit - as long as they're not in some kind of array. I don't know if that's a bug, or by design, or a foggy area of the spec, or a combination of all of the above.
 Regardless; any "special resources" one would, somewhat naturally, wish
 to cleanup via dtors have to be explicitly managed via other means. This
 usually means a global application-list of "special stuff", which does not 
 seem to jive with OOP very well?
I kind of agree with you, but at the same time, I just take the stance that although it's useful, _the GC can't be trusted_. Unless a custom GC is written for every program and every possible arrangement of data, it can't know in what order to call dtors/finalizers and whatnot. So I do end up keeping lists of all types of objects that I want to be called deterministically, and delete them on program exit. I just leave the simple / common stuff (throwaway class instances, string crap) to the GC. That just makes me feel a lot better and safer. In addition, I usually don't assume that any references a class holds are valid in the dtor. I leave the cleanup of other objects (like in Sean's example) to the other objects' dtors.
 On the face of it, it shouldn't be hard for the GC to invloke dtors in 
 such a manner whereby dependencies are preserved ~ that would at least 
 help. But then, the whole notion is somewhat worthless (in D) when it's 
 implemented as a non-deterministic activity.
Yeah, I was thinking about that, maybe instead of just looping through all class instances linearly and deleting everything, just keep running GC passes until the regular GC pass has no effect, and brute force the rest. In this way, my method of "not deleting other objects in dtors" would delete the instance of the LinkedList on the first pass, and then all the Nodes on the second, since they are now orphaned.
Apr 05 2006
parent reply kris <foo bar.com> writes:
Jarrett Billingsley wrote:
 "kris" <foo bar.com> wrote in message news:e11fds$m0m$1 digitaldaemon.com...
 
Yes, it is. The "death tractors" (dtors in D) are notably less than useful 
right now. Any dependencies are likely in an unknown state (as you note), 
and then, dtors are not invoked when the program exits. From
what I recall, dtors are not even invoked when you "delete" an object? 
It's actually quite hard to nail down when they /are/ invoked :)
They are invoked when you call delete. This is how you do the deterministic "list of special stuff" that you mention - you just 'delete' them all, perhaps in a certain order.
I ended up using my own 'finalizer' since, back in the day, delete didn't invoke the dtor. It does now, so that's something. Objects that refer to anything external should probably have a close() method anyway ~ which gets us back to what Mike had noted.
 
 In fact, the dtors are also called on program exit - as long as they're not 
 in some kind of array.  I don't know if that's a bug, or by design, or a 
 foggy area of the spec, or a combination of all of the above.
Interesting. It does appear to do that now, whereas in the past it didn't. I remember a post from someone complaining that it took 5 minutes for his program to exit because the GC was run to completion on all 10,000,000,000 objects he had (or something like that). The "fix" for that appeared to be "just don't cleanup on exit", which then sidestepped all dtors. It seems something changed along the way, since dtors do indeed get invoked at program termination for a simple test program (not if an exception is thrown, though). My bad. Does this happen consistently, then? I mean, are dtors invoked on all remaining Objects during exit? At all times? Is that even a good idea?
 
 
Regardless; any "special resources" one would, somewhat naturally, wish
to cleanup via dtors have to be explicitly managed via other means. This
usually means a global application-list of "special stuff", which does not 
seem to jive with OOP very well?
I kind of agree with you, but at the same time, I just take the stance that although it's useful, _the GC can't be trusted_. Unless a custom GC is written for every program and every possible arrangement of data, it can't know in what order to call dtors/finalizers and whatnot. So I do end up keeping lists of all types of objects that I want to be called deterministically, and delete them on program exit. I just leave the simple / common stuff (throwaway class instances, string crap) to the GC. That just makes me feel a lot better and safer.
The GC is supposed to be your friend :) That doesn't mean it should know about your design but, there again, it shouldn't abort it either. That implies any additional GC references held by a dtor Object really should be valid whenever that dtor is invoked. The fact that they're not relegates dtors to having insignificant value ~ which somehow doesn't seem right. Frankly, I don't clearly understand why they're in D at all ~ too little consistency.
 
 In addition, I usually don't assume that any references a class holds are 
 valid in the dtor.  I leave the cleanup of other objects (like in Sean's 
 example) to the other objects' dtors.
 
 
On the face of it, it shouldn't be hard for the GC to invloke dtors in 
such a manner whereby dependencies are preserved ~ that would at least 
help. But then, the whole notion is somewhat worthless (in D) when it's 
implemented as a non-deterministic activity.
Yeah, I was thinking about that, maybe instead of just looping through all class instances linearly and deleting everything, just keep running GC passes until the regular GC pass has no effect, and brute force the rest. In this way, my method of "not deleting other objects in dtors" would delete the instance of the LinkedList on the first pass, and then all the Nodes on the second, since they are now orphaned.
Yep If references were still valid for dtors, and dtors were invoked in a deterministic manner, perhaps all we'd need is something similar to "scope(exit)", but referring to a global scope instead? Should the memory manager take care of the latter?
Apr 05 2006
next sibling parent reply Dave <Dave_member pathlink.com> writes:
In article <e11ki9$rtq$1 digitaldaemon.com>, kris says...
Jarrett Billingsley wrote:
 "kris" <foo bar.com> wrote in message news:e11fds$m0m$1 digitaldaemon.com...
 
Yes, it is. The "death tractors" (dtors in D) are notably less than useful 
right now. Any dependencies are likely in an unknown state (as you note), 
and then, dtors are not invoked when the program exits. From
what I recall, dtors are not even invoked when you "delete" an object? 
It's actually quite hard to nail down when they /are/ invoked :)
They are invoked when you call delete. This is how you do the deterministic "list of special stuff" that you mention - you just 'delete' them all, perhaps in a certain order.
Ok, so for non-auto death tractors (that name is great): a) non-auto D class dtors are actually what are called finalizers everywhere else, except when delete is explicitly called. b) although dtors are eventually all called, it is non-deterministic unless the class is auto, or delete is used explicitly. c) unless dtors are called deterministically, they could often be considered worthless since, w/ a GC handling memory, the primary reason for dtor's is to release other expensive external resources. d) there is (alot of) overhead involved with 'dtors for every class'. e) All this has been a major sticking-point of other languages and runtimes they use is finalizers instead of dtors (they also have Dispose, but that needs to be called explicitly), and using(...) takes the place of auto/delete. IIRC, exactly when these finalizers are called is always non-deterministic and not even guaranteed unless an explicit "full collect" is done, and a big part of this is precisely because it's so expensive. Although I program in those languages day to day, because of this, I don't rely on anything that is going on behind the scenes as I've always ended-up explicitly "finalizing" things myself rather than relying on the GC or the using(...) statement. If you've done alot of DB work in .NET (for example), then you'll know that doing this is sometimes as bothersome as malloc/free or new/delete (and Thank God for .NET's try/finally). That is a major reason I think finalizers are useless unless they're always deterministic. From some tests I've done in the past and recently duplicated in http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/36258, just attempting to set a finalizer is damned expensive, and a lot of that expense is because setFinalizer needs to be synchronized. IIRC, in the tests I've run in the past, if the finalizer overhead is removed, the current GC can actually run as fast for smallish class objects over several collections, as new/delete for C++ classes or malloc/free for C structs. There is not only an expense involved in setting the finalizer, but the way it works in the current D GC is that there is overhead involved in every collection checking for finalizers, even for non-class objects. It looks to me like if all the non-deterministic finalization cruft could be removed from the GC, the *current* GC may actually be a little faster than malloc/free for class objects (at least moderately sized ones). Long and short of it is I like Mike's ideas regarding allowing dtors for only auto classes. In that way, the GC wouldn't have to deal with finalizers at all, or at least during non-deterministic collections. It would also still allow D to claim RAII because 'auto' classes are something new for D compared to most other languages. It may be that taking care of the finalizer overhead issue is a must if D GC's will ever be able to perform as well as other languages for class objects. Kind-of ironic; the goals of D are to be as powerful as C++, yet make compilers relatively easy to develop - but a side effect of those two is that really good GC's may be harder to develop than the compilers <g> - Dave
Apr 05 2006
next sibling parent kris <foo bar.com> writes:
Dave wrote:
 In article <e11ki9$rtq$1 digitaldaemon.com>, kris says...
 
Jarrett Billingsley wrote:

"kris" <foo bar.com> wrote in message news:e11fds$m0m$1 digitaldaemon.com...


Yes, it is. The "death tractors" (dtors in D) are notably less than useful 
right now. Any dependencies are likely in an unknown state (as you note), 
and then, dtors are not invoked when the program exits. From
what I recall, dtors are not even invoked when you "delete" an object? 
It's actually quite hard to nail down when they /are/ invoked :)
They are invoked when you call delete. This is how you do the deterministic "list of special stuff" that you mention - you just 'delete' them all, perhaps in a certain order.
Ok, so for non-auto death tractors (that name is great): a) non-auto D class dtors are actually what are called finalizers everywhere else, except when delete is explicitly called. b) although dtors are eventually all called, it is non-deterministic unless the class is auto, or delete is used explicitly. c) unless dtors are called deterministically, they could often be considered worthless since, w/ a GC handling memory, the primary reason for dtor's is to release other expensive external resources. d) there is (alot of) overhead involved with 'dtors for every class'. e) All this has been a major sticking-point of other languages and runtimes they use is finalizers instead of dtors (they also have Dispose, but that needs to be called explicitly), and using(...) takes the place of auto/delete. IIRC, exactly when these finalizers are called is always non-deterministic and not even guaranteed unless an explicit "full collect" is done, and a big part of this is precisely because it's so expensive. Although I program in those languages day to day, because of this, I don't rely on anything that is going on behind the scenes as I've always ended-up explicitly "finalizing" things myself rather than relying on the GC or the using(...) statement. If you've done alot of DB work in .NET (for example), then you'll know that doing this is sometimes as bothersome as malloc/free or new/delete (and Thank God for .NET's try/finally). That is a major reason I think finalizers are useless unless they're always deterministic. From some tests I've done in the past and recently duplicated in http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/36258, just attempting to set a finalizer is damned expensive, and a lot of that expense is because setFinalizer needs to be synchronized. IIRC, in the tests I've run in the past, if the finalizer overhead is removed, the current GC can actually run as fast for smallish class objects over several collections, as new/delete for C++ classes or malloc/free for C structs. There is not only an expense involved in setting the finalizer, but the way it works in the current D GC is that there is overhead involved in every collection checking for finalizers, even for non-class objects. It looks to me like if all the non-deterministic finalization cruft could be removed from the GC, the *current* GC may actually be a little faster than malloc/free for class objects (at least moderately sized ones). Long and short of it is I like Mike's ideas regarding allowing dtors for only auto classes. In that way, the GC wouldn't have to deal with finalizers at all, or at least during non-deterministic collections. It would also still allow D to claim RAII because 'auto' classes are something new for D compared to most other languages.
I could buy that too, if the darned "auto" keyword weren't so overloaded :-P [snip]
Apr 05 2006
prev sibling parent reply "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"Dave" <Dave_member pathlink.com> wrote in message 
news:e11vjk$1fou$1 digitaldaemon.com...
 Long and short of it is I like Mike's ideas regarding allowing dtors for 
 only
 auto classes. In that way, the GC wouldn't have to deal with finalizers at 
 all,
 or at least during non-deterministic collections. It would also still 
 allow D to
 claim RAII because 'auto' classes are something new for D compared to most 
 other
 languages.
Hmm. 'auto' works well and good for classes whose references are local variables, but .. what about objects whose lifetimes aren't determined by the return of a function? I.e. the Node class is used only in LinkedList. When a LinkedList is killed, all its Nodes must die as well. Since the Node references are kept in the LinkedList and not as local variables, there's no way to specify 'auto' for them. Then you start getting into a catch-22. Okay, so you need to delete all those child Nodes in the dtor of LinkedList, meaning that LinkedList has to be made auto so it can have a dtor. But what if a linked list reference has to exist at global level, or in a struct? There is no function return to determine when to delete the list. So you have to make LinkedList non-auto, but then that means that you can't delete all those child nodes since you don't have a dtor / finalizer, etc.. I think RAII is nice, but it doesn't seem to fix everything. Unless, of course, it were extended to deal with these odd cases.
Apr 05 2006
next sibling parent "Regan Heath" <regan netwin.co.nz> writes:
On Thu, 6 Apr 2006 00:08:43 -0400, Jarrett Billingsley <kb3ctd2 yahoo.com>  
wrote:
 "Dave" <Dave_member pathlink.com> wrote in message
 news:e11vjk$1fou$1 digitaldaemon.com...
 Long and short of it is I like Mike's ideas regarding allowing dtors for
 only
 auto classes. In that way, the GC wouldn't have to deal with finalizers  
 at
 all,
 or at least during non-deterministic collections. It would also still
 allow D to
 claim RAII because 'auto' classes are something new for D compared to  
 most
 other
 languages.
Hmm. 'auto' works well and good for classes whose references are local variables, but .. what about objects whose lifetimes aren't determined by the return of a function? I.e. the Node class is used only in LinkedList. When a LinkedList is killed, all its Nodes must die as well.
Assuming the nodes contain reference(s) to resources (other than memory) that need to be released, right? You don't need to delete them to free memory, the GC should free them eventually. The same is true for any non-auto object which contains a sub-object which has a reference to a resource that must be released deterministically. Isn't the solution therefore to make every object containing an 'auto' object 'auto' as well. How about this: 1) If a class has a dtor it must be auto, eg. class A { ~this() {} } //error A must be auto auto class A { ~this() {} } //ok 2) If a class contains a reference to an auto class, it must also be auto, eg. class B { A a; } //error A is auto, B must be auto too auto class B { A a; ~this() { delete a; } } //ok 2a) If that class does not have a dtor it is an error. 2b) If that dtor does not delete the 'a' reference it is an error. Speculative: Can the compiler in fact auto-generate a dtor for this class? One that deletes all auto references. Can it append (not prepend) that auto-generated dtor to any user supplied one? 3) Remove the other 'auto' class syntax, i.e. class A {} auto A a = new A(); It's either a class with resources that need to be freed, or it's not. Is there any need for a middle ground? (this also removes the double use of auto, that'll make some people happy) Pros: 1. no more weird crashes in dtors where people reference things which are gone. 2. compiler finds/corrects most reference leaks automatically. 3. no more double use of 'auto'. Cons: 1. less flexible? I can already think of a situation where this might be too inflexible. What happens if you want to share an object between multiple objects, for example: auto class DatabaseConnection {} a singelton style shared connection to a database. You have several classes which share that connection, i.e. class UserQuery { DatabaseConnection c; } using the rules above these classes would either be illegal, or get a dtor which auto-deletes the DatabaseConnection. The solution? Perhaps it's reference counting in the DatabaseConnection? Perhaps it's a new syntax to mark something 'shared', preventing the compiler auto-deleting it. eg. class UserQuery { shared DatabaseConnection c; } Perhaps this cure is worse than the disease? Thoughts? Regan
Apr 05 2006
prev sibling parent reply kris <foo bar.com> writes:
Jarrett Billingsley wrote:
 "Dave" <Dave_member pathlink.com> wrote in message 
 news:e11vjk$1fou$1 digitaldaemon.com...
 
Long and short of it is I like Mike's ideas regarding allowing dtors for 
only
auto classes. In that way, the GC wouldn't have to deal with finalizers at 
all,
or at least during non-deterministic collections. It would also still 
allow D to
claim RAII because 'auto' classes are something new for D compared to most 
other
languages.
Hmm. 'auto' works well and good for classes whose references are local variables, but .. what about objects whose lifetimes aren't determined by the return of a function? I.e. the Node class is used only in LinkedList. When a LinkedList is killed, all its Nodes must die as well. Since the Node references are kept in the LinkedList and not as local variables, there's no way to specify 'auto' for them.
Heck, the LinkedList dtor /cannot/ rely on the nodes being valid if they are also managed by the GC :) So, as I understand it, one cannot legitimately execute that example. [snip]
Apr 05 2006
parent reply Sean Kelly <sean f4.ca> writes:
kris wrote:
 Jarrett Billingsley wrote:
 Hmm.  'auto' works well and good for classes whose references are 
 local variables, but .. what about objects whose lifetimes aren't 
 determined by the return of a function?

 I.e. the Node class is used only in LinkedList.  When a LinkedList is 
 killed, all its Nodes must die as well.  Since the Node references are 
 kept in the LinkedList and not as local variables, there's no way to 
 specify 'auto' for them.
Heck, the LinkedList dtor /cannot/ rely on the nodes being valid if they are also managed by the GC :) So, as I understand it, one cannot legitimately execute that example.
...unless the LinkedList has a deterministic lifetime :-) Sean
Apr 06 2006
parent reply kris <foo bar.com> writes:
Sean Kelly wrote:
 kris wrote:
 
 Jarrett Billingsley wrote:

 Hmm.  'auto' works well and good for classes whose references are 
 local variables, but .. what about objects whose lifetimes aren't 
 determined by the return of a function?

 I.e. the Node class is used only in LinkedList.  When a LinkedList is 
 killed, all its Nodes must die as well.  Since the Node references 
 are kept in the LinkedList and not as local variables, there's no way 
 to specify 'auto' for them.
Heck, the LinkedList dtor /cannot/ rely on the nodes being valid if they are also managed by the GC :) So, as I understand it, one cannot legitimately execute that example.
...unless the LinkedList has a deterministic lifetime :-) Sean
<g> Touché !
Apr 06 2006
parent reply Georg Wrede <georg.wrede nospam.org> writes:
kris wrote:
 Sean Kelly wrote:
 kris wrote:
 Jarrett Billingsley wrote:
 Hmm.  'auto' works well and good for classes whose references are 
 local variables, but .. what about objects whose lifetimes aren't 
 determined by the return of a function?

 I.e. the Node class is used only in LinkedList.  When a LinkedList 
 is killed, all its Nodes must die as well.  Since the Node 
 references are kept in the LinkedList and not as local variables, 
 there's no way to specify 'auto' for them.
Heck, the LinkedList dtor /cannot/ rely on the nodes being valid if they are also managed by the GC :) So, as I understand it, one cannot legitimately execute that example.
...unless the LinkedList has a deterministic lifetime :-)
<g> Touché !
Hey, hey, hey... If anybody deletes stuff from a linked list, isn't it their responsibility to fix the pointers of the previous and/or the next item, to "bypass" that item?????? The mere fact that no "outside" references exist to a particular item in a linked list does _not_ make this item eligible for GC. Not in the current implementation, and I dare say, in no future implementation ever. In other words, it is _guaranteed_ that _all_ items in a linked list are valid. This could be called a "linked-list-invariant". :-)
Apr 06 2006
parent reply Lars Ivar Igesund <larsivar igesund.net> writes:
Georg Wrede wrote:

 kris wrote:
 Sean Kelly wrote:
 kris wrote:
 Jarrett Billingsley wrote:
 Hmm.  'auto' works well and good for classes whose references are
 local variables, but .. what about objects whose lifetimes aren't
 determined by the return of a function?

 I.e. the Node class is used only in LinkedList.  When a LinkedList
 is killed, all its Nodes must die as well.  Since the Node
 references are kept in the LinkedList and not as local variables,
 there's no way to specify 'auto' for them.
Heck, the LinkedList dtor /cannot/ rely on the nodes being valid if they are also managed by the GC :) So, as I understand it, one cannot legitimately execute that example.
...unless the LinkedList has a deterministic lifetime :-)
<g> Touché !
Hey, hey, hey... If anybody deletes stuff from a linked list, isn't it their responsibility to fix the pointers of the previous and/or the next item, to "bypass" that item?????? The mere fact that no "outside" references exist to a particular item in a linked list does _not_ make this item eligible for GC. Not in the current implementation, and I dare say, in no future implementation ever. In other words, it is _guaranteed_ that _all_ items in a linked list are valid.
Not if the linked list is circular (such that all items is linked to), but disjoint from the roots kept by the GC. This memory will be lost to a conservative GC, but can be detected some of the other types around.
Apr 06 2006
next sibling parent Georg Wrede <georg nospam.org> writes:
Lars Ivar Igesund wrote:
 Georg Wrede wrote:
kris wrote:
Sean Kelly wrote:
kris wrote:
Jarrett Billingsley wrote:

 Hmm.  'auto' works well and good for classes whose references are
 local variables, but .. what about objects whose lifetimes aren't
 determined by the return of a function?

 I.e. the Node class is used only in LinkedList.  When a LinkedList
 is killed, all its Nodes must die as well.  Since the Node
 references are kept in the LinkedList and not as local variables,
 there's no way to specify 'auto' for them.
Heck, the LinkedList dtor /cannot/ rely on the nodes being valid if they are also managed by the GC :) So, as I understand it, one cannot legitimately execute that example.
...unless the LinkedList has a deterministic lifetime :-)
<g> Touché !
Hey, hey, hey... If anybody deletes stuff from a linked list, isn't it their responsibility to fix the pointers of the previous and/or the next item, to "bypass" that item?????? The mere fact that no "outside" references exist to a particular item in a linked list does _not_ make this item eligible for GC. Not in the current implementation, and I dare say, in no future implementation ever. In other words, it is _guaranteed_ that _all_ items in a linked list are valid.
Not if the linked list is circular (such that all items is linked to), but disjoint from the roots kept by the GC. This memory will be lost to a conservative GC, but can be detected some of the other types around.
If the linked list is circular, and at the same time there's no reference to this list from any GC examined area, then I'd consider this as a Programmer Fault. Any set of "items", none of which is referenced from a "roots" area, is IMHO eligible for deletion. Whether this set is circular or not. In other words, we should not strive to make the GC "too smart" for its own good. Either we see to it that items not wished for deletion are pointed to, or we accept that non-pointed-to items are considered passe.
Apr 06 2006
prev sibling parent reply Georg Wrede <georg.wrede nospam.org> writes:
Lars Ivar Igesund wrote:
 Georg Wrede wrote:
 kris wrote:
 Sean Kelly wrote:
 kris wrote:
 Jarrett Billingsley wrote:
 
 Hmm.  'auto' works well and good for classes whose
 references are local variables, but .. what about objects
 whose lifetimes aren't determined by the return of a
 function?
 
 I.e. the Node class is used only in LinkedList.  When a
 LinkedList is killed, all its Nodes must die as well.
 Since the Node references are kept in the LinkedList and
 not as local variables, there's no way to specify 'auto'
 for them.
Heck, the LinkedList dtor /cannot/ rely on the nodes being valid if they are also managed by the GC :) So, as I understand it, one cannot legitimately execute that example.
...unless the LinkedList has a deterministic lifetime :-)
<g> Touché !
Hey, hey, hey... If anybody deletes stuff from a linked list, isn't it their responsibility to fix the pointers of the previous and/or the next item, to "bypass" that item?????? The mere fact that no "outside" references exist to a particular item in a linked list does _not_ make this item eligible for GC. Not in the current implementation, and I dare say, in no future implementation ever. In other words, it is _guaranteed_ that _all_ items in a linked list are valid.
Not if the linked list is circular (such that all items is linked to), but disjoint from the roots kept by the GC. This memory will be lost to a conservative GC, but can be detected some of the other types around.
The mere existence of a circular list that is not pointed-to from the outside, is a programmer error. Unless one explicitly wants it to be collected. But even then it's a programmer error if the items need destructing, since the collection may or may not happen "ever". So, in practice, whenever one wants to store items that need destructors in a linked list, the list itself should be encapsulated in a class that can guarantee the timely destruction of the items, as opposed to merely abandoning them.
Apr 06 2006
parent Lars Ivar Igesund <larsivar igesund.net> writes:
Georg Wrede wrote:

 Lars Ivar Igesund wrote:
 Georg Wrede wrote:
 kris wrote:
 Sean Kelly wrote:
 kris wrote:
 Jarrett Billingsley wrote:
 
 Hmm.  'auto' works well and good for classes whose
 references are local variables, but .. what about objects
 whose lifetimes aren't determined by the return of a
 function?
 
 I.e. the Node class is used only in LinkedList.  When a
 LinkedList is killed, all its Nodes must die as well.
 Since the Node references are kept in the LinkedList and
 not as local variables, there's no way to specify 'auto'
 for them.
Heck, the LinkedList dtor /cannot/ rely on the nodes being valid if they are also managed by the GC :) So, as I understand it, one cannot legitimately execute that example.
...unless the LinkedList has a deterministic lifetime :-)
<g> Touché !
Hey, hey, hey... If anybody deletes stuff from a linked list, isn't it their responsibility to fix the pointers of the previous and/or the next item, to "bypass" that item?????? The mere fact that no "outside" references exist to a particular item in a linked list does _not_ make this item eligible for GC. Not in the current implementation, and I dare say, in no future implementation ever. In other words, it is _guaranteed_ that _all_ items in a linked list are valid.
Not if the linked list is circular (such that all items is linked to), but disjoint from the roots kept by the GC. This memory will be lost to a conservative GC, but can be detected some of the other types around.
The mere existence of a circular list that is not pointed-to from the outside, is a programmer error. Unless one explicitly wants it to be collected. But even then it's a programmer error if the items need destructing, since the collection may or may not happen "ever".
Maybe it is a programmer's error, but at the same time a programmer expect a GC to collect memory that is no longer referenced by the program. Also the list might be generated by a complex enough program to actually make it difficult to see that it is a circular linked list. Depending on the GC, it might or might not be able to reclaim this memory (or calling the destructors/finalizers of the objects in the list), because it no longer explicitly know about it.
Apr 06 2006
prev sibling parent reply Sean Kelly <sean f4.ca> writes:
kris wrote:
 Jarrett Billingsley wrote:
 
 Interesting. It does appear to do that now, whereas in the past it 
 didn't. I remember a post from someone complaining that it took 5 
 minutes for his program to exit because the GC was run to completion on 
 all 10,000,000,000 objects he had (or something like that). The "fix" 
 for that appeared to be "just don't cleanup on exit", which then 
 sidestepped all dtors. It seems something changed along the way, since 
 dtors do indeed get invoked at program termination for a simple test 
 program (not if an exception is thrown, though). My bad.
 
 Does this happen consistently, then? I mean, are dtors invoked on all 
 remaining Objects during exit? At all times? Is that even a good idea?
Yes, yes, yes, maybe :-) It's the call to gc.fullCollectNoStack in gc_term. There are alternatives that might work nearly as well (ie. the techniques you've used in the past) if shutdown time is an issue.
 That doesn't mean it should know about your design but, there again, it 
 shouldn't abort it either. That implies any additional GC references 
 held by a dtor Object really should be valid whenever that dtor is 
 invoked. The fact that they're not relegates dtors to having 
 insignificant value ~ which somehow doesn't seem right. Frankly, I don't 
 clearly understand why they're in D at all ~ too little consistency.
Because not having them tends to inspire people to invent their own, like the dispose() convention in Java. Having language support is preferable, even if the functionality isn't terrific.
 Yeah, I was thinking about that, maybe instead of just looping through 
 all class instances linearly and deleting everything, just keep 
 running GC passes until the regular GC pass has no effect, and brute 
 force the rest. In this way, my method of "not deleting other objects 
 in dtors" would delete the instance of the LinkedList on the first 
 pass, and then all the Nodes on the second, since they are now orphaned. 
Yep If references were still valid for dtors, and dtors were invoked in a deterministic manner, perhaps all we'd need is something similar to "scope(exit)", but referring to a global scope instead? Should the memory manager take care of the latter?
In most cases this would work, but what about orphaned cycles? The GC would ultimately just have to pick a place to start. Also, I think disentangling a complex web of references could be somewhat time intensive, and collection runs are already too slow :-) Sean
Apr 05 2006
parent kris <foo bar.com> writes:
Sean Kelly wrote:
 kris wrote:
 
 Jarrett Billingsley wrote:

 Interesting. It does appear to do that now, whereas in the past it 
 didn't. I remember a post from someone complaining that it took 5 
 minutes for his program to exit because the GC was run to completion 
 on all 10,000,000,000 objects he had (or something like that). The 
 "fix" for that appeared to be "just don't cleanup on exit", which then 
 sidestepped all dtors. It seems something changed along the way, since 
 dtors do indeed get invoked at program termination for a simple test 
 program (not if an exception is thrown, though). My bad.

 Does this happen consistently, then? I mean, are dtors invoked on all 
 remaining Objects during exit? At all times? Is that even a good idea?
Yes, yes, yes, maybe :-) It's the call to gc.fullCollectNoStack in gc_term. There are alternatives that might work nearly as well (ie. the techniques you've used in the past) if shutdown time is an issue.
 That doesn't mean it should know about your design but, there again, 
 it shouldn't abort it either. That implies any additional GC 
 references held by a dtor Object really should be valid whenever that 
 dtor is invoked. The fact that they're not relegates dtors to having 
 insignificant value ~ which somehow doesn't seem right. Frankly, I 
 don't clearly understand why they're in D at all ~ too little 
 consistency.
Because not having them tends to inspire people to invent their own, like the dispose() convention in Java. Having language support is preferable, even if the functionality isn't terrific.
Right :) That's why what Mike suggests make sense to me ~ only have dtor support for those classes that can actually take advantage of it, and have that enforced by the compiler. If, for example, one could also instantiate RAII classes at the global scope, then that would take care of loose ends too. If that also makes the GC execute faster, then so much the better.
 Yeah, I was thinking about that, maybe instead of just looping 
 through all class instances linearly and deleting everything, just 
 keep running GC passes until the regular GC pass has no effect, and 
 brute force the rest. In this way, my method of "not deleting other 
 objects in dtors" would delete the instance of the LinkedList on the 
 first pass, and then all the Nodes on the second, since they are now 
 orphaned. 
Yep If references were still valid for dtors, and dtors were invoked in a deterministic manner, perhaps all we'd need is something similar to "scope(exit)", but referring to a global scope instead? Should the memory manager take care of the latter?
In most cases this would work, but what about orphaned cycles? The GC would ultimately just have to pick a place to start. Also, I think disentangling a complex web of references could be somewhat time intensive, and collection runs are already too slow :-)
I'd assumed it already followed a dependency tree to figure out the collectable allocations? But even so, it's probably better to not do any of that at all (and do what Mike suggests instead).
Apr 05 2006
prev sibling parent Sean Kelly <sean f4.ca> writes:
kris wrote:
 
 Yes, it is. The "death tractors" (dtors in D) are notably less than 
 useful right now. Any dependencies are likely in an unknown state (as 
 you note), and then, dtors are not invoked when the program exits. From 
 what I recall, dtors are not even invoked when you "delete" an object? 
 It's actually quite hard to nail down when they /are/ invoked :)
I think dtors are called whenever an object is destroyed, be it via delete or by the GC. And the GC should perform a complete clean-up on app termination. I believe this is the current behavior in both Phobos and Ares (look at internal/gc/gc.d:gc_term() in Phobos and dmdrt/memory.d:gc_term() in Ares for the shutdown cleanup code).
 Given all that, the finalizer behaviour mentioned above sounds rather 
 like the current death-tractor behaviour?
It is exactly. The dtor behavior has simply changed to be suitable for a more effective clean-up whenever the object is destroyed deterministically (ie. via delete or as an auto object). I suppose an alternative would be to pass a state flag to the dtor to indicate the manner of disposal? I really can't think of a means of implementing this that is as elegant as D deserves. Sean
Apr 05 2006
prev sibling next sibling parent "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"Sean Kelly" <sean f4.ca> wrote in message 
news:e11ca0$ht2$1 digitaldaemon.com...
 Since finalizers are called when the GC destroys an object, they are very 
 limited in what they can do.  They can't assume any GC managed object they 
 have a reference to is valid, etc.  By contrast, destructors can make this 
 assumption, because the object is being destroyed deterministically.  I 
 think having both may be too confusing to be worthwhile, but it would 
 allow for things like this:

 The argument against finalizers, as Mike mentioned, is that you typically 
 want to reclaim such special resources deterministically, so letting the 
 GC take care of this 'someday' is of questionable utility.
Thank you for that clear, concise, and un-condescending reply :)
Apr 05 2006
prev sibling parent reply Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:
Sean Kelly wrote:
 Jarrett Billingsley wrote:
 "Sean Kelly" <sean f4.ca> wrote in message 
 news:e10pk7$2khb$1 digitaldaemon.com...
     - a type can have a destructor and/or a finalizer
     - the destructor is called upon a) explicit delete or b) at end 
 of scope for auto objects
     - the finalizer is called if allocated on the gc heap and the
       destructor has not been called
Would you mind explaining why exactly there needs to be a difference between destructors and finalizers? I've been following all the arguments about this heap vs. auto classes and dtors vs. finalizers, and I still can't figure out why destructors _can't be the finalizers_. Do finalizers do something fundamentally different from destructors?
Since finalizers are called when the GC destroys an object, they are very limited in what they can do. They can't assume any GC managed object they have a reference to is valid, etc. By contrast, destructors can make this assumption, because the object is being destroyed deterministically. I think having both may be too confusing to be worthwhile, but it would allow for things like this: class LinkedList { ~this() { // called deterministically for( Node n = top; n; ) { Node t = n->next; delete n; n = t; } finalize(); } void finalize() { // called by GC // nodes may have already been destroyed // so leave them alone, but special // resources could be reclaimed } } The argument against finalizers, as Mike mentioned, is that you typically want to reclaim such special resources deterministically, so letting the GC take care of this 'someday' is of questionable utility. Sean
Ok, I think we can tackle this problem in a better way. So far, people have been thinking about the fact that when destructors are called in a GC cycle, they are called with finalizer semantics (i.e., you don't know if the member references are valid or not, thus you can't use them). This is a problem when in a destructor, one would like to destroy component objects (as the Nodes of the LinkedList example). Some ideas where discussed here, but I didn't think any were fruitful. Like: *Forcing all classes with destructors to be auto classes -> doesn't add any usefulness, instead just nuisances. *Making the GC destroy objects in an order that makes members references valid -> has a high performance cost and/or is probably just not possible (circular references?). Perhaps another way would be to have the following behavior: - When a destructor is called during a GC (i.e., "as a finalizer") for an object, then the member references are not valid and cannot be referenced, *but they can be deleted*. It will be deleted iff it has not been deleted already. I think this can be done without significant overhead. At the end of a GC cycle, the GC has already a list of all objects that are to be deleted. Thus, on the release phase, it could be modified to have a flag indicating whether the object was already deleted or not. Thus when LinkedList deletes a Node, the delete is only made if the object has already been deleted or not. Still, while the previous idea might be good, it's not the optimal, because we are not clearly apperceiving the problem/issue at hand. What we *really* want is to directly couple the lifecycle of a component (member) object with it's composite (owner) object. A Node of a LinkedList has the same lifecycle of it's LinkedList, so Node shouldn't even be a independent Garbage Collection managing element. What we want is an allocator that allocates memory that is not to be claimed by the GC (but which is to be scanned by the GC). It's behavior is exactly like the allocator of http://www.digitalmars.com/d/memory.html#newdelete but it should come with the language and be available for all types. With usage like: class LinkedList { ... Add(Object obj) { Node node = mnew Node(blabla); ... } Thus, when the destructor is called upon a LinkedList, either explicitly, or by the GC, the Node references will always be valid. One has to be careful now, as mnew'ed object are effectively under manual memory management, and so every mnew must have a corresponding delete, lest there be dangling pointer ou memory leaks. Nonetheless it seems to be only sane solution to this problem. Another interesting addition, is to extend the concept of auto to class members. Just as currently auto couples the lifecycle of a variable to the enclosing function, an auto class member would couple the lifecycle of its member to it's owner object. It would get deleted implicitly when then owner object got deleted. Here is another (made up) example: class SomeUIWidget { auto Color fgcolor; auto Color bgcolor; auto Size size; auto Image image; ... The auto members would then have to be initialized on a constructor or something (the exact restrictions might vary, such as being final or not). -- Bruno Medeiros - CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Apr 09 2006
next sibling parent reply kris <foo bar.com> writes:
Bruno Medeiros wrote:
 Sean Kelly wrote:
 
 Jarrett Billingsley wrote:

 "Sean Kelly" <sean f4.ca> wrote in message 
 news:e10pk7$2khb$1 digitaldaemon.com...

     - a type can have a destructor and/or a finalizer
     - the destructor is called upon a) explicit delete or b) at end 
 of scope for auto objects
     - the finalizer is called if allocated on the gc heap and the
       destructor has not been called
Would you mind explaining why exactly there needs to be a difference between destructors and finalizers? I've been following all the arguments about this heap vs. auto classes and dtors vs. finalizers, and I still can't figure out why destructors _can't be the finalizers_. Do finalizers do something fundamentally different from destructors?
Since finalizers are called when the GC destroys an object, they are very limited in what they can do. They can't assume any GC managed object they have a reference to is valid, etc. By contrast, destructors can make this assumption, because the object is being destroyed deterministically. I think having both may be too confusing to be worthwhile, but it would allow for things like this: class LinkedList { ~this() { // called deterministically for( Node n = top; n; ) { Node t = n->next; delete n; n = t; } finalize(); } void finalize() { // called by GC // nodes may have already been destroyed // so leave them alone, but special // resources could be reclaimed } } The argument against finalizers, as Mike mentioned, is that you typically want to reclaim such special resources deterministically, so letting the GC take care of this 'someday' is of questionable utility. Sean
Ok, I think we can tackle this problem in a better way. So far, people have been thinking about the fact that when destructors are called in a GC cycle, they are called with finalizer semantics (i.e., you don't know if the member references are valid or not, thus you can't use them). This is a problem when in a destructor, one would like to destroy component objects (as the Nodes of the LinkedList example). Some ideas where discussed here, but I didn't think any were fruitful. Like: *Forcing all classes with destructors to be auto classes -> doesn't add any usefulness, instead just nuisances. *Making the GC destroy objects in an order that makes members references valid -> has a high performance cost and/or is probably just not possible (circular references?). Perhaps another way would be to have the following behavior: - When a destructor is called during a GC (i.e., "as a finalizer") for an object, then the member references are not valid and cannot be referenced, *but they can be deleted*. It will be deleted iff it has not been deleted already. I think this can be done without significant overhead. At the end of a GC cycle, the GC has already a list of all objects that are to be deleted. Thus, on the release phase, it could be modified to have a flag indicating whether the object was already deleted or not. Thus when LinkedList deletes a Node, the delete is only made if the object has already been deleted or not. Still, while the previous idea might be good, it's not the optimal, because we are not clearly apperceiving the problem/issue at hand. What we *really* want is to directly couple the lifecycle of a component (member) object with it's composite (owner) object. A Node of a LinkedList has the same lifecycle of it's LinkedList, so Node shouldn't even be a independent Garbage Collection managing element. What we want is an allocator that allocates memory that is not to be claimed by the GC (but which is to be scanned by the GC). It's behavior is exactly like the allocator of http://www.digitalmars.com/d/memory.html#newdelete but it should come with the language and be available for all types. With usage like: class LinkedList { ... Add(Object obj) { Node node = mnew Node(blabla); ... } Thus, when the destructor is called upon a LinkedList, either explicitly, or by the GC, the Node references will always be valid. One has to be careful now, as mnew'ed object are effectively under manual memory management, and so every mnew must have a corresponding delete, lest there be dangling pointer ou memory leaks. Nonetheless it seems to be only sane solution to this problem. Another interesting addition, is to extend the concept of auto to class members. Just as currently auto couples the lifecycle of a variable to the enclosing function, an auto class member would couple the lifecycle of its member to it's owner object. It would get deleted implicitly when then owner object got deleted. Here is another (made up) example: class SomeUIWidget { auto Color fgcolor; auto Color bgcolor; auto Size size; auto Image image; ... The auto members would then have to be initialized on a constructor or something (the exact restrictions might vary, such as being final or not).
Regardless of how it's implemented, what's needed is a bit of consistency. Currently, dtors are invoked with two entirely different world-states: with valid state, and with unspecified state. What makes this generally unworkable is the fact that (a) the difference in state is often critical to the operation of the dtor, and (b) there's no clean way to tell the difference. I use a bit of a hack to distinguish between the two: a common module has a global variable set to true when the enclosing module-dtor is invoked. This obviously depends upon module-dtors being first (which they currently are, but that is not in the spec). Most of you will probably be going "eww" at this point, but it's the only way I found to make dtors consistent and thus usable. Further, this is only workable if the dtor() itself can be abandoned when in state (b) above; prohibiting the use of dtors for a whole class of cleanup concerns, and forcing one to defer to the dispose() or close() pattern ~~ some say anti-pattern. As I understand it, the two states correspond to (1) an explicit 'delete' of the object, which includes "auto" usage; and (2) implicit cleanup via the GC. The suggestion to restrict dtor to 'auto' classes is a means to limit of object lifetimes that are not related to scope ~ such as time-based). That would need to be addressed somehow? Turning to your suggestions ~ the 'marking' of references such that they can be "deleted" multiple times is perhaps questionable, partly because it appears to be specific to the GC implementation? I imagine an incremental collector would have problems with this approach, even if it were workable with a "stop the world" collector? I don't know for sure, but suspect there'd be issues there somewhere. Whatever the resolution, consistency should be the order of the day. - Kris
Apr 09 2006
next sibling parent kris <foo bar.com> writes:
kris wrote:
 I use a bit of a hack to distinguish between the two: a common module 
 has a global variable set to true when the enclosing module-dtor is 
 invoked. This obviously depends upon module-dtors being first (which 
 they currently are, but that is not in the spec). Most of you will 
 probably be going "eww" at this point, but it's the only way I found to 
 make dtors consistent and thus usable. Further, this is only workable if 
 the dtor() itself can be abandoned when in state (b) above; prohibiting 
 the use of dtors for a whole class of cleanup concerns, and forcing one 
 to defer to the dispose() or close() pattern ~~ some say anti-pattern.
After reading, that paragraph does not reflect the status-quo at all ... First, it should have said "used" instead of "use" (past-tense ~ this is not applied any more, since dtors have all but been abandoned). Second, the identification of "state" was limited to program termination only ~ the classes in question were actually collected only at that point. Third, the cleanup did not rely on GC managed memory. All in all, that paragraph is pretty darned misleading ~~ my bad :-( The take-home message is that I did not find a general mechanism to distinguish between valid-state and unspecified-state for a dtor ~ the oft-crucial inconsistency remains in its fully-fledged guise. The other issue is that I clearly should avoid posting whilst hallucinating. Sorry;
Apr 09 2006
prev sibling parent reply Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:
kris wrote:
 
 
 Regardless of how it's implemented, what's needed is a bit of 
 consistency. Currently, dtors are invoked with two entirely different 
 world-states: with valid state, and with unspecified state. What makes 
 this generally unworkable is the fact that (a) the difference in state 
 is often critical to the operation of the dtor, and (b) there's no clean 
 way to tell the difference.
 
Hum, from what you said, follows a rather trivial alternative solution to the problem: Have the destructor have an implicit parameter/variable, that indicates whether it was called explicitly or as a finalizer (i.e, in a GC run): (This would be similar in semantics to Sean's suggestion of separating the destruction and finalize methods) class LinkedList { ~this() { // called manually/explicitly and automatically if(explicit) { for( Node n = top; n; ) { Node t = n->next; delete n; n = t; } } // ... finalize here } ... Would this be acceptable? How would this compare to other suggestions? I can think of a few things to say versus my suggestion.
 
 As I understand it, the two states correspond to (1) an explicit 
 'delete' of the object, which includes "auto" usage; and (2) implicit 
 cleanup via the GC.
 
 The suggestion to restrict dtor to 'auto' classes is a means to limit 


 of object lifetimes that are not related to scope ~ such as time-based). 
 That would need to be addressed somehow?
 
 Turning to your suggestions ~ the 'marking' of references such that they 
 can be "deleted" multiple times is perhaps questionable, partly because 
 it appears to be specific to the GC implementation? I imagine an 
 incremental collector would have problems with this approach, even if it 
 were workable with a "stop the world" collector? I don't know for sure, 
 but suspect there'd be issues there somewhere.
 
It works for a stop-the-world collector, I'm sure. As for a incremental collector, hum... well, it works if collector guarantees the following: * The collector determines a set S of objects to be reclaimed, and no object in S is referenced outside of S.
 Whatever the resolution, consistency should be the order of the day.
 
 - Kris
Manual and automatic memory management are two very different paradigms that are likely impossible or impractical to be made "consistent" or conciliated, at least in the way you are implying. The "only auto classes have destructors" suggestion only makes it "consistent" because it limits the usage of the class to only one paradigm (manual management). -- Bruno Medeiros - CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Apr 10 2006
parent kris <foo bar.com> writes:
Bruno Medeiros wrote:
 kris wrote:
 
 Regardless of how it's implemented, what's needed is a bit of 
 consistency. Currently, dtors are invoked with two entirely different 
 world-states: with valid state, and with unspecified state. What makes 
 this generally unworkable is the fact that (a) the difference in state 
 is often critical to the operation of the dtor, and (b) there's no 
 clean way to tell the difference.
Hum, from what you said, follows a rather trivial alternative solution to the problem: Have the destructor have an implicit parameter/variable, that indicates whether it was called explicitly or as a finalizer (i.e, in a GC run): (This would be similar in semantics to Sean's suggestion of separating the destruction and finalize methods) class LinkedList { ~this() { // called manually/explicitly and automatically if(explicit) { for( Node n = top; n; ) { Node t = n->next; delete n; n = t; } } // ... finalize here } ... Would this be acceptable? How would this compare to other suggestions? I can think of a few things to say versus my suggestion.
Perhaps it would be better as an optional parameter? This certainly would allow for lazy dtors that don't need timely cleanup. Although I can't think of any reasonable examples to illustrate with. However, it clearly exposes the "uneasy" status that a dtor might find itself in. For that reason it seems a bit like a hack on top of a queasy problem (to me). In cases like these I tend to think it's better to start off constrained and deterministic (remove those 'lazy' non-deterministic dtor invocations), and then optionally open things up as is deemed necessary, or when a resolution to the non-determinism is found.
Apr 10 2006
prev sibling next sibling parent reply Sean Kelly <sean f4.ca> writes:
Bruno Medeiros wrote:
 
 What we want is an allocator that allocates memory that is not to be 
 claimed by the GC (but which is to be scanned by the GC). It's behavior 
 is exactly like the allocator of 
 http://www.digitalmars.com/d/memory.html#newdelete but it should come 
 with the language and be available for all types. With usage like:
 
   class LinkedList {
     ...
     Add(Object obj) {
       Node node = mnew Node(blabla);
       ...
     }
 
 Thus, when the destructor is called upon a LinkedList, either 
 explicitly, or by the GC, the Node references will always be valid. One 
 has to be careful now, as mnew'ed object are effectively under manual 
 memory management, and so every mnew must have a corresponding delete, 
 lest there be dangling pointer ou memory leaks. Nonetheless it seems to 
 be only sane solution to this problem.
This does seem to be the most reasonable method. In fact, it could be done now without the addition of new keywords by adding two new GC functions: release and reclaim (bad names, but they're all I could think of). 'release' would tell the GC not to automatically finalize or delete the memory block, as you've suggested above, and 'reclaim' would transfer ownership back to the GC. It's more error prone than I'd like, but also perhaps the most reasonable. A possible alternative would be for the GC to peform its cleanup in two stages. The first sweep runs all finalizers on orphaned objects, and the second releases the memory. Thus in Eric's example on d.D.learn, he would be able legally iterate across his AA and close all HANDLEs because the memory would still be valid at that stage. Assuming there aren't any problems with this latter idea, I think it should be implemented as standard behavior for the GC, and the former idea should be provided as an option. Thus the user would have complete manual control available when needed, but more foolproof basic behavior for simpler situations.
 Another interesting addition, is to extend the concept of auto to class 
 members. Just as currently auto couples the lifecycle of a variable to 
 the enclosing function, an auto class member would couple the lifecycle 
 of its member to it's owner object. It would get deleted implicitly when 
 then owner object got deleted. Here is another (made up) example:
 
   class SomeUIWidget {
     auto Color fgcolor;
     auto Color bgcolor;
     auto Size size;
     auto Image image;
     ...
 
 The auto members would then have to be initialized on a constructor or 
 something (the exact restrictions might vary, such as being final or not).
I like this idea as well, though it may require some additional bookkeeping to accomplish. For example, a GC scan may encounter the members before the owner, so each member may need to contain a hidden pointer to the owner object so the GC knows how to sort things out. Sean
Apr 09 2006
next sibling parent reply Georg Wrede <georg.wrede nospam.org> writes:
Sean Kelly wrote:
 Bruno Medeiros wrote:
 
 What we want is an allocator that allocates memory that is not to be 
 claimed by the GC (but which is to be scanned by the GC). It's 
 behavior is exactly like the allocator of 
 http://www.digitalmars.com/d/memory.html#newdelete but it should come 
 with the language and be available for all types. With usage like:

   class LinkedList {
     ...
     Add(Object obj) {
       Node node = mnew Node(blabla);
       ...
     }

 Thus, when the destructor is called upon a LinkedList, either 
 explicitly, or by the GC, the Node references will always be valid. 
 One has to be careful now, as mnew'ed object are effectively under 
 manual memory management, and so every mnew must have a corresponding 
 delete, lest there be dangling pointer ou memory leaks. Nonetheless it 
 seems to be only sane solution to this problem.
This does seem to be the most reasonable method. In fact, it could be done now without the addition of new keywords by adding two new GC functions: release and reclaim (bad names, but they're all I could think of). 'release' would tell the GC not to automatically finalize or delete the memory block, as you've suggested above, and 'reclaim' would transfer ownership back to the GC. It's more error prone than I'd like, but also perhaps the most reasonable. A possible alternative would be for the GC to peform its cleanup in two stages. The first sweep runs all finalizers on orphaned objects, and the second releases the memory. Thus in Eric's example on d.D.learn, he would be able legally iterate across his AA and close all HANDLEs because the memory would still be valid at that stage. Assuming there aren't any problems with this latter idea, I think it should be implemented as standard behavior for the GC, and the former idea should be provided as an option. Thus the user would have complete manual control available when needed, but more foolproof basic behavior for simpler situations.
 Another interesting addition, is to extend the concept of auto to 
 class members. Just as currently auto couples the lifecycle of a 
 variable to the enclosing function, an auto class member would couple 
 the lifecycle of its member to it's owner object. It would get deleted 
 implicitly when then owner object got deleted. Here is another (made 
 up) example:

   class SomeUIWidget {
     auto Color fgcolor;
     auto Color bgcolor;
     auto Size size;
     auto Image image;
     ...

 The auto members would then have to be initialized on a constructor or 
 something (the exact restrictions might vary, such as being final or 
 not).
I like this idea as well, though it may require some additional bookkeeping to accomplish. For example, a GC scan may encounter the members before the owner, so each member may need to contain a hidden pointer to the owner object so the GC knows how to sort things out.
If the above case was written as: class SomeUIWidget { Color fgcolor; Color bgcolor; Size size; Image image; ... and the class didn't have an explicit destructor, then the only "damage" at GC (or otherwise destruction) time would be that a couple of Color instances, a Size instance and an Image instance would be "left over" after that particular GC run. Big deal? At the next GC run (unless they'd be pointed-to by other things), they'd get deleted too. No major flood of tears here. Somehow I fear folks are making this a way too complicated thing.
Apr 09 2006
parent Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:
Georg Wrede wrote:
 Sean Kelly wrote:
 Bruno Medeiros wrote:

 What we want is an allocator that allocates memory that is not to be 
 claimed by the GC (but which is to be scanned by the GC). It's 
 behavior is exactly like the allocator of 
 http://www.digitalmars.com/d/memory.html#newdelete but it should come 
 with the language and be available for all types. With usage like:

   class LinkedList {
     ...
     Add(Object obj) {
       Node node = mnew Node(blabla);
       ...
     }

 Thus, when the destructor is called upon a LinkedList, either 
 explicitly, or by the GC, the Node references will always be valid. 
 One has to be careful now, as mnew'ed object are effectively under 
 manual memory management, and so every mnew must have a corresponding 
 delete, lest there be dangling pointer ou memory leaks. Nonetheless 
 it seems to be only sane solution to this problem.
This does seem to be the most reasonable method. In fact, it could be done now without the addition of new keywords by adding two new GC functions: release and reclaim (bad names, but they're all I could think of). 'release' would tell the GC not to automatically finalize or delete the memory block, as you've suggested above, and 'reclaim' would transfer ownership back to the GC. It's more error prone than I'd like, but also perhaps the most reasonable. A possible alternative would be for the GC to peform its cleanup in two stages. The first sweep runs all finalizers on orphaned objects, and the second releases the memory. Thus in Eric's example on d.D.learn, he would be able legally iterate across his AA and close all HANDLEs because the memory would still be valid at that stage. Assuming there aren't any problems with this latter idea, I think it should be implemented as standard behavior for the GC, and the former idea should be provided as an option. Thus the user would have complete manual control available when needed, but more foolproof basic behavior for simpler situations.
 Another interesting addition, is to extend the concept of auto to 
 class members. Just as currently auto couples the lifecycle of a 
 variable to the enclosing function, an auto class member would couple 
 the lifecycle of its member to it's owner object. It would get 
 deleted implicitly when then owner object got deleted. Here is 
 another (made up) example:

   class SomeUIWidget {
     auto Color fgcolor;
     auto Color bgcolor;
     auto Size size;
     auto Image image;
     ...

 The auto members would then have to be initialized on a constructor 
 or something (the exact restrictions might vary, such as being final 
 or not).
I like this idea as well, though it may require some additional bookkeeping to accomplish. For example, a GC scan may encounter the members before the owner, so each member may need to contain a hidden pointer to the owner object so the GC knows how to sort things out.
If the above case was written as: class SomeUIWidget { Color fgcolor; Color bgcolor; Size size; Image image; ... and the class didn't have an explicit destructor, then the only "damage" at GC (or otherwise destruction) time would be that a couple of Color instances, a Size instance and an Image instance would be "left over" after that particular GC run. Big deal? At the next GC run (unless they'd be pointed-to by other things), they'd get deleted too. No major flood of tears here. Somehow I fear folks are making this a way too complicated thing.
Actually, with any decent GC, all of those objects will be reclaimed on the first GC run (and DMD does that). So you are correct that there is no difference when running the GC on that object. But you miss the point. The point (of my suggestions) was to be able to have a destruction system that would work "correctly/extensively" both when called by a GC cycle, and when called explicitly (outside of a GC cycle). By "correctly/extensively" I mean that the destructor would be able in both cases to ensure the destruction of it's owned resources. -- Bruno Medeiros - CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Apr 10 2006
prev sibling parent reply Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:
Sean Kelly wrote:
 Bruno Medeiros wrote:
 What we want is an allocator that allocates memory that is not to be 
 claimed by the GC (but which is to be scanned by the GC). It's 
 behavior is exactly like the allocator of 
 http://www.digitalmars.com/d/memory.html#newdelete but it should come 
 with the language and be available for all types. With usage like:

   class LinkedList {
     ...
     Add(Object obj) {
       Node node = mnew Node(blabla);
       ...
     }

 Thus, when the destructor is called upon a LinkedList, either 
 explicitly, or by the GC, the Node references will always be valid. 
 One has to be careful now, as mnew'ed object are effectively under 
 manual memory management, and so every mnew must have a corresponding 
 delete, lest there be dangling pointer ou memory leaks. Nonetheless it 
 seems to be only sane solution to this problem.
This does seem to be the most reasonable method. In fact, it could be done now without the addition of new keywords by adding two new GC functions: release and reclaim (bad names, but they're all I could think of). 'release' would tell the GC not to automatically finalize or delete the memory block, as you've suggested above, and 'reclaim' would transfer ownership back to the GC. It's more error prone than I'd like, but also perhaps the most reasonable.
Hum, indeed.
 A possible alternative would be for the GC to peform its cleanup in two 
 stages.  The first sweep runs all finalizers on orphaned objects, and 
 the second releases the memory.  Thus in Eric's example on d.D.learn, he 
 would be able legally iterate across his AA and close all HANDLEs 
 because the memory would still be valid at that stage.
 
By orphaned objects, do you mean all objects that are to be reclaimed by the GC on that cycle? Or just the subset of those objects, that are not referenced by anyone?
 Assuming there aren't any problems with this latter idea, I think it 
 should be implemented as standard behavior for the GC, and the former 
 idea should be provided as an option.  Thus the user would have complete 
 manual control available when needed, but more foolproof basic behavior 
 for simpler situations.
 
 Another interesting addition, is to extend the concept of auto to 
 class members. Just as currently auto couples the lifecycle of a 
 variable to the enclosing function, an auto class member would couple 
 the lifecycle of its member to it's owner object. It would get deleted 
 implicitly when then owner object got deleted. Here is another (made 
 up) example:

   class SomeUIWidget {
     auto Color fgcolor;
     auto Color bgcolor;
     auto Size size;
     auto Image image;
     ...

 The auto members would then have to be initialized on a constructor or 
 something (the exact restrictions might vary, such as being final or 
 not).
I like this idea as well, though it may require some additional bookkeeping to accomplish. For example, a GC scan may encounter the members before the owner, so each member may need to contain a hidden pointer to the owner object so the GC knows how to sort things out. Sean
Hum, true, it would need some additional bookkeeping, didn't realize that immediately. The semantics like those that I mentioned in me previous post would suffice: "When a destructor is called upon an object during a GC (i.e., "as a finalizer"), then the member references are not valid and cannot be referenced, *but they can be deleted*. Each will be deleted iff it has not been deleted already in the reclaiming phase." I don't think your algorithm (having a hidden pointer) would be necessary (or even feasible), and the one I mentioned before would suffice. -- Bruno Medeiros - CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Apr 10 2006
next sibling parent reply Sean Kelly <sean f4.ca> writes:
Bruno Medeiros wrote:
 Sean Kelly wrote:

 A possible alternative would be for the GC to peform its cleanup in 
 two stages.  The first sweep runs all finalizers on orphaned objects, 
 and the second releases the memory.  Thus in Eric's example on 
 d.D.learn, he would be able legally iterate across his AA and close 
 all HANDLEs because the memory would still be valid at that stage.
By orphaned objects, do you mean all objects that are to be reclaimed by the GC on that cycle? Or just the subset of those objects, that are not referenced by anyone?
All objects that are to be reclaimed. I figured your other suggestion could be used for more complex cases.
 Another interesting addition, is to extend the concept of auto to 
 class members. Just as currently auto couples the lifecycle of a 
 variable to the enclosing function, an auto class member would couple 
 the lifecycle of its member to it's owner object. It would get 
 deleted implicitly when then owner object got deleted. Here is 
 another (made up) example:

   class SomeUIWidget {
     auto Color fgcolor;
     auto Color bgcolor;
     auto Size size;
     auto Image image;
     ...

 The auto members would then have to be initialized on a constructor 
 or something (the exact restrictions might vary, such as being final 
 or not).
I like this idea as well, though it may require some additional bookkeeping to accomplish. For example, a GC scan may encounter the members before the owner, so each member may need to contain a hidden pointer to the owner object so the GC knows how to sort things out.
Hum, true, it would need some additional bookkeeping, didn't realize that immediately. The semantics like those that I mentioned in me previous post would suffice: "When a destructor is called upon an object during a GC (i.e., "as a finalizer"), then the member references are not valid and cannot be referenced, *but they can be deleted*. Each will be deleted iff it has not been deleted already in the reclaiming phase." I don't think your algorithm (having a hidden pointer) would be necessary (or even feasible), and the one I mentioned before would suffice.
Hrm... but what if the owner is simply collected via a normal GC run? In that case, the GC may encounter the member objects before the owner object. I suppose bookkeeping at the member level may not be necessary, but it may result in an extra scan through the list of objects to be finalized to determine who owns what. Sean
Apr 10 2006
parent reply Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:
Sean Kelly wrote:
 Bruno Medeiros wrote:
 Sean Kelly wrote:

 A possible alternative would be for the GC to peform its cleanup in 
 two stages.  The first sweep runs all finalizers on orphaned objects, 
 and the second releases the memory.  Thus in Eric's example on 
 d.D.learn, he would be able legally iterate across his AA and close 
 all HANDLEs because the memory would still be valid at that stage.
By orphaned objects, do you mean all objects that are to be reclaimed by the GC on that cycle? Or just the subset of those objects, that are not referenced by anyone?
All objects that are to be reclaimed. I figured your other suggestion could be used for more complex cases.
That way, you have the guarantee that all references are valid, but some instances would have their destructors called multiple times. That's likely a behavior that isn't acceptable on some cases.
 Another interesting addition, is to extend the concept of auto to 
 class members. Just as currently auto couples the lifecycle of a 
 variable to the enclosing function, an auto class member would 
 couple the lifecycle of its member to it's owner object. It would 
 get deleted implicitly when then owner object got deleted. Here is 
 another (made up) example:

   class SomeUIWidget {
     auto Color fgcolor;
     auto Color bgcolor;
     auto Size size;
     auto Image image;
     ...

 The auto members would then have to be initialized on a constructor 
 or something (the exact restrictions might vary, such as being final 
 or not).
I like this idea as well, though it may require some additional bookkeeping to accomplish. For example, a GC scan may encounter the members before the owner, so each member may need to contain a hidden pointer to the owner object so the GC knows how to sort things out.
Hum, true, it would need some additional bookkeeping, didn't realize that immediately. The semantics like those that I mentioned in me previous post would suffice: "When a destructor is called upon an object during a GC (i.e., "as a finalizer"), then the member references are not valid and cannot be referenced, *but they can be deleted*. Each will be deleted iff it has not been deleted already in the reclaiming phase." I don't think your algorithm (having a hidden pointer) would be necessary (or even feasible), and the one I mentioned before would suffice.
Hrm... but what if the owner is simply collected via a normal GC run? In that case, the GC may encounter the member objects before the owner object. I suppose bookkeeping at the member level may not be necessary, but it may result in an extra scan through the list of objects to be finalized to determine who owns what. Sean
The bookkeeping is made by the GC and memory pool manager. A scan through the list of objects to be finalized is necessary, but it won't be an _extra_ scan. Let me try to explain this way: *** The current GC algorithm: *** delete obj: m = getMemManagerHandle(obj); if(m.isObjectInstance) m.obj.destroy(); // calls ~this() freeMemory(m); GC: GC determines a set S of instances to be reclaimed (garbage); foreach(m in S) { delete m; } *** The extended GC algorithm: *** delete: m = getMemManagerHandle(obj); if(m.isDeleted) return; if(m.isObjectInstance) m.obj.destroy(); // calls ~this() if(!m.isGarbageSet) // If it is not in S freeMemory(m); GC: GC determines a set S of instances to be reclaimed (garbage); foreach(m in S) { m.isGarbage = true; } foreach(m in S) { delete m; } foreach(m in S) { freeMemory(m); } And there we go. No increase in algorithmic complexity. There is only an increase in the Memory Manager record size (we need a flag for m.isDeleted, and we need it only during a GC run). The reason we don't freeMemory(m) right after delete m; is because we need the bookkeeping of m.isDeleted until the end of the GC run. The reason we have m.isGarbage is to allow the deletion of objects not in S during the GC run. (it is an optimization of doing "S.contains(m)" ) Hope I don't have a bug up there :P -- Bruno Medeiros - CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Apr 13 2006
parent reply pragma <pragma_member pathlink.com> writes:
In article <e1m6mo$19d$1 digitaldaemon.com>, Bruno Medeiros says...
*** The extended GC algorithm: ***

delete:

   m = getMemManagerHandle(obj);
   if(m.isDeleted)
     return;
   if(m.isObjectInstance)
     m.obj.destroy(); // calls ~this()
   if(!m.isGarbageSet) // If it is not in S
     freeMemory(m);

GC:

   GC determines a set S of instances to be reclaimed (garbage);
   foreach(m in S) {
     m.isGarbage = true;
   }
   foreach(m in S) {
     delete m;
   }
   foreach(m in S) {
     freeMemory(m);
   }
Something like this will help *part* of the problem. By delaying the freeing of referenced memory, dynamically allocated primitives (like arrays) will continue to function inside of class destructors. However, this does not help with references to objects and structs, as they may still be placed in an invalid state by their own destructors. /**/ class A{ /**/ public uint resource; /**/ public this(){ resource = 42; } /**/ public ~this(){ resource = 0; } /**/ } /**/ class B{ /**/ public A a; /**/ public this(){ a = new A(); } /**/ public ~this(){ writefln("resource: %d",a.resource); } /**/ } Depending on the ording in S, the program will output either "resource: 42" or "resource: 0". The problem only gets worse for object cycles. I'm not saying it won't work, but it just moves the wrinkle into a different area to be stomped out. Now, one way to improve this is if there were a standard method on objects that can be checked in situations like these. That way you'd know if another object is finalized, or in the process of being finalized.
   foreach(m in S) {
     m.isFinalized = true;
     delete m;
   }
Now this doesn't make life any easier, but it does make things deterministic. /**/ class A{ /**/ public uint resource; /**/ public this(){ resource = 42; } /**/ public ~this(){ resource = 0; } /**/ } /**/ class B{ /**/ public A a; /**/ public this(){ a = new A(); } /**/ public ~this(){ if(!a.isFinalized) writefln("resource: %d",a.resource); } /**/ } (another option would be something like gc.isFinalized(a), should the footprint of Object be an issue) Now B outputs nothing if A is finalized. That seems like a win, but what if B really needed that value before A went away? In such a case, you're back to square-one: you can't depend on the state of another referenced object within a dtor, valid reference or otherwise. - EricAnderton at yahoo
Apr 13 2006
parent Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:
pragma wrote:
 In article <e1m6mo$19d$1 digitaldaemon.com>, Bruno Medeiros says...
 *** The extended GC algorithm: ***

 delete:

   m = getMemManagerHandle(obj);
   if(m.isDeleted)
     return;
   if(m.isObjectInstance)
     m.obj.destroy(); // calls ~this()
   if(!m.isGarbageSet) // If it is not in S
     freeMemory(m);

 GC:

   GC determines a set S of instances to be reclaimed (garbage);
   foreach(m in S) {
     m.isGarbage = true;
   }
   foreach(m in S) {
     delete m;
   }
   foreach(m in S) {
     freeMemory(m);
   }
Something like this will help *part* of the problem. By delaying the freeing of referenced memory, dynamically allocated primitives (like arrays) will continue to function inside of class destructors. However, this does not help with references to objects and structs, as they may still be placed in an invalid state by their own destructors. /**/ class A{ /**/ public uint resource; /**/ public this(){ resource = 42; } /**/ public ~this(){ resource = 0; } /**/ } /**/ class B{ /**/ public A a; /**/ public this(){ a = new A(); } /**/ public ~this(){ writefln("resource: %d",a.resource); } /**/ } Depending on the ording in S, the program will output either "resource: 42" or "resource: 0". The problem only gets worse for object cycles. I'm not saying it won't work, but it just moves the wrinkle into a different area to be stomped out.
True, I forgot to mention that. The order of destruction is undefined, so it will only work with objects where that order doesn't matter. (that should be the case with most)
 Now, one way to improve this is if there were a standard method on objects that
 can be checked in situations like these.  That way you'd know if another object
 is finalized, or in the process of being finalized.
 
   foreach(m in S) {
     m.isFinalized = true;
     delete m;
   }
Now this doesn't make life any easier, but it does make things deterministic. /**/ class A{ /**/ public uint resource; /**/ public this(){ resource = 42; } /**/ public ~this(){ resource = 0; } /**/ } /**/ class B{ /**/ public A a; /**/ public this(){ a = new A(); } /**/ public ~this(){ if(!a.isFinalized) writefln("resource: %d",a.resource); } /**/ } (another option would be something like gc.isFinalized(a), should the footprint of Object be an issue) Now B outputs nothing if A is finalized. That seems like a win, but what if B really needed that value before A went away? In such a case, you're back to square-one: you can't depend on the state of another referenced object within a dtor, valid reference or otherwise. - EricAnderton at yahoo
Exactly, you can't really solve the order/state problem with this. I think the only way to do it is to manually memory manage the member objects (with a construct such as mmnew or otherwise). -- Bruno Medeiros - CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Apr 14 2006
prev sibling parent Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:
Bruno Medeiros wrote:
 Sean Kelly wrote:
 Bruno Medeiros wrote:
 What we want is an allocator that allocates memory that is not to be 
 claimed by the GC (but which is to be scanned by the GC). It's 
 behavior is exactly like the allocator of 
 http://www.digitalmars.com/d/memory.html#newdelete but it should come 
 with the language and be available for all types. With usage like:

   class LinkedList {
     ...
     Add(Object obj) {
       Node node = mnew Node(blabla);
       ...
     }

 Thus, when the destructor is called upon a LinkedList, either 
 explicitly, or by the GC, the Node references will always be valid. 
 One has to be careful now, as mnew'ed object are effectively under 
 manual memory management, and so every mnew must have a corresponding 
 delete, lest there be dangling pointer ou memory leaks. Nonetheless 
 it seems to be only sane solution to this problem.
This does seem to be the most reasonable method. In fact, it could be done now without the addition of new keywords by adding two new GC functions: release and reclaim (bad names, but they're all I could think of). 'release' would tell the GC not to automatically finalize or delete the memory block, as you've suggested above, and 'reclaim' would transfer ownership back to the GC. It's more error prone than I'd like, but also perhaps the most reasonable.
Hum, indeed.
Then again, with a proper allocator (mmnew) there is room for more optimization. I doubt one would want (or that it would be good) to change the management ownership of an instance during it's lifetime. Rather, it should be set right from the start (when allocated). Also, I've realized just now, that with templates one can get a pretty close solution, with something like: mmnew!(Foobar) The shortcoming is you won't be able to use non-default constructors in that call. -- Bruno Medeiros - CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Apr 14 2006
prev sibling next sibling parent reply Mike Capp <mike.capp gmail.com> writes:
In article <e1bj4r$1gt$1 digitaldaemon.com>, Bruno Medeiros says...
Some ideas where discussed here, but I didn't think any were fruitful. Like:
  *Forcing all classes with destructors to be auto classes -> doesn't 
add any usefulness, instead just nuisances.
Hmm, yes. Like private/protected member access specifiers - what usefulness do they add? Or requiring a cast to assign from one type to another - sheer nuisance! cheers Mike
Apr 09 2006
parent reply Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:
Mike Capp wrote:
 In article <e1bj4r$1gt$1 digitaldaemon.com>, Bruno Medeiros says...
 Some ideas where discussed here, but I didn't think any were fruitful. Like:
  *Forcing all classes with destructors to be auto classes -> doesn't 
 add any usefulness, instead just nuisances.
Hmm, yes. Like private/protected member access specifiers - what usefulness do they add? Or requiring a cast to assign from one type to another - sheer nuisance! cheers Mike
Protection attributes and casts add usefulness (not gonna detail why). Forcing all classes with destructors to be auto classes, on the other hand, severily limits the usage of such classes. An auto class can not be a global, static, field, inout and out parameter. It must be bound to a function, and *cannot be a part of another data structure*. This latter restriction, as is, is unacceptable, no? -- Bruno Medeiros - CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Apr 10 2006
next sibling parent reply Mike Capp <mike.capp gmail.com> writes:
In article <e1dak2$21d9$1 digitaldaemon.com>, Bruno Medeiros says...
Protection attributes and casts add usefulness (not gonna detail why).
The usefulness of protection attributes lies solely in preventing you from misusing something. Same with auto and dtors. If a class needs a dtor, leaving it to the GC qualifies as misuse in my view.
Forcing all classes with destructors to be auto classes, on the other 
hand, severily limits the usage of such classes. An auto class can not 
be a global, static, field, inout and out parameter. It must be bound to 
a function, and *cannot be a part of another data structure*. This 
latter restriction, as is, is unacceptable, no?
Agreed; IIRC, auto members of auto classes were part of my original suggestion, and I think the dtors-for-autos-only restriction would quickly force this problem out into the open. It may be that we're agreeing on the destination and only differing on how to get there. cheers Mike
Apr 10 2006
parent Don Clugston <dac nospam.com.au> writes:
Mike Capp wrote:
 In article <e1dak2$21d9$1 digitaldaemon.com>, Bruno Medeiros says...
 Protection attributes and casts add usefulness (not gonna detail why).
The usefulness of protection attributes lies solely in preventing you from misusing something. Same with auto and dtors. If a class needs a dtor, leaving it to the GC qualifies as misuse in my view.
 Forcing all classes with destructors to be auto classes, on the other 
 hand, severily limits the usage of such classes. An auto class can not 
 be a global, static, field, inout and out parameter. It must be bound to 
 a function, and *cannot be a part of another data structure*. This 
 latter restriction, as is, is unacceptable, no?
Agreed; IIRC, auto members of auto classes were part of my original suggestion, and I think the dtors-for-autos-only restriction would quickly force this problem out into the open.
I suspect that if finalisers were abolished, those other restrictions would be MUCH easier to lift. They probably exist mainly because of the complexity of the interactions with the GC.
 
 It may be that we're agreeing on the destination and only differing on how to
 get there.
 
 cheers
 Mike
 
 
Apr 10 2006
prev sibling parent "Regan Heath" <regan netwin.co.nz> writes:
On Mon, 10 Apr 2006 11:05:00 +0100, Bruno Medeiros  
<brunodomedeirosATgmail SPAM.com> wrote:
 Mike Capp wrote:
 In article <e1bj4r$1gt$1 digitaldaemon.com>, Bruno Medeiros says...
 Some ideas where discussed here, but I didn't think any were fruitful.  
 Like:
  *Forcing all classes with destructors to be auto classes -> doesn't  
 add any usefulness, instead just nuisances.
Hmm, yes. Like private/protected member access specifiers - what usefulness do they add? Or requiring a cast to assign from one type to another - sheer nuisance! cheers Mike
Protection attributes and casts add usefulness (not gonna detail why). Forcing all classes with destructors to be auto classes, on the other hand, severily limits the usage of such classes. An auto class can not be a global, static, field, inout and out parameter. It must be bound to a function, and *cannot be a part of another data structure*. This latter restriction, as is, is unacceptable, no?
The suggestion I made assumed we could remove these restrictions. I'm not sure whether that's true or not, if it is impossible can someone explain to me why? I would be curious to know. It seems that if C++ can have classes at module/file scope that have destructors why can't D. I have a feeling it has something to do with how Walter has implemented it.. but that _could_ change, if the reasons were strong enough, right? If we assume (for the purposes of exploring the solution) that the restrictions can be removed doesn't this idea make a lot of sense? 1. any class/module with a dtor must be 'auto'. 2. any class/module containing a reference to an 'auto' class/module must be 'auto'. 3. The 'auto' keyword used here; "auto Foo f = new Foo();" is not required, remove it. 4. A 'shared' keyword is used to indicate a shared 'auto' resource. Rationale: if a class has cleanup to do, it must be done all the time, not just sometimes and not selectively*. Therefore any class with cleanup to do is 'auto' and any class containing an 'auto' member also has cleanup to do, thus must be 'auto'. (*) the exception to this rule is a member reference to a shared resource, thus the 'shared' keyword. The compiler can auto-generate dtors for classes containing 'auto' members eg. auto class File { HANDLE h; ~this() { CloseHandle(h); } } auto class Foo { File f; /*auto generated dtor ~this() { delete f; } } If the user supplies a dtor, the compiler can simply append it's auto-dtor to the end of that (I don't think deleting a reference twice is a problem). In this way 'auto' propagates itself as required. (In fact if you think about it, the keyword 'auto' isn't really even required. It can be removed and the behaviour outlined above can simply be implemented) The shared keyword would prevent the automatic dtor from calling delete on the shared reference. If that reference was the only 'auto' member it would therefore prevent the class from being 'auto' itself. The user would have to manage the shared resource manually, or rather, can rely on it being deleted by the (one and only) non-shared reference to it, eg. [file.d] File a = new File("a.txt"); class Foo { shared File foo; this(File f) { foo = f; } } void main() { Foo f = new Foo(a); } The class 'Foo' is not auto, it has no dtor, the compiler does not generate one, it's shared reference 'foo' is never deleted. The module level reference 'a' is auto, an auto generated module level dtor will delete it. The classes affected by this idea are few, I'd say less than 20% (even with 'auto' propagating up the class tree), the rest will have no dtor and will simply be collected as normal by the GC, no dtor calls required. As far as I can see there are no restrictions of use for this idea. Classes will be the same as they are today, only they'll have deterministic destruction where required. Assuming of course it can actually be implemented. Regan
Apr 10 2006
prev sibling parent reply Georg Wrede <georg.wrede nospam.org> writes:
Bruno Medeiros wrote:
 Sean Kelly wrote:
 
 Jarrett Billingsley wrote:

 "Sean Kelly" <sean f4.ca> wrote in message 
 news:e10pk7$2khb$1 digitaldaemon.com...

     - a type can have a destructor and/or a finalizer
     - the destructor is called upon a) explicit delete or b) at end 
 of scope for auto objects
     - the finalizer is called if allocated on the gc heap and the
       destructor has not been called
Would you mind explaining why exactly there needs to be a difference between destructors and finalizers? I've been following all the arguments about this heap vs. auto classes and dtors vs. finalizers, and I still can't figure out why destructors _can't be the finalizers_. Do finalizers do something fundamentally different from destructors?
Since finalizers are called when the GC destroys an object, they are very limited in what they can do. They can't assume any GC managed object they have a reference to is valid, etc. By contrast, destructors can make this assumption, because the object is being destroyed deterministically. I think having both may be too confusing to be worthwhile, but it would allow for things like this: class LinkedList { ~this() { // called deterministically for( Node n = top; n; ) { Node t = n->next; delete n; n = t; } finalize(); } void finalize() { // called by GC // nodes may have already been destroyed // so leave them alone, but special // resources could be reclaimed } } The argument against finalizers, as Mike mentioned, is that you typically want to reclaim such special resources deterministically, so letting the GC take care of this 'someday' is of questionable utility. Sean
Ok, I think we can tackle this problem in a better way. So far, people have been thinking about the fact that when destructors are called in a GC cycle, they are called with finalizer semantics (i.e., you don't know if the member references are valid or not, thus you can't use them). This is a problem when in a destructor, one would like to destroy component objects (as the Nodes of the LinkedList example). Some ideas where discussed here, but I didn't think any were fruitful. Like: *Forcing all classes with destructors to be auto classes -> doesn't add any usefulness, instead just nuisances. *Making the GC destroy objects in an order that makes members references valid -> has a high performance cost and/or is probably just not possible (circular references?). Perhaps another way would be to have the following behavior: - When a destructor is called during a GC (i.e., "as a finalizer") for an object, then the member references are not valid and cannot be referenced, *but they can be deleted*. It will be deleted iff it has not been deleted already. I think this can be done without significant overhead. At the end of a GC cycle, the GC has already a list of all objects that are to be deleted. Thus, on the release phase, it could be modified to have a flag indicating whether the object was already deleted or not. Thus when LinkedList deletes a Node, the delete is only made if the object has already been deleted or not.
If an instance is deleted by the GC, the pointers that it may have to other instances (of the same or instances of other classes) vanish. All of those other instances may or may not have other pointers pointing to them. So, deleting (or destructing) a particular instance, should not in any way "cascade" to those other instances. On the next run, the GC _may_ notice that those other instances are not pointed-to by anything anymore, and then it may delete/destruct them. --- So much for "regular" instance deletion. Then, we have the case where the instance "owns" some scarce resource (a file handle, a port, or some such). Such instances should be destructed in a _timely_ fashion _only_, right? In other words, instances that need explicit destruction, should be destructed _at_the_moment_ they become obsolete -- and not "mañana". It is conceivable that the "regular" instances do not have explicit destructors (after all, their memory footprint would just be released to the free pool), wherease the "resource owning" instances really do need an explicit destructor. Thus, the existence of an explicit destructor should be a sign that makes [us, Walter, the compiler, anybody] understand that such an instance _needs_ to be destructed _right_away_. This makes one think of "auto". Now, there have been several comments like /auto can't work/ because we don't know the scope of the instance. That is just BS. Every piece of source code should be written "hierarchically" (that is, not the entire program as "one function"). When one refactors the goings-on in the program to short procedures, then it all of a sudden is not too difficult to use "auto" to manage the lifetime of instances.
 Still, while the previous idea might be good, it's not the optimal, 
 because we are not clearly apperceiving the problem/issue at hand. What 
 we *really* want is to directly couple the lifecycle of a component 
 (member) object with it's composite (owner) object. A Node of a 
 LinkedList has the same lifecycle of it's LinkedList, so Node shouldn't 
 even be a independent Garbage Collection managing element.
 
 What we want is an allocator that allocates memory that is not to be 
 claimed by the GC (but which is to be scanned by the GC). It's behavior 
 is exactly like the allocator of 
 http://www.digitalmars.com/d/memory.html#newdelete but it should come 
 with the language and be available for all types. With usage like:
 
   class LinkedList {
     ...
     Add(Object obj) {
       Node node = mnew Node(blabla);
       ...
     }
 
 Thus, when the destructor is called upon a LinkedList, either 
 explicitly, or by the GC, the Node references will always be valid. One 
 has to be careful now, as mnew'ed object are effectively under manual 
 memory management, and so every mnew must have a corresponding delete, 
 lest there be dangling pointer ou memory leaks. Nonetheless it seems to 
 be only sane solution to this problem.
 
 
 Another interesting addition, is to extend the concept of auto to class 
 members. Just as currently auto couples the lifecycle of a variable to 
 the enclosing function, an auto class member would couple the lifecycle 
 of its member to it's owner object. It would get deleted implicitly when 
 then owner object got deleted. Here is another (made up) example:
 
   class SomeUIWidget {
     auto Color fgcolor;
     auto Color bgcolor;
     auto Size size;
     auto Image image;
     ...
 
 The auto members would then have to be initialized on a constructor or 
 something (the exact restrictions might vary, such as being final or not).
 
 
Apr 09 2006
parent reply kris <foo bar.com> writes:
Georg Wrede wrote:
[snip]

 So much for "regular" instance deletion. Then, we have the case where 
 the instance "owns" some scarce resource (a file handle, a port, or some 
 such). Such instances should be destructed in a _timely_ fashion _only_, 
 right?
 
 In other words, instances that need explicit destruction, should be 
 destructed _at_the_moment_ they become obsolete -- and not "mañana".
 
 It is conceivable that the "regular" instances do not have explicit 
 destructors (after all, their memory footprint would just be released to 
 the free pool), wherease the "resource owning" instances really do need 
 an explicit destructor.
 
 Thus, the existence of an explicit destructor should be a sign that 
 makes [us, Walter, the compiler, anybody] understand that such an 
 instance _needs_ to be destructed _right_away_.
 
 This makes one think of "auto". Now, there have been several comments 
 like /auto can't work/ because we don't know the scope of the instance. 
 That is just BS. Every piece of source code should be written 
 "hierarchically" (that is, not the entire program as "one function"). 
 When one refactors the goings-on in the program to short procedures, 
 then it all of a sudden is not too difficult to use "auto" to manage the 
 lifetime of instances.
That was all sounding reasonable up until this point :) I think we can safely put aside the entire-program-as-one-function as unrealistic. Given that, and assuming the existance of a dtor implies "auto" (and thus raii), how does one manage a "pool" of resources? For example, how about a pool of DB connections? Let's assume that they need to be correctly closed at some point, and that the pool is likely to expand and contract based upon demand over time ... So the question is how do those connections, and the pool itself, jive with scoped raii? Assuming it doesn't, then one would presumeably revert to a manual dispose() pattern with such things?
Apr 09 2006
parent reply Mike Capp <mike.capp gmail.com> writes:
In article <e1c6vl$moj$1 digitaldaemon.com>, kris says...
I think we can safely put aside the entire-program-as-one-function as 
unrealistic. Given that, and assuming the existance of a dtor implies 
"auto" (and thus raii), how does one manage a "pool" of resources? For 
example, how about a pool of DB connections? Let's assume that they need 
to be correctly closed at some point, and that the pool is likely to 
expand and contract based upon demand over time ...

So the question is how do those connections, and the pool itself, jive 
with scoped raii? Assuming it doesn't, then one would presumeably revert 
to a manual dispose() pattern with such things?
Two different classes. A ConnectionPool at application scope, e.g. in main(), and a ConnectionUsage wherever you need one. Both are RAII. ConnectionPool acts as a factory for ConnectionUsage instances (modulo language limitations) and adds to the pool as needed; ConnectionUsage just "borrows" an instance from the pool for the duration of its scope. cheers Mike
Apr 09 2006
next sibling parent reply kris <foo bar.com> writes:
Mike Capp wrote:
 In article <e1c6vl$moj$1 digitaldaemon.com>, kris says...
 
I think we can safely put aside the entire-program-as-one-function as 
unrealistic. Given that, and assuming the existance of a dtor implies 
"auto" (and thus raii), how does one manage a "pool" of resources? For 
example, how about a pool of DB connections? Let's assume that they need 
to be correctly closed at some point, and that the pool is likely to 
expand and contract based upon demand over time ...

So the question is how do those connections, and the pool itself, jive 
with scoped raii? Assuming it doesn't, then one would presumeably revert 
to a manual dispose() pattern with such things?
Two different classes. A ConnectionPool at application scope, e.g. in main(), and a ConnectionUsage wherever you need one. Both are RAII. ConnectionPool acts as a factory for ConnectionUsage instances (modulo language limitations) and adds to the pool as needed; ConnectionUsage just "borrows" an instance from the pool for the duration of its scope. cheers Mike
Thanks! So, when culling the pool (say, on a timeout basis) the cleanup-code for the held resource is not held within the "borrowed" dtor, but in a dispose() method? Otherwise, said dtor would imply raii for the borrowed connection, which would be bogus behaviour for a class instance that is being held onto by the pool? In other words: you'd want to avoid deleting (via raii) the connection object, so you'd have to be careful to not use a dtor in such a case (if we assume dtor means raii). What I'm getting at here is a potential complexity in the implementation of pool-style designs. Perhaps not a big deal, but something to be learned anyway? And it retains a need for the dispose() pattern? I /think/ I prefer the simplicity of removing dtor invocation from the GC instead (see post "GC and dtors ~ a different approach?"). How about you?
Apr 09 2006
parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Sun, 09 Apr 2006 18:34:47 -0700, kris <foo bar.com> wrote:
 Mike Capp wrote:
 In article <e1c6vl$moj$1 digitaldaemon.com>, kris says...

 I think we can safely put aside the entire-program-as-one-function as  
 unrealistic. Given that, and assuming the existance of a dtor implies  
 "auto" (and thus raii), how does one manage a "pool" of resources? For  
 example, how about a pool of DB connections? Let's assume that they  
 need to be correctly closed at some point, and that the pool is likely  
 to expand and contract based upon demand over time ...

 So the question is how do those connections, and the pool itself, jive  
 with scoped raii? Assuming it doesn't, then one would presumeably  
 revert to a manual dispose() pattern with such things?
Two different classes. A ConnectionPool at application scope, e.g. in main(), and a ConnectionUsage wherever you need one. Both are RAII. ConnectionPool acts as a factory for ConnectionUsage instances (modulo language limitations) and adds to the pool as needed; ConnectionUsage just "borrows" an instance from the pool for the duration of its scope. cheers Mike
Thanks! So, when culling the pool (say, on a timeout basis) the cleanup-code for the held resource is not held within the "borrowed" dtor, but in a dispose() method? Otherwise, said dtor would imply raii for the borrowed connection, which would be bogus behaviour for a class instance that is being held onto by the pool? In other words: you'd want to avoid deleting (via raii) the connection object, so you'd have to be careful to not use a dtor in such a case (if we assume dtor means raii).
Unless you add a 'shared' keyword as I described in a previous post. eg. auto class Connection { //auto required to have dtor HANDLE h; ~this() { CloseHandle(h); } } class ConnectionUsage { shared Connection c; } ConnectionUsage is not required to be 'auto' because it has no 'auto' class members which are not 'shared' resources. Alternately you implement reference counting for the Connection class, remove shared, and add 'auto' to ConnectionUsage. Regan
Apr 09 2006
parent reply kris <foo bar.com> writes:
Regan Heath wrote:
 On Sun, 09 Apr 2006 18:34:47 -0700, kris <foo bar.com> wrote:
 
 Mike Capp wrote:

 In article <e1c6vl$moj$1 digitaldaemon.com>, kris says...

 I think we can safely put aside the entire-program-as-one-function 
 as  unrealistic. Given that, and assuming the existance of a dtor 
 implies  "auto" (and thus raii), how does one manage a "pool" of 
 resources? For  example, how about a pool of DB connections? Let's 
 assume that they  need to be correctly closed at some point, and 
 that the pool is likely  to expand and contract based upon demand 
 over time ...

 So the question is how do those connections, and the pool itself, 
 jive  with scoped raii? Assuming it doesn't, then one would 
 presumeably  revert to a manual dispose() pattern with such things?
Two different classes. A ConnectionPool at application scope, e.g. in main(), and a ConnectionUsage wherever you need one. Both are RAII. ConnectionPool acts as a factory for ConnectionUsage instances (modulo language limitations) and adds to the pool as needed; ConnectionUsage just "borrows" an instance from the pool for the duration of its scope. cheers Mike
Thanks! So, when culling the pool (say, on a timeout basis) the cleanup-code for the held resource is not held within the "borrowed" dtor, but in a dispose() method? Otherwise, said dtor would imply raii for the borrowed connection, which would be bogus behaviour for a class instance that is being held onto by the pool? In other words: you'd want to avoid deleting (via raii) the connection object, so you'd have to be careful to not use a dtor in such a case (if we assume dtor means raii).
Unless you add a 'shared' keyword as I described in a previous post. eg. auto class Connection { //auto required to have dtor HANDLE h; ~this() { CloseHandle(h); } } class ConnectionUsage { shared Connection c; } ConnectionUsage is not required to be 'auto' because it has no 'auto' class members which are not 'shared' resources. Alternately you implement reference counting for the Connection class, remove shared, and add 'auto' to ConnectionUsage. Regan
Yes ~ that's true. On the other hand, all these concerns would melt away if the GC were changed to not invoke the dtor (see related post). The beauty of that approach is that there's no additional keywords or compiler behaviour; only the GC is modified to remove the dtor call during a normal collection cycle. Invoking delete or raii just works as always, yet the invalid dtor state is eliminated. It also eliminates the need for a dispose() pattern, which would be nice ;-)
Apr 09 2006
next sibling parent reply Dave <Dave_member pathlink.com> writes:
In article <e1cfpo$100u$1 digitaldaemon.com>, kris says...
Regan Heath wrote:
 On Sun, 09 Apr 2006 18:34:47 -0700, kris <foo bar.com> wrote:
 
 Mike Capp wrote:

 In article <e1c6vl$moj$1 digitaldaemon.com>, kris says...

 I think we can safely put aside the entire-program-as-one-function 
 as  unrealistic. Given that, and assuming the existance of a dtor 
 implies  "auto" (and thus raii), how does one manage a "pool" of 
 resources? For  example, how about a pool of DB connections? Let's 
 assume that they  need to be correctly closed at some point, and 
 that the pool is likely  to expand and contract based upon demand 
 over time ...

 So the question is how do those connections, and the pool itself, 
 jive  with scoped raii? Assuming it doesn't, then one would 
 presumeably  revert to a manual dispose() pattern with such things?
Two different classes. A ConnectionPool at application scope, e.g. in main(), and a ConnectionUsage wherever you need one. Both are RAII. ConnectionPool acts as a factory for ConnectionUsage instances (modulo language limitations) and adds to the pool as needed; ConnectionUsage just "borrows" an instance from the pool for the duration of its scope. cheers Mike
Thanks! So, when culling the pool (say, on a timeout basis) the cleanup-code for the held resource is not held within the "borrowed" dtor, but in a dispose() method? Otherwise, said dtor would imply raii for the borrowed connection, which would be bogus behaviour for a class instance that is being held onto by the pool? In other words: you'd want to avoid deleting (via raii) the connection object, so you'd have to be careful to not use a dtor in such a case (if we assume dtor means raii).
Unless you add a 'shared' keyword as I described in a previous post. eg. auto class Connection { //auto required to have dtor HANDLE h; ~this() { CloseHandle(h); } } class ConnectionUsage { shared Connection c; } ConnectionUsage is not required to be 'auto' because it has no 'auto' class members which are not 'shared' resources. Alternately you implement reference counting for the Connection class, remove shared, and add 'auto' to ConnectionUsage. Regan
Yes ~ that's true. On the other hand, all these concerns would melt away if the GC were changed to not invoke the dtor (see related post). The beauty of that approach is that there's no additional keywords or compiler behaviour; only the GC is modified to remove the dtor call during a normal collection cycle. Invoking delete or raii just works as always, yet the invalid dtor state is eliminated. It also eliminates the need for a dispose() pattern, which would be nice ;-)
So, 'auto' and delete would work as they do now, with the remaining problem of people defining ~this() and it (inadvertently) never gets called, even at program exit? Hmmm if that's so, I'd add one thing -- how about something like a "fullCollect(bool finalize = false)" that would be called with 'true' at the end of dmain(), and could be explicitly called by the programmer? That could run into the problem of dtors invoked in an invalid state, but at least then it would still be deterministic (either the program ending normally or the programmer calling fullCollect(true)). BTW - I must have missed it, but what would be an example of a dtor called in an invalid state? Thanks, - Dave
Apr 09 2006
parent kris <foo bar.com> writes:
Dave wrote:
 In article <e1cfpo$100u$1 digitaldaemon.com>, kris says...
 
Regan Heath wrote:

On Sun, 09 Apr 2006 18:34:47 -0700, kris <foo bar.com> wrote:


Mike Capp wrote:


In article <e1c6vl$moj$1 digitaldaemon.com>, kris says...


I think we can safely put aside the entire-program-as-one-function 
as  unrealistic. Given that, and assuming the existance of a dtor 
implies  "auto" (and thus raii), how does one manage a "pool" of 
resources? For  example, how about a pool of DB connections? Let's 
assume that they  need to be correctly closed at some point, and 
that the pool is likely  to expand and contract based upon demand 
over time ...

So the question is how do those connections, and the pool itself, 
jive  with scoped raii? Assuming it doesn't, then one would 
presumeably  revert to a manual dispose() pattern with such things?
Two different classes. A ConnectionPool at application scope, e.g. in main(), and a ConnectionUsage wherever you need one. Both are RAII. ConnectionPool acts as a factory for ConnectionUsage instances (modulo language limitations) and adds to the pool as needed; ConnectionUsage just "borrows" an instance from the pool for the duration of its scope. cheers Mike
Thanks! So, when culling the pool (say, on a timeout basis) the cleanup-code for the held resource is not held within the "borrowed" dtor, but in a dispose() method? Otherwise, said dtor would imply raii for the borrowed connection, which would be bogus behaviour for a class instance that is being held onto by the pool? In other words: you'd want to avoid deleting (via raii) the connection object, so you'd have to be careful to not use a dtor in such a case (if we assume dtor means raii).
Unless you add a 'shared' keyword as I described in a previous post. eg. auto class Connection { //auto required to have dtor HANDLE h; ~this() { CloseHandle(h); } } class ConnectionUsage { shared Connection c; } ConnectionUsage is not required to be 'auto' because it has no 'auto' class members which are not 'shared' resources. Alternately you implement reference counting for the Connection class, remove shared, and add 'auto' to ConnectionUsage. Regan
Yes ~ that's true. On the other hand, all these concerns would melt away if the GC were changed to not invoke the dtor (see related post). The beauty of that approach is that there's no additional keywords or compiler behaviour; only the GC is modified to remove the dtor call during a normal collection cycle. Invoking delete or raii just works as always, yet the invalid dtor state is eliminated. It also eliminates the need for a dispose() pattern, which would be nice ;-)
So, 'auto' and delete would work as they do now, with the remaining problem of people defining ~this() and it (inadvertently) never gets called, even at program exit? Hmmm if that's so, I'd add one thing -- how about something like a "fullCollect(bool finalize = false)" that would be called with 'true' at the end of dmain(), and could be explicitly called by the programmer? That could run into the problem of dtors invoked in an invalid state, but at least then it would still be deterministic (either the program ending normally or the programmer calling fullCollect(true)). BTW - I must have missed it, but what would be an example of a dtor called in an invalid state? Thanks, - Dave
See post entitled "GC & dtors ~ a different approach" at 6:17pm ?
Apr 09 2006
prev sibling next sibling parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Sun, 09 Apr 2006 19:27:09 -0700, kris <foo bar.com> wrote:
 Regan Heath wrote:
 On Sun, 09 Apr 2006 18:34:47 -0700, kris <foo bar.com> wrote:

 Mike Capp wrote:

 In article <e1c6vl$moj$1 digitaldaemon.com>, kris says...

 I think we can safely put aside the entire-program-as-one-function  
 as  unrealistic. Given that, and assuming the existance of a dtor  
 implies  "auto" (and thus raii), how does one manage a "pool" of  
 resources? For  example, how about a pool of DB connections? Let's  
 assume that they  need to be correctly closed at some point, and  
 that the pool is likely  to expand and contract based upon demand  
 over time ...

 So the question is how do those connections, and the pool itself,  
 jive  with scoped raii? Assuming it doesn't, then one would  
 presumeably  revert to a manual dispose() pattern with such things?
Two different classes. A ConnectionPool at application scope, e.g. in main(), and a ConnectionUsage wherever you need one. Both are RAII. ConnectionPool acts as a factory for ConnectionUsage instances (modulo language limitations) and adds to the pool as needed; ConnectionUsage just "borrows" an instance from the pool for the duration of its scope. cheers Mike
Thanks! So, when culling the pool (say, on a timeout basis) the cleanup-code for the held resource is not held within the "borrowed" dtor, but in a dispose() method? Otherwise, said dtor would imply raii for the borrowed connection, which would be bogus behaviour for a class instance that is being held onto by the pool? In other words: you'd want to avoid deleting (via raii) the connection object, so you'd have to be careful to not use a dtor in such a case (if we assume dtor means raii).
Unless you add a 'shared' keyword as I described in a previous post. eg. auto class Connection { //auto required to have dtor HANDLE h; ~this() { CloseHandle(h); } } class ConnectionUsage { shared Connection c; } ConnectionUsage is not required to be 'auto' because it has no 'auto' class members which are not 'shared' resources. Alternately you implement reference counting for the Connection class, remove shared, and add 'auto' to ConnectionUsage. Regan
Yes ~ that's true. On the other hand, all these concerns would melt away if the GC were changed to not invoke the dtor (see related post). The beauty of that approach is that there's no additional keywords or compiler behaviour;
True, however the beauty is marred by the possibility of resource leaks. I'd like to think we can come up with a solution which prevents them, or at least makes them less likely. It would be a big step up over C++ etc and if it takes adding a keyword and/or new compiler behaviour it's a small price to pay IMO.
 only the GC is modified to remove the dtor call during a normal  
 collection cycle. Invoking delete or raii just works as always, yet the  
 invalid dtor state is eliminated. It also eliminates the need for a  
 dispose() pattern, which would be nice ;-)
At least this idea stops people doing things they shouldn't in dtors. What I think we need to do is come up with several concrete use-cases (actual code) which use resources which need to be released and explore how each suggestion would affect that code, for example I'm still not conviced the linklist use-case mentioned here several times requires any explicit cleanup code, isn't it all just memory to be freed by the GC? Can someone post a code example and explain why it does please. It seems to me that as modules already have ctor/dtors then my suggestion can simply treat a module like a class i.e. automatically adding a dtor (or appending to an existing dtor) which deletes the (non shared) auto class instances at module level. Regan
Apr 09 2006
next sibling parent reply kris <foo bar.com> writes:
Regan Heath wrote:
 On Sun, 09 Apr 2006 19:27:09 -0700, kris <foo bar.com> wrote:
 On the other hand, all these concerns would melt away if the GC were  
 changed to not invoke the dtor (see related post). The beauty of that  
 approach is that there's no additional keywords or compiler behaviour;
True, however the beauty is marred by the possibility of resource leaks. I'd like to think we can come up with a solution which prevents them, or at least makes them less likely. It would be a big step up over C++ etc and if it takes adding a keyword and/or new compiler behaviour it's a small price to pay IMO.
Regarding leaks, please see related post entitled "GC & dtors ~ a different approach" ? I just hacked up the collector in Ares to do what is described in that post. The quick hack doesn't do the leak-detection part, but the rest of it works fine (there may well be cases I've overlooked but the obvious ones, 'delete' and raii, now invoke the dtor whereas normal collection does not).
Apr 09 2006
parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Sun, 09 Apr 2006 21:18:45 -0700, kris <foo bar.com> wrote:
 Regan Heath wrote:
 On Sun, 09 Apr 2006 19:27:09 -0700, kris <foo bar.com> wrote:
 On the other hand, all these concerns would melt away if the GC were   
 changed to not invoke the dtor (see related post). The beauty of that   
 approach is that there's no additional keywords or compiler behaviour;
True, however the beauty is marred by the possibility of resource leaks. I'd like to think we can come up with a solution which prevents them, or at least makes them less likely. It would be a big step up over C++ etc and if it takes adding a keyword and/or new compiler behaviour it's a small price to pay IMO.
Regarding leaks, please see related post entitled "GC & dtors ~ a different approach" ?
I have. Here is what you say WRT leaks:
 What about implicit cleanup? In this scenario, it doesn't happen. If you  
 don't explicitly (via delete or via raii) delete an >object, the dtor is  
 not invoked. This applies the notion that it's better to have a leak  
 than a dead program. The leak is a bug >to be resolved.
Whereas using my suggestion we get implicit cleanup. Auto propagates as required, dtors are added and delete is called automatically where required resulting in no leaks. The best part is that the compiler enforces that by default and you have to opt-out with 'shared' to introduce a leak. So, assuming it's workable (Walters call) and it's not too inflexible I think it's a better solution. In short, I would rather not have to explicitly manage the resources if at all possible (and I still hope it might be). Regan
Apr 09 2006
parent reply kris <foo bar.com> writes:
Regan Heath wrote:

 I have. Here is what you say WRT leaks:
 
 What about implicit cleanup? In this scenario, it doesn't happen. If 
 you  don't explicitly (via delete or via raii) delete an >object, the 
 dtor is  not invoked. This applies the notion that it's better to have 
 a leak  than a dead program. The leak is a bug >to be resolved.
Whereas using my suggestion we get implicit cleanup. Auto propagates as required, dtors are added and delete is called automatically where required resulting in no leaks. The best part is that the compiler enforces that by default and you have to opt-out with 'shared' to introduce a leak. So, assuming it's workable (Walters call) and it's not too inflexible I think it's a better solution. In short, I would rather not have to explicitly manage the resources if at all possible (and I still hope it might be).
I thought the idea was that classes with dtors are /intended/ to be explicitly cleaned up? That, implicit cleanup of resources (manana, some time) was actually a negative aspect? At least, that's what Mike was suggesting, and it seemed like a really good idea. Along those lines, what I was suggesting is to enable dtors for explicit cleanup only. Plus an optional runtime leak detector. I guess I like the simplicity of that. What you suggest seems workable too, but perhaps a little more involved?
Apr 09 2006
next sibling parent "Regan Heath" <regan netwin.co.nz> writes:
On Sun, 09 Apr 2006 22:21:39 -0700, kris <foo bar.com> wrote:
 Regan Heath wrote:

 I have. Here is what you say WRT leaks:

 What about implicit cleanup? In this scenario, it doesn't happen. If  
 you  don't explicitly (via delete or via raii) delete an >object, the  
 dtor is  not invoked. This applies the notion that it's better to have  
 a leak  than a dead program. The leak is a bug >to be resolved.
Whereas using my suggestion we get implicit cleanup. Auto propagates as required, dtors are added and delete is called automatically where required resulting in no leaks. The best part is that the compiler enforces that by default and you have to opt-out with 'shared' to introduce a leak. So, assuming it's workable (Walters call) and it's not too inflexible I think it's a better solution. In short, I would rather not have to explicitly manage the resources if at all possible (and I still hope it might be).
I thought the idea was that classes with dtors are /intended/ to be explicitly cleaned up?
Not my idea ;) I think any given resource has a correct time/place for cleanup, we just need a way to specify that, ideally one that can do so and avoid as much human error as possible (AKA resource leaks).
 That, implicit cleanup of resources (manana, some time) was actually a  
 negative aspect? At least, that's what Mike was suggesting, and it  
 seemed like a really good idea.
It's certainly a simple solution to the problem, it may be that it's also the best, more use-cases will convince me (at least) one way of the other.
 Along those lines, what I was suggesting is to enable dtors for explicit  
 cleanup only. Plus an optional runtime leak detector. I guess I like the  
 simplicity of that. What you suggest seems workable too, but perhaps a  
 little more involved?
It's certainly more involved. It can't be done without changes to the compiler, but, once those are in place it can guarantee resources are cleaned up and it can guarantee no leaks occur. (assuming I'm not missing something obvious). The price paid for that is some flexibility (perhaps, perhaps not - I want more use-cases to try it with), I reckon the price is worth the benefit. Regan
Apr 09 2006
prev sibling parent reply Mike Capp <mike.capp gmail.com> writes:
In article <e1cq0t$1fm5$1 digitaldaemon.com>, kris says...
I thought the idea was that classes with dtors are /intended/ to be 
explicitly cleaned up? That, implicit cleanup of resources (manana, some 
time) was actually a negative aspect? At least, that's what Mike was 
suggesting, and it seemed like a really good idea.
Um... can we avoid using "implicit" and "explicit" in this context? "Implicit" to me means "without writing any code", which covers both RAII and GC cleanup (if you're lucky). "Explicit" to me means manual calls to dtors or dispose(), which is the worst of all possible approaches. cheers Mike
Apr 10 2006
parent reply kris <foo bar.com> writes:
Mike Capp wrote:
 In article <e1cq0t$1fm5$1 digitaldaemon.com>, kris says...
 
I thought the idea was that classes with dtors are /intended/ to be 
explicitly cleaned up? That, implicit cleanup of resources (manana, some 
time) was actually a negative aspect? At least, that's what Mike was 
suggesting, and it seemed like a really good idea.
Um... can we avoid using "implicit" and "explicit" in this context? "Implicit" to me means "without writing any code", which covers both RAII and GC cleanup (if you're lucky). "Explicit" to me means manual calls to dtors or dispose(), which is the worst of all possible approaches.
Yeah, I see the murk. What would you prefer to call them? The distinction being made there was whether the dtor was initiated via delete/auto, versus normal collection by the GC (where the latter was referred to as implicit).
Apr 10 2006
parent reply Don Clugston <dac nospam.com.au> writes:
kris wrote:
 Mike Capp wrote:
 In article <e1cq0t$1fm5$1 digitaldaemon.com>, kris says...

 I thought the idea was that classes with dtors are /intended/ to be 
 explicitly cleaned up? That, implicit cleanup of resources (manana, 
 some time) was actually a negative aspect? At least, that's what Mike 
 was suggesting, and it seemed like a really good idea.
Um... can we avoid using "implicit" and "explicit" in this context? "Implicit" to me means "without writing any code", which covers both RAII and GC cleanup (if you're lucky). "Explicit" to me means manual calls to dtors or dispose(), which is the worst of all possible approaches.
Yeah, I see the murk. What would you prefer to call them? The distinction being made there was whether the dtor was initiated via delete/auto, versus normal collection by the GC (where the latter was referred to as implicit).
deterministic and non-deterministic.
Apr 10 2006
next sibling parent Mike Capp <mike.capp gmail.com> writes:
In article <e1dfmc$29r2$1 digitaldaemon.com>, Don Clugston says...
kris wrote:
 Mike Capp wrote:
 Um... can we avoid using "implicit" and "explicit" in this context? 
Yeah, I see the murk. What would you prefer to call them?
deterministic and non-deterministic.
Yes. Which pretty much correspond to "important" and "don't care". cheers Mike
Apr 10 2006
prev sibling next sibling parent kris <foo bar.com> writes:
Don Clugston wrote:
 kris wrote:
 
 Mike Capp wrote:

 In article <e1cq0t$1fm5$1 digitaldaemon.com>, kris says...

 I thought the idea was that classes with dtors are /intended/ to be 
 explicitly cleaned up? That, implicit cleanup of resources (manana, 
 some time) was actually a negative aspect? At least, that's what 
 Mike was suggesting, and it seemed like a really good idea.
Um... can we avoid using "implicit" and "explicit" in this context? "Implicit" to me means "without writing any code", which covers both RAII and GC cleanup (if you're lucky). "Explicit" to me means manual calls to dtors or dispose(), which is the worst of all possible approaches.
Yeah, I see the murk. What would you prefer to call them? The distinction being made there was whether the dtor was initiated via delete/auto, versus normal collection by the GC (where the latter was referred to as implicit).
deterministic and non-deterministic.
Thank you;
Apr 10 2006
prev sibling parent reply Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:
Don Clugston wrote:
 kris wrote:
 Mike Capp wrote:
 In article <e1cq0t$1fm5$1 digitaldaemon.com>, kris says...

 I thought the idea was that classes with dtors are /intended/ to be 
 explicitly cleaned up? That, implicit cleanup of resources (manana, 
 some time) was actually a negative aspect? At least, that's what 
 Mike was suggesting, and it seemed like a really good idea.
Um... can we avoid using "implicit" and "explicit" in this context? "Implicit" to me means "without writing any code", which covers both RAII and GC cleanup (if you're lucky). "Explicit" to me means manual calls to dtors or dispose(), which is the worst of all possible approaches.
Yeah, I see the murk. What would you prefer to call them? The distinction being made there was whether the dtor was initiated via delete/auto, versus normal collection by the GC (where the latter was referred to as implicit).
deterministic and non-deterministic.
I don't like those terms. Although they are not false (because *currently* explicit destruction is deterministic, and implicit destruction in non-deterministic), the fact of whether the destructor was called deterministically or non-deterministically is not in itself relevant to this issue. What is relevant is the state of the object to be destroyed (in defined or undefined state). Nor is implicit destruction/collection inherently non-deterministic and vice-versa. (even if systems that operated this way would be unpractical) So far, I'm keeping the terms "implicit" and "explicit", as they seems adequate to me and I don't find at all that RAII collection is "implicit" or "without writing any code". -- Bruno Medeiros - CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Apr 13 2006
parent reply Don Clugston <dac nospam.com.au> writes:
Bruno Medeiros wrote:
 Don Clugston wrote:
 kris wrote:
 Mike Capp wrote:
 In article <e1cq0t$1fm5$1 digitaldaemon.com>, kris says...

 I thought the idea was that classes with dtors are /intended/ to be 
 explicitly cleaned up? That, implicit cleanup of resources (manana, 
 some time) was actually a negative aspect? At least, that's what 
 Mike was suggesting, and it seemed like a really good idea.
Um... can we avoid using "implicit" and "explicit" in this context? "Implicit" to me means "without writing any code", which covers both RAII and GC cleanup (if you're lucky). "Explicit" to me means manual calls to dtors or dispose(), which is the worst of all possible approaches.
Yeah, I see the murk. What would you prefer to call them? The distinction being made there was whether the dtor was initiated via delete/auto, versus normal collection by the GC (where the latter was referred to as implicit).
deterministic and non-deterministic.
I don't like those terms. Although they are not false (because *currently* explicit destruction is deterministic, and implicit destruction in non-deterministic), the fact of whether the destructor was called deterministically or non-deterministically is not in itself relevant to this issue.
I'm not sure that this is correct, see below.
 What is relevant is the state of the object to 
 be destroyed (in defined or undefined state).
 Nor is implicit destruction/collection inherently non-deterministic and 
 vice-versa.  (even if systems that operated this way would be unpractical)
Yes, you're right, a finaliser could be invoked immediately whenever the last reference goes out of scope. But I think (not sure) that the issues with finalisers would disappear if they were deterministic in this manner. At least, I'm confident that non-deterministic scope-based destructors would suffer from the same problems that finalisers do.
 So far, I'm keeping the terms "implicit" and "explicit", as they seems 
 adequate to me and I don't find at all that RAII collection is 
 "implicit" or "without writing any code".
However, RAII has been contrasted with "explicit" memory management for a very long time. "Explicit" has a firmly established meaning of 'new' and 'delete', it's very confusing to use them to mean something entirely different. (If however, the distinction is between "gc" and "non-gc", let's call a spade a spade). On this topic -- there's an interesting thread on comp.c++ by Andrei Alexandrescu about gc and RAII. Among other things, he argues that finalisers are a flawed concept that shouldn't be included. (BTW, he seems to be getting *very* interested in D -- he now has a link to the D spec on his website, for example -- so his opinions are worth examining).
Apr 18 2006
parent reply Sean Kelly <sean f4.ca> writes:
Don Clugston wrote:
 
 On this topic -- there's an interesting thread on comp.c++ by Andrei 
 Alexandrescu about gc and RAII. Among other things, he argues that 
 finalisers are a flawed concept that shouldn't be included. (BTW, he 
 seems to be getting *very* interested in D -- he now has a link to the D 
 spec on his website, for example -- so his opinions are worth examining).
This seems in line with some of the other ideas discussed in this thread, and with what I'm trying out with this latest release of Ares. The idea is that the runtime code will be aware of how an object is being destroyed, be it by by the GC or by some other means. Currently, that's as as far as it goes unless you want to modify the finalizer function and rebuild the runtime, but the next release will include a hookable callback in the standard library similar to onAssertError. This will allow the user to decide upon which behavior is most appropriate, and to do so on a per-class basis as I am planning to pass either the original class pointer or simply a ClassInfo object. For a debug build it may be appropriate to report the error and terminate (say via an assert) while some release applications may want to be a bit more lenient. This does impose a restriction on standard library code however, as it must behave as if non-deterministic finalization is always illegal. This isn't terribly difficult to accomplish, but it's something to be aware of. Sean
Apr 18 2006
parent reply Mike Capp <mike.capp gmail.com> writes:
In article <e2378s$gpn$1 digitaldaemon.com>, Sean Kelly says...
the next release [of Ares] will include a 
hookable callback in the standard library similar to onAssertError. 
This will allow the user to decide upon which behavior is most 
appropriate, and to do so on a per-class basis as I am planning to pass 
either the original class pointer or simply a ClassInfo object.
To clarify: if the decision is per-class (which I agree it should be), is there any benefit to catching this error at runtime rather than compile time? Or is it just that it's easier to try out this way? cheers Mike
Apr 18 2006
parent reply Sean Kelly <sean f4.ca> writes:
Mike Capp wrote:
 In article <e2378s$gpn$1 digitaldaemon.com>, Sean Kelly says...
 the next release [of Ares] will include a 
 hookable callback in the standard library similar to onAssertError. 
 This will allow the user to decide upon which behavior is most 
 appropriate, and to do so on a per-class basis as I am planning to pass 
 either the original class pointer or simply a ClassInfo object.
To clarify: if the decision is per-class (which I agree it should be), is there any benefit to catching this error at runtime rather than compile time? Or is it just that it's easier to try out this way?
I'm not entirely sure it would be possible to catch every instance of this at compile-time. That aside, I very much want to avoid anything requiring compiler changes unless Walter is the one to implement them, and really to avoid any fundamental changes in application behavior without Walter's approval. This is one reason I've chosen to add this feature via a hookable callback that defaults to existing behavior (ie. to ignore the problem and continue). The other being that I'm not convinced such errors always warrant termination, particularly for release builds. To clarify, I've added two callbacks and a user-callable function to my local build: void setCollectHandler( collectHandlerType h ); extern (C) void onCollectResource( ClassInfo info ); onCollectResource is called whenever the GC collects an object that has a dtor and if not user-supplied handler is provided then the call is a no-op. I may yet replace the ClassInfo object with an Object reference, but haven't decided whether doing so offers much over the current version. extern (C) void onFinalizeError( ClassInfo c, Exception e ); onFinalizeError is called whenever an Exception is thrown from an object dtor and will effectively terminate the application with a message. This is accomplished by wrapping the passed exception in a new system-level exception object and re-throwing. Things get a bit weird if e is an OutOfMemoryException, but that's a possibility I'm ignoring for now. Sean
Apr 18 2006
parent Sean Kelly <sean f4.ca> writes:
Sean Kelly wrote:
 Mike Capp wrote:
 In article <e2378s$gpn$1 digitaldaemon.com>, Sean Kelly says...
 the next release [of Ares] will include a hookable callback in the 
 standard library similar to onAssertError. This will allow the user 
 to decide upon which behavior is most appropriate, and to do so on a 
 per-class basis as I am planning to pass either the original class 
 pointer or simply a ClassInfo object.
To clarify: if the decision is per-class (which I agree it should be), is there any benefit to catching this error at runtime rather than compile time? Or is it just that it's easier to try out this way?
As per Kris' suggestion, the (future) behavior of onCollectResource in Ares has changed slightly. The call now has the following format: extern (C) bool onCollectResource( Object obj ); Default behavior is as before--to silently clean up the object and continue. However, if the user has supplied a cleanup handler and it returns 'false' then the object's dtors will not be called. Instead, the user code is expected to have cleaned things up another way. Thus the user has a selection of options to choose from, in order of complexity: * Report the error and continue, returning 'true'. * Report the error and terminate the application. * Clean up the object's resources by some other means and return 'false'. The final option is to allow the user to write dtors that always assume referenced objects are valid while allowing execution to continue if such objects are encountered by the garbage collector (currently, dereferencing a GCed object in a dtor may cause an access violation if the refrenced object has already been cleaned up). I'll admit that this last option provides a lot more rope than seems prudent, but it also makes for some interesting possibilities and I'm curious to see how things work out :-) Sean
Apr 18 2006
prev sibling parent Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:
Regan Heath wrote:
 
 What I think we need to do is come up with several concrete use-cases 
 (actual code) which use resources which need to be released and explore 
 how each suggestion would affect that code, for example I'm still not 
 conviced the linklist use-case mentioned here several times requires any 
 explicit cleanup code, isn't it all just memory to be freed by the GC? 
 Can someone post a code example and explain why it does please.
 
 
 Regan
See my reply to Georg: news://news.digitalmars.com:119/e1dg8t$2akn$2 digitaldaemon.com -- Bruno Medeiros - CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Apr 10 2006
prev sibling parent reply Sean Kelly <sean f4.ca> writes:
kris wrote:
 
 On the other hand, all these concerns would melt away if the GC were 
 changed to not invoke the dtor (see related post). The beauty of that 
 approach is that there's no additional keywords or compiler behaviour; 
 only the GC is modified to remove the dtor call during a normal 
 collection cycle. Invoking delete or raii just works as always, yet the 
 invalid dtor state is eliminated. It also eliminates the need for a 
 dispose() pattern, which would be nice ;-)
For what it's worth, I think this could be accomplished now (thogh I've not tried it) as follows: Object o = new MyObject; gc_setFinalizer( o, null ); Sean
Apr 10 2006
parent kris <foo bar.com> writes:
Sean Kelly wrote:
 kris wrote:
 
 On the other hand, all these concerns would melt away if the GC were 
 changed to not invoke the dtor (see related post). The beauty of that 
 approach is that there's no additional keywords or compiler behaviour; 
 only the GC is modified to remove the dtor call during a normal 
 collection cycle. Invoking delete or raii just works as always, yet 
 the invalid dtor state is eliminated. It also eliminates the need for 
 a dispose() pattern, which would be nice ;-)
For what it's worth, I think this could be accomplished now (thogh I've not tried it) as follows: Object o = new MyObject; gc_setFinalizer( o, null );
Nearly, but not quite the same. This certainly disables the dtor for the given class, but if you forget to do it, your dtor will called with an 'unspecified' (what Don called non-deterministic) state. Plus, there's no option for capturing leaks. I believe it's far better to stop the GC from invoking the dtor in those cases where the state is unspecified: the system would become fully deterministic, the need for a dispose() pattern goes away ('delete'/raii takes over), expensive resources that should be released quickly are always treated in that manner (consistently) or treated as leaks otherwise, and the GC runs a little faster. There's the edge-case whereby someone wants a dtor to be invoked lazily by the collector, at some point in the future. That puts us back into the non-deterministic dtor state, and is a model that Mike was suggesting should be removed anyway (because classes that need to release something should do so as quickly as possible). I fully agree with Mike on this aspect, but wonder whether a simple implementation might suffice instead (GC change only)? Essentially what I'm suggesting is adding this to the documentation: "a class dtor is invoked via the use of 'delete' or raii only. This guarantees that (a) classes holding external or otherwise "expensive" resources will release them in a timely manner, that (b) that the dtor will be invoked with a fully deterministic state ~ all memory references held by a class instance will be valid when the dtor is invoked, and (c) there's no need for redundant cleanup-patterns such as dispose()" - Kris
Apr 10 2006
prev sibling parent Dave <Dave_member pathlink.com> writes:
In article <e1c9tc$p14$1 digitaldaemon.com>, Mike Capp says...
In article <e1c6vl$moj$1 digitaldaemon.com>, kris says...
I think we can safely put aside the entire-program-as-one-function as 
unrealistic. Given that, and assuming the existance of a dtor implies 
"auto" (and thus raii), how does one manage a "pool" of resources? For 
example, how about a pool of DB connections? Let's assume that they need 
to be correctly closed at some point, and that the pool is likely to 
expand and contract based upon demand over time ...

So the question is how do those connections, and the pool itself, jive 
with scoped raii? Assuming it doesn't, then one would presumeably revert 
to a manual dispose() pattern with such things?
Two different classes. A ConnectionPool at application scope, e.g. in main(), and a ConnectionUsage wherever you need one. Both are RAII. ConnectionPool acts as a factory for ConnectionUsage instances (modulo language limitations) and adds to the pool as needed; ConnectionUsage just "borrows" an instance from the pool for the duration of its scope. cheers Mike
That's a mssing part of the puzzle - up until now IMO the changes to the compiler would have been minimal to support "only autos can have dtors". Now it would require another change to the language in that 'auto' is not currently allowed for module scope classes. To support that, I guess there would have to be code inserted along the lines of module static dtors for auto class objects declared at module scope (except it would have to also check that each class had been actually instantiated, obviously). Then I guess there would be good potential for circular reference problems like there is for module ctors and dtors with imported modules. So the compiler would then have to insert runtime checks like it does for module ctors now, which make things yet more complicated.
Apr 09 2006