digitalmars.D - auto classes and finalizers

Sean Kelly (19/19) Apr 05 2006 I've been following a thread on GC in c.l.c++.m and something Herb

Mike Capp (10/25) Apr 05 2006 Personally I'm against it. I feel quite strongly that defining a destruc...

kris (12/43) Apr 06 2006 Mike;

Mike Capp (11/18) Apr 06 2006 The trouble is that this wouldn't make the RAII behaviour apparent to so...

Georg Wrede (4/24) Apr 06 2006 Yes, and thinking of a class that needs "destructing", which then may
kris (20/43) Apr 06 2006 Well, pretty much anything intended to be long-lived within the program,...

Regan Heath (21/45) Apr 06 2006 I did, I also suggested some solutions:

Daniel Keep (23/78) Apr 17 2006 Where to attach this post... aah well, this seems as good a spot as any,

kris (49/49) Apr 09 2006 I thought it worthwhile to review the dtor behaviour and view the

Bruno Medeiros (6/70) Apr 10 2006 All of those pros you mention are valid. But you'd have one serious con:

kris (57/129) Apr 10 2006 Thanks;

Bruno Medeiros (14/44) Apr 13 2006 Just one addendum: I was just pointing out that con, I wasn't saying it

kris (7/57) Apr 13 2006 :-D

kris (4/8) Apr 10 2006 Can anyone come up with some examples whereby a class needs to cleanup,

Sean Kelly (11/20) Apr 10 2006 Well, there are plenty of instances where the lifetime of an object

kris (5/31) Apr 10 2006 Yes; that's how I feel about it also. Especially when the "silent"
Georg Wrede (4/27) Apr 10 2006 Writing this kind of code demands that the programmer keeps (in his

Georg Wrede (40/50) Apr 10 2006 Got another idea.

Regan Heath (6/9) Apr 10 2006 Not memory managed, surely.. the memory will still be collected by the G...

Bruno Medeiros (7/19) Apr 13 2006 Kris clearly mentioned that a class with a dtor (i.e. a class needing

Sean Kelly (9/23) Apr 13 2006 The version of Ares released yesterday has code in place to do this.

Jarrett Billingsley (7/12) Apr 05 2006 Would you mind explaining why exactly there needs to be a difference bet...

Sean Kelly (26/39) Apr 05 2006 Since finalizers are called when the GC destroys an object, they are

kris (16/63) Apr 05 2006 Yes, it is. The "death tractors" (dtors in D) are notably less than

Jarrett Billingsley (24/37) Apr 05 2006 They are invoked when you call delete. This is how you do the determini...

kris (27/77) Apr 05 2006 I ended up using my own 'finalizer' since, back in the day, delete

Dave (48/61) Apr 05 2006 Ok, so for non-auto death tractors (that name is great):

kris (3/67) Apr 05 2006 I could buy that too, if the darned "auto" keyword weren't so overloaded...
Jarrett Billingsley (18/27) Apr 05 2006 Hmm. 'auto' works well and good for classes whose references are local

Regan Heath (52/70) Apr 05 2006 Assuming the nodes contain reference(s) to resources (other than memory)...
kris (5/27) Apr 05 2006 Heck, the LinkedList dtor /cannot/ rely on the nodes being valid if they...

Sean Kelly (3/18) Apr 06 2006 ...unless the LinkedList has a deterministic lifetime :-)

kris (2/27) Apr 06 2006 Touché !

Georg Wrede (12/33) Apr 06 2006 Hey, hey, hey...

Lars Ivar Igesund (4/42) Apr 06 2006 Not if the linked list is circular (such that all items is linked to), b...

Georg Wrede (9/51) Apr 06 2006 If the linked list is circular, and at the same time there's no
Georg Wrede (9/55) Apr 06 2006 The mere existence of a circular list that is not pointed-to from the

Lars Ivar Igesund (8/60) Apr 06 2006 Maybe it is a programmer's error, but at the same time a programmer expe...

Sean Kelly (12/44) Apr 05 2006 Yes, yes, yes, maybe :-) It's the call to gc.fullCollectNoStack in

kris (11/65) Apr 05 2006 Right :)

Sean Kelly (13/21) Apr 05 2006 I think dtors are called whenever an object is destroyed, be it via

Jarrett Billingsley (3/12) Apr 05 2006 Thank you for that clear, concise, and un-condescending reply :)
Bruno Medeiros (63/109) Apr 09 2006 Ok, I think we can tackle this problem in a better way. So far, people

kris (32/162) Apr 09 2006 Regardless of how it's implemented, what's needed is a bit of

kris (13/22) Apr 09 2006 After reading, that paragraph does not reflect the status-quo at all ...
Bruno Medeiros (32/62) Apr 10 2006 Hum, from what you said, follows a rather trivial alternative solution

kris (11/44) Apr 10 2006 Perhaps it would be better as an optional parameter? This certainly

Sean Kelly (23/58) Apr 09 2006 This does seem to be the most reasonable method. In fact, it could be

Georg Wrede (15/81) Apr 09 2006 If the above case was written as:

Bruno Medeiros (13/99) Apr 10 2006 Actually, with any decent GC, all of those objects will be reclaimed on

Bruno Medeiros (17/84) Apr 10 2006 By orphaned objects, do you mean all objects that are to be reclaimed by...

Sean Kelly (9/54) Apr 10 2006 All objects that are to be reclaimed. I figured your other suggestion

Bruno Medeiros (49/110) Apr 13 2006 That way, you have the guarantee that all references are valid, but some...

pragma (41/65) Apr 13 2006 Something like this will help *part* of the problem. By delaying the fr...

Bruno Medeiros (10/89) Apr 14 2006 True, I forgot to mention that. The order of destruction is undefined,

Bruno Medeiros (13/47) Apr 14 2006 Then again, with a proper allocator (mmnew) there is room for more

Mike Capp (6/9) Apr 09 2006 Hmm, yes. Like private/protected member access specifiers - what usefuln...

Bruno Medeiros (10/23) Apr 10 2006 Protection attributes and casts add usefulness (not gonna detail why).

Mike Capp (11/17) Apr 10 2006 The usefulness of protection attributes lies solely in preventing you fr...

Don Clugston (4/28) Apr 10 2006 I suspect that if finalisers were abolished, those other restrictions

Regan Heath (66/85) Apr 10 2006 The suggestion I made assumed we could remove these restrictions. I'm no...

Georg Wrede (29/157) Apr 09 2006 If an instance is deleted by the GC, the pointers that it may have to

kris (12/36) Apr 09 2006 That was all sounding reasonable up until this point :)

Mike Capp (8/17) Apr 09 2006 Two different classes. A ConnectionPool at application scope, e.g. in ma...

kris (14/36) Apr 09 2006 Thanks!

Regan Heath (14/46) Apr 09 2006 Unless you add a 'shared' keyword as I described in a previous post. eg.

kris (9/69) Apr 09 2006 Yes ~ that's true.

Dave (14/83) Apr 09 2006 So, 'auto' and delete would work as they do now, with the remaining prob...

kris (2/101) Apr 09 2006 See post entitled "GC & dtors ~ a different approach" at 6:17pm ?

Regan Heath (18/82) Apr 09 2006 True, however the beauty is marred by the possibility of resource leaks....

kris (8/19) Apr 09 2006 Regarding leaks, please see related post entitled "GC & dtors ~ a

Regan Heath (12/28) Apr 09 2006 Whereas using my suggestion we get implicit cleanup. Auto propagates as ...

kris (9/27) Apr 09 2006 I thought the idea was that classes with dtors are /intended/ to be

Regan Heath (13/38) Apr 09 2006 Not my idea ;) I think any given resource has a correct time/place for
Mike Capp (7/11) Apr 10 2006 Um... can we avoid using "implicit" and "explicit" in this context? "Imp...

kris (5/17) Apr 10 2006 Yeah, I see the murk. What would you prefer to call them? The

Don Clugston (2/23) Apr 10 2006 deterministic and non-deterministic.

Mike Capp (4/10) Apr 10 2006 Yes. Which pretty much correspond to "important" and "don't care".
kris (2/31) Apr 10 2006 Thank you;
Bruno Medeiros (15/39) Apr 13 2006 I don't like those terms. Although they are not false (because

Don Clugston (17/55) Apr 18 2006 Yes, you're right, a finaliser could be invoked immediately whenever the...

Sean Kelly (19/25) Apr 18 2006 This seems in line with some of the other ideas discussed in this

Mike Capp (6/11) Apr 18 2006 To clarify: if the decision is per-class (which I agree it should be), i...

Sean Kelly (26/36) Apr 18 2006 I'm not entirely sure it would be possible to catch every instance of

Sean Kelly (21/34) Apr 18 2006 As per Kris' suggestion, the (future) behavior of onCollectResource in

Bruno Medeiros (6/16) Apr 10 2006 See my reply to Georg:

Sean Kelly (6/14) Apr 10 2006 For what it's worth, I think this could be accomplished now (thogh I've

kris (26/43) Apr 10 2006 Nearly, but not quite the same. This certainly disables the dtor for the...

Dave (12/31) Apr 09 2006 That's a mssing part of the puzzle - up until now IMO the changes to the

Sean Kelly <sean f4.ca> writes:

I've been following a thread on GC in c.l.c++.m and something Herb 
posted about C++/CLI today got me thinking:

     - a type can have a destructor and/or a finalizer
     - the destructor is called upon a) explicit delete or b) at end 	
       of scope for auto objects
     - the finalizer is called if allocated on the gc heap and the
       destructor has not been called

Given that D can have lexical destruction of objects that weren't 
explicitly designed for it, ie.

     class C {}
     auto C = new C();

Might it not be worthwhile to do something similar to the above?  This 
would allow objects to explicitly delete all their contained data in 
instances where they are being used as auto objects, rather than always 
relying on the GC for this purpose.  I'll admit I don't particularly 
like the idea of separate finalize() and ~this() methods, but it seems 
an attractive enough feature that something along these lines may be 
appropriate.


Sean

Apr 05 2006

Mike Capp <mike.capp gmail.com> writes:

In article <e10pk7$2khb$1 digitaldaemon.com>, Sean Kelly says...
I've been following a thread on GC in c.l.c++.m and something Herb 
posted about C++/CLI today got me thinking:

     - a type can have a destructor and/or a finalizer
     - the destructor is called upon a) explicit delete or b) at end 	
       of scope for auto objects
     - the finalizer is called if allocated on the gc heap and the
       destructor has not been called
[snip]
Might it not be worthwhile to do something similar to the above?  This 
would allow objects to explicitly delete all their contained data in 
instances where they are being used as auto objects, rather than always 
relying on the GC for this purpose.  I'll admit I don't particularly 
like the idea of separate finalize() and ~this() methods, but it seems 
an attractive enough feature that something along these lines may be 
appropriate.

Personally I'm against it. I feel quite strongly that defining a destructor (or
finalizer) should be illegal for a GC type - it should only be allowed for a
class declared as 'auto'. If you need dtor-like behaviour, you should not be
using GC, and the compiler should tell you so.

I posted this opinion some weeks back in a similar discussion here, expecting to
be chased out of town with pitchforks, but the response was very positive.
Nobody could think of any counterexamples, at any rate.

cheers
Mike

Apr 05 2006

kris <foo bar.com> writes:

Mike Capp wrote:
 In article <e10pk7$2khb$1 digitaldaemon.com>, Sean Kelly says...
 
I've been following a thread on GC in c.l.c++.m and something Herb 
posted about C++/CLI today got me thinking:

    - a type can have a destructor and/or a finalizer
    - the destructor is called upon a) explicit delete or b) at end 	
      of scope for auto objects
    - the finalizer is called if allocated on the gc heap and the
      destructor has not been called
[snip]
Might it not be worthwhile to do something similar to the above?  This 
would allow objects to explicitly delete all their contained data in 
instances where they are being used as auto objects, rather than always 
relying on the GC for this purpose.  I'll admit I don't particularly 
like the idea of separate finalize() and ~this() methods, but it seems 
an attractive enough feature that something along these lines may be 
appropriate.

 
 
 Personally I'm against it. I feel quite strongly that defining a destructor (or
 finalizer) should be illegal for a GC type - it should only be allowed for a
 class declared as 'auto'. If you need dtor-like behaviour, you should not be
 using GC, and the compiler should tell you so.
 
 I posted this opinion some weeks back in a similar discussion here, expecting
to
 be chased out of town with pitchforks, but the response was very positive.
 Nobody could think of any counterexamples, at any rate.
 
 cheers
 Mike


Mike;

Instead of making the dtor illegal for GC types, why not remove the 
'auto' keyword from this realm altogether, and just use the existance of 
a dtor as the class RAII indicator?

Thus, any class with a dtor is automatically RAII. When the dtor is 
actually invoked, all relevant GC allocations should still be intact; yes?

What to do about those classes that need a dtor-like construct, but 
cannot be deemed RAII? Be explicit about closing them, using the close() 
or dispose() approach.

Thoughts?

- Kris

Apr 06 2006

Mike Capp <mike.capp gmail.com> writes:

In article <e12fva$29gr$1 digitaldaemon.com>, kris says...
Mike;

Instead of making the dtor illegal for GC types, why not remove the 
'auto' keyword from this realm altogether, and just use the existance of 
a dtor as the class RAII indicator?

The trouble is that this wouldn't make the RAII behaviour apparent to somebody
reading the code. They'd have to go and look at the class definition. I'm happy
to do a little extra typing for the sake of code clarity here, in the same way

by calls as well as decls was a nice touch.

What to do about those classes that need a dtor-like construct, but 
cannot be deemed RAII? Be explicit about closing them, using the close() 
or dispose() approach.

Can you give some concrete examples of such 'awkward' classes? I'm not saying
they don't exist, but I'm not assuming that they must, either. The "dispose"
(anti-)pattern is, frankly, awful. It's "Wrong By Default" taken to the extreme.


cheers
Mike

Apr 06 2006

Georg Wrede <georg.wrede nospam.org> writes:

Mike Capp wrote:
 kris says...
 
 Instead of making the dtor illegal for GC types, why not remove the
 'auto' keyword from this realm altogether, and just use the
 existance of a dtor as the class RAII indicator?

 
 The trouble is that this wouldn't make the RAII behaviour apparent to
 somebody reading the code. They'd have to go and look at the class
 definition. I'm happy to do a little extra typing for the sake of

 having "in" and "ref" arguments marked as such by calls as well as
 decls was a nice touch.

FWIW, I fully agree.

 What to do about those classes that need a dtor-like construct, but
 cannot be deemed RAII? Be explicit about closing them, using the
 close() or dispose() approach.

 
 Can you give some concrete examples of such 'awkward' classes? I'm
 not saying they don't exist, but I'm not assuming that they must,
 either. The "dispose" (anti-)pattern is, frankly, awful. It's "Wrong
 By Default" taken to the extreme.

Yes, and thinking of a class that needs "destructing", which then may 
happen much later (at GC time), or never at all -- is just insanity.

Apr 06 2006

kris <foo bar.com> writes:

Mike Capp wrote:
 In article <e12fva$29gr$1 digitaldaemon.com>, kris says...
 
Mike;

Instead of making the dtor illegal for GC types, why not remove the 
'auto' keyword from this realm altogether, and just use the existance of 
a dtor as the class RAII indicator?

 
 
 The trouble is that this wouldn't make the RAII behaviour apparent to somebody
 reading the code. They'd have to go and look at the class definition. I'm happy
 to do a little extra typing for the sake of code clarity here, in the same way

such
 by calls as well as decls was a nice touch.

Yes, that is true.



What to do about those classes that need a dtor-like construct, but 
cannot be deemed RAII? Be explicit about closing them, using the close() 
or dispose() approach.

 
 
 Can you give some concrete examples of such 'awkward' classes? I'm not saying
 they don't exist, but I'm not assuming that they must, either. 

Well, pretty much anything intended to be long-lived within the program, 
yet the OS cannot clean up by default. This includes external hardware 
which should be reset or otherwise released and, more commonly, various 
types of scant resources used for purposes of optimization ~ Regan noted 
database resources, which are a good example. Others might include 
termination network-handshaking, and so on. Such things are often 
wrapped via a class, with the expectation said class can encapsulate the 
cleanup process. Their scope (or life expectancy) is often intended to 
span a considerable period of time.

In some cases it might be possible to arrange the code such that these 
entities are actually scoped on the stack (for RAII purposes), where the 
enclosing function doesn't exit until termination time. However, others 
often have a life expectancy based upon "activity" ~ a classic example 
might be cached database resources, where life-expectancy of the object 
has nothing to do with scope per se, but is instead often based upon a 
period of dormancy or inactivity.



 The "dispose"
 (anti-)pattern is, frankly, awful. It's "Wrong By Default" taken to the
extreme.

This is the option left open after the discovery that dtor() is pretty 
much worthless. I agree that a better solution is needed.

Apr 06 2006

"Regan Heath" <regan netwin.co.nz> writes:

On Thu, 06 Apr 2006 11:48:07 -0700, kris <foo bar.com> wrote:
 Mike Capp wrote:
 What to do about those classes that need a dtor-like construct, but  
 cannot be deemed RAII? Be explicit about closing them, using the  
 close() or dispose() approach.

   Can you give some concrete examples of such 'awkward' classes? I'm  
 not saying
 they don't exist, but I'm not assuming that they must, either.

 Well, pretty much anything intended to be long-lived within the program,  
 yet the OS cannot clean up by default. This includes external hardware  
 which should be reset or otherwise released and, more commonly, various  
 types of scant resources used for purposes of optimization ~ Regan noted  
 database resources, which are a good example.

I did, I also suggested some solutions:
http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/36462

- refrence counting.
- a new 'shared' keyword.

The idea in that thread (isn't really a new idea) is essentially what Kris  
said above:
 Instead of making the dtor illegal for GC types, why not remove the  
 'auto' keyword from this realm altogether, and just use the existance  
 of a dtor as the class RAII indicator?



In my case; removal of 'auto' from object instance declaration, but  
requiring it on class definitions when a dtor is present. Plus requiring  
it on classes containing classes which are 'auto'.

After all a dtor indicates some (non-memory) cleanup needs to be done,  
making it RAII be definition, no? And any class containing a reference  
that needs cleanup, will itself need cleanup, right?

I think we need to try and come up with some examples of where it can't  
work, and/or decide what the limitations are and if they're an  
inappropriate cost to pay for what I think could be quite a safe system to  
write RAII in.

You said:
   The trouble is that this wouldn't make the RAII behaviour apparent to  
 somebody
 reading the code. They'd have to go and look at the class definition.  
 I'm happy
 to do a little extra typing for the sake of code clarity here, in the  
 same way

 marked as such
 by calls as well as decls was a nice touch.


IMO the benefit outweights this cost. Much like it does for 'out' etc  
function parameters.

Regan

Apr 06 2006

Daniel Keep <daniel.keep.lists gmail.com> writes:

Where to attach this post... aah well, this seems as good a spot as any,
I guess...

I won't pretend I'm an expert in these things, but it seems to me that
adding reference counting to D's wide range of memory management options
would solve most of these problems, yes?

The main case for keeping dtors with GCed objects is that sometimes you
have an object that needs to be cleaned up in some fashion, but which
isn't (or can't easily be) tied to a particular stack frame.  If you
made this class reference counted, then it would be cleaned up the
second the last reference goes out of scope.

The common drawback is the argument that you then have to watch out for
cycles, but Python seems to be coping fine--it has a generational cycle
checker as far as I understand it, and I've seen papers for creating
thread-safe generational checkers so that wouldn't need to be a problem.

I think having lazy GC, RAII, manual memory management and ref. counting
would cover just about everything you could possibly want to do.

Plus, it'd be a great gloating point: "D: memory management YOUR way!"

	-- Daniel

P.S.  I beg forgiveness if I've oversimplified this.

Regan Heath wrote:
 On Thu, 06 Apr 2006 11:48:07 -0700, kris <foo bar.com> wrote:
 Mike Capp wrote:
 What to do about those classes that need a dtor-like construct, but
 cannot be deemed RAII? Be explicit about closing them, using the
 close() or dispose() approach.

   Can you give some concrete examples of such 'awkward' classes? I'm
 not saying
 they don't exist, but I'm not assuming that they must, either.

 Well, pretty much anything intended to be long-lived within the
 program, yet the OS cannot clean up by default. This includes external
 hardware which should be reset or otherwise released and, more
 commonly, various types of scant resources used for purposes of
 optimization ~ Regan noted database resources, which are a good example.

 
 I did, I also suggested some solutions:
 http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/36462
 
 - refrence counting.
 - a new 'shared' keyword.
 
 The idea in that thread (isn't really a new idea) is essentially what
 Kris said above:
 Instead of making the dtor illegal for GC types, why not remove the
 'auto' keyword from this realm altogether, and just use the
 existance of a dtor as the class RAII indicator?



 
 In my case; removal of 'auto' from object instance declaration, but
 requiring it on class definitions when a dtor is present. Plus requiring
 it on classes containing classes which are 'auto'.
 
 After all a dtor indicates some (non-memory) cleanup needs to be done,
 making it RAII be definition, no? And any class containing a reference
 that needs cleanup, will itself need cleanup, right?
 
 I think we need to try and come up with some examples of where it can't
 work, and/or decide what the limitations are and if they're an
 inappropriate cost to pay for what I think could be quite a safe system
 to write RAII in.
 
 You said:
   The trouble is that this wouldn't make the RAII behaviour apparent
 to somebody
 reading the code. They'd have to go and look at the class definition.
 I'm happy
 to do a little extra typing for the sake of code clarity here, in the
 same way

 marked as such
 by calls as well as decls was a nice touch.


 
 IMO the benefit outweights this cost. Much like it does for 'out' etc
 function parameters.
 
 Regan

-- 

v1sw5+8Yhw5ln4+5pr6OFma8u6+7Lw4Tm6+7l6+7D
a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP    http://hackerkey.com/

Apr 17 2006

kris <foo bar.com> writes:

I thought it worthwhile to review the dtor behaviour and view the 
concerns from a different direction:

dtor 'state' valid:
- explicit invocation via delete keyword
- explicit invocation via raii

dtor state 'unspecified':
- implicitly called when no more references are held to the object
- implicitly called when a program terminates


Just for fun, let's assume the 'unspecified' issue cannot be resolved. 
Let's also assume there are dtors which expect to "clean up", and which 
will fail when the dtor state is 'unspecified'.

What happens when a programmer forgets to explicitly delete such an 
object? Well, the program is highly likely to fail (or be in an 
inconsistent state) after the GC collects said object. This might be 
before or during program termination.

How does one ensure this cannot occur? One obvious method would be for 
the GC to /not/ invoke any dtor by default. While the GC would still 
collect, such a change would ensure it cannot be the cause of a failing 
program (it would also make the GC a little faster, but that's probably 
beside the point).

Assuming that were the case, we're left with only the two cases where 
cleanup is explicit and the dtor state is 'valid': via the delete 
keyword, and via raii (both of which apply the same functionality).

This would tend to relieve the need for an explicit dispose() pattern, 
since the dtor is now the equivalent?

What about implicit cleanup? In this scenario, it doesn't happen. If you 
don't explicitly (via delete or via raii) delete an object, the dtor is 
not invoked. This applies the notion that it's better to have a leak 
than a dead program. The leak is a bug to be resolved.

What would be really nice is a tool to tell us about such leaks. It 
should be possible for the GC (when configured to do so) to identify 
collected objects which have a non-default dtor. In other words, the GC 
can probably tell if a custom dtor is present (it has a different 
address than a default dtor?). If the GC finds one of these during a 
normal collection cycle, and is about to collect it, it might raise a 
runtime error to indicate the leak instance?

Anyway ~ to summarize, this would have the following effect:

1) no more bogus crashes due to dtors being invoked in an invalid state
2) no need for the dispose() pattern
3) normal collection does not invoke dtors, making it a little faster
4) there's a possibility of a tool to identify and capture leaking 
resources. Something which would be handy anyway.


For the sake of example: "unscoped" resources, such as connection-pools, 
would operate per normal in this scenario: the pool elements should be 
deleted explicitly by the hosting pool (or be treated as leaks, if they 
have a custom dtor). The pool itself would have to be deleted explicitly 
also ~ as is currently the case today ~ which can optionally be handled 
via a module-dtor.

Thoughts?

Apr 09 2006

Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:

kris wrote:
 I thought it worthwhile to review the dtor behaviour and view the 
 concerns from a different direction:
 
 dtor 'state' valid:
 - explicit invocation via delete keyword
 - explicit invocation via raii
 
 dtor state 'unspecified':
 - implicitly called when no more references are held to the object
 - implicitly called when a program terminates
 
 
 Just for fun, let's assume the 'unspecified' issue cannot be resolved. 
 Let's also assume there are dtors which expect to "clean up", and which 
 will fail when the dtor state is 'unspecified'.
 
 What happens when a programmer forgets to explicitly delete such an 
 object? Well, the program is highly likely to fail (or be in an 
 inconsistent state) after the GC collects said object. This might be 
 before or during program termination.
 
 How does one ensure this cannot occur? One obvious method would be for 
 the GC to /not/ invoke any dtor by default. While the GC would still 
 collect, such a change would ensure it cannot be the cause of a failing 
 program (it would also make the GC a little faster, but that's probably 
 beside the point).
 
 Assuming that were the case, we're left with only the two cases where 
 cleanup is explicit and the dtor state is 'valid': via the delete 
 keyword, and via raii (both of which apply the same functionality).
 
 This would tend to relieve the need for an explicit dispose() pattern, 
 since the dtor is now the equivalent?
 
 What about implicit cleanup? In this scenario, it doesn't happen. If you 
 don't explicitly (via delete or via raii) delete an object, the dtor is 
 not invoked. This applies the notion that it's better to have a leak 
 than a dead program. The leak is a bug to be resolved.
 
 What would be really nice is a tool to tell us about such leaks. It 
 should be possible for the GC (when configured to do so) to identify 
 collected objects which have a non-default dtor. In other words, the GC 
 can probably tell if a custom dtor is present (it has a different 
 address than a default dtor?). If the GC finds one of these during a 
 normal collection cycle, and is about to collect it, it might raise a 
 runtime error to indicate the leak instance?
 
 Anyway ~ to summarize, this would have the following effect:
 
 1) no more bogus crashes due to dtors being invoked in an invalid state
 2) no need for the dispose() pattern
 3) normal collection does not invoke dtors, making it a little faster
 4) there's a possibility of a tool to identify and capture leaking 
 resources. Something which would be handy anyway.
 
 
 For the sake of example: "unscoped" resources, such as connection-pools, 
 would operate per normal in this scenario: the pool elements should be 
 deleted explicitly by the hosting pool (or be treated as leaks, if they 
 have a custom dtor). The pool itself would have to be deleted explicitly 
 also ~ as is currently the case today ~ which can optionally be handled 
 via a module-dtor.
 
 Thoughts?

All of those pros you mention are valid. But you'd have one serious con:
* Any class which required cleanup would have to be manually memory managed.

-- 
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Apr 10 2006

kris <foo bar.com> writes:

Bruno Medeiros wrote:
 kris wrote:
 
 I thought it worthwhile to review the dtor behaviour and view the 
 concerns from a different direction:

 dtor 'state' valid:
 - explicit invocation via delete keyword
 - explicit invocation via raii

 dtor state 'unspecified':
 - implicitly called when no more references are held to the object
 - implicitly called when a program terminates


 Just for fun, let's assume the 'unspecified' issue cannot be resolved. 
 Let's also assume there are dtors which expect to "clean up", and 
 which will fail when the dtor state is 'unspecified'.

 What happens when a programmer forgets to explicitly delete such an 
 object? Well, the program is highly likely to fail (or be in an 
 inconsistent state) after the GC collects said object. This might be 
 before or during program termination.

 How does one ensure this cannot occur? One obvious method would be for 
 the GC to /not/ invoke any dtor by default. While the GC would still 
 collect, such a change would ensure it cannot be the cause of a 
 failing program (it would also make the GC a little faster, but that's 
 probably beside the point).

 Assuming that were the case, we're left with only the two cases where 
 cleanup is explicit and the dtor state is 'valid': via the delete 
 keyword, and via raii (both of which apply the same functionality).

 This would tend to relieve the need for an explicit dispose() pattern, 
 since the dtor is now the equivalent?

 What about implicit cleanup? In this scenario, it doesn't happen. If 
 you don't explicitly (via delete or via raii) delete an object, the 
 dtor is not invoked. This applies the notion that it's better to have 
 a leak than a dead program. The leak is a bug to be resolved.

 What would be really nice is a tool to tell us about such leaks. It 
 should be possible for the GC (when configured to do so) to identify 
 collected objects which have a non-default dtor. In other words, the 
 GC can probably tell if a custom dtor is present (it has a different 
 address than a default dtor?). If the GC finds one of these during a 
 normal collection cycle, and is about to collect it, it might raise a 
 runtime error to indicate the leak instance?

 Anyway ~ to summarize, this would have the following effect:

 1) no more bogus crashes due to dtors being invoked in an invalid state
 2) no need for the dispose() pattern
 3) normal collection does not invoke dtors, making it a little faster
 4) there's a possibility of a tool to identify and capture leaking 
 resources. Something which would be handy anyway.


 For the sake of example: "unscoped" resources, such as 
 connection-pools, would operate per normal in this scenario: the pool 
 elements should be deleted explicitly by the hosting pool (or be 
 treated as leaks, if they have a custom dtor). The pool itself would 
 have to be deleted explicitly also ~ as is currently the case today ~ 
 which can optionally be handled via a module-dtor.

 Thoughts?

 
 
 All of those pros you mention are valid. But you'd have one serious con:
 * Any class which required cleanup would have to be manually memory 
 managed.
 

Thanks;

First, let's change the verbiage of "valid" and "unspecified" to be 
"deterministic" and "non-deterministic" respectively (per Don C).

This makes it clear that a dtor invoked /lazily/ by the GC will be 
invoked in a non-deterministic state (how the GC works today). This 
non-deterministic state means that it's very likely any or all 
gc-managed references held purely by a class instance will already be 
collected when the relevant dtor is invoked.

The other aspect to consider is the timeliness of cleanup. Mike suggests 
that classes that actually have something to cleanup should do so in a 
timely manner, and that the indicator for this is the presence of a dtor.

To get to your assertion: under the suggested model, any class with 
resources that need to be released should either be 'delete'd at some 
appropriate point, or have raii applied to it. Classes with dtors that 
are not cleaned up in this manner can be treated as "leaks" (and can be 
identified at runtime).

Thus, the term "manually memory managed" is not as clear as it might be: 
raii can be used to clean up, and scope(exit) can be used to cleanup. An 
explicit 'delete' can be used to cleanup. There's no malloc() or 
anything like that invoved.

The truly serious problem with a 'lazy' cleanup is that the dtor will 
wind up invoked with non-determinstic state (typically leading to a 
serious error). The other concern with lazy cleanup is what Mike 
addresses (if the resource needs cleaning up, it should be done in a 
timely manner ~ not at some arbitrary point in the future).

What would be an example of a class requiring cleanup, which should be 
performed lazily? I can't think of a reasonable one off-hand, but let's 
take an example anyway:

Suppose I have a class that holds a file-handle. This handle should be 
released when the class is no longer in use. Luckily, the file-handle 
does not require to be GC-managed itself (can be held by the class as an 
integer). This provides us with two choices ~ release the handle in a 
timely fashion, or release it at some undetermined point in the future 
(when the class is collected). We're lucky to have a choice here; it's 
actually something of a special case.

The model suggested follows Mike's proposal that the file-handle should 
actually be released as soon as reasonably possible. RAII can be used to 
ensure that happens automagically. What happens if said class is not 
raii, and it not hit with a 'delete'? The suggested model can easily 
identify that class instance as a "leak" when collected by the GC, and 
report it as such. That is: instead of the GC-collector invoking the 
dtor with a non-deterministic state, it instead identifies a leaking 
resource.

As far as automatic cleanup goes, I think D is already well armed via 
raii and the scope() idiom. Adopting an attitude of cleaning up 
resources in a timely manner will surely only be of benefit in the long 
run?

Another approach here is to allow the collector to invoke the dtor (as 
it does today), and somehow ensure that its state is fully deterministic 
(which is not done today). I suspect that would be notably more 
expensive and/or difficult to achieve? However, that also does not 
address Mike's concern about timely cleanup, which I think is of valid 
concern. Thus, I really like the simplicity of the model as described 
above. It also has the added bonus of eliminating the need for a 
redundant dispose() pattern, and makes the GC a little faster :-)

- Kris

Apr 10 2006

Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:

kris wrote:
 Bruno Medeiros wrote:
 All of those pros you mention are valid. But you'd have one serious con:
 * Any class which required cleanup would have to be manually memory 
 managed.

 

Just one addendum: I was just pointing out that con, I wasn't saying it 
was or was not, a bad idea overall.

 
 First, let's change the verbiage of "valid" and "unspecified" to be 
 "deterministic" and "non-deterministic" respectively (per Don C).
 

Let's not. *g*  See my reply to the Don.

 
 To get to your assertion: under the suggested model, any class with 
 resources that need to be released should either be 'delete'd at some 
 appropriate point, or have raii applied to it. Classes with dtors that 
 are not cleaned up in this manner can be treated as "leaks" (and can be 
 identified at runtime).
 
 Thus, the term "manually memory managed" is not as clear as it might be: 
 raii can be used to clean up, and scope(exit) can be used to cleanup. An 
 explicit 'delete' can be used to cleanup. There's no malloc() or 
 anything like that invoved.
 

Those are all manual memory management. (Even if auto and scope() are 
much better than plain malloc/free).
[Note: RAII's auto = scope(exit)]
You would have an automatic leak/failure detection, true.

 The truly serious problem with a 'lazy' cleanup is that the dtor will 
 wind up invoked with non-determinstic state (typically leading to a 
 serious error). The other concern with lazy cleanup is what Mike 
 addresses (if the resource needs cleaning up, it should be done in a 
 timely manner ~ not at some arbitrary point in the future).
 

The state is *undefined*, it is not "non-deterministic" nor 
"deterministic". This is the kind of terminology blur up that I was 
leery of. :P

-- 
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Apr 13 2006

kris <foo bar.com> writes:

Bruno Medeiros wrote:
 kris wrote:
 
 Bruno Medeiros wrote:

 All of those pros you mention are valid. But you'd have one serious con:
 * Any class which required cleanup would have to be manually memory 
 managed.


 
 Just one addendum: I was just pointing out that con, I wasn't saying it 
 was or was not, a bad idea overall.
 
 First, let's change the verbiage of "valid" and "unspecified" to be 
 "deterministic" and "non-deterministic" respectively (per Don C).

 
 Let's not. *g*  See my reply to the Don.

heheh :)

 
 To get to your assertion: under the suggested model, any class with 
 resources that need to be released should either be 'delete'd at some 
 appropriate point, or have raii applied to it. Classes with dtors that 
 are not cleaned up in this manner can be treated as "leaks" (and can 
 be identified at runtime).

 Thus, the term "manually memory managed" is not as clear as it might 
 be: raii can be used to clean up, and scope(exit) can be used to 
 cleanup. An explicit 'delete' can be used to cleanup. There's no 
 malloc() or anything like that invoved.

 
 Those are all manual memory management. (Even if auto and scope() are 
 much better than plain malloc/free).
 [Note: RAII's auto = scope(exit)]
 You would have an automatic leak/failure detection, true.
 
 The truly serious problem with a 'lazy' cleanup is that the dtor will 
 wind up invoked with non-determinstic state (typically leading to a 
 serious error). The other concern with lazy cleanup is what Mike 
 addresses (if the resource needs cleaning up, it should be done in a 
 timely manner ~ not at some arbitrary point in the future).

 
 The state is *undefined*, it is not "non-deterministic" nor 
 "deterministic". This is the kind of terminology blur up that I was 
 leery of. :P

:-D

Terminology aside; with the current implementation, invocation of dtors 
during a collection often causes serious problems. That's why we see the 
use of close/dispose patterns in D. It would be great to avoid both of 
those things :p

Apr 13 2006

kris <foo bar.com> writes:

Bruno Medeiros wrote:
 kris wrote:
 All of those pros you mention are valid. But you'd have one serious con:
 * Any class which required cleanup would have to be manually memory 
 managed.

Can anyone come up with some examples whereby a class needs to cleanup, 
and also /needs/ to be collected lazily? In other words, where raii or 
delete could not be applied appropriately?

Apr 10 2006

Sean Kelly <sean f4.ca> writes:

kris wrote:
 Bruno Medeiros wrote:
 kris wrote:
 All of those pros you mention are valid. But you'd have one serious con:
 * Any class which required cleanup would have to be manually memory 
 managed.

 
 Can anyone come up with some examples whereby a class needs to cleanup, 
 and also /needs/ to be collected lazily? In other words, where raii or 
 delete could not be applied appropriately?

Well, there are plenty of instances where the lifetime of an object 
isn't bound to a specific owner or scope--consider connection objects 
for a server app.  However, in most cases it's possible (and correct) to 
delegate cleanup responsibility to a specific manager object or to link 
it to the occurrence of some specific event.  So far as 
non-deterministic cleanup via dtors is concerned, I think it's mostly 
implemented as a fail-safe.  And it may be more correct to signal an 
error if such an object is encountered via a GC run than to simply clean 
it up silently, as a careful programmer might consider this a resource leak.


Sean

Apr 10 2006

kris <foo bar.com> writes:

Sean Kelly wrote:
 kris wrote:
 
 Bruno Medeiros wrote:

 kris wrote:
 All of those pros you mention are valid. But you'd have one serious con:
 * Any class which required cleanup would have to be manually memory 
 managed.


 Can anyone come up with some examples whereby a class needs to 
 cleanup, and also /needs/ to be collected lazily? In other words, 
 where raii or delete could not be applied appropriately?

 
 
 Well, there are plenty of instances where the lifetime of an object 
 isn't bound to a specific owner or scope--consider connection objects 
 for a server app.  However, in most cases it's possible (and correct) to 
 delegate cleanup responsibility to a specific manager object or to link 
 it to the occurrence of some specific event.  

Aye


 So far as 
 non-deterministic cleanup via dtors is concerned, I think it's mostly 
 implemented as a fail-safe.  And it may be more correct to signal an 
 error if such an object is encountered via a GC run than to simply clean 
 it up silently, as a careful programmer might consider this a resource 
 leak.

Yes; that's how I feel about it also. Especially when the "silent" 
cleanup leads to SegFaults and such. Intended as a fail-safe, but 
actually a failure-causation ;-)

Apr 10 2006

Georg Wrede <georg.wrede nospam.org> writes:

Sean Kelly wrote:
 kris wrote:
 
 Bruno Medeiros wrote:

 kris wrote:
 All of those pros you mention are valid. But you'd have one serious con:
 * Any class which required cleanup would have to be manually memory 
 managed.

 Can anyone come up with some examples whereby a class needs to 
 cleanup, and also /needs/ to be collected lazily? In other words, 
 where raii or delete could not be applied appropriately?

 
 Well, there are plenty of instances where the lifetime of an object 
 isn't bound to a specific owner or scope--consider connection objects 
 for a server app.  However, in most cases it's possible (and correct) to 
 delegate cleanup responsibility to a specific manager object or to link 
 it to the occurrence of some specific event.  So far as 
 non-deterministic cleanup via dtors is concerned, I think it's mostly 
 implemented as a fail-safe.  And it may be more correct to signal an 
 error if such an object is encountered via a GC run than to simply clean 
 it up silently, as a careful programmer might consider this a resource 
 leak.

Writing this kind of code demands that the programmer keeps (in his 
mind) a clear picture of _who_ owns the instance.

Getting that unclear is a sure receipe for disaster.

Apr 10 2006

Georg Wrede <georg.wrede nospam.org> writes:

kris wrote:
 Bruno Medeiros wrote:
 
 kris wrote:
 All of those pros you mention are valid. But you'd have one serious con:
 * Any class which required cleanup would have to be manually memory 
 managed.

 
 Can anyone come up with some examples whereby a class needs to cleanup, 
 and also /needs/ to be collected lazily? In other words, where raii or 
 delete could not be applied appropriately?

Got another idea.

It seems to me that this discussion is pretty abstract. Normally, half 
the participants would be talking about Apples and the other about 
Oranges, without neither noticing. But in this D newsgroup, I believe 
the state of knowledge is high enough for such not to happen.

However, half of the _audience_ may not be that clear on that both 
apples and oranges belong to the Class Magnoliopsida, and one of them to 
the Order Rosales and the other to Sapindales. But which? (And I 
certainly admit I belong to this Audience here.)

To serve and accomodate all, and to even possibly start to get 
potentially worthwhile commentary from a larger group of eyes, I suggest 
we try to construct the simplest Structure of Instances needed to 
display _all_ of the discussed woes.

As a first draft (and not even remotely pretending it is adequate), I 
cast the following:

VIEW THIS IN MONOSPACE FONT
===========================

code                   heap


iRa -----------------> alpha ---> beta
                         ^          /
                          \        /
                           \      /
                            \    V
                             gamma
                             ^  ^
                            /    \
                           /      \
                          /        \
                         V         V
iRb ----------------> delta <--> epsilon

(Oh, iR stands for Instance Reference, just to not get involved with the 
types or classes:

SomeClass iRx = new SomeClass();  // Create a reference to an instance.
)

So, the upper half makes a singly linked list and the lower half makes a 
doubly linked list, and then there arem two references (or D variables) 
pointing to the Alpha and Delta instances.

Can this structure demosnstrate _all_ of the problems we're currently 
discussing, or should it be more complicated?

Apr 10 2006

"Regan Heath" <regan netwin.co.nz> writes:

On Mon, 10 Apr 2006 13:33:56 +0100, Bruno Medeiros  
<brunodomedeirosATgmail SPAM.com> wrote:
 All of those pros you mention are valid. But you'd have one serious con:
 * Any class which required cleanup would have to be manually memory  
 managed.

Not memory managed, surely.. the memory will still be collected by the GC,  
all that changes is that the dtor is not invoked when that happens.. or at  
least that is how I understood Kris's proposal.

Regan

Apr 10 2006

Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:

Regan Heath wrote:
 On Mon, 10 Apr 2006 13:33:56 +0100, Bruno Medeiros 
 <brunodomedeirosATgmail SPAM.com> wrote:
 All of those pros you mention are valid. But you'd have one serious con:
 * Any class which required cleanup would have to be manually memory 
 managed.

 
 Not memory managed, surely.. the memory will still be collected by the 
 GC, all that changes is that the dtor is not invoked when that happens.. 
 or at least that is how I understood Kris's proposal.
 
 Regan
 

Kris clearly mentioned that a class with a dtor (i.e. a class needing 
cleanup) being collected by the GC would be an abnormal situation. 
(which could, or not, be detected by the runtime.)

-- 
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Apr 13 2006

Sean Kelly <sean f4.ca> writes:

Bruno Medeiros wrote:
 Regan Heath wrote:
 On Mon, 10 Apr 2006 13:33:56 +0100, Bruno Medeiros 
 <brunodomedeirosATgmail SPAM.com> wrote:
 All of those pros you mention are valid. But you'd have one serious con:
 * Any class which required cleanup would have to be manually memory 
 managed.

 Not memory managed, surely.. the memory will still be collected by the 
 GC, all that changes is that the dtor is not invoked when that 
 happens.. or at least that is how I understood Kris's proposal.

 
 Kris clearly mentioned that a class with a dtor (i.e. a class needing 
 cleanup) being collected by the GC would be an abnormal situation. 
 (which could, or not, be detected by the runtime.)

The version of Ares released yesterday has code in place to do this. 
For now, you'll have to alter the finalizer if you wanted to do 
something special (dmdrt/memory.d:cr_finalize), but eventually it it 
will probably call an onFinalizeError function in the standard library 
that can be hooked in a similar manner to onAssertError.  The error will 
be signaled when the GC collects an object that has a dtor.  Default 
behavior will likely be to do ignore it and move on.


Sean

Apr 13 2006

"Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:

"Sean Kelly" <sean f4.ca> wrote in message 
news:e10pk7$2khb$1 digitaldaemon.com...
     - a type can have a destructor and/or a finalizer
     - the destructor is called upon a) explicit delete or b) at end of 
 scope for auto objects
     - the finalizer is called if allocated on the gc heap and the
       destructor has not been called

Would you mind explaining why exactly there needs to be a difference between 
destructors and finalizers?  I've been following all the arguments about 
this heap vs. auto classes and dtors vs. finalizers, and I still can't 
figure out why destructors _can't be the finalizers_.  Do finalizers do 
something fundamentally different from destructors?

Apr 05 2006

Sean Kelly <sean f4.ca> writes:

Jarrett Billingsley wrote:
 "Sean Kelly" <sean f4.ca> wrote in message 
 news:e10pk7$2khb$1 digitaldaemon.com...
     - a type can have a destructor and/or a finalizer
     - the destructor is called upon a) explicit delete or b) at end of 
 scope for auto objects
     - the finalizer is called if allocated on the gc heap and the
       destructor has not been called

 
 Would you mind explaining why exactly there needs to be a difference between 
 destructors and finalizers?  I've been following all the arguments about 
 this heap vs. auto classes and dtors vs. finalizers, and I still can't 
 figure out why destructors _can't be the finalizers_.  Do finalizers do 
 something fundamentally different from destructors? 

Since finalizers are called when the GC destroys an object, they are 
very limited in what they can do.  They can't assume any GC managed 
object they have a reference to is valid, etc.  By contrast, destructors 
can make this assumption, because the object is being destroyed 
deterministically.  I think having both may be too confusing to be 
worthwhile, but it would allow for things like this:

     class LinkedList {
         ~this() { // called deterministically
             for( Node n = top; n; ) {
                 Node t = n->next;
                 delete n;
                 n = t;
             }
             finalize();
          }

          void finalize() { // called by GC
              // nodes may have already been destroyed
              // so leave them alone, but special
              // resources could be reclaimed
          }
     }

The argument against finalizers, as Mike mentioned, is that you 
typically want to reclaim such special resources deterministically, so 
letting the GC take care of this 'someday' is of questionable utility.


Sean

Apr 05 2006

kris <foo bar.com> writes:

Sean Kelly wrote:
 Jarrett Billingsley wrote:
 
 "Sean Kelly" <sean f4.ca> wrote in message 
 news:e10pk7$2khb$1 digitaldaemon.com...

     - a type can have a destructor and/or a finalizer
     - the destructor is called upon a) explicit delete or b) at end 
 of scope for auto objects
     - the finalizer is called if allocated on the gc heap and the
       destructor has not been called


 Would you mind explaining why exactly there needs to be a difference 
 between destructors and finalizers?  I've been following all the 
 arguments about this heap vs. auto classes and dtors vs. finalizers, 
 and I still can't figure out why destructors _can't be the 
 finalizers_.  Do finalizers do something fundamentally different from 
 destructors? 

 
 
 Since finalizers are called when the GC destroys an object, they are 
 very limited in what they can do.  They can't assume any GC managed 
 object they have a reference to is valid, etc.  By contrast, destructors 
 can make this assumption, because the object is being destroyed 
 deterministically.  I think having both may be too confusing to be 
 worthwhile, but it would allow for things like this:
 
     class LinkedList {
         ~this() { // called deterministically
             for( Node n = top; n; ) {
                 Node t = n->next;
                 delete n;
                 n = t;
             }
             finalize();
          }
 
          void finalize() { // called by GC
              // nodes may have already been destroyed
              // so leave them alone, but special
              // resources could be reclaimed
          }
     }
 
 The argument against finalizers, as Mike mentioned, is that you 
 typically want to reclaim such special resources deterministically, so 
 letting the GC take care of this 'someday' is of questionable utility.

Yes, it is. The "death tractors" (dtors in D) are notably less than 
useful right now. Any dependencies are likely in an unknown state (as 
you note), and then, dtors are not invoked when the program exits. From 
what I recall, dtors are not even invoked when you "delete" an object? 
It's actually quite hard to nail down when they /are/ invoked :)

Regardless; any "special resources" one would, somewhat naturally, wish 
to cleanup via dtors have to be explicitly managed via other means. This 
usually means a global application-list of "special stuff", which does 
not seem to jive with OOP very well?

On the face of it, it shouldn't be hard for the GC to invloke dtors in 
such a manner whereby dependencies are preserved ~ that would at least 
help. But then, the whole notion is somewhat worthless (in D) when it's 
implemented as a non-deterministic activity.

Given all that, the finalizer behaviour mentioned above sounds rather 
like the current death-tractor behaviour?

Apr 05 2006

"Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:

"kris" <foo bar.com> wrote in message news:e11fds$m0m$1 digitaldaemon.com...
 Yes, it is. The "death tractors" (dtors in D) are notably less than useful 
 right now. Any dependencies are likely in an unknown state (as you note), 
 and then, dtors are not invoked when the program exits. From
 what I recall, dtors are not even invoked when you "delete" an object? 
 It's actually quite hard to nail down when they /are/ invoked :)

They are invoked when you call delete.  This is how you do the deterministic 
"list of special stuff" that you mention - you just 'delete' them all, 
perhaps in a certain order.

In fact, the dtors are also called on program exit - as long as they're not 
in some kind of array.  I don't know if that's a bug, or by design, or a 
foggy area of the spec, or a combination of all of the above.

 Regardless; any "special resources" one would, somewhat naturally, wish
 to cleanup via dtors have to be explicitly managed via other means. This
 usually means a global application-list of "special stuff", which does not 
 seem to jive with OOP very well?

I kind of agree with you, but at the same time, I just take the stance that 
although it's useful, _the GC can't be trusted_.  Unless a custom GC is 
written for every program and every possible arrangement of data, it can't 
know in what order to call dtors/finalizers and whatnot.  So I do end up 
keeping lists of all types of objects that I want to be called 
deterministically, and delete them on program exit.  I just leave the simple 
/ common stuff (throwaway class instances, string crap) to the GC.  That 
just makes me feel a lot better and safer.

In addition, I usually don't assume that any references a class holds are 
valid in the dtor.  I leave the cleanup of other objects (like in Sean's 
example) to the other objects' dtors.

 On the face of it, it shouldn't be hard for the GC to invloke dtors in 
 such a manner whereby dependencies are preserved ~ that would at least 
 help. But then, the whole notion is somewhat worthless (in D) when it's 
 implemented as a non-deterministic activity.

Yeah, I was thinking about that, maybe instead of just looping through all 
class instances linearly and deleting everything, just keep running GC 
passes until the regular GC pass has no effect, and brute force the rest. 
In this way, my method of "not deleting other objects in dtors" would delete 
the instance of the LinkedList on the first pass, and then all the Nodes on 
the second, since they are now orphaned.

Apr 05 2006

kris <foo bar.com> writes:

Jarrett Billingsley wrote:
 "kris" <foo bar.com> wrote in message news:e11fds$m0m$1 digitaldaemon.com...
 
Yes, it is. The "death tractors" (dtors in D) are notably less than useful 
right now. Any dependencies are likely in an unknown state (as you note), 
and then, dtors are not invoked when the program exits. From
what I recall, dtors are not even invoked when you "delete" an object? 
It's actually quite hard to nail down when they /are/ invoked :)

 
 
 They are invoked when you call delete.  This is how you do the deterministic 
 "list of special stuff" that you mention - you just 'delete' them all, 
 perhaps in a certain order.

I ended up using my own 'finalizer' since, back in the day, delete 
didn't invoke the dtor. It does now, so that's something. Objects that 
refer to anything external should probably have a close() method anyway 
~ which gets us back to what Mike had noted.

 
 In fact, the dtors are also called on program exit - as long as they're not 
 in some kind of array.  I don't know if that's a bug, or by design, or a 
 foggy area of the spec, or a combination of all of the above.

Interesting. It does appear to do that now, whereas in the past it 
didn't. I remember a post from someone complaining that it took 5 
minutes for his program to exit because the GC was run to completion on 
all 10,000,000,000 objects he had (or something like that). The "fix" 
for that appeared to be "just don't cleanup on exit", which then 
sidestepped all dtors. It seems something changed along the way, since 
dtors do indeed get invoked at program termination for a simple test 
program (not if an exception is thrown, though). My bad.

Does this happen consistently, then? I mean, are dtors invoked on all 
remaining Objects during exit? At all times? Is that even a good idea?


 
 
Regardless; any "special resources" one would, somewhat naturally, wish
to cleanup via dtors have to be explicitly managed via other means. This
usually means a global application-list of "special stuff", which does not 
seem to jive with OOP very well?

 
 
 I kind of agree with you, but at the same time, I just take the stance that 
 although it's useful, _the GC can't be trusted_.  Unless a custom GC is 
 written for every program and every possible arrangement of data, it can't 
 know in what order to call dtors/finalizers and whatnot.  So I do end up 
 keeping lists of all types of objects that I want to be called 
 deterministically, and delete them on program exit.  I just leave the simple 
 / common stuff (throwaway class instances, string crap) to the GC.  That 
 just makes me feel a lot better and safer.

The GC is supposed to be your friend :)

That doesn't mean it should know about your design but, there again, it 
shouldn't abort it either. That implies any additional GC references 
held by a dtor Object really should be valid whenever that dtor is 
invoked. The fact that they're not relegates dtors to having 
insignificant value ~ which somehow doesn't seem right. Frankly, I don't 
clearly understand why they're in D at all ~ too little consistency.

 
 In addition, I usually don't assume that any references a class holds are 
 valid in the dtor.  I leave the cleanup of other objects (like in Sean's 
 example) to the other objects' dtors.
 
 
On the face of it, it shouldn't be hard for the GC to invloke dtors in 
such a manner whereby dependencies are preserved ~ that would at least 
help. But then, the whole notion is somewhat worthless (in D) when it's 
implemented as a non-deterministic activity.

 
 
 Yeah, I was thinking about that, maybe instead of just looping through all 
 class instances linearly and deleting everything, just keep running GC 
 passes until the regular GC pass has no effect, and brute force the rest. 
 In this way, my method of "not deleting other objects in dtors" would delete 
 the instance of the LinkedList on the first pass, and then all the Nodes on 
 the second, since they are now orphaned. 

Yep

If references were still valid for dtors, and dtors were invoked in a 
deterministic manner, perhaps all we'd need is something similar to 
"scope(exit)", but referring to a global scope instead? Should the 
memory manager take care of the latter?

Apr 05 2006

Dave <Dave_member pathlink.com> writes:

In article <e11ki9$rtq$1 digitaldaemon.com>, kris says...
Jarrett Billingsley wrote:
 "kris" <foo bar.com> wrote in message news:e11fds$m0m$1 digitaldaemon.com...
 
Yes, it is. The "death tractors" (dtors in D) are notably less than useful 
right now. Any dependencies are likely in an unknown state (as you note), 
and then, dtors are not invoked when the program exits. From
what I recall, dtors are not even invoked when you "delete" an object? 
It's actually quite hard to nail down when they /are/ invoked :)

 
 
 They are invoked when you call delete.  This is how you do the deterministic 
 "list of special stuff" that you mention - you just 'delete' them all, 
 perhaps in a certain order.


Ok, so for non-auto death tractors (that name is great):

a) non-auto D class dtors are actually what are called finalizers everywhere
else, except when delete is explicitly called.
b) although dtors are eventually all called, it is non-deterministic unless the
class is auto, or delete is used explicitly.
c) unless dtors are called deterministically, they could often be considered
worthless since, w/ a GC handling memory, the primary reason for dtor's is to
release other expensive external resources.
d) there is (alot of) overhead involved with 'dtors for every class'.
e) All this has been a major sticking-point of other languages and runtimes

they use is finalizers instead of dtors (they also have Dispose, but that needs
to be called explicitly), and using(...) takes the place of auto/delete. IIRC,
exactly when these finalizers are called is always non-deterministic and not
even guaranteed unless an explicit "full collect" is done, and a big part of
this is precisely because it's so expensive. Although I program in those
languages day to day, because of this, I don't rely on anything that is going on
behind the scenes as I've always ended-up explicitly "finalizing" things myself
rather than relying on the GC or the using(...) statement. If you've done alot
of DB work in .NET (for example), then you'll know that doing this is sometimes
as bothersome as malloc/free or new/delete (and Thank God for .NET's
try/finally). That is a major reason I think finalizers are useless unless
they're always deterministic.

From some tests I've done in the past and recently duplicated in
http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/36258, just attempting
to set a finalizer is damned expensive, and a lot of that expense is because
setFinalizer needs to be synchronized. IIRC, in the tests I've run in the past,
if the finalizer overhead is removed, the current GC can actually run as fast
for smallish class objects over several collections, as new/delete for C++
classes or malloc/free for C structs.

There is not only an expense involved in setting the finalizer, but the way it
works in the current D GC is that there is overhead involved in every collection
checking for finalizers, even for non-class objects. It looks to me like if all
the non-deterministic finalization cruft could be removed from the GC, the
*current* GC may actually be a little faster than malloc/free for class objects
(at least moderately sized ones).

Long and short of it is I like Mike's ideas regarding allowing dtors for only
auto classes. In that way, the GC wouldn't have to deal with finalizers at all,
or at least during non-deterministic collections. It would also still allow D to
claim RAII because 'auto' classes are something new for D compared to most other
languages.

It may be that taking care of the finalizer overhead issue is a must if D GC's
will ever be able to perform as well as other languages for class objects. 

Kind-of ironic; the goals of D are to be as powerful as C++, yet make compilers
relatively easy to develop - but a side effect of those two is that really good
GC's may be harder to develop than the compilers <g>

- Dave

Apr 05 2006

kris <foo bar.com> writes:

Dave wrote:
 In article <e11ki9$rtq$1 digitaldaemon.com>, kris says...
 
Jarrett Billingsley wrote:

"kris" <foo bar.com> wrote in message news:e11fds$m0m$1 digitaldaemon.com...


Yes, it is. The "death tractors" (dtors in D) are notably less than useful 
right now. Any dependencies are likely in an unknown state (as you note), 
and then, dtors are not invoked when the program exits. From
what I recall, dtors are not even invoked when you "delete" an object? 
It's actually quite hard to nail down when they /are/ invoked :)


They are invoked when you call delete.  This is how you do the deterministic 
"list of special stuff" that you mention - you just 'delete' them all, 
perhaps in a certain order.


 
 Ok, so for non-auto death tractors (that name is great):
 
 a) non-auto D class dtors are actually what are called finalizers everywhere
 else, except when delete is explicitly called.
 b) although dtors are eventually all called, it is non-deterministic unless the
 class is auto, or delete is used explicitly.
 c) unless dtors are called deterministically, they could often be considered
 worthless since, w/ a GC handling memory, the primary reason for dtor's is to
 release other expensive external resources.
 d) there is (alot of) overhead involved with 'dtors for every class'.
 e) All this has been a major sticking-point of other languages and runtimes

 they use is finalizers instead of dtors (they also have Dispose, but that needs
 to be called explicitly), and using(...) takes the place of auto/delete. IIRC,
 exactly when these finalizers are called is always non-deterministic and not
 even guaranteed unless an explicit "full collect" is done, and a big part of
 this is precisely because it's so expensive. Although I program in those
 languages day to day, because of this, I don't rely on anything that is going
on
 behind the scenes as I've always ended-up explicitly "finalizing" things myself
 rather than relying on the GC or the using(...) statement. If you've done alot
 of DB work in .NET (for example), then you'll know that doing this is sometimes
 as bothersome as malloc/free or new/delete (and Thank God for .NET's
 try/finally). That is a major reason I think finalizers are useless unless
 they're always deterministic.
 
 From some tests I've done in the past and recently duplicated in
 http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/36258, just attempting
 to set a finalizer is damned expensive, and a lot of that expense is because
 setFinalizer needs to be synchronized. IIRC, in the tests I've run in the past,
 if the finalizer overhead is removed, the current GC can actually run as fast
 for smallish class objects over several collections, as new/delete for C++
 classes or malloc/free for C structs.
 
 There is not only an expense involved in setting the finalizer, but the way it
 works in the current D GC is that there is overhead involved in every
collection
 checking for finalizers, even for non-class objects. It looks to me like if all
 the non-deterministic finalization cruft could be removed from the GC, the
 *current* GC may actually be a little faster than malloc/free for class objects
 (at least moderately sized ones).
 
 Long and short of it is I like Mike's ideas regarding allowing dtors for only
 auto classes. In that way, the GC wouldn't have to deal with finalizers at all,
 or at least during non-deterministic collections. It would also still allow D
to
 claim RAII because 'auto' classes are something new for D compared to most
other
 languages.

I could buy that too, if the darned "auto" keyword weren't so overloaded :-P

[snip]

Apr 05 2006

"Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:

"Dave" <Dave_member pathlink.com> wrote in message 
news:e11vjk$1fou$1 digitaldaemon.com...
 Long and short of it is I like Mike's ideas regarding allowing dtors for 
 only
 auto classes. In that way, the GC wouldn't have to deal with finalizers at 
 all,
 or at least during non-deterministic collections. It would also still 
 allow D to
 claim RAII because 'auto' classes are something new for D compared to most 
 other
 languages.

Hmm.  'auto' works well and good for classes whose references are local 
variables, but .. what about objects whose lifetimes aren't determined by 
the return of a function?

I.e. the Node class is used only in LinkedList.  When a LinkedList is 
killed, all its Nodes must die as well.  Since the Node references are kept 
in the LinkedList and not as local variables, there's no way to specify 
'auto' for them.

Then you start getting into a catch-22.  Okay, so you need to delete all 
those child Nodes in the dtor of LinkedList, meaning that LinkedList has to 
be made auto so it can have a dtor.  But what if a linked list reference has 
to exist at global level, or in a struct?  There is no function return to 
determine when to delete the list.  So you have to make LinkedList non-auto, 
but then that means that you can't delete all those child nodes since you 
don't have a dtor / finalizer, etc..

I think RAII is nice, but it doesn't seem to fix everything.  Unless, of 
course, it were extended to deal with these odd cases.

Apr 05 2006

"Regan Heath" <regan netwin.co.nz> writes:

On Thu, 6 Apr 2006 00:08:43 -0400, Jarrett Billingsley <kb3ctd2 yahoo.com>  
wrote:
 "Dave" <Dave_member pathlink.com> wrote in message
 news:e11vjk$1fou$1 digitaldaemon.com...
 Long and short of it is I like Mike's ideas regarding allowing dtors for
 only
 auto classes. In that way, the GC wouldn't have to deal with finalizers  
 at
 all,
 or at least during non-deterministic collections. It would also still
 allow D to
 claim RAII because 'auto' classes are something new for D compared to  
 most
 other
 languages.

 Hmm.  'auto' works well and good for classes whose references are local
 variables, but .. what about objects whose lifetimes aren't determined by
 the return of a function?

 I.e. the Node class is used only in LinkedList.  When a LinkedList is
 killed, all its Nodes must die as well.

Assuming the nodes contain reference(s) to resources (other than memory)  
that need to be released, right? You don't need to delete them to free  
memory, the GC should free them eventually.

The same is true for any non-auto object which contains a sub-object which  
has a reference to a resource that must be released deterministically.  
Isn't the solution therefore to make every object containing an 'auto'  
object 'auto' as well.

How about this:

1) If a class has a dtor it must be auto, eg.

      class A { ~this() {} } //error A must be auto
auto class A { ~this() {} } //ok


2) If a class contains a reference to an auto class, it must also be auto,  
eg.

      class B { A a; } //error A is auto, B must be auto too
auto class B { A a; ~this() { delete a; } } //ok

2a) If that class does not have a dtor it is an error.
2b) If that dtor does not delete the 'a' reference it is an error.

Speculative:
Can the compiler in fact auto-generate a dtor for this class? One that  
deletes all auto references.
Can it append (not prepend) that auto-generated dtor to any user supplied  
one?


3) Remove the other 'auto' class syntax, i.e.

class A {}
auto A a = new A();

It's either a class with resources that need to be freed, or it's not. Is  
there any need for a middle ground?
(this also removes the double use of auto, that'll make some people happy)


Pros:
1. no more weird crashes in dtors where people reference things which are  
gone.
2. compiler finds/corrects most reference leaks automatically.
3. no more double use of 'auto'.

Cons:
1. less flexible?


I can already think of a situation where this might be too inflexible.  
What happens if you want to share an object between multiple objects, for  
example:
   auto class DatabaseConnection {}

a singelton style shared connection to a database. You have several  
classes which share that connection, i.e.
   class UserQuery { DatabaseConnection c; }

using the rules above these classes would either be illegal, or get a dtor  
which auto-deletes the DatabaseConnection.

The solution? Perhaps it's reference counting in the DatabaseConnection?  
Perhaps it's a new syntax to mark something 'shared', preventing the  
compiler auto-deleting it. eg.
   class UserQuery { shared DatabaseConnection c; }

Perhaps this cure is worse than the disease? Thoughts?

Regan

Apr 05 2006

kris <foo bar.com> writes:

Jarrett Billingsley wrote:
 "Dave" <Dave_member pathlink.com> wrote in message 
 news:e11vjk$1fou$1 digitaldaemon.com...
 
Long and short of it is I like Mike's ideas regarding allowing dtors for 
only
auto classes. In that way, the GC wouldn't have to deal with finalizers at 
all,
or at least during non-deterministic collections. It would also still 
allow D to
claim RAII because 'auto' classes are something new for D compared to most 
other
languages.

 
 
 Hmm.  'auto' works well and good for classes whose references are local 
 variables, but .. what about objects whose lifetimes aren't determined by 
 the return of a function?
 
 I.e. the Node class is used only in LinkedList.  When a LinkedList is 
 killed, all its Nodes must die as well.  Since the Node references are kept 
 in the LinkedList and not as local variables, there's no way to specify 
 'auto' for them.

Heck, the LinkedList dtor /cannot/ rely on the nodes being valid if they 
are also managed by the GC :)

So, as I understand it, one cannot legitimately execute that example.

[snip]

Apr 05 2006

Sean Kelly <sean f4.ca> writes:

kris wrote:
 Jarrett Billingsley wrote:
 Hmm.  'auto' works well and good for classes whose references are 
 local variables, but .. what about objects whose lifetimes aren't 
 determined by the return of a function?

 I.e. the Node class is used only in LinkedList.  When a LinkedList is 
 killed, all its Nodes must die as well.  Since the Node references are 
 kept in the LinkedList and not as local variables, there's no way to 
 specify 'auto' for them.

 
 Heck, the LinkedList dtor /cannot/ rely on the nodes being valid if they 
 are also managed by the GC :)
 
 So, as I understand it, one cannot legitimately execute that example.

...unless the LinkedList has a deterministic lifetime :-)


Sean

Apr 06 2006

kris <foo bar.com> writes:

Sean Kelly wrote:
 kris wrote:
 
 Jarrett Billingsley wrote:

 Hmm.  'auto' works well and good for classes whose references are 
 local variables, but .. what about objects whose lifetimes aren't 
 determined by the return of a function?

 I.e. the Node class is used only in LinkedList.  When a LinkedList is 
 killed, all its Nodes must die as well.  Since the Node references 
 are kept in the LinkedList and not as local variables, there's no way 
 to specify 'auto' for them.


 Heck, the LinkedList dtor /cannot/ rely on the nodes being valid if 
 they are also managed by the GC :)

 So, as I understand it, one cannot legitimately execute that example.

 
 
 ...unless the LinkedList has a deterministic lifetime :-)
 
 
 Sean

<g> Touché !

Apr 06 2006

Georg Wrede <georg.wrede nospam.org> writes:

kris wrote:
 Sean Kelly wrote:
 kris wrote:
 Jarrett Billingsley wrote:
 Hmm.  'auto' works well and good for classes whose references are 
 local variables, but .. what about objects whose lifetimes aren't 
 determined by the return of a function?

 I.e. the Node class is used only in LinkedList.  When a LinkedList 
 is killed, all its Nodes must die as well.  Since the Node 
 references are kept in the LinkedList and not as local variables, 
 there's no way to specify 'auto' for them.

 Heck, the LinkedList dtor /cannot/ rely on the nodes being valid if 
 they are also managed by the GC :)

 So, as I understand it, one cannot legitimately execute that example.

 ...unless the LinkedList has a deterministic lifetime :-)

 
 <g> Touché !

Hey, hey, hey...

If anybody deletes stuff from a linked list, isn't it their 
responsibility to fix the pointers of the previous and/or the next item, 
to "bypass" that item??????

The mere fact that no "outside" references exist to a particular item in 
a linked list does _not_ make this item eligible for GC.

Not in the current implementation, and I dare say, in no future 
implementation ever.

In other words, it is _guaranteed_ that _all_ items in a linked list are 
valid.

This could be called a "linked-list-invariant". :-)

Apr 06 2006

Lars Ivar Igesund <larsivar igesund.net> writes:

Georg Wrede wrote:

 kris wrote:
 Sean Kelly wrote:
 kris wrote:
 Jarrett Billingsley wrote:
 Hmm.  'auto' works well and good for classes whose references are
 local variables, but .. what about objects whose lifetimes aren't
 determined by the return of a function?

 I.e. the Node class is used only in LinkedList.  When a LinkedList
 is killed, all its Nodes must die as well.  Since the Node
 references are kept in the LinkedList and not as local variables,
 there's no way to specify 'auto' for them.

 Heck, the LinkedList dtor /cannot/ rely on the nodes being valid if
 they are also managed by the GC :)

 So, as I understand it, one cannot legitimately execute that example.

 ...unless the LinkedList has a deterministic lifetime :-)

 
 <g> Touché !

 
 Hey, hey, hey...
 
 If anybody deletes stuff from a linked list, isn't it their
 responsibility to fix the pointers of the previous and/or the next item,
 to "bypass" that item??????
 
 The mere fact that no "outside" references exist to a particular item in
 a linked list does _not_ make this item eligible for GC.
 
 Not in the current implementation, and I dare say, in no future
 implementation ever.
 
 In other words, it is _guaranteed_ that _all_ items in a linked list are
 valid.
 

Not if the linked list is circular (such that all items is linked to), but
disjoint from the roots kept by the GC. This memory will be lost to a
conservative GC, but can be detected some of the other types around.

Apr 06 2006

Georg Wrede <georg nospam.org> writes:

Lars Ivar Igesund wrote:
 Georg Wrede wrote:
kris wrote:
Sean Kelly wrote:
kris wrote:
Jarrett Billingsley wrote:

 Hmm.  'auto' works well and good for classes whose references are
 local variables, but .. what about objects whose lifetimes aren't
 determined by the return of a function?

 I.e. the Node class is used only in LinkedList.  When a LinkedList
 is killed, all its Nodes must die as well.  Since the Node
 references are kept in the LinkedList and not as local variables,
 there's no way to specify 'auto' for them.

 Heck, the LinkedList dtor /cannot/ rely on the nodes being valid if
 they are also managed by the GC :)

 So, as I understand it, one cannot legitimately execute that example.

 ...unless the LinkedList has a deterministic lifetime :-)

 <g> Touch� !

 Hey, hey, hey...

 If anybody deletes stuff from a linked list, isn't it their
 responsibility to fix the pointers of the previous and/or the next item,
 to "bypass" that item??????

 The mere fact that no "outside" references exist to a particular item in
 a linked list does _not_ make this item eligible for GC.

 Not in the current implementation, and I dare say, in no future
 implementation ever.

 In other words, it is _guaranteed_ that _all_ items in a linked list are
 valid.

 
 Not if the linked list is circular (such that all items is linked to), but
 disjoint from the roots kept by the GC. This memory will be lost to a
 conservative GC, but can be detected some of the other types around.

If the linked list is circular, and at the same time there's no 
reference to this list from any GC examined area, then I'd consider this 
as a Programmer Fault.

Any set of "items", none of which is referenced from a "roots" area, is 
IMHO eligible for deletion. Whether this set is circular or not.

In other words, we should not strive to make the GC "too smart" for its 
own good. Either we see to it that items not wished for deletion are 
pointed to, or we accept that non-pointed-to items are considered passe.

Apr 06 2006

Georg Wrede <georg.wrede nospam.org> writes:

Lars Ivar Igesund wrote:
 Georg Wrede wrote:
 kris wrote:
 Sean Kelly wrote:
 kris wrote:
 Jarrett Billingsley wrote:
 
 Hmm.  'auto' works well and good for classes whose
 references are local variables, but .. what about objects
 whose lifetimes aren't determined by the return of a
 function?
 
 I.e. the Node class is used only in LinkedList.  When a
 LinkedList is killed, all its Nodes must die as well.
 Since the Node references are kept in the LinkedList and
 not as local variables, there's no way to specify 'auto'
 for them.

 
 Heck, the LinkedList dtor /cannot/ rely on the nodes being
 valid if they are also managed by the GC :)
 
 So, as I understand it, one cannot legitimately execute that
 example.

 
 ...unless the LinkedList has a deterministic lifetime :-)

 
 <g> Touché !

 
 Hey, hey, hey...
 
 If anybody deletes stuff from a linked list, isn't it their 
 responsibility to fix the pointers of the previous and/or the next
 item, to "bypass" that item??????
 
 The mere fact that no "outside" references exist to a particular
 item in a linked list does _not_ make this item eligible for GC.
 
 Not in the current implementation, and I dare say, in no future 
 implementation ever.
 
 In other words, it is _guaranteed_ that _all_ items in a linked
 list are valid.

 
 Not if the linked list is circular (such that all items is linked
 to), but disjoint from the roots kept by the GC. This memory will be
 lost to a conservative GC, but can be detected some of the other
 types around.

The mere existence of a circular list that is not pointed-to from the 
outside, is a programmer error. Unless one explicitly wants it to be 
collected. But even then it's a programmer error if the items need 
destructing, since the collection may or may not happen "ever".

So, in practice, whenever one wants to store items that need destructors 
in a linked list, the list itself should be encapsulated in a class that 
can guarantee the timely destruction of the items, as opposed to merely 
abandoning them.

Apr 06 2006

Lars Ivar Igesund <larsivar igesund.net> writes:

Georg Wrede wrote:

 Lars Ivar Igesund wrote:
 Georg Wrede wrote:
 kris wrote:
 Sean Kelly wrote:
 kris wrote:
 Jarrett Billingsley wrote:
 
 Hmm.  'auto' works well and good for classes whose
 references are local variables, but .. what about objects
 whose lifetimes aren't determined by the return of a
 function?
 
 I.e. the Node class is used only in LinkedList.  When a
 LinkedList is killed, all its Nodes must die as well.
 Since the Node references are kept in the LinkedList and
 not as local variables, there's no way to specify 'auto'
 for them.

 
 Heck, the LinkedList dtor /cannot/ rely on the nodes being
 valid if they are also managed by the GC :)
 
 So, as I understand it, one cannot legitimately execute that
 example.

 
 ...unless the LinkedList has a deterministic lifetime :-)

 
 <g> Touché !

 
 Hey, hey, hey...
 
 If anybody deletes stuff from a linked list, isn't it their
 responsibility to fix the pointers of the previous and/or the next
 item, to "bypass" that item??????
 
 The mere fact that no "outside" references exist to a particular
 item in a linked list does _not_ make this item eligible for GC.
 
 Not in the current implementation, and I dare say, in no future
 implementation ever.
 
 In other words, it is _guaranteed_ that _all_ items in a linked
 list are valid.

 
 Not if the linked list is circular (such that all items is linked
 to), but disjoint from the roots kept by the GC. This memory will be
 lost to a conservative GC, but can be detected some of the other
 types around.

 
 The mere existence of a circular list that is not pointed-to from the
 outside, is a programmer error. Unless one explicitly wants it to be
 collected. But even then it's a programmer error if the items need
 destructing, since the collection may or may not happen "ever".

Maybe it is a programmer's error, but at the same time a programmer expect a
GC to collect memory that is no longer referenced by the program. Also the
list might be generated by a complex enough program to actually make it
difficult to see that it is a circular linked list. Depending on the GC, it
might or might not be able to reclaim this memory (or calling the
destructors/finalizers of the objects in the list), because it no longer
explicitly know about it.

Apr 06 2006

Sean Kelly <sean f4.ca> writes:

kris wrote:
 Jarrett Billingsley wrote:
 
 Interesting. It does appear to do that now, whereas in the past it 
 didn't. I remember a post from someone complaining that it took 5 
 minutes for his program to exit because the GC was run to completion on 
 all 10,000,000,000 objects he had (or something like that). The "fix" 
 for that appeared to be "just don't cleanup on exit", which then 
 sidestepped all dtors. It seems something changed along the way, since 
 dtors do indeed get invoked at program termination for a simple test 
 program (not if an exception is thrown, though). My bad.
 
 Does this happen consistently, then? I mean, are dtors invoked on all 
 remaining Objects during exit? At all times? Is that even a good idea?

Yes, yes, yes, maybe :-)  It's the call to gc.fullCollectNoStack in 
gc_term.  There are alternatives that might work nearly as well (ie. the 
techniques you've used in the past) if shutdown time is an issue.

 That doesn't mean it should know about your design but, there again, it 
 shouldn't abort it either. That implies any additional GC references 
 held by a dtor Object really should be valid whenever that dtor is 
 invoked. The fact that they're not relegates dtors to having 
 insignificant value ~ which somehow doesn't seem right. Frankly, I don't 
 clearly understand why they're in D at all ~ too little consistency.

Because not having them tends to inspire people to invent their own, 
like the dispose() convention in Java.  Having language support is 
preferable, even if the functionality isn't terrific.

 Yeah, I was thinking about that, maybe instead of just looping through 
 all class instances linearly and deleting everything, just keep 
 running GC passes until the regular GC pass has no effect, and brute 
 force the rest. In this way, my method of "not deleting other objects 
 in dtors" would delete the instance of the LinkedList on the first 
 pass, and then all the Nodes on the second, since they are now orphaned. 

 
 Yep
 
 If references were still valid for dtors, and dtors were invoked in a 
 deterministic manner, perhaps all we'd need is something similar to 
 "scope(exit)", but referring to a global scope instead? Should the 
 memory manager take care of the latter?

In most cases this would work, but what about orphaned cycles?  The GC 
would ultimately just have to pick a place to start.  Also, I think 
disentangling a complex web of references could be somewhat time 
intensive, and collection runs are already too slow :-)


Sean

Apr 05 2006

kris <foo bar.com> writes:

Sean Kelly wrote:
 kris wrote:
 
 Jarrett Billingsley wrote:

 Interesting. It does appear to do that now, whereas in the past it 
 didn't. I remember a post from someone complaining that it took 5 
 minutes for his program to exit because the GC was run to completion 
 on all 10,000,000,000 objects he had (or something like that). The 
 "fix" for that appeared to be "just don't cleanup on exit", which then 
 sidestepped all dtors. It seems something changed along the way, since 
 dtors do indeed get invoked at program termination for a simple test 
 program (not if an exception is thrown, though). My bad.

 Does this happen consistently, then? I mean, are dtors invoked on all 
 remaining Objects during exit? At all times? Is that even a good idea?

 
 
 Yes, yes, yes, maybe :-)  It's the call to gc.fullCollectNoStack in 
 gc_term.  There are alternatives that might work nearly as well (ie. the 
 techniques you've used in the past) if shutdown time is an issue.
 
 That doesn't mean it should know about your design but, there again, 
 it shouldn't abort it either. That implies any additional GC 
 references held by a dtor Object really should be valid whenever that 
 dtor is invoked. The fact that they're not relegates dtors to having 
 insignificant value ~ which somehow doesn't seem right. Frankly, I 
 don't clearly understand why they're in D at all ~ too little 
 consistency.

 
 
 Because not having them tends to inspire people to invent their own, 
 like the dispose() convention in Java.  Having language support is 
 preferable, even if the functionality isn't terrific.

Right :)

That's why what Mike suggests make sense to me ~ only have dtor support 
for those classes that can actually take advantage of it, and have that 
enforced by the compiler. If, for example, one could also instantiate 
RAII classes at the global scope, then that would take care of loose 
ends too.

If that also makes the GC execute faster, then so much the better.


 Yeah, I was thinking about that, maybe instead of just looping 
 through all class instances linearly and deleting everything, just 
 keep running GC passes until the regular GC pass has no effect, and 
 brute force the rest. In this way, my method of "not deleting other 
 objects in dtors" would delete the instance of the LinkedList on the 
 first pass, and then all the Nodes on the second, since they are now 
 orphaned. 


 Yep

 If references were still valid for dtors, and dtors were invoked in a 
 deterministic manner, perhaps all we'd need is something similar to 
 "scope(exit)", but referring to a global scope instead? Should the 
 memory manager take care of the latter?

 
 
 In most cases this would work, but what about orphaned cycles?  The GC 
 would ultimately just have to pick a place to start.  Also, I think 
 disentangling a complex web of references could be somewhat time 
 intensive, and collection runs are already too slow :-)

I'd assumed it already followed a dependency tree to figure out the 
collectable allocations? But even so, it's probably better to not do any 
of that at all (and do what Mike suggests instead).

Apr 05 2006

Sean Kelly <sean f4.ca> writes:

kris wrote:
 
 Yes, it is. The "death tractors" (dtors in D) are notably less than 
 useful right now. Any dependencies are likely in an unknown state (as 
 you note), and then, dtors are not invoked when the program exits. From 
 what I recall, dtors are not even invoked when you "delete" an object? 
 It's actually quite hard to nail down when they /are/ invoked :)

I think dtors are called whenever an object is destroyed, be it via 
delete or by the GC.  And the GC should perform a complete clean-up on 
app termination.  I believe this is the current behavior in both Phobos 
and Ares (look at internal/gc/gc.d:gc_term() in Phobos and 
dmdrt/memory.d:gc_term() in Ares for the shutdown cleanup code).

 Given all that, the finalizer behaviour mentioned above sounds rather 
 like the current death-tractor behaviour?

It is exactly.  The dtor behavior has simply changed to be suitable for 
a more effective clean-up whenever the object is destroyed 
deterministically (ie. via delete or as an auto object).  I suppose an 
alternative would be to pass a state flag to the dtor to indicate the 
manner of disposal?  I really can't think of a means of implementing 
this that is as elegant as D deserves.


Sean

Apr 05 2006

"Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:

"Sean Kelly" <sean f4.ca> wrote in message 
news:e11ca0$ht2$1 digitaldaemon.com...
 Since finalizers are called when the GC destroys an object, they are very 
 limited in what they can do.  They can't assume any GC managed object they 
 have a reference to is valid, etc.  By contrast, destructors can make this 
 assumption, because the object is being destroyed deterministically.  I 
 think having both may be too confusing to be worthwhile, but it would 
 allow for things like this:

 The argument against finalizers, as Mike mentioned, is that you typically 
 want to reclaim such special resources deterministically, so letting the 
 GC take care of this 'someday' is of questionable utility.

Thank you for that clear, concise, and un-condescending reply :)

Apr 05 2006

Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:

Sean Kelly wrote:
 Jarrett Billingsley wrote:
 "Sean Kelly" <sean f4.ca> wrote in message 
 news:e10pk7$2khb$1 digitaldaemon.com...
     - a type can have a destructor and/or a finalizer
     - the destructor is called upon a) explicit delete or b) at end 
 of scope for auto objects
     - the finalizer is called if allocated on the gc heap and the
       destructor has not been called

 Would you mind explaining why exactly there needs to be a difference 
 between destructors and finalizers?  I've been following all the 
 arguments about this heap vs. auto classes and dtors vs. finalizers, 
 and I still can't figure out why destructors _can't be the 
 finalizers_.  Do finalizers do something fundamentally different from 
 destructors? 

 
 Since finalizers are called when the GC destroys an object, they are 
 very limited in what they can do.  They can't assume any GC managed 
 object they have a reference to is valid, etc.  By contrast, destructors 
 can make this assumption, because the object is being destroyed 
 deterministically.  I think having both may be too confusing to be 
 worthwhile, but it would allow for things like this:
 
     class LinkedList {
         ~this() { // called deterministically
             for( Node n = top; n; ) {
                 Node t = n->next;
                 delete n;
                 n = t;
             }
             finalize();
          }
 
          void finalize() { // called by GC
              // nodes may have already been destroyed
              // so leave them alone, but special
              // resources could be reclaimed
          }
     }
 
 The argument against finalizers, as Mike mentioned, is that you 
 typically want to reclaim such special resources deterministically, so 
 letting the GC take care of this 'someday' is of questionable utility.
 
 
 Sean

Ok, I think we can tackle this problem in a better way. So far, people 
have been thinking about the fact that when destructors are called in a 
GC cycle, they are called with finalizer semantics (i.e., you don't know 
if the member references are valid or not, thus you can't use them).

This is a problem when in a destructor, one would like to destroy 
component objects (as the Nodes of the LinkedList example).


Some ideas where discussed here, but I didn't think any were fruitful. Like:
  *Forcing all classes with destructors to be auto classes -> doesn't 
add any usefulness, instead just nuisances.
  *Making the GC destroy objects in an order that makes members 
references valid -> has a high performance cost and/or is probably just 
not possible (circular references?).


Perhaps another way would be to have the following behavior:
- When a destructor is called during a GC (i.e., "as a finalizer") for 
an object, then the member references are not valid and cannot be 
referenced, *but they can be deleted*. It will be deleted iff it has not 
been deleted already.
I think this can be done without significant overhead. At the end of a 
GC cycle, the GC has already a list of all objects that are to be 
deleted. Thus, on the release phase, it could be modified to have a flag 
indicating whether the object was already deleted or not. Thus when 
LinkedList deletes a Node, the delete is only made if the object has 
already been deleted or not.


Still, while the previous idea might be good, it's not the optimal, 
because we are not clearly apperceiving the problem/issue at hand. What 
we *really* want is to directly couple the lifecycle of a component 
(member) object with it's composite (owner) object. A Node of a 
LinkedList has the same lifecycle of it's LinkedList, so Node shouldn't 
even be a independent Garbage Collection managing element.

What we want is an allocator that allocates memory that is not to be 
claimed by the GC (but which is to be scanned by the GC). It's behavior 
is exactly like the allocator of 
http://www.digitalmars.com/d/memory.html#newdelete but it should come 
with the language and be available for all types. With usage like:

   class LinkedList {
     ...
     Add(Object obj) {
       Node node = mnew Node(blabla);
       ...
     }

Thus, when the destructor is called upon a LinkedList, either 
explicitly, or by the GC, the Node references will always be valid. One 
has to be careful now, as mnew'ed object are effectively under manual 
memory management, and so every mnew must have a corresponding delete, 
lest there be dangling pointer ou memory leaks. Nonetheless it seems to 
be only sane solution to this problem.


Another interesting addition, is to extend the concept of auto to class 
members. Just as currently auto couples the lifecycle of a variable to 
the enclosing function, an auto class member would couple the lifecycle 
of its member to it's owner object. It would get deleted implicitly when 
then owner object got deleted. Here is another (made up) example:

   class SomeUIWidget {
     auto Color fgcolor;
     auto Color bgcolor;
     auto Size size;
     auto Image image;
     ...

The auto members would then have to be initialized on a constructor or 
something (the exact restrictions might vary, such as being final or not).


-- 
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Apr 09 2006

kris <foo bar.com> writes:

Bruno Medeiros wrote:
 Sean Kelly wrote:
 
 Jarrett Billingsley wrote:

 "Sean Kelly" <sean f4.ca> wrote in message 
 news:e10pk7$2khb$1 digitaldaemon.com...

     - a type can have a destructor and/or a finalizer
     - the destructor is called upon a) explicit delete or b) at end 
 of scope for auto objects
     - the finalizer is called if allocated on the gc heap and the
       destructor has not been called


 Would you mind explaining why exactly there needs to be a difference 
 between destructors and finalizers?  I've been following all the 
 arguments about this heap vs. auto classes and dtors vs. finalizers, 
 and I still can't figure out why destructors _can't be the 
 finalizers_.  Do finalizers do something fundamentally different from 
 destructors? 


 Since finalizers are called when the GC destroys an object, they are 
 very limited in what they can do.  They can't assume any GC managed 
 object they have a reference to is valid, etc.  By contrast, 
 destructors can make this assumption, because the object is being 
 destroyed deterministically.  I think having both may be too confusing 
 to be worthwhile, but it would allow for things like this:

     class LinkedList {
         ~this() { // called deterministically
             for( Node n = top; n; ) {
                 Node t = n->next;
                 delete n;
                 n = t;
             }
             finalize();
          }

          void finalize() { // called by GC
              // nodes may have already been destroyed
              // so leave them alone, but special
              // resources could be reclaimed
          }
     }

 The argument against finalizers, as Mike mentioned, is that you 
 typically want to reclaim such special resources deterministically, so 
 letting the GC take care of this 'someday' is of questionable utility.


 Sean

 
 
 Ok, I think we can tackle this problem in a better way. So far, people 
 have been thinking about the fact that when destructors are called in a 
 GC cycle, they are called with finalizer semantics (i.e., you don't know 
 if the member references are valid or not, thus you can't use them).
 
 This is a problem when in a destructor, one would like to destroy 
 component objects (as the Nodes of the LinkedList example).
 
 
 Some ideas where discussed here, but I didn't think any were fruitful. 
 Like:
  *Forcing all classes with destructors to be auto classes -> doesn't add 
 any usefulness, instead just nuisances.
  *Making the GC destroy objects in an order that makes members 
 references valid -> has a high performance cost and/or is probably just 
 not possible (circular references?).
 
 
 Perhaps another way would be to have the following behavior:
 - When a destructor is called during a GC (i.e., "as a finalizer") for 
 an object, then the member references are not valid and cannot be 
 referenced, *but they can be deleted*. It will be deleted iff it has not 
 been deleted already.
 I think this can be done without significant overhead. At the end of a 
 GC cycle, the GC has already a list of all objects that are to be 
 deleted. Thus, on the release phase, it could be modified to have a flag 
 indicating whether the object was already deleted or not. Thus when 
 LinkedList deletes a Node, the delete is only made if the object has 
 already been deleted or not.
 
 
 Still, while the previous idea might be good, it's not the optimal, 
 because we are not clearly apperceiving the problem/issue at hand. What 
 we *really* want is to directly couple the lifecycle of a component 
 (member) object with it's composite (owner) object. A Node of a 
 LinkedList has the same lifecycle of it's LinkedList, so Node shouldn't 
 even be a independent Garbage Collection managing element.
 
 What we want is an allocator that allocates memory that is not to be 
 claimed by the GC (but which is to be scanned by the GC). It's behavior 
 is exactly like the allocator of 
 http://www.digitalmars.com/d/memory.html#newdelete but it should come 
 with the language and be available for all types. With usage like:
 
   class LinkedList {
     ...
     Add(Object obj) {
       Node node = mnew Node(blabla);
       ...
     }
 
 Thus, when the destructor is called upon a LinkedList, either 
 explicitly, or by the GC, the Node references will always be valid. One 
 has to be careful now, as mnew'ed object are effectively under manual 
 memory management, and so every mnew must have a corresponding delete, 
 lest there be dangling pointer ou memory leaks. Nonetheless it seems to 
 be only sane solution to this problem.
 
 
 Another interesting addition, is to extend the concept of auto to class 
 members. Just as currently auto couples the lifecycle of a variable to 
 the enclosing function, an auto class member would couple the lifecycle 
 of its member to it's owner object. It would get deleted implicitly when 
 then owner object got deleted. Here is another (made up) example:
 
   class SomeUIWidget {
     auto Color fgcolor;
     auto Color bgcolor;
     auto Size size;
     auto Image image;
     ...
 
 The auto members would then have to be initialized on a constructor or 
 something (the exact restrictions might vary, such as being final or not).
 
 


Regardless of how it's implemented, what's needed is a bit of 
consistency. Currently, dtors are invoked with two entirely different 
world-states: with valid state, and with unspecified state. What makes 
this generally unworkable is the fact that (a) the difference in state 
is often critical to the operation of the dtor, and (b) there's no clean 
way to tell the difference.

I use a bit of a hack to distinguish between the two: a common module 
has a global variable set to true when the enclosing module-dtor is 
invoked. This obviously depends upon module-dtors being first (which 
they currently are, but that is not in the spec). Most of you will 
probably be going "eww" at this point, but it's the only way I found to 
make dtors consistent and thus usable. Further, this is only workable if 
the dtor() itself can be abandoned when in state (b) above; prohibiting 
the use of dtors for a whole class of cleanup concerns, and forcing one 
to defer to the dispose() or close() pattern ~~ some say anti-pattern.

As I understand it, the two states correspond to (1) an explicit 
'delete' of the object, which includes "auto" usage; and (2) implicit 
cleanup via the GC.

The suggestion to restrict dtor to 'auto' classes is a means to limit 


of object lifetimes that are not related to scope ~ such as time-based). 
That would need to be addressed somehow?

Turning to your suggestions ~ the 'marking' of references such that they 
can be "deleted" multiple times is perhaps questionable, partly because 
it appears to be specific to the GC implementation? I imagine an 
incremental collector would have problems with this approach, even if it 
were workable with a "stop the world" collector? I don't know for sure, 
but suspect there'd be issues there somewhere.

Whatever the resolution, consistency should be the order of the day.

- Kris

Apr 09 2006

kris <foo bar.com> writes:

kris wrote:
 I use a bit of a hack to distinguish between the two: a common module 
 has a global variable set to true when the enclosing module-dtor is 
 invoked. This obviously depends upon module-dtors being first (which 
 they currently are, but that is not in the spec). Most of you will 
 probably be going "eww" at this point, but it's the only way I found to 
 make dtors consistent and thus usable. Further, this is only workable if 
 the dtor() itself can be abandoned when in state (b) above; prohibiting 
 the use of dtors for a whole class of cleanup concerns, and forcing one 
 to defer to the dispose() or close() pattern ~~ some say anti-pattern.

After reading, that paragraph does not reflect the status-quo at all ...

First, it should have said "used" instead of "use" (past-tense ~ this is 
not applied any more, since dtors have all but been abandoned). Second, 
the identification of "state" was limited to program termination only ~ 
the classes in question were actually collected only at that point. 
Third, the cleanup did not rely on GC managed memory. All in all, that 
paragraph is pretty darned misleading ~~ my bad :-(

The take-home message is that I did not find a general mechanism to 
distinguish between valid-state and unspecified-state for a dtor ~ the 
oft-crucial inconsistency remains in its fully-fledged guise. The other 
issue is that I clearly should avoid posting whilst hallucinating.

Sorry;

Apr 09 2006

Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:

kris wrote:
 
 
 Regardless of how it's implemented, what's needed is a bit of 
 consistency. Currently, dtors are invoked with two entirely different 
 world-states: with valid state, and with unspecified state. What makes 
 this generally unworkable is the fact that (a) the difference in state 
 is often critical to the operation of the dtor, and (b) there's no clean 
 way to tell the difference.
 

Hum, from what you said, follows a rather trivial alternative solution 
to the problem: Have the destructor have an implicit parameter/variable, 
that indicates whether it was called explicitly or as a finalizer (i.e, 
in a GC run): (This would be similar in semantics to Sean's suggestion 
of separating the destruction and finalize methods)

class LinkedList {
   ~this() { // called manually/explicitly and automatically
     if(explicit) {
       for( Node n = top; n; ) {
         Node t = n->next;
         delete n;
         n = t;
       }
     }
     // ... finalize here
   }
   ...

Would this be acceptable? How would this compare to other suggestions? I 
can think of a few things to say versus my suggestion.

 
 As I understand it, the two states correspond to (1) an explicit 
 'delete' of the object, which includes "auto" usage; and (2) implicit 
 cleanup via the GC.
 
 The suggestion to restrict dtor to 'auto' classes is a means to limit 


 of object lifetimes that are not related to scope ~ such as time-based). 
 That would need to be addressed somehow?
 
 Turning to your suggestions ~ the 'marking' of references such that they 
 can be "deleted" multiple times is perhaps questionable, partly because 
 it appears to be specific to the GC implementation? I imagine an 
 incremental collector would have problems with this approach, even if it 
 were workable with a "stop the world" collector? I don't know for sure, 
 but suspect there'd be issues there somewhere.
 

It works for a stop-the-world collector, I'm sure. As for a incremental 
collector, hum... well, it works if collector guarantees the following:
   * The collector determines a set S of objects to be reclaimed, and no 
object in S is referenced outside of S.


 Whatever the resolution, consistency should be the order of the day.
 
 - Kris

Manual and automatic memory management are two very different paradigms 
that are likely impossible or impractical to be made "consistent" or 
conciliated, at least in the way you are implying. The "only auto 
classes have destructors" suggestion only makes it "consistent" because 
it limits the usage of the class to only one paradigm (manual management).

-- 
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Apr 10 2006

kris <foo bar.com> writes:

Bruno Medeiros wrote:
 kris wrote:
 
 Regardless of how it's implemented, what's needed is a bit of 
 consistency. Currently, dtors are invoked with two entirely different 
 world-states: with valid state, and with unspecified state. What makes 
 this generally unworkable is the fact that (a) the difference in state 
 is often critical to the operation of the dtor, and (b) there's no 
 clean way to tell the difference.

 
 Hum, from what you said, follows a rather trivial alternative solution 
 to the problem: Have the destructor have an implicit parameter/variable, 
 that indicates whether it was called explicitly or as a finalizer (i.e, 
 in a GC run): (This would be similar in semantics to Sean's suggestion 
 of separating the destruction and finalize methods)
 
 class LinkedList {
   ~this() { // called manually/explicitly and automatically
     if(explicit) {
       for( Node n = top; n; ) {
         Node t = n->next;
         delete n;
         n = t;
       }
     }
     // ... finalize here
   }
   ...
 
 Would this be acceptable? How would this compare to other suggestions? I 
 can think of a few things to say versus my suggestion.

Perhaps it would be better as an optional parameter? This certainly 
would allow for lazy dtors that don't need timely cleanup. Although I 
can't think of any reasonable examples to illustrate with.

However, it clearly exposes the "uneasy" status that a dtor might find 
itself in. For that reason it seems a bit like a hack on top of a queasy 
problem (to me). In cases like these I tend to think it's better to 
start off constrained and deterministic (remove those 'lazy' 
non-deterministic dtor invocations), and then optionally open things up 
as is deemed necessary, or when a resolution to the non-determinism is 
found.

Apr 10 2006

Sean Kelly <sean f4.ca> writes:

Bruno Medeiros wrote:
 
 What we want is an allocator that allocates memory that is not to be 
 claimed by the GC (but which is to be scanned by the GC). It's behavior 
 is exactly like the allocator of 
 http://www.digitalmars.com/d/memory.html#newdelete but it should come 
 with the language and be available for all types. With usage like:
 
   class LinkedList {
     ...
     Add(Object obj) {
       Node node = mnew Node(blabla);
       ...
     }
 
 Thus, when the destructor is called upon a LinkedList, either 
 explicitly, or by the GC, the Node references will always be valid. One 
 has to be careful now, as mnew'ed object are effectively under manual 
 memory management, and so every mnew must have a corresponding delete, 
 lest there be dangling pointer ou memory leaks. Nonetheless it seems to 
 be only sane solution to this problem.

This does seem to be the most reasonable method.  In fact, it could be 
done now without the addition of new keywords by adding two new GC 
functions: release and reclaim (bad names, but they're all I could think 
of).  'release' would tell the GC not to automatically finalize or 
delete the memory block, as you've suggested above, and 'reclaim' would 
transfer ownership back to the GC.  It's more error prone than I'd like, 
but also perhaps the most reasonable.

A possible alternative would be for the GC to peform its cleanup in two 
stages.  The first sweep runs all finalizers on orphaned objects, and 
the second releases the memory.  Thus in Eric's example on d.D.learn, he 
would be able legally iterate across his AA and close all HANDLEs 
because the memory would still be valid at that stage.

Assuming there aren't any problems with this latter idea, I think it 
should be implemented as standard behavior for the GC, and the former 
idea should be provided as an option.  Thus the user would have complete 
manual control available when needed, but more foolproof basic behavior 
for simpler situations.

 Another interesting addition, is to extend the concept of auto to class 
 members. Just as currently auto couples the lifecycle of a variable to 
 the enclosing function, an auto class member would couple the lifecycle 
 of its member to it's owner object. It would get deleted implicitly when 
 then owner object got deleted. Here is another (made up) example:
 
   class SomeUIWidget {
     auto Color fgcolor;
     auto Color bgcolor;
     auto Size size;
     auto Image image;
     ...
 
 The auto members would then have to be initialized on a constructor or 
 something (the exact restrictions might vary, such as being final or not).

I like this idea as well, though it may require some additional 
bookkeeping to accomplish.  For example, a GC scan may encounter the 
members before the owner, so each member may need to contain a hidden 
pointer to the owner object so the GC knows how to sort things out.


Sean

Apr 09 2006

Georg Wrede <georg.wrede nospam.org> writes:

Sean Kelly wrote:
 Bruno Medeiros wrote:
 
 What we want is an allocator that allocates memory that is not to be 
 claimed by the GC (but which is to be scanned by the GC). It's 
 behavior is exactly like the allocator of 
 http://www.digitalmars.com/d/memory.html#newdelete but it should come 
 with the language and be available for all types. With usage like:

   class LinkedList {
     ...
     Add(Object obj) {
       Node node = mnew Node(blabla);
       ...
     }

 Thus, when the destructor is called upon a LinkedList, either 
 explicitly, or by the GC, the Node references will always be valid. 
 One has to be careful now, as mnew'ed object are effectively under 
 manual memory management, and so every mnew must have a corresponding 
 delete, lest there be dangling pointer ou memory leaks. Nonetheless it 
 seems to be only sane solution to this problem.

 
 
 This does seem to be the most reasonable method.  In fact, it could be 
 done now without the addition of new keywords by adding two new GC 
 functions: release and reclaim (bad names, but they're all I could think 
 of).  'release' would tell the GC not to automatically finalize or 
 delete the memory block, as you've suggested above, and 'reclaim' would 
 transfer ownership back to the GC.  It's more error prone than I'd like, 
 but also perhaps the most reasonable.
 
 A possible alternative would be for the GC to peform its cleanup in two 
 stages.  The first sweep runs all finalizers on orphaned objects, and 
 the second releases the memory.  Thus in Eric's example on d.D.learn, he 
 would be able legally iterate across his AA and close all HANDLEs 
 because the memory would still be valid at that stage.
 
 Assuming there aren't any problems with this latter idea, I think it 
 should be implemented as standard behavior for the GC, and the former 
 idea should be provided as an option.  Thus the user would have complete 
 manual control available when needed, but more foolproof basic behavior 
 for simpler situations.
 
 Another interesting addition, is to extend the concept of auto to 
 class members. Just as currently auto couples the lifecycle of a 
 variable to the enclosing function, an auto class member would couple 
 the lifecycle of its member to it's owner object. It would get deleted 
 implicitly when then owner object got deleted. Here is another (made 
 up) example:

   class SomeUIWidget {
     auto Color fgcolor;
     auto Color bgcolor;
     auto Size size;
     auto Image image;
     ...

 The auto members would then have to be initialized on a constructor or 
 something (the exact restrictions might vary, such as being final or 
 not).

 
 I like this idea as well, though it may require some additional 
 bookkeeping to accomplish.  For example, a GC scan may encounter the 
 members before the owner, so each member may need to contain a hidden 
 pointer to the owner object so the GC knows how to sort things out.

If the above case was written as:

    class SomeUIWidget {
      Color fgcolor;
      Color bgcolor;
      Size size;
      Image image;
      ...

and the class didn't have an explicit destructor, then the only "damage" 
at GC (or otherwise destruction) time would be that a couple of Color 
instances, a Size instance and an Image instance would be "left over" 
after that particular GC run.

Big deal? At the next GC run (unless they'd be pointed-to by other 
things), they'd get deleted too. No major flood of tears here.

Somehow I fear folks are making this a way too complicated thing.

Apr 09 2006

Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:

Georg Wrede wrote:
 Sean Kelly wrote:
 Bruno Medeiros wrote:

 What we want is an allocator that allocates memory that is not to be 
 claimed by the GC (but which is to be scanned by the GC). It's 
 behavior is exactly like the allocator of 
 http://www.digitalmars.com/d/memory.html#newdelete but it should come 
 with the language and be available for all types. With usage like:

   class LinkedList {
     ...
     Add(Object obj) {
       Node node = mnew Node(blabla);
       ...
     }

 Thus, when the destructor is called upon a LinkedList, either 
 explicitly, or by the GC, the Node references will always be valid. 
 One has to be careful now, as mnew'ed object are effectively under 
 manual memory management, and so every mnew must have a corresponding 
 delete, lest there be dangling pointer ou memory leaks. Nonetheless 
 it seems to be only sane solution to this problem.


 This does seem to be the most reasonable method.  In fact, it could be 
 done now without the addition of new keywords by adding two new GC 
 functions: release and reclaim (bad names, but they're all I could 
 think of).  'release' would tell the GC not to automatically finalize 
 or delete the memory block, as you've suggested above, and 'reclaim' 
 would transfer ownership back to the GC.  It's more error prone than 
 I'd like, but also perhaps the most reasonable.

 A possible alternative would be for the GC to peform its cleanup in 
 two stages.  The first sweep runs all finalizers on orphaned objects, 
 and the second releases the memory.  Thus in Eric's example on 
 d.D.learn, he would be able legally iterate across his AA and close 
 all HANDLEs because the memory would still be valid at that stage.

 Assuming there aren't any problems with this latter idea, I think it 
 should be implemented as standard behavior for the GC, and the former 
 idea should be provided as an option.  Thus the user would have 
 complete manual control available when needed, but more foolproof 
 basic behavior for simpler situations.

 Another interesting addition, is to extend the concept of auto to 
 class members. Just as currently auto couples the lifecycle of a 
 variable to the enclosing function, an auto class member would couple 
 the lifecycle of its member to it's owner object. It would get 
 deleted implicitly when then owner object got deleted. Here is 
 another (made up) example:

   class SomeUIWidget {
     auto Color fgcolor;
     auto Color bgcolor;
     auto Size size;
     auto Image image;
     ...

 The auto members would then have to be initialized on a constructor 
 or something (the exact restrictions might vary, such as being final 
 or not).

 I like this idea as well, though it may require some additional 
 bookkeeping to accomplish.  For example, a GC scan may encounter the 
 members before the owner, so each member may need to contain a hidden 
 pointer to the owner object so the GC knows how to sort things out.

 
 If the above case was written as:
 
    class SomeUIWidget {
      Color fgcolor;
      Color bgcolor;
      Size size;
      Image image;
      ...
 
 and the class didn't have an explicit destructor, then the only "damage" 
 at GC (or otherwise destruction) time would be that a couple of Color 
 instances, a Size instance and an Image instance would be "left over" 
 after that particular GC run.
 
 Big deal? At the next GC run (unless they'd be pointed-to by other 
 things), they'd get deleted too. No major flood of tears here.
 
 Somehow I fear folks are making this a way too complicated thing.

Actually, with any decent GC, all of those objects will be reclaimed on 
the first GC run (and DMD does that). So you are correct that there is 
no difference when running the GC on that object.
But you miss the point. The point (of my suggestions) was to be able to 
have a destruction system that would work "correctly/extensively" both 
when called by a GC cycle, and when called explicitly (outside of a GC 
cycle).
By "correctly/extensively" I mean that the destructor would be able in 
both cases to ensure the destruction of it's owned resources.

-- 
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Apr 10 2006

Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:

Sean Kelly wrote:
 Bruno Medeiros wrote:
 What we want is an allocator that allocates memory that is not to be 
 claimed by the GC (but which is to be scanned by the GC). It's 
 behavior is exactly like the allocator of 
 http://www.digitalmars.com/d/memory.html#newdelete but it should come 
 with the language and be available for all types. With usage like:

   class LinkedList {
     ...
     Add(Object obj) {
       Node node = mnew Node(blabla);
       ...
     }

 Thus, when the destructor is called upon a LinkedList, either 
 explicitly, or by the GC, the Node references will always be valid. 
 One has to be careful now, as mnew'ed object are effectively under 
 manual memory management, and so every mnew must have a corresponding 
 delete, lest there be dangling pointer ou memory leaks. Nonetheless it 
 seems to be only sane solution to this problem.

 
 This does seem to be the most reasonable method.  In fact, it could be 
 done now without the addition of new keywords by adding two new GC 
 functions: release and reclaim (bad names, but they're all I could think 
 of).  'release' would tell the GC not to automatically finalize or 
 delete the memory block, as you've suggested above, and 'reclaim' would 
 transfer ownership back to the GC.  It's more error prone than I'd like, 
 but also perhaps the most reasonable.
 

Hum, indeed.

 A possible alternative would be for the GC to peform its cleanup in two 
 stages.  The first sweep runs all finalizers on orphaned objects, and 
 the second releases the memory.  Thus in Eric's example on d.D.learn, he 
 would be able legally iterate across his AA and close all HANDLEs 
 because the memory would still be valid at that stage.
 

By orphaned objects, do you mean all objects that are to be reclaimed by 
the GC on that cycle? Or just the subset of those objects, that are not 
referenced by anyone?

 Assuming there aren't any problems with this latter idea, I think it 
 should be implemented as standard behavior for the GC, and the former 
 idea should be provided as an option.  Thus the user would have complete 
 manual control available when needed, but more foolproof basic behavior 
 for simpler situations.
 
 Another interesting addition, is to extend the concept of auto to 
 class members. Just as currently auto couples the lifecycle of a 
 variable to the enclosing function, an auto class member would couple 
 the lifecycle of its member to it's owner object. It would get deleted 
 implicitly when then owner object got deleted. Here is another (made 
 up) example:

   class SomeUIWidget {
     auto Color fgcolor;
     auto Color bgcolor;
     auto Size size;
     auto Image image;
     ...

 The auto members would then have to be initialized on a constructor or 
 something (the exact restrictions might vary, such as being final or 
 not).

 
 I like this idea as well, though it may require some additional 
 bookkeeping to accomplish.  For example, a GC scan may encounter the 
 members before the owner, so each member may need to contain a hidden 
 pointer to the owner object so the GC knows how to sort things out.
 
 
 Sean

Hum, true, it would need some additional bookkeeping, didn't realize 
that immediately. The semantics like those that I mentioned in me 
previous post would suffice:

"When a destructor is called upon an object during a GC (i.e., "as a 
finalizer"), then the member references are not valid and cannot be 
referenced, *but they can be deleted*. Each will be deleted iff it has 
not been deleted already in the reclaiming phase."

I don't think your algorithm (having a hidden pointer) would be 
necessary (or even feasible), and the one I mentioned before would suffice.

-- 
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Apr 10 2006

Sean Kelly <sean f4.ca> writes:

Bruno Medeiros wrote:
 Sean Kelly wrote:

 A possible alternative would be for the GC to peform its cleanup in 
 two stages.  The first sweep runs all finalizers on orphaned objects, 
 and the second releases the memory.  Thus in Eric's example on 
 d.D.learn, he would be able legally iterate across his AA and close 
 all HANDLEs because the memory would still be valid at that stage.

 
 By orphaned objects, do you mean all objects that are to be reclaimed by 
 the GC on that cycle? Or just the subset of those objects, that are not 
 referenced by anyone?

All objects that are to be reclaimed.  I figured your other suggestion 
could be used for more complex cases.

 Another interesting addition, is to extend the concept of auto to 
 class members. Just as currently auto couples the lifecycle of a 
 variable to the enclosing function, an auto class member would couple 
 the lifecycle of its member to it's owner object. It would get 
 deleted implicitly when then owner object got deleted. Here is 
 another (made up) example:

   class SomeUIWidget {
     auto Color fgcolor;
     auto Color bgcolor;
     auto Size size;
     auto Image image;
     ...

 The auto members would then have to be initialized on a constructor 
 or something (the exact restrictions might vary, such as being final 
 or not).

 I like this idea as well, though it may require some additional 
 bookkeeping to accomplish.  For example, a GC scan may encounter the 
 members before the owner, so each member may need to contain a hidden 
 pointer to the owner object so the GC knows how to sort things out.

 
 Hum, true, it would need some additional bookkeeping, didn't realize 
 that immediately. The semantics like those that I mentioned in me 
 previous post would suffice:
 
 "When a destructor is called upon an object during a GC (i.e., "as a 
 finalizer"), then the member references are not valid and cannot be 
 referenced, *but they can be deleted*. Each will be deleted iff it has 
 not been deleted already in the reclaiming phase."
 
 I don't think your algorithm (having a hidden pointer) would be 
 necessary (or even feasible), and the one I mentioned before would suffice.

Hrm... but what if the owner is simply collected via a normal GC run? 
In that case, the GC may encounter the member objects before the owner 
object.  I suppose bookkeeping at the member level may not be necessary, 
but it may result in an extra scan through the list of objects to be 
finalized to determine who owns what.


Sean

Apr 10 2006

Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:

Sean Kelly wrote:
 Bruno Medeiros wrote:
 Sean Kelly wrote:

 A possible alternative would be for the GC to peform its cleanup in 
 two stages.  The first sweep runs all finalizers on orphaned objects, 
 and the second releases the memory.  Thus in Eric's example on 
 d.D.learn, he would be able legally iterate across his AA and close 
 all HANDLEs because the memory would still be valid at that stage.

 By orphaned objects, do you mean all objects that are to be reclaimed 
 by the GC on that cycle? Or just the subset of those objects, that are 
 not referenced by anyone?

 
 All objects that are to be reclaimed.  I figured your other suggestion 
 could be used for more complex cases.
 

That way, you have the guarantee that all references are valid, but some 
instances would have their destructors called multiple times. That's 
likely a behavior that isn't acceptable on some cases.

 Another interesting addition, is to extend the concept of auto to 
 class members. Just as currently auto couples the lifecycle of a 
 variable to the enclosing function, an auto class member would 
 couple the lifecycle of its member to it's owner object. It would 
 get deleted implicitly when then owner object got deleted. Here is 
 another (made up) example:

   class SomeUIWidget {
     auto Color fgcolor;
     auto Color bgcolor;
     auto Size size;
     auto Image image;
     ...

 The auto members would then have to be initialized on a constructor 
 or something (the exact restrictions might vary, such as being final 
 or not).

 I like this idea as well, though it may require some additional 
 bookkeeping to accomplish.  For example, a GC scan may encounter the 
 members before the owner, so each member may need to contain a hidden 
 pointer to the owner object so the GC knows how to sort things out.

 Hum, true, it would need some additional bookkeeping, didn't realize 
 that immediately. The semantics like those that I mentioned in me 
 previous post would suffice:

 "When a destructor is called upon an object during a GC (i.e., "as a 
 finalizer"), then the member references are not valid and cannot be 
 referenced, *but they can be deleted*. Each will be deleted iff it has 
 not been deleted already in the reclaiming phase."

 I don't think your algorithm (having a hidden pointer) would be 
 necessary (or even feasible), and the one I mentioned before would 
 suffice.

 
 Hrm... but what if the owner is simply collected via a normal GC run? In 
 that case, the GC may encounter the member objects before the owner 
 object.  I suppose bookkeeping at the member level may not be necessary, 
 but it may result in an extra scan through the list of objects to be 
 finalized to determine who owns what.
 
 
 Sean

The bookkeeping is made by the GC and memory pool manager. A scan 
through the list of objects to be finalized is necessary, but it won't 
be an _extra_ scan. Let me try to explain this way:

*** The current GC algorithm: ***

delete obj:

   m = getMemManagerHandle(obj);
   if(m.isObjectInstance)
     m.obj.destroy(); // calls ~this()
   freeMemory(m);

GC:

   GC determines a set S of instances to be reclaimed (garbage);
   foreach(m in S) {
     delete m;
   }

*** The extended GC algorithm: ***

delete:

   m = getMemManagerHandle(obj);
   if(m.isDeleted)
     return;
   if(m.isObjectInstance)
     m.obj.destroy(); // calls ~this()
   if(!m.isGarbageSet) // If it is not in S
     freeMemory(m);

GC:

   GC determines a set S of instances to be reclaimed (garbage);
   foreach(m in S) {
     m.isGarbage = true;
   }
   foreach(m in S) {
     delete m;
   }
   foreach(m in S) {
     freeMemory(m);
   }


And there we go. No increase in algorithmic complexity. There is only an 
increase in the Memory Manager record size (we need a flag for 
m.isDeleted, and we need it only during a GC run).
The reason we don't freeMemory(m) right after delete m; is because we 
need the bookkeeping of m.isDeleted until the end of the GC run.
The reason we have m.isGarbage is to allow the deletion of objects not 
in S during the GC run. (it is an optimization of doing "S.contains(m)" )

Hope I don't have a bug up there :P

-- 
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Apr 13 2006

pragma <pragma_member pathlink.com> writes:

In article <e1m6mo$19d$1 digitaldaemon.com>, Bruno Medeiros says...
*** The extended GC algorithm: ***

delete:

   m = getMemManagerHandle(obj);
   if(m.isDeleted)
     return;
   if(m.isObjectInstance)
     m.obj.destroy(); // calls ~this()
   if(!m.isGarbageSet) // If it is not in S
     freeMemory(m);

GC:

   GC determines a set S of instances to be reclaimed (garbage);
   foreach(m in S) {
     m.isGarbage = true;
   }
   foreach(m in S) {
     delete m;
   }
   foreach(m in S) {
     freeMemory(m);
   }

Something like this will help *part* of the problem.  By delaying the freeing of
referenced memory, dynamically allocated primitives (like arrays) will continue
to function inside of class destructors.  However, this does not help with
references to objects and structs, as they may still be placed in an invalid
state by their own destructors.

/**/ class A{
/**/   public uint resource;
/**/   public this(){ resource = 42; }
/**/   public ~this(){ resource = 0; }
/**/ }
/**/ class B{
/**/   public A a;
/**/   public this(){ a = new A(); }
/**/   public ~this(){ writefln("resource: %d",a.resource); }
/**/ }

Depending on the ording in S, the program will output either "resource: 42" or
"resource: 0".  The problem only gets worse for object cycles.  I'm not saying
it won't work, but it just moves the wrinkle into a different area to be stomped
out.

Now, one way to improve this is if there were a standard method on objects that
can be checked in situations like these.  That way you'd know if another object
is finalized, or in the process of being finalized.

   foreach(m in S) {
     m.isFinalized = true;
     delete m;
   }

Now this doesn't make life any easier, but it does make things deterministic.

/**/ class A{
/**/   public uint resource;
/**/   public this(){ resource = 42; }
/**/   public ~this(){ resource = 0; }
/**/ }
/**/ class B{
/**/   public A a;
/**/   public this(){ a = new A(); }
/**/   public ~this(){ if(!a.isFinalized) writefln("resource: %d",a.resource); }
/**/ }

(another option would be something like gc.isFinalized(a), should the footprint
of Object be an issue)

Now B outputs nothing if A is finalized.  That seems like a win, but what if B
really needed that value before A went away?  In such a case, you're back to
square-one: you can't depend on the state of another referenced object within a
dtor, valid reference or otherwise.

- EricAnderton at yahoo

Apr 13 2006

Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:

pragma wrote:
 In article <e1m6mo$19d$1 digitaldaemon.com>, Bruno Medeiros says...
 *** The extended GC algorithm: ***

 delete:

   m = getMemManagerHandle(obj);
   if(m.isDeleted)
     return;
   if(m.isObjectInstance)
     m.obj.destroy(); // calls ~this()
   if(!m.isGarbageSet) // If it is not in S
     freeMemory(m);

 GC:

   GC determines a set S of instances to be reclaimed (garbage);
   foreach(m in S) {
     m.isGarbage = true;
   }
   foreach(m in S) {
     delete m;
   }
   foreach(m in S) {
     freeMemory(m);
   }

 
 Something like this will help *part* of the problem.  By delaying the freeing
of
 referenced memory, dynamically allocated primitives (like arrays) will continue
 to function inside of class destructors.  However, this does not help with
 references to objects and structs, as they may still be placed in an invalid
 state by their own destructors.
 
 /**/ class A{
 /**/   public uint resource;
 /**/   public this(){ resource = 42; }
 /**/   public ~this(){ resource = 0; }
 /**/ }
 /**/ class B{
 /**/   public A a;
 /**/   public this(){ a = new A(); }
 /**/   public ~this(){ writefln("resource: %d",a.resource); }
 /**/ }
 
 Depending on the ording in S, the program will output either "resource: 42" or
 "resource: 0".  The problem only gets worse for object cycles.  I'm not saying
 it won't work, but it just moves the wrinkle into a different area to be
stomped
 out.
 

True, I forgot to mention that. The order of destruction is undefined, 
so it will only work with objects where that order doesn't matter. (that 
should be the case with most)

 Now, one way to improve this is if there were a standard method on objects that
 can be checked in situations like these.  That way you'd know if another object
 is finalized, or in the process of being finalized.
 
   foreach(m in S) {
     m.isFinalized = true;
     delete m;
   }

 
 Now this doesn't make life any easier, but it does make things deterministic.
 
 /**/ class A{
 /**/   public uint resource;
 /**/   public this(){ resource = 42; }
 /**/   public ~this(){ resource = 0; }
 /**/ }
 /**/ class B{
 /**/   public A a;
 /**/   public this(){ a = new A(); }
 /**/   public ~this(){ if(!a.isFinalized) writefln("resource: %d",a.resource);
}
 /**/ }
 
 (another option would be something like gc.isFinalized(a), should the footprint
 of Object be an issue)
 
 Now B outputs nothing if A is finalized.  That seems like a win, but what if B
 really needed that value before A went away?  In such a case, you're back to
 square-one: you can't depend on the state of another referenced object within a
 dtor, valid reference or otherwise.
 
 - EricAnderton at yahoo

Exactly, you can't really solve the order/state problem with this. I 
think the only way to do it is to manually memory manage the member 
objects (with a construct such as mmnew or otherwise).

-- 
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Apr 14 2006

Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:

Bruno Medeiros wrote:
 Sean Kelly wrote:
 Bruno Medeiros wrote:
 What we want is an allocator that allocates memory that is not to be 
 claimed by the GC (but which is to be scanned by the GC). It's 
 behavior is exactly like the allocator of 
 http://www.digitalmars.com/d/memory.html#newdelete but it should come 
 with the language and be available for all types. With usage like:

   class LinkedList {
     ...
     Add(Object obj) {
       Node node = mnew Node(blabla);
       ...
     }

 Thus, when the destructor is called upon a LinkedList, either 
 explicitly, or by the GC, the Node references will always be valid. 
 One has to be careful now, as mnew'ed object are effectively under 
 manual memory management, and so every mnew must have a corresponding 
 delete, lest there be dangling pointer ou memory leaks. Nonetheless 
 it seems to be only sane solution to this problem.

 This does seem to be the most reasonable method.  In fact, it could be 
 done now without the addition of new keywords by adding two new GC 
 functions: release and reclaim (bad names, but they're all I could 
 think of).  'release' would tell the GC not to automatically finalize 
 or delete the memory block, as you've suggested above, and 'reclaim' 
 would transfer ownership back to the GC.  It's more error prone than 
 I'd like, but also perhaps the most reasonable.

 
 Hum, indeed.
 

Then again, with a proper allocator (mmnew) there is room for more 
optimization. I doubt one would want (or that it would be good) to 
change the management ownership of an instance during it's lifetime. 
Rather, it should be set right from the start (when allocated).

Also, I've realized just now, that with templates one can get a pretty 
close solution, with something like:
   mmnew!(Foobar)
The shortcoming is you won't be able to use non-default constructors in 
that call.

-- 
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Apr 14 2006

Mike Capp <mike.capp gmail.com> writes:

In article <e1bj4r$1gt$1 digitaldaemon.com>, Bruno Medeiros says...
Some ideas where discussed here, but I didn't think any were fruitful. Like:
  *Forcing all classes with destructors to be auto classes -> doesn't 
add any usefulness, instead just nuisances.

Hmm, yes. Like private/protected member access specifiers - what usefulness do
they add? Or requiring a cast to assign from one type to another - sheer
nuisance!

cheers
Mike

Apr 09 2006

Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:

Mike Capp wrote:
 In article <e1bj4r$1gt$1 digitaldaemon.com>, Bruno Medeiros says...
 Some ideas where discussed here, but I didn't think any were fruitful. Like:
  *Forcing all classes with destructors to be auto classes -> doesn't 
 add any usefulness, instead just nuisances.

 
 Hmm, yes. Like private/protected member access specifiers - what usefulness do
 they add? Or requiring a cast to assign from one type to another - sheer
 nuisance!
 
 cheers
 Mike
 
 

Protection attributes and casts add usefulness (not gonna detail why). 
Forcing all classes with destructors to be auto classes, on the other 
hand, severily limits the usage of such classes. An auto class can not 
be a global, static, field, inout and out parameter. It must be bound to 
a function, and *cannot be a part of another data structure*. This 
latter restriction, as is, is unacceptable, no?


-- 
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Apr 10 2006

Mike Capp <mike.capp gmail.com> writes:

In article <e1dak2$21d9$1 digitaldaemon.com>, Bruno Medeiros says...
Protection attributes and casts add usefulness (not gonna detail why).

The usefulness of protection attributes lies solely in preventing you from
misusing something. Same with auto and dtors. If a class needs a dtor, leaving
it to the GC qualifies as misuse in my view.

Forcing all classes with destructors to be auto classes, on the other 
hand, severily limits the usage of such classes. An auto class can not 
be a global, static, field, inout and out parameter. It must be bound to 
a function, and *cannot be a part of another data structure*. This 
latter restriction, as is, is unacceptable, no?

Agreed; IIRC, auto members of auto classes were part of my original suggestion,
and I think the dtors-for-autos-only restriction would quickly force this
problem out into the open.

It may be that we're agreeing on the destination and only differing on how to
get there.

cheers
Mike

Apr 10 2006

Don Clugston <dac nospam.com.au> writes:

Mike Capp wrote:
 In article <e1dak2$21d9$1 digitaldaemon.com>, Bruno Medeiros says...
 Protection attributes and casts add usefulness (not gonna detail why).

 
 The usefulness of protection attributes lies solely in preventing you from
 misusing something. Same with auto and dtors. If a class needs a dtor, leaving
 it to the GC qualifies as misuse in my view.
 
 Forcing all classes with destructors to be auto classes, on the other 
 hand, severily limits the usage of such classes. An auto class can not 
 be a global, static, field, inout and out parameter. It must be bound to 
 a function, and *cannot be a part of another data structure*. This 
 latter restriction, as is, is unacceptable, no?

 
 Agreed; IIRC, auto members of auto classes were part of my original suggestion,
 and I think the dtors-for-autos-only restriction would quickly force this
 problem out into the open.

I suspect that if finalisers were abolished, those other restrictions 
would be MUCH easier to lift. They probably exist mainly because of the 
complexity of the interactions with the GC.

 
 It may be that we're agreeing on the destination and only differing on how to
 get there.
 
 cheers
 Mike

Apr 10 2006

"Regan Heath" <regan netwin.co.nz> writes:

On Mon, 10 Apr 2006 11:05:00 +0100, Bruno Medeiros  
<brunodomedeirosATgmail SPAM.com> wrote:
 Mike Capp wrote:
 In article <e1bj4r$1gt$1 digitaldaemon.com>, Bruno Medeiros says...
 Some ideas where discussed here, but I didn't think any were fruitful.  
 Like:
  *Forcing all classes with destructors to be auto classes -> doesn't  
 add any usefulness, instead just nuisances.

  Hmm, yes. Like private/protected member access specifiers - what  
 usefulness do
 they add? Or requiring a cast to assign from one type to another - sheer
 nuisance!
  cheers
 Mike

 Protection attributes and casts add usefulness (not gonna detail why).  
 Forcing all classes with destructors to be auto classes, on the other  
 hand, severily limits the usage of such classes. An auto class can not  
 be a global, static, field, inout and out parameter. It must be bound to  
 a function, and *cannot be a part of another data structure*. This  
 latter restriction, as is, is unacceptable, no?

The suggestion I made assumed we could remove these restrictions. I'm not  
sure whether that's true or not, if it is impossible can someone explain  
to me why? I would be curious to know. It seems that if C++ can have  
classes at module/file scope that have destructors why can't D. I have a  
feeling it has something to do with how Walter has implemented it.. but  
that _could_ change, if the reasons were strong enough, right?

If we assume (for the purposes of exploring the solution) that the  
restrictions can be removed doesn't this idea make a lot of sense?

1. any class/module with a dtor must be 'auto'.
2. any class/module containing a reference to an 'auto' class/module must  
be 'auto'.
3. The 'auto' keyword used here; "auto Foo f = new Foo();" is not  
required, remove it.
4. A 'shared' keyword is used to indicate a shared 'auto' resource.

Rationale: if a class has cleanup to do, it must be done all the time, not  
just sometimes and not selectively*. Therefore any class with cleanup to  
do is 'auto' and any class containing an 'auto' member also has cleanup to  
do, thus must be 'auto'.

(*) the exception to this rule is a member reference to a shared resource,  
thus the 'shared' keyword.

The compiler can auto-generate dtors for classes containing 'auto' members  
eg.

auto class File {
   HANDLE h;
   ~this() { CloseHandle(h); }
}

auto class Foo {
   File f;
   /*auto generated dtor
   ~this() { delete f; }
}

If the user supplies a dtor, the compiler can simply append it's auto-dtor  
to the end of that (I don't think deleting a reference twice is a  
problem). In this way 'auto' propagates itself as required.

(In fact if you think about it, the keyword 'auto' isn't really even  
required. It can be removed and the behaviour outlined above can simply be  
implemented)

The shared keyword would prevent the automatic dtor from calling delete on  
the shared reference. If that reference was the only 'auto' member it  
would therefore prevent the class from being 'auto' itself. The user would  
have to manage the shared resource manually, or rather, can rely on it  
being deleted by the (one and only) non-shared reference to it, eg.

[file.d]

File a = new File("a.txt");

class Foo {
   shared File foo;
   this(File f) { foo = f; }
}

void main()
{
   Foo f = new Foo(a);
}

The class 'Foo' is not auto, it has no dtor, the compiler does not  
generate one, it's shared reference 'foo' is never deleted. The module  
level reference 'a' is auto, an auto generated module level dtor will  
delete it.

The classes affected by this idea are few, I'd say less than 20% (even  
with 'auto' propagating up the class tree), the rest will have no dtor and  
will simply be collected as normal by the GC, no dtor calls required.

As far as I can see there are no restrictions of use for this idea.  
Classes will be the same as they are today, only they'll have  
deterministic destruction where required. Assuming of course it can  
actually be implemented.

Regan

Apr 10 2006

Georg Wrede <georg.wrede nospam.org> writes:

Bruno Medeiros wrote:
 Sean Kelly wrote:
 
 Jarrett Billingsley wrote:

 "Sean Kelly" <sean f4.ca> wrote in message 
 news:e10pk7$2khb$1 digitaldaemon.com...

     - a type can have a destructor and/or a finalizer
     - the destructor is called upon a) explicit delete or b) at end 
 of scope for auto objects
     - the finalizer is called if allocated on the gc heap and the
       destructor has not been called


 Would you mind explaining why exactly there needs to be a difference 
 between destructors and finalizers?  I've been following all the 
 arguments about this heap vs. auto classes and dtors vs. finalizers, 
 and I still can't figure out why destructors _can't be the 
 finalizers_.  Do finalizers do something fundamentally different from 
 destructors? 


 Since finalizers are called when the GC destroys an object, they are 
 very limited in what they can do.  They can't assume any GC managed 
 object they have a reference to is valid, etc.  By contrast, 
 destructors can make this assumption, because the object is being 
 destroyed deterministically.  I think having both may be too confusing 
 to be worthwhile, but it would allow for things like this:

     class LinkedList {
         ~this() { // called deterministically
             for( Node n = top; n; ) {
                 Node t = n->next;
                 delete n;
                 n = t;
             }
             finalize();
          }

          void finalize() { // called by GC
              // nodes may have already been destroyed
              // so leave them alone, but special
              // resources could be reclaimed
          }
     }

 The argument against finalizers, as Mike mentioned, is that you 
 typically want to reclaim such special resources deterministically, so 
 letting the GC take care of this 'someday' is of questionable utility.


 Sean

 
 
 Ok, I think we can tackle this problem in a better way. So far, people 
 have been thinking about the fact that when destructors are called in a 
 GC cycle, they are called with finalizer semantics (i.e., you don't know 
 if the member references are valid or not, thus you can't use them).
 
 This is a problem when in a destructor, one would like to destroy 
 component objects (as the Nodes of the LinkedList example).
 
 
 Some ideas where discussed here, but I didn't think any were fruitful. 
 Like:
  *Forcing all classes with destructors to be auto classes -> doesn't add 
 any usefulness, instead just nuisances.
  *Making the GC destroy objects in an order that makes members 
 references valid -> has a high performance cost and/or is probably just 
 not possible (circular references?).
 
 
 Perhaps another way would be to have the following behavior:
 - When a destructor is called during a GC (i.e., "as a finalizer") for 
 an object, then the member references are not valid and cannot be 
 referenced, *but they can be deleted*. It will be deleted iff it has not 
 been deleted already.
 I think this can be done without significant overhead. At the end of a 
 GC cycle, the GC has already a list of all objects that are to be 
 deleted. Thus, on the release phase, it could be modified to have a flag 
 indicating whether the object was already deleted or not. Thus when 
 LinkedList deletes a Node, the delete is only made if the object has 
 already been deleted or not.

If an instance is deleted by the GC, the pointers that it may have to 
other instances (of the same or instances of other classes) vanish. All 
of those other instances may or may not have other pointers pointing to 
them. So, deleting (or destructing) a particular instance, should not in 
any way "cascade" to those other instances.

On the next run, the GC _may_ notice that those other instances are not 
pointed-to by anything anymore, and then it may delete/destruct them.

---

So much for "regular" instance deletion. Then, we have the case where 
the instance "owns" some scarce resource (a file handle, a port, or some 
such). Such instances should be destructed in a _timely_ fashion _only_, 
right?

In other words, instances that need explicit destruction, should be 
destructed _at_the_moment_ they become obsolete -- and not "ma�ana".

It is conceivable that the "regular" instances do not have explicit 
destructors (after all, their memory footprint would just be released to 
the free pool), wherease the "resource owning" instances really do need 
an explicit destructor.

Thus, the existence of an explicit destructor should be a sign that 
makes [us, Walter, the compiler, anybody] understand that such an 
instance _needs_ to be destructed _right_away_.

This makes one think of "auto". Now, there have been several comments 
like /auto can't work/ because we don't know the scope of the instance. 
That is just BS. Every piece of source code should be written 
"hierarchically" (that is, not the entire program as "one function"). 
When one refactors the goings-on in the program to short procedures, 
then it all of a sudden is not too difficult to use "auto" to manage the 
lifetime of instances.

 Still, while the previous idea might be good, it's not the optimal, 
 because we are not clearly apperceiving the problem/issue at hand. What 
 we *really* want is to directly couple the lifecycle of a component 
 (member) object with it's composite (owner) object. A Node of a 
 LinkedList has the same lifecycle of it's LinkedList, so Node shouldn't 
 even be a independent Garbage Collection managing element.
 
 What we want is an allocator that allocates memory that is not to be 
 claimed by the GC (but which is to be scanned by the GC). It's behavior 
 is exactly like the allocator of 
 http://www.digitalmars.com/d/memory.html#newdelete but it should come 
 with the language and be available for all types. With usage like:
 
   class LinkedList {
     ...
     Add(Object obj) {
       Node node = mnew Node(blabla);
       ...
     }
 
 Thus, when the destructor is called upon a LinkedList, either 
 explicitly, or by the GC, the Node references will always be valid. One 
 has to be careful now, as mnew'ed object are effectively under manual 
 memory management, and so every mnew must have a corresponding delete, 
 lest there be dangling pointer ou memory leaks. Nonetheless it seems to 
 be only sane solution to this problem.
 
 
 Another interesting addition, is to extend the concept of auto to class 
 members. Just as currently auto couples the lifecycle of a variable to 
 the enclosing function, an auto class member would couple the lifecycle 
 of its member to it's owner object. It would get deleted implicitly when 
 then owner object got deleted. Here is another (made up) example:
 
   class SomeUIWidget {
     auto Color fgcolor;
     auto Color bgcolor;
     auto Size size;
     auto Image image;
     ...
 
 The auto members would then have to be initialized on a constructor or 
 something (the exact restrictions might vary, such as being final or not).

Apr 09 2006

kris <foo bar.com> writes:

Georg Wrede wrote:
[snip]

 So much for "regular" instance deletion. Then, we have the case where 
 the instance "owns" some scarce resource (a file handle, a port, or some 
 such). Such instances should be destructed in a _timely_ fashion _only_, 
 right?
 
 In other words, instances that need explicit destruction, should be 
 destructed _at_the_moment_ they become obsolete -- and not "ma�ana".
 
 It is conceivable that the "regular" instances do not have explicit 
 destructors (after all, their memory footprint would just be released to 
 the free pool), wherease the "resource owning" instances really do need 
 an explicit destructor.
 
 Thus, the existence of an explicit destructor should be a sign that 
 makes [us, Walter, the compiler, anybody] understand that such an 
 instance _needs_ to be destructed _right_away_.
 
 This makes one think of "auto". Now, there have been several comments 
 like /auto can't work/ because we don't know the scope of the instance. 
 That is just BS. Every piece of source code should be written 
 "hierarchically" (that is, not the entire program as "one function"). 
 When one refactors the goings-on in the program to short procedures, 
 then it all of a sudden is not too difficult to use "auto" to manage the 
 lifetime of instances.

That was all sounding reasonable up until this point :)

I think we can safely put aside the entire-program-as-one-function as 
unrealistic. Given that, and assuming the existance of a dtor implies 
"auto" (and thus raii), how does one manage a "pool" of resources? For 
example, how about a pool of DB connections? Let's assume that they need 
to be correctly closed at some point, and that the pool is likely to 
expand and contract based upon demand over time ...

So the question is how do those connections, and the pool itself, jive 
with scoped raii? Assuming it doesn't, then one would presumeably revert 
to a manual dispose() pattern with such things?

Apr 09 2006

Mike Capp <mike.capp gmail.com> writes:

In article <e1c6vl$moj$1 digitaldaemon.com>, kris says...
I think we can safely put aside the entire-program-as-one-function as 
unrealistic. Given that, and assuming the existance of a dtor implies 
"auto" (and thus raii), how does one manage a "pool" of resources? For 
example, how about a pool of DB connections? Let's assume that they need 
to be correctly closed at some point, and that the pool is likely to 
expand and contract based upon demand over time ...

So the question is how do those connections, and the pool itself, jive 
with scoped raii? Assuming it doesn't, then one would presumeably revert 
to a manual dispose() pattern with such things?

Two different classes. A ConnectionPool at application scope, e.g. in main(),
and a ConnectionUsage wherever you need one. Both are RAII. ConnectionPool acts
as a factory for ConnectionUsage instances (modulo language limitations) and
adds to the pool as needed; ConnectionUsage just "borrows" an instance from the
pool for the duration of its scope.

cheers
Mike

Apr 09 2006

kris <foo bar.com> writes:

Mike Capp wrote:
 In article <e1c6vl$moj$1 digitaldaemon.com>, kris says...
 
I think we can safely put aside the entire-program-as-one-function as 
unrealistic. Given that, and assuming the existance of a dtor implies 
"auto" (and thus raii), how does one manage a "pool" of resources? For 
example, how about a pool of DB connections? Let's assume that they need 
to be correctly closed at some point, and that the pool is likely to 
expand and contract based upon demand over time ...

So the question is how do those connections, and the pool itself, jive 
with scoped raii? Assuming it doesn't, then one would presumeably revert 
to a manual dispose() pattern with such things?

 
 
 Two different classes. A ConnectionPool at application scope, e.g. in main(),
 and a ConnectionUsage wherever you need one. Both are RAII. ConnectionPool acts
 as a factory for ConnectionUsage instances (modulo language limitations) and
 adds to the pool as needed; ConnectionUsage just "borrows" an instance from the
 pool for the duration of its scope.
 
 cheers
 Mike

Thanks!

So, when culling the pool (say, on a timeout basis) the cleanup-code for 
the held resource is not held within the "borrowed" dtor, but in a 
dispose() method? Otherwise, said dtor would imply raii for the borrowed 
connection, which would be bogus behaviour for a class instance that is 
being held onto by the pool? In other words: you'd want to avoid 
deleting (via raii) the connection object, so you'd have to be careful 
to not use a dtor in such a case (if we assume dtor means raii).

What I'm getting at here is a potential complexity in the implementation 
of pool-style designs. Perhaps not a big deal, but something to be 
learned anyway? And it retains a need for the dispose() pattern?

I /think/ I prefer the simplicity of removing dtor invocation from the 
GC instead (see post "GC and dtors ~ a different approach?"). How about you?

Apr 09 2006

"Regan Heath" <regan netwin.co.nz> writes:

On Sun, 09 Apr 2006 18:34:47 -0700, kris <foo bar.com> wrote:
 Mike Capp wrote:
 In article <e1c6vl$moj$1 digitaldaemon.com>, kris says...

 I think we can safely put aside the entire-program-as-one-function as  
 unrealistic. Given that, and assuming the existance of a dtor implies  
 "auto" (and thus raii), how does one manage a "pool" of resources? For  
 example, how about a pool of DB connections? Let's assume that they  
 need to be correctly closed at some point, and that the pool is likely  
 to expand and contract based upon demand over time ...

 So the question is how do those connections, and the pool itself, jive  
 with scoped raii? Assuming it doesn't, then one would presumeably  
 revert to a manual dispose() pattern with such things?

   Two different classes. A ConnectionPool at application scope, e.g. in  
 main(),
 and a ConnectionUsage wherever you need one. Both are RAII.  
 ConnectionPool acts
 as a factory for ConnectionUsage instances (modulo language  
 limitations) and
 adds to the pool as needed; ConnectionUsage just "borrows" an instance  
 from the
 pool for the duration of its scope.
  cheers
 Mike

 Thanks!

 So, when culling the pool (say, on a timeout basis) the cleanup-code for  
 the held resource is not held within the "borrowed" dtor, but in a  
 dispose() method? Otherwise, said dtor would imply raii for the borrowed  
 connection, which would be bogus behaviour for a class instance that is  
 being held onto by the pool? In other words: you'd want to avoid  
 deleting (via raii) the connection object, so you'd have to be careful  
 to not use a dtor in such a case (if we assume dtor means raii).

Unless you add a 'shared' keyword as I described in a previous post. eg.

auto class Connection { //auto required to have dtor
   HANDLE h;
   ~this() { CloseHandle(h); }
}

class ConnectionUsage {
   shared Connection c;
}

ConnectionUsage is not required to be 'auto' because it has no 'auto'  
class members which are not 'shared' resources. Alternately you implement  
reference counting for the Connection class, remove shared, and add 'auto'  
to ConnectionUsage.

Regan

Apr 09 2006

kris <foo bar.com> writes:

Regan Heath wrote:
 On Sun, 09 Apr 2006 18:34:47 -0700, kris <foo bar.com> wrote:
 
 Mike Capp wrote:

 In article <e1c6vl$moj$1 digitaldaemon.com>, kris says...

 I think we can safely put aside the entire-program-as-one-function 
 as  unrealistic. Given that, and assuming the existance of a dtor 
 implies  "auto" (and thus raii), how does one manage a "pool" of 
 resources? For  example, how about a pool of DB connections? Let's 
 assume that they  need to be correctly closed at some point, and 
 that the pool is likely  to expand and contract based upon demand 
 over time ...

 So the question is how do those connections, and the pool itself, 
 jive  with scoped raii? Assuming it doesn't, then one would 
 presumeably  revert to a manual dispose() pattern with such things?

   Two different classes. A ConnectionPool at application scope, e.g. 
 in  main(),
 and a ConnectionUsage wherever you need one. Both are RAII.  
 ConnectionPool acts
 as a factory for ConnectionUsage instances (modulo language  
 limitations) and
 adds to the pool as needed; ConnectionUsage just "borrows" an 
 instance  from the
 pool for the duration of its scope.
  cheers
 Mike


 Thanks!

 So, when culling the pool (say, on a timeout basis) the cleanup-code 
 for  the held resource is not held within the "borrowed" dtor, but in 
 a  dispose() method? Otherwise, said dtor would imply raii for the 
 borrowed  connection, which would be bogus behaviour for a class 
 instance that is  being held onto by the pool? In other words: you'd 
 want to avoid  deleting (via raii) the connection object, so you'd 
 have to be careful  to not use a dtor in such a case (if we assume 
 dtor means raii).

 
 
 Unless you add a 'shared' keyword as I described in a previous post. eg.
 
 auto class Connection { //auto required to have dtor
   HANDLE h;
   ~this() { CloseHandle(h); }
 }
 
 class ConnectionUsage {
   shared Connection c;
 }
 
 ConnectionUsage is not required to be 'auto' because it has no 'auto'  
 class members which are not 'shared' resources. Alternately you 
 implement  reference counting for the Connection class, remove shared, 
 and add 'auto'  to ConnectionUsage.
 
 Regan

Yes ~ that's true.

On the other hand, all these concerns would melt away if the GC were 
changed to not invoke the dtor (see related post). The beauty of that 
approach is that there's no additional keywords or compiler behaviour; 
only the GC is modified to remove the dtor call during a normal 
collection cycle. Invoking delete or raii just works as always, yet the 
invalid dtor state is eliminated. It also eliminates the need for a 
dispose() pattern, which would be nice ;-)

Apr 09 2006

Dave <Dave_member pathlink.com> writes:

In article <e1cfpo$100u$1 digitaldaemon.com>, kris says...
Regan Heath wrote:
 On Sun, 09 Apr 2006 18:34:47 -0700, kris <foo bar.com> wrote:
 
 Mike Capp wrote:

 In article <e1c6vl$moj$1 digitaldaemon.com>, kris says...

 I think we can safely put aside the entire-program-as-one-function 
 as  unrealistic. Given that, and assuming the existance of a dtor 
 implies  "auto" (and thus raii), how does one manage a "pool" of 
 resources? For  example, how about a pool of DB connections? Let's 
 assume that they  need to be correctly closed at some point, and 
 that the pool is likely  to expand and contract based upon demand 
 over time ...

 So the question is how do those connections, and the pool itself, 
 jive  with scoped raii? Assuming it doesn't, then one would 
 presumeably  revert to a manual dispose() pattern with such things?

   Two different classes. A ConnectionPool at application scope, e.g. 
 in  main(),
 and a ConnectionUsage wherever you need one. Both are RAII.  
 ConnectionPool acts
 as a factory for ConnectionUsage instances (modulo language  
 limitations) and
 adds to the pool as needed; ConnectionUsage just "borrows" an 
 instance  from the
 pool for the duration of its scope.
  cheers
 Mike


 Thanks!

 So, when culling the pool (say, on a timeout basis) the cleanup-code 
 for  the held resource is not held within the "borrowed" dtor, but in 
 a  dispose() method? Otherwise, said dtor would imply raii for the 
 borrowed  connection, which would be bogus behaviour for a class 
 instance that is  being held onto by the pool? In other words: you'd 
 want to avoid  deleting (via raii) the connection object, so you'd 
 have to be careful  to not use a dtor in such a case (if we assume 
 dtor means raii).

 
 
 Unless you add a 'shared' keyword as I described in a previous post. eg.
 
 auto class Connection { //auto required to have dtor
   HANDLE h;
   ~this() { CloseHandle(h); }
 }
 
 class ConnectionUsage {
   shared Connection c;
 }
 
 ConnectionUsage is not required to be 'auto' because it has no 'auto'  
 class members which are not 'shared' resources. Alternately you 
 implement  reference counting for the Connection class, remove shared, 
 and add 'auto'  to ConnectionUsage.
 
 Regan

Yes ~ that's true.

On the other hand, all these concerns would melt away if the GC were 
changed to not invoke the dtor (see related post). The beauty of that 
approach is that there's no additional keywords or compiler behaviour; 
only the GC is modified to remove the dtor call during a normal 
collection cycle. Invoking delete or raii just works as always, yet the 
invalid dtor state is eliminated. It also eliminates the need for a 
dispose() pattern, which would be nice ;-)

So, 'auto' and delete would work as they do now, with the remaining problem of
people defining ~this() and it (inadvertently) never gets called, even at
program exit?

Hmmm if that's so, I'd add one thing -- how about something like a
"fullCollect(bool finalize = false)" that would be called with 'true' at the end
of dmain(), and could be explicitly called by the programmer?

That could run into the problem of dtors invoked in an invalid state, but at
least then it would still be deterministic (either the program ending normally
or the programmer calling fullCollect(true)).

BTW - I must have missed it, but what would be an example of a dtor called in an
invalid state?

Thanks,

- Dave

Apr 09 2006

kris <foo bar.com> writes:

Dave wrote:
 In article <e1cfpo$100u$1 digitaldaemon.com>, kris says...
 
Regan Heath wrote:

On Sun, 09 Apr 2006 18:34:47 -0700, kris <foo bar.com> wrote:


Mike Capp wrote:


In article <e1c6vl$moj$1 digitaldaemon.com>, kris says...


I think we can safely put aside the entire-program-as-one-function 
as  unrealistic. Given that, and assuming the existance of a dtor 
implies  "auto" (and thus raii), how does one manage a "pool" of 
resources? For  example, how about a pool of DB connections? Let's 
assume that they  need to be correctly closed at some point, and 
that the pool is likely  to expand and contract based upon demand 
over time ...

So the question is how do those connections, and the pool itself, 
jive  with scoped raii? Assuming it doesn't, then one would 
presumeably  revert to a manual dispose() pattern with such things?

  Two different classes. A ConnectionPool at application scope, e.g. 
in  main(),
and a ConnectionUsage wherever you need one. Both are RAII.  
ConnectionPool acts
as a factory for ConnectionUsage instances (modulo language  
limitations) and
adds to the pool as needed; ConnectionUsage just "borrows" an 
instance  from the
pool for the duration of its scope.
 cheers
Mike


Thanks!

So, when culling the pool (say, on a timeout basis) the cleanup-code 
for  the held resource is not held within the "borrowed" dtor, but in 
a  dispose() method? Otherwise, said dtor would imply raii for the 
borrowed  connection, which would be bogus behaviour for a class 
instance that is  being held onto by the pool? In other words: you'd 
want to avoid  deleting (via raii) the connection object, so you'd 
have to be careful  to not use a dtor in such a case (if we assume 
dtor means raii).


Unless you add a 'shared' keyword as I described in a previous post. eg.

auto class Connection { //auto required to have dtor
  HANDLE h;
  ~this() { CloseHandle(h); }
}

class ConnectionUsage {
  shared Connection c;
}

ConnectionUsage is not required to be 'auto' because it has no 'auto'  
class members which are not 'shared' resources. Alternately you 
implement  reference counting for the Connection class, remove shared, 
and add 'auto'  to ConnectionUsage.

Regan

Yes ~ that's true.

On the other hand, all these concerns would melt away if the GC were 
changed to not invoke the dtor (see related post). The beauty of that 
approach is that there's no additional keywords or compiler behaviour; 
only the GC is modified to remove the dtor call during a normal 
collection cycle. Invoking delete or raii just works as always, yet the 
invalid dtor state is eliminated. It also eliminates the need for a 
dispose() pattern, which would be nice ;-)

 
 
 So, 'auto' and delete would work as they do now, with the remaining problem of
 people defining ~this() and it (inadvertently) never gets called, even at
 program exit?
 
 Hmmm if that's so, I'd add one thing -- how about something like a
 "fullCollect(bool finalize = false)" that would be called with 'true' at the
end
 of dmain(), and could be explicitly called by the programmer?
 
 That could run into the problem of dtors invoked in an invalid state, but at
 least then it would still be deterministic (either the program ending normally
 or the programmer calling fullCollect(true)).
 
 BTW - I must have missed it, but what would be an example of a dtor called in
an
 invalid state?
 
 Thanks,
 
 - Dave
 
 

See post entitled "GC & dtors ~ a different approach" at 6:17pm ?

Apr 09 2006

"Regan Heath" <regan netwin.co.nz> writes:

On Sun, 09 Apr 2006 19:27:09 -0700, kris <foo bar.com> wrote:
 Regan Heath wrote:
 On Sun, 09 Apr 2006 18:34:47 -0700, kris <foo bar.com> wrote:

 Mike Capp wrote:

 In article <e1c6vl$moj$1 digitaldaemon.com>, kris says...

 I think we can safely put aside the entire-program-as-one-function  
 as  unrealistic. Given that, and assuming the existance of a dtor  
 implies  "auto" (and thus raii), how does one manage a "pool" of  
 resources? For  example, how about a pool of DB connections? Let's  
 assume that they  need to be correctly closed at some point, and  
 that the pool is likely  to expand and contract based upon demand  
 over time ...

 So the question is how do those connections, and the pool itself,  
 jive  with scoped raii? Assuming it doesn't, then one would  
 presumeably  revert to a manual dispose() pattern with such things?

   Two different classes. A ConnectionPool at application scope, e.g.  
 in  main(),
 and a ConnectionUsage wherever you need one. Both are RAII.   
 ConnectionPool acts
 as a factory for ConnectionUsage instances (modulo language   
 limitations) and
 adds to the pool as needed; ConnectionUsage just "borrows" an  
 instance  from the
 pool for the duration of its scope.
  cheers
 Mike


 Thanks!

 So, when culling the pool (say, on a timeout basis) the cleanup-code  
 for  the held resource is not held within the "borrowed" dtor, but in  
 a  dispose() method? Otherwise, said dtor would imply raii for the  
 borrowed  connection, which would be bogus behaviour for a class  
 instance that is  being held onto by the pool? In other words: you'd  
 want to avoid  deleting (via raii) the connection object, so you'd  
 have to be careful  to not use a dtor in such a case (if we assume  
 dtor means raii).

   Unless you add a 'shared' keyword as I described in a previous post.  
 eg.
  auto class Connection { //auto required to have dtor
   HANDLE h;
   ~this() { CloseHandle(h); }
 }
  class ConnectionUsage {
   shared Connection c;
 }
  ConnectionUsage is not required to be 'auto' because it has no 'auto'   
 class members which are not 'shared' resources. Alternately you  
 implement  reference counting for the Connection class, remove shared,  
 and add 'auto'  to ConnectionUsage.
  Regan

 Yes ~ that's true.

 On the other hand, all these concerns would melt away if the GC were  
 changed to not invoke the dtor (see related post). The beauty of that  
 approach is that there's no additional keywords or compiler behaviour;

True, however the beauty is marred by the possibility of resource leaks.  
I'd like to think we can come up with a solution which prevents them, or  
at least makes them less likely. It would be a big step up over C++ etc  
and if it takes adding a keyword and/or new compiler behaviour it's a  
small price to pay IMO.

 only the GC is modified to remove the dtor call during a normal  
 collection cycle. Invoking delete or raii just works as always, yet the  
 invalid dtor state is eliminated. It also eliminates the need for a  
 dispose() pattern, which would be nice ;-)

At least this idea stops people doing things they shouldn't in dtors.

What I think we need to do is come up with several concrete use-cases  
(actual code) which use resources which need to be released and explore  
how each suggestion would affect that code, for example I'm still not  
conviced the linklist use-case mentioned here several times requires any  
explicit cleanup code, isn't it all just memory to be freed by the GC? Can  
someone post a code example and explain why it does please.

It seems to me that as modules already have ctor/dtors then my suggestion  
can simply treat a module like a class i.e. automatically adding a dtor  
(or appending to an existing dtor) which deletes the (non shared) auto  
class instances at module level.

Regan

Apr 09 2006

kris <foo bar.com> writes:

Regan Heath wrote:
 On Sun, 09 Apr 2006 19:27:09 -0700, kris <foo bar.com> wrote:
 On the other hand, all these concerns would melt away if the GC were  
 changed to not invoke the dtor (see related post). The beauty of that  
 approach is that there's no additional keywords or compiler behaviour;

 
 
 True, however the beauty is marred by the possibility of resource 
 leaks.  I'd like to think we can come up with a solution which prevents 
 them, or  at least makes them less likely. It would be a big step up 
 over C++ etc  and if it takes adding a keyword and/or new compiler 
 behaviour it's a  small price to pay IMO.

Regarding leaks, please see related post entitled "GC & dtors ~ a 
different approach" ?

I just hacked up the collector in Ares to do what is described in that 
post. The quick hack doesn't do the leak-detection part, but the rest of 
it works fine (there may well be cases I've overlooked but the obvious 
ones, 'delete' and raii, now invoke the dtor whereas normal collection 
does not).

Apr 09 2006

"Regan Heath" <regan netwin.co.nz> writes:

On Sun, 09 Apr 2006 21:18:45 -0700, kris <foo bar.com> wrote:
 Regan Heath wrote:
 On Sun, 09 Apr 2006 19:27:09 -0700, kris <foo bar.com> wrote:
 On the other hand, all these concerns would melt away if the GC were   
 changed to not invoke the dtor (see related post). The beauty of that   
 approach is that there's no additional keywords or compiler behaviour;

   True, however the beauty is marred by the possibility of resource  
 leaks.  I'd like to think we can come up with a solution which prevents  
 them, or  at least makes them less likely. It would be a big step up  
 over C++ etc  and if it takes adding a keyword and/or new compiler  
 behaviour it's a  small price to pay IMO.

 Regarding leaks, please see related post entitled "GC & dtors ~ a  
 different approach" ?

I have. Here is what you say WRT leaks:

 What about implicit cleanup? In this scenario, it doesn't happen. If you  
 don't explicitly (via delete or via raii) delete an >object, the dtor is  
 not invoked. This applies the notion that it's better to have a leak  
 than a dead program. The leak is a bug >to be resolved.

Whereas using my suggestion we get implicit cleanup. Auto propagates as  
required, dtors are added and delete is called automatically where  
required resulting in no leaks. The best part is that the compiler  
enforces that by default and you have to opt-out with 'shared' to  
introduce a leak.

So, assuming it's workable (Walters call) and it's not too inflexible I  
think it's a better solution. In short, I would rather not have to  
explicitly manage the resources if at all possible (and I still hope it  
might be).

Regan

Apr 09 2006

kris <foo bar.com> writes:

Regan Heath wrote:

 I have. Here is what you say WRT leaks:
 
 What about implicit cleanup? In this scenario, it doesn't happen. If 
 you  don't explicitly (via delete or via raii) delete an >object, the 
 dtor is  not invoked. This applies the notion that it's better to have 
 a leak  than a dead program. The leak is a bug >to be resolved.

 
 
 Whereas using my suggestion we get implicit cleanup. Auto propagates as  
 required, dtors are added and delete is called automatically where  
 required resulting in no leaks. The best part is that the compiler  
 enforces that by default and you have to opt-out with 'shared' to  
 introduce a leak.
 
 So, assuming it's workable (Walters call) and it's not too inflexible I  
 think it's a better solution. In short, I would rather not have to  
 explicitly manage the resources if at all possible (and I still hope it  
 might be).

I thought the idea was that classes with dtors are /intended/ to be 
explicitly cleaned up? That, implicit cleanup of resources (manana, some 
time) was actually a negative aspect? At least, that's what Mike was 
suggesting, and it seemed like a really good idea.

Along those lines, what I was suggesting is to enable dtors for explicit 
cleanup only. Plus an optional runtime leak detector. I guess I like the 
simplicity of that. What you suggest seems workable too, but perhaps a 
little more involved?

Apr 09 2006

"Regan Heath" <regan netwin.co.nz> writes:

On Sun, 09 Apr 2006 22:21:39 -0700, kris <foo bar.com> wrote:
 Regan Heath wrote:

 I have. Here is what you say WRT leaks:

 What about implicit cleanup? In this scenario, it doesn't happen. If  
 you  don't explicitly (via delete or via raii) delete an >object, the  
 dtor is  not invoked. This applies the notion that it's better to have  
 a leak  than a dead program. The leak is a bug >to be resolved.

   Whereas using my suggestion we get implicit cleanup. Auto propagates  
 as  required, dtors are added and delete is called automatically where   
 required resulting in no leaks. The best part is that the compiler   
 enforces that by default and you have to opt-out with 'shared' to   
 introduce a leak.
  So, assuming it's workable (Walters call) and it's not too inflexible  
 I  think it's a better solution. In short, I would rather not have to   
 explicitly manage the resources if at all possible (and I still hope  
 it  might be).

 I thought the idea was that classes with dtors are /intended/ to be  
 explicitly cleaned up?

Not my idea ;) I think any given resource has a correct time/place for  
cleanup, we just need a way to specify that, ideally one that can do so  
and avoid as much human error as possible (AKA resource leaks).

 That, implicit cleanup of resources (manana, some time) was actually a  
 negative aspect? At least, that's what Mike was suggesting, and it  
 seemed like a really good idea.

It's certainly a simple solution to the problem, it may be that it's also  
the best, more use-cases will convince me (at least) one way of the other.

 Along those lines, what I was suggesting is to enable dtors for explicit  
 cleanup only. Plus an optional runtime leak detector. I guess I like the  
 simplicity of that. What you suggest seems workable too, but perhaps a  
 little more involved?

It's certainly more involved. It can't be done without changes to the  
compiler, but, once those are in place it can guarantee resources are  
cleaned up and it can guarantee no leaks occur. (assuming I'm not missing  
something obvious). The price paid for that is some flexibility (perhaps,  
perhaps not - I want more use-cases to try it with), I reckon the price is  
worth the benefit.

Regan

Apr 09 2006

Mike Capp <mike.capp gmail.com> writes:

In article <e1cq0t$1fm5$1 digitaldaemon.com>, kris says...
I thought the idea was that classes with dtors are /intended/ to be 
explicitly cleaned up? That, implicit cleanup of resources (manana, some 
time) was actually a negative aspect? At least, that's what Mike was 
suggesting, and it seemed like a really good idea.

Um... can we avoid using "implicit" and "explicit" in this context? "Implicit"
to me means "without writing any code", which covers both RAII and GC cleanup
(if you're lucky). "Explicit" to me means manual calls to dtors or dispose(),
which is the worst of all possible approaches.

cheers
Mike

Apr 10 2006

kris <foo bar.com> writes:

Mike Capp wrote:
 In article <e1cq0t$1fm5$1 digitaldaemon.com>, kris says...
 
I thought the idea was that classes with dtors are /intended/ to be 
explicitly cleaned up? That, implicit cleanup of resources (manana, some 
time) was actually a negative aspect? At least, that's what Mike was 
suggesting, and it seemed like a really good idea.

 
 
 Um... can we avoid using "implicit" and "explicit" in this context? "Implicit"
 to me means "without writing any code", which covers both RAII and GC cleanup
 (if you're lucky). "Explicit" to me means manual calls to dtors or dispose(),
 which is the worst of all possible approaches.

Yeah, I see the murk. What would you prefer to call them? The 
distinction being made there was whether the dtor was initiated via 
delete/auto, versus normal collection by the GC (where the latter was 
referred to as implicit).

Apr 10 2006

Don Clugston <dac nospam.com.au> writes:

kris wrote:
 Mike Capp wrote:
 In article <e1cq0t$1fm5$1 digitaldaemon.com>, kris says...

 I thought the idea was that classes with dtors are /intended/ to be 
 explicitly cleaned up? That, implicit cleanup of resources (manana, 
 some time) was actually a negative aspect? At least, that's what Mike 
 was suggesting, and it seemed like a really good idea.


 Um... can we avoid using "implicit" and "explicit" in this context? 
 "Implicit"
 to me means "without writing any code", which covers both RAII and GC 
 cleanup
 (if you're lucky). "Explicit" to me means manual calls to dtors or 
 dispose(),
 which is the worst of all possible approaches.

 
 Yeah, I see the murk. What would you prefer to call them? The 
 distinction being made there was whether the dtor was initiated via 
 delete/auto, versus normal collection by the GC (where the latter was 
 referred to as implicit).

deterministic and non-deterministic.

Apr 10 2006

Mike Capp <mike.capp gmail.com> writes:

In article <e1dfmc$29r2$1 digitaldaemon.com>, Don Clugston says...
kris wrote:
 Mike Capp wrote:
 Um... can we avoid using "implicit" and "explicit" in this context? 

 Yeah, I see the murk. What would you prefer to call them? 

deterministic and non-deterministic.

Yes. Which pretty much correspond to "important" and "don't care".

cheers
Mike

Apr 10 2006

kris <foo bar.com> writes:

Don Clugston wrote:
 kris wrote:
 
 Mike Capp wrote:

 In article <e1cq0t$1fm5$1 digitaldaemon.com>, kris says...

 I thought the idea was that classes with dtors are /intended/ to be 
 explicitly cleaned up? That, implicit cleanup of resources (manana, 
 some time) was actually a negative aspect? At least, that's what 
 Mike was suggesting, and it seemed like a really good idea.



 Um... can we avoid using "implicit" and "explicit" in this context? 
 "Implicit"
 to me means "without writing any code", which covers both RAII and GC 
 cleanup
 (if you're lucky). "Explicit" to me means manual calls to dtors or 
 dispose(),
 which is the worst of all possible approaches.


 Yeah, I see the murk. What would you prefer to call them? The 
 distinction being made there was whether the dtor was initiated via 
 delete/auto, versus normal collection by the GC (where the latter was 
 referred to as implicit).

 
 
 deterministic and non-deterministic.


Thank you;

Apr 10 2006

Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:

Don Clugston wrote:
 kris wrote:
 Mike Capp wrote:
 In article <e1cq0t$1fm5$1 digitaldaemon.com>, kris says...

 I thought the idea was that classes with dtors are /intended/ to be 
 explicitly cleaned up? That, implicit cleanup of resources (manana, 
 some time) was actually a negative aspect? At least, that's what 
 Mike was suggesting, and it seemed like a really good idea.


 Um... can we avoid using "implicit" and "explicit" in this context? 
 "Implicit"
 to me means "without writing any code", which covers both RAII and GC 
 cleanup
 (if you're lucky). "Explicit" to me means manual calls to dtors or 
 dispose(),
 which is the worst of all possible approaches.

 Yeah, I see the murk. What would you prefer to call them? The 
 distinction being made there was whether the dtor was initiated via 
 delete/auto, versus normal collection by the GC (where the latter was 
 referred to as implicit).

 
 deterministic and non-deterministic.

I don't like those terms. Although they are not false (because 
*currently* explicit destruction is deterministic, and implicit 
destruction in non-deterministic), the fact of whether the destructor 
was called deterministically or non-deterministically is not in itself 
relevant to this issue. What is relevant is the state of the object to 
be destroyed (in defined or undefined state).
Nor is implicit destruction/collection inherently non-deterministic and 
vice-versa.  (even if systems that operated this way would be unpractical)

So far, I'm keeping the terms "implicit" and "explicit", as they seems 
adequate to me and I don't find at all that RAII collection is 
"implicit" or "without writing any code".

-- 
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Apr 13 2006

Don Clugston <dac nospam.com.au> writes:

Bruno Medeiros wrote:
 Don Clugston wrote:
 kris wrote:
 Mike Capp wrote:
 In article <e1cq0t$1fm5$1 digitaldaemon.com>, kris says...

 I thought the idea was that classes with dtors are /intended/ to be 
 explicitly cleaned up? That, implicit cleanup of resources (manana, 
 some time) was actually a negative aspect? At least, that's what 
 Mike was suggesting, and it seemed like a really good idea.


 Um... can we avoid using "implicit" and "explicit" in this context? 
 "Implicit"
 to me means "without writing any code", which covers both RAII and 
 GC cleanup
 (if you're lucky). "Explicit" to me means manual calls to dtors or 
 dispose(),
 which is the worst of all possible approaches.

 Yeah, I see the murk. What would you prefer to call them? The 
 distinction being made there was whether the dtor was initiated via 
 delete/auto, versus normal collection by the GC (where the latter was 
 referred to as implicit).

 deterministic and non-deterministic.

 
 I don't like those terms. Although they are not false (because 
 *currently* explicit destruction is deterministic, and implicit 
 destruction in non-deterministic), the fact of whether the destructor 
 was called deterministically or non-deterministically is not in itself 
 relevant to this issue. 

I'm not sure that this is correct, see below.

 What is relevant is the state of the object to 
 be destroyed (in defined or undefined state).

 Nor is implicit destruction/collection inherently non-deterministic and 
 vice-versa.  (even if systems that operated this way would be unpractical)

Yes, you're right, a finaliser could be invoked immediately whenever the 
last reference goes out of scope. But I think (not sure) that the issues 
with finalisers would disappear if they were deterministic in this 
manner. At least, I'm confident that non-deterministic scope-based 
destructors would suffer from the same problems that finalisers do.

 So far, I'm keeping the terms "implicit" and "explicit", as they seems 
 adequate to me and I don't find at all that RAII collection is 
 "implicit" or "without writing any code".

However, RAII has been contrasted with "explicit" memory management for 
a very long time. "Explicit" has a firmly established meaning of 'new' 
and 'delete', it's very confusing to use them to mean something entirely 
different. (If however, the distinction is between "gc" and "non-gc", 
let's call a spade a spade).

On this topic -- there's an interesting thread on comp.c++ by Andrei 
Alexandrescu about gc and RAII. Among other things, he argues that 
finalisers are a flawed concept that shouldn't be included. (BTW, he 
seems to be getting *very* interested in D -- he now has a link to the D 
spec on his website, for example -- so his opinions are worth examining).

Apr 18 2006

Sean Kelly <sean f4.ca> writes:

Don Clugston wrote:
 
 On this topic -- there's an interesting thread on comp.c++ by Andrei 
 Alexandrescu about gc and RAII. Among other things, he argues that 
 finalisers are a flawed concept that shouldn't be included. (BTW, he 
 seems to be getting *very* interested in D -- he now has a link to the D 
 spec on his website, for example -- so his opinions are worth examining).

This seems in line with some of the other ideas discussed in this 
thread, and with what I'm trying out with this latest release of Ares. 
The idea is that the runtime code will be aware of how an object is 
being destroyed, be it by by the GC or by some other means.  Currently, 
that's as as far as it goes unless you want to modify the finalizer 
function and rebuild the runtime, but the next release will include a 
hookable callback in the standard library similar to onAssertError. 
This will allow the user to decide upon which behavior is most 
appropriate, and to do so on a per-class basis as I am planning to pass 
either the original class pointer or simply a ClassInfo object.  For a 
debug build it may be appropriate to report the error and terminate (say 
via an assert) while some release applications may want to be a bit more 
lenient.

This does impose a restriction on standard library code however, as it 
must behave as if non-deterministic finalization is always illegal. 
This isn't terribly difficult to accomplish, but it's something to be 
aware of.


Sean

Apr 18 2006

Mike Capp <mike.capp gmail.com> writes:

In article <e2378s$gpn$1 digitaldaemon.com>, Sean Kelly says...
the next release [of Ares] will include a 
hookable callback in the standard library similar to onAssertError. 
This will allow the user to decide upon which behavior is most 
appropriate, and to do so on a per-class basis as I am planning to pass 
either the original class pointer or simply a ClassInfo object.

To clarify: if the decision is per-class (which I agree it should be), is there
any benefit to catching this error at runtime rather than compile time? Or is it
just that it's easier to try out this way?

cheers
Mike

Apr 18 2006

Sean Kelly <sean f4.ca> writes:

Mike Capp wrote:
 In article <e2378s$gpn$1 digitaldaemon.com>, Sean Kelly says...
 the next release [of Ares] will include a 
 hookable callback in the standard library similar to onAssertError. 
 This will allow the user to decide upon which behavior is most 
 appropriate, and to do so on a per-class basis as I am planning to pass 
 either the original class pointer or simply a ClassInfo object.

 
 To clarify: if the decision is per-class (which I agree it should be), is there
 any benefit to catching this error at runtime rather than compile time? Or is
it
 just that it's easier to try out this way?

I'm not entirely sure it would be possible to catch every instance of 
this at compile-time.  That aside, I very much want to avoid anything 
requiring compiler changes unless Walter is the one to implement them, 
and really to avoid any fundamental changes in application behavior 
without Walter's approval.  This is one reason I've chosen to add this 
feature via a hookable callback that defaults to existing behavior (ie. 
to ignore the problem and continue).  The other being that I'm not 
convinced such errors always warrant termination, particularly for 
release builds.

To clarify, I've added two callbacks and a user-callable function to my 
local build:

     void setCollectHandler( collectHandlerType h );
     extern (C) void onCollectResource( ClassInfo info );

onCollectResource is called whenever the GC collects an object that has 
a dtor and if not user-supplied handler is provided then the call is a 
no-op.  I may yet replace the ClassInfo object with an Object reference, 
but haven't decided whether doing so offers much over the current version.

     extern (C) void onFinalizeError( ClassInfo c, Exception e );

onFinalizeError is called whenever an Exception is thrown from an object 
dtor and will effectively terminate the application with a message. 
This is accomplished by wrapping the passed exception in a new 
system-level exception object and re-throwing.  Things get a bit weird 
if e is an OutOfMemoryException, but that's a possibility I'm ignoring 
for now.


Sean

Apr 18 2006

Sean Kelly <sean f4.ca> writes:

Sean Kelly wrote:
 Mike Capp wrote:
 In article <e2378s$gpn$1 digitaldaemon.com>, Sean Kelly says...
 the next release [of Ares] will include a hookable callback in the 
 standard library similar to onAssertError. This will allow the user 
 to decide upon which behavior is most appropriate, and to do so on a 
 per-class basis as I am planning to pass either the original class 
 pointer or simply a ClassInfo object.

 To clarify: if the decision is per-class (which I agree it should be), 
 is there
 any benefit to catching this error at runtime rather than compile 
 time? Or is it
 just that it's easier to try out this way?


As per Kris' suggestion, the (future) behavior of onCollectResource in 
Ares has changed slightly.  The call now has the following format:

     extern (C) bool onCollectResource( Object obj );

Default behavior is as before--to silently clean up the object and 
continue.  However, if the user has supplied a cleanup handler and it 
returns 'false' then the object's dtors will not be called.  Instead, 
the user code is expected to have cleaned things up another way.  Thus 
the user has a selection of options to choose from, in order of complexity:

* Report the error and continue, returning 'true'.
* Report the error and terminate the application.
* Clean up the object's resources by some other means and return 'false'.

The final option is to allow the user to write dtors that always assume 
referenced objects are valid while allowing execution to continue if 
such objects are encountered by the garbage collector (currently, 
dereferencing a GCed object in a dtor may cause an access violation if 
the refrenced object has already been cleaned up).  I'll admit that this 
last option provides a lot more rope than seems prudent, but it also 
makes for some interesting possibilities and I'm curious to see how 
things work out :-)


Sean

Apr 18 2006

Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:

Regan Heath wrote:
 
 What I think we need to do is come up with several concrete use-cases 
 (actual code) which use resources which need to be released and explore 
 how each suggestion would affect that code, for example I'm still not 
 conviced the linklist use-case mentioned here several times requires any 
 explicit cleanup code, isn't it all just memory to be freed by the GC? 
 Can someone post a code example and explain why it does please.
 
 
 Regan

See my reply to Georg:
news://news.digitalmars.com:119/e1dg8t$2akn$2 digitaldaemon.com

-- 
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Apr 10 2006

Sean Kelly <sean f4.ca> writes:

kris wrote:
 
 On the other hand, all these concerns would melt away if the GC were 
 changed to not invoke the dtor (see related post). The beauty of that 
 approach is that there's no additional keywords or compiler behaviour; 
 only the GC is modified to remove the dtor call during a normal 
 collection cycle. Invoking delete or raii just works as always, yet the 
 invalid dtor state is eliminated. It also eliminates the need for a 
 dispose() pattern, which would be nice ;-)

For what it's worth, I think this could be accomplished now (thogh I've 
not tried it) as follows:

Object o = new MyObject;
gc_setFinalizer( o, null );


Sean

Apr 10 2006

kris <foo bar.com> writes:

Sean Kelly wrote:
 kris wrote:
 
 On the other hand, all these concerns would melt away if the GC were 
 changed to not invoke the dtor (see related post). The beauty of that 
 approach is that there's no additional keywords or compiler behaviour; 
 only the GC is modified to remove the dtor call during a normal 
 collection cycle. Invoking delete or raii just works as always, yet 
 the invalid dtor state is eliminated. It also eliminates the need for 
 a dispose() pattern, which would be nice ;-)

 
 
 For what it's worth, I think this could be accomplished now (thogh I've 
 not tried it) as follows:
 
 Object o = new MyObject;
 gc_setFinalizer( o, null );

Nearly, but not quite the same. This certainly disables the dtor for the 
given class, but if you forget to do it, your dtor will called with an 
'unspecified' (what Don called non-deterministic) state. Plus, there's 
no option for capturing leaks.

I believe it's far better to stop the GC from invoking the dtor in those 
cases where the state is unspecified: the system would become fully 
deterministic, the need for a dispose() pattern goes away ('delete'/raii 
takes over), expensive resources that should be released quickly are 
always treated in that manner (consistently) or treated as leaks 
otherwise, and the GC runs a little faster.

There's the edge-case whereby someone wants a dtor to be invoked lazily 
by the collector, at some point in the future. That puts us back into 
the non-deterministic dtor state, and is a model that Mike was 
suggesting should be removed anyway (because classes that need to 
release something should do so as quickly as possible). I fully agree 
with Mike on this aspect, but wonder whether a simple implementation 
might suffice instead (GC change only)?

Essentially what I'm suggesting is adding this to the documentation:

"a class dtor is invoked via the use of 'delete' or raii only. This 
guarantees that (a) classes holding external or otherwise "expensive" 
resources will release them in a timely manner, that (b) that the dtor 
will be invoked with a fully deterministic state ~ all memory references 
  held by a class instance will be valid when the dtor is invoked, and 
(c) there's no need for redundant cleanup-patterns such as dispose()"

- Kris

Apr 10 2006

Dave <Dave_member pathlink.com> writes:

In article <e1c9tc$p14$1 digitaldaemon.com>, Mike Capp says...
In article <e1c6vl$moj$1 digitaldaemon.com>, kris says...
I think we can safely put aside the entire-program-as-one-function as 
unrealistic. Given that, and assuming the existance of a dtor implies 
"auto" (and thus raii), how does one manage a "pool" of resources? For 
example, how about a pool of DB connections? Let's assume that they need 
to be correctly closed at some point, and that the pool is likely to 
expand and contract based upon demand over time ...

So the question is how do those connections, and the pool itself, jive 
with scoped raii? Assuming it doesn't, then one would presumeably revert 
to a manual dispose() pattern with such things?

Two different classes. A ConnectionPool at application scope, e.g. in main(),
and a ConnectionUsage wherever you need one. Both are RAII. ConnectionPool acts
as a factory for ConnectionUsage instances (modulo language limitations) and
adds to the pool as needed; ConnectionUsage just "borrows" an instance from the
pool for the duration of its scope.

cheers
Mike

That's a mssing part of the puzzle - up until now IMO the changes to the
compiler would have been minimal to support "only autos can have dtors".

Now it would require another change to the language in that 'auto' is not
currently allowed for module scope classes. To support that, I guess there would
have to be code inserted along the lines of module static dtors for auto class
objects declared at module scope (except it would have to also check that each
class had been actually instantiated, obviously).

Then I guess there would be good potential for circular reference problems like
there is for module ctors and dtors with imported modules. So the compiler would
then have to insert runtime checks like it does for module ctors now, which make
things yet more complicated.

Apr 09 2006

D Programming

C/C++ Programming

Other

digitalmars.D - auto classes and finalizers