www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - GC BUG: Referenced object destroyed before released

reply "Koroskin Denis" <2korden gmail.com> writes:
Hello, community!
Consider the following code:

// For those who prefer alternative reality ;)
version(Tango) extern(C) void printf(char[] s, ...);

class Resource
{
     private char[] data;
     private bool released;

     this()
     {
         this.data =3D null;
         this.released =3D false;
     }

     ~this()
     {
         printf("Resource destroyed\n");
     }

     void release()
     {
         released =3D true;
     }
}

class Owner
{
     private Resource res;

     this(Resource res) {
         this.res =3D res;
     }

     ~this()
     {
         res.release();
         printf("Owner destroyed\n");
     }
}

int main()
{
     Resource res =3D new Resource();
     Owner owner =3D new Owner(res);

     printf("Main exit\n");
     return 0;
}

In this code we don't explicitly destroy anything, so GC chooses the ord=
er  =

of object destruction. And it chooses wrong.

"Phobos hack" shows the following stack trace upon exit:

Main exit
Resource destroyed
Unhandled win32 exception!
Error: Access Violation (object.Exception)
backtrace:
  00402087 void crash.Owner._dtor() (+f) crash.d:35
  0040262a _d_callfinalizer (+46)
  004025dd void std.gc.new_finalizer(void*, bool) (+9)
  004066c3 void gcx.GC.fullCollectNoStack() (+3b)
  0040226f gc_term (+b)
  00415b6d mainCRTStartup (+a9)
  7c816d4f ???

or without it:

Main exit
Resource destroyed
Error: Access Violation

or with Tango:

object.Exception: Access Violation
tango.core.Exception.FinalizeException: An exception was thrown while  =

finalizing an instance of class crash.Resource

object.Exception: Access Violation

Was there any bug report filled on this bug? It's very annoying, since i=
t  =

makes your program not reliable in runtime. At least, if it was a  =

compilation bug, and I managed to bypass it, I would be assured, that th=
e  =

resulted program whould behave predictable. Yet, it is not...
Mar 16 2008
next sibling parent reply "Vladimir Panteleev" <thecybershadow gmail.com> writes:
On Sun, 16 Mar 2008 15:45:14 +0200, Koroskin Denis <2korden gmail.com> wrote:

 In this code we don't explicitly destroy anything, so GC chooses the order
 of object destruction. And it chooses wrong.

Quoting from http://www.digitalmars.com/d/1.0/class.html#destructors :
 Furthermore, the order in which the garbage collector calls destructors for
unreference objects is not specified.

So, it's not a bug :) You can't rely on the order of which objects will be destroyed. -- Best regards, Vladimir mailto:thecybershadow gmail.com
Mar 16 2008
next sibling parent reply "Koroskin Denis" <2korden gmail.com> writes:
On Sun, 16 Mar 2008 17:15:36 +0300, Vladimir Panteleev  
<thecybershadow gmail.com> wrote:

 Quoting from http://www.digitalmars.com/d/1.0/class.html#destructors :

 Furthermore, the order in which the garbage collector calls destructors  
 for unreference objects is not specified.

So, it's not a bug :) You can't rely on the order of which objects will be destroyed.

Yes:
 order in which the garbage collector calls destructors for  
 /unreferenced/ objects is not specified.

However, in this particular example Resource _is_ a referenced object, unlike Owner, which is not. In any case, is this code wrong? If not, why does it cause acess violation?
Mar 16 2008
parent reply Christopher Wright <dhasenan gmail.com> writes:
Koroskin Denis wrote:
 On Sun, 16 Mar 2008 17:15:36 +0300, Vladimir Panteleev 
 <thecybershadow gmail.com> wrote:
 
 Quoting from http://www.digitalmars.com/d/1.0/class.html#destructors :

 Furthermore, the order in which the garbage collector calls 
 destructors for unreference objects is not specified.

So, it's not a bug :) You can't rely on the order of which objects will be destroyed.

Yes:
 order in which the garbage collector calls destructors for 
 /unreferenced/ objects is not specified.

However, in this particular example Resource _is_ a referenced object, unlike Owner, which is not. In any case, is this code wrong? If not, why does it cause acess violation?

Resource is not referenced, from the garbage collector's point of view. An object is referenced if there is a path to it from the GC's roots (the stack, registers, and anything else you care to add with std.gc.addRoot or some such). When main exits, there's no reference to Owner, which means there is no way to get to Resource, so it's not referenced. Getting around this would not be extremely difficult, I think. You could just call all destructors and then collect the memory, which might leave some things in a bad state. Still, if you were careful, you could reference other objects in destructors without segfaulting. Or, you could build a reference graph from the memory you're collecting, then call destructors based on that graph. This would be pretty slow; I'd support it as the standard, but people would certainly want a way to disable it, since it's unnecessary in many situations. If you want deterministic destruction, you can do something like this: interface IFinalizable { void finalize(); } class MyClass : IFinalizable { // You would use a mixin for this... IFinalizable[] finalizeParents, finalizeChildren; public void finalize() { if (finalized) return; finalized = true; foreach (o; finalizeParents) { o.finalize; } destroyThis(); foreach (o; finalizeChildren) { o.finalize; } } // And for this. ~this() { finalize(); } // And this would be defined per-class, of course. private void destroyThis() { // Do whatever you need to do. // It's safe to use any IFinalizable, but not anything else. } } The runtime for this is quadratic in the worst case, plus it adds 16 bytes to each object. On the other hand, memory is cheap these days, and in the average case, the runtime's going to be linear. If you have a cycle of IFinalizables, you get nondeterministic destruction, but no infinite loops, which is the best you could ask for.
Mar 16 2008
parent Christopher Wright <dhasenan gmail.com> writes:
Vladimir Panteleev wrote:
 On Sun, 16 Mar 2008 17:58:40 +0200, Christopher Wright <dhasenan gmail.com>
wrote:
 
 Getting around this would not be extremely difficult, I think. 

You could also write custom class allocators and use non-managed memory for manual memory management: http://www.digitalmars.com/d/1.0/class.html#allocators

Then your destructors must delete all their owned children and free that memory. Fortunately, opDelete is inherited, so they just have to delete their children.
Mar 17 2008
prev sibling parent "Vladimir Panteleev" <thecybershadow gmail.com> writes:
On Sun, 16 Mar 2008 17:58:40 +0200, Christopher Wright <dhasenan gmail.com>
wrote:

 Getting around this would not be extremely difficult, I think. 

You could also write custom class allocators and use non-managed memory for manual memory management: http://www.digitalmars.com/d/1.0/class.html#allocators -- Best regards, Vladimir mailto:thecybershadow gmail.com
Mar 16 2008
prev sibling parent reply Graham St Jack <grahams acres.com.au> writes:
On Sun, 16 Mar 2008 16:45:14 +0300, Koroskin Denis wrote:

 Hello, community!
 Consider the following code:
 
 // For those who prefer alternative reality ;) version(Tango) extern(C)
 void printf(char[] s, ...);
 
 class Resource
 {
      private char[] data;
      private bool released;
 
      this()
      {
          this.data = null;
          this.released = false;
      }
 
      ~this()
      {
          printf("Resource destroyed\n");
      }
 
      void release()
      {
          released = true;
      }
 }
 
 class Owner
 {
      private Resource res;
 
      this(Resource res) {
          this.res = res;
      }
 
      ~this()
      {
          res.release();
          printf("Owner destroyed\n");
      }
 }
 
 int main()
 {
      Resource res = new Resource();
      Owner owner = new Owner(res);
 
      printf("Main exit\n");
      return 0;
 }
 
 In this code we don't explicitly destroy anything, so GC chooses the
 order of object destruction. And it chooses wrong.
 
 "Phobos hack" shows the following stack trace upon exit:
 
 Main exit
 Resource destroyed
 Unhandled win32 exception!
 Error: Access Violation (object.Exception) backtrace:
   00402087 void crash.Owner._dtor() (+f) crash.d:35 0040262a
   _d_callfinalizer (+46)
   004025dd void std.gc.new_finalizer(void*, bool) (+9) 004066c3 void
   gcx.GC.fullCollectNoStack() (+3b) 0040226f gc_term (+b)
   00415b6d mainCRTStartup (+a9)
   7c816d4f ???
 
 or without it:
 
 Main exit
 Resource destroyed
 Error: Access Violation
 
 or with Tango:
 
 object.Exception: Access Violation
 tango.core.Exception.FinalizeException: An exception was thrown while
 finalizing an instance of class crash.Resource
 
 object.Exception: Access Violation
 
 Was there any bug report filled on this bug? It's very annoying, since
 it makes your program not reliable in runtime. At least, if it was a
 compilation bug, and I managed to bypass it, I would be assured, that
 the resulted program whould behave predictable. Yet, it is not...

I agree that this is a problem. Who cares if the two objects in question aren't referenced and more - their destructors should be called in the right order if it is possible to do so. I guess the work-around (and maybe permanent) is to not reference any garbage-collected objects in your destructors.
Mar 16 2008
next sibling parent reply "Koroskin Denis" <2korden gmail.com> writes:
On Mon, 17 Mar 2008 01:28:04 +0300, Graham St Jack <grahams acres.com.au>  
wrote:
 I agree that this is a problem. Who cares if the two objects in question
 aren't referenced and more - their destructors should be called in the
 right order if it is possible to do so.

Yep.
 I guess the work-around (and maybe permanent) is to not reference any
 garbage-collected objects in your destructors.

Nope. This is against RAII pattern.
Mar 16 2008
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Koroskin Denis" in message
 On Mon, 17 Mar 2008 01:28:04 +0300, Graham St Jack  wrote:
 I agree that this is a problem. Who cares if the two objects in question
 aren't referenced and more - their destructors should be called in the
 right order if it is possible to do so.

Yep.
 I guess the work-around (and maybe permanent) is to not reference any
 garbage-collected objects in your destructors.

Nope. This is against RAII pattern.

The issue is with the usage of the destructor to clean up memory. Do not worry about cleaning up memory in destructors, that is what we have the garbage collector for. In your example, all the members of Resource are GC allocated and so will be cleaned up when Resource is cleaned up. You shouldn't care what the order is. If you have an OS resource, such as an open file, that should be cleaned up in the destructor of the object which owns the OS handle. This is what destructors are for. The only missing piece is if you wanted to have an array of OS resources. In order to make sure the elements in this array are cleaned up properly, you would need to allocate the array using an alternate memory allocator that does not get cleaned up automatically, or else use wrapper classes to store the OS resources. A way to fix this would be to always deallocate arrays last in the GC, and allow using the array elements (not objects pointed to by those elements) in the destructors. -Steve
Mar 17 2008
prev sibling next sibling parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
Graham St Jack wrote:

 I guess the work-around (and maybe permanent) is to not reference any 
 garbage-collected objects in your destructors. 

Err, that's more or less what the documentation says. """ When the garbage collector calls a destructor for an object of a class that has members that are references to garbage collected objects, those references are no longer valid. This means that destructors cannot reference sub objects. """ -- http://www.digitalmars.com/d/1.0/class.html#destructors Maybe it would be more precise to say "This means destructors cannot reference garbage collected objects". --bb
Mar 16 2008
parent Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Graham St Jack wrote:
 Another question along these lines - I am having trouble with a multi-
 threaded program (linux, gdc, phobos). It works fine until main() 
 returns, and then I get a SIGUSR1 that causes the program to crash. Any 
 ideas what might be happening?
 
 My investigations so far show that SIGUSR1 and SIGUSR2 are used in 
 thread.d as part of the implementation of pause() and resume(). I was 
 under the impression that the thread that is interrupted to handle a 
 signal is chosen at random, and therefore that it is a bad idea to try to 
 use signals in a threaded program, so I am confused.
 
 Also, the garbage collector seems to kill off all my threads without 
 waiting for their run methods to return. I assume this means that the 
 threads aren't being treated as root objects by the garbage collector.

When main() exits, the GC starts a collection. In order to do so safely it pauses all other threads with Thread.pauseAll, which (in the Linux version) sends SIGUSR1 to all other threads to tell them to pause. (Note that this doesn't kill them, even though the function that does this is called pthread_kill) It's usually a good idea to just tell your debugger to ignore SIGUSR1 and SIGUSR2 (the latter is used to resume afterwards). To do this in GDB, use "handle SIGUSR1 noprint nostop SIGUSR2 noprint nostop". If you're just wondering why all your threads get killed when main() ends I can tell you right now: this is how it works :P. You should see the same in a C or C++ program.
 Does all this mean that I have to explicitly take action so that the run 
 methods will return, then wait() on them before returning from main()?

As you said, call Thread.wait() on all other threads before exiting main() to wait for them to end. Or just switch to Tango which does this automatically (add Tangobos if you need Phobos compatibility).
Mar 17 2008
prev sibling next sibling parent reply "Vladimir Panteleev" <thecybershadow gmail.com> writes:
On Mon, 17 Mar 2008 00:28:04 +0200, Graham St Jack <grahams acres.com.au> wrote:

 I agree that this is a problem. Who cares if the two objects in question
 aren't referenced and more - their destructors should be called in the
 right order if it is possible to do so.

This is impossible without an exact garbage collector. Also: circular references. -- Best regards, Vladimir mailto:thecybershadow gmail.com
Mar 16 2008
parent "Koroskin Denis" <2korden+dmd gmail.com> writes:
On Mon, 17 Mar 2008 06:06:49 +0300, Vladimir Panteleev  
<thecybershadow gmail.com> wrote:

 On Mon, 17 Mar 2008 00:28:04 +0200, Graham St Jack  
 <grahams acres.com.au> wrote:

 I agree that this is a problem. Who cares if the two objects in question
 aren't referenced and more - their destructors should be called in the
 right order if it is possible to do so.

This is impossible without an exact garbage collector. Also: circular references.

Indeed, it's hard to implement if possible. And it is a design problem if not possible. But if it is so bad, then we need a safe pattern that would avoid this problem. And maybe build it into language somehow??? Disabling access to other objects in a destructor is really a bad option... I would also be happy to hear Walter's comment on this.
Mar 17 2008
prev sibling next sibling parent Graham St Jack <Graham.StJack internode.on.net> writes:
On Mon, 17 Mar 2008 09:45:30 +0900, Bill Baxter wrote:

 Graham St Jack wrote:
 
 I guess the work-around (and maybe permanent) is to not reference any
 garbage-collected objects in your destructors.

Err, that's more or less what the documentation says. """ When the garbage collector calls a destructor for an object of a class that has members that are references to garbage collected objects, those references are no longer valid. This means that destructors cannot reference sub objects. """ -- http://www.digitalmars.com/d/1.0/class.html#destructors Maybe it would be more precise to say "This means destructors cannot reference garbage collected objects". --bb

Thanks for the clarification - it clears it up nicely for me. Another question along these lines - I am having trouble with a multi- threaded program (linux, gdc, phobos). It works fine until main() returns, and then I get a SIGUSR1 that causes the program to crash. Any ideas what might be happening? My investigations so far show that SIGUSR1 and SIGUSR2 are used in thread.d as part of the implementation of pause() and resume(). I was under the impression that the thread that is interrupted to handle a signal is chosen at random, and therefore that it is a bad idea to try to use signals in a threaded program, so I am confused. Also, the garbage collector seems to kill off all my threads without waiting for their run methods to return. I assume this means that the threads aren't being treated as root objects by the garbage collector. Does all this mean that I have to explicitly take action so that the run methods will return, then wait() on them before returning from main()?
Mar 16 2008
prev sibling next sibling parent Graham St Jack <grahams acres.com.au> writes:
On Mon, 17 Mar 2008 10:16:57 +0100, Frits van Bommel wrote:

 Graham St Jack wrote:
 Another question along these lines - I am having trouble with a multi-
 threaded program (linux, gdc, phobos). It works fine until main()
 returns, and then I get a SIGUSR1 that causes the program to crash. Any
 ideas what might be happening?
 
 My investigations so far show that SIGUSR1 and SIGUSR2 are used in
 thread.d as part of the implementation of pause() and resume(). I was
 under the impression that the thread that is interrupted to handle a
 signal is chosen at random, and therefore that it is a bad idea to try
 to use signals in a threaded program, so I am confused.
 
 Also, the garbage collector seems to kill off all my threads without
 waiting for their run methods to return. I assume this means that the
 threads aren't being treated as root objects by the garbage collector.

When main() exits, the GC starts a collection. In order to do so safely it pauses all other threads with Thread.pauseAll, which (in the Linux version) sends SIGUSR1 to all other threads to tell them to pause. (Note that this doesn't kill them, even though the function that does this is called pthread_kill) It's usually a good idea to just tell your debugger to ignore SIGUSR1 and SIGUSR2 (the latter is used to resume afterwards). To do this in GDB, use "handle SIGUSR1 noprint nostop SIGUSR2 noprint nostop". If you're just wondering why all your threads get killed when main() ends I can tell you right now: this is how it works :P. You should see the same in a C or C++ program.
 Does all this mean that I have to explicitly take action so that the
 run methods will return, then wait() on them before returning from
 main()?

As you said, call Thread.wait() on all other threads before exiting main() to wait for them to end. Or just switch to Tango which does this automatically (add Tangobos if you need Phobos compatibility).

Thanks for the information - it has saved me a lot of wasted time.
Mar 17 2008
prev sibling next sibling parent Graham St Jack <grahams acres.com.au> writes:
On Mon, 17 Mar 2008 12:34:06 -0400, Steven Schveighoffer wrote:

 "Koroskin Denis" in message
 On Mon, 17 Mar 2008 01:28:04 +0300, Graham St Jack  wrote:
 I agree that this is a problem. Who cares if the two objects in
 question aren't referenced and more - their destructors should be
 called in the right order if it is possible to do so.

Yep.
 I guess the work-around (and maybe permanent) is to not reference any
 garbage-collected objects in your destructors.


The issue is with the usage of the destructor to clean up memory. Do not worry about cleaning up memory in destructors, that is what we have the garbage collector for. In your example, all the members of Resource are GC allocated and so will be cleaned up when Resource is cleaned up. You shouldn't care what the order is. If you have an OS resource, such as an open file, that should be cleaned up in the destructor of the object which owns the OS handle. This is what destructors are for. The only missing piece is if you wanted to have an array of OS resources. In order to make sure the elements in this array are cleaned up properly, you would need to allocate the array using an alternate memory allocator that does not get cleaned up automatically, or else use wrapper classes to store the OS resources. A way to fix this would be to always deallocate arrays last in the GC, and allow using the array elements (not objects pointed to by those elements) in the destructors. -Steve

I get it now. Thanks for the insights. I am too used to manual memory- management in C++ - using a garbage collector is wonderfully easy by comparison, but there are a few tricks to learn even so.
Mar 17 2008
prev sibling parent Graham St Jack <grahams acres.com.au> writes:
On Mon, 17 Mar 2008 12:27:15 +0000, Graham St Jack wrote:

 On Mon, 17 Mar 2008 10:16:57 +0100, Frits van Bommel wrote:
 
 Graham St Jack wrote:
 Another question along these lines - I am having trouble with a multi-
 threaded program (linux, gdc, phobos). It works fine until main()
 returns, and then I get a SIGUSR1 that causes the program to crash.
 Any ideas what might be happening?
 
 My investigations so far show that SIGUSR1 and SIGUSR2 are used in
 thread.d as part of the implementation of pause() and resume(). I was
 under the impression that the thread that is interrupted to handle a
 signal is chosen at random, and therefore that it is a bad idea to try
 to use signals in a threaded program, so I am confused.
 
 Also, the garbage collector seems to kill off all my threads without
 waiting for their run methods to return. I assume this means that the
 threads aren't being treated as root objects by the garbage collector.

When main() exits, the GC starts a collection. In order to do so safely it pauses all other threads with Thread.pauseAll, which (in the Linux version) sends SIGUSR1 to all other threads to tell them to pause. (Note that this doesn't kill them, even though the function that does this is called pthread_kill) It's usually a good idea to just tell your debugger to ignore SIGUSR1 and SIGUSR2 (the latter is used to resume afterwards). To do this in GDB, use "handle SIGUSR1 noprint nostop SIGUSR2 noprint nostop". If you're just wondering why all your threads get killed when main() ends I can tell you right now: this is how it works :P. You should see the same in a C or C++ program.
 Does all this mean that I have to explicitly take action so that the
 run methods will return, then wait() on them before returning from
 main()?

As you said, call Thread.wait() on all other threads before exiting main() to wait for them to end. Or just switch to Tango which does this automatically (add Tangobos if you need Phobos compatibility).

Thanks for the information - it has saved me a lot of wasted time.

Just to let you know, I fixed my code along these lines and it works just fine. Thanks for the leg up.
Mar 17 2008