www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Heisenbug involving Destructors & GC - Help Needed

reply "Maxime Chevalier-Boisvert" <maximechevalierb gmail.com> writes:
I seem to have run into a heisenbug involving destructors and the 
GC. I'm kind of stuck at this point and need help tracking down 
the issue.

I put the broken code in a branch called heisenbug on github:
https://github.com/higgsjs/Higgs/tree/heisenbug

The problem manifests itself on runs of `make test` (my 
unittests), but only some of the time. I wrote a script to run 
`make test` repeatedly to try and find a solution:
https://github.com/higgsjs/Higgs/blob/heisenbug/source/repeatmaketest.py

The problem usually manifests itself after 5 to 15 runs on my 
machine. I get a segmentation fault, not always in the same 
place. The randomness seems to stem from address space 
randomization.

It seems the issue is caused by my freeing/reinitializing the VM 
during unit tests. More specifically, commenting out this line 
makes the problem go away:
https://github.com/higgsjs/Higgs/blob/heisenbug/source/runtime/vm.d#L741

Higgs can run all of my benchmarks without ever failing, but 
restarting the VM during `make test` seems to be causing this 
problem to happen. It's not impossible that there could be 
another underlying issue, such as the JITted code I generate 
corrupting some memory location, but it would seem that if this 
were the case, the issue would likely show up outside of unit 
tests. Any help would be appreciated.
Jun 26 2015
next sibling parent "Maxime Chevalier-Boisvert" <maximechevalierb gmail.com> writes:
On Friday, 26 June 2015 at 18:27:34 UTC, Maxime 
Chevalier-Boisvert wrote:
 I seem to have run into a heisenbug involving destructors and 
 the GC. I'm kind of stuck at this point and need help tracking 
 down the issue.

 [...]
I should add that I'm running Ubuntu 12.04, 64-bit, and using DMD 2.067.1
Jun 26 2015
prev sibling next sibling parent reply Etienne Cimon <etcimon gmail.com> writes:
On 2015-06-26 14:27, Maxime Chevalier-Boisvert wrote:
 I seem to have run into a heisenbug involving destructors and the GC.
 I'm kind of stuck at this point and need help tracking down the issue.

 I put the broken code in a branch called heisenbug on github:
 https://github.com/higgsjs/Higgs/tree/heisenbug

 The problem manifests itself on runs of `make test` (my unittests), but
 only some of the time. I wrote a script to run `make test` repeatedly to
 try and find a solution:
 https://github.com/higgsjs/Higgs/blob/heisenbug/source/repeatmaketest.py

 The problem usually manifests itself after 5 to 15 runs on my machine. I
 get a segmentation fault, not always in the same place. The randomness
 seems to stem from address space randomization.

 It seems the issue is caused by my freeing/reinitializing the VM during
 unit tests. More specifically, commenting out this line makes the
 problem go away:
 https://github.com/higgsjs/Higgs/blob/heisenbug/source/runtime/vm.d#L741

 Higgs can run all of my benchmarks without ever failing, but restarting
 the VM during `make test` seems to be causing this problem to happen.
 It's not impossible that there could be another underlying issue, such
 as the JITted code I generate corrupting some memory location, but it
 would seem that if this were the case, the issue would likely show up
 outside of unit tests. Any help would be appreciated.
This might come as a surprise to you as much as it did to me at the time, but when you have GCRoot* root; where GCRoot is a struct, if you destroy(root), you're setting your local pointer to null. You're not actually calling the destructor on the struct. Also, I would avoid throwing of any type in a destructor. https://github.com/higgsjs/Higgs/blob/0b48477120c4acce46a01b05a1d4b035aa432550/source/jit/codeblock.d#L157
Jun 26 2015
parent reply "rsw0x" <anonymous anonymous.com> writes:
On Saturday, 27 June 2015 at 02:53:42 UTC, Etienne Cimon wrote:
 On 2015-06-26 14:27, Maxime Chevalier-Boisvert wrote:

 This might come as a surprise to you as much as it did to me at 
 the time, but when you have GCRoot* root; where GCRoot is a 
 struct, if you destroy(root), you're setting your local pointer 
 to null. You're not actually calling the destructor on the 
 struct.

 Also, I would avoid throwing of any type in a destructor.

 https://github.com/higgsjs/Higgs/blob/0b48477120c4acce46a01b05a1d4b035aa432550/source/jit/codeblock.d#L157
calling destroy on a pointer should either be fixed or be an error, that should not be allowed to happen.
Jun 26 2015
parent reply "Brian Schott" <briancschott gmail.com> writes:
On Saturday, 27 June 2015 at 03:16:35 UTC, rsw0x wrote:
 calling destroy on a pointer should either be fixed or be an 
 error, that should not be allowed to happen.
Completely agreed. Calling destroy() on a pointer has been incorrect EVERY time I've seen it done.
Jun 26 2015
next sibling parent reply "Temtaime" <temtaime gmail.com> writes:
Disagree. Destroy on a pointer calls dtor of a struct.
Why it should be an error ?
Jun 26 2015
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 06/27/2015 05:44 AM, Temtaime wrote:
 Disagree. Destroy on a pointer calls dtor of a struct.
 Why it should be an error ?
import std.stdio; bool destroyed=false; struct S{ ~this(){ destroyed=true; } } void main(){ auto s=new S; destroy(s); writeln(destroyed); // false }
Jun 26 2015
prev sibling next sibling parent Etienne Cimon <etcimon gmail.com> writes:
On 2015-06-26 23:44, Temtaime wrote:
 Disagree. Destroy on a pointer calls dtor of a struct.
 Why it should be an error ?
Exactly what I assumed too. Can you imagine all of the random errors that stem from such a basic, low-level assumption? I carry a scar for every day I've spent in the debugger, and this bug has given me its load of torments. Of course, I'm the only one to blame, I didn't know: import std.stdio; void main() { struct A { ~this() { writeln("Dtor"); } } A* a = new A; destroy(a); writeln("Done"); } output: Done
Jun 26 2015
prev sibling parent "Brian Schott" <briancschott gmail.com> writes:
On Saturday, 27 June 2015 at 03:44:54 UTC, Temtaime wrote:
 Disagree. Destroy on a pointer calls dtor of a struct.
This is false. The fact that you think that this is true is the reason that we want it to be an error import std.stdio : writeln; import core.stdc.stdlib : malloc; private struct S { ~this() { writeln("Destructor call"); } } void main() { S* s1 = cast(S*) malloc(S.sizeof); destroy(s1); S* s2 = cast(S*) malloc(S.sizeof); typeid(S).destroy(s2); } Run this program. You'll only see one destructor call.
Jun 26 2015
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 6/26/15 11:38 PM, Brian Schott wrote:
 On Saturday, 27 June 2015 at 03:16:35 UTC, rsw0x wrote:
 calling destroy on a pointer should either be fixed or be an error,
 that should not be allowed to happen.
Completely agreed. Calling destroy() on a pointer has been incorrect EVERY time I've seen it done.
I'd say if you want to see cases where it was done correctly, change the behavior and watch the complaints come in ;) Wouldn't destroying the pointer target make an assumption about ownership? Consider a struct that contains a pointer: struct S { T *x; } destroy's algorithm for this struct is to call destroy on all of the members, then set to S.init. But what if S is just *referencing* that T, and doesn't *own* it? Isn't this the wrong thing to do? I agree it's inconsistent with class references. How to make it consistent isn't as easy to decide. For me, it's the class destruction which is incorrect. Consider that destroy is consistent with scope destruction for every type except for classes: class C {} struct S {} { C c = new C; S s; S *s2 = new S; } // calls s.dtor, does not call c.dtor or s2.dtor And what about arrays? Should calling destroy on an array call destroy on all of the elements (it currently doesn't)? If I were to design destroy from scratch, I'd make destroy, and destroyRef (which destroys referenced data via pointer or class). -Steve
Jun 30 2015
parent reply "Etienne" <etcimon gmail.com> writes:
On Tuesday, 30 June 2015 at 13:01:46 UTC, Steven Schveighoffer 
wrote:
 On 6/26/15 11:38 PM, Brian Schott wrote:
 On Saturday, 27 June 2015 at 03:16:35 UTC, rsw0x wrote:
 calling destroy on a pointer should either be fixed or be an 
 error,
 that should not be allowed to happen.
Completely agreed. Calling destroy() on a pointer has been incorrect EVERY time I've seen it done.
I'd say if you want to see cases where it was done correctly, change the behavior and watch the complaints come in ;) Wouldn't destroying the pointer target make an assumption about ownership? Consider a struct that contains a pointer: struct S { T *x; } destroy's algorithm for this struct is to call destroy on all of the members, then set to S.init. But what if S is just *referencing* that T, and doesn't *own* it? Isn't this the wrong thing to do? I agree it's inconsistent with class references. How to make it consistent isn't as easy to decide. For me, it's the class destruction which is incorrect. Consider that destroy is consistent with scope destruction for every type except for classes: class C {} struct S {} { C c = new C; S s; S *s2 = new S; } // calls s.dtor, does not call c.dtor or s2.dtor And what about arrays? Should calling destroy on an array call destroy on all of the elements (it currently doesn't)? If I were to design destroy from scratch, I'd make destroy, and destroyRef (which destroys referenced data via pointer or class). -Steve
I don't think there's a problem with destroy in the first place. The problem is that it's being advertised as calling the destructors: http://dlang.org/library/object/type_info.destroy.html http://dlang.org/library/object/destroy.html The function would need a documentation section itself, since it's being advertised as a replacement for `delete` and expected to do the same wherever there's something to read about it.
Jun 30 2015
next sibling parent "Etienne" <etcimon gmail.com> writes:
On Tuesday, 30 June 2015 at 14:04:45 UTC, Etienne wrote:
 On Tuesday, 30 June 2015 at 13:01:46 UTC, Steven Schveighoffer 
 wrote:
 [...]
I don't think there's a problem with destroy in the first place. The problem is that it's being advertised as calling the destructors: http://dlang.org/library/object/type_info.destroy.html http://dlang.org/library/object/destroy.html The function would need a documentation section itself, since it's being advertised as a replacement for `delete` and expected to do the same wherever there's something to read about it.
We probably just need a new page in the `D Reference` about Lifetime. Documenting the general behavior and good practices of anything that manages lifetime, e.g. malloc, free, delete, destroy, new, pointers, classes, base types, etc.
Jun 30 2015
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 6/30/15 10:04 AM, Etienne wrote:

 I don't think there's a problem with destroy in the first place. The
 problem is that it's being advertised as calling the destructors:
It does for value types and for class references. -Steve
Jun 30 2015
parent "Etienne" <etcimon gmail.com> writes:
On Tuesday, 30 June 2015 at 17:10:41 UTC, Steven Schveighoffer 
wrote:
 On 6/30/15 10:04 AM, Etienne wrote:

 I don't think there's a problem with destroy in the first 
 place. The
 problem is that it's being advertised as calling the 
 destructors:
It does for value types and for class references. -Steve
Yeah well the types that don't get finalized by it need to be listed somewhere
Jun 30 2015
prev sibling parent reply "Kagamin" <spam here.lot> writes:
On Friday, 26 June 2015 at 18:27:34 UTC, Maxime 
Chevalier-Boisvert wrote:
 I seem to have run into a heisenbug involving destructors and 
 the GC. I'm kind of stuck at this point and need help tracking 
 down the issue.
BTW, how GC performance is doing? Etienne is experimenting with thread local GC: http://forum.dlang.org/post/nfuzudyoatryapcwxquu forum.dlang.org which also proved to be faster. But only if single-threaded GC is good enough for you, which normally should be given single-threaded model of javascript.
Jun 27 2015
parent "Etienne" <etcimon gmail.com> writes:
On Saturday, 27 June 2015 at 12:28:26 UTC, Kagamin wrote:
 On Friday, 26 June 2015 at 18:27:34 UTC, Maxime 
 Chevalier-Boisvert wrote:
 I seem to have run into a heisenbug involving destructors and 
 the GC. I'm kind of stuck at this point and need help tracking 
 down the issue.
BTW, how GC performance is doing? Etienne is experimenting with thread local GC: http://forum.dlang.org/post/nfuzudyoatryapcwxquu forum.dlang.org which also proved to be faster. But only if single-threaded GC is good enough for you, which normally should be given single-threaded model of javascript.
The new keyword can leverage the shared keyword to place certain objects in a shared GC, which can coexist with the local GC without fundamental changes. ie std.concurrency already requires that you pass shared objects, but casting won't cut it anymore you'll actually have to duplicate the object on the shared GC with .sdup or create it there with new shared. I don't need a shared GC right now so I'm not pushing too much for it, I'm satisfied with keeping a pointer of the object alive in the thread it was created and that's all a thread local GC really requires for concurrency to work
Jun 27 2015