www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Some questions about GC

reply Roland Hadinger <rolandh.dlangforum maildrop.cc> writes:
These questions probably need some context: I'm working on an 
interpreter that will manage memory via reference counted struct 
types. To deal with the problem of strong reference cycles 
retaining memory indefinitely, weak references or recursive 
teardowns have to be used where appropriate.

To help detect memory leaks from within the interpreter, I'd also 
like to employ core.memory.GC in the following fashion:

* keep core.memory.GC off (disabled) by default, but nonetheless 
allocate objects from GC memory
* provide a function that can find (and reclaim) retained 
unreachable object graphs that contain strong reference cycles
* the main purpose of this function is to find and report such 
instances, not to reclaim memory. Retained graphs should be 
reported as warnings on stderr, so that the program can be fixed 
manually, e.g. by weakening some refs in the proper places
* the function will rely on GC.collect to find unreachable objects
* the function will *always* be called implicitly when a program 
terminates
* the function should also be explicitly callable from any point 
within a program.

Now my questions:

Is it safe to assume that a call to GC.collect will be handled 
synchronously (and won't return early)?

Is there a way to ensure that GC.collect will never run unless 
when called explicitly (even in out of memory situations)?

Is it possible and is it OK to print to stderr while the GC is 
collecting (e.g. from  nogc code, using functions from 
core.stdc.stdio)?

Could I implement my function by introducing a shared global flag 
which is set prior to calling GC.collect and reset afterwards, so 
that any destructor can determine whether has been invoked by a 
"flagged" call to GC.collect and act accordingly?

Alternatively: do I need to implement such a flag, or is there 
already a way in which a destructor can determine whether it has 
been invoked by the GC?

Thanks for any help!
Oct 18 2019
parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Friday, October 18, 2019 10:54:55 AM MDT Roland Hadinger via Digitalmars-
d-learn wrote:
 These questions probably need some context: I'm working on an
 interpreter that will manage memory via reference counted struct
 types. To deal with the problem of strong reference cycles
 retaining memory indefinitely, weak references or recursive
 teardowns have to be used where appropriate.

 To help detect memory leaks from within the interpreter, I'd also
 like to employ core.memory.GC in the following fashion:

 * keep core.memory.GC off (disabled) by default, but nonetheless
 allocate objects from GC memory
 * provide a function that can find (and reclaim) retained
 unreachable object graphs that contain strong reference cycles
 * the main purpose of this function is to find and report such
 instances, not to reclaim memory. Retained graphs should be
 reported as warnings on stderr, so that the program can be fixed
 manually, e.g. by weakening some refs in the proper places
 * the function will rely on GC.collect to find unreachable objects
 * the function will *always* be called implicitly when a program
 terminates
 * the function should also be explicitly callable from any point
 within a program.

 Now my questions:

 Is it safe to assume that a call to GC.collect will be handled
 synchronously (and won't return early)?
D's GC is a stop-the-world GC. Every thread managed by the GC is stopped when a thread runs a collection.
 Is there a way to ensure that GC.collect will never run unless
 when called explicitly (even in out of memory situations)?
The GC only runs a collection either when you explicitly tell it to or when you try to allocate memory using the GC, and it determines that it should run a collection. Disabling the GC normally prevents a collection from running, though per the documentation, it sounds like it may still run if the GC actually runs out of memory. I had thought that it prevented collections completely, but that's not what the documentation says. I don't know what the current implementation does.
 Is it possible and is it OK to print to stderr while the GC is
 collecting (e.g. from  nogc code, using functions from
 core.stdc.stdio)?
No code in any thread managed by the GC is run while a collection is running unless it's code that's triggered by the collection itself (e.g. a finalizer being called on an object that's being collected - and even that isn't supposed to access GC-allocated objects, because the GC might have already destroyed them - e.g. in the case of cycle). If you want code to run at the same time as a GC collection, it's going to have to be in a thread that is not attached to the GC, and at that point, you shouldn't be accessing _anything_ that's managed by the GC unless you have a guarantee that what you're accessing won't be collected. And even then, you shouldn't be mutating any of it. Also, nogc doesn't say anything about whether the code accesses GC-allocated objects. It just means that it's not allowed to access most GC functions, which usually just means that it doesn't allocate anything using the GC and that it doesn't risk running a collection. So, just because a function is nogc doesn't necessarily mean that it's safe to run it from a thread that isn't managed by the GC while a collection is running.
 Could I implement my function by introducing a shared global flag
 which is set prior to calling GC.collect and reset afterwards, so
 that any destructor can determine whether has been invoked by a
 "flagged" call to GC.collect and act accordingly?
You should be able to do that, but then the destructor can't be pure (though as I understand it, there's currently a compiler bug with pure destructors anyway which causes them to not be called), and when a destructor is run as a finalizer, it shouldn't be accessing any other GC-allocated objects, because the GC might have actually destroyed them already at that point. Finalizers really aren't supposed to doh much of anything other than managing what lives in an object directly or managing non-GC-allocated resources. Regardless, anything that really should be operating as a destructor rather than a finalizer has to live on the stack, since finalizers won't be run until a collection occurs. If you're explicitly running them yourself via your own reference counting, then you don't have that problem, but if there's any chance that a destructor is going to be run as a finalizer by the GC, then you have to write your destructors / finalizers with the idea that that could happen.
 Alternatively: do I need to implement such a flag, or is there
 already a way in which a destructor can determine whether it has
 been invoked by the GC?

 Thanks for any help!
Honestly, the way things are set up, destructors aren't supposed to know or care about whether they're being run by the GC as a finalizer. So, the GC isn't going to provide that kind of functionality. What you're looking to do is pretty much a giant hack from the perspective of the GC and likely to be pretty dangerous to attempt. I suspect that what would make a lot more sense would be to create a custom build of druntime to run which specifically printed out what wasn't freed when the program shut down rather than trying to hack around how the GC works. Alternatively, you could just ditch the GC entirely and then use valgrind to see what didn't get freed to catch cycles (or other screw-ups that resulted in memory not being freed). Having the GC take care of cycles for you isn't necessarily a problem, but having the GC report on what's alive or not is tricky business, particularly since it's supposed to keep anything that the program still has access to alive. Another thing to consider is that some language features outright require the GC (e.g. closures and anything with dynamic arrays involving allocation), and if you truly don't want to use the GC for that stuff, it's probably going to be easier to require that your program not use the GC at all than to try to have it just manage cycles. Regardless, if you really want to go forward with something like you're proposing here, you'll probably need to get answers from one of the few GC experts around here. - Jonathan M Davis
Oct 18 2019