www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - RFC: Pinning interface for the GC

reply =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= <alex lycus.org> writes:
Hi,

With precise garbage collection coming up, and most likely compacting 
garbage collection in the future, I think it's time we start thinking 
about an API to pin garbage collector-managed objects.

A typical approach that people use to 'pin' objects today is to allocate 
a chunk of memory from the C heap, add it as a root [range], and store a 
reference in it. That, or just global variables.

This is kind of terrible because adding the chunk of memory as a root 
forces the GC to actually scan it, which is unnecessary when what you 
really want is to pin the object in place and tell the GC "I know what 
I'm doing, don't touch this".

I propose the following functions in core.memory.GC:

     static bool pin(const(void)* p) nothrow;
     static bool unpin(const(void)* p) nothrow;

The pin function shall pin the object pointed to by p in place such that 
it is not allowed to be moved nor collected until unpinned. The function 
shall return true if the object was successfully pinned or false if the 
object was already pinned or didn't belong to the garbage collector in 
the first place.

The unpin function shall unpin the object pointed to by p such that it 
is once again eligible for moving and collection as usual. The function 
shall return true if the object was successfully unpinned or false if 
the object was not pinned or didn't belong to the garbage collector in 
the first place.

Destroy!

-- 
Alex Rřnne Petersen
alex lycus.org
http://lycus.org
Oct 13 2012
next sibling parent reply "David Nadlinger" <see klickverbot.at> writes:
On Saturday, 13 October 2012 at 18:58:27 UTC, Alex Rønne 
Petersen wrote:
 This is kind of terrible because adding the chunk of memory as 
 a root forces the GC to actually scan it, which is unnecessary 
 when what you really want is to pin the object in place and 
 tell the GC "I know what I'm doing, don't touch this".

If pointers in pinned objects make their targets live, there would be no difference to simply adding the object as a root. So in your proposal, pinned objects are implicitly marked live if they aren't reachable from any of the roots, but any other objects reachable only from a pinned object but not from a root would be collected – correct? David
Oct 13 2012
parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 13-10-2012 21:24, David Nadlinger wrote:
 On Saturday, 13 October 2012 at 18:58:27 UTC, Alex Rønne Petersen wrote:
 This is kind of terrible because adding the chunk of memory as a root
 forces the GC to actually scan it, which is unnecessary when what you
 really want is to pin the object in place and tell the GC "I know what
 I'm doing, don't touch this".

If pointers in pinned objects make their targets live, there would be no difference to simply adding the object as a root. So in your proposal, pinned objects are implicitly marked live if they aren't reachable from any of the roots, but any other objects reachable only from a pinned object but not from a root would be collected – correct? David

There is a difference: Adding the object itself as a root does not actually guarantee that the object *itself* might not be collected. At least, this is how I have to assume things work given that this is not guaranteed here: http://dlang.org/phobos/core_memory.html#addRoot As for your question: Not quite. A pinned object that points to any other unpinned objects will implicitly keep those alive. This is at least how I would expect it to work, following the principle of least surprise. -- Alex Rønne Petersen alex lycus.org http://lycus.org
Oct 13 2012
next sibling parent =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 13-10-2012 21:51, David Nadlinger wrote:
 On Saturday, 13 October 2012 at 19:34:29 UTC, Alex Rønne Petersen wrote:
 As for your question: Not quite. A pinned object that points to any
 other unpinned objects will implicitly keep those alive. This is at
 least how I would expect it to work, following the principle of least
 surprise.

But then the GC _does_ have to scan those objects to be able to mark the whole graph as live, no? Wasn't it this what you were referring to as "kind of terrible" in your first post?

Ah, I could have been clearer here. The problem with using roots is two-fold: 1) It adds unnecessary work for the marking phase. 2) It forces scanning of 'pinned' objects to be imprecise. (1) is not so much of a problem (it only happens if you have root ranges with null pointers and so on), but (2) can be. Another problem that would pop up if we made scanning of roots precise is that, then, the stored reference could be moved (as you also pointed out). I guess there is also the issue of adding roots being a relatively expensive operation - it has to go through a mutex-guarded function whereas pinning objects can be made lock-free (at least on some architectures). I think the problem boils down to using roots for something they're not meant to be used for, semantically.
 But yes, for a moving GC, a way to pin objects would have to be added,
 and lots of code using GC.add*() for interfacing with C would have to be
 changed – or we make those functions actually pin the objects for
 backwards compatibility and add a new set of functions which really just
 add something as a root.

 David

-- Alex Rønne Petersen alex lycus.org http://lycus.org
Oct 13 2012
prev sibling parent =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 13-10-2012 22:00, David Nadlinger wrote:
 On Saturday, 13 October 2012 at 19:34:29 UTC, Alex Rønne Petersen wrote:
 There is a difference: Adding the object itself as a root does not
 actually guarantee that the object *itself* might not be collected. At
 least, this is how I have to assume things work given that this is not
 guaranteed here: http://dlang.org/phobos/core_memory.html#addRoot

Actually, it does: the internal array of added roots is simply considered an additional range to be scanned by the GC implementation. The docs should probably be clarified in this regard. David

That's good to know. I'm not convinced that this should be defined behavior though. It encourages semantically incorrect use of the API (see my other reply). -- Alex Rønne Petersen alex lycus.org http://lycus.org
Oct 13 2012
prev sibling next sibling parent "David Nadlinger" <see klickverbot.at> writes:
On Saturday, 13 October 2012 at 19:34:29 UTC, Alex Rønne 
Petersen wrote:
 As for your question: Not quite. A pinned object that points to 
 any other unpinned objects will implicitly keep those alive. 
 This is at least how I would expect it to work, following the 
 principle of least surprise.

But then the GC _does_ have to scan those objects to be able to mark the whole graph as live, no? Wasn't it this what you were referring to as "kind of terrible" in your first post? But yes, for a moving GC, a way to pin objects would have to be added, and lots of code using GC.add*() for interfacing with C would have to be changed – or we make those functions actually pin the objects for backwards compatibility and add a new set of functions which really just add something as a root. David
Oct 13 2012
prev sibling next sibling parent "David Nadlinger" <see klickverbot.at> writes:
On Saturday, 13 October 2012 at 19:34:29 UTC, Alex Rønne 
Petersen wrote:
 There is a difference: Adding the object itself as a root does 
 not actually guarantee that the object *itself* might not be 
 collected. At least, this is how I have to assume things work 
 given that this is not guaranteed here: 
 http://dlang.org/phobos/core_memory.html#addRoot

Actually, it does: the internal array of added roots is simply considered an additional range to be scanned by the GC implementation. The docs should probably be clarified in this regard. David
Oct 13 2012
prev sibling next sibling parent "David Nadlinger" <see klickverbot.at> writes:
On Saturday, 13 October 2012 at 20:00:54 UTC, David Nadlinger 
wrote:
 Actually, it does: the internal array of added roots is simply 
 considered an additional range to be scanned by the GC 
 implementation. The docs should probably be clarified in this 
 regard.

https://github.com/D-Programming-Language/druntime/pull/322 David
Oct 13 2012
prev sibling next sibling parent "David Nadlinger" <see klickverbot.at> writes:
On Saturday, 13 October 2012 at 20:12:04 UTC, Alex Rønne 
Petersen wrote:
 2) It forces scanning of 'pinned' objects to be imprecise.

I'm not so sure about that. In the comment section you added to Git core.memory, you wrote »Roots are always scanned conservatively. Roots include […] memory locations added through the GC.addRoot and GC.addRange functions.«. But this statement is problematic, since addRange() adds a »memory location« consisting of root pointers, whereas addRoot() adds a single (rvalue) root pointer. Thus, depending on which case you consider, »memory location« would refer to different levels of indirection. As far as I can see, adding objects you want to »pin« as roots would only force them to be scanned imprecisely if you'd force the entire GC memory block referred to by addRoot() resp. all the GC blocks referred to by the range added using addRange() to be scanned conservatively. But why would this be necessary? David
Oct 13 2012
prev sibling next sibling parent "dsimcha" <dsimcha yahoo.com> writes:
We already have a NO_MOVE attribute that can be set or unset.  
What's wrong with that?

http://dlang.org/phobos/core_memory.html#NO_MOVE

On Saturday, 13 October 2012 at 18:58:27 UTC, Alex Rønne 
Petersen wrote:
 Hi,

 With precise garbage collection coming up, and most likely 
 compacting garbage collection in the future, I think it's time 
 we start thinking about an API to pin garbage collector-managed 
 objects.

 A typical approach that people use to 'pin' objects today is to 
 allocate a chunk of memory from the C heap, add it as a root 
 [range], and store a reference in it. That, or just global 
 variables.

 This is kind of terrible because adding the chunk of memory as 
 a root forces the GC to actually scan it, which is unnecessary 
 when what you really want is to pin the object in place and 
 tell the GC "I know what I'm doing, don't touch this".

 I propose the following functions in core.memory.GC:

     static bool pin(const(void)* p) nothrow;
     static bool unpin(const(void)* p) nothrow;

 The pin function shall pin the object pointed to by p in place 
 such that it is not allowed to be moved nor collected until 
 unpinned. The function shall return true if the object was 
 successfully pinned or false if the object was already pinned 
 or didn't belong to the garbage collector in the first place.

 The unpin function shall unpin the object pointed to by p such 
 that it is once again eligible for moving and collection as 
 usual. The function shall return true if the object was 
 successfully unpinned or false if the object was not pinned or 
 didn't belong to the garbage collector in the first place.

 Destroy!

Oct 13 2012
prev sibling parent Rainer Schuetze <r.sagitario gmx.de> writes:
On 10/13/2012 8:58 PM, Alex Rřnne Petersen wrote:
 Hi,

 With precise garbage collection coming up, and most likely compacting
 garbage collection in the future, I think it's time we start thinking
 about an API to pin garbage collector-managed objects.

 A typical approach that people use to 'pin' objects today is to allocate
 a chunk of memory from the C heap, add it as a root [range], and store a
 reference in it. That, or just global variables.

I guess people don't think about pinning because the GC is not moving objects ;-) As of today, you usually add a root to a garbage collected memory object to keep it in memory (and it can be scanned precisely), but you add a range to a memory chunk not managed by the garbage collector for scanning (you can't pass type info for this, so it is scanned conservatively). So this discussion is about addRoot/removeRoot, not addRange/removeRange (at least in the terminology of the current gc implementation).
 This is kind of terrible because adding the chunk of memory as a root
 forces the GC to actually scan it, which is unnecessary when what you
 really want is to pin the object in place and tell the GC "I know what
 I'm doing, don't touch this".

 I propose the following functions in core.memory.GC:

      static bool pin(const(void)* p) nothrow;
      static bool unpin(const(void)* p) nothrow;

 The pin function shall pin the object pointed to by p in place such that
 it is not allowed to be moved nor collected until unpinned. The function
 shall return true if the object was successfully pinned or false if the
 object was already pinned or didn't belong to the garbage collector in
 the first place.

 The unpin function shall unpin the object pointed to by p such that it
 is once again eligible for moving and collection as usual. The function
 shall return true if the object was successfully unpinned or false if
 the object was not pinned or didn't belong to the garbage collector in
 the first place.

 Destroy!

Your proposal splits the addRoot-functionality of holding reference, scanning and moving a garbage-collected object into two functions addRoot/pin. For a non-moving garbage collector, this is not really an issue. Adding a root means that there is a reference to the object somewhere outside the reach of the garbage collector, so moving it would make that reference invalid. Not scanning the root object could cause referenced memory chunks to be collected, making the root object invalid. So I think the addRoot functionality should not change, tearing it apart could create invalid references. The pin/unpin functions do make sense for a moving garbage collector, but your motivation above is misleading. It is not related to adding roots or ranges, though I guess using roots is the safer way. Well, thinking about it again, what's the use case of pinning without an external reference and keeping the object alive, just as addRoot would do? Another question, which also affects addRoot/addRange: should there be a pin/unpin counter, so that an object that is pinned twice also needs to be unpinned twice to be movable again? This cannot be implemented by the simple NO_MOVE flag mentioned by David. roots and ranges implement it, but in a slightly inefficient way because the memory is scanned multiple times then.
Oct 14 2012