www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - GC behavior and Allocators

reply "Brian Schott" <briancschott gmail.com> writes:
Experience has shown that using allocators can drastically 
improve the execution time of D programs. It has also shown that 
the biggest issue with allocators is that unless you are very 
careful, the GC will start freeing your live memory.

I think that we need to define the behavior of 
core.memory.GC.addRange more precisely. Consider the following 
pseudo-code:

auto a = allocate(aSize);
GC.addRange(a, aSize);
auto a = reallocate(a, aSize * 2);
GC.addRange(a, aSize * 2);

What does this do? Well, if the proxy pointer inside of the GC is 
null, it creates two entries. If it's not null and the default GC 
implementation is being used, it results in one entry being 
present. This single entry will only contain the smaller size, 
leaving the GC free to collect memory pointed to by the second 
half of your array.

How about removing? Again, it depends. If the default GC is being 
used there can only be one entry per pointer, so removeRange does 
exactly what you expect. If the proxy pointer is null, 
gc_removeRange falls back to an implementation that scans through 
its ranges array and overwrites the first entry containing a 
matching pointer with the last entry in the array, and then 
decrementing the array length. This reordering means if you have 
multiple entries for various pointers, you don't know which one 
is being removed when you call removeRange. This is fine if you 
like to practice luck-oriented programming.

What about calling removeRange and then addRange immediately? 
This will only work in single-threaded code. In a multi-threaded 
program the GC can run between the calls to removeRange and 
addRange.

I think that the only sane way to solve this is to define in the 
specs for core.memory that GC.addRange will only ever store one 
entry per pointer, and that the length will be the value of "sz" 
from the most recent call to addRange.

Thoughts?
Jul 03 2014
next sibling parent "safety0ff" <safety0ff.dev gmail.com> writes:
On Thursday, 3 July 2014 at 21:53:25 UTC, Brian Schott wrote:
 I think that the only sane way to solve this is to define in 
 the specs for core.memory that GC.addRange will only ever store 
 one entry per pointer, and that the length will be the value of 
 "sz" from the most recent call to addRange.

 Thoughts?

Easy to change the behaviour of the 2.066 GC, just force update instead of ignoring duplicates: https://github.com/D-Programming-Language/druntime/blob/master/src/rt/util/container/treap.d#L138 I feel there are instances where core.memory is underspecified. When I was implementing the update for addRange from array to treap it was considered "undefined behaviour" to addRange twice with the same base pointer without first removeRange'ing the second. That's why the code currently ignores dups. Boehm GC on the other hand uses an array similar to the previous version of the D2 GC, except that it has code for properly merging duplicates (this has performance implications though.) Please keep sharing your insight into practical use of these APIs. :)
Jul 03 2014
prev sibling next sibling parent "safety0ff" <safety0ff.dev gmail.com> writes:
On Thursday, 3 July 2014 at 21:53:25 UTC, Brian Schott wrote:
 I think that the only sane way to solve this is to define in 
 the specs for core.memory that GC.addRange will only ever store 
 one entry per pointer, and that the length will be the value of 
 "sz" from the most recent call to addRange.

 Thoughts?

I just thought a little more about this and you will always have a race. Consider this code: auto a = malloc(aSize); GC.addRange(a, aSize); auto b = realloc(a, aSize * 2); If realloc moves the data (a != b) and the GC runs before you can fix up the ranges you can have invalid access during collection.
Jul 04 2014
prev sibling next sibling parent "Kagamin" <spam here.lot> writes:
If you reallocate doubling the size, it's likely such reallocs 
always move, so they should be equivalent to malloc+free, so your 
code can be

mem2 = alloc(sz*2);
mem2[] = mem1[];
addRange(mem2);
removeRange(mem1);
free(mem1);

if not, you need to lock the GC so that it won't interfere during 
realloc.
Jul 04 2014
prev sibling next sibling parent "Chris Cain" <zshazz gmail.com> writes:
On Friday, 4 July 2014 at 10:36:03 UTC, safety0ff wrote:
 I just thought a little more about this and you will always 
 have a race.

 Consider this code:
 auto a = malloc(aSize);
 GC.addRange(a, aSize);
 auto b = realloc(a, aSize * 2);

 If realloc moves the data (a != b) and the GC runs before you 
 can fix up the ranges you can have invalid access during 
 collection.

http://dlang.org/phobos/core_memory.html#.GC.enable If GC.enable and GC.disable truly disallowed GC running (or alternative `GC.hard_disable`/`GC.hard_enable` existed that guaranteed such) then you could use that to make sure that the GC didn't collect in the middle of a pair of those calls. On Friday, 4 July 2014 at 11:54:07 UTC, Kagamin wrote:
 If you reallocate doubling the size, it's likely such reallocs 
 always move, so they should be equivalent to malloc+free, so 
 your code can be

 mem2 = alloc(sz*2);
 mem2[] = mem1[];
 addRange(mem2);
 removeRange(mem1);
 free(mem1);

 if not, you need to lock the GC so that it won't interfere 
 during realloc.

Is there a way to lock the GC currently?
Jul 04 2014
prev sibling next sibling parent "David Nadlinger" <code klickverbot.at> writes:
On Friday, 4 July 2014 at 14:47:12 UTC, Chris Cain wrote:
 Is there a way to lock the GC currently?

There are critical regions in core.thread. While in such a region, your thread will never be suspended, effectively also precluding the GC from running. They are a rather dangerous tool though, use them only if absolutely required and on small, controlled pieces of code. David
Jul 04 2014
prev sibling next sibling parent "safety0ff" <safety0ff.dev gmail.com> writes:
On Friday, 4 July 2014 at 14:47:12 UTC, Chris Cain wrote:
 If GC.enable and GC.disable truly disallowed GC running (or 
 alternative `GC.hard_disable`/`GC.hard_enable` existed that 
 guaranteed such) then you could use that to make sure that the 
 GC didn't collect in the middle of a pair of those calls.

Right, this should be sufficient in the current implementation. On Friday, 4 July 2014 at 17:05:32 UTC, David Nadlinger wrote:
 There are critical regions in core.thread. While in such a 
 region, your thread will never be suspended, effectively also 
 precluding the GC from running. They are a rather dangerous 
 tool though, use them only if absolutely required and on small, 
 controlled pieces of code.

I think this will deadlock on the GC mutex.
Jul 04 2014
prev sibling next sibling parent "Brian Schott" <briancschott gmail.com> writes:
On Friday, 4 July 2014 at 17:05:32 UTC, David Nadlinger wrote:
 On Friday, 4 July 2014 at 14:47:12 UTC, Chris Cain wrote:
 Is there a way to lock the GC currently?

There are critical regions in core.thread. While in such a region, your thread will never be suspended, effectively also precluding the GC from running. They are a rather dangerous tool though, use them only if absolutely required and on small, controlled pieces of code. David

So something like this would work? void* GCAwareRealloc(void* ptr, size_t size) { import core.thread; void* r; thread_enterCriticalRegion(); scope (exit) thread_exitCriticalRegion(); r = realloc(ptr, size); // Assuming that addRange performs an update of size GC.addRange(r, size); return r; }
Jul 05 2014
prev sibling parent "safety0ff" <safety0ff.dev gmail.com> writes:
On Sunday, 6 July 2014 at 04:58:42 UTC, Brian Schott wrote:
 So something like this would work?

 void* GCAwareRealloc(void* ptr, size_t size)
 {
   import core.thread;
   void* r;
   thread_enterCriticalRegion();
   scope (exit) thread_exitCriticalRegion();
   r = realloc(ptr, size);
   // Assuming that addRange performs an update of size
   GC.addRange(r, size);
   return r;
 }

No, it can cause a deadlock: Thread 1: thread_enterCriticalRegion(); Thread 2: Takes GC lock. Thread 1: Tries to take GC lock to add range, but it has to wait. Thread 2: triggers a collection -> thread_suspendAll(). Thread 2 waits on thread 1 to exit critical section, thread 1 waits on thread 2 to release GC lock. So the best we can do at the moment is: void* GCAwareRealloc(void* ptr, size_t size) { void* r; GC.disable; scope (exit) GC.enable; GC.removeRange(ptr); r = realloc(ptr, size); GC.addRange(r, size); return r; } With your proposed enhancement we get: void* GCAwareRealloc(void* ptr, size_t size) { void* r; GC.disable; scope (exit) GC.enable; r = realloc(ptr, size); if (r != ptr) GC.removeRange(ptr); GC.addRange(r, size); // if r == ptr, only updates size return r; } Which executes a little faster because we sometimes do 1 less GC range operation (plus 1 less GC lock.)
Jul 06 2014