digitalmars.D - GC behavior and Allocators

Brian Schott (36/36) Jul 03 2014 Experience has shown that using allocators can drastically

safety0ff (14/19) Jul 03 2014 Easy to change the behaviour of the 2.066 GC, just force update
safety0ff (9/14) Jul 04 2014 I just thought a little more about this and you will always have

Chris Cain (8/27) Jul 04 2014 http://dlang.org/phobos/core_memory.html#.GC.enable

David Nadlinger (7/8) Jul 04 2014 There are critical regions in core.thread. While in such a

Brian Schott (13/21) Jul 05 2014 So something like this would work?

safety0ff (33/45) Jul 06 2014 No, it can cause a deadlock:

safety0ff (4/13) Jul 04 2014 Right, this should be sufficient in the current implementation.

Kagamin (10/10) Jul 04 2014 If you reallocate doubling the size, it's likely such reallocs

"Brian Schott" <briancschott gmail.com> writes:

Experience has shown that using allocators can drastically 
improve the execution time of D programs. It has also shown that 
the biggest issue with allocators is that unless you are very 
careful, the GC will start freeing your live memory.

I think that we need to define the behavior of 
core.memory.GC.addRange more precisely. Consider the following 
pseudo-code:

auto a = allocate(aSize);
GC.addRange(a, aSize);
auto a = reallocate(a, aSize * 2);
GC.addRange(a, aSize * 2);

What does this do? Well, if the proxy pointer inside of the GC is 
null, it creates two entries. If it's not null and the default GC 
implementation is being used, it results in one entry being 
present. This single entry will only contain the smaller size, 
leaving the GC free to collect memory pointed to by the second 
half of your array.

How about removing? Again, it depends. If the default GC is being 
used there can only be one entry per pointer, so removeRange does 
exactly what you expect. If the proxy pointer is null, 
gc_removeRange falls back to an implementation that scans through 
its ranges array and overwrites the first entry containing a 
matching pointer with the last entry in the array, and then 
decrementing the array length. This reordering means if you have 
multiple entries for various pointers, you don't know which one 
is being removed when you call removeRange. This is fine if you 
like to practice luck-oriented programming.

What about calling removeRange and then addRange immediately? 
This will only work in single-threaded code. In a multi-threaded 
program the GC can run between the calls to removeRange and 
addRange.

I think that the only sane way to solve this is to define in the 
specs for core.memory that GC.addRange will only ever store one 
entry per pointer, and that the length will be the value of "sz" 
from the most recent call to addRange.

Thoughts?

Jul 03 2014

"safety0ff" <safety0ff.dev gmail.com> writes:

On Thursday, 3 July 2014 at 21:53:25 UTC, Brian Schott wrote:
 I think that the only sane way to solve this is to define in 
 the specs for core.memory that GC.addRange will only ever store 
 one entry per pointer, and that the length will be the value of 
 "sz" from the most recent call to addRange.

 Thoughts?

Easy to change the behaviour of the 2.066 GC, just force update 
instead of ignoring duplicates:
https://github.com/D-Programming-Language/druntime/blob/master/src/rt/util/container/treap.d#L138

I feel there are instances where core.memory is underspecified.
When I was implementing the update for addRange from array to 
treap it was considered "undefined behaviour" to addRange twice 
with the same base pointer without first removeRange'ing the 
second. That's why the code currently ignores dups.

Boehm GC on the other hand uses an array similar to the previous 
version of the D2 GC, except that it has code for properly 
merging duplicates (this has performance implications though.)

Please keep sharing your insight into practical use of these 
APIs. :)

Jul 03 2014

"safety0ff" <safety0ff.dev gmail.com> writes:

On Thursday, 3 July 2014 at 21:53:25 UTC, Brian Schott wrote:
 I think that the only sane way to solve this is to define in 
 the specs for core.memory that GC.addRange will only ever store 
 one entry per pointer, and that the length will be the value of 
 "sz" from the most recent call to addRange.

 Thoughts?

I just thought a little more about this and you will always have 
a race.

Consider this code:
auto a = malloc(aSize);
GC.addRange(a, aSize);
auto b = realloc(a, aSize * 2);

If realloc moves the data (a != b) and the GC runs before you can 
fix up the ranges you can have invalid access during collection.

Jul 04 2014

"Chris Cain" <zshazz gmail.com> writes:

On Friday, 4 July 2014 at 10:36:03 UTC, safety0ff wrote:
 I just thought a little more about this and you will always 
 have a race.

 Consider this code:
 auto a = malloc(aSize);
 GC.addRange(a, aSize);
 auto b = realloc(a, aSize * 2);

 If realloc moves the data (a != b) and the GC runs before you 
 can fix up the ranges you can have invalid access during 
 collection.



If GC.enable and GC.disable truly disallowed GC running (or 
alternative `GC.hard_disable`/`GC.hard_enable` existed that 
guaranteed such) then you could use that to make sure that the GC 
didn't collect in the middle of a pair of those calls.

On Friday, 4 July 2014 at 11:54:07 UTC, Kagamin wrote:
 If you reallocate doubling the size, it's likely such reallocs 
 always move, so they should be equivalent to malloc+free, so 
 your code can be

 mem2 = alloc(sz*2);
 mem2[] = mem1[];
 addRange(mem2);
 removeRange(mem1);
 free(mem1);

 if not, you need to lock the GC so that it won't interfere 
 during realloc.

Is there a way to lock the GC currently?

Jul 04 2014

"David Nadlinger" <code klickverbot.at> writes:

On Friday, 4 July 2014 at 14:47:12 UTC, Chris Cain wrote:
 Is there a way to lock the GC currently?

There are critical regions in core.thread. While in such a 
region, your thread will never be suspended, effectively also 
precluding the GC from running. They are a rather dangerous tool 
though, use them only if absolutely required and on small, 
controlled pieces of code.

David

Jul 04 2014

"Brian Schott" <briancschott gmail.com> writes:

On Friday, 4 July 2014 at 17:05:32 UTC, David Nadlinger wrote:
 On Friday, 4 July 2014 at 14:47:12 UTC, Chris Cain wrote:
 Is there a way to lock the GC currently?

 There are critical regions in core.thread. While in such a 
 region, your thread will never be suspended, effectively also 
 precluding the GC from running. They are a rather dangerous 
 tool though, use them only if absolutely required and on small, 
 controlled pieces of code.

 David

So something like this would work?

void* GCAwareRealloc(void* ptr, size_t size)
{
   import core.thread;
   void* r;
   thread_enterCriticalRegion();
   scope (exit) thread_exitCriticalRegion();
   r = realloc(ptr, size);
   // Assuming that addRange performs an update of size
   GC.addRange(r, size);
   return r;
}

Jul 05 2014

"safety0ff" <safety0ff.dev gmail.com> writes:

On Sunday, 6 July 2014 at 04:58:42 UTC, Brian Schott wrote:
 So something like this would work?

 void* GCAwareRealloc(void* ptr, size_t size)
 {
   import core.thread;
   void* r;
   thread_enterCriticalRegion();
   scope (exit) thread_exitCriticalRegion();
   r = realloc(ptr, size);
   // Assuming that addRange performs an update of size
   GC.addRange(r, size);
   return r;
 }

No, it can cause a deadlock:

Thread 1: thread_enterCriticalRegion();
Thread 2: Takes GC lock.
Thread 1: Tries to take GC lock to add range, but it has to wait.
Thread 2: triggers a collection -> thread_suspendAll().

Thread 2 waits on thread 1 to exit critical section, thread 1 
waits on thread 2 to release GC lock.

So the best we can do at the moment is:
void* GCAwareRealloc(void* ptr, size_t size)
{
   void* r;
   GC.disable;
   scope (exit) GC.enable;
   GC.removeRange(ptr);
   r = realloc(ptr, size);
   GC.addRange(r, size);
   return r;
}

With your proposed enhancement we get:
void* GCAwareRealloc(void* ptr, size_t size)
{
   void* r;
   GC.disable;
   scope (exit) GC.enable;
   r = realloc(ptr, size);
   if (r != ptr)
     GC.removeRange(ptr);
   GC.addRange(r, size); // if r == ptr, only updates size
   return r;
}

Which executes a little faster because we sometimes do 1 less GC 
range operation (plus 1 less GC lock.)

Jul 06 2014

"safety0ff" <safety0ff.dev gmail.com> writes:

On Friday, 4 July 2014 at 14:47:12 UTC, Chris Cain wrote:
 If GC.enable and GC.disable truly disallowed GC running (or 
 alternative `GC.hard_disable`/`GC.hard_enable` existed that 
 guaranteed such) then you could use that to make sure that the 
 GC didn't collect in the middle of a pair of those calls.

Right, this should be sufficient in the current implementation.

On Friday, 4 July 2014 at 17:05:32 UTC, David Nadlinger wrote:
 There are critical regions in core.thread. While in such a 
 region, your thread will never be suspended, effectively also 
 precluding the GC from running. They are a rather dangerous 
 tool though, use them only if absolutely required and on small, 
 controlled pieces of code.

I think this will deadlock on the GC mutex.

Jul 04 2014

"Kagamin" <spam here.lot> writes:

If you reallocate doubling the size, it's likely such reallocs 
always move, so they should be equivalent to malloc+free, so your 
code can be

mem2 = alloc(sz*2);
mem2[] = mem1[];
addRange(mem2);
removeRange(mem1);
free(mem1);

if not, you need to lock the GC so that it won't interfere during 
realloc.

Jul 04 2014

D Programming

C/C++ Programming

Other

digitalmars.D - GC behavior and Allocators