www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Regarding the more precise GC

reply "bearophile" <bearophileHUGS lycos.com> writes:
In the main D newsgroup I have seen the two recent threads 
regarding a more precise GC in D. I have two questions about 
that, that seem more fit for D.learn.

1) I have not fully understood the performance and memory 
implications of the more precise GC. In the thread I think I've 
seen a memory overhead in every allocated struct. Is it possible 
to disable this bookkeeping/overhead for smaller programs that 
enjoy/need a GC but probably don't need a precise GC because they 
run only for no more than 30 seconds? Many of my small D programs 
are command-line utilities with a short run-time, but they often 
need a GC.


2) If I have a tagged union, like:


static bool isPointer;

union Foo {
     size_t count;
     int* ptr;
}

In every point of my program I know that inside a Foo there is a 
pointer or a size_t according to the value isPointer (a similar 
case is if I use a single bit tagging inside a size_t to denote 
if it's a pointer or an integral value. The D docs say that in 
the current conservative GC design such pointer tagging with a 
single bit is not allowed if the pointer is to CG-managed memory, 
but the union is allowed. Maybe with the more precise GC even the 
pointer tagging gets possible).

Is it possible to tell to the precise GC every time it performs a 
collection if a Foo contains a pointer to follow, or if it 
instead contains just an integral value to ignore? I think it 
needs to be a function pointer or some kind of callback.

Bye and thank you,
bearophile
Apr 21 2012
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 22.04.2012 4:28, bearophile wrote:
 In the main D newsgroup I have seen the two recent threads regarding a
 more precise GC in D. I have two questions about that, that seem more
 fit for D.learn.

 1) I have not fully understood the performance and memory implications
 of the more precise GC. In the thread I think I've seen a memory
 overhead in every allocated struct. Is it possible to disable this
 bookkeeping/overhead for smaller programs that enjoy/need a GC but
 probably don't need a precise GC because they run only for no more than
 30 seconds? Many of my small D programs are command-line utilities with
 a short run-time, but they often need a GC.
AFAIKT there is overhead per user defined type. In typical cases or no indirections or structs/classes of length < 32*size_t.sizeof (64 on 64bit) fit within one word per struct/class. The above assumes it will use simple bitmask where applicable, there are better and more compact ways still. Most likely + another pointer per entity type for a hook in TypeInfo inserted by compiler. So, on average it's about 2 * num_of_abstractions* pointer.sizeof + small_extra
 2) If I have a tagged union, like:


 static bool isPointer;

 union Foo {
 size_t count;
 int* ptr;
 }

 In every point of my program I know that inside a Foo there is a pointer
 or a size_t according to the value isPointer (a similar case is if I use
 a single bit tagging inside a size_t to denote if it's a pointer or an
 integral value. The D docs say that in the current conservative GC
 design such pointer tagging with a single bit is not allowed if the
 pointer is to CG-managed memory, but the union is allowed. Maybe with
 the more precise GC even the pointer tagging gets possible).

 Is it possible to tell to the precise GC every time it performs a
 collection if a Foo contains a pointer to follow, or if it instead
 contains just an integral value to ignore? I think it needs to be a
 function pointer or some kind of callback.
It can be the case that the cost of calling per instance callbacks just too high so being conservative with them is preferable. But otherwise I think that currently proposed scheme allows it (as long as the collector supports this - the meta info has to be in sync with GC). Namely a type provides a callback that returns a bitmask when called with its memory block as parameter by GC. Yet I believe GC per type callbacks were presently assumed to be agnostic to the actual content within the memory block. It can be extended from here if deemed useful.
 Bye and thank you,
 bearophile
-- Dmitry Olshansky
Apr 22 2012