www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - GC Idea: "explicit types" (Repost)

(I proposed this in about a year ago, and thought it may be relevant with 
what Andrei is pursuing.)

Before I get into my proposal, I want to vote for stack maps to be added to
D.  IMO, stack maps are the next logical step to making the GC faster.  They
don't require a fundamental shift in the library like a moving GC would.
Once stack maps are added, then perhaps the following proposal should be
considered to glean additional GC performance.

I'm not stuck on terminology here, so if you don't like the term "explicit"
because it's too overloaded, that's fine with me.  Pick another term.  The
concept is what's important.  This proposal is about getting GC and explicit
memory management to play well together.  The idea is to give the compiler
information that allows the GC to scan less data, and hence perform better.
Let's start with a class that uses explcit memory management.

class Foo
{
public:
    new(size_t sz) { return std.c.stdlib.malloc(sz); }
    delete(void* p) { std.c.stdlib.free(p); }
}

This works fine, but doesn't tell the compiler whether data referenced by
Foo is allocated on the GC heap or not.  If we preceded the class with some
kind of qualifier, like "explicit", this would indicate to the compiler that
data referenced by Foo is not allocated on the heap.  Note: this constraint
can't be enforced by the compiler, but could be enforced via run-time debug
assertions.

explicit class Foo
{
public:
    new(size_t sz) { return std.c.stdlib.malloc(sz); }
    delete(void* p) { std.c.stdlib.free(p); }
}

A problem here arises because even though Foo is allocated on the malloc
heap, it could contain references, pointers, or arrays that touch the GC
heap.  Thus, making Foo "explicit" also denotes that any reference, pointer
or array contained by Foo is also explicit, and therefore does not refer to
data on the GC heap.  Interestingly, this means that "explicit" would have
to be transitive, like D's const.

Thus, for the explicit qualifier to be useful, it must be able to be applied
to a struct, class, pointer, reference, or array type.  However, it doesn't
make sense to apply it to primitive or POD types.  If you follow my logic
you understand what explicit types can do.  They inform the compiler that no
GC heap data will be referenced, so that the compiler can exclude explicit
types from GC scanning.  Further, the use of explicit can be enforced via
run-time debug assertions.  Note that there are a few implementation details
that I'm ignoring now for simplicity sake.

-Craig 
Apr 26 2009