www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Designing with the GC out of mind

reply "JS" <js.mdnq gmail.com> writes:
I would like to remove the GC dependence on my own class objects 
I am designing and at some point remove the GC completely if 
possible(assuming at some point D itself will be GC agnostic).

1. Is there a way to determine what the GC is doing? I would like 
to know just how much my projects are GC depending(both for D 
internals and my code).

2. Have all classes implement an (almost final) interface to 
allow for custom allocation schemes. I suppose I can modify the 
sources, but is there an easier way? I don't want to have to 
implement the interface for every class I create nor explicitly 
specify the inheritance.

The interface here basically has a final New method and a virtual 
ClassSize method which gets the size of the class. (which I think 
is required over sizeof, classInstanceSize, etc... to prevent 
slicing). The ClassSize, while being virtual, is basically the 
same for every class(in fact it just returns a constant which is 
determined at compile time... I don't want the user to have to 
write the code for the class).
Aug 06 2013
next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Tuesday, 6 August 2013 at 07:19:40 UTC, JS wrote:
 1. Is there a way to determine what the GC is doing?

It is probably easiest to just run it in a debugger, and set a breakpoint on gc_malloc and gc_qalloc. When it breaks, you can see where it is, then continue to carry on. You can also tell where the gc runs by carefully reviewing the code: 1) any instance of "new", obviously 2) any use of array ~ array. Binary ~ on other types doesn't necessarily run the gc, since it could be overloaded, but generally does. (If you want to avoid gc, a ~ b is something I'd avoid entirely.) 3) Some uses of a ~= b. This doesn't necessarily allocate, it might be appending to reserved space or be overloaded. If a is a built in array though and you're looking at the code, assume it allocates and do something else. 4) Passing a nested delegate or lambda to another function, sometimes. It won't if a) the receiving function marks the delegate argument as scope, or b) the function doesn't reference any variables. But generally if you see a non-scope delegate, assume it allocates. 5) Any array literal that isn't static immutable allocates. Hopefully this will be fixed soon, but right now they all do... even if the array literal is in an enum! ummm... I think I might be forgetting something, but I'm pretty sure these are all the major ones. Anything that allocates might call the gc collect. So to sum up: search your code for "new", "~", calling functions that receive "delegate", and "[a,b,c...]" and you'll have an idea as to where the gc might run in your code.
 I suppose I can modify the sources, but is there an easier way?

What about writing your own external alloc function like std.conv.emplace? Then you'd do it just at the creation site without modifying the class.
 (which I think is required over sizeof, classInstanceSize, 
 etc... to prevent slicing).

Check out typeid(MyClass).init as well...
Aug 06 2013
prev sibling next sibling parent "QAston" <qaston gmail.com> writes:
On Tuesday, 6 August 2013 at 12:41:26 UTC, Adam D. Ruppe wrote:
 On Tuesday, 6 August 2013 at 07:19:40 UTC, JS wrote:
 1. Is there a way to determine what the GC is doing?


 You can also tell where the gc runs by carefully reviewing the 
 code:

There's complete list at http://dlang.org/garbage.html - paragraph: "D Operations That Involve the Garbage Collector"
Aug 06 2013
prev sibling next sibling parent "JS" <js.mdnq gmail.com> writes:
On Tuesday, 6 August 2013 at 12:41:26 UTC, Adam D. Ruppe wrote:
 On Tuesday, 6 August 2013 at 07:19:40 UTC, JS wrote:
 1. Is there a way to determine what the GC is doing?

It is probably easiest to just run it in a debugger, and set a breakpoint on gc_malloc and gc_qalloc. When it breaks, you can see where it is, then continue to carry on. You can also tell where the gc runs by carefully reviewing the code: 1) any instance of "new", obviously 2) any use of array ~ array. Binary ~ on other types doesn't necessarily run the gc, since it could be overloaded, but generally does. (If you want to avoid gc, a ~ b is something I'd avoid entirely.) 3) Some uses of a ~= b. This doesn't necessarily allocate, it might be appending to reserved space or be overloaded. If a is a built in array though and you're looking at the code, assume it allocates and do something else. 4) Passing a nested delegate or lambda to another function, sometimes. It won't if a) the receiving function marks the delegate argument as scope, or b) the function doesn't reference any variables. But generally if you see a non-scope delegate, assume it allocates. 5) Any array literal that isn't static immutable allocates. Hopefully this will be fixed soon, but right now they all do... even if the array literal is in an enum! ummm... I think I might be forgetting something, but I'm pretty sure these are all the major ones. Anything that allocates might call the gc collect. So to sum up: search your code for "new", "~", calling functions that receive "delegate", and "[a,b,c...]" and you'll have an idea as to where the gc might run in your code.

Thanks. It looks like the bulk can be avoided except by writing custom arrays types. I'd still like some easy way to hook into the GC to monitor what is going on though because ultimately I'd like to disable the GC but then must guarantee that memory is not leaking.
 I suppose I can modify the sources, but is there an easier way?

What about writing your own external alloc function like std.conv.emplace? Then you'd do it just at the creation site without modifying the class.

Well, I am using an interface to allow for various strategies that can also change dynamically. e.g.., interface iNew(T) { final static T New(Args...)(Strategy, Args args) { .. } } all of my objects will inherit this interface. The only issue is that they must do so, I see no need to have to manually add the inheritance. I also worry about size allocation but it seems the typeid you mention should solve this(but to be honest, I don't know if it will completely solve the issue or how performant it is). There is TypeInfo, but again, I don't know how performant. By having every object implement a Size and return it's size wold work fine though. e.g., interface iNew(T) { property size_t Size(); final static T New(Args...)(Strategy, Args args) { .. } } But it would be nice if Size was automatically implemented in each class(since all it does is return a constant). The overhead is just a function call rather than some reflection type of stuff. Having iNew implicitly inherited and Size() implicitly generated is what I'm after. I'll check out the alternative Size() methods and see if they are satisfactory.
Aug 06 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
QAston:

 There's complete list at http://dlang.org/garbage.html - 
 paragraph: "D Operations That Involve the Garbage Collector"

I think that list is not fully updated, this was recently a little improved:
 Array literals (except when used to initialize static data)

Now if you initialize a fixed sized array even if it's not static, no allocations will happen for the array literal itself. Bye, bearophile
Aug 06 2013
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Tuesday, 6 August 2013 at 13:43:01 UTC, JS wrote:
 I'd still like some easy way to hook into the GC to monitor 
 what is going on though because ultimately I'd like to disable 
 the GC but then must guarantee that memory is not leaking.

Using a debugger is the easiest way. You can also hack on druntime (various ways to do this, including one option that doesn't actually edit druntime, linked to earlier in this thread).. and there's a gc proxy in the code but idk how to use it.
 seems the typeid you mention should solve this(but to be 
 honest, I don't know if it will completely solve the issue or 
 how performant it is).

typeid just fetches the TypeInfo, so they're the same thing I'm not sure how fast it is to fetch that info, but that's what the gc does so you're at least no worse off.
Aug 06 2013
prev sibling next sibling parent "JS" <js.mdnq gmail.com> writes:
On Tuesday, 6 August 2013 at 14:27:32 UTC, Adam D. Ruppe wrote:
 On Tuesday, 6 August 2013 at 13:43:01 UTC, JS wrote:
 I'd still like some easy way to hook into the GC to monitor 
 what is going on though because ultimately I'd like to disable 
 the GC but then must guarantee that memory is not leaking.

Using a debugger is the easiest way. You can also hack on druntime (various ways to do this, including one option that doesn't actually edit druntime, linked to earlier in this thread).. and there's a gc proxy in the code but idk how to use it.

The issue is I may want to use the GC at runtime for various tasks but disable clean up until the appropriate time. Also, without some automated way to see what the GC is doing, it would be a nightmare to have to debug it every time some library function is added to see if it's using the GC. (e.g., file functions, etc...) It would be easier and more informative just to hook into the GC and then report statistics at the end of the program. I could then, by using my allocation strategies, see how beneficial they are.
 seems the typeid you mention should solve this(but to be 
 honest, I don't know if it will completely solve the issue or 
 how performant it is).

typeid just fetches the TypeInfo, so they're the same thing I'm not sure how fast it is to fetch that info, but that's what the gc does so you're at least no worse off.

I guess I'll have to do some sleuthing at some point to see what exactly is happening...
Aug 06 2013
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
On Tuesday, 6 August 2013 at 15:20:47 UTC, JS wrote:
 ...

I don't think you can do it without building custom druntime with own GC hooks.
Aug 06 2013
prev sibling next sibling parent "anonymous" <anonymous example.com> writes:
On Tuesday, 6 August 2013 at 14:27:32 UTC, Adam D. Ruppe wrote:
 and there's a gc proxy in the code but idk how to use it.

I have this little piece of hack that uses the proxy to disable GC allocations at runtime. It's not tested well, so it probably has some issues, but maybe it's a start for someone. See the unittest for how to use. CODE: import std.exception: assertThrown; import std.traits: ParameterTypeTuple, ReturnType; static import core.memory; /*NOTE: Deriving from Throwable instead of Exception, because with Exception catch(GCUsedWhenForbidden) doesn't work (dmd 2.063). */ class GCUsedWhenForbidden : Throwable { this(string gcProcName, string file = __FILE__, size_t line = __LINE__) { super(gcProcName, file, line); } } struct GC { static auto opDispatch(string n, A ...)(A args) { mixin("return core.memory.GC." ~ n ~ "(args);"); } private static Proxy proxy; static this() { foreach(name; __traits(allMembers, Proxy)) makeDie!name(); } private shared static uint forbidCount = 0; static void forbidUse() { if(forbidCount++ == 0) actuallyForbidUse; } private static void actuallyForbidUse() { makeNop!"gc_addRange"(); gc_setProxy(&proxy); makeDie!"gc_addRange"(); } static void allowUse() { if(--forbidCount == 0) actuallyAllowUse; } private static void actuallyAllowUse() { makeNop!"gc_removeRange"(); gc_clrProxy(); makeDie!"gc_removeRange"(); } private static auto proxyMember(string name)() { mixin("return &proxy." ~ name ~ ";"); } private static makeNop(string name)() { *proxyMember!name = &nop!(typeof(*proxyMember!name)); } private static makeDie(string name)() { *proxyMember!name = &die!(typeof(*proxyMember!name), name); } } unittest { int[] h = new int[1]; GC.forbidUse; // 1 assertThrown!GCUsedWhenForbidden(new int[1]); GC.forbidUse; // 2 assertThrown!GCUsedWhenForbidden(new int[1]); GC.allowUse; // 1 assertThrown!GCUsedWhenForbidden(new int[1]); GC.allowUse; // 0 h = new int[1]; // now, GC can be used again } private extern(C) void nop(T)(ParameterTypeTuple!T) {} private extern(C) ReturnType!T die(T, string name)(ParameterTypeTuple!T) { GC.actuallyAllowUse; scope(exit) GC.actuallyForbidUse; throw new GCUsedWhenForbidden(name); } //NOTE: copied from druntime/src/gc/gc.d private struct BlkInfo { void* base; size_t size; uint attr; } //NOTE: copied from druntime/src/gc/proxy.d private struct Proxy { extern(C) { void function() gc_enable; void function() gc_disable; void function() gc_collect; void function() gc_minimize; uint function(void*) gc_getAttr; uint function(void*, uint) gc_setAttr; uint function(void*, uint) gc_clrAttr; void* function(size_t, uint) gc_malloc; BlkInfo function(size_t, uint) gc_qalloc; void* function(size_t, uint) gc_calloc; void* function(void*, size_t, uint ba) gc_realloc; size_t function(void*, size_t, size_t) gc_extend; size_t function(size_t) gc_reserve; void function(void*) gc_free; void* function(void*) gc_addrOf; size_t function(void*) gc_sizeOf; BlkInfo function(void*) gc_query; void function(void*) gc_addRoot; void function(void*, size_t) gc_addRange; void function(void*) gc_removeRoot; void function(void*) gc_removeRange; } } //NOTE: copied from druntime/src/gc/proxy.d private extern(C) void gc_setProxy(Proxy*); private extern(C) void gc_clrProxy();
Aug 06 2013
prev sibling parent Martin Drasar <drasar ics.muni.cz> writes:
On 6.8.2013 14:58, QAston wrote:
 On Tuesday, 6 August 2013 at 12:41:26 UTC, Adam D. Ruppe wrote:
 On Tuesday, 6 August 2013 at 07:19:40 UTC, JS wrote:
 1. Is there a way to determine what the GC is doing?


 You can also tell where the gc runs by carefully reviewing the code:

There's complete list at http://dlang.org/garbage.html - paragraph: "D Operations That Involve the Garbage Collector"

Also as Manu and others pointed out, Phobos tends to allocate a lot. There is also a code from Adam D. Ruppe that you might find useful: http://forum.dlang.org/thread/fbjeivugntvudgopyfll forum.dlang.org Martin
Aug 06 2013