www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - The GC (agian)

reply JG <someone somewhere.com> writes:
Hi,

Having been hanging around these forums for a few years now (and 
writing software in D during that time) I have noticed that there 
are quite often disagreements about the GC.

I was thinking about this and wondered if some of these problems 
could be possibly
eliminated by making the GC more configurable. For instance one 
might envisage
that one can with some nice API be able do things like:
(a) specify exactly when the GC should do a stop the world mark 
and sweep (not when it feels like it when it needs to allocate 
memory and perhaps through an API more pleasant than the current, 
enable and disable mechanism);
(b) specify that the next allocations should be done into a 
preallocated block (which it could be asked to drop later, 
without marking and sweeping).

I guess my real question is more is there someway that the GC be 
modified so that on the one hand it works exactly like it does 
now (if no options are set) and on the other hand can be 
configured so that it is close enough to manual memory allocation 
that say someone building a game wouldn't find it gets in there 
way.

The main point would be to allow someone to fully use D (with 
some extra house-keeping they have to do) but avoid unpredictable 
GC pauses.

I understand that this wouldn't satisfy everyone, but perhaps it 
would be more feasible than some of the more drastic proposals 
that get thrown around from time to time.

If this is something that makes sense and would be useful I would 
be willing to try and build it.

[In case it is of interest to anyone, personally I like having a 
GC, although there are a few instances where it would be nice to 
have a bit more flexibility.]
Nov 20 2021
next sibling parent Stanislav Blinov <stanislav.blinov gmail.com> writes:
On Saturday, 20 November 2021 at 13:48:44 UTC, JG wrote:
 Hi,

 Having been hanging around these forums for a few years now 
 (and writing software in D during that time) I have noticed 
 that there are quite often disagreements about the GC.

 I was thinking about this and wondered if some of these 
 problems could be possibly
 eliminated by making the GC more configurable. For instance one 
 might envisage
 that one can with some nice API be able do things like:
 (a) specify exactly when the GC should do a stop the world mark 
 and sweep (not when it feels like it when it needs to allocate 
 memory and perhaps through an API more pleasant than the 
 current, enable and disable mechanism);
 (b) specify that the next allocations should be done into a 
 preallocated block (which it could be asked to drop later, 
 without marking and sweeping).

 I guess my real question is more is there someway that the GC 
 be modified so that on the one hand it works exactly like it 
 does now (if no options are set) and on the other hand can be 
 configured so that it is close enough to manual memory 
 allocation that say someone building a game wouldn't find it 
 gets in there way.

 The main point would be to allow someone to fully use D (with 
 some extra house-keeping they have to do) but avoid 
 unpredictable GC pauses.

 I understand that this wouldn't satisfy everyone, but perhaps 
 it would be more feasible than some of the more drastic 
 proposals that get thrown around from time to time.

 If this is something that makes sense and would be useful I 
 would be willing to try and build it.

 [In case it is of interest to anyone, personally I like having 
 a GC, although there are a few instances where it would be nice 
 to have a bit more flexibility.]
(a) could be a welcome addition, or rather, augmentation of existing implementation, i.e. guaranteeing that GC.collect() actually does full collection, and also making more strict specification for disable/enable (b) has some problems: what is "next" allocation? There are quite a few things beside `new` that may allocate. Also, on what thread? Asking to drop the block is also not really feasible, as the only way GC could do so safely would be to ensure there are no more pointers into that block stored anywhere, which requires scanning. If you don't need safe, however, it's trivial to plop your own arena on a GC-allocated block through e.g. `std.experimental.allocator`, or even your own. And you already can reserve memory for the GC heap beforehand with the `GC.reserve()`. If you're making a game or anything that requires consistent high framerate, the problem is mostly not in predicting the pauses, you can deal with that (though still, current documented requirements on `disable` do leave some questions). It's the pause itself, which with the current implementation is obnoxiously long. If you have natural lulls in framerate, you may be able to afford it. If not, you pretty much wouldn't be using GC anyway, either at all or in select threads, but that means you're missing out on a few language features.
Nov 20 2021
prev sibling next sibling parent Commander Zot <no no.no> writes:
On Saturday, 20 November 2021 at 13:48:44 UTC, JG wrote:
 Hi,

 Having been hanging around these forums for a few years now 
 (and writing software in D during that time) I have noticed 
 that there are quite often disagreements about the GC.

 I was thinking about this and wondered if some of these 
 problems could be possibly
 eliminated by making the GC more configurable. For instance one 
 might envisage
 that one can with some nice API be able do things like:
 (a) specify exactly when the GC should do a stop the world mark 
 and sweep (not when it feels like it when it needs to allocate 
 memory and perhaps through an API more pleasant than the 
 current, enable and disable mechanism);
 (b) specify that the next allocations should be done into a 
 preallocated block (which it could be asked to drop later, 
 without marking and sweeping).

 I guess my real question is more is there someway that the GC 
 be modified so that on the one hand it works exactly like it 
 does now (if no options are set) and on the other hand can be 
 configured so that it is close enough to manual memory 
 allocation that say someone building a game wouldn't find it 
 gets in there way.

 The main point would be to allow someone to fully use D (with 
 some extra house-keeping they have to do) but avoid 
 unpredictable GC pauses.

 I understand that this wouldn't satisfy everyone, but perhaps 
 it would be more feasible than some of the more drastic 
 proposals that get thrown around from time to time.

 If this is something that makes sense and would be useful I 
 would be willing to try and build it.

 [In case it is of interest to anyone, personally I like having 
 a GC, although there are a few instances where it would be nice 
 to have a bit more flexibility.]
If you want manual memory management you can do that today. If you don't want the GC to stop the world you can create a thread not registered to to GC today. Yes, you lose access to some language features that require automatic memory management. But that is not that big of a deal when other languages like C/C++ don't even have those in the first place. I'm all for improving the GC and providing more options, but IMHO there are more important things where your help could have a bigger impact on D usability.
Nov 20 2021
prev sibling next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Nov 20, 2021 at 01:48:44PM +0000, JG via Digitalmars-d wrote:
 Hi,
 
 Having been hanging around these forums for a few years now (and
 writing software in D during that time) I have noticed that there are
 quite often disagreements about the GC.
 
 I was thinking about this and wondered if some of these problems could
 be possibly eliminated by making the GC more configurable. For
 instance one might envisage that one can with some nice API be able do
 things like:
 (a) specify exactly when the GC should do a stop the world mark and
 sweep (not when it feels like it when it needs to allocate memory and
 perhaps through an API more pleasant than the current, enable and
 disable mechanism);
What's wrong with the current mechanism? In applications where I need better control over collections, I've used GC.disable and GC.collect quite effectively.
 (b) specify that the next allocations should be done into a
 preallocated block (which it could be asked to drop later, without
 marking and sweeping).
My impression was that this was one of the goals of std.experimental.allocator.
 I guess my real question is more is there someway that the GC be
 modified so that on the one hand it works exactly like it does now (if
 no options are set) and on the other hand can be configured so that it
 is close enough to manual memory allocation that say someone building
 a game wouldn't find it gets in there way.
Just slap nogc on main() (or whatever function is the entry point into your critical section where you don't want any GC activity) and you can already do this today.
 The main point would be to allow someone to fully use D (with some
 extra house-keeping they have to do) but avoid unpredictable GC
 pauses.
In one of my projects I call GC.disable at the start, and have a manual iteration counter that invokes GC.collect at strategic points. It works pretty well. In this particular context the reason I wanted to do this was to control the frequency of GC collections, because I found that collections ran too often. So I ran collections on my own schedule instead of the default. The same strategy could also be used to run collections only at specific points in your code, so that pauses are predictable. [...]
 [In case it is of interest to anyone, personally I like having a GC,
 although there are a few instances where it would be nice to have a
 bit more flexibility.]
Me too. And so far I've found that calling GC.disable and running GC.collect on my own schedule has worked well enough for me. (Note that GC.disable merely disables automatic collection; allocations will still go through the GC and collections will still run when you manually invoke GC.collect.) T -- An imaginary friend squared is a real enemy.
Nov 20 2021
next sibling parent zjh <fqbqrr 163.com> writes:
On Saturday, 20 November 2021 at 17:00:05 UTC, H. S. Teoh wrote:

 Me too.  And so far I've found that calling GC.disable and 
 running GC.collect on my own schedule has worked well enough 
 for me.
Then the number of users in other languages increased.
Nov 20 2021
prev sibling parent reply Imperatorn <johan_forsberg_86 hotmail.com> writes:
On Saturday, 20 November 2021 at 17:00:05 UTC, H. S. Teoh wrote:
 On Sat, Nov 20, 2021 at 01:48:44PM +0000, JG via Digitalmars-d 
 wrote:
 [...]
What's wrong with the current mechanism? In applications where I need better control over collections, I've used GC.disable and GC.collect quite effectively.
 [...]
My impression was that this was one of the goals of std.experimental.allocator.
 [...]
Just slap nogc on main() (or whatever function is the entry point into your critical section where you don't want any GC activity) and you can already do this today.
 [...]
In one of my projects I call GC.disable at the start, and have a manual iteration counter that invokes GC.collect at strategic points. It works pretty well. In this particular context the reason I wanted to do this was to control the frequency of GC collections, because I found that collections ran too often. So I ran collections on my own schedule instead of the default. The same strategy could also be used to run collections only at specific points in your code, so that pauses are predictable. [...]
 [...]
Me too. And so far I've found that calling GC.disable and running GC.collect on my own schedule has worked well enough for me. (Note that GC.disable merely disables automatic collection; allocations will still go through the GC and collections will still run when you manually invoke GC.collect.) T
I get the feeling that maybe what's missing is just confidence in the system. Are there good documentation about this for example? Some walk through? Videos with evidence that the GC is not touching nogc etc? I have never experienced any problems with the GC, but we ofc have to listen to the voice of those that are worried for whatever reason. Personally I would love to see some videos on GC best practices/patterns ☀️
Nov 21 2021
parent reply bauss <jj_1337 live.dk> writes:
On Sunday, 21 November 2021 at 09:29:10 UTC, Imperatorn wrote:
 I get the feeling that maybe what's missing is just confidence 
 in the system. Are there good documentation about this for 
 example? Some walk through? Videos with evidence that the GC is 
 not touching  nogc etc?
The GC will never touch nogc because nogc will have the compiler verify that any operation you do in such scope doesn't allocate using the GC. Which means the GC has no idea allocated memory in the scope actually exist. The GC doesn't work by scanning memory and figuring out whether a block of memory is GC allocated or not. The GC works by having a set of nodes that it scans through and checks whether they have any references pointing to them. All nodes of course refer to GC allocations. Ex. when you allocate memory for a class then a node for said class will be added to the GC and scanned.
Nov 22 2021
next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Tuesday, 23 November 2021 at 07:14:17 UTC, bauss wrote:
 The GC doesn't work by scanning memory and figuring out whether 
 a block of memory is GC allocated or not.
I understand what you mean, but it has to, since D does not distinguish between GC pointers and non-GC pointers. It only distinguishes between GC and non GC memory (address ranges).
 The GC works by having a set of nodes that it scans through and 
 checks whether they have any references pointing to them.
More the opposite. It traverses the connected live graph by following outgoing pointers from nodes that are known to be live. If you have many nodes with pointers then that takes a lot of time. With the current setup you might have to traverse a large subgraph that has no GC nodes in it. If something is reachable from one of the GC roots it will have to scan it, just in case there might be some GC node deep down in the subgraph. This is easy to mess up in larger programs, obviously. ``` nogc``` is just an escape hatch, it does not help with modelling issues or memory management issues.
Nov 23 2021
prev sibling parent reply Imperatorn <johan_forsberg_86 hotmail.com> writes:
On Tuesday, 23 November 2021 at 07:14:17 UTC, bauss wrote:
 On Sunday, 21 November 2021 at 09:29:10 UTC, Imperatorn wrote:
 I get the feeling that maybe what's missing is just confidence 
 in the system. Are there good documentation about this for 
 example? Some walk through? Videos with evidence that the GC 
 is not touching  nogc etc?
The GC will never touch nogc because nogc will have the compiler verify that any operation you do in such scope doesn't allocate using the GC. Which means the GC has no idea allocated memory in the scope actually exist. The GC doesn't work by scanning memory and figuring out whether a block of memory is GC allocated or not. The GC works by having a set of nodes that it scans through and checks whether they have any references pointing to them. All nodes of course refer to GC allocations. Ex. when you allocate memory for a class then a node for said class will be added to the GC and scanned.
**We** know that the GC doesn't do that. But can we *prove* it to those in doubt? I get the feeling that some ppl still believe the GC sneaks in some backdoor and says hello
Nov 23 2021
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Tuesday, 23 November 2021 at 13:46:28 UTC, Imperatorn wrote:
 I get the feeling that some ppl still believe the GC sneaks in 
 some backdoor and says hello
Well, it "does", unless you only use the GC for initialization, in which case you are good. The GC also has to scan non-GC memory if that memory can point back to GC-memory. If you want to avoid this, you basically have to reason about lifetimes with no compiler support at all. At that point manual memory management with RAII/RC becomes less brittle and the GC becomes more of a liability than an advantage.
Nov 23 2021
parent reply FeepingCreature <feepingcreature gmail.com> writes:
On Tuesday, 23 November 2021 at 14:02:34 UTC, Ola Fosheim Grøstad 
wrote:
 On Tuesday, 23 November 2021 at 13:46:28 UTC, Imperatorn wrote:
 I get the feeling that some ppl still believe the GC sneaks in 
 some backdoor and says hello
The GC also has to scan non-GC memory if that memory can point back to GC-memory.
To be fair, it only does that if you explicitly register the non-GC memory with `GC.addRoot`. If you don't do this, GC memory can indeed be freed prematurely while still being referenced from non-GC memory.
Nov 24 2021
next sibling parent FeepingCreature <feepingcreature gmail.com> writes:
On Wednesday, 24 November 2021 at 09:29:07 UTC, FeepingCreature 
wrote:
 To be fair, it only does that if you explicitly register the 
 non-GC memory with `GC.addRoot`. If you don't do this, GC 
 memory can indeed be freed prematurely while still being 
 referenced from non-GC memory.
`GC.addRange`, sorry.
Nov 24 2021
prev sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Wednesday, 24 November 2021 at 09:29:07 UTC, FeepingCreature 
wrote:
 To be fair, it only does that if you explicitly register the 
 non-GC memory with `GC.addRoot`. If you don't do this, GC 
 memory can indeed be freed prematurely while still being 
 referenced from non-GC memory.
Yes, this is the issue. You either have to make the GC scan non-GC memory or you have to do lifetimes "in your head". Neither are good options for large interactive programs. It can work for a tiny game though.
Nov 24 2021
prev sibling parent Guillaume Piolat <first.last gmail.com> writes:
On Saturday, 20 November 2021 at 13:48:44 UTC, JG wrote:
 (a) specify exactly when the GC should do a stop the world mark 
 and sweep (not when it feels like it when it needs to allocate 
 memory and perhaps through an API more pleasant than the 
 current, enable and disable mechanism);
GC.enable / GC.disable and manually provoking collections is only a quick fix. The Right Way(tm) to go about it imho is to just have a smaller GC heap, that will reduce the maximum pause time (aka scanning) so that whatever happens your real-time system doesn't get encumbered by the GC. How to get a small GC heap? With -profile=GC, pools, custom allocators, non-scannabled memory etc. GC is actually pay as you go, the larger the heap the more you pay. If you use very small amount of GC, disabling it or going -betterC will yield immeasurable gains, apart from memory usage. Of course GC still doesn't work for fast callback that don't have the leisure to hold even a mutex, for those using deregistered thread and nogc is necessary.
Nov 23 2021