www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - I don't get the GC. (heapSizeFactor followup)

reply FeepingCreature <feepingcreature gmail.com> writes:
A follow-up to 
https://forum.dlang.org/thread/befrzndhowlwnvlqcoxx forum.dlang.org

I don't get it.

Okay, this is how I understand the conservative GC to work:

- do allocations
- allocations increment `usedSmallPages`/`usedLargePages`
- at some time, we cross 
`smallCollectThreshold`/`largeCollectThreshold`
   - then we do a fullcollect, ie. mark, sweep
   - then set the new threshold to `usedSmallPages * 
heapSizeFactor`, which defaults to 2
     - plus some smooth-decay magic that means threshold doesn't 
go down super fast
   - unlock and resume

Rinse and repeat.

We have a service. It's pretty `std.json` intensive, it handles a 
lot of networking on startup, and it has about 30 threads. When 
we start it up with everything at default settings, it uses 3.3GB.

This would indicate, given `heapSizeFactor=2` by default, that 
the high-water mark of used pages is 1.6GB.

We've tried setting `heapSizeFactor` to 0.25. This is not quite 
equivalent to running the GC on every allocation 
(`smallCollectThreshold < usedSmallPages`), but AIUI it first 
runs the GC every time it would allocate a new pool. It's pretty 
aggressive; something like 70% of our startup performance goes to 
GC. But, after startup, the service sits at RSS 961MB.

Here's the confusing part. RSS 961MB is an upper limit on the 
live memory. If this is correct, the `smallCollectThreshold` 
after startup should be at most 1.6GB? Right? So the untuned GC 
should not let the process grow above 1.6GB? Right? But it's 
3.3GB, more than twice that.

I mean, wrong, because we might get unlucky and run our GC when 
we have a lot of temporary memory live. But we'd need to have 
twice as much temporary memory truly live during startup as we do 
after startup, right? And if I watch the RSS during startup with 
`heapSizeFactor=0.25`, I don't see it above 961MB *ever*. And it 
can't be that net queues run emptier during startup with 
`heapSizeFactor=0.25` than without, because the process is a lot 
slower with `heapSizeFactor=0.25` than without! It should clear 
queues faster with it off!

So what is the GC doing?!
Jan 16 2023
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
Right goal, wrong questions.

The process memory consumption may not be what the GC is consuming. You 
need to measure that first before questioning if the GC is the one doing 
it wrong.

For instance a badly acting buddy allocator in malloc could double 
memory like you're seeing. So swapping malloc may give the results you 
desire. But first rule out the GC.
Jan 16 2023
parent reply FeepingCreature <feepingcreature gmail.com> writes:
On Monday, 16 January 2023 at 23:50:35 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 Right goal, wrong questions.

 The process memory consumption may not be what the GC is 
 consuming. You need to measure that first before questioning if 
 the GC is the one doing it wrong.

 For instance a badly acting buddy allocator in malloc could 
 double memory like you're seeing. So swapping malloc may give 
 the results you desire. But first rule out the GC.
Okay, queried GC stats instead of looking at RSS. Now I randomly get 2.6GB after start without `heapSizeFactor`, whatever, random variation. But the interesting thing is: - `heapSizeFactor=0.25`: usedSize 704MB freeSize 202MB - `heapSizeFactor=2` (default): usedSize 1568MB freeSize 810MB So why is twice as much GC memory *reachable* without `heapSizeFactor`?
Jan 16 2023
parent FeepingCreature <feepingcreature gmail.com> writes:
On Tuesday, 17 January 2023 at 06:42:33 UTC, FeepingCreature 
wrote:
 Okay, queried GC stats instead of looking at RSS. Now I 
 randomly get 2.6GB after start without `heapSizeFactor`, 
 whatever, random variation. But the interesting thing is:

 - `heapSizeFactor=0.25`: usedSize 704MB freeSize 202MB
 - `heapSizeFactor=2` (default): usedSize 1568MB freeSize 810MB

 So why is twice as much GC memory *reachable* without 
 `heapSizeFactor`?
Okay hang on no. If I actually call `GC.collect` before measuring, I do get the proper usedSize 652MB freeSize 2302MB. So the GC insists that it *transitively* needed 3GB? It seems the GC's claim is "because your program ran faster, it did something that ballooned its used memory to 3GB before it sank back down." ... Right? But that's impossible. This service is entirely network triggered. It can't use *more* live memory by running faster. Could the GC's used-memory estimate have gotten messed up somehow/somewhere? I don't see where that could be in the code though.
Jan 16 2023