www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.internals - Concurrent No-Pause GC Option (idea)

reply jordan4ibanez <jordan4ibanez002 gmail.com> writes:
I had an idea on the GC for CPUs that have more than one core. So 
basically all of them.

So I learned on the Discord that the current GC runs parallel if 
this machine setup is detected, but it still pauses everything to 
make sure nothing is changed, thanks to adr.

Now I had the idea that perhaps this could be fixed. It is simple 
in idea, but I am sure that I am about to find out is is 
horrifically complex in implementation.

So the current assumption is that the memory addresses for 
objects do not become moved around in the heap and it grows and 
shrinks as needed. So this opens a new opportunity to create a 
concurrent no-pause garbage collector. My literal idea was this:

```
So like, when an object, let's say, in the main thread has it's 
destructor called, or the thread notices that it no longer talks 
to this piece of data ever again, basically tell that external gc 
thread "hey this piece of data and all of it's children are done" 
and just keep on going doing it's execution routine
```

So the main thread creates an object like so:

```d
Object data = new Object();
```

This is great, that means that ``data`` has it's own little house 
on the heap and it will never move from this spot until it "has 
to move out".

So we make it move out:

```d
Object data = new Object();
data = new Object();
```

Well there we go, ``data`` is now a new ``data``. But what 
happened to the old ``data``? Well with the concurrent GC the 
main thread has silently told the GC to collect all of ``data``'s 
data.

The same thing happens if it goes out of scope:

```d
{
     Object data = new Object();
     // data destroy is sent to concurrent gc to free space
}
```

The same thing will happen if the piece of data is in an array, 
or if it's held in another object. (In theory) But this was just 
an extremely simple example. Ideally the pointer table in the 
thread will never be able to see the data location of the object 
that is destroyed even if it still exists in memory.

The concurrent gc would simply always be listening and the only 
micro pauses I could see would be from having to add those memory 
addresses to the gc's stack. Perhaps that could be avoided by 
having a concurrent linked list somehow.

This is coming from someone new to D and I am still learning much 
about it and trying to help build the ecosystem and languages as 
much as I can. I would love to know your thoughts.
Aug 20 2022
next sibling parent frame <frame86 live.com> writes:
On Sunday, 21 August 2022 at 00:11:55 UTC, jordan4ibanez wrote:

 This is coming from someone new to D and I am still learning 
 much about it and trying to help build the ecosystem and 
 languages as much as I can. I would love to know your thoughts.
You probably thinking of RC: (Ref(erence) Counted) GC, where memory can be collected if the last reference is gone. The drawbacks of this approach are that you will need more memory for each allocated variable since you will need to track the references. And to increase or decrease those reference counts has a runtime impact whereever a reference occurs: eg. pointer to `data` is passed to another object and the original pointer is going out of scope. Or just a common slice (`[]`) as seen in `string` that is handled from function to function maybe even recursively. The current GC doesn't do this, keeping the impact low. It only needs to stop the threads if it need to collect in a cycle which doesn't always happen and you can even help organize the GC's cleanup work by manually calling `GC.collect()` when it's only suitable for your program. If you are interested in this stuff, I suggest you to dig in the General forum section where cons & pros of other GC implementations and their side effects are heavily discussed :D
Aug 21 2022
prev sibling parent reply Sergey <kornburn yandex.ru> writes:
On Sunday, 21 August 2022 at 00:11:55 UTC, jordan4ibanez wrote:
 I had an idea on the GC for CPUs that have more than one core. 
 So basically all of them.

 So I learned on the Discord that the current GC runs parallel 
 if this machine setup is detected, but it still pauses 
 everything to make sure nothing is changed, thanks to adr.
Isn’t it the same as already proposed earlier https://dlang.org/blog/2019/07/22/symmetry-autumn-of-code-experience-report-porting-a-fork-based-gc/
Aug 21 2022
parent reply frame <frame86 live.com> writes:
On Sunday, 21 August 2022 at 21:42:20 UTC, Sergey wrote:
 On Sunday, 21 August 2022 at 00:11:55 UTC, jordan4ibanez wrote:
 I had an idea on the GC for CPUs that have more than one core. 
 So basically all of them.

 So I learned on the Discord that the current GC runs parallel 
 if this machine setup is detected, but it still pauses 
 everything to make sure nothing is changed, thanks to adr.
Isn’t it the same as already proposed earlier https://dlang.org/blog/2019/07/22/symmetry-autumn-of-code-experience-report-porting-a-fork-based-gc/
The mentioned fork variant would be near the same as the current GC, it tracks nothing. It suggests to scan in the forked process and then tell the mutator thread parent what blocks are free again. I don't know why a fork is necessary. That can be done in a normal background thread also? And it doesn't eliminate the problem that data can be used in another thread after the scanner marked it as free. Actually there is already a fork based implementation merged in the runtime as an alternative to parallel marking, but it still stops other threads, of course.
Aug 22 2022
parent reply Krzysztof =?UTF-8?B?SmFqZcWbbmljYQ==?= <krzysztof.jajesnica gmail.com> writes:
On Monday, 22 August 2022 at 18:00:42 UTC, frame wrote:
 I don't know why a fork is necessary. That can be done in a 
 normal background thread also? And it doesn't eliminate the 
 problem that data can be used in another thread after the 
 scanner marked it as free.
AFAIK forking does indeed eliminate the problem of another thread using the data after scanner marks it as free. Forked process does not share address space with the parent process (it gets a copy of the parent's address space), so any memory accesses done in the application process after the `fork()` will not influence the scanning process. If the scanner process marks a block as free, then that means it was definitely unreachable in the application process prior to the `fork()`, so the application process has no way to access the block (apart from cases where you hide the pointer from the GC on purpose, but that's a problem for almost all GC implementations, not just the `fork()` one)
Aug 25 2022
parent frame <frame86 live.com> writes:
On Thursday, 25 August 2022 at 19:06:15 UTC, Krzysztof Jajeśnica 
wrote:
 On Monday, 22 August 2022 at 18:00:42 UTC, frame wrote:
 AFAIK forking does indeed eliminate the problem of another 
 thread using the data after scanner marks it as free.

 Forked process does not share address space with the parent 
 process (it gets a copy of the parent's address space), so any 
 memory accesses done in the application process after the 
 `fork()` will not influence the scanning process. If the 
 scanner process marks a block as free, then that means it was 
 definitely unreachable in the application process prior to the 
 `fork()`, so the application process has no way to access the 
 block (apart from cases where you hide the pointer from the GC 
 on purpose, but that's a problem for almost all GC 
 implementations, not just the `fork()` one)
I think it does share unless a write operation is performed on parent or child side, then both get a fresh mutable copy. So the child needs to commit IDs(?) indexes(?) whatever back to the parent, and the parent can map it to its memory block [or something like that]. That is, of course, way better than a background thread as it creates some isolated snapshot. But consider the parent would call `GC.free()` manually while the child is running and the child thinks the block can be freed and the parent continued and just used the block right again (don't know if the GC actually does this, but I think so). The parent must ignore the supplied ID in this case or would collect used memory. The GC needs to work differently for this.
Aug 26 2022