www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Fiber local GC

reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
Pony has a fiber local GC, which means collection can happen when 
the fiber is inactive  or even altogether skip collection if the 
fiber is short-lived.

Go is currently exploring a Transaction Oriented GC addition to 
the concurrent GC it already has:

https://docs.google.com/document/d/1gCsFxXamW8RRvOe5hECz98Ftk-tcRRJcDFANj2VwCB0/edit

It takes the same viewpoint. A go-routine (fiber) that is 
short-lived (like a HTTP request handler) can release everything 
in one swipe without collection.

I think this viewpoint is much more efficient and promising than 
D's thread-local viewpoint.

What D needs is a type qualifier that keeps data "fiber local" 
and possibly a transition mechanism like Pony has for detecting 
objects that should be allocated on a global heap (or less 
efficiently, "pin objects" that are exported outside the fiber).
Jun 25 2016
next sibling parent reply qznc <qznc web.de> writes:
On Saturday, 25 June 2016 at 10:33:00 UTC, Ola Fosheim Grøstad 
wrote:
 Pony has a fiber local GC, which means collection can happen 
 when the fiber is inactive  or even altogether skip collection 
 if the fiber is short-lived.

 Go is currently exploring a Transaction Oriented GC addition to 
 the concurrent GC it already has:

 https://docs.google.com/document/d/1gCsFxXamW8RRvOe5hECz98Ftk-tcRRJcDFANj2VwCB0/edit

 It takes the same viewpoint. A go-routine (fiber) that is 
 short-lived (like a HTTP request handler) can release 
 everything in one swipe without collection.

 I think this viewpoint is much more efficient and promising 
 than D's thread-local viewpoint.

 What D needs is a type qualifier that keeps data "fiber local" 
 and possibly a transition mechanism like Pony has for detecting 
 objects that should be allocated on a global heap (or less 
 efficiently, "pin objects" that are exported outside the fiber).
D does not even have thread-local GC. Since fibers are bound to a thread, a thread-local GC would help as well. The hard part is how to make it safe.
Jun 25 2016
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Saturday, 25 June 2016 at 10:49:43 UTC, qznc wrote:
 Since fibers are bound to a thread, a thread-local GC would 
 help as well. The hard part is how to make it safe.
Yes, but a thread is usually long-lived, so you don't get the free-all-no-collection-needed speedup. I don't think it is so hard to make it safe, but we need to get rid of the idea that it is inconvenient to use a more complex type system for pointers. I don't really see why that is a big issue, as type-erasure before code-gen would prevent bloat. I think it is neither easy or hard to make it safe. It is doable, if we make the right trade-offs. But more advanced typing of pointers is most likely needed.
Jun 25 2016
prev sibling next sibling parent reply rikki cattermole <rikki cattermole.co.nz> writes:
On 25/06/2016 10:33 PM, Ola Fosheim Grøstad wrote:
 Pony has a fiber local GC, which means collection can happen when the
 fiber is inactive  or even altogether skip collection if the fiber is
 short-lived.

 Go is currently exploring a Transaction Oriented GC addition to the
 concurrent GC it already has:

 https://docs.google.com/document/d/1gCsFxXamW8RRvOe5hECz98Ftk-tcRRJcDFANj2VwCB0/edit


 It takes the same viewpoint. A go-routine (fiber) that is short-lived
 (like a HTTP request handler) can release everything in one swipe
 without collection.

 I think this viewpoint is much more efficient and promising than D's
 thread-local viewpoint.

 What D needs is a type qualifier that keeps data "fiber local" and
 possibly a transition mechanism like Pony has for detecting objects that
 should be allocated on a global heap (or less efficiently, "pin objects"
 that are exported outside the fiber).
No need for a new keyword. What we could do is use a type like Vibe.d's TalkLocal to limit instances to the thread and register a context via a template for the struct describing the context. struct MyContext { int x; string text; } TaskLocal!MyContext context; shared static this() { registerFiberLocal(&context); } ... registerFiberLocal(T)(T* inst) if (is(T == struct)) { ... } Hmm okay maybe not a good idea. As a library like this won't work. What we want is to be able to explicitly say what the context is for a functions code and declare no globals. noglobals struct MyContext { disable this(this); void func() { } } Function calls outside of the struct are ok for: system, pure and noglobals. Making the struct pure wouldn't work since there goes safe and system calls. MyContext.func would implicitly have noglobals applied to it. Now preferably we would have new hooks added so they could be in the context. Specifically if they are there they are used instead of the globals for e.g. GC. That way things like allocators could be used instead. Given that the context should be heap/stack allocated and never copied it would mean that once the fiber dies, all memory it uses is deallocated or returned to the pool. Now I did say about hooks, there are no ways to do that without having a bunch of context pointers somewhere saying which function is currently set to the hook. If done via a stack it would given history for unsetting the current context allowing recursive. The underlying fiber provider would have to inform the switch on e.g. yield. But that isn't hard to do. Looking back perhaps we can allow globals, but we'd need to instead have a hook for when assigning to a global / passing to anything not pure that is heap based. That would allow "moving" memory ownership from the fiber to the process GC. TLDR: I think we can do this with just a noglobals attribute, but done properly it looks like we need proper hooks in place which are a lot harder.
Jun 25 2016
parent reply rikki cattermole <rikki cattermole.co.nz> writes:
On 25/06/2016 11:00 PM, rikki cattermole wrote:
... snip ...

 What we want is to be able to explicitly say what the context is for a
 functions code and declare no globals.

  noglobals
 struct MyContext {

      disable
     this(this);

     void func() {

     }
 }

 Function calls outside of the struct are ok for:  system, pure and
  noglobals.
 Making the struct pure wouldn't work since there goes  safe and  system
 calls.

 MyContext.func would implicitly have  noglobals applied to it.

 Now preferably we would have new hooks added so they could be in the
 context. Specifically if they are there they are used instead of the
 globals for e.g. GC. That way things like allocators could be used
 instead. Given that the context should be heap/stack allocated and never
 copied it would mean that once the fiber dies, all memory it uses is
 deallocated or returned to the pool.

 Now I did say about hooks, there are no ways to do that without having a
 bunch of context pointers somewhere saying which function is currently
 set to the hook. If done via a stack it would given history for
 unsetting the current context allowing recursive.

 The underlying fiber provider would have to inform the switch on e.g.
 yield. But that isn't hard to do.

 Looking back perhaps we can allow globals, but we'd need to instead have
 a hook for when assigning to a global / passing to anything not pure
 that is heap based. That would allow "moving" memory ownership from the
 fiber to the process GC.


 TLDR: I think we can do this with just a  noglobals attribute, but done
 properly it looks like we need proper hooks in place which are a lot
 harder.
I've thought about this further, the only hook function we don't currently have is related to global + TLS assignment for memory. We can get away with e.g. new overriding and ~ via the GC proxy (actually a fairly decent way to go about it). Since this assignment hook is potentially quite expensive, it definitely should be only if used under whatever attribute we use.
Jun 25 2016
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Saturday, 25 June 2016 at 11:37:44 UTC, rikki cattermole wrote:
 I've thought about this further, the only hook function we 
 don't currently have is related to global + TLS assignment for 
 memory.
 We can get away with e.g. new overriding and ~ via the GC proxy 
 (actually a fairly decent way to go about it).

 Since this assignment hook is potentially quite expensive, it 
 definitely should be only if used under whatever attribute we 
 use.
I think the basic idea would be that the fiber heap is presumed to work like a region allocator that is never collected, except when you run low on memory, then you scan only the fiber-local memory (and since it is local you could possibly also compact). So: 1. References to the fiber heap can only be made from the same fiber heap or the fiber stack. 2. You can hand out borrowed fiber-references to the fiber heap when calling non-fiber functions, but fiber-references can never be turned into non-fiber references. 3. Aggregates (structs/classes/arrays) that can contain fiber-references are tainted as fiber-local and cannot leave the fiber. Then you need a mechanism that transitively converts fiber-local data into non-fiber-local data. This can be done as either: 1. Deep copy. 2. Traverse-and-pin-memory-as-exported, then reinterpret cast into non-fiber-local types. 3. Optimization: a priori detected as non-fiber-local using static analysis and allocated in a separate heap and then reinterpret casted into non-fiber-local before being exported outside the fiber. Memory outside the fiber can use regular reference-counting. Of course, this could also be generalized to something that would not only work for fibers, but also for stack-less contexts such as a facade to a global graph that is only collectible at specific points in the code. So you collect only where there are no external references to internal nodes (or use reference-counted pinning for exports). A fiber-local heap would be a special case where the fiber-stack is part of the fiber-local heap. I think this might work. The assumption is that GC is most useful in a singular execution context and that more manual management (like reference counting) is acceptable between execution contexts.
Jun 25 2016
parent reply rikki cattermole <rikki cattermole.co.nz> writes:
On 25/06/2016 11:55 PM, Ola Fosheim Grøstad wrote:
 On Saturday, 25 June 2016 at 11:37:44 UTC, rikki cattermole wrote:
 I've thought about this further, the only hook function we don't
 currently have is related to global + TLS assignment for memory.
 We can get away with e.g. new overriding and ~ via the GC proxy
 (actually a fairly decent way to go about it).

 Since this assignment hook is potentially quite expensive, it
 definitely should be only if used under whatever attribute we use.
I think the basic idea would be that the fiber heap is presumed to work like a region allocator that is never collected, except when you run low on memory, then you scan only the fiber-local memory (and since it is local you could possibly also compact). So: 1. References to the fiber heap can only be made from the same fiber heap or the fiber stack. 2. You can hand out borrowed fiber-references to the fiber heap when calling non-fiber functions, but fiber-references can never be turned into non-fiber references.
This worries me. 1. This adds ref counting or some other form of restriction which has not been declared as to what it is. 2. Removes the ability to assign to globals limiting usage.
 3. Aggregates (structs/classes/arrays) that can contain fiber-references
 are tainted as fiber-local and cannot leave the fiber.

 Then you need a mechanism that transitively converts fiber-local data
 into non-fiber-local data. This can be done as either:

 1. Deep copy.

 2. Traverse-and-pin-memory-as-exported, then reinterpret cast into
 non-fiber-local types.

 3. Optimization: a priori detected as non-fiber-local using static
 analysis and allocated in a separate heap and then reinterpret casted
 into non-fiber-local before being exported outside the fiber.

 Memory outside the fiber can use regular reference-counting.

 Of course, this could also be generalized to something that would not
 only work for fibers, but also for stack-less contexts such as a facade
 to a global graph that is only collectible at specific points in the
 code. So you collect only where there are no external references to
 internal nodes (or use reference-counted pinning for exports).

 A fiber-local heap would be a special case where the fiber-stack is part
 of the fiber-local heap.

 I think this might work. The assumption is that GC is most useful in a
 singular execution context and that more manual management (like
 reference counting) is acceptable between execution contexts.
Jun 25 2016
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Saturday, 25 June 2016 at 12:01:25 UTC, rikki cattermole wrote:
 2. You can hand out borrowed fiber-references to the fiber 
 heap when
 calling non-fiber functions, but fiber-references can never be 
 turned
 into non-fiber references.
This worries me. 1. This adds ref counting or some other form of restriction which has not been declared as to what it is.
You can get quite far using modern type systems and static analysis. You don't need to reference count, unless you export the object outside the fiber. Or rather, the reference count is only increased outside the fiber, if it is not reference outside the fiber it stays at 0 (conceptually).
 2. Removes the ability to assign to globals limiting usage.
You can add weak-references. Pin the object as being weakly referenced. When the weak reference is accessed (using RAII), and the object is alive, a weak-reference-counter is increased, when the access is over (RAII goes out of scope) the weak-reference-counter is decreased. If the object no longer exists you get either null or an exception.
Jun 25 2016
prev sibling parent reply Martin Nowak <code+news.digitalmars dawg.eu> writes:
On 06/25/2016 12:33 PM, Ola Fosheim Grøstad wrote:
 
 It takes the same viewpoint. A go-routine (fiber) that is short-lived
 (like a HTTP request handler) can release everything in one swipe
 without collection.
Simple as don't allocate on a per-request basis if you want a fast program. It also trivial to attach an std.allocator to your custom Fiber. Also Fibers should be pooled and reused to avoid the setup cost (just measured 1.5µs on top of the 28ns to simply reset a fiber). -Martin
Jun 27 2016
next sibling parent reply deadalnix <deadalnix gmail.com> writes:
On Monday, 27 June 2016 at 22:51:05 UTC, Martin Nowak wrote:
 On 06/25/2016 12:33 PM, Ola Fosheim Grøstad wrote:
 
 It takes the same viewpoint. A go-routine (fiber) that is 
 short-lived
 (like a HTTP request handler) can release everything in one 
 swipe
 without collection.
Simple as don't allocate on a per-request basis if you want a fast program. It also trivial to attach an std.allocator to your custom Fiber. Also Fibers should be pooled and reused to avoid the setup cost (just measured 1.5µs on top of the 28ns to simply reset a fiber). -Martin
Yeah, but you can blast the whole heap in between reuse, and that is great :)
Jun 27 2016
parent Martin Nowak <code dawg.eu> writes:
On Monday, 27 June 2016 at 22:55:39 UTC, deadalnix wrote:
 Yeah, but you can blast the whole heap in between reuse, and 
 that is great :)
Right, and you can already reset a fiber local allocator today, just put a reference to it in your Fiber subclass.
Jun 27 2016
prev sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Monday, 27 June 2016 at 22:51:05 UTC, Martin Nowak wrote:
 Simple as don't allocate on a per-request basis if you want a 
 fast program. It also trivial to attach an std.allocator to 
 your custom Fiber. Also Fibers should be pooled and reused to 
 avoid the setup cost (just measured 1.5µs on top of the 28ns to 
 simply reset a fiber).
Are you arguing in favour of having fiber local GC or just having a regular allocator? In a real cloud setup you'll have to deal with running on instances with only 128MB of RAM, so you need to collect occasionally for heavy requests. If you don't the entire instance is shut down.
Jun 27 2016