www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Eclipse OMR project provides a reusable Garbage Collector

reply Dibyendu Majumdar <d.majumdar gmail.com> writes:
Hi

Not sure if I am repeating information already known here - if so 
I do apologise!

IBM has open sourced the J9 Garbage Collector as part of Eclipse 
OMR project. Perhaps D could use this. It is claimed that the GC 
implementation is relatively straight forward to reuse.

https://github.com/eclipse/omr

Regards
Dibyendu
Dec 02 2016
next sibling parent Dibyendu Majumdar <d.majumdar gmail.com> writes:
On Saturday, 3 December 2016 at 02:23:13 UTC, Dibyendu Majumdar 
wrote:
 IBM has open sourced the J9 Garbage Collector as part of 
 Eclipse OMR project. Perhaps D could use this. It is claimed 
 that the GC implementation is relatively straight forward to 
 reuse.

 https://github.com/eclipse/omr
The documentation is a bit sparse but here is some additional info: http://www.slideshare.net/craiglehmann/the-omr-gc-talk-ruby-kaigi-2015
Dec 02 2016
prev sibling next sibling parent reply Chris Wright <dhasenan gmail.com> writes:
On Sat, 03 Dec 2016 02:23:13 +0000, Dibyendu Majumdar wrote:
 IBM has open sourced the J9 Garbage Collector as part of Eclipse OMR
 project. Perhaps D could use this. It is claimed that the GC
 implementation is relatively straight forward to reuse.
That's awesome! Unfortunately, it looks like incorporating it would take a fair bit of effort, and upstreaming it will not happen. I'm going to take a deeper look this weekend, but it's not looking good so far. Druntime is under the Boost public license and will remain under it. OMR is dual licensed Apache2 / Eclipse Public License. Druntime's GC system is pluggable, but the GC instance is set in a private variable in the `gc.proxy` module. You need to modify the runtime to add another GC implementation -- it's not something an application can just insert, and it's not something you can override just by linking in another library. (As far as I can determine. Corrections welcome.) And that makes the licensing problem much harder to work around. That said, let's look at the API: omrobjectptr_t OMR_GC_Allocate(OMR_VMThread * omrVMThread, uintptr_t allocationCategory, uintptr_t size, uintptr_t allocateFlags); omrobjectptr_t OMR_GC_AllocateNoGC(OMR_VMThread * omrVMThread, uintptr_t allocationCategory, uintptr_t size, uintptr_t allocateFlagss); omr_error_t OMR_GC_SystemCollect(OMR_VMThread* omrVMThread, uint32_t gcCode); First parameter is a pointer to an OMR thread. I'd have to rewrite core.thread to use OMR's thread system, or maybe I could fake OMR threads with enough effort. That's another barrier. The `allocationCategory` element is basically an object that describes the object model and lets OMR access type information. Looks like it expects us to be able to get the size of an object from a pointer to it. D's runtime *does* let you get this information for heap-allocated objects, but for arrays and heap-allocated structs, it does so by accessing the GC's internal data structures...yeah. In D, you can malloc memory or memory map a file and put pointers in there. You manage this with the runtime by calling `GC.addRoot`. There doesn't appear to be an equivalent in OMR's GC. (Hard to tell, since it clocks in at 60KLOC.) I *think* the AllocateNoGC function allocates but guarantees it won't collect memory. Not entirely certain. D's current runtime rather requires that.
Dec 02 2016
parent reply Dibyendu Majumdar <mobile majumdar.org.uk> writes:
On Saturday, 3 December 2016 at 06:26:22 UTC, Chris Wright wrote:
 On Sat, 03 Dec 2016 02:23:13 +0000, Dibyendu Majumdar wrote:
 IBM has open sourced the J9 Garbage Collector as part of 
 Eclipse OMR project. Perhaps D could use this. It is claimed 
 that the GC implementation is relatively straight forward to 
 reuse.
That's awesome! Unfortunately, it looks like incorporating it would take a fair bit of effort, and upstreaming it will not happen. I'm going to take a deeper look this weekend, but it's not looking good so far.
I would imagine that some effort is necessary ... but the payoff is huge isn't it ? To have a robust GC would change the game for D.
 That said, let's look at the API:

 omrobjectptr_t OMR_GC_Allocate(OMR_VMThread * omrVMThread, 
 uintptr_t allocationCategory, uintptr_t size, uintptr_t 
 allocateFlags);

 omrobjectptr_t OMR_GC_AllocateNoGC(OMR_VMThread * omrVMThread, 
 uintptr_t allocationCategory, uintptr_t size, uintptr_t 
 allocateFlagss);

 omr_error_t OMR_GC_SystemCollect(OMR_VMThread* omrVMThread, 
 uint32_t gcCode);

 First parameter is a pointer to an OMR thread. I'd have to 
 rewrite core.thread to use OMR's thread system, or maybe I 
 could fake OMR threads with enough effort. That's another 
 barrier.
I think all GCs need to interact with threads so there has to be some linkage between the thread sub-system and GC. I saw a slide deck or some docs that explained the requirements to use OMR GC. Unfortunately I did not save a link and cannot locate this now. I will post a link if I find it again. It said that the OMR expects objects to be allocated on 8-byte aligned memory and it requires one GC byte - which is assumed to be at the start of the allocated object but can be changed if required. Regards Dibyendu
Dec 03 2016
next sibling parent Dibyendu Majumdar <mobile majumdar.org.uk> writes:
On Saturday, 3 December 2016 at 10:29:02 UTC, Dibyendu Majumdar 
wrote:
 That said, let's look at the API:

 omrobjectptr_t OMR_GC_Allocate(OMR_VMThread * omrVMThread, 
 uintptr_t allocationCategory, uintptr_t size, uintptr_t 
 allocateFlags);

 omrobjectptr_t OMR_GC_AllocateNoGC(OMR_VMThread * omrVMThread, 
 uintptr_t allocationCategory, uintptr_t size, uintptr_t 
 allocateFlagss);

 omr_error_t OMR_GC_SystemCollect(OMR_VMThread* omrVMThread, 
 uint32_t gcCode);

 First parameter is a pointer to an OMR thread. I'd have to 
 rewrite core.thread to use OMR's thread system, or maybe I 
 could fake OMR threads with enough effort. That's another 
 barrier.
I think all GCs need to interact with threads so there has to be some linkage between the thread sub-system and GC. I saw a slide deck or some docs that explained the requirements to use OMR GC. Unfortunately I did not save a link and cannot locate this now. I will post a link if I find it again. It said that the OMR expects objects to be allocated on 8-byte aligned memory and it requires one GC byte - which is assumed to be at the start of the allocated object but can be changed if required.
Here is the video that describes how to use the GC. Says initial integration should take <100 lines of code! GC details are about 30 minutes into the talk. https://developer.ibm.com/open/videos/eclipse-omr-tech-talk/
Dec 03 2016
prev sibling parent reply Chris Wright <dhasenan gmail.com> writes:
On Sat, 03 Dec 2016 10:29:02 +0000, Dibyendu Majumdar wrote:
 I would imagine that some effort is necessary ... but the payoff is huge
 isn't it ? To have a robust GC would change the game for D.
It's possible that ldc and gdc will switch their runtimes to use OMR, but dmd will not. I might be able to provide an alternate runtime with dub. Will have to investigate. If that works, we might end up splitting the runtime in two -- a lot of the runtime won't need to change, and I'd rather not have the maintenance overhead for typeinfo and unicode handling just to provide a better GC.
 I think all GCs need to interact with threads so there has to be some
 linkage between the thread sub-system and GC.
True. If this project were designed to be used a la carte, it would have a pluggable mechanism for identifying thread stacks and pausing threads. That would make it much easier to port a runtime to OMR, but it would make OMR more complex -- and it's already 775kLOC. It adds more work to my plate, but I can deal.
 I saw a slide deck or some docs that explained the requirements to use
 OMR GC. Unfortunately I did not save a link and cannot locate this now.
 I will post a link if I find it again.
Thanks! Watching it now.
 It said that the OMR expects objects to be allocated on 8-byte aligned
 memory and it requires one GC byte - which is assumed to be at the start
 of the allocated object but can be changed if required.
One byte for the GC, or one byte of overhead for the glue layer to use? Assuming that all structs are padded to word boundaries, one byte would allow a 256-word struct. I can create a struct with 10,000 fields, and each can be more than one word.
Dec 03 2016
parent reply Chris Wright <dhasenan gmail.com> writes:
On Sat, 03 Dec 2016 15:21:02 +0000, Chris Wright wrote:

 On Sat, 03 Dec 2016 10:29:02 +0000, Dibyendu Majumdar wrote:
 I would imagine that some effort is necessary ... but the payoff is
 huge isn't it ? To have a robust GC would change the game for D.
 It said that the OMR expects objects to be allocated on 8-byte aligned
 memory and it requires one GC byte - which is assumed to be at the
 start of the allocated object but can be changed if required.
One byte for the GC, or one byte of overhead for the glue layer to use? Assuming that all structs are padded to word boundaries, one byte would allow a 256-word struct. I can create a struct with 10,000 fields, and each can be more than one word.
Also, OMR assumes there are no unaligned pointers to GC memory. So if I wrote the following (perfectly valid) D code, it would crash: class Foo { bool[4] bools; } auto a = &new Foo().bools[1];
Dec 03 2016
parent Chris Wright <dhasenan gmail.com> writes:
On Sat, 03 Dec 2016 15:45:00 +0000, Chris Wright wrote:
 Also, OMR assumes there are no unaligned pointers to GC memory.
Also no pointers-to-members, but I have a way around both these problems. Obvious in hindsight. The memory overhead is O(#allocations). The time overhead is O(lg #allocations) per pointer reference per scan. So we won't see numbers as impressive as Java's, unfortunately -- not unless we forbid pointers-to-members and rework array slicing. The details: When I allocate a thing, I insert an entry for the thing into an ordered set of allocated ranges. The entry will be a (begin, end) pointer pair for the allocation. I can get this info because allocations tell me how much to allocate, and OMR's GC gives me a pointer. When inserting that allocated range into the set, I will have to erase any overlapping elements. This shouldn't be difficult -- as long as the no-overlapping-ranges invariant was maintained before this call, I only have to look one element to the left, and it's quick to determine how far to the right I need to look. When scanning an object, we find the pointers in that object as we currently do. Then, instead of enqueueing those pointers to be scanned, we look them up in the allocation set and enqueue the allocation's `begin` pointer. Pretty much what we do today. It's just replacing a little less of the GC than we would prefer.
Dec 03 2016
prev sibling parent Dibyendu Majumdar <mobile majumdar.org.uk> writes:
On Saturday, 3 December 2016 at 02:23:13 UTC, Dibyendu Majumdar 
wrote:
 IBM has open sourced the J9 Garbage Collector as part of 
 Eclipse OMR project. Perhaps D could use this. It is claimed 
 that the GC implementation is relatively straight forward to 
 reuse.

 https://github.com/eclipse/omr
BTW the .Net Core CLR also appears to be reusable - and maybe even easier to integrate. There is a standalone sample - and the integration interface seems well defined. https://github.com/dotnet/coreclr/tree/master/src/gc
Dec 03 2016