www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Migrating an existing more modern GC to D's gc.d

reply Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
How difficult would it be to migrate an existing modern 
GC-implementation into D's?

Which kinds of GC's would be of interest?

Which attempts have been made already?
Apr 09 2018
next sibling parent reply Jack Stouffer <jack jackstouffer.com> writes:
On Monday, 9 April 2018 at 18:27:26 UTC, Per Nordlöw wrote:
 How difficult would it be to migrate an existing modern 
 GC-implementation into D's?
Considering no one has done it, very.
 Which kinds of GC's would be of interest?
There's been threads about this. I'd do a search for "precise GC" in general.
 Which attempts have been made already?
https://github.com/dlang/druntime/pull/1603
Apr 09 2018
parent Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Monday, 9 April 2018 at 18:39:11 UTC, Jack Stouffer wrote:
 On Monday, 9 April 2018 at 18:27:26 UTC, Per Nordlöw wrote:
 How difficult would it be to migrate an existing modern 
 GC-implementation into D's?
Considering no one has done it, very.
What's the reason for this being so hard? A too unstrict programming model that enables (has enabled) to much bit-fiddling with pointers (classes)?
Apr 09 2018
prev sibling next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On Monday, 9 April 2018 at 18:27:26 UTC, Per Nordlöw wrote:
 How difficult would it be to migrate an existing modern 
 GC-implementation into D's?
Which one? None of of even close to advanced GCs are pluggable, most in addition to being hardwired to a runtime/VM codebase, also rely on things like: - particular object layout as in object header (Java, Dart + many JavaScript engines certainly do this) - safe points and custom stackmaps - some use tagged pointers and forbid explicit pointer arithmetic - most heavily rely on GC pointers not being mixed with non-GC pointers - generational ones need write barriers (pieces of code that guard each assignment of reference) - most concurrent ones use read-barriers as well
 Which kinds of GC's would be of interest?
I believe we can get away with parallel mark-sweep + snapshot-based concurrency. It has some limitations but in D land with GC not being the single source of memory it should work fine.
 Which attempts have been made already?
I still think that mostly precise Immix style GC would also work, it won’t be 1:1 porting job though. Many things to figure out.
Apr 09 2018
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Apr 09, 2018 at 07:43:00PM +0000, Dmitry Olshansky via Digitalmars-d
wrote:
 On Monday, 9 April 2018 at 18:27:26 UTC, Per Nordlöw wrote:
[...]
 Which kinds of GC's would be of interest?
I believe we can get away with parallel mark-sweep + snapshot-based concurrency. It has some limitations but in D land with GC not being the single source of memory it should work fine.
 Which attempts have been made already?
I still think that mostly precise Immix style GC would also work, it won’t be 1:1 porting job though. Many things to figure out.
Last I remembered, you were working on a GC prototype for D? Any news on that, or have you basically given it up? T -- Life is complex. It consists of real and imaginary parts. -- YHL
Apr 09 2018
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On Monday, 9 April 2018 at 19:50:16 UTC, H. S. Teoh wrote:
 On Mon, Apr 09, 2018 at 07:43:00PM +0000, Dmitry Olshansky via 
 Digitalmars-d wrote:
 On Monday, 9 April 2018 at 18:27:26 UTC, Per Nordlöw wrote:
[...]
 Which kinds of GC's would be of interest?
I believe we can get away with parallel mark-sweep + snapshot-based concurrency. It has some limitations but in D land with GC not being the single source of memory it should work fine.
 Which attempts have been made already?
I still think that mostly precise Immix style GC would also work, it won’t be 1:1 porting job though. Many things to figure out.
Last I remembered, you were working on a GC prototype for D?
Still there, but my spare time is super limited lately, the other project preempted that for the moment.
 Any news on that, or have you basically given it up?
Might try to hack to the finish line in one good night, it was pretty close to complete. Debugging would be fun though ;) Will likely try to complete it at DConf hackathon, I’d be glad should anyone want to help.
 T
Apr 09 2018
parent reply David Bennett <davidbennett bravevision.com> writes:
On Tuesday, 10 April 2018 at 05:26:28 UTC, Dmitry Olshansky wrote:
 On Monday, 9 April 2018 at 19:50:16 UTC, H. S. Teoh wrote:
 Last I remembered, you were working on a GC prototype for D?
Still there, but my spare time is super limited lately, the other project preempted that for the moment.
 Any news on that, or have you basically given it up?
Might try to hack to the finish line in one good night, it was pretty close to complete. Debugging would be fun though ;)
I was thinking about messing with the GC in my free time just yesterday... how hard would it be: Add a BlkAttr.THREAD_LOCAL, and set it from the runtime if the type or it's members are not shared or __gshared. Then we could store BlkAttr.THREAD_LOCAL memory in different pages (per thread) without having to setting a mutex. (if we need to get new page from the global pool we set a mutex for that) If thats possible we could also Just(TM) scan the current thread stack and mark/sweep only those pages. (without a stop the world) And when a thread ends we could give the pages to the global pool without a mark/sweep. The idea is it works like it does currently unless something is invisible to other threads, Or am i missing something obvious? (quite likely)
Apr 09 2018
next sibling parent David Bennett <davidbennett bravevision.com> writes:
On Tuesday, 10 April 2018 at 06:10:10 UTC, David Bennett wrote:
 I was thinking about messing with the GC in my free time just 
 yesterday... how hard would it be:

 [snip]

 The idea is it works like it does currently unless something is 
 invisible to other threads, Or am i missing something obvious? 
 (quite likely)
Forgot to mention that a non-thread local mark/sweep would still scan all thread stacks and pages like it does currently as a thread local could hold a pointer the the global data (ie a copy of __gshared, void*). The only why I can think of to break this idea is using cast() or sending something to a C function that then does and adds pointers in global data to the thread local stuff...
Apr 09 2018
prev sibling next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On Tuesday, 10 April 2018 at 06:10:10 UTC, David Bennett wrote:
 On Tuesday, 10 April 2018 at 05:26:28 UTC, Dmitry Olshansky 
 wrote:
 On Monday, 9 April 2018 at 19:50:16 UTC, H. S. Teoh wrote:
 Last I remembered, you were working on a GC prototype for D?
Still there, but my spare time is super limited lately, the other project preempted that for the moment.
 Any news on that, or have you basically given it up?
Might try to hack to the finish line in one good night, it was pretty close to complete. Debugging would be fun though ;)
I was thinking about messing with the GC in my free time just yesterday... how hard would it be: Add a BlkAttr.THREAD_LOCAL, and set it from the runtime if the type or it's members are not shared or __gshared. Then we could store BlkAttr.THREAD_LOCAL memory in different pages (per thread) without having to setting a mutex. (if we need to get new page from the global pool we set a mutex for that)
Lost immutable and that thread-local is often casted to immutable, sometimes by compiler. See assumeUnique and its ilk in Phobos. Same with shared - it’s still often the case that you allocate thread-local then cast to shared. Lastly - thanks to 0-typesafety of delegates it’s trivial to share a single GC-backed stack with multiple threads. So what you deemed thread-local might be used in other thread, transitively so. D is thread-local except when it’s not.
 If thats possible we could also Just(TM) scan the current 
 thread stack and mark/sweep only those pages. (without a stop 
 the world)
That is indeed something we should at some point have. Needs cooperation from the language such as explicit functions for shared<->local conversions that run-time is aware of.
 And when a thread ends we could give the pages to the global 
 pool without a mark/sweep.

 The idea is it works like it does currently unless something is 
 invisible to other threads, Or am i missing something obvious? 
 (quite likely)
Indeed there are ugly details that while would allow per thread GC in principle will in general crash and burn on most non-trivial programs.
Apr 09 2018
parent reply David Bennett <davidbennett bravevision.com> writes:
On Tuesday, 10 April 2018 at 06:43:28 UTC, Dmitry Olshansky wrote:
 On Tuesday, 10 April 2018 at 06:10:10 UTC, David Bennett wrote:
 I was thinking about messing with the GC in my free time just 
 yesterday... how hard would it be:

 [snip]
Lost immutable and that thread-local is often casted to immutable, sometimes by compiler. See assumeUnique and its ilk in Phobos. Same with shared - it’s still often the case that you allocate thread-local then cast to shared.
People cast from thread local to shared? ...okay thats no fun... :( I can understand the other way, thats why i was leaning on the conservative side and putting more stuff in the global pools.
 Lastly - thanks to 0-typesafety of delegates it’s trivial to 
 share a single GC-backed stack with multiple threads. So what 
 you deemed thread-local might be used in other thread, 
 transitively so.
Oh thats a good point I didn't think of!
 D is thread-local except when it’s not.

 If thats possible we could also Just(TM) scan the current 
 thread stack and mark/sweep only those pages. (without a stop 
 the world)
That is indeed something we should at some point have. Needs cooperation from the language such as explicit functions for shared<->local conversions that run-time is aware of.
So the language could (in theory) inject a __move_to_global(ref local, ref global) when casting to shared and the GC would need to update all the references in the local pages to point to the new global address?
 And when a thread ends we could give the pages to the global 
 pool without a mark/sweep.

 The idea is it works like it does currently unless something 
 is invisible to other threads, Or am i missing something 
 obvious? (quite likely)
Indeed there are ugly details that while would allow per thread GC in principle will in general crash and burn on most non-trivial programs.
Okay, thanks for the points they were very clear so I assume you have spent a lot more brain power on this then I have.
Apr 10 2018
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On Tuesday, 10 April 2018 at 07:22:14 UTC, David Bennett wrote:
 On Tuesday, 10 April 2018 at 06:43:28 UTC, Dmitry Olshansky 
 wrote:
 On Tuesday, 10 April 2018 at 06:10:10 UTC, David Bennett wrote:
 I was thinking about messing with the GC in my free time just 
 yesterday... how hard would it be:

 [snip]
Lost immutable and that thread-local is often casted to immutable, sometimes by compiler. See assumeUnique and its ilk in Phobos. Same with shared - it’s still often the case that you allocate thread-local then cast to shared.
People cast from thread local to shared? ...okay thats no fun... :( I can understand the other way, thats why i was leaning on the conservative side and putting more stuff in the global pools.
Well you might want to build something as thread-local and then publish as shared.
 That is indeed something we should at some point have. Needs 
 cooperation from the language such as explicit functions for 
 shared<->local conversions that run-time is aware of.
So the language could (in theory) inject a __move_to_global(ref local, ref global) when casting to shared and the GC would need to update all the references in the local pages to point to the new global address?
I think it could be __to_shared(ptr, length) to let GC know that block should be added to global set of sorts. That will foobar the GC design quite a bit but to have per thread GCs I’d take that risk. But then keeping in mind transitive nature of shared.... Maybe not ;) Maybe it should work the other way around - keep all in global pool, and have per-thread ref-sets of some form. Tricky anyway.
Apr 11 2018
parent David Bennett <davidbennett bravevision.com> writes:
On Wednesday, 11 April 2018 at 19:38:59 UTC, Dmitry Olshansky 
wrote:
 On Tuesday, 10 April 2018 at 07:22:14 UTC, David Bennett wrote:
 People cast from thread local to shared? ...okay thats no 
 fun...  :(

 I can understand the other way, thats why i was leaning on the 
 conservative side and putting more stuff in the global pools.
Well you might want to build something as thread-local and then publish as shared.
Yeah I can see if your trying to share types like classes, shared would get in the way quite quick.
 I think it could be __to_shared(ptr, length) to let GC know 
 that block should be added to global set of sorts. That will 
 foobar the GC design quite a bit but to have per thread GCs I’d 
 take that risk.
Yeah I had this idea also, the runtime gets a hook on cast(shared) and the GC then just sets a flag and that part of memory will never be freed inside a thread-local mark/sweep. No move needed.
 But then keeping in mind transitive nature of shared.... Maybe 
 not ;)
Yeah shared is quite locked down so should have less ways people could foil my plans. It's __gshared that im worried about now, ie if you had a class (stored in global pool) that you then assigned a local class to one of it's members. When a thread-local mark/sweep happened it wouldn't see the ref in the global pool and the member might get removed... --- class A{} class B{ __gshared A a; this(A a){ this.a=a; } } void main() { A a = new A(); B b = new B(a); } --- Currently my idea of storing classes with __gshared members would put B on the global poll but theres no cast so A would not be hoocked with __to_shared(). I guess the compiler could in theory inject the same __to_shared() in this case also, but it would be a lot harder and would probably be a mess as theres no cast to hook. So maybe with __gshared it should be on the thread-local pool but marked as global.. but you might be able to mix shared and __gshared in a way that wouldn't work.
 Maybe it should work the other way around - keep all in global 
 pool, and have per-thread ref-sets of some form. Tricky anyway.
Would be worth some thought, I'll keep it in mind. For now, I'm seeing if I can just make it so each thread has it's own Bin list, this way the data is stored in a way where the thread-local stuff is generally packed closer together and theres a higher chance to have a whole free page after a global mark/sweep. If there a good benchmark for the GC I can run to see if I'm actually improving things?
Apr 12 2018
prev sibling parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Tuesday, April 10, 2018 06:10:10 David Bennett via Digitalmars-d wrote:
 On Tuesday, 10 April 2018 at 05:26:28 UTC, Dmitry Olshansky wrote:
 On Monday, 9 April 2018 at 19:50:16 UTC, H. S. Teoh wrote:
 Last I remembered, you were working on a GC prototype for D?
Still there, but my spare time is super limited lately, the other project preempted that for the moment.
 Any news on that, or have you basically given it up?
Might try to hack to the finish line in one good night, it was pretty close to complete. Debugging would be fun though ;)
I was thinking about messing with the GC in my free time just yesterday... how hard would it be: Add a BlkAttr.THREAD_LOCAL, and set it from the runtime if the type or it's members are not shared or __gshared. Then we could store BlkAttr.THREAD_LOCAL memory in different pages (per thread) without having to setting a mutex. (if we need to get new page from the global pool we set a mutex for that) If thats possible we could also Just(TM) scan the current thread stack and mark/sweep only those pages. (without a stop the world) And when a thread ends we could give the pages to the global pool without a mark/sweep. The idea is it works like it does currently unless something is invisible to other threads, Or am i missing something obvious? (quite likely)
As it stands, it's impossible to have thread-local memory pools. It's quite legal to construct an object as shared or thread-local and cast it to the other. In fact, it's _highly_ likely that that's how any shared object of any complexity is going to be constructed. Similarly, it's extremely common to allocate an object as mutable and then cast it to immutable (either using assumeUnique or by using a pure function where the compiler does the cast implicitly for you if it can guarantee that the return value is unique), and immutable objects are implicitly shared. At minimum, there would have to be runtime hooks to do something like move an object between pools when it is cast to shared or immutable (or back) in order to ensure that an object was in the right pool, but if that requires copying the object rather than just moving the memory block, then it can't be done, because every pointer or reference pointing to that object would have to be rewritten (which isn't supported by the language). Also, it would be a disaster for shared, because the typical way to use shared is to protect the shared object with a mutex, cast away shared so that it can be operated on as thread-local within that section of code, and then before the mutex is released, all thread-local references then need to be gone. e.g. synchronized(mutex) { auto threadLocal = cast(MyType)mySharedObject; // do something with threadLocal... // threadLocal leaves scope and is gone without being cast back } // all references to the shared object should now be shared You really _don't_ want the shared object to move between pools because of that cast (since it would hurt performance), and in such a situation, you don't usually cast back to shared. Rather, you have a shared reference, cast it to get a thread-local reference, and then let the thread-local reference leave scope. So, the same object temporarily has both a thread-local and a shared reference to it, and if it were moved to the thread-local pool with the cast, it would never be moved back when the thread-local references left scope and the mutex was released. Having synchronized classes as described in TDPL would make the above code cleaner in the cases where a synchronized class would work, but the basic concept is the same. It would still be doing a cast underneath the hood, and it would still have the same problems. It just wouldn't involve explicit casting. shared's design inherently requires casting away shared, so it just plain isn't going to play well with anything that doesn't play well with such casts - such as having thread-local heaps. Also, IIRC, at one point, Daniel Murphy explained to me some problem with classes with regards to the virtual table or the TypeInfo that inherently wouldn't work with trying to move it between threads. Unfortunately, I don't remember the details now, but I do remember that there's _something_ there that wouldn't work with thread-local heaps. And if anyone were to seriously try it, I expect that he could probably come up with the reasons again. Regardless, I think that it's clear that in order to do anything with thread-local pools, we'd have to lock down the type system even further to disallow casts to or from shared or immutable, and that would really be a big problem given the inherent restrictions on those types and how shared is intended to be used. So, while it's a common idea as to how the GC could be improved, and it would be great if we could do it, I think that it goes right along with all of the other ideas that require stuff like read and write barriers everywhere and thus will never be in D's GC. - Jonathan M Davis
Apr 09 2018
next sibling parent reply David Bennett <davidbennett bravevision.com> writes:
On Tuesday, 10 April 2018 at 06:47:53 UTC, Jonathan M Davis wrote:
 As it stands, it's impossible to have thread-local memory 
 pools. It's quite legal to construct an object as shared or 
 thread-local and cast it to the other. In fact, it's _highly_ 
 likely that that's how any shared object of any complexity is 
 going to be constructed. Similarly, it's extremely common to 
 allocate an object as mutable and then cast it to immutable 
 (either using assumeUnique or by using a pure function where 
 the compiler does the cast implicitly for you if it can 
 guarantee that the return value is unique), and immutable 
 objects are implicitly shared.
(Honest question:) Do people really cast from local to shared/immutable and expect it to work? (when ever I cast something more complex then a size_t I almost expect it to blow up... or break sometime in the future) That said, I can understanding building a shared object from parts of local data... though I try to keep my thread barriers as thin as possible myself. (meaning I tend to copy stuff to the shared and have as few shared's as possible)
 At minimum, there would have to be runtime hooks to do 
 something like move an object between pools when it is cast to 
 shared or immutable (or back) in order to ensure that an object 
 was in the right pool, but if that requires copying the object 
 rather than just moving the memory block, then it can't be 
 done, because every pointer or reference pointing to that 
 object would have to be rewritten (which isn't supported by the 
 language).
A hook for local to cast(shared) could work... but would require a DIP I guess. I was hoping to make a more incremental improvement the the GC.
 Also, it would be a disaster for shared, because the typical 
 way to use shared is to protect the shared object with a mutex, 
 cast away shared so that it can be operated on as thread-local 
 within that section of code, and then before the mutex is 
 released, all thread-local references then need to be gone. e.g.


 synchronized(mutex)
 {
     auto threadLocal = cast(MyType)mySharedObject;

     // do something with threadLocal...

     // threadLocal leaves scope and is gone without being cast 
 back
 }

 // all references to the shared object should now be shared
Yeah thats why I was still scanning all thread stacks and pages when marking global data. So a shared -> local is a no op but the other way needs thought.
 You really _don't_ want the shared object to move between pools
 because of that cast (since it would hurt performance), and in 
 such a
 situation, you don't usually cast back to shared. Rather, you 
 have a shared
 reference, cast it to get a thread-local reference, and then 
 let the
 thread-local reference leave scope. So, the same object 
 temporarily has both
 a thread-local and a shared reference to it, and if it were 
 moved to the
 thread-local pool with the cast, it would never be moved back 
 when the
 thread-local references left scope and the mutex was released.

 Having synchronized classes as described in TDPL would make the 
 above code cleaner in the cases where a synchronized class 
 would work, but the basic concept is the same. It would still 
 be doing a cast underneath the hood, and it would still have 
 the same problems. It just wouldn't involve explicit casting. 
 shared's design inherently requires casting away shared, so it 
 just plain isn't going to play well with anything that doesn't 
 play well with such casts - such as having thread-local heaps.
I would think a shared class would never be marked as a THREAD_LOCAL as it has a shared member.
 Also, IIRC, at one point, Daniel Murphy explained to me some 
 problem with classes with regards to the virtual table or the 
 TypeInfo that inherently wouldn't work with trying to move it 
 between threads. Unfortunately, I don't remember the details 
 now, but I do remember that there's _something_ there that 
 wouldn't work with thread-local heaps. And if anyone were to 
 seriously try it, I expect that he could probably come up with 
 the reasons again.

 Regardless, I think that it's clear that in order to do 
 anything with thread-local pools, we'd have to lock down the 
 type system even further to disallow casts to or from shared or 
 immutable, and that would really be a big problem given the 
 inherent restrictions on those types and how shared is intended 
 to be used. So, while it's a common idea as to how the GC could 
 be improved, and it would be great if we could do it, I think 
 that it goes right along with all of the other ideas that 
 require stuff like read and write barriers everywhere and thus 
 will never be in D's GC.

 - Jonathan M Davis
Yeah I thought it would have issues, thanks for your feedback! I'll see if I can come up with a better idea that doesn't break as much stuff.
Apr 10 2018
parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Tuesday, April 10, 2018 07:55:00 David Bennett via Digitalmars-d wrote:
 On Tuesday, 10 April 2018 at 06:47:53 UTC, Jonathan M Davis wrote:
 As it stands, it's impossible to have thread-local memory
 pools. It's quite legal to construct an object as shared or
 thread-local and cast it to the other. In fact, it's _highly_
 likely that that's how any shared object of any complexity is
 going to be constructed. Similarly, it's extremely common to
 allocate an object as mutable and then cast it to immutable
 (either using assumeUnique or by using a pure function where
 the compiler does the cast implicitly for you if it can
 guarantee that the return value is unique), and immutable
 objects are implicitly shared.
(Honest question:) Do people really cast from local to shared/immutable and expect it to work? (when ever I cast something more complex then a size_t I almost expect it to blow up... or break sometime in the future)
Yes. They expect it to work, and as the language is currently designed, it works perfectly well. In fact, it's even built into the language. e.g. int[] foo() pure { return [1, 2, 3, 4]; } void main() { immutable arr = foo(); } compiles thanks to the fact that the compiler can guarantee from the signature of foo that its return value is unique. We also have std.exception.assumeUnique (which is just a cast to immutable) as a way to document that you're guaranteeing that a reference to an object is unique and therefore can be safely cast to immutable.
 That said, I can understanding building a shared object from
 parts of local data... though I try to keep my thread barriers as
 thin as possible myself. (meaning I tend to copy stuff to the
 shared and have as few shared's as possible)
Because of how restrictive shared and immutable are, you frequently have to build them from thread-local, mutable data. And while it's preferable to have as little in your program be shared as possible and to favor solutions such as doing message passing with std.concurrency, there are situations where you pretty much need to have complex shared objects. And since D is a systems language, we're a lot more restricted in the assumptions that we can - Jonathan M Davis
Apr 10 2018
parent reply David Bennett <davidbennett bravevision.com> writes:
On Tuesday, 10 April 2018 at 08:10:32 UTC, Jonathan M Davis wrote:
 Yes. They expect it to work, and as the language is currently 
 designed, it works perfectly well. In fact, it's even built 
 into the language. e.g.

     int[] foo() pure
     {
         return [1, 2, 3, 4];
     }

     void main()
     {
         immutable arr = foo();
     }

 compiles thanks to the fact that the compiler can guarantee 
 from the signature of foo that its return value is unique.
Oh is that run at runtime? I thought D was just smart and did it using CTFE.
 We also have std.exception.assumeUnique (which is just a cast 
 to immutable) as a way to document that you're guaranteeing 
 that a reference to an object is unique and therefore can be 
 safely cast to immutable.
Can't say I've used std.exception.assumeUnique, but I guess other people have a use for it as it exists. Would be nice if you could inject type checking information at compile time without effecting the storage class. But thats a bit OT now.
 Because of how restrictive shared and immutable are, you 
 frequently have to build them from thread-local, mutable data. 
 And while it's preferable to have as little in your program be 
 shared as possible and to favor solutions such as doing message 
 passing with std.concurrency, there are situations where you 
 pretty much need to have complex shared objects. And since D is 
 a systems language, we're a lot more restricted in the 
 assumptions that we can make in comparison to a language such 

Yeah i agree that any solution should keep in mind that D is a systems language and should allow you to do stuff when you need to. Oh, I just had a much simpler idea that shouldn't have any issues, I'll see if that makes the GC faster to allocate. (everything else is the same)
Apr 10 2018
next sibling parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Tuesday, April 10, 2018 08:37:47 David Bennett via Digitalmars-d wrote:
 On Tuesday, 10 April 2018 at 08:10:32 UTC, Jonathan M Davis wrote:
 Yes. They expect it to work, and as the language is currently
 designed, it works perfectly well. In fact, it's even built
 into the language. e.g.

     int[] foo() pure
     {
         return [1, 2, 3, 4];
     }

     void main()
     {
         immutable arr = foo();
     }

 compiles thanks to the fact that the compiler can guarantee
 from the signature of foo that its return value is unique.
Oh is that run at runtime? I thought D was just smart and did it using CTFE.
CTFE only ever happens when it must happen. The compiler never does it as an optimization. So, if you did enum arr = foo(); or static arr = foo(); then it would use CTFE, because an enum's value must be known at compile time, and if a static variable is directly initialized instead of initialized via a static constructor, its value must be known at compile time. But if you're initializing a variable whose value does not need to be known at compile time, then no CTFE occurs. It would be a serious rabbit hole for the compiler to attempt CTFE when it wasn't told to, particularly since it can't look at a function and know whether it's going to work with CTFE or not. It has to actually call it with a specific set of arguments to find out (and depending on what the function does, it might even work with CTFE with some arguments and not with others - e.g. if a particular branch of an if statement works with CTFE while another does an operation that doesn't work with CTFE). - Jonathan M Davis
Apr 10 2018
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 10.04.2018 10:56, Jonathan M Davis wrote:
 CTFE only ever happens when it must happen. The compiler never does it as an
 optimization.
The frontend doesn't. The backend might.
Apr 13 2018
parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Friday, April 13, 2018 22:36:31 Timon Gehr via Digitalmars-d wrote:
 On 10.04.2018 10:56, Jonathan M Davis wrote:
 CTFE only ever happens when it must happen. The compiler never does it
 as an optimization.
The frontend doesn't. The backend might.
The optimizer may do constant folding or inline the code so far that it just gives the result, but it doesn't do actual CTFE. That's all in the frontend. - Jonathan M Davis
Apr 13 2018
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 13.04.2018 23:40, Jonathan M Davis wrote:
 On Friday, April 13, 2018 22:36:31 Timon Gehr via Digitalmars-d wrote:
 On 10.04.2018 10:56, Jonathan M Davis wrote:
 CTFE only ever happens when it must happen. The compiler never does it
 as an optimization.
The frontend doesn't. The backend might.
The optimizer may do constant folding or inline the code so far that it just gives the result, but it doesn't do actual CTFE. That's all in the frontend. - Jonathan M Davis
CTFE just stands for "compile-time function evaluation". Claiming that the compiler never does this as an optimization is a bit misleading, but fine.
Apr 13 2018
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Apr 14, 2018 at 01:40:58AM +0200, Timon Gehr via Digitalmars-d wrote:
 On 13.04.2018 23:40, Jonathan M Davis wrote:
 On Friday, April 13, 2018 22:36:31 Timon Gehr via Digitalmars-d wrote:
 On 10.04.2018 10:56, Jonathan M Davis wrote:
 CTFE only ever happens when it must happen. The compiler never
 does it as an optimization.
The frontend doesn't. The backend might.
The optimizer may do constant folding or inline the code so far that it just gives the result, but it doesn't do actual CTFE. That's all in the frontend. - Jonathan M Davis
CTFE just stands for "compile-time function evaluation". Claiming that the compiler never does this as an optimization is a bit misleading, but fine.
CTFE, as currently implemented in the compiler front-end (i.e., common across dmd, gdc, ldc), is actually only invoked when a value is *required* at compile-time, e.g., as a template argument or enum. While the CTFE code did grow out of the constant-folding code, the two are actually distinct, and the front-end never calls CTFE when performing constant-folding (even though CTFE could be construed to be a souped-up form of constant-folding). This is a rather fine distinction in the current implementation that I wasn't aware of until recently. Certain backends, like ldc's, may also perform their own "compile-time function evaluation", e.g., on the LLVM IR, as part of their optimization pass. For example, the LDC optimizer can literally execute LLVM IR at compile-time and replace an entire function-call tree with a single instruction that loads the computed value as a literal. This has nothing to do with CTFE (as we know it in D) per se, but is a feature of the LDC optimizer. T -- It only takes one twig to burn down a forest.
Apr 13 2018
prev sibling parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 4/10/18 4:37 AM, David Bennett wrote:
 On Tuesday, 10 April 2018 at 08:10:32 UTC, Jonathan M Davis wrote:
 Yes. They expect it to work, and as the language is currently 
 designed, it works perfectly well. In fact, it's even built into the 
 language. e.g.

     int[] foo() pure
     {
         return [1, 2, 3, 4];
     }

     void main()
     {
         immutable arr = foo();
     }

 compiles thanks to the fact that the compiler can guarantee from the 
 signature of foo that its return value is unique.
Oh is that run at runtime? I thought D was just smart and did it using CTFE.
Well, D could be smart enough and call a runtime function that says it's moving data from thread-local to shared (or vice versa).
 
 We also have std.exception.assumeUnique (which is just a cast to 
 immutable) as a way to document that you're guaranteeing that a 
 reference to an object is unique and therefore can be safely cast to 
 immutable.
Can't say I've used std.exception.assumeUnique, but I guess other people have a use for it as it exists. Would be nice if you could inject type checking information at compile time without effecting the storage class. But thats a bit OT now.
assumeUnique is a library function, it could be instrumented to do the right thing. I think it's possible to do this in D, but you need language support. -Steve
Apr 10 2018
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2018-04-10 08:47, Jonathan M Davis wrote:

 Regardless, I think that it's clear that in order to do anything with
 thread-local pools, we'd have to lock down the type system even further to
 disallow casts to or from shared or immutable, and that would really be a
 big problem given the inherent restrictions on those types and how shared is
 intended to be used.
Apple's GC for Objective-C (before it had ARC) was using thread-local pools. I wonder how they manged to do that in a language that doesn't have a type system that differentiates between TLS and shared memory. -- /Jacob Carlborg
Apr 10 2018
parent Paulo Pinto <pjmlp progtools.org> writes:
On Tuesday, 10 April 2018 at 18:31:28 UTC, Jacob Carlborg wrote:
 On 2018-04-10 08:47, Jonathan M Davis wrote:

 Regardless, I think that it's clear that in order to do 
 anything with
 thread-local pools, we'd have to lock down the type system 
 even further to
 disallow casts to or from shared or immutable, and that would 
 really be a
 big problem given the inherent restrictions on those types and 
 how shared is
 intended to be used.
Apple's GC for Objective-C (before it had ARC) was using thread-local pools. I wonder how they manged to do that in a language that doesn't have a type system that differentiates between TLS and shared memory.
They were doing it quite bad. One of the reasons that always gets lost when discussing the merits of ARC over GC in Objective-C, is that Apple never managed to make the GC work without issues given its underlying C semantics. So naturally having the compiler do what developers were already doing by hand with Framework derived classes was a safer way than ensuring Objective-C's GC would never crash. Apple used to have a GC caveats document that was long taken down from their site. This is one of the few surviving ones, https://developer.apple.com/library/content/releasenotes/Cocoa/RN-ObjectiveC/#//apple_ref/doc/uid/TP40004309-CH1-DontLinkElementID_1
Apr 11 2018
prev sibling parent Ikeran <dhasenan ikeran.org> writes:
On Tuesday, 10 April 2018 at 06:47:53 UTC, Jonathan M Davis wrote:
 As it stands, it's impossible to have thread-local memory 
 pools. It's quite legal to construct an object as shared or 
 thread-local and cast it to the other. In fact, it's _highly_ 
 likely that that's how any shared object of any complexity is 
 going to be constructed. Similarly, it's extremely common to 
 allocate an object as mutable and then cast it to immutable 
 (either using assumeUnique or by using a pure function where 
 the compiler does the cast implicitly for you if it can 
 guarantee that the return value is unique), and immutable 
 objects are implicitly shared. At minimum, there would have to 
 be runtime hooks to do something like move an object between 
 pools when it is cast to shared or immutable (or back) in order 
 to ensure that an object was in the right pool, but if that 
 requires copying the object rather than just moving the memory 
 block, then it can't be done, because every pointer or 
 reference pointing to that object would have to be rewritten 
 (which isn't supported by the language).
It's a bit easier than that. When you cast something to shared or immutable, or allocate it as shared or immutable, you pin the object on the local heap. When the thread-local collector runs, it won't collect that object, since another thread might know about it. Then, when you run the global collector, it will determine which shared objects are still reachable and unpin things as appropriate. That unpinning process requires a way to look up the owning thread for a piece of memory, which can be done in logarithmic time relative to the number of contiguous segments of address space. Casting away from shared would not call any runtime functions; even if it were guaranteed that the cast were done on the allocating thread, it's likely that there exists another reference to the item in another thread. This would discourage the use of immutable, since it wouldn't benefit from thread-local heaps.
Apr 11 2018
prev sibling parent reply Ikeran <dhasenan ikeran.org> writes:
On Monday, 9 April 2018 at 19:43:00 UTC, Dmitry Olshansky wrote:
 None of of even close to advanced GCs are pluggable
Eclipse OMR contains a pluggable GC, and it's used in OpenJ9, which claims to be an enterprise-grade JVM.
Apr 09 2018
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On Tuesday, 10 April 2018 at 03:59:33 UTC, Ikeran wrote:
 On Monday, 9 April 2018 at 19:43:00 UTC, Dmitry Olshansky wrote:
 None of of even close to advanced GCs are pluggable
Eclipse OMR contains a pluggable GC, and it's used in OpenJ9,
Or rather Eclipse OMR is a toolkit for runtimes/VMs and GC plugs into that. I encourage you to try it to implement D-like semantics with this run-time and you’ll see just how pluggable it is.
 which claims to be an enterprise-grade JVM.
I once used OpenJ9, which was IBM J9 I think, right beforce open sourcing. It was about x2 slower then Hotspot, didn’t dig too deep as to preciese reason. The fact that it was on Power8 was especially surprising, I thought IBM would take advantage of their own hardware.
Apr 09 2018
prev sibling next sibling parent reply Ali <fakeemail example.com> writes:
On Monday, 9 April 2018 at 18:27:26 UTC, Per Nordlöw wrote:
 How difficult would it be to migrate an existing modern 
 GC-implementation into D's?

 Which kinds of GC's would be of interest?

 Which attempts have been made already?
I think the priority is not having pluggable GC's, or a better GC, but to fully support nogc and deterministic and manual memory management which as I understood is on the roadmap
Apr 09 2018
parent reply =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Monday, 9 April 2018 at 20:20:39 UTC, Ali wrote:
 On Monday, 9 April 2018 at 18:27:26 UTC, Per Nordlöw wrote:
 How difficult would it be to migrate an existing modern 
 GC-implementation into D's?

 Which kinds of GC's would be of interest?

 Which attempts have been made already?
I think the priority is not having pluggable GC's, or a better GC, but to fully support nogc and deterministic and manual memory management which as I understood is on the roadmap
Through allocators solely or will the GC adapt in some way?
Apr 09 2018
next sibling parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Monday, April 09, 2018 23:21:23 Nordlöw via Digitalmars-d wrote:
 On Monday, 9 April 2018 at 20:20:39 UTC, Ali wrote:
 On Monday, 9 April 2018 at 18:27:26 UTC, Per Nordlöw wrote:
 How difficult would it be to migrate an existing modern
 GC-implementation into D's?

 Which kinds of GC's would be of interest?

 Which attempts have been made already?
I think the priority is not having pluggable GC's, or a better GC, but to fully support nogc and deterministic and manual memory management which as I understood is on the roadmap
Through allocators solely or will the GC adapt in some way?
I don't think that there are any plans to fundamentally change how the GC works from the language perspective. The implementation may be improved or replaced, but the GC isn't going anywhere, and any code that uses the GC should continue to be able to do so as it has. Certainly, we're not getting rid of or marginalizing the GC. We just want to make sure that code doesn't use the GC when it doesn't need to or doesn't seriously benefit from using the GC. More of Phobos should be nogc than is currently, but it's never going to be the case that all of Phobos is nogc. There are real benefits to using the GC, and we don't want to throw that away. We just don't want to rely on it when it doesn't make sense. There has been some discussion of adding some sort of RC capabilities to the language with the idea that a type could be designed to be RC-ed that way, but I don't think that the details have been sorted out yet, and I'm not sure that it's even clear whether that's going to involve anything other than GC-allocated memory (e.g. if the GC is used, then it can take care of circular references, whereas if it isn't, then we have to get into weak references and all of the complications that go with that). I believe that Walter started looking into it, but I don't know how far he got before he got sidetracked. In particular, as I understand it, Walter's work with scope and DIP 1000 was primarily motivated by whatever he was trying to do with RC, because without something like DIP 1000, it becomes much harder (if not impossible) to do RC in a fully safe manner. So, whatever we end up seeing with regards to RC support in the language is going to have to wait until DIP 1000 has been fully sorted out, which will probably be a while. Also, any work that's done to improve the GC at this point isn't something that's going to be done by Walter. So, improvement to the GC is the sort of thing that's likely to happen in parallel to any language improvements like adding better RC support. - Jonathan M Davis
Apr 09 2018
prev sibling parent Ali <fakeemail example.com> writes:
On Monday, 9 April 2018 at 23:21:23 UTC, Nordlöw wrote:
 Through allocators solely or will the GC adapt in some way?
Here is the relevant line from the vision document " nogc: Use of D without a garbage collector, most likely by using reference counting and related methods Unique/Weak references) for reclamation of resources. This task is made challenging by the safety requirement. We believe we have an attack in the upcoming allocators/collections combos." And the link to the vision document https://wiki.dlang.org/Vision/2018H1 In general, I do recommend you read the document carefully, and it important to note that is in it, and what is not in it Obviously, there is no mention on working on the GC There is also no direct mention of changing Phobos or modifying Phobos Also it might be important to read the vision document in order priority number 1 is "1. Lock down the language definition" this very much align with many comments I've seen here from Andrei or Walter that they are more interested in seeing the existing features used, rather than adding new features The vision document doesn't seem to introduce any new feature, mostly improvements to existing features, or making existing feature more usable
Apr 09 2018
prev sibling parent reply Chris <wendlec tcd.ie> writes:
On Monday, 9 April 2018 at 18:27:26 UTC, Per Nordlöw wrote:
 How difficult would it be to migrate an existing modern 
 GC-implementation into D's?

 Which kinds of GC's would be of interest?

 Which attempts have been made already?
IBM has open sourced its JVM: https://www.eclipse.org/openj9/ They claim they have good GCs. So maybe someone knowledgeable wants to have a look at it.
May 24 2018
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 5/24/18 8:35 AM, Chris wrote:
 On Monday, 9 April 2018 at 18:27:26 UTC, Per Nordlöw wrote:
 How difficult would it be to migrate an existing modern 
 GC-implementation into D's?

 Which kinds of GC's would be of interest?

 Which attempts have been made already?
IBM has open sourced its JVM: https://www.eclipse.org/openj9/ They claim they have good GCs. So maybe someone knowledgeable wants to have a look at it.
It's GPL, Apache, or EPL. I'm not sure about EPL, but I know that the former 2 are not convertible to Boost, so we couldn't accept a port from there. Really though, the issues with D's GC are partly to blame from the language itself rather than the GC design. Having certain aspects of the language precludes certain GCs. Java as a language is much more conducive to more advanced GC designs. -Steve
May 24 2018
parent Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Thursday, 24 May 2018 at 13:13:03 UTC, Steven Schveighoffer 
wrote:
 Really though, the issues with D's GC are partly to blame from 
 the language itself rather than the GC design. Having certain 
 aspects of the language precludes certain GCs. Java as a 
 language is much more conducive to more advanced GC designs.
I'm hoping for a tough long-term deprecation process that alleviates these issues eventhough they will cause big breakage. I believe it will be worth it.
May 24 2018