www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.announce - D Concurrent GC

reply Leandro Lucarella <luca llucax.com.ar> writes:
Not quite ready for prime-time yet, but I think it's in a stage when it
could be interesting anyway (at least for developers or people that want
to experiment):
http://llucax.com.ar/blog/blog/post/-1a4bdfba


If you like history, you can read all the related posts in chronological
order here:
http://llucax.com.ar/blog/blog/tag/dgc?sort=+date

If you only care about concurrency, you probably want to read just this:
http://llucax.com.ar/blog/blog/tag/cdgc?sort=+date

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
Desde chiquito quería ser doctor
Pero después me enfermé y me hice músico
Sep 09 2010
next sibling parent reply Bernard Helyer <b.helyer gmail.com> writes:
Very nice. I've been reading your posts on this with interest.

How much work would be involved in porting this to druntime?
Sep 09 2010
parent reply Leandro Lucarella <luca llucax.com.ar> writes:
Bernard Helyer, el 10 de septiembre a las 04:49 me escribiste:
 Very nice. I've been reading your posts on this with interest.
 
 How much work would be involved in porting this to druntime?
Is hard to tell since I didn't followed druntime development very closely lately, but both are based on the same code, so probably not too much work should be involved. -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- In a world without fences, who need gates?
Sep 10 2010
parent reply Leandro Lucarella <luca llucax.com.ar> writes:
Content-Disposition: inline

Leandro Lucarella, el 10 de septiembre a las 09:26 me escribiste:
 Bernard Helyer, el 10 de septiembre a las 04:49 me escribiste:
 Very nice. I've been reading your posts on this with interest.
 
 How much work would be involved in porting this to druntime?
Is hard to tell since I didn't followed druntime development very closely lately, but both are based on the same code, so probably not too much work should be involved.
BTW, I just saw there were quite a few replies to this threads while my internet connection was down for a couple of weeks (just after I posted this), that I didn't received. I'm sorry about the silence, I'll try to reply the things that might still be of interest, and I'll take this opportunity to say that CDGC is "finished", see the blog post for (a few) more details: http://llucax.com.ar/blog/blog/post/-7454ab1d Now the replies. David Simcha asked:
 Been meaning to ask, what does this GC do with regard to precise
 scanning?  Is it substantially less conservative than the current D2
 GC?
Yes, if the compiler support is there. It uses the patch to DMD posted by wm4 to scan the heap precisely. See bug 3463 for details: http://d.puremagic.com/issues/show_bug.cgi?id=3463 Andrei Alexandrescu said:
 There's a discussion on digitalmars.D about a D program being slower
 than the equivalent Python script because of the GC. Would be a good
 test bed!

 http://rounin.livejournal.com/21815.html
I'll try it when I have some time. I wouldn't expect much improvement because it only uses AAs that doesn't get scanned precisely yet (IIRC from wm4 patch), but there are a few optimizations that could help with dynamic arrays (which are also used frequently). We'll see :) OTOH, Dil does a lot of string processing using too AAs and dynamic arrays (AFAIK) and the improvement is huge using CGGC :). As a sample, attached are the results for a Dil run generating all the Tango docs, using 1 CPU (using more CPUs is a little faster), for the Tango basic GC (TBGC) and several configurations of the CDGC. The file named "time" measures the total run time (i.e. the throughput), the one named "stw" measures the maximum stop-the-world pause (i.e. the time no thread can run) and the file named "pause" measures the maximum time the GC will actually block the program (i.e. the maximum real latency). All in seconds, for 50 runs (the run time) and 20 runs (the pause time), minimum is in black, maximum in white, and the mean is centered between 2 std deviations (in grey). -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Hey you, out there on your own Sitting naked by the phone Would you touch me?
Oct 07 2010
parent reply "Denis Koroskin" <2korden gmail.com> writes:
On Fri, 08 Oct 2010 04:25:30 +0400, Leandro Lucarella <luca llucax.com.ar>  
wrote:

 Leandro Lucarella, el 10 de septiembre a las 09:26 me escribiste:
 Bernard Helyer, el 10 de septiembre a las 04:49 me escribiste:
 Very nice. I've been reading your posts on this with interest.

 How much work would be involved in porting this to druntime?
Is hard to tell since I didn't followed druntime development very closely lately, but both are based on the same code, so probably not too much work should be involved.
BTW, I just saw there were quite a few replies to this threads while my internet connection was down for a couple of weeks (just after I posted this), that I didn't received. I'm sorry about the silence, I'll try to reply the things that might still be of interest, and I'll take this opportunity to say that CDGC is "finished", see the blog post for (a few) more details: http://llucax.com.ar/blog/blog/post/-7454ab1d Now the replies. David Simcha asked:
 Been meaning to ask, what does this GC do with regard to precise
 scanning?  Is it substantially less conservative than the current D2
 GC?
Yes, if the compiler support is there. It uses the patch to DMD posted by wm4 to scan the heap precisely. See bug 3463 for details: http://d.puremagic.com/issues/show_bug.cgi?id=3463 Andrei Alexandrescu said:
 There's a discussion on digitalmars.D about a D program being slower
 than the equivalent Python script because of the GC. Would be a good
 test bed!

 http://rounin.livejournal.com/21815.html
I'll try it when I have some time. I wouldn't expect much improvement because it only uses AAs that doesn't get scanned precisely yet (IIRC from wm4 patch), but there are a few optimizations that could help with dynamic arrays (which are also used frequently). We'll see :) OTOH, Dil does a lot of string processing using too AAs and dynamic arrays (AFAIK) and the improvement is huge using CGGC :). As a sample, attached are the results for a Dil run generating all the Tango docs, using 1 CPU (using more CPUs is a little faster), for the Tango basic GC (TBGC) and several configurations of the CDGC. The file named "time" measures the total run time (i.e. the throughput), the one named "stw" measures the maximum stop-the-world pause (i.e. the time no thread can run) and the file named "pause" measures the maximum time the GC will actually block the program (i.e. the maximum real latency). All in seconds, for 50 runs (the run time) and 20 runs (the pause time), minimum is in black, maximum in white, and the mean is centered between 2 std deviations (in grey).
Hi, Leandro! I tried using your GC under D2/Windows, and unfortunately it crashes with Access Violation (I used a version modified by Sean as a starting point with little changes to fix compilation error). Are there any plans supporting it? I'm willing to help with testing it as much as I can. Thanks!
Oct 07 2010
parent reply Leandro Lucarella <luca llucax.com.ar> writes:
Denis Koroskin, el  8 de octubre a las 05:14 me escribiste:
 I tried using your GC under D2/Windows, and unfortunately it crashes
 with Access Violation (I used a version modified by Sean as a
 starting point with little changes to fix compilation error).
 
 Are there any plans supporting it? I'm willing to help with testing
 it as much as I can.
As I told to Sean, I tried my best to not break Windows support very much, but I don't have a Windows environment to test with (and to very interested on creating one to be honest, sorry :S). But I am very interested on accepting patches to make Windows work (and improvements). Testing the GC on Windows might not be very tempting though, as the main improvement is based on the fork(2) system call, which is not available on Windows, so all the fun options will be disabled there. The only thing that might worth trying in Windows is the precise scanning (well, there are some other minor optimizations that proved useful). -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Yo soy Peperino, mártir latino, venid al asado pero traed el vino. -- Peperino Pómoro
Oct 07 2010
parent Leandro Lucarella <luca llucax.com.ar> writes:
Leandro Lucarella, el  8 de octubre a las 01:44 me escribiste:
 Denis Koroskin, el  8 de octubre a las 05:14 me escribiste:
 I tried using your GC under D2/Windows, and unfortunately it crashes
 with Access Violation (I used a version modified by Sean as a
 starting point with little changes to fix compilation error).
 
 Are there any plans supporting it? I'm willing to help with testing
 it as much as I can.
As I told to Sean, I tried my best to not break Windows support very much, but I don't have a Windows environment to test with (and to very interested on creating one to be honest, sorry :S). But I am very interested on accepting patches to make Windows work (and improvements). Testing the GC on Windows might not be very tempting though, as the main improvement is based on the fork(2) system call, which is not available on Windows, so all the fun options will be disabled there. The only thing that might worth trying in Windows is the precise scanning (well, there are some other minor optimizations that proved useful).
BTW, I've put up a small (Linux/POSIX) HOWTO to easily try CDGC. People interested on trying CDGC might find it useful: http://llucax.com.ar/blog/blog/post/-4d1fefb4 -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Cuando el Mártir estaba siendo perseguido y aglutinado por los citronetos, aquellos perversos que pretendian, en su maldad, piononizar las enseñanzas de Peperino. -- Peperino Pómoro
Oct 10 2010
prev sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
Leandro Lucarella Wrote:

 Not quite ready for prime-time yet, but I think it's in a stage when it
 could be interesting anyway (at least for developers or people that want
 to experiment):
 http://llucax.com.ar/blog/blog/post/-1a4bdfba
Nice work! I've gotten this to compile as the GC for druntime using D2 but have ran into a snag. I'm using OSX (ie. no usable debug info) but near as I can tell the issue is: private T locked(T, alias Code)() { if (thread_needLock()) synchronized (gc.lock) return Code(); else return Code(); } void* gc_malloc(size_t size, uint attrs = 0) { if (size == 0) return null; return locked!(void*, () { assert (Invariant()); scope (exit) assert (Invariant()); return malloc(size, attrs, null); })(); } In the code above, it appears that the anonymous delegate being passed as an alias to locked() is having its stack data dynamically allocated (ie. as a dynamic closure). For normal delegate calls this can be avoided by accepting "scope delegate" as the function parameter, but I haven't found an analog when using the alias approach. Obviously, what happens is that a call to gc_malloc() ends up needing GCed memory, so gc_malloc() is recursively called, and on until the stack explodes. I'll see if I can come up with a workaround that continues using the alias template parameter.
Sep 14 2010
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 14 Sep 2010 19:09:18 -0400, Sean Kelly <sean invisibleduck.org>  
wrote:

 Leandro Lucarella Wrote:

 Not quite ready for prime-time yet, but I think it's in a stage when it
 could be interesting anyway (at least for developers or people that want
 to experiment):
 http://llucax.com.ar/blog/blog/post/-1a4bdfba
Nice work! I've gotten this to compile as the GC for druntime using D2 but have ran into a snag. I'm using OSX (ie. no usable debug info) but near as I can tell the issue is: private T locked(T, alias Code)() { if (thread_needLock()) synchronized (gc.lock) return Code(); else return Code(); } void* gc_malloc(size_t size, uint attrs = 0) { if (size == 0) return null; return locked!(void*, () { assert (Invariant()); scope (exit) assert (Invariant()); return malloc(size, attrs, null); })(); } In the code above, it appears that the anonymous delegate being passed as an alias to locked() is having its stack data dynamically allocated (ie. as a dynamic closure). For normal delegate calls this can be avoided by accepting "scope delegate" as the function parameter, but I haven't found an analog when using the alias approach. Obviously, what happens is that a call to gc_malloc() ends up needing GCed memory, so gc_malloc() is recursively called, and on until the stack explodes. I'll see if I can come up with a workaround that continues using the alias template parameter.
What if you passed it through a scope delegate function? i.e. void *lockedproxy(scope void* delegate() dg) { return locked!(void *, dg); } -Steve
Sep 15 2010
parent reply Sean Kelly <sean invisibleduck.org> writes:
"Steven Schveighoffer" <schveiguy yahoo.com> wrote:
 On Tue, 14 Sep 2010 19:09:18 -0400, Sean Kelly
 <sean invisibleduck.org>  wrote:
 
 Leandro Lucarella Wrote:
 
 Not quite ready for prime-time yet, but I think it's in a stage when
 it
 could be interesting anyway (at least for developers or people that
 want
 to experiment):
 http://llucax.com.ar/blog/blog/post/-1a4bdfba
Nice work! I've gotten this to compile as the GC for druntime using D2 > but have ran into a snag. I'm using OSX (ie. no usable debug info) but > near as I can tell the issue is: private T locked(T, alias Code)() { if (thread_needLock()) synchronized (gc.lock) return Code(); else return Code(); } void* gc_malloc(size_t size, uint attrs = 0) { if (size == 0) return null; return locked!(void*, () { assert (Invariant()); scope (exit) assert (Invariant()); return malloc(size, attrs, null); })(); } In the code above, it appears that the anonymous delegate being passed > as an alias to locked() is having its stack data dynamically allocated > (ie. as a dynamic closure). For normal delegate calls this can be > avoided by accepting "scope delegate" as the function parameter, but I > haven't found an analog when using the alias approach. Obviously, what > happens is that a call to gc_malloc() ends up needing GCed memory, so > gc_malloc() is recursively called, and on until the stack explodes. > I'll see if I can come up with a workaround that continues using the > alias template parameter.
What if you passed it through a scope delegate function? i.e. void *lockedproxy(scope void* delegate() dg) { return locked!(void *, dg); }
I could simply change it to a function that accepts a scope delegate. It just isn't as efficient as the alias approach. But an indirect call is pretty small potatoes in this case anyway. I tried this and it worked, and the I ran into an issue where the object used for the lock (classinfo for GCLock) was apparently null. I'll look into that today.
Sep 15 2010
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 15 Sep 2010 10:26:35 -0400, Sean Kelly <sean invisibleduck.org>  
wrote:

 "Steven Schveighoffer" <schveiguy yahoo.com> wrote:
 On Tue, 14 Sep 2010 19:09:18 -0400, Sean Kelly
 <sean invisibleduck.org>  wrote:

 Leandro Lucarella Wrote:

 Not quite ready for prime-time yet, but I think it's in a stage when
 it
 could be interesting anyway (at least for developers or people that
 want
 to experiment):
 http://llucax.com.ar/blog/blog/post/-1a4bdfba
Nice work! I've gotten this to compile as the GC for druntime using D2 > but have ran into a snag. I'm using OSX (ie. no usable debug info) but > near as I can tell the issue is: private T locked(T, alias Code)() { if (thread_needLock()) synchronized (gc.lock) return Code(); else return Code(); } void* gc_malloc(size_t size, uint attrs = 0) { if (size == 0) return null; return locked!(void*, () { assert (Invariant()); scope (exit) assert (Invariant()); return malloc(size, attrs, null); })(); } In the code above, it appears that the anonymous delegate being passed > as an alias to locked() is having its stack data dynamically allocated > (ie. as a dynamic closure). For normal delegate calls this can be > avoided by accepting "scope delegate" as the function parameter, but I > haven't found an analog when using the alias approach. Obviously, what > happens is that a call to gc_malloc() ends up needing GCed memory, so > gc_malloc() is recursively called, and on until the stack explodes. > I'll see if I can come up with a workaround that continues using the > alias template parameter.
What if you passed it through a scope delegate function? i.e. void *lockedproxy(scope void* delegate() dg) { return locked!(void *, dg); }
I could simply change it to a function that accepts a scope delegate. It just isn't as efficient as the alias approach. But an indirect call is pretty small potatoes in this case anyway. I tried this and it worked, and the I ran into an issue where the object used for the lock (classinfo for GCLock) was apparently null. I'll look into that today.
Well, the alias version could be used for other callable types, whereas a delegate version can only be used for delegates. Not sure how important it is. -Steve
Sep 15 2010
parent Sean Kelly <sean invisibleduck.org> writes:
Steven Schveighoffer Wrote:

 On Wed, 15 Sep 2010 10:26:35 -0400, Sean Kelly <sean invisibleduck.org>  
 wrote:
 
 I could simply change it to a function that accepts a scope delegate. It
 just isn't as efficient as the alias approach.  But an indirect call is
 pretty small potatoes in this case anyway. I tried this and it worked,
 and the I ran into an issue where the object used for the lock
 (classinfo for GCLock) was apparently null. I'll look into that today.
Well, the alias version could be used for other callable types, whereas a delegate version can only be used for delegates. Not sure how important it is.
I think there needs to be a solution for this in the general case, but for this specific application it doesn't matter.
Sep 15 2010
prev sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
Sean Kelly Wrote:

 Leandro Lucarella Wrote:
 
 Not quite ready for prime-time yet, but I think it's in a stage when it
 could be interesting anyway (at least for developers or people that want
 to experiment):
 http://llucax.com.ar/blog/blog/post/-1a4bdfba
Nice work! I've gotten this to compile as the GC for druntime using D2 but have ran into a snag.
Okay, all fixed. Porting the GC to D2 was more work than porting it to druntime specifically. There are still a few hacks in place to work around const issues and I faked the PointerMap stuff, but the GC seems to work just great. I'll have to find a reasonable performance test for comparison. Oh, I'll send you the modified code as well.
Sep 15 2010
parent reply Lutger <lutger.blijdestijn gmail.com> writes:
Sean Kelly wrote:

 Sean Kelly Wrote:
 
 Leandro Lucarella Wrote:
 
 Not quite ready for prime-time yet, but I think it's in a stage when it
 could be interesting anyway (at least for developers or people that want
 to experiment):
 http://llucax.com.ar/blog/blog/post/-1a4bdfba
Nice work! I've gotten this to compile as the GC for druntime using D2 but have ran into a snag.
Okay, all fixed. Porting the GC to D2 was more work than porting it to druntime specifically. There are still a few hacks in place to work around const issues and I faked the PointerMap stuff, but the GC seems to work just great. I'll have to find a reasonable performance test for comparison. Oh, I'll send you the modified code as well.
Are you going to integrate it in druntime?
Sep 15 2010
parent reply Sean Kelly <sean invisibleduck.org> writes:
Lutger Wrote:

 Sean Kelly wrote:
 
 Sean Kelly Wrote:
 
 Leandro Lucarella Wrote:
 
 Not quite ready for prime-time yet, but I think it's in a stage when it
 could be interesting anyway (at least for developers or people that want
 to experiment):
 http://llucax.com.ar/blog/blog/post/-1a4bdfba
Nice work! I've gotten this to compile as the GC for druntime using D2 but have ran into a snag.
Okay, all fixed. Porting the GC to D2 was more work than porting it to druntime specifically. There are still a few hacks in place to work around const issues and I faked the PointerMap stuff, but the GC seems to work just great. I'll have to find a reasonable performance test for comparison. Oh, I'll send you the modified code as well.
Are you going to integrate it in druntime?
Dunno. For now, I've just sent my modified copy to Leandro and the druntime list for people to play with. It seems quite promising though.
Sep 15 2010
next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Sean Kelly (sean invisibleduck.org)'s article
 Lutger Wrote:
 Sean Kelly wrote:

 Sean Kelly Wrote:

 Leandro Lucarella Wrote:

 Not quite ready for prime-time yet, but I think it's in a stage when it
 could be interesting anyway (at least for developers or people that want
 to experiment):
 http://llucax.com.ar/blog/blog/post/-1a4bdfba
Nice work! I've gotten this to compile as the GC for druntime using D2 but have ran into a snag.
Okay, all fixed. Porting the GC to D2 was more work than porting it to druntime specifically. There are still a few hacks in place to work around const issues and I faked the PointerMap stuff, but the GC seems to work just great. I'll have to find a reasonable performance test for comparison. Oh, I'll send you the modified code as well.
Are you going to integrate it in druntime?
Dunno. For now, I've just sent my modified copy to Leandro and the druntime
list for people to play with. It seems quite promising though. Been meaning to ask, what does this GC do with regard to precise scanning? Is it substantially less conservative than the current D2 GC?
Sep 15 2010
parent Sean Kelly <sean invisibleduck.org> writes:
dsimcha Wrote:

 == Quote from Sean Kelly (sean invisibleduck.org)'s article

 Dunno.  For now, I've just sent my modified copy to Leandro and the druntime
list for people to play with. It seems quite promising though. Been meaning to ask, what does this GC do with regard to precise scanning? Is it substantially less conservative than the current D2 GC?
Its gc_malloc, etc, accept a PointerMap type which I assume is used for precise scanning, but I couldn't find a definition of PointerMap in the GC code or other obvious locations so I disabled it for now. I assume it's from the precise scanning patch and that I can get whatever else is needed from there. For now, I just wanted a functional port to test.
Sep 16 2010
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 09/15/2010 07:00 PM, Sean Kelly wrote:
 Lutger Wrote:

 Sean Kelly wrote:

 Sean Kelly Wrote:

 Leandro Lucarella Wrote:

 Not quite ready for prime-time yet, but I think it's in a stage when it
 could be interesting anyway (at least for developers or people that want
 to experiment):
 http://llucax.com.ar/blog/blog/post/-1a4bdfba
Nice work! I've gotten this to compile as the GC for druntime using D2 but have ran into a snag.
Okay, all fixed. Porting the GC to D2 was more work than porting it to druntime specifically. There are still a few hacks in place to work around const issues and I faked the PointerMap stuff, but the GC seems to work just great. I'll have to find a reasonable performance test for comparison. Oh, I'll send you the modified code as well.
Are you going to integrate it in druntime?
Dunno. For now, I've just sent my modified copy to Leandro and the druntime list for people to play with. It seems quite promising though.
There's a discussion on digitalmars.D about a D program being slower than the equivalent Python script because of the GC. Would be a good test bed! Andrei
Sep 15 2010
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/15/10 22:22 CDT, Andrei Alexandrescu wrote:
 On 09/15/2010 07:00 PM, Sean Kelly wrote:
 Lutger Wrote:

 Sean Kelly wrote:

 Sean Kelly Wrote:

 Leandro Lucarella Wrote:

 Not quite ready for prime-time yet, but I think it's in a stage
 when it
 could be interesting anyway (at least for developers or people
 that want
 to experiment):
 http://llucax.com.ar/blog/blog/post/-1a4bdfba
Nice work! I've gotten this to compile as the GC for druntime using D2 but have ran into a snag.
Okay, all fixed. Porting the GC to D2 was more work than porting it to druntime specifically. There are still a few hacks in place to work around const issues and I faked the PointerMap stuff, but the GC seems to work just great. I'll have to find a reasonable performance test for comparison. Oh, I'll send you the modified code as well.
Are you going to integrate it in druntime?
Dunno. For now, I've just sent my modified copy to Leandro and the druntime list for people to play with. It seems quite promising though.
There's a discussion on digitalmars.D about a D program being slower than the equivalent Python script because of the GC. Would be a good test bed!
This is what I was referring to. http://rounin.livejournal.com/21815.html The author makes quite a few good points comparing D and Python in expressiveness and performance. Andrei
Sep 16 2010
prev sibling parent Sean Kelly <sean invisibleduck.org> writes:
Andrei Alexandrescu Wrote:
 
 There's a discussion on digitalmars.D about a D program being slower 
 than the equivalent Python script because of the GC. Would be a good 
 test bed!
I was playing with that yesterday, but there was too much setup required for a quick test (it compared an MD5 has of a directory tree to the current directory tree). I'll revisit it though, since it seems a good starting point.
Sep 16 2010