www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - GC Blacklisting

reply dsimcha <dsimcha yahoo.com> writes:
I've been optimizing the druntime garbage collector lately, and as I've 
found several smaller optimizations since I submitted my patch last 
week, I've temporarily forked the druntime Git repository to store those 
optimizations.  The results are here:

https://github.com/dsimcha/druntime/wiki/Druntime-GC-Optimization-Fork

In the process, I've built up such a good mental model of the GC 
codebase that I'd hate to see it go to waste.

On 32-bit systems, false pointers are a big problem with conservative 
garbage collection.  Apparently this problem can be drastically 
mitigated by blacklisting memory addresses that have false pointers 
pointing at them.  (See 
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.92.96
&rep=rep1&type=pdf) 
  I'm wondering if such a technique is worth implementing in the D 
garbage collector.

Reasons why it may be worth doing:

1.  It would drastically mitigate false pointers.

2.  For the most part, it wouldn't be too hard to implement, now that I 
have a good mental model of the GC.

Reasons why it may not be worth doing:

1.  It would create some (probably small) additional overhead in the GC.

2.  IIUC false pointers are, for all practical purposes, not an issue on 
64 bit because the address space is so sparse.  Even on a 16 gigabyte 
heap, a random 64-bit pattern only has about a 1 in a billion 
probability of pointing to a valid heap address.  32 bit is a crufty 
legacy technology that's on its way out (rapidly now that DMD's 64-bit 
codegen works).  64-bit is the future.

3.  If the precise heap scanning patch that's been in Bugzilla for a 
while ever gets in, the false pointer issue might become mostly moot.

4.  Long-term, when/if D becomes a language successful enough to have 
serious resources thrown at it, the GC will probably be rewritten by GC 
experts working full time on it, rather than a few jack-of-all-trades 
programmers who don't specialize in GC.

5.  Figuring out a good heuristic for performing very large allocations 
in the presence of blacklisting would be rather difficult.  Do you keep 
asking the OS for more memory to avoid allocating a blacklisted page no 
matter what?  If not, where/how do you draw the line?

Please comment on whether you think this is worth doing.
Feb 24 2011
next sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Fri, 25 Feb 2011 06:38:52 +0200, dsimcha <dsimcha yahoo.com> wrote:

 I've been optimizing the druntime garbage collector lately, and as I've  
 found several smaller optimizations since I submitted my patch last  
 week, I've temporarily forked the druntime Git repository to store those  
 optimizations.  The results are here:

 https://github.com/dsimcha/druntime/wiki/Druntime-GC-Optimization-Fork

Good work!
 In the process, I've built up such a good mental model of the GC  
 codebase that I'd hate to see it go to waste.

 On 32-bit systems, false pointers are a big problem with conservative  
 garbage collection.  Apparently this problem can be drastically  
 mitigated by blacklisting memory addresses that have false pointers  
 pointing at them.  (See  
 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.92.96
&rep=rep1&type=pdf)  
   I'm wondering if such a technique is worth implementing in the D  
 garbage collector.

Blacklisting deals with cases when data is allocated at referenced addresses, but seems useless when false references are created to already-existing data, which seems to me like the more common case. Furthermore, if the false pointers have a high-enough entropy (and there are enough of them), they will cover the address space densely and consistently enough that blacklisting becomes almost useless. Not to mention the inherent waste of memory associated with this? I have very successfully dealt with false pointers to large objects in the following way: https://github.com/CyberShadow/data.d This method also has other advantages (as mentioned). Personally, I would prefer that D's GC remains fast and lean, optimized for scanning small heaps of small objects, and a method similar to mine be used for handling large data. P.S. I'm currently in the process of tracking down a memory corruption bug, which *might* result in a GC patch for D1. I'm also instinctively worried that touching the GC code may introduce new memory corruption bugs, which can be EXTREMELY hard to find. I've been chasing this one for 4 years. -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Feb 24 2011
parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Vladimir Panteleev (vladimir thecybershadow.net)'s article
 P.S. I'm currently in the process of tracking down a memory corruption
 bug, which *might* result in a GC patch for D1. I'm also instinctively
 worried that touching the GC code may introduce new memory corruption
 bugs, which can be EXTREMELY hard to find. I've been chasing this one for
 4 years.

I doubt it's a GC bug. If it's not a bug in your code, I'd be more inclined to assume it's a codegen bug, simply because the codegen is much larger and more complex, and there are more opportunities for weird bugs that can only be reproduced under very specific circumstances to creep in. Once you get past the superficial cruftiness and unreadability of the codebase and get a good conceptual model of it, D's GC is actually pretty simple. Also, I've been testing my patches by using the Phobos, std.parallelism/parallelfuture, and dstats unittests, and by simply eating my own dogfood (i.e. using my modified GC's when running some simulations and stuff). So far, so good. Unfortunately, we don't have a specific GC test suite, but IMHO if it works on this much real-world code, it's unlikely that I've created any bugs.
Feb 25 2011
parent dsimcha <dsimcha yahoo.com> writes:
== Quote from Vladimir Panteleev (vladimir thecybershadow.net)'s article
 How can you be so sure this is enough? The particular manifestation of the
 bug I was examining crashed my application 5 hours in, because the GC
 attempted to traverse a free list which had ASCII in it because the item
 had been allocated but it occured in the free list twice (so the first
 instance was used by the app to store text), because a freed (GC'd) object
 was manually deleted again when an element was removed from an associated
 array, and it was freed initially because the GC never reached it, because
 its "parent" was marked as NOSCAN, because the GC relies on NOSCAN being
 cleared on freed objects, and allocating in a destructor called during a
 GC breaks that assumption (and messes things up generally).
 Are you at least running your tests with the GC debug options enabled
 (such as MEMSTOMP)? I hope your patches don't break them, either.
 In case you missed my other reply, what I was aiming at is that something
 must be done when allocating from destructors. It must either reliably
 work or immediately fail, and definitely not corrupt the GC's state.
 Phobos allocates in destructors in a few places as well (std.zlib being
 one).

Ok, I didn't realize how deep into debugging this you had gotten. Thanks for your work. BTW, of course it's (practically) impossible to **prove** that the GC is correct and that there aren't any subtle corner case bugs, but the point is that testing on lots of real world code with completely different allocation patterns helps a ton. Ironically, for code that has lots of possible states and codepaths, "regular" unittests don't work well because it's virtually impossible to think of all the possible cases that real world code brings to the table. Also, the GC is currently (pre-patches) so slow that it's practically unusable on large (multi-gigabyte) heaps, to a large extent defeating the purpose of 64-bit support. This is, IMHO, a **severe** bug in itself, and is worth taking some risks to fix.
Feb 25 2011
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Fri, 25 Feb 2011 06:38:52 +0200, dsimcha <dsimcha yahoo.com> wrote:

 In the process, I've built up such a good mental model of the GC  
 codebase that I'd hate to see it go to waste.

Here's a question for you. What happens when a class destructor, called by the GC during a collection, allocates? (Hint: nothing good.) -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Feb 24 2011
prev sibling next sibling parent reply dennis luehring <dl.soluz gmx.net> writes:
Am 25.02.2011 05:38, schrieb dsimcha:
 I've been optimizing the druntime garbage collector lately, and as I've
 found several smaller optimizations since I submitted my patch last
 week, I've temporarily forked the druntime Git repository to store those
 optimizations.  The results are here:

 https://github.com/dsimcha/druntime/wiki/Druntime-GC-Optimization-Fork

how do you test your optimization - is there a huge (growing) GC-benchmark tests base? (something like the compiler regression tests but for the GC) - is there something like this for java/c# or the others available - can we use these as a testbed?
Feb 25 2011
parent reply "Paulo Pinto" <pjmlp progtools.org> writes:
One thing that would help, would be if it could be possible to plug into
D's GC like in other systems.

I mean the tools that allow to plugin into an existing application that 
monitor
runtime activity.

--
Paulo

"dennis luehring" <dl.soluz gmx.net> wrote in message 
news:ik7o0v$2gcd$1 digitalmars.com...
 Am 25.02.2011 05:38, schrieb dsimcha:
 I've been optimizing the druntime garbage collector lately, and as I've
 found several smaller optimizations since I submitted my patch last
 week, I've temporarily forked the druntime Git repository to store those
 optimizations.  The results are here:

 https://github.com/dsimcha/druntime/wiki/Druntime-GC-Optimization-Fork

how do you test your optimization - is there a huge (growing) GC-benchmark tests base? (something like the compiler regression tests but for the GC) - is there something like this for java/c# or the others available - can we use these as a testbed?

Feb 25 2011
parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Paulo Pinto (pjmlp progtools.org)'s article
 One thing that would help, would be if it could be possible to plug into
 D's GC like in other systems.
 I mean the tools that allow to plugin into an existing application that
 monitor
 runtime activity.

Can you be a little more specific about what you're asking for? One thing that might be nice is reviving GCStats, which allowed getting some basic statistics about GC performance, but is not exposed and appears to have succumbed to severe bit rot. Other than that, I have no idea what you mean.
Feb 25 2011
parent "Paulo Pinto" <pjmlp progtools.org> writes:
Hi,

I mean something like

http://visualvm.java.net/features.html
http://download.oracle.com/javase/1.5.0/docs/tooldocs/share/jstat.html
http://www.slideshare.net/guest62fd60c/eclipse-memory-analyzer-presentation

In Java's case, or in .Net's

Windows Performance Counters
http://support.microsoft.com/kb/316365
http://www.jetbrains.com/profiler/

In Go's case
http://golang.org/pkg/runtime/pprof/
http://code.google.com/p/google-perftools/

In some Unix systems you have now DTrace (maybe this one works also with D)
http://netbeans.org/kb/docs/ide/NetBeans_DTrace_GUI_Plugin_0_4.html

Or something like  MemoryScape
http://www.roguewave.com/products/totalview-family/memoryscape.aspx

Or even NightStart which also supports Ada, beside C and C++
http://real-time.ccur.com/Libraries/docs_pdf/NightStarTools.pdf

--
Paulo

"dsimcha" <dsimcha yahoo.com> wrote in message 
news:ik8l6v$16pk$1 digitalmars.com...
 == Quote from Paulo Pinto (pjmlp progtools.org)'s article
 One thing that would help, would be if it could be possible to plug into
 D's GC like in other systems.
 I mean the tools that allow to plugin into an existing application that
 monitor
 runtime activity.

Can you be a little more specific about what you're asking for? One thing that might be nice is reviving GCStats, which allowed getting some basic statistics about GC performance, but is not exposed and appears to have succumbed to severe bit rot. Other than that, I have no idea what you mean.

Feb 25 2011
prev sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Fri, 25 Feb 2011 18:30:36 +0200, dsimcha <dsimcha yahoo.com> wrote:

 == Quote from Vladimir Panteleev (vladimir thecybershadow.net)'s article
 P.S. I'm currently in the process of tracking down a memory corruption
 bug, which *might* result in a GC patch for D1. I'm also instinctively
 worried that touching the GC code may introduce new memory corruption
 bugs, which can be EXTREMELY hard to find. I've been chasing this one  
 for
 4 years.

I doubt it's a GC bug. If it's not a bug in your code, I'd be more inclined to assume it's a codegen bug, simply because the codegen is much larger and more complex, and there are more opportunities for weird bugs that can only be reproduced under very specific circumstances to creep in. Once you get past the superficial cruftiness and unreadability of the codebase and get a good conceptual model of it, D's GC is actually pretty simple.

That's what I've been telling myself for the past few years as well. (I've written patches and a memory debugger for D and even attempted writing my own GCs, so I'm no stranger to D's GC.)
 Also, I've been testing my patches by using the Phobos,
 std.parallelism/parallelfuture, and dstats unittests, and by simply  
 eating my own
 dogfood (i.e. using my modified GC's when running some simulations and  
 stuff).  So
 far, so good.  Unfortunately, we don't have a specific GC test suite,  
 but IMHO if
 it works on this much real-world code, it's unlikely that I've created  
 any bugs.

How can you be so sure this is enough? The particular manifestation of the bug I was examining crashed my application 5 hours in, because the GC attempted to traverse a free list which had ASCII in it because the item had been allocated but it occured in the free list twice (so the first instance was used by the app to store text), because a freed (GC'd) object was manually deleted again when an element was removed from an associated array, and it was freed initially because the GC never reached it, because its "parent" was marked as NOSCAN, because the GC relies on NOSCAN being cleared on freed objects, and allocating in a destructor called during a GC breaks that assumption (and messes things up generally). Are you at least running your tests with the GC debug options enabled (such as MEMSTOMP)? I hope your patches don't break them, either. In case you missed my other reply, what I was aiming at is that something must be done when allocating from destructors. It must either reliably work or immediately fail, and definitely not corrupt the GC's state. Phobos allocates in destructors in a few places as well (std.zlib being one). -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Feb 25 2011