www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - GC performance: collection frequency

reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
Over in the d.learn forum, somebody posted a question about poor
performance in a text-parsing program. After a bit of profiling I
discovered that reducing GC collection frequency (i.e., GC.disable()
then manually call GC.collect() at some interval) improved program
performance by about 20%.

This isn't the first time I encountered this.  Some time ago (late last
year IIRC) I found that in one of my own CPU-intensive programs,
manually scheduling GC collection cycles won me about 30-40% performance
improvement.

While two data points is hardly statistically significant, these two do
seem to suggest that perhaps part of the GC's perceived poor performance
may stem from an overly-zealous collection schedule.

Since asking users to implement their own GC collection schedule can be
a bit onerous (not to mention greatly uglifying user code), would it be
a good idea to make the GC collection schedule configurable?  At least
that way, people can just call GC.collectSchedule(/*some value*/) as a
first stab at improving overall performance, without needing to rewrite
a whole bunch of code to avoid the GC, or go all-out  nogc.

We could also reduce the default collection frequency, of course, but
lacking sufficient data I wouldn't know what value to set it to.


T

-- 
Computers shouldn't beep through the keyhole.
Sep 14 2015
next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Monday, 14 September 2015 at 18:51:36 UTC, H. S. Teoh wrote:
 We could also reduce the default collection frequency, of 
 course, but lacking sufficient data I wouldn't know what value 
 to set it to.
Definitely. I think it hits a case where it is right at the edge of the line and you are allocating a small amount. So it is like the limit is 1,000 bytes. You are at 980 and ask it to allocate 30. So it runs a collection cycle, frees the 30 from the previous loop iteration, then allocates it again... so the whole loop, it is on the edge and runs very often. Of course, it has to scan everything to ensure it is safe to free those 30 bytes so the GC then runs way out of proportion. Maybe we can make the GC detect this somehow and bump up the size. I don't actually know the implementation that well though.
Sep 14 2015
parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Monday, 14 September 2015 at 18:58:45 UTC, Adam D. Ruppe wrote:
 On Monday, 14 September 2015 at 18:51:36 UTC, H. S. Teoh wrote:
 We could also reduce the default collection frequency, of 
 course, but lacking sufficient data I wouldn't know what value 
 to set it to.
Definitely. I think it hits a case where it is right at the edge of the line and you are allocating a small amount. So it is like the limit is 1,000 bytes. You are at 980 and ask it to allocate 30. So it runs a collection cycle, frees the 30 from the previous loop iteration, then allocates it again... so the whole loop, it is on the edge and runs very often. Of course, it has to scan everything to ensure it is safe to free those 30 bytes so the GC then runs way out of proportion. Maybe we can make the GC detect this somehow and bump up the size. I don't actually know the implementation that well though.
My first inclination would be to make it just allocate more memory and not run a collection if the last collection was too recent, but there are bound to be papers and studies on this sort of thing already. And the exact strategy to use likely depends heavily on the type of GC - e.g. if our GC were updated to be concurrent like we've talked about for a while now, then triggering a concurrent collection at 80% could make it so that the program didn't actually run out of memory while still not slowing it down much (just long enough to fork for the concurrent collection), whereas if we don't have a concurrent GC (like now), then triggering at 80% would just make things worse. - Jonathan M Davis
Sep 14 2015
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Monday, 14 September 2015 at 18:51:36 UTC, H. S. Teoh wrote:
 Over in the d.learn forum, somebody posted a question about poor
 performance in a text-parsing program. After a bit of profiling 
 I
 discovered that reducing GC collection frequency (i.e., 
 GC.disable()
 then manually call GC.collect() at some interval) improved 
 program
 performance by about 20%.

 This isn't the first time I encountered this.  Some time ago 
 (late last year IIRC) I found that in one of my own 
 CPU-intensive programs, manually scheduling GC collection 
 cycles won me about 30-40% performance improvement.

 While two data points is hardly statistically significant, 
 these two do seem to suggest that perhaps part of the GC's 
 perceived poor performance may stem from an overly-zealous 
 collection schedule.

 Since asking users to implement their own GC collection 
 schedule can be a bit onerous (not to mention greatly uglifying 
 user code), would it be a good idea to make the GC collection 
 schedule configurable?  At least that way, people can just call 
 GC.collectSchedule(/*some value*/) as a first stab at improving 
 overall performance, without needing to rewrite a whole bunch 
 of code to avoid the GC, or go all-out  nogc.

 We could also reduce the default collection frequency, of 
 course, but lacking sufficient data I wouldn't know what value 
 to set it to.
Isn't there some amount of configuration that can currently be done via environment variables? Or was that just something that someone had done in one of the GC-related dconf talks that never made it into druntime proper? It definitely seemed like a good idea in any case. - Jonathan M Davis
Sep 14 2015
next sibling parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Mon, Sep 14, 2015 at 07:19:53PM +0000, Jonathan M Davis via Digitalmars-d
wrote:
[...]
 Isn't there some amount of configuration that can currently be done
 via environment variables? Or was that just something that someone had
 done in one of the GC-related dconf talks that never made it into
 druntime proper?  It definitely seemed like a good idea in any case.
[...] If it's undocumented, it's as good as not existing as far as end users are concerned. :-) I didn't see anything mentioned in core.memory's docs, nor in dlang.org's page on the GC, nor on the wiki's GC page. T -- Programming is not just an act of telling a computer what to do: it is also an act of telling other programmers what you wished the computer to do. Both are important, and the latter deserves care. -- Andrew Morton
Sep 14 2015
prev sibling next sibling parent Daniel =?UTF-8?B?S296w6Fr?= via Digitalmars-d writes:
http://dlang.org/changelog/2.067.0.html#gc-options

On Mon, 14 Sep 2015 12:25:06 -0700
"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> wrote:

 On Mon, Sep 14, 2015 at 07:19:53PM +0000, Jonathan M Davis via
 Digitalmars-d wrote: [...]
 Isn't there some amount of configuration that can currently be done
 via environment variables? Or was that just something that someone
 had done in one of the GC-related dconf talks that never made it
 into druntime proper?  It definitely seemed like a good idea in any
 case.
[...] If it's undocumented, it's as good as not existing as far as end users are concerned. :-) I didn't see anything mentioned in core.memory's docs, nor in dlang.org's page on the GC, nor on the wiki's GC page. T
Sep 14 2015
prev sibling parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Tue, Sep 15, 2015 at 07:08:01AM +0200, Daniel Kozák via Digitalmars-d wrote:
 
 http://dlang.org/changelog/2.067.0.html#gc-options
[...] Wow that is obscure. This really needs to go into the main docs so that it can actually be found... T -- People demand freedom of speech to make up for the freedom of thought which they avoid. -- Soren Aabye Kierkegaard (1813-1855)
Sep 16 2015
prev sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 14-Sep-2015 21:47, H. S. Teoh via Digitalmars-d wrote:
 Over in the d.learn forum, somebody posted a question about poor
 performance in a text-parsing program. After a bit of profiling I
 discovered that reducing GC collection frequency (i.e., GC.disable()
 then manually call GC.collect() at some interval) improved program
 performance by about 20%.
One thing that any remotely production-quality GC does is analyze the result of collection with respect to minimal headroom - X % (typically 30-50%). If we freed Y % of heap where Y < X, then the GC should extend the heap so that it get within X % mark of free space in the extended heap. -- Dmitry Olshansky
Sep 17 2015
parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Thu, Sep 17, 2015 at 11:26:17AM +0300, Dmitry Olshansky via Digitalmars-d
wrote:
 On 14-Sep-2015 21:47, H. S. Teoh via Digitalmars-d wrote:
Over in the d.learn forum, somebody posted a question about poor
performance in a text-parsing program. After a bit of profiling I
discovered that reducing GC collection frequency (i.e., GC.disable()
then manually call GC.collect() at some interval) improved program
performance by about 20%.
One thing that any remotely production-quality GC does is analyze the result of collection with respect to minimal headroom - X % (typically 30-50%). If we freed Y % of heap where Y < X, then the GC should extend the heap so that it get within X % mark of free space in the extended heap.
[...] Excellent idea. Sounds reasonably simple to implement, though I'm not exactly familiar with the current GC so I don't know if I'll be able to implement this myself... T -- Gone Chopin. Bach in a minuet.
Sep 17 2015