www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 3931] New: Associative Arrays on repeated stress testing get progressively slower

http://d.puremagic.com/issues/show_bug.cgi?id=3931

           Summary: Associative Arrays on repeated stress testing get
                    progressively slower
           Product: D
           Version: 2.040
          Platform: x86
        OS/Version: Windows
            Status: NEW
          Severity: normal
          Priority: P2
         Component: druntime
        AssignedTo: sean invisibleduck.org
        ReportedBy: y0uf00bar gmail.com


--- Comment #0 from Michael Rynn <y0uf00bar gmail.com> 2010-03-10 21:48:34 PST
---
This occurs in both D1 and D2. For the specific test code where I noticed this,
check out the aa project on dsource. I first noticed it on the original PyDict
code, and then also noted built-in AA have similar behaviour.

Looked at the Associative Arrays project on http://www.dsource.org/
projects/aa

Ran the test program for PyDict noted the average result for consecutive
runs of the one test it did, for  uint[unit]  associative arrays.

 This was very irritating, because of long wait for command line output.
The test.d default sizes and number of runs were very big and only put out
the average result at the end.

Changed test.d to output each individual run time. The time for each
individual run got progressively longer.

This was fixable by making the calloc block attribute be
GC.BlkAttr.NO_SCAN.  So maybe the randomized uint bits confuse the garbage
collector as it scans through.

In the PyDictD1 test code there was a call to disable the garbage
collector. Thats cheating.

I looked at the code for PyDictD2.d  and I decided that the table of
struct Entry,  holding the key and value, for which calloc is used , could
be replaced by a more D - like (its a template anyway) call to new
Entry[size].  The size is always a power of 2.

Is it better to use new Entry[] rather than calloc,  since the GC has a
map of which values are pointers and which are not?  Hopefully just like
real D code.

Ran the test again after removing the calloc, and the speed improved and
the progressive run time increase went away.

For comparison, I also put in a wrapper class template for the built in D
AA type, and did the same tests. For uint[unit] this also had a
progressive slow down in run-time.  Same issue found for the D1 builtin
AA.

So the built-in associative array type of D suffers from this disease.
(under very stressful array size parameters of course). It might affect
long running programs with big builtin AA tables.


The mystery is  why was the slowdown progressive.  After each run the AA
variable is deleted, and the GC gets told to delete the calloc array
directly. If the previous block of memory is properly gone, and a new
cleared block created, shouldn't the run results be the same each time?

So now the PyDict does not use calloc. and is faster than the built-in. I
added a few missing properties (length, opIndex)

The tests for string[uint] were re-instated.  The built in AA had some
progressive run slowdown tendency for this as well, although not as
marked.

A comment on the news group says its like a "false sharing" problem.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Mar 10 2010