www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 3637] New: Array append patch to prevent stomping and to enhance thread-local append performance

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3637

           Summary: Array append patch to prevent stomping and to enhance
                    thread-local append performance
           Product: D
           Version: future
          Platform: x86
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: druntime
        AssignedTo: sean invisibleduck.org
        ReportedBy: schveiguy yahoo.com



12:10:42 PST ---
Created an attachment (id=528)
Patch to prevent array stomping and increase append performance

The attached patch fixes 2 problems:

1. Array stomping.  A slice of an array can stomp on another array if the slice
starts at the beginning of the other array.  This patch prevents such problems
by storing the "allocated length" or the length of valid data at the end of the
array.

2. Array append previously required acquiring the GC lock, even when the array
can be extended in-place.  The patch makes use of an 8-element MRU (most
recently used) cache in thread-local storage that allows the runtime to avoid
taking the global lock to look up an array's allocated length.  This allows one
to append to up to 8 arrays per thread without taking the global lock unless an
actual re-allocation is required.  Shared array appending still requires a
global lock for appending, and does not have a cache associated with it. 
However, shared appending is still safe as far as stomping goes.

Note that with this patch, the following trick *no longer works*:

int[] bigarray = new int[10000];
bigarray.length = 0;
foreach(i; 0..10000)
   bigarray ~= i;

Because truncating the array this way can cause stomping, the first append will
reallocate bigarray and you will lose the benefit of preallocating the original
array.

A (yet unwritten) library function is required to preallocate an array.

To apply the patch, download the 2.037 distribution (this code was only tested
against the distribution), and apply the patch from the src/druntime directory:

cd dmd2/src/druntime
patch -p0 < append.patch

then build and re-install druntime and phobos libraries.

The patch was tested on Linux dmd version 2.037.  An earlier version of the
patch which was mostly the same was tested on Windows dmd 2.036 and Mac OSX
10.6 dmd 2.037.  The difference between the earlier and this version of the
patch is the handling of appending dchar to char[] and wchar[].

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 21 2009
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3637


Leandro Lucarella <llucax gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------

          mime type|                            |

              patch|                            |



PST ---
(From update of attachment 528)
Fix the MIME type and mark the attachment as a patch.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 22 2009
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3637


Leandro Lucarella <llucax gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |llucax gmail.com



PST ---
It is really necessary change the GC API to fix this? I think it's a really bad
idea to modify the GC API (add a function) just because dynamic arrays have
problems (I don't want to start over the discussion about dynamic arrays being
broken :).

If you absolutely need this new function: BlkInfo gc_malloc_bi(size_t sz, uint
ba = 0); then maybe it's better to change the existing regular gc_malloc() to:
void*  gc_malloc(size_t sz, uint ba = 0, BlkInfo bi = null);

The advantage is that it's API-backward compatible and doesn't introduce a new
function to the GC API and the downside is it's not binary-backward compatible,
but I don't think people care much about this in D.

The inability to pre-allocate can be a little bad too, but it was awkward
anyway, so providing a better way to do so can be a good thing after all.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 22 2009
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3637




20:19:09 PST ---
It's not *necessary* to change the API to fix this, but it is hugely
advantageous.  If you want to know the block size of the chunk you just
allocated, the current API requires *another* lock of the GC, and a search
through the pools.  All the info is there in malloc, it's just not returned.

BTW, just an additional size parameter would suffice (this is how gcx handles
the allocation).  Having it return a BlkInfo struct is convenient because that
is the data type I'm working with when setting allocated length.

And introducing a new function is just as backwards compatible as adding a new
optional parameter to a current function.

Preallocation needs to be a runtime function that is yet to be written which
allocates a large block but sets the "allocated" size to 0.  It can also avoid
pre-initializing the block if the block has no pointers (something that the old
trick didn't do).

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 22 2009
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3637




PST ---

 It's not *necessary* to change the API to fix this, but it is hugely
 advantageous.  If you want to know the block size of the chunk you just
 allocated, the current API requires *another* lock of the GC, and a search
 through the pools.  All the info is there in malloc, it's just not returned.
 
 BTW, just an additional size parameter would suffice (this is how gcx handles
 the allocation).  Having it return a BlkInfo struct is convenient because that
 is the data type I'm working with when setting allocated length.
Yes, that seems reasonable since BlkInfo is already part of the API.
 And introducing a new function is just as backwards compatible as adding a new
 optional parameter to a current function.
I know that, but it adds complexity to the API, and seems a little weird since gc_malloc() and gc_malloc_bi() are basically the same. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Dec 23 2009
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3637


Steven Schveighoffer <schveiguy yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------

           obsolete|                            |



06:28:15 PST ---
Created an attachment (id=569)
Patch to druntime revision 245 to implement array appending

generated with svn diff.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 18 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3637




06:29:32 PST ---
Note that shared data isn't properly synchronized when being copied, but this
also happens in all the other array functions (not just appending).  I will
file a separate bug on this.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 18 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3637


Steven Schveighoffer <schveiguy yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------

           obsolete|                            |



06:34:43 PST ---
Created an attachment (id=570)
Patch to druntime revision 245 to implement array appending

The only addition to this patch is I added comments identifying where the
compiler fails to deliver the proper typeinfo for shared appending.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 18 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3637


Steven Schveighoffer <schveiguy yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
         AssignedTo|sean invisibleduck.org      |schveiguy yahoo.com



08:37:48 PST ---
changeset 252

http://www.dsource.org/projects/druntime/changeset/252

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 22 2010
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3637


Walter Bright <bugzilla digitalmars.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
                 CC|                            |bugzilla digitalmars.com
         Resolution|                            |FIXED



22:26:33 PST ---
Fixed dmd 2.041

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Mar 08 2010