www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - dynamic array capacity

reply spir <denis.spir gmail.com> writes:
Hello,

Is there a common idiom to pre-allocate a dynamic array. I mean allocating =
to avoid numerous re-allocations in loop, not setting length & filling cont=
ent.
The differences are:
(0) no content --> no init
(1) one can simply append, instead of setting elements at defined indices
(2) one can append sub-sequences
(3) when the initially requested capacity is full, the array automagically =
resizes as needed (usng D's builtin dynarray realloc scheme).

Denis
-- -- -- -- -- -- --
vit esse estrany =E2=98=A3

spir.wikidot.com
Dec 29 2010
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 29 Dec 2010 07:29:29 -0500, spir <denis.spir gmail.com> wrote:

 Hello,

 Is there a common idiom to pre-allocate a dynamic array. I mean  
 allocating to avoid numerous re-allocations in loop, not setting length  
 & filling content.

int[] x; x.reserve(10000); // pre-allocate at least 10000 elements for appending Another way to do it: Appender!(int[]) app; app.reserve(10000); this will perform much better than builtin array appending. -Steve
Dec 29 2010
parent %u <e ee.com> writes:
== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article
 On Wed, 29 Dec 2010 07:29:29 -0500, spir <denis.spir gmail.com> wrote:
 Hello,

 Is there a common idiom to pre-allocate a dynamic array. I mean
 allocating to avoid numerous re-allocations in loop, not setting length
 & filling content.

x.reserve(10000); // pre-allocate at least 10000 elements for appending Another way to do it: Appender!(int[]) app; app.reserve(10000); this will perform much better than builtin array appending. -Steve

Oh, wow! D2 really kicks D1's ass here..
Dec 29 2010
prev sibling next sibling parent spir <denis.spir gmail.com> writes:
On Wed, 29 Dec 2010 11:24:01 -0500
"Steven Schveighoffer" <schveiguy yahoo.com> wrote:

 On Wed, 29 Dec 2010 07:29:29 -0500, spir <denis.spir gmail.com> wrote:
=20
 Hello,

 Is there a common idiom to pre-allocate a dynamic array. I mean =20
 allocating to avoid numerous re-allocations in loop, not setting length=


 & filling content.

int[] x; x.reserve(10000); // pre-allocate at least 10000 elements for appending =20 Another way to do it: =20 Appender!(int[]) app; app.reserve(10000); =20 this will perform much better than builtin array appending. =20 -Steve

Thank you, Steven. I could not find anything like 'reserve' in the property= table because it's not a property ;-) Denis -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Dec 29 2010
prev sibling next sibling parent spir <denis.spir gmail.com> writes:
On Wed, 29 Dec 2010 11:24:01 -0500
"Steven Schveighoffer" <schveiguy yahoo.com> wrote:

 On Wed, 29 Dec 2010 07:29:29 -0500, spir <denis.spir gmail.com> wrote:
=20
 Hello,

 Is there a common idiom to pre-allocate a dynamic array. I mean =20
 allocating to avoid numerous re-allocations in loop, not setting length=


 & filling content.

int[] x; x.reserve(10000); // pre-allocate at least 10000 elements for appending =20 Another way to do it: =20 Appender!(int[]) app; app.reserve(10000); =20 this will perform much better than builtin array appending. =20 -Steve

I've done some timings using reserve and Appender. Seems not to work on my = use case (decomposition of a string [actually a sequence of code points] ac= cording to NFD). (see sample code below) * use reserve (to source string's length) with builtin append (operator '~= =3D') --> 20% slower * use Appender w/o reserve --> 3 times slower * user Appender + its own reserve --> 1.5 times slower (i.e. divide above t= ime per 2) I'm surprised that reserve does not speed up builtin appending, since it ca= n only avoid numerous reallocations. How to interpret that? I'm even more surprised of Appender's results on this use case, after havin= g read about it's performance several times on the list. Strange. Can it be= due to the fact that I only append sub-sequences? (the decomposition '*p' = below is also a mini-array) Denis // NFD decomposition Code[] decomposition (Code[] codes) { /** Decompose code sequence according to form "NFD". */ //~ Code[] decompCodes; // new, decomposed; pile Appender!(Code[]) decompCodes; Decomp* p; // pointer to single-code decomposition Ordinal i0 =3D 0; // reference index past last decomposed code =20 // Decompose each code. //~ decompCodes.reserve(codes.length); // worse time! ??? //~ foreach (i,code ; codes) { //~ // case composed code //~ p =3D (code in DECOMPOSITIONS); //~ // Do not even check whether code is composite //~ // for simple ASCII & latin characters. //~ if (code > 0xBF) //~ if (p) { //~ // Append past simple codes, if any. //~ if (i > i0) decompCodes ~=3D codes[i0..i]; //~ // Append current code's decomposition. //~ decompCodes ~=3D *p; //~ // Update reference index. //~ i0 =3D i + 1; //~ } //~ } //~ // Append last simple codes (if any). //~ decompCodes ~=3D codes[i0..$]; decompCodes.reserve(codes.length); // worse time! ??? foreach (i,code ; codes) { // case composed code p =3D (code in DECOMPOSITIONS); // Do not even check whether code is composite // for simple ASCII & latin characters. if (code > 0xBF) if (p) { // Append past simple codes, if any. if (i > i0) decompCodes.put(codes[i0..i]); // Append current code's decomposition. decompCodes.put(*p); // Update reference index. i0 =3D i + 1; } } // Append last simple codes (if any). decompCodes.put(codes[i0..$]); =20 //~ return decompCodes; return decompCodes.data; } -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Dec 29 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 29 Dec 2010 13:14:29 -0500, spir <denis.spir gmail.com> wrote:

 I've done some timings using reserve and Appender. Seems not to work on  
 my use case (decomposition of a string [actually a sequence of code  
 points] according to NFD). (see sample code below)
 * use reserve (to source string's length) with builtin append (operator  
 '~=') --> 20% slower
 * use Appender w/o reserve --> 3 times slower
 * user Appender + its own reserve --> 1.5 times slower (i.e. divide  
 above time per 2)

What is the baseline for this? I.e. what is it 20% slower than? FWIW, Appender should be much faster than builtin append, even without reserve. However, Appender has a recently fixed bug (not fixed in 2.051) where appending *arrays* of elements goes very slow. I see you are doing that in a couple spots.
 I'm surprised that reserve does not speed up builtin appending, since it  
 can only avoid numerous reallocations. How to interpret that?
 I'm even more surprised of Appender's results on this use case, after  
 having read about it's performance several times on the list. Strange.  
 Can it be due to the fact that I only append sub-sequences? (the  
 decomposition '*p' below is also a mini-array)

It should speed up appending. If it doesn't, then it's either a bug or pilot error. As I said before, Appender in 2.051 and earlier has a bug where appending an array is very slow. But builtin appending should be faster if you reserve. Simple tests I run prove that this is true. Recent developments in how the appending array grows mitigate this quite a bit, but it certainly will result in less memory being consumed, and it always runs faster. -Steve
Dec 29 2010
prev sibling next sibling parent spir <denis.spir gmail.com> writes:
On Wed, 29 Dec 2010 13:48:48 -0500
"Steven Schveighoffer" <schveiguy yahoo.com> wrote:

 On Wed, 29 Dec 2010 13:14:29 -0500, spir <denis.spir gmail.com> wrote:
=20
 I've done some timings using reserve and Appender. Seems not to work on=


 my use case (decomposition of a string [actually a sequence of code =20
 points] according to NFD). (see sample code below)
 * use reserve (to source string's length) with builtin append (operator=


 '~=3D') --> 20% slower
 * use Appender w/o reserve --> 3 times slower
 * user Appender + its own reserve --> 1.5 times slower (i.e. divide =20
 above time per 2)

What is the baseline for this? I.e. what is it 20% slower than? FWIW, =20 Appender should be much faster than builtin append, even without reserve.

The baseline is builtin append (~=3D) without reserving. (Sorry, thought it= was clear.)
 However, Appender has a recently fixed bug (not fixed in 2.051) where =20
 appending *arrays* of elements goes very slow.  I see you are doing that =

 in a couple spots.

As explained below, I'm doing _only_ that. So (as guessed), seems it may be= the reason why Appender performs so badly in my case.
 I'm surprised that reserve does not speed up builtin appending, since i=


 can only avoid numerous reallocations. How to interpret that?


* This question remains. ??? *
 I'm even more surprised of Appender's results on this use case, after =


 having read about it's performance several times on the list. Strange. =


 Can it be due to the fact that I only append sub-sequences? (the =20
 decomposition '*p' below is also a mini-array)

It should speed up appending. If it doesn't, then it's either a bug or =

 pilot error.  As I said before, Appender in 2.051 and earlier has a bug =

 where appending an array is very slow.
=20
 But builtin appending should be faster if you reserve.

Yop!
 Simple tests I run prove that this is true.  Recent developments in how =

 the appending array grows mitigate this quite a bit, but it certainly wil=

 result in less memory being consumed, and it always runs faster.

I can upload the modified version using reserve & Appender (with the timing= module) if you like. (The original one is at https://bitbucket.org/denispi= r/denispir-d/src, files pile.d and chrono.d.) Anyway, why reserve does not help builtin append is a mystery. In the worst= case, it should not trouble. By the way, how comes I do not need to import anything to be able to use re= serve (like if it were a builtin prop)? About Appender, I have had a look at the code in std.array and could not fi= nd any sign of why -- except that appending a slice loops over its elements= --calling put() for each one--, but this should not be _that_ slow, I gues= s. (Or should it?) Maybe a specialisation of put to take an array (or slice= ) would help? (Instead of using an array interface, as is done now.) The details of realloc are beyong my competence (e.g. the computation of re= sizing ratio), so I can hardly search further.
 -Steve

Denis -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Dec 29 2010
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 29 Dec 2010 14:49:35 -0500, spir <denis.spir gmail.com> wrote:

 On Wed, 29 Dec 2010 13:48:48 -0500
 "Steven Schveighoffer" <schveiguy yahoo.com> wrote:

 On Wed, 29 Dec 2010 13:14:29 -0500, spir <denis.spir gmail.com> wrote:

 I've done some timings using reserve and Appender. Seems not to work  

 my use case (decomposition of a string [actually a sequence of code
 points] according to NFD). (see sample code below)
 * use reserve (to source string's length) with builtin append  

 '~=') --> 20% slower
 * use Appender w/o reserve --> 3 times slower
 * user Appender + its own reserve --> 1.5 times slower (i.e. divide
 above time per 2)

What is the baseline for this? I.e. what is it 20% slower than? FWIW, Appender should be much faster than builtin append, even without reserve.

The baseline is builtin append (~=) without reserving. (Sorry, thought it was clear.)

Then I would be suspect of your test. In simple tests I just ran, reserving outperforms normal appending (though not by a huge amount). If it's *slower*, then there's something else wrong, as the worst case is it degenerates to non-reserved appending, since the same code paths are followed.
 However, Appender has a recently fixed bug (not fixed in 2.051) where
 appending *arrays* of elements goes very slow.  I see you are doing that
 in a couple spots.

As explained below, I'm doing _only_ that. So (as guessed), seems it may be the reason why Appender performs so badly in my case.

Yes, most likely. I mean it was like 10x slower in tests I ran. Try this version of std.array (shouldn't need to recompile phobos, it's a template): http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/array.d?rev=2238&format=raw
 I'm surprised that reserve does not speed up builtin appending, since  

 can only avoid numerous reallocations. How to interpret that?


* This question remains. ??? *

I would lean towards pilot error (sorry), since simple tests show reserve does speed things up. I don't have a lot of time to look into your code in depth. If you can show a simple tests that proves otherwise, I'd be happy to look at it.
 Simple tests I run prove that this is true.  Recent developments in how
 the appending array grows mitigate this quite a bit, but it certainly  
 will
 result in less memory being consumed, and it always runs faster.

I can upload the modified version using reserve & Appender (with the timing module) if you like. (The original one is at https://bitbucket.org/denispir/denispir-d/src, files pile.d and chrono.d.)

I use the 'time' command to run benchmarks usually because I've been burned in the past with using custom-built timing mechanisms that have flaws in them.
 Anyway, why reserve does not help builtin append is a mystery. In the  
 worst case, it should not trouble.

I'll augment that and say "in your application", since it seems to perform expectedly in my tests.
 By the way, how comes I do not need to import anything to be able to use  
 reserve (like if it were a builtin prop)?

It's in object.di, always imported. I'm wondering, if those should be added to the 'properties' page of the spec.
 About Appender, I have had a look at the code in std.array and could not  
 find any sign of why -- except that appending a slice loops over its  
 elements --calling put() for each one--, but this should not be _that_  
 slow, I guess. (Or should it?) Maybe a specialisation of put to take an  
 array (or slice) would help? (Instead of using an array interface, as is  
 done now.)
 The details of realloc are beyong my competence (e.g. the computation of  
 resizing ratio), so I can hardly search further.

The issue is that appending was not amortized when appending arrays vs. appending individual elements (where appending *is* amortized). In order to achieve the appearance of O(1) appending, you must grow the array proportionally instead of linearly. In simple terms, you want to essentially double the capacity whenever you realloc. The actual aglorithm uses a logarithmic growth curve that settles around 1.4x. The flawed appender code was just adding enough space to accommodate the data being appended, but only if an array was appended (appending single elements was using the growth curve). Not only that, but it wasn't taking advantage of the ability to extend into consecutive pages without having to move data around. -Steve
Dec 29 2010