www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Spikes in array capacity

reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
There is something strange in the array capacity algorithm.

import std.stdio;
import std.array;

void main()
{
     int[] a;
     size_t oldCapacity;

     foreach (i; 0 .. 100_000_000) {
         a ~= i;

         if (a.capacity != oldCapacity) {
             writeln("length: ", a.length,
                     " capacity: ", capacity(a),
                     " ratio: ", cast(double)capacity(a) / oldCapacity);
             oldCapacity = capacity(a);
         }
     }
}

Produces:

length: 1 capacity: 3 ratio: inf        please ignore this one ;)
length: 4 capacity: 7 ratio: 2.33333
length: 8 capacity: 15 ratio: 2.14286
length: 16 capacity: 31 ratio: 2.06667
length: 32 capacity: 63 ratio: 2.03226
length: 64 capacity: 127 ratio: 2.01587
length: 128 capacity: 255 ratio: 2.00787
length: 256 capacity: 509 ratio: 1.99608
length: 512 capacity: 1021 ratio: 2.00589
length: 1022 capacity: 2045 ratio: 2.00294
length: 2046 capacity: 3069 ratio: 1.50073
length: 3070 capacity: 4093 ratio: 1.33366
length: 4094 capacity: 5117 ratio: 1.25018
...
length: 1153022 capacity: 1154045 ratio: 1.00089
length: 1154046 capacity: 1155069 ratio: 1.00089
length: 1155070 capacity: 3153917 ratio: 2.7305    <-- spike
length: 3153918 capacity: 3154941 ratio: 1.00032
length: 3154942 capacity: 3155965 ratio: 1.00032
...
length: 4741118 capacity: 4742141 ratio: 1.00022
length: 4742142 capacity: 4743165 ratio: 1.00022
length: 4743166 capacity: 12333053 ratio: 2.60017  <-- spike
length: 12333054 capacity: 12334077 ratio: 1.00008
length: 12334078 capacity: 12335101 ratio: 1.00008
... (there are more spikes later on)

Is that intended?

Ali
Jul 01 2010
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 01 Jul 2010 12:46:27 -0400, Ali Çehreli <acehreli yahoo.com> wrote:

 There is something strange in the array capacity algorithm.

 import std.stdio;
 import std.array;

 void main()
 {
      int[] a;
      size_t oldCapacity;

      foreach (i; 0 .. 100_000_000) {
          a ~= i;

          if (a.capacity != oldCapacity) {
              writeln("length: ", a.length,
                      " capacity: ", capacity(a),
                      " ratio: ", cast(double)capacity(a) / oldCapacity);
              oldCapacity = capacity(a);
          }
      }
 }

 Produces:

 length: 1 capacity: 3 ratio: inf        please ignore this one ;)
 length: 4 capacity: 7 ratio: 2.33333
 length: 8 capacity: 15 ratio: 2.14286
 length: 16 capacity: 31 ratio: 2.06667
 length: 32 capacity: 63 ratio: 2.03226
 length: 64 capacity: 127 ratio: 2.01587
 length: 128 capacity: 255 ratio: 2.00787
 length: 256 capacity: 509 ratio: 1.99608
 length: 512 capacity: 1021 ratio: 2.00589
 length: 1022 capacity: 2045 ratio: 2.00294
 length: 2046 capacity: 3069 ratio: 1.50073
 length: 3070 capacity: 4093 ratio: 1.33366
 length: 4094 capacity: 5117 ratio: 1.25018
 ...
 length: 1153022 capacity: 1154045 ratio: 1.00089
 length: 1154046 capacity: 1155069 ratio: 1.00089
 length: 1155070 capacity: 3153917 ratio: 2.7305    <-- spike
 length: 3153918 capacity: 3154941 ratio: 1.00032
 length: 3154942 capacity: 3155965 ratio: 1.00032
 ...
 length: 4741118 capacity: 4742141 ratio: 1.00022
 length: 4742142 capacity: 4743165 ratio: 1.00022
 length: 4743166 capacity: 12333053 ratio: 2.60017  <-- spike
 length: 12333054 capacity: 12334077 ratio: 1.00008
 length: 12334078 capacity: 12335101 ratio: 1.00008
 ... (there are more spikes later on)

 Is that intended?

 Ali

No, that is not intended. Please file a bugzilla against druntime. -Steve
Jul 01 2010
parent =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
Steven Schveighoffer wrote:
 On Thu, 01 Jul 2010 12:46:27 -0400, Ali Çehreli <acehreli yahoo.com> wrote:
 
 There is something strange in the array capacity algorithm.


 length: 1153022 capacity: 1154045 ratio: 1.00089
 length: 1154046 capacity: 1155069 ratio: 1.00089
 length: 1155070 capacity: 3153917 ratio: 2.7305    <-- spike
 length: 3153918 capacity: 3154941 ratio: 1.00032
 length: 3154942 capacity: 3155965 ratio: 1.00032


 No, that is not intended.  Please file a bugzilla against druntime.

http://d.puremagic.com/issues/show_bug.cgi?id=4412 Thanks, Ali
Jul 01 2010
prev sibling next sibling parent reply BCS <none anon.com> writes:
Hello Ali,

 There is something strange in the array capacity algorithm.
 [...]
 Is that intended?
 
 Ali
 

I'm going to take a guess that the gc is (after some point) allocating into the smallest hole with "enough" capacity. If you re run the test but with something else going on to add some noise, do the spikes move? void makeNoise() { static byte[][256] data; data[rand() % $] = new byte[rand() % 512]; } -- ... <IXOYE><
Jul 01 2010
parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
BCS wrote:

 There is something strange in the array capacity algorithm.
 [...]
 Is that intended?


 I'm going to take a guess that the gc is (after some point) allocating
 into the smallest hole with "enough" capacity.

You may be right. :)
 If you re run the test
 but with something else going on to add some noise, do the spikes move?

Yes, the spikes move.
 void makeNoise()
 {
      static byte[][256] data;
      data[rand() % $] = new byte[rand() % 512];
 }

I am not familiar with the dmd internals; but following the code took me to the function newCapacity() in src/druntime/src/rt/lifetime.d. The capacity growth is more complicated than what I've been assuming so far. And yes, GC is in the picture. Quoting the comments from that file: /* * Better version by Dave Fladebo: * This uses an inverse logorithmic algorithm to pre-allocate a bit more * space for larger arrays. * - Arrays smaller than PAGESIZE bytes are left as-is, so for the most * common cases, memory allocation is 1 to 1. The small overhead added * doesn't affect small array perf. (it's virtually the same as * current). * - Larger arrays have some space pre-allocated. * - As the arrays grow, the relative pre-allocated space shrinks. * - The logorithmic algorithm allocates relatively more space for * mid-size arrays, making it very fast for medium arrays (for * mid-to-large arrays, this turns out to be quite a bit faster than the * equivalent realloc() code in C, on Linux at least. Small arrays are * just as fast as GCC). * - Perhaps most importantly, overall memory usage and stress on the GC * is decreased significantly for demanding environments. */ So there may not be a bug after all... Ali
Jul 01 2010
parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
Steven Schveighoffer wrote:
 On Fri, 02 Jul 2010 01:34:26 -0400, Ali Çehreli <acehreli yahoo.com> 

 BCS wrote:

  >> There is something strange in the array capacity algorithm.
  >> [...]
  >> Is that intended?

  > I'm going to take a guess that the gc is (after some point) 


  > into the smallest hole with "enough" capacity.

 You may be right. :)

  > If you re run the test
  > but with something else going on to add some noise, do the spikes
 move?

 Yes, the spikes move.

  > void makeNoise()
  > {
  >      static byte[][256] data;
  >      data[rand() % $] = new byte[rand() % 512];
  > }

 I am not familiar with the dmd internals; but following the code took
 me to the function newCapacity() in src/druntime/src/rt/lifetime.d.
 The capacity growth is more complicated than what I've been assuming
 so far.

 And yes, GC is in the picture. Quoting the comments from that file:

 /*
   * Better version by Dave Fladebo:
   * This uses an inverse logorithmic algorithm to pre-allocate a bit 


   * space for larger arrays.
   * - Arrays smaller than PAGESIZE bytes are left as-is, so for the most
   * common cases, memory allocation is 1 to 1. The small overhead added
   * doesn't affect small array perf. (it's virtually the same as
   * current).
   * - Larger arrays have some space pre-allocated.
   * - As the arrays grow, the relative pre-allocated space shrinks.
   * - The logorithmic algorithm allocates relatively more space for
   * mid-size arrays, making it very fast for medium arrays (for
   * mid-to-large arrays, this turns out to be quite a bit faster 


   * equivalent realloc() code in C, on Linux at least. Small arrays are
   * just as fast as GCC).
   * - Perhaps most importantly, overall memory usage and stress on 


   * is decreased significantly for demanding environments.
   */

 So there may not be a bug after all...

 Ali

If your original code is all that was being run, I think there is a bug.

The original code was it.
 I'll tell you why -- the GC provides a way to append pages to an
 allocated block.  Essentially, you can glue consecutive unused pages
 together to make a larger block, thereby growing your block.

 But anything under a page is simply reallocated because anything less
 than a page is inside a pool of blocks of the same size, and those
 cannot be glued together.

 So when your array is growing, and it gets larger than a page, it should
 grow a single page at a time.  But I think (and I stress think, I'm not
 sure) that when a block of pages is deallocated, each page becomes
 independently free.  They do not stay glued together.  So for an
 allocating requesting 1 or 2 more pages to grow by 2000 pages is
 suspect.  I still have to examine the code, but I think there is a 

 Back to my original assertion -- to disprove the theory that some other
 "larger hole" is why it gets so big, in one instance your number of
 elements grows from 1,155,070 (4MB) to 3,153,917 (12MB), that's almost a
 200% increase.  In order for that 12MB hole to exist, something must
 have allocated it before, and then deallocated it.  It can't be your
 loop because your loop has not allocated that much space yet.  I would
 find that very strange to happen somewhere in the runtime without you
 specifically allocating it.  If however, your original post is not the
 entire code, then there may be some possibility to that.

 BTW, you can take the newCapacity function out of the loop by simply
 setting the length and then writing the last element, i.e.:

 a.length += 1;
 a[$-1] = i;

 newCapacity isn't called when setting the length explicitly, because
 it's not considered to be an append.

You are right. Looks like I was successful in locating the culprit. :) I took liberty to write ++a.length and also outputted the number of elements that are in reserve at each allocation: import std.stdio; import std.array; void main() { int[] a; size_t oldCapacity; foreach (i; 0 .. 100_000_000) { ++a.length; a[$-1] = i; if (capacity(a) != oldCapacity) { writeln("length: ", a.length, " capacity: ", capacity(a), " ratio: ", cast(double)capacity(a) / oldCapacity, " reserve: ", capacity(a) - a.length); oldCapacity = capacity(a); } } } Now the spikes are gone, and the allocations asymptote at once-per-1024 elements for an int array: length: 1 capacity: 3 ratio: inf reserve: 2 length: 4 capacity: 7 ratio: 2.33333 reserve: 3 length: 8 capacity: 15 ratio: 2.14286 reserve: 7 length: 16 capacity: 31 ratio: 2.06667 reserve: 15 length: 32 capacity: 63 ratio: 2.03226 reserve: 31 length: 64 capacity: 127 ratio: 2.01587 reserve: 63 length: 128 capacity: 255 ratio: 2.00787 reserve: 127 length: 256 capacity: 509 ratio: 1.99608 reserve: 253 length: 512 capacity: 1021 ratio: 2.00589 reserve: 509 length: 1022 capacity: 2045 ratio: 2.00294 reserve: 1023 length: 2046 capacity: 3069 ratio: 1.50073 reserve: 1023 length: 3070 capacity: 4093 ratio: 1.33366 reserve: 1023 ... In the long run, there is one allocation per 1024 elements. That still feels like amortized constant, but considering that relocation times would be O(N), I think reserve should still grow with N, but maybe not too much. (I am not sure about this last point at all.)
 -Steve

Thank you very much, Ali
Jul 02 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Ali Çehreli wrote:
[snip]
 Now the spikes are gone, and the allocations asymptote at once-per-1024 
 elements for an int array:
 
 length: 1 capacity: 3 ratio: inf reserve: 2
 length: 4 capacity: 7 ratio: 2.33333 reserve: 3
 length: 8 capacity: 15 ratio: 2.14286 reserve: 7
 length: 16 capacity: 31 ratio: 2.06667 reserve: 15
 length: 32 capacity: 63 ratio: 2.03226 reserve: 31
 length: 64 capacity: 127 ratio: 2.01587 reserve: 63
 length: 128 capacity: 255 ratio: 2.00787 reserve: 127
 length: 256 capacity: 509 ratio: 1.99608 reserve: 253
 length: 512 capacity: 1021 ratio: 2.00589 reserve: 509
 length: 1022 capacity: 2045 ratio: 2.00294 reserve: 1023
 length: 2046 capacity: 3069 ratio: 1.50073 reserve: 1023
 length: 3070 capacity: 4093 ratio: 1.33366 reserve: 1023
 ....
 
 In the long run, there is one allocation per 1024 elements. That still 
 feels like amortized constant, but considering that relocation times 
 would be O(N), I think reserve should still grow with N, but maybe not 
 too much. (I am not sure about this last point at all.)

You may want to also print the pointer to the first element of the array. That provides you information about how often the array is actually moved. Andrei
Jul 02 2010
parent =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
Andrei Alexandrescu wrote:
 Ali Çehreli wrote:
 [snip]
 Now the spikes are gone, and the allocations asymptote at
 once-per-1024 elements for an int array:

 length: 1 capacity: 3 ratio: inf reserve: 2
 length: 4 capacity: 7 ratio: 2.33333 reserve: 3
 length: 8 capacity: 15 ratio: 2.14286 reserve: 7
 length: 16 capacity: 31 ratio: 2.06667 reserve: 15
 length: 32 capacity: 63 ratio: 2.03226 reserve: 31
 length: 64 capacity: 127 ratio: 2.01587 reserve: 63
 length: 128 capacity: 255 ratio: 2.00787 reserve: 127
 length: 256 capacity: 509 ratio: 1.99608 reserve: 253
 length: 512 capacity: 1021 ratio: 2.00589 reserve: 509
 length: 1022 capacity: 2045 ratio: 2.00294 reserve: 1023
 length: 2046 capacity: 3069 ratio: 1.50073 reserve: 1023
 length: 3070 capacity: 4093 ratio: 1.33366 reserve: 1023
 ....

 In the long run, there is one allocation per 1024 elements. That still
 feels like amortized constant, but considering that relocation times
 would be O(N), I think reserve should still grow with N, but maybe not
 too much. (I am not sure about this last point at all.)

You may want to also print the pointer to the first element of the array. That provides you information about how often the array is actually moved.

Good point. :) Yes, spikes are only when the array is moved. I've also discovered that a basic array invariant is violated at size 509 as well: import std.string; import std.array; void main() { int[] a; a.length = 509; a ~= 0; assert(a.capacity >= a.length, format("capacity: %s length: %s", a.capacity, a.length)); } Outputs core.exception.AssertError deneme.d(17444): capacity: 509 length: 510 I will update the bug with these. Ali
Jul 02 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 02 Jul 2010 01:34:26 -0400, Ali Çehreli <acehreli yahoo.com> wrote:

 BCS wrote:

  >> There is something strange in the array capacity algorithm.
  >> [...]
  >> Is that intended?

  > I'm going to take a guess that the gc is (after some point) allocating
  > into the smallest hole with "enough" capacity.

 You may be right. :)

  > If you re run the test
  > but with something else going on to add some noise, do the spikes  
 move?

 Yes, the spikes move.

  > void makeNoise()
  > {
  >      static byte[][256] data;
  >      data[rand() % $] = new byte[rand() % 512];
  > }

 I am not familiar with the dmd internals; but following the code took me  
 to the function newCapacity() in src/druntime/src/rt/lifetime.d. The  
 capacity growth is more complicated than what I've been assuming so far.

 And yes, GC is in the picture. Quoting the comments from that file:

 /*
   * Better version by Dave Fladebo:
   * This uses an inverse logorithmic algorithm to pre-allocate a bit more
   * space for larger arrays.
   * - Arrays smaller than PAGESIZE bytes are left as-is, so for the most
   * common cases, memory allocation is 1 to 1. The small overhead added
   * doesn't affect small array perf. (it's virtually the same as
   * current).
   * - Larger arrays have some space pre-allocated.
   * - As the arrays grow, the relative pre-allocated space shrinks.
   * - The logorithmic algorithm allocates relatively more space for
   * mid-size arrays, making it very fast for medium arrays (for
   * mid-to-large arrays, this turns out to be quite a bit faster than the
   * equivalent realloc() code in C, on Linux at least. Small arrays are
   * just as fast as GCC).
   * - Perhaps most importantly, overall memory usage and stress on the GC
   * is decreased significantly for demanding environments.
   */

 So there may not be a bug after all...

 Ali

If your original code is all that was being run, I think there is a bug. I'll tell you why -- the GC provides a way to append pages to an allocated block. Essentially, you can glue consecutive unused pages together to make a larger block, thereby growing your block. But anything under a page is simply reallocated because anything less than a page is inside a pool of blocks of the same size, and those cannot be glued together. So when your array is growing, and it gets larger than a page, it should grow a single page at a time. But I think (and I stress think, I'm not sure) that when a block of pages is deallocated, each page becomes independently free. They do not stay glued together. So for an allocating requesting 1 or 2 more pages to grow by 2000 pages is suspect. I still have to examine the code, but I think there is a problem. Back to my original assertion -- to disprove the theory that some other "larger hole" is why it gets so big, in one instance your number of elements grows from 1,155,070 (4MB) to 3,153,917 (12MB), that's almost a 200% increase. In order for that 12MB hole to exist, something must have allocated it before, and then deallocated it. It can't be your loop because your loop has not allocated that much space yet. I would find that very strange to happen somewhere in the runtime without you specifically allocating it. If however, your original post is not the entire code, then there may be some possibility to that. BTW, you can take the newCapacity function out of the loop by simply setting the length and then writing the last element, i.e.: a.length += 1; a[$-1] = i; newCapacity isn't called when setting the length explicitly, because it's not considered to be an append. -Steve
Jul 02 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 02 Jul 2010 10:01:11 -0400, Ali Çehreli <acehreli yahoo.com> wrote:

 Steven Schveighoffer wrote:
  > newCapacity isn't called when setting the length explicitly, because
  > it's not considered to be an append.

 You are right.

 Looks like I was successful in locating the culprit. :) I took liberty  
 to write ++a.length and also outputted the number of elements that are  
 in reserve at each allocation:

 import std.stdio;
 import std.array;

 void main()
 {
      int[] a;
      size_t oldCapacity;

      foreach (i; 0 .. 100_000_000) {
          ++a.length;
          a[$-1] = i;

          if (capacity(a) != oldCapacity) {
              writeln("length: ", a.length,
                      " capacity: ", capacity(a),
                      " ratio: ", cast(double)capacity(a) / oldCapacity,
                      " reserve: ", capacity(a) - a.length);
              oldCapacity = capacity(a);
          }
      }
 }

 Now the spikes are gone, and the allocations asymptote at once-per-1024  
 elements for an int array:

 length: 1 capacity: 3 ratio: inf reserve: 2
 length: 4 capacity: 7 ratio: 2.33333 reserve: 3
 length: 8 capacity: 15 ratio: 2.14286 reserve: 7
 length: 16 capacity: 31 ratio: 2.06667 reserve: 15
 length: 32 capacity: 63 ratio: 2.03226 reserve: 31
 length: 64 capacity: 127 ratio: 2.01587 reserve: 63
 length: 128 capacity: 255 ratio: 2.00787 reserve: 127
 length: 256 capacity: 509 ratio: 1.99608 reserve: 253
 length: 512 capacity: 1021 ratio: 2.00589 reserve: 509
 length: 1022 capacity: 2045 ratio: 2.00294 reserve: 1023
 length: 2046 capacity: 3069 ratio: 1.50073 reserve: 1023
 length: 3070 capacity: 4093 ratio: 1.33366 reserve: 1023
 ...

 In the long run, there is one allocation per 1024 elements. That still  
 feels like amortized constant, but considering that relocation times  
 would be O(N), I think reserve should still grow with N, but maybe not  
 too much. (I am not sure about this last point at all.)

4x1024 = 4096 == page size. The newCapacity function is supposed to reduce the effect of this phenomenon. i.e., when appending, it's assumed you are going to continue appending, so to reduce the overhead, the amount added to the memory block is supposed to be related to the total number of bytes already allocated. It doesn't seem like the function is working very well... It could have been something I did, or it could have always been broken :) I honestly did not examine the code at all, I just only ever read the comments, assuming it works. Please add these comments to the bug report, it will help when I look at it. I don't have a lot of time to look at it right now, so I'm afraid I'll forget what we were doing later :) -Steve
Jul 02 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 02 Jul 2010 13:01:41 -0400, Ali Çehreli <acehreli yahoo.com> wrote:

 I've also discovered that a basic array invariant is violated at size  
 509 as well:

 import std.string;
 import std.array;

 void main()
 {
      int[] a;
      a.length = 509;
      a ~= 0;

      assert(a.capacity >= a.length,
             format("capacity: %s length: %s", a.capacity, a.length));
 }

 Outputs

 core.exception.AssertError deneme.d(17444): capacity: 509 length: 510

 I will update the bug with these.

Thanks. I just figured this one out :) Very interesting bug indeed. -Steve
Jul 02 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 02 Jul 2010 14:50:53 -0400, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Fri, 02 Jul 2010 13:01:41 -0400, Ali Çehreli <acehreli yahoo.com>  
 wrote:

 I've also discovered that a basic array invariant is violated at size  
 509 as well:

 import std.string;
 import std.array;

 void main()
 {
      int[] a;
      a.length = 509;
      a ~= 0;

      assert(a.capacity >= a.length,
             format("capacity: %s length: %s", a.capacity, a.length));
 }

 Outputs

 core.exception.AssertError deneme.d(17444): capacity: 509 length: 510

 I will update the bug with these.

Thanks. I just figured this one out :) Very interesting bug indeed.

FYI: http://www.dsource.org/projects/druntime/changeset/317 -Steve
Jul 02 2010
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 02 Jul 2010 14:58:41 -0400, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Fri, 02 Jul 2010 14:50:53 -0400, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 On Fri, 02 Jul 2010 13:01:41 -0400, Ali Çehreli <acehreli yahoo.com>  
 wrote:

 I've also discovered that a basic array invariant is violated at size  
 509 as well:

 import std.string;
 import std.array;

 void main()
 {
      int[] a;
      a.length = 509;
      a ~= 0;

      assert(a.capacity >= a.length,
             format("capacity: %s length: %s", a.capacity, a.length));
 }

 Outputs

 core.exception.AssertError deneme.d(17444): capacity: 509 length: 510

 I will update the bug with these.

Thanks. I just figured this one out :) Very interesting bug indeed.

FYI: http://www.dsource.org/projects/druntime/changeset/317

Missed something, here is a better view of the change: http://www.dsource.org/projects/druntime/changeset?new=trunk%2Fsrc%2Frt%2Flifetime.d%40318&old=trunk%2Fsrc%2Frt%2Flifetime.d%40299 -Steve
Jul 02 2010