digitalmars.D.learn - Best way to clear dynamic array for reuse

Miguel L (18/18) Jul 13 2016 I am using a temporary dynamic array inside a loop this way:

rikki cattermole (8/26) Jul 13 2016 All of those "options" do the same thing, remove all references to that

Mathias Lang (9/21) Jul 13 2016 No they don't. The first and the third change the pointer, so one

Lodovico Giaretta (4/22) Jul 13 2016 Use std.array.Appender. It allows faster appends, and has a handy

Miguel L (19/45) Jul 13 2016 I tried Appender, but for some reason garbage collector still

Lodovico Giaretta (6/23) Jul 13 2016 Well, I think foo's parameter should be `ref Appender!(A[]) bar`

Steven Schveighoffer (6/32) Jul 13 2016 Yes, this is why you still have issues. An out parameter is set to its

Miguel L (16/53) Jul 14 2016 Ok, i have read about Appender and assumeSafeAppend(), but i am

Jonathan M Davis via Digitalmars-d-learn (224/238) Jul 14 2016 If you haven't read it yet, I'd suggest reading

Miguel L (9/23) Jul 14 2016 Thank you Jonathan, that really cleared up a lot of things, I

Jonathan M Davis via Digitalmars-d-learn (102/110) Jul 14 2016 _All_ that a dynamic array is is
Steven Schveighoffer (10/31) Jul 14 2016 No, it's not a permanent adjustment. It simply tells the array runtime

ketmar (10/16) Jul 13 2016 it really depends of your other code. if you don't have any
cym13 (7/25) Jul 13 2016 The best option would be a.clear(). From the language specs:

Lodovico Giaretta (3/9) Jul 13 2016 I don't think OP is using associative arrays, but dynamic arrays

cym13 (3/13) Jul 13 2016 You're right, my bad, I read too fast :/

Jonathan M Davis via Digitalmars-d-learn (49/67) Jul 13 2016 a = [];

Miguel L <mlabayru gmail.com> writes:

I am using a temporary dynamic array inside a loop this way:
A[] a;
for(....)
{
a=[]; //discard array contents
... appends thousand of elements to a
... use a for some calculations
}

I would like to know which would be the best way to clear a 
contents avoiding reallocations, as there seems to be lots of 
garbage collection cycles taking place.

The options would be:

a=[];
a.length=0;
a=null;
...
any other?

Can you help me please?

Jul 13 2016

rikki cattermole <rikki cattermole.co.nz> writes:

On 13/07/2016 11:59 PM, Miguel L wrote:
 I am using a temporary dynamic array inside a loop this way:
 A[] a;
 for(....)
 {
 a=[]; //discard array contents
 ... appends thousand of elements to a
 ... use a for some calculations
 }

 I would like to know which would be the best way to clear a contents
 avoiding reallocations, as there seems to be lots of garbage collection
 cycles taking place.

 The options would be:

 a=[];
 a.length=0;
 a=null;
 ...
 any other?

 Can you help me please?

All of those "options" do the same thing, remove all references to that 
data.

There is a couple of options. What I will recommend instead is to start 
using buffers and only expand when you append past the length. You will 
probably want a struct to wrap this up (and disable postblit).

Otherwise if you're lazy just disable GC and reenable after the code 
segment.

Jul 13 2016

Mathias Lang <mathias.lang sociomantic.com> writes:

On Wednesday, 13 July 2016 at 12:05:12 UTC, rikki cattermole 
wrote:
 On 13/07/2016 11:59 PM, Miguel L wrote:
 The options would be:

 a=[];
 a.length=0;
 a=null;
 ...
 any other?

 Can you help me please?

 All of those "options" do the same thing, remove all references 
 to that data.

No they don't. The first and the third change the pointer, so one 
cannot reuse the array.

 Miguel: You want to use 

See the example in the doc. To reset your buffer you can 
use`buff.length = 0` instead of taking a slice as the example 
does.

Jul 13 2016

Lodovico Giaretta <lodovico giaretart.net> writes:

On Wednesday, 13 July 2016 at 11:59:18 UTC, Miguel L wrote:
 I am using a temporary dynamic array inside a loop this way:
 A[] a;
 for(....)
 {
 a=[]; //discard array contents
 ... appends thousand of elements to a
 ... use a for some calculations
 }

 I would like to know which would be the best way to clear a 
 contents avoiding reallocations, as there seems to be lots of 
 garbage collection cycles taking place.

 The options would be:

 a=[];
 a.length=0;
 a=null;
 ...
 any other?

 Can you help me please?

Use std.array.Appender. It allows faster appends, and has a handy 
.clear method that zeroes the length of the managed array, 
without de-allocating it, so the same buffer is reused.

Jul 13 2016

Miguel L <mlabayru gmail.com> writes:

On Wednesday, 13 July 2016 at 12:05:18 UTC, Lodovico Giaretta 
wrote:
 On Wednesday, 13 July 2016 at 11:59:18 UTC, Miguel L wrote:
 I am using a temporary dynamic array inside a loop this way:
 A[] a;
 for(....)
 {
 a=[]; //discard array contents
 ... appends thousand of elements to a
 ... use a for some calculations
 }

 I would like to know which would be the best way to clear a 
 contents avoiding reallocations, as there seems to be lots of 
 garbage collection cycles taking place.

 The options would be:

 a=[];
 a.length=0;
 a=null;
 ...
 any other?

 Can you help me please?

 Use std.array.Appender. It allows faster appends, and has a 
 handy .clear method that zeroes the length of the managed 
 array, without de-allocating it, so the same buffer is reused.

I tried Appender, but for some reason garbage collector still 
seems to be running every few iterations.
I will try to expand a little on my code because maybe there is 
something i am missing:

  Appender!(A[]) a;

  void foo( out Appender!(A[]) bar)
{
...
bar~= lot of elements
}

  for(....)
  {
  //a=[]; //discard array contents
  a.clear();
  foo(a) appends thousand of elements to a
  ... use a for some calculations
  }

Jul 13 2016

Lodovico Giaretta <lodovico giaretart.net> writes:

On Wednesday, 13 July 2016 at 12:37:26 UTC, Miguel L wrote:
 I tried Appender, but for some reason garbage collector still 
 seems to be running every few iterations.
 I will try to expand a little on my code because maybe there is 
 something i am missing:

  Appender!(A[]) a;

  void foo( out Appender!(A[]) bar)
 {
 ...
 bar~= lot of elements
 }

  for(....)
  {
  //a=[]; //discard array contents
  a.clear();
  foo(a) appends thousand of elements to a
  ... use a for some calculations
  }

Well, I think foo's parameter should be `ref Appender!(A[]) bar` 
instead of `out Appender!(A[]) bar`. Also, if you know you will 
append lots of elements, doing a.reserve(s), with s being an 
estimate of the number of appends you expect, might be a good 
idea.

Jul 13 2016

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 7/13/16 8:41 AM, Lodovico Giaretta wrote:
 On Wednesday, 13 July 2016 at 12:37:26 UTC, Miguel L wrote:
 I tried Appender, but for some reason garbage collector still seems to
 be running every few iterations.
 I will try to expand a little on my code because maybe there is
 something i am missing:

  Appender!(A[]) a;

  void foo( out Appender!(A[]) bar)
 {
 ...
 bar~= lot of elements
 }

  for(....)
  {
  //a=[]; //discard array contents
  a.clear();
  foo(a) appends thousand of elements to a
  ... use a for some calculations
  }

 Well, I think foo's parameter should be `ref Appender!(A[]) bar` instead
 of `out Appender!(A[]) bar`.

Yes, this is why you still have issues. An out parameter is set to its 
init value upon function entry, so you have lost all your allocation at 
that point.

 Also, if you know you will append lots of
 elements, doing a.reserve(s), with s being an estimate of the number of
 appends you expect, might be a good idea.

This is true for builtin arrays as well.

-Steve

Jul 13 2016

Miguel L <mlabayru gmail.com> writes:

On Wednesday, 13 July 2016 at 17:19:09 UTC, Steven Schveighoffer 
wrote:
 On 7/13/16 8:41 AM, Lodovico Giaretta wrote:
 On Wednesday, 13 July 2016 at 12:37:26 UTC, Miguel L wrote:
 I tried Appender, but for some reason garbage collector still 
 seems to
 be running every few iterations.
 I will try to expand a little on my code because maybe there 
 is
 something i am missing:

  Appender!(A[]) a;

  void foo( out Appender!(A[]) bar)
 {
 ...
 bar~= lot of elements
 }

  for(....)
  {
  //a=[]; //discard array contents
  a.clear();
  foo(a) appends thousand of elements to a
  ... use a for some calculations
  }

 Well, I think foo's parameter should be `ref Appender!(A[]) 
 bar` instead
 of `out Appender!(A[]) bar`.

 Yes, this is why you still have issues. An out parameter is set 
 to its init value upon function entry, so you have lost all 
 your allocation at that point.

 Also, if you know you will append lots of
 elements, doing a.reserve(s), with s being an estimate of the 
 number of
 appends you expect, might be a good idea.

 This is true for builtin arrays as well.

 -Steve

Ok, i have read about Appender and assumeSafeAppend(), but i am 
still a bit confused.
What i have understood is: dynamic arrays are (almost) always 
reallocating when appending to them except assumeSafeAppend() is 
used, or when wrapping them with Appender. So, if i'm sure there 
are no slices referencing my array i can use assumeSafeAppend(). 
But is this permanent? I mean if I declare:
  array A[] x;
  x.assumeSafeAppend();

Does that last forever so I can append without reallocating after 
emptying array x, or should I call assumeSafeAppend() every time 
I adjust x.length or append to x?

Maybe I should give up trying to use dynamic arrays and use fixed 
length arrays instead.

Jul 14 2016

Jonathan M Davis via Digitalmars-d-learn writes:

On Thursday, July 14, 2016 07:07:52 Miguel L via Digitalmars-d-learn wrote:
 Ok, i have read about Appender and assumeSafeAppend(), but i am
 still a bit confused.
 What i have understood is: dynamic arrays are (almost) always
 reallocating when appending to them except assumeSafeAppend() is
 used, or when wrapping them with Appender. So, if i'm sure there
 are no slices referencing my array i can use assumeSafeAppend().
 But is this permanent? I mean if I declare:
   array A[] x;
   x.assumeSafeAppend();

 Does that last forever so I can append without reallocating after
 emptying array x, or should I call assumeSafeAppend() every time
 I adjust x.length or append to x?

 Maybe I should give up trying to use dynamic arrays and use fixed
 length arrays instead.

If you haven't read it yet, I'd suggest reading

http://dlang.org/d-array-article.html

A dynamic array is basically

struct DynamicArray(T)
{
    size_t length;
    T* ptr;
}

So, it does not own or manage its own memory. It can refer to any memory,
but it's the GC that manages determining whether appending would need to
reallocate or not. And if it does reallocate, then the memory is going to be
GC-allocated regardless of whether it was GC allocated originally or
malloc-ed memory or a slice of a static array or whatever.

When the GC allocates a block of memory for a dynamic array, it keeps track
of the farthest point in that block of memory that any dynamic array refers
to. If the last element in a dynamic array is the last element in that
memory block, and there is additional capacity beyond that within the memory
block, then appending will not result in a reallocation. Rather, it will
expand that dynamic array into the free space in the memory block. But if
that memory block is fully used or if the last element in the dynamic array
is not the last used element in the memory block (or if the dynamic array
refers to memory that was not allocated by the GC), then there is no room
for that array to be expanded, and a reallocation will occur, at which point
that dynamic array _will_ point to the last used element in its new memory
block. So, if you doing something like

int[] arr;
arr ~= 10;
arr ~= 9;
arr ~= 42;
arr ~= 17;
arr ~= 42;
arr ~= 99;
arr ~= 0;
arr ~= 100;

it could be that only one of those append operations actually result in an
allocation (the first one), or it could be that multiple do. How many of
them require a reallocation depends on what size the underlying buffer is,
and that's implementation dependent. If you want to know how many elements
can be appended without reallocating, then use capacity. For instance, on my
machine, this code

import std.stdio;

void main()
{
    int[] arr;
    writefln("len: %s, cap: %s", arr.length, arr.capacity);
    arr ~= 10;
    writefln("len: %s, cap: %s", arr.length, arr.capacity);
    arr ~= 9;
    writefln("len: %s, cap: %s", arr.length, arr.capacity);
    arr ~= 42;
    writefln("len: %s, cap: %s", arr.length, arr.capacity);
    arr ~= 17;
    writefln("len: %s, cap: %s", arr.length, arr.capacity);
    arr ~= 42;
    writefln("len: %s, cap: %s", arr.length, arr.capacity);
    arr ~= 99;
    writefln("len: %s, cap: %s", arr.length, arr.capacity);
    arr ~= 0;
    writefln("len: %s, cap: %s", arr.length, arr.capacity);
    arr ~= 100;
    writefln("len: %s, cap: %s", arr.length, arr.capacity);
}

prints

len: 0, cap: 0
len: 1, cap: 3
len: 2, cap: 3
len: 3, cap: 3
len: 4, cap: 7
len: 5, cap: 7
len: 6, cap: 7
len: 7, cap: 7
len: 8, cap: 15

So, you can see that a reallocation occurred on the 1st, 4th, and 8th append
operations, and that won't reallocate again until the 16th append operation
(since it can grow to a length of 15 before a reallocation would be
required). The capacity grows in a manner similar to that of std::vector in
C++ or ArrayList in Java. And it's reasonably efficient in terms of memory
allocations (amortized O(1)). The reason that Appender often gets used is
that checking the capacity of the dynamic array is not cheap (since that's
kept track of by the GC and not the dynamic array itself), and that has to
be checked every time that you append. Appender is able to play some games
to make that more efficient under the assumption that what you're doing is
simply appending to build an array after which you would take it out of the
Appender and not use Appender anymore. But Appender does not change the
allocation scheme. It just makes the checks more efficient. So, use Appender
when you're first building an array, but don't keep using it after that.

What gets more entertaining in terms of capacity and reallocations is when
you start slicing the array or changing its length. For instance, if we add


    arr.length = arr.length - 1;
    writefln("len: %s, cap: %s", arr.length, arr.capacity);

to the end of the previous example, then you get

len: 0, cap: 0
len: 1, cap: 3
len: 2, cap: 3
len: 3, cap: 3
len: 4, cap: 7
len: 5, cap: 7
len: 6, cap: 7
len: 7, cap: 7
len: 8, cap: 15
len: 7, cap: 0

Notice that the capacity is now 0. That's because the dynamic array no
longer refers to the last, used element in the underlying memory buffer.
Similarly, if you instead did


import std.stdio;

void main()
{
    int[] arr;
    writefln("arr len: %s, cap: %s", arr.length, arr.capacity);
    arr ~= 10;
    writefln("arr len: %s, cap: %s", arr.length, arr.capacity);
    arr ~= 9;
    writefln("arr len: %s, cap: %s", arr.length, arr.capacity);
    arr ~= 42;
    writefln("arr len: %s, cap: %s", arr.length, arr.capacity);
    arr ~= 17;
    writefln("arr len: %s, cap: %s", arr.length, arr.capacity);
    arr ~= 42;
    writefln("arr len: %s, cap: %s", arr.length, arr.capacity);
    arr ~= 99;
    writefln("arr len: %s, cap: %s", arr.length, arr.capacity);
    arr ~= 0;
    writefln("arr len: %s, cap: %s", arr.length, arr.capacity);
    arr ~= 100;
    writefln("arr len: %s, cap: %s\n", arr.length, arr.capacity);

    auto arr2 = arr;
    writefln("arr len: %s, cap: %s", arr.length, arr.capacity);
    writefln("arr2 len: %s, cap: %s\n", arr2.length, arr2.capacity);

    auto arr3 = arr[0 .. $ - 1];
    writefln("arr len: %s, cap: %s", arr.length, arr.capacity);
    writefln("arr2 len: %s, cap: %s", arr2.length, arr2.capacity);
    writefln("arr3 len: %s, cap: %s\n", arr3.length, arr3.capacity);

    arr2 ~= 77;
    writefln("arr len: %s, cap: %s", arr.length, arr.capacity);
    writefln("arr2 len: %s, cap: %s", arr2.length, arr2.capacity);
    writefln("arr3 len: %s, cap: %s", arr3.length, arr3.capacity);
}

it prints out

arr len: 0, cap: 0
arr len: 1, cap: 3
arr len: 2, cap: 3
arr len: 3, cap: 3
arr len: 4, cap: 7
arr len: 5, cap: 7
arr len: 6, cap: 7
arr len: 7, cap: 7
arr len: 8, cap: 15

arr len: 8, cap: 15
arr2 len: 8, cap: 15

arr len: 8, cap: 15
arr2 len: 8, cap: 15
arr3 len: 7, cap: 0

arr len: 8, cap: 0
arr2 len: 9, cap: 15
arr3 len: 7, cap: 0

Notice that only whichever dynamic arrays refer to the last element within
that memory block has a non-zero capacity. So, appending one of those won't
result in a reallocation so long as the memory block isn't full, but
appending to any of the others _will_ result in a reallocation, and if you
append to a dynamic without reallocating, any other dynamic arrays which are
slices of the same memory block and which had a non-zero capacity will no
longer have a non-zero capacity, because they no longer refer to the last,
used element in the underlying memory block.

So, if what you're doing is passing around dynamic arrays which are slices
of one another left and right and appending to them will-nilly, then you're
going to get a lot of reallocations, but if you're just appending to dynamic
arrays which refer to the last element in their memory block, and you don't
append to any of the other dynamic arrays which are slices of that same
block, then you'll get only occasional reallocations.

But it sounds like what you're trying to do is something like

    auto arr = [99, 45, 33, 22, 19, 46];
    writefln("arr len: %s, cap: %s", arr.length, arr.capacity);

    arr.length -= 2;
    writefln("arr len: %s, cap: %s", arr.length, arr.capacity);

    arr ~= 99;
    writefln("arr len: %s, cap: %s", arr.length, arr.capacity);

which on my machine results in

arr len: 6, cap: 7
arr len: 4, cap: 0
arr len: 5, cap: 7

It reallocated 99 was appended, because arr didn't refer to the last used
element in the memory block. To fix that, you use assumeSafeAppend. e.g.

    auto arr = [99, 45, 33, 22, 19, 46];
    writefln("arr len: %s, cap: %s", arr.length, arr.capacity);

    arr.length -= 2;
    writefln("arr len: %s, cap: %s", arr.length, arr.capacity);

    arr.assumeSafeAppend();
    writefln("arr len: %s, cap: %s", arr.length, arr.capacity);

    arr ~= 99;
    writefln("arr len: %s, cap: %s", arr.length, arr.capacity);

which prints

arr len: 6, cap: 7
arr len: 4, cap: 0
arr len: 4, cap: 7
arr len: 5, cap: 7

on my machine. It didn't reallocate. Instead, assumeSafeAppend told the GC
that the last element that arr refered to within its memory block was the
last used element in the memory block. So, suddenly, all of the space after
arr was available, and its capacity was non-zero. So, if you're going to be
doing a bunch of adjustments to length to remove elements, then you can use
assumeSafeAppend to then make it so that the GC understands that the
elements after that array are non longer used and that appending can grow
the array into that space, which will significantly reduce the number of
reallocations in the case where you keep removing elements from the end of
the array. _However_, the very large caveat is that if you do this, you
cannot have any other dynamic arrays which refer to any elements past the
end of arr, because if such dynamic arrays do exist, then suddenly, their
values are gonig to be stomped on by the append operations to arr (and it
might even be that those elements had their destructors called when
assumesafeAppend was called - I don't know; if so it's that much worse if
you tell it that it's safe to append when it actually isn't).

So, whether you should be using Appender or assumeSafeAppend or neither
depends entirely on what you're doing. However, in general, simply appending
to dynamic arrays does not result in many reallocations (just like it
doesn't result in a lot of realloctions for std::vector or ArrayList). When
reallocations become a problem is when you start slicing a dynamic array so
that you have other dynamic arrays which refer to the same memory, and you
append to those dynamic arrays, or when you reduce the length of an array
and then append to it, because in both of those cases, you're appending to
dynamic arrays which do not refer to the last element in their underlying
memory block.

Hopefully, that makes things at least somewhat clearer.

- Jonathan M Davis

Jul 14 2016

Miguel L <mlabayru gmail.com> writes:

On Thursday, 14 July 2016 at 09:12:50 UTC, Jonathan M Davis wrote:
 So, whether you should be using Appender or assumeSafeAppend or 
 neither depends entirely on what you're doing. However, in 
 general, simply appending to dynamic arrays does not result in 
 many reallocations (just like it doesn't result in a lot of 
 realloctions for std::vector or ArrayList). When reallocations 
 become a problem is when you start slicing a dynamic array so 
 that you have other dynamic arrays which refer to the same 
 memory, and you append to those dynamic arrays, or when you 
 reduce the length of an array and then append to it, because in 
 both of those cases, you're appending to dynamic arrays which 
 do not refer to the last element in their underlying memory 
 block.

 Hopefully, that makes things at least somewhat clearer.

 - Jonathan M Davis

Thank you Jonathan, that really cleared up a lot of things, I 
read the article. But I still have this doubt: is 
assumeSafeAppend() changing a property of the array as "this 
array is never going to be referenced by any other slice, you can 
append or change its length any time and it is never going to be 
reallocated unless it's out of free space"? or it is more like 
"adjust capacity after last operation" so I should be calling it 
whenever I am adjusting length or before appending?

Jul 14 2016

Jonathan M Davis via Digitalmars-d-learn writes:

On Thursday, July 14, 2016 09:56:02 Miguel L via Digitalmars-d-learn wrote:
 Thank you Jonathan, that really cleared up a lot of things, I
 read the article. But I still have this doubt: is
 assumeSafeAppend() changing a property of the array as "this
 array is never going to be referenced by any other slice, you can
 append or change its length any time and it is never going to be
 reallocated unless it's out of free space"? or it is more like
 "adjust capacity after last operation" so I should be calling it
 whenever I am adjusting length or before appending?

_All_ that a dynamic array is is

struct DynamicArray(T)
{
    size_t length;
    T* ptr;
}

All of properties of a dynamic array are calculated by the GC. So, an
operation like assumeSafeAppend is not doing anything to the array itself
but to the memory block (or rather the metadata associated with the memory
block) that the GC keeps track of. Think of the memory block in the GC as
being something like

struct MemoryBlock(T)
{
    T* start;
    size_t length;
    T* farthestUsed;
}

That's not actually what it looks like, but it should help you understand.
And now let's assume that we somehow have access to this memory block as the
variable memBlock, and you get something like

auto arr = [15, 19, 22, 7, 2];
assert(arr.length == 5);
assert(arr.ptr == memBlock.start);
assert(arr.ptr + arr.length == memBlock.farthestUsed);
assert(arr.capacity == memBlock.length);

If you append to arr, then memBlock.farthestUsed gets adjusted, and you get
something like

arr ~= 42;
assert(arr.length == 6);
assert(arr.ptr == memBlock.start);
assert(arr.ptr + arr.length == memBlock.farthestUsed);
assert(arr.capacity == memBlock.length);

If you then slice arr so that it doesn't refer to the first element in the
memory block, then you'd get something like

arr = arr[1 .. $];
assert(arr.length == 5);
assert(arr.ptr == memBlock.start + 1);
assert(arr.ptr + arr.length == memBlock.farthestUsed);
assert(arr.capacity == memBlock.length - (arr.ptr - memBlock.start));

If you change the array's length so that it has one fewer elements on the
end, then you get something like

--arr.length;
assert(arr.length == 4);
assert(arr.ptr == memBlock.start + 1);
assert(arr.ptr + arr.length + 1 == memBlock.farthestUsed);
assert(arr.capacity == 0);

Because arr.ptr + arr.length != memBlock.farthestUsed, the capacity is now
0. So, appending to arr would then require a reallocation. However, if you
call assumeSafeAppend, then farthestUsed is adjusted, and you get something
like

// Sets memBlock.farthestUsed to arr.ptr + arr.length
arr.assumeSafeAppend();
assert(arr.length == 4);
assert(arr.ptr == memBlock.start + 1);
assert(arr.ptr + arr.length == memBlock.farthestUsed);
assert(arr.capacity == memBlock.length - (arr.ptr - memBlock.start));

So, the main thing that assumeSafeAppend has done is adjust the field that
goes with the memory block which indicates the farthest point in the memory
block that a dynamic array has ever grown to. The GC doesn't try and figure
out which dynamic arrays might currently refer to that memory. It has no
clue and doesn't care. During a collection, it'll check whether anything
points to that memory block and free it if nothing does, but during normal
operations, no attempt is made to determine how many dynamic arrays refer to
a particular memory block or where in that memory block they might point.
All the GC needs to keep track of in order to avoid having arrays stomp on
one another when they grow is the farthest that any dynamic array has grown
into that memory block. And assumeSafeAppend is telling the GC to change
that value to point to the last element in the array that you call it on
rather than wherever it pointed to before.

So, if there actually are any other dynamic arrays referring to the memory
past the end of the array that you called assumeSafeAppend on, then you've
screwed them up, and growing that array will stomp on them, causing bugs.
However, if there really aren't any other dynamic arrays referring to the
memory past the end of that array, then you're fine. Appending won't require
a reallocation until there is no more free memory in the block for the array
to grow into.

However, if you ever change the length of the array again, taking elements
off the end, once again, that array's capacity will be 0, and appending to
it will reallocate.

So, if you want to remove an element from the end of an array and then
append without reallocating, you will need to call assumeSafeAppend after
_every_ time that you reduce the array's length. e.g.

--arr.length;
arr.assumeSafeAppend();

And that will work great as long as there are no other dynamic arrays
referring to the element that was just removed from the array. If you have
something like

auto arr2 = arr;
--arr.length;
arr.assumeSafeAppend();

then that's very bad, because if you then append to arr, then it will be
overwriting the last element in arr2. Additionally, if assumeSafeAppend
results in the destructor being called on the element that was removed from
the array (I'm not sure if it does or doesn't), then accessing the last
element of arr2 without appending to arr would result in an unsafe
operation, because that element would be in an invalid state.

So, shriking an array and calling assumeSafeAppend to be able to grow the
array without reallocating (until the memory buffer is full anyway) is fine
- but only so long as you make sure that you don't have any other dynamic
arrays referring to the memory past the end of that array.

- Jonathan M Davis

Jul 14 2016

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 7/14/16 5:56 AM, Miguel L wrote:
 On Thursday, 14 July 2016 at 09:12:50 UTC, Jonathan M Davis wrote:
 So, whether you should be using Appender or assumeSafeAppend or
 neither depends entirely on what you're doing. However, in general,
 simply appending to dynamic arrays does not result in many
 reallocations (just like it doesn't result in a lot of realloctions
 for std::vector or ArrayList). When reallocations become a problem is
 when you start slicing a dynamic array so that you have other dynamic
 arrays which refer to the same memory, and you append to those dynamic
 arrays, or when you reduce the length of an array and then append to
 it, because in both of those cases, you're appending to dynamic arrays
 which do not refer to the last element in their underlying memory block.

 Hopefully, that makes things at least somewhat clearer.

 Thank you Jonathan, that really cleared up a lot of things, I read the
 article. But I still have this doubt: is assumeSafeAppend() changing a
 property of the array as "this array is never going to be referenced by
 any other slice, you can append or change its length any time and it is
 never going to be reallocated unless it's out of free space"? or it is
 more like "adjust capacity after last operation" so I should be calling
 it whenever I am adjusting length or before appending?

No, it's not a permanent adjustment. It simply tells the array runtime 
that the extra elements are no longer used and can be reclaimed.

If you append again, then shrink, you have to call assumeSafeAppend again.

What Jonathan is trying to explain is that the array slice (the int[] 
type) does not store any of this information. It's all stored in the 
runtime. So there's nothing adjusted on the slice itself, just on the 
array runtime "type". This means the slice isn't considered "special" in 
any way.

-Steve

Jul 14 2016

ketmar <ketmar ketmar.no-ip.org> writes:

On Wednesday, 13 July 2016 at 11:59:18 UTC, Miguel L wrote:
 The options would be:

 a=[];
 a.length=0;
 a=null;
 ...
 any other?

it really depends of your other code. if you don't have any 
slices of the array, for example, you can use `a.length = 0; 
a.assumeSafeAppend;` -- this will reuse the allocated memory. but 
you should be REALLY sure that you have no array slices are 
floating around! 'cause you effectively promised the runtime that.

otherwise, `a = [];` and `a = null;` is the same, as `[]` is a 
"null array".

most of the time it is ok to use `a = null;` and let GC do it's 
work. it is safe, and you'd better stick to that.

Jul 13 2016

cym13 <cpicard openmailbox.org> writes:

On Wednesday, 13 July 2016 at 11:59:18 UTC, Miguel L wrote:
 I am using a temporary dynamic array inside a loop this way:
 A[] a;
 for(....)
 {
 a=[]; //discard array contents
 ... appends thousand of elements to a
 ... use a for some calculations
 }

 I would like to know which would be the best way to clear a 
 contents avoiding reallocations, as there seems to be lots of 
 garbage collection cycles taking place.

 The options would be:

 a=[];
 a.length=0;
 a=null;
 ...
 any other?

 Can you help me please?

The best option would be a.clear(). From the language specs:

“Removes all remaining keys and values from an associative array. 
The array is not rehashed after removal, to allow for the 
existing storage to be reused. This will affect all references to 
the same instance and is not equivalent to destroy(aa) which only 
sets the current reference to null.”

Jul 13 2016

Lodovico Giaretta <lodovico giaretart.net> writes:

On Wednesday, 13 July 2016 at 12:20:07 UTC, cym13 wrote:
 The best option would be a.clear(). From the language specs:

 “Removes all remaining keys and values from an associative 
 array. The array is not rehashed after removal, to allow for 
 the existing storage to be reused. This will affect all 
 references to the same instance and is not equivalent to 
 destroy(aa) which only sets the current reference to null.”

I don't think OP is using associative arrays, but dynamic arrays 
(if I understood correctly).

Jul 13 2016

cym13 <cpicard openmailbox.org> writes:

On Wednesday, 13 July 2016 at 12:22:55 UTC, Lodovico Giaretta 
wrote:
 On Wednesday, 13 July 2016 at 12:20:07 UTC, cym13 wrote:
 The best option would be a.clear(). From the language specs:

 “Removes all remaining keys and values from an associative 
 array. The array is not rehashed after removal, to allow for 
 the existing storage to be reused. This will affect all 
 references to the same instance and is not equivalent to 
 destroy(aa) which only sets the current reference to null.”

 I don't think OP is using associative arrays, but dynamic 
 arrays (if I understood correctly).

You're right, my bad, I read too fast :/

Jul 13 2016

Jonathan M Davis via Digitalmars-d-learn writes:

On Wednesday, July 13, 2016 11:59:18 Miguel L via Digitalmars-d-learn wrote:
 I am using a temporary dynamic array inside a loop this way:
 A[] a;
 for(....)
 {
 a=[]; //discard array contents
 ... appends thousand of elements to a
 ... use a for some calculations
 }

 I would like to know which would be the best way to clear a
 contents avoiding reallocations, as there seems to be lots of
 garbage collection cycles taking place.

 The options would be:

 a=[];
 a.length=0;
 a=null;
 ...
 any other?

 Can you help me please?

a = [];

and

a = null;

both set the .ptr property of an array to null, and the length to 0. Setting
the length to 0, just sets the length to 0. Regardless, appending after any
of those operations is going to result in allocating memory, because the
dynamic array has no unused memory to expand into. The GC determines whether
it can append to a dynamic array without allocating based on whether that
dynamic array's last element is the last element in the block of memory that
the dynamic array refers to which has been used by any dynamic array. It
does not keep track of how many arrays refer to the same memory block or
where in the memory block they refer to. So, if the dynamic array that
you're trying to append to does not refer to the last element in that block
of memory which hasn't been used, then the GC has to assume that another
dynamic array might refer to it. So, it won't expand into that memory and
will instead reallocate. And because of that mechanic, in general, trying to
"clear" a dynamic array doesn't work.

However, if you are certain that there are no other dynamic arrays refering
to the same memory, then you can tell the GC that by using assumeSafeAppend.
e.g.

a.length = 0;
a.assumeSafeAppend();

And then the GC will think that the last element in a (which would be no
element in this case) is the last used point in the block of memory pointed
to by the dynamic array, and so it won't reallocate when you append. The big
caveat here, of course, is that you have to be sure that there are no other
dynamic arrays referring to the memory after a, or you'll be stomping on
their memory when you append to a. But as long as you're sure that no other
dynamic arrays refer to that memory, then you're fine.

If you want an array type where you can clear out its elements and affect
all other references to that array as well, then the built-in dynamic arrays 
won't cut it, and you'll need to use something like std.container.Array.

If you haven't yet, I would advise reading

http://dlang.org/d-array-article.html

since it goes into detail on how D's dynamic arrays work - though it uses
the wrong terminology and refers to the GC-allocated buffer that the dynamic
array points to as if it were the dynamic array, and calls the dynamic array
a slice, whereas the official terminology is that T[] is a dynamic array (no
matter what memory it refers to), and if it's non-null, then it's a slice of
whatever memory it points to. However, the memory that it points to is just
the memory that it points to. It has no special name, even if it's
GC-allocated. So, even if T[] is a slice of malloc-ed memory or of a static
array, it's still a dynamic array (though in that case, appending to it will
always reallocate, since the GC can't grow a dynamic array unless it points
to memory allocated by the GC for dynamic arrays). But in spite of the
slight terminology problem, it's a great article, and a must-read for anyone
serious about D.

- Jonathan M Davis

Jul 13 2016

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Best way to clear dynamic array for reuse