digitalmars.D - Strange behavior on shrinking a Dynamic Array -- Capacity reduces to 0
- d coder (36/36) Jun 22 2011 Hello List
- Robert Jacques (10/21) Jun 22 2011 Yes. Capacity is only non-zero for arrays which can safely be extended. ...
- Ali =?iso-8859-1?q?=C7ehreli?= (4/30) Jun 22 2011 And an article to go with that: :)
- d coder (7/16) Jun 22 2011 Thanks All
- Steven Schveighoffer (7/28) Jun 23 2011 David Simcha has removed Appender from std.parallelism, due to a perceiv...
- dsimcha (11/40) Jun 23 2011 I don't either. It was one of those things where I just didn't know wha...
- Steven Schveighoffer (34/66) Jun 22 2011 A slice capacity of 0 does *not* indicate that 0 bytes are stored in the...
Hello List As per TDPL and D documentation, shrinking a dynamic array should not relocate it. But that is exactly what is happening. On shrinking an array, its capacity is getting reduced to 0. Surprisingly the capacity after a shrink is even less than the length of the array. As a result the array gets relocated as soon as another element is added to it. I have to call reserve again after shrinking it to make sure that the array is not relocated. Is this the expected behavior? Regards - Puneet P.S. Please compile the following code to see yourself. import std.stdio; void main() { uint [] arr; reserve(arr, 2048); writeln("After reservation:"); writefln("arr is at: %10s, length: %6s, capacity %6s", arr.ptr, arr.length, capacity(arr)); // Grow the array for (size_t i = 0; i < 64; ++i) arr ~= i; writeln("After appending:"); writefln("arr is at: %10s, length: %6s, capacity %6s", arr.ptr, arr.length, capacity(arr)); arr.length = 32; // reserve(arr, 2048); // does not relocate if uncommented writeln("After setting length:"); writefln("arr is at: %10s, length: %6s, capacity %6s", arr.ptr, arr.length, capacity(arr)); // grow the array again for (size_t i = 0; i < 32; ++i) arr ~= i; writeln("After appending again:"); writefln("arr is at: %10s, length: %6s, capacity %6s", arr.ptr, arr.length, capacity(arr)); }
Jun 22 2011
On Wed, 22 Jun 2011 23:38:30 -0400, d coder <dlang.coder gmail.com> wrote:Hello List As per TDPL and D documentation, shrinking a dynamic array should not relocate it. But that is exactly what is happening. On shrinking an array, its capacity is getting reduced to 0. Surprisingly the capacity after a shrink is even less than the length of the array. As a result the array gets relocated as soon as another element is added to it. I have to call reserve again after shrinking it to make sure that the array is not relocated. Is this the expected behavior?Yes. Capacity is only non-zero for arrays which can safely be extended. If you shrink an array from length 10 to length 5, the GC is smart enough to know that that there may be a dangling reference to elements [5..10] and thus if the length 5 array was appended to, it might stomp on someone else's data. Long ago, the GC was dumber and it caused a massive hole in the const/immutable system. If you know that no other aliases exist (i.e. you're buffering, etc) you can use the function assumeSafeAppend to reset the capacity. However, I'd strongly recommend switching to Appender whenever you're tempted to use assumeSafeAppend.
Jun 22 2011
On Wed, 22 Jun 2011 23:55:23 -0400, Robert Jacques wrote:On Wed, 22 Jun 2011 23:38:30 -0400, d coder <dlang.coder gmail.com> wrote:And an article to go with that: :) http://www.dsource.org/projects/dcollections/wiki/ArrayArticle AliHello List As per TDPL and D documentation, shrinking a dynamic array should not relocate it. But that is exactly what is happening. On shrinking an array, its capacity is getting reduced to 0. Surprisingly the capacity after a shrink is even less than the length of the array. As a result the array gets relocated as soon as another element is added to it. I have to call reserve again after shrinking it to make sure that the array is not relocated. Is this the expected behavior?Yes. Capacity is only non-zero for arrays which can safely be extended. If you shrink an array from length 10 to length 5, the GC is smart enough to know that that there may be a dangling reference to elements [5..10] and thus if the length 5 array was appended to, it might stomp on someone else's data. Long ago, the GC was dumber and it caused a massive hole in the const/immutable system. If you know that no other aliases exist (i.e. you're buffering, etc) you can use the function assumeSafeAppend to reset the capacity. However, I'd strongly recommend switching to Appender whenever you're tempted to use assumeSafeAppend.
Jun 22 2011
Yes. Capacity is only non-zero for arrays which can safely be extended. If you shrink an array from length 10 to length 5, the GC is smart enough to know that that there may be a dangling reference to elements [5..10] and thus if the length 5 array was appended to, it might stomp on someone else's data. Long ago, the GC was dumber and it caused a massive hole in the const/immutable system. If you know that no other aliases exist (i.e. you're buffering, etc) you can use the function assumeSafeAppend to reset the capacity. However, I'd strongly recommend switching to Appender whenever you're tempted to use assumeSafeAppend.Thanks All I would give Appender a try. Any idea if it is safe to use this struct in multithreaded environment. Would it be OK to add elements to it from multiple threads -- of course I would be using synchronized code to make sure that only one element gets added at a time. Any other precautions? Regards - Puneet
Jun 22 2011
On Thu, 23 Jun 2011 00:15:12 -0400, d coder <dlang.coder gmail.com> wrote:David Simcha has removed Appender from std.parallelism, due to a perceived failure during multithreaded appends (even with locking). I don't see why it would happen, but you may want to be cautious. See this commit: https://github.com/D-Programming-Language/phobos/commit/5a3761baac147ff43f4dc2ce264a18ad5e4330bd#std/parallelism.d -SteveYes. Capacity is only non-zero for arrays which can safely be extended. If you shrink an array from length 10 to length 5, the GC is smart enough to know that that there may be a dangling reference to elements [5..10] and thus if the length 5 array was appended to, it might stomp on someone else's data. Long ago, the GC was dumber and it caused a massive hole in the const/immutable system. If you know that no other aliases exist (i.e. you're buffering, etc) you can use the function assumeSafeAppend to reset the capacity. However, I'd strongly recommend switching to Appender whenever you're tempted to use assumeSafeAppend.Thanks All I would give Appender a try. Any idea if it is safe to use this struct in multithreaded environment. Would it be OK to add elements to it from multiple threads -- of course I would be using synchronized code to make sure that only one element gets added at a time. Any other precautions?
Jun 23 2011
== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s articleOn Thu, 23 Jun 2011 00:15:12 -0400, d coder <dlang.coder gmail.com> wrote:https://github.com/D-Programming-Language/phobos/commit/5a3761baac147ff43f4dc2ce264a18ad5e4330bd#std/parallelism.dDavid Simcha has removed Appender from std.parallelism, due to a perceived failure during multithreaded appends (even with locking). I don't see why it would happen, but you may want to be cautious. See this commit:Yes. Capacity is only non-zero for arrays which can safely be extended. If you shrink an array from length 10 to length 5, the GC is smart enough to know that that there may be a dangling reference to elements [5..10] and thus if the length 5 array was appended to, it might stomp on someone else's data. Long ago, the GC was dumber and it caused a massive hole in the const/immutable system. If you know that no other aliases exist (i.e. you're buffering, etc) you can use the function assumeSafeAppend to reset the capacity. However, I'd strongly recommend switching to Appender whenever you're tempted to use assumeSafeAppend.Thanks All I would give Appender a try. Any idea if it is safe to use this struct in multithreaded environment. Would it be OK to add elements to it from multiple threads -- of course I would be using synchronized code to make sure that only one element gets added at a time. Any other precautions?-SteveI don't either. It was one of those things where I just didn't know what else to try. I'm not sure if removing Appender did anything. It seems to have removed one manifestation of the issues with std.parallelism, but occasionally it still segfaults on FreeBSD64. The bug appears to be in initialization code. Running the unittest in a loop 10,000 times by putting a for loop around the unittest code passes. Running the unittest several times by restarting the executable eventually fails. IIRC, though, running in a for loop used to fail before I took out the Appender. I can't reproduce any of this stuff anywhere except the FreeBSD64 auto tester.
Jun 23 2011
On Wed, 22 Jun 2011 23:38:30 -0400, d coder <dlang.coder gmail.com> wrote:Hello List As per TDPL and D documentation, shrinking a dynamic array should not relocate it. But that is exactly what is happening. On shrinking an array, its capacity is getting reduced to 0. Surprisingly the capacity after a shrink is even less than the length of the array. As a result the array gets relocated as soon as another element is added to it.A slice capacity of 0 does *not* indicate that 0 bytes are stored in the slice, it indicates that appending to the slice will reallocate into a new block. It's a special token meaning "any append to this slice will reallocate". There have been a few people that suggested capacity should return *at least* the number of bytes in the slice. However, the zero indicator can be useful. It's easier to check for, generates more efficient code when checking, plus the distinction between "not appendable" and "appendable, but the current block is full" can be very important. We would lose this distinction if the default return was the length of the slice.I have to call reserve again after shrinking it to make sure that the array is not relocated.In fact, it is relocated. Reserve will not overwrite data that is already valid, so it will reallocate the slice into a new array that can hold the requested size. There is a function that allows you to reestablish appendability without reallocating: assumeSafeAppend(). Call this on a slice, and it will become appendable again (non-zero capacity) at its current location. However, be cautious as this will *overwrite* any data that was valid after the slice. This function has no effect on slices of non-heap data.import std.stdio; void main() { uint [] arr; reserve(arr, 2048);The above line will allocate an array, and set arr to the first 2048 elements.writeln("After reservation:"); writefln("arr is at: %10s, length: %6s, capacity %6s", arr.ptr, arr.length, capacity(arr)); // Grow the array for (size_t i = 0; i < 64; ++i) arr ~= i; writeln("After appending:"); writefln("arr is at: %10s, length: %6s, capacity %6s", arr.ptr, arr.length, capacity(arr));I expect arr.ptr to be unchanged, arr.length to be 64, and capacity to be unchanged.arr.length = 32;Note that even though you have effectively cut off 32 elements from your slice that latter data is *still valid*. Another slice could be referencing that data, which is why arr is no longer appendable.// reserve(arr, 2048); // does not relocate if uncommented writeln("After setting length:"); writefln("arr is at: %10s, length: %6s, capacity %6s", arr.ptr, arr.length, capacity(arr));If you leave reserve commented out, I'd expect arr.ptr to remain unchanged, length to be 32, and capacity to be 0 (not appendable) If you uncomment reserve, then arr.ptr will have moved, length will be 32, and capacity will be >= 2048.// grow the array again for (size_t i = 0; i < 32; ++i) arr ~= i; writeln("After appending again:"); writefln("arr is at: %10s, length: %6s, capacity %6s", arr.ptr, arr.length, capacity(arr));Regardless of reserve, arr.ptr will have changed from it's original allocation, arr.length == 64 and capacity will vary depending on whether reserve was called or not. -Steve
Jun 22 2011