www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - GC and slices

reply Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
This is a very short question regarding the liberties of the GC. Is it 
safe to rely on data outside a slice?

For example, given:

int* a = new int[100_000] + 50_000;
std.gc.fullCollect();

I'd still assume both a[-50_000] and a[49_999] to remain valid.

But does the same hold for slices?

int[] a = (new int[100_000])[50_000..50_010];
std.gc.fullCollect();

Is the memory outside the slice 'a' still guaranteed to be valid, or is 
a compacting GC allowed to remove the unreferenced areas outside the slice?

/Oskar
Sep 14 2006
next sibling parent Sean Kelly <sean f4.ca> writes:
Oskar Linde wrote:
 This is a very short question regarding the liberties of the GC. Is it 
 safe to rely on data outside a slice?
 
 For example, given:
 
 int* a = new int[100_000] + 50_000;
 std.gc.fullCollect();
 
 I'd still assume both a[-50_000] and a[49_999] to remain valid.
 
 But does the same hold for slices?

Yes it does. I imagine it might be possible to create an allocator that shrinks arrays which only have partial slice references to them, but it seems far too complicated to be worthwhile.
 int[] a = (new int[100_000])[50_000..50_010];
 std.gc.fullCollect();
 
 Is the memory outside the slice 'a' still guaranteed to be valid, or is 
 a compacting GC allowed to remove the unreferenced areas outside the slice?

It might be useful to add formal language stating that the entire block will remain valid, but it's a safe assumption to operate under either way. I'll try to add something to this effect in the docs for Ares' std.memory module. Sean
Sep 14 2006
prev sibling parent reply Walter Bright <newshound digitalmars.com> writes:
Oskar Linde wrote:
 This is a very short question regarding the liberties of the GC. Is it 
 safe to rely on data outside a slice?

Yes, because all a slice is is an interior pointer, and interior pointers hold the entire allocated chunk.
Sep 20 2006
parent reply Lionello Lunesu <lio lunesu.remove.com> writes:
Walter Bright wrote:
 Oskar Linde wrote:
 This is a very short question regarding the liberties of the GC. Is it 
 safe to rely on data outside a slice?

Yes, because all a slice is is an interior pointer, and interior pointers hold the entire allocated chunk.

Are you guys sure?! I was using a void[] as a fifo queue, appending new items to the list with ~= and removing processed items with list = list[1..$] and was wondering if the memory at the beginning of the list.ptr was ever being freed, so I made this small test program: #import std.stdio; #long[] queue = [123,1234,1233,435,7654,54,3241]; #void main() { # const size_t MASK = 1024*1024*16-1; # size_t size; # while (1) { # queue ~= size; // anything # size += long.sizeof; # long* t = queue.ptr; # queue = queue[1..$]; # assert( queue[0] != 123 ); // popped? # assert( queue.length == 7 ); # assert( (t+1) == queue.ptr ); // moved? # if ((size&MASK) == 0) // stats # writefln(size); # } #} and it seems it IS being freed! The (virtual) memory is not growing! How can this be? Is the GC smarter than we think? L.
Sep 21 2006
parent reply Sean Kelly <sean f4.ca> writes:
Lionello Lunesu wrote:
 Walter Bright wrote:
 Oskar Linde wrote:
 This is a very short question regarding the liberties of the GC. Is 
 it safe to rely on data outside a slice?

Yes, because all a slice is is an interior pointer, and interior pointers hold the entire allocated chunk.

Are you guys sure?! I was using a void[] as a fifo queue, appending new items to the list with ~= and removing processed items with list = list[1..$] and was wondering if the memory at the beginning of the list.ptr was ever being freed, so I made this small test program: #import std.stdio; #long[] queue = [123,1234,1233,435,7654,54,3241]; #void main() { # const size_t MASK = 1024*1024*16-1; # size_t size; # while (1) { # queue ~= size; // anything # size += long.sizeof; # long* t = queue.ptr; # queue = queue[1..$]; # assert( queue[0] != 123 ); // popped? # assert( queue.length == 7 ); # assert( (t+1) == queue.ptr ); // moved? # if ((size&MASK) == 0) // stats # writefln(size); # } #} and it seems it IS being freed! The (virtual) memory is not growing!

When you try to append to a slice a reallocation occurs--growing an array in place is only possible if your array reference refers to the head of the memory block. Also, a rellocation will occur on an append periodically as the as the block grows. Up to 4096 bytes (the size of a memory page) reallocations will occur when the array passes a power of two boundary--16, 32, 64, etc--beyond 4096 bytes a reallocation will occur when the array passes a page boundary. This has to do with how the GC manages memory, and it is a pretty typical allocator design. Sean
Sep 21 2006
parent reply Lionello Lunesu <lio lunesu.remove.com> writes:
 When you try to append to a slice a reallocation occurs--growing an 
 array in place is only possible if your array reference refers to the 
 head of the memory block.  Also, a rellocation will occur on an append 
 periodically as the as the block grows.  Up to 4096 bytes (the size of a 
 memory page) reallocations will occur when the array passes a power of 
 two boundary--16, 32, 64, etc--beyond 4096 bytes a reallocation will 
 occur when the array passes a page boundary.  This has to do with how 
 the GC manages memory, and it is a pretty typical allocator design.

Hi Sean, Thanks for the explanation.. I was thinking about it some more and indeed forgot to test the pointer value after the append :S Pretty stupid of me. But a thought occured. Reallocation often involves a copy, but why should that copy be necessary? I mean, the CPU (286+ anyway) has this mapping table from physical to virtual memory. Wouldn't it be possible to allocate a piece of virtual memory, but to let the 'old' pages refer to the old physical memory so no copy is needed? Is this possible in Windows/Linux? (In windows there's VirtualAlloc, which can be used to grow an existing virtual memory block but I pretty sure that there's a memory copy if that block can not be grown when a new block of virtual address space is reserved.) L.
Sep 21 2006
parent Sean Kelly <sean f4.ca> writes:
Lionello Lunesu wrote:
 When you try to append to a slice a reallocation occurs--growing an 
 array in place is only possible if your array reference refers to the 
 head of the memory block.  Also, a rellocation will occur on an append 
 periodically as the as the block grows.  Up to 4096 bytes (the size of 
 a memory page) reallocations will occur when the array passes a power 
 of two boundary--16, 32, 64, etc--beyond 4096 bytes a reallocation 
 will occur when the array passes a page boundary.  This has to do with 
 how the GC manages memory, and it is a pretty typical allocator design.

Hi Sean, Thanks for the explanation.. I was thinking about it some more and indeed forgot to test the pointer value after the append :S Pretty stupid of me. But a thought occured. Reallocation often involves a copy, but why should that copy be necessary? I mean, the CPU (286+ anyway) has this mapping table from physical to virtual memory. Wouldn't it be possible to allocate a piece of virtual memory, but to let the 'old' pages refer to the old physical memory so no copy is needed? Is this possible in Windows/Linux?

Hrm... Assuming the array was already at least 2048 bytes in length (so it would be living on its own page) and the page(s) allocated could be mapped contiguously with the page the array is on then this could work as you described. Currently however, the GC code isn't quite this sophisticated. If the append needs additional memory then an allocation/copy occurs regardless of array size or location. Also, the internal GC.realloc routine only works if you pass a pointer to the head of a memory block, not a slice. I'll look into whether it would be worthwhile to fix realloc to work regardless. From there, someone could extend realloc further to grow large arrays in place if possible. I'm not sure it's worth the effort, personally, though I suppose it could be if an application frequently appends to huge arrays.
 (In windows there's VirtualAlloc, which can be used to grow an existing 
 virtual memory block but I pretty sure that there's a memory copy if 
 that block can not be grown when a new block of virtual address space is 
 reserved.)

I think the allocation and copy would probably occur separately and manually within the GC code instead of relying on VirtualAlloc to take care of it. Sean
Sep 22 2006