digitalmars.D.learn - Bulk allocation and partial deallocation for tree data structures.

Filip Bystricky (15/15) Jun 29 2017 Hello!

Filip Bystricky (1/1) Jul 02 2017 Anyone?

Stefan Koch (4/5) Jul 02 2017 The answer is no.

=?UTF-8?Q?Ali_=c3=87ehreli?= (4/9) Jul 03 2017 Would it be possible to write a custom std.experimental.allocator that

Moritz Maxeiner (16/27) Jul 03 2017 Since the `deallocate` function an Allocator has to implement

Filip Bystricky (14/43) Jul 03 2017 Thanks for the responses!

Filip Bystricky (7/7) Jul 03 2017 Oh and I forgot to mention: another use-case for this would be

Moritz Maxeiner (10/18) Jul 03 2017 Not sure I understand you here: If an instance of such a manual

Moritz Maxeiner (25/36) Jul 03 2017 Then you must adjust your acceptance parameters, because:

Filip Bystricky (34/91) Jul 06 2017 You're right, this would only work for a layer on top of the

Filip Bystricky <filip_bystricky brown.edu> writes:

Hello!

I'm implementing a persistent hash trie (like in clojure/scala). 
Every 'persisting' insertion involves allocating a fixed number 
(6) of nodes (each chunk is a fixed width ranging between 1 and 
~33 words).

Basically, this data structure always allocates a whole branch at 
a time, but then nodes are deallocated individually.

Is there a way to tell an allocator allocate n chunks at a time? 
Or, alternatively, is there a way to allocate all the memory 
needed at once, and then free just chunks of it at a time? It 
seems like this would provide at least some speed improvement. 
And it could also be useful for batch operations on other 
node-based data structures (such as adding ranges of nodes to a 
graph at a time).

Thanks!

Jun 29 2017

Filip Bystricky <filip_bystricky brown.edu> writes:

Anyone?

Jul 02 2017

Stefan Koch <uplink.coder googlemail.com> writes:

On Monday, 3 July 2017 at 02:51:49 UTC, Filip Bystricky wrote:
 Anyone?

The answer is no.

Partial deallocation in an arbitrary fashion is not advisable.

And there are no standard library mechanisms for it.

Jul 02 2017

=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:

On 07/02/2017 07:56 PM, Stefan Koch wrote:
 On Monday, 3 July 2017 at 02:51:49 UTC, Filip Bystricky wrote:
 Anyone?

 The answer is no.

 Partial deallocation in an arbitrary fashion is not advisable.

 And there are no standard library mechanisms for it.

Would it be possible to write a custom std.experimental.allocator that 
does this?

Ali

Jul 03 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Monday, 3 July 2017 at 17:06:10 UTC, Ali Çehreli wrote:
 On 07/02/2017 07:56 PM, Stefan Koch wrote:
 On Monday, 3 July 2017 at 02:51:49 UTC, Filip Bystricky wrote:
 Anyone?

 The answer is no.

 Partial deallocation in an arbitrary fashion is not advisable.

 And there are no standard library mechanisms for it.

 Would it be possible to write a custom 
 std.experimental.allocator that does this?

Since the `deallocate` function an Allocator has to implement 
takes a `void[]` (-> known size), it's possible:
At deallocation the Allocator will have to check if the given 
memory subblock lies within any of the previously allocated 
memory blocks; if it does, it will have to "tear" the 
to-be-deallocated memory subblock from within it, storing the 
resulting prefix and suffix memory subblocks (both remain 
allocated) and removing the original block. It will still have to 
track those two subblocks as belonging together, though, since it 
can only deallocate the original memory block back to the parent 
allocator (another Allocator, or something else) once both of 
these subblocks are no longer allocated. Obviously this applies 
to further partial deallocations of subblocks within subblocks.
I tend to agree with Stefan that this is likely not sensible, but 
if in doubt, benchmark for the specific use case.

Jul 03 2017

Filip Bystricky <filip_bystricky brown.edu> writes:

On Tuesday, 4 July 2017 at 01:56:11 UTC, Moritz Maxeiner wrote:
 On Monday, 3 July 2017 at 17:06:10 UTC, Ali Çehreli wrote:
 On 07/02/2017 07:56 PM, Stefan Koch wrote:
 On Monday, 3 July 2017 at 02:51:49 UTC, Filip Bystricky wrote:
 Anyone?

 The answer is no.

 Partial deallocation in an arbitrary fashion is not advisable.

 And there are no standard library mechanisms for it.

 Would it be possible to write a custom 
 std.experimental.allocator that does this?

 Since the `deallocate` function an Allocator has to implement 
 takes a `void[]` (-> known size), it's possible:
 At deallocation the Allocator will have to check if the given 
 memory subblock lies within any of the previously allocated 
 memory blocks; if it does, it will have to "tear" the 
 to-be-deallocated memory subblock from within it, storing the 
 resulting prefix and suffix memory subblocks (both remain 
 allocated) and removing the original block. It will still have 
 to track those two subblocks as belonging together, though, 
 since it can only deallocate the original memory block back to 
 the parent allocator (another Allocator, or something else) 
 once both of these subblocks are no longer allocated. Obviously 
 this applies to further partial deallocations of subblocks 
 within subblocks.
 I tend to agree with Stefan that this is likely not sensible, 
 but if in doubt, benchmark for the specific use case.

Thanks for the responses!

Moritz, that is a good suggestion. However, in many cases it is 
unacceptable to have to prevent the whole block from being freed 
(especially if the memory is managed by a compacting gc). I think 
the feature would work better as an additional allocator 
primitive, which should be easy to implement for most allocators 
that support reallocate (e.g. bitmapped-block and free-list 
allocators).

In the meantime I realized that since a child node can only be 
freed after its parent is freed, I can allocate all 6 nodes in a 
single allocation, but in reverse order, (putting the leaf at the 
front and the root at the tail). That way, to free an ancestor, 
we just realloc the corresponding leaf's memory.

Jul 03 2017

Filip Bystricky <filip_bystricky brown.edu> writes:

Oh and I forgot to mention: another use-case for this would be 
for arrays. For manually managed arrays like std.container.array, 
it would make it possible to transfer ownership of individual 
objects from the array back to the program after the array goes 
out of scope. For gc slices, it could enable some gc 
implementations to deallocate parts of an array even if there are 
still references pointing inside that array.

Jul 03 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Tuesday, 4 July 2017 at 03:13:14 UTC, Filip Bystricky wrote:
 Oh and I forgot to mention: another use-case for this would be 
 for arrays. For manually managed arrays like 
 std.container.array, it would make it possible to transfer 
 ownership of individual objects from the array back to the 
 program after the array goes out of scope.

Not sure I understand you here: If an instance of such a manual 
array implementation goes out of scope it must destruct (if they 
are objects and not primitives) and deallocate its elements. 
There is no ownership transfer going on here (and who would be 
the target, anyway?).

 For gc slices, it could enable some gc implementations to 
 deallocate parts of an array even if there are still references 
 pointing inside that array.

I'm fairly certain the necessary bookkeeping logic for partial 
deallocations will outweigh any gain from it. In the case of such 
gc slices, I would rather just memcpy to a new, smaller block and 
update pointers to it (-> a moving GC).

Jul 03 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Tuesday, 4 July 2017 at 02:53:00 UTC, Filip Bystricky wrote:
 On Tuesday, 4 July 2017 at 01:56:11 UTC, Moritz Maxeiner wrote:
 [...]

 However, in many cases it is unacceptable to have to prevent 
 the whole block from being freed (especially if the memory is 
 managed by a compacting gc).

Then you must adjust your acceptance parameters, because:
Eventually (down the parent Allocator call chain) you will have 
to get the memory from the operating system, and AFAIK they only 
support deallocating via pointers to the beginning of a 
previously allocated block; let PartiallyDeallocatableAllocator 
(PDA) be our Allocator implementation supporting partial 
deallocations via such a parent Allocator (Parent) that can only 
(de)allocate in whole blocks.
That means even if Parent collects garbage it will not be able to 
allocate memory from within a block previously allocated (by PDA 
calling Parent's allocate) until that (whole) block has been been 
deallocated in Parent, either explicitly (by PDA), or implicitly 
by being collected.
And since PDA will want to track deallocated subblocks and hand 
them out in its own `allocate`, bypassing Parent when feasible 
(otherwise what's the point of supporting partial deallocations 
if you can't reuse the freed memory), it will have to have 
pointers to these subblocks, i.e. even if Parent collects 
garbage, PDA will block the collection, anyway (live pointers 
into the block).

 I think the feature would work better as an additional 
 allocator primitive, which should be easy to implement for most 
 allocators that support reallocate (e.g. bitmapped-block and 
 free-list allocators).

I don't see how reallocate helps here, as you can only 
grow/shrink towards the end of a block, not the beginning (so you 
can only do mimick partial deallocations at the end of a block).

 In the meantime I realized that since a child node can only 
 [...]

Good to see that you found an easier solution :)

Jul 03 2017

Filip Bystricky <filip_bystricky brown.edu> writes:

On Tuesday, 4 July 2017 at 03:52:39 UTC, Moritz Maxeiner wrote:

First of all thank you for your responses!

 On Tuesday, 4 July 2017 at 02:53:00 UTC, Filip Bystricky wrote:
 On Tuesday, 4 July 2017 at 01:56:11 UTC, Moritz Maxeiner wrote:
 [...]

 However, in many cases it is unacceptable to have to prevent 
 the whole block from being freed (especially if the memory is 
 managed by a compacting gc).

 Then you must adjust your acceptance parameters, because:
 Eventually (down the parent Allocator call chain) you will have 
 to get the memory from the operating system, and AFAIK they 
 only support deallocating via pointers to the beginning of a 
 previously allocated block;

You're right, this would only work for a layer on top of the 
system allocator, such as regions.

 let PartiallyDeallocatableAllocator (PDA) be our Allocator 
 implementation supporting partial deallocations via such a 
 parent Allocator (Parent) that can only (de)allocate in whole 
 blocks.
 That means even if Parent collects garbage it will not be able 
 to allocate memory from within a block previously allocated (by 
 PDA calling Parent's allocate) until that (whole) block has 
 been been deallocated in Parent, either explicitly (by PDA), or 
 implicitly by being collected.
 And since PDA will want to track deallocated subblocks and hand 
 them out in its own `allocate`, bypassing Parent when feasible 
 (otherwise what's the point of supporting partial deallocations 
 if you can't reuse the freed memory), it will have to have 
 pointers to these subblocks, i.e. even if Parent collects 
 garbage, PDA will block the collection, anyway (live pointers 
 into the block).

You're right, but I think it's sufficient to allow the PDA to 
hold on to the whole block, provided it can re-allocate chunks of 
partially deallocated blocks.
 I think the feature would work better as an additional 
 allocator primitive, which should be easy to implement for 
 most allocators that support reallocate (e.g. bitmapped-block 
 and free-list allocators).

 I don't see how reallocate helps here, as you can only 
 grow/shrink towards the end of a block, not the beginning (so 
 you can only do mimick partial deallocations at the end of a 
 block).

I meant rather that I think that what makes it possible for 
allocators to implement reallocate should usually make it 
possible to implement partialDeallocate. For example, a free-list 
allocator that tracks blocks by pointer and length will probably 
realloc by splitting a block into two. Well, you could split a 
block in similar ways to partially deallocate it. Another example 
would be a bitmapped block allocator: just clear all the bits of 
the blocks you want to deallocate (in this case a block is a 
fixed-length chunk, and the allocation comprises several blocks).

You're right, realloc doesn't help, and nor do any of the other 
primitives. This is why it would make more sense as its own 
primitive.

 In the meantime I realized that since a child node can only 
 [...]

 Good to see that you found an easier solution :)


On Tuesday, 4 July 2017 at 04:11:11 UTC, Moritz Maxeiner wrote:
 On Tuesday, 4 July 2017 at 03:13:14 UTC, Filip Bystricky wrote:
 Oh and I forgot to mention: another use-case for this would be 
 for arrays. For manually managed arrays like 
 std.container.array, it would make it possible to transfer 
 ownership of individual objects from the array back to the 
 program after the array goes out of scope.

 Not sure I understand you here: If an instance of such a manual 
 array implementation goes out of scope it must destruct (if 
 they are objects and not primitives) and deallocate its 
 elements. There is no ownership transfer going on here (and who 
 would be the target, anyway?).

I was thinking of a Range of some large value type T, that stores 
values in a T[], but that can give ownership of an elements by 
splitting the backing array into two parts, before the element 
and after it, then returning a Unique!T pointer to the element. 
Then whether the range or the Unique!T goes out of scope first, 
their corresponding "owned" memory will be deallocated.
But I realized that in this case it would probably be less 
overhead to just store the values as a (Unique!T)[].

 For gc slices, it could enable some gc implementations to 
 deallocate parts of an array even if there are still 
 references pointing inside that array.

 I'm fairly certain the necessary bookkeeping logic for partial 
 deallocations will outweigh any gain from it. In the case of 
 such gc slices, I would rather just memcpy to a new, smaller 
 block and update pointers to it (-> a moving GC).

But sometimes you have a huge chunk of memory and want to free 
half of it, but don't want to copy the other half (or maybe you 
don't have enough RAM to do so). One operation is O(n) (n = size 
of the new, smaller block), the other is O(m) (m = the number of 
chunks you're splitting the block into). That's why realloc 
exists, and we don't just always memcpy to a new, smaller array.

Jul 06 2017

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Bulk allocation and partial deallocation for tree data structures.