www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - BigInt -- a challenge for std.allocator

reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
Hello all,

First up -- please understand that this is written by someone whose practical 
experience of memory allocation is limited to either malloc/free or new/delete, 
and who is used to programming in scenarios where typically the amount of
memory 
required can be allocated at the start of a program and freed at the end -- so, 
the rationale or use-cases of most of the content in std.allocator is not 
obvious to me.  So, in writing this, I'm looking to be educated ... :-)

As part of the work I've been doing towards getting David Simcha's std.rational 
module into Phobos, I've been looking inside std.bigint.  One aspect of how it 
works is its memory management: it has an internally-allocated data store (at 
its core, an array of words of some appropriate size, typically the same as the 
machine word size) which is stored by reference inside the user-facing object.

By default, when you copy a BigInt, this data is not duplicated -- the
reference 
is copied instead, and BigInt copies only on write.  So:

     BigInt m = 100;
     BigInt n = m;       // just copies the reference
     m = 50;             // creates a new data store and writes to it

However, this has the unfortunate effect that it copies on write _regardless of 
whether multiple BigInts are pointing to the same data_.  So, you can do 
something like:

     BigInt n = 100;
     n += 10;
     ++n;

... and both the 2nd and 3rd lines will result in a reallocation even though 
there is no need for it, because only n is referencing the memory it wraps.

So, a better approach would be for BigInt write to ask,

     if (/* I'm the only person referencing this memory */)
     {
         // just write straight to the memory
     }
     else
     {
         // allocate new memory and write to it
     }

But, how would one go about implementing that using std.allocator?

Or ... am I overthinking this, and std.typecons.RefCounted would be a better 
approach?

Thanks & best wishes,

     -- Joe
Oct 29 2013
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Joseph Rushton Wakeling:

     if (/* I'm the only person referencing this memory */)
It seems you refer to: http://en.wikipedia.org/wiki/Uniqueness_typing Bye, bearophile
Oct 29 2013
prev sibling parent reply "Froglegs" <barf barf.com> writes:
     BigInt n = 100;
     n += 10;
     ++n;

 ... and both the 2nd and 3rd lines will result in a 
 reallocation even though there is no need for it, because only 
 n is referencing the memory it wraps.
Does BigInt not have overloaded operators for adding normal integers? Is it converting 10 to a BigInt prior to adding it to n?
Oct 29 2013
parent reply "Don" <x nospam.com> writes:
On Tuesday, 29 October 2013 at 11:26:45 UTC, Froglegs wrote:
    BigInt n = 100;
    n += 10;
    ++n;

 ... and both the 2nd and 3rd lines will result in a 
 reallocation even though there is no need for it, because only 
 n is referencing the memory it wraps.
Does BigInt not have overloaded operators for adding normal integers? Is it converting 10 to a BigInt prior to adding it to n?
Yes, it has overloaded operators, but that's not relevant. The issue is that initially n points to an block of memory, containing [100]. Then, if you use COW, n+= 10 isn't in-place. It has to create a new array, containing [110]. Then ++n creates a third array. [111].
Oct 29 2013
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
29-Oct-2013 16:22, Don пишет:
 On Tuesday, 29 October 2013 at 11:26:45 UTC, Froglegs wrote:
    BigInt n = 100;
    n += 10;
    ++n;

 ... and both the 2nd and 3rd lines will result in a reallocation even
 though there is no need for it, because only n is referencing the
 memory it wraps.
Does BigInt not have overloaded operators for adding normal integers? Is it converting 10 to a BigInt prior to adding it to n?
Yes, it has overloaded operators, but that's not relevant. The issue is that initially n points to an block of memory, containing [100]. Then, if you use COW, n+= 10 isn't in-place. It has to create a new array, containing [110]. Then ++n creates a third array. [111].
Can't it use ref-counted COW? Then since there is only 1 reference to the original block it may mutate it in-place else it creates new chunk with refs=1 and decrements the original count. Maybe I'm missing something but it shouldn't be that hard. -- Dmitry Olshansky
Oct 29 2013
parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 29/10/13 15:58, Dmitry Olshansky wrote:
 Can't it use ref-counted COW? Then since there is only 1 reference to the
 original block it may mutate it in-place else it creates new chunk with refs=1
 and decrements the original count. Maybe I'm missing something but it shouldn't
 be that hard.
Yes, that's pretty much what I was thinking of. The question is how to implement it in terms of std.allocator -- or is that the wrong base to build ref=counted memory upon?
Oct 29 2013
next sibling parent reply "inout" <inout gmail.com> writes:
On Tuesday, 29 October 2013 at 21:59:38 UTC, Joseph Rushton 
Wakeling wrote:
 On 29/10/13 15:58, Dmitry Olshansky wrote:
 Can't it use ref-counted COW? Then since there is only 1 
 reference to the
 original block it may mutate it in-place else it creates new 
 chunk with refs=1
 and decrements the original count. Maybe I'm missing something 
 but it shouldn't
 be that hard.
Yes, that's pretty much what I was thinking of. The question is how to implement it in terms of std.allocator -- or is that the wrong base to build ref=counted memory upon?
This has almost nothing to do with an allocator.
Oct 29 2013
parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 30/10/13 01:38, inout wrote:
 This has almost nothing to do with an allocator.
My impression was that std.allocator as-is was designed to create a toolkit for developers to build different memory management strategies into their applications. Is there any reason why that should exclude implementing ref-counting approaches? As I said at the start, I'm looking to be educated, so I'd appreciate you explaining why something is wrong rather than just telling me.
Oct 30 2013
next sibling parent "Dicebot" <public dicebot.lv> writes:
On Wednesday, 30 October 2013 at 12:39:18 UTC, Joseph Rushton 
Wakeling wrote:
 On 30/10/13 01:38, inout wrote:
 This has almost nothing to do with an allocator.
My impression was that std.allocator as-is was designed to create a toolkit for developers to build different memory management strategies into their applications. Is there any reason why that should exclude implementing ref-counting approaches?
"memory allocation strategy" != "memory management strategy", though those can be possibly coupled. Ref-counting can be done on top of any allocation strategy that support deterministic deallocattion.
Oct 30 2013
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/30/13 5:39 AM, Joseph Rushton Wakeling wrote:
 On 30/10/13 01:38, inout wrote:
 This has almost nothing to do with an allocator.
My impression was that std.allocator as-is was designed to create a toolkit for developers to build different memory management strategies into their applications. Is there any reason why that should exclude implementing ref-counting approaches? As I said at the start, I'm looking to be educated, so I'd appreciate you explaining why something is wrong rather than just telling me.
Reference counting comes somewhere on top of allocators. (For a long time I thought they should be fused; now I think they can be neatly separated.) Allocators know how to manage void[] and that's about it. Andrei
Oct 30 2013
parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 30/10/13 16:21, Andrei Alexandrescu wrote:
 Reference counting comes somewhere on top of allocators. (For a long time I
 thought they should be fused; now I think they can be neatly separated.)
 Allocators know how to manage void[] and that's about it.
Ahh, OK. That makes more sense. It rather baffled me to hear people saying, "This has nothing to do with an allocator." My impression is that std.typecons.RefCounted has some issues, and that we might expect better reference counting solutions to be built on top of std.allocator -- otherwise I'd just have used the existing RefCounted.
Oct 30 2013
prev sibling next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
30-Oct-2013 01:59, Joseph Rushton Wakeling пишет:
 On 29/10/13 15:58, Dmitry Olshansky wrote:
 Can't it use ref-counted COW? Then since there is only 1 reference to the
 original block it may mutate it in-place else it creates new chunk
 with refs=1
 and decrements the original count. Maybe I'm missing something but it
 shouldn't
 be that hard.
Yes, that's pretty much what I was thinking of. The question is how to implement it in terms of std.allocator -- or is that the wrong base to build ref=counted memory upon?
Like others said - just use allocator.alloc/allocator.free instead of malloc/free nothing stellar. -- Dmitry Olshansky
Oct 30 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/30/13 9:35 AM, Dmitry Olshansky wrote:
 30-Oct-2013 01:59, Joseph Rushton Wakeling пишет:
 On 29/10/13 15:58, Dmitry Olshansky wrote:
 Can't it use ref-counted COW? Then since there is only 1 reference to
 the
 original block it may mutate it in-place else it creates new chunk
 with refs=1
 and decrements the original count. Maybe I'm missing something but it
 shouldn't
 be that hard.
Yes, that's pretty much what I was thinking of. The question is how to implement it in terms of std.allocator -- or is that the wrong base to build ref=counted memory upon?
Like others said - just use allocator.alloc/allocator.free instead of malloc/free nothing stellar.
Well actually there are things that can be done at the allocator level to explicitly help reference counting: 1. The refcount could be an affix to the allocated block. 2. Small refcounts can be kept together in a compact block a la HeapBlock. 3. Or refcounts can be kept in a freelist. So refounting can be helped by an allocator built especially for it. Andrei
Oct 30 2013
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
30-Oct-2013 22:02, Andrei Alexandrescu пишет:
 Well actually there are things that can be done at the allocator level
 to explicitly help reference counting:
 1. The refcount could be an affix to the allocated block.
Yup. I seriously underappreciated this one.
 2. Small refcounts can be kept together in a compact block a la HeapBlock.
That would entail some work to tie-up refcounts and bigint internal array, but it sounds interesting.
 3. Or refcounts can be kept in a freelist.
Doubtful. That would amount to extra pointer stored per BigInt AND extra indirection to get at ref-count unless I'm missing the details completely.
 So refounting can be helped by an allocator built especially for it.
Agreed. BTW AffixAllocator is quite a feat, I think I wanted it countless times while working on std.regex/std.uni. I expect massive code performance and clarity improvements when {core,std}.allocator comes in.
 Andrei
-- Dmitry Olshansky
Oct 30 2013
prev sibling parent "Kagamin" <spam here.lot> writes:
On Tuesday, 29 October 2013 at 21:59:38 UTC, Joseph Rushton 
Wakeling wrote:
 On 29/10/13 15:58, Dmitry Olshansky wrote:
 Can't it use ref-counted COW? Then since there is only 1 
 reference to the
 original block it may mutate it in-place else it creates new 
 chunk with refs=1
 and decrements the original count. Maybe I'm missing something 
 but it shouldn't
 be that hard.
Yes, that's pretty much what I was thinking of. The question is how to implement it in terms of std.allocator -- or is that the wrong base to build ref=counted memory upon?
Implement a pool of bigints as an allocator: when refcount reaches zero, return the bigint buffer to the pool - a simple bump the pointer algorithm, when you need a new buffer, request it from the pool. A pool is a stack with fixed capacity, and buffers go in and out.
Oct 30 2013