www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - An important pull request: accessing shared affix for immutable data

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
https://github.com/D-Programming-Language/phobos/pull/3991

A short while ago Dicebot discussed the notion of using the allocator to 
store the reference count of objects (and generally metadata). The 
allocator seems to be a good place because in a way it's a source of 
"ground truth" - no matter how data is qualified, it originated as 
untyped mutable bytes from the allocator.

So after thinking a bit I managed to convince myself that the affixes in 
an AffixAllocator can be accessed with removing immutable, but without 
actually breaking any guarantee made by the type system. (Affixes are 
extra bytes allocated before and after the actual allocation.) The logic 
goes as follows:

* If the buffer is mutable, then the allocator assumes it hasn't been 
shared across threads, so it returns a reference to a mutable affix.

* If the buffer is const, then the allocator must conservatively assume 
it might have been immutable and subsequently shared among threads. 
Therefore, several threads may request the affix of the same buffer 
simultaneously. So it returns a reference to a shared affix.

* If the buffer is shared, then the allocator assumes again several 
threads may access the affix so it returns a reference to a shared affix.

One simple way to look at this is: the allocator keeps an associative 
array mapping allocated buffers to metadata (e.g. reference counts). The 
allocated buffers may be immutable, which doesn't require the metadata 
to be immutable as well. The only difference between an approach based 
on an associative array and AffixAllocator is that the latter is faster 
(the association is fixed by layout).


Destroy!

Andrei
Feb 12 2016
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 12.02.2016 20:12, Andrei Alexandrescu wrote:
 https://github.com/D-Programming-Language/phobos/pull/3991

 A short while ago Dicebot discussed the notion of using the allocator to
 store the reference count of objects (and generally metadata). The
 allocator seems to be a good place because in a way it's a source of
 "ground truth" - no matter how data is qualified, it originated as
 untyped mutable bytes from the allocator.

 So after thinking a bit I managed to convince myself that the affixes in
 an AffixAllocator can be accessed with removing immutable, but without
 actually breaking any guarantee made by the type system. (Affixes are
 extra bytes allocated before and after the actual allocation.) The logic
 goes as follows:

 * If the buffer is mutable, then the allocator assumes it hasn't been
 shared across threads, so it returns a reference to a mutable affix.

 * If the buffer is const, then the allocator must conservatively assume
 it might have been immutable and subsequently shared among threads.
 Therefore, several threads may request the affix of the same buffer
 simultaneously. So it returns a reference to a shared affix.

 * If the buffer is shared, then the allocator assumes again several
 threads may access the affix so it returns a reference to a shared affix.

 One simple way to look at this is: the allocator keeps an associative
 array mapping allocated buffers to metadata (e.g. reference counts). The
 allocated buffers may be immutable, which doesn't require the metadata
 to be immutable as well. The only difference between an approach based
 on an associative array and AffixAllocator is that the latter is faster
 (the association is fixed by layout).


 Destroy!

 Andrei
The first thing that comes to mind is that accessing a global associative array is not 'pure'.
Feb 12 2016
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/12/2016 06:29 PM, Timon Gehr wrote:
 The first thing that comes to mind is that accessing a global
 associative array is not 'pure'.
The weird thing is in this case is. The analogy is limited. -- Andrei
Feb 12 2016
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 13.02.2016 01:30, Andrei Alexandrescu wrote:
 On 02/12/2016 06:29 PM, Timon Gehr wrote:
 The first thing that comes to mind is that accessing a global
 associative array is not 'pure'.
The weird thing is in this case is. The analogy is limited. -- Andrei
Why is it limited? Accessing the affix isn't pure either. I.e. reference counting will not work in pure code.
Feb 12 2016
prev sibling parent ZombineDev <valid_email he.re> writes:
On Friday, 12 February 2016 at 23:29:58 UTC, Timon Gehr wrote:
 The first thing that comes to mind is that accessing a global 
 associative array is not 'pure'.
Not necessarily. Purity depends on the allocator and how you access it (via parameter or through a static/shared instance). I have this code in my project: (note **pure** unittest) /// UniqueRef is encapsulates T* or T (if (is(T == A[], A)) /// It is designed to ensure unique ownership and to /// provide transparent access to its payload struct UniqueRef(T, alias Allocator = Mallocator) if (isScalarValueType!T || isDynamicArray!T) { /* ... */ } /// pure nothrow safe // not nogc because the alias template parameter cause GC context allocation :( // Maybe make the Allocator a ctor parameter of UniqueRef? unittest { // Test forwarding of operators for arrays of user defined types auto allocator = MyStackAllocator!1024(); assert (allocator.usedBytes == 0); assert (allocator.allocationCount == 0); { auto arr = UniqueRef!(Vec3[], allocator)(4); assert (arr.length == 4); assert (arr.ptr !is null); assert (arr.ptr == allocator.bytes.ptr); assert (allocator.usedBytes == 4 * Vec3.sizeof); assert (allocator.allocationCount == 1); arr[] = Vec3(3, 2, 1); Vec3[4] expected1 = [ Vec3(3, 2, 1), Vec3(3, 2, 1), Vec3(3, 2, 1), Vec3(3, 2, 1) ]; assert (arr[] == expected1); arr[] += Vec3(1, 1, 1); Vec3[4] expected2 = [ Vec3(4, 3, 2), Vec3(4, 3, 2), Vec3(4, 3, 2), Vec3(4, 3, 2) ]; assert (arr[] == expected2); } assert (allocator.usedBytes == 0); assert (allocator.allocationCount == 0); } I think I can accomodate a SharedRef that uses the proposed affix allocator fairly easily in my current design. And it would still be conditionally pure (depending on the payload). I just haven't found the need to share ownership in my project yet.
Feb 13 2016
prev sibling next sibling parent reply deadalnix <deadalnix gmail.com> writes:
On Friday, 12 February 2016 at 19:12:48 UTC, Andrei Alexandrescu 
wrote:
 https://github.com/D-Programming-Language/phobos/pull/3991

 A short while ago Dicebot discussed the notion of using the 
 allocator to store the reference count of objects (and 
 generally metadata). The allocator seems to be a good place 
 because in a way it's a source of "ground truth" - no matter 
 how data is qualified, it originated as untyped mutable bytes 
 from the allocator.

 So after thinking a bit I managed to convince myself that the 
 affixes in an AffixAllocator can be accessed with removing 
 immutable, but without actually breaking any guarantee made by 
 the type system. (Affixes are extra bytes allocated before and 
 after the actual allocation.) The logic goes as follows:

 * If the buffer is mutable, then the allocator assumes it 
 hasn't been shared across threads, so it returns a reference to 
 a mutable affix.

 * If the buffer is const, then the allocator must 
 conservatively assume it might have been immutable and 
 subsequently shared among threads. Therefore, several threads 
 may request the affix of the same buffer simultaneously. So it 
 returns a reference to a shared affix.

 * If the buffer is shared, then the allocator assumes again 
 several threads may access the affix so it returns a reference 
 to a shared affix.

 One simple way to look at this is: the allocator keeps an 
 associative array mapping allocated buffers to metadata (e.g. 
 reference counts). The allocated buffers may be immutable, 
 which doesn't require the metadata to be immutable as well. The 
 only difference between an approach based on an associative 
 array and AffixAllocator is that the latter is faster (the 
 association is fixed by layout).


 Destroy!

 Andrei
So if I may, I think we should avoid affix data in the general case. Providing some metadata in the allocate is in itself a good idea. Locating these data with the object is usually not : - Mutating the metadata will create sharing overhead on the whole cache line. Sharing of immutable would become inefficient. - It tends to create allocation size that aren't friendly to underlying allocators. For instance, an alocation of size 2^n + 8 bumps you to the next size class, often 2^(n+1) or alike. This creates a lot of internal fragmentation. - Buffer overflow/underflow WILL spill in the alocator metadata. You don't want that. This pretty much guaranteed that the worse thing that can happen will happen: corrupting the alocator. jemalloc moved away from using them for small runs for these reasons.
Feb 12 2016
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/12/2016 06:52 PM, deadalnix wrote:
 Providing some metadata in the allocate is in itself a good idea.
 Locating these data with the object is usually not :
   - Mutating the metadata will create sharing overhead on the whole
 cache line. Sharing of immutable would become inefficient.
   - It tends to create allocation size that aren't friendly to
 underlying allocators. For instance, an alocation of size 2^n + 8 bumps
 you to the next size class, often 2^(n+1) or alike. This creates a lot
 of internal fragmentation.
   - Buffer overflow/underflow WILL spill in the alocator metadata. You
 don't want that. This pretty much guaranteed that the worse thing that
 can happen will happen: corrupting the alocator.

 jemalloc moved away from using them for small runs for these reasons.
I think we're good there. -- Andrei
Feb 12 2016
parent reply John Colvin <john.loughran.colvin gmail.com> writes:
On Saturday, 13 February 2016 at 00:30:58 UTC, Andrei 
Alexandrescu wrote:
 On 02/12/2016 06:52 PM, deadalnix wrote:
 [...]
I think we're good there. -- Andrei
Is there somewhere where I / others can see an explanation of how "we're good"? Those sound like genuine problems.
Feb 13 2016
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/13/16 7:40 AM, John Colvin wrote:
 On Saturday, 13 February 2016 at 00:30:58 UTC, Andrei Alexandrescu wrote:
 On 02/12/2016 06:52 PM, deadalnix wrote:
 [...]
I think we're good there. -- Andrei
Is there somewhere where I / others can see an explanation of how "we're good"? Those sound like genuine problems.
Those are just a soup of generic considerations that could be brought about any code. -- andrei
Feb 13 2016
parent deadalnix <deadalnix gmail.com> writes:
On Saturday, 13 February 2016 at 13:10:19 UTC, Andrei 
Alexandrescu wrote:
 On 2/13/16 7:40 AM, John Colvin wrote:
 On Saturday, 13 February 2016 at 00:30:58 UTC, Andrei 
 Alexandrescu wrote:
 On 02/12/2016 06:52 PM, deadalnix wrote:
 [...]
I think we're good there. -- Andrei
Is there somewhere where I / others can see an explanation of how "we're good"? Those sound like genuine problems.
Those are just a soup of generic considerations that could be brought about any code. -- andrei
Indeed, I just explained why some major moved away from it, it is not like it is relevant. I mean, it is not like this is proposed as a way to handle memory management in the language down the road.
Feb 13 2016
prev sibling next sibling parent tsbockman <thomas.bockman gmail.com> writes:
On Friday, 12 February 2016 at 19:12:48 UTC, Andrei Alexandrescu 
wrote:
 * If the buffer is const, then the allocator must 
 conservatively assume it might have been immutable and 
 subsequently shared among threads. Therefore, several threads 
 may request the affix of the same buffer simultaneously. So it 
 returns a reference to a shared affix.
I understand the reasoning here, but I really dislike the idea of `const` as a de-optimization. However, given the way that const is used, I guess this wouldn't be much of a problem in practice provided that some form of borrowing or inc/dec pair elision is implemented.
Feb 12 2016
prev sibling next sibling parent reply tsbockman <thomas.bockman gmail.com> writes:
On Friday, 12 February 2016 at 19:12:48 UTC, Andrei Alexandrescu 
wrote:
 * If the buffer is const, then the allocator must 
 conservatively assume it might have been immutable and 
 subsequently shared among threads. Therefore, several threads 
 may request the affix of the same buffer simultaneously. So it 
 returns a reference to a shared affix.
What about providing a way to mark a piece of data as thread-local at allocation time? The allocator could then stick it in a separate memory range from the (potentially) shared stuff, and just issue a fatal `Error` if an attempt was made to access it from the wrong thread. This would prevent `const` from triggering unnecessary atomic ops or locks. The main disadvantage I see is that the logic to find the correct reference count address would be a little more complicated, but I suspect this would be a good trade-off given how expensive thread-safe memory operations can be.
Feb 12 2016
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/12/2016 08:02 PM, tsbockman wrote:
 What about providing a way to mark a piece of data as thread-local at
 allocation time?
Yah, could be part of the metadata. Then after inspecting that bit, shared can be casted away. I consider that an optimization that doesn't add or remove generality. -- Andrei
Feb 12 2016
parent reply tsbockman <thomas.bockman gmail.com> writes:
On Saturday, 13 February 2016 at 01:27:49 UTC, Andrei 
Alexandrescu wrote:
 On 02/12/2016 08:02 PM, tsbockman wrote:
 What about providing a way to mark a piece of data as 
 thread-local at
 allocation time?
Yah, could be part of the metadata. Then after inspecting that bit, shared can be casted away. I consider that an optimization that doesn't add or remove generality. -- Andrei
I was actually suggesting encoding the shared-ness into the memory address itself. (No tricks - just allocate shared data in a separate memory range. This may only be feasible on 64-bit though, with its over-abundance of address space.) This way, you can determine whether the (meta-)data is shared without accessing it at all, just by asking the allocator about the address.
Feb 12 2016
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/12/2016 10:03 PM, tsbockman wrote:
 On Saturday, 13 February 2016 at 01:27:49 UTC, Andrei Alexandrescu wrote:
 On 02/12/2016 08:02 PM, tsbockman wrote:
 What about providing a way to mark a piece of data as thread-local at
 allocation time?
Yah, could be part of the metadata. Then after inspecting that bit, shared can be casted away. I consider that an optimization that doesn't add or remove generality. -- Andrei
I was actually suggesting encoding the shared-ness into the memory address itself. (No tricks - just allocate shared data in a separate memory range. This may only be feasible on 64-bit though, with its over-abundance of address space.) This way, you can determine whether the (meta-)data is shared without accessing it at all, just by asking the allocator about the address.
Yah, TypedAllocator can be used for exactly that. -- Andrei
Feb 12 2016
parent reply tsbockman <thomas.bockman gmail.com> writes:
On Saturday, 13 February 2016 at 03:05:08 UTC, Andrei 
Alexandrescu wrote:
 I was actually suggesting encoding the shared-ness into the 
 memory
 address itself. (No tricks - just allocate shared data in a 
 separate
 memory range. This may only be feasible on 64-bit though, with 
 its
 over-abundance of address space.)

 This way, you can determine whether the (meta-)data is shared 
 without
 accessing it at all, just by asking the allocator about the 
 address.
Yah, TypedAllocator can be used for exactly that. -- Andrei
Cool. Then this technique could also be used to address Timon Gehr's concerns about violating the type system, if necessary.
Feb 12 2016
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/12/16 10:10 PM, tsbockman wrote:
 Cool. Then this technique could also be used to address Timon Gehr's
 concerns about violating the type system, if necessary.
Regarding that. Not sure what he meant, but it dawned on me we've never mentioned in the language spec that using synchronization unnecessarily for non-shared data and also using no synchronization (in the same thread) is fine. -- Andrei
Feb 13 2016
prev sibling parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Saturday, 13 February 2016 at 03:03:05 UTC, tsbockman wrote:
 On Saturday, 13 February 2016 at 01:27:49 UTC, Andrei 
 Alexandrescu wrote:
 On 02/12/2016 08:02 PM, tsbockman wrote:
 What about providing a way to mark a piece of data as 
 thread-local at
 allocation time?
Yah, could be part of the metadata. Then after inspecting that bit, shared can be casted away. I consider that an optimization that doesn't add or remove generality. -- Andrei
I was actually suggesting encoding the shared-ness into the memory address itself. (No tricks - just allocate shared data in a separate memory range. This may only be feasible on 64-bit though, with its over-abundance of address space.) This way, you can determine whether the (meta-)data is shared without accessing it at all, just by asking the allocator about the address.
Be careful: https://en.wikipedia.org/wiki/Address_space_layout_randomization
Feb 13 2016
parent tsbockman <thomas.bockman gmail.com> writes:
On Saturday, 13 February 2016 at 08:16:04 UTC, Ola Fosheim 
Grøstad wrote:
 On Saturday, 13 February 2016 at 03:03:05 UTC, tsbockman wrote:
 I was actually suggesting encoding the shared-ness into the 
 memory address itself. (No tricks - just allocate shared data 
 in a separate memory range. This may only be feasible on 
 64-bit though, with its over-abundance of address space.)

 This way, you can determine whether the (meta-)data is shared 
 without accessing it at all, just by asking the allocator 
 about the address.
Be careful: https://en.wikipedia.org/wiki/Address_space_layout_randomization
What issue, specifically, are you trying to point out? Obviously even with ASL the heap has large contiguous chunks of address space available to it, otherwise it could not allocate large arrays. That is all that is required for my scheme to be workable, I think.
Feb 13 2016
prev sibling next sibling parent reply Dicebot <public dicebot.lv> writes:
On 02/12/2016 09:12 PM, Andrei Alexandrescu wrote:
 So after thinking a bit I managed to convince myself that the affixes in
 an AffixAllocator can be accessed with removing immutable, but without
 actually breaking any guarantee made by the type system. (Affixes are
 extra bytes allocated before and after the actual allocation.) The logic
 goes as follows:
 
 * If the buffer is mutable, then the allocator assumes it hasn't been
 shared across threads, so it returns a reference to a mutable affix.
 
 * If the buffer is const, then the allocator must conservatively assume
 it might have been immutable and subsequently shared among threads.
 Therefore, several threads may request the affix of the same buffer
 simultaneously. So it returns a reference to a shared affix.
 
 * If the buffer is shared, then the allocator assumes again several
 threads may access the affix so it returns a reference to a shared affix.
Thank you for moving forward with this. I haven't yet had chance to play with the code but it looks like a good start. Some minor notes on implementation (I have also discussed it a bit with deadalnix in IRC): - cache thrashing for shared allocations can be reduced by aligning those on cache line size (for thread-local ones it is not needed) - considering this API is going to be dangerously system, possible corruption of metadata is indeed very scary - though not a blocker within my own value list - I wonder if would make for a cleaner design to have distinct allocator instances for thread-local and shared memory instead of branching on qualifiers of requested type Right now the main concern to me is how such design is going to interact with existing code like `assumeUnique`. With GC which treats all memory similarly it is legal (and somewhat common) to allocate mutable non-shared data first and only later cook it into immutable via `assumeUnique`. But if allocator has to treat such memory differently (i.e. add barriers/atomics for non-TLS one) such code will start silently introduce major bugs. Sadly, don't have any specific suggeestions about all of this yet.
Feb 12 2016
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/12/2016 08:24 PM, Dicebot wrote:
 - considering this API is going to be dangerously  system, possible
 corruption of metadata is indeed very scary - though not a blocker
 within my own value list
I don't think this is an argument worth minding, either.
 - I wonder if would make for a cleaner design to have distinct allocator
 instances for thread-local and shared memory instead of branching on
 qualifiers of requested type
TypedAllocator does exactly that. Not difficult to integrate here, but the key here is removing immutability without subverting the type system (which is a distinct matter).
 Right now the main concern to me is how such design is going to interact
 with existing code like `assumeUnique`. With GC which treats all memory
 similarly it is legal (and somewhat common) to allocate mutable
 non-shared data first and only later cook it into immutable via
 `assumeUnique`. But if allocator has to treat such memory differently
 (i.e. add barriers/atomics  for non-TLS one) such code will start
 silently introduce major bugs.
Folks who defined refcounted types use the special AffixAllocator internally in an encapsulated manner. They define it, they use it - outside intervention is not possible. Those who use their own allocation calls and then use assumeUnique cannot use it on the new types. Or maybe I'm not understanding something. Andrei
Feb 12 2016
parent reply Dicebot <public dicebot.lv> writes:
On 02/13/2016 03:35 AM, Andrei Alexandrescu wrote:
 Folks who defined refcounted types use the special AffixAllocator
 internally in an encapsulated manner. They define it, they use it -
 outside intervention is not possible. Those who use their own allocation
 calls and then use assumeUnique cannot use it on the new types. Or maybe
 I'm not understanding something.
Ah probably this is the important point I have totally missed. So you envision this kind of metadata support to be only available via specific type of allocator, AffixAllocator, and not being supported by most of them?
Feb 12 2016
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/12/2016 08:42 PM, Dicebot wrote:
 So you
 envision this kind of metadata support to be only available via specific
 type of allocator, AffixAllocator, and not being supported by most of them?
That is correct. You want metadata, you put an AffixAllocator together. -- Andrei
Feb 12 2016
parent Dicebot <public dicebot.lv> writes:
On 02/13/2016 04:34 AM, Andrei Alexandrescu wrote:
 On 02/12/2016 08:42 PM, Dicebot wrote:
 So you
 envision this kind of metadata support to be only available via specific
 type of allocator, AffixAllocator, and not being supported by most of
 them?
That is correct. You want metadata, you put an AffixAllocator together. -- Andrei
Aha, I think this kills most of my here-and-now concerns if it is only planned as opt-in allocator extension. Better to do benchmarking of various approaches to layout when there is rcstring prototype in that case. Thanks!
Feb 13 2016
prev sibling parent ZombineDev <valid_email he.re> writes:
On Saturday, 13 February 2016 at 01:42:08 UTC, Dicebot wrote:
 On 02/13/2016 03:35 AM, Andrei Alexandrescu wrote:
 Folks who defined refcounted types use the special 
 AffixAllocator internally in an encapsulated manner. They 
 define it, they use it - outside intervention is not possible. 
 Those who use their own allocation calls and then use 
 assumeUnique cannot use it on the new types. Or maybe I'm not 
 understanding something.
Ah probably this is the important point I have totally missed. So you envision this kind of metadata support to be only available via specific type of allocator, AffixAllocator, and not being supported by most of them?
That's the most important part of the current design of std.allocator, IMO: 1) allocators are made to be composabled - they only perform suballocations. 2) allocators are first class value types - this makes them usable in pure code. Also only `shared` Allocators objects can be used by more than one thread.
Feb 13 2016
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 12.02.2016 20:12, Andrei Alexandrescu wrote:
 * If the buffer is mutable, then the allocator assumes it hasn't been
 shared across threads, so it returns a reference to a mutable affix.

 * If the buffer is const, then the allocator must conservatively assume
 it might have been immutable and subsequently shared among threads.
 Therefore, several threads may request the affix of the same buffer
 simultaneously. So it returns a reference to a shared affix.
Const could also mean mutable. This can hence reference the same data as both shared and unshared, which violates the type system.
Feb 12 2016
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/12/2016 09:21 PM, Timon Gehr wrote:
 Const could also mean mutable. This can hence reference the same data as
 both shared and unshared, which violates the type system.
If const comes from mutable, then shared is superfluous leading to extra synchronization. That's suboptimal, but how does it violate the type system? -- Andrei
Feb 12 2016
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 13.02.2016 03:35, Andrei Alexandrescu wrote:
 On 02/12/2016 09:21 PM, Timon Gehr wrote:
 Const could also mean mutable. This can hence reference the same data as
 both shared and unshared, which violates the type system.
If const comes from mutable, then shared is superfluous leading to extra synchronization. That's suboptimal, but how does it violate the type system? -- Andrei
One point of 'shared' is that data which is _not_ shared cannot be mutated by some other thread while you are manipulating it. 'shared' data can be accessed by multiple threads simultaneously. Unshared data can be accessed by only one thread at a time. 'shared' data can be passed freely among threads. If you alias the same data as both shared and unshared, the shared version can be sent to another thread which will then modify it, violating the guarantees on the unshared reference.
Feb 13 2016
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/13/2016 11:55 AM, Timon Gehr wrote:
 If you alias the same data as both shared and unshared, the shared
 version can be sent to another thread which will then modify it,
 violating the guarantees on the unshared reference.
Oh, I understand. So you're thinking of someone taking the address of allocator.prefix(block) and sharing it across threads? -- Andrei
Feb 13 2016
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 13.02.2016 18:42, Andrei Alexandrescu wrote:
 On 02/13/2016 11:55 AM, Timon Gehr wrote:
 If you alias the same data as both shared and unshared, the shared
 version can be sent to another thread which will then modify it,
 violating the guarantees on the unshared reference.
Oh, I understand. So you're thinking of someone taking the address of allocator.prefix(block) and sharing it across threads? -- Andrei
Not necessarily. shared is transitive and prefix/suffix are arbitrary types which might contain mutable indirections. I don't really see why this part of the allocator interface should even be typed.
Feb 13 2016
next sibling parent ZombineDev <valid_email he.re> writes:
On Saturday, 13 February 2016 at 18:16:43 UTC, Timon Gehr wrote:
 On 13.02.2016 18:42, Andrei Alexandrescu wrote:
 On 02/13/2016 11:55 AM, Timon Gehr wrote:
 If you alias the same data as both shared and unshared, the 
 shared
 version can be sent to another thread which will then modify 
 it,
 violating the guarantees on the unshared reference.
Oh, I understand. So you're thinking of someone taking the address of allocator.prefix(block) and sharing it across threads? -- Andrei
Not necessarily. shared is transitive and prefix/suffix are arbitrary types which might contain mutable indirections. I don't really see why this part of the allocator interface should even be typed.
I think it would be a lot simpler if we can consider allocations from unshared allocator objects to be unshareable. This rules out GCAllocator and Mallocator, but thread-local allocators can offer better performance anyway.
Feb 13 2016
prev sibling parent Marco Leise <Marco.Leise gmx.de> writes:
Am Sat, 13 Feb 2016 19:16:43 +0100
schrieb Timon Gehr <timon.gehr gmx.ch>:

 Not necessarily. shared is transitive and prefix/suffix are arbitrary 
 types which might contain mutable indirections.
I don't want to disturb your conversation, but how about designing a working "shared" first? My last attempt at using it on aggregate fields was frustrating and asked for a complete redesign, similar to "inout". Some of my most frustrating moments, some of which I discussed with Sean Kelly who had a better understanding of the runtime: Ramblings on the FAQ on shared: http://www.digitalmars.com/d/archives/digitalmars/D/D_2.0_FAQ_on_shared_246071.html None of core.sync is shared: http://www.digitalmars.com/d/archives/digitalmars/D/learn/m_condition.mutex_cannot_be_used_in_shared_method_64874.html When casting away shared I needed a way to pass a "depth" to the case, resulting in this template by John Colvin (where depth is reduced to 1 level or all, but you get the idea): http://www.digitalmars.com/d/archives/digitalmars/D/learn/How_do_you_get_T_from_shared_T_64230.html Cannot destroy a shared struct: http://www.digitalmars.com/d/archives/digitalmars/D/learn/How_to_destroy_a_shared_struct._55603.html Here I tried to do away with the explicit casts by listing the variables that are protected under a specific mutex: http://www.digitalmars.com/d/archives/digitalmars/D/New_scope_modifier_unshared_for_temporary_ownership_of_shared_228907.html -- Marco
Apr 24 2016
prev sibling parent reply ZombineDev <valid_email he.re> writes:
On Saturday, 13 February 2016 at 02:35:43 UTC, Andrei 
Alexandrescu wrote:
 On 02/12/2016 09:21 PM, Timon Gehr wrote:
 Const could also mean mutable. This can hence reference the 
 same data as
 both shared and unshared, which violates the type system.
If const comes from mutable, then shared is superfluous leading to extra synchronization. That's suboptimal, but how does it violate the type system? -- Andrei
It violates the expectations that if an object is not shared, it could not possibly be modified from another thread.
Feb 13 2016
parent ZombineDev <valid_email he.re> writes:
On Saturday, 13 February 2016 at 21:49:48 UTC, ZombineDev wrote:
 On Saturday, 13 February 2016 at 02:35:43 UTC, Andrei 
 Alexandrescu wrote:
 On 02/12/2016 09:21 PM, Timon Gehr wrote:
 Const could also mean mutable. This can hence reference the 
 same data as
 both shared and unshared, which violates the type system.
If const comes from mutable, then shared is superfluous leading to extra synchronization. That's suboptimal, but how does it violate the type system? -- Andrei
It violates the expectations that if an object is not shared, it could not possibly be modified from another thread.
All threads must agree on a protocol in order for synchronization to work correctly. Say thread A has a non-shared ref to an object and thread B has a shared ref to it. What good is this shared qualifier for B when A doesn't honor it?
Feb 13 2016
prev sibling next sibling parent reply Mathias Lang via Digitalmars-d <digitalmars-d puremagic.com> writes:
2016-02-12 20:12 GMT+01:00 Andrei Alexandrescu via Digitalmars-d <
digitalmars-d puremagic.com>:

 https://github.com/D-Programming-Language/phobos/pull/3991
The only difference between an approach based on an associative array and
 AffixAllocator is that the latter is faster (the association is fixed by
 layout).
Could you point to your benchmark / paper about this ? The general idea sounds good, however I share deadalnix's concern over storing the metadata next to the user's data. I also fail to see how those are a "soup of generic considerations", given it only apply when using an affix.
Feb 13 2016
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/13/2016 01:50 PM, Mathias Lang via Digitalmars-d wrote:
 2016-02-12 20:12 GMT+01:00 Andrei Alexandrescu via Digitalmars-d
 <digitalmars-d puremagic.com <mailto:digitalmars-d puremagic.com>>:

     https://github.com/D-Programming-Language/phobos/pull/3991

     The only difference between an approach based on an associative
     array and AffixAllocator is that the latter is faster (the
     association is fixed by layout).


 Could you point to your benchmark / paper about this ?
There's no need. I'll do the implementation with the prefix, and if you do it with a global hashtable within the same or better speed, my hat is off to you.
 The general idea sounds good, however I share deadalnix's concern over
 storing the metadata next to the user's data. I also fail to see how
 those are a "soup of generic considerations", given it only apply when
 using an affix.
It's really easy to throw hypotheticals out and speculate what's going to cause slowdown and what's not (and look awfully expert in the process). I'm very familiar with the concerns guarding storing metadata in jemalloc seeing I've worked in team with Jason Evans for months; most are quoted from rote and don't apply here. But the entire matter of digging down into how efficient the layout is would be a massive missing of the point. AffixAllocator is one example, there are many other tactical possibilities for allocators to stash metadata, neither of which affects the point. The point here was safely getting to modifiable metadata for modifiable data. Timon Gehr was the one person who figured the real problem (inadvertent sharing of data not meant to be shared). I think I'll remove the const overload for now. Would that leave any hole unplugged? Andrei
Feb 13 2016
next sibling parent reply deadalnix <deadalnix gmail.com> writes:
On Saturday, 13 February 2016 at 21:10:50 UTC, Andrei 
Alexandrescu wrote:
 There's no need. I'll do the implementation with the prefix, 
 and if you do it with a global hashtable within the same or 
 better speed, my hat is off to you.
That is false dichotomy. What about storing the metadata at an address that is computable from from the object's address, while not being contiguous with the object allocated ? Is substracting a constant really the only option here ? (hint, it is not)
Feb 13 2016
next sibling parent ZombineDev <valid_email he.re> writes:
On Saturday, 13 February 2016 at 22:01:45 UTC, deadalnix wrote:
 On Saturday, 13 February 2016 at 21:10:50 UTC, Andrei 
 Alexandrescu wrote:
 There's no need. I'll do the implementation with the prefix, 
 and if you do it with a global hashtable within the same or 
 better speed, my hat is off to you.
That is false dichotomy. What about storing the metadata at an address that is computable from from the object's address, while not being contiguous with the object allocated ? Is substracting a constant really the only option here ? (hint, it is not)
1) False-sharing overhead: If you look at the case where the allocated data is non-shared, you'll get better cache locality, instead of sharing overhead on the cache coherency system. 2) Unfriendly allocation size: In general you're correct, but in the end, it depends on the type of data. If you consider the case of shared_ptr - refcount as metadata + ptr as the actual allocation - you get a nice allocation size of 2*size_t.sizeof. As I see it, AffixAllocator is not a silver bullet, but in some very specific cases it can actaully be a very good fit. You just have to carefully consider this case-by-case. E.g. for shared types with irregular size we can just switch to a different allocator - that's the whole point of the composable allocators. 3) Increased probability/danger of buffer overflow/underflow: First, I would say that things like slice bounds checks make D a safer language for the average user than C/C++, which should make this a bit less of a problem. Second, I actually think that even if this suddenly leads to a lot of problems for the end users, this would bring more pressure for adding more Ada/Rust-like static analysis for D - which is a Good Thing.
Feb 13 2016
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/13/16 5:01 PM, deadalnix wrote:
 What about storing the metadata at an address that is computable from
 from the object's address, while not being contiguous with the object
 allocated ? Is substracting a constant really the only option here ?
 (hint, it is not)
I'd say, you have at your disposal a flexible allocator framework so a simple way to go about it would be to implement new allocators that take various options about storing metadata. -- Andrei
Feb 13 2016
parent deadalnix <deadalnix gmail.com> writes:
On Sunday, 14 February 2016 at 01:27:40 UTC, Andrei Alexandrescu 
wrote:
 On 2/13/16 5:01 PM, deadalnix wrote:
 What about storing the metadata at an address that is 
 computable from
 from the object's address, while not being contiguous with the 
 object
 allocated ? Is substracting a constant really the only option 
 here ?
 (hint, it is not)
I'd say, you have at your disposal a flexible allocator framework so a simple way to go about it would be to implement new allocators that take various options about storing metadata. -- Andrei
I have nothing against this as some sort of optin allocation scheme. But my understanding was that it is desirable to wire RC into this.
Feb 13 2016
prev sibling parent reply rsw0x <anonymous anonymous.com> writes:
On Saturday, 13 February 2016 at 21:10:50 UTC, Andrei 
Alexandrescu wrote:
 On 02/13/2016 01:50 PM, Mathias Lang via Digitalmars-d wrote:
 2016-02-12 20:12 GMT+01:00 Andrei Alexandrescu via 
 Digitalmars-d
 <digitalmars-d puremagic.com 
 <mailto:digitalmars-d puremagic.com>>:

     https://github.com/D-Programming-Language/phobos/pull/3991

     The only difference between an approach based on an 
 associative
     array and AffixAllocator is that the latter is faster (the
     association is fixed by layout).


 Could you point to your benchmark / paper about this ?
There's no need. I'll do the implementation with the prefix, and if you do it with a global hashtable within the same or better speed, my hat is off to you.
I believe he was referring to BiBoP.
Feb 13 2016
parent rsw0x <anonymous anonymous.com> writes:
On Saturday, 13 February 2016 at 22:16:02 UTC, rsw0x wrote:
 On Saturday, 13 February 2016 at 21:10:50 UTC, Andrei 
 Alexandrescu wrote:
 On 02/13/2016 01:50 PM, Mathias Lang via Digitalmars-d wrote:
 [...]
There's no need. I'll do the implementation with the prefix, and if you do it with a global hashtable within the same or better speed, my hat is off to you.
I believe he was referring to BiBoP.
er, nevermind, sorry for the noise. I shouldn't read threads from the end to the start.
Feb 13 2016
prev sibling next sibling parent reply Iakh <iaktakh gmail.com> writes:
On Friday, 12 February 2016 at 19:12:48 UTC, Andrei Alexandrescu 
wrote:
 https://github.com/D-Programming-Language/phobos/pull/3991

 A short while ago Dicebot discussed the notion of using the 
 allocator to store the reference count of objects (and 
 generally metadata). The allocator seems to be a good place 
 because in a way it's a source of "ground truth" - no matter 
 how data is qualified, it originated as untyped mutable bytes 
 from the allocator.
 [...]

 Destroy!

 Andrei
So you can use metadata only with global allocators, until you don't need to save ref to the allocator. Why can't we use attribute intransitive on ref fields to prevents immutable being transitively applied on referred content? Add some restrictions (only private, wrap using with trusted) and it would be enough to implement RefCounting with immutable.
Feb 13 2016
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/13/2016 03:07 PM, Iakh wrote:
 So you can use metadata only with global allocators,
 until you don't need to save ref to the allocator.
Well you can use other allocators if you save them so you have them available for deallocation. -- Andrei
Feb 13 2016
parent Iakh <iaktakh gmail.com> writes:
On Saturday, 13 February 2016 at 21:12:10 UTC, Andrei 
Alexandrescu wrote:
 On 02/13/2016 03:07 PM, Iakh wrote:
 So you can use metadata only with global allocators,
 until you don't need to save ref to the allocator.
Well you can use other allocators if you save them so you have them available for deallocation. -- Andrei
Yeap. You did catch me. And if you don't have " intransitive const" or C++'s mutable you can't save ref to the allocator within object: struct RCExternal { //alias allocator = theGlobalAllocator; auto allocator = &theGlobalAllocator; // assign mutable to immutable private void[] data; ~this () { allocator.decRef(data.ptr); } }
Feb 13 2016
prev sibling next sibling parent reply ZombineDev <valid_email he.re> writes:
On Friday, 12 February 2016 at 19:12:48 UTC, Andrei Alexandrescu 
wrote:
 https://github.com/D-Programming-Language/phobos/pull/3991

 A short while ago Dicebot discussed the notion of using the 
 allocator to store the reference count of objects (and 
 generally metadata). The allocator seems to be a good place 
 because in a way it's a source of "ground truth" - no matter 
 how data is qualified, it originated as untyped mutable bytes 
 from the allocator.
I agree. Your allocators framework is very flexible and AffixAllocator looks like the perfect tool for the job. The type system already relies on the allocators and putting more trust in them can indeed enable some interesting opportunities.
 So after thinking a bit I managed to convince myself that the 
 affixes in an AffixAllocator can be accessed with removing 
 immutable, but without actually breaking any guarantee made by 
 the type system. (Affixes are extra bytes allocated before and 
 after the actual allocation.) The logic goes as follows:

 * If the buffer is mutable, then the allocator assumes it 
 hasn't been shared across threads, so it returns a reference to 
 a mutable affix.
I agree that non-shared data implies that it's meta data can't be shared, but I don't exactly agree with you reasoning - see below.
 * If the buffer is const, then the allocator must 
 conservatively assume it might have been immutable and 
 subsequently shared among threads. Therefore, several threads 
 may request the affix of the same buffer simultaneously. So it 
 returns a reference to a shared affix.
Absolutely disagree. See below.
 * If the buffer is shared, then the allocator assumes again 
 several threads may access the affix so it returns a reference 
 to a shared affix.
Correct. Shared data implies shared access to the corresponding metadata.
 One simple way to look at this is: the allocator keeps an 
 associative array mapping allocated buffers to metadata (e.g. 
 reference counts). The allocated buffers may be immutable, 
 which doesn't require the metadata to be immutable as well. The 
 only difference between an approach based on an associative 
 array and AffixAllocator is that the latter is faster (the 
 association is fixed by layout).
Yes, mutable metadata in front of immutable data is OK, as long as the users of the immutable data can't directly access the metadata (e.g. it is well encapsulated and those who do access it, must see is typed as `shared`).
 Destroy!

 Andrei
Overall it's an interesting and strong propsal, except for the part about prefix(const) magically returning shared. Shared is transitive and unless you take special care to remove it - e.g. http://dpaste.dzfl.pl/4294b9c2653e, you shouldn't be able to get non-shared references in to it (and vice-versa - you should have shared references in a non-shared objects). That's why I believe that unless the allocator is explicitly shared, you shouldn't be allowed to share the bytes allocated from it. So in order to get shareable immutable data, we should use shared Allocator and not try to subvert the type system, even though allocators are low-level infrastructure code.
Feb 13 2016
parent ZombineDev <valid_email he.re> writes:
On Saturday, 13 February 2016 at 21:38:30 UTC, ZombineDev wrote:
 ...
 get non-shared references in to it (and vice-versa - you should 
 have shared references in a non-shared objects).
 ...
you should ***not be able to*** have shared references in a non-shared objects
Feb 13 2016
prev sibling next sibling parent reply =?UTF-8?Q?S=c3=b6nke_Ludwig?= <sludwig outerproduct.org> writes:
One thing that I really miss from most of the discussions about RC is 
support for weak references, which IMO should definitely be supported in 
the standard RC implementation. Event if they are only needed rarely, 
there are some things that simply do not work without them. And I don't 
think they can be implemented efficiently on top of ordinary RC.

For them to work natively, the lifetime of the allocated memory block 
and that of the reference count must be separate. So they could either 
simply be a separate allocation, or live in a special reference count 
store, similar to what Objective-C does.
Feb 14 2016
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/14/2016 03:08 AM, Sönke Ludwig wrote:
 For them to work natively, the lifetime of the allocated memory block
 and that of the reference count must be separate.
Not necessarily. C++ makes this work for make_shared by keeping the memory allocated around (but not the object) until the last weak ref goes away. We can do the same, but we also have a better alternative. Most of our allocators support shrink-in-place. For now I haven't exposed it as a primitive but that's short work. When the object goes away we can shrink memory in place to only the length of the metadata. Overall: I also think weak refs should be supported, but I know they can be tacked on later. Andrei
Feb 14 2016
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Sunday, 14 February 2016 at 11:14:59 UTC, Andrei Alexandrescu 
wrote:
 We can do the same, but we also have a better alternative. Most 
 of our allocators support shrink-in-place. For now I haven't 
 exposed it as a primitive but that's short work. When the 
 object goes away we can shrink memory in place to only the 
 length of the metadata.
This will lead to massive memory fragmentation. When you allocate a lot of objects they tend to be of the same size...
Feb 14 2016
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/12/2016 02:12 PM, Andrei Alexandrescu wrote:
 https://github.com/D-Programming-Language/phobos/pull/3991
Finally I got around to update this. Please find holes in my logic. Updated documentation at: http://dtest.thecybershadow.net/artifact/website-6a8825cc07dee2db751fa4b61044073b228c20c7-b911819df047024a6d304d01957307a3/web/phobos-prerelease/std_experimental_allocator_building_blocks_affix_allocator.html Thanks! -- Andrei
Apr 23 2016