digitalmars.D - std.container.BinaryHeap + refCounted = WTF???
- dsimcha <dsimcha yahoo.com> Nov 16 2010
- Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> Nov 16 2010
- Michel Fortin <michel.fortin michelf.com> Nov 16 2010
- "Steven Schveighoffer" <schveiguy yahoo.com> Nov 16 2010
- dsimcha <dsimcha yahoo.com> Nov 17 2010
- dsimcha <dsimcha yahoo.com> Nov 17 2010
- Sean Kelly <sean invisibleduck.org> Nov 17 2010
- dsimcha <dsimcha yahoo.com> Nov 17 2010
- dsimcha <dsimcha yahoo.com> Nov 17 2010
- dsimcha <dsimcha yahoo.com> Nov 29 2010
- dsimcha <dsimcha yahoo.com> Nov 29 2010
- dsimcha <dsimcha yahoo.com> Nov 29 2010
- "Steven Schveighoffer" <schveiguy yahoo.com> Nov 17 2010
- "Steven Schveighoffer" <schveiguy yahoo.com> Nov 17 2010
- "Steven Schveighoffer" <schveiguy yahoo.com> Nov 17 2010
- "Steven Schveighoffer" <schveiguy yahoo.com> Nov 17 2010
- "Steven Schveighoffer" <schveiguy yahoo.com> Nov 17 2010
- "Steven Schveighoffer" <schveiguy yahoo.com> Nov 29 2010
- "Steven Schveighoffer" <schveiguy yahoo.com> Nov 29 2010
I'm trying to use BinaryHeap in a multithreaded programming using
std.parallelism/parallelfuture. I kept getting ridiculous segfaults and
stuff, and when I looked into it in a debugger, I realized the crashes had to
do with reference counting, probably caused by this line in BinaryHeap:
private RefCounted!(Tuple!(Store, "_store", size_t, "_length"),
RefCountedAutoInitialize.no) _payload;
Why the heck are the payload and the length being stored in such a weird way?
IMHO BinaryHeap shouldn't use reference counting unless Store does. More
generally, anything that uses reference counting, especially where unexpected,
should come with a very strong warning in its ddoc so that people are aware of
the hidden caveats, like copying it using memcpy() and sharing it across
threads.
Nov 16 2010
On 11/16/10 12:47 PM, dsimcha wrote:I'm trying to use BinaryHeap in a multithreaded programming using std.parallelism/parallelfuture. I kept getting ridiculous segfaults and stuff, and when I looked into it in a debugger, I realized the crashes had to do with reference counting, probably caused by this line in BinaryHeap: private RefCounted!(Tuple!(Store, "_store", size_t, "_length"), RefCountedAutoInitialize.no) _payload; Why the heck are the payload and the length being stored in such a weird way? IMHO BinaryHeap shouldn't use reference counting unless Store does. More generally, anything that uses reference counting, especially where unexpected, should come with a very strong warning in its ddoc so that people are aware of the hidden caveats, like copying it using memcpy() and sharing it across threads.
BinaryHeap being a container it has reference semantics. In order to also be sealed, it needs to be reference counted. The fact that the store uses itself reference counting is not relevant because BinaryHeap has two fields. Andrei
Nov 16 2010
On 2010-11-16 15:47:07 -0500, dsimcha <dsimcha yahoo.com> said:I'm trying to use BinaryHeap in a multithreaded programming using std.parallelism/parallelfuture. I kept getting ridiculous segfaults and stuff, and when I looked into it in a debugger, I realized the crashes had to do with reference counting, probably caused by this line in BinaryHeap: private RefCounted!(Tuple!(Store, "_store", size_t, "_length"), RefCountedAutoInitialize.no) _payload; Why the heck are the payload and the length being stored in such a weird way? IMHO BinaryHeap shouldn't use reference counting unless Store does. More generally, anything that uses reference counting, especially where unexpected, should come with a very strong warning in its ddoc so that people are aware of the hidden caveats, like copying it using memcpy() and sharing it across threads.
RefCounted, and many other things in Phobos, aren't thread-safe because of those reference counters. This problem can occur even if you don't share it with other threads, as the GC may destroy objects in another thread. See bug 4624: <http://d.puremagic.com/issues/show_bug.cgi?id=4624> Also bug 4621 about the compiler not catching the problem at compile-time (as it should catch low-level races): <http://d.puremagic.com/issues/show_bug.cgi?id=4621> I suggest you try to make RefCounted use an atomic reference counter to see if it removes your segfaults. Andrei acknowledged the issue recently (Walter too, I think), so I trust it'll be fixed at some point. One idea was to change the GC so it knows in which thread it should call destructors... -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Nov 16 2010
On Tue, 16 Nov 2010 15:47:07 -0500, dsimcha <dsimcha yahoo.com> wrote:I'm trying to use BinaryHeap in a multithreaded programming using std.parallelism/parallelfuture. I kept getting ridiculous segfaults and stuff, and when I looked into it in a debugger, I realized the crashes had to do with reference counting, probably caused by this line in BinaryHeap: private RefCounted!(Tuple!(Store, "_store", size_t, "_length"), RefCountedAutoInitialize.no) _payload; Why the heck are the payload and the length being stored in such a weird way? IMHO BinaryHeap shouldn't use reference counting unless Store does. More generally, anything that uses reference counting, especially where unexpected, should come with a very strong warning in its ddoc so that people are aware of the hidden caveats, like copying it using memcpy() and sharing it across threads.
I think in general containers don't work across multiple threads unless specifically designed to do that. dcollections containers would probably all fail if you tried to use them from multiple threads. That being said, I'm not a huge fan of reference counting period. Containers have no business being reference counted anyways, since their resource is memory, and should be handled by the GC. This doesn't mean pieces of it shouldn't be reference counted or allocated via malloc or whatever, but the object itself's lifetime should be managed by the GC IMO. Not coincidentally, this is how dcollections is set up. -Steve
Nov 16 2010
== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s articleI think in general containers don't work across multiple threads unless specifically designed to do that.
I'm making the assumption that you'd handle all the synchronization issues yourself. When you need to update the container, there's an obvious issue. In general I don't like the trend in D of building things into arrays, containers, etc. such that they can't be shared across threads due to obscure implementation details even when it looks safe. (For arrays, I'm referring to the appending issue, which is problematic when I try to append to an array from multiple threads, synchronizing manually.)dcollections containers would probably all fail if you tried to use them from multiple threads. That being said, I'm not a huge fan of reference counting period. Containers have no business being reference counted anyways, since their resource is memory, and should be handled by the GC. This doesn't mean pieces of it shouldn't be reference counted or allocated via malloc or whatever, but the object itself's lifetime should be managed by the GC IMO. Not coincidentally, this is how dcollections is set up.
I think reference counting is an elegant solution to a niche problem (namely, memory management of large containers that won't have circular references when memory is tight), but given all the baggage it creates, I don't think it should be the default for any container. I think we need to start thinking about custom allocators, and allow reference counting but make GC the default.
Nov 17 2010
== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s articleI think that we need a wrapper for containers that implements the shared methods required and manually locks things in order to use them. Then you apply this wrapper to any container type, and it's now a shared container. There are also certain types of containers that lend themselves to shared access. For example, I can see a linked list where each node contains a lock being a useful type.
This is a good idea to some degree, but the thing is that it forces you to use shared even when you're going for fine-grained parallelism and want to just cowboy it. For fine-grained parallelism use cases, my hunch is that cowboying is going to be the only game in town for a long time in all languages, not just D.(For arrays, I'm referring to the appending issue, which is problematic when I try to append to an array from multiple threads, synchronizing manually.)
appending is possible.dcollections containers would probably all fail if you tried to use them from multiple threads.
Ok, I stand corrected. It seemed to work in practice, but always I just assumed that it was a Bad Thing to do and worked for the Wrong Reasons.memory (a relatively abundant resource).
Apparently you've never tired working with multigigabyte datasets using a conservative garbage collector and 32-bit address space.
Nov 17 2010
Steven Schveighoffer Wrote:There is specific code in array appending that locks a global lock when appending to shared arrays. Appending to __gshared arrays from multiple threads likely will not work in some cases though. I don't know how to get around this, since the runtime is not made aware that the data is shared.
The shared attribute will have to become a part of the TypeInfo, much like const is now. Knowing whether data is shared can affect where/how the memory block is allocated by the GC, etc.
Nov 17 2010
== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s articleThe issue is that if you append to such an array and it adds more pages in place, the block length location will move. Since each thread caches its own copy of the block info, one will be wrong and look at array data thinking it's a length field. Even if you surround the appends with a lock, it will still cause problems because of the cache. I'm not sure there's any way to reliably append to such data from multiple threads. -Steve
Would assumeSafeAppend() do the trick?
Nov 17 2010
== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s articleOn Wed, 17 Nov 2010 12:09:11 -0500, dsimcha <dsimcha yahoo.com> wrote:== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s articleThe issue is that if you append to such an array and it adds more pages in place, the block length location will move. Since each thread caches its own copy of the block info, one will be wrong and look at array data thinking it's a length field. Even if you surround the appends with a lock, it will still cause problems because of the cache. I'm not sure there's any way to reliably append to such data from multiple threads. -Steve
Would assumeSafeAppend() do the trick?
append without using the cache. -Steve
I thought the whole point of assumeSafeAppend is that it puts the current ptr and length into the cache as-is.
Nov 17 2010
== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s articleOn Wed, 17 Nov 2010 12:09:11 -0500, dsimcha <dsimcha yahoo.com> wrote:== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s articleThe issue is that if you append to such an array and it adds more pages in place, the block length location will move. Since each thread caches its own copy of the block info, one will be wrong and look at array data thinking it's a length field. Even if you surround the appends with a lock, it will still cause problems because of the cache. I'm not sure there's any way to reliably append to such data from multiple threads. -Steve
Would assumeSafeAppend() do the trick?
append without using the cache. -Steve
How about using std.array.Appender? This looks safe as far as I can tell, but I want to make sure there aren't any obscure implementation details that would prevent this from working, too.
Nov 29 2010
== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s articleOn Mon, 29 Nov 2010 10:58:05 -0500, dsimcha <dsimcha yahoo.com> wrote:== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s articleOn Wed, 17 Nov 2010 12:09:11 -0500, dsimcha <dsimcha yahoo.com> wrote:== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s articleThe issue is that if you append to such an array and it adds more
in place, the block length location will move. Since each thread caches its own copy of the block info, one will be wrong and look at array data thinking it's a length field. Even if you surround the appends with a lock, it will still cause problems because of the cache. I'm not sure there's any way to reliably
to such data from multiple threads. -Steve
Would assumeSafeAppend() do the trick?
to append without using the cache. -Steve
How about using std.array.Appender? This looks safe as far as I can tell, but I want to make sure there aren't any obscure implementation details that would prevent this from working, too.
fact, if you try to do normal appending to an array built with Appender, it will automatically reallocate because the memory does not have the APPENDABLE bit set (new GC bit I added to avoid misinterpretations). This is a good solution. -Steve
Sounds like a good solution to me, too. As long as this situation is unlikely to change in the future, I think we should scrap the idea of making appending to __gshared using ~= safe with manual synchronization, BUT the following needs to be documented: 1. Due to obscure implementation details, it is **not** safe to append to a non-shared array even if you synchronize manually. 2. If you're doing low-level shared-memory cowboy multithreading, Appender is the correct solution. 3. In general, any data structure, etc. in Phobos/druntime that can't be made thread-safe by using a synchronized/shared wrapper or manual synchronization due to obscure implementation details (especially if this would be safe with a textbook implementation) should come with an **bold** or ALL CAPS explicit warning against doing such things.
Nov 29 2010
== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article1. Due to obscure implementation details, it is **not** safe to append to a non-shared array even if you synchronize manually.
append to non-shared arrays. -Steve
No, I meant appending to non-shared arrays from multiple threads, i.e. __gshared or passed by reference between threads.
Nov 29 2010
On Wed, 17 Nov 2010 09:17:05 -0500, dsimcha <dsimcha yahoo.com> wrote:== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s articleI think in general containers don't work across multiple threads unless specifically designed to do that.
I'm making the assumption that you'd handle all the synchronization issues yourself. When you need to update the container, there's an obvious issue. In general I don't like the trend in D of building things into arrays, containers, etc. such that they can't be shared across threads due to obscure implementation details even when it looks safe.
I think that we need a wrapper for containers that implements the shared methods required and manually locks things in order to use them. Then you apply this wrapper to any container type, and it's now a shared container. There are also certain types of containers that lend themselves to shared access. For example, I can see a linked list where each node contains a lock being a useful type.(For arrays, I'm referring to the appending issue, which is problematic when I try to append to an array from multiple threads, synchronizing manually.)
I'm interested what you mean here, I tried to make sure cross-thread appending is possible.dcollections containers would probably all fail if you tried to use them from multiple threads. That being said, I'm not a huge fan of reference counting period. Containers have no business being reference counted anyways, since their resource is memory, and should be handled by the GC. This doesn't mean pieces of it shouldn't be reference counted or allocated via malloc or whatever, but the object itself's lifetime should be managed by the GC IMO. Not coincidentally, this is how dcollections is set up.
I think reference counting is an elegant solution to a niche problem (namely, memory management of large containers that won't have circular references when memory is tight), but given all the baggage it creates, I don't think it should be the default for any container. I think we need to start thinking about custom allocators, and allow reference counting but make GC the default.
The memory management of a container's innards is open to reference counting (or whatever, I agree that allocators should be supported somewhere). I just object to reference counting of the container itself, as it's not important to me whether a container gets closed outside the GC automatically. With something like a File it's different since you are coupling closing a file (a very limited resource) with cleaning memory (a relatively abundant resource). -Steve
Nov 17 2010
On Wed, 17 Nov 2010 10:14:21 -0500, dsimcha <dsimcha yahoo.com> wrote:== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s articleI think that we need a wrapper for containers that implements the shared methods required and manually locks things in order to use them. Then you apply this wrapper to any container type, and it's now a shared container. There are also certain types of containers that lend themselves to shared access. For example, I can see a linked list where each node contains a lock being a useful type.
This is a good idea to some degree, but the thing is that it forces you to use shared even when you're going for fine-grained parallelism and want to just cowboy it. For fine-grained parallelism use cases, my hunch is that cowboying is going to be the only game in town for a long time in all languages, not just D.
There is always the possibility of "cowboy"ing it. But I don't see that the standard lib should be catering to this.(For arrays, I'm referring to the appending issue, which is
when I try to append to an array from multiple threads, synchronizing manually.)
appending is possible.dcollections containers would probably all fail if you tried to use
from multiple threads.
Ok, I stand corrected. It seemed to work in practice, but always I just assumed that it was a Bad Thing to do and worked for the Wrong Reasons.
There is specific code in array appending that locks a global lock when appending to shared arrays. Appending to __gshared arrays from multiple threads likely will not work in some cases though. I don't know how to get around this, since the runtime is not made aware that the data is shared.memory (a relatively abundant resource).
Apparently you've never tired working with multigigabyte datasets using a conservative garbage collector and 32-bit address space.
Is that supported by out-of-the-box containers? I would expect you need to create a special data structure to deal with such things. And no, I don't regularly work with such issues. But my point is, reference counting the *container* which uses some sort of memory allocation to implement its innards is not coupling a limited resource to memory allocation/deallocation. In other words, I think it's better to have the container be a non-reference counted type, even if you reference-count the elements. I prefer class semantics to be quite honest, where explicit initialization is required. -Steve
Nov 17 2010
On Wed, 17 Nov 2010 11:58:20 -0500, Sean Kelly <sean invisibleduck.org> wrote:Steven Schveighoffer Wrote:There is specific code in array appending that locks a global lock when appending to shared arrays. Appending to __gshared arrays from multiple threads likely will not work in some cases though. I don't know how to get around this, since the runtime is not made aware that the data is shared.
The shared attribute will have to become a part of the TypeInfo, much like const is now. Knowing whether data is shared can affect where/how the memory block is allocated by the GC, etc.
shared is part of it, but __gshared is not. Since __gshared is the hack to allow "bare metal" sharing, I don't see how it can be part of the type info. The issue is that if you append to such an array and it adds more pages in place, the block length location will move. Since each thread caches its own copy of the block info, one will be wrong and look at array data thinking it's a length field. Even if you surround the appends with a lock, it will still cause problems because of the cache. I'm not sure there's any way to reliably append to such data from multiple threads. -Steve
Nov 17 2010
On Wed, 17 Nov 2010 12:09:11 -0500, dsimcha <dsimcha yahoo.com> wrote:== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s articleThe issue is that if you append to such an array and it adds more pages in place, the block length location will move. Since each thread caches its own copy of the block info, one will be wrong and look at array data thinking it's a length field. Even if you surround the appends with a lock, it will still cause problems because of the cache. I'm not sure there's any way to reliably append to such data from multiple threads. -Steve
Would assumeSafeAppend() do the trick?
No, that does not affect your cache. I probably should add a function to append without using the cache. -Steve
Nov 17 2010
On Wed, 17 Nov 2010 13:58:55 -0500, dsimcha <dsimcha yahoo.com> wrote:== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s articleOn Wed, 17 Nov 2010 12:09:11 -0500, dsimcha <dsimcha yahoo.com> wrote:== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s articleThe issue is that if you append to such an array and it adds more
in place, the block length location will move. Since each thread caches its own copy of the block info, one will be wrong and look at array data thinking it's a length field. Even if you surround the appends with a lock, it will still cause problems because of the cache. I'm not sure there's any way to reliably
to such data from multiple threads. -Steve
Would assumeSafeAppend() do the trick?
to append without using the cache. -Steve
I thought the whole point of assumeSafeAppend is that it puts the current ptr and length into the cache as-is.
All the cache does is store the block info -- block start, block size, and block flags. The length is stored in the block directly. The cache allows me to skip a call to the GC (and lock the GC's global mutex) by getting the block info directly from a small cache. The block info is then used to determine where and how the "used" length is stored. Since the length is stored at the end, a change in block size in one cache while being unchanged in another cache can lead to problems. assumeSafeAppend sets the "used" block length as the given array's length so the block can be used again for appending. It does not affect the cache. Another option is to go back to the mode where the "used" length is stored at the beginning of large blocks (this caused alignment problems for some people). -Steve
Nov 17 2010
On Mon, 29 Nov 2010 10:58:05 -0500, dsimcha <dsimcha yahoo.com> wrote:== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s articleOn Wed, 17 Nov 2010 12:09:11 -0500, dsimcha <dsimcha yahoo.com> wrote:== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s articleThe issue is that if you append to such an array and it adds more
in place, the block length location will move. Since each thread caches its own copy of the block info, one will be wrong and look at array data thinking it's a length field. Even if you surround the appends with a lock, it will still cause problems because of the cache. I'm not sure there's any way to reliably
to such data from multiple threads. -Steve
Would assumeSafeAppend() do the trick?
to append without using the cache. -Steve
How about using std.array.Appender? This looks safe as far as I can tell, but I want to make sure there aren't any obscure implementation details that would prevent this from working, too.
That should be fine, std.array.Appender does not affect the LRU cache. In fact, if you try to do normal appending to an array built with Appender, it will automatically reallocate because the memory does not have the APPENDABLE bit set (new GC bit I added to avoid misinterpretations). This is a good solution. -Steve
Nov 29 2010
On Mon, 29 Nov 2010 11:44:23 -0500, dsimcha <dsimcha yahoo.com> wrote:== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s articleOn Mon, 29 Nov 2010 10:58:05 -0500, dsimcha <dsimcha yahoo.com> wrote:== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s articleOn Wed, 17 Nov 2010 12:09:11 -0500, dsimcha <dsimcha yahoo.com>
== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s articleThe issue is that if you append to such an array and it adds more
in place, the block length location will move. Since each thread
its own copy of the block info, one will be wrong and look at array
thinking it's a length field. Even if you surround the appends with a lock, it will still cause problems because of the cache. I'm not sure there's any way to reliably
to such data from multiple threads. -Steve
Would assumeSafeAppend() do the trick?
to append without using the cache. -Steve
How about using std.array.Appender? This looks safe as far as I can tell, but I want to make sure there aren't any obscure implementation details that would prevent this from working, too.
In fact, if you try to do normal appending to an array built with Appender, it will automatically reallocate because the memory does not have the APPENDABLE bit set (new GC bit I added to avoid misinterpretations). This is a good solution. -Steve
Sounds like a good solution to me, too. As long as this situation is unlikely to change in the future, I think we should scrap the idea of making appending to __gshared using ~= safe with manual synchronization,
I think it's worth giving people who want low-level access to the runtime functions the ability to avoid using the LRU cache. The cache is meant to support everyday coding, but can possibly cause problems if you are interested in doing low-level optimizations.BUT the following needs to be documented: 1. Due to obscure implementation details, it is **not** safe to append to a non-shared array even if you synchronize manually.
I think you meant __gshared, not non-shared. It's perfectly safe to append to non-shared arrays. -Steve
Nov 29 2010









Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> 