digitalmars.D.learn - improve concurrent queue
- luminousone (85/85) Aug 26 2013 I have been working on a project and needed a good concurrent
- qznc (10/12) Aug 27 2013 What does "good" mean? Concurrent queues have so many design
- luminousone (13/26) Aug 27 2013 Generally, multiple writers and a single reader, but multiple
- Guillaume Piolat (11/21) Jun 04 2016 A mutex usually have a very fast uncontended path. It only blocks
- Dejan Lekic (4/6) Jun 03 2016 The research paper can actually be freely downloaded from
I have been working on a project and needed a good concurrent queue, so I wrote one, is their anyone more familiar with the atomic arts that could tell me if their is anyway i can further improve it. module container.concurrentqueue; import std.typetuple; import core.atomic; class ConcurrentQueue( items... ) { align(64) class nodeType { align(1): this( ) { atomicStore( this.next, cast(shared nodeType) null ); } this( TypeTuple!items value ) { foreach( k, v ; value ) { this.value[k] = v; } this(); } TypeTuple!items value; shared nodeType next; } class ConsumerResult { TypeTuple!items value; alias value this; } public this() { shared nodeType temp = cast(shared)new nodeType( ); atomicStore( first, temp ); atomicStore( last , temp ); atomicStore( producerLock, false ); atomicStore( consumerLock, false ); } public void Produce( items item ) { TypeTuple!items t = item; shared nodeType temp = cast(shared)new nodeType ( t ); while( !cas(&producerLock, false, true ) ) { } atomicStore( last.next , temp ); atomicStore( last , temp ); atomicStore( producerLock, false ); } public ConsumerResult Consume( ) { while( !cas(&consumerLock, false, true ) ) { } shared nodeType temp = cast(shared)atomicLoad( first ); shared nodeType next = cast(shared)atomicLoad( temp.next ); ConsumerResult result = new ConsumerResult(); if( next !is null ) { foreach( k, v ; items ) { result[k] = cast(v)next.value[k]; } first = next; atomicStore( consumerLock, false ); return result; } atomicStore( consumerLock, false ); return null; } private shared nodeType first; private byte padd1[64]; private shared nodeType last; private byte padd2[64]; private shared bool consumerLock; private byte padd3[64]; private shared bool producerLock; }
Aug 26 2013
On Monday, 26 August 2013 at 19:35:28 UTC, luminousone wrote:I have been working on a project and needed a good concurrent queueWhat does "good" mean? Concurrent queues have so many design parameters (single reader? single writer? blocking? bounded? etc) and there is no queue, which performs optimal for every case. I can recommand this paper (paywalled though): http://dl.acm.org/citation.cfm?id=248106 Nevertheless, I believe you rather wanted some comments on your specific code. I wonder why you implemented spinlocks yourself? If you write a blocking algorithm, you should probably use the phobos locks.
Aug 27 2013
On Tuesday, 27 August 2013 at 17:35:13 UTC, qznc wrote:On Monday, 26 August 2013 at 19:35:28 UTC, luminousone wrote:Generally, multiple writers and a single reader, but multiple writers and multiple readers may also occur.I have been working on a project and needed a good concurrent queueWhat does "good" mean? Concurrent queues have so many design parameters (single reader? single writer? blocking? bounded? etc) and there is no queue, which performs optimal for every case.I can recommand this paper (paywalled though): http://dl.acm.org/citation.cfm?id=248106 Nevertheless, I believe you rather wanted some comments on your specific code. I wonder why you implemented spinlocks yourself? If you write a blocking algorithm, you should probably use the phobos locks.I was under the impression that the atomic spinlock has a lower latency for any waiters, then a mutex when its unlocked? I am using this for a temporary or depending on performance, a perminate replacement for std.concurrency send/receive until such time as bugs in std.variant are fixed. I have a thread with an opengl context, and I am using the queue to send messages to this thread containing interface Drawable objects, that have any needed preprocessing on them done such that only opengl calls have to be made to the video card to render it.
Aug 27 2013
On Tuesday, 27 August 2013 at 19:50:03 UTC, luminousone wrote:I was under the impression that the atomic spinlock has a lower latency for any waiters, then a mutex when its unlocked? I am using this for a temporary or depending on performance, a perminate replacement for std.concurrency send/receive until such time as bugs in std.variant are fixed. I have a thread with an opengl context, and I am using the queue to send messages to this thread containing interface Drawable objects, that have any needed preprocessing on them done such that only opengl calls have to be made to the video card to render it.A mutex usually have a very fast uncontended path. It only blocks and is slow when already taken. You can think of it like a spinlock + a blocking safety net. In the uncontended case, the mutex with fast path will hardly be any slower than a spinlock. Use a spinlock instead if: - you are dead sure you won't spin for "too long" - you don't need being reentrant - you don't want this performance hysteresis (fast while uncontended, slow when things begin to creep in thread queues)
Jun 04 2016
On Tuesday, 27 August 2013 at 17:35:13 UTC, qznc wrote:I can recommand this paper (paywalled though): http://dl.acm.org/citation.cfm?id=248106The research paper can actually be freely downloaded from http://www.dtic.mil/cgi-bin/GetTRDoc?AD=ADA309412 . Good article, thanks!
Jun 03 2016