digitalmars.D.learn - Getting started with threads in D

Henrik Valter Vogelius Hansson (26/26) Jun 16 2012 Hi again!

Jonathan M Davis (7/38) Jun 16 2012 For starters, read this:
Russel Winder (53/80) Jun 17 2012 My take on this is that as soon as an applications programmer talks

Henrik Valter Vogelius Hansson (22/122) Jun 22 2012 Aight been reading a lot now about it. I'm interested in the

Sean Kelly (56/61) Jun 22 2012 but there is a problem and also why I have to think about threads. =

Henrik Valter Vogelius Hansson (23/35) Jun 22 2012 Well it also depends on how you do the receive. Though right now

"Henrik Valter Vogelius Hansson" <groogy groogy.se> writes:

Hi again!

I have looked around a little with what D offers but don't know 
really what I should use since D offers several ways to use 
threads. Some more high level than others. Don't really also know 
which one would be suitable for me.

A little background could help. I am a game developer and during 
my semester I want to experiment with making games in D. I use 
threads to separate some tasks that can easily work in parallel 
with each other. The most common being a Logic/Graphics 
separation. But as development progresses I usually add more 
threads like inside graphics I can end up with 2 or 3 more 
threads.

I want to avoid Amdahl's law as much as possible and have as 
small synchronization nodes. The data exchange should be as basic 
as possible but still have room for improvements and future 
additions.

The Concurrency library looked very promising but felt like the 
synchronization wouldn't be that nice but it would provide a 
random-access to the data in your code. Correct me of course if I 
am wrong. Is there a good thread pool system that could be used? 
Does that system also handle solving dependencies in the 
work-flow? This is what we use at my work more or less.

In worst case scenario I will just use the basic thread class and 
implement my own system above that. Then there is the question, 
is there any pitfalls in the current library that I should be 
aware of?

Jun 16 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday, June 17, 2012 03:15:44 Henrik Valter Vogelius Hansson wrote:
 Hi again!
 
 I have looked around a little with what D offers but don't know
 really what I should use since D offers several ways to use
 threads. Some more high level than others. Don't really also know
 which one would be suitable for me.
 
 A little background could help. I am a game developer and during
 my semester I want to experiment with making games in D. I use
 threads to separate some tasks that can easily work in parallel
 with each other. The most common being a Logic/Graphics
 separation. But as development progresses I usually add more
 threads like inside graphics I can end up with 2 or 3 more
 threads.
 
 I want to avoid Amdahl's law as much as possible and have as
 small synchronization nodes. The data exchange should be as basic
 as possible but still have room for improvements and future
 additions.
 
 The Concurrency library looked very promising but felt like the
 synchronization wouldn't be that nice but it would provide a
 random-access to the data in your code. Correct me of course if I
 am wrong. Is there a good thread pool system that could be used?
 Does that system also handle solving dependencies in the
 work-flow? This is what we use at my work more or less.
 
 In worst case scenario I will just use the basic thread class and
 implement my own system above that. Then there is the question,
 is there any pitfalls in the current library that I should be
 aware of?

For starters, read this:

http://www.informit.com/articles/article.aspx?p=1609144

And look at these modules in the standard library:

http://dlang.org/phobos/std_concurrency.html
http://dlang.org/phobos/std_parallelism.html

- Jonathan M Davis

Jun 16 2012

Russel Winder <russel winder.org.uk> writes:

On Sun, 2012-06-17 at 03:15 +0200, Henrik Valter Vogelius Hansson wrote:
 Hi again!
=20
 I have looked around a little with what D offers but don't know=20
 really what I should use since D offers several ways to use=20
 threads. Some more high level than others. Don't really also know=20
 which one would be suitable for me.

My take on this is that as soon as an applications programmer talks
about using threads in their program, they have admitted they are
working at the wrong level.  Applications programmers do not manage
their control stacks, applications programmers do not manage their
heaps, why on earth manage your threads. Threads are an implementation
resource best managed by an abstraction.

Using processes and message passing (over a thread pool, as you are
heading towards in comments below) has proven over the last 30+ years to
be the only scalable way of managing parallelism, so use it as a
concurrency technique as well and get parallelism as near as for free as
it is possible to get.

Ancient models and techniques such as actors, dataflow, CSP, data
parallelism are making a resurgence exactly because explicit shared
memory multi-threading is an inappropriate technique. It has just taken
the world 15+ years to appreciate this.

 A little background could help. I am a game developer and during=20
 my semester I want to experiment with making games in D. I use=20
 threads to separate some tasks that can easily work in parallel=20
 with each other. The most common being a Logic/Graphics=20
 separation. But as development progresses I usually add more=20
 threads like inside graphics I can end up with 2 or 3 more=20
 threads.

I can only repeat the above: don't think in terms of threads and shared
memory, think in terms of processes and messages passed between them.

 I want to avoid Amdahl's law as much as possible and have as=20
 small synchronization nodes. The data exchange should be as basic=20
 as possible but still have room for improvements and future=20
 additions.

Isn't the current hypothesis that you can't avoid Amdahl's Law? If what
you mean is that you want to ensure you have an embarrassingly parallel
solution so that speed up is linear that seems entirely reasonable, but
then D has a play in this game with the std.parallelism module.  It uses
the term "task" rather than process or thread to try and enforce an
algorithm-focused view. cf. http://dlang.org/phobos/std_parallelism.html

 The Concurrency library looked very promising but felt like the=20
 synchronization wouldn't be that nice but it would provide a=20
 random-access to the data in your code. Correct me of course if I=20
 am wrong. Is there a good thread pool system that could be used?=20
 Does that system also handle solving dependencies in the=20
 work-flow? This is what we use at my work more or less.

What makes you say synchronization is not that nice?

Random access, data, threads and parallelism in the same paragraph
raises a red flag of warning!

std.concurrency is a realization of actors so there is effectively a
variety of thread pool involved. std.parallelism has task pools
explicitly.=20

 In worst case scenario I will just use the basic thread class and=20
 implement my own system above that. Then there is the question,=20
 is there any pitfalls in the current library that I should be=20
 aware of?

I am sure D's current offerings are not perfect but they do represent a
good part of the right direction to be travelling.  What is missing is a
module for dataflow processing(*) and one for CSP.  Sadly I haven't had
time to get stuck into doing an implementation as I had originally
planned 18 months or so ago: most of my time is now in the Python and
Groovy arena as that is where the income comes from.  cf. GPars
(http://gpars.codehaus.org) and Python-CSP =E2=80=93 though the latter has
stopped moving due to planning a whole new Python framework for
concurrency and parallelism.


(*) People who talk about "you can implement dataflow with actors and
vice versa" miss the point about provision of appropriate abstractions
with appropriate performance characteristics.
=20
--=20
Russel.
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D
Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder ekiga.n=
et
41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel winder.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder

Jun 17 2012

"Henrik Valter Vogelius Hansson" <groogy groogy.se> writes:

On Sunday, 17 June 2012 at 07:23:38 UTC, Russel Winder wrote:
 On Sun, 2012-06-17 at 03:15 +0200, Henrik Valter Vogelius 
 Hansson wrote:
 Hi again!
 
 I have looked around a little with what D offers but don't 
 know really what I should use since D offers several ways to 
 use threads. Some more high level than others. Don't really 
 also know which one would be suitable for me.

 My take on this is that as soon as an applications programmer 
 talks
 about using threads in their program, they have admitted they 
 are
 working at the wrong level.  Applications programmers do not 
 manage
 their control stacks, applications programmers do not manage 
 their
 heaps, why on earth manage your threads. Threads are an 
 implementation
 resource best managed by an abstraction.

 Using processes and message passing (over a thread pool, as you 
 are
 heading towards in comments below) has proven over the last 30+ 
 years to
 be the only scalable way of managing parallelism, so use it as a
 concurrency technique as well and get parallelism as near as 
 for free as
 it is possible to get.

 Ancient models and techniques such as actors, dataflow, CSP, 
 data
 parallelism are making a resurgence exactly because explicit 
 shared
 memory multi-threading is an inappropriate technique. It has 
 just taken
 the world 15+ years to appreciate this.

 A little background could help. I am a game developer and 
 during my semester I want to experiment with making games in 
 D. I use threads to separate some tasks that can easily work 
 in parallel with each other. The most common being a 
 Logic/Graphics separation. But as development progresses I 
 usually add more threads like inside graphics I can end up 
 with 2 or 3 more threads.

 I can only repeat the above: don't think in terms of threads 
 and shared
 memory, think in terms of processes and messages passed between 
 them.

 I want to avoid Amdahl's law as much as possible and have as 
 small synchronization nodes. The data exchange should be as 
 basic as possible but still have room for improvements and 
 future additions.

 Isn't the current hypothesis that you can't avoid Amdahl's Law? 
 If what
 you mean is that you want to ensure you have an embarrassingly 
 parallel
 solution so that speed up is linear that seems entirely 
 reasonable, but
 then D has a play in this game with the std.parallelism module.
  It uses
 the term "task" rather than process or thread to try and 
 enforce an
 algorithm-focused view. cf. 
 http://dlang.org/phobos/std_parallelism.html

 The Concurrency library looked very promising but felt like 
 the synchronization wouldn't be that nice but it would provide 
 a random-access to the data in your code. Correct me of course 
 if I am wrong. Is there a good thread pool system that could 
 be used? Does that system also handle solving dependencies in 
 the work-flow? This is what we use at my work more or less.

 What makes you say synchronization is not that nice?

 Random access, data, threads and parallelism in the same 
 paragraph
 raises a red flag of warning!

 std.concurrency is a realization of actors so there is 
 effectively a
 variety of thread pool involved. std.parallelism has task pools
 explicitly.

 In worst case scenario I will just use the basic thread class 
 and implement my own system above that. Then there is the 
 question, is there any pitfalls in the current library that I 
 should be aware of?

 I am sure D's current offerings are not perfect but they do 
 represent a
 good part of the right direction to be travelling.  What is 
 missing is a
 module for dataflow processing(*) and one for CSP.  Sadly I 
 haven't had
 time to get stuck into doing an implementation as I had 
 originally
 planned 18 months or so ago: most of my time is now in the 
 Python and
 Groovy arena as that is where the income comes from.  cf. GPars
 (http://gpars.codehaus.org) and Python-CSP – though the 
 latter has
 stopped moving due to planning a whole new Python framework for
 concurrency and parallelism.


 (*) People who talk about "you can implement dataflow with 
 actors and
 vice versa" miss the point about provision of appropriate 
 abstractions
 with appropriate performance characteristics.
 

Aight been reading a lot now about it. I'm interested in the 
TaskPool but there is a problem and also why I have to think 
about threads. OpenGL/DirectX contexts are only valid for one 
thread at a time. And with the task pool I can't control what 
thread to be used with the specified task right? At least from 
what I could find I couldn't. So that's out of the question. The 
concurrency library is... I don't know. I most usually do a very 
fast synchronization swap(just swap two pointers) while the 
concurrency library seems like it would halt both threads for a 
longer time. Or am I viewing this from the wrong direction? 
Should I do it like lazy evaluation maybe? If you need code 
examples of what I am talking about I can give you that. Though I 
don't know the code-tag for this message board.

I will still use the task pool I think though all OpenGL calls 
will have to be routed so they are all done on the same thread 
somehow.

The message box for the threads in concurrency, are they thread 
safe? Let's say we have two logic tasks running in parallel and 
both are sending messages to the graphics thread. Would that 
result in undefined behavior or does the concurrency library 
handle this kind of scenario for you?

Jun 22 2012

Sean Kelly <sean invisibleduck.org> writes:

On Jun 22, 2012, at 11:17 AM, Henrik Valter Vogelius Hansson wrote:
=20
 Aight been reading a lot now about it. I'm interested in the TaskPool =

but there is a problem and also why I have to think about threads. =
OpenGL/DirectX contexts are only valid for one thread at a time. And =
with the task pool I can't control what thread to be used with the =
specified task right?

That's pretty much the entire point of a thread pool--it aims for =
optimal task completion time, and does this via an opaque scheduling =
mechanism.

 At least from what I could find I couldn't. So that's out of the =

question. The concurrency library is... I don't know. I most usually do =
a very fast synchronization swap(just swap two pointers) while the =
concurrency library seems like it would halt both threads for a longer =
time. Or am I viewing this from the wrong direction? Should I do it like =
lazy evaluation maybe? If you need code examples of what I am talking =
about I can give you that. Though I don't know the code-tag for this =
message board.

Games are an odd bird in that performance comes at the expense of much =
else, and that it really isn't easy to parallelize the main loop.  That =
said, the only time the concurrency library would halt a thread is if =
you do a receive() with no timeout and the message you want isn't in the =
queue.  So you can bypass this by using a timeout of 0 (basically a peek =
operation), and changing the code path based on whether the desired =
message was received.

 I will still use the task pool I think though all OpenGL calls will =

have to be routed so they are all done on the same thread somehow.

I think that will net you worse performance than if the main thread just =
did everything.  You still have synchronous execution but thread =
synchronization on top of that.  Can ownership of an OpenGL/DirectX =
contact be passed between threads?  Can you maybe just give every thread =
its own context and let it process whatever task you give to it, or is a =
context necessarily linked with some set of operations?

 The message box for the threads in concurrency, are they thread safe? =

Let's say we have two logic tasks running in parallel and both are =
sending messages to the graphics thread. Would that result in undefined =
behavior or does the concurrency library handle this kind of scenario =
for you?

Since it's a concurrency library, of course the API is thread safe :-)  =
Basically, how receive() works is it first looks in a thread-local queue =
for the desired message.  If one wasn't found it acquires a lock on that =
thread's shared message queue, moves the shared queue elements into the =
local queue, and releases the mutex.  Then it scans the new elements in =
the list for a match.  If it still doesn't find one, it re-acquires the =
mutex on the shared queue, and does the same thing.  If the shared queue =
is ever empty during this process, receive() will block on a condition =
variable up to the supplied timeout value.

The only performance issue with the concurrency API right now is that it =
allocates a struct to wrap each sent message, so there is some GC load.  =
I experimented with using a shared free list instead however, and it =
didn't really help performance in my test cases.  I suspect I'd either =
have to go to a lock-free free list, or something other fairly fancy =
approach.  Beyond that, I've experimented with using ref and not using =
ref attributes for parameters everywhere applicable, etc.  The current =
implementation is as fast as I could get things.

For future directions, I really want to add inter-process messaging.  =
That means serialization support and a scalable socket implementation =
though.  Not to mention free time.  I've considered just hacking =
together the implementation and limiting inter-process messages to =
concrete variables as a proof of concept.  That would need just free =
time.=

Jun 22 2012

"Henrik Valter Vogelius Hansson" <groogy groogy.se> writes:

 Games are an odd bird in that performance comes at the expense 
 of much else, and that it really isn't easy to parallelize the 
 main loop.  That said, the only time the concurrency library 
 would halt a thread is if you do a receive() with no timeout 
 and the message you want isn't in the queue.  So you can bypass 
 this by using a timeout of 0 (basically a peek operation), and 
 changing the code path based on whether the desired message was 
 received.

Well it also depends on how you do the receive. Though right now 
I am thinking of like a lazy evaluation, so I only try to receive 
the messages(with timeout) where I expect to use them instead of 
doing it all on the same place. And the same goes on the other 
end. Well might be over thinking it cause it's starting to sound 
more and more like how I used to work before I tried task pools. 
And I guess it won't be added in the near future so you can 
specify thread id's to the task? Like all OpenGL-related tasks 
get a specific thread while all other's doesn't matter.

 Can ownership of an OpenGL/DirectX contact be passed between 
 threads?  Can you maybe just give every thread its own context 
 and let it process whatever task you give to it, or is a 
 context necessarily linked with some set of operations?

Ownership can not be passed between threads. And giving every 
thread it's own context is possible but is bothersome because for 
instance the different context would have different states. 
(Backface culling, depth settings, and so on) Plus it would be 
pretty slow because I would have to call glFlush or similar to 
force the drivers to make sure all texture data has been updated 
to all threads and so on. Most of these problems with context and 
threads I have learned through the hard way :P

If you have a opinion in what you think would be the best way to 
do this then I am interested, even if it is single threading it. 
But I want a motivation of course. Otherwise I'll just go with 
the concurrency library and lazy evaluation idea. I'll probably 
profile a little and do a consideration over what is easiest to 
work with and expand on later as well.

Jun 22 2012

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Getting started with threads in D