www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Concurrency and transfering "ownership" of data between threads?

reply Heywood Floyd <soul8o8 gmail.com> writes:
Good Evening from Berlin!


Have been reading the chapter about concurrency by Andrei. Nice.

I have some questions, of varying quality, I'm sure.


Let's say that we have some sort of structure of rather complex data. To give
us something concrete to talk about, let's say this data is a tree of nodes
representing 3-dimensional objects. It could look something like this: (Not
complete example, just to give an idea.)

// ...
class Cube : Node{
   float x, y, z, size, mass, elasticity;
   // ...
}
// ...
tree.add(new Cube("cube1"));
tree["cube1"].add(new Cube("cube2"));

Let's further say that this structure of data will be subjected to two
different activities: 1) We will change properties of some nodes according to
some complex lengthy calculations, which may even entail changing the position
of nodes in the tree, and 2) we will traverse the tree in a recursive manner
and read the properties in order to render a representation of these nodes to
screen. 

These two activities will then be repeated many times, so, let's say we wish to
do these two activities in parallel as much as possible!

How do we do that?

From what I can tell, one way of doing this would be to actually have two data
structures, one which is the "original" and is used for the calculations, and
one which is just a copy of the data after each calculation. We could then
insert a third activity, which me can call "copy", inbetween the two threads.
Something like this:

|== Calculate ===| Copy |
                           |    v   |===== Render ====|

Seems to me this would then allow us to interlock these two activites:

..|== Calculate ===| Copy |== Calculate ====|..
..|=== Render  ===|    v   |=== Render ====|..

(Sorry if the ASCII graphics looks skewed.)
So let me just try and set up some kind of rudamentary code for this in
layman's D:

// import ...
void main()
{
    auto tree = new Node();
    tree.add(new Cube("cube1"));

    auto child = spawn(&renderLoop, thisTid);

    while(true)
    {
        calc(tree);
        auto treeCopy = tree.deepDup();
        receiveOnly!bool();
        send(child, treeCopy);
    }
}

void renderLoop(Tid parent){
{
    send(parent, true);
    while(true)
    {
        Node tree = receiveOnly!Node();
        render(tree);
        send(parent, true);        
    }
}


So a couple of thoughts here. 

- Is this looking ok? What is the "proper" way of going about this?

- How do we send over a large chunk of complex data to another thread and sort
of let them assume "ownership" of this data? So that the thread receiving the
data does not need any locks etc. when working with the data? I suppose we
could send over the entire treeCopy as immutable (?) but let's for the sake of
argument assume that renderLoop() needs to be able to _write_ in the copy of
the tree structure too!

(- Sidenote: How do we efficiently copy a tree-structure of data several times
per second? Sounds insane to me?)

Let's further, for the sake of argument, NOT consider that the GPU will do most
of the rendering and that in effect parallelising in this particular case may
be marginally beneficial. Let's simply assume we have only one processor here:
your normal household dual-core CPU, and a VGA-graphics card from the 80s.

For me, in my head, parallelising is very much about designing a flow of data
through different processes, and this flow chart will have certain "hot spots"
where data must be guarded. You have one process doing something and then
"transfering over" some data to the next process that continues the
calculations. (With process I simply mean data processing activity, not a "unix
process".)

|---calc A--->O----calc C----> etc.
|---calc B-----^

O = transfer point

I find it difficult to see how this is done in D. (Of course, I'm not sure this
"transfer"-idea makes any practical sense.) I understand immutable makes data
read-only, contagiously, and shared makes data heavily guarded. 

But yes, is there any way we can transfer "ownership" (not sure if that is the
right term) of data from one thread to another? So that we can have two threads
working on two pieces data, then let them copy it (or not!), and then transfer
ownership and have a third thread work on the copied data, without barriers or
guards or stuff like that during the time of actual work?


Kind regards and sorry for a lengthy sporadic post
/Heywood Floyd
Dec 13 2010
parent =?ISO-8859-1?Q?Pelle_M=E5nsson?= <pelle.mansson gmail.com> writes:
On 12/13/2010 09:45 PM, Heywood Floyd wrote:
 Good Evening from Berlin!


 Have been reading the chapter about concurrency by Andrei. Nice.

 I have some questions, of varying quality, I'm sure.


 Let's say that we have some sort of structure of rather complex data. To give
us something concrete to talk about, let's say this data is a tree of nodes
representing 3-dimensional objects. It could look something like this: (Not
complete example, just to give an idea.)

 // ...
 class Cube : Node{
     float x, y, z, size, mass, elasticity;
     // ...
 }
 // ...
 tree.add(new Cube("cube1"));
 tree["cube1"].add(new Cube("cube2"));

 Let's further say that this structure of data will be subjected to two
different activities: 1) We will change properties of some nodes according to
some complex lengthy calculations, which may even entail changing the position
of nodes in the tree, and 2) we will traverse the tree in a recursive manner
and read the properties in order to render a representation of these nodes to
screen.

 These two activities will then be repeated many times, so, let's say we wish
to do these two activities in parallel as much as possible!

 How do we do that?

  From what I can tell, one way of doing this would be to actually have two
data structures, one which is the "original" and is used for the calculations,
and one which is just a copy of the data after each calculation. We could then
insert a third activity, which me can call "copy", inbetween the two threads.
Something like this:

 |== Calculate ===| Copy |
                             |    v   |===== Render ====|

 Seems to me this would then allow us to interlock these two activites:

 ..|== Calculate ===| Copy |== Calculate ====|..
 ..|=== Render  ===|    v   |=== Render ====|..

 (Sorry if the ASCII graphics looks skewed.)
 So let me just try and set up some kind of rudamentary code for this in
layman's D:

 // import ...
 void main()
 {
      auto tree = new Node();
      tree.add(new Cube("cube1"));

      auto child = spawn(&renderLoop, thisTid);

      while(true)
      {
          calc(tree);
          auto treeCopy = tree.deepDup();
          receiveOnly!bool();
          send(child, treeCopy);
      }
 }

 void renderLoop(Tid parent){
 {
      send(parent, true);
      while(true)
      {
          Node tree = receiveOnly!Node();
          render(tree);
          send(parent, true);
      }
 }


 So a couple of thoughts here.

 - Is this looking ok? What is the "proper" way of going about this?

 - How do we send over a large chunk of complex data to another thread and sort
of let them assume "ownership" of this data? So that the thread receiving the
data does not need any locks etc. when working with the data? I suppose we
could send over the entire treeCopy as immutable (?) but let's for the sake of
argument assume that renderLoop() needs to be able to _write_ in the copy of
the tree structure too!

 (- Sidenote: How do we efficiently copy a tree-structure of data several times
per second? Sounds insane to me?)

 Let's further, for the sake of argument, NOT consider that the GPU will do
most of the rendering and that in effect parallelising in this particular case
may be marginally beneficial. Let's simply assume we have only one processor
here: your normal household dual-core CPU, and a VGA-graphics card from the 80s.

 For me, in my head, parallelising is very much about designing a flow of data
through different processes, and this flow chart will have certain "hot spots"
where data must be guarded. You have one process doing something and then
"transfering over" some data to the next process that continues the
calculations. (With process I simply mean data processing activity, not a "unix
process".)

 |---calc A--->O----calc C---->  etc.
 |---calc B-----^

 O = transfer point

 I find it difficult to see how this is done in D. (Of course, I'm not sure
this "transfer"-idea makes any practical sense.) I understand immutable makes
data read-only, contagiously, and shared makes data heavily guarded.

 But yes, is there any way we can transfer "ownership" (not sure if that is the
right term) of data from one thread to another? So that we can have two threads
working on two pieces data, then let them copy it (or not!), and then transfer
ownership and have a third thread work on the copied data, without barriers or
guards or stuff like that during the time of actual work?


 Kind regards and sorry for a lengthy sporadic post
 /Heywood Floyd
I think you should probably use shared. With the guards and all. And then not copying it, just sending the reference. If the write barriers make this too slow, try selectively casting away shared at heavy working spots, when you know that's the only place using the object at that time. I think that should be safe. (?) :)
Dec 14 2010