www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Functional Muti-threading

reply janderson <askme me.com> writes:
Here's one idea about how D could help support multi-threading.  What if 
you could specify which functions where thread-able.  ie:

threadable int foo(int X)
{
...
}

void bar()
{
    int a = 0;
    int array[3];
    x[0] = foo(a);  //Separate worker thread
    x[1] = foo(a);  //Separate worker thread (or could be main thread)
    x[2] = foo(a);  //Main thread
    //Would sleep here until things are done
}

Say foo was a relativity small function.  The compiler may decide to 
pack 2 thread operations together for greater efficiency.  The compiler 
would have the ultimate decision about when something gets placed on 
another thread.  Whether an application is compiled threaded or not 
would be a compiler flag (-threaded).

The compiler should also be able to determine functions that can be 
threaded saftly and add these to the list (kinda like how inline works 
in C++ -> you can mark something as inline to let the compiler know but 
it also makes its own decisions).

Consider the for loop:

int array = new int[];
array.length = GetSize();
for (int n=0; n<array.length; ++n)
{
    array[0] = foo(4);
}
//Sleep until finished

In this particular case the compiler knows at runtime what the size is 
just before the loop starts (and size does not change).  It could divide 
the work into the number of hardware threads.

Consider the for loop:

int array[] = new int[];
for (int n=0; n<GetSize(); ++n)
{
    array ~= foo(4);
}
//Sleep until finished

In this case the compiler has no-idea about how many times the loop will 
run.  It would have to hand out the tasks for foo one at a time to 
whatever worker thread were least busy.

Another thought for the for loop was to do something like:

int array[] = new int[];
Threadable(4) for (int n=0; n<GetSize(); ++n)
{
    array ~= foo(4);
}

Where threadable(4) tells the compiler that the work should be served 
out in lots of 4.  It could compile 4 versions of of the function (ie 
one with 4 foos, one with 3, one with 2 and one with 1).

Because these optimisation could get really complex, I think you'd start 
by just allowing the threadable option and then slowly add optimizations 
to support it.

The downside to this approach is that you would probably not be able to 
get the performance a of hand tuned threading application.  But then 
again, that would take a lot longer to write.

Anyways the point is, the compiler would decide when to do 
multithreading and what threadable functions can be compounded together 
at compile time to make for an optimal program.

Just a ruff idea.  I think compiler support for multi-threading in some 
form would be helpful in D's success.

What do you think?
Jan 21 2007
next sibling parent reply "Saaa" <empty needmail.com> writes:
I'd recommend looking at the futurism lib in D.announce and see how much of 
this is already there ; ) 
Jan 21 2007
parent reply janderson <askme me.com> writes:
Saaa wrote:
 I'd recommend looking at the futurism lib in D.announce and see how much of 
 this is already there ; ) 
 

The futurism library was one of the things that triggered my memory about this idea. Well done on a great library! However I think it could be done better by the compiler in many regards. Things like futurism would still have their place. The difference is that in futurism library threading based on the call, not the function itself. It doesn't determine if something should be threaded or not, users have to decide on a case by case basis. It doesn't compound threaded functions together. Its more work for programmer to maintain. Its harder to read. Its harder to turn on and off. Placing sleep-until-done-messages would also need to be done by hand. One thought though, you may be able to get pretty close to that for-loop thing with delegates and templates (for compounding). Although if you had 2 loops in a row (or a loop and threaded function) it wouldn't be able to figure out that these both can be running at once. -Joel
Jan 21 2007
parent reply janderson <askme me.com> writes:
janderson wrote:
 Saaa wrote:
 I'd recommend looking at the futurism lib in D.announce and see how 
 much of this is already there ; )

The futurism library was one of the things that triggered my memory about this idea. Well done on a great library! However I think it could be done better by the compiler in many regards. Things like futurism would still have their place. The difference is that in futurism library threading based on the call, not the function itself. It doesn't determine if something should be threaded or not, users have to decide on a case by case basis. It doesn't compound threaded functions together. Its more work for programmer to maintain. Its harder to read. Its harder to turn on and off. Placing sleep-until-done-messages would also need to be done by hand. One thought though, you may be able to get pretty close to that for-loop thing with delegates and templates (for compounding). Although if you had 2 loops in a row (or a loop and threaded function) it wouldn't be able to figure out that these both can be running at once. -Joel

Also how would futurism figure out if a function (that isn't labeled) is candidate for multi-threading? -Joel
Jan 21 2007
parent janderson <askme me.com> writes:
Case in point:

To take an example from Kevin Bealer,

int sum(int[] x)
{
     int rv = 0;
     foreach(int y; x) {
         rv += y;
     }
     return rv;
}

int sum_4(int[] x)
{
     alias Future!(int) FInt;
     int part = x.length/4;

     // Each of these queues a work item to be done either on some
     // other thread, or by the main thread here.

     FInt x1 = new FInt({ return sum(x[0..part]); });
     FInt x2 = new FInt({ return sum[part..part*2]); });
     FInt x3 = new FInt({ return sum[part*2..part*3]); });
     FInt x4 = new FInt({ return sum[part*3..$]; });

     // The addition below waits for each results to complete
     // before adding it to the total.

     return x1.value + x2.value + x3.value + x4.value;
}


Could be written like:
int sum(int[] x)
{
     int rv = 0;
     threadable(4) foreach(int y; x) {
         rv += y;
     }
     return rv;
}

Or even better
threadable int sum(int[] x)
{
     int rv = 0;
     threadable(4) foreach(int y; x) {
         rv += y;
     }
     return rv;
}

Now sum can be spliced in with other CPU independent operations. ir

int foo(int[] a, int b[])
{
   return sum(a) + sum(b); //Both run in parallel
}

Better still, we can turn it off with a simple compiler flag to get 
single threaded performance.

Obviously we have Futurism library now however I make this suggestion 
for D2.0.  I'm sure there are a few problems in the syntax that i 
presented that will need to be worked though.  Or maybe the edge cases 
are just too difficult to make this feasible?

Oh, and for anyone who still hasn't see Herb Sutter's video:

http://video.google.com/videoplay?docid=7625918717318948700&q=sutter

It explains why it is so important to figure out how multi-threading is 
going to be handled in D.
Jan 21 2007
prev sibling next sibling parent reply Daniel Keep <daniel.keep+lists gmail.com> writes:
janderson wrote:
 [snip]
 
 What do you think?

"I think all the right thinking people in this country are sick and tired of being sick and tired of being sick and tired. I know I'm not, and I'm sick and tired of being told that I am!" Or something to that effect... *ahem* This is one of the things I've thought about quite a bit over the years. What you call "threadable" I call "pure", as in a pure function. Basically, a pure function is one which has *no* side effects. If a function has no side effects, then it doesn't matter when it gets executed. The nice thing is that "pure function" is more well-defined than "threadable" :3 The idea comes from functional programming languages like LISP. If you're writing pure LISP, then all of your functions are pure, which leaves LISP free to thread your program however it likes. I thought about suggesting that the compiler add a purity check in the same vein as inlining: any function found (or marked) as pure could be threaded off by the compiler without any adverse side-effects. Of course, I never did since this would be a huge amount of work. Which got me thinking about other kinds of multithreading. Which is why I wrote my "future" implementation (not to be confused with "futurism" :P). To be honest, I think we are more likely to see array operations *first* (ie: a[] = b[] + c[] as an element-wise addition) than this stuff. The nice thing is that array operations can also be parrallelised (using things like SIMD, or even multi-threaded SIMD on multi-core systems) but much more easily. So yeah, anything that makes multithreading easier is a good thing in my books. </2> -- Daniel
Jan 21 2007
parent janderson <askme me.com> writes:
Daniel Keep wrote:
 janderson wrote:
 [snip]

 What do you think?

"I think all the right thinking people in this country are sick and tired of being sick and tired of being sick and tired. I know I'm not, and I'm sick and tired of being told that I am!" Or something to that effect... *ahem* This is one of the things I've thought about quite a bit over the years. What you call "threadable" I call "pure", as in a pure function. Basically, a pure function is one which has *no* side effects. If a function has no side effects, then it doesn't matter when it gets executed. The nice thing is that "pure function" is more well-defined than "threadable" :3

I fear pure may be mixed up with pure virtual. I don't like threadable either. Naming aside. I'm glad that I'm not the only one who has thought about this technique (means I'm probably not crazy). It also brings up an interesting point. There are probably other non-threadable operations that can occur on these "pure function"'s. For instance order of operations can be switched around (if that resulted in some optimisation). Or the function could be silently rewritten to work on an array of inputs/outputs rather then a single output so it could take advantage of SIMD.
 
 The idea comes from functional programming languages like LISP.  If 
 you're writing pure LISP, then all of your functions are pure, which 
 leaves LISP free to thread your program however it likes.

Neat, so there are already examples of this at work. Any procedural languages that use it?
 I thought about suggesting that the compiler add a purity check in the 
 same vein as inlining: any function found (or marked) as pure could be 
 threaded off by the compiler without any adverse side-effects.

Exactly. It may be possible to get the expressiveness of the syntax where we can specify functions as pure virtual ourselves. However then it would be impossible to generalize the syntax so that the compiler could add pure functions at it's will.
 Of course, I never did since this would be a huge amount of work.  Which 

I think if it only worked in a few cases to begin with and that got expanded as time went on, I'd be fine with that.
 got me thinking about other kinds of multithreading.  Which is why I 
 wrote my "future" implementation (not to be confused with "futurism" :P).

Nice work on the future implementation BTW... The great thing about future is that you can still run other more complex threads at the same time and have "future" fill some of the idle gaps.
 
 To be honest, I think we are more likely to see array operations *first* 
 (ie: a[] = b[] + c[] as an element-wise addition) than this stuff.  The 
 nice thing is that array operations can also be parrallelised (using 
 things like SIMD, or even multi-threaded SIMD on multi-core systems) but 
 much more easily.

Yeah, I was hanging out for this stuff too when Walter mentioned it a few years back. ... Heres another con: It would make stepping though the code a PITA, unless you turned it off for debug. -Joel
Jan 21 2007
prev sibling next sibling parent janderson <askme me.com> writes:
janderson wrote:
 Here's one idea about how D could help support multi-threading.  What if 
 you could specify which functions where thread-able.  ie:
 
 threadable int foo(int X)
 {
 ...
 }
 
 void bar()
 {
    int a = 0;
    int array[3];
    x[0] = foo(a);  //Separate worker thread
    x[1] = foo(a);  //Separate worker thread (or could be main thread)
    x[2] = foo(a);  //Main thread
    //Would sleep here until things are done
 }
 
 
 What do you think?

Here's an extension to the idea. It would be good to have an "un-pure", "!pure" or "!threadable" functions as well. That way if you included an "un-pure" function in an pure function the compiler would give an error. If an included function was not marked un-pure the user would have to face the side-effect consequences at run-time. -Joel
Jan 21 2007
prev sibling parent Kevin Bealer <kevinbealer gmail.com> writes:
I like this idea, but I think hand tuning and design will be necessary 
for quite a while.  For a big application, it might be enough to run 
with "-profile", then find the most expensive dozen or so functions and 
try to split those up.

On the other hand, the more CPUs there are in the box that lands on 
someone's desktop, the more important it will be to have apps that can 
use a dozen hardware threads.

(I kind of wish I had a big CPU intensive "real world" app written in D 
to play with though, so that I could test my theories about this stuff 
in a more realistic setting.)

Kevin
Jan 24 2007