www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - openMP

reply "Farmer" <axionator gmail.com> writes:
Hi,
I am tempted to start D programming but for me it is crucrial to 
be able to parallelize for-loops as can be done with openMP for 
C/C++ (mainly  pragma omp parallel for,  pragma omp critical).
I have already seen the std.parallelism library but I'm unsure 
whether it can provide me with the same functionality.

Thanks
Oct 02 2012
parent reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Tuesday, 2 October 2012 at 19:15:19 UTC, Farmer wrote:
 Hi,
 I am tempted to start D programming but for me it is crucrial 
 to be able to parallelize for-loops as can be done with openMP 
 for C/C++ (mainly  pragma omp parallel for,  pragma omp 
 critical).
 I have already seen the std.parallelism library but I'm unsure 
 whether it can provide me with the same functionality.

 Thanks
It can. Here's an example from the docs of parallelising a simple for loop: auto logs = new double[10_000_000]; foreach(i, ref elem; taskPool.parallel(logs, 100)) { elem = log(i + 1.0); } This creates a pool of workers that each perform 100 iterations of the loop body in parallel.
Oct 02 2012
parent reply "Farmer" <axionator gmail.com> writes:
And is there also a pragma omp critical analogon?

On Tuesday, 2 October 2012 at 20:16:36 UTC, Peter Alexander wrote:
 On Tuesday, 2 October 2012 at 19:15:19 UTC, Farmer wrote:
 Hi,
 I am tempted to start D programming but for me it is crucrial 
 to be able to parallelize for-loops as can be done with openMP 
 for C/C++ (mainly  pragma omp parallel for,  pragma omp 
 critical).
 I have already seen the std.parallelism library but I'm unsure 
 whether it can provide me with the same functionality.

 Thanks
It can. Here's an example from the docs of parallelising a simple for loop: auto logs = new double[10_000_000]; foreach(i, ref elem; taskPool.parallel(logs, 100)) { elem = log(i + 1.0); } This creates a pool of workers that each perform 100 iterations of the loop body in parallel.
Oct 02 2012
next sibling parent "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Tuesday, 2 October 2012 at 21:13:33 UTC, Farmer wrote:
 And is there also a pragma omp critical analogon?
For critical sections you could use a low-level mutex. I don't do much parallel stuff in D, so I don't know if this is the preferred way, but it's an option. http://dlang.org/phobos/core_sync_mutex.html import std.stdio; import std.parallelism; import std.math; import core.sync.mutex; void main() { auto logs = new double[1_000_000]; double x = 0.0; Mutex m = new Mutex(); foreach(i, ref elem; taskPool.parallel(logs, 100)) { elem = log(i + 1.0); m.lock(); x += 1.0; m.unlock(); } }
Oct 02 2012
prev sibling parent reply Russel Winder <russel winder.org.uk> writes:
On Tue, 2012-10-02 at 23:13 +0200, Farmer wrote:
 And is there also a pragma omp critical analogon?
No. The D approach in std.parallelism is to offer explicit parallel constructs that also work on a uniprocessor. The OpenMP approach is to provide meta-data to allow the compiler to parallelize sequential code. Although OpenMP is high-profile in the C, C++ and Fortran world, it's raison d'=C3=AAtre is to be able to use sequential code in a parallel context. Now that C++ has made the jump to using futures and asynchronous function calls as an integral part of the language, I think we will see it drift away from the OpenMP style camp much more towards the TBB style camp. I am not sure if TBB is the right framework, but a drift twowards frameworks that are parallel first and sequential as a special case does seem to be on the cards. SO rather than working with annotation based systems such as OpenMP, I think D is right to be using parallel map, parallel reduce as functions rather than seeing explicit iterations. Functional languages have always had this. Python, Groovy, Ruby, Clojure brought the ideas to mainstream platforms, as did Scala and C++ (cf. std::foreach). Java 8 will take up this approach. So library based iteration is with us as the tool for abstracting away the details. Obviously we will still have explicit iteration in imperative languages but significantly less than in the past. To summarize: OpenMP and parallelizing explicit iteration is backward looking, library calls parallelizing using implicit iteration is the future. --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Oct 03 2012
parent reply Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:
On Wed, 03 Oct 2012 09:08:47 +0100
Russel Winder <russel winder.org.uk> wrote:
 Now that C++ has made the jump to using futures and
 asynchronous function calls as an integral part of the language,
Speaking of, do we have futures in D yet? IIRC, way back last time I asked about it there was something that needed taken care of first, though I don't remember what. If we don't have them ATM, is there currently anything in the way of actually creating them?
Oct 03 2012
parent reply "dsimcha" <dsimcha yahoo.com> writes:
Unless we're using different terminology here, futures are just 
std.parallelism Tasks.

On Wednesday, 3 October 2012 at 10:17:41 UTC, Nick Sabalausky 
wrote:
 On Wed, 03 Oct 2012 09:08:47 +0100
 Russel Winder <russel winder.org.uk> wrote:
 Now that C++ has made the jump to using futures and
 asynchronous function calls as an integral part of the 
 language,
Speaking of, do we have futures in D yet? IIRC, way back last time I asked about it there was something that needed taken care of first, though I don't remember what. If we don't have them ATM, is there currently anything in the way of actually creating them?
Oct 03 2012
parent reply "David Nadlinger" <see klickverbot.at> writes:
On Wednesday, 3 October 2012 at 14:10:57 UTC, dsimcha wrote:
 Unless we're using different terminology here, futures are just 
 std.parallelism Tasks.
No, std.parallelism.Tasks are not really futures – they offer a constrained [1] future interface, but couple this with the notion that a Task can be executed at some point on a TaskPool chosen by the user. Because of this, I had to implement my own futures for the Thrift async stuff, where I needed a future as a promise [2] by an invoked entity that it kicked off a background activity which will eventually return a value, but which the users can't »start« or choose to »execute it now«, as they can with Tasks. If TaskPool had an execute() method which took a delegate to execute (or a »Task«, for that matter) and returned a new object which serves as a »handle« with wait()/get()/… methods, _that_ would (likely) be a future. David [1] Constrained in the sense that it is only meant for short-/synchronous-running tasks and thus e.g. offer no callback mechanism. [2] Let's not get into splitting hairs regarding the exact meaning of »Future« vs. »Promise«, especially because C++11 introduced a new interpretation to the mix.
Oct 03 2012
parent reply "dsimcha" <dsimcha yahoo.com> writes:
Ok, now I vaguely remember seeing stuff about futures in your 
Thrift code and wondering why it was there.  I'm a little big 
confused about what you want.  If I understand correctly, 
std.parallelism can already do it pretty easily, but maybe the 
docs need to be improved a little to make it obvious how to.

All you have to do is something like this:

auto createFuture() {
     auto myTask = task!someFun();  // Returns a _pointer_ to a 
Task.
     taskPool.put(myTask);  // Or myTask.executeInNewThread();

     // A task created with task() can outlive the scope it was 
created in.
     // A scoped task, created with scopedTask(), cannot.  This is 
safe,
     // since myTask is NOT scoped and is a _pointer_ to a Task.
     return myTask;
}

In this case myTask is already running using the execution 
resources specified in createFuture().  Does this do what you 
wanted?  If so, I'll clarify the documentation.  If not, please 
clarify what you needed and the relevant use cases so that I can 
fix std.parallelism.

On Wednesday, 3 October 2012 at 15:50:38 UTC, David Nadlinger 
wrote:
 On Wednesday, 3 October 2012 at 14:10:57 UTC, dsimcha wrote:
 Unless we're using different terminology here, futures are 
 just std.parallelism Tasks.
No, std.parallelism.Tasks are not really futures – they offer a constrained [1] future interface, but couple this with the notion that a Task can be executed at some point on a TaskPool chosen by the user. Because of this, I had to implement my own futures for the Thrift async stuff, where I needed a future as a promise [2] by an invoked entity that it kicked off a background activity which will eventually return a value, but which the users can't »start« or choose to »execute it now«, as they can with Tasks. If TaskPool had an execute() method which took a delegate to execute (or a »Task«, for that matter) and returned a new object which serves as a »handle« with wait()/get()/… methods, _that_ would (likely) be a future. David [1] Constrained in the sense that it is only meant for short-/synchronous-running tasks and thus e.g. offer no callback mechanism. [2] Let's not get into splitting hairs regarding the exact meaning of »Future« vs. »Promise«, especially because C++11 introduced a new interpretation to the mix.
Oct 03 2012
parent reply "David Nadlinger" <see klickverbot.at> writes:
On Wednesday, 3 October 2012 at 19:42:07 UTC, dsimcha wrote:
 If not, please clarify what you needed and the relevant use 
 cases so that I can fix std.parallelism.
In my use case, conflating the notion of a future, i.e. a value that becomes available at some point in the future, with the process which creates that future makes no sense. For example, let's say you are writing a function which computes a complex database query from its parameters and then submits it to your query manager/connection pool/… for asynchronous execution. You cannot use std.parallelism.Task in this case, because there is no way of expressing the process which retrieves the result as a delegate running inside a TaskPool. Or, say you want to write an "aggregator", combining the results of several futures together, again offering the same future interface (maybe an array of the original result types) to consumers. Again, there is no computation-bound part to that at all, which would make sense to run on a TaskPool – you are only waiting on the other tasks to finish. The second problem with std.parallelism.Task is that your only choice is polling (or blocking, for that matter). Yes, callbacks are a hairy thing to do if you can't be sure what thread they are executed on, but not having them severely limits the power of your abstraction, especially if you are dealing with non-CPU-bound tasks (as many of today's "modern" use cases are). For example, something my mentor asked to implement for Thrift during last year's GSoC was a feature which allows to send a request out to a pool of servers concurrently, returning the first one of the results (apparently, this mechanism is used as a sharding mechanism in some situations – if a server doesn't have the data, it simply ignores the request). How would you implement something like that as a function Task[] -> Task? For for its take on the matter) also has a »ContinueWith« method which is really just a completion callback mechanism. std.parallelism.Task is great for expressing local resource-intensive units of work (and fast!), but I think it is to rigid and specialized for that case to be generally useful. David
Oct 03 2012
parent reply "dsimcha" <dsimcha yahoo.com> writes:
On Wednesday, 3 October 2012 at 21:02:07 UTC, David Nadlinger 
wrote:
 On Wednesday, 3 October 2012 at 19:42:07 UTC, dsimcha wrote:
 If not, please clarify what you needed and the relevant use 
 cases so that I can fix std.parallelism.
In my use case, conflating the notion of a future, i.e. a value that becomes available at some point in the future, with the process which creates that future makes no sense.
So the "process which creates the future" is a Task that executes in a different thread than the caller? And an alternative way that a value might become available in the future is e.g. if it's being retrieved from some slow I/O process like a database or network?
 For example, let's say you are writing a function which 
 computes a complex database query from its parameters and then 
 submits it to your query manager/connection pool/… for 
 asynchronous execution. You cannot use std.parallelism.Task in 
 this case, because there is no way of expressing the process 
 which retrieves the result as a delegate running inside a 
 TaskPool.
Ok, I'm confused here. Why can't the process that retrieves the result be expressed as a delegate running in a TaskPool or a new thread?
 Or, say you want to write an "aggregator", combining the 
 results of several futures together, again offering the same 
 future interface (maybe an array of the original result types) 
 to consumers. Again, there is no computation-bound part to that 
 at all, which would make sense to run on a TaskPool – you are 
 only waiting on the other tasks to finish.
Maybe I'm just being naive since I don't understand the use cases, but why couldn't you just create an array of Task objects?
 The second problem with std.parallelism.Task is that your only 
 choice is polling (or blocking, for that matter). Yes, 
 callbacks are a hairy thing to do if you can't be sure what 
 thread they are executed on, but not having them severely 
 limits the power of your abstraction, especially if you are 
 dealing with non-CPU-bound tasks (as many of today's "modern" 
 use cases are).
I'm a little confused about how the callbacks would be used here. Is the idea that some callback would be called when the task is finished? Would it be called in the worker thread or the thread that submitted the task to the pool? Can you provide a use case?
 For example, something my mentor asked to implement for Thrift 
 during last year's GSoC was a feature which allows to send a 
 request out to a pool of servers concurrently, returning the 
 first one of the results (apparently, this mechanism is used as 
 a sharding mechanism in some situations – if a server doesn't 
 have the data, it simply ignores the request).
"First one of the results" == the result produced by the the first server to return anything?
 How would you implement something like that as a function 

 universally praised for its take on the matter) also has a 
 »ContinueWith« method which is really just a completion 
 callback mechanism.
I'll look into ContinueWith and see if it's implementable in std.parallelism without breaking anything.
 std.parallelism.Task is great for expressing local 
 resource-intensive units of work (and fast!), but I think it is 
 to rigid and specialized for that case to be generally useful.
Right. I wrote std.parallelism with resource-intensive units of work in mind because that's the use case I was familiar with. It was designed first and foremost to make using SMP parallelism _simple_. In hindsight I might have erred to much on the side of making simple things simple vs. complicated things possible or over-specialized it and avoided solving the an important, more general problem. I'll try to understand your use cases and see if they can be addressed without making simple things more complicated. I think the best way you could help me understand what I've overlooked in std.parallelism's design is to give a quick n' dirty example of how an API that does what you want would be used. Even more generally, any _concise, concrete_ use cases, even toy use cases, would be a huge help.
Oct 03 2012
parent reply "David Nadlinger" <see klickverbot.at> writes:
On Wednesday, 3 October 2012 at 23:02:25 UTC, dsimcha wrote:
 So the "process which creates the future" is a Task that 
 executes in a different thread than the caller?  And an 
 alternative way that a value might become available in the 
 future is e.g. if it's being retrieved from some slow I/O 
 process like a database or network?
Yes.
 For example, let's say you are writing a function which 
 computes a complex database query from its parameters and then 
 submits it to your query manager/connection pool/… for 
 asynchronous execution. You cannot use std.parallelism.Task in 
 this case, because there is no way of expressing the process 
 which retrieves the result as a delegate running inside a 
 TaskPool.
Ok, I'm confused here. Why can't the process that retrieves the result be expressed as a delegate running in a TaskPool or a new thread?
Because you already have a system in place for managing these tasks, which is separate from std.parallelism. A reason for this could be that you are using a third-party library like libevent. Another could be that the type of workload requires additional problem knowledge of the scheduler so that different tasks don't tread on each others's toes (for example communicating with some servers via a pool of sockets, where you can handle several concurrent requests to different servers, but can't have two task read/write to the same socket at the same time, because you'd just send garbage). Really, this issue is just about extensibility and/or flexibility. The design of std.parallelism.Task assumes that all values which "becomes available at some point in the future" are the product of a process for which a TaskPool is a suitable Task vs. TaskCompletionSource, etc.
 The second problem with std.parallelism.Task is that your only 
 choice is polling (or blocking, for that matter). Yes, 
 callbacks are a hairy thing to do if you can't be sure what 
 thread they are executed on, but not having them severely 
 limits the power of your abstraction, especially if you are 
 dealing with non-CPU-bound tasks (as many of today's "modern" 
 use cases are).
I'm a little confused about how the callbacks would be used here. Is the idea that some callback would be called when the task is finished? Would it be called in the worker thread or the thread that submitted the task to the pool? Can you provide a use case?
Maybe using the word "callback" was a bit misleading, but it callback would be invoked on the worker thread (or by whoever invokes the hypothetical Future.complete(<result>) method). Probably most trivial use case would be to set a condition variable in it in order to implement a waitAny(Task[]) method, which waits until the first of a set of tasks is completed. Ever wanted to wait on multiple condition variables? Or used select() with multiple sockets? This is what I mean. For more advanced/application-level use cases, just look at any for C++, see e.g. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3327.pdf. I didn't really read the the N3327 paper in detail, but from a brief look it seems to be a nice summary of what you might want to do with tasks/asynchronous results – I think you could find it an interesting read. David
Oct 04 2012
next sibling parent reply "dsimcha" <dsimcha yahoo.com> writes:
Ok, I think I see where you're coming from here.  I've replied to 
some points below just to make sure and discuss possible 
solutions.

On Thursday, 4 October 2012 at 16:07:35 UTC, David Nadlinger 
wrote:
 On Wednesday, 3 October 2012 at 23:02:25 UTC, dsimcha wrote:
 Because you already have a system in place for managing these 
 tasks, which is separate from std.parallelism. A reason for 
 this could be that you are using a third-party library like 
 libevent. Another could be that the type of workload requires 
 additional problem knowledge of the scheduler so that different 
 tasks don't tread on each others's toes (for example 
 communicating with some servers via a pool of sockets, where 
 you can handle several concurrent requests to different 
 servers, but can't have two task read/write to the same socket 
 at the same time, because you'd just send garbage).

 Really, this issue is just about extensibility and/or 
 flexibility. The design of std.parallelism.Task assumes that 
 all values which "becomes available at some point in the 
 future" are the product of a process for which a TaskPool is a 
 suitable scheduler. C++ has std::future separate from 

I'll look into these when I have more time, but I guess what it boils down to is the need to separate the **abstraction** of something that returns a value later (I'll call that **abstraction** futures) from the **implementation** provided by std.parallelism (I'll call this **implementation** tasks), which was designed only with CPU-bound tasks and multicore in mind. On the other hand, I like std.parallelism's simplicity for handling its charter of CPU-bound problems and multicore parallelism. Perhaps the solution is to define another Phobos module that models the **abstraction** of futures and provide an adapter of some kind to make std.parallelism tasks, which are a much lower-level concept, fit this model. I don't think the **general abstraction** of a future should be defined in std.parallelism, though. std.parallelism includes parallelism-oriented things besides tasks, e.g. parallel map, reduce, foreach. Including a more abstract model of values that become available later would make its charter too unfocused.
 Maybe using the word "callback" was a bit misleading, but it 
 callback would be invoked on the worker thread (or by whoever 
 invokes the hypothetical Future.complete(<result>) method).

 Probably most trivial use case would be to set a condition 
 variable in it in order to implement a waitAny(Task[]) method, 
 which waits until the first of a set of tasks is completed. 
 Ever wanted to wait on multiple condition variables? Or used 
 select() with multiple sockets? This is what I mean.
Well, implementing something like ContinueWith or Future.complete for std.parallelism tasks would be trivial, and I see how waitAny could easily be implemented in terms of this. I'm not sure I want to define an API for this in std.parallelism, though, until we have something like a std.future and the **abstraction** of a future is better-defined.
 For more advanced/application-level use cases, just look at any 

 for C++, see e.g. 
 http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3327.pdf.

 I didn't really read the the N3327 paper in detail, but from a 
 brief look it seems to be a nice summary of what you might want 
 to do with tasks/asynchronous results – I think you could 
 find it an interesting read.
I don't have time to look at these right now, but I'll definitely look at them sometime soon. Thanks for the info.
Oct 04 2012
parent reply "Pragma Tix" <bizprac or.fr> writes:
On Thursday, 4 October 2012 at 18:34:29 UTC, dsimcha wrote:

 I don't have time to look at these right now, but I'll 
 definitely look at them sometime soon.  Thanks for the info.
You will finds this interesting too, a code snippet from Daniel Keep. http://www.dsource.org/projects/scrapple/browser/trunk/future/future.d
Oct 05 2012
parent "David Nadlinger" <see klickverbot.at> writes:
On Friday, 5 October 2012 at 10:26:57 UTC, Pragma Tix wrote:
 You will finds this interesting too, a code snippet from Daniel
 Keep.

 http://www.dsource.org/projects/scrapple/browser/trunk/future/future.d
The code only allows you to do something equivalent to »auto t = std.parallelism.task!dg(); t.executeInNewThread()«. So no, I don't think David, being the author of std.parallelism, would find it interesting… ;) David
Oct 05 2012
prev sibling parent "dsimcha" <dsimcha yahoo.com> writes:
On Thursday, 4 October 2012 at 16:07:35 UTC, David Nadlinger 
wrote:
 For more advanced/application-level use cases, just look at any 

 for C++, see e.g. 
 http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3327.pdf.

 I didn't really read the the N3327 paper in detail, but from a 
 brief look it seems to be a nice summary of what you might want 
 to do with tasks/asynchronous results – I think you could 
 find it an interesting read.

 David
Thanks for posting this. It was an incredibly useful read for me! Given that the code I write is generally compute-intensive, not I/O intensive, I'd never given much thought to the value of futures in I/O intensive code before this discussion. I stand by what I said before: Someone (not me because I'm not intimately familiar with the use cases; you might be qualified) should write a std.future module for Phobos that properly models the **abstraction** of a future. It's only tangentially relevant to std.parallelism's charter, which includes both a special case of futures that's useful to SMP parallelism and other parallel computing constructs. Then, we should define an adapter that allows std.parallelism Tasks to be modeled more abstractly as futures when necessary, once we've nailed down what the future **abstraction** should look like.
Oct 04 2012