www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - std.parallelism equivalents for posix fork and multi-machine

reply "Laeeth Isharc" <laeethnospam nospamlaeeth.com> writes:
Is there value to having equivalents to the std.parallelism 
approach that works with processes rather than threads, and makes 
it easy to manage tasks over multiple machines?

I took a look at std.parallelism and it's beyond what I can do 
for now.  But it seems like this might be a useful project, and 
not one of unmanageable difficulty...
May 13 2015
next sibling parent reply "weaselcat" <weaselcat gmail.com> writes:
On Wednesday, 13 May 2015 at 20:28:02 UTC, Laeeth Isharc wrote:
 Is there value to having equivalents to the std.parallelism 
 approach that works with processes rather than threads, and 
 makes it easy to manage tasks over multiple machines?
I'm not sure if you're asking because of this thread, but see http://forum.dlang.org/thread/tczkndtepnvppggzmews forum.dlang.org#post-tczkndtepnvppggzmews:40forum.dlang.org python outperforming D because it doesn't have to deal with synchronization headaches. I found D to be way faster when reimplemented with fork, but having to use the stdc API is ugly(IMO)
May 13 2015
next sibling parent "Laeeth Isharc" <laeethnospam nospamlaeeth.com> writes:
On Wednesday, 13 May 2015 at 20:34:24 UTC, weaselcat wrote:
 On Wednesday, 13 May 2015 at 20:28:02 UTC, Laeeth Isharc wrote:
 Is there value to having equivalents to the std.parallelism 
 approach that works with processes rather than threads, and 
 makes it easy to manage tasks over multiple machines?
I'm not sure if you're asking because of this thread, but see http://forum.dlang.org/thread/tczkndtepnvppggzmews forum.dlang.org#post-tczkndtepnvppggzmews:40forum.dlang.org python outperforming D because it doesn't have to deal with synchronization headaches. I found D to be way faster when reimplemented with fork, but having to use the stdc API is ugly(IMO)
yes - that is what spurred me to post,but it had been on my mind for a while (especially the multi-machine stuff).
May 13 2015
prev sibling parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Wednesday, 13 May 2015 at 20:34:24 UTC, weaselcat wrote:
 On Wednesday, 13 May 2015 at 20:28:02 UTC, Laeeth Isharc wrote:
 Is there value to having equivalents to the std.parallelism 
 approach that works with processes rather than threads, and 
 makes it easy to manage tasks over multiple machines?
I'm not sure if you're asking because of this thread, but see http://forum.dlang.org/thread/tczkndtepnvppggzmews forum.dlang.org#post-tczkndtepnvppggzmews:40forum.dlang.org python outperforming D because it doesn't have to deal with synchronization headaches. I found D to be way faster when reimplemented with fork, but having to use the stdc API is ugly(IMO)
It was also easy to get D very fast by just being a little more eager with IO and reducing the enormous number of little allocations being made.
May 14 2015
parent "Laeeth Isharc" <laeeth nospamlaeeth.com> writes:
On Thursday, 14 May 2015 at 16:33:46 UTC, John Colvin wrote:
 On Wednesday, 13 May 2015 at 20:34:24 UTC, weaselcat wrote:
 On Wednesday, 13 May 2015 at 20:28:02 UTC, Laeeth Isharc wrote:
 Is there value to having equivalents to the std.parallelism 
 approach that works with processes rather than threads, and 
 makes it easy to manage tasks over multiple machines?
I'm not sure if you're asking because of this thread, but see http://forum.dlang.org/thread/tczkndtepnvppggzmews forum.dlang.org#post-tczkndtepnvppggzmews:40forum.dlang.org python outperforming D because it doesn't have to deal with synchronization headaches. I found D to be way faster when reimplemented with fork, but having to use the stdc API is ugly(IMO)
It was also easy to get D very fast by just being a little more eager with IO and reducing the enormous number of little allocations being made.
Yes - thank you for your highly educational rewrite, which I personally very much appreciate your taking the trouble to do. Perhaps this should be turned (by you or someone else) into a mini case-study on the wiki of how to write idiomatic and efficient D code. Or maybe just put up the slides from your forthcoming talk (which I look forward to watching later when it is up). It's good to know D can in fact deliver on the implicit promise in a real use case with not too much work. (Yes, naively written code was a bit slow when dealing with millions of lines, but in which language of comparable flexibility would that not be true). It's also interesting that your code was idiomatic. (I was reading up about Scala, which seems beautiful in many ways, but it is terribly disturbing to see that the idiomatic way often seems to be the most inefficient, at least as things stood a couple of years ago). But, even so, I think having a wrapper for fork and an API for multiprocessing (which you could then hook up to eg the Digital Ocean, AWS apis etc) would be rather helpful. I spoke with a friend of mine at one of the most admired/hated Wall Street firms. One of the smartest quants I know who has now moved to portfolio management. He was doing a study on tick data going back to 2000. I asked him how long it took to run on his firm's infrastructure. An hour! And the operations were pretty simple. I think it should only take a couple of minutes. And it would be nice to show an example of - from a spreadsheet - spinning up 100 digital ocean instances - and running the numbers not just on one security, but every relevant security, and having a nice summary appear back in the sheet within a couple of minutes. The reason speed matters is that long waits interfere with rapid iteration and the creative thought process. In a market environment you may well have forgotten what you wanted after an hour... Laeeth.
May 14 2015
prev sibling parent reply "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"Laeeth Isharc"  wrote in message 
news:ejbhesbstgazkxnpvqsl forum.dlang.org...

 Is there value to having equivalents to the std.parallelism approach that 
 works with processes rather than threads, and makes it easy to manage 
 tasks over multiple machines?

 I took a look at std.parallelism and it's beyond what I can do for now. 
 But it seems like this might be a useful project, and not one of 
 unmanageable difficulty...
Yes, there is enormous value. It's just waiting for someone to do it.
May 14 2015
parent reply "Laeeth Isharc" <laeeth nospamlaeeth.com> writes:
On Thursday, 14 May 2015 at 10:15:48 UTC, Daniel Murphy wrote:
 "Laeeth Isharc"  wrote in message 
 news:ejbhesbstgazkxnpvqsl forum.dlang.org...

 Is there value to having equivalents to the std.parallelism 
 approach that works with processes rather than threads, and 
 makes it easy to manage tasks over multiple machines?

 I took a look at std.parallelism and it's beyond what I can do 
 for now. But it seems like this might be a useful project, and 
 not one of unmanageable difficulty...
Yes, there is enormous value. It's just waiting for someone to do it.
To start the process off (because small beginnings are better than no beginning): what are the key features of processes vs threads one would need to bear in mind when designing such a thing? Because I spent the past couple of decades in a different field, multiprocessing passed me by, so I am only now slowly catching up.
May 14 2015
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 14 May 2015 at 20:06:55 UTC, Laeeth Isharc wrote:
 To start the process off (because small beginnings are better 
 than no beginning): what are the key features of processes vs 
 threads one would need to bear in mind when designing such a 
 thing?  Because I spent the past couple of decades in a 
 different field, multiprocessing passed me by, so I am only now 
 slowly catching up.
"nobody" understands multiprocessing. Or rather… you need to understand the hardware and the concrete problem space first. There are no general solutions.
May 14 2015
parent reply "Laeeth Isharc" <laeeth nospamlaeeth.com> writes:
On Thursday, 14 May 2015 at 20:15:38 UTC, Ola Fosheim Grøstad 
wrote:
 On Thursday, 14 May 2015 at 20:06:55 UTC, Laeeth Isharc wrote:
 To start the process off (because small beginnings are better 
 than no beginning): what are the key features of processes vs 
 threads one would need to bear in mind when designing such a 
 thing?  Because I spent the past couple of decades in a 
 different field, multiprocessing passed me by, so I am only 
 now slowly catching up.
"nobody" understands multiprocessing. Or rather… you need to understand the hardware and the concrete problem space first. There are no general solutions.
Yes, I certainly understand that it is a highly specialist and complex area where the best minds in the world have not yet the answers. So if one were addressing the problem from a computer science academic perspective, then perhaps one will arrive at a different answer. My own is a pragmatic commercial one. I have some problems which perhaps scale quite well, and rather than write it using fork directly, I would rather have a higher level wrapper along the lines of std.parallelism. Perhaps such would be flawed and limited, but often something is better than nothing, even if not perfect. And I mention it on the forum only because usually I have found the problems I face turn out to be those faced by many others too.. If you have any thoughts on what should be considered, I would very much appreciate them. (And I owe you a response on our last suspended discussion, but haven't had time of late). Laeeth.
May 14 2015
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 14 May 2015 at 20:28:20 UTC, Laeeth Isharc wrote:
 My own is a pragmatic commercial one.  I have some problems 
 which perhaps scale quite well, and rather than write it using 
 fork directly, I would rather have a higher level wrapper along 
 the lines of std.parallelism.
Languages like Chapel and extended versions of C++ have built in support for parallel computing that is relatively effortless and designed by experts (Cray/IBM etc) to cover common patterns in demanding batch processing for those who wants something higher level than plain C++ (or in this case D which is pretty much the same thing). However, you could consider combining single threaded processes in D with e.g. Python as a supervising process if the datasets allow it. You'll find lots of literature on Inter Process Communication (IPC) for Unix. Performance will be lower, but your own productivity might be higher, YMMV.
 Perhaps such would be flawed and limited, but often something 
 is better than nothing, even if not perfect.  And I mention it 
 on the forum only because usually I have found the problems I 
 face turn out to be those faced by many others too..
You need momentum in order to get from a raw state to something polished, so you essentially need a larger community that both have experience with the topic and a need for it in order to get a sensible framework that is maintained. If you can get away with it, the most common simplistic approach seems to be map-reduce. Because it is easy to distribute over many machines and there are frameworks that do the tedious bits for you.
 If you have any thoughts on what should be considered, I would 
 very much appreciate them.  (And I owe you a response on our 
 last suspended discussion, but haven't had time of late).
Nah, you owe me nothing ;-). And I also have no time atm. ;-) Ola.
May 14 2015
parent reply "Laeeth Isharc" <nospamlaeeth nospam.laeeth.com> writes:
On Thursday, 14 May 2015 at 20:56:16 UTC, Ola Fosheim Grøstad 
wrote:
 On Thursday, 14 May 2015 at 20:28:20 UTC, Laeeth Isharc wrote:
 My own is a pragmatic commercial one.  I have some problems 
 which perhaps scale quite well, and rather than write it using 
 fork directly, I would rather have a higher level wrapper 
 along the lines of std.parallelism.
Languages like Chapel and extended versions of C++ have built in support for parallel computing that is relatively effortless and designed by experts (Cray/IBM etc) to cover common patterns in demanding batch processing for those who wants something higher level than plain C++ (or in this case D which is pretty much the same thing).
Yes - I am sure that there is excellent stuff here, from which one may learn much: especially if approaching it from a more theoretical or enterprisey industrial scale perspective.
 However, you could consider combining single threaded processes 
 in D with e.g. Python as a supervising process if the datasets 
 allow it. You'll find lots of literature on Inter Process 
 Communication (IPC) for Unix. Performance will be lower, but 
 your own productivity might be higher, YMMV.
But why would one use python when fork itself isn't hard to use in a narrow sense, and neither is the kind of interprocess communication I would like to do for the kind of tasks I have in mind. It just seems to make sense to have a light wrapper. Just because some problems in parallel processing are hard doesn't seem to me a reason not to do some work on addressing the easier ones that may in a practical sense have great value in having an imperfect (but real) solution for. Sometimes I have the sense when talking with you that the answer to any question is anything but D! ;) (But I am sure I must be mistaken!)
 Perhaps such would be flawed and limited, but often something 
 is better than nothing, even if not perfect.  And I mention it 
 on the forum only because usually I have found the problems I 
 face turn out to be those faced by many others too..
You need momentum in order to get from a raw state to something polished, so you essentially need a larger community that both have experience with the topic and a need for it in order to get a sensible framework that is maintained.
True. But we are not speaking of getting from a raw state to perfection but just starting to play with the problem. If Walter Bright had listened to well-intentioned advice, he wouldn't be in the compiler business, let alone have given us what became D. I am no Walter Bright, but this is an easier problem to start exploring, and this would be beyond the scope of anything I would do just by myself.
 If you can get away with it, the most common simplistic 
 approach seems to be map-reduce. Because it is easy to 
 distribute over many machines and there are frameworks that do 
 the tedious bits for you.
Yes, indeed. But my question was more about the distinctions between processes and threads and the non-obvious implications for the design of such a framework. Nice chatting. Laeeth.
May 14 2015
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Friday, 15 May 2015 at 00:07:15 UTC, Laeeth Isharc wrote:
 But why would one use python when fork itself isn't hard to use 
 in a narrow sense, and neither is the kind of interprocess 
 communication I would like to do for the kind of tasks I have 
 in mind. It just seems to make sense to have a light wrapper.
The managing process doesn't have to be fast, but should be easy to reconfigure. It is overall more effective (not efficient) to use a scripting language with a REPL for scripty tasks. Forking comes with its own set of pitfalls. The unix-way is to have a conglomerate of simple processes tied together with a script. Overall easier to debug and modify.
 Just because some problems in parallel processing are hard 
 doesn't seem to me a reason not to do some work on addressing 
 the easier ones that may in a practical sense have great value 
 in having an imperfect (but real) solution for.  Sometimes I 
 have the sense when talking with you that the answer to any 
 question is anything but D! ;)  (But I am sure I must be 
 mistaken!)
I would have said the same thing about Rust and Nim too. Overall, what other people do with a tool affects the eco system and maturity. If you do system level programming you are less affected by the eco system then when you do higher level task-oriented programming. What is your mission, to solve a problem effectively now or to start building a new framework with a time horizon measured in years? You have to decide this first. Then you have to decide what is more expensive, your time or spending twice as much on CPU power (whether it is hardware or rented time at a datacenter).
 True.  But we are not speaking of getting from a raw state to 
 perfection but just starting to play with the problem.  If 
 Walter Bright had listened to well-intentioned advice, he 
 wouldn't be in the compiler business, let alone have given us 
 what became D.
He set out to build a new framework with a time horizon measured in decades. That's perfectly reasonable and what you have to expect when starting on a new language. If you want to build a framework for a specific use you need both the theoretical insights and the pragmatical experience in order to complete it in a timely manner. You need many many iterations to get to a state where it is better (than whatever people use today). Which is why most (sensible) engineers will pick existing solutions that are receiving polish, rather than the next big thing.
 Yes, indeed.  But my question was more about the distinctions 
 between processes and threads and the non-obvious implications 
 for the design of such a framework.
If you want to use fork(), you might as well use threads, the main distinction is that with processes you have to be explicit about what resources to share, but after a fork() you also risk ending up in an inconsistent state if you aren't careful. With a fork based solution you still need to deal with a different level of complexity than you get with a Unixy conglomerate of simple programs that cooperate, the Unix way is easier to debug and test, but slower than an optimized multi threaded solution (and marginally slower than a process that fork itself).
May 14 2015