digitalmars.D - std.parallelism equivalents for posix fork and multi-machine

Laeeth Isharc (6/6) May 13 2015 Is there value to having equivalents to the std.parallelism

weaselcat (7/10) May 13 2015 I'm not sure if you're asking because of this thread, but see

Laeeth Isharc (3/13) May 13 2015 yes - that is what spurred me to post,but it had been on my mind
John Colvin (4/14) May 14 2015 It was also easy to get D very fast by just being a little more

Laeeth Isharc (36/53) May 14 2015 Yes - thank you for your highly educational rewrite, which I

Daniel Murphy (3/9) May 14 2015 Yes, there is enormous value. It's just waiting for someone to do it.

Laeeth Isharc (7/18) May 14 2015 To start the process off (because small beginnings are better

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (4/10) May 14 2015 "nobody" understands multiprocessing. Or rather… you need to

Laeeth Isharc (19/29) May 14 2015 Yes, I certainly understand that it is a highly specialist and

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (22/33) May 14 2015 Languages like Chapel and extended versions of C++ have built in

Laeeth Isharc (27/55) May 14 2015 Yes - I am sure that there is excellent stuff here, from which

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (38/57) May 14 2015 The managing process doesn't have to be fast, but should be easy

"Laeeth Isharc" <laeethnospam nospamlaeeth.com> writes:

Is there value to having equivalents to the std.parallelism 
approach that works with processes rather than threads, and makes 
it easy to manage tasks over multiple machines?

I took a look at std.parallelism and it's beyond what I can do 
for now.  But it seems like this might be a useful project, and 
not one of unmanageable difficulty...

May 13 2015

"weaselcat" <weaselcat gmail.com> writes:

On Wednesday, 13 May 2015 at 20:28:02 UTC, Laeeth Isharc wrote:
 Is there value to having equivalents to the std.parallelism 
 approach that works with processes rather than threads, and 
 makes it easy to manage tasks over multiple machines?

I'm not sure if you're asking because of this thread, but see

http://forum.dlang.org/thread/tczkndtepnvppggzmews forum.dlang.org#post-tczkndtepnvppggzmews:40forum.dlang.org

python outperforming D because it doesn't have to deal with 
synchronization headaches. I found D to be way faster when 
reimplemented with fork, but having to use the stdc API is 
ugly(IMO)

May 13 2015

"Laeeth Isharc" <laeethnospam nospamlaeeth.com> writes:

On Wednesday, 13 May 2015 at 20:34:24 UTC, weaselcat wrote:
 On Wednesday, 13 May 2015 at 20:28:02 UTC, Laeeth Isharc wrote:
 Is there value to having equivalents to the std.parallelism 
 approach that works with processes rather than threads, and 
 makes it easy to manage tasks over multiple machines?

 I'm not sure if you're asking because of this thread, but see

 http://forum.dlang.org/thread/tczkndtepnvppggzmews forum.dlang.org#post-tczkndtepnvppggzmews:40forum.dlang.org

 python outperforming D because it doesn't have to deal with 
 synchronization headaches. I found D to be way faster when 
 reimplemented with fork, but having to use the stdc API is 
 ugly(IMO)

yes - that is what spurred me to post,but it had been on my mind 
for a while (especially the multi-machine stuff).

May 13 2015

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Wednesday, 13 May 2015 at 20:34:24 UTC, weaselcat wrote:
 On Wednesday, 13 May 2015 at 20:28:02 UTC, Laeeth Isharc wrote:
 Is there value to having equivalents to the std.parallelism 
 approach that works with processes rather than threads, and 
 makes it easy to manage tasks over multiple machines?

 I'm not sure if you're asking because of this thread, but see

 http://forum.dlang.org/thread/tczkndtepnvppggzmews forum.dlang.org#post-tczkndtepnvppggzmews:40forum.dlang.org

 python outperforming D because it doesn't have to deal with 
 synchronization headaches. I found D to be way faster when 
 reimplemented with fork, but having to use the stdc API is 
 ugly(IMO)

It was also easy to get D very fast by just being a little more 
eager with IO and reducing the enormous number of little 
allocations being made.

May 14 2015

"Laeeth Isharc" <laeeth nospamlaeeth.com> writes:

On Thursday, 14 May 2015 at 16:33:46 UTC, John Colvin wrote:
On Wednesday, 13 May 2015 at 20:34:24 UTC, weaselcat wrote:
On Wednesday, 13 May 2015 at 20:28:02 UTC, Laeeth Isharc wrote:
Is there value to having equivalents to the std.parallelism
approach that works with processes rather than threads, and
makes it easy to manage tasks over multiple machines?

I'm not sure if you're asking because of this thread, but see

http://forum.dlang.org/thread/tczkndtepnvppggzmews forum.dlang.org#post-tczkndtepnvppggzmews:40forum.dlang.org

python outperforming D because it doesn't have to deal with
synchronization headaches. I found D to be way faster when
reimplemented with fork, but having to use the stdc API is
ugly(IMO)

It was also easy to get D very fast by just being a little more
eager with IO and reducing the enormous number of little
allocations being made.

Yes - thank you for your highly educational rewrite, which I
personally very much appreciate your taking the trouble to do.
Perhaps this should be turned (by you or someone else) into a
mini case-study on the wiki of how to write idiomatic and
efficient D code. Or maybe just put up the slides from your
forthcoming talk (which I look forward to watching later when it
is up).

It's good to know D can in fact deliver on the implicit promise
in a real use case with not too much work. (Yes, naively written
code was a bit slow when dealing with millions of lines, but in
which language of comparable flexibility would that not be true).
It's also interesting that your code was idiomatic. (I was
reading up about Scala, which seems beautiful in many ways, but
it is terribly disturbing to see that the idiomatic way often
seems to be the most inefficient, at least as things stood a
couple of years ago).

But, even so, I think having a wrapper for fork and an API for
multiprocessing (which you could then hook up to eg the Digital
Ocean, AWS apis etc) would be rather helpful.

I spoke with a friend of mine at one of the most admired/hated
Wall Street firms. One of the smartest quants I know who has now
moved to portfolio management. He was doing a study on tick data
going back to 2000. I asked him how long it took to run on his
firm's infrastructure. An hour! And the operations were pretty
simple. I think it should only take a couple of minutes. And it
would be nice to show an example of - from a spreadsheet -
spinning up 100 digital ocean instances - and running the numbers
not just on one security, but every relevant security, and having
a nice summary appear back in the sheet within a couple of
minutes.

The reason speed matters is that long waits interfere with rapid
iteration and the creative thought process. In a market
environment you may well have forgotten what you wanted after an
hour...

Laeeth.

May 14 2015

"Daniel Murphy" <yebbliesnospam gmail.com> writes:

"Laeeth Isharc"  wrote in message 
news:ejbhesbstgazkxnpvqsl forum.dlang.org...

 Is there value to having equivalents to the std.parallelism approach that 
 works with processes rather than threads, and makes it easy to manage 
 tasks over multiple machines?

 I took a look at std.parallelism and it's beyond what I can do for now. 
 But it seems like this might be a useful project, and not one of 
 unmanageable difficulty...

Yes, there is enormous value.  It's just waiting for someone to do it.

May 14 2015

"Laeeth Isharc" <laeeth nospamlaeeth.com> writes:

On Thursday, 14 May 2015 at 10:15:48 UTC, Daniel Murphy wrote:
 "Laeeth Isharc"  wrote in message 
 news:ejbhesbstgazkxnpvqsl forum.dlang.org...

 Is there value to having equivalents to the std.parallelism 
 approach that works with processes rather than threads, and 
 makes it easy to manage tasks over multiple machines?

 I took a look at std.parallelism and it's beyond what I can do 
 for now. But it seems like this might be a useful project, and 
 not one of unmanageable difficulty...

 Yes, there is enormous value.  It's just waiting for someone to 
 do it.

To start the process off (because small beginnings are better 
than no beginning): what are the key features of processes vs 
threads one would need to bear in mind when designing such a 
thing?  Because I spent the past couple of decades in a different 
field, multiprocessing passed me by, so I am only now slowly 
catching up.

May 14 2015

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:

On Thursday, 14 May 2015 at 20:06:55 UTC, Laeeth Isharc wrote:
 To start the process off (because small beginnings are better 
 than no beginning): what are the key features of processes vs 
 threads one would need to bear in mind when designing such a 
 thing?  Because I spent the past couple of decades in a 
 different field, multiprocessing passed me by, so I am only now 
 slowly catching up.

"nobody" understands multiprocessing. Or rather… you need to 
understand the hardware and the concrete problem space first. 
There are no general solutions.

May 14 2015

"Laeeth Isharc" <laeeth nospamlaeeth.com> writes:

On Thursday, 14 May 2015 at 20:15:38 UTC, Ola Fosheim Grøstad 
wrote:
 On Thursday, 14 May 2015 at 20:06:55 UTC, Laeeth Isharc wrote:
 To start the process off (because small beginnings are better 
 than no beginning): what are the key features of processes vs 
 threads one would need to bear in mind when designing such a 
 thing?  Because I spent the past couple of decades in a 
 different field, multiprocessing passed me by, so I am only 
 now slowly catching up.

 "nobody" understands multiprocessing. Or rather… you need to 
 understand the hardware and the concrete problem space first. 
 There are no general solutions.

Yes, I certainly understand that it is a highly specialist and 
complex area where the best minds in the world have not yet the 
answers.  So if one were addressing the problem from a computer 
science academic perspective, then perhaps one will arrive at a 
different answer.

My own is a pragmatic commercial one.  I have some problems which 
perhaps scale quite well, and rather than write it using fork 
directly, I would rather have a higher level wrapper along the 
lines of std.parallelism.  Perhaps such would be flawed and 
limited, but often something is better than nothing, even if not 
perfect.  And I mention it on the forum only because usually I 
have found the problems I face turn out to be those faced by many 
others too..

If you have any thoughts on what should be considered, I would 
very much appreciate them.  (And I owe you a response on our last 
suspended discussion, but haven't had time of late).


Laeeth.

May 14 2015

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:

On Thursday, 14 May 2015 at 20:28:20 UTC, Laeeth Isharc wrote:
 My own is a pragmatic commercial one.  I have some problems 
 which perhaps scale quite well, and rather than write it using 
 fork directly, I would rather have a higher level wrapper along 
 the lines of std.parallelism.

Languages like Chapel and extended versions of C++ have built in 
support for parallel computing that is relatively effortless and 
designed by experts (Cray/IBM etc) to cover common patterns in 
demanding batch processing for those who wants something higher 
level than plain C++ (or in this case D which is pretty much the 
same thing).

However, you could consider combining single threaded processes 
in D with e.g. Python as a supervising process if the datasets 
allow it. You'll find lots of literature on Inter Process 
Communication (IPC) for Unix. Performance will be lower, but your 
own productivity might be higher, YMMV.

 Perhaps such would be flawed and limited, but often something 
 is better than nothing, even if not perfect.  And I mention it 
 on the forum only because usually I have found the problems I 
 face turn out to be those faced by many others too..

You need momentum in order to get from a raw state to something 
polished, so you essentially need a larger community that both 
have experience with the topic and a need for it in order to get 
a sensible framework that is maintained.

If you can get away with it, the most common simplistic approach 
seems to be map-reduce. Because it is easy to distribute over 
many machines and there are frameworks that do the tedious bits 
for you.

 If you have any thoughts on what should be considered, I would 
 very much appreciate them.  (And I owe you a response on our 
 last suspended discussion, but haven't had time of late).

Nah, you owe me nothing ;-). And I also have no time atm. ;-)

Ola.

May 14 2015

"Laeeth Isharc" <nospamlaeeth nospam.laeeth.com> writes:

On Thursday, 14 May 2015 at 20:56:16 UTC, Ola Fosheim Grøstad 
wrote:
 On Thursday, 14 May 2015 at 20:28:20 UTC, Laeeth Isharc wrote:
 My own is a pragmatic commercial one.  I have some problems 
 which perhaps scale quite well, and rather than write it using 
 fork directly, I would rather have a higher level wrapper 
 along the lines of std.parallelism.

 Languages like Chapel and extended versions of C++ have built 
 in support for parallel computing that is relatively effortless 
 and designed by experts (Cray/IBM etc) to cover common patterns 
 in demanding batch processing for those who wants something 
 higher level than plain C++ (or in this case D which is pretty 
 much the same thing).

Yes - I am sure that there is excellent stuff here, from which 
one may learn much: especially if approaching it from a more 
theoretical or enterprisey industrial scale perspective.

 However, you could consider combining single threaded processes 
 in D with e.g. Python as a supervising process if the datasets 
 allow it. You'll find lots of literature on Inter Process 
 Communication (IPC) for Unix. Performance will be lower, but 
 your own productivity might be higher, YMMV.

But why would one use python when fork itself isn't hard to use 
in a narrow sense, and neither is the kind of interprocess 
communication I would like to do for the kind of tasks I have in 
mind. It just seems to make sense to have a light wrapper.  Just 
because some problems in parallel processing are hard doesn't 
seem to me a reason not to do some work on addressing the easier 
ones that may in a practical sense have great value in having an 
imperfect (but real) solution for.  Sometimes I have the sense 
when talking with you that the answer to any question is anything 
but D! ;)  (But I am sure I must be mistaken!)

 Perhaps such would be flawed and limited, but often something 
 is better than nothing, even if not perfect.  And I mention it 
 on the forum only because usually I have found the problems I 
 face turn out to be those faced by many others too..

 You need momentum in order to get from a raw state to something 
 polished, so you essentially need a larger community that both 
 have experience with the topic and a need for it in order to 
 get a sensible framework that is maintained.

True.  But we are not speaking of getting from a raw state to 
perfection but just starting to play with the problem.  If Walter 
Bright had listened to well-intentioned advice, he wouldn't be in 
the compiler business, let alone have given us what became D.  I 
am no Walter Bright, but this is an easier problem to start 
exploring, and this would be beyond the scope of anything I would 
do just by myself.

 If you can get away with it, the most common simplistic 
 approach seems to be map-reduce. Because it is easy to 
 distribute over many machines and there are frameworks that do 
 the tedious bits for you.

Yes, indeed.  But my question was more about the distinctions 
between processes and threads and the non-obvious implications 
for the design of such a framework.

Nice chatting.



Laeeth.

May 14 2015

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:

On Friday, 15 May 2015 at 00:07:15 UTC, Laeeth Isharc wrote:
 But why would one use python when fork itself isn't hard to use 
 in a narrow sense, and neither is the kind of interprocess 
 communication I would like to do for the kind of tasks I have 
 in mind. It just seems to make sense to have a light wrapper.

The managing process doesn't have to be fast, but should be easy 
to reconfigure. It is overall more effective (not efficient) to 
use a scripting language with a REPL for scripty tasks. Forking 
comes with its own set of pitfalls. The unix-way is to have a 
conglomerate of simple processes tied together with a script. 
Overall easier to debug and modify.

 Just because some problems in parallel processing are hard 
 doesn't seem to me a reason not to do some work on addressing 
 the easier ones that may in a practical sense have great value 
 in having an imperfect (but real) solution for.  Sometimes I 
 have the sense when talking with you that the answer to any 
 question is anything but D! ;)  (But I am sure I must be 
 mistaken!)

I would have said the same thing about Rust and Nim too. Overall, 
what other people do with a tool affects the eco system and 
maturity. If you do system level programming you are less 
affected by the eco system then when you do higher level 
task-oriented programming.

What is your mission, to solve a problem effectively now or to 
start building a new framework with a time horizon measured in 
years? You have to decide this first.

Then you have to decide what is more expensive, your time or 
spending twice as much on CPU power (whether it is hardware or 
rented time at a datacenter).

 True.  But we are not speaking of getting from a raw state to 
 perfection but just starting to play with the problem.  If 
 Walter Bright had listened to well-intentioned advice, he 
 wouldn't be in the compiler business, let alone have given us 
 what became D.

He set out to build a new framework with a time horizon measured 
in decades. That's perfectly reasonable and what you have to 
expect when starting on a new language.

If you want to build a framework for a specific use you need both 
the theoretical insights and the pragmatical experience in order 
to complete it in a timely manner. You need many many iterations 
to get to a state where it is better (than whatever people use 
today). Which is why most (sensible) engineers will pick existing 
solutions that are receiving polish, rather than the next big 
thing.

 Yes, indeed.  But my question was more about the distinctions 
 between processes and threads and the non-obvious implications 
 for the design of such a framework.

If you want to use fork(), you might as well use threads, the 
main distinction is that with processes you have to be explicit 
about what resources to share, but after a fork() you also risk 
ending up in an inconsistent state if you aren't careful.

With a fork based solution you still need to deal with a 
different level of complexity than you get with a Unixy 
conglomerate of simple programs that cooperate, the Unix way is 
easier to debug and test, but slower than an optimized multi 
threaded solution (and marginally slower than a process that fork 
itself).

May 14 2015

D Programming

C/C++ Programming

Other

digitalmars.D - std.parallelism equivalents for posix fork and multi-machine