www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Parallel programming

reply bearophile <bearophileHUGS lycos.com> writes:
How much time do we have to wait to see some parallel processing features in D?
People are getting more and more rabid because they have few ways to use their
2-4 core CPUs.
Classic multithreading is useful, but sometimes it's not easy to use correctly.

There are other ways to write parallel code, that D may adopt (more than one
way is probably better, no silver bullet exists in this field). Their point is
to allow to use the 2-4+ core CPUs of today (and maybe the 80-1000+ cores of
the future) in non-speed-critical parts of the code where the programmer wants
to use the other cores anyway, without too much programming efforts.

I think Walter wants D language to be multi-paradigm; one of the best ways to
allow multi-processing in a simple and safer way is the Stream Processing
(http://en.wikipedia.org/wiki/Stream_Processing ), D syntax may grow few
constructs to use such kind of programming in a simple way (C++ has some such
libs, I think).

Another easy way to perform multi processing is to vectorize. It means the
compiler can automatically use all the cores to perform operators like
array1+array2+array3.

Another way to perform multi processing is so add to the D syntax the
parallel_for (and few related things to merge things back, etc) syntax that was
present in the "Parallel Pascal" language. Such things are quite simpler to use
correctly than threads. The new "Fortress" language by Sun shows similar
things, but they are more refined compared to the Parallel Pascal ones (and
they look more complex to understand and use, so they may be overkill for D, I
don't know. Some of those parallel things of Fortress look quite difficult to
implement to me).
Time ago I have seen a form of parallel_for and the like in a small and easy
language from MIT, that I think are simple enough.

Other ways to use parallel code are now being pushed by Intel, OpenMP, and the
hairy but usable CUDA by Nvidia (I am not sure I want to learn CUDA, it's a C
variant, but seems to require a large human memory and a large human brain to
be used, while I think D may have simpler built-in things. Much more "serious"
D programmers may use external libs that allow them any fine control they
want). I to me they look too much in flux now to be copied too much by D.

Bye,
bearophile
Jul 15 2008
next sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
bearophile wrote:
 How much time do we have to wait to see some parallel processing features in
D? People are getting more and more rabid because they have few ways to use
their 2-4 core CPUs.
 Classic multithreading is useful, but sometimes it's not easy to use correctly.
 
 There are other ways to write parallel code, that D may adopt (more than one
way is probably better, no silver bullet exists in this field). Their point is
to allow to use the 2-4+ core CPUs of today (and maybe the 80-1000+ cores of
the future) in non-speed-critical parts of the code where the programmer wants
to use the other cores anyway, without too much programming efforts.
 
 I think Walter wants D language to be multi-paradigm; one of the best ways to
allow multi-processing in a simple and safer way is the Stream Processing
(http://en.wikipedia.org/wiki/Stream_Processing ), D syntax may grow few
constructs to use such kind of programming in a simple way (C++ has some such
libs, I think).
 
 Another easy way to perform multi processing is to vectorize. It means the
compiler can automatically use all the cores to perform operators like
array1+array2+array3.
 
 Another way to perform multi processing is so add to the D syntax the
parallel_for (and few related things to merge things back, etc) syntax that was
present in the "Parallel Pascal" language. Such things are quite simpler to use
correctly than threads. The new "Fortress" language by Sun shows similar
things, but they are more refined compared to the Parallel Pascal ones (and
they look more complex to understand and use, so they may be overkill for D, I
don't know. Some of those parallel things of Fortress look quite difficult to
implement to me).
 Time ago I have seen a form of parallel_for and the like in a small and easy
language from MIT, that I think are simple enough.

I asked for parallelization support for foreach... well, ages ago. At the time Walter said no because DMD was years away from being able to do anything like that, but perhaps with the new focus on multiprogramming one can argue more strongly that it's important to get something like this in the spec even if DMD itself doesn't support it. My request was pretty minimal and partially a reaction to foreach_reverse. It was: foreach( ... ) // defaults to "fwd" foreach(fwd)( ... ) foreach(rev)( ... ) foreach(any)( ... ) Thus foreach(any) is eligible for parallelization, while fwd and rev are what we have now. This would be easy enough with templates and another keyword: apply!(fwd)( ... ) etc. But passing a delegate literal as an argument isn't nearly as nice as the built-in foreach. And Tom's (IIRC) proposal to clean up the syntax for this doesn't look like it will ever be accepted.
 Other ways to use parallel code are now being pushed by Intel, OpenMP, and the
hairy but usable CUDA by Nvidia (I am not sure I want to learn CUDA, it's a C
variant, but seems to require a large human memory and a large human brain to
be used, while I think D may have simpler built-in things. Much more "serious"
D programmers may use external libs that allow them any fine control they
want). I to me they look too much in flux now to be copied too much by D.

D already has coroutines, DCSP, and futures available from various programmers (Mikola Lysenko for the first two), so I think the state of multiprogramming in D is actually pretty good even without additional language support. Sean
Jul 15 2008
parent downs <default_357-line yahoo.de> writes:
Sean Kelly wrote:
 D already has coroutines, DCSP, and futures available from various
 programmers (Mikola Lysenko for the first two), so I think the state of
 multiprogramming in D is actually pretty good even without additional
 language support.
 

 Sean

For what it's worth, coroutines and futures are also in tools. ( http://dsource.org/projects/scrapple/browser/trunk/tools/tools ) Also, I agree with your sentiment. --downs
Jul 15 2008
prev sibling next sibling parent downs <default_357-line yahoo.de> writes:
bearophile wrote:
 How much time do we have to wait to see some parallel processing features in
D? People are getting more and more rabid because they have few ways to use
their 2-4 core CPUs.
 Classic multithreading is useful, but sometimes it's not easy to use correctly.
 

Grow a pair and use threads. It's not _that_ hard.
 There are other ways to write parallel code, that D may adopt (more than one
way is probably better, no silver bullet exists in this field). Their point is
to allow to use the 2-4+ core CPUs of today (and maybe the 80-1000+ cores of
the future) in non-speed-critical parts of the code where the programmer wants
to use the other cores anyway, without too much programming efforts.
 
 I think Walter wants D language to be multi-paradigm; one of the best ways to
allow multi-processing in a simple and safer way is the Stream Processing
(http://en.wikipedia.org/wiki/Stream_Processing ), D syntax may grow few
constructs to use such kind of programming in a simple way (C++ has some such
libs, I think).
 
 Another easy way to perform multi processing is to vectorize. It means the
compiler can automatically use all the cores to perform operators like
array1+array2+array3.
 

Patched GDC supports autovectorization with -ftree-vectorize, although that's single-core. One of the good things IMHO about D is that its operations are mostly easy to understand, i.e. there's little magic going on. PLEASE don't change that.
 Another way to perform multi processing is so add to the D syntax the
parallel_for (and few related things to merge things back, etc) syntax that was
present in the "Parallel Pascal" language. Such things are quite simpler to use
correctly than threads. The new "Fortress" language by Sun shows similar
things, but they are more refined compared to the Parallel Pascal ones (and
they look more complex to understand and use, so they may be overkill for D, I
don't know. Some of those parallel things of Fortress look quite difficult to
implement to me).

 Time ago I have seen a form of parallel_for and the like in a small and easy
language from MIT, that I think are simple enough.

auto tp = new Threadpool(4); tp.mt_foreach(Range[4], (int e) { });
 
 Other ways to use parallel code are now being pushed by Intel, OpenMP, and the
hairy but usable CUDA by Nvidia (I am not sure I want to learn CUDA, it's a C
variant, but seems to require a large human memory and a large human brain to
be used, while I think D may have simpler built-in things. Much more "serious"
D programmers may use external libs that allow them any fine control they
want). I to me they look too much in flux now to be copied too much by D.

Please, no hardware specific features. D is x86 dependent enough as it is, it would be a bad idea to add dependencies on _graphics cards_.
 Bye,
 bearophile

IMHO what's really needed is good tools to discover interaction between threads. I'd like a standardized way to grab debug info, like the current back trace of a std.thread.Thread. This could be used to implement fairly sophisticated logging. Also, what I have requested before .. single-instruction function bodies should be able to omit their {}s, to bring them in line with normal loop statements. This sounds like a hack, but which is better? void test() { synchronized(this) { ... } } or void test() synchronized(this) { ... } --downs
Jul 15 2008
prev sibling next sibling parent reply Markus Koskimies <markus reaaliaika.net> writes:
On Tue, 15 Jul 2008 19:34:24 -0400, bearophile wrote:

 How much time do we have to wait to see some parallel processing
 features in D? People are getting more and more rabid because they have
 few ways to use their 2-4 core CPUs. Classic multithreading is useful,
 but sometimes it's not easy to use correctly.

A very short answer; for true parallel processing, 2-4 processors is nothing. The success of CFL (Control-Flow Languages) like C, C++, D, Pascal, Perl, Python, BASICs, Cobol, Comal, PL/I, whitespace, malbolge, etc. etc. is that they follow the underlaying paradigm of computer. There has been many efforts to declare languages, that are implicitly parallel. The most used approach is to use DFL (Data-Flow Language) paradigms, and the most well-know of these is definitely VHDL. Others are e.g. NESL and ID. Then there are several languages that are either in- between like functional programming languages (Haskell, Erlang) or reductive languages (like make and Prolog). Short references: http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=714561 http://portal.acm.org/citation.cfm?id=359579&dl=GUIDE http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=630241 Especially Hartenstein's articles are good to read, if you are trying to understand, why we are still using CFL & RASP, and why parallel architectures have failed. No, the future will show us not any more parallelism at source level. Instead, (a) the compilers start to understand source better, to parallelize inner kernels of loops automatically, and (b) there will be even more layers between source we are writing and the instructions/ configurations processors are executing, and thus the main purpose of source language is not any more to follow the underlaying paradigm, but productivity - how easy it is to humans to express things; and CFL- languages are far from their counterparts in this area. Comparing CFL/DFL at compiler level, see e.g. http://csdl2.computer.org/persagen/DLAbsToc.jsp?resourcePath=/dl/ proceedings/&toc=comp/proceedings/ fccm/1995/7086/00/7086toc.xml&DOI=10.1109/FPGA.1995.477423 If I would asked to say what is the way of writing future programs, I would say it is MPS (Message Passing Systems), refer to e.g. Hewitt's Actor Model (1973). Furthermore, I would predict processors to start to do low-level reconfigurations, e.g. RSIP (Reconfigurable Instruction Set Processor) -paradigm. Look google for GARP and the awesome performance increasements it can offer for certain tasks.
Jul 15 2008
next sibling parent Sean Kelly <sean invisibleduck.org> writes:
== Quote from Markus Koskimies (markus reaaliaika.net)'s article
 If I would asked to say what is the way of writing future programs, I
 would say it is MPS (Message Passing Systems), refer to e.g. Hewitt's
 Actor Model (1973).

I agree completely. MP is easy to comprehend (it's how people naturally operate) and the tech behind it is extremely well established. I remain skeptical that we'll see a tremendous amount of automatic parallelization of ostensibly procedural code by the interpreter (ie. compiler or VM). For one thing, it complicates debugging tremendously, not to mention the error conditions that such translation can introduce. As an potentially relevant anecdote, after Herb Sutter's presentation on Concur a year or two ago at SDWest I asked him what should happen if two threads of an automatically parallelized loop both throw an exception, given that the C++ spec dictates that having more than one in-flight exception per thread should call terminate(). He dodged my question and turned to talk to someone else, who interestingly enough, did make an attempt to ensure that Herb understood what I was asking, but to no avail. Implications about Herb aside, I do think that this suggests that there are known problems with implicit parallelization that everyone is hoping will just magically disappear. How can one verify the correctness of code that may fail if implicitly parallelized but work if not?
 Furthermore, I would predict processors to start to
 do low-level reconfigurations, e.g. RSIP (Reconfigurable Instruction Set
 Processor) -paradigm. Look google for GARP and the awesome performance
 increasements it can offer for certain tasks.

Interestingly, parallel programming is the topic covered by ACM Communications magazine this month, and I believe there is a bit about this sort of hardware parallelism in addition to transactional memory, etc. The articles I've read so far have all been well-reasoned and pretty honest about the benefits and problems with each idea. Sean
Jul 15 2008
prev sibling next sibling parent Markus Koskimies <markus reaaliaika.net> writes:
On Wed, 16 Jul 2008 00:53:28 +0000, Sean Kelly wrote:

 == Quote from Markus Koskimies (markus reaaliaika.net)'s article
 If I would asked to say what is the way of writing future programs, I
 would say it is MPS (Message Passing Systems), refer to e.g. Hewitt's
 Actor Model (1973).

I agree completely. MP is easy to comprehend (it's how people naturally operate) and the tech behind it is extremely well established.

I couldn't agree more. MP is very natural way for us humans to organize parallel things. But there is even more behind it; the very fundamental reason that restricts computers to come PRAM machines is this our world around us. It restricts all physical machines, including computers to maximum of three spatial dimensions, and inherently neighborhood connected models; and those are very very far from ideal PRAM things...
 I remain
 skeptical that we'll see a tremendous amount of automatic
 parallelization of ostensibly procedural code by the interpreter (ie.
 compiler or VM).  For one thing, it complicates debugging tremendously,
 not to mention the error conditions that such translation can introduce.

Another thing I completely agree. It is not about what could be ideally best things, it is the reality that matters. Debugging a highly parallel thing, e.g. FPGA hardware, is very, very time-consuming thing.
 As an potentially relevant anecdote, after Herb Sutter's presentation on
 Concur [...]

Many highly skillful people are very bound to the great ideas they have in their mind. I'm not an exception :)
 Furthermore, I would predict processors to start to do low-level
 reconfigurations, e.g. RSIP (Reconfigurable Instruction Set Processor)
 -paradigm. Look google for GARP and the awesome performance
 increasements it can offer for certain tasks.

Interestingly, parallel programming is the topic covered by ACM Communications magazine this month, and I believe there is a bit about this sort of hardware parallelism in addition to transactional memory, etc. The articles I've read so far have all been well-reasoned and pretty honest about the benefits and problems with each idea.

If reconfigurable computers - and more or less distributed computing - does not come as next major processor architectures, I will go to some distant place and shame. They are not ideal nor optimal computers, far from that - programming one is very laborous and it is very hard for compilers. But they just work.
Jul 15 2008
prev sibling parent Markus Koskimies <markus reaaliaika.net> writes:
On Wed, 16 Jul 2008 01:15:23 +0000, Markus Koskimies wrote:

 Many highly skillful people are very bound to the great ideas they have
 in their mind. I'm not an exception :)

Many *even* highly etc. etc. :oops:
Jul 15 2008
prev sibling next sibling parent JAnderson <ask me.com> writes:
bearophile wrote:
 How much time do we have to wait to see some parallel processing features in
D? People are getting more and more rabid because they have few ways to use
their 2-4 core CPUs.
 Classic multithreading is useful, but sometimes it's not easy to use correctly.
 
 There are other ways to write parallel code, that D may adopt (more than one
way is probably better, no silver bullet exists in this field). Their point is
to allow to use the 2-4+ core CPUs of today (and maybe the 80-1000+ cores of
the future) in non-speed-critical parts of the code where the programmer wants
to use the other cores anyway, without too much programming efforts.
 
 I think Walter wants D language to be multi-paradigm; one of the best ways to
allow multi-processing in a simple and safer way is the Stream Processing
(http://en.wikipedia.org/wiki/Stream_Processing ), D syntax may grow few
constructs to use such kind of programming in a simple way (C++ has some such
libs, I think).
 
 Another easy way to perform multi processing is to vectorize. It means the
compiler can automatically use all the cores to perform operators like
array1+array2+array3.
 
 Another way to perform multi processing is so add to the D syntax the
parallel_for (and few related things to merge things back, etc) syntax that was
present in the "Parallel Pascal" language. Such things are quite simpler to use
correctly than threads. The new "Fortress" language by Sun shows similar
things, but they are more refined compared to the Parallel Pascal ones (and
they look more complex to understand and use, so they may be overkill for D, I
don't know. Some of those parallel things of Fortress look quite difficult to
implement to me).
 Time ago I have seen a form of parallel_for and the like in a small and easy
language from MIT, that I think are simple enough.
 
 Other ways to use parallel code are now being pushed by Intel, OpenMP, and the
hairy but usable CUDA by Nvidia (I am not sure I want to learn CUDA, it's a C
variant, but seems to require a large human memory and a large human brain to
be used, while I think D may have simpler built-in things. Much more "serious"
D programmers may use external libs that allow them any fine control they
want). I to me they look too much in flux now to be copied too much by D.
 
 Bye,
 bearophile

I'm hoping that the new "Pure" stuff Walter is working on, will enable the compiler to automatically parrellize things like foreach. It won't be as fast as something that's hand tuned to be faster however it will be a hell of a lot easier to write. -Joel
Jul 15 2008
prev sibling parent Jascha Wetzel <ask a-search-engine.de> writes:
agreed, we absolutely need an OpenMP (http://www.openmp.org) 
implementation for D.

bearophile wrote:
 How much time do we have to wait to see some parallel processing features in
D? People are getting more and more rabid because they have few ways to use
their 2-4 core CPUs.
 Classic multithreading is useful, but sometimes it's not easy to use correctly.
 
 There are other ways to write parallel code, that D may adopt (more than one
way is probably better, no silver bullet exists in this field). Their point is
to allow to use the 2-4+ core CPUs of today (and maybe the 80-1000+ cores of
the future) in non-speed-critical parts of the code where the programmer wants
to use the other cores anyway, without too much programming efforts.
 
 I think Walter wants D language to be multi-paradigm; one of the best ways to
allow multi-processing in a simple and safer way is the Stream Processing
(http://en.wikipedia.org/wiki/Stream_Processing ), D syntax may grow few
constructs to use such kind of programming in a simple way (C++ has some such
libs, I think).
 
 Another easy way to perform multi processing is to vectorize. It means the
compiler can automatically use all the cores to perform operators like
array1+array2+array3.
 
 Another way to perform multi processing is so add to the D syntax the
parallel_for (and few related things to merge things back, etc) syntax that was
present in the "Parallel Pascal" language. Such things are quite simpler to use
correctly than threads. The new "Fortress" language by Sun shows similar
things, but they are more refined compared to the Parallel Pascal ones (and
they look more complex to understand and use, so they may be overkill for D, I
don't know. Some of those parallel things of Fortress look quite difficult to
implement to me).
 Time ago I have seen a form of parallel_for and the like in a small and easy
language from MIT, that I think are simple enough.
 
 Other ways to use parallel code are now being pushed by Intel, OpenMP, and the
hairy but usable CUDA by Nvidia (I am not sure I want to learn CUDA, it's a C
variant, but seems to require a large human memory and a large human brain to
be used, while I think D may have simpler built-in things. Much more "serious"
D programmers may use external libs that allow them any fine control they
want). I to me they look too much in flux now to be copied too much by D.
 
 Bye,
 bearophile

Jul 16 2008