www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.announce - "Almost there" version of TDPL updated on Safari Rough Cuts

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
http://my.safaribooksonline.com/roughcuts

The current version includes virtually the entire book except (a) 
overloaded operators, (b) qualifiers, (c) threads. In the meantime I 
have finished the new design and wrote the chapter on overloaded 
operators. The design got Walter's seal of approval but I'm still 
waiting for Don's.

I plan to write the chapter on qualifiers in the next few days, mostly 
on the plane to and from Quebec. Hopefully Walter ad I will zero in on a 
solution to code duplication due to qualifiers, probably starting from 
Steven Schveighoffer's proposal.

I'll then have one month to design a small but compelling core 
concurrency framework together with Walter, Sean, and whomever would 
want to participate. The initial framework will emphasize de facto 
isolation between threads and message passing. It will build on an 
Erlang-inspired message passing design defined and implemented by Sean.


Andrei
Dec 10 2009
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

 The initial framework will emphasize de facto 
 isolation between threads and message passing. It will build on an 
 Erlang-inspired message passing design defined and implemented by Sean.

Sounds good. When you have 30+ CPU cores you don't want shared memory, in those situations message passing (and actors, agents, etc) seems better :-) Bye, bearophile
Dec 10 2009
prev sibling parent reply =?ISO-8859-1?Q?=c1lvaro_Castro-Castilla?= <alvcastro yahoo.es> writes:
Andrei Alexandrescu Wrote:

 http://my.safaribooksonline.com/roughcuts
 
 The current version includes virtually the entire book except (a) 
 overloaded operators, (b) qualifiers, (c) threads. In the meantime I 
 have finished the new design and wrote the chapter on overloaded 
 operators. The design got Walter's seal of approval but I'm still 
 waiting for Don's.
 
 I plan to write the chapter on qualifiers in the next few days, mostly 
 on the plane to and from Quebec. Hopefully Walter ad I will zero in on a 
 solution to code duplication due to qualifiers, probably starting from 
 Steven Schveighoffer's proposal.
 
 I'll then have one month to design a small but compelling core 
 concurrency framework together with Walter, Sean, and whomever would 
 want to participate. The initial framework will emphasize de facto 
 isolation between threads and message passing. It will build on an 
 Erlang-inspired message passing design defined and implemented by Sean.
 
 
 Andrei

Are these threads going to be green, stackless threads? (I think they are actually recursive functions) Or is mostly the share-nothing approach what you bring from Erlang, using system threads? More info please! :) From my point of view, I also think this is the best approach to scalable concurrency.
Dec 10 2009
parent reply Sean Kelly <sean invisibleduck.org> writes:
Álvaro Castro-Castilla Wrote:

 Andrei Alexandrescu Wrote:
 
 http://my.safaribooksonline.com/roughcuts
 
 The current version includes virtually the entire book except (a) 
 overloaded operators, (b) qualifiers, (c) threads. In the meantime I 
 have finished the new design and wrote the chapter on overloaded 
 operators. The design got Walter's seal of approval but I'm still 
 waiting for Don's.
 
 I plan to write the chapter on qualifiers in the next few days, mostly 
 on the plane to and from Quebec. Hopefully Walter ad I will zero in on a 
 solution to code duplication due to qualifiers, probably starting from 
 Steven Schveighoffer's proposal.
 
 I'll then have one month to design a small but compelling core 
 concurrency framework together with Walter, Sean, and whomever would 
 want to participate. The initial framework will emphasize de facto 
 isolation between threads and message passing. It will build on an 
 Erlang-inspired message passing design defined and implemented by Sean.

Are these threads going to be green, stackless threads? (I think they are actually recursive functions)

Not initially, though that may happen later. The default static storage class is thread-local, which would be confusing if the "thread" you're using shares static data with some other thread. I'm pretty sure this could be fixed with some library work, but it isn't done right now. In short, for now you're more likely to want to use a small number of threads than zillions like you would in Erlang.
Or is mostly the share-nothing approach what you bring from Erlang, using
system threads? More info please! :)

The share-nothing approach is the initial goal. If green threads are used later it wouldn't change the programming model anyway, just the number of threads an app could use with reasonable performance.
 From my point of view, I also think this is the best approach to scalable
concurrency.

Glad you agree :-)
Dec 10 2009
parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Sean Kelly (sean invisibleduck.org)'s article
 Álvaro Castro-Castilla Wrote:
 Andrei Alexandrescu Wrote:

 http://my.safaribooksonline.com/roughcuts

 The current version includes virtually the entire book except (a)
 overloaded operators, (b) qualifiers, (c) threads. In the meantime I
 have finished the new design and wrote the chapter on overloaded
 operators. The design got Walter's seal of approval but I'm still
 waiting for Don's.

 I plan to write the chapter on qualifiers in the next few days, mostly
 on the plane to and from Quebec. Hopefully Walter ad I will zero in on a
 solution to code duplication due to qualifiers, probably starting from
 Steven Schveighoffer's proposal.

 I'll then have one month to design a small but compelling core
 concurrency framework together with Walter, Sean, and whomever would
 want to participate. The initial framework will emphasize de facto
 isolation between threads and message passing. It will build on an
 Erlang-inspired message passing design defined and implemented by Sean.

Are these threads going to be green, stackless threads? (I think they are


 Not initially, though that may happen later.  The default static storage class

static data with some other thread. I'm pretty sure this could be fixed with some library work, but it isn't done right now. In short, for now you're more likely to want to use a small number of threads than zillions like you would in Erlang.
Or is mostly the share-nothing approach what you bring from Erlang, using


 The share-nothing approach is the initial goal.  If green threads are used
later

could use with reasonable performance.
 From my point of view, I also think this is the best approach to scalable


 Glad you agree :-)

This is great for super-scalable concurrency, the kind you need for things like servers, but what about the case where you need concurrency mostly to exploit data parallelism in a multicore environment? Are we considering things like parallel foreach, map, reduce, etc. to be orthogonal to what's being discussed here, or do they fit together somehow?
Dec 10 2009
parent reply Sean Kelly <sean invisibleduck.org> writes:
dsimcha Wrote:
 
 This is great for super-scalable concurrency, the kind you need for things like
 servers, but what about the case where you need concurrency mostly to exploit
data
 parallelism in a multicore environment?  Are we considering things like
parallel
 foreach, map, reduce, etc. to be orthogonal to what's being discussed here, or
do
 they fit together somehow?

I think it probably depends on the relative efficiency of a message passing approach to one using a thread pool for the small-N case (particularly for very large datasets). If message passing can come close to the thread pool in performance then it's clearly preferable. It may come down to whether pass by reference is allowed in some instances. It's always possible to use casts to bypass checking and pass by reference anyway, but it would be nice if this weren't necessary.
Dec 10 2009
parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Sean Kelly (sean invisibleduck.org)'s article
 dsimcha Wrote:
 This is great for super-scalable concurrency, the kind you need for things like
 servers, but what about the case where you need concurrency mostly to exploit
data
 parallelism in a multicore environment?  Are we considering things like
parallel
 foreach, map, reduce, etc. to be orthogonal to what's being discussed here, or
do
 they fit together somehow?


large datasets). If message passing can come close to the thread pool in performance then it's clearly preferable. It may come down to whether pass by reference is allowed in some instances. It's always possible to use casts to bypass checking and pass by reference anyway, but it would be nice if this weren't necessary. What about simplicity? Message passing is definitely safer. Parallel foreach (the kind that allows implicit sharing of basically everything, including stack variables) is basically a cowboy approach that leaves all safety concerns to the programmer. OTOH, parallel foreach is a very easy construct to understand and use in situations where you have data parallelism and you're doing things that are obviously (from the programmer's perspective) safe, even though they can't be statically proven safe (from the compiler's perspective). Don't get me wrong, I definitely think message passing-style concurrency has its place. It's just the wrong tool for the job if your goal is simply to exploit data parallelism to use as many cores as you can.
Dec 10 2009
parent reply Sean Kelly <sean invisibleduck.org> writes:
dsimcha Wrote:

 == Quote from Sean Kelly (sean invisibleduck.org)'s article
 dsimcha Wrote:
 This is great for super-scalable concurrency, the kind you need for things like
 servers, but what about the case where you need concurrency mostly to exploit
data
 parallelism in a multicore environment?  Are we considering things like
parallel
 foreach, map, reduce, etc. to be orthogonal to what's being discussed here, or
do
 they fit together somehow?


large datasets). If message passing can come close to the thread pool in performance then it's clearly preferable. It may come down to whether pass by reference is allowed in some instances. It's always possible to use casts to bypass checking and pass by reference anyway, but it would be nice if this weren't necessary. What about simplicity? Message passing is definitely safer. Parallel foreach (the kind that allows implicit sharing of basically everything, including stack variables) is basically a cowboy approach that leaves all safety concerns to the programmer. OTOH, parallel foreach is a very easy construct to understand and use in situations where you have data parallelism and you're doing things that are obviously (from the programmer's perspective) safe, even though they can't be statically proven safe (from the compiler's perspective).

I'd like to think it isn't necessary to expose the internals of the algorithm to the user. Parallel foreach (or map, since they're the same thing), could as easily divide up the dataset and send slices to worker threads via messages as with a more visible threading model. The only issue I can think of is that for very large datasets you really have to pass references of one kind or another. Scatter/gather with copying just isn't feasible when you're at the limits of virtual memory.
Dec 10 2009
parent reply =?ISO-8859-1?Q?=c1lvaro_Castro-Castilla?= <alvcastro yahoo.es> writes:
Sean Kelly Wrote:

 dsimcha Wrote:
 
 == Quote from Sean Kelly (sean invisibleduck.org)'s article
 dsimcha Wrote:
 This is great for super-scalable concurrency, the kind you need for things like
 servers, but what about the case where you need concurrency mostly to exploit
data
 parallelism in a multicore environment?  Are we considering things like
parallel
 foreach, map, reduce, etc. to be orthogonal to what's being discussed here, or
do
 they fit together somehow?


large datasets). If message passing can come close to the thread pool in performance then it's clearly preferable. It may come down to whether pass by reference is allowed in some instances. It's always possible to use casts to bypass checking and pass by reference anyway, but it would be nice if this weren't necessary. What about simplicity? Message passing is definitely safer. Parallel foreach (the kind that allows implicit sharing of basically everything, including stack variables) is basically a cowboy approach that leaves all safety concerns to the programmer. OTOH, parallel foreach is a very easy construct to understand and use in situations where you have data parallelism and you're doing things that are obviously (from the programmer's perspective) safe, even though they can't be statically proven safe (from the compiler's perspective).

I'd like to think it isn't necessary to expose the internals of the algorithm to the user. Parallel foreach (or map, since they're the same thing), could as easily divide up the dataset and send slices to worker threads via messages as with a more visible threading model. The only issue I can think of is that for very large datasets you really have to pass references of one kind or another. Scatter/gather with copying just isn't feasible when you're at the limits of virtual memory.

Language extensions for message passing, such as Kilim for Java send messages giving away the ownership of data, not copying it. That's a reason for the need of compiler/runtime support. Also, parallel map/foreach is more feasible as a library-only solution, whether the message passing requires some support from the runtime environment. Please, correct me if I'm wrong.
Dec 11 2009
parent reply Sean Kelly <sean invisibleduck.org> writes:
Álvaro Castro-Castilla Wrote:
 
 Language extensions for message passing, such as Kilim for Java send messages
giving away the ownership of data, not copying it. That's a reason for the need
of compiler/runtime support.

Knowledge of unique ownership can obviate the need for copying, but copying is a reasonable fall-back in most cases.
 Also, parallel map/foreach is more feasible as a library-only solution,
whether the message passing requires some support from the runtime environment.

It really depends on the language and what your goals are. There are message passing libraries for C, for example, but they don't provide much in the way of safety.
Dec 11 2009
parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Sean Kelly (sean invisibleduck.org)'s article
 Álvaro Castro-Castilla Wrote:
 Language extensions for message passing, such as Kilim for Java send messages


compiler/runtime support.
 Knowledge of unique ownership can obviate the need for copying, but copying is
a

 Also, parallel map/foreach is more feasible as a library-only solution,


 It really depends on the language and what your goals are.  There are message

safety. IDK, generally I agree with the idea that D and Phobos should provide safe defaults and more efficiency when you really need it and explicitly ask for it. The exception, though, is when someone is using a construct that would only be used by people who really need efficiency, and thus has already explicitly asked for efficiency. This includes parallel foreach. In these cases, I think that "efficiency first, safety second" has to be the rule and there should never, ever be any implicit copying. If you can't implement this safely, then it should be implemented unsafely and place the onus of ensuring safety on the programmer. Concurrency for purposes other than pure throughput performance is a completely different animal, though one that I admittedly have never needed to deal with. Here, safety counts much more.
Dec 11 2009
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
dsimcha:
 The exception, though, is when someone is using a construct that would only be
 used by people who really need efficiency, and thus has already explicitly
asked
 for efficiency.  This includes parallel foreach.  In these cases, I think that
 "efficiency first, safety second" has to be the rule and there should never,
ever
 be any implicit copying.  If you can't implement this safely, then it should be
 implemented unsafely and place the onus of ensuring safety on the programmer.

system parallel_foreach() Vs: parallel_foreach() ? :-) When they can be created it's useful to have handy safe alternatives (possibly the default ones). Bye, bearophile
Dec 11 2009
prev sibling parent Sean Kelly <sean invisibleduck.org> writes:
dsimcha Wrote:

 == Quote from Sean Kelly (sean invisibleduck.org)'s article
 Álvaro Castro-Castilla Wrote:
 Language extensions for message passing, such as Kilim for Java send messages


compiler/runtime support.
 Knowledge of unique ownership can obviate the need for copying, but copying is
a

 Also, parallel map/foreach is more feasible as a library-only solution,


 It really depends on the language and what your goals are.  There are message

safety. IDK, generally I agree with the idea that D and Phobos should provide safe defaults and more efficiency when you really need it and explicitly ask for it. The exception, though, is when someone is using a construct that would only be used by people who really need efficiency, and thus has already explicitly asked for efficiency. This includes parallel foreach.

Yeah, that's why I said in my original reply that if message passing were used, it may be necessary to use casting to avoid copying. However, it may just be enough to just pass a slice of a shared range. Nothing has really been decided regarding restrictions for what can be passed.
Dec 11 2009