digitalmars.D - std.concurrency: Returning from spawned function

dsimcha (15/15) Sep 10 2010 I was thinking about ways to improve std.concurrency w/o compromising it...

Sean Kelly (2/10) Sep 10 2010 That each spawn() results in the creation of a thread whose lifetime end...

Russel Winder (36/54) Sep 11 2010 d
dsimcha (31/34) Sep 11 2010 the function returns is an implementation details.

Sean Kelly (2/5) Sep 11 2010 Definitely for some platforms, possibly for all of them (perhaps requiri...

dsimcha (9/14) Sep 11 2010 trickery). Look at thread_entryPoint in core.thread for a clue on how T...

Michel Fortin (6/14) Sep 11 2010 You could still allow it for pure functions.

dsimcha <dsimcha yahoo.com> writes:

I was thinking about ways to improve std.concurrency w/o compromising its
safety or the simplicity of what already works.  Isn't it unnecessarily
restrictive that a spawned function must return void?  Since the spawned
thread dies when the spawned function returns, the return value could safely
be moved to the owner thread.  Therefore, the return values wouldn't even have
to be immutable/shared/lacking indirection.  The return value could, for
example, be stored in Tid, with attempts to retrieve it blocking until the
spawned thread returns.

This would enable an important use of concurrency to be implemented safely and
efficiently.  Assume the input to a function is an immutable string specifying
a filename.  The output is a very complex data structure representing the data
in the file.  A main thread could then spawn a worker thread with the
immutable string for input, and get the data structure without using any
shared data or exposing the possibility of any race conditions.

Is there some reason I'm missing why this doesn't already work?

Sep 10 2010

Sean Kelly <sean invisibleduck.org> writes:

dsimcha Wrote:

 I was thinking about ways to improve std.concurrency w/o compromising its
 safety or the simplicity of what already works.  Isn't it unnecessarily
 restrictive that a spawned function must return void?  Since the spawned
 thread dies when the spawned function returns, the return value could safely
 be moved to the owner thread.  Therefore, the return values wouldn't even have
 to be immutable/shared/lacking indirection.  The return value could, for
 example, be stored in Tid, with attempts to retrieve it blocking until the
 spawned thread returns.

That each spawn() results in the creation of a thread whose lifetime ends when
the function returns is an implementation details.  It could as easily be a
thread pool that resets its TLS data when picking up a new operation,
user-space thread, etc.  In short, I don't think that the behavior of a thread
exiting should be a motivating factor for design changes.  Does this gain
anything over sending a message on exit?

Sep 10 2010

Russel Winder <russel russel.org.uk> writes:

On Sat, 2010-09-11 at 00:52 -0400, Sean Kelly wrote:
 dsimcha Wrote:
=20
 I was thinking about ways to improve std.concurrency w/o compromising i=


ts
 safety or the simplicity of what already works.  Isn't it unnecessarily
 restrictive that a spawned function must return void?  Since the spawne=


d
 thread dies when the spawned function returns, the return value could s=


afely
 be moved to the owner thread.  Therefore, the return values wouldn't ev=


en have
 to be immutable/shared/lacking indirection.  The return value could, fo=


r
 example, be stored in Tid, with attempts to retrieve it blocking until =


the
 spawned thread returns.

=20
 That each spawn() results in the creation of a thread whose lifetime
 ends when the function returns is an implementation details.  It could
 as easily be a thread pool that resets its TLS data when picking up a
 new operation, user-space thread, etc.  In short, I don't think that
 the behavior of a thread exiting should be a motivating factor for
 design changes.  Does this gain anything over sending a message on
 exit?

I guess it is really a question of message passing versus data
parallelism.  Clearly in a message passing idiom asynchronous function
execution can (possibly should) always be handled by void functions.  In
a data parallel context you generally want a function that returns the
value.  The idiom here is to create a sequence and then to create a new
sequence which is a function applied to each element of the old sequence
delivering a value to the new sequence -- parallel arrays.
Algorithmically the computation on each result element is independent,
even in the case where non-local read access are allowed, so this is
"embarrassingly parallel".  It is left as a runtime implementation issue
as to how the computations map to threads and thence to processors.  C
++0x doesn't really get this right, but Chapel and X10 are getting
close, but they are full PGAS (partitioned global address space)
languages, so they should do. Haskell, via DPH, is also getting there.
As indeed in Java -- assuming Java 7 ever makes it into production.

I think my real point is that data parallelism shouldn't have to be
manually constructed from asynchronous functions, as long as you have
closures -- either explicitly or implicitly (as can be constructed with
C++ and Java).

--=20
Russel.
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D
Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder ekiga.n=
et
41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel russel.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder

Sep 11 2010

dsimcha <dsimcha yahoo.com> writes:

== Quote from Sean Kelly (sean invisibleduck.org)'s article
 That each spawn() results in the creation of a thread whose lifetime ends when

the function returns is an implementation details.

Fair enough.  I didn't realize this was just an implementation detail.  In this
case you're absolutely right.

 It could as easily be a thread pool that resets its TLS data when picking up a

new operation, user-space thread, etc.

I don't know the details of how TLS is implemented.  Would it be possible to
reset
all TLS data for a given thread?  If so, I could make my std.parallelism module
that's currently in review (formerly parallelfuture for those who haven't been
following the Phobos list) support **safe** asynchronous function calls for a
limited but useful number of cases:

1.  Like for std.concurrency, all arguments must be free of unshared aliasing.

2.  The callable object must be a function pointer, not a delegate, template
alias
parameter or object w/ overloaded opCall.

3.  If I could reset all TLS data in the worker thread upon returning, then I
could allow for arbitrarily complex object graphs as the return type, since the
worker thread could be guaranteed to not have any local references to the object
after the task returned.

This would probably be better than using std.concurrency for these cases because
even though asynchronous function calls look somewhat similar to message
passing,
trying to use std.concurrency for such things is really shoehorning.
std.parallelism is actually designed for asynchronous function calls, but
currently has the disadvantage of being completely unsafe in that it allows
unchecked data sharing.

Also note that what I'm proposing would only be an additional feature for
std.parallelism, which would probably be called something like safeTask.  It
would
still allow all the unchecked data sharing you could handle in  system mode.

In short, I don't think that the behavior of a thread exiting should be a

motivating factor for design changes.  Does this gain anything over sending a
message on exit?

It allows safely passing an arbitrarily complex object graph because the fact
that
the thread is exiting means you're moving it from one thread to another rather
than sharing it.

Sep 11 2010

Sean Kelly <sean invisibleduck.org> writes:

dsimcha Wrote:
 
 I don't know the details of how TLS is implemented.  Would it be possible to
reset
 all TLS data for a given thread?

Definitely for some platforms, possibly for all of them (perhaps requiring some
trickery).  Look at thread_entryPoint in core.thread for a clue on how TLS is
implemented.

Sep 11 2010

dsimcha <dsimcha yahoo.com> writes:

== Quote from Sean Kelly (sean invisibleduck.org)'s article
 dsimcha Wrote:
 I don't know the details of how TLS is implemented.  Would it be possible to
reset
 all TLS data for a given thread?

 Definitely for some platforms, possibly for all of them (perhaps requiring some

trickery).  Look at thread_entryPoint in core.thread for a clue on how TLS is
implemented.

Interesting.  Now that I've given it more thought, though, using this is
probably
a bad idea for lightweight task-based concurrency because resetting TLS at all
properly would require re-running thread-local module c'tors (arbitrarily
expensive), not just blitting a few things (cheap).  I guess I'll have to resort
to disallowing arbitrary indirection in return types in any safe subset of
std.parallelism.

Sep 11 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-09-11 18:14:25 -0400, dsimcha <dsimcha yahoo.com> said:

 Interesting.  Now that I've given it more thought, though, using this 
 is probably
 a bad idea for lightweight task-based concurrency because resetting TLS at all
 properly would require re-running thread-local module c'tors (arbitrarily
 expensive), not just blitting a few things (cheap).  I guess I'll have 
 to resort
 to disallowing arbitrary indirection in return types in any safe subset of
 std.parallelism.

You could still allow it for pure functions.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Sep 11 2010

D Programming

C/C++ Programming

Other

digitalmars.D - std.concurrency: Returning from spawned function