www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - std.concurrency: Returning from spawned function

reply dsimcha <dsimcha yahoo.com> writes:
I was thinking about ways to improve std.concurrency w/o compromising its
safety or the simplicity of what already works.  Isn't it unnecessarily
restrictive that a spawned function must return void?  Since the spawned
thread dies when the spawned function returns, the return value could safely
be moved to the owner thread.  Therefore, the return values wouldn't even have
to be immutable/shared/lacking indirection.  The return value could, for
example, be stored in Tid, with attempts to retrieve it blocking until the
spawned thread returns.

This would enable an important use of concurrency to be implemented safely and
efficiently.  Assume the input to a function is an immutable string specifying
a filename.  The output is a very complex data structure representing the data
in the file.  A main thread could then spawn a worker thread with the
immutable string for input, and get the data structure without using any
shared data or exposing the possibility of any race conditions.

Is there some reason I'm missing why this doesn't already work?
Sep 10 2010
parent reply Sean Kelly <sean invisibleduck.org> writes:
dsimcha Wrote:

 I was thinking about ways to improve std.concurrency w/o compromising its
 safety or the simplicity of what already works.  Isn't it unnecessarily
 restrictive that a spawned function must return void?  Since the spawned
 thread dies when the spawned function returns, the return value could safely
 be moved to the owner thread.  Therefore, the return values wouldn't even have
 to be immutable/shared/lacking indirection.  The return value could, for
 example, be stored in Tid, with attempts to retrieve it blocking until the
 spawned thread returns.

That each spawn() results in the creation of a thread whose lifetime ends when the function returns is an implementation details. It could as easily be a thread pool that resets its TLS data when picking up a new operation, user-space thread, etc. In short, I don't think that the behavior of a thread exiting should be a motivating factor for design changes. Does this gain anything over sending a message on exit?
Sep 10 2010
next sibling parent Russel Winder <russel russel.org.uk> writes:
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Sat, 2010-09-11 at 00:52 -0400, Sean Kelly wrote:
 dsimcha Wrote:
=20
 I was thinking about ways to improve std.concurrency w/o compromising i=


 safety or the simplicity of what already works.  Isn't it unnecessarily
 restrictive that a spawned function must return void?  Since the spawne=


 thread dies when the spawned function returns, the return value could s=


 be moved to the owner thread.  Therefore, the return values wouldn't ev=


 to be immutable/shared/lacking indirection.  The return value could, fo=


 example, be stored in Tid, with attempts to retrieve it blocking until =


 spawned thread returns.

That each spawn() results in the creation of a thread whose lifetime ends when the function returns is an implementation details. It could as easily be a thread pool that resets its TLS data when picking up a new operation, user-space thread, etc. In short, I don't think that the behavior of a thread exiting should be a motivating factor for design changes. Does this gain anything over sending a message on exit?

I guess it is really a question of message passing versus data parallelism. Clearly in a message passing idiom asynchronous function execution can (possibly should) always be handled by void functions. In a data parallel context you generally want a function that returns the value. The idiom here is to create a sequence and then to create a new sequence which is a function applied to each element of the old sequence delivering a value to the new sequence -- parallel arrays. Algorithmically the computation on each result element is independent, even in the case where non-local read access are allowed, so this is "embarrassingly parallel". It is left as a runtime implementation issue as to how the computations map to threads and thence to processors. C ++0x doesn't really get this right, but Chapel and X10 are getting close, but they are full PGAS (partitioned global address space) languages, so they should do. Haskell, via DPH, is also getting there. As indeed in Java -- assuming Java 7 ever makes it into production. I think my real point is that data parallelism shouldn't have to be manually constructed from asynchronous functions, as long as you have closures -- either explicitly or implicitly (as can be constructed with C++ and Java). --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel russel.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Sep 11 2010
prev sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Sean Kelly (sean invisibleduck.org)'s article
 That each spawn() results in the creation of a thread whose lifetime ends when

Fair enough. I didn't realize this was just an implementation detail. In this case you're absolutely right.
 It could as easily be a thread pool that resets its TLS data when picking up a

I don't know the details of how TLS is implemented. Would it be possible to reset all TLS data for a given thread? If so, I could make my std.parallelism module that's currently in review (formerly parallelfuture for those who haven't been following the Phobos list) support **safe** asynchronous function calls for a limited but useful number of cases: 1. Like for std.concurrency, all arguments must be free of unshared aliasing. 2. The callable object must be a function pointer, not a delegate, template alias parameter or object w/ overloaded opCall. 3. If I could reset all TLS data in the worker thread upon returning, then I could allow for arbitrarily complex object graphs as the return type, since the worker thread could be guaranteed to not have any local references to the object after the task returned. This would probably be better than using std.concurrency for these cases because even though asynchronous function calls look somewhat similar to message passing, trying to use std.concurrency for such things is really shoehorning. std.parallelism is actually designed for asynchronous function calls, but currently has the disadvantage of being completely unsafe in that it allows unchecked data sharing. Also note that what I'm proposing would only be an additional feature for std.parallelism, which would probably be called something like safeTask. It would still allow all the unchecked data sharing you could handle in system mode.
In short, I don't think that the behavior of a thread exiting should be a

message on exit? It allows safely passing an arbitrarily complex object graph because the fact that the thread is exiting means you're moving it from one thread to another rather than sharing it.
Sep 11 2010
parent reply Sean Kelly <sean invisibleduck.org> writes:
dsimcha Wrote:
 
 I don't know the details of how TLS is implemented.  Would it be possible to
reset
 all TLS data for a given thread?

Definitely for some platforms, possibly for all of them (perhaps requiring some trickery). Look at thread_entryPoint in core.thread for a clue on how TLS is implemented.
Sep 11 2010
parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Sean Kelly (sean invisibleduck.org)'s article
 dsimcha Wrote:
 I don't know the details of how TLS is implemented.  Would it be possible to
reset
 all TLS data for a given thread?


implemented. Interesting. Now that I've given it more thought, though, using this is probably a bad idea for lightweight task-based concurrency because resetting TLS at all properly would require re-running thread-local module c'tors (arbitrarily expensive), not just blitting a few things (cheap). I guess I'll have to resort to disallowing arbitrary indirection in return types in any safe subset of std.parallelism.
Sep 11 2010
parent Michel Fortin <michel.fortin michelf.com> writes:
On 2010-09-11 18:14:25 -0400, dsimcha <dsimcha yahoo.com> said:

 Interesting.  Now that I've given it more thought, though, using this 
 is probably
 a bad idea for lightweight task-based concurrency because resetting TLS at all
 properly would require re-running thread-local module c'tors (arbitrarily
 expensive), not just blitting a few things (cheap).  I guess I'll have 
 to resort
 to disallowing arbitrary indirection in return types in any safe subset of
 std.parallelism.

You could still allow it for pure functions. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Sep 11 2010