www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - std.concurrency and fibers

reply =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= <alex lycus.org> writes:
Hi,

We currently have std.concurrency as a message-passing mechanism. We 
encourage people to use it instead of OS threads, which is great. 
However, what is *not* great is that spawned tasks correspond 1:1 to OS 
threads. This is not even remotely scalable for Erlang-style 
concurrency. There's a fairly simple way to fix that: Fibers.

The only problem with adding fiber support to std.concurrency is that 
the interface is just not flexible enough. The current interface is 
completely and entirely tied to the notion of threads (contrary to what 
its module description says).

Now, I see a number of ways we can fix this:

A) We completely get rid of the notion of threads and instead simply 
speak of 'tasks'. This trivially allows us to use threads, fibers, 
whatever to back the module. I personally think this is the best way to 
build a message-passing abstraction because it gives enough transparency 
to *actually* distribute tasks across machines without things breaking.
B) We make the module capable of backing tasks with both threads and 
fibers, and expose an interface that allows the user to choose what kind 
of task is spawned. I'm *not* convinced this is a good approach because 
it's extremely error-prone (imagine doing a thread-based receive inside 
a fiber-based task!).
C) We just swap out threads with fibers and document that the module 
uses fibers. See my comments in A for why I'm not sure this is a good idea.

All of these are going to break code in one way or another - that's 
unavoidable. But we really need to make std.concurrency grow up; other 
languages (Erlang, Rust, Go, ...) have had micro-threads (in some form) 
for years, and if we want D to be seriously usable for large-scale 
concurrency, we need to have them too.

Thoughts? Other ideas?

-- 
Alex Rønne Petersen
alex lycus.org
http://lycus.org
Oct 04 2012
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 10/04/2012 01:32 PM, Alex Rønne Petersen wrote:
 Hi,

 We currently have std.concurrency as a message-passing mechanism. We
 encourage people to use it instead of OS threads, which is great.
 However, what is *not* great is that spawned tasks correspond 1:1 to OS
 threads. This is not even remotely scalable for Erlang-style
 concurrency. There's a fairly simple way to fix that: Fibers.

 The only problem with adding fiber support to std.concurrency is that
 the interface is just not flexible enough. The current interface is
 completely and entirely tied to the notion of threads (contrary to what
 its module description says).

 Now, I see a number of ways we can fix this:

 A) We completely get rid of the notion of threads and instead simply
 speak of 'tasks'. This trivially allows us to use threads, fibers,
 whatever to back the module. I personally think this is the best way to
 build a message-passing abstraction because it gives enough transparency
 to *actually* distribute tasks across machines without things breaking.
 B) We make the module capable of backing tasks with both threads and
 fibers, and expose an interface that allows the user to choose what kind
 of task is spawned. I'm *not* convinced this is a good approach because
 it's extremely error-prone (imagine doing a thread-based receive inside
 a fiber-based task!).
 C) We just swap out threads with fibers and document that the module
 uses fibers. See my comments in A for why I'm not sure this is a good idea.

 All of these are going to break code in one way or another - that's
 unavoidable. But we really need to make std.concurrency grow up; other
 languages (Erlang, Rust, Go, ...) have had micro-threads (in some form)
 for years, and if we want D to be seriously usable for large-scale
 concurrency, we need to have them too.

 Thoughts? Other ideas?
+1, but what about TLS?
Oct 04 2012
parent reply =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= <alex lycus.org> writes:
On 04-10-2012 14:11, Timon Gehr wrote:
 On 10/04/2012 01:32 PM, Alex Rønne Petersen wrote:
 Hi,

 We currently have std.concurrency as a message-passing mechanism. We
 encourage people to use it instead of OS threads, which is great.
 However, what is *not* great is that spawned tasks correspond 1:1 to OS
 threads. This is not even remotely scalable for Erlang-style
 concurrency. There's a fairly simple way to fix that: Fibers.

 The only problem with adding fiber support to std.concurrency is that
 the interface is just not flexible enough. The current interface is
 completely and entirely tied to the notion of threads (contrary to what
 its module description says).

 Now, I see a number of ways we can fix this:

 A) We completely get rid of the notion of threads and instead simply
 speak of 'tasks'. This trivially allows us to use threads, fibers,
 whatever to back the module. I personally think this is the best way to
 build a message-passing abstraction because it gives enough transparency
 to *actually* distribute tasks across machines without things breaking.
 B) We make the module capable of backing tasks with both threads and
 fibers, and expose an interface that allows the user to choose what kind
 of task is spawned. I'm *not* convinced this is a good approach because
 it's extremely error-prone (imagine doing a thread-based receive inside
 a fiber-based task!).
 C) We just swap out threads with fibers and document that the module
 uses fibers. See my comments in A for why I'm not sure this is a good
 idea.

 All of these are going to break code in one way or another - that's
 unavoidable. But we really need to make std.concurrency grow up; other
 languages (Erlang, Rust, Go, ...) have had micro-threads (in some form)
 for years, and if we want D to be seriously usable for large-scale
 concurrency, we need to have them too.

 Thoughts? Other ideas?
+1, but what about TLS?
I think that no matter what we do, we have to simply say "don't do that" to thread-local state (it would break in distributed scenarios too, for instance). Instead, I think we should do what the Rust folks did: Use *task*-local state and leave it up to std.concurrency to figure out how to deal with it. It won't be as 'seamless' as TLS variables in D of course, but I think it's good enough in practice. -- Alex Rønne Petersen alex lycus.org http://lycus.org
Oct 04 2012
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 10/04/2012 02:22 PM, Alex Rønne Petersen wrote:
 On 04-10-2012 14:11, Timon Gehr wrote:
 On 10/04/2012 01:32 PM, Alex Rønne Petersen wrote:
 Hi,

 We currently have std.concurrency as a message-passing mechanism. We
 encourage people to use it instead of OS threads, which is great.
 However, what is *not* great is that spawned tasks correspond 1:1 to OS
 threads. This is not even remotely scalable for Erlang-style
 concurrency. There's a fairly simple way to fix that: Fibers.

 The only problem with adding fiber support to std.concurrency is that
 the interface is just not flexible enough. The current interface is
 completely and entirely tied to the notion of threads (contrary to what
 its module description says).

 Now, I see a number of ways we can fix this:

 A) We completely get rid of the notion of threads and instead simply
 speak of 'tasks'. This trivially allows us to use threads, fibers,
 whatever to back the module. I personally think this is the best way to
 build a message-passing abstraction because it gives enough transparency
 to *actually* distribute tasks across machines without things breaking.
 B) We make the module capable of backing tasks with both threads and
 fibers, and expose an interface that allows the user to choose what kind
 of task is spawned. I'm *not* convinced this is a good approach because
 it's extremely error-prone (imagine doing a thread-based receive inside
 a fiber-based task!).
 C) We just swap out threads with fibers and document that the module
 uses fibers. See my comments in A for why I'm not sure this is a good
 idea.

 All of these are going to break code in one way or another - that's
 unavoidable. But we really need to make std.concurrency grow up; other
 languages (Erlang, Rust, Go, ...) have had micro-threads (in some form)
 for years, and if we want D to be seriously usable for large-scale
 concurrency, we need to have them too.

 Thoughts? Other ideas?
+1, but what about TLS?
I think that no matter what we do, we have to simply say "don't do that" to thread-local state (it would break in distributed scenarios too, for instance). Instead, I think we should do what the Rust folks did: Use *task*-local state and leave it up to std.concurrency to figure out how to deal with it. It won't be as 'seamless' as TLS variables in D of course, but I think it's good enough in practice.
If it is not seamless, we have failed. IMO the runtime should expose an interface for allocating TLS, switching between TLS instances and destroying TLS. What about the stack? Allocating a fixed-size stack per task is costly and Walter opposes dynamic stack growth.
Oct 04 2012
next sibling parent =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= <alex lycus.org> writes:
On 04-10-2012 14:48, Timon Gehr wrote:
 On 10/04/2012 02:22 PM, Alex Rønne Petersen wrote:
 On 04-10-2012 14:11, Timon Gehr wrote:
 On 10/04/2012 01:32 PM, Alex Rønne Petersen wrote:
 Hi,

 We currently have std.concurrency as a message-passing mechanism. We
 encourage people to use it instead of OS threads, which is great.
 However, what is *not* great is that spawned tasks correspond 1:1 to OS
 threads. This is not even remotely scalable for Erlang-style
 concurrency. There's a fairly simple way to fix that: Fibers.

 The only problem with adding fiber support to std.concurrency is that
 the interface is just not flexible enough. The current interface is
 completely and entirely tied to the notion of threads (contrary to what
 its module description says).

 Now, I see a number of ways we can fix this:

 A) We completely get rid of the notion of threads and instead simply
 speak of 'tasks'. This trivially allows us to use threads, fibers,
 whatever to back the module. I personally think this is the best way to
 build a message-passing abstraction because it gives enough
 transparency
 to *actually* distribute tasks across machines without things breaking.
 B) We make the module capable of backing tasks with both threads and
 fibers, and expose an interface that allows the user to choose what
 kind
 of task is spawned. I'm *not* convinced this is a good approach because
 it's extremely error-prone (imagine doing a thread-based receive inside
 a fiber-based task!).
 C) We just swap out threads with fibers and document that the module
 uses fibers. See my comments in A for why I'm not sure this is a good
 idea.

 All of these are going to break code in one way or another - that's
 unavoidable. But we really need to make std.concurrency grow up; other
 languages (Erlang, Rust, Go, ...) have had micro-threads (in some form)
 for years, and if we want D to be seriously usable for large-scale
 concurrency, we need to have them too.

 Thoughts? Other ideas?
+1, but what about TLS?
I think that no matter what we do, we have to simply say "don't do that" to thread-local state (it would break in distributed scenarios too, for instance). Instead, I think we should do what the Rust folks did: Use *task*-local state and leave it up to std.concurrency to figure out how to deal with it. It won't be as 'seamless' as TLS variables in D of course, but I think it's good enough in practice.
If it is not seamless, we have failed. IMO the runtime should expose an interface for allocating TLS, switching between TLS instances and destroying TLS.
I suppose it could be done. But keep in mind the side-effects of an approach like this: Some thread-local variables (for instance, think 'chunk' inside emplace) would break (or at least behave very weirdly) if you switch the *entire* TLS context when entering a task. Sure, we could use the runtime interface for TLS switching only for task-local state, but then we're back to square one with it not being seamless.
 What about the stack? Allocating a fixed-size stack per task is costly
 and Walter opposes dynamic stack growth.
Yeah, I never understood why. It's essential for functional-style code running in constrained tasks. It's not just about conserving memory; it's to make recursion feasible. In any case, fibers currently allocate PAGE_SIZE * 4 bytes for stacks. -- Alex Rønne Petersen alex lycus.org http://lycus.org
Oct 04 2012
prev sibling next sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 04-Oct-12 16:48, Timon Gehr wrote:
 On 10/04/2012 02:22 PM, Alex Rønne Petersen wrote:
 On 04-10-2012 14:11, Timon Gehr wrote:
[snip]
 I think that no matter what we do, we have to simply say "don't do that"
 to thread-local state (it would break in distributed scenarios too, for
 instance).

 Instead, I think we should do what the Rust folks did: Use *task*-local
 state and leave it up to std.concurrency to figure out how to deal with
 it. It won't be as 'seamless' as TLS variables in D of course, but I
 think it's good enough in practice.
If it is not seamless, we have failed. IMO the runtime should expose an interface for allocating TLS, switching between TLS instances and destroying TLS.
Agreed.
 What about the stack? Allocating a fixed-size stack per task is costly
 and Walter opposes dynamic stack growth.
Allocating a fixed-size stack is costly only in terms of virtual address space. Then running out of address space is of concern on 32-bits only. On 64 bits you may as well allocate 1 Gb per task it will only get reserved if it's used. -- Dmitry Olshansky
Oct 04 2012
prev sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
On Oct 4, 2012, at 5:48 AM, Timon Gehr <timon.gehr gmx.ch> wrote:
=20
 What about the stack? Allocating a fixed-size stack per task is costly
 and Walter opposes dynamic stack growth.
This is another reason I've been delaying using fibers. The correct = approach is probably to go the distance by reserving a large block, = committing only a portion, and commit the rest dynamically as needed. = The current fiber implementation does have a guard page in some cases, = but doesn't go so far as to reserve/commit portions of a larger stack = space.=
Oct 04 2012
parent =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= <alex lycus.org> writes:
On 05-10-2012 01:34, Sean Kelly wrote:
 On Oct 4, 2012, at 5:48 AM, Timon Gehr <timon.gehr gmx.ch> wrote:
 What about the stack? Allocating a fixed-size stack per task is costly
 and Walter opposes dynamic stack growth.
This is another reason I've been delaying using fibers. The correct approach is probably to go the distance by reserving a large block, committing only a portion, and commit the rest dynamically as needed. The current fiber implementation does have a guard page in some cases, but doesn't go so far as to reserve/commit portions of a larger stack space.
I think we'd need compiler support to be able to do it in a reasonable way at all. Doing it via OS virtual memory hacks seems like a bad idea to me. -- Alex Rønne Petersen alex lycus.org http://lycus.org
Oct 04 2012
prev sibling next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 04-Oct-12 15:32, Alex Rønne Petersen wrote:
 Hi,

 We currently have std.concurrency as a message-passing mechanism. We
 encourage people to use it instead of OS threads, which is great.
 However, what is *not* great is that spawned tasks correspond 1:1 to OS
 threads. This is not even remotely scalable for Erlang-style
 concurrency. There's a fairly simple way to fix that: Fibers.

 The only problem with adding fiber support to std.concurrency is that
 the interface is just not flexible enough. The current interface is
 completely and entirely tied to the notion of threads (contrary to what
 its module description says).

 Now, I see a number of ways we can fix this:

 A) We completely get rid of the notion of threads and instead simply
 speak of 'tasks'. This trivially allows us to use threads, fibers,
 whatever to back the module. I personally think this is the best way to
 build a message-passing abstraction because it gives enough transparency
 to *actually* distribute tasks across machines without things breaking.
Cool, but currently it's a leaky abstraction. For instance if task is implemented with fibers static variables will be shared among threads. Essentially I think Fibers need TLS (or rather FLS) synced with language 'static' keyword. Otherwise the whole TLS by default is a useless chunk of machinery.
 B) We make the module capable of backing tasks with both  threads and
 fibers, and expose an interface that allows the user to choose what kind
 of task is spawned. I'm *not* convinced this is a good approach because
 it's extremely error-prone (imagine doing a thread-based receive inside
 a fiber-based task!).
Bleh.
 C) We just swap out threads with fibers and document that the module
 uses fibers. See my comments in A for why I'm not sure this is a good idea.
Seems a lot like A but with task defined to be a fiber. I'd prefer this. However then it needs a user-defined policy for distributing fibers across real threads (pools). Btw A is full of this too.
 All of these are going to break code in one way or another - that's
 unavoidable. But we really need to make std.concurrency grow up; other
 languages (Erlang, Rust, Go, ...) have had micro-threads (in some form)
 for years, and if we want D to be seriously usable for large-scale
 concurrency, we need to have them too.

 Thoughts? Other ideas?
+1 -- Dmitry Olshansky
Oct 04 2012
parent reply =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= <alex lycus.org> writes:
On 04-10-2012 22:04, Dmitry Olshansky wrote:
 On 04-Oct-12 15:32, Alex Rønne Petersen wrote:
 Hi,

 We currently have std.concurrency as a message-passing mechanism. We
 encourage people to use it instead of OS threads, which is great.
 However, what is *not* great is that spawned tasks correspond 1:1 to OS
 threads. This is not even remotely scalable for Erlang-style
 concurrency. There's a fairly simple way to fix that: Fibers.

 The only problem with adding fiber support to std.concurrency is that
 the interface is just not flexible enough. The current interface is
 completely and entirely tied to the notion of threads (contrary to what
 its module description says).

 Now, I see a number of ways we can fix this:

 A) We completely get rid of the notion of threads and instead simply
 speak of 'tasks'. This trivially allows us to use threads, fibers,
 whatever to back the module. I personally think this is the best way to
 build a message-passing abstraction because it gives enough transparency
 to *actually* distribute tasks across machines without things breaking.
Cool, but currently it's a leaky abstraction. For instance if task is implemented with fibers static variables will be shared among threads. Essentially I think Fibers need TLS (or rather FLS) synced with language 'static' keyword. Otherwise the whole TLS by default is a useless chunk of machinery.
Yeah, it's a problem all right. But we'll need compiler support for this stuff in any case. Can't help but wonder if it's really worth it. It seems to me like a simple AA-like API based on the typeid of data would be better -- as in, much more generic -- than trying to teach the compiler and runtime how to deal with this stuff. Think something like this: struct Data { int foo; float bar; } void myTask() { auto data = Data(42, 42.42f); TaskStore.save(data); // work ... foo(); // work ... } void foo() { auto data = TaskStore.load!Data(); // work ... } I admit, not as seamless as static variables, but a hell of a lot less magical.
 B) We make the module capable of backing tasks with both  threads and
 fibers, and expose an interface that allows the user to choose what kind
 of task is spawned. I'm *not* convinced this is a good approach because
 it's extremely error-prone (imagine doing a thread-based receive inside
 a fiber-based task!).
Bleh.
 C) We just swap out threads with fibers and document that the module
 uses fibers. See my comments in A for why I'm not sure this is a good
 idea.
Seems a lot like A but with task defined to be a fiber. I'd prefer this. However then it needs a user-defined policy for distributing fibers across real threads (pools). Btw A is full of this too.
By choosing C we effectively give up any hope of distributed tasks and especially if we have a scheduler API. Is that really a good idea in this day and age?
 All of these are going to break code in one way or another - that's
 unavoidable. But we really need to make std.concurrency grow up; other
 languages (Erlang, Rust, Go, ...) have had micro-threads (in some form)
 for years, and if we want D to be seriously usable for large-scale
 concurrency, we need to have them too.

 Thoughts? Other ideas?
+1
-- Alex Rønne Petersen alex lycus.org http://lycus.org
Oct 04 2012
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 05-Oct-12 08:27, Alex Rønne Petersen wrote:
 On 04-10-2012 22:04, Dmitry Olshansky wrote:
 Cool, but currently it's a leaky abstraction. For instance if task is
 implemented with fibers static variables will be shared among threads.
 Essentially I think Fibers need TLS (or rather FLS) synced with language
 'static' keyword. Otherwise the whole TLS by default is a useless chunk
 of machinery.
Yeah, it's a problem all right. But we'll need compiler support for this stuff in any case. Can't help but wonder if it's really worth it. It seems to me like a simple AA-like API based on the typeid of data would be better -- as in, much more generic than trying to teach the compiler and runtime how to deal with this stuff. Think something like this: struct Data { int foo; float bar; } void myTask() { auto data = Data(42, 42.42f); TaskStore.save(data); // work ... foo(); // work ... } void foo() { auto data = TaskStore.load!Data(); // work ... } I admit, not as seamless as static variables, but a hell of a lot less magical.
This just doesn't work though. The true problem is not in the code you as a programmer doing distibuted stuff do. It's library writers that typically use TLS for some persistent state inside module and D currently makes it easy and transparent just like in the old non-MT days but for threads ONLY. Now having them all pack their stuff and go about fixing globals to TaskStore.store/.load is not realistic and down right horrible. Currently I suspect w.r.t. Fibers all that works is based on conventions & luck. One problem with making everything FLS is that cost becomes darn high. On the other hand Fibers are yielded only manually (+scheduler now? probably on recive/send etc.) and a lot of things can be "fiber-safe" as is. Also it seems like for this to work we need not only a scheduler but reworked libraries that are fiber-aware (so they don't block on I/O etc.). See e.g. vibe.d.
 C) We just swap out threads with fibers and document that the module
 uses fibers. See my comments in A for why I'm not sure this is a good
 idea.
Seems a lot like A but with task defined to be a fiber. I'd prefer this. However then it needs a user-defined policy for distributing fibers across real threads (pools). Btw A is full of this too.
By choosing C we effectively give up any hope of distributed tasks and especially if we have a scheduler API. Is that really a good idea in this day and age?
Why? Remote fibers should go for a distributed tasks. Like I said just make Fiber == task. As long as there is a suitable protocol for communication it's all right. I'm insisting on fiber as a task as this makes for simpler logic of message passing. And scheduler is still inevitable as fibers wait for messages and are multiplexed on only as many threads. I just don't see any other abstraction you want to put in place of task. It should be self-contained persistent 'worker' so that message passing works transparently. -- Dmitry Olshansky
Oct 05 2012
parent Johannes Pfau <nospam example.com> writes:
Am Fri, 05 Oct 2012 12:58:18 +0400
schrieb Dmitry Olshansky <dmitry.olsh gmail.com>:

 The true problem is not in the code you as a programmer doing
 distibuted stuff do.
 It's library writers that typically use TLS for some persistent state 
 inside module
 and D currently makes it easy and transparent just like in the old
 non-MT days but for threads ONLY.
We should probably do some analysis on the phobos source code to see if this really is the case. I thought TLS is mainly used to avoid threading issues, which works for Fibers. Things like the thread local RNG generator variables work fine with usual TLS and even if the Fiber is passed between different threads, this still works well. I think we'd only have problems with APIs which leave TLS variables in an inconsistent state between calls to functions. But I always though such behavior doesn't fit TLS variables well and should be abstracted into a struct+member variable as state. In the end, isn't 'global TLS' state just as bad as global state in C and should be avoided?
Oct 05 2012
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, October 04, 2012 13:32:01 Alex R=C3=B8nne Petersen wrote:
 Thoughts? Other ideas?
std.concurrency is supposed to be designed such that it can be used for= more=20 than just threads (e.g. sending messages across the network), so if it = needs=20 to be adjusted to accomodate that, then we should do so, but we need to= be=20 careful to do it in a way that minimizes code breakage as much as reaso= nably=20 possible. - Jonathan M Davis
Oct 04 2012
prev sibling next sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
On Oct 4, 2012, at 4:32 AM, Alex R=F8nne Petersen <alex lycus.org> =
wrote:

 Hi,
=20
 We currently have std.concurrency as a message-passing mechanism. We =
encourage people to use it instead of OS threads, which is great. = However, what is *not* great is that spawned tasks correspond 1:1 to OS = threads. This is not even remotely scalable for Erlang-style = concurrency. There's a fairly simple way to fix that: Fibers.
=20
 The only problem with adding fiber support to std.concurrency is that =
the interface is just not flexible enough. The current interface is = completely and entirely tied to the notion of threads (contrary to what = its module description says). How is the interface tied to the notion of threads? I had hoped to = design it with the underlying concurrency mechanism completely = abstracted. The most significant reason that fibers aren't used behind = the scenes today is because the default storage class of static data is = thread-local, and this would really have to be made fiber-local. I'm = reasonably certain this could be done and have considered going so far = as to make the main thread in D a fiber, but the implementation is = definitely non-trivial and will probably be slower than the built-in TLS = mechanism as well. So consider the current std.concurrency = implementation to be a prototype. I'd also like to add interprocess = messaging, but that will be another big task.=
Oct 04 2012
parent reply =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= <alex lycus.org> writes:
On 05-10-2012 01:30, Sean Kelly wrote:
 On Oct 4, 2012, at 4:32 AM, Alex Rønne Petersen <alex lycus.org> wrote:

 Hi,

 We currently have std.concurrency as a message-passing mechanism. We encourage
people to use it instead of OS threads, which is great. However, what is *not*
great is that spawned tasks correspond 1:1 to OS threads. This is not even
remotely scalable for Erlang-style concurrency. There's a fairly simple way to
fix that: Fibers.

 The only problem with adding fiber support to std.concurrency is that the
interface is just not flexible enough. The current interface is completely and
entirely tied to the notion of threads (contrary to what its module description
says).
How is the interface tied to the notion of threads? I had hoped to design it with the underlying concurrency mechanism completely abstracted. The most significant reason that fibers aren't used behind the scenes today is because the default storage class of static data is thread-local, and this would really have to be made fiber-local. I'm reasonably certain this could be done and have considered going so far as to make the main thread in D a fiber, but the implementation is definitely non-trivial and will probably be slower than the built-in TLS mechanism as well. So consider the current std.concurrency implementation to be a prototype. I'd also like to add interprocess messaging, but that will be another big task.
Mostly in that everything operates on Tids (as opposed to some opaque Cid type) and, as you mentioned, TLS. The problem is basically that people have gotten used to std.concurrency always using OS threads due to subtle things like that from day one. -- Alex Rønne Petersen alex lycus.org http://lycus.org
Oct 04 2012
parent Sean Kelly <sean invisibleduck.org> writes:
On Oct 4, 2012, at 9:18 PM, Alex R=C3=B8nne Petersen <alex lycus.org> wrote:=


 On 05-10-2012 01:30, Sean Kelly wrote:
 On Oct 4, 2012, at 4:32 AM, Alex R=C3=B8nne Petersen <alex lycus.org> wro=
te:
=20
 Hi,
=20
 We currently have std.concurrency as a message-passing mechanism. We enc=
ourage people to use it instead of OS threads, which is great. However, what= is *not* great is that spawned tasks correspond 1:1 to OS threads. This is n= ot even remotely scalable for Erlang-style concurrency. There's a fairly sim= ple way to fix that: Fibers.
=20
 The only problem with adding fiber support to std.concurrency is that th=
e interface is just not flexible enough. The current interface is completely= and entirely tied to the notion of threads (contrary to what its module des= cription says).
=20
 How is the interface tied to the notion of threads?  I had hoped to desig=
n it with the underlying concurrency mechanism completely abstracted. The m= ost significant reason that fibers aren't used behind the scenes today is be= cause the default storage class of static data is thread-local, and this wou= ld really have to be made fiber-local. I'm reasonably certain this could be= done and have considered going so far as to make the main thread in D a fib= er, but the implementation is definitely non-trivial and will probably be sl= ower than the built-in TLS mechanism as well. So consider the current std.c= oncurrency implementation to be a prototype. I'd also like to add interproc= ess messaging, but that will be another big task.
=20
 Mostly in that everything operates on Tids (as opposed to some opaque Cid t=
ype) and, as you mentioned, TLS. The problem is basically that people have g= otten used to std.concurrency always using OS threads due to subtle things l= ike that from day one. A Tid is a Cid and in the first iteration I actually named it Cid and was as= ked to change it. Tid seems reasonable since it represents a logical thread= anyway. It just may not actually be a kernel thread. I think we have to mak= e TLS work for fibers or using them isn't an option. It would be ridiculous t= o say "D has this cool new idea about statics but you can't use it if you're= using the standard concurrency package."=
Oct 05 2012
prev sibling parent reply deadalnix <deadalnix gmail.com> writes:
Le 04/10/2012 13:32, Alex Rønne Petersen a écrit :
 Hi,

 We currently have std.concurrency as a message-passing mechanism. We
 encourage people to use it instead of OS threads, which is great.
 However, what is *not* great is that spawned tasks correspond 1:1 to OS
 threads. This is not even remotely scalable for Erlang-style
 concurrency. There's a fairly simple way to fix that: Fibers.

 The only problem with adding fiber support to std.concurrency is that
 the interface is just not flexible enough. The current interface is
 completely and entirely tied to the notion of threads (contrary to what
 its module description says).

 Now, I see a number of ways we can fix this:

 A) We completely get rid of the notion of threads and instead simply
 speak of 'tasks'. This trivially allows us to use threads, fibers,
 whatever to back the module. I personally think this is the best way to
 build a message-passing abstraction because it gives enough transparency
 to *actually* distribute tasks across machines without things breaking.
 B) We make the module capable of backing tasks with both threads and
 fibers, and expose an interface that allows the user to choose what kind
 of task is spawned. I'm *not* convinced this is a good approach because
 it's extremely error-prone (imagine doing a thread-based receive inside
 a fiber-based task!).
 C) We just swap out threads with fibers and document that the module
 uses fibers. See my comments in A for why I'm not sure this is a good idea.

 All of these are going to break code in one way or another - that's
 unavoidable. But we really need to make std.concurrency grow up; other
 languages (Erlang, Rust, Go, ...) have had micro-threads (in some form)
 for years, and if we want D to be seriously usable for large-scale
 concurrency, we need to have them too.

 Thoughts? Other ideas?
Something I wonder for a while : why not run everything in fibers ?
Oct 04 2012
parent =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= <alex lycus.org> writes:
On 05-10-2012 04:14, deadalnix wrote:
 Le 04/10/2012 13:32, Alex Rønne Petersen a écrit :
 Hi,

 We currently have std.concurrency as a message-passing mechanism. We
 encourage people to use it instead of OS threads, which is great.
 However, what is *not* great is that spawned tasks correspond 1:1 to OS
 threads. This is not even remotely scalable for Erlang-style
 concurrency. There's a fairly simple way to fix that: Fibers.

 The only problem with adding fiber support to std.concurrency is that
 the interface is just not flexible enough. The current interface is
 completely and entirely tied to the notion of threads (contrary to what
 its module description says).

 Now, I see a number of ways we can fix this:

 A) We completely get rid of the notion of threads and instead simply
 speak of 'tasks'. This trivially allows us to use threads, fibers,
 whatever to back the module. I personally think this is the best way to
 build a message-passing abstraction because it gives enough transparency
 to *actually* distribute tasks across machines without things breaking.
 B) We make the module capable of backing tasks with both threads and
 fibers, and expose an interface that allows the user to choose what kind
 of task is spawned. I'm *not* convinced this is a good approach because
 it's extremely error-prone (imagine doing a thread-based receive inside
 a fiber-based task!).
 C) We just swap out threads with fibers and document that the module
 uses fibers. See my comments in A for why I'm not sure this is a good
 idea.

 All of these are going to break code in one way or another - that's
 unavoidable. But we really need to make std.concurrency grow up; other
 languages (Erlang, Rust, Go, ...) have had micro-threads (in some form)
 for years, and if we want D to be seriously usable for large-scale
 concurrency, we need to have them too.

 Thoughts? Other ideas?
Something I wonder for a while : why not run everything in fibers ?
Because then we definitely need dynamic stack growth wired into both the compiler and the runtime. Not impossible, but there's a *lot* of effort required (and convincing, in Walter's case). -- Alex Rønne Petersen alex lycus.org http://lycus.org
Oct 04 2012