digitalmars.D - std.concurrency and fibers

=?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= (33/33) Oct 04 2012 Hi,

Timon Gehr (2/31) Oct 04 2012 +1, but what about TLS?

=?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= (12/51) Oct 04 2012 I think that no matter what we do, we have to simply say "don't do that"...

Timon Gehr (6/54) Oct 04 2012 If it is not seamless, we have failed. IMO the runtime should expose an

=?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= (17/76) Oct 04 2012 I suppose it could be done.
Dmitry Olshansky (9/26) Oct 04 2012 Agreed.
Sean Kelly (7/10) Oct 04 2012 This is another reason I've been delaying using fibers. The correct =

=?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= (8/13) Oct 04 2012 I think we'd need compiler support to be able to do it in a reasonable

Dmitry Olshansky (13/42) Oct 04 2012 Cool, but currently it's a leaky abstraction. For instance if task is

=?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= (35/82) Oct 04 2012 Yeah, it's a problem all right. But we'll need compiler support for this...

Dmitry Olshansky (30/73) Oct 05 2012 This just doesn't work though.

Johannes Pfau (12/18) Oct 05 2012 We should probably do some analysis on the phobos source code to see if

Jonathan M Davis (11/12) Oct 04 2012 std.concurrency is supposed to be designed such that it can be used for=
Sean Kelly (20/25) Oct 04 2012 encourage people to use it instead of OS threads, which is great. =

=?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= (9/16) Oct 04 2012 Mostly in that everything operates on Tids (as opposed to some opaque

Sean Kelly (27/39) Oct 05 2012 ourage people to use it instead of OS threads, which is great. However, ...

deadalnix (2/31) Oct 04 2012 Something I wonder for a while : why not run everything in fibers ?

=?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= (9/48) Oct 04 2012 Because then we definitely need dynamic stack growth wired into both the...

=?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= <alex lycus.org> writes:

Hi,

We currently have std.concurrency as a message-passing mechanism. We 
encourage people to use it instead of OS threads, which is great. 
However, what is *not* great is that spawned tasks correspond 1:1 to OS 
threads. This is not even remotely scalable for Erlang-style 
concurrency. There's a fairly simple way to fix that: Fibers.

The only problem with adding fiber support to std.concurrency is that 
the interface is just not flexible enough. The current interface is 
completely and entirely tied to the notion of threads (contrary to what 
its module description says).

Now, I see a number of ways we can fix this:

A) We completely get rid of the notion of threads and instead simply 
speak of 'tasks'. This trivially allows us to use threads, fibers, 
whatever to back the module. I personally think this is the best way to 
build a message-passing abstraction because it gives enough transparency 
to *actually* distribute tasks across machines without things breaking.
B) We make the module capable of backing tasks with both threads and 
fibers, and expose an interface that allows the user to choose what kind 
of task is spawned. I'm *not* convinced this is a good approach because 
it's extremely error-prone (imagine doing a thread-based receive inside 
a fiber-based task!).
C) We just swap out threads with fibers and document that the module 
uses fibers. See my comments in A for why I'm not sure this is a good idea.

All of these are going to break code in one way or another - that's 
unavoidable. But we really need to make std.concurrency grow up; other 
languages (Erlang, Rust, Go, ...) have had micro-threads (in some form) 
for years, and if we want D to be seriously usable for large-scale 
concurrency, we need to have them too.

Thoughts? Other ideas?

-- 
Alex R�nne Petersen
alex lycus.org
http://lycus.org

Oct 04 2012

Timon Gehr <timon.gehr gmx.ch> writes:

On 10/04/2012 01:32 PM, Alex R�nne Petersen wrote:
 Hi,

 We currently have std.concurrency as a message-passing mechanism. We
 encourage people to use it instead of OS threads, which is great.
 However, what is *not* great is that spawned tasks correspond 1:1 to OS
 threads. This is not even remotely scalable for Erlang-style
 concurrency. There's a fairly simple way to fix that: Fibers.

 The only problem with adding fiber support to std.concurrency is that
 the interface is just not flexible enough. The current interface is
 completely and entirely tied to the notion of threads (contrary to what
 its module description says).

 Now, I see a number of ways we can fix this:

 A) We completely get rid of the notion of threads and instead simply
 speak of 'tasks'. This trivially allows us to use threads, fibers,
 whatever to back the module. I personally think this is the best way to
 build a message-passing abstraction because it gives enough transparency
 to *actually* distribute tasks across machines without things breaking.
 B) We make the module capable of backing tasks with both threads and
 fibers, and expose an interface that allows the user to choose what kind
 of task is spawned. I'm *not* convinced this is a good approach because
 it's extremely error-prone (imagine doing a thread-based receive inside
 a fiber-based task!).
 C) We just swap out threads with fibers and document that the module
 uses fibers. See my comments in A for why I'm not sure this is a good idea.

 All of these are going to break code in one way or another - that's
 unavoidable. But we really need to make std.concurrency grow up; other
 languages (Erlang, Rust, Go, ...) have had micro-threads (in some form)
 for years, and if we want D to be seriously usable for large-scale
 concurrency, we need to have them too.

 Thoughts? Other ideas?

+1, but what about TLS?

Oct 04 2012

=?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= <alex lycus.org> writes:

On 04-10-2012 14:11, Timon Gehr wrote:
 On 10/04/2012 01:32 PM, Alex R�nne Petersen wrote:
 Hi,

 We currently have std.concurrency as a message-passing mechanism. We
 encourage people to use it instead of OS threads, which is great.
 However, what is *not* great is that spawned tasks correspond 1:1 to OS
 threads. This is not even remotely scalable for Erlang-style
 concurrency. There's a fairly simple way to fix that: Fibers.

 The only problem with adding fiber support to std.concurrency is that
 the interface is just not flexible enough. The current interface is
 completely and entirely tied to the notion of threads (contrary to what
 its module description says).

 Now, I see a number of ways we can fix this:

 A) We completely get rid of the notion of threads and instead simply
 speak of 'tasks'. This trivially allows us to use threads, fibers,
 whatever to back the module. I personally think this is the best way to
 build a message-passing abstraction because it gives enough transparency
 to *actually* distribute tasks across machines without things breaking.
 B) We make the module capable of backing tasks with both threads and
 fibers, and expose an interface that allows the user to choose what kind
 of task is spawned. I'm *not* convinced this is a good approach because
 it's extremely error-prone (imagine doing a thread-based receive inside
 a fiber-based task!).
 C) We just swap out threads with fibers and document that the module
 uses fibers. See my comments in A for why I'm not sure this is a good
 idea.

 All of these are going to break code in one way or another - that's
 unavoidable. But we really need to make std.concurrency grow up; other
 languages (Erlang, Rust, Go, ...) have had micro-threads (in some form)
 for years, and if we want D to be seriously usable for large-scale
 concurrency, we need to have them too.

 Thoughts? Other ideas?

 +1, but what about TLS?

I think that no matter what we do, we have to simply say "don't do that" 
to thread-local state (it would break in distributed scenarios too, for 
instance).

Instead, I think we should do what the Rust folks did: Use *task*-local 
state and leave it up to std.concurrency to figure out how to deal with 
it. It won't be as 'seamless' as TLS variables in D of course, but I 
think it's good enough in practice.

-- 
Alex R�nne Petersen
alex lycus.org
http://lycus.org

Oct 04 2012

Timon Gehr <timon.gehr gmx.ch> writes:

On 10/04/2012 02:22 PM, Alex R�nne Petersen wrote:
 On 04-10-2012 14:11, Timon Gehr wrote:
 On 10/04/2012 01:32 PM, Alex R�nne Petersen wrote:
 Hi,

 We currently have std.concurrency as a message-passing mechanism. We
 encourage people to use it instead of OS threads, which is great.
 However, what is *not* great is that spawned tasks correspond 1:1 to OS
 threads. This is not even remotely scalable for Erlang-style
 concurrency. There's a fairly simple way to fix that: Fibers.

 The only problem with adding fiber support to std.concurrency is that
 the interface is just not flexible enough. The current interface is
 completely and entirely tied to the notion of threads (contrary to what
 its module description says).

 Now, I see a number of ways we can fix this:

 A) We completely get rid of the notion of threads and instead simply
 speak of 'tasks'. This trivially allows us to use threads, fibers,
 whatever to back the module. I personally think this is the best way to
 build a message-passing abstraction because it gives enough transparency
 to *actually* distribute tasks across machines without things breaking.
 B) We make the module capable of backing tasks with both threads and
 fibers, and expose an interface that allows the user to choose what kind
 of task is spawned. I'm *not* convinced this is a good approach because
 it's extremely error-prone (imagine doing a thread-based receive inside
 a fiber-based task!).
 C) We just swap out threads with fibers and document that the module
 uses fibers. See my comments in A for why I'm not sure this is a good
 idea.

 All of these are going to break code in one way or another - that's
 unavoidable. But we really need to make std.concurrency grow up; other
 languages (Erlang, Rust, Go, ...) have had micro-threads (in some form)
 for years, and if we want D to be seriously usable for large-scale
 concurrency, we need to have them too.

 Thoughts? Other ideas?

 +1, but what about TLS?

 I think that no matter what we do, we have to simply say "don't do that"
 to thread-local state (it would break in distributed scenarios too, for
 instance).

 Instead, I think we should do what the Rust folks did: Use *task*-local
 state and leave it up to std.concurrency to figure out how to deal with
 it. It won't be as 'seamless' as TLS variables in D of course, but I
 think it's good enough in practice.

If it is not seamless, we have failed. IMO the runtime should expose an
interface for allocating TLS, switching between TLS instances and
destroying TLS.

What about the stack? Allocating a fixed-size stack per task is costly
and Walter opposes dynamic stack growth.

Oct 04 2012

=?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= <alex lycus.org> writes:

On 04-10-2012 14:48, Timon Gehr wrote:
 On 10/04/2012 02:22 PM, Alex R�nne Petersen wrote:
 On 04-10-2012 14:11, Timon Gehr wrote:
 On 10/04/2012 01:32 PM, Alex R�nne Petersen wrote:
 Hi,

 We currently have std.concurrency as a message-passing mechanism. We
 encourage people to use it instead of OS threads, which is great.
 However, what is *not* great is that spawned tasks correspond 1:1 to OS
 threads. This is not even remotely scalable for Erlang-style
 concurrency. There's a fairly simple way to fix that: Fibers.

 The only problem with adding fiber support to std.concurrency is that
 the interface is just not flexible enough. The current interface is
 completely and entirely tied to the notion of threads (contrary to what
 its module description says).

 Now, I see a number of ways we can fix this:

 A) We completely get rid of the notion of threads and instead simply
 speak of 'tasks'. This trivially allows us to use threads, fibers,
 whatever to back the module. I personally think this is the best way to
 build a message-passing abstraction because it gives enough
 transparency
 to *actually* distribute tasks across machines without things breaking.
 B) We make the module capable of backing tasks with both threads and
 fibers, and expose an interface that allows the user to choose what
 kind
 of task is spawned. I'm *not* convinced this is a good approach because
 it's extremely error-prone (imagine doing a thread-based receive inside
 a fiber-based task!).
 C) We just swap out threads with fibers and document that the module
 uses fibers. See my comments in A for why I'm not sure this is a good
 idea.

 All of these are going to break code in one way or another - that's
 unavoidable. But we really need to make std.concurrency grow up; other
 languages (Erlang, Rust, Go, ...) have had micro-threads (in some form)
 for years, and if we want D to be seriously usable for large-scale
 concurrency, we need to have them too.

 Thoughts? Other ideas?

 +1, but what about TLS?

 I think that no matter what we do, we have to simply say "don't do that"
 to thread-local state (it would break in distributed scenarios too, for
 instance).

 Instead, I think we should do what the Rust folks did: Use *task*-local
 state and leave it up to std.concurrency to figure out how to deal with
 it. It won't be as 'seamless' as TLS variables in D of course, but I
 think it's good enough in practice.

 If it is not seamless, we have failed. IMO the runtime should expose an
 interface for allocating TLS, switching between TLS instances and
 destroying TLS.

I suppose it could be done.

But keep in mind the side-effects of an approach like this: Some 
thread-local variables (for instance, think 'chunk' inside emplace) 
would break (or at least behave very weirdly) if you switch the *entire* 
TLS context when entering a task.

Sure, we could use the runtime interface for TLS switching only for 
task-local state, but then we're back to square one with it not being 
seamless.

 What about the stack? Allocating a fixed-size stack per task is costly
 and Walter opposes dynamic stack growth.

Yeah, I never understood why. It's essential for functional-style code 
running in constrained tasks. It's not just about conserving memory; 
it's to make recursion feasible.

In any case, fibers currently allocate PAGE_SIZE * 4 bytes for stacks.

-- 
Alex R�nne Petersen
alex lycus.org
http://lycus.org

Oct 04 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 04-Oct-12 16:48, Timon Gehr wrote:
 On 10/04/2012 02:22 PM, Alex R�nne Petersen wrote:
 On 04-10-2012 14:11, Timon Gehr wrote:


[snip]
 I think that no matter what we do, we have to simply say "don't do that"
 to thread-local state (it would break in distributed scenarios too, for
 instance).

 Instead, I think we should do what the Rust folks did: Use *task*-local
 state and leave it up to std.concurrency to figure out how to deal with
 it. It won't be as 'seamless' as TLS variables in D of course, but I
 think it's good enough in practice.

 If it is not seamless, we have failed. IMO the runtime should expose an
 interface for allocating TLS, switching between TLS instances and
 destroying TLS.

Agreed.

 What about the stack? Allocating a fixed-size stack per task is costly
 and Walter opposes dynamic stack growth.

Allocating a fixed-size stack is costly only in terms of virtual address 
space. Then running out of address space is of concern on 32-bits only. 
On 64 bits you may as well allocate 1 Gb per task it will only get 
reserved if it's used.

-- 
Dmitry Olshansky

Oct 04 2012

Sean Kelly <sean invisibleduck.org> writes:

On Oct 4, 2012, at 5:48 AM, Timon Gehr <timon.gehr gmx.ch> wrote:
=20
 What about the stack? Allocating a fixed-size stack per task is costly
 and Walter opposes dynamic stack growth.

This is another reason I've been delaying using fibers.  The correct =
approach is probably to go the distance by reserving a large block, =
committing only a portion, and commit the rest dynamically as needed.  =
The current fiber implementation does have a guard page in some cases, =
but doesn't go so far as to reserve/commit portions of a larger stack =
space.=

Oct 04 2012

=?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= <alex lycus.org> writes:

On 05-10-2012 01:34, Sean Kelly wrote:
 On Oct 4, 2012, at 5:48 AM, Timon Gehr <timon.gehr gmx.ch> wrote:
 What about the stack? Allocating a fixed-size stack per task is costly
 and Walter opposes dynamic stack growth.

 This is another reason I've been delaying using fibers.  The correct approach
is probably to go the distance by reserving a large block, committing only a
portion, and commit the rest dynamically as needed.  The current fiber
implementation does have a guard page in some cases, but doesn't go so far as
to reserve/commit portions of a larger stack space.

I think we'd need compiler support to be able to do it in a reasonable 
way at all. Doing it via OS virtual memory hacks seems like a bad idea 
to me.

-- 
Alex R�nne Petersen
alex lycus.org
http://lycus.org

Oct 04 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 04-Oct-12 15:32, Alex R�nne Petersen wrote:
 Hi,

 We currently have std.concurrency as a message-passing mechanism. We
 encourage people to use it instead of OS threads, which is great.
 However, what is *not* great is that spawned tasks correspond 1:1 to OS
 threads. This is not even remotely scalable for Erlang-style
 concurrency. There's a fairly simple way to fix that: Fibers.

 The only problem with adding fiber support to std.concurrency is that
 the interface is just not flexible enough. The current interface is
 completely and entirely tied to the notion of threads (contrary to what
 its module description says).

 Now, I see a number of ways we can fix this:

 A) We completely get rid of the notion of threads and instead simply
 speak of 'tasks'. This trivially allows us to use threads, fibers,
 whatever to back the module. I personally think this is the best way to
 build a message-passing abstraction because it gives enough transparency
 to *actually* distribute tasks across machines without things breaking.

Cool, but currently it's a leaky abstraction. For instance if task is 
implemented with fibers static variables will be shared among threads.
Essentially I think Fibers need TLS (or rather FLS) synced with language 
'static' keyword. Otherwise the whole TLS by default is a useless chunk 
of machinery.

 B) We make the module capable of backing tasks with both  threads and
 fibers, and expose an interface that allows the user to choose what kind
 of task is spawned. I'm *not* convinced this is a good approach because
 it's extremely error-prone (imagine doing a thread-based receive inside
 a fiber-based task!).

Bleh.

 C) We just swap out threads with fibers and document that the module
 uses fibers. See my comments in A for why I'm not sure this is a good idea.

Seems a lot like A but with task defined to be a fiber. I'd prefer this. 
However then it needs a user-defined policy for distributing fibers 
across real threads (pools). Btw A is full of this too.

 All of these are going to break code in one way or another - that's
 unavoidable. But we really need to make std.concurrency grow up; other
 languages (Erlang, Rust, Go, ...) have had micro-threads (in some form)
 for years, and if we want D to be seriously usable for large-scale
 concurrency, we need to have them too.

 Thoughts? Other ideas?

+1

-- 
Dmitry Olshansky

Oct 04 2012

=?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= <alex lycus.org> writes:

On 04-10-2012 22:04, Dmitry Olshansky wrote:
 On 04-Oct-12 15:32, Alex R�nne Petersen wrote:
 Hi,

 We currently have std.concurrency as a message-passing mechanism. We
 encourage people to use it instead of OS threads, which is great.
 However, what is *not* great is that spawned tasks correspond 1:1 to OS
 threads. This is not even remotely scalable for Erlang-style
 concurrency. There's a fairly simple way to fix that: Fibers.

 The only problem with adding fiber support to std.concurrency is that
 the interface is just not flexible enough. The current interface is
 completely and entirely tied to the notion of threads (contrary to what
 its module description says).

 Now, I see a number of ways we can fix this:

 A) We completely get rid of the notion of threads and instead simply
 speak of 'tasks'. This trivially allows us to use threads, fibers,
 whatever to back the module. I personally think this is the best way to
 build a message-passing abstraction because it gives enough transparency
 to *actually* distribute tasks across machines without things breaking.

 Cool, but currently it's a leaky abstraction. For instance if task is
 implemented with fibers static variables will be shared among threads.
 Essentially I think Fibers need TLS (or rather FLS) synced with language
 'static' keyword. Otherwise the whole TLS by default is a useless chunk
 of machinery.

Yeah, it's a problem all right. But we'll need compiler support for this 
stuff in any case.

Can't help but wonder if it's really worth it. It seems to me like a 
simple AA-like API based on the typeid of data would be better -- as in, 
much more generic -- than trying to teach the compiler and runtime how 
to deal with this stuff.

Think something like this:

struct Data
{
     int foo;
     float bar;
}

void myTask()
{
     auto data = Data(42, 42.42f);

     TaskStore.save(data);

     // work ...

     foo();

     // work ...
}

void foo()
{
     auto data = TaskStore.load!Data();

     // work ...
}

I admit, not as seamless as static variables, but a hell of a lot less 
magical.

 B) We make the module capable of backing tasks with both  threads and
 fibers, and expose an interface that allows the user to choose what kind
 of task is spawned. I'm *not* convinced this is a good approach because
 it's extremely error-prone (imagine doing a thread-based receive inside
 a fiber-based task!).

 Bleh.

 C) We just swap out threads with fibers and document that the module
 uses fibers. See my comments in A for why I'm not sure this is a good
 idea.

 Seems a lot like A but with task defined to be a fiber. I'd prefer this.
 However then it needs a user-defined policy for distributing fibers
 across real threads (pools). Btw A is full of this too.

By choosing C we effectively give up any hope of distributed tasks and 
especially if we have a scheduler API. Is that really a good idea in 
this day and age?

 All of these are going to break code in one way or another - that's
 unavoidable. But we really need to make std.concurrency grow up; other
 languages (Erlang, Rust, Go, ...) have had micro-threads (in some form)
 for years, and if we want D to be seriously usable for large-scale
 concurrency, we need to have them too.

 Thoughts? Other ideas?

 +1


-- 
Alex R�nne Petersen
alex lycus.org
http://lycus.org

Oct 04 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 05-Oct-12 08:27, Alex R�nne Petersen wrote:
 On 04-10-2012 22:04, Dmitry Olshansky wrote:

 Cool, but currently it's a leaky abstraction. For instance if task is
 implemented with fibers static variables will be shared among threads.
 Essentially I think Fibers need TLS (or rather FLS) synced with language
 'static' keyword. Otherwise the whole TLS by default is a useless chunk
 of machinery.

 Yeah, it's a problem all right. But we'll need compiler support for this
 stuff in any case.

 Can't help but wonder if it's really worth it. It seems to me like a
 simple AA-like API based on the typeid of data would be better -- as in,
 much more generic
 than trying to teach the compiler and runtime how
 to deal with this stuff.
 Think something like this:

 struct Data
 {
      int foo;
      float bar;
 }

 void myTask()
 {
      auto data = Data(42, 42.42f);

      TaskStore.save(data);

      // work ...

      foo();

      // work ...
 }

 void foo()
 {
      auto data = TaskStore.load!Data();
      // work ...
 }

 I admit, not as seamless as static variables, but a hell of a lot less
 magical.

This just doesn't work though.
The true problem is not in the code you as a programmer doing distibuted 
stuff do.
It's library writers that typically use TLS for some persistent state 
inside module
and D currently makes it easy and transparent just like in the old non-MT
days but for threads ONLY.

Now having them all pack their stuff and go about fixing globals to 
TaskStore.store/.load
is not realistic and down right horrible.
Currently I suspect w.r.t. Fibers all that works is based on conventions 
& luck.

One problem with making everything FLS is that cost becomes darn high. 
On the other hand Fibers are yielded only manually (+scheduler now? 
probably on recive/send etc.) and a lot of things can be "fiber-safe" as is.

Also it seems like for this to work we need not only a scheduler but 
reworked libraries that are fiber-aware (so they don't block on I/O 
etc.). See e.g. vibe.d.

 C) We just swap out threads with fibers and document that the module
 uses fibers. See my comments in A for why I'm not sure this is a good
 idea.

 Seems a lot like A but with task defined to be a fiber. I'd prefer this.
 However then it needs a user-defined policy for distributing fibers
 across real threads (pools). Btw A is full of this too.

 By choosing C we effectively give up any hope of distributed tasks and
 especially if we have a scheduler API. Is that really a good idea in
 this day and age?

Why? Remote fibers should go for a distributed tasks. Like I said just 
make Fiber == task.
As long as there is a suitable protocol for communication it's all 
right. I'm insisting on fiber as a task as this makes for simpler logic 
of message passing. And scheduler is still inevitable as fibers wait for 
messages and are multiplexed on only as many threads.

I just don't see any other abstraction you want to put in place of task. 
It should be self-contained persistent 'worker' so that message passing 
works transparently.


-- 
Dmitry Olshansky

Oct 05 2012

Johannes Pfau <nospam example.com> writes:

Am Fri, 05 Oct 2012 12:58:18 +0400
schrieb Dmitry Olshansky <dmitry.olsh gmail.com>:

 The true problem is not in the code you as a programmer doing
 distibuted stuff do.
 It's library writers that typically use TLS for some persistent state 
 inside module
 and D currently makes it easy and transparent just like in the old
 non-MT days but for threads ONLY.

We should probably do some analysis on the phobos source code to see if
this really is the case. I thought TLS is mainly used to avoid
threading issues, which works for Fibers. Things like the thread local
RNG generator variables work fine with usual TLS and even if the
Fiber is passed between different threads, this still works well.

I think we'd only have problems with APIs which leave TLS variables in
an inconsistent state between calls to functions. But I always though
such behavior doesn't fit TLS variables well and should be abstracted
into a struct+member variable as state. In the end, isn't 'global TLS'
state just as bad as global state in C and should be avoided?

Oct 05 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Thursday, October 04, 2012 13:32:01 Alex R=C3=B8nne Petersen wrote:
 Thoughts? Other ideas?

std.concurrency is supposed to be designed such that it can be used for=
 more=20
than just threads (e.g. sending messages across the network), so if it =
needs=20
to be adjusted to accomodate that, then we should do so, but we need to=
 be=20
careful to do it in a way that minimizes code breakage as much as reaso=
nably=20
possible.

- Jonathan M Davis

Oct 04 2012

Sean Kelly <sean invisibleduck.org> writes:

On Oct 4, 2012, at 4:32 AM, Alex R=F8nne Petersen <alex lycus.org> =
wrote:

 Hi,
=20
 We currently have std.concurrency as a message-passing mechanism. We =

encourage people to use it instead of OS threads, which is great. =
However, what is *not* great is that spawned tasks correspond 1:1 to OS =
threads. This is not even remotely scalable for Erlang-style =
concurrency. There's a fairly simple way to fix that: Fibers.
=20
 The only problem with adding fiber support to std.concurrency is that =

the interface is just not flexible enough. The current interface is =
completely and entirely tied to the notion of threads (contrary to what =
its module description says).

How is the interface tied to the notion of threads?  I had hoped to =
design it with the underlying concurrency mechanism completely =
abstracted.  The most significant reason that fibers aren't used behind =
the scenes today is because the default storage class of static data is =
thread-local, and this would really have to be made fiber-local.  I'm =
reasonably certain this could be done and have considered going so far =
as to make the main thread in D a fiber, but the implementation is =
definitely non-trivial and will probably be slower than the built-in TLS =
mechanism as well.  So consider the current std.concurrency =
implementation to be a prototype.  I'd also like to add interprocess =
messaging, but that will be another big task.=

Oct 04 2012

=?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= <alex lycus.org> writes:

On 05-10-2012 01:30, Sean Kelly wrote:
 On Oct 4, 2012, at 4:32 AM, Alex R�nne Petersen <alex lycus.org> wrote:

 Hi,

 We currently have std.concurrency as a message-passing mechanism. We encourage
people to use it instead of OS threads, which is great. However, what is *not*
great is that spawned tasks correspond 1:1 to OS threads. This is not even
remotely scalable for Erlang-style concurrency. There's a fairly simple way to
fix that: Fibers.

 The only problem with adding fiber support to std.concurrency is that the
interface is just not flexible enough. The current interface is completely and
entirely tied to the notion of threads (contrary to what its module description
says).

 How is the interface tied to the notion of threads?  I had hoped to design it
with the underlying concurrency mechanism completely abstracted.  The most
significant reason that fibers aren't used behind the scenes today is because
the default storage class of static data is thread-local, and this would really
have to be made fiber-local.  I'm reasonably certain this could be done and
have considered going so far as to make the main thread in D a fiber, but the
implementation is definitely non-trivial and will probably be slower than the
built-in TLS mechanism as well.  So consider the current std.concurrency
implementation to be a prototype.  I'd also like to add interprocess messaging,
but that will be another big task.

Mostly in that everything operates on Tids (as opposed to some opaque 
Cid type) and, as you mentioned, TLS. The problem is basically that 
people have gotten used to std.concurrency always using OS threads due 
to subtle things like that from day one.

-- 
Alex R�nne Petersen
alex lycus.org
http://lycus.org

Oct 04 2012

Sean Kelly <sean invisibleduck.org> writes:

On Oct 4, 2012, at 9:18 PM, Alex R=C3=B8nne Petersen <alex lycus.org> wrote:=


 On 05-10-2012 01:30, Sean Kelly wrote:
 On Oct 4, 2012, at 4:32 AM, Alex R=C3=B8nne Petersen <alex lycus.org> wro=


te:
=20
 Hi,
=20
 We currently have std.concurrency as a message-passing mechanism. We enc=



ourage people to use it instead of OS threads, which is great. However, what=
 is *not* great is that spawned tasks correspond 1:1 to OS threads. This is n=
ot even remotely scalable for Erlang-style concurrency. There's a fairly sim=
ple way to fix that: Fibers.
=20
 The only problem with adding fiber support to std.concurrency is that th=



e interface is just not flexible enough. The current interface is completely=
 and entirely tied to the notion of threads (contrary to what its module des=
cription says).
=20
 How is the interface tied to the notion of threads?  I had hoped to desig=


n it with the underlying concurrency mechanism completely abstracted.  The m=
ost significant reason that fibers aren't used behind the scenes today is be=
cause the default storage class of static data is thread-local, and this wou=
ld really have to be made fiber-local.  I'm reasonably certain this could be=
 done and have considered going so far as to make the main thread in D a fib=
er, but the implementation is definitely non-trivial and will probably be sl=
ower than the built-in TLS mechanism as well.  So consider the current std.c=
oncurrency implementation to be a prototype.  I'd also like to add interproc=
ess messaging, but that will be another big task.
=20
 Mostly in that everything operates on Tids (as opposed to some opaque Cid t=

ype) and, as you mentioned, TLS. The problem is basically that people have g=
otten used to std.concurrency always using OS threads due to subtle things l=
ike that from day one.

A Tid is a Cid and in the first iteration I actually named it Cid and was as=
ked to change it.  Tid seems reasonable since it represents a logical thread=
 anyway. It just may not actually be a kernel thread. I think we have to mak=
e TLS work for fibers or using them isn't an option. It would be ridiculous t=
o say "D has this cool new idea about statics but you can't use it if you're=
 using the standard concurrency package."=

Oct 05 2012

deadalnix <deadalnix gmail.com> writes:

Le 04/10/2012 13:32, Alex R�nne Petersen a �crit :
 Hi,

 We currently have std.concurrency as a message-passing mechanism. We
 encourage people to use it instead of OS threads, which is great.
 However, what is *not* great is that spawned tasks correspond 1:1 to OS
 threads. This is not even remotely scalable for Erlang-style
 concurrency. There's a fairly simple way to fix that: Fibers.

 The only problem with adding fiber support to std.concurrency is that
 the interface is just not flexible enough. The current interface is
 completely and entirely tied to the notion of threads (contrary to what
 its module description says).

 Now, I see a number of ways we can fix this:

 A) We completely get rid of the notion of threads and instead simply
 speak of 'tasks'. This trivially allows us to use threads, fibers,
 whatever to back the module. I personally think this is the best way to
 build a message-passing abstraction because it gives enough transparency
 to *actually* distribute tasks across machines without things breaking.
 B) We make the module capable of backing tasks with both threads and
 fibers, and expose an interface that allows the user to choose what kind
 of task is spawned. I'm *not* convinced this is a good approach because
 it's extremely error-prone (imagine doing a thread-based receive inside
 a fiber-based task!).
 C) We just swap out threads with fibers and document that the module
 uses fibers. See my comments in A for why I'm not sure this is a good idea.

 All of these are going to break code in one way or another - that's
 unavoidable. But we really need to make std.concurrency grow up; other
 languages (Erlang, Rust, Go, ...) have had micro-threads (in some form)
 for years, and if we want D to be seriously usable for large-scale
 concurrency, we need to have them too.

 Thoughts? Other ideas?

Something I wonder for a while : why not run everything in fibers ?

Oct 04 2012

=?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= <alex lycus.org> writes:

On 05-10-2012 04:14, deadalnix wrote:
 Le 04/10/2012 13:32, Alex R�nne Petersen a �crit :
 Hi,

 We currently have std.concurrency as a message-passing mechanism. We
 encourage people to use it instead of OS threads, which is great.
 However, what is *not* great is that spawned tasks correspond 1:1 to OS
 threads. This is not even remotely scalable for Erlang-style
 concurrency. There's a fairly simple way to fix that: Fibers.

 The only problem with adding fiber support to std.concurrency is that
 the interface is just not flexible enough. The current interface is
 completely and entirely tied to the notion of threads (contrary to what
 its module description says).

 Now, I see a number of ways we can fix this:

 A) We completely get rid of the notion of threads and instead simply
 speak of 'tasks'. This trivially allows us to use threads, fibers,
 whatever to back the module. I personally think this is the best way to
 build a message-passing abstraction because it gives enough transparency
 to *actually* distribute tasks across machines without things breaking.
 B) We make the module capable of backing tasks with both threads and
 fibers, and expose an interface that allows the user to choose what kind
 of task is spawned. I'm *not* convinced this is a good approach because
 it's extremely error-prone (imagine doing a thread-based receive inside
 a fiber-based task!).
 C) We just swap out threads with fibers and document that the module
 uses fibers. See my comments in A for why I'm not sure this is a good
 idea.

 All of these are going to break code in one way or another - that's
 unavoidable. But we really need to make std.concurrency grow up; other
 languages (Erlang, Rust, Go, ...) have had micro-threads (in some form)
 for years, and if we want D to be seriously usable for large-scale
 concurrency, we need to have them too.

 Thoughts? Other ideas?

 Something I wonder for a while : why not run everything in fibers ?

Because then we definitely need dynamic stack growth wired into both the 
compiler and the runtime.

Not impossible, but there's a *lot* of effort required (and convincing, 
in Walter's case).

-- 
Alex R�nne Petersen
alex lycus.org
http://lycus.org

Oct 04 2012

D Programming

C/C++ Programming

Other

digitalmars.D - std.concurrency and fibers