www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - An idea; Coroutines

reply Richard (Rikki) Andrew Cattermole <richard cattermole.co.nz> writes:
Coroutines! Oh coroutines.

Recently Walter has shown some interest in me describing my 
library implementation thereof, due to my belief that they should 
be used for asynchronous tasks and hence need to be part of the 
language.

Before I begin I just want to say, that while I have a library 
implementation working, this is not the only way to do 
coroutines, nor is the proposed syntax.

-----------

So to begin with, let's define what the library author considers 
the library API to be.

```d
struct InstantiableCoroutine(ResultType, Args...) {
     this(T:__coroutine)(immutable T co);

     bool isNull();
     bool isInstantiated();
     InstantiableCoroutine makeInstance(Args args);

     GenericCoroutine opCast(T=GenericCoroutine)();
     Future opCast(T=Future!ResultType)();

     GenericCoroutine waitingOn();
     COResult!ResultType result();

     bool isComplete();
     void unsafeContinue()  system;
}
```

This is the premise that the other two API representations are 
built from.

What we wish to describe in this API, is a type that can be 
instantiated with new instances, after having a descriptor passed 
in via say a closure or from function pointer (whose function is 
being compiled also).
It needs to be able to tell the user some information of it, if 
it is instantiated and if so, is it currently complete or waiting 
on another coroutine.
Along with the respective values.
Lastly we must be able to continue execution once we are no 
longer waiting on our condition.

Of note is the ``GenericCoroutine`` and ``Future``. Take notice 
of the template arguments that are provided by the ``opCast``'s. 
Their respective methods are a subset of 
``InstantiableCoroutine``'s.

For my implementation I use structs and reference counting.
I find that this works quite well.
However due to the possibility of cyclic relationships, a GC may 
be desired if you want a more naive solution.

-----------

The ``GenericCoroutine`` is where the real magic happens.
It is what all the coroutine API's boil down to without templates.

When you do dependency analysis on what coroutine is waiting on 
another, and when no longer waiting on another sent off to the 
worker pool to execute, this is the abstraction that is used 
internally to the event loop.

-----------

Next up is a specialized coroutine implementation that I call 
"future-completion".

A future completion uses the same API as above, but one very key 
difference.
The descriptor state does not come from the user.
It comes from a prebuilt library function that only wants to know 
the return type of the future.

It will only complete, after it has been externally triggered as 
such.
This can be done for say a socket read based upon such functions 
as ``poll``.

For reference counting this helps break up cycles since you can 
store the trigger separately from the reference counted 
abstraction.
So it does a lot of jobs, I have found it to be irreplacable.

As a feature, it is quite easily one of my most genius ideas ever.
Simply because it is an ordinary coroutine, that integrates into 
the worker dependency state for coroutines, and does not require 
the user to know that it is special.

-----------

For a language feature, it should not be tied to a given library 
implementation.
There is no reason for it to be.

Done properly and with enough understanding of the compiler, it 
should be possible to write code similar to what I posted 
earlier, without using any attributes to say that this is a 
coroutine object that can be implicitly constructed.
It can see that it can be because of the constructor template 
check.

This leads us to wanting to describe the user experience:

```d
struct ListenSocket {
     static ListenSocket 
listen(A...)(InstantiableCoroutine!(Socket, A) coroutine, ushort 
port);
}

ListenSocket.listen((Socket socket) /*  notls */ {
     writeln(socket.read(4));

     // is equivalent to

     /+
     Future!(ubyte[]) __temp0 = socket.read(4);
     await __temp0;
     assert(__temp0.isComplete);
     writeln(__temp0.result);
     +/
}, port: 8080);
```

It looks sequential.

The user knows nothing about the coroutines happening underneath.

This is quite honestly the holy grail of asynchronous programming 
that we as a field have been studying and trying to make work 
since the 1950's.
See[0] for more information.

We can do it.

There is nothing stopping us except political will.

[0] 
https://www.amazon.com.au/Concurrent-Programming-C-R-Snow/dp/0521339936
[1] 
https://github.com/Project-Sidero/eventloop/tree/master/source/sidero/eventloop/coroutine
[2] 
https://github.com/Project-Sidero/eventloop/blob/master/source/sidero/eventloop/tasks/future_completion.d
[3] 
https://github.com/Project-Sidero/eventloop/blob/master/source/sidero/eventloop/internal/workers.d#L169

-----------

My code while not ready for users, does work.
See [1] for coroutines API, builder has the unit test.
[2] for future completions logic (yes there is unittest in 
there!).
And [3] for the internal coroutine dependency state.


For those interested in the state that the compiler would 
generate consider:

```d
struct CO_Object_xx(LibraryType) {
     enum Functions = [&__stage1, &__stage2];
     alias UserVars = AliasSeq!(...);
     alias ResultType = ...;

     static struct State {
         Stage stage;
         LibraryType waitingOn;

         UserVars vars;
         ResultType result;

         version(D_BetterC) {
         } else {
             Exception resultException;
         }

         ~this(); // cleanup if required

         bool isComplete() {
             return this.stage == Stage.CompleteValue || 
this.stage == Stage.CompleteException;
         }
     }

     enum Stage {
         Stage_1,
         Stage_2,
         ReadyToStart,
         CompleteValue,
         CompleteException,
     }

     void __stage1(scope State* state) {
     }

     void __stage2(scope State* state) {
     }
}
```
Jan 16
next sibling parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Tuesday, 16 January 2024 at 14:20:41 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 Coroutines! Oh coroutines.

 Recently Walter has shown some interest in me describing my 
 library implementation thereof, due to my belief that they 
 should be used for asynchronous tasks and hence need to be part 
 of the language.

 [...]
We have fibers in the runtime; why would we need the coroutines you propose and what's the difference to the fibers we already have?
Jan 16
parent Richard (Rikki) Andrew Cattermole <richard cattermole.co.nz> writes:
On Tuesday, 16 January 2024 at 14:46:03 UTC, Stefan Koch wrote:
 We have fibers in the runtime; why would we need the coroutines 
 you propose and what's the difference to the fibers we already 
 have?
To quote: https://www.open-std.org/JTC1/SC22/WG21/docs/papers/2018/p1364r0.pdf Which comes from Microsoft: https://devblogs.microsoft.com/oldnewthing/20191011-00/?p=102989
 We have accumulated more than a quarter century of experience 
 with  fibers across variety of programming languages and 
 platforms. In the 90s fibers and N : M scheduling looked 
 promising, now, with improvements in hardware, operating system 
 kernel and painful experience of trying to make the fibers work 
 has resulted in a recommendation: **DO NOT USE FIBERS!** Use 
 threads instead and/or write your code using asynchronous APIs 
 with hand-crafted state machines.
Fibers as we have them are a runtime hack using inline assembly. Microsoft had to add them to WinAPI specifically because people kept messing up their implementation with the calling conventions. Ours on OSX broke a few years ago too (was fixed quickly). But the key difference to understand about stackless coroutines instead of fibers, coroutines understand the dependency that a coroutine that has yielded has, a fiber does not.
Jan 16
prev sibling next sibling parent zjh <fqbqrr 163.com> writes:
On Tuesday, 16 January 2024 at 14:20:41 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 Coroutines! Oh coroutines.
`The stack free protocol of C++20 ` is very good!
Jan 16
prev sibling next sibling parent reply victoroak <victoroak victoroak.com> writes:
On Tuesday, 16 January 2024 at 14:20:41 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 
 [...]
I was playing around with the concept of generators in D. They could be used as a base for coroutines. For me it would generete code like this: ``` // string fizzBuzz(size_t n) { // for(size_t i; i < n; i++) { // if (i % 3 == 0 && i % 5 == 0) // yield "fizz"; // else if(i % 3 == 0) // yield "buzz"; // else if (i % 5 == 0) // yield "fizzbuzz"; // else // yield i.to!string; // } // } struct Generator { // params size_t n; // state of the generator size_t state = 0; // locals size_t i = void; this(size_t n) { this.n = n; } bool next(ref string output) { switch(state) { case 0: goto LABEL0; case 1: goto LABEL1; default: assert(0); } LABEL0: for (i = 0; i < n; i++) { if (i % 3 == 0 && i % 5 == 0) { output = "fizz"; state = 1; return false; } else if(i % 3 == 0) { output = "buzz"; state = 1; return false; } else if (i % 5 == 0) { output = "fizzbuzz"; state = 1; return false; } else { output = i.to!string; state = 1; return false; } LABEL1: } state = 2; return true; } } ``` I was thinking about this. The generated struct would probably need to allocate in the GC if any parameter to the generator function escaped. We could use something like scope but I don't know how reliable it would be to make it safe. I would really like to see some kind of stackless resumable function in D.
Jan 16
parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 17/01/2024 8:05 AM, victoroak wrote:
 I was thinking about this. The generated struct would probably need to 
 allocate in the GC if any parameter to the generator function escaped. 
 We could use something like scope but I don't know how reliable it would 
 be to make it safe. I would really like to see some kind of stackless 
 resumable function in D.
What you are thinking about is the inverse of scope. You are not allowed to be borrowed from the state and I don't think we have the logic to represent that. I suppose something like `` live`` forced on you could work though for owners only.
Jan 16
prev sibling parent reply Sebastiaan Koppe <mail skoppe.eu> writes:
On Tuesday, 16 January 2024 at 14:20:41 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 Coroutines! Oh coroutines.
Coroutines are nice, but for general asynchronous computation you'll want something more abstract. Have you seen the Sender/Receiver work in C++? https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2300r7.html#intro I highly recommend understanding its working principles. Happy to help answer questions of course. We did an implementation at https://github.com/symmetryinvestments/concurrency which I am (very slowly) cleaning up and prepping for inclusion into Phobos.
Jan 16
next sibling parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 17/01/2024 9:23 AM, Sebastiaan Koppe wrote:
 On Tuesday, 16 January 2024 at 14:20:41 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 Coroutines! Oh coroutines.
Coroutines are nice, but for general asynchronous computation you'll want something more abstract. Have you seen the Sender/Receiver work in C++? https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2300r7.html#intro I highly recommend understanding its working principles. Happy to help answer questions of course. We did an implementation at https://github.com/symmetryinvestments/concurrency which I am (very slowly) cleaning up and prepping for inclusion into Phobos.
I think I did look at it at some point, but went yeah this is way more complex than what I need. It may be possible that the language coroutine support may be able to drive it. Which could be worth considering.
Jan 16
parent reply Sebastiaan Koppe <mail skoppe.eu> writes:
On Tuesday, 16 January 2024 at 20:43:19 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 On 17/01/2024 9:23 AM, Sebastiaan Koppe wrote:
 We did an implementation at 
 https://github.com/symmetryinvestments/concurrency which I am 
 (very slowly) cleaning up and prepping for inclusion into 
 Phobos.
I think I did look at it at some point, but went yeah this is way more complex than what I need.
That was my reaction too, but the moving parts are actually a lot simpler than they look. Furthermore they do this with great composability, with practically no allocations, and, most importantly, using structured concurrency principles.
 It may be possible that the language coroutine support may be 
 able to drive it. Which could be worth considering.
Its the other way around really. I have already been able to integrate them with Threads, Fibers, epoll, iouring, iocp, even with external C eventloops. The C++ folks have them integrated with their stackless coroutines as well. Sender/Receivers is the simplest complete model for asynchronous computation I have come across. Emphasize on complete. Not saying we have to copy at verbatim, but they have a lot that we should at least consider. My pet peeve is the typical lack of support for cancellation many async libs have. While in reality it should be front and center.
Jan 16
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 17/01/2024 10:15 AM, Sebastiaan Koppe wrote:
     It may be possible that the language coroutine support may be able
     to drive it. Which could be worth considering.
 
 Its the other way around really.
That wasn't what I meant. I meant that the language feature could be put into its state. So library coroutine representation and sender/receiver library representation would both take as argument the language coroutine feature. This could be a good example of why I am proposing that the language feature should not be library specific. Because we could have multiple library representations in our standard library!
Jan 16
parent Sebastiaan Koppe <mail skoppe.eu> writes:
On Tuesday, 16 January 2024 at 21:19:29 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 On 17/01/2024 10:15 AM, Sebastiaan Koppe wrote:
     It may be possible that the language coroutine support may 
 be able
     to drive it. Which could be worth considering.
 
 Its the other way around really.
That wasn't what I meant. ... So library coroutine representation and sender/receiver library representation would both take as argument the language coroutine feature.
I don't know enough of the specific language coroutine feature you have in mind, but I don't think its the right base building block for async computations. For one they allocate too often, and I am not sure how they can handle cancellation without explicitly passing stoptokens around. However, I absolutely do agree with you that whatever language support D gets, it needs to look like regular sequential code.
Jan 16
prev sibling parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On Tuesday, 16 January 2024 at 20:23:01 UTC, Sebastiaan Koppe 
wrote:
 We did an implementation at 
 https://github.com/symmetryinvestments/concurrency which I am 
 (very slowly) cleaning up and prepping for inclusion into 
 Phobos.
I was curious about this library. Has it been battle-tested or used in production in any capacity?
Jan 16
parent max haughton <maxhaton gmail.com> writes:
On Wednesday, 17 January 2024 at 03:50:45 UTC, Andrej Mitrovic 
wrote:
 On Tuesday, 16 January 2024 at 20:23:01 UTC, Sebastiaan Koppe 
 wrote:
 We did an implementation at 
 https://github.com/symmetryinvestments/concurrency which I am 
 (very slowly) cleaning up and prepping for inclusion into 
 Phobos.
I was curious about this library. Has it been battle-tested or used in production in any capacity?
Yes
Jan 17