www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Stackless resumable functions

reply "Martin Nowak" <code dawg.eu> writes:
This is so much better than Fibers.
http://youtu.be/KUhSjfSbINE

What I like most about the proposal is that you can adapt await 
by specializing template functions, similar to how range based 
foreach works.
It also isn't tied to a particular scheduling mechanism and of 
course consumes much less memory than stack based suspension.
Oct 24 2014
next sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Friday, 24 October 2014 at 10:33:40 UTC, Martin Nowak wrote:
 What I like most about the proposal is that you can adapt await 
 by specializing template functions, similar to how range based 
 foreach works.
 It also isn't tied to a particular scheduling mechanism and of 
 course consumes much less memory than stack based suspension.
This is how all truly object oriented languages with concurrency works. Block activation records are conceptually on the heap and there is no difference between an object and a function: Simula67, Beta, Self… It is slower than using a stack though, but if done as in Beta you get a back pointer to the caller (who instantiate the function/object) which can be handy for modelling.
Oct 24 2014
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Friday, 24 October 2014 at 14:50:53 UTC, Ola Fosheim Grøstad 
wrote:
 On Friday, 24 October 2014 at 10:33:40 UTC, Martin Nowak wrote:
 What I like most about the proposal is that you can adapt 
 await by specializing template functions, similar to how range 
 based foreach works.
 It also isn't tied to a particular scheduling mechanism and of 
 course consumes much less memory than stack based suspension.
This is how all truly object oriented languages with concurrency works. Block activation records are conceptually on the heap and there is no difference between an object and a function: Simula67, Beta, Self… It is slower than using a stack though, but if done as in Beta you get a back pointer to the caller (who instantiate the function/object) which can be handy for modelling.
It is worth pointing out that one advantage of taking this uniform view is that you can more easily define a system to persist/migrate a transitive closure of objects/fibers and transfer them to other servers. However, it does not have to be stackless in terms of implementation. A stack is then an optimization, the compiled code can put things on the stack until it at runtime hits a yield (at which point you have to pick it up).
Oct 24 2014
prev sibling next sibling parent reply "Sean Kelly" <sean invisibleduck.org> writes:
On Friday, 24 October 2014 at 10:33:40 UTC, Martin Nowak wrote:
 This is so much better than Fibers.
 http://youtu.be/KUhSjfSbINE

 What I like most about the proposal is that you can adapt await 
 by specializing template functions, similar to how range based 
 foreach works.
 It also isn't tied to a particular scheduling mechanism and of 
 course consumes much less memory than stack based suspension.
I'm about halfway through the talk and it's a bit confusing so far because all of what I'd consider the interesting part seems to be implemented as a compiler extension and so is invisible by looking at the code. He's talking about suspend points in the function but there's no indication from the code that they are present where he says. It seems a bit like these functions are closures and the compiler is figuring this out according to the return type or something. So it's potentially interesting but difficult to see how this directly compares to classic coroutines. I'm hoping all will be clear by the end of the talk.
Oct 24 2014
next sibling parent reply "ROOAR" <youwants yahoo.com> writes:
  I really liked this proposal for resumable lambda:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4244.pdf
Oct 24 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/24/14 10:51 AM, ROOAR wrote:
   I really liked this proposal for resumable lambda:

 http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4244.pdf
Is this related to the video? -- Andrei
Oct 27 2014
next sibling parent "David Nadlinger" <code klickverbot.at> writes:
On Tuesday, 28 October 2014 at 02:10:47 UTC, Andrei Alexandrescu 
wrote:
 On 10/24/14 10:51 AM, ROOAR wrote:
  I really liked this proposal for resumable lambda:

 http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4244.pdf
Is this related to the video? -- Andrei
I haven't been following the developments too closely, but I think the talk essentially describes N4134, which supersedes N3977/N3858. Chris Kohlhoff references the latter in his introduction in N4244, but I haven't read that paper in any detail. David
Oct 27 2014
prev sibling next sibling parent "ROOAR" <youwants yahoo.com> writes:
On Tuesday, 28 October 2014 at 02:10:47 UTC, Andrei Alexandrescu 
wrote:
 On 10/24/14 10:51 AM, ROOAR wrote:
  I really liked this proposal for resumable lambda:

 http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4244.pdf
Is this related to the video? -- Andrei
It is a separate proposal, not the one shown in the video
Oct 27 2014
prev sibling parent "Martin Nowak" <code dawg.eu> writes:
On Tuesday, 28 October 2014 at 02:10:47 UTC, Andrei Alexandrescu 
wrote:
 On 10/24/14 10:51 AM, ROOAR wrote:
  I really liked this proposal for resumable lambda:

 http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4244.pdf
Is this related to the video? -- Andrei
There is a good sumarry of the current state in http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2014/n4232.pdf.
Oct 28 2014
prev sibling parent "Sean Kelly" <sean invisibleduck.org> writes:
Alright, done.  It's a pretty interesting proposal.  They are
effectively closures with coroutine-like semantics.  It seems
like the overhead for a complex system might actually be greater
than with classic coroutines, as closure data allocations could
be happening all over the place, but this is pure speculation.

I think a direct comparison could be drawn between their API and
ours, as std.concurrency now has a Generator object and one of
his early examples is a generator as well.  From a use
perspective, the two are really pretty similar, though our
Generator allocates an entire stack while theirs allocates N
function-level context blocks (one per contained awaitable).

Overall I see this proposal as being complementary to actors as
per std.concurrency.  Theirs provides a fairly simple and
lightweight model for composing code that doesn't normally
compose well (like recursive iterators), which is one traditional
use of coroutines.  But for high levels of concurrency to be
achieved, a scheduler needs to sit behind the await mechanism so
other things can happen when execution is suspended waiting for a
result.  This could integrate well with the Scheduler that is now
a part of std.concurrency, as it would be fairly trivial for a
context switch to occur whenever an awaitable suspend occurs.
Oct 24 2014
prev sibling next sibling parent reply "Kagamin" <spam here.lot> writes:
In my experience with .net, async/await introduce a non-obvious 
multithreading model, which remaining hidden under the hood, can 
still inflict concurrency issues on your code: race conditions 

types, D will need to catch concurrent access to data, as it's 
required by D type system.
Oct 28 2014
parent reply "Paulo Pinto" <pjmlp progtools.org> writes:
On Tuesday, 28 October 2014 at 12:30:22 UTC, Kagamin wrote:
 In my experience with .net, async/await introduce a non-obvious 
 multithreading model, which remaining hidden under the hood, 
 can still inflict concurrency issues on your code: race 

 shared types, D will need to catch concurrent access to data, 
 as it's required by D type system.
This is why one of the biggest improvements in Visual Studio 2013 is async/await debugging. Visual Studio 2014 has a few more improvements, if I am not mistaken.
Oct 28 2014
parent reply "Kagamin" <spam here.lot> writes:
I think, it's for seamless debugging. The debugger support for 
async/await is indeed non-trivial, because code is mutated by the 
compiler a lot, but I don't think it has anything to do with 
concurrency. Official MS position is async/await has no 
concurrency problems.
Oct 28 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/28/14 8:25 AM, Kagamin wrote:
 I think, it's for seamless debugging. The debugger support for
 async/await is indeed non-trivial, because code is mutated by the
 compiler a lot, but I don't think it has anything to do with
 concurrency. Official MS position is async/await has no concurrency
 problems.
Yah, my understanding of async/await is it's all single-threaded. -- Andrei
Oct 28 2014
parent "Kagamin" <spam here.lot> writes:
It's usually single-threaded by default, but this is achieved by 
routing continuation to the original thread through its 
synchronization context, while in multithreaded mode continuation 
can be executed in thread pool on any thread.
Oct 29 2014
prev sibling parent reply "bitwise" <bitwise.pvt gmail.com> writes:
Just out of curiosity, is anyone planning to implement this?

As for why it's not already implemented, is it because of lack of 
interest, or prohibitive reasons?



On Friday, 24 October 2014 at 10:33:40 UTC, Martin Nowak wrote:
 This is so much better than Fibers.
 http://youtu.be/KUhSjfSbINE

 What I like most about the proposal is that you can adapt await 
 by specializing template functions, similar to how range based 
 foreach works.
 It also isn't tied to a particular scheduling mechanism and of 
 course consumes much less memory than stack based suspension.
Feb 20 2015
parent reply "CraigDillabaugh" <craig.dillabaugh gmail.com> writes:
On Friday, 20 February 2015 at 17:51:17 UTC, bitwise wrote:
 Just out of curiosity, is anyone planning to implement this?

 As for why it's not already implemented, is it because of lack 
 of interest, or prohibitive reasons?



 On Friday, 24 October 2014 at 10:33:40 UTC, Martin Nowak wrote:
 This is so much better than Fibers.
 http://youtu.be/KUhSjfSbINE

 What I like most about the proposal is that you can adapt 
 await by specializing template functions, similar to how range 
 based foreach works.
 It also isn't tied to a particular scheduling mechanism and of 
 course consumes much less memory than stack based suspension.
They are waiting for your pull request :o)
Feb 20 2015
parent reply "bitwise" <bitwise.pvt gmail.com> writes:
 They are waiting for your pull request :o)
Welll... I would like to try, but there is good reason behind the noncommital phrasing of my question :) I experimented with stackful coroutines in C++, but I think stackless resumable functions will be more difficult. I assume I can harvest most of what I need from other parts of D. My initial thoughts on this is that I could have resumable functions return a special range which contained 3 things: -a pointer to the function -all the stack variables from the function -an integer index into a jump table at the beginning of the resumable function At compile time, each usage of yield in the resumable function would be replace with: -a command to update the jump table index, followed by -a label, who's offset would be contained in the jump table So initially calling a resumable function would return the range in question which would repeatedly call into the resumable function with a different index until the end of the resumable function was reached. Input on this would be appreciated.
Feb 21 2015
next sibling parent reply ketmar <ketmar ketmar.no-ip.org> writes:
On Sat, 21 Feb 2015 21:19:32 +0000, bitwise wrote:

 Input on this would be appreciated.
it seems to me that you can "abuse" delegates for this (i mean the=20 underlying "fat pointer"). resumable function can create closure (i think=20 all the code for this is already here), "resume address" pointer (this=20 can be added to closure), pointer to "real" delegate, hidden "yield=20 point" address, and return a delegate which does something like this on=20 call: * executes "real" delegate from "resume address". * "real" delegate yields by doing indirect call to "resume address" * assembler code at "resume address" stores new resume address (it's on=20 stack now) * and returns from "wrapping delegate" so from the point of programmer this will look as an ordinary delegate,=20 even with the same type, and it will be usable anywhere where ordinary=20 delegate is usable.=
Feb 21 2015
parent reply "bitwise" <bitwise.pvt gmail.com> writes:
On Sunday, 22 February 2015 at 03:35:28 UTC, ketmar wrote:
 On Sat, 21 Feb 2015 21:19:32 +0000, bitwise wrote:

 Input on this would be appreciated.
it seems to me that you can "abuse" delegates for this (i mean the underlying "fat pointer"). resumable function can create closure (i think all the code for this is already here), "resume address" pointer (this can be added to closure), pointer to "real" delegate, hidden "yield point" address, and return a delegate which does something like this on call: * executes "real" delegate from "resume address". * "real" delegate yields by doing indirect call to "resume address" * assembler code at "resume address" stores new resume address (it's on stack now) * and returns from "wrapping delegate" so from the point of programmer this will look as an ordinary delegate, even with the same type, and it will be usable anywhere where ordinary delegate is usable.
I'm glad you mentioned "resume address". I guess I don't need a jump table since the function is not switching on a user variable ;) I don't like the idea of having a resumable function look like a regular delegate. There would be no clean way to check when the resumable function had finished. I've only skimmed the C++ proposal for resumable lambdas below, but it seems like their solution is to introduce a 'stop_iteration' exception which gets thrown when you try to call a resumable lambda that has run to completion. So basically, all uses of delegates would have to be wrapped in try/catch blocks.. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4244.pdf The newest visual studio preview version has "generators". This is an example from the visual studio blog: generator<int> fib() { int a = 0; int b = 1; for (;;) { __yield_value a; auto next = a + b; a = b; b = next; } } So I'm assuming the above function would return something like this: template<class T> struct generator { T value; iterator begin() { /* executes function, stores value, and returns iterator to first yield */ } iterator end() { ... } void operator++() { /* execute function to next yield and store value */ } T& operator*() { return value; } }; I am considering something similar. Please excuse my incoherent pseudocode :) interface IResumableClosure { // address of resumable function // stack vars // methods for initializing a new MethodRange(T) } struct Generator(T) { IResumableClosure closure; Generator(IResumableClosure closure) { this.closure = closure; } int opApply(int delegate(ref T) dg) { foreach(value; MethodRange!T(closure)) if(dg(value)) return 1; return 0; } MethodRange!T getRange() { return MethodRange!T(closure); } } struct MethodRange(T) { IResumableClosure closure; // contains stack vars, resume address // last returned value, and 'finished' flag void *context; MethodRange(IResumableClosure closure) { this.closure = closure; this.context = closure.createContext(); if(!context->finished) this.value = invoke(context); } private T invoke(void* context) { // run function until yield, store return address in context } T front() { return value; } void popFront() { value = invoke(context); } bool empty() { return context.finished; } } Generator!int myResumableFunction() { for(int i = 0; i < 10; ++i) yield i; return; // exits the resumable function // return 1; // illegal inside resumable function, use yield } for(number; myResumableFunction()) writeln(number); // or auto gen = myResumableFunction(); auto metRang = gen.getRange(); // runs to first yield while(!metRang.empty()) { writeln(metRang.front); metRang.popFront(); } // or the caller could optionally return a MethodRange(T) directly if they were // ok with the function being invoked immediately. MethodRange!int myResumableFunction() { yield 1; } auto metRang = myResumableFunction(); // runs to first yield while(!metRang.empty()) { writeln(metRang.front); metRang.popFront(); } So like you said, I could wrap the resumable function in some kind of specialized closure, but I could use that to optionally initialize a MethodRange(T) or a Generator(T).
Feb 22 2015
parent reply ketmar <ketmar ketmar.no-ip.org> writes:
On Mon, 23 Feb 2015 00:48:42 +0000, bitwise wrote:

 I don't like the idea of having a resumable function look like a regular
 delegate. There would be no clean way to check when the resumable
 function had finished.
you don't need to. if you really need to do that, you're doing something=20 wrong trying to recreate ranges and exposing their internal design to the=20 user. this resumable function *is* delegate. nobody cares about it's=20 internals.
 I've only skimmed the C++ proposal for resumable
 lambdas below, but it seems like their solution is to introduce a
 'stop_iteration' exception which gets thrown when you try to call a
 resumable lambda that has run to completion. So basically, all uses of
 delegates would have to be wrapped in try/catch blocks..
resumable functions are not iterators. it's a slightly perversed flow=20 control method. iteration/generation is one of the use cases. and, by the way, yielding is a two-way communication channel. you seem to stuck with iteration/generation idea, but this is not the way=20 resumable functions should be seen. resumable functions are basics of=20 cooperative multitasking, and iteration/generation is just a special case=20 of coop mt. mimicking delegates allows to use resumable function in any code that=20 expects delegate. and defining interfaces (yeilding is two-way comm=20 channel!) will allow to build generator abstraction a-la range (or even=20 mimicking any necessary range).=
Feb 22 2015
parent reply "bitwise" <bitwise.pvt gmail.com> writes:
 you don't need to. if you really need to do that, you're doing 
 something
This makes no sense to me. A usage example may be helpful.
 resumable functions are not iterators. it's a slightly 
 perversed flow
 control method. iteration/generation is one of the use cases.
c++?
 and, by the way, yielding is a two-way communication channel.
http://www.boost.org/doc/libs/1_57_0/libs/coroutine/doc/html/index.html [quote] Coroutine Asymmetric coroutine Symmetric coroutine [/quote]
 you seem to stuck with iteration/generation idea, but this is 
 not the way resumable functions should be seen.
Not sure what to say to this..
 mimicking delegates allows to use resumable function in any 
 code that expects delegate.
If you can come up with even one example(with code) where it would make sense to accept both coroutines and delegates, I will be very surprised. In any case, I have revised my design. Generator(T) was redundant, so I removed it. Below is something that I think is well formed enough for me to start digging through the compiler for specifics. interface MethodRange(T) { property T front(); void popFront(); property bool empty(); int opApply(int delegate(T) dg); } class __ResumeableMethod(T, METHOD_TYPE, CONTEXT_TYPE) : MethodRange!T { // method locals and passed args CONTEXT_TYPE* context; // -initially contains method entry point // -once executed, contains the resume address // -when finished, contains null void *fptr; // object reference for instance methods void *obj; T value; this(CONTEXT_TYPE* context, void *fptr, void *obj) { this.context = context; this.fptr = fptr; this.obj = obj; invoke(context, &value); } private T invoke(CONTEXT_TYPE *context, T *yielded_value) { fptr(context); fptr = context->return_address; if(fptr) *yielded_value = context->yielded_value; } property override T front() { assert(!this.empty); return value; } override void popFront() { assert(!this.empty); invoke(context, &value); } property override bool empty() { return fptr == null; } int opApply(int delegate(T) dg) { while(!this.empty) { if(dg(this.front)) return 1; } return 0; } } MethodRange!int myResumableMethod() { for(int i = 0; i < 10; ++i) yield i; } // the compiler would automatically transform the above // method into something like the one below MethodRange!int myResumableMethod() { void __method(__ContextFor__method *ctx) { for(ctx->i = 0; i < 10; ++i) { ctx->yielded_value = i; ctx->return_address = &return_pos; return; return_pos: } ctx->return_address = null; } return new __ResumeableMethod!int(new __ContextFor__method, &__method, null); } // usage void main() { foreach(num; myResumableMethod()) writeln(num); // or MethodRange!int[] methods; auto mr = myResumableMethod(); if(!mr.empty) methods ~= mr; for(int i = 0; i < methods.length; ) { auto mr = methods[i]; int status = mr.front; // handle item status.. mr.popFront(); if(mr.empty) methods.remove(i); else ++i; } }
Feb 23 2015
parent reply ketmar <ketmar ketmar.no-ip.org> writes:
On Mon, 23 Feb 2015 23:10:28 +0000, bitwise wrote:

 you don't need to. if you really need to do that, you're doing
 something
=20 This makes no sense to me. A usage example may be helpful.
you still thinking in terms of generators. generator is a high-level=20 construct, resumable routine is a low-level construct. latter is used to=20 build the former.
 resumable functions are not iterators. it's a slightly perversed flow
 control method. iteration/generation is one of the use cases.
=20
as one of the possible high-level constructs that can be built with=20 resumable routines.
 http://www.boost.org/doc/libs/1_57_0/libs/coroutine/doc/html/index.html
=20
 [quote]
 Coroutine
      Asymmetric coroutine Symmetric coroutine
 [/quote]
i don't even want to know what boost does. it stinks anyway.
 mimicking delegates allows to use resumable function in any code that
 expects delegate.
=20 If you can come up with even one example(with code) where it would make sense to accept both coroutines and delegates, I will be very surprised.
anywhere. any code. any delegate usage. it's a simple idiom of "switching=20 control", where "function that calling delegate" can be seen as "delegate=20 that calling the function". heh. anytime you need to switch the=20 controlling side without rewriting the code. if you can't think out the=20 use case for this, you still looking at resumable routines from the wrong=20 POV.
 In any case, I have revised my design. Generator(T) was redundant, so I
 removed it. Below is something that I think is well formed enough for me
 to start digging through the compiler for specifics.
it's like building the brick house without having the bricks. you trying=20 to build the house when you have to make bricks. sure, you can do that...=20 and than cursing while smashing it into bricks when you need to build the=20 fence. or building fences from houses. what you *really* trying to do is to build specific actor instead of=20 building the foundation for programming with actors. building the=20 foundation for actor model is way better and will allow you to build your=20 generators easily.=
Feb 23 2015
parent reply "bitwise" <bitwise.pvt gmail.com> writes:
On Tuesday, 24 February 2015 at 00:48:34 UTC, ketmar wrote:
 On Mon, 23 Feb 2015 23:10:28 +0000, bitwise wrote:
 you still thinking in terms of generators. Generator is a 
 high-level construct, resumable routine is a low-level 
 construct. latter is used to build the former.
I think you're getting confused between "stackless" and "stackful" resumable functions. If I wanted stackful resumable functions, I would just use D's Fiber. If I wanted something lower level, I would simply make a D wrapper for boost::fcontext. http://www.boost.org/doc/libs/1_57_0/boost/context/fcontext.hpp #define RF "resumable function" // :) But, I am not talking about stackful RF. The control flow you're describing is that of a stackful RF. With stackless RFs, you can't just randomly switch control flow between coroutines, nor can you yield from nested stack frames. A stackless RF runs on the same stack as everything else. Only the local variables and RF's arguments are allocated on the heap. That said, I can't think of a lower level abstraction than what I have provided that would actually be useful...
 i don't even want to know what boost does. it stinks anyway.
Maybe this is why you don't get what I'm saying ;)
 anywhere. any code. any delegate usage. it's a simple idiom of 
 "switching control", where "function that calling delegate" can 
 be seen as "delegate that calling the function".
Again, I'm pretty sure you're thinking of stackful RFs.
 it's like building the brick house without having the bricks.
Don't be silly, we're clearly building a bike shed here.
Feb 23 2015
parent reply ketmar <ketmar ketmar.no-ip.org> writes:
On Tue, 24 Feb 2015 03:22:10 +0000, bitwise wrote:

 I think you're getting confused between "stackless" and "stackful"
 resumable functions.
ahem. seems that you are right, and i outsmarted myself yet again. seems=20 that my head is too small to keep three different task unmessed (forum=20 discussion, plus two of my projects). sorry.=
Feb 23 2015
next sibling parent "bitwise" <bitwise.pvt gmail.com> writes:
 ahem. seems that you are right, and i outsmarted myself yet 
 again. seems
 that my head is too small to keep three different task unmessed 
 (forum
 discussion, plus two of my projects).

 sorry.
If it's any consolation, I did accidentally use a C++ constructor in my first example.. I better sleep with one eye opened tonight ;)
Feb 23 2015
prev sibling parent reply "bitwise" <bitwise.pvt gmail.com> writes:
 ketmar

You know you may be kinda right about reusability though...

I am not as familiar with "await" as I am with coroutines, but I 
think the principal is similar. The Visual Studio team seems to 
have chosen to tackle stackless resumable routines and await at 
the same time, so there may be some reusability there. I couldn't 
say for sure though.

In any case, I don't think it would be hard to refactor my 
suggested implementation under the covers to facilitate something 
like await without hurting the user-facing functionality.
Feb 24 2015
parent ketmar <ketmar ketmar.no-ip.org> writes:
On Tue, 24 Feb 2015 15:53:53 +0000, bitwise wrote:

 In any case, I don't think it would be hard to refactor my suggested
 implementation under the covers to facilitate something like await
 without hurting the user-facing functionality.
sure, if you will get something working, it will be easier to chop it to=20 pieces if necessary. or one can simply emulate one thing with another=20 thing as a quickhack.=
Feb 24 2015
prev sibling parent ketmar <ketmar ketmar.no-ip.org> writes:
On Sat, 21 Feb 2015 21:19:32 +0000, bitwise wrote:

p.s. i think you will need to so some housekeeping in "wrapping=20
delegate", of course. it depends of compiler internals though, and from=20
codegen details.=
Feb 21 2015