www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - First-class polymorphic lamdas?

reply ZombineDev <valid_email he.re> writes:
Background: I'm working on a transducers proposal for Phobos, 
which
generalizes ranges to support push data-flow model (in addition 
to the current pull-only) enabling composable algorithmic 
transformations that can be applied to other sources of data such 
as RX observables[0], signal & slots, channels, etc, as well as 
the traditional data sources like containers, generator ranges 
like `iota` and so on.
For example, this would allow one to use a single transformation 
object e.g. { filter -> map -> uniq } for both synchronous and 
asynchronous sequences of data.

The existing implementations that I'm studying are the one 
introduced originally by Rich Hickey in Clojure[1] and Ableton's 
Atria C++ library presented at CppCon 2015 [2].
My main obstacle is that they both rely on returning polymorphic 
lambdas. E.g. polymorphic lambdas are need to be first-class 
value types.

AFAIU, as per proposal N3649[3] which was accepted for C++14, if 
one of the parameters of a lambda expression has type-specifier 
`auto`, the compiler should generate a **non-templated** 
placeholder struct with a **templated** call operator. This 
allows the following code to work:

```
// Compile with -std=c++14
#include <iostream>
#include <vector>
using namespace std;

int main()
{
     double local_var = 1.0;

     auto add = [&](auto number) { local_var += number; };

     add(3);     cout << local_var << endl; // Prints 4
     add(3.14);  cout << local_var << endl; // Prints 7.14
     add(20LL);  cout << local_var << endl; // Prints 27.14

     auto vec = vector<decltype(add)> { add, add, add, add };

     for (auto&& fun : vec)
     {
         fun(1);
         fun(0.5);
         fun(2LL);
     }

     cout << local_var << endl;  // Prints 41.14
}
```

In D, on the other hand, the compiler stores a reference to the 
lambda function as a function-pointer (or a delagate-pointer, if 
any local variables are captured), however this precludes support 
for polymorhpic lambdas. AFAIU, delegates are implemented the way 
they are because this enables much easier ABI interoperability.

Kenji worked on supporting polymorphic lambdas for DMD 2.070 
[4][5] and his design
(if I understand it correctly) works like so:
```
alias add = (a,b) => a + b;
// is syntactically lowered to

alias add = add__compile_generated;
template add__compile_generated(T, U)
{
     auto add__compile_generated(T a, U b) { return a + b; }
}
```

This by itself was a good enhancement, however it is incapable of 
handling some very useful cases which C++14 supports, showcased 
in the below example, which is adopted from the Transducers 
CppCon 2015 lecture:

```
// 0) accumulate
// (nothing special here)
template <typename In, typename S, typename Rf>
S accumulate(In first, In last, S state, Rf step) {
     for (; first != last; ++first) {
         state = step(state, *first);
     }
     return state;
}

// 1) output_rf
// !!! Proper value-type with templated operator() !
auto output_rf = [] (auto out, auto input) {
     *out++ = input;
     return out;
};

// 2) map & filter a value-types with templated operator()
// that return value-types with tempalted operator()

// !!!
auto map = [] (auto fn) {
   return [=] (auto step) {
     return [=] (auto s, auto ...ins) {
       return step(s, fn(ins...));
     };
   };
};

// !!!
auto filter = [] (auto pred) {
     return [=] (auto step) {
         return [=] (auto s, auto ...ins) {
             return pred(ins...)?
             step(s, ins) :
             s;
         };
     };
};

auto f = filter([](int x) { return x > 0; });

auto xf = comp(
     filter([](int x) { return x > 0; }),
     map([](int x) { return x * 2; })
);

accumulate(first, last, out, xf);
// which turns into:
accumulate(first, last, out, comp(filter(pred), map(fn)) 
(output_rf));
// which turns into:
accumulate(first, last, out, filter(pred) (map(fn) (output_rf)));

// In order to replicate this in D I need to use ugly workarounds 
like:
// * wrap the lambda expression in a dummy struct
// * put the body of the lambda expression in a templated opCall
// * return DummyStruct.init, in order to prevent the compiler 
from trying to call
//   the opCall function
// *  property just in case some future version decides to 
enforce if for
//   calling functions without parentheses
```


```
// 1) translated to D
// current syntax
 property auto outputRf()
{
     struct OutputReducer
     {
         Out opCall(OutRange, E)(ref OutRange output, E elem)
             if (isOutputRange!(OutRange, E))
         {
             output.put(elem);
             return output;
         }
     }
     return OutputReducer.init;
}

// desired syntax:

auto outputRf(O, E) = (ref O output, E elem) {
     output.put(elem);
     return output;
};

// 2) translated to D
// current syntax:
auto map(alias fn, Rf)(Rf step)
{
     struct MapResult
     {
         Rf step;

         auto opCall(S, Input...)
             (auto ref S state, auto ref Input inputs)
         {
             return step(state, fn(inputs));
         }
     }
     MapResult result = { step : step };
     return result;
}

// New problems:
// * Need to use alias template parameter to capture the mapping 
function.
//   It would be better if I could take it as a value parameter 
like in C++.
// * Need to manually put all of the captured variables as 
members of the struct
// * Can't return MapResult(step), because the compiler confuses 
it with a call to
//   opCall, even though opCall is not a static function. Instead 
I need to use
//   the C struct initializer syntax.
//
// Desired syntax:
// 1. Omitting the type-specifier let's the compiler know
//    that there should be a corresponding template parameter.
// 2. Allow for optional template parameters before the
//    run-time parameters.

auto filter = (pred) {
     return (step) {
         return (S, Inputs...)(S s, Inputs ins) {
             return pred(ins)?
             step(s, ins) :
             s;
         };
     };
};
```

To sum up, I want the compiler to do all those things for me, 
like the C++14 compilers do. Presently it feels like I'm doing 
the compiler's job.

[0]: http://reactivex.io/intro.html
[1]: https://www.youtube.com/watch?v=6mTbuzafcII
[2]: https://www.youtube.com/watch?v=vohGJjGxtJQ
[3]: 
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3649.html
[4]: http://dlang.org/changelog/2.070.0.html#alias-funclit
[5]: https://github.com/D-Programming-Language/dmd/pull/3638
Feb 05 2016
next sibling parent reply Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Friday, 5 February 2016 at 20:36:20 UTC, ZombineDev wrote:
 In D, on the other hand, the compiler stores a reference to the 
 lambda function as a function-pointer (or a delagate-pointer, 
 if any local variables are captured), however this precludes 
 support for polymorhpic lambdas. AFAIU, delegates are 
 implemented the way they are because this enables much easier 
 ABI interoperability.
AIUI, what's happening underneath, is that any time something needs a context pointer, it is "nested" under it, and its "this" becomes the context. I.e.: void main() { int n; alias dg = x => n+=x; } This is obviously equivalent to: void dg(T)(T x) { n += x; } i.e. dg is a template of a local function. Obvious so far. However, if next you do: void callIt(alias fun)() { fun(5); } callIt!dg(); callIt becomes nested under main(), i.e. its context pointer will be main's stack frame. This precludes callIt already having a context pointer, which is why you can't use methods together with alias parameters to symbols with context, or multiple alias parameters to things with different context. AIUI SDC doesn't have this limitation. Anyway, you probably knew all this already.
 Kenji worked on supporting polymorphic lambdas for DMD 2.070 
 [4][5] and his design
 (if I understand it correctly) works like so:
 ```
 alias add = (a,b) => a + b;
 // is syntactically lowered to

 alias add = add__compile_generated;
 template add__compile_generated(T, U)
 {
     auto add__compile_generated(T a, U b) { return a + b; }
 }
 ```
I believe the 2.070 changes are entirely confined to syntax. Previously, the syntax: alias add = (a,b) => a + b; would not be valid. I mentioned this (and the workaround) in a blog post a while ago: https://blog.thecybershadow.net/2015/04/28/the-amazing-template-that-does-nothing/ and briefly at DConf IIRC. TL;DR: The solution was to wrap it in a template which aliases to itself: alias I(alias X) = X; alias add = I!((a,b) => a + b);
 auto map(alias fn, Rf)(Rf step)
 {
     struct MapResult
     {
         Rf step;

         auto opCall(S, Input...)
             (auto ref S state, auto ref Input inputs)
         {
             return step(state, fn(inputs));
         }
     }
     MapResult result = { step : step };
     return result;
 }
For some reason I didn't realize you can also create a value type templated functor by using explicitly nested structs. I had stumbled upon the new feature from https://github.com/dlang/dmd/pull/5518 by accident while working on something similar (heterogenous iteration and range-like composition, e.g. over struct fields). Will post that here later today.
 To sum up, I want the compiler to do all those things for me, 
 like the C++14 compilers do. Presently it feels like I'm doing 
 the compiler's job.
I think that alias template parameters and context binding are a very rich and unexplored area of D, which will allow writing some really efficient and composable code. A while ago I created a few DMD pull requests with related improvements, and based on those DMD pull requests I wrote an allocator and serialization library. The entire code was messy but extremely efficient (e.g. the "this" pointer was shared between all composed layers of the entire allocator instance). DMD pull requests: https://github.com/dlang/dmd/pull/3329 https://github.com/dlang/dmd/pull/3345 https://github.com/dlang/dmd/pull/3361 Compiler capability tests: https://github.com/CyberShadow/ae/blob/master/utils/meta/caps.d Allocators (note that this predates std.experimental.allocator): https://github.com/CyberShadow/ae/blob/master/utils/alloc.d Serialization: https://github.com/CyberShadow/ae/tree/master/utils/serialization For personal use, I still use a hacked compiler with https://github.com/dlang/dmd/pull/3884 reverted, even though at the time I argued in favor of that change (since the introduction of field context binding broke code). It is just too useful. A while ago Kenji proposed "static alias", to explicitly control whether an alias argument passes context or not (sometimes not passing context is required, e.g. if the parameter will be used only for introspection or with my __traits(child)): https://wiki.dlang.org/Brush_Up_Language_Features#Nested_Symbols I think by now we would need "this alias" since fields and methods no longer bind with context to alias parameters. I would like to some day wrap all of the above work together into a coherent proposal to improve D aliases, but haven't had the time/energy so far (few people seem to share my enthusiasm for the subject :)). Perhaps for DConf 2017...
Jun 03 2016
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 03.06.2016 13:11, Vladimir Panteleev wrote:
 I think that alias template parameters and context binding are a very
 rich and unexplored area of D, which will allow writing some really
 efficient and composable code. A while ago I created a few DMD pull
 requests with related improvements, and based on those DMD pull requests
 I wrote an allocator and serialization library. The entire code was
 messy but extremely efficient (e.g. the "this" pointer was shared
 between all composed layers of the entire allocator instance).

 DMD pull requests:

 https://github.com/dlang/dmd/pull/3329
 https://github.com/dlang/dmd/pull/3345
I agree that those cases should work.
 https://github.com/dlang/dmd/pull/3361
 ...
Interesting. Not sure about this. It seems it should work by analogy with this: struct Test{ int x=2; int foo(){ return x; } int call(alias a)(){ return a(); } pragma(msg,Test().call!foo); } So I have implemented it. :) https://github.com/tgehr/d-compiler/commit/ea7cdcc43dce8fd05b64b872e83ac28905b81dff However, there are some funny cases: struct A{ int x; this(int x){ this.x=x; } int fun(){ return x;} int[] caller(T)(T t){ int[] r; r~=T.fun(); r~=(x++,t).fun(); return r; } } struct B{ alias fun = A.fun; } pragma(msg, A(2).caller(B())); // [2,3] pragma(msg, A(2).caller(A(10))); // [2,10] I.e. this code exhibits 'special' behaviour when the passed type is A.
 Compiler capability tests:

 https://github.com/CyberShadow/ae/blob/master/utils/meta/caps.d

 Allocators (note that this predates std.experimental.allocator):

 https://github.com/CyberShadow/ae/blob/master/utils/alloc.d

 Serialization:

 https://github.com/CyberShadow/ae/tree/master/utils/serialization

 For personal use, I still use a hacked compiler with
 https://github.com/dlang/dmd/pull/3884 reverted, even though at the time
 I argued in favor of that change (since the introduction of field
 context binding broke code). It is just too useful.
 ...
It works with my compiler too. There shouldn't be any restrictions on context binding.
 A while ago Kenji proposed "static alias", to explicitly control whether
 an alias argument passes context or not (sometimes not passing context
 is required, e.g. if the parameter will be used only for introspection
 or with my __traits(child)):

 https://wiki.dlang.org/Brush_Up_Language_Features#Nested_Symbols

 I think by now we would need "this alias" since fields and methods no
 longer bind with context to alias parameters.
 ...
It must work by default, otherwise libraries can accidentally not support it. I think it should just be inferred. I.e. if the template instance needs access to the context, that requirement is propagated to the caller, otherwise it isn't. The arguments that Kenji raises against it seem to be DMD-specific, no? Additionally, I'm in favor of having both static alias and this alias.
 I would like to some day wrap all of the above work together into a
 coherent proposal to improve D aliases, but haven't had the time/energy
 so far (few people seem to share my enthusiasm for the subject :)).
 Perhaps for DConf 2017...
I share it. The main limitation that needs fixing is capture of multiple contexts. Unfortunately I have not gotten around to implementing multiple contexts and context inference myself so far.
Jun 03 2016
parent reply Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Friday, 3 June 2016 at 15:14:59 UTC, Timon Gehr wrote:
 https://github.com/dlang/dmd/pull/3361
 ...
Interesting. Not sure about this. It seems it should work by analogy with this: struct Test{ int x=2; int foo(){ return x; } int call(alias a)(){ return a(); } pragma(msg,Test().call!foo); } So I have implemented it. :) https://github.com/tgehr/d-compiler/commit/ea7cdcc43dce8fd05b64b872e83ac28905b81dff
Hah, very cool :)
 However, there are some funny cases:

 struct A{
     int x;
     this(int x){ this.x=x; }
     int fun(){ return x;}
     int[] caller(T)(T t){
         int[] r;
         r~=T.fun();
         r~=(x++,t).fun();
         return r;
     }
 }
 struct B{
     alias fun = A.fun;
 }

 pragma(msg, A(2).caller(B())); // [2,3]
 pragma(msg, A(2).caller(A(10))); // [2,10]

 I.e. this code exhibits 'special' behaviour when the passed 
 type is A.
Maybe when I don't follow your argument, but it acts exactly as before when the passed type is A. It also acts as before when you pass some other type with a regular old fun() method (i.e. when fun is static or the T.fun call is removed). It acts differently only when fun is an alias to a method in A, which otherwise wouldn't have compiled. Either way, this piggy-backs on that T.fun() works when fun is not static, which I think would not have been necessary had we had a nice syntax for __traits(child) (with the one I proposed it would be `this.(T.fun)();`). Anyway, I don't feel too strongly about this one.
 It must work by default, otherwise libraries can accidentally 
 not support it. I think it should just be inferred. I.e. if the 
 template instance needs access to the context, that requirement 
 is propagated to the caller, otherwise it isn't. The arguments 
 that Kenji raises against it seem to be DMD-specific, no?
I must admit that I don't fully understand Kenji's explanation either.
 I share it. The main limitation that needs fixing is capture of 
 multiple contexts. Unfortunately I have not gotten around to 
 implementing multiple contexts and context inference myself so 
 far.
Yeah, multiple contexts would be the crown jewel. I've been told, though, that the idea that things can have at most one context pointer is very deeply rooted into DMD. I think Walter Bright is also not a fan of the idea, if I understand correctly his main counter-argument is that for untemplated function pointers, when it has 0 contexts it's just a function pointer, when it has 1 context it's a delegate, when it has 2 or more contexts, it's something else, what is it - how would you describe its type in language syntax?
Jun 03 2016
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 03.06.2016 18:36, Vladimir Panteleev wrote:
 On Friday, 3 June 2016 at 15:14:59 UTC, Timon Gehr wrote:
 https://github.com/dlang/dmd/pull/3361
 ...
Interesting. Not sure about this. It seems it should work by analogy with this: struct Test{ int x=2; int foo(){ return x; } int call(alias a)(){ return a(); } pragma(msg,Test().call!foo); } So I have implemented it. :) https://github.com/tgehr/d-compiler/commit/ea7cdcc43dce8fd05b64b872e83ac28905b81dff
Hah, very cool :) ...
There's also this commit: https://github.com/tgehr/d-compiler/commit/dc34b0b2529300a05fc00c0fc54d006bc96aa790 I noticed during your nice talk at DConf that I lacked support for lazy void parameters. This was fixed before the end of the talk. :)
 However, there are some funny cases:

 struct A{
     int x;
     this(int x){ this.x=x; }
     int fun(){ return x;}
     int[] caller(T)(T t){
         int[] r;
         r~=T.fun();
         r~=(x++,t).fun();
         return r;
     }
 }
 struct B{
     alias fun = A.fun;
 }

 pragma(msg, A(2).caller(B())); // [2,3]
 pragma(msg, A(2).caller(A(10))); // [2,10]

 I.e. this code exhibits 'special' behaviour when the passed type is A.
Maybe when I don't follow your argument, but it acts exactly as before when the passed type is A. ...
You are right, the behavior in question is not actually newly introduced. It's probably fine.
 It must work by default, otherwise libraries can accidentally not
 support it. I think it should just be inferred. I.e. if the template
 instance needs access to the context, that requirement is propagated
 to the caller, otherwise it isn't. The arguments that Kenji raises
 against it seem to be DMD-specific, no?
I must admit that I don't fully understand Kenji's explanation either. ...
Well, one argument seems to be that it is clearly possible to create somewhat weird situations if both inference and introspection happen at the same time. I handle all forward reference issues in an unified manner and therefore shouldn't run into major problems there.
 I share it. The main limitation that needs fixing is capture of
 multiple contexts. Unfortunately I have not gotten around to
 implementing multiple contexts and context inference myself so far.
Yeah, multiple contexts would be the crown jewel. I've been told, though, that the idea that things can have at most one context pointer is very deeply rooted into DMD. I think Walter Bright is also not a fan of the idea, if I understand correctly his main counter-argument is that for untemplated function pointers, when it has 0 contexts it's just a function pointer, when it has 1 context it's a delegate, when it has 2 or more contexts, it's something else, what is it - how would you describe its type in language syntax?
It should still be a delegate. One context pointer suffices to address an arbitrary amount of data. We can just put a void*[n] with all remaining contexts inside the stack frame context.
Jun 03 2016
prev sibling parent Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Friday, 3 June 2016 at 11:11:50 UTC, Vladimir Panteleev wrote:
 For some reason I didn't realize you can also create a value 
 type templated functor by using explicitly nested structs. I 
 had stumbled upon the new feature from 
 https://github.com/dlang/dmd/pull/5518 by accident while 
 working on something similar (heterogenous iteration and 
 range-like composition, e.g. over struct fields). Will post 
 that here later today.
https://github.com/CyberShadow/ae/blob/master/utils/meta/chain.d
Jun 03 2016
prev sibling parent reply QAston <qaston gmail.com> writes:
On Friday, 5 February 2016 at 20:36:20 UTC, ZombineDev wrote:
 Background: I'm working on a transducers proposal for Phobos, 
 which
 generalizes ranges to support push data-flow model (in addition 
 to the current pull-only) enabling composable algorithmic 
 transformations that can be applied to other sources of data 
 such as RX observables[0], signal & slots, channels, etc, as 
 well as the traditional data sources like containers, generator 
 ranges like `iota` and so on.
Take a look at my impl: https://github.com/QAston/transducers-dlang I doubt transducers are a material for stdlib.
Jun 03 2016
parent QAston <qaston gmail.com> writes:
On Friday, 3 June 2016 at 23:25:25 UTC, QAston wrote:
 On Friday, 5 February 2016 at 20:36:20 UTC, ZombineDev wrote:
 Background: I'm working on a transducers proposal for Phobos, 
 which
 generalizes ranges to support push data-flow model (in 
 addition to the current pull-only) enabling composable 
 algorithmic transformations that can be applied to other 
 sources of data such as RX observables[0], signal & slots, 
 channels, etc, as well as the traditional data sources like 
 containers, generator ranges like `iota` and so on.
Take a look at my impl: https://github.com/QAston/transducers-dlang I doubt transducers are a material for stdlib.
Few notes: my impl is finished, just lacks some config options, like easy allocator configuration. It extends OutputRange interface for push operations instead of inventing a new one. You could use that library and just plug your rx impl.
Jun 03 2016