www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - How can you read and understand the source of *naryFun in functional.d?

reply Tom <tom gmail.com> writes:
Hi,

When I saw the ways to use std.algorithem's functions, I have noticed that the
input lambda's can be writen as strings. Somewhat like the pythonic "exec". I
went to the source of this feature in functional.d
("https://github.com/D-Programming-Language/phobos/blob/master/std/functional.d").
The functions unaryFun and binaryFun. Is there a way I can read them and
understand them easily? or maybe I missed something?
Jan 29 2011
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Tom:

 The functions unaryFun and binaryFun. Is there a way I can read them and
 understand them easily? or maybe I missed something?
If you learn D well, you will be able to read them. But even if you know D they aren't easy to read. It's not easy stuff, that's standard-library-grade material :-) Bye, bearophile
Jan 29 2011
prev sibling next sibling parent reply Tomek =?ISO-8859-2?Q?Sowi=F1ski?= <just ask.me> writes:
Tom napisa=B3:


 When I saw the ways to use std.algorithem's functions, I have noticed tha=
t the
 input lambda's can be writen as strings. Somewhat like the pythonic "exec=
". I
 went to the source of this feature in functional.d
 ("https://github.com/D-Programming-Language/phobos/blob/master/std/functi=
onal.d").
 The functions unaryFun and binaryFun. Is there a way I can read them and
 understand them easily? or maybe I missed something?
The standard library implementation must cater for a lot of corner-cases. B= ut the essence is this: template binaryFun(string expr) { auto binaryFun(T, U)(T a, U b) { return mixin(expr); } } unittest { assert (binaryFun!"a+b"(1,2) =3D=3D 3); assert (binaryFun!"a-b"(1,2) =3D=3D -1); } The magic happens at the mixin line. It takes any expression or statement i= n string form and compiles it in context of the function. Unlike pythonic e= xec, the string must be known at compile-time. --=20 Tomek
Jan 29 2011
next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Andrei, I think there's a lurking bug in unaryFunImpl, on line 53:
            enum testAsExpression = "{ ElementType "~parmName
                ~"; return ("~fun~");}()";
            enum testAsStmts = "{"~ElementType.stringof
                ~" "~parmName~"; "~fun~"}()";

Notice the second enum, I think it's missing a ";" after "~fun~".
It might need changing the last chars from "}()"; to ";}()";

That enum is never used in code which explains why the typo hasn't
been caught yet.
Jan 29 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
P.S. Tom, I'm writing an explanation on how unaryFunImpl works right
now, but it's long. Give me a few more minutes and it'll be done. :)
Jan 29 2011
prev sibling next sibling parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
I hope this gets pasted right, otherwise I'll have to open a freakin' blog. :)

Up until a few months ago, I had no experience with templates whatsoever.

At first, I started using function templates, because they were the
easiest to grasp. Or so I thought. Basically I just used function
templates as if they are a special kind of function that takes any
types of arguments. Sort of like having a regular Python function in
your D code. I didn't quite grasp the point of templates back then.

What really opened my eyes was the realization that in D there's a
distinct notion between manipulating types at compile-time and
manipulating types at run-time.

Once I've realized that, I knew why template arguments have to be
known at compile time, because templates are a compile-time feature.
And templates aren't really a "type" at runtime, they're only visible
at compile-time. They're just "placeholders" with their own scope, and
they hold declarations in that scope. And to access those declarations
(or fields would be a proper word), you generally use the dot
operator. There's also the eponymous trick where you can use the
template instantiation itself to refer to one of its fields (this is
useful when you're only interesting in getting back one result, as it
saves you the burden of having to explicitly specify that field when
instantiating the template).

Then, I've realized that templates can be used almost anywhere. A
template can have fields which are templated structs and classes,
which in turn can have templated functions, which in turn have
templated declarations in them, and the functions can have templated
return values.

So, let's talk about unaryFun now. unaryFun is a template which has
the ability to construct a function, by simply passing a string as an
argument. It's named *unary* because it constructs a function that
takes only a single argument. You pass the string in the syntax form
where it's as if you're only writing the return expression. unaryFun
has the ability to convert some special predefined characters to
parameters of a function. For example, it recognizes that 'a' should
be replaced with the argument to the newly constructed function.

Here's a simple instantiation of unaryFun:
unaryFun!("(a & 1) == 0")

unaryFun takes the string argument and might create this function:


Let me walk you through unaryFun line by line (I'm using DMD 2.051).
First, our example code:

alias unaryFun!("(a & 1) == 0") isEven;
assert(isEven(2) && !isEven(1));

Now open std.functional.

Line 40: unaryFun instantiates unaryFunImpl. You'll notice it passes
the same arguments to unaryFunImpl as the ones it got from the calling
code. We passed a string, and the two other arguments are set to
'false' and 'a' by default, since we didn't specify them. Once the
unaryFunImpl template is instantiated, we can select any of its fields
with the dot operator. In this case, we're interested in the result
field.
So.. what does the alias do? It's a nice thing called the eponymous
trick, where you can use the name of the template itself as the result
of the instantiation. In other words, if you alias the name of a
template to some declaration, when you instantiate that template you
won't have to use the dot prefix to get a single result.
A quick example without the eponymous trick:
template foo(int value)
{
    int result = value;
}
void main()
{
    writeln( foo!(4).result );
}

And here's one with the eponymous trick:
template foo(int value)
{
    int result = value;
    alias result foo;  // we've aliased the foo template name to the
result field
}
void main()
{
    writeln( foo!(4) );  // no need to explicitly use .result here
}

This is very handy when you're only interested in getting one field of
a template.
Now let's see how unaryFunImpl works.

Line 45: This is a *compile-time* check. It tries to determine if the
fun argument is a string.
There's an else clause way down on line 98, which, in case the fun
argument isn't a string, simply aliases the fun argument to the name
'result'.
So, why do we need to check if the 'fun' argument is a string? Well,
because 'fun' in the template
signature of unaryFunImpl is defined as an 'alias' type. An alias is a
parameter type that can be a string, a value, a type, a function, or
any other valid D symbol name. The reason alias is used in
unaryFunImpl is because we can call unaryFunImpl in different ways. In
our own code we've used:

void main()
{
    alias unaryFun!("(a & 1) == 0") isEven;
}

In this case the 'static if' check on line 45 will pass since the
argument is a string. But we could have called unaryFun like so:

void foo(int x)
{
}
void main()
{
    alias unaryFun!(foo) isEven;
}

In this case we're passing a function parameter at compile time. The
'static if' check will evaluate to false since the fun parameter is an
actual function, not a string. This brings us to the else clause on
line 98. Since the whole purpose of unaryFun is to construct a
function out of a string, it doesn't make sense to do any work if
'fun' isn't a string. And if 'fun' isn't a string, it's likely a
function, so we alias it to 'result' and we're done. 'fun' could also
be a delegate (a function pointer), or a struct with a defined opCall
method (these kinds of structs can be called as if they were
functions).

Ok let's continue. We've passed a string ("(a & 1) == 0"), so our
static if check on line 45 evaluates to true. Now comes the fun part,
a template within a template!

Line 47: 'Body' is a template that is only used within unaryFunImpl,
which explains why it's nested inside it. It takes one compile-time
argument, in this case that argument will be a 'type' argument (such
as int/double/etc).
Let's just assume for a second that we've instantiated the Body
template with an int argument ( Body!(int) ).
On line 51 and 52 there are two enum declarations:
    enum testAsExpression = "{ ElementType "~parmName ~"; return ("~fun~");}()";
    enum testAsStmts = "{"~ElementType.stringof ~" "~parmName~"; "~fun~"}()";
Note how we're able to use the arguments from the enclosing template.
I'm refering to the use of 'parmName' and 'fun', which are arguments
belonging to unaryFunImpl.
Recall that our initial example code is this:
    unaryFun!("(a & 1) == 0")

We can now easily expand these two enums to something more meaningful:
    enum testAsExpression = "{ ElementType a; return ((a & 1) == 0);}()";
    enum testAsStmts = "{ int a; (a & 1) == 0; }()";

It might be a bit hard to read. These two enums are two versions of a
function (in string form) that will be tested to see if they're valid
and compilable as actual functions.
Notice that in the string we don't specify the function name, or the
function arguments for that matter. This is called a 'lambda', an
anonymous function which doesn't have a name.
Also notice that at the end of the string there's an open and closed
paranthesis '()'. Keep that in mind for now.

Moving to line 57:
Another static if check. __traits is a compiler feature, similar to a
pragma, where you can do some compiler intrinsics and get information
from the actual compiler itself. __traits takes two arguments, the
first can be one of the predefined keywords that defines what you're
asking the compiler to check for (e.g. "give me all members of some
struct", or "hey compiler, does the following code compile?", see
http://www.digitalmars.com/d/2.0/traits.html for more). The second
argument is what the compiler will do intrinsics on. Here, we're using
the 'compiles' keyword, which asks the compiler if it can succesfully
compile an expression. That expression can be anything that's a valid
expression.

This is where our enums come to play. What we want to do, is convert
the enums from their string representation into actual D code that is
compiled in. This is where 'mixin' comes into play. mixin takes any
string, and compiles it in as if it were any regular D code.
For example:
void main()
{
    mixin("int x;");
    writeln( x );  // x exists now

    mixin(" struct Foo { } ");
    Foo foo;  // foo is a Foo type
}

So with the power of __traits(compiles..) and mixin('string'), we're
able to compile strings as if they were D code, and then check if that
code is actually valid. Here's some examples of the two in work:

    static if (__traits(compiles, mixin("int x;")))
    {
    }

    static if (__traits(compiles, mixin("class Foo { void bar() { } }")))
    {
    }

    static if (__traits(compiles, mixin(" { int a; a++; return a; } ")))
    {
    }

Notice the last example was an anonymous function (a lambda).
Ok, we were on line 56, the static if is checking if the enum
'testAsExpression' is compilable as D code. If it's not compilable (if
we passed an invalid string representation of a function in our 'fun'
argument), then we reach the else clause on line 67. We'll hit a
static assert here, which means all compilation stops and you'll get
an error. Here's an example that does just that:

import std.functional;
void main()
{
    string brokenFunctionRepresentation = "(a & 1) == b";
    alias unaryFun!(brokenFunctionRepresentation) isEven;
    isEven(4);
}

 D:\DMD\dmd2\windows\bin\..\..\src\phobos\std\functional.d(74): Error: static
assert  "Bad unary function: " ~ brokenFunctionRepresentation ~ " for type " ~
"int"c
 D:\DMD\dmd2\windows\bin\..\..\src\phobos\std\functional.d(87):       
instantiated from here: Body!(int)
 testing45.d(10):        instantiated from here: result!(int)
But, we *did* pass a valid argument, so the static if check will evaluate to true. Line 59: This enum simply expands to: enum string code = "return ( (a & 1) == b );"; The next line can be difficult to grasp, so let me walk you through it. Here again we're using mixin to compile code that's in string form. Remember the value of 'testAsExpression': enum testAsExpression = "{ ElementType a; return ((a & 1) == 0);}()"; Ok, so how do we compile this in? Well, remember that ElementType is a type argument passed to the Body template (the one we're in now). And remember that I've said that for the moment we're pretending that that argument is an int. This means when mixed in the expression becomes: { int a; return ((a & 1) == 0);}(); Remember when I discussed the two enums and how they have the open and closed paranthesis at the end? These are used to call the function that was just defined. It's a lambda function, so we have no other immediate way of calling it, since a lamda doesn't have a name. Here's some lambda examples that are immediately called: import std.functional; import std.stdio; void main() { enum lamdaWithCall = " { return 10 * 20; }()"; writeln( mixin(lamdaWithCall) ); // writes 200 enum lambdaWithoutCall = " { return 10 * 20; }"; writeln( mixin(lambdaWithoutCall)() ); // notice paranthesis outside the mixin } Back to our code at line 60. We're mixing in the enum that has a lambda and a call. This will be compiled, which means the function will be called and it will return a value. 'typeof' returns the type of its argument. Since our function will most likely return an int, typeof will simply return 'int' (In other words, if the function returned '1', it's no longer 1, it's the type of 1 which is an int). Now on the left you see we're using an alias. We're aliasing the int type to the 'ReturnType' symbol name. That's it for the Body template. It's function is to parse a string, try to construct and evaluate a function from the string argument UnaryFunImpl passes in, and in return it defines two important declarations that our unaryFunImpl template will need: 1. The actual code that will be used as the function definition ('enum string code'). 2. The return type of this function ('ReturnType'). Back to our UnaryFunImpl template. We've just pretended that Body was called, but now UnaryFunImpl will call it for the first time. Line 78: This static if checks if our function takes a reference as an argument. We have to manually let UnaryFunImpl know about this (actually we have to let UnaryFun know about it, it passed that argument down to UnaryFunImpl), which is the purpose of the 'byRef' argument. Here's an example demonstrating the differences: import std.functional; import std.stdio; void main() { alias unaryFun!("a = 2") byValue; alias unaryFun!("a = 2", true) byReference; int x = 5; byValue(x); assert(x == 5); byReference(x); assert(x == 2); } After the static if check on line 78 comes the complicated function definition. The full definition is: Body!(ElementType).ReturnType result(ElementType)(ref ElementType a) Let's chop this down into pieces: 'result' is a templated function. It takes a compile-time argument ElementType, and a runtime argument 'a' which is of type reference of ElementType. The return type is specified as: Body!(ElementType).ReturnType This is one of the two fields that the Body template defines for us after we instantiate it. Remember at the very beginning I've noted said you can use templates on return values, well this is one such use case. So, the compiler instantiates the Body template with the ElementType argument, and then looks at the field 'ReturnType'. Remember, 'Body.ReturnType' is the actual type of the return value of that function string that was mixed in. But, where is ElementType defined? We have to go back and look at the result function definition again: Body!(ElementType).ReturnType result(ElementType)(ref ElementType a) Okay, ElementType is figured out from the compile-time argument to result. So now you're probably wondering "wait a minute. We never passed any argument to result!". That's right. We're not calling result yet. We're just *defining a templated function*. Remember that a template is nothing more then a set of declarations. The 'result' templated function is just a declaration within UnaryFunImpl. Now, onto line 82: Here we instantiate Body with the ElementType argument, we take it's 'code' field (remember Body is used to make two fields, 'code' and 'ReturnType'), and we mix in the code. This was the case if our function took a ref parameter. Now, let's jump back to line 40, to unaryFun itself: alias unaryFunImpl!(funbody, byRef, parmName).result unaryFun; Here, unaryFun instantiates unaryFunImpl with a string ('funbody' here, although unaryFunImpl uses 'fun' as the name) that is going to be constructed as a lambda and then checked for validity using the Body template. We also pass two other arguments which are usually set to default. After the template is instantiated, we want its 'result' field. 'result' is a full-fledged templated function which we have just been talking about. We alias this new function to 'unaryFun', and this is our eponymous template trick where we alias some symbol to the template name. Now, up the stack, we've made a complete trip so let's see our original code again: alias unaryFun!("(a & 1) == 0") isEven; assert(isEven(2) && !isEven(1)); We instantiated unaryFun with a string, it calls unaryFunImpl and passes it the string and some default arguments, unaryFunImpl in turn calls Body to parse the string and construct the code and return type. unaryFunImpl then proceeds to construct the 'result' templated function. unaryFun in turn takes this function and aliases it to itself. And now we take the template instantiation (which is aliased to this new function), and alias *that* to isEven. isEven is now equal to the 'result' templated function. You can prove it, too! Here's another instantiation of unaryFun: import std.functional; import std.stdio; void main() { alias unaryFun!(" 2 / a") twoDividedByArg; assert(twoDividedByArg!(int)(1) == 2); assert(twoDividedByArg!(double)(0.5) == 4); assert(twoDividedByArg(1) == 2); assert(twoDividedByArg(0.5) == 4); } The reason the last two calls work is because D has a shorthand for calling templated functions. If the runtime argument can be determined at compile-time (in this case these are integral literals, so the answer is yes), then you don't have to use the bang-parenthesis !() syntax to call a templated function. I still haven't explained two things. One is how 'result' is constructed when called with a function that doesn't take ref parameters. Let's go to line 87 in functional.d. This is almost the same definition of 'result' as the one in the static if body. Except this time it doesn't take a reference a argument. On the next line there's some weird mixin: mixin("alias __a "~parmName~";"); The reason it's weird is that I haven't explained what the *third* argument to unaryFun is. The "string parmName" parameter. This one is used in case you want to use a different character to designate is as the parameter in your function literal string. Here's how this works: import std.functional; import std.stdio; void main() { alias unaryFun!(" 2 / a", false) twoDividedByArg; alias unaryFun!(" 2 / bcd", false, "bcd") twoDividedByArgB; assert(twoDividedByArg!(int)(1) == 2); assert(twoDividedByArgB!(double)(0.5) == 4); } So essentially it allows us to use any string of characters instead of "b". Now these two mixins will make a lot more sense: mixin("alias __a "~parmName~";"); mixin(Body!(ElementType).code); Remember, 'Body.code' is the code of our function literal. If our function uses 'bcd' instead of the name that was explicitly defined in 'result', then compilation will fail. See the 'result' definition on line 87: Body!(ElementType).ReturnType result(ElementType)(ElementType __a) So how do we make our code use any character of strings as the argument and yet it has to refer to __a? By using an alias, of course! mixin("alias __a "~parmName~";"); expands to: mixin("alias __a bcd;"); And mixin(Body!(ElementType).code); expands to: return (2 / bcd); The whole 'result' function then reads: Body!(ElementType).ReturnType result(ElementType)(ElementType __a) { alias __a bcd; return (2 / bcd); } And that's the whole story! I hope this wasn't too difficult to follow. unaryFun might actually be a rather complicated example to explain. But the good news is, once you understand one template like unaryFun, you'll start grasping the rest in no-time. The reason why Phobos has such difficult to read templates is because writing templates in D is very easy. Before you know it, your templates end up being hundreds of lines long. But the benefit over C++ is that you can *read* the template syntax with little to no missunderstanding. The complications arise when the templates have a lot of nested templates, and when they make calls to a lot of other templates. But template calls are easy to trace. Dissasembling C++ syntax however is not. Well, unless your name is Andrei Alexandrescu, that is. :)
Jan 29 2011
next sibling parent reply "Akakima" <akakima33 gmail.com> writes:
"Andrej Mitrovic" <andrej.mitrovich gmail.com>

I hope this gets pasted right, otherwise I'll have to open a freakin' blog. 
:)

 Up until a few months ago, I had no experience with templates whatsoever.
That's very good stuff! Well written and very helpful. May i suggest you to put that somewhere in a [ maybe new ] section of digitalmars site reserved to other documentation like that. As a newbee, i can tell you that this would help. Thanks
Jan 29 2011
parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 1/29/11, Akakima <akakima33 gmail.com> wrote:
 "Andrej Mitrovic" <andrej.mitrovich gmail.com>

I hope this gets pasted right, otherwise I'll have to open a freakin' blog.

:)

 Up until a few months ago, I had no experience with templates whatsoever.
That's very good stuff! Well written and very helpful. May i suggest you to put that somewhere in a [ maybe new ] section of digitalmars site reserved to other documentation like that. As a newbee, i can tell you that this would help. Thanks
Thanks. Well, I'm not the maintainer of the D website. Maybe I could put it up on wiki4d (http://prowiki.org/wiki4d/wiki.cgi?FrontPage). Someone will have to direct the newbies there.
Jan 29 2011
prev sibling parent reply Trass3r <un known.com> writes:
Save this somewhere or it will be lost here ;)
Jan 29 2011
next sibling parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Done: http://prowiki.org/wiki4d/wiki.cgi?D__Tutorial/D2Templates

I've replaced my obvious English mistakes with slightly less obvious
ones. Kinda wish prowiki had a nicer code display though (syntax
highlighting would be nice).
Jan 29 2011
parent reply Tom <tom gmail.com> writes:
I would like to thank you for that such a great explanation of the
Template-based
programming in that unary example. I think it's great you have uploaded that to
the wiki4d, and it's definite will help a lot of people that come from other

templating, to understand that design pattern.
And with that close tracking to the functional.d code, I thing but not sure that
we've found some bugs. I will do some testing later today, and if I will see
what
I think to be wrong, I'll update.

and again,
 Thanks
Jan 30 2011
next sibling parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Sun, Jan 30, 2011 at 12:03, Tom <tom gmail.com> wrote:
 I would like to thank you for that such a great explanation of the
Template-based
 programming in that unary example. I think it's great you have uploaded that to
 the wiki4d, and it's definite will help a lot of people that come from other

 templating, to understand that design pattern.
 And with that close tracking to the functional.d code, I thing but not sure
that
 we've found some bugs. I will do some testing later today, and if I will see
what
 I think to be wrong, I'll update.
If anyone is interested, I coded a n-args version of unaryFun/binaryFun called naryFun. If you use 'a', 'b', ... as args names, it can automatically determine the templated function arity. So naryFun!"a + b * c - sin(d-a)" is a 4-args template function. It's there: http://svn.dsource.org/projects/dranges/trunk/dranges/docs/functional.html (look for naryFun at the bottom) code is here: http://dsource.org/projects/dranges/browser/trunk/dranges/functional.d It could be simpler now: when I did it 18 months ago, CTFE wasn't so powerful. The looping templates can now easily be done with a simple foreach. This has been on my todo list for quite some time... Philippe
Jan 30 2011
prev sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 1/30/11, Philippe Sigaud <philippe.sigaud gmail.com> wrote:
 If anyone is interested, I coded a n-args version of
 unaryFun/binaryFun called naryFun. If you use 'a', 'b', ... as args
 names, it can automatically determine the templated function arity.
Is this going in the next release? There's an nary template in std.functional right now but it's commented out. It does look like an older attempt that probably didn't work.
Jan 30 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Wait, why won't this compile?:

template foo(int value)
{
    int result;
    alias result foo;
}

void main()
{
    writeln( foo!(4) );
}
Jan 29 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Oh yeah, it's that damn bug where you can't have multiple
declarations, otherwise aliases don't work.

I still don't know if that bug will ever be fixed or if I should add
that to the tutorial.
Jan 29 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Well I've updated the tutorial anyway. I just left a note and used an
templateImpl companion template.
Jan 29 2011
prev sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Well now I'm feeling really stupid because apparently this is how it
*should* work, at least according to the docs.

I'd love to see 'alias symbol this' work even with multiple
declarations. It'd be nicee..
Jan 29 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Woops, sorry about the missing function definition after the line:
"unaryFun takes the string argument and might create this function:"

This should be maybe:
int unaryFun(int a)
{
    return (a & 1) == 0;
}

But that's oversimplification since that instantiation is a function
template itself, not a specific function.
Jan 29 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Note when I said "But, we *did* pass a valid argument," I was refering
to the original code example I gave at the top where we called
unaryFun for the first time.
Jan 29 2011
prev sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Fix: So essentially it allows us to use any string of characters instead of
"b".

I've meant "a" here, not "b".
Jan 29 2011
prev sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday 29 January 2011 16:01:06 Andrej Mitrovic wrote:
 Well now I'm feeling really stupid because apparently this is how it
 *should* work, at least according to the docs.
 
 I'd love to see 'alias symbol this' work even with multiple
 declarations. It'd be nicee..
It's how it should work according to TDPL. It'll get fixed It's just a question of when. - Jonathan M Davis
Jan 29 2011