www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Getting the names of all the top-level functions in module

reply Stefan Koch <uplink.coder googlemail.com> writes:
Hello there,

I was just working on an example for my upcoming library 
`core.reflect`
and I wanted to contrast it with the current template way of 
doing things.
The issue at hand here is getting all the names of top-level 
(free)function in a module in a string[].

The way that I found to do this with templates involves a bit of 
knowledge of is expressions here it is

```d
template functionNames (alias M)
{
     static const functionNames = ()
     {
         string[] names;
         foreach(m;__traits(derivedMembers, M))
         {
             alias sym = __traits(getMember, M, m);
             static if (is(typeof(sym) T) && is(T F == function))
             {
                 names ~= __traits(identifier, sym);
             }
         }
         return names;
     } ();
}
```

Note that on the "callsite" you have to instantiate the helper 
template as:
`static const fNames = functionNames!(mixin(__MODULE__))`;
the parens around the mixin are not optional it will not parse 
without them.

now comes the way do that with core.reflect.

```d
 (core.reflect) // annotation prevents code-gen since the 
`nodeFromName` builtin which is called in there is not a real 
function and therefore has no body for the codegenerator to look 
at.
const(FunctionDeclaration)[] getFreeFunctionNamesFromModule(
     string module_ = __MODULE__,
     ReflectFlags flags = ReflectFlags.NoMemberRecursion,
     immutable Scope _scope = currentScope()
)
{
     string[] result;
     auto mod_ = nodeFromName(module_, flags, _scope);
     if (auto mod = cast(Module) mod_)
     foreach(member;mod.members)
     {
         if (auto fd = cast(const FunctionDeclaration) member)
         {
             result ~= fd.name;
         }
     }

     return result;
}
```

and you use it like this.
`static const functionNames = getFreeFunctionNamesFromModule;`
I am using a parenthesis less call to make it look like it's a 
magic builtin but in reality the only bits of magic that are used 
are the two builtins `nodeFromName` and `currentScope`

whereas the template above used 5 magic constructs.
  - 2 different types of pattern matching is expressions.
    and
  - 3 different __traits expressions.
Sep 21
next sibling parent Stefan Koch <uplink.coder googlemail.com> writes:
On Tuesday, 21 September 2021 at 13:29:44 UTC, Stefan Koch wrote:
 Hello there,
 Note that on the "callsite" you have to instantiate the helper 
 template as:
 `static const fNames = functionNames!(mixin(__MODULE__))`;
Correction you can't actually use it like that. Because that introduces a symbol into the module you are trying to reflect over, and the compiler will complain about a `circular reference` to `fNames`. you have to use it directly you cannot assign it to a module level variable. ah well :-)
Sep 21
prev sibling parent reply Adam D Ruppe <destructionator gmail.com> writes:
On Tuesday, 21 September 2021 at 13:29:44 UTC, Stefan Koch wrote:
 Note that on the "callsite" you have to instantiate the helper 
 template as:
Not true. You can pass module names as plain names now, no more need for the special mixin, with the exception of getting the current module (just because __MODULE__ is a string), but if this were a library function, you wouldn't be using the current module anyway.
 whereas the template above used 5 magic constructs.
  - 2 different types of pattern matching is expressions.
    and
  - 3 different __traits expressions.
Not true. There's two extremely basic traits and one is expression actually necessary here. You overcomplicated it again. This works today: import arsd.simpledisplay; // or whatever enum string[] functionNames(alias M) = () { string[] names; foreach(memberName; __traits(derivedMembers, M)) static if (is(typeof(__traits(getMember, M, memberName)) == function)) names ~= memberName; return names; }(); pragma(msg, functionNames!(arsd.simpledisplay)); As you can see, there's one trait, and one is expression, to see if it is a function. Not that complicated.
 you have to use it directly you cannot assign it to a module 
 level variable.
Also not true. --- module qwe.whatever; enum string[] functionNames(alias M) = () { string[] names; foreach(memberName; __traits(derivedMembers, M)) static if (is(typeof(__traits(getMember, M, memberName)) == function)) names ~= memberName; return names; }(); const fNames = functionNames!(qwe.whatever); void main() { import std.stdio; writeln(fNames); } void foo() {} --- $ dmd qwe $ ./qwe ["main", "foo"] $ nm qwe.o | grep functionNames // notice this is empty btw $
Sep 21
parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Tuesday, 21 September 2021 at 14:20:50 UTC, Adam D Ruppe wrote:
 On Tuesday, 21 September 2021 at 13:29:44 UTC, Stefan Koch
 This works today:

 import arsd.simpledisplay; // or whatever

 enum string[] functionNames(alias M) = ()
 {
         string[] names;
         foreach(memberName; __traits(derivedMembers, M))
             static if (is(typeof(__traits(getMember, M, 
 memberName)) ==
             function))
                 names ~= memberName;
         return names;
 }();

 pragma(msg, functionNames!(arsd.simpledisplay));
This does work. Interesting. But had I not posted here would I have been able to get to the template you posted above? I am not trying to over-complicate anything this is what I naturally arrive at. After spending more than 20 minutes to get it to show me the answer I am looking for.
Sep 21
parent reply Adam D Ruppe <destructionator gmail.com> writes:
Here's one with param names:

----

module qwe;

struct FunctionInfo {
         string name;
         string[] paramNames;
}

enum FunctionInfo[] functionInfo(alias M) = ()
{
         FunctionInfo[] res;
         foreach(memberName; __traits(derivedMembers, M))
             static if (is(typeof(__traits(getMember, M, 
memberName)) Params == __parameters)) {
                 FunctionInfo fi = FunctionInfo(memberName);
                 foreach(idx, param; Params)
                         fi.paramNames ~= __traits(identifier, 
Params[idx .. idx + 1]);
                 res ~= fi;
             }
         return res;
}();

void main() {
         import std.stdio;
         writeln(functionInfo!qwe);
}

void foo(int a, int b){}

----


Of course the weird thing here is the need to slice the params 
tuple to get the idenfitier. I had to borrow that trick from 
Phobos but now that I know it it is very useful.

Otherwise the code is still fairly simple. Only hte one is 
expression since if it doesn't have params it doesn't trigger the 
condition and if all functions have params (even if it is an 
empty set)
Sep 21
parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Tuesday, 21 September 2021 at 15:07:35 UTC, Adam D Ruppe wrote:
 Here's one with param names:

 ----

 module qwe;

 struct FunctionInfo {
         string name;
         string[] paramNames;
 }

 enum FunctionInfo[] functionInfo(alias M) = ()
 {
         FunctionInfo[] res;
         foreach(memberName; __traits(derivedMembers, M))
             static if (is(typeof(__traits(getMember, M, 
 memberName)) Params == __parameters)) {
                 FunctionInfo fi = FunctionInfo(memberName);
                 foreach(idx, param; Params)
                         fi.paramNames ~= __traits(identifier, 
 Params[idx .. idx + 1]);
                 res ~= fi;
             }
         return res;
 }();

 void main() {
         import std.stdio;
         writeln(functionInfo!qwe);
 }

 void foo(int a, int b){}
```d Here's the core.reflect version for that struct FunctionInfo { string name; string[] params; } (core.reflect) const(FunctionInfo)[] FunctionsInfoFromModule(string module_ = __MODULE__, ReflectFlags flags = ReflectFlags.NoMemberRecursion, immutable Scope _scope = currentScope() ) { const(FunctionInfo)[] result; auto mod_ = nodeFromName(module_, flags, _scope); if (auto mod = cast(Module) mod_) foreach(member;mod.members) { if (auto fd = cast(const FunctionDeclaration) member) { string[] paramNames; foreach(p;fd.type.parameterTypes) //because that's how dmd looks and I didn't smooth that part out yet. it's going to be nicer in the future { paramNames ~= p.identfier; // TODO maybe rename identifier to name? } result ~= FunctionInfo(fd.name, paramNames); } } return result; } ``` Alternatively you could of course just use the `FunctionDeclaration` object ;) As the information is already bundled.
Sep 21
parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Tuesday, 21 September 2021 at 15:18:01 UTC, Stefan Koch wrote:
 On Tuesday, 21 September 2021 at 15:07:35 UTC, Adam D Ruppe 
 wrote:
 Here's one with param names:

 ----
I've done a little benchmark the core.reflect version vs the template version. | Command | Mean [ms] | Min [ms] | Max [ms] | Relative | |:---|---:|---:|---:|---:| | 0.88 | 0.99 | 1.11 | 1.17 | 1.44 | 1.62 | 1.69 | 1.90 | Which corresponds to this graph. [![IMAGE perfgraph](https://i.ibb.co/BVHy1LQ/graph.png)] blue is adams template. orange is core reflect. X in correlated to N the number of functions being reflection and Y is correlated to compile time time
Sep 21
parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Tuesday, 21 September 2021 at 17:15:23 UTC, Stefan Koch wrote:
 On Tuesday, 21 September 2021 at 15:18:01 UTC, Stefan Koch 
 wrote:
 On Tuesday, 21 September 2021 at 15:07:35 UTC, Adam D Ruppe 
 wrote:
 Here's one with param names:

 ----
I've done a little benchmark the core.reflect version vs the template version.
Scratch that. It turns out I should have tested the output. nodeFromName will return an Import node in this case, so the dynamic cast to module fails and CTFE it never ran to from the datastructure. Hence the previous number are garbage. Here is an updated version. We can see that the CTFE overhead is clearly dominating here. | Command | Mean [ms] | Min [ms] | Max [ms] | Relative | |:---|---:|---:|---:|---:| | | 0.82 | 0.95 | 1.01 | 1.20 | 1.21 | 1.29 | 1.38 | So those are the proper numbers. And instead of a 10x change we have what almost seems to be a constant difference. The simple truth of this is that the CTFE concatenation is the weak link here. ![plot](https://i.ibb.co/KWpd4vS/real-diff.png") blue is the template red is core reflect.
Sep 21
parent Stefan Koch <uplink.coder googlemail.com> writes:
On Tuesday, 21 September 2021 at 22:53:26 UTC, Stefan Koch wrote:
 On Tuesday, 21 September 2021 at 17:15:23 UTC, Stefan Koch 
 wrote:
 On Tuesday, 21 September 2021 at 15:18:01 UTC, Stefan Koch 
 wrote:
 On Tuesday, 21 September 2021 at 15:07:35 UTC, Adam D Ruppe 
 wrote:
 Here's one with param names:

 ----
I've done a little benchmark the core.reflect version vs the template version.
Scratch that. It turns out I should have tested the output. The simple truth of this is that the CTFE concatenation is the weak link here.
So if ctfe is the limiting factor how does the graph look when CTFE is taken out of the picture. ![](https://i.ibb.co/bJsshPZ/without-ctfe.png) That's how. core.reflect looks linear while the template shows signs of becoming superlinear. However I cannot test it on bigger N because there is a restriction to how many synthetic symbol-names the compiler can create ... ```d // Perturb the name mangling so that the symbols can co-exist // instead of colliding s.localNum = cast(ushort)(originalSymbol.localNum + 1); assert(s.localNum); // 65535 should be enough for anyone ``` ah well :-) It should be noted that core.reflect can be used with much higher N because it doesn't create symbols in that way.
Sep 21