digitalmars.D.learn - Dynamically binding to D code using extern(D)

Hipreme (142/142) Sep 30 2021 I write this post as both a learning tool, a question and an

jfondren (61/71) Sep 30 2021 The terms that people use are a bit sloppy. There are three kinds

Hipreme (18/27) Sep 30 2021 Okay, I do agree with you that I may have exaggerated with
Mike Parker (3/6) Sep 30 2021 That's actually "dynamic loading".

Hipreme <msnmancini hotmail.com> writes:

I write this post as both a learning tool, a question and an 
inquiry.

There are just a lot of drawbacks in trying to do function 
exporting while using D.

That interface is absurdly confuse and that is probably why I've 
never seen a project here which made an use of extern(D) while 
using a DLL.

While I'm making my DLL's generation, there are a lot of pitfalls 
that I can feel into.

**Simple Function**

```

module something;
extern(D) export int sum(int a, int b){return a + b;}
```

The correct way to bind to that function would be:

```

module app;
import core.demangle

int function(int a, int b) sum;

void main()
{
     sum = cast(typeof(sum))GetProcAddress(someDll, 
mangleFunc!(typeof(sum)("something.sum");
}

```

And that should be it for loading a simple function.

Now, lets make our case a bit more complicated:

**Overloaded function**


```

module something;

extern(D) export int add(int a, int b)
{
     return a + b;
}

extern(D) export float add(float a, float b)
{
    return a+b;
}
```


For loading those functions, the correct way would be

```

module app;
import core.demangle;


int function(int a, int b) sumInt;
float function(float a, float b) sumFloat;

int sum(int a, int b){return sumInt(a, b);}
float sum(float a, float b){return sumFloat(a,b);}

void main()
{
     sumInt = cast(typeof(sumInt))GetProcAddress(dll, 
mangleFunc!(typeof(sumInt))("something.sum"));
     sumFloat = cast(typeof(sumFloat))GetProcAddress(dll, 
mangleFunc!(typeof(sumFloat))("something.sum"));
}
```

Notice how much the overall complexity starts to increase as 
there seems to be no way to put get the overloads and there 
doesn't seem to be any advantage in using extern(D).


**Static Methods**

The only difference from the default functions is that we need to 
pass the class name as a module name.


**Static Methods returning user data**

That is mainly the reason I'm writing that post. It made me 
really wonder if I should really use extern(D).


This section will use 3 files because after all, there is really 
a (consistency?) problem


```
module supertest;
import ultratest;

class SuperTest
{
    extern(D) export static SuperTest getter(){return new 
SuperTest();}
    extern(D) export static UltraTest ultraGetter(){return new 
UltraTest();}

    import core.demangle;

    pragma(msg, 
mangleFunc!(typeof(&SuperTest.getter))("supertest.SuperTest.getter"));
    //Prints _D9supertest9SuperTest6getterFZCQBeQx
    pragma(msg, 
mangleFunc!(typeof(&SuperTest.ultraGetter))("supertest.SuperTest.ultraGetter"));
    //Prints 
_D9supertest9SuperTest11ultraGetterFZC9ultratest9UltraTest

}
```

```
module ultratest;
class UltraTest{}
```

```
module app;
import core.demangle;

void main()
{
    //???

}
```

As you can see at module supertest, the pattern seems to break 
when returning user data
for another module. From my knowledge, I don't know how could I 
get this function, specially because you will need to know: the 
module that you're importing the function + the module that where 
the userdata is defined for getting it.


It seems pretty insane to work with that.


extern(D) advantages:

-

extern(D) disadvantages:

- Code only callable in D(probably no other language as a 
demangler)
- I don't remember seeing any other code before in that post 
doing that, so, no documentation at all
- You will need to call the demangler for binding to a symbol, 
which in my project,  it could make each call to a unique type 
from the demangler costs 15KB
- You will need to know the module which you imported your 
function
- If your function returns userdata from another function, there 
doesn't seem to be any workaround
- Doesn't provide any overloading binding support though the 
language has support to overloading


extern(C) advantages:

- Code callable from any language as it is absolutely intuitive
- Well documented

extern(C) disadvantages:

- You will need to declare your function pointer as extern(C) or 
it will swap the arguments order.



I have not even entered in the case where I tried overloading 
static methods, which I think it would need to declarate aliases 
to the static methods typings for actually generating a mangled 
name.

I want to know if extern(D) is actually meant to not be touched. 
adr said that his use for that was actually when doing


extern(C):
//Funcs defined here


extern(D): //Resets the linkage to the default one


So, there are just too many disadvantages for doing extern(D) for 
binding it to any code, I would like to know where we can get 
more documentation than what I posted here right now (really, 
I've never saw any code binding to an extern(D) code). And I do 
believe that is the main reason why people usually don't use 
dynamic libs in D, it is just inviable as you would need to 
regenerate all the API yourself

Sep 30 2021

jfondren <julian.fondren gmail.com> writes:

On Thursday, 30 September 2021 at 18:09:46 UTC, Hipreme wrote:
 I write this post as both a learning tool, a question and an 
 inquiry.

 There are just a lot of drawbacks in trying to do function 
 exporting while using D.

The terms that people use are a bit sloppy. There are three kinds 
of 'linking' here:

1. static linking, performed during compilation, once. If linking 
fails, the compile files.
2. dynamic linking (option 1), performed when an executable 
starts up, before your program gains control, by the system 
linker. If linking fails, your program never gets control.
3. dynamic linking (option 2), performed arbitrarily at runtime, 
by your program. If linking fails, you can do whatever you want 
about that.

All of the loadSymbol and 'userdata module' hassle that you're 
frustrated by is from option 2. Option 1 is really the normal way 
to link large shared libraries and there's nothing to it. What 
your code looks like that loads a shared library is just `import 
biglib;`, and the rest of the work is in dub, pkg-config, 
`LD_LIBRARY_PATH`, etc. Phobos is commonly linked in this way.

Pretty much anything that isn't a plugin in a plugin directory 
can use option 1 instead of option 2.

 extern(C) advantages:

 - Code callable from any language as it is absolutely intuitive
 - Well documented

You can call scalding water 'hot' even when you're fresh from 
observing a lava flow. People still find the C ABI frustrating in 
a lot of ways, and especially when they encounter it for the 
first time.

But the C ABI rules the world right now, yes. The real advantages 
are

- it 'never' changes
- 'everyone' already makes it easy to use

 extern(C) disadvantages:

 - You will need to declare your function pointer as extern(C) 
 or it will swap the arguments order.

- you're limited to using C's types
- you can't use overloading, lazy parameters, default values; you 
can't rely on scope parameters, etc., etc.
- you can't casually hand over GC-allocated data and expect the 
other side to handle it right, or structs with lifetime functions 
that you expect to be called
- very little of importance is statically checked: to use a C ABI 
right you need to very carefully read documentation that needs to 
exist to even know who is expected to clean up a pointer and how, 
how large buffers should be. (I wasn't feeling a lot of the C 
ABI's "absolute intuitiveness" when I was passing libpcre an 
ovector sized to the number of pairs I wanted back rather than 
the correct number of `pairs*3/2`)

Option 2 dynamic linking of D libraries sounds pretty 
frustrating. Even with a plugin architecture, maybe I'd prefer 
just recompiling the application each time the plugins change to 
retain option 1 dynamic linking. Using a C ABI instead is a good 
idea if just to play nice with other languages.

And if you were wanting something like untrusted plugins, a way 
to respond to a segfault in a plugin, like I think you mentioned 
in Discord, then I'd still suggest not linking at all but having 
separate applications and some form of interprocess communication 
(pipes, unix sockets, TCP sockets) instead of function calls. 
This is something that you could design, or with D's reflection, 
generate code for against the function calls you already have. 
But this is even more work that you'll have to do. If we add "a 
separate process telling you what to do with some kind of 
protocol" as a fourth kind of linking, then the respective effort 
is

1. free! it compiles, it's probably good!
2. free! if the program starts, it's probably good!
3. wow, why don't you just write your own loadSymbol DSL?
4. wow, why don't you just reimplement Erlang/OTP and call it 
std.distributed? maybe protobufs will be enough.

Sep 30 2021

Hipreme <msnmancini hotmail.com> writes:

Okay, I do agree with you that I may have exaggerated with 
absolute intuitiveness, but I was talking about that 
intuitiveness for loading a symbol from a shared library.

 You're limited to using C's types

- I think I don't understood what you meant with that, if the 
data type is known before head, it is possible to just declare it 
from the other side


On Thursday, 30 September 2021 at 22:30:30 UTC, jfondren wrote:
 - you can't use overloading, lazy parameters, default values; 
 you can't rely on scope parameters, etc., etc.

- That seems to be pretty much more a problem for dynamically 
loading a function, although default values can be mirrored to in 
D API.


 - you can't casually hand over GC-allocated data and expect the 
 other side to handle it right, or structs with lifetime 
 functions that you expect to be called

- That is another problem that doesn't seem related to the 
external linkage too, handling GC-allocated data with extern(D) 
doesn't stop it from it being garbage collected, I'm fixing that 
kind of error right now again.

 separate applications and some form of interprocess 
 communication (pipes, unix sockets, TCP sockets) instead of 
 function calls.

- I'm pretty interested in how to make that thing work, but I 
think that would change a lot in how I'm designing my code, and 
with that way, it would probably become absolutely data oriented, 
right?

Sep 30 2021

Mike Parker <aldacron gmail.com> writes:

On Thursday, 30 September 2021 at 22:30:30 UTC, jfondren wrote:

 3. dynamic linking (option 2), performed arbitrarily at 
 runtime, by your program. If linking fails, you can do whatever 
 you want about that.

That's actually "dynamic loading".

https://en.wikipedia.org/wiki/Dynamic_loading

Sep 30 2021

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Dynamically binding to D code using extern(D)