www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Can not get struct member addresses at compile time

reply Doeme <doeme the-internet.org> writes:
Hi!

I'm currently investigating why I can not take the address of a 
static struct-element at compile time (afaik the linker should be 
able to resolve this, and doing the identical thing in C works...)

```d
struct Foo{
  ubyte bar;
}

__gshared Foo foo;
void* baz = \&foo;       //works
void* bar = \&foo.bar;   //Error: static variable `foo` cannot be 
read at compile time
```

Does this have to do with ".bar" potentially being a function 
call?
How does one get the address of a struct member?

The higher level problem of this question is that I want to make 
a pointer-map of a struct:

```d
import std.meta;
struct Foo{
     ubyte a;
     ushort b;
}

template ptrize(alias S){
	enum ptrize =  \&S;
}

__gshared Foo f;
static immutable void*[] map = [staticMap!(ptrize, f.tupleof)];
```

Is there an alternative to get to this point? Static module 
initializers are not really an option, since the whole thing 
should be -betterC.

Thanks for any hints!
Jun 16 2021
next sibling parent reply Mike Parker <aldacron gmail.com> writes:
On Wednesday, 16 June 2021 at 09:27:25 UTC, Doeme wrote:

 Is there an alternative to get to this point? Static module 
 initializers are not really an option, since the whole thing 
 should be -betterC.
```D import core.stdc.stdio; struct Foo{ ubyte bar; } __gshared Foo foo; void* baz = &foo; void* bar; extern(C): pragma(crt_constructor) void initialize() { bar = &foo.bar; } void main() { *(cast(ubyte*)bar) = 10; printf("%d", foo.bar); } ``` https://dlang.org/spec/pragma.html#crtctor
Jun 16 2021
next sibling parent reply jfondren <julian.fondren gmail.com> writes:
On Wednesday, 16 June 2021 at 11:56:31 UTC, Mike Parker wrote:
 https://dlang.org/spec/pragma.html#crtctor
"as a simple replacement for shared static this in betterC mode" Cool. However, ```d immutable int example; version(D_BetterC) { pragma(crt_constructor) extern(C) void initialize() { example = 1; } } else { shared static this() { example = 1; } } ``` this compiles without -betterC; with -betterC it errors out: Error: cannot modify `immutable` expression `example` Is this a bug?
Jun 16 2021
parent Mike Parker <aldacron gmail.com> writes:
On Wednesday, 16 June 2021 at 13:26:43 UTC, jfondren wrote:

 Is this a bug?
Probably. Please file an issue if there isn't one already: https://issues.dlang.org/
Jun 16 2021
prev sibling parent Doeme <doeme the-internet.org> writes:
On Wednesday, 16 June 2021 at 11:56:31 UTC, Mike Parker wrote:
 https://dlang.org/spec/pragma.html#crtctor
Very interesting, thanks for the hint! This is definitely a viable solution, though if there's a way to let the linker determine the pointer address, that'd be even better. In C, it's a comparatively standard thing to do: ```c struct Foo{ char bar; }; struct Foo foo; void *ptr = &foo.bar; //compiles, links and works ``` Why the D compilers would not pass this down to the linker eludes me.
Jun 16 2021
prev sibling parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 6/16/21 2:27 AM, Doeme wrote:

 How does one get the address of a struct member?
Here is an experiment with offsetof and opDispatch: struct Foo{ ubyte bar; int i; } auto addrOf(T)(ref T t) { static struct AddrOf { void * origin; auto opDispatch(string member)() { return origin + mixin ("T." ~ member ~ ".offsetof"); } } return AddrOf(&t); } import std.stdio; __gshared Foo foo; void main() { writeln(&foo); writeln(foo.addrOf.bar); writeln(foo.addrOf.i); // Alternative syntax writeln(addrOf(foo).bar); writeln(addrOf(foo).i); } Ali
Jun 16 2021
parent reply Doeme <doeme the-internet.org> writes:
On Wednesday, 16 June 2021 at 13:36:07 UTC, Ali Çehreli wrote:
 On 6/16/21 2:27 AM, Doeme wrote:

 How does one get the address of a struct member?
Here is an experiment with offsetof and opDispatch:
Cool stuff! I actually tried a very similar approach once, but it did not work out, since the D compilers refuse to do pointer arithmetic at compile time :/ ```d struct Foo{ ubyte bar; } void* membaddr(void *ptr, ulong offset){ return ptr+offset; } __gshared Foo foo; void* bar = membaddr(&foo, foo.bar.offsetof); //Error: cannot perform arithmetic on `void*` pointers at compile time ``` I guess that the opDispatch-method will suffer from the same issue...
Jun 16 2021
parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 6/16/21 8:47 AM, Doeme wrote:

 On Wednesday, 16 June 2021 at 13:36:07 UTC, Ali =C3=87ehreli wrote:
 On 6/16/21 2:27 AM, Doeme wrote:

 How does one get the address of a struct member?
Here is an experiment with offsetof and opDispatch:
Cool stuff! I actually tried a very similar approach once, but it did not work out=
,
 since the D compilers refuse to do pointer arithmetic at compile time =
:/
 ```d
 struct Foo{
   ubyte bar;
 }

 void* membaddr(void *ptr, ulong offset){
      return ptr+offset;
 }

 __gshared Foo foo;
 void* bar =3D membaddr(&foo, foo.bar.offsetof);
 //Error: cannot perform arithmetic on `void*` pointers at compile time=
 ```

 I guess that the opDispatch-method will suffer from the same issue...
No, opDispatch does not work either for compile time addresses. Actually, it is news to me that the compiler can know (determine?) the=20 address of a global variable. I thought the loador would determine the=20 addresses, but apparently not. Is it really a constant compiled value in = the case of C? Can you show an example please? Ali
Jun 16 2021
next sibling parent Doeme <doeme the-internet.org> writes:
On Wednesday, 16 June 2021 at 21:42:41 UTC, Ali Çehreli wrote:
 On 6/16/21 8:47 AM, Doeme wrote:

 On Wednesday, 16 June 2021 at 13:36:07 UTC, Ali Çehreli wrote:
 On 6/16/21 2:27 AM, Doeme wrote:

 How does one get the address of a struct member?
Here is an experiment with offsetof and opDispatch:
Cool stuff! I actually tried a very similar approach once, but it did not
work out,
 since the D compilers refuse to do pointer arithmetic at
compile time :/
 ```d
 struct Foo{
   ubyte bar;
 }

 void* membaddr(void *ptr, ulong offset){
      return ptr+offset;
 }

 __gshared Foo foo;
 void* bar = membaddr(&foo, foo.bar.offsetof);
 //Error: cannot perform arithmetic on `void*` pointers at
compile time
 ```

 I guess that the opDispatch-method will suffer from the same
issue... No, opDispatch does not work either for compile time addresses. Actually, it is news to me that the compiler can know (determine?) the address of a global variable. I thought the loador would determine the addresses, but apparently not. Is it really a constant compiled value in the case of C? Can you show an example please? Ali
The compiler can, in deed, not know the address, but the linker can. Example: ```c #include <stdio.h> struct Foo{ int bar; int baz; }; struct Foo foo; static const void *fooptr = &foo; static const void *barptr = &foo.bar; static const void *bazptr = &foo.baz; int main(int argc, char **argv){ printf("Address of foo: %p\n", fooptr); //Address of foo: 0x55fbd8292030 printf("Address of bar: %p\n", barptr); //Address of bar: 0x55fbd8292030 printf("Address of bau: %p\n", bazptr); //Address of bau: 0x55fbd8292034 return 0; } ``` We can see that the code actually outputs the right addresses. If we investigate the object file passed down to the linker, we see: ``` $ gcc -c test.c $ objdump -x test.o [...] SYMBOL TABLE: 0000000000000000 l df *ABS* 0000000000000000 test.c 0000000000000000 l d .text 0000000000000000 .text 0000000000000000 l d .data.rel.local 0000000000000000 .data.rel.local 0000000000000000 l O .data.rel.local 0000000000000008 fooptr 0000000000000008 l O .data.rel.local 0000000000000008 barptr 0000000000000010 l O .data.rel.local 0000000000000008 bazptr 0000000000000000 l d .rodata 0000000000000000 .rodata 0000000000000000 g O .bss 0000000000000008 foo 0000000000000000 g F .text 0000000000000070 main 0000000000000000 *UND* 0000000000000000 _GLOBAL_OFFSET_TABLE_ 0000000000000000 *UND* 0000000000000000 printf [...] RELOCATION RECORDS FOR [.data.rel.local]: OFFSET TYPE VALUE 0000000000000000 R_X86_64_64 foo 0000000000000008 R_X86_64_64 foo 0000000000000010 R_X86_64_64 foo+0x0000000000000004 [...] ``` This tells us that: * There are 3 variables in the initialized .data.rel.local section, our pointers. * There is one variable in the zeroed .bss section, our instance of struct Foo * To the positions of the of our three pointers in the .data.rel.local section, there is being written the address of the symbol foo, foo, and lastly, foo+4. Thus, the address is only known at link time, but it _can_ be known by placing the right relocation commands to the object elf-file (i.e. relocation + offset, foo+4).
Jun 16 2021
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Jun 16, 2021 at 02:42:41PM -0700, Ali Çehreli via Digitalmars-d-learn
wrote:
[...]
 Actually, it is news to me that the compiler can know (determine?) the
 address of a global variable.
[...] The compiler does not (and cannot) know. But the runtime dynamic linker can, and does. The two are bridged by the compiler emitting a relocatable symbol for the address of the global variable, with a table of relocations (offsets in the code) that the runtime linker patches the actual addresses into when the program is executed. T -- War doesn't prove who's right, just who's left. -- BSD Games' Fortune
Jun 16 2021
parent reply Doeme <doeme the-internet.org> writes:
On Wednesday, 16 June 2021 at 22:16:54 UTC, H. S. Teoh wrote:
 The compiler does not (and cannot) know.  But the runtime 
 dynamic linker can, and does.  The two are bridged by the 
 compiler emitting a relocatable symbol for the address of the 
 global variable, with a table of relocations (offsets in the 
 code) that the runtime linker patches the actual addresses into 
 when the program is executed.


 T
Exactly! Except that the dynamic linker is not really involved here, since all the symbols can/must be relocated statically at link time.
Jun 16 2021
parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 6/16/21 3:27 PM, Doeme wrote:
 On Wednesday, 16 June 2021 at 22:16:54 UTC, H. S. Teoh wrote:
 The compiler does not (and cannot) know.=C2=A0 But the runtime dynamic=
=20
 linker can, and does.=C2=A0 The two are bridged by the compiler emitti=
ng a=20
 relocatable symbol for the address of the global variable, with a=20
 table of relocations (offsets in the code) that the runtime linker=20
 patches the actual addresses into when the program is executed.


 T
=20 Exactly! Except that the dynamic linker is not really involved here,=20 since all the symbols can/must be relocated statically at link time.
Thank you, both. It still rules out an address at "compile time" in=20 general. For example, we cannot use such an address when instantiating a = template, or static array length, etc. And if I understand it correctly, there must be a pointer *variable* for = the linker to initialize. Fine then: That's how this usage works for C=20 but not for D. :) Thank you, Ali
Jun 16 2021
parent Doeme <doeme the-internet.org> writes:
On Wednesday, 16 June 2021 at 23:20:26 UTC, Ali Çehreli wrote:
 Thank you, both. It still rules out an address at "compile 
 time" in general. For example, we cannot use such an address 
 when instantiating a template, or static array length, etc.

 And if I understand it correctly, there must be a pointer 
 *variable* for the linker to initialize. Fine then: That's how 
 this usage works for C but not for D. :)

 Thank you,
 Ali
Yes, there must be a pointer variable, which explains why we can not do compile time pointer arithmetic, which is fair and square (although I think it might be possible with some extra compiler effort). It does _not_ explain why we can't take addresses of struct members, though, since we have a pointer variable there! Coming back to the working part of my first example: ```d struct Foo{ int bar; } __gshared Foo foo; void *fooptr = &foo; ``` This works! And yields very similar relocations than the C version (plus some overhead and name-mangling): ``` [...] 0000000000000000 w O .data._D19TypeInfo_S4test3Foo6__initZ 0000000000000091 _D19TypeInfo_S4test3Foo6__initZ 0000000000000000 g O .bss 0000000000000004 _D4test3Foo6__initZ 0000000000000004 g O .bss 0000000000000004 _D4test3fooSQk3Foo 0000000000000000 g .tdata. 0000000000000008 _D4test6fooptrPv 0000000000000000 g O .rodata 000000000000000d _D4test12__ModuleInfoZ [...] RELOCATION RECORDS FOR [.tdata.]: OFFSET TYPE VALUE 0000000000000000 R_X86_64_64 _D4test3fooSQk3Foo [...] ``` The only difference is that the compiler will not pass down relocations plus an offset (i.e. relocating to a member of a struct) down to the linker, and I don't quite see a specific reason why it should not.
Jun 17 2021