www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Inline assembly question

reply Dibyendu Majumdar <d.majumdar gmail.com> writes:
Hi,

I have recently started work on building a VM for Lua (actually a 
derivative of Lua) in X86-64 assembly. I am using the dynasm tool 
that is part of LuaJIT. I was wondering whether I could also 
write this in D's inline assembly perhaps, but there is one 
aspect that I am not sure how to do.

The assembly code uses static allocation of registers, but 
because of the differences in how registers are used in Win64 
versus Unix X64 - different registers are assigned depending on 
the architecture. dynasm makes this easy to do using macros; e.g. 
below.

|.if X64WIN
|.define CARG1,		rcx		// x64/WIN64 C call arguments.
|.define CARG2,		rdx
|.define CARG3,		r8
|.define CARG4,		r9
|.else
|.define CARG1,		rdi		// x64/POSIX C call arguments.
|.define CARG2,		rsi
|.define CARG3,		rdx
|.define CARG4,		rcx
|.endif

With above in place, the code can use the mnemonics to refer to 
the registers rather than the registers themselves. This allows 
the assembly code to be coded once for both architectures.

How would one do this in D inline assembly?

Thanks and Regards
Dibyendu
Nov 12
next sibling parent Nicholas Wilson <iamthewilsonator hotmail.com> writes:
On Sunday, 12 November 2017 at 11:01:39 UTC, Dibyendu Majumdar 
wrote:
 Hi,

 I have recently started work on building a VM for Lua (actually 
 a derivative of Lua) in X86-64 assembly. I am using the dynasm 
 tool that is part of LuaJIT. I was wondering whether I could 
 also write this in D's inline assembly perhaps, but there is 
 one aspect that I am not sure how to do.

 The assembly code uses static allocation of registers, but 
 because of the differences in how registers are used in Win64 
 versus Unix X64 - different registers are assigned depending on 
 the architecture. dynasm makes this easy to do using macros; 
 e.g. below.

 |.if X64WIN
 |.define CARG1,		rcx		// x64/WIN64 C call arguments.
 |.define CARG2,		rdx
 |.define CARG3,		r8
 |.define CARG4,		r9
 |.else
 |.define CARG1,		rdi		// x64/POSIX C call arguments.
 |.define CARG2,		rsi
 |.define CARG3,		rdx
 |.define CARG4,		rcx
 |.endif

 With above in place, the code can use the mnemonics to refer to 
 the registers rather than the registers themselves. This allows 
 the assembly code to be coded once for both architectures.

 How would one do this in D inline assembly?

 Thanks and Regards
 Dibyendu
You could do it with a mixin, it would be rather ugly though. Not sure of another way off the top of my head.
Nov 12
prev sibling next sibling parent reply Eugene Wissner <belka caraus.de> writes:
On Sunday, 12 November 2017 at 11:01:39 UTC, Dibyendu Majumdar 
wrote:
 Hi,

 I have recently started work on building a VM for Lua (actually 
 a derivative of Lua) in X86-64 assembly. I am using the dynasm 
 tool that is part of LuaJIT. I was wondering whether I could 
 also write this in D's inline assembly perhaps, but there is 
 one aspect that I am not sure how to do.

 The assembly code uses static allocation of registers, but 
 because of the differences in how registers are used in Win64 
 versus Unix X64 - different registers are assigned depending on 
 the architecture. dynasm makes this easy to do using macros; 
 e.g. below.

 |.if X64WIN
 |.define CARG1,		rcx		// x64/WIN64 C call arguments.
 |.define CARG2,		rdx
 |.define CARG3,		r8
 |.define CARG4,		r9
 |.else
 |.define CARG1,		rdi		// x64/POSIX C call arguments.
 |.define CARG2,		rsi
 |.define CARG3,		rdx
 |.define CARG4,		rcx
 |.endif

 With above in place, the code can use the mnemonics to refer to 
 the registers rather than the registers themselves. This allows 
 the assembly code to be coded once for both architectures.

 How would one do this in D inline assembly?

 Thanks and Regards
 Dibyendu
Here is an example with mixins: version (Windows) { enum Reg : string { CARG1 = "RCX", CARG2 = "RDX", } } else { enum Reg : string { CARG1 = "RDI", CARG2 = "RSI", } } template Instruction(string I, Reg target, Reg source) { enum string Instruction = "asm { mov " ~ target ~ ", " ~ source ~ "; }"; } void func() { mixin(Instruction!("mov", Reg.CARG1, Reg.CARG2)); }
Nov 12
parent reply Dibyendu Majumdar <d.majumdar gmail.com> writes:
On Sunday, 12 November 2017 at 11:55:23 UTC, Eugene Wissner wrote:
 On Sunday, 12 November 2017 at 11:01:39 UTC, Dibyendu Majumdar 
 wrote:
 I have recently started work on building a VM for Lua 
 (actually a derivative of Lua) in X86-64 assembly. I am using 
 the dynasm tool that is part of LuaJIT. I was wondering 
 whether I could also write this in D's inline assembly 
 perhaps, but there is one aspect that I am not sure how to do.

 The assembly code uses static allocation of registers, but 
 because of the differences in how registers are used in Win64 
 versus Unix X64 - different registers are assigned depending 
 on the architecture. dynasm makes this easy to do using 
 macros; e.g. below.

 |.if X64WIN
 |.define CARG1,		rcx		// x64/WIN64 C call arguments.
 |.define CARG2,		rdx
 |.define CARG3,		r8
 |.define CARG4,		r9
 |.else
 |.define CARG1,		rdi		// x64/POSIX C call arguments.
 |.define CARG2,		rsi
 |.define CARG3,		rdx
 |.define CARG4,		rcx
 |.endif

 With above in place, the code can use the mnemonics to refer 
 to the registers rather than the registers themselves. This 
 allows the assembly code to be coded once for both 
 architectures.

 How would one do this in D inline assembly?

 Thanks and Regards
 Dibyendu
Here is an example with mixins: version (Windows) { enum Reg : string { CARG1 = "RCX", CARG2 = "RDX", } } else { enum Reg : string { CARG1 = "RDI", CARG2 = "RSI", } } template Instruction(string I, Reg target, Reg source) { enum string Instruction = "asm { mov " ~ target ~ ", " ~ source ~ "; }"; } void func() { mixin(Instruction!("mov", Reg.CARG1, Reg.CARG2)); }
Thank you - I probably could use something like this. It is uglier than the simpler approach in dynasm of course. How about when I need to combine this with some struct/union access? In dynasm I can write: | mov BASE, CI->u.l.base // BASE = ci->u.l.base (volatile) | mov PC, CI->u.l.savedpc // PC = CI->u.l.savedpc How can I mix the mixin above and combine with struct offsets? Thanks and Regards Dibyendu
Nov 12
parent reply Basile B. <b2.temp gmx.com> writes:
On Sunday, 12 November 2017 at 12:17:51 UTC, Dibyendu Majumdar 
wrote:
 On Sunday, 12 November 2017 at 11:55:23 UTC, Eugene Wissner 
 wrote:
 [...]
Thank you - I probably could use something like this. It is uglier than the simpler approach in dynasm of course. How about when I need to combine this with some struct/union access? In dynasm I can write: | mov BASE, CI->u.l.base // BASE = ci->u.l.base (volatile) | mov PC, CI->u.l.savedpc // PC = CI->u.l.savedpc How can I mix the mixin above and combine with struct offsets? Thanks and Regards Dibyendu
https://dlang.org/spec/iasm.html#agregate_member_offsets aggregate.member.offsetof[someregister]
Nov 12
parent reply Dibyendu Majumdar <d.majumdar gmail.com> writes:
On Sunday, 12 November 2017 at 12:32:09 UTC, Basile B. wrote:
 On Sunday, 12 November 2017 at 12:17:51 UTC, Dibyendu Majumdar 
 wrote:
 On Sunday, 12 November 2017 at 11:55:23 UTC, Eugene Wissner 
 wrote:
 [...]
Thank you - I probably could use something like this. It is uglier than the simpler approach in dynasm of course. How about when I need to combine this with some struct/union access? In dynasm I can write: | mov BASE, CI->u.l.base // BASE = ci->u.l.base (volatile) | mov PC, CI->u.l.savedpc // PC = CI->u.l.savedpc How can I mix the mixin above and combine with struct offsets?
https://dlang.org/spec/iasm.html#agregate_member_offsets aggregate.member.offsetof[someregister]
Sorry I didn't phrase my question accurately. Presumably to use above with the mnemonics I would need additional mixin templates where the aggregate type and member etc would need to be parameters?
Nov 12
parent reply Eugene Wissner <belka caraus.de> writes:
On Sunday, 12 November 2017 at 15:25:43 UTC, Dibyendu Majumdar 
wrote:
 On Sunday, 12 November 2017 at 12:32:09 UTC, Basile B. wrote:
 On Sunday, 12 November 2017 at 12:17:51 UTC, Dibyendu Majumdar 
 wrote:
 On Sunday, 12 November 2017 at 11:55:23 UTC, Eugene Wissner 
 wrote:
 [...]
Thank you - I probably could use something like this. It is uglier than the simpler approach in dynasm of course. How about when I need to combine this with some struct/union access? In dynasm I can write: | mov BASE, CI->u.l.base // BASE = ci->u.l.base (volatile) | mov PC, CI->u.l.savedpc // PC = CI->u.l.savedpc How can I mix the mixin above and combine with struct offsets?
https://dlang.org/spec/iasm.html#agregate_member_offsets aggregate.member.offsetof[someregister]
Sorry I didn't phrase my question accurately. Presumably to use above with the mnemonics I would need additional mixin templates where the aggregate type and member etc would need to be parameters?
You can use just string parameters instead of enums, then you can pass arbitrary arguments to the instructions. The compiler will tell you if something is wrong with the syntax of the generated assembly.
Nov 12
parent reply Dibyendu Majumdar <d.majumdar gmail.com> writes:
On Sunday, 12 November 2017 at 18:48:02 UTC, Eugene Wissner wrote:
 https://dlang.org/spec/iasm.html#agregate_member_offsets

 aggregate.member.offsetof[someregister]
Sorry I didn't phrase my question accurately. Presumably to use above with the mnemonics I would need additional mixin templates where the aggregate type and member etc would need to be parameters?
You can use just string parameters instead of enums, then you can pass arbitrary arguments to the instructions. The compiler will tell you if something is wrong with the syntax of the generated assembly.
Okay thank you. Sigh. It would be so much simpler to be able to just define mnemonics for registers. Anyway, another question: Does the compiler generate appropriate unwind information on Win64? Prsumably if a function is marked 'naked' then it doesn't? Thanks and Regards Dibyendu
Nov 12
parent reply Basile B. <b2.temp gmx.com> writes:
On Sunday, 12 November 2017 at 21:27:28 UTC, Dibyendu Majumdar 
wrote:
 On Sunday, 12 November 2017 at 18:48:02 UTC, Eugene Wissner 
 wrote:
 https://dlang.org/spec/iasm.html#agregate_member_offsets

 aggregate.member.offsetof[someregister]
Sorry I didn't phrase my question accurately. Presumably to use above with the mnemonics I would need additional mixin templates where the aggregate type and member etc would need to be parameters?
You can use just string parameters instead of enums, then you can pass arbitrary arguments to the instructions. The compiler will tell you if something is wrong with the syntax of the generated assembly.
Okay thank you. Sigh. It would be so much simpler to be able to just define mnemonics for registers. Anyway, another question: Does the compiler generate appropriate unwind information on Win64? Prsumably if a function is marked 'naked' then it doesn't? Thanks and Regards Dibyendu
yeah about stack frame..., also don't forget to mark the asm block "pure nothrow" if possible... It's not documented but the syntax is like that: ``` void foo() { asm pure nothrow { naked; ret; } } ```
Nov 12
parent reply Dibyendu Majumdar <d.majumdar gmail.com> writes:
On Sunday, 12 November 2017 at 22:00:58 UTC, Basile B. wrote:
 On Sunday, 12 November 2017 at 21:27:28 UTC, Dibyendu Majumdar
 Does the compiler generate appropriate unwind information on 
 Win64? Prsumably if a function is marked 'naked' then it 
 doesn't?
yeah about stack frame..., also don't forget to mark the asm block "pure nothrow" if possible... It's not documented but the syntax is like that: ``` void foo() { asm pure nothrow { naked; ret; } } ```
I am not sure I have understood above; will DMD generate the right Win64 unwind info for this contrived example: int luaV_interp(lua_State *L) { asm pure nothrow { naked; push RDI; push RSI; push RBX; push R12; push R13; push R14; push R15; sub RSP, 5*8; mov RAX, 0; add RSP, 5*8; pop R15; pop R14; pop R13; pop R12; pop RBX; pop RSI; pop RDI; pop RBP; ret; } }
Nov 12
parent reply Basile B. <b2.temp gmx.com> writes:
On Sunday, 12 November 2017 at 22:20:46 UTC, Dibyendu Majumdar 
wrote:
 On Sunday, 12 November 2017 at 22:00:58 UTC, Basile B. wrote:
 On Sunday, 12 November 2017 at 21:27:28 UTC, Dibyendu Majumdar
I am not sure I have understood above; will DMD generate the right Win64 unwind info for this contrived example:
no in naked mode you have to save and restore by hand.
Nov 12
parent reply Dibyendu Majumdar <d.majumdar gmail.com> writes:
On Sunday, 12 November 2017 at 22:24:08 UTC, Basile B. wrote:
 On Sunday, 12 November 2017 at 22:20:46 UTC, Dibyendu Majumdar 
 wrote:
 On Sunday, 12 November 2017 at 22:00:58 UTC, Basile B. wrote:
 On Sunday, 12 November 2017 at 21:27:28 UTC, Dibyendu Majumdar
I am not sure I have understood above; will DMD generate the right Win64 unwind info for this contrived example:
no in naked mode you have to save and restore by hand.
So how does one manually generate the .pdata and .xdata sections? Are you saying that this is what I would need to do? Another question - how can I tell DMD to no generate the frame pointer? Thanks for answering my questions. Regards Dibyendu
Nov 12
next sibling parent Guillaume Piolat <contact spam.com> writes:
 On Sunday, 12 November 2017 at 22:20:46 UTC, Dibyendu Majumdar

 no in naked mode you have to save and restore by hand.
Note that in Win64 even if not naked, you'll have to save/restore some registers like XMMx with x >= 6.
 Another question - how can I tell DMD to no generate the frame 
 pointer?
naked; However with naked; you have the problem of respecting the various ABI out there.
Nov 13
prev sibling parent Dibyendu Majumdar <d.majumdar gmail.com> writes:
On Sunday, 12 November 2017 at 22:40:06 UTC, Dibyendu Majumdar 
wrote:
 On Sunday, 12 November 2017 at 22:00:58 UTC, Basile B. wrote:
no in naked mode you have to save and restore by hand.
So how does one manually generate the .pdata and .xdata sections? Are you saying that this is what I would need to do? Another question - how can I tell DMD to no generate the frame pointer?
Hi, any further info on this? I am not talking here of the assembly push/pop instructions, rather the .pdata and .xdata sections needed on Win64. Thanks and Regards Dibyendu
Nov 13
prev sibling next sibling parent reply Basile B. <b2.temp gmx.com> writes:
On Sunday, 12 November 2017 at 11:01:39 UTC, Dibyendu Majumdar 
wrote:
 Hi,
 [...]
 The assembly code uses static allocation of registers, but 
 because of the differences in how registers are used in Win64 
 versus Unix X64 - different registers are assigned depending on 
 the architecture. dynasm makes this easy to do using macros; 
 e.g. below.
 [...]
 With above in place, the code can use the mnemonics to refer to 
 the registers rather than the registers themselves. This allows 
 the assembly code to be coded once for both architectures.
I see...the problem is not the input parameters but functions calls **inside** iasm, right ?
Nov 12
parent Dibyendu Majumdar <d.majumdar gmail.com> writes:
On Sunday, 12 November 2017 at 12:00:00 UTC, Basile B. wrote:
 On Sunday, 12 November 2017 at 11:01:39 UTC, Dibyendu Majumdar 
 wrote:
 [...]
 The assembly code uses static allocation of registers, but 
 because of the differences in how registers are used in Win64 
 versus Unix X64 - different registers are assigned depending 
 on the architecture. dynasm makes this easy to do using 
 macros; e.g. below.
 [...]
 With above in place, the code can use the mnemonics to refer 
 to the registers rather than the registers themselves. This 
 allows the assembly code to be coded once for both 
 architectures.
I see...the problem is not the input parameters but functions calls **inside** iasm, right ?
Not sure I understand the question. Once the defines are there I can write following: | // Call luaF_close | mov CARG1, L // arg1 = L | mov CARG2, BASE // arg2 = base | call extern luaF_close // call luaF_close As you can see above, CARG1, L, CARG2, BASE are all mnemonics that map to registers. However this is only defined in one place. Regards Dibyendu
Nov 12
prev sibling parent reply Basile B. <b2.temp gmx.com> writes:
On Sunday, 12 November 2017 at 11:01:39 UTC, Dibyendu Majumdar 
wrote:
 Hi,

 I have recently started work on building a VM for Lua (actually 
 a derivative of Lua) in X86-64 assembly. I am using the dynasm 
 tool that is part of LuaJIT. I was wondering whether I could 
 also write this in D's inline assembly perhaps, but there is 
 one aspect that I am not sure how to do.

 The assembly code uses static allocation of registers, but 
 because of the differences in how registers are used in Win64 
 versus Unix X64 - different registers are assigned depending on 
 the architecture. dynasm makes this easy to do using macros; 
 e.g. below.

 |.if X64WIN
 |.define CARG1,		rcx		// x64/WIN64 C call arguments.
 |.define CARG2,		rdx
 |.define CARG3,		r8
 |.define CARG4,		r9
 |.else
 |.define CARG1,		rdi		// x64/POSIX C call arguments.
 |.define CARG2,		rsi
 |.define CARG3,		rdx
 |.define CARG4,		rcx
 |.endif

 With above in place, the code can use the mnemonics to refer to 
 the registers rather than the registers themselves. This allows 
 the assembly code to be coded once for both architectures.

 How would one do this in D inline assembly?

 Thanks and Regards
 Dibyendu
TBH I wonder if this is not worth a enhancement (or even a DIP) to have in asm blocks a special alias syntax... { asm { version(...) { alias First = RDI; alias Second = RSI; // ... } else { alias First = RCX; alias Second = RDX; } mov First, Second; call aFunctionWithOneParam; // called with 2nd parent param as 1st param } } since the whole mixin solution make the custom asm unreadable just because of this problem. And Even maybe some special identifiers since extern(...) may lead to problems...
Nov 13
next sibling parent Basile B. <b2.temp gmx.com> writes:
On Monday, 13 November 2017 at 18:40:42 UTC, Basile B. wrote:
 On Sunday, 12 November 2017 at 11:01:39 UTC, Dibyendu Majumdar 
 wrote:
 [...]
TBH I wonder if this is not worth a enhancement (or even a DIP) to have in asm blocks a special alias syntax... { asm { version(...) { alias First = RDI; alias Second = RSI; // ... } else { alias First = RCX; alias Second = RDX; } mov First, Second; call aFunctionWithOneParam; // called with 2nd parent param as 1st param } } since the whole mixin solution make the custom asm unreadable just because of this problem. And Even maybe some special identifiers since extern(...) may lead to problems...
Hmmm being the fact that "First" might be XMM0 i actually think the predefined identifiers is clearly not a solution. "alias" in asm blocks still stands though
Nov 13
prev sibling next sibling parent Guillaume Piolat <contact spam.com> writes:
On Monday, 13 November 2017 at 18:40:42 UTC, Basile B. wrote:
 TBH I wonder if this is not worth a enhancement (or even a DIP)
 to have in asm blocks a special alias syntax...

 {
     asm
     {
         version(...)
         {
            alias First = RDI;
            alias Second = RSI;
            // ...
         }
         else
         {
            alias First = RCX;
            alias Second = RDX;
         }
         mov First, Second;
         call aFunctionWithOneParam; // called with 2nd parent 
 param as 1st param
     }
 }
Something that happen quite frequently is duplicating very similar blocks of assembly between x86 32-bit and 64-bit. However in almost all cases the differences are small and if "version" blocks would be accepted, it would be enough.
Nov 13
prev sibling parent Dibyendu Majumdar <d.majumdar gmail.com> writes:
On Monday, 13 November 2017 at 18:40:42 UTC, Basile B. wrote:

 TBH I wonder if this is not worth a enhancement (or even a DIP)
 to have in asm blocks a special alias syntax...

 {
     asm
     {
         version(...)
         {
            alias First = RDI;
            alias Second = RSI;
            // ...
         }
         else
         {
            alias First = RCX;
            alias Second = RDX;
         }
         mov First, Second;
         call aFunctionWithOneParam; // called with 2nd parent 
 param as 1st param
     }
 }

 since the whole mixin solution make the custom asm unreadable 
 just because of this problem.
Hi, that would be nice but I won't be holding my breath for such a feature to appear. I have a simple solution - I will just run it past a C pre-processor or maybe a custom one. Regards Dibyendu
Nov 13