www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Mixin in Inline Assembly

reply Chris M. <chrismohrfeld comcast.net> writes:
Right now I'm working on a project where I'm implementing a VM in 
D. I'm on the rotate instructions, and realized I could *almost* 
abstract the ror and rol instructions with the following function

private void rot(string ins)(int *op1, int op2)
{
     int tmp = *op1;
     asm
     {
         mov EAX, tmp; // I'd also like to know if I could just 
load *op1 directly into EAX
         mov ECX, op2[EBP];
         mixin(ins ~ " EAX, CL;"); // Issue here
         mov tmp, EAX;
     }
     *op1 = tmp;
}

However, the inline assembler doesn't like me trying to do a 
mixin. Is there a way around this?

(There is a reason op1 is a pointer instead of a ref int, please 
don't ask about it)
Jan 08
next sibling parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Monday, 9 January 2017 at 02:31:42 UTC, Chris M. wrote:
 Right now I'm working on a project where I'm implementing a VM 
 in D. I'm on the rotate instructions, and realized I could 
 *almost* abstract the ror and rol instructions with the 
 following function

 private void rot(string ins)(int *op1, int op2)
 {
     int tmp = *op1;
     asm
     {
         mov EAX, tmp; // I'd also like to know if I could just 
 load *op1 directly into EAX
         mov ECX, op2[EBP];
         mixin(ins ~ " EAX, CL;"); // Issue here
         mov tmp, EAX;
     }
     *op1 = tmp;
 }

 However, the inline assembler doesn't like me trying to do a 
 mixin. Is there a way around this?

 (There is a reason op1 is a pointer instead of a ref int, 
 please don't ask about it)
Yes make the whole inline asm a mixin.
Jan 08
parent Chris M. <chrismohrfeld comcast.net> writes:
On Monday, 9 January 2017 at 02:38:01 UTC, Stefan Koch wrote:
 On Monday, 9 January 2017 at 02:31:42 UTC, Chris M. wrote:
 [...]
Yes make the whole inline asm a mixin.
Awesome, got it working. Thanks to both replies.
Jan 08
prev sibling next sibling parent ketmar <ketmar ketmar.no-ip.org> writes:
On Monday, 9 January 2017 at 02:31:42 UTC, Chris M. wrote:
 However, the inline assembler doesn't like me trying to do a 
 mixin.
yep. iasm is completely independent from other fronted, it has it's own lexer, parser and so on. don't expect those things to work. the only way is to mixin the whole iasm block, including `asm{}`.
Jan 08
prev sibling next sibling parent Adam D. Ruppe <destructionator gmail.com> writes:
On Monday, 9 January 2017 at 02:31:42 UTC, Chris M. wrote:
     asm
     {
         mov EAX, tmp; // I'd also like to know if I could just 
 load *op1 directly into EAX
         mov ECX, op2[EBP];
         mixin(ins ~ " EAX, CL;"); // Issue here
         mov tmp, EAX;
     }
     *op1 = tmp;
 }

 However, the inline assembler doesn't like me trying to do a 
 mixin. Is there a way around this?
' You should be able to break it up too asm { mov EAX, tmp; } mixin("asm { "~ ins ~ "EAX, CL;" }"); asm { move tmp, EAX; } you get the idea. It should compile to the same thing.
Jan 08
prev sibling parent reply Basile B. <b2.temp gmx.com> writes:
On Monday, 9 January 2017 at 02:31:42 UTC, Chris M. wrote:
 Right now I'm working on a project where I'm implementing a VM 
 in D. I'm on the rotate instructions, and realized I could 
 *almost* abstract the ror and rol instructions with the 
 following function

 private void rot(string ins)(int *op1, int op2)
 {
     int tmp = *op1;
     asm
     {
         mov EAX, tmp; // I'd also like to know if I could just 
 load *op1 directly into EAX
         mov ECX, op2[EBP];
         mixin(ins ~ " EAX, CL;"); // Issue here
         mov tmp, EAX;
     }
     *op1 = tmp;
 }
don't forget to flag asm pure nothrow {} otherwise it's slow.
Jan 10
next sibling parent reply Guillaume Piolat <first.last gmail.com> writes:
On Tuesday, 10 January 2017 at 10:41:54 UTC, Basile B. wrote:
 don't forget to flag

 asm pure nothrow {}

 otherwise it's slow.
Why?
Jan 10
parent reply Basile B. <b2.temp gmx.com> writes:
On Tuesday, 10 January 2017 at 11:38:43 UTC, Guillaume Piolat 
wrote:
 On Tuesday, 10 January 2017 at 10:41:54 UTC, Basile B. wrote:
 don't forget to flag

 asm pure nothrow {}

 otherwise it's slow.
Why?
It's an empirical observation. In september I tried to get why an inline asm function was slow. What happened was that I didn't mark the asm block as nothrow https://forum.dlang.org/post/xznocpxtalpayvkrwxey forum.dlang.org I opened an issue asking the specifications to explain that clearly.
Jan 10
next sibling parent Guillaume Piolat <first.last gmail.com> writes:
On Tuesday, 10 January 2017 at 13:13:17 UTC, Basile B. wrote:
 On Tuesday, 10 January 2017 at 11:38:43 UTC, Guillaume Piolat 
 wrote:
 On Tuesday, 10 January 2017 at 10:41:54 UTC, Basile B. wrote:
 don't forget to flag

 asm pure nothrow {}

 otherwise it's slow.
Why?
It's an empirical observation. In september I tried to get why an inline asm function was slow. What happened was that I didn't mark the asm block as nothrow https://forum.dlang.org/post/xznocpxtalpayvkrwxey forum.dlang.org I opened an issue asking the specifications to explain that clearly.
Interesting, thanks.
Jan 10
prev sibling parent reply Chris M <chrismohrfeld comcast.net> writes:
On Tuesday, 10 January 2017 at 13:13:17 UTC, Basile B. wrote:
 On Tuesday, 10 January 2017 at 11:38:43 UTC, Guillaume Piolat 
 wrote:
 On Tuesday, 10 January 2017 at 10:41:54 UTC, Basile B. wrote:
 don't forget to flag

 asm pure nothrow {}

 otherwise it's slow.
Why?
It's an empirical observation. In september I tried to get why an inline asm function was slow. What happened was that I didn't mark the asm block as nothrow https://forum.dlang.org/post/xznocpxtalpayvkrwxey forum.dlang.org I opened an issue asking the specifications to explain that clearly.
Huh, that's really interesting, thanks for posting. I guess my other question would be how do I determine if a block of assembly is pure? I also figured out moving *op1 directly into RAX, guess it makes sense that a 64-bit value would need a 64-bit register :) private void rot(string ins)(int *op1, int op2) { mixin(" asm { mov RAX, op1; mov ECX, op2[EBP];" ~ ins ~ " [RAX], CL; } "); }
Jan 10
parent Basile B. <b2.temp gmx.com> writes:
On Wednesday, 11 January 2017 at 00:11:50 UTC, Chris M wrote:
 On Tuesday, 10 January 2017 at 13:13:17 UTC, Basile B. wrote:
 On Tuesday, 10 January 2017 at 11:38:43 UTC, Guillaume Piolat 
 wrote:
 On Tuesday, 10 January 2017 at 10:41:54 UTC, Basile B. wrote:
 don't forget to flag

 asm pure nothrow {}

 otherwise it's slow.
Why?
It's an empirical observation. In september I tried to get why an inline asm function was slow. What happened was that I didn't mark the asm block as nothrow https://forum.dlang.org/post/xznocpxtalpayvkrwxey forum.dlang.org I opened an issue asking the specifications to explain that clearly.
Huh, that's really interesting, thanks for posting. I guess my other question would be how do I determine if a block of assembly is pure?
The game changer for the performances is just "nothrow".
Jan 10
prev sibling parent reply Era Scarecrow <rtcvb32 yahoo.com> writes:
On Tuesday, 10 January 2017 at 10:41:54 UTC, Basile B. wrote:
 don't forget to flag

 asm pure nothrow {}

 otherwise it's slow.
Suddenly reminds me some of the speedup assembly I was writing for wideint, but seems I lost my code. too bad, the 128bit multiply had sped up and the division needed some work.
Jan 10
parent reply Guillaume Piolat <first.last gmail.com> writes:
On Wednesday, 11 January 2017 at 06:14:35 UTC, Era Scarecrow 
wrote:
 Suddenly reminds me some of the speedup assembly I was writing 
 for wideint, but seems I lost my code. too bad, the 128bit 
 multiply had sped up and the division needed some work.
I'm a taker if you have some algorithm to reuse 32-bit divide in wideint division instead of scanning bits :)
Jan 11
parent Era Scarecrow <rtcvb32 yahoo.com> writes:
On Wednesday, 11 January 2017 at 15:39:49 UTC, Guillaume Piolat 
wrote:
 On Wednesday, 11 January 2017 at 06:14:35 UTC, Era Scarecrow 
 wrote:
 Suddenly reminds me some of the speedup assembly I was writing 
 for wideint, but seems I lost my code. too bad, the 128bit 
 multiply had sped up and the division needed some work.
I'm a taker if you have some algorithm to reuse 32-bit divide in wideint division instead of scanning bits :)
I remember the divide was giving me some trouble. The idea was to try and use the built in registers and limits of the assembly to take advantage of full 128bit division, unfortunately if the result is too large to fit in a 64bit result it breaks, rather than giving me half the result and letting me work with it. Still I think I'll impliment my own version and then if it's faster I'll submit it.
Jan 11