www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - D ASM. Program fails

reply Iakh <iaktakh gmail.com> writes:
This code compiles but program exits with code -11
What's wrong?

import std.stdio;
import core.simd;

int pmovmskb(inout byte16 v)
{
     asm
     {
         movdqa XMM0, v;
         pmovmskb EAX, XMM0;
         ret;
     }
}
void main()
{
     byte16 a = [-1, 0, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0];
     auto i = pmovmskb(a);
}

Program exited with code -11

DMD64 D Compiler v2.069
Jan 21
parent reply anonymous <anonymous example.com> writes:
On 22.01.2016 06:59, Iakh wrote:
 import std.stdio;
 import core.simd;

 int pmovmskb(inout byte16 v)
 {
      asm
      {
          movdqa XMM0, v;
          pmovmskb EAX, XMM0;
          ret;
      }
 }
 void main()
 {
      byte16 a = [-1, 0, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0];
      auto i = pmovmskb(a);
 }
I don't know much about these things, but it seems to be the `ret;`. This doesn't segfault: ---- int pmovmskb(byte16 v) { int r; asm { movdqa XMM0, v; pmovmskb EAX, XMM0; mov r, EAX; } return r; } ---- Removed the `inout` because it doesn't make sense. You may be looking for `ref`.
Jan 22
next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Friday, 22 January 2016 at 12:18:53 UTC, anonymous wrote:
 I don't know much about these things, but it seems to be the 
 `ret;`.
Right. This is an ordinary D function so the compiler generates code to set up a stack for local variables. It looks like: push ebp; mov ebp, esp; sub EBP, some_size; /* sometimes a few other register saves */ /* your code here */ /* sometimes a few other register restores */ leave; ret; `leave` btw is the same as `mov esp,ebp; pop ebp;` - it undoes the result of those first three instructions. All this setup stuff is about creating a stack frame for the function's local variables. If you ret without restoring the frame, all local variables (and return addresses!) from there on are going to be out of sync and will lead to memory access violations. That's what happened to you. If you want to write a whole function in assembly without the compiler inserting any additional code, start it off with `asm { naked; }` inside so dmd knows what you are trying to do. Then you are in complete control. Otherwise, remember to clear the frame correctly, or better yet, just return using the ordinary D statement instead of the asm instruction.
Jan 22
parent reply userABCabc123 <userABCabc123 skdf.gz> writes:
On Friday, 22 January 2016 at 14:06:42 UTC, Adam D. Ruppe wrote:
 On Friday, 22 January 2016 at 12:18:53 UTC, anonymous wrote:
 I don't know much about these things, but it seems to be the 
 `ret;`.
Right. This is an ordinary D function so the compiler generates code to set up a stack for local variables. It looks like: push ebp; mov ebp, esp; sub EBP, some_size; /* sometimes a few other register saves */ /* your code here */ /* sometimes a few other register restores */ leave; ret; `leave` btw is the same as `mov esp,ebp; pop ebp;` - it undoes the result of those first three instructions. All this setup stuff is about creating a stack frame for the function's local variables. If you ret without restoring the frame, all local variables (and return addresses!) from there on are going to be out of sync and will lead to memory access violations. That's what happened to you. If you want to write a whole function in assembly without the compiler inserting any additional code, start it off with `asm { naked; }` inside so dmd knows what you are trying to do. Then you are in complete control. Otherwise, remember to clear the frame correctly, or better yet, just return using the ordinary D statement instead of the asm instruction.
naked version: int pmovmskb2(byte16 v) { asm { naked; push RBP; mov RBP, RSP; sub RSP, 0x20; movdqa dword ptr[RBP-0x10], XMM0; mov dword ptr[RBP-0x18], 0; movdqa XMM0, dword ptr[RBP-0x10]; pmovmskb EAX, XMM0; mov RSP, RBP; pop RBP; ret; } } Note that there is maybe a DMD codegen bug because the asm generated for the non naked version copy the result to the stack and then the stack to result but after pmovmskb it's already setup in EAX. 000000000044C580h push rbp 000000000044C581h mov rbp, rsp 000000000044C584h sub rsp, 20h 000000000044C588h movdqa dqword ptr [rbp-10h], xmm0 000000000044C58Dh mov dword ptr [rbp-18h], 00000000h 000000000044C594h movdqa xmm0, dqword ptr [rbp-10h] 000000000044C599h pmovmskb eax, xmm0 ; already in result 000000000044C59Dh mov dword ptr [rbp-18h], eax ; what? 000000000044C5A0h mov eax, dword ptr [rbp-18h] ; what? 000000000044C5A3h mov rsp, rbp 000000000044C5A6h pop rbp 000000000044C5A7h ret
Jan 22
parent reply userABCabc123 <userABCabc123 skdf.gz> writes:
On Friday, 22 January 2016 at 17:12:25 UTC, userABCabc123 wrote:
 Note that there is maybe a DMD codegen bug because the asm 
 generated for the non naked version copy the result to the 
 stack and then the stack to result but after pmovmskb it's 
 already setup in EAX.

 000000000044C580h  push rbp
 000000000044C581h  mov rbp, rsp
 000000000044C584h  sub rsp, 20h
 000000000044C588h  movdqa dqword ptr [rbp-10h], xmm0
 000000000044C58Dh  mov dword ptr [rbp-18h], 00000000h
 000000000044C594h  movdqa xmm0, dqword ptr [rbp-10h]
 000000000044C599h  pmovmskb eax, xmm0 ; already in result
 000000000044C59Dh  mov dword ptr [rbp-18h], eax ; what?
 000000000044C5A0h  mov eax, dword ptr [rbp-18h] ; what?
 000000000044C5A3h  mov rsp, rbp
 000000000044C5A6h  pop rbp
 000000000044C5A7h  ret
Oops, there no DMD codegen bug, the non naked version explicitly uses a local value for the return so without the local "r" this gives: int pmovmskb(byte16 v) { asm { naked; push RBP; mov RBP, RSP; sub RSP, 0x10; movdqa dword ptr[RBP-0x10], XMM0; movdqa XMM0, dword ptr[RBP-0x10]; pmovmskb EAX, XMM0; mov RSP, RBP; pop RBP; ret; } }
Jan 22
parent reply Iakh <iaktakh gmail.com> writes:
On Friday, 22 January 2016 at 17:27:35 UTC, userABCabc123 wrote:
 int pmovmskb(byte16 v)
 {
     asm
     {
         naked;
         push RBP;
         mov RBP, RSP;
         sub RSP, 0x10;
         movdqa dword ptr[RBP-0x10], XMM0;
         movdqa XMM0, dword ptr[RBP-0x10];
         pmovmskb EAX, XMM0;
         mov RSP, RBP;
         pop RBP;
         ret;
     }
 }
Thanks. It works. Buth shorter version too: asm { naked; push RBP; mov RBP, RSP; //sub RSP, 0x10; //movdqa dword ptr[RBP-0x10], XMM0; //movdqa XMM0, dword ptr[RBP-0x10]; pmovmskb EAX, XMM0; mov RSP, RBP; pop RBP; ret; } Looks like the SIMD param is passed by SIMD reg
Jan 22
parent userABCabc123 <userABCabc123 skdf.gz> writes:
On Friday, 22 January 2016 at 20:54:46 UTC, Iakh wrote:
 On Friday, 22 January 2016 at 17:27:35 UTC, userABCabc123 wrote:
 [...]
Thanks. It works. Buth shorter version too: asm { naked; push RBP; mov RBP, RSP; //sub RSP, 0x10; //movdqa dword ptr[RBP-0x10], XMM0; //movdqa XMM0, dword ptr[RBP-0x10]; pmovmskb EAX, XMM0; mov RSP, RBP; pop RBP; ret; } Looks like the SIMD param is passed by SIMD reg
Right I must be blind. So you can even remove the prelude and the prologue: int pmovmskb2(byte16 v) { asm { naked; pmovmskb EAX, XMM0; ret; } }
Jan 22
prev sibling parent reply Iakh <iaktakh gmail.com> writes:
On Friday, 22 January 2016 at 12:18:53 UTC, anonymous wrote:
 ----
 int pmovmskb(byte16 v)
 {
     int r;
     asm
     {
         movdqa XMM0, v;
         pmovmskb EAX, XMM0;
         mov r, EAX;
     }
     return r;
 }
 ----
This code returns 0 for any input v
 Removed the `inout` because it doesn't make sense. You may be 
 looking for `ref`.
yeah
Jan 22
parent reply anonymous <anonymous example.com> writes:
On 22.01.2016 21:34, Iakh wrote:
 This code returns 0 for any input v
It seems to return 5 here: http://dpaste.dzfl.pl/85fb8e5c4b6b
Jan 22
parent Iakh <iaktakh gmail.com> writes:
On Friday, 22 January 2016 at 20:41:23 UTC, anonymous wrote:
 On 22.01.2016 21:34, Iakh wrote:
 This code returns 0 for any input v
It seems to return 5 here: http://dpaste.dzfl.pl/85fb8e5c4b6b
Yeah. Sorry. My bad.
Jan 22