www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Using inline assembler

reply Etienne <etcimon gmail.com> writes:
I'm a bit new to the inline assembler, I'm trying to use the `movdqu` 
operation to move a 128 bit double quadword from a pointer location into 
another location like this:

align(16) union __m128i { ubyte[16] data };

void store(__m128i* src, __m128i* dst) {
	asm { movdqu [dst], src; }
}


The compiler complains about a "bad type/size of operands 'movdqu'", but 
these two data segments are 16 byte align so they should be in an XMM# 
register? Is there something I'm missing here?
Oct 09 2014
next sibling parent reply "anonymous" <anonymous example.com> writes:
On Thursday, 9 October 2014 at 12:37:20 UTC, Etienne wrote:
 I'm a bit new to the inline assembler, I'm trying to use the 
 `movdqu` operation to move a 128 bit double quadword from a 
 pointer location into another location like this:

 align(16) union __m128i { ubyte[16] data };

 void store(__m128i* src, __m128i* dst) {
 	asm { movdqu [dst], src; }
 }


 The compiler complains about a "bad type/size of operands 
 'movdqu'", but these two data segments are 16 byte align so 
 they should be in an XMM# register? Is there something I'm 
 missing here?
I know virtually nothing about SSE, but you can't move directly from memory to memory, can you? You need go through a register, no? This compiles: align(16) union __m128i { ubyte[16] data; } /* note the position of the semicolon */ void store(__m128i* src, __m128i* dst) { asm { movdqu XMM0, [src]; /* note: [src] */ movdqu [dst], XMM0; } }
Oct 09 2014
parent reply Etienne <etcimon gmail.com> writes:
On 2014-10-09 8:54 AM, anonymous wrote:
 This compiles:

 align(16) union __m128i { ubyte[16] data; } /* note the position
 of the semicolon */

 void store(__m128i* src, __m128i* dst) {
       asm
       {
           movdqu XMM0, [src]; /* note: [src] */
           movdqu [dst], XMM0;
       }
 }
Yes, this does compile, but the value from src never ends up stored in dst. void main() { __m128i src; src.data[0] = 255; __m128i dst; writeln(src.data); // shows 255 at offset 0 store(&src, &dst); writeln(dst.data); // remains set as the initial array } http://x86.renejeschke.de/html/file_module_x86_id_184.html Is this how it's meant to be used?
Oct 09 2014
parent reply "anonymous" <anonymous example.com> writes:
On Thursday, 9 October 2014 at 13:29:27 UTC, Etienne wrote:
 On 2014-10-09 8:54 AM, anonymous wrote:
 This compiles:

 align(16) union __m128i { ubyte[16] data; } /* note the 
 position
 of the semicolon */

 void store(__m128i* src, __m128i* dst) {
      asm
      {
          movdqu XMM0, [src]; /* note: [src] */
          movdqu [dst], XMM0;
      }
 }
Yes, this does compile, but the value from src never ends up stored in dst. void main() { __m128i src; src.data[0] = 255; __m128i dst; writeln(src.data); // shows 255 at offset 0 store(&src, &dst); writeln(dst.data); // remains set as the initial array } http://x86.renejeschke.de/html/file_module_x86_id_184.html Is this how it's meant to be used?
I'm out of my knowledge zone here, but it seems to work when you move the pointers to registers first: void store(__m128i* src, __m128i* dst) { asm { mov RAX, src; mov RBX, dst; movdqu XMM0, [RAX]; movdqu [RBX], XMM0; } }
Oct 09 2014
parent Etienne <etcimon gmail.com> writes:
On 2014-10-09 9:46 AM, anonymous wrote:
 I'm out of my knowledge zone here, but it seems to work when you
 move the pointers to registers first:

 void store(__m128i* src, __m128i* dst) {
       asm
       {
           mov RAX, src;
           mov RBX, dst;
           movdqu XMM0, [RAX];
           movdqu [RBX], XMM0;
       }
 }
Absolutely incredible! My first useful working assembler code. You save the day. Now I can probably write a whole SIMD library ;)
Oct 09 2014
prev sibling parent Etienne <etcimon gmail.com> writes:
Maybe someone can help with the more specific problem. I'm translating a 
crypto engine here:

https://github.com/etcimon/botan/blob/master/source/botan/block/aes_ni/aes_ni.d

But I need this to work on DMD, LDC and GDC. I decided to write the 
assembler code directly for the functions in this module:

https://github.com/etcimon/botan/blob/master/source/botan/utils/simd/xmmintrin.d

If there's anything someone can tell me about this, I'd be thankful. I'm 
very experienced in every aspect of programming, but still at my first 
baby steps in assembler.
Oct 09 2014