www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - SSE, Inline assembler, Structs, ...

reply Audun Wilhelmsen <seronor gmail.com> writes:
I want to use SSE to create a fast vector/matrix library (if anyone has done
this already I'd like to know). 

It seems that there's quite a bit of overhead with operator overloading, so I'd
probably want to write some of the algorithms in my final app in assembly, but
I'd still like to have optimized operators too. But I'm having some problems. I
can't get this to work for instance:
align struct Vec 4 {
  float x,y,z,w;
  ....
Vec4 opAdd(Vec4 v) {
  Vec4 res;
  asm {
    movaps XMM0, [this]; 
    addps XMM0, v[EBP];
    movaps res[EBP], XMM0;
  }
  return res;
}
}

if i add Vec4  *me = this and replace this with me it compiles, but it crashes.

Also, this confuses me:
	Vec4 v1 = Vec4(1,2,3,4);
//	Vec4* p = &v1;
	asm {
		movaps XMM1, v1[EBP]; 
	}

if I remove the comment, the program crashes.
Apr 01 2008
parent reply Sascha Katzner <sorry.no spam.invalid> writes:
Audun Wilhelmsen wrote:
 Also, this confuses me:
 	Vec4 v1 = Vec4(1,2,3,4);
 //	Vec4* p = &v1;
 	asm {
 		movaps XMM1, v1[EBP]; 
 	}
 
 if I remove the comment, the program crashes.

It seems that this is an data alignment problem, IIRC the "a" in movAps stands for "align" that means the command expects aligned data. You can use movups (unaligned), but that is not as fast. Or you can align your data. LLAP, Sascha
Apr 01 2008
parent reply Audun Wilhelmsen <seronor gmail.com> writes:
Sascha Katzner Wrote:

 Audun Wilhelmsen wrote:
 Also, this confuses me:
 	Vec4 v1 = Vec4(1,2,3,4);
 //	Vec4* p = &v1;
 	asm {
 		movaps XMM1, v1[EBP]; 
 	}
 
 if I remove the comment, the program crashes.

It seems that this is an data alignment problem, IIRC the "a" in movAps stands for "align" that means the command expects aligned data. You can use movups (unaligned), but that is not as fast. Or you can align your data. LLAP, Sascha

Well I've tried align, align(4) and align(16) in front of struct Vec4.. Isn't that supposed to align the data?
Apr 03 2008
next sibling parent "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"Audun Wilhelmsen" <seronor gmail.com> wrote in message 
news:ft3bdl$1enj$1 digitalmars.com...
 Sascha Katzner Wrote:

 Audun Wilhelmsen wrote:
 Also, this confuses me:
 Vec4 v1 = Vec4(1,2,3,4);
 // Vec4* p = &v1;
 asm {
 movaps XMM1, v1[EBP];
 }

 if I remove the comment, the program crashes.

It seems that this is an data alignment problem, IIRC the "a" in movAps stands for "align" that means the command expects aligned data. You can use movups (unaligned), but that is not as fast. Or you can align your data. LLAP, Sascha

Well I've tried align, align(4) and align(16) in front of struct Vec4.. Isn't that supposed to align the data?

I think align(n) only works on data alignment within the struct, and not the alignment of the struct itself in memory. I _think_.
Apr 03 2008
prev sibling parent Sascha Katzner <sorry.no spam.invalid> writes:
Audun Wilhelmsen wrote:
 Well I've tried align, align(4) and align(16) in front of struct
 Vec4.. Isn't that supposed to align the data?

Since 1.023...
 Data items in static data segment >= 16 bytes in size are now
 paragraph aligned.

So, you have to put your structs in the static data segment, structs on the stack are not properly aligned as far as I know. LLAP, Sascha
Apr 04 2008