www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Aligning data in memory

reply Peter Alexander <peter.alexander.au gmail.com> writes:
I posted this is d.learn, and also on stackoverflow.com with no 
satisfactory answer. Can anyone help me with this?

http://stackoverflow.com/questions/7375165/aligning-stack-variables-in-d

---

Is there a way to align data on the stack? In particular, I want to 
create an 16-byte aligned array of floats to load into XMM registers 
using movaps, which is significantly faster than movups.

e.g.

void foo()
{
     float[4] v = [1.0f, 2.0f, 3.0f, 4.0f];
     asm
     {
         movaps XMM0, v; // v must be 16-byte aligned for this to work.
         ...
     }
}
Sep 17 2011
next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
Perhaps:

void foo() {
        struct V { align(16) float[4] v = [1.0f, 2.0f, 3.0f, 4.0f]; }
	V v;
	asm {
		movaps XMM0, v;
	}
}


It compiles, but I'm not sure if it's actually correct.
Sep 17 2011
next sibling parent Peter Alexander <peter.alexander.au gmail.com> writes:
On 17/09/11 7:11 PM, Adam D. Ruppe wrote:
 Perhaps:

 void foo() {
          struct V { align(16) float[4] v = [1.0f, 2.0f, 3.0f, 4.0f]; }
 	V v;
 	asm {
 		movaps XMM0, v;
 	}
 }


 It compiles, but I'm not sure if it's actually correct.

If I am correct, that only aligns it within the struct, it doesn't align the struct itself.
Sep 17 2011
prev sibling next sibling parent Trass3r <un known.com> writes:
Am 17.09.2011, 20:11 Uhr, schrieb Adam D. Ruppe  
<destructionator gmail.com>:

 Perhaps:

 void foo() {
         struct V { align(16) float[4] v = [1.0f, 2.0f, 3.0f, 4.0f]; }
 	V v;
 	asm {
 		movaps XMM0, v;
 	}
 }

 It compiles, but I'm not sure if it's actually correct.

That align directive is fucked up anyways. Why does it even exist if the value you specify doesn't change anything? I can't make sense out of the description: http://www.d-programming-language.org/attribute.html#align
Sep 19 2011
prev sibling next sibling parent reply Peter Alexander <peter.alexander.au gmail.com> writes:
On 19/09/11 9:17 AM, Rory McGuire wrote:
 surely you would have to use
   movaps XMM0, v.v;

   because the alignment would only happen inside the struct?


 On Sat, Sep 17, 2011 at 8:11 PM, Adam D. Ruppe
 <destructionator gmail.com <mailto:destructionator gmail.com>> wrote:

     Perhaps:

     void foo() {
             struct V { align(16) float[4] v = [1.0f, 2.0f, 3.0f, 4.0f]; }
             V v;
             asm {
                     movaps XMM0, v;
             }
     }


     It compiles, but I'm not sure if it's actually correct.

v has offset 0 in the struct, so &v.v == &v, which is all the inline asm cares about.
Sep 20 2011
parent Peter Alexander <peter.alexander.au gmail.com> writes:
I could be wrong, but I think so.

As I understand, align(N) only aligns it *within the structure*.

If you are at 0 offset, you are aligned on all N already, so I don't see 
why it would add padding before the first member of a struct.


On 21/09/11 11:22 AM, Rory McGuire wrote:
 Would that even be true in the case where you specify a alignment (
 keeping in mind that the alignment is for that specific variable)?



 On Tue, Sep 20, 2011 at 7:25 PM, Peter Alexander
 <peter.alexander.au gmail.com <mailto:peter.alexander.au gmail.com>> wrote:

     On 19/09/11 9:17 AM, Rory McGuire wrote:

         surely you would have to use
           movaps XMM0, v.v;

           because the alignment would only happen inside the struct?


         On Sat, Sep 17, 2011 at 8:11 PM, Adam D. Ruppe
         <destructionator gmail.com <mailto:destructionator gmail.com>
         <mailto:destructionator gmail.__com
         <mailto:destructionator gmail.com>>> wrote:

             Perhaps:

             void foo() {
                     struct V { align(16) float[4] v = [1.0f, 2.0f, 3.0f,
         4.0f]; }
                     V v;
                     asm {
                             movaps XMM0, v;
                     }
             }


             It compiles, but I'm not sure if it's actually correct.



     v has offset 0 in the struct, so &v.v == &v, which is all the inline
     asm cares about.

Sep 21 2011
prev sibling parent Rory McGuire <rjmcguire gmail.com> writes:
--20cf3079bebeb9ba3004ad70f3e7
Content-Type: text/plain; charset=UTF-8

Would that even be true in the case where you specify a alignment ( keeping
in mind that the alignment is for that specific variable)?



On Tue, Sep 20, 2011 at 7:25 PM, Peter Alexander <
peter.alexander.au gmail.com> wrote:

 On 19/09/11 9:17 AM, Rory McGuire wrote:

 surely you would have to use
  movaps XMM0, v.v;

  because the alignment would only happen inside the struct?


 On Sat, Sep 17, 2011 at 8:11 PM, Adam D. Ruppe
 <destructionator gmail.com
<mailto:destructionator gmail.**com<destructionator gmail.com>>>
 wrote:

    Perhaps:

    void foo() {
            struct V { align(16) float[4] v = [1.0f, 2.0f, 3.0f, 4.0f]; }
            V v;
            asm {
                    movaps XMM0, v;
            }
    }


    It compiles, but I'm not sure if it's actually correct.

cares about.

--20cf3079bebeb9ba3004ad70f3e7 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Would that even be true in the case where you specify a alignment ( keeping= in mind that the alignment is for that specific variable)?<div><br></div><= div><br><br><div class=3D"gmail_quote">On Tue, Sep 20, 2011 at 7:25 PM, Pet= er Alexander <span dir=3D"ltr">&lt;<a href=3D"mailto:peter.alexander.au gma= il.com">peter.alexander.au gmail.com</a>&gt;</span> wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex;"><div class=3D"im">On 19/09/11 9:17 AM, Rory= McGuire wrote:<br> </div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-l= eft:1px #ccc solid;padding-left:1ex"><div class=3D"im"> surely you would have to use<br> =C2=A0movaps XMM0, v.v;<br> <br> =C2=A0because the alignment would only happen inside the struct?<br> <br> <br></div><div><div></div><div class=3D"h5"> On Sat, Sep 17, 2011 at 8:11 PM, Adam D. Ruppe<br> &lt;<a href=3D"mailto:destructionator gmail.com" target=3D"_blank">destruct= ionator gmail.com</a> &lt;mailto:<a href=3D"mailto:destructionator gmail.co= m" target=3D"_blank">destructionator gmail.<u></u>com</a>&gt;&gt; wrote:<br=

=C2=A0 =C2=A0Perhaps:<br> <br> =C2=A0 =C2=A0void foo() {<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0struct V { align(16) float[4] v = =3D [1.0f, 2.0f, 3.0f, 4.0f]; }<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0V v;<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0asm {<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0movap= s XMM0, v;<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}<br> =C2=A0 =C2=A0}<br> <br> <br> =C2=A0 =C2=A0It compiles, but I&#39;m not sure if it&#39;s actually correc= t.<br> <br> <br> </div></div></blockquote> <br> v has offset 0 in the struct, so &amp;v.v =3D=3D &amp;v, which is all the i= nline asm cares about.<br> <br> </blockquote></div><br></div> --20cf3079bebeb9ba3004ad70f3e7--
Sep 21 2011
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Sat, 17 Sep 2011 14:01:19 -0400, Peter Alexander
<peter.alexander.au gmail.com> wrote:
 I posted this is d.learn, and also on stackoverflow.com with no
 satisfactory answer. Can anyone help me with this?

 http://stackoverflow.com/questions/7375165/aligning-stack-variables-in-d

 ---

 Is there a way to align data on the stack? In particular, I want to
 create an 16-byte aligned array of floats to load into XMM registers
 using movaps, which is significantly faster than movups.

 e.g.

 void foo()
 {
      float[4] v = [1.0f, 2.0f, 3.0f, 4.0f];
      asm
      {
          movaps XMM0, v; // v must be 16-byte aligned for this to work.
          ...
      }
 }

It depends. OS X requires 16-byte alignment, which DMD complies with. So on Mac the above code is okay. However, on PC, the only way to get aligned memory is to a) use the heap or b) request extra stack space and align it yourself. (i.e. declare a float[7] and then slice it appropriately) The other option is to just use movups. movups on aligned data had (IIRC) the same speed on aligned data as movaps did on my CPU (Core 2) and I'd really be surprised if on any modern architecture this wasn't true. (That said, movups does slow down on unaligned memory) Also, you could use alloca or region allocator to get aligned memory.
Sep 17 2011
prev sibling parent Rory McGuire <rjmcguire gmail.com> writes:
--bcaec50164396fa4b004ad46f83f
Content-Type: text/plain; charset=UTF-8

surely you would have to use
 movaps XMM0, v.v;

 because the alignment would only happen inside the struct?


On Sat, Sep 17, 2011 at 8:11 PM, Adam D. Ruppe <destructionator gmail.com>wrote:

 Perhaps:

 void foo() {
        struct V { align(16) float[4] v = [1.0f, 2.0f, 3.0f, 4.0f]; }
        V v;
        asm {
                movaps XMM0, v;
        }
 }


 It compiles, but I'm not sure if it's actually correct.

--bcaec50164396fa4b004ad46f83f Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable surely you would have to use<div>=C2=A0movaps XMM0, v.v;</div><div><br></di= v><div>=C2=A0because the alignment would only happen inside the struct?<div=
<br><br><div class=3D"gmail_quote">On Sat, Sep 17, 2011 at 8:11 PM, Adam D=

destructionator gmail.com</a>&gt;</span> wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex;">Perhaps:<br> <br> void foo() {<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0struct V { align(16) float[4] v =3D [1.0f, 2.0f= , 3.0f, 4.0f]; }<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0V v;<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0asm {<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0movaps XMM0, v;<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0}<br> }<br> <br> <br> It compiles, but I&#39;m not sure if it&#39;s actually correct.<br> </blockquote></div><br></div></div> --bcaec50164396fa4b004ad46f83f--
Sep 19 2011