www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - std.simd module

reply Manu <turkeyman gmail.com> writes:
--20cf3074b0b2e0ceec04b828d674
Content-Type: text/plain; charset=UTF-8

So I've been trying to collate a sensible framework for a standard
cross-platform simd module since Walter added the SIMD stuff.
I'm sure everyone will have a million opinions on this, so I've drawn my
approach up to a point where it properly conveys the intent, and I've
proven the code gen works, and is good. Now I figure I should get everyone
to shoot it down before I commit to the tedious work filling in all the
remaining blanks.

(Note: I've only written code against GDC as yet, since DMD's SSE only
supports x64, and x64 is not supported in Windows)
https://github.com/TurkeyMan/phobos/blob/master/std/simd.d

The code might surprise a lot of people... so I'll give a few words about
the approach.

The key goal here is to provide the lowest level USEFUL set of functions,
all the basic functions that people actually use in their algorithms,
without requiring them to understand the quirks of various platforms vector
hardware.
Different SIMD hardware tends to have very different shuffling, load/store,
component addressing, support for more/less of the primitive maths
operations, etc.
This library, which is the lowest level library I expect programmers would
ever want to use in their apps, should provide that API at the lowest
useful level.

First criticism I expect is for many to insist on a class-style vector
library, which I personally think has no place as a low level, portable API.
Everyone has a different idea of what the perfect vector lib should look
like, and it tends to change significantly with respect to its application.

I feel this flat API is easier to implement, maintain, and understand, and
I expect the most common use of this lib will be in the back end of peoples
own vector/matrix/linear algebra libs that suit their apps.

My key concern is with my function names... should I be worried about name
collisions in such a low level lib? I already shadow a lot of standard
float functions...
I prefer them abbreviated in this (fairly standard) way, keeps lines of
code short and compact. It should be particularly familiar to anyone who
has written shaders and such.

Opinions? Shall I continue as planned?

--20cf3074b0b2e0ceec04b828d674
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

So I&#39;ve been trying to collate a sensible framework for a standard cros=
s-platform simd module since Walter added the SIMD stuff.<div>I&#39;m sure =
everyone will have a million opinions on this, so I&#39;ve drawn my approac=
h up to a point where it properly conveys the intent, and I&#39;ve proven t=
he code gen works, and is good. Now I figure I should get everyone to shoot=
 it down before I commit to the tedious work filling in all the remaining b=
lanks.</div>
<div><br></div><div>(Note: I&#39;ve only written code against GDC as yet, s=
ince DMD&#39;s SSE only supports x64, and x64 is not supported in Windows)<=
/div><div><a href=3D"https://github.com/TurkeyMan/phobos/blob/master/std/si=
md.d">https://github.com/TurkeyMan/phobos/blob/master/std/simd.d</a></div>
<div><br></div><div>The code might surprise a lot of people... so I&#39;ll =
give a few words about the approach.</div><div><br class=3D"Apple-interchan=
ge-newline">The key goal here is to provide the lowest level USEFUL set of =
functions, all the basic functions that people actually use in their algori=
thms, without requiring them to understand the quirks of various platforms =
vector hardware.</div>
<div>Different SIMD hardware tends to have very different shuffling, load/s=
tore, component addressing, support for more/less of the primitive maths op=
erations, etc.</div><div>This library, which is the lowest level library I =
expect programmers would ever want to use in their apps, should provide tha=
t API at the lowest useful level.</div>
<div><br></div><div>First criticism I expect is for many to insist on a cla=
ss-style vector library, which I personally think has no place as a low lev=
el, portable API.</div><div>Everyone has a different idea of what the perfe=
ct vector lib should look like, and it tends to change significantly with r=
espect to its application.</div>
<div><br></div><div>I feel this flat API is easier to implement, maintain, =
and understand, and I expect the most common use of this lib will be in the=
 back end of peoples own vector/matrix/linear algebra libs that suit their =
apps.</div>
<div><br></div><div>My key concern is with my function names... should I be=
 worried about name collisions in such a low level lib? I already shadow a =
lot of standard float functions...</div><div>I prefer them abbreviated in t=
his (fairly standard) way, keeps lines of code short and compact. It should=
 be particularly familiar to anyone who has written shaders and such.</div>
<div><br></div><div>Opinions? Shall I continue as planned?</div>

--20cf3074b0b2e0ceec04b828d674--
Feb 04 2012
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 02/04/2012 08:57 PM, Manu wrote:
 So I've been trying to collate a sensible framework for a standard
 cross-platform simd module since Walter added the SIMD stuff.
 I'm sure everyone will have a million opinions on this, so I've drawn my
 approach up to a point where it properly conveys the intent, and I've
 proven the code gen works, and is good. Now I figure I should get
 everyone to shoot it down before I commit to the tedious work filling in
 all the remaining blanks.

 (Note: I've only written code against GDC as yet, since DMD's SSE only
 supports x64, and x64 is not supported in Windows)
 https://github.com/TurkeyMan/phobos/blob/master/std/simd.d

 The code might surprise a lot of people... so I'll give a few words
 about the approach.

 The key goal here is to provide the lowest level USEFUL set of
 functions, all the basic functions that people actually use in their
 algorithms, without requiring them to understand the quirks of various
 platforms vector hardware.
 Different SIMD hardware tends to have very different shuffling,
 load/store, component addressing, support for more/less of the primitive
 maths operations, etc.
 This library, which is the lowest level library I expect programmers
 would ever want to use in their apps, should provide that API at the
 lowest useful level.

 First criticism I expect is for many to insist on a class-style vector
 library, which I personally think has no place as a low level, portable API.
 Everyone has a different idea of what the perfect vector lib should look
 like, and it tends to change significantly with respect to its application.

 I feel this flat API is easier to implement, maintain, and understand,
 and I expect the most common use of this lib will be in the back end of
 peoples own vector/matrix/linear algebra libs that suit their apps.

 My key concern is with my function names... should I be worried about
 name collisions in such a low level lib? I already shadow a lot of
 standard float functions...

That is not really an issue. If it actually bothers someone, static or named imports come to the rescue.
 I prefer them abbreviated in this (fairly standard) way, keeps lines of
 code short and compact. It should be particularly familiar to anyone who
 has written shaders and such.

I agree.
 Opinions? Shall I continue as planned?

Looks good. I think it should provide emulation in case of non-existent hardware support (maybe even with a possibility to opt-out).
Feb 04 2012
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/4/2012 11:57 AM, Manu wrote:
 My key concern is with my function names... should I be worried about name
 collisions in such a low level lib?

No. D's module resolution is good enough that prefixing names is not D-style and is to be avoided.
 I prefer them abbreviated in this (fairly standard) way, keeps lines of code
 short and compact. It should be particularly familiar to anyone who has written
 shaders and such.

 Opinions? Shall I continue as planned?

I'm far too overloaded at the moment to give this an in-depth review. I'm hoping others here will step up!
Feb 04 2012
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 02/05/2012 02:03 AM, Manu wrote:
 On 5 February 2012 02:55, Walter Bright <newshound2 digitalmars.com
 <mailto:newshound2 digitalmars.com>> wrote:

     On 2/4/2012 11:57 AM, Manu wrote:

         My key concern is with my function names... should I be worried
         about name
         collisions in such a low level lib?


     No. D's module resolution is good enough that prefixing names is not
     D-style and is to be avoided.


 One concern that has occurred to me relating to the D module system
 is... without any traditional header files, how will this API inline
 properly? It helps that every function is a template, so I suppose that
 forces it to inline yeah?
 I'm quite concerned by a lack of force-inline keyword... it can't be
 left to the compiler to decide to inline these or not. they MUST be
 inlined, there is no compromise.

The 'enum' storage class would mean force-inline when generalized to functions.
Feb 04 2012
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
--0023544717e46dd89104b82d1e13
Content-Type: text/plain; charset=UTF-8

On 5 February 2012 02:55, Walter Bright <newshound2 digitalmars.com> wrote:

 On 2/4/2012 11:57 AM, Manu wrote:

 My key concern is with my function names... should I be worried about name
 collisions in such a low level lib?

No. D's module resolution is good enough that prefixing names is not D-style and is to be avoided.

One concern that has occurred to me relating to the D module system is... without any traditional header files, how will this API inline properly? It helps that every function is a template, so I suppose that forces it to inline yeah? I'm quite concerned by a lack of force-inline keyword... it can't be left to the compiler to decide to inline these or not. they MUST be inlined, there is no compromise. --0023544717e46dd89104b82d1e13 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <div class=3D"gmail_quote">On 5 February 2012 02:55, Walter Bright <span di= r=3D"ltr">&lt;<a href=3D"mailto:newshound2 digitalmars.com">newshound2 digi= talmars.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" styl= e=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <div class=3D"im">On 2/4/2012 11:57 AM, Manu wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> My key concern is with my function names... should I be worried about name<= br> collisions in such a low level lib?<br> </blockquote> <br></div> No. D&#39;s module resolution is good enough that prefixing names is not D-= style and is to be avoided.<br></blockquote><div><br></div><div>One concern= that has occurred to me relating to the D module system is... without any = traditional header files, how will this API inline properly? It helps that = every function is a template, so I suppose that forces it to inline yeah?</= div> <div>I&#39;m quite concerned by a lack of force-inline keyword... it can&#3= 9;t be left to the compiler to decide to inline these or not. they MUST be = inlined, there is no compromise.</div></div> --0023544717e46dd89104b82d1e13--
Feb 04 2012
prev sibling next sibling parent Sean Cavanaugh <WorksOnMyMachine gmail.com> writes:
Looks good so far:

   it could use float[2] code wherever there is float[3] code 
(magnitude2 etc)

   any/all should have template overloads to let you specificy exactly 
which channels match, and simple hardcoded ones for the common cases 
(any1, any2, any3, any4 aka the default 'any')

   I have implementations of floor/ceil/round(to-even) that work on 
pre-SSE4 hardware for float and doubles I can give out they are fairly 
simple, as well as the main transcendentals (pow, exp, log, sin, cos, 
tan, asin, acos, atan).  sinh and cosh being the only major ones I left out.

I just need a place or address to post or mail the code.

   D should be able to handle names and overloading better, though 
giving everything unique names was the design choice I made for my 
library, primarily to make the code searchable and potentially portable 
to C (aside from the heavy use of const references as argument types).



On 2/4/2012 1:57 PM, Manu wrote:
 So I've been trying to collate a sensible framework for a standard
 cross-platform simd module since Walter added the SIMD stuff.
 I'm sure everyone will have a million opinions on this, so I've drawn my
 approach up to a point where it properly conveys the intent, and I've
 proven the code gen works, and is good. Now I figure I should get
 everyone to shoot it down before I commit to the tedious work filling in
 all the remaining blanks.

 (Note: I've only written code against GDC as yet, since DMD's SSE only
 supports x64, and x64 is not supported in Windows)
 https://github.com/TurkeyMan/phobos/blob/master/std/simd.d

 The code might surprise a lot of people... so I'll give a few words
 about the approach.

 The key goal here is to provide the lowest level USEFUL set of
 functions, all the basic functions that people actually use in their
 algorithms, without requiring them to understand the quirks of various
 platforms vector hardware.
 Different SIMD hardware tends to have very different shuffling,
 load/store, component addressing, support for more/less of the primitive
 maths operations, etc.
 This library, which is the lowest level library I expect programmers
 would ever want to use in their apps, should provide that API at the
 lowest useful level.

 First criticism I expect is for many to insist on a class-style vector
 library, which I personally think has no place as a low level, portable API.
 Everyone has a different idea of what the perfect vector lib should look
 like, and it tends to change significantly with respect to its application.

 I feel this flat API is easier to implement, maintain, and understand,
 and I expect the most common use of this lib will be in the back end of
 peoples own vector/matrix/linear algebra libs that suit their apps.

 My key concern is with my function names... should I be worried about
 name collisions in such a low level lib? I already shadow a lot of
 standard float functions...
 I prefer them abbreviated in this (fairly standard) way, keeps lines of
 code short and compact. It should be particularly familiar to anyone who
 has written shaders and such.

 Opinions? Shall I continue as planned?

Feb 04 2012
prev sibling next sibling parent "F i L" <witte2008 gmail.com> writes:
Looks good to me so far ;-)


 First criticism I expect is for many to insist on a class-style 
 vector
 library, which I personally think has no place as a low level, 
 portable API.
 Everyone has a different idea of what the perfect vector lib 
 should look
 like, and it tends to change significantly with respect to its 
 application.

I think it would be useful, especially to newcomers who are unfamiliar with D's lib terrain, to have an officially supported "utils" library for these higher-level structures. core // to the metal std // low-level but useful util // get the job done
Feb 04 2012
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
--20cf3074d792fe041404b834d6e7
Content-Type: text/plain; charset=UTF-8

On 5 February 2012 07:47, Sean Cavanaugh <WorksOnMyMachine gmail.com> wrote:

 Looks good so far:

  it could use float[2] code wherever there is float[3] code (magnitude2
 etc)

Yep, I intended to do this. You'll see I added dot2, I just didin't add the others yet :P Note: this is FAR from complete, I just wanted to get initial opinions before I took it too far. any/all should have template overloads to let you specificy exactly which
 channels match, and simple hardcoded ones for the common cases (any1, any2,
 any3, any4 aka the default 'any')

... I'll look into it again more closely, but I don't think I can bring myself to do this. It's ONLY really possible on SSE. Something so expensive shouldn't be in the base API I don't think. The only case where this operation is particular common is working with 3d vectors. In my experience (fairly extensive, on many architectures) you will almost always have 0's or 1's in the W anyway, which you can control the mask by choosing greater or greater-equal. With careful consideration, you can achieve this at zero cost, and not providing that API leads you to consider such a construct. I have implementations of floor/ceil/round(to-even) that work on pre-SSE4
 hardware for float and doubles I can give out they are fairly simple, as
 well as the main transcendentals (pow, exp, log, sin, cos, tan, asin, acos,
 atan).  sinh and cosh being the only major ones I left out.

I did plan to add all of these, just haven't gotten to it. You're more than welcome to contribute your implementations. I recommend a sincos() functions (and friends) as well. Assuming you implement them as a taylor series, it's more efficient to calculate both at once, and it's rare that you ever call one and not the other. I just need a place or address to post or mail the code. Pull request? :) Or email me: turkeyman at gmail D should be able to handle names and overloading better, though giving
 everything unique names was the design choice I made for my library,
 primarily to make the code searchable and potentially portable to C (aside
 from the heavy use of const references as argument types).

/agree, but the names I've used are so standardised and expected, that I'm really apprehensive to use different names. Need more opinions to make a good decision, but currently I'm leaning heavily towards keeping it how it is. On 2/4/2012 1:57 PM, Manu wrote:
 So I've been trying to collate a sensible framework for a standard
 cross-platform simd module since Walter added the SIMD stuff.
 I'm sure everyone will have a million opinions on this, so I've drawn my
 approach up to a point where it properly conveys the intent, and I've
 proven the code gen works, and is good. Now I figure I should get
 everyone to shoot it down before I commit to the tedious work filling in
 all the remaining blanks.

 (Note: I've only written code against GDC as yet, since DMD's SSE only
 supports x64, and x64 is not supported in Windows)
 https://github.com/TurkeyMan/**phobos/blob/master/std/simd.d<https://github.com/TurkeyMan/phobos/blob/master/std/simd.d>

 The code might surprise a lot of people... so I'll give a few words
 about the approach.

 The key goal here is to provide the lowest level USEFUL set of
 functions, all the basic functions that people actually use in their
 algorithms, without requiring them to understand the quirks of various
 platforms vector hardware.
 Different SIMD hardware tends to have very different shuffling,
 load/store, component addressing, support for more/less of the primitive
 maths operations, etc.
 This library, which is the lowest level library I expect programmers
 would ever want to use in their apps, should provide that API at the
 lowest useful level.

 First criticism I expect is for many to insist on a class-style vector
 library, which I personally think has no place as a low level, portable
 API.
 Everyone has a different idea of what the perfect vector lib should look
 like, and it tends to change significantly with respect to its
 application.

 I feel this flat API is easier to implement, maintain, and understand,
 and I expect the most common use of this lib will be in the back end of
 peoples own vector/matrix/linear algebra libs that suit their apps.

 My key concern is with my function names... should I be worried about
 name collisions in such a low level lib? I already shadow a lot of
 standard float functions...
 I prefer them abbreviated in this (fairly standard) way, keeps lines of
 code short and compact. It should be particularly familiar to anyone who
 has written shaders and such.

 Opinions? Shall I continue as planned?


--20cf3074d792fe041404b834d6e7 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <div class=3D"gmail_quote">On 5 February 2012 07:47, Sean Cavanaugh <span d= ir=3D"ltr">&lt;<a href=3D"mailto:WorksOnMyMachine gmail.com">WorksOnMyMachi= ne gmail.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" sty= le=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <br> Looks good so far:<br> <br> =C2=A0it could use float[2] code wherever there is float[3] code (magnitud= e2 etc)<br></blockquote><div><br></div><div>Yep, I intended to do this. You= &#39;ll see I added dot2, I just didin&#39;t add the others yet :P</div><di= v> <br></div><div>Note: this is FAR from complete, I just wanted to get initia= l opinions before I took it too far.</div><div><br></div><blockquote class= =3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padd= ing-left:1ex"> =C2=A0any/all should have template overloads to let you specificy exactly w= hich channels match, and simple hardcoded ones for the common cases (any1, = any2, any3, any4 aka the default &#39;any&#39;)<br></blockquote><div><br></= div> <div>... I&#39;ll look into it again more closely, but I don&#39;t think I = can bring myself to do this. It&#39;s ONLY really possible on SSE. Somethin= g so expensive shouldn&#39;t be in the base API I don&#39;t think.</div> <div>The only case where this operation is particular common is working wit= h 3d vectors. In my experience (fairly extensive, on many architectures) yo= u will almost always have 0&#39;s or 1&#39;s in the W anyway, which you can= control the mask by choosing greater or greater-equal. With careful consid= eration, you can achieve this at zero cost, and not providing that API lead= s you to consider such a construct.</div> <div><br></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex= ;border-left:1px #ccc solid;padding-left:1ex">=C2=A0I have implementations = of floor/ceil/round(to-even) that work on pre-SSE4 hardware for float and d= oubles I can give out they are fairly simple, as well as the main transcend= entals (pow, exp, log, sin, cos, tan, asin, acos, atan). =C2=A0sinh and cos= h being the only major ones I left out.<br> </blockquote><div><br></div><div>I did plan to add all of these, just haven= &#39;t gotten to it. You&#39;re more than welcome to contribute your implem= entations.</div><div>I recommend a sincos() functions (and friends) as well= . Assuming you implement them as a taylor series, it&#39;s more efficient t= o calculate both at once, and it&#39;s rare that you ever call one and not = the other.</div> <div><br></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex= ;border-left:1px #ccc solid;padding-left:1ex">I just need a place or addres= s to post or mail the code.</blockquote><div><br></div><div>Pull request? := )</div> <div>Or email me: turkeyman at gmail</div><div><br></div><blockquote class= =3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padd= ing-left:1ex">=C2=A0D should be able to handle names and overloading better= , though giving everything unique names was the design choice I made for my= library, primarily to make the code searchable and potentially portable to= C (aside from the heavy use of const references as argument types).</block= quote> <div><br></div><div>/agree, but the names I&#39;ve used are so standardised= and expected, that I&#39;m really apprehensive to use different names.</di= v><div>Need more opinions to make a good decision, but currently I&#39;m le= aning heavily towards keeping it how it is.</div> <div><br></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex= ;border-left:1px #ccc solid;padding-left:1ex"><div class=3D"im">On 2/4/2012= 1:57 PM, Manu wrote:<br> </div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-l= eft:1px #ccc solid;padding-left:1ex"><div class=3D"im"> So I&#39;ve been trying to collate a sensible framework for a standard<br> cross-platform simd module since Walter added the SIMD stuff.<br> I&#39;m sure everyone will have a million opinions on this, so I&#39;ve dra= wn my<br> approach up to a point where it properly conveys the intent, and I&#39;ve<b= r> proven the code gen works, and is good. Now I figure I should get<br> everyone to shoot it down before I commit to the tedious work filling in<br=

<br> (Note: I&#39;ve only written code against GDC as yet, since DMD&#39;s SSE o= nly<br> supports x64, and x64 is not supported in Windows)<br> <a href=3D"https://github.com/TurkeyMan/phobos/blob/master/std/simd.d" targ= et=3D"_blank">https://github.com/TurkeyMan/<u></u>phobos/blob/master/std/si= md.d</a><br> <br> The code might surprise a lot of people... so I&#39;ll give a few words<br> about the approach.<br> <br> The key goal here is to provide the lowest level USEFUL set of<br> functions, all the basic functions that people actually use in their<br> algorithms, without requiring them to understand the quirks of various<br> platforms vector hardware.<br> Different SIMD hardware tends to have very different shuffling,<br> load/store, component addressing, support for more/less of the primitive<br=

This library, which is the lowest level library I expect programmers<br> would ever want to use in their apps, should provide that API at the<br> lowest useful level.<br> <br></div><div class=3D"im"> First criticism I expect is for many to insist on a class-style vector<br> library, which I personally think has no place as a low level, portable API= .<br> Everyone has a different idea of what the perfect vector lib should look<br=

<br> <br></div><div class=3D"im"> I feel this flat API is easier to implement, maintain, and understand,<br> and I expect the most common use of this lib will be in the back end of<br> peoples own vector/matrix/linear algebra libs that suit their apps.<br> <br></div><div class=3D"im"> My key concern is with my function names... should I be worried about<br></= div><div class=3D"im"> name collisions in such a low level lib? I already shadow a lot of<br> standard float functions...<br></div><div class=3D"im"> I prefer them abbreviated in this (fairly standard) way, keeps lines of<br> code short and compact. It should be particularly familiar to anyone who<br=

<br> Opinions? Shall I continue as planned?<br> </div></blockquote> <br> </blockquote></div><br> --20cf3074d792fe041404b834d6e7--
Feb 05 2012
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
--0023544717e4880a2504b834de4b
Content-Type: text/plain; charset=UTF-8

On 5 February 2012 09:22, F i L <witte2008 gmail.com> wrote:

 Looks good to me so far ;-)

  First criticism I expect is for many to insist on a class-style vector
 library, which I personally think has no place as a low level, portable
 API.
 Everyone has a different idea of what the perfect vector lib should look
 like, and it tends to change significantly with respect to its
 application.

I think it would be useful, especially to newcomers who are unfamiliar with D's lib terrain, to have an officially supported "utils" library for these higher-level structures. core // to the metal std // low-level but useful util // get the job done

Precisely my thoughts too. Something like 'util' may produce comprehensive, very generic, standard constructs, but makes no guarantees that they are efficient, or the best possible implementation for your application/context. --0023544717e4880a2504b834de4b Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <div class=3D"gmail_quote">On 5 February 2012 09:22, F i L <span dir=3D"ltr= ">&lt;<a href=3D"mailto:witte2008 gmail.com">witte2008 gmail.com</a>&gt;</s= pan> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex= ;border-left:1px #ccc solid;padding-left:1ex"> Looks good to me so far ;-)<div class=3D"im"><br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> First criticism I expect is for many to insist on a class-style vector<br> library, which I personally think has no place as a low level, portable API= .<br> Everyone has a different idea of what the perfect vector lib should look<br=

<br> </blockquote> <br></div> I think it would be useful, especially to newcomers who are unfamiliar with= D&#39;s lib terrain, to have an officially supported &quot;utils&quot; lib= rary for these higher-level structures.<br> <br> core // to the metal<br> std // low-level but useful<br> util // get the job done<br> </blockquote></div><br><div>Precisely my thoughts too. Something like &#39;= util&#39; may produce comprehensive, very generic, standard constructs, but= makes no guarantees that they are efficient, or the best possible implemen= tation for your application/context.</div> --0023544717e4880a2504b834de4b--
Feb 05 2012
prev sibling next sibling parent "a" <a a.com> writes:
On Saturday, 4 February 2012 at 23:15:17 UTC, Manu wrote:

 First criticism I expect is for many to insist on a class-style 
 vector
 library, which I personally think has no place as a low level, 
 portable API.
 Everyone has a different idea of what the perfect vector lib 
 should look
 like, and it tends to change significantly with respect to its 
 application.

 I feel this flat API is easier to implement, maintain, and 
 understand, and
 I expect the most common use of this lib will be in the back 
 end of peoples
 own vector/matrix/linear algebra libs that suit their apps.

 My key concern is with my function names... should I be worried 
 about name
 collisions in such a low level lib? I already shadow a lot of 
 standard
 float functions...
 I prefer them abbreviated in this (fairly standard) way, keeps 
 lines of
 code short and compact. It should be particularly familiar to 
 anyone who
 has written shaders and such.

I prefer the flat API and short names too.
 Opinions? Shall I continue as planned?

Looks nice. Please do continue :) You have only run this on a 32 bit machine, right? Cause I tried to compile this simple example and got some errors about converting ulong to int: auto testfun(float4 a, float4 b) { return swizzle!("yxwz")(a); } It compiles if I do this changes: 566c566 < foreach(i; 0..N) ---
 		foreach(int i; 0..N)

< int i = countUntil(s, swizzleKey[0]); ---
 				int i = cast(int)countUntil(s, swizzleKey[0]);

< foreach(j, c; s) // find the offset of the ---
 					foreach(int j, c; s) // find the offset of the

Feb 06 2012
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
--00248c6a84d62883d804b84b04af
Content-Type: text/plain; charset=UTF-8

On 6 February 2012 10:49, a <a a.com> wrote:

 On Saturday, 4 February 2012 at 23:15:17 UTC, Manu wrote:

  First criticism I expect is for many to insist on a class-style vector
 library, which I personally think has no place as a low level, portable
 API.
 Everyone has a different idea of what the perfect vector lib should look
 like, and it tends to change significantly with respect to its
 application.

 I feel this flat API is easier to implement, maintain, and understand, and
 I expect the most common use of this lib will be in the back end of
 peoples
 own vector/matrix/linear algebra libs that suit their apps.

 My key concern is with my function names... should I be worried about name
 collisions in such a low level lib? I already shadow a lot of standard
 float functions...
 I prefer them abbreviated in this (fairly standard) way, keeps lines of
 code short and compact. It should be particularly familiar to anyone who
 has written shaders and such.

I prefer the flat API and short names too. Opinions? Shall I continue as planned?

Looks nice. Please do continue :) You have only run this on a 32 bit machine, right? Cause I tried to compile this simple example and got some errors about converting ulong to int:

True, I have only been working in x86 GDC so far, but I just wanted to get feedback about my approach and API design at this point. It seems there are no serious objections, I'll continue as is. I have an ARM compiler too now, so I'll be implementing/testing against that as reference also.
 auto testfun(float4 a, float4 b)
 {
   return swizzle!("yxwz")(a);
 }

 It compiles if I do this changes:

 566c566
 <               foreach(i; 0..N)
 ---

                foreach(int i; 0..N)

< int i = countUntil(s, swizzleKey[0]); ---
                                int i = cast(int)countUntil(s,
 swizzleKey[0]);

< foreach(j, c; s) // find the offset of the ---
                                        foreach(int j, c; s) // find the
 offset of the


--00248c6a84d62883d804b84b04af Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <div class=3D"gmail_quote">On 6 February 2012 10:49, a <span dir=3D"ltr">&l= t;<a href=3D"mailto:a a.com">a a.com</a>&gt;</span> wrote:<br><blockquote c= lass=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;= padding-left:1ex"> <div class=3D"im">On Saturday, 4 February 2012 at 23:15:17 UTC, Manu wrote:= <br> <br> </div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-l= eft:1px #ccc solid;padding-left:1ex"><div class=3D"im"> First criticism I expect is for many to insist on a class-style vector<br> library, which I personally think has no place as a low level, portable API= .<br> Everyone has a different idea of what the perfect vector lib should look<br=

<br> <br></div><div class=3D"im"> I feel this flat API is easier to implement, maintain, and understand, and<= br> I expect the most common use of this lib will be in the back end of peoples= <br> own vector/matrix/linear algebra libs that suit their apps.<br> <br> My key concern is with my function names... should I be worried about name<= br> collisions in such a low level lib? I already shadow a lot of standard<br> float functions...<br> I prefer them abbreviated in this (fairly standard) way, keeps lines of<br> code short and compact. It should be particularly familiar to anyone who<br=

</div></blockquote> <br> I prefer the flat API and short names too.<div class=3D"im"><br> <br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> Opinions? Shall I continue as planned?<br> </blockquote> <br></div> Looks nice. Please do continue :)<br> <br> You have only run this on a 32 bit machine, right? Cause I tried to compile= this simple example and got some errors about converting ulong to int:<br>= </blockquote><div><br></div><div>True, I have only been working in x86 GDC = so far, but I just wanted to get feedback about my approach and API design = at this point.</div> <div>It seems there are no serious objections, I&#39;ll continue as is. I h= ave an ARM compiler too now, so I&#39;ll be implementing/testing against th= at as reference also.</div><div>=C2=A0</div><blockquote class=3D"gmail_quot= e" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> auto testfun(float4 a, float4 b)<br> {<br> =C2=A0 return swizzle!(&quot;yxwz&quot;)(a);<br> }<br> <br> It compiles if I do this changes:<br> <br> 566c566<br> &lt; =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 foreach(i; 0..N)<br> ---<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0foreach(int i; 0..N= )<br> </blockquote> 574c574<br> &lt; =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 int i =3D countUntil(s, swizzleKey[0]);<= br> ---<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0int i =3D cast(int)countUntil(s, swiz= zleKey[0]);<br> </blockquote> 591c591<br> &lt; =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 foreach(j, c= ; s) // find the offset of the ---<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0foreach(i= nt j, c; s) // find the offset of the</blockquote> </blockquote></div><br> --00248c6a84d62883d804b84b04af--
Feb 06 2012
prev sibling next sibling parent "a" <a a.com> writes:
 True, I have only been working in x86 GDC so far, but I just 
 wanted to get
 feedback about my approach and API design at this point.
 It seems there are no serious objections, I'll continue as is.

I have one proposal about API design of matrix operations. Maybe there could be functions that would take row vectors as parameters in addition to those that take matrix structs. That way one could call matrix functions on data that isn't stored as matrix structures without copying. So for example for the transpose function there would also be a function that would be used like this (a* are inputs and r* are outputs): transpose(aX, aY, aZ, aW, rX, rY, rZ, rW); Maybe those functions could be used to implement the functions that take and return structs. I also think that interleave and deinterleave operations would be useful. For four element float vectors those can be implemented with only one instruction at least for SSE (using unpcklps, unpckhps and shufps) and NEON (using vuzp and vzip).
 I have an
 ARM compiler too now, so I'll be implementing/testing against 
 that as
 reference also.

Could you please tell me how did you get the ARM compiler to work?
Feb 06 2012
prev sibling next sibling parent Iain Buclaw <ibuclaw ubuntu.com> writes:
On 6 February 2012 15:13, a <a a.com> wrote:
 True, I have only been working in x86 GDC so far, but I just wanted to g=


 feedback about my approach and API design at this point.
 It seems there are no serious objections, I'll continue as is.

I have one proposal about API design of matrix operations. Maybe there co=

 be functions that would take row vectors as parameters in addition to tho=

 that take matrix structs. That way one could call matrix functions on dat=

 that isn't stored as matrix structures without copying. So for example fo=

 the transpose function there would also be a function that would be used
 like this (a* are inputs and r* are outputs):

 transpose(aX, aY, aZ, aW, rX, rY, rZ, rW);

 Maybe those functions could be used to implement the functions that take =

 return structs.

 I also think that interleave and deinterleave operations would be useful.
 For four element float vectors those can be implemented with only one
 instruction at least for SSE (using unpcklps, unpckhps and shufps) and =

 (using vuzp and vzip).


 I have an
 ARM compiler too now, so I'll be implementing/testing against that as
 reference also.

Could you please tell me how did you get the ARM compiler to work?

There's a thread in d.gnu with Linux and MinGW cross compiler binaries. --=20 Iain Buclaw *(p < e ? p++ : p) =3D (c & 0x0f) + '0';
Feb 06 2012
prev sibling next sibling parent "a" <a a.com> writes:
 There's a thread in d.gnu with Linux and MinGW cross compiler 
 binaries.

I didn't know that, thanks.
Feb 06 2012
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
--0023544717e462c8cd04b84e803b
Content-Type: text/plain; charset=UTF-8

On 6 February 2012 17:13, a <a a.com> wrote:

 True, I have only been working in x86 GDC so far, but I just wanted to get
 feedback about my approach and API design at this point.
 It seems there are no serious objections, I'll continue as is.

I have one proposal about API design of matrix operations. Maybe there could be functions that would take row vectors as parameters in addition to those that take matrix structs. That way one could call matrix functions on data that isn't stored as matrix structures without copying. So for example for the transpose function there would also be a function that would be used like this (a* are inputs and r* are outputs): transpose(aX, aY, aZ, aW, rX, rY, rZ, rW);

... the problem is, without multiple return values (come on, D should have multiple return values!), how do you return the result? :)
 Maybe those functions could be used to implement the functions that take
 and return structs.

Yes... I've been pondering how to do this properly for ages actually. That's the main reason I haven't fleshed out any matrix functions yet; I'm still not at all sold on how to represent the matrices. Ideally, there should not be any memory access. But even if they pass by ref/pointer, as soon as the function is inlined, the memory access will disappear, and it'll effectively generate the same code... So the problem is not so much with respect to THIS API, but with respect to the matrix calling convention in general... I also think that interleave and deinterleave operations would be useful.
 For four element float vectors those can be implemented with only one
 instruction at least for SSE (using unpcklps, unpckhps and shufps) and
  NEON (using vuzp and vzip).

Sure. I wasn't sure how useful they were in practise... I didn't want to load it with countless silly permutation routines so I figured I'll add them by request, or as they are proven useful in real world apps. What would you typically do with the interleave functions at a high level? Sure you don't just use it as a component behind a few actually useful functions which should be exposed instead? I have an
 ARM compiler too now, so I'll be implementing/testing against that as
 reference also.

Could you please tell me how did you get the ARM compiler to work?

I did not.. It was the work of another fine chap in the gdc newsgroup ;) --0023544717e462c8cd04b84e803b Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <br><br><div class=3D"gmail_quote">On 6 February 2012 17:13, a <span dir=3D= "ltr">&lt;<a href=3D"mailto:a a.com">a a.com</a>&gt;</span> wrote:<br><bloc= kquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #cc= c solid;padding-left:1ex"> <div class=3D"im"><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .= 8ex;border-left:1px #ccc solid;padding-left:1ex"> True, I have only been working in x86 GDC so far, but I just wanted to get<= br> feedback about my approach and API design at this point.<br> It seems there are no serious objections, I&#39;ll continue as is.<br> </blockquote> <br></div> I have one proposal about API design of matrix operations. Maybe there coul= d be functions that would take row vectors as parameters in addition to tho= se that take matrix structs. That way one could call matrix functions on da= ta that isn&#39;t stored as matrix structures without copying. So for examp= le for the transpose function there would also be a function that would be = used like this (a* are inputs and r* are outputs):<br> <br> transpose(aX, aY, aZ, aW, rX, rY, rZ, rW);<br></blockquote><div><br></div><= div>... the problem is, without multiple return values (come on, D should h= ave multiple return values!), how do you return the result? :)</div><div> =C2=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;bo= rder-left:1px #ccc solid;padding-left:1ex">Maybe those functions could be u= sed to implement the functions that take and return structs.<br></blockquot= e><div> <br></div><div>Yes... I&#39;ve been pondering how to do this properly for a= ges actually. That&#39;s the main reason I haven&#39;t fleshed out any matr= ix functions yet; I&#39;m still not at all sold on how to represent the mat= rices.</div> <div>Ideally, there should not be any memory access. But even if they pass = by ref/pointer, as soon as the function is inlined, the memory access will = disappear, and it&#39;ll effectively generate the same code...</div><div> <br></div><div>So the problem is not so much with respect to THIS API, but = with respect to the matrix calling convention in general...</div><div><br><= /div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-le= ft:1px #ccc solid;padding-left:1ex"> I also think that interleave and deinterleave operations would be useful. F= or four element float vectors those can be implemented with only one instru= ction at least for SSE (using unpcklps, unpckhps and shufps) and =C2=A0NEON= (using vuzp and vzip).</blockquote> <div><br></div><div>Sure. I wasn&#39;t sure how useful they were in practis= e... I didn&#39;t want to load it with countless silly permutation routines= so I figured I&#39;ll add them by request, or as they are proven useful in= real world apps.</div> <div>What would you typically do with the interleave functions at a high le= vel? Sure you don&#39;t just use it as a component behind a few actually us= eful functions which should be exposed instead?</div><div><br></div><blockq= uote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc = solid;padding-left:1ex"> <div class=3D"im"> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> I have an<br> ARM compiler too now, so I&#39;ll be implementing/testing against that as<b= r> reference also.<br> </blockquote> <br></div> Could you please tell me how did you get the ARM compiler to work?<br></blo= ckquote><div><br></div><div>I did not.. It was the work of another fine cha= p in the gdc newsgroup ;)</div></div> --0023544717e462c8cd04b84e803b--
Feb 06 2012
prev sibling parent "a" <a a.com> writes:
 used like this (a* are inputs and r* are outputs):

 transpose(aX, aY, aZ, aW, rX, rY, rZ, rW);

... the problem is, without multiple return values (come on, D should have multiple return values!), how do you return the result? :)
 Maybe those functions could be used to implement the functions 
 that take
 and return structs.

Yes... I've been pondering how to do this properly for ages actually. That's the main reason I haven't fleshed out any matrix functions yet; I'm still not at all sold on how to represent the matrices. Ideally, there should not be any memory access. But even if they pass by ref/pointer, as soon as the function is inlined, the memory access will disappear, and it'll effectively generate the same code...

I meant having functions that would return through reference parameters. The transpose function above would have signature transpose(float4, float4, float4, float4, ref float4, ref float4, ref float4, ref float4).
 Sure. I wasn't sure how useful they were in practise... I 
 didn't want to
 load it with countless silly permutation routines so I figured 
 I'll add
 them by request, or as they are proven useful in real world 
 apps.
 What would you typically do with the interleave functions at a 
 high level?
 Sure you don't just use it as a component behind a few actually 
 useful
 functions which should be exposed instead?

I think they would be useful when you work with arrays of structs with two elements such as complex numbers. For example to calculate a square of a complex array you could do: for(size_t i=0; i < a.length; i += 2) { float4 first = a[i]; float4 second = a[i + 1]; float4 re = deinterleaveLow(first, second); float4 im = deinterleaveHigh(first, second); flaot4 re2 = re * re - im * im; float4 im2 = re * im im2 += im2; a[i] = interleaveLow(re2, im2); a[i + 1] = interleaveHigh(re2, im2); } Interleave and interleave can also be useful when you want to shuffle data in some custom way. You can't cover all possible permutations of elements over multiple vectors in a library (unless you do something like A* search at compile time and generate code based on that - but that would probably be way to slow), but you can expose at least the capabilities that are common to most platforms, such as interleave and deinterleave.
Feb 06 2012