www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - std.simd

reply Manu <turkeyman gmail.com> writes:
Hey chaps (and possibly lasses?)

I've been slowly working a std.simd library, the aim of which is to provide
a lowest-level hardware-independent SIMD interface. core.simd implements
SSE currently for x86, other architectures are currently exposed via
gcc.builtins.
The purpose of std.simd, is to be the lowest level API that people make
direct use of, while still having as-close-to-direct-as-possible mapping to
the hardware opcodes, but still being portable. I would expect that custom,
more-feature-rich SIMD/vector/matrix/linear algebra libraries should be
built on top of std.simd in future, that way being portable to as many
systems as possible.

Now I've reached a question in the design of the library, I'd like to take
a general consensus.

lowest level vectors are defined by: __vector(type[width])
But core.simd also defines a bunch of handy 'nice' aliases for common
vector types, ie, float4, int4, short8, etc.

I want to claim those names into std.simd. They should be the lowest level
names that people use, and therefore associate with the std.simd
functionality.
I also want to enhance them a bit:
  I want to make them a struct that wraps the primitive rather than an
alias. I understand this single-POD struct will be handled the same as the
POD its self, is that right? If I pass the wrapper struct byval to a
function, it will be passed in a register as it should yeah?
  I then intend to then add CTFE support, and maybe some properties and
opDisplatch bits.

Does this sound reasonable?
Mar 15 2012
next sibling parent reply "Robert Jacques" <sandford jhu.edu> writes:
On Thu, 15 Mar 2012 12:09:58 -0500, Manu <turkeyman gmail.com> wrote:
 Hey chaps (and possibly lasses?)

 I've been slowly working a std.simd library, the aim of which is to  
 provide
 a lowest-level hardware-independent SIMD interface. core.simd implements
 SSE currently for x86, other architectures are currently exposed via
 gcc.builtins.
 The purpose of std.simd, is to be the lowest level API that people make
 direct use of, while still having as-close-to-direct-as-possible mapping  
 to
 the hardware opcodes, but still being portable. I would expect that  
 custom,
 more-feature-rich SIMD/vector/matrix/linear algebra libraries should be
 built on top of std.simd in future, that way being portable to as many
 systems as possible.

 Now I've reached a question in the design of the library, I'd like to  
 take
 a general consensus.

 lowest level vectors are defined by: __vector(type[width])
 But core.simd also defines a bunch of handy 'nice' aliases for common
 vector types, ie, float4, int4, short8, etc.

 I want to claim those names into std.simd. They should be the lowest  
 level
 names that people use, and therefore associate with the std.simd
 functionality.
 I also want to enhance them a bit:
   I want to make them a struct that wraps the primitive rather than an
 alias. I understand this single-POD struct will be handled the same as  
 the
 POD its self, is that right? If I pass the wrapper struct byval to a
 function, it will be passed in a register as it should yeah?
   I then intend to then add CTFE support, and maybe some properties and
 opDisplatch bits.

 Does this sound reasonable?

This sounds reasonable. However, please realize that if you wish to use the short vector names (i.e. float4, float3, float2, etc) you should support the full set with a decent range of operations and methods. Several people (myself included) have written similar short vector libraries; I think having having short vectors in phobos is important, but having one library provide float4 and another float2 is less than ideal, even if not all of the types could leverage the SMID backend. For myself, the killer feature for such a library would be have the CUDA compatible alignments for the types. (or an equivalent enum to the effect)
Mar 15 2012
next sibling parent reply Manu <turkeyman gmail.com> writes:
On 15 March 2012 20:35, Robert Jacques <sandford jhu.edu> wrote:

 On Thu, 15 Mar 2012 12:09:58 -0500, Manu <turkeyman gmail.com> wrote:

 Hey chaps (and possibly lasses?)

 I've been slowly working a std.simd library, the aim of which is to
 provide
 a lowest-level hardware-independent SIMD interface. core.simd implements
 SSE currently for x86, other architectures are currently exposed via
 gcc.builtins.
 The purpose of std.simd, is to be the lowest level API that people make
 direct use of, while still having as-close-to-direct-as-possible mapping
 to
 the hardware opcodes, but still being portable. I would expect that
 custom,
 more-feature-rich SIMD/vector/matrix/linear algebra libraries should be
 built on top of std.simd in future, that way being portable to as many
 systems as possible.

 Now I've reached a question in the design of the library, I'd like to take
 a general consensus.

 lowest level vectors are defined by: __vector(type[width])
 But core.simd also defines a bunch of handy 'nice' aliases for common
 vector types, ie, float4, int4, short8, etc.

 I want to claim those names into std.simd. They should be the lowest level
 names that people use, and therefore associate with the std.simd
 functionality.
 I also want to enhance them a bit:
  I want to make them a struct that wraps the primitive rather than an
 alias. I understand this single-POD struct will be handled the same as the
 POD its self, is that right? If I pass the wrapper struct byval to a
 function, it will be passed in a register as it should yeah?
  I then intend to then add CTFE support, and maybe some properties and
 opDisplatch bits.

 Does this sound reasonable?

This sounds reasonable. However, please realize that if you wish to use the short vector names (i.e. float4, float3, float2, etc) you should support the full set with a decent range of operations and methods. Several people (myself included) have written similar short vector libraries; I think having having short vectors in phobos is important, but having one library provide float4 and another float2 is less than ideal, even if not all of the types could leverage the SMID backend. For myself, the killer feature for such a library would be have the CUDA compatible alignments for the types. (or an equivalent enum to the effect)

I can see how you come to that conclusion, but I generally feel that that's a problem for a higher layer of library. I really feel it's important to keep std.simd STRICTLY about the hardware simd operations, only implementing what the hardware can express efficiently, and not trying to emulate anything else. In some areas I feel I've already violated that premise, by adding some functions to make good use of something that NEON/VMX can express in a single opcode, but takes SSE 2-3. I don't want to push that bar, otherwise the user will lose confidence that the functions in std.simd will actually work efficiently on any given hardware. It's not a do-everything library, it's a hardware SIMD abstraction, and most functions map to exactly one hardware opcode. I expect most people will want to implement their own higher level lib on top tbh; almost nobody will ever agree on what the perfect maths library should look like, and it's also context specific.
Mar 15 2012
parent reply "Robert Jacques" <sandford jhu.edu> writes:
On Thu, 15 Mar 2012 14:02:15 -0500, Manu <turkeyman gmail.com> wrote:
 On 15 March 2012 20:35, Robert Jacques <sandford jhu.edu> wrote:
 On Thu, 15 Mar 2012 12:09:58 -0500, Manu <turkeyman gmail.com> wrote:


[snip]
 This sounds reasonable. However, please realize that if you wish to use
 the short vector names (i.e. float4, float3, float2, etc) you should
 support the full set with a decent range of operations and methods. Several
 people (myself included) have written similar short vector libraries; I
 think having having short vectors in phobos is important, but having one
 library provide float4 and another float2 is less than ideal, even if not
 all of the types could leverage the SMID backend. For myself, the killer
 feature for such a library would be have the CUDA compatible alignments for
 the types. (or an equivalent enum to the effect)

I can see how you come to that conclusion, but I generally feel that that's a problem for a higher layer of library.

Then you should to leave namespace room for that higher level library.
Mar 15 2012
next sibling parent Manu <turkeyman gmail.com> writes:
On 16 March 2012 01:32, Robert Jacques <sandford jhu.edu> wrote:

 On Thu, 15 Mar 2012 14:02:15 -0500, Manu <turkeyman gmail.com> wrote:

 On 15 March 2012 20:35, Robert Jacques <sandford jhu.edu> wrote:

 On Thu, 15 Mar 2012 12:09:58 -0500, Manu <turkeyman gmail.com> wrote:

[snip]

 This sounds reasonable. However, please realize that if you wish to use
 the short vector names (i.e. float4, float3, float2, etc) you should
 support the full set with a decent range of operations and methods.
 Several
 people (myself included) have written similar short vector libraries; I
 think having having short vectors in phobos is important, but having one
 library provide float4 and another float2 is less than ideal, even if not
 all of the types could leverage the SMID backend. For myself, the killer
 feature for such a library would be have the CUDA compatible alignments
 for
 the types. (or an equivalent enum to the effect)

I can see how you come to that conclusion, but I generally feel that that's a problem for a higher layer of library.

Then you should to leave namespace room for that higher level library.

I haven't stolen any names that aren't already taken by core.simd. I just want to claim those primitive types already aliased in std.simd to enhance them with some more useful base-level functionality.
Mar 16 2012
prev sibling parent reply "David Nadlinger" <see klickverbot.at> writes:
On Thursday, 15 March 2012 at 23:32:29 UTC, Robert Jacques wrote:
 Then you should to leave namespace room for that higher level 
 library.

What makes you thing that there would be only one such high-level library wanting to define a floatN type? There is no such thing as a global namespace in D (well, one could probably argue that the things defined in object are). Thus, I don't see a problem with re-using a name in a third-party library, if its a good fit in both places – and you'll probably have a hard time coming up with a better name for SIMD stuff than float4. If at some point you want to mix types from both modules, you could always use static or renamed imports. For example, »import lowlevel = std.simd« would give you »lowlevel.float4 upVector;«, which might be clearer in the context of your application than any longer, pre-defined name could ever be. True, we shouldn't generally pick very likely-to-collide names by default just because we can so, but denying the existence of the D module system altogether is going to set us back to using library name prefixes everywhere, like in C (and sometimes C++) code. David
Mar 16 2012
parent reply "Robert Jacques" <sandford jhu.edu> writes:
On Fri, 16 Mar 2012 08:24:58 -0500, David Nadlinger <see klickverbot.at>=
  =

wrote:

 On Thursday, 15 March 2012 at 23:32:29 UTC, Robert Jacques wrote:
 Then you should to leave namespace room for that higher level library=


.
 What makes you thing that there would be only one such high-level  =

 library wanting to define a floatN type?

The fact that several people have proposed unifying the existing librari= es = any putting them into phobos :)
 There is no such thing as a global namespace in D (well, one could  =

 probably argue that the things defined in object are). Thus, I don't s=

ee =
 a problem with re-using a name in a third-party library, if its a good=

=
 fit in both places =E2=80=93 and you'll probably have a hard time comi=

ng up with =
 a better name for SIMD stuff than float4.

 If at some point you want to mix types from both modules, you could  =

 always use static or renamed imports. For example, =C2=BBimport lowlev=

el =3D =
 std.simd=C2=AB would give you =C2=BBlowlevel.float4 upVector;=C2=AB, w=

hich might be =
 clearer in the context of your application than any longer, pre-define=

d =
 name could ever be.

 True, we shouldn't generally pick very likely-to-collide names by  =

 default just because we can so, but denying the existence of the D  =

 module system altogether is going to set us back to using library name=

=
 prefixes everywhere, like in C (and sometimes C++) code.

 David

Unrelated libraries using the same name is relatively painless. Highly = related libraries that conflict, on the other hand, are generally painfu= l. = Yes, there are a lot of mechanisms available to work around this, but = selective imports and renaming all add to the cognitive load of using an= d = writing the code. To me float4 isn't a SIMD name; its a vector name and if it's implemente= d = using SIMD, great, but that's an implementation detail. I can understand= a = close to the metal SIMD library and encourage the work. But if it isn't = = also going to be a vector library, if possible, it shouldn't use the = vector names.
Mar 16 2012
parent reply Manu <turkeyman gmail.com> writes:
On 16 March 2012 22:39, Robert Jacques <sandford jhu.edu> wrote:

 On Fri, 16 Mar 2012 08:24:58 -0500, David Nadlinger <see klickverbot.at>
 wrote:

  On Thursday, 15 March 2012 at 23:32:29 UTC, Robert Jacques wrote:
 Then you should to leave namespace room for that higher level library.

What makes you thing that there would be only one such high-level librar=


y
 wanting to define a floatN type?

The fact that several people have proposed unifying the existing librarie=

s
 any putting them into phobos :)

I personally can't see it happening. Above the most primitive level that I've tried to cover with std.simd, I think it'll be very hard to find agreement on what that API should look like. If you can invent a proposal that everyone agrees on, I'd be very interested to see it. Perhaps if you extend the fairly raw and D-ish API that I've tried to use in std.simd it could work, but I don't think many people will like using that in their code. I anticipate std.simd will be wrapped in some big bloated class by almost everyone that uses it, so why bother to add the emulation at that level? There is no such thing as a global namespace in D (well, one could probably
 argue that the things defined in object are). Thus, I don't see a proble=


m
 with re-using a name in a third-party library, if its a good fit in both
 places =E2=80=93 and you'll probably have a hard time coming up with a b=


etter name
 for SIMD stuff than float4.

 If at some point you want to mix types from both modules, you could
 always use static or renamed imports. For example, =C2=BBimport lowlevel=


=3D
 std.simd=C2=AB would give you =C2=BBlowlevel.float4 upVector;=C2=AB, whi=


ch might be
 clearer in the context of your application than any longer, pre-defined
 name could ever be.

 True, we shouldn't generally pick very likely-to-collide names by defaul=


t
 just because we can so, but denying the existence of the D module system
 altogether is going to set us back to using library name prefixes
 everywhere, like in C (and sometimes C++) code.

 David

Unrelated libraries using the same name is relatively painless. Highly related libraries that conflict, on the other hand, are generally painful=

.
 Yes, there are a lot of mechanisms available to work around this, but
 selective imports and renaming all add to the cognitive load of using and
 writing the code.

 To me float4 isn't a SIMD name; its a vector name and if it's implemented
 using SIMD, great, but that's an implementation detail. I can understand =

a
 close to the metal SIMD library and encourage the work. But if it isn't
 also going to be a vector library, if possible, it shouldn't use the vect=

or
 names.

Can you give me an example of a non-simd context where this is the case? Don't say shaders, because that is supported in hardware, and that's my point. Also there's nothing stopping a secondary library adding/emulating the additional types. They could work seamlessly together. flaot4 may come from std.simd, float3/float2 may be added by a further lib that simply extends std.simd.
Mar 16 2012
parent reply "Robert Jacques" <sandford jhu.edu> writes:
On Fri, 16 Mar 2012 16:45:05 -0500, Manu <turkeyman gmail.com> wrote:
 On 16 March 2012 22:39, Robert Jacques <sandford jhu.edu> wrote:
 On Fri, 16 Mar 2012 08:24:58 -0500, David Nadlinger <see klickverbot.at>
 wrote:
  On Thursday, 15 March 2012 at 23:32:29 UTC, Robert Jacques wrote:


[snip]
 Unrelated libraries using the same name is relatively painless. Highly
 related libraries that conflict, on the other hand, are generally painful.
 Yes, there are a lot of mechanisms available to work around this, but
 selective imports and renaming all add to the cognitive load of using and
 writing the code.

 To me float4 isn't a SIMD name; its a vector name and if it's implemented
 using SIMD, great, but that's an implementation detail. I can understand a
 close to the metal SIMD library and encourage the work. But if it isn't
 also going to be a vector library, if possible, it shouldn't use the vector
 names.

Can you give me an example of a non-simd context where this is the case? Don't say shaders, because that is supported in hardware, and that's my point. Also there's nothing stopping a secondary library adding/emulating the additional types. They could work seamlessly together. flaot4 may come from std.simd, float3/float2 may be added by a further lib that simply extends std.simd.

Shaders. :) Actually, float4 isn't supported in hardware if you're on NVIDIA. And IIRC ATI/AMD is moving away from hardware support as well. I'm not sure what Intel or the embedded GPUs do. On the CPU side SIMD support on both ARM and PowerPC is optional. As for examples, pretty much every graphics, vision, imaging and robotics library has a small vector library attached to it; were you looking for something else? Also, clean support for float3 / float2 / etc. has shown up in Intel's Larrabee and its Knight's derivatives; so, maybe we'll see it in a desktop processor someday. To say nothing of the 245-bit and 512-bit SIMD units on some machines. My concern is that std.simd is (for good reason) leaking the underlying hardware implementation (std.simd seems to be very x86 centric), while vectors, in my mind, are a higher level abstraction.
Mar 17 2012
parent Manu <turkeyman gmail.com> writes:
On 17 March 2012 20:42, Robert Jacques <sandford jhu.edu> wrote:

 On Fri, 16 Mar 2012 16:45:05 -0500, Manu <turkeyman gmail.com> wrote:

 Can you give me an example of a non-simd context where this is the case?

Don't say shaders, because that is supported in hardware, and that's my
 point.
 Also there's nothing stopping a secondary library adding/emulating the
 additional types. They could work seamlessly together. flaot4 may come
 from
 std.simd, float3/float2 may be added by a further lib that simply extends
 std.simd.

Shaders. :) Actually, float4 isn't supported in hardware if you're on NVIDIA. And IIRC ATI/AMD is moving away from hardware support as well. I'm not sure what Intel or the embedded GPUs do. On the CPU side SIMD support on both ARM and PowerPC is optional. As for examples, pretty much every graphics, vision, imaging and robotics library has a small vector library attached to it; were you looking for something else?

GPU hardware is fundamentally different than CPU vector extensions. The goal is not to imitate shaders on the CPU. There are already other possibilities for that anyway. Also, clean support for float3 / float2 / etc. has shown up in Intel's
 Larrabee and its Knight's derivatives; so, maybe we'll see it in a desktop
 processor someday. To say nothing of the 245-bit and 512-bit SIMD units on
 some machines.

Well when that day comes, we'll add hardware abstraction for it. There are 2 that do currently exist, 3DNow, but that's so antiquated, I see no reason to support it. The other is the Gamecube, Wii, WiiU line of consoles; all have 2D vector hardware. I'll gladly add support for that the very moment anyone threatens to use D on a Nintendo system, but no point right now. float3 on the other hand is not supported on any machine, and it's very inefficient. Use of float3 should be discouraged at all costs. People should be encouraged to use float4's and pack something useful in W if they can. And if not, they should be aware that they are wasting 25% of their flops. I don't recall ever dismissing 256bit vector units. Infact I've suggested support for AVX is mandatory on plenty of occasions. I'm also familiar with a few 512bit vector architectures, but I don't think they warrant a language level implementation yet. Let's just work through what we have, and what will be used to start with. I'd be keen to see how it tends to be used, and make any fundamental changes before blowing it way out of proportion.
 My concern is that std.simd is (for good reason) leaking the underlying
 hardware implementation (std.simd seems to be very x86 centric), while
 vectors, in my mind, are a higher level abstraction.

It's certainly not SSE centric. Actually, if you can legitimately criticise me of anything, it's being biased AGAINST x86 based processors. I'm critically aware of VMX, NEON, SPU, and many architectures that came before. What parts of my current work in progress do you suggest is biased to x86 hardware implementation? From my experience, I'm fairly happy with how it looks at this point being an efficiency-first architecture-abstracted interface. As I've said, I'm still confident that people will just come along and wrap it up with what they feel is a simple/user-friendly interface anyway. If I try to make this higher-level/more-user-friendly, I still won't please everyone, and I'll sacrifice raw efficiency in the process, which defeats the purpose. How do you define vectors in your mind?
Mar 17 2012
prev sibling next sibling parent James Miller <james aatch.net> writes:
On 16 March 2012 08:02, Manu <turkeyman gmail.com> wrote:
 On 15 March 2012 20:35, Robert Jacques <sandford jhu.edu> wrote:
 This sounds reasonable. However, please realize that if you wish to use
 the short vector names (i.e. float4, float3, float2, etc) you should sup=


port
 the full set with a decent range of operations and methods. Several peop=


le
 (myself included) have written similar short vector libraries; I think
 having having short vectors in phobos is important, but having one libra=


ry
 provide float4 and another float2 is less than ideal, even if not all of=


the
 types could leverage the SMID backend. For myself, the killer feature fo=


r
 such a library would be have the CUDA compatible alignments for the type=


s.
 (or an equivalent enum to the effect)

I can see how you come to that conclusion, but I generally feel that that=

's
 a problem for a higher layer of library.
 I really feel it's important to keep std.simd STRICTLY about the hardware
 simd operations, only implementing what the hardware can express
 efficiently, and not trying to emulate anything else. In some areas I fee=

l
 I've already violated that premise, by adding some functions to make good
 use of something that NEON/VMX can express in a single opcode, but takes =

SSE
 2-3.=C2=A0I don't want to push that bar, otherwise the user will lose con=

fidence
 that the functions in std.simd will actually work efficiently on any give=

n
 hardware.
 It's not a do-everything library, it's a hardware SIMD abstraction, and m=

ost
 functions map to exactly one hardware opcode. I expect most people will w=

ant
 to implement their own higher level lib on top tbh; almost nobody will ev=

er
 agree on what the perfect maths library should look like, and it's also
 context specific.

I think that having the low-level vectors makes sense. Since technically only float4, int4, short8, byte16, actually make sense in the context of direct SIMD, providing other vectors would be straying into vector-library territory, as people would then expect interoperability between them, standard vector/matrix operations, and that could get too high-level. Third-party libraries have to be useful for something! Slightly off topic questions: Are you planning on providing a way to fallback if certain operations aren't supported? Even if it can only be picked at compile time? Is your work on Github or something? I wouldn't mind having a peek, since this stuff interests me. How well does this stuff inline? I can imagine that a lot of the benefit of using SIMD would be lost if every SIMD instruction ends up wrapped in 3-4 more instructions, especially if you need to do consecutive operations on the same data. -- James Miller
Mar 15 2012
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
On 15 March 2012 22:27, James Miller <james aatch.net> wrote:

 On 16 March 2012 08:02, Manu <turkeyman gmail.com> wrote:
 On 15 March 2012 20:35, Robert Jacques <sandford jhu.edu> wrote:
 This sounds reasonable. However, please realize that if you wish to use
 the short vector names (i.e. float4, float3, float2, etc) you should


support
 the full set with a decent range of operations and methods. Several


people
 (myself included) have written similar short vector libraries; I think
 having having short vectors in phobos is important, but having one


library
 provide float4 and another float2 is less than ideal, even if not all


of the
 types could leverage the SMID backend. For myself, the killer feature


for
 such a library would be have the CUDA compatible alignments for the


types.
 (or an equivalent enum to the effect)

I can see how you come to that conclusion, but I generally feel that

that's
 a problem for a higher layer of library.
 I really feel it's important to keep std.simd STRICTLY about the hardware
 simd operations, only implementing what the hardware can express
 efficiently, and not trying to emulate anything else. In some areas I

feel
 I've already violated that premise, by adding some functions to make good
 use of something that NEON/VMX can express in a single opcode, but takes

SSE
 2-3. I don't want to push that bar, otherwise the user will lose

confidence
 that the functions in std.simd will actually work efficiently on any

given
 hardware.
 It's not a do-everything library, it's a hardware SIMD abstraction, and

most
 functions map to exactly one hardware opcode. I expect most people will

want
 to implement their own higher level lib on top tbh; almost nobody will

ever
 agree on what the perfect maths library should look like, and it's also
 context specific.

I think that having the low-level vectors makes sense. Since technically only float4, int4, short8, byte16, actually make sense in the context of direct SIMD, providing other vectors would be straying into vector-library territory, as people would then expect interoperability between them, standard vector/matrix operations, and that could get too high-level. Third-party libraries have to be useful for something! Slightly off topic questions: Are you planning on providing a way to fallback if certain operations aren't supported?

I think it depends on HOW unsupported they are. If it can be emulated efficiently (and in the context, the emulation would be as efficient as possible on the architecture anyway), then probably, but if it's a problem that should simply be solved another way, I'd rather encourage that with a compile error. Even if it can only be picked at compile time? Is
 your work on Github or something?

Yup: https://github.com/TurkeyMan/phobos/commits/master/std/simd.d
 I wouldn't mind having a peek, since
 this stuff interests me. How well does this stuff inline?

It inlines perfectly, I pay very close attention to the codegen every single function. And have loads of static branches to select more efficient versions for more recent revisions of the SSE instruction set.
 I can
 imagine that a lot of the benefit of using SIMD would be lost if every
 SIMD instruction ends up wrapped in 3-4 more instructions, especially
 if you need to do consecutive operations on the same data.

It will lose 100% of its benefit it it is wrapped up in even ONE function call, and equally so if the vectors don't pass/return in hardware registers as they should. I'm crafting it to have the same performance characteristics as 'int'.
Mar 15 2012
prev sibling parent James Miller <james aatch.net> writes:
On 16 March 2012 11:44, Manu <turkeyman gmail.com> wrote:
 On 15 March 2012 22:27, James Miller <james aatch.net> wrote:
 On 16 March 2012 08:02, Manu <turkeyman gmail.com> wrote:
 On 15 March 2012 20:35, Robert Jacques <sandford jhu.edu> wrote:
 This sounds reasonable. However, please realize that if you wish to u=




se
 the short vector names (i.e. float4, float3, float2, etc) you should
 support
 the full set with a decent range of operations and methods. Several
 people
 (myself included) have written similar short vector libraries; I thin=




k
 having having short vectors in phobos is important, but having one
 library
 provide float4 and another float2 is less than ideal, even if not all
 of the
 types could leverage the SMID backend. For myself, the killer feature
 for
 such a library would be have the CUDA compatible alignments for the
 types.
 (or an equivalent enum to the effect)

I can see how you come to that conclusion, but I generally feel that that's a problem for a higher layer of library. I really feel it's important to keep std.simd STRICTLY about the hardware simd operations, only implementing what the hardware can express efficiently, and not trying to emulate anything else. In some areas I feel I've already violated that premise, by adding some functions to make good use of something that NEON/VMX can express in a single opcode, but tak=



es
 SSE
 2-3.=C2=A0I don't want to push that bar, otherwise the user will lose
 confidence
 that the functions in std.simd will actually work efficiently on any
 given
 hardware.
 It's not a do-everything library, it's a hardware SIMD abstraction, an=



d
 most
 functions map to exactly one hardware opcode. I expect most people wil=



l
 want
 to implement their own higher level lib on top tbh; almost nobody will
 ever
 agree on what the perfect maths library should look like, and it's als=



o
 context specific.

I think that having the low-level vectors makes sense. Since technically only float4, int4, short8, byte16, actually make sense in the context of direct SIMD, providing other vectors would be straying into vector-library territory, as people would then expect interoperability between them, standard vector/matrix operations, and that could get too high-level. Third-party libraries have to be useful for something! Slightly off topic questions: Are you planning on providing a way to fallback if certain operations aren't supported?

I think it depends on HOW unsupported they are. If it can be emulated efficiently (and in the context, the emulation would be as efficient as possible on the architecture anyway), then probably, but if it's a proble=

m
 that should simply be solved another way, I'd rather encourage that with =

a
 compile error.

 Even if it can only be picked at compile time? Is
 your work on Github or something?

Yup:=C2=A0https://github.com/TurkeyMan/phobos/commits/master/std/simd.d
 I wouldn't mind having a peek, since
 this stuff interests me. How well does this stuff inline?

It inlines perfectly, I pay very close attention to the codegen every sin=

gle
 function. And have loads of static branches to select more efficient
 versions for more recent revisions of the SSE instruction set.

 I can
 imagine that a lot of the benefit of using SIMD would be lost if every
 SIMD instruction ends up wrapped in 3-4 more instructions, especially
 if you need to do consecutive operations on the same data.

It will lose 100% of its benefit it it is wrapped up in even ONE function call, and equally so if the vectors don't pass/return in hardware registe=

rs
 as they should.
 I'm crafting it to have the same performance characteristics as 'int'.

Cool, thanks for answering my questions. Some of what I'm working on atm would benefit from simd. -- James Miller
Mar 15 2012
prev sibling parent reply "F i L" <witte2008 gmail.com> writes:
Great to hear this is coming along. Can I get a link to the 
(github?) source?

Do the simd functions have fallback functionally for unsupported 
hardware? Is that planned? Or is that something I'd be writing 
into my own Vector structures?

Also, I noticed Phobos now includes a "etc" library, do you have 
plans to eventually make a general purpose higher-level Linear 
systems library in that?
Mar 15 2012
next sibling parent James Miller <james aatch.net> writes:
On 16 March 2012 11:14, F i L <witte2008 gmail.com> wrote:
 Great to hear this is coming along. Can I get a link to the (github?)
 source?

 Do the simd functions have fallback functionally for unsupported hardware?
 Is that planned? Or is that something I'd be writing into my own Vector
 structures?

 Also, I noticed Phobos now includes a "etc" library, do you have plans to
 eventually make a general purpose higher-level Linear systems library in
 that?

Looks like we have the same questions. Great minds think alike and all that :D -- Jame sMiller
Mar 15 2012
prev sibling parent Manu <turkeyman gmail.com> writes:
On 16 March 2012 00:14, F i L <witte2008 gmail.com> wrote:

 Do the simd functions have fallback functionally for unsupported hardware?
 Is that planned? Or is that something I'd be writing into my own Vector
 structures?

I am thinking more and more that it'll have fallback for unsupported hardware (since the same code will need to run for CTFE), but as well just pipe unsupported platforms through that code. But it probably won't be as efficient as possible for those platforms, so the jury is still out. It might be better to encourage them to do it properly.
 Also, I noticed Phobos now includes a "etc" library, do you have plans to
 eventually make a general purpose higher-level Linear systems library in
 that?

I don't plan to. If I end out using one in my personal code, I'll share it though.
Mar 15 2012