digitalmars.D - std.simd

Manu (27/27) Mar 15 2012 Hey chaps (and possibly lasses?)

Robert Jacques (10/43) Mar 15 2012 This sounds reasonable. However, please realize that if you wish to use ...

Manu (16/62) Mar 15 2012 I can see how you come to that conclusion, but I generally feel that tha...

Robert Jacques (3/17) Mar 15 2012 Then you should to leave namespace room for that higher level library.

Manu (4/28) Mar 16 2012 I haven't stolen any names that aren't already taken by core.simd. I jus...
David Nadlinger (20/22) Mar 16 2012 What makes you thing that there would be only one such high-level

Robert Jacques (29/48) Mar 16 2012 On Fri, 16 Mar 2012 08:24:58 -0500, David Nadlinger ...

Manu (28/68) Mar 16 2012 s

Robert Jacques (5/28) Mar 17 2012 Shaders. :) Actually, float4 isn't supported in hardware if you're on NV...

Manu (35/58) Mar 17 2012 GPU hardware is fundamentally different than CPU vector extensions. The

James Miller (32/57) Mar 15 2012 le
Manu (15/74) Mar 15 2012 I think it depends on HOW unsupported they are. If it can be emulated
James Miller (15/96) Mar 15 2012 k

F i L (8/8) Mar 15 2012 Great to hear this is coming along. Can I get a link to the

James Miller (4/12) Mar 15 2012 Looks like we have the same questions. Great minds think alike and all t...
Manu (9/15) Mar 15 2012 I am thinking more and more that it'll have fallback for unsupported

Manu <turkeyman gmail.com> writes:

Hey chaps (and possibly lasses?)

I've been slowly working a std.simd library, the aim of which is to provide
a lowest-level hardware-independent SIMD interface. core.simd implements
SSE currently for x86, other architectures are currently exposed via
gcc.builtins.
The purpose of std.simd, is to be the lowest level API that people make
direct use of, while still having as-close-to-direct-as-possible mapping to
the hardware opcodes, but still being portable. I would expect that custom,
more-feature-rich SIMD/vector/matrix/linear algebra libraries should be
built on top of std.simd in future, that way being portable to as many
systems as possible.

Now I've reached a question in the design of the library, I'd like to take
a general consensus.

lowest level vectors are defined by: __vector(type[width])
But core.simd also defines a bunch of handy 'nice' aliases for common
vector types, ie, float4, int4, short8, etc.

I want to claim those names into std.simd. They should be the lowest level
names that people use, and therefore associate with the std.simd
functionality.
I also want to enhance them a bit:
  I want to make them a struct that wraps the primitive rather than an
alias. I understand this single-POD struct will be handled the same as the
POD its self, is that right? If I pass the wrapper struct byval to a
function, it will be passed in a register as it should yeah?
  I then intend to then add CTFE support, and maybe some properties and
opDisplatch bits.

Does this sound reasonable?

Mar 15 2012

"Robert Jacques" <sandford jhu.edu> writes:

On Thu, 15 Mar 2012 12:09:58 -0500, Manu <turkeyman gmail.com> wrote:
 Hey chaps (and possibly lasses?)

 I've been slowly working a std.simd library, the aim of which is to  
 provide
 a lowest-level hardware-independent SIMD interface. core.simd implements
 SSE currently for x86, other architectures are currently exposed via
 gcc.builtins.
 The purpose of std.simd, is to be the lowest level API that people make
 direct use of, while still having as-close-to-direct-as-possible mapping  
 to
 the hardware opcodes, but still being portable. I would expect that  
 custom,
 more-feature-rich SIMD/vector/matrix/linear algebra libraries should be
 built on top of std.simd in future, that way being portable to as many
 systems as possible.

 Now I've reached a question in the design of the library, I'd like to  
 take
 a general consensus.

 lowest level vectors are defined by: __vector(type[width])
 But core.simd also defines a bunch of handy 'nice' aliases for common
 vector types, ie, float4, int4, short8, etc.

 I want to claim those names into std.simd. They should be the lowest  
 level
 names that people use, and therefore associate with the std.simd
 functionality.
 I also want to enhance them a bit:
   I want to make them a struct that wraps the primitive rather than an
 alias. I understand this single-POD struct will be handled the same as  
 the
 POD its self, is that right? If I pass the wrapper struct byval to a
 function, it will be passed in a register as it should yeah?
   I then intend to then add CTFE support, and maybe some properties and
 opDisplatch bits.

 Does this sound reasonable?

This sounds reasonable. However, please realize that if you wish to use  
the short vector names (i.e. float4, float3, float2, etc) you should  
support the full set with a decent range of operations and methods.  
Several people (myself included) have written similar short vector  
libraries; I think having having short vectors in phobos is important, but  
having one library provide float4 and another float2 is less than ideal,  
even if not all of the types could leverage the SMID backend. For myself,  
the killer feature for such a library would be have the CUDA compatible  
alignments for the types. (or an equivalent enum to the effect)

Mar 15 2012

Manu <turkeyman gmail.com> writes:

On 15 March 2012 20:35, Robert Jacques <sandford jhu.edu> wrote:

 On Thu, 15 Mar 2012 12:09:58 -0500, Manu <turkeyman gmail.com> wrote:

 Hey chaps (and possibly lasses?)

 I've been slowly working a std.simd library, the aim of which is to
 provide
 a lowest-level hardware-independent SIMD interface. core.simd implements
 SSE currently for x86, other architectures are currently exposed via
 gcc.builtins.
 The purpose of std.simd, is to be the lowest level API that people make
 direct use of, while still having as-close-to-direct-as-possible mapping
 to
 the hardware opcodes, but still being portable. I would expect that
 custom,
 more-feature-rich SIMD/vector/matrix/linear algebra libraries should be
 built on top of std.simd in future, that way being portable to as many
 systems as possible.

 Now I've reached a question in the design of the library, I'd like to take
 a general consensus.

 lowest level vectors are defined by: __vector(type[width])
 But core.simd also defines a bunch of handy 'nice' aliases for common
 vector types, ie, float4, int4, short8, etc.

 I want to claim those names into std.simd. They should be the lowest level
 names that people use, and therefore associate with the std.simd
 functionality.
 I also want to enhance them a bit:
  I want to make them a struct that wraps the primitive rather than an
 alias. I understand this single-POD struct will be handled the same as the
 POD its self, is that right? If I pass the wrapper struct byval to a
 function, it will be passed in a register as it should yeah?
  I then intend to then add CTFE support, and maybe some properties and
 opDisplatch bits.

 Does this sound reasonable?

 This sounds reasonable. However, please realize that if you wish to use
 the short vector names (i.e. float4, float3, float2, etc) you should
 support the full set with a decent range of operations and methods. Several
 people (myself included) have written similar short vector libraries; I
 think having having short vectors in phobos is important, but having one
 library provide float4 and another float2 is less than ideal, even if not
 all of the types could leverage the SMID backend. For myself, the killer
 feature for such a library would be have the CUDA compatible alignments for
 the types. (or an equivalent enum to the effect)

I can see how you come to that conclusion, but I generally feel that that's
a problem for a higher layer of library.
I really feel it's important to keep std.simd STRICTLY about the hardware
simd operations, only implementing what the hardware can express
efficiently, and not trying to emulate anything else. In some areas I feel
I've already violated that premise, by adding some functions to make good
use of something that NEON/VMX can express in a single opcode, but takes
SSE 2-3. I don't want to push that bar, otherwise the user will lose
confidence that the functions in std.simd will actually work efficiently on
any given hardware.
It's not a do-everything library, it's a hardware SIMD abstraction, and
most functions map to exactly one hardware opcode. I expect most people
will want to implement their own higher level lib on top tbh; almost nobody
will ever agree on what the perfect maths library should look like, and
it's also context specific.

Mar 15 2012

"Robert Jacques" <sandford jhu.edu> writes:

On Thu, 15 Mar 2012 14:02:15 -0500, Manu <turkeyman gmail.com> wrote:
 On 15 March 2012 20:35, Robert Jacques <sandford jhu.edu> wrote:
 On Thu, 15 Mar 2012 12:09:58 -0500, Manu <turkeyman gmail.com> wrote:


[snip]
 This sounds reasonable. However, please realize that if you wish to use
 the short vector names (i.e. float4, float3, float2, etc) you should
 support the full set with a decent range of operations and methods. Several
 people (myself included) have written similar short vector libraries; I
 think having having short vectors in phobos is important, but having one
 library provide float4 and another float2 is less than ideal, even if not
 all of the types could leverage the SMID backend. For myself, the killer
 feature for such a library would be have the CUDA compatible alignments for
 the types. (or an equivalent enum to the effect)

 I can see how you come to that conclusion, but I generally feel that that's
 a problem for a higher layer of library.

Then you should to leave namespace room for that higher level library.

Mar 15 2012

Manu <turkeyman gmail.com> writes:

On 16 March 2012 01:32, Robert Jacques <sandford jhu.edu> wrote:

 On Thu, 15 Mar 2012 14:02:15 -0500, Manu <turkeyman gmail.com> wrote:

 On 15 March 2012 20:35, Robert Jacques <sandford jhu.edu> wrote:

 On Thu, 15 Mar 2012 12:09:58 -0500, Manu <turkeyman gmail.com> wrote:

 [snip]

 This sounds reasonable. However, please realize that if you wish to use
 the short vector names (i.e. float4, float3, float2, etc) you should
 support the full set with a decent range of operations and methods.
 Several
 people (myself included) have written similar short vector libraries; I
 think having having short vectors in phobos is important, but having one
 library provide float4 and another float2 is less than ideal, even if not
 all of the types could leverage the SMID backend. For myself, the killer
 feature for such a library would be have the CUDA compatible alignments
 for
 the types. (or an equivalent enum to the effect)

 I can see how you come to that conclusion, but I generally feel that
 that's
 a problem for a higher layer of library.

 Then you should to leave namespace room for that higher level library.

I haven't stolen any names that aren't already taken by core.simd. I just
want to claim those primitive types already aliased in std.simd to enhance
them with some more useful base-level functionality.

Mar 16 2012

"David Nadlinger" <see klickverbot.at> writes:

On Thursday, 15 March 2012 at 23:32:29 UTC, Robert Jacques wrote:
 Then you should to leave namespace room for that higher level 
 library.

What makes you thing that there would be only one such high-level 
library wanting to define a floatN type?

There is no such thing as a global namespace in D (well, one 
could probably argue that the things defined in object are). 
Thus, I don't see a problem with re-using a name in a third-party 
library, if its a good fit in both places – and you'll probably 
have a hard time coming up with a better name for SIMD stuff than 
float4.

If at some point you want to mix types from both modules, you 
could always use static or renamed imports. For example, »import 
lowlevel = std.simd« would give you »lowlevel.float4 
upVector;«, which might be clearer in the context of your 
application than any longer, pre-defined name could ever be.

True, we shouldn't generally pick very likely-to-collide names by 
default just because we can so, but denying the existence of the 
D module system altogether is going to set us back to using 
library name prefixes everywhere, like in C (and sometimes C++) 
code.

David

Mar 16 2012

"Robert Jacques" <sandford jhu.edu> writes:

On Fri, 16 Mar 2012 08:24:58 -0500, David Nadlinger <see klickverbot.at>=
  =

wrote:

 On Thursday, 15 March 2012 at 23:32:29 UTC, Robert Jacques wrote:
 Then you should to leave namespace room for that higher level library=


.
 What makes you thing that there would be only one such high-level  =

 library wanting to define a floatN type?

The fact that several people have proposed unifying the existing librari=
es  =

any putting them into phobos :)

 There is no such thing as a global namespace in D (well, one could  =

 probably argue that the things defined in object are). Thus, I don't s=

ee  =

 a problem with re-using a name in a third-party library, if its a good=

  =

 fit in both places =E2=80=93 and you'll probably have a hard time comi=

ng up with  =

 a better name for SIMD stuff than float4.

 If at some point you want to mix types from both modules, you could  =

 always use static or renamed imports. For example, =C2=BBimport lowlev=

el =3D  =

 std.simd=C2=AB would give you =C2=BBlowlevel.float4 upVector;=C2=AB, w=

hich might be  =

 clearer in the context of your application than any longer, pre-define=

d  =

 name could ever be.

 True, we shouldn't generally pick very likely-to-collide names by  =

 default just because we can so, but denying the existence of the D  =

 module system altogether is going to set us back to using library name=

  =

 prefixes everywhere, like in C (and sometimes C++) code.

 David

Unrelated libraries using the same name is relatively painless. Highly  =

related libraries that conflict, on the other hand, are generally painfu=
l.  =

Yes, there are a lot of mechanisms available to work around this, but  =

selective imports and renaming all add to the cognitive load of using an=
d  =

writing the code.

To me float4 isn't a SIMD name; its a vector name and if it's implemente=
d  =

using SIMD, great, but that's an implementation detail. I can understand=
 a  =

close to the metal SIMD library and encourage the work. But if it isn't =
 =

also going to be a vector library, if possible, it shouldn't use the  =

vector names.

Mar 16 2012

Manu <turkeyman gmail.com> writes:

On 16 March 2012 22:39, Robert Jacques <sandford jhu.edu> wrote:

 On Fri, 16 Mar 2012 08:24:58 -0500, David Nadlinger <see klickverbot.at>
 wrote:

  On Thursday, 15 March 2012 at 23:32:29 UTC, Robert Jacques wrote:
 Then you should to leave namespace room for that higher level library.

 What makes you thing that there would be only one such high-level librar=


y
 wanting to define a floatN type?

 The fact that several people have proposed unifying the existing librarie=

s
 any putting them into phobos :)


I personally can't see it happening. Above the most primitive level that
I've tried to cover with std.simd, I think it'll be very hard to find
agreement on what that API should look like.
If you can invent a proposal that everyone agrees on, I'd be very
interested to see it. Perhaps if you extend the fairly raw and D-ish API
that I've tried to use in std.simd it could work, but I don't think many
people will like using that in their code. I anticipate std.simd will be
wrapped in some big bloated class by almost everyone that uses it, so why
bother to add the emulation at that level?


There is no such thing as a global namespace in D (well, one could probably
 argue that the things defined in object are). Thus, I don't see a proble=


m
 with re-using a name in a third-party library, if its a good fit in both
 places =E2=80=93 and you'll probably have a hard time coming up with a b=


etter name
 for SIMD stuff than float4.

 If at some point you want to mix types from both modules, you could
 always use static or renamed imports. For example, =C2=BBimport lowlevel=


 =3D
 std.simd=C2=AB would give you =C2=BBlowlevel.float4 upVector;=C2=AB, whi=


ch might be
 clearer in the context of your application than any longer, pre-defined
 name could ever be.

 True, we shouldn't generally pick very likely-to-collide names by defaul=


t
 just because we can so, but denying the existence of the D module system
 altogether is going to set us back to using library name prefixes
 everywhere, like in C (and sometimes C++) code.

 David

 Unrelated libraries using the same name is relatively painless. Highly
 related libraries that conflict, on the other hand, are generally painful=

.
 Yes, there are a lot of mechanisms available to work around this, but
 selective imports and renaming all add to the cognitive load of using and
 writing the code.

 To me float4 isn't a SIMD name; its a vector name and if it's implemented
 using SIMD, great, but that's an implementation detail. I can understand =

a
 close to the metal SIMD library and encourage the work. But if it isn't
 also going to be a vector library, if possible, it shouldn't use the vect=

or
 names.

Can you give me an example of a non-simd context where this is the case?
Don't say shaders, because that is supported in hardware, and that's my
point.
Also there's nothing stopping a secondary library adding/emulating the
additional types. They could work seamlessly together. flaot4 may come from
std.simd, float3/float2 may be added by a further lib that simply extends
std.simd.

Mar 16 2012

"Robert Jacques" <sandford jhu.edu> writes:

On Fri, 16 Mar 2012 16:45:05 -0500, Manu <turkeyman gmail.com> wrote:
 On 16 March 2012 22:39, Robert Jacques <sandford jhu.edu> wrote:
 On Fri, 16 Mar 2012 08:24:58 -0500, David Nadlinger <see klickverbot.at>
 wrote:
  On Thursday, 15 March 2012 at 23:32:29 UTC, Robert Jacques wrote:


[snip]

 Unrelated libraries using the same name is relatively painless. Highly
 related libraries that conflict, on the other hand, are generally painful.
 Yes, there are a lot of mechanisms available to work around this, but
 selective imports and renaming all add to the cognitive load of using and
 writing the code.

 To me float4 isn't a SIMD name; its a vector name and if it's implemented
 using SIMD, great, but that's an implementation detail. I can understand a
 close to the metal SIMD library and encourage the work. But if it isn't
 also going to be a vector library, if possible, it shouldn't use the vector
 names.

 Can you give me an example of a non-simd context where this is the case?
 Don't say shaders, because that is supported in hardware, and that's my
 point.
 Also there's nothing stopping a secondary library adding/emulating the
 additional types. They could work seamlessly together. flaot4 may come from
 std.simd, float3/float2 may be added by a further lib that simply extends
 std.simd.

Shaders. :) Actually, float4 isn't supported in hardware if you're on NVIDIA.
And IIRC ATI/AMD is moving away from hardware support as well. I'm not sure
what Intel or the embedded GPUs do. On the CPU side SIMD support on both ARM
and PowerPC is optional. As for examples, pretty much every graphics, vision,
imaging and robotics library has a small vector library attached to it; were
you looking for something else?

Also, clean support for float3 / float2 / etc. has shown up in Intel's Larrabee
and its Knight's derivatives; so, maybe we'll see it in a desktop processor
someday. To say nothing of the 245-bit and 512-bit SIMD units on some machines.

My concern is that std.simd is (for good reason) leaking the underlying
hardware implementation (std.simd seems to be very x86 centric), while vectors,
in my mind, are a higher level abstraction.

Mar 17 2012

Manu <turkeyman gmail.com> writes:

On 17 March 2012 20:42, Robert Jacques <sandford jhu.edu> wrote:

 On Fri, 16 Mar 2012 16:45:05 -0500, Manu <turkeyman gmail.com> wrote:

 Can you give me an example of a non-simd context where this is the case?

 Don't say shaders, because that is supported in hardware, and that's my
 point.
 Also there's nothing stopping a secondary library adding/emulating the
 additional types. They could work seamlessly together. flaot4 may come
 from
 std.simd, float3/float2 may be added by a further lib that simply extends
 std.simd.

 Shaders. :) Actually, float4 isn't supported in hardware if you're on
 NVIDIA. And IIRC ATI/AMD is moving away from hardware support as well. I'm
 not sure what Intel or the embedded GPUs do. On the CPU side SIMD support
 on both ARM and PowerPC is optional. As for examples, pretty much every
 graphics, vision, imaging and robotics library has a small vector library
 attached to it; were you looking for something else?

GPU hardware is fundamentally different than CPU vector extensions. The
goal is not to imitate shaders on the CPU. There are already other
possibilities for that anyway.

Also, clean support for float3 / float2 / etc. has shown up in Intel's
 Larrabee and its Knight's derivatives; so, maybe we'll see it in a desktop
 processor someday. To say nothing of the 245-bit and 512-bit SIMD units on
 some machines.

Well when that day comes, we'll add hardware abstraction for it. There are
2 that do currently exist, 3DNow, but that's so antiquated, I see no reason
to support it. The other is the Gamecube, Wii, WiiU line of consoles; all
have 2D vector hardware. I'll gladly add support for that the very moment
anyone threatens to use D on a Nintendo system, but no point right now.

float3 on the other hand is not supported on any machine, and it's very
inefficient. Use of float3 should be discouraged at all costs. People
should be encouraged to use float4's and pack something useful in W if they
can. And if not, they should be aware that they are wasting 25% of their
flops.

I don't recall ever dismissing 256bit vector units. Infact I've suggested
support for AVX is mandatory on plenty of occasions. I'm also familiar with
a few 512bit vector architectures, but I don't think they warrant a
language level implementation yet. Let's just work through what we have,
and what will be used to start with. I'd be keen to see how it tends to be
used, and make any fundamental changes before blowing it way out
of proportion.


 My concern is that std.simd is (for good reason) leaking the underlying
 hardware implementation (std.simd seems to be very x86 centric), while
 vectors, in my mind, are a higher level abstraction.

It's certainly not SSE centric. Actually, if you can legitimately criticise
me of anything, it's being biased AGAINST x86 based processors. I'm
critically aware of VMX, NEON, SPU, and many architectures that came before.
What parts of my current work in progress do you suggest is biased to x86
hardware implementation? From my experience, I'm fairly happy with how it
looks at this point being an efficiency-first architecture-abstracted
interface.

As I've said, I'm still confident that people will just come along and wrap
it up with what they feel is a simple/user-friendly interface anyway. If I
try to make this higher-level/more-user-friendly, I still won't please
everyone, and I'll sacrifice raw efficiency in the process, which defeats
the purpose.

How do you define vectors in your mind?

Mar 17 2012

James Miller <james aatch.net> writes:

On 16 March 2012 08:02, Manu <turkeyman gmail.com> wrote:
 On 15 March 2012 20:35, Robert Jacques <sandford jhu.edu> wrote:
 This sounds reasonable. However, please realize that if you wish to use
 the short vector names (i.e. float4, float3, float2, etc) you should sup=


port
 the full set with a decent range of operations and methods. Several peop=


le
 (myself included) have written similar short vector libraries; I think
 having having short vectors in phobos is important, but having one libra=


ry
 provide float4 and another float2 is less than ideal, even if not all of=


 the
 types could leverage the SMID backend. For myself, the killer feature fo=


r
 such a library would be have the CUDA compatible alignments for the type=


s.
 (or an equivalent enum to the effect)


 I can see how you come to that conclusion, but I generally feel that that=

's
 a problem for a higher layer of library.
 I really feel it's important to keep std.simd STRICTLY about the hardware
 simd operations, only implementing what the hardware can express
 efficiently, and not trying to emulate anything else. In some areas I fee=

l
 I've already violated that premise, by adding some functions to make good
 use of something that NEON/VMX can express in a single opcode, but takes =

SSE
 2-3.=C2=A0I don't want to push that bar, otherwise the user will lose con=

fidence
 that the functions in std.simd will actually work efficiently on any give=

n
 hardware.
 It's not a do-everything library, it's a hardware SIMD abstraction, and m=

ost
 functions map to exactly one hardware opcode. I expect most people will w=

ant
 to implement their own higher level lib on top tbh; almost nobody will ev=

er
 agree on what the perfect maths library should look like, and it's also
 context specific.

I think that having the low-level vectors makes sense. Since
technically only float4, int4, short8, byte16, actually make sense in
the context of direct SIMD, providing other vectors would be straying
into vector-library territory, as people would then expect
interoperability between them, standard vector/matrix operations, and
that could get too high-level. Third-party libraries have to be useful
for something!

Slightly off topic questions:
Are you planning on providing a way to fallback if certain operations
aren't supported? Even if it can only be picked at compile time? Is
your work on Github or something? I wouldn't mind having a peek, since
this stuff interests me. How well does this stuff inline? I can
imagine that a lot of the benefit of using SIMD would be lost if every
SIMD instruction ends up wrapped in 3-4 more instructions, especially
if you need to do consecutive operations on the same data.

--
James Miller

Mar 15 2012

Manu <turkeyman gmail.com> writes:

On 15 March 2012 22:27, James Miller <james aatch.net> wrote:

 On 16 March 2012 08:02, Manu <turkeyman gmail.com> wrote:
 On 15 March 2012 20:35, Robert Jacques <sandford jhu.edu> wrote:
 This sounds reasonable. However, please realize that if you wish to use
 the short vector names (i.e. float4, float3, float2, etc) you should


 support
 the full set with a decent range of operations and methods. Several


 people
 (myself included) have written similar short vector libraries; I think
 having having short vectors in phobos is important, but having one


 library
 provide float4 and another float2 is less than ideal, even if not all


 of the
 types could leverage the SMID backend. For myself, the killer feature


 for
 such a library would be have the CUDA compatible alignments for the


 types.
 (or an equivalent enum to the effect)


 I can see how you come to that conclusion, but I generally feel that

 that's
 a problem for a higher layer of library.
 I really feel it's important to keep std.simd STRICTLY about the hardware
 simd operations, only implementing what the hardware can express
 efficiently, and not trying to emulate anything else. In some areas I

 feel
 I've already violated that premise, by adding some functions to make good
 use of something that NEON/VMX can express in a single opcode, but takes

 SSE
 2-3. I don't want to push that bar, otherwise the user will lose

 confidence
 that the functions in std.simd will actually work efficiently on any

 given
 hardware.
 It's not a do-everything library, it's a hardware SIMD abstraction, and

 most
 functions map to exactly one hardware opcode. I expect most people will

 want
 to implement their own higher level lib on top tbh; almost nobody will

 ever
 agree on what the perfect maths library should look like, and it's also
 context specific.

 I think that having the low-level vectors makes sense. Since
 technically only float4, int4, short8, byte16, actually make sense in
 the context of direct SIMD, providing other vectors would be straying
 into vector-library territory, as people would then expect
 interoperability between them, standard vector/matrix operations, and
 that could get too high-level. Third-party libraries have to be useful
 for something!

 Slightly off topic questions:
 Are you planning on providing a way to fallback if certain operations
 aren't supported?


I think it depends on HOW unsupported they are. If it can be emulated
efficiently (and in the context, the emulation would be as efficient as
possible on the architecture anyway), then probably, but if it's a problem
that should simply be solved another way, I'd rather encourage that with a
compile error.

Even if it can only be picked at compile time? Is
 your work on Github or something?


Yup: https://github.com/TurkeyMan/phobos/commits/master/std/simd.d


 I wouldn't mind having a peek, since
 this stuff interests me. How well does this stuff inline?


It inlines perfectly, I pay very close attention to the codegen every
single function. And have loads of static branches to select more efficient
versions for more recent revisions of the SSE instruction set.


 I can
 imagine that a lot of the benefit of using SIMD would be lost if every
 SIMD instruction ends up wrapped in 3-4 more instructions, especially
 if you need to do consecutive operations on the same data.

It will lose 100% of its benefit it it is wrapped up in even ONE function
call, and equally so if the vectors don't pass/return in hardware registers
as they should.
I'm crafting it to have the same performance characteristics as 'int'.

Mar 15 2012

James Miller <james aatch.net> writes:

On 16 March 2012 11:44, Manu <turkeyman gmail.com> wrote:
 On 15 March 2012 22:27, James Miller <james aatch.net> wrote:
 On 16 March 2012 08:02, Manu <turkeyman gmail.com> wrote:
 On 15 March 2012 20:35, Robert Jacques <sandford jhu.edu> wrote:
 This sounds reasonable. However, please realize that if you wish to u=




se
 the short vector names (i.e. float4, float3, float2, etc) you should
 support
 the full set with a decent range of operations and methods. Several
 people
 (myself included) have written similar short vector libraries; I thin=




k
 having having short vectors in phobos is important, but having one
 library
 provide float4 and another float2 is less than ideal, even if not all
 of the
 types could leverage the SMID backend. For myself, the killer feature
 for
 such a library would be have the CUDA compatible alignments for the
 types.
 (or an equivalent enum to the effect)


 I can see how you come to that conclusion, but I generally feel that
 that's
 a problem for a higher layer of library.
 I really feel it's important to keep std.simd STRICTLY about the
 hardware
 simd operations, only implementing what the hardware can express
 efficiently, and not trying to emulate anything else. In some areas I
 feel
 I've already violated that premise, by adding some functions to make
 good
 use of something that NEON/VMX can express in a single opcode, but tak=



es
 SSE
 2-3.=C2=A0I don't want to push that bar, otherwise the user will lose
 confidence
 that the functions in std.simd will actually work efficiently on any
 given
 hardware.
 It's not a do-everything library, it's a hardware SIMD abstraction, an=



d
 most
 functions map to exactly one hardware opcode. I expect most people wil=



l
 want
 to implement their own higher level lib on top tbh; almost nobody will
 ever
 agree on what the perfect maths library should look like, and it's als=



o
 context specific.

 I think that having the low-level vectors makes sense. Since
 technically only float4, int4, short8, byte16, actually make sense in
 the context of direct SIMD, providing other vectors would be straying
 into vector-library territory, as people would then expect
 interoperability between them, standard vector/matrix operations, and
 that could get too high-level. Third-party libraries have to be useful
 for something!

 Slightly off topic questions:
 Are you planning on providing a way to fallback if certain operations
 aren't supported?


 I think it depends on HOW unsupported they are. If it can be emulated
 efficiently (and in the context, the emulation would be as efficient as
 possible on the architecture anyway), then probably, but if it's a proble=

m
 that should simply be solved another way, I'd rather encourage that with =

a
 compile error.

 Even if it can only be picked at compile time? Is
 your work on Github or something?


 Yup:=C2=A0https://github.com/TurkeyMan/phobos/commits/master/std/simd.d

 I wouldn't mind having a peek, since
 this stuff interests me. How well does this stuff inline?


 It inlines perfectly, I pay very close attention to the codegen every sin=

gle
 function. And have loads of static branches to select more efficient
 versions for more recent revisions of the SSE instruction set.

 I can
 imagine that a lot of the benefit of using SIMD would be lost if every
 SIMD instruction ends up wrapped in 3-4 more instructions, especially
 if you need to do consecutive operations on the same data.


 It will lose 100% of its benefit it it is wrapped up in even ONE function
 call, and equally so if the vectors don't pass/return in hardware registe=

rs
 as they should.
 I'm crafting it to have the same performance characteristics as 'int'.

Cool, thanks for answering my questions. Some of what I'm working on
atm would benefit from simd.

--
James Miller

Mar 15 2012

"F i L" <witte2008 gmail.com> writes:

Great to hear this is coming along. Can I get a link to the 
(github?) source?

Do the simd functions have fallback functionally for unsupported 
hardware? Is that planned? Or is that something I'd be writing 
into my own Vector structures?

Also, I noticed Phobos now includes a "etc" library, do you have 
plans to eventually make a general purpose higher-level Linear 
systems library in that?

Mar 15 2012

James Miller <james aatch.net> writes:

On 16 March 2012 11:14, F i L <witte2008 gmail.com> wrote:
 Great to hear this is coming along. Can I get a link to the (github?)
 source?

 Do the simd functions have fallback functionally for unsupported hardware?
 Is that planned? Or is that something I'd be writing into my own Vector
 structures?

 Also, I noticed Phobos now includes a "etc" library, do you have plans to
 eventually make a general purpose higher-level Linear systems library in
 that?

Looks like we have the same questions. Great minds think alike and all that :D

--
Jame sMiller

Mar 15 2012

Manu <turkeyman gmail.com> writes:

On 16 March 2012 00:14, F i L <witte2008 gmail.com> wrote:

 Do the simd functions have fallback functionally for unsupported hardware?
 Is that planned? Or is that something I'd be writing into my own Vector
 structures?


I am thinking more and more that it'll have fallback for unsupported
hardware (since the same code will need to run for CTFE), but as well just
pipe unsupported platforms through that code.
But it probably won't be as efficient as possible for those platforms, so
the jury is still out. It might be better to encourage them to do it
properly.


 Also, I noticed Phobos now includes a "etc" library, do you have plans to
 eventually make a general purpose higher-level Linear systems library in
 that?

I don't plan to. If I end out using one in my personal code, I'll share it
though.

Mar 15 2012

D Programming

C/C++ Programming

Other

digitalmars.D - std.simd