www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - suggestion: basic SIMD types modelled after Cg/HLSL

reply cschueler <cschueler_member pathlink.com> writes:
Hi list,
I'll throw in a suggestion for extending D when no other worries are left over.

Why not incorporate the 4-way SIMD (single instruction multiple data) vectors,
which are available in hardware by now on virtually every platform, by means of
builtin types in the same syntax as in Cg (Nvidia's C for graphics) resp. HLSL
(high-level shading language).

The rationale should be quite easy: the hardware can do it, so in the same sense
as D is enthusiastic of supporting 80-bit floats, it could as well map these
capapilites to language primitives.

Here is in very compressed form what it currently looks like in Cg. 
You can define vectors like this:

float2 a;
float3 b;
float4 c;

matrices like this:

float2x2 A;
float3x4 B;
float4x4 C;

Add/Sub/Mul/Div/Compare and math functions like sqrt(), sin(), exp() etc,
operate element wise. So in Cg

float4 result = a < b; 

results in a float4 the elements of which are set to the comparison results of
the individual elemens (0 or 1). (I've seen in the archives one thread of a
discussion that the return type opCmp is fixed to bool so this behavior might
confilct with some existing language spec. Mind however that SIMD compare
capability is very useful for muxing constructs.)

You can swizzle element access:

// put the contents of vector a in swizzled order into x
float4 result = a.xzyw; 

Literal constants expand to the vector type of their context, so the constant
"1" may silently propagate to float4(1,1,1,1) if needed:

// will promote to ( a + float4(1,1,1,1) ) / float4(2,2,2,2)
float4 result = ( a + 1 ) / 2;

Of course you can apply a swizzle to a propagated constant

// no syntax error (why should it?)
float4 result = (1).xyzw; 

Some intrinsic functions like dot and cross products are available.
Matrix multiplcation is done with mul.

float4 result = mul( matrix, vector );
float3 result = cross( v1, v2 );


That's for the basics. Cg/HLSL is familiar already to a number of people, so in
one scoop, you'd obviate the need for a large number of people (basically the
non-scientific crowd) to write their own matrix/vector classes and give them a
familiar syntax to boot.

I'd envision that a sufficiently endowed compiler could generate code for the
laguage features even if the target platform has no SIMD hardware; much in the
same spirit of a float-emulator when native hardware is not available.

Anyway, these are only points to kick off a discussion. Obvioulsy I'd like D to
move into a direction where it is useful for me (As you may infer from my other
posts :) )
Jul 12 2006
next sibling parent reply "Craig Black" <cblack ara.com> writes:
Good ideas.  Perhaps it would be easier to integrate Cg/HLSL w/ D so that 
they could work well together.  Perhaps Cg/HLSL code could  be inlined like 
assembler.  That way the full functionality of these languages could be 
leveraged.  I don't know how difficult this would be to implement though.

-Craig

"cschueler" <cschueler_member pathlink.com> wrote in message 
news:e93n8a$27cl$1 digitaldaemon.com...
 Hi list,
 I'll throw in a suggestion for extending D when no other worries are left 
 over.

 Why not incorporate the 4-way SIMD (single instruction multiple data) 
 vectors,
 which are available in hardware by now on virtually every platform, by 
 means of
 builtin types in the same syntax as in Cg (Nvidia's C for graphics) resp. 
 HLSL
 (high-level shading language).

 The rationale should be quite easy: the hardware can do it, so in the same 
 sense
 as D is enthusiastic of supporting 80-bit floats, it could as well map 
 these
 capapilites to language primitives.

 Here is in very compressed form what it currently looks like in Cg.
 You can define vectors like this:

 float2 a;
 float3 b;
 float4 c;

 matrices like this:

 float2x2 A;
 float3x4 B;
 float4x4 C;

 Add/Sub/Mul/Div/Compare and math functions like sqrt(), sin(), exp() etc,
 operate element wise. So in Cg

 float4 result = a < b;

 results in a float4 the elements of which are set to the comparison 
 results of
 the individual elemens (0 or 1). (I've seen in the archives one thread of 
 a
 discussion that the return type opCmp is fixed to bool so this behavior 
 might
 confilct with some existing language spec. Mind however that SIMD compare
 capability is very useful for muxing constructs.)

 You can swizzle element access:

 // put the contents of vector a in swizzled order into x
 float4 result = a.xzyw;

 Literal constants expand to the vector type of their context, so the 
 constant
 "1" may silently propagate to float4(1,1,1,1) if needed:

 // will promote to ( a + float4(1,1,1,1) ) / float4(2,2,2,2)
 float4 result = ( a + 1 ) / 2;

 Of course you can apply a swizzle to a propagated constant

 // no syntax error (why should it?)
 float4 result = (1).xyzw;

 Some intrinsic functions like dot and cross products are available.
 Matrix multiplcation is done with mul.

 float4 result = mul( matrix, vector );
 float3 result = cross( v1, v2 );


 That's for the basics. Cg/HLSL is familiar already to a number of people, 
 so in
 one scoop, you'd obviate the need for a large number of people (basically 
 the
 non-scientific crowd) to write their own matrix/vector classes and give 
 them a
 familiar syntax to boot.

 I'd envision that a sufficiently endowed compiler could generate code for 
 the
 laguage features even if the target platform has no SIMD hardware; much in 
 the
 same spirit of a float-emulator when native hardware is not available.

 Anyway, these are only points to kick off a discussion. Obvioulsy I'd like 
 D to
 move into a direction where it is useful for me (As you may infer from my 
 other
 posts :) )







 
Jul 12 2006
parent reply Thomas Kuehne <thomas-dloop kuehne.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Craig Black schrieb am 2006-07-12:
 Good ideas.  Perhaps it would be easier to integrate Cg/HLSL w/ D so that 
 they could work well together.  Perhaps Cg/HLSL code could  be inlined like 
 assembler.  That way the full functionality of these languages could be 
 leveraged.  I don't know how difficult this would be to implement though.
The suggestion sounds good to me, however implementing an optimizing compiler that uses SSE 1/2/3 and MMX is no small task. Maybe GCC's -mfpmath=sse might provide a starting point for SSE support in GDC? Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFEthQFLK5blCcjpWoRAmQ2AJ0UPTtfNPxYMhBNBkwzMJApqDxBUQCglJGe SQpJT6pGUgi0bmRz1V2Y1w4= =mqGK -----END PGP SIGNATURE-----
Jul 13 2006
next sibling parent Mike Capp <mike.capp gmail.com> writes:
In article <mqgho3-487.ln1 birke.kuehne.cn>, Thomas Kuehne says...
Craig Black schrieb am 2006-07-12:
 Good ideas.  Perhaps it would be easier to integrate Cg/HLSL w/ D so that 
 they could work well together.  Perhaps Cg/HLSL code could  be inlined like 
 assembler.  That way the full functionality of these languages could be 
 leveraged.  I don't know how difficult this would be to implement though.
The suggestion sounds good to me, however implementing an optimizing compiler that uses SSE 1/2/3 and MMX is no small task. Maybe GCC's -mfpmath=sse might provide a starting point for SSE support in GDC?
You might be interested in Sh - http://libsh.org/ cheers Mike
Jul 13 2006
prev sibling parent "Craig Black" <cblack ara.com> writes:
"Thomas Kuehne" <thomas-dloop kuehne.cn> wrote in message 
news:mqgho3-487.ln1 birke.kuehne.cn...
 -----BEGIN PGP SIGNED MESSAGE-----
 Hash: SHA1

 Craig Black schrieb am 2006-07-12:
 Good ideas.  Perhaps it would be easier to integrate Cg/HLSL w/ D so that
 they could work well together.  Perhaps Cg/HLSL code could  be inlined 
 like
 assembler.  That way the full functionality of these languages could be
 leveraged.  I don't know how difficult this would be to implement though.
The suggestion sounds good to me, however implementing an optimizing compiler that uses SSE 1/2/3 and MMX is no small task. Maybe GCC's -mfpmath=sse might provide a starting point for SSE support in GDC?
I was thinking more along the lines of using the compilers that already exist rather than creating our own. -Craig
Jul 13 2006
prev sibling next sibling parent Anders Runesson <anders runesson.info> writes:
I love this idea. I won't have to learn assembler to use simd!
/Anders

ons 2006-07-12 klockan 20:49 +0000 skrev cschueler:
 Hi list,
 I'll throw in a suggestion for extending D when no other worries are left over.
 
 Why not incorporate the 4-way SIMD (single instruction multiple data) vectors,
 which are available in hardware by now on virtually every platform, by means of
 builtin types in the same syntax as in Cg (Nvidia's C for graphics) resp. HLSL
 (high-level shading language).
 
 The rationale should be quite easy: the hardware can do it, so in the same
sense
 as D is enthusiastic of supporting 80-bit floats, it could as well map these
 capapilites to language primitives.
 
 Here is in very compressed form what it currently looks like in Cg. 
 You can define vectors like this:
 
 float2 a;
 float3 b;
 float4 c;
 
 matrices like this:
 
 float2x2 A;
 float3x4 B;
 float4x4 C;
 
 Add/Sub/Mul/Div/Compare and math functions like sqrt(), sin(), exp() etc,
 operate element wise. So in Cg
 
 float4 result = a < b; 
 
 results in a float4 the elements of which are set to the comparison results of
 the individual elemens (0 or 1). (I've seen in the archives one thread of a
 discussion that the return type opCmp is fixed to bool so this behavior might
 confilct with some existing language spec. Mind however that SIMD compare
 capability is very useful for muxing constructs.)
 
 You can swizzle element access:
 
 // put the contents of vector a in swizzled order into x
 float4 result = a.xzyw; 
 
 Literal constants expand to the vector type of their context, so the constant
 "1" may silently propagate to float4(1,1,1,1) if needed:
 
 // will promote to ( a + float4(1,1,1,1) ) / float4(2,2,2,2)
 float4 result = ( a + 1 ) / 2;
 
 Of course you can apply a swizzle to a propagated constant
 
 // no syntax error (why should it?)
 float4 result = (1).xyzw; 
 
 Some intrinsic functions like dot and cross products are available.
 Matrix multiplcation is done with mul.
 
 float4 result = mul( matrix, vector );
 float3 result = cross( v1, v2 );
 
 
 That's for the basics. Cg/HLSL is familiar already to a number of people, so in
 one scoop, you'd obviate the need for a large number of people (basically the
 non-scientific crowd) to write their own matrix/vector classes and give them a
 familiar syntax to boot.
 
 I'd envision that a sufficiently endowed compiler could generate code for the
 laguage features even if the target platform has no SIMD hardware; much in the
 same spirit of a float-emulator when native hardware is not available.
 
 Anyway, these are only points to kick off a discussion. Obvioulsy I'd like D to
 move into a direction where it is useful for me (As you may infer from my other
 posts :) )
 
 
 
 
 
 
 
 
Jul 12 2006
prev sibling next sibling parent "Andrei Khropov" <andkhropov nospam_mtu-net.ru> writes:
cschueler wrote:

 you'd obviate the need for a large number of people (basically
 the non-scientific crowd) to write their own matrix/vector classes and give
 them a familiar syntax to boot.
As D is quite popular among game developers why not include these classes into the standard library? There's a project http://www.dsource.org/projects/helix that contains this functionality. With some templates/conditional compilation trickery it can be extended to optionally use SIMD instructions.
 
 I'd envision that a sufficiently endowed compiler could generate code for the
 laguage features even if the target platform has no SIMD hardware; much in the
 same spirit of a float-emulator when native hardware is not available.
 
 Anyway, these are only points to kick off a discussion. Obvioulsy I'd like D
 to move into a direction where it is useful for me (As you may infer from my
 other posts :) )
Well, I believe matrices and vectors and simple operations on them should be built-in into the language anyway (well, at the present moment it's a beyond 1.0 feature). IMHO using SIMD for these operations is a matter of compiler optimization. I'm not a compiler writer however so I can't judge how tough it is to implement. -- AKhropov
Jul 13 2006
prev sibling parent mclysenk mtu.edu writes:
In article <e93n8a$27cl$1 digitaldaemon.com>, cschueler says...
Hi list,
I'll throw in a suggestion for extending D when no other worries are left over.

Why not incorporate the 4-way SIMD (single instruction multiple data) vectors,
which are available in hardware by now on virtually every platform, by means of
builtin types in the same syntax as in Cg (Nvidia's C for graphics) resp. HLSL
(high-level shading language).
In my opinion, this feature belongs in a library - not in the language spec. CG is totally different in design and scope from D, because shaders are a very different problem domain than system programs. The real issue with a vectors is there is no one-size solution. Most everyone can agree on simle stuff like vector addition or scalar multiplication, but what about vector-vector products? Hardware can easily do a full per-component multiply or a dot product, and both have equal claim to the * operator. What about cross products? Throwing aside issues like handedness, they only exist in 3 dimensions - which means they will cause headaches with 2d and 4d vectors. A nice mathematical solution is geometric algebra, however it is hardly 'efficient.' Matrix multiplication is also full of traps, since there are so many different kinds of inner, outer and tensor products. Ultimately any simple language solution will cause more trouble than its worth. However, a library would be more than flexible enough to implement any sort of vector routine, especially with incipient features like implicit template instantion. -Mik
Jul 14 2006