www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - SIMD c = a op b

reply Cecil Ward <cecil cecilward.com> writes:
Is it true that this doesn’t work (in either branch)?

float4 a,b;
static if (__traits(compiles, a/b))
     c = a / b;
else
     c[] = a[] / b[];

I tried it with 4 x 64-bit ulongs in a 256-bit vector instead.
Hoping I have done things correctly, I got an error message about 
requiring a destination variable as in c = a op b where I tried 
simply "return a / b;" In the else branch, I got a type 
conversion error. Is that because a[] is an array of 256-bit 
vectors, in the else case, not an array of ulongs?
Jun 17 2023
parent reply Cecil Ward <cecil cecilward.com> writes:
On Sunday, 18 June 2023 at 04:54:08 UTC, Cecil Ward wrote:
 Is it true that this doesn’t work (in either branch)?

 float4 a,b;
 static if (__traits(compiles, a/b))
     c = a / b;
 else
     c[] = a[] / b[];

 I tried it with 4 x 64-bit ulongs in a 256-bit vector instead.
 Hoping I have done things correctly, I got an error message 
 about requiring a destination variable as in c = a op b where I 
 tried simply "return a / b;" In the else branch, I got a type 
 conversion error. Is that because a[] is an array of 256-bit 
 vectors, in the else case, not an array of ulongs?
Correction I should have written ‘always work’ - I just copied the example straight from the language documentation for simd and adapted it to use ulongs and a wider vector. I was using GDC.
Jun 17 2023
parent Guillaume Piolat <first.last spam.org> writes:
On Sunday, 18 June 2023 at 05:01:16 UTC, Cecil Ward wrote:
 On Sunday, 18 June 2023 at 04:54:08 UTC, Cecil Ward wrote:
 Is it true that this doesn’t always work (in either branch)?

 float4 a,b;
 static if (__traits(compiles, a/b))
     c = a / b;
 else
     c[] = a[] / b[];
It's because SIMD stuff doesn't always works that intel-intrinsics was created. It insulates you from the compiler underneath. import inteli.emmintrin; void main() { float4 a, b, c; c = a / b; // _always_ works c = _mm_div_ps(a, b); // _always_ works } Sure in some case it may emulate those vectors, but for vector of float it's only in DMD -m32. It relies on excellent __vector work made a long time ago, and supplements it. For 32-byte vectors such as __vector(float[8]), you will have trouble on GDC when -mavx isn't there, or with DMD. Do you think the builtin __vector support the same operations across the compilers? The answer is "it's getting there", in the meanwhile using intel-intrinsics will lower your exposure to the compiler woes. If you want to use DMD and -O -inline, you should also expect much more problems unless working extra in order to have SIMD.
Jun 19 2023