www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.ldc - Simd instructions

reply "bearophile" <bearophileHUGS lycos.com> writes:
As a final step to compute the product of two complex numbers I 
perform a simd operation on double2:

x3 = [x3.array[1] - x2.array[1], x3.array[0] + x2.array[0]];

But ldc2 compiles that quite badly (I don't know who's to blame, 
if necessary I will open a LLVM bug report), so I have tried to 
use an instruction addsubpd.

To do it I have imported ldc.gccbuiltins_x86 and then I use:

x3 = __builtin_ia32_addsubpd(x3, x2);

but ldc2 gives me:

LLVM ERROR: Cannot select: intrinsic %llvm.x86.sse3.addsub.pd
Stack dump:
0.      Running pass 'X86 DAG->DAG Instruction Selection' on 
function 
' "\01__D12complex_mul217__T8compMul6Vk12Z8compMul6FNaNbNfKG12NhG2dKG12NhG2dKG12NhG2dZv"'

Can you help me?

Bye,
bearophile
Jul 18 2013
parent reply "jerro" <a a.com> writes:
Try adding flag -mattr=sse3.
Jul 18 2013
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
jerro:

 Try adding flag -mattr=sse3.
Now it's accepted, thank you. So is LDC2 assuming a very old CPU? :-) Bye, bearophile
Jul 18 2013
parent "Kai Nacke" <kai redstar.de> writes:
On Thursday, 18 July 2013 at 21:30:12 UTC, bearophile wrote:
 jerro:

 Try adding flag -mattr=sse3.
Now it's accepted, thank you. So is LDC2 assuming a very old CPU? :-) Bye, bearophile
Hi, the behaviour was changed because you can't create a generic package if you optimize for your CPU. But the change created other problems, see issue #414 (https://github.com/ldc-developers/ldc/issues/414). With LLVM 3.3, the auto vectorizer is not enabled. You have to specify -vectorize on the command line. Maybe you want to try that with your original code. Kai
Jul 22 2013