digitalmars.D.bugs - [Issue 19663] New: On x86_64 the fabs intrinsic should use SSE
- d-bugmail puremagic.com (31/31) Feb 09 2019 https://issues.dlang.org/show_bug.cgi?id=19663
https://issues.dlang.org/show_bug.cgi?id=19663 Issue ID: 19663 Summary: On x86_64 the fabs intrinsic should use SSE Product: D Version: D2 Hardware: x86_64 OS: All Status: NEW Keywords: performance Severity: enhancement Priority: P1 Component: dmd Assignee: nobody puremagic.com Reporter: b2.temp gmx.com Currently on x86_64 dmd backend uses the FPU FABS homonymous instruction but since `single` and `double` parameters are passed, as defined by ABI, in SSE registers, the they have to travel from these SSE registers to GP registers then only to FPU registers and depending on what's done with the absolute value that's obtained: back to a GP register (and all of this to clear a bit !), then again back to SSE register if the func has to return the value etc. It would be more wise to use SSE logical AND with a mask. This would be done only for the single and double types. Several options exist 1. generate mask and ANDPS/ANDPD 2. ANDPS/ANDPD on a constant mask (LDC2 does that btw) 3. left shift and right shift by one Forum discussion: https://forum.dlang.org/post/diljelbvmenuxtaqbuxw forum.dlang.org Reference for the possible solutions: https://stackoverflow.com/questions/32408665/fastest-way-to-compute-absolute-value-using-sse --
Feb 09 2019