|
Archives
D Programming
digitalmars.Ddigitalmars.D.bugs digitalmars.D.dtl digitalmars.D.ide digitalmars.D.dwt digitalmars.D.announce digitalmars.D.learn digitalmars.D.debugger D.gnu D C/C++ Programming
c++c++.announce c++.atl c++.beta c++.chat c++.command-line c++.dos c++.dos.16-bits c++.dos.32-bits c++.idde c++.mfc c++.rtl c++.stl c++.stl.hp c++.stl.port c++.stl.sgi c++.stlsoft c++.windows c++.windows.16-bits c++.windows.32-bits c++.wxwindows digitalmars.empire digitalmars.DMDScript electronics |
digitalmars.D.bugs - [Issue 3751] New: Optimalization error in some floating point code
http://d.puremagic.com/issues/show_bug.cgi?id=3751 Summary: Optimalization error in some floating point code Product: D Version: 2.039 Platform: x86 OS/Version: Linux Status: NEW Severity: major Priority: P2 Component: DMD AssignedTo: nobody puremagic.com ReportedBy: baryluk smp.if.uj.edu.pl --- Comment #0 from Witold Baryluk <baryluk smp.if.uj.edu.pl> 2010-01-28 20:17:58 PST --- I tested this in 2.039 and v2.028, so probably all version beetwen are affected. I don't know how far this bug was introducent. Here is simplified test case (functionally it is buggy but it is sufficient to show bug): import std.math; import std.stdio; double bisect(double z) { double left = -7.0, right = 7.0, half; while (true) { // do {} while(true); // also have this problem half = (left+right)*0.5; version(something) { writefln("%s", half); // adding this solves problem } if (left == half) { return half; // or break } if (half == right) { return half; // or break } /+ // the same effect as two if statments. if ((left == half) || (half == right)) { return half; // or break } +/ double fhalf = exp(-0.5*half*half) * (half + 0.7) - z; if (fhalf > 0.0) { right = half; } else if (fhalf <= 0.0) { left = half; } }; //return half; // not rechable, irrevelant } void main() { foreach (i; 1 .. 10) { auto x = 0.1 + 0.001*i; writefln("%g %g", x, bisect(x)); } } compile without optimalisation: # dmd blad.d; ./blad 0.101 -0.580467 0.102 -0.579361 0.103 -0.578256 0.104 -0.577153 0.105 -0.57605 0.106 -0.574949 0.107 -0.573849 0.108 -0.57275 0.109 -0.571653 # compile with optimalisation: # dmd -O blad.d; ./blad 0.101 -0.580467 .... nothing happens, CPU usage 100% ^C # compile with optimalisation and dummy write: # dmd -O -version=something blad.d; ./blad ... ... 0.109 -0.571653 # // correctly ends just like without optimalisations You can check asmbler code here: version without optimalisations: .text._D4blad6bisectFdZd segment assume CS:.text._D4blad6bisectFdZd _D4blad6bisectFdZd: push EBP mov EBP,ESP sub ESP,020h fld qword ptr FLAT:.rodata[08h] fstp qword ptr -020h[EBP] fld qword ptr FLAT:.rodata[019h] fstp qword ptr -018h[EBP] fld qword ptr FLAT:.rodata[02Ah] fstp qword ptr -010h[EBP] L21: fld qword ptr -020h[EBP] fadd qword ptr -018h[EBP] fmul qword ptr _TMP6 SYM32[09h] fstp qword ptr -010h[EBP] fld qword ptr -020h[EBP] fld qword ptr -010h[EBP] fucompp ST(1),ST fstsw AX sahf jne L46 jp L46 fld qword ptr -010h[EBP] leave ret 8 L46: fld qword ptr -010h[EBP] fld qword ptr -018h[EBP] fucompp ST(1),ST fstsw AX sahf jne L5C jp L5C fld qword ptr -010h[EBP] leave ret 8 L5C: fld qword ptr -010h[EBP] fmul qword ptr _TMP6 SYM32[049h] fmul qword ptr -010h[EBP] sub ESP,8 fstp qword ptr [ESP] call near ptr _D3std4math3expFNaNbdZd PC32 fld qword ptr -010h[EBP] fadd qword ptr _TMP10 SYM32 fmulp ST(1),ST fsub qword ptr 8[EBP] fstp qword ptr -8[EBP] fld qword ptr -8[EBP] ftst fstsw AX sahf fstp ST jbe L98 fld qword ptr -010h[EBP] fstp qword ptr -018h[EBP] jmp short L21 L98: fld qword ptr -8[EBP] ftst fstsw AX sahf fstp ST ja L21 jp L21 fld qword ptr -010h[EBP] fstp qword ptr -020h[EBP] jmp near ptr L21 nop nop nop .text._D4blad6bisectFdZd ends Version with optimalistion: _D4blad6bisectFdZd: push EBP mov EBP,ESP sub ESP,020h mov dword ptr -010h[EBP],0 fld qword ptr FLAT:.rodata[0Fh] fld qword ptr FLAT:.rodata[01Dh] fxch ST1 mov dword ptr -0Ch[EBP],0 fstp qword ptr -020h[EBP] fstp qword ptr -018h[EBP] L28: fld qword ptr -010h[EBP] fld qword ptr -018h[EBP] fucompp ST(1),ST fstsw AX sahf jne L40 jp L40 L37: fld qword ptr -010h[EBP] mov ESP,EBP pop EBP ret 8 L40: fld qword ptr -010h[EBP] fmul qword ptr _TMP6 SYM32[025h] fmul qword ptr -010h[EBP] sub ESP,8 fstp qword ptr [ESP] call near ptr _D3std4math3expFNaNbdZd PC32 fld qword ptr -010h[EBP] fadd qword ptr _TMP6 SYM32[044h] fmulp ST(1),ST fsub qword ptr 8[EBP] fst qword ptr -8[EBP] ftst fstsw AX fstp ST sahf jbe L93 fld qword ptr -010h[EBP] fstp qword ptr -018h[EBP] L77: fld qword ptr -020h[EBP] fld ST0 fadd qword ptr -018h[EBP] fmul qword ptr _TMP10 SYM32[09h] fst qword ptr -010h[EBP] fucompp ST(1),ST fstsw AX sahf jp L28 je L37 jmp short L28 L93: fld qword ptr -8[EBP] ftst fstsw AX fstp ST sahf ja L77 jp L77 fld qword ptr -010h[EBP] fstp qword ptr -020h[EBP] jmp short L77 nop nop nop .text._D4blad6bisectFdZd ends version with "something" added artifacialy: .text._D4blad6bisectFdZd segment assume CS:.text._D4blad6bisectFdZd _D4blad6bisectFdZd: push EBP mov EBP,ESP sub ESP,020h mov dword ptr -010h[EBP],0 fld qword ptr _TMP0 SYM32[017h] fld qword ptr _TMP0 SYM32[025h] fxch ST1 mov dword ptr -0Ch[EBP],0 fstp qword ptr -020h[EBP] fstp qword ptr -018h[EBP] push dword ptr _TMP0 SYM32[02Eh] push dword ptr _TMP0 SYM32[030h] push 0 push 0 call near ptr ...writeflnTAyaTdZ8writeflnFAyadZv PC32 L3D: fld qword ptr -010h[EBP] fld qword ptr -018h[EBP] fucompp ST(1),ST fstsw AX sahf jne L55 jp L55 L4C: fld qword ptr -010h[EBP] mov ESP,EBP pop EBP ret 8 L55: fld qword ptr -010h[EBP] fmul qword ptr _TMP7 SYM32[03Ah] fmul qword ptr -010h[EBP] sub ESP,8 fstp qword ptr [ESP] call near ptr _D3std4math3expFNaNbdZd PC32 fld qword ptr -010h[EBP] fadd qword ptr _TMP10 SYM32[09h] fmulp ST(1),ST fsub qword ptr 8[EBP] fst qword ptr -8[EBP] ftst fstsw AX fstp ST sahf jbe LCA fld qword ptr -010h[EBP] fstp qword ptr -018h[EBP] L8C: push dword ptr _TMP10 SYM32[02h] push dword ptr _TMP10 SYM32[04h] fld qword ptr -020h[EBP] fadd qword ptr -018h[EBP] fmul qword ptr _TMP11 SYM32[028h] fst qword ptr -010h[EBP] sub ESP,8 fstp qword ptr [ESP] call near ptr ...writeflnTAyaTdZ8writeflnFAyadZv PC32 fld qword ptr -020h[EBP] fld qword ptr -010h[EBP] fucompp ST(1),ST fstsw AX sahf jp L3D je L4C jmp near ptr L3D LCA: fld qword ptr -8[EBP] ftst fstsw AX fstp ST sahf ja L8C jp L8C fld qword ptr -010h[EBP] fstp qword ptr -020h[EBP] jmp short L8C .text._D4blad6bisectFdZd ends What is interesting, that just adding single writefln makes this asembler code change in much more places than just call of this function. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- Jan 28 2010
http://d.puremagic.com/issues/show_bug.cgi?id=3751 --- Comment #1 from Witold Baryluk <baryluk smp.if.uj.edu.pl> 2010-01-28 20:24:33 PST --- exp(-0.5*half*half) can be change to 1.0/(half*half) and bug is still the same. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- Jan 28 2010
http://d.puremagic.com/issues/show_bug.cgi?id=3751 Stephan Dilly <spam extrawurst.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |spam extrawurst.org --- Comment #2 from Stephan Dilly <spam extrawurst.org> 2010-01-28 22:30:29 PST --- is this maybe related to issue 3736 ? -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- Jan 28 2010
http://d.puremagic.com/issues/show_bug.cgi?id=3751 --- Comment #3 from Witold Baryluk <baryluk smp.if.uj.edu.pl> 2010-01-29 04:28:20 PST --- (In reply to comment #2)is this maybe related to issue 3736 ? Jan 29 2010
http://d.puremagic.com/issues/show_bug.cgi?id=3751 Don <clugdbug yahoo.com.au> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |clugdbug yahoo.com.au --- Comment #4 from Don <clugdbug yahoo.com.au> 2010-01-29 04:55:17 PST --- I'm not certain that there's a bug here (difficult to tell without reducing the test case further). It could be that in the optimised case you have a temporary value stored in a register, increasing the precision. It's legal for the compiler to do that. Adding a function call stops DMD from keeping the value in a register. Anyway you need to cut the test case down significantly. BTW it behaves the same way on DMD0.175, so this is definitely not a regression. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- Jan 29 2010
http://d.puremagic.com/issues/show_bug.cgi?id=3751 Steven Schveighoffer <schveiguy yahoo.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |schveiguy yahoo.com --- Comment #5 from Steven Schveighoffer <schveiguy yahoo.com> 2010-01-29 05:44:47 PST --- (In reply to comment #4)I'm not certain that there's a bug here (difficult to tell without reducing the test case further). It could be that in the optimised case you have a temporary value stored in a register, increasing the precision. It's legal for the compiler to do that. Adding a function call stops DMD from keeping the value in a register. Anyway you need to cut the test case down significantly. BTW it behaves the same way on DMD0.175, so this is definitely not a regression. Jan 29 2010
http://d.puremagic.com/issues/show_bug.cgi?id=3751 --- Comment #6 from Witold Baryluk <baryluk smp.if.uj.edu.pl> 2010-01-29 05:48:52 PST --- (In reply to comment #4)I'm not certain that there's a bug here (difficult to tell without reducing the test case further). It could be that in the optimised case you have a temporary value stored in a register, increasing the precision. It's legal for the compiler to do that. Adding a function call stops DMD from keeping the value in a register. Anyway you need to cut the test case down significantly. BTW it behaves the same way on DMD0.175, so this is definitely not a regression. Jan 29 2010
http://d.puremagic.com/issues/show_bug.cgi?id=3751 --- Comment #7 from Witold Baryluk <baryluk smp.if.uj.edu.pl> 2010-01-29 05:50:47 PST --- (In reply to comment #5)I think this is exactly the problem. You are using floating point expecting that all floats are created equal. One rule about floating point is never to build an infinite loop waiting for two floats to converge using ==. You must use an epsilon if you expect any reasonable result. Jan 29 2010
http://d.puremagic.com/issues/show_bug.cgi?id=3751 --- Comment #8 from Steven Schveighoffer <schveiguy yahoo.com> 2010-01-29 06:06:41 PST --- (In reply to comment #7)Yes I know, but I was relaying of IEEE 754 semantic, in which my code should work (there will be point in execuion when left == half or right == half exactly!). I was just trying to write this routine to be slightly faster just by not using EPSILONS and substractions. Jan 29 2010
http://d.puremagic.com/issues/show_bug.cgi?id=3751 Witold Baryluk <baryluk smp.if.uj.edu.pl> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |INVALID --- Comment #9 from Witold Baryluk <baryluk smp.if.uj.edu.pl> 2010-01-29 15:50:21 PST --- (In reply to comment #8)(In reply to comment #7)Yes I know, but I was relaying of IEEE 754 semantic, in which my code should work (there will be point in execuion when left == half or right == half exactly!). I was just trying to write this routine to be slightly faster just by not using EPSILONS and substractions. Jan 29 2010
|