www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 9387] New: Compiler switch -O changes behavior of correct code

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9387

           Summary: Compiler switch -O changes behavior of correct code
           Product: D
           Version: D2
          Platform: x86_64
        OS/Version: Mac OS X
            Status: NEW
          Severity: major
          Priority: P2
         Component: DMD
        AssignedTo: nobody puremagic.com
        ReportedBy: stephan.schiffels mac.com



---
Created an attachment (id=1182)
Source file with program that

The attached program implements a part of Brent's minimization algorithm for
one-dimensionsal functions. The code is from Numerical Recipes 3rd edition.

I use dmd 2.061/

When I run the program with "rdmd brent_test.d" it runs fine and gives the
correct result.

When I run it with optimization, i.e. with "rdmd -O brent_test.d", it behaves
differently. It enters some infinite loop and eventually throws the expected
exception for too many iterations.

You can see that I placed a writefln() into line 45, which outputs the value of
variable a. When you move this writefln statement just one line below, i.e.
below the if-statement, the code runs fine, even with optimization.

I colleague of mine suggested that there might be a bug related to a large
number of local variables. Maybe some limiting number of registers causes the
machine to cache things into memory and pulling them back in a wrong way or
something.

Appreciate help!

Stephan

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 24 2013
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9387




---
During debugging, I actually looked at the value of every single local
variable, and you can actually see how the value of some variables (for example
"a") changes from one iteration to the next, without any assignment.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 24 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9387




---
I just checked: The bug definitely was introduced with version 2.061!
With dmd version 2.060, everything works fine, with and without the "-O"
switch.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 24 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9387


Walter Bright <bugzilla digitalmars.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bugzilla digitalmars.com
           Severity|major                       |regression


-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 24 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9387




00:30:21 PST ---
I can't reproduce this with the latest dmd. I'll upload a new beta tomorrow you
can try.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 25 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9387




---

 I can't reproduce this with the latest dmd. I'll upload a new beta tomorrow you
 can try.
What actually seems to be corrupted are the precompiled executables on the zip-file on the web. We checked this for the osx and the linux version. Both of these precompiled versions produce this bug. When we compile dmd from source, even for version 2.061 from the web, this bug does not occur. Stephan -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jan 25 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9387




---

 I can't reproduce this with the latest dmd. I'll upload a new beta tomorrow you
 can try.
Sorry to jump back and forth here. I have to again correct my previous statement: With the latest version of dmd/druntime/phobos (2.062 from git), this bug does occur! But only when you compile and run separately. When you use dmd -run, both versions with and without -O work fine. This is quite weird. So: dmd -O brent_test.d ./brent_test should produce a different outcome than dmd brent_test.d ./brent_test I will try use bisect to find out when this bug was introduced. Stephan -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jan 25 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9387




10:37:02 PST ---
When I compile and run separately, it works fine.

You should also clarify whether you are using -m64 or not.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 25 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9387




---
Right, I use the 64bit model.
And I tested this on OSX and on linux, with same outcomes on both platforms.
It's frustating that you can't reproduce. Thanks for responding quickly on this
anyway.
I will see what I can find out with bisect.

Stephan

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 25 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9387




BTW you might be interested in std.numeric.findRoot, which is the
root-finding-by-bracketing algorithm (in contrast to "Brent's algorithm" which
is minima-finding-by-bracketing). In terms of number of calls, I believe it
beats all published algorithms (in some cases, by an order of magnitude). I
should really publish it. I did some work on the minima problem as well, and
put it into Tango, but it isn't in Phobos.
The code is very old now, dating from a time where there were many compiler
limitations, and it could use a review.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 28 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9387




...and I can reproduce your bug.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 28 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9387




I think there is an uninitialized variable in there. When I compile with -O, if
I run the same executable multiple times, sometimes it passes, sometimes it
fails.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 28 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9387




PST ---
Hi Don,

glad to hear that you can reproduce the bug! I tested initializing all
variables by hand, and the bug still occurs.
Thanks for the suggestion to use std.numeric. Looks very useful! The Numerical
Recipes Code style is worse than horrible! All those 1-letter variables...

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 28 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9387


Don <clugdbug yahoo.com.au> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |wrong-code



Here is a more reduced test case (still enormous):
Without -O, it returns on the first pass through the loop. With -O, one of two
things happen:
(a) it hits the assert(0) on the first pass through the loop; or
(b) it generates an alignment hardware exception.

It looks as though it is a issue with misalignment of SSE registers.
Removing the assert(0) causes an ICE.

---
import std.math : abs;

void minimize()
{
    double a,b,d=0.0,etemp,fu,fv,fw,fx;
    double p;
    double q,r,tol1,tol2,u,v,w,x,xm;
    double e=0.0;
    double ax,bx,cx,fa,fb,fc;
    double tol;
  ax = 2.8541;
  bx = 3;
  cx = 3.0458;
  fa = 0.145898;
  fb = 0;
  fc = 0.381966;
 tol = 3.0e-8;

    a= ax;
    b= cx;
    v = bx;
    w = bx;
    x = bx;
    fx = 0;
    fv = fx;
    fw = fx;

  a = 2.97871347812973974456; b = 3.0458; v =2.9442711606; w =2.9787134781;
  x = 3;  fx= 0; fv = 0.00310570354087098691;
  fw = 0.00045311601333306815;
  e =-0.0557288394;
  d = -0.0212865219;
    for (int iter=0;iter<1;iter++) {
      xm=0.5*(a+b);
      tol1=tol*abs(x);
      tol2=2.0*(tol1);
      if (abs(x-xm) <= (tol2-0.5*(b-a))) {
        return;
      }
      if (abs(e) > tol1) {
       r=(x-w)*(fx-fv);
        q=(x-v)*(fx-fw);
        p=(x-v)*q-(x-w)*r;
        q=2.0*(q-r);
        if (q > 0.0) p = -p;
        q=abs(q);
        etemp=e;
        e=d;
        if (abs(p) >= abs(0.5*q*etemp) || q < p) {
          d= b-x;
        }
        else {
          d=p/q;
          u=x+d;
          if (u-a < tol2 || b-u < tol2)
            d = xm - x;
        }
      }
      else { d= (e=(x >= xm ? a-x : b-x)); }
      u= (abs(d) >= tol1) ? x+d : x+3.0e-8;

     if (u < 3.01) return;
     else
        assert(0);  // FAILS HERE

      fu = (u-3.0)*(u-3.0);

      if (fu <= fx) {
        assert(0);
      }
    }
}

void main() {
  minimize();
}

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 29 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9387


Don <clugdbug yahoo.com.au> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |ice
         OS/Version|Mac OS X                    |All



A reduced test case for the ICE:

import std.math : abs;

void bug9387()
{
    double x = 3;
    double r = (x-2.1)*0.1;
    double q = (x-2.1)*0.1 - r;
    double p = (x-2.1)*q - (x-2.1)*r;

    if (q > 0.0) p = -p;
    if (abs(p) >= q ) { }
}
---
dmd -O -m64 bug.d
Internal error: backend/cgcod.c 769

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 29 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9387




ICE, further reduced:
--------------
void bug9387a(double x) { }

void ice9387()
{
    double x = 0.3;
    double r = x*0.1;
    double q = x*0.1 + r;
    double p = x*0.1 + r*0.2;
    if ( q )
        p = -p;
    bug9387a(p);
}

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 30 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9387




And a reduction for the wrong-code case. This sometimes segfaults but usually
hangs. Looks like the saved RBX register gets trampled:

double brent(double x) { return x; }

void wrong9387()
{
    for (int iter=0; iter<1; iter++) {
        double v =2.94;
        if (brent(v)<= 2.9) { return; }
        double w = 2.97;
        double r = (0.2-w) * 0.1;
        double q = (0.2-v) * 0.1 - r;
        double p = 0.7*q - (0.2-v)*0.3;
        if (q > 0.0) p = -p;
        q = brent(q);
        double d = p-q;
        if (2.94 + d)
            w = v -v;
        brent(w);
    }
}

void main()
{
  wrong9387();
}

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 30 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9387




Commit pushed to dmd-1.x at https://github.com/D-Programming-Language/dmd

https://github.com/D-Programming-Language/dmd/commit/bfa5d0f0ba80c7ff6e0d67806714763584666fb2
fix Issue 9387 - Compiler switch -O changes behavior of correct code

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 30 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9387




14:42:56 PST ---
https://github.com/D-Programming-Language/dmd/pull/1584

Thanks, Don, for the minimizations which made it easy for me to find the
problem. It was not a regression, although it looked like one. The bug is nasty
and I'm glad to get it fixed.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 30 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9387




PST ---
Don and Walter, thanks for reducing the code and fixing the bug, all on a very
short timescale! This is going to be a very important fix for me. Using the
optimization switch is critical for me.

Stephan

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 30 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9387




Commits pushed to master at https://github.com/D-Programming-Language/dmd

https://github.com/D-Programming-Language/dmd/commit/06d991f039eab23561398aea4ea764ea49a6dea4
fix Issue 9387 - Compiler switch -O changes behavior of correct code

https://github.com/D-Programming-Language/dmd/commit/9f3ab3f0b4713bd12a3ada71ca783bab1edae663


fix Issue 9387 - Compiler switch -O changes behavior of correct code

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 30 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9387


Walter Bright <bugzilla digitalmars.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED


-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 30 2013
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9387




 Don and Walter, thanks for reducing the code and fixing the bug, all on a very
short timescale! Thanks. Optimizer bugs get top priority, and this was the one of the worst bugs of all time. I found test cases where the executable was wrong, yet still produced correct results in 90% of runs. I don't think I've ever seen a bug that was so difficult to reduce. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jan 31 2013