digitalmars.D.bugs - [Issue 6660] New: Problem with SSE registers in array ops
- d-bugmail puremagic.com (42/42) Sep 13 2011 http://d.puremagic.com/issues/show_bug.cgi?id=6660
- d-bugmail puremagic.com (19/19) Sep 26 2011 http://d.puremagic.com/issues/show_bug.cgi?id=6660
- d-bugmail puremagic.com (12/25) Sep 27 2011 http://d.puremagic.com/issues/show_bug.cgi?id=6660
- d-bugmail puremagic.com (17/17) Sep 27 2011 http://d.puremagic.com/issues/show_bug.cgi?id=6660
- d-bugmail puremagic.com (19/19) Sep 27 2011 http://d.puremagic.com/issues/show_bug.cgi?id=6660
- d-bugmail puremagic.com (40/40) Sep 27 2011 http://d.puremagic.com/issues/show_bug.cgi?id=6660
- d-bugmail puremagic.com (13/13) Sep 27 2011 http://d.puremagic.com/issues/show_bug.cgi?id=6660
- d-bugmail puremagic.com (14/14) Sep 27 2011 http://d.puremagic.com/issues/show_bug.cgi?id=6660
- d-bugmail puremagic.com (13/13) Dec 22 2011 http://d.puremagic.com/issues/show_bug.cgi?id=6660
- d-bugmail puremagic.com (13/13) Mar 28 2012 http://d.puremagic.com/issues/show_bug.cgi?id=6660
http://d.puremagic.com/issues/show_bug.cgi?id=6660 Summary: Problem with SSE registers in array ops Product: D Version: D1 & D2 Platform: Other OS/Version: Windows Status: NEW Severity: normal Priority: P2 Component: DMD AssignedTo: nobody puremagic.com ReportedBy: clugdbug yahoo.com.au --- Comment #0 from Don <clugdbug yahoo.com.au> 2011-09-13 01:12:12 PDT --- This program, arrayop.d, void main() { double[4] a; double[4] b; a[] = b[] + b[]; } compiled and run repeatedly in a batch file dmd arrayop arrayop dmd arrayop arrayop dmd arrayop arrayop ... (I put it in about 20 times) eventually generates this error on a SandyBridge processor, Windows 7. C:\sandbox\bugs>dmd arrayop DMD v2.055 DEBUG OPTLINK (R) for Win32 Release 8.00.12 Copyright (C) Digital Mars 1989-2010 All rights reserved. http://www.digitalmars.com/ctg/optlink.html OPTLINK : Error 3: Cannot Create File arrayop.exe --- errorlevel 1 Also happens in release version of DMD 2.055. I think it is an SSE issue, since it only happens with arrays of floats and doubles (not reals). But I'm just guessing. Maybe it is corrupting the stack. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 13 2011
http://d.puremagic.com/issues/show_bug.cgi?id=6660 Brad Roberts <braddr puremagic.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |braddr puremagic.com --- Comment #1 from Brad Roberts <braddr puremagic.com> 2011-09-26 23:50:42 PDT --- Another data point... In the auto tester where it's building each test with the sequence of different parameter combinations, it used to fail every once in a while due to the same error below. Changing it to write to a different executable every time (I just added a counter so it's testfoo_0.exe, testfoo_1.exe, etc..) completely fixed that problem. I have no recollection which tests were failing.. I thought it was pretty random, but it might not have been. My assumption is/was that windows isn't releasing the exclusive write lock on the executable file synchronously with the exiting of the application. Have you tried the same loop with an empty main? -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 26 2011
http://d.puremagic.com/issues/show_bug.cgi?id=6660 --- Comment #2 from Don <clugdbug yahoo.com.au> 2011-09-27 01:00:39 PDT --- (In reply to comment #1)Another data point... In the auto tester where it's building each test with the sequence of different parameter combinations, it used to fail every once in a while due to the same error below. Changing it to write to a different executable every time (I just added a counter so it's testfoo_0.exe, testfoo_1.exe, etc..) completely fixed that problem. I have no recollection which tests were failing.. I thought it was pretty random, but it might not have been. My assumption is/was that windows isn't releasing the exclusive write lock on the executable file synchronously with the exiting of the application. Have you tried the same loop with an empty main?Yes, I have, and it never fails. It also never fails when 'double' is replaced by 'real'. This makes it very hard for me to blame Windows for this. I found three tests from the test suite which failed: test15, arrayop, and hospital. I reduced arrayop down to that minimum size. Might be worth trying to reduce the others as well. It's also possible that it could be an issue with core.cpuid. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 27 2011
http://d.puremagic.com/issues/show_bug.cgi?id=6660 Don <clugdbug yahoo.com.au> changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|Problem with SSE registers |Problem with core.cpuid on |in array ops |Windows7 --- Comment #3 from Don <clugdbug yahoo.com.au> 2011-09-27 01:07:33 PDT --- Yup, it's core.cpuid. This one fails (intermittently): ----- import core.cpuid; void main() { bool b = sse(); } -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 27 2011
http://d.puremagic.com/issues/show_bug.cgi?id=6660 --- Comment #4 from Don <clugdbug yahoo.com.au> 2011-09-27 01:42:17 PDT --- Reduced test case is very, very strange: void main() { __gshared uint a; asm { mov EAX, 2; cpuid; mov a, EAX; } uint numinfos = a& 0xFF; do { } while (--numinfos); } It only happens with cpuid = 2. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 27 2011
http://d.puremagic.com/issues/show_bug.cgi?id=6660 --- Comment #5 from Don <clugdbug yahoo.com.au> 2011-09-27 03:57:59 PDT --- This is really incredible. I've removed all of the D code, and I can still reproduce the behaviour. If you uncomment out the jz line, it won't happen. The 'int 3' line is just a breakpoint, to prove that the branch is never taken. void main() { int ctr; // also works with __gshared int ctr; asm { mov EAX, 2; cpuid; and EAX, 0xFF; mov ctr, EAX; // jz was_zero; Lxx: dec int ptr ctr; jnz Lxx; jmp done; was_zero: int 3; done: ; } } Wild speculation: there's a bug in CPUID 2: it's not clearing the loopback buffer. The loop is executed as if 'ctr' were still zero. This means that it loops 2^^32 times. This is long enough that Windows does a task switch. In core2, the loopback buffer was between the predecoders and the decoders, but on core i7, they moved it after the decoders. I tried to confirm this by extending the size of the loop, by padding with nops. When the loop is 63 bytes of code (56 nops), it fails. Once I add a 57th nop, it stops failing. These aren't the numbers I expected -- the loopback buffer is 256 bytes on the core i7. However I have a core i3, perhaps it's different, or it may be a decoding bug. Regardless, this looks very much like a CPU erratum. My guess is that affecting the loop predictor. which isn't the branch prediction -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 27 2011
http://d.puremagic.com/issues/show_bug.cgi?id=6660 --- Comment #6 from Don <clugdbug yahoo.com.au> 2011-09-27 12:20:31 PDT --- My theory is not correct. I figured that I could check if the number of iterations was wrong by using rdtsc to see how many instructions are executed. But it shows nothing unusual. I'm no longer convinced that this is a loopback issue. I also found that if I include a writefln after the relevant code, the critical length of the loop drops from 64 (0x40) to 40 (0x28). It doesn't seem to be affected by code alignment, so it's not a cache line issue. This whole thing is very, very strange. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 27 2011
http://d.puremagic.com/issues/show_bug.cgi?id=6660 --- Comment #7 from Don <clugdbug yahoo.com.au> 2011-09-27 18:12:05 PDT --- The reduced test case from test15.d looks _completely_ different: void main() { char[] a = new char[0]; uint c = 20000; while (c--) a ~= 'x'; } This looks as though the gc is still running after the app has exited. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 27 2011
http://d.puremagic.com/issues/show_bug.cgi?id=6660 --- Comment #8 from Don <clugdbug yahoo.com.au> 2011-12-22 02:41:41 PST --- This is interesting. http://msdn.microsoft.com/en-us/library/windows/hardware/ff538528%28v=vs.85%29.aspx "A CPUID intercept message is delivered by the hypervisor when a virtual processor executes a CPUID instruction and the parent partition previously called the HvInstallIntercept hypercall function to install an intercept on such instructions." Wow. There is a hypervisor running on my laptop. And it's buggy. Could it be a rootkit? -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Dec 22 2011
http://d.puremagic.com/issues/show_bug.cgi?id=6660 Don <clugdbug yahoo.com.au> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED --- Comment #9 from Don <clugdbug yahoo.com.au> 2012-03-28 00:06:39 PDT --- Turns out to be caused by Windows Defender. Disabling it in the development directory solves the problem. Looks like a bug in Windows Defender to me. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 28 2012