www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Segfault with DMD optimization switch

reply "Stephan" <stephan80 mac.com> writes:
Hi,

I just spent the last 5 hours trying to find the cause of a 
segmentation fault. It was clearly my mistake that I did all 
those tests with dmd's optimization switch on ("-O").
When I disabled optimization, i.e. removed the "-O" flag, the 
code runs perfectly.

I can't exclude an actual bug in my code with 100 percent 
certainty. But given various segmentation faults that I have seen 
with dmd previous to version 2.061, and given the fact that I am 
not using any pointers or other memory-unsafe constructs, I am 
inclined to believe that this segfault is actually caused by dmd 
itself, or at least dmd's optimization. Please correct me if you 
think that is extremely unlikely.

Has anyone experienced those kind of problems which only occur 
with the "-O" flag, but not without it? Or can anyone point me to 
a related bug that has already been reported on bugzilla?

Unfortunately, in my case the code crashed in a position late in 
some iteration loop, so I can't easily reproduce it in a simple 
program to file a bug report.

Stephan
Jan 22 2013
next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Tuesday, January 22, 2013 19:21:33 Stephan wrote:
 Unfortunately, in my case the code crashed in a position late in
 some iteration loop, so I can't easily reproduce it in a simple
 program to file a bug report.

DustMite might be able to help you reduce your code to a simpler program which also has the failure: https://github.com/CyberShadow/DustMite - Jonathan M davis
Jan 22 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Stephan:

 Has anyone experienced those kind of problems which only occur 
 with the "-O" flag, but not without it?

Compiler crashes caused by -O happen. But I think in your case looking at bugzilla isn't a big help, because the possible causes are too many.
 Unfortunately, in my case the code crashed in a position late 
 in some iteration loop, so I can't easily reproduce it in a 
 simple program to file a bug report.

If you want the bug to be fixed, you will probably have to localize and submit it. Bye, bearophile
Jan 22 2013
prev sibling next sibling parent "Stephan" <stephan80 mac.com> writes:
Thanks for that hint. DustMite looks promising.
I will try to file a report.

Stephan

On Tuesday, 22 January 2013 at 18:28:08 UTC, Jonathan M Davis 
wrote:
 On Tuesday, January 22, 2013 19:21:33 Stephan wrote:
 Unfortunately, in my case the code crashed in a position late 
 in
 some iteration loop, so I can't easily reproduce it in a simple
 program to file a bug report.

DustMite might be able to help you reduce your code to a simpler program which also has the failure: https://github.com/CyberShadow/DustMite - Jonathan M davis

Jan 22 2013
prev sibling next sibling parent "ixid" <nuaccount gmail.com> writes:
On Tuesday, 22 January 2013 at 18:28:08 UTC, Jonathan M Davis 
wrote:
 On Tuesday, January 22, 2013 19:21:33 Stephan wrote:
 Unfortunately, in my case the code crashed in a position late 
 in
 some iteration loop, so I can't easily reproduce it in a simple
 program to file a bug report.

DustMite might be able to help you reduce your code to a simpler program which also has the failure: https://github.com/CyberShadow/DustMite - Jonathan M davis

Using DMD 2.061 on Windows XP with VisualD and turning on optimization in debug mode, this will throw an exception: import std.datetime; void main() { StopWatch sw; sw.start; } What seems to cause it is adding a break point on the last line. It exits successfully with code 0 without the break point. This just happens to be the minimal case the first program I could think of reduces to and still has the issue, I have no idea if it's anything to do with std.datetime nor if I am demonstrating the same issue, apologies if this is irrelevant.
Jan 22 2013
prev sibling next sibling parent "Rob T" <alanb ucora.com> writes:
I encountered a segfault once after compiling with -O -release. 
It was the -release that caused an assert to be removed from a 
function that did not return due to an error condition. When the 
error was encountered there was no longer an assert to catch it, 
resulting in a segfault.

-rt
Jan 22 2013
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, January 23, 2013 06:39:02 Rob T wrote:
 I encountered a segfault once after compiling with -O -release.
 It was the -release that caused an assert to be removed from a
 function that did not return due to an error condition. When the
 error was encountered there was no longer an assert to catch it,
 resulting in a segfault.

assert(0) is put at the end of functions in case the end of the function is reached without returning, and unlike normal assertions, assert(0) is left in in release mode, except it becomes a halt instruction, which be pretty much the same thing as a segfault. So, it sounds like you ran into a situation which was normal and expected given the bug that resulted in the end of the function being reached without returning. - Jonathan M Davis
Jan 22 2013
prev sibling next sibling parent "Rob T" <alanb ucora.com> writes:
On Wednesday, 23 January 2013 at 05:55:34 UTC, Jonathan M Davis 
wrote:
 assert(0) is put at the end of functions in case the end of the 
 function is
 reached without returning, and unlike normal assertions, 
 assert(0) is left in
 in release mode, except it becomes a halt instruction, which be 
 pretty much
 the same thing as a segfault. So, it sounds like you ran into a 
 situation
 which was normal and expected given the bug that resulted in 
 the end of the
 function being reached without returning.

 - Jonathan M Davis

I just tried reproducing the assert(0) segfault with released 2.061 but it no longer segfaults, instead it halts and displays the text message in the assert statement as expected. I guess it's no longer an issue, or was a side effect of something else that I was doing that was later resolved. --rt
Jan 22 2013
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-01-22 19:21, Stephan wrote:
 Hi,

 I just spent the last 5 hours trying to find the cause of a segmentation
 fault. It was clearly my mistake that I did all those tests with dmd's
 optimization switch on ("-O").
 When I disabled optimization, i.e. removed the "-O" flag, the code runs
 perfectly.

 I can't exclude an actual bug in my code with 100 percent certainty. But
 given various segmentation faults that I have seen with dmd previous to
 version 2.061, and given the fact that I am not using any pointers or
 other memory-unsafe constructs, I am inclined to believe that this
 segfault is actually caused by dmd itself, or at least dmd's
 optimization. Please correct me if you think that is extremely unlikely.

I would say that if using the -O flag the code behaves differently from without using the flag it's always a bug. The -O flag should never change the semantic meaning of the code. It should only make it faster. -- /Jacob Carlborg
Jan 22 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/22/2013 11:43 PM, Jacob Carlborg wrote:
 I would say that if using the -O flag the code behaves differently from without
 using the flag it's always a bug.

 The -O flag should never change the semantic meaning of the code. It should
only
 make it faster.

If your program has undefined behavior in it, the -O flag can definitely cause that behavior to change.
Jan 22 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-01-23 08:50, Walter Bright wrote:

 If your program has undefined behavior in it, the -O flag can definitely
 cause that behavior to change.

Yes, I guess that is to be expected. I'm mostly talking about correct code here. -- /Jacob Carlborg
Jan 23 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/23/2013 8:06 AM, Jacob Carlborg wrote:
 On 2013-01-23 08:50, Walter Bright wrote:

 If your program has undefined behavior in it, the -O flag can definitely
 cause that behavior to change.

Yes, I guess that is to be expected. I'm mostly talking about correct code here.

-O can especially expose code bugs like referencing an out-of-scope stack frame.
Jan 23 2013
parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/23/2013 2:15 PM, Stephan wrote:
 What would be the prototypic short program that simulates referencing an out of
 scope stack frame? It would be great to see an example that produces a
 deterministic segfault.

Just return a pointer to a local (first passing it through another function in order to hide what you're doing from the compiler). btw, bug 8832 is an example of one.
Jan 23 2013
prev sibling next sibling parent "Stephan" <stephan80 mac.com> writes:
On Wednesday, 23 January 2013 at 20:03:03 UTC, Walter Bright 
wrote:
 On 1/23/2013 8:06 AM, Jacob Carlborg wrote:
 On 2013-01-23 08:50, Walter Bright wrote:

 If your program has undefined behavior in it, the -O flag can 
 definitely
 cause that behavior to change.

Yes, I guess that is to be expected. I'm mostly talking about correct code here.

-O can especially expose code bugs like referencing an out-of-scope stack frame.

This is a very good point, and I do think that there might be some mistake like that in my code, but it's definitely not obvious to me where. I did use delegates at some point, but in order to narrow down the causes, I changed that delegate to a functor, and the segfault still occurred. What would be the prototypic short program that simulates referencing an out of scope stack frame? It would be great to see an example that produces a deterministic segfault. Thanks, Stephan
Jan 23 2013
prev sibling next sibling parent "Stephan" <stephan_schiffels mac.com> writes:
On Thursday, 24 January 2013 at 00:01:26 UTC, Walter Bright wrote:
 On 1/23/2013 2:15 PM, Stephan wrote:
 What would be the prototypic short program that simulates 
 referencing an out of
 scope stack frame? It would be great to see an example that 
 produces a
 deterministic segfault.

Just return a pointer to a local (first passing it through another function in order to hide what you're doing from the compiler). btw, bug 8832 is an example of one.

OK, I spent quite a while to narrow it down, and I definitely can say that there is a compiler bug related to the optimization switch. I filed an issue: (9387). I could not anymore reproduce a seg-fault, but even worse: The optimization switch changes the behavior of the program!, I declared it as "major", since it is not clear whether this bug is even detectable if it doesn't cause a seg-fault. It just changes the behavior, so it seems to me very dangerous! The program I attached in the bug-report is an implementation of a minimization routine from Numerical Recipes, 3rd edition. This bug is possibly caused by having a large number of local variables, which are somehow written out into memory before moved into registers or something. I looked at the assembly with a colleague but we couldn't find something conclusive. I hope that you guys can reproduce the bug (simply run "rdmd -O brent_test.d" and compare it with "rdmd brent_test.d") on dmd 2.061. There is nothing exotic in the source-code, only assignments. No pointers, no references, no structs. It must be a compiler bug. Stephan
Jan 24 2013
prev sibling parent "Stephan" <stephan_schiffels mac.com> writes:
On Thursday, 24 January 2013 at 00:01:26 UTC, Walter Bright wrote:
 On 1/23/2013 2:15 PM, Stephan wrote:
 What would be the prototypic short program that simulates 
 referencing an out of
 scope stack frame? It would be great to see an example that 
 produces a
 deterministic segfault.

Just return a pointer to a local (first passing it through another function in order to hide what you're doing from the compiler). btw, bug 8832 is an example of one.

OK. This bug was introduced in 2.061! With 2.060, the optimization switch does not corrupt the program. I put this in the bug report as well. Stephan
Jan 24 2013