www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - It's always something

reply Walter Bright <newshound2 digitalmars.com> writes:
I just spent about 3 hours tracking down the strangest problem. Win64 exception 
handling would work fine, but when I'd turn on -g, it would crash.

At first I thought "I'm generating a bad object file with -g." But since dmd 
isn't emitting any Win64 symbolic debug info, a check shows the object files
are 
the same with and without -g.

So it must be something with linking? I spent a lot of time going over and over 
the relocation fixups (could there be a missing offset?), but could find
nothing 
wrong.

Instrumenting the Phobos eh handler code, it seems my handler table was
pointing 
off into la-la land when linking with /DEBUG. What the hell?

I thought I'd dump the first byte of where it was pointing to, and look in the 
assembler output and see if that byte was at a predictable offset from where it 
was supposed to be. The actual function started with C3, but the handler table 
pointed to E9. There was no E9 in the object file. What the hell?

Finally, it dawned on me. E9 is a JMP instruction! For some reason, the 
Microsoft linker inserts a bunch of trampolines when linking for debug.

The fix, then, was for the eh handler to look and see if the handler table is 
pointing to an E9, and if so, then adjust the function address to be where ever 
the JMP goes.

And so it goes...
Sep 22 2012
next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 9/22/12, Walter Bright <newshound2 digitalmars.com> wrote:
 Win64 exception

Maybe unrelated but I've read this recently (WindowProc): http://msdn.microsoft.com/en-us/library/windows/desktop/ms633573%28v=vs.85%29.aspx It says: "However, if your application runs on a 64-bit version of Windows operating system or WOW64, you should be aware that a 64-bit operating system handles uncaught exceptions differently based on its 64-bit processor architecture, exception architecture, and calling convention. The following table.." Apparently there's a hotfix for some x64 exception issues: http://support.microsoft.com/kb/976038 http://stackoverflow.com/questions/2631452/64bit-exceptions-in-wndproc-silently-fail We've been discussing recently in D.learn whether WndProc needs to be nothrow. The exceptions are propagated back to WinMain on x32 but maybe on x64 it needs to be nothrow to avoid x64 issues.
Sep 22 2012
prev sibling parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2012-09-22 09:19:33 +0000, Walter Bright <newshound2 digitalmars.com> said:

 I just spent about 3 hours tracking down the strangest problem. Win64 
 exception handling would work fine, but when I'd turn on -g, it would 
 crash.
 
 At first I thought "I'm generating a bad object file with -g." But 
 since dmd isn't emitting any Win64 symbolic debug info, a check shows 
 the object files are the same with and without -g.
 
 So it must be something with linking? I spent a lot of time going over 
 and over the relocation fixups (could there be a missing offset?), but 
 could find nothing wrong.
 
 Instrumenting the Phobos eh handler code, it seems my handler table was 
 pointing off into la-la land when linking with /DEBUG. What the hell?
 
 I thought I'd dump the first byte of where it was pointing to, and look 
 in the assembler output and see if that byte was at a predictable 
 offset from where it was supposed to be. The actual function started 
 with C3, but the handler table pointed to E9. There was no E9 in the 
 object file. What the hell?
 
 Finally, it dawned on me. E9 is a JMP instruction! For some reason, the 
 Microsoft linker inserts a bunch of trampolines when linking for debug.
 
 The fix, then, was for the eh handler to look and see if the handler 
 table is pointing to an E9, and if so, then adjust the function address 
 to be where ever the JMP goes.
 
 And so it goes...

But there should be a reason why there's a jump there. Have you found it? If you're just bypassing the jump you might be breaking something else. For instance, this jump table might have been a mean to allow the debugger to more easily break on exceptions. Or it might be something else, I don't know, but it's likely there's a reason. You should keep a record of those anomalies somewhere, it might prove useful as a starting point to investigating problems future problems that might arise. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca/
Sep 22 2012
next sibling parent "Sandeep Datta" <datta.sandeep gmail.com> writes:
 You should keep a record of those anomalies somewhere, it might 
 prove useful as a starting point to investigating problems 
 future problems that might arise.

You are right. I think it is a good thing Walter took the time out to write about this. In the absence of better documentation this post might come in handy (we can always use Google).
Sep 22 2012
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 9/22/2012 6:37 AM, Michel Fortin wrote:
 But there should be a reason why there's a jump there. Have you found it? If
 you're just bypassing the jump you might be breaking something else. For
 instance, this jump table might have been a mean to allow the debugger to more
 easily break on exceptions. Or it might be something else, I don't know, but
 it's likely there's a reason.

Such trampolines are most often used so that a function can be easily "hot swapped" with another function. This may be a debugging feature of VS. It took me so long to figure this one out because I had no idea that the MS linker would do this.
 You should keep a record of those anomalies somewhere, it might prove useful as
 a starting point to investigating problems future problems that might arise.

I'll probably write a blog post about it eventually.
Sep 22 2012