www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - segfaults

reply Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
Is there any decent way to figure out where segfaults are coming from?

e.g. 200k lines of bad code converted from java

I tried gdb, and it didn't seem to work too well.

Die: DW_TAG_type_unit (abbrev 3, offset 0x6d)
   parent at offset: 0xb
   has children: FALSE
   attributes:
     DW_AT_byte_size (DW_FORM_data1) constant: 8
Dwarf Error: Missing children for type unit [in module /home/ellery/dxl.exe]
Missing separate debuginfos, use: debuginfo-install glibc-2.11.1-1.i686


And I'm not proficient with gdb.

dmd 1.056 / tango .99999 or whatever

fedora linux
Feb 22 2010
next sibling parent reply Jesse Phillips <jessekphillips+D gmail.com> writes:
Ellery Newcomer wrote:

 Is there any decent way to figure out where segfaults are coming from?

 e.g. 200k lines of bad code converted from java

 I tried gdb, and it didn't seem to work too well.

 Die: DW_TAG_type_unit (abbrev 3, offset 0x6d)
    parent at offset: 0xb
    has children: FALSE
    attributes:
      DW_AT_byte_size (DW_FORM_data1) constant: 8
 Dwarf Error: Missing children for type unit [in module /home/ellery/dxl.exe]
 Missing separate debuginfos, use: debuginfo-install glibc-2.11.1-1.i686


 And I'm not proficient with gdb.

 dmd 1.056 / tango .99999 or whatever

 fedora linux

I never tried getting a core dump with D, but you should be able to get one by enabling them[1]. And then you can use GDB[2]. But as implied earlier I don't know how well it will work. 1. http://en.linuxreviews.org/HOWTO_enable_core-dumps 2. http://www.unix.com/unix-advanced-expert-users/19128-how-do-core-dump-analysis.html
Feb 22 2010
parent Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
On 02/22/2010 08:41 PM, Jesse Phillips wrote:
 Ellery Newcomer wrote:

 Is there any decent way to figure out where segfaults are coming from?

 e.g. 200k lines of bad code converted from java

 I tried gdb, and it didn't seem to work too well.

 Die: DW_TAG_type_unit (abbrev 3, offset 0x6d)
     parent at offset: 0xb
     has children: FALSE
     attributes:
       DW_AT_byte_size (DW_FORM_data1) constant: 8
 Dwarf Error: Missing children for type unit [in module /home/ellery/dxl.exe]
 Missing separate debuginfos, use: debuginfo-install glibc-2.11.1-1.i686


 And I'm not proficient with gdb.

 dmd 1.056 / tango .99999 or whatever

 fedora linux

I never tried getting a core dump with D, but you should be able to get one by enabling them[1]. And then you can use GDB[2]. But as implied earlier I don't know how well it will work. 1. http://en.linuxreviews.org/HOWTO_enable_core-dumps 2. http://www.unix.com/unix-advanced-expert-users/19128-how-do-core-dump-analysis.html

No good. Gives the same message.
Feb 22 2010
prev sibling next sibling parent Bernard Helyer <b.helyer gmail.com> writes:
On 23/02/10 15:14, Ellery Newcomer wrote:
 Is there any decent way to figure out where segfaults are coming from?

 e.g. 200k lines of bad code converted from java

 I tried gdb, and it didn't seem to work too well.

 Die: DW_TAG_type_unit (abbrev 3, offset 0x6d)
 parent at offset: 0xb
 has children: FALSE
 attributes:
 DW_AT_byte_size (DW_FORM_data1) constant: 8
 Dwarf Error: Missing children for type unit [in module
 /home/ellery/dxl.exe]
 Missing separate debuginfos, use: debuginfo-install glibc-2.11.1-1.i686


 And I'm not proficient with gdb.

 dmd 1.056 / tango .99999 or whatever

 fedora linux

It's a bit hit and miss with DMD, GDB, and debugging I'm afraid (I have no real experience with D1.X and Tango, but I assume it is the same deal). The debugging information put out by DMD seems to be in error, coupled with a possible bug in GDB). Sometimes a different debug switch (-g vs -gc) can help, as can (as in my current project) omitting it altogether. So once you compile your file into an executable run gdb exename then run then, once you hit a SIGSEGV, bt And see what luck you have with the switches and their combinations. Good luck! -Bernard.
Feb 22 2010
prev sibling next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Ellery Newcomer:
 Is there any decent way to figure out where segfaults are coming from?
 e.g. 200k lines of bad code converted from java

To perform that translation you have to do first adapt the original Java code to D as much as possible keeping it woeking, then add unit tests to each method to each class of the Java code (if they are missing) and then translate the classes one class at a time, and for each class you translate the unittest too, and you add preconditions & invariants too. You have to do all that one little step at a time, and test very well every tiny step you take. Bye, bearophile
Feb 23 2010
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 22 Feb 2010 21:14:08 -0500, Ellery Newcomer  
<ellery-newcomer utulsa.edu> wrote:

 Is there any decent way to figure out where segfaults are coming from?

 e.g. 200k lines of bad code converted from java

 I tried gdb, and it didn't seem to work too well.

 Die: DW_TAG_type_unit (abbrev 3, offset 0x6d)
    parent at offset: 0xb
    has children: FALSE
    attributes:
      DW_AT_byte_size (DW_FORM_data1) constant: 8
 Dwarf Error: Missing children for type unit [in module  
 /home/ellery/dxl.exe]
 Missing separate debuginfos, use: debuginfo-install glibc-2.11.1-1.i686


 And I'm not proficient with gdb.

 dmd 1.056 / tango .99999 or whatever

 fedora linux

GDB is probably the best option. Have you tried compiling with -gc? I've had luck with that in the past. Segfault gives you a signal (SIGSEGV), you could try handling the signal to print information about where the program is. Other than that, you could try logging. Tango has really good logging facilities. Try putting logging messages in functions that get called at various stages in the program. You can continue to narrow down where the failure is by instrumenting even further. Once you determine the function it's in, then I usually do something like this after every line: Stdout.formatln("here {}", __LINE__); Of course, Tango's Stdout is thread unsafe, I can't remember the thread safe version, something like Trace (if you are using threads). You should get a line number just before the failed call. -Steve
Feb 23 2010
parent reply Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
On 02/23/2010 06:28 AM, Steven Schveighoffer wrote:
 On Mon, 22 Feb 2010 21:14:08 -0500, Ellery Newcomer
 <ellery-newcomer utulsa.edu> wrote:

 Is there any decent way to figure out where segfaults are coming from?

 e.g. 200k lines of bad code converted from java

 I tried gdb, and it didn't seem to work too well.

 Die: DW_TAG_type_unit (abbrev 3, offset 0x6d)
 parent at offset: 0xb
 has children: FALSE
 attributes:
 DW_AT_byte_size (DW_FORM_data1) constant: 8
 Dwarf Error: Missing children for type unit [in module
 /home/ellery/dxl.exe]
 Missing separate debuginfos, use: debuginfo-install glibc-2.11.1-1.i686


 And I'm not proficient with gdb.

 dmd 1.056 / tango .99999 or whatever

 fedora linux

GDB is probably the best option. Have you tried compiling with -gc? I've had luck with that in the past.

it gives me Dwarf Error: Cannot find DIE at 0x0 referenced from DIE at 0x11bd4 [in module /home/ellery/dxl.exe] I'm thinking it's an issue with DMD. I can get backtraces with simple programs.
 Segfault gives you a signal (SIGSEGV), you could try handling the signal
 to print information about where the program is.

 Other than that, you could try logging. Tango has really good logging
 facilities. Try putting logging messages in functions that get called at
 various stages in the program. You can continue to narrow down where the
 failure is by instrumenting even further. Once you determine the
 function it's in, then I usually do something like this after every line:

that's what I've been doing. it's tedious.
 Stdout.formatln("here {}", __LINE__);

 Of course, Tango's Stdout is thread unsafe, I can't remember the thread
 safe version, something like Trace (if you are using threads).

 You should get a line number just before the failed call.

 -Steve

Feb 23 2010
parent reply Bernard Helyer <b.helyer gmail.com> writes:
On 24/02/10 03:45, Ellery Newcomer wrote:

 I'm thinking it's an issue with DMD. I can get backtraces with simple
 programs.

If you use a dynamic array in there somewhere, the chances of it not working go up, I'm afraid. This doesn't leave many programs that *work*.
Feb 23 2010
parent reply Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
On 02/23/2010 03:22 PM, Bernard Helyer wrote:
 On 24/02/10 03:45, Ellery Newcomer wrote:

 I'm thinking it's an issue with DMD. I can get backtraces with simple
 programs.

If you use a dynamic array in there somewhere, the chances of it not working go up, I'm afraid. This doesn't leave many programs that *work*.

Hey! You're right! import tango.io.Stdout; void main(){ Object obj = null; int[] a; a ~= 1; Stdout(obj.toString()).newline; } gives me Die: DW_TAG_type_unit (abbrev 7, offset 0x6f) parent at offset: 0xb has children: FALSE attributes: DW_AT_byte_size (DW_FORM_data1) constant: 8 DW_AT_type (DW_FORM_ref4) constant ref: 0x68 (adjusted) Dwarf Error: Missing children for type unit [in module /home/ellery/test/test] Do you think this should be a separate issue from 1079?
Feb 23 2010
parent Bernard Helyer <b.helyer gmail.com> writes:
On 24/02/10 12:53, Ellery Newcomer wrote:
 Hey! You're right!

 import tango.io.Stdout;
 void main(){
 Object obj = null;
 int[] a;
 a ~= 1;
 Stdout(obj.toString()).newline;
 }

 gives me

 Die: DW_TAG_type_unit (abbrev 7, offset 0x6f)
 parent at offset: 0xb
 has children: FALSE
 attributes:
 DW_AT_byte_size (DW_FORM_data1) constant: 8
 DW_AT_type (DW_FORM_ref4) constant ref: 0x68 (adjusted)
 Dwarf Error: Missing children for type unit [in module
 /home/ellery/test/test]

 Do you think this should be a separate issue from 1079?

Well, I've noted it down on 1079. I think it's related. Seeing as *that* bug doesn't seem to be getting much attention, I don't know how much good it'd to. You are, of course, welcome to open a new issue.
Feb 23 2010
prev sibling parent reply Robert Clipsham <robert octarineparrot.com> writes:
On 23/02/10 02:14, Ellery Newcomer wrote:
 Is there any decent way to figure out where segfaults are coming from?

 e.g. 200k lines of bad code converted from java

 I tried gdb, and it didn't seem to work too well.

 Die: DW_TAG_type_unit (abbrev 3, offset 0x6d)
 parent at offset: 0xb
 has children: FALSE
 attributes:
 DW_AT_byte_size (DW_FORM_data1) constant: 8
 Dwarf Error: Missing children for type unit [in module
 /home/ellery/dxl.exe]
 Missing separate debuginfos, use: debuginfo-install glibc-2.11.1-1.i686


 And I'm not proficient with gdb.

 dmd 1.056 / tango .99999 or whatever

 fedora linux

I'm no expert, but that looks like a dmd bug, can you reproduce with ldc? The actual segfault is probably to do with your code, but if gdb gives that then there's a problem with the debug info that dmd is writing. The only easy way to debug this if dmd's giving bad debug info is to use another compiler (I've never had issues using ldc and gdb with the patches from http://dsource.org/projects/gdb-patches/ ). I'd suggest reporting a dmd bug too, but unless you can work this down to a test case then it'll probably just be ignored... If your code is open source then someone else could work it down, otherwise I can't see much use in reporting a bug that can't be reproduced. Other options to try are compiling with -g, compiling with -gc and seeing if you manage to get a better result, printf() debugging, using another compiler, or compiling without debug info and getting a backtrace... providing the binary isn't stripped you should still be able to get function names, even if you can't get files/line numbers.
Feb 23 2010
parent reply Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
On 02/23/2010 10:34 AM, Robert Clipsham wrote:
 I'm no expert, but that looks like a dmd bug, can you reproduce with
 ldc? The actual segfault is probably to do with your code, but if gdb
 gives that then there's a problem with the debug info that dmd is
 writing. The only easy way to debug this if dmd's giving bad debug info
 is to use another compiler (I've never had issues using ldc and gdb with
 the patches from

Oh. good idea. mua ha ha. ldc dies on compile: ldc: /home/kamm/eigenes/projekte/ldc/llvm-26/lib/VMCore/Instructions.cpp:921: void llvm::StoreInst::AssertOK(): Assertion `getOperand(0)->getType() == cast<PointerType>(getOperand(1)->getType())->getElementType() && "Ptr must be a pointer to Val type!"' failed. 0 ldc 0x0000000000df97bf 1 ldc 0x0000000000dfb46d 2 libpthread.so.0 0x000000302540f0f0 3 libc.so.6 0x00000030248326c5 gsignal + 53 4 libc.so.6 0x0000003024833ea5 abort + 373 5 libc.so.6 0x000000302482b7b5 __assert_fail + 245 6 ldc 0x0000000000d62c01 llvm::StoreInst::AssertOK() + 145 7 ldc 0x0000000000673206 DtoStore(llvm::Value*, llvm::Value*) + 102 8 ldc 0x000000000064814c DtoNewClass(Loc, TypeClass*, NewExp*) + 444 9 ldc 0x000000000065f930 NewExp::toElem(IRState*) + 832 10 ldc 0x000000000065f4f1 AssignExp::toElem(IRState*) + 193 11 ldc 0x0000000000617e14 DtoDeclarationExp(Dsymbol*) + 596 12 ldc 0x000000000065bf81 DeclarationExp::toElem(IRState*) + 65 13 ldc 0x000000000063fcdc ExpStatement::toIR(IRState*) + 108 14 ldc 0x000000000063fe67 CompoundStatement::toIR(IRState*) + 103 15 ldc 0x000000000063fdc8 ScopeStatement::toIR(IRState*) + 72 16 ldc 0x0000000000642f67 ForStatement::toIR(IRState*) + 951 17 ldc 0x000000000063fe67 CompoundStatement::toIR(IRState*) + 103
 http://dsource.org/projects/gdb-patches/ ). I'd suggest reporting a dmd
 bug too, but unless you can work this down to a test case then it'll
 probably just be ignored... If your code is open source then someone
 else could work it down, otherwise I can't see much use in reporting a
 bug that can't be reproduced.

yeah, it's open source, but I don't think it's morally right to inflict it on others..
 Other options to try are compiling with -g, compiling with -gc and
 seeing if you manage to get a better result, printf() debugging, using
 another compiler, or compiling without debug info and getting a
 backtrace... providing the binary isn't stripped you should still be
 able to get function names, even if you can't get files/line numbers.

ah. compile sans debug info works a bit better. thanks!
Feb 23 2010
parent Robert Clipsham <robert octarineparrot.com> writes:
On 23/02/10 17:33, Ellery Newcomer wrote:
 Oh. good idea.

 mua ha ha. ldc dies on compile:

 ldc:
 /home/kamm/eigenes/projekte/ldc/llvm-26/lib/VMCore/Instructions.cpp:921:
 void llvm::StoreInst::AssertOK(): Assertion `getOperand(0)->getType() ==
 cast<PointerType>(getOperand(1)->getType())->getElementType() && "Ptr
 must be a pointer to Val type!"' failed.
 0 ldc 0x0000000000df97bf
 1 ldc 0x0000000000dfb46d
 2 libpthread.so.0 0x000000302540f0f0
 3 libc.so.6 0x00000030248326c5 gsignal + 53
 4 libc.so.6 0x0000003024833ea5 abort + 373
 5 libc.so.6 0x000000302482b7b5 __assert_fail + 245
 6 ldc 0x0000000000d62c01 llvm::StoreInst::AssertOK() + 145
 7 ldc 0x0000000000673206 DtoStore(llvm::Value*, llvm::Value*) + 102
 8 ldc 0x000000000064814c DtoNewClass(Loc, TypeClass*, NewExp*) + 444
 9 ldc 0x000000000065f930 NewExp::toElem(IRState*) + 832
 10 ldc 0x000000000065f4f1 AssignExp::toElem(IRState*) + 193
 11 ldc 0x0000000000617e14 DtoDeclarationExp(Dsymbol*) + 596
 12 ldc 0x000000000065bf81 DeclarationExp::toElem(IRState*) + 65
 13 ldc 0x000000000063fcdc ExpStatement::toIR(IRState*) + 108
 14 ldc 0x000000000063fe67 CompoundStatement::toIR(IRState*) + 103
 15 ldc 0x000000000063fdc8 ScopeStatement::toIR(IRState*) + 72
 16 ldc 0x0000000000642f67 ForStatement::toIR(IRState*) + 951
 17 ldc 0x000000000063fe67 CompoundStatement::toIR(IRState*) + 103

I don't know how you're compiling, but if you can figure out what file causes this you could file an ldc bug, working down a test case shouldn't be too hard from that :)
 http://dsource.org/projects/gdb-patches/ ). I'd suggest reporting a dmd
 bug too, but unless you can work this down to a test case then it'll
 probably just be ignored... If your code is open source then someone
 else could work it down, otherwise I can't see much use in reporting a
 bug that can't be reproduced.

yeah, it's open source, but I don't think it's morally right to inflict it on others..

I see your point, but you may as well report a bug if it's open source, eventually someone will take a look at it... it's better to report it with a 200k LoC test than let the bug go unnoticed :) Of course if you can work it down then it'll be even easier, but I suspect that's a lot of effort.
 Other options to try are compiling with -g, compiling with -gc and
 seeing if you manage to get a better result, printf() debugging, using
 another compiler, or compiling without debug info and getting a
 backtrace... providing the binary isn't stripped you should still be
 able to get function names, even if you can't get files/line numbers.

ah. compile sans debug info works a bit better. thanks!

No problem, has it helped to narrow it down to the function/line of code it's happening in? If it has you might be able to narrow down a dmd bug ^
Feb 23 2010