www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 6014] New: rt_finalize Segmentation fault , dmd 2.053 on linux & freebsd

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6014

           Summary: rt_finalize Segmentation fault , dmd 2.053 on linux &
                    freebsd
           Product: D
           Version: D2
          Platform: Other
        OS/Version: Linux
            Status: NEW
          Severity: regression
          Priority: P2
         Component: DMD
        AssignedTo: nobody puremagic.com
        ReportedBy: changlon gmail.com



I build my projects on linux64 & freebsd32 .  the same runtime error keep
touble me .

Program received signal SIGSEGV, Segmentation fault.
0x00000000004ba73f in rt_finalize () .

The dmd version is 2.053 release, I remove all dtor "~this()" from my code but
the error still exists.

I have no idea how to reduce the example,  I just sure the error is throw when
i call Parse.parse .  


http://gool.googlecode.com/files/jade_dtor_bug.tar.bz2

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 15 2011
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6014




I notice that if throw exception in dtor will cause some problem , but in this
case is no exception, and also no dtor exists .

the package I post here dtor is still exists, you can remove them and test
agian (util.pool.dtor,  jade.Compiler.dtor).

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 16 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6014




The same code work fine in Win32,  the runtime  rt_finalize error is since dmd
2.052.  


The Win32 dmd 2.052 has same problem , but fiexd in dmd 2.053 .

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 16 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6014




I have no idea how to reduce this test case or how to trace the bug. 

I build the project with -g -debug, then run gdb . But the error is not in the
project code . it is in the druntime.


Can anybody tell me how to build a debug version druntime lib ?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 27 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6014


Steven Schveighoffer <schveiguy yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |schveiguy yahoo.com



04:25:16 PDT ---


 Can anybody tell me how to build a debug version druntime lib ?
In posix.mak, change flags: DFLAGS=-gc -Isrc -Iimport -nofloat -d -w UDFLAGS=$(DFLAGS) make -f posix.mak then rebuild phobos, and copy the phobos library into your lib dir. Probably want to build phobos in debug mode as well. I'm actually surprised you have to edit the makefile, it should be easier... -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
May 27 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6014




 Steven Schveighoffer thank you .


I update my dmd to 2.054, and build the debug libphobos2 on linux ,  use gdb
catch this error .
---------------------------------------------------------------------
Program received signal SIGSEGV, Segmentation fault.
0x00000000004cd10c in rt.lifetime.rt_finalize (p=0x7ffff729d000, det=false)
    at src/rt/lifetime.d:1154
1154                ClassInfo c = **pc;
----------------------------------------------------------------------

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 13 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6014




04:47:25 PDT ---
A stack trace would help.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6014




Hi Steven Schveighoffer ,

I import core.runtime, but the stack trace is not auto printed .

Can you tell how to print the stack trace ?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6014




09:45:27 PDT ---
I meant a stack trace from gdb...

use bt I think.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6014




Starting program: /web/www/tmp/jade/jade2test 
[Thread debugging using libthread_db enabled]
f = 0x4fc4b0,32, t = 0x713030,32, size = 1
f = 0x4fed20,176, t = 0x7ffff7ed5f00,176, size = 1
f = 0x4ff200,72, t = 0x71f490,72, size = 1
f = 0x4f8c10,64, t = 0x7ffff7ed8fc0,64, size = 1
f = 0x4f8d00,64, t = 0x7ffff7ed8f80,64, size = 1
f = 0x4f0880,12, t = 0x7ffff7ed9ff0,12, size = 1
f = 0x4f3920,56, t = 0x7ffff7ed8f00,56, size = 1
f = 0x4f3920,56, t = 0x7ffff7ed8ec0,56, size = 1
1 times use time = 1ms 



Program received signal SIGSEGV, Segmentation fault.
0x00000000004cda08 in rt.lifetime.rt_finalize (p=0x7ffff729d000, det=false)
    at src/rt/lifetime.d:1154
1154                ClassInfo c = **pc;
(gdb) bt

    at src/rt/lifetime.d:1154

    stackTop=0x7fffffffe260) at src/gc/gcx.d:2631

    at src/gc/gcx.d:2391

    at src/gc/gcx.d:1329


    at src/rt/dmain2.d:515

    dg=0x00000000004abbdc00007fffffffe4a0) at src/rt/dmain2.d:471

    at src/rt/dmain2.d:518

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6014




05:49:10 PDT ---
So here is what I can learn from this information:

1. The crash is happening on the final collection cycle when the runtime is
shutting down.
2. The memory block (pointer value 0x7ffff729d000) is marked as having a
finalizer.
3. The memory block being collected does not have a valid classinfo pointer
(which resides at the very beginning of the block), which means either:

  a. It's not really a class, and is incorrectly marked as having a finalizer
or
  b. The pointer has somehow been corrupted.

The issue with a problem like this is, the corruption could happen anywhere.

Given that dtors allocating memory has now been disallowed by 2.054 (a known
cause of corruption), I don't think your code could be doing that.

So that leaves examining your code for incorrect memory operations.  I don't
really have time to look through your code, but I'd recommend looking
suspiciously at things where casts are used, or where you are using raw
pointers.

One other thing is to add (or uncomment) some druntime debug printf statements
-- print out the classinfo name and addresses for memory blocks being
allocated.  That at least should tell you what the *original* type was being
allocated for the failed memory block.  Sometimes this is the only way to debug
such corruption issues.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 15 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6014




I do not understand the mechanism of druntime , this problem has troubled me
for a long time .

According to my simple understanding, the following code does not cover up the
failure, But in fact it failed. Is I got it wrong or druntime has a bug?

------------------------------------------------------
import core.memory;
void main(){
        auto attr = cast(GC.BlkAttr) 0b1 ;
        auto test = GC.malloc(10, attr);
        GC.setAttr(test, 0);
        auto _attr = GC.getAttr(test);
        assert(attr != _attr);
}
-----------------------------------------------------

I use GC.malloc and  GC.realloc to speed up the memory alloc,  A memory block
attr has be changed before exit main function .

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 17 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6014




I think the bug is not because class cast .

in the code i have Token and ASTNode,  The Token is struct and ASTNode is
class. if I apply them both on heap then it working fine,  if I only apply 
ASTNode on heap the problem is still there ,  If I apply only Token on heap the
test is work fine .

I storage the pointer of Token struct on a global pointer, and print it before
exit main function,  find it is diffent ,  that mean the Pool.data has been
moved ,  and after exit main I got a  Segmentation .


I simply do nothing but just change the struct Token to class Token,  they
still apply on pool but not heap,  The problem is not exists anymore.


So,  I guess this is not a cast(class) issue,  It is a struct issue, and
related with druntime GC .


the Segmentation is very rare, If i change a lite things on example.jade, the
Segmentation will not exists .


If i apply stuct Token on heap the performance will be very bad,  It will cause
100 times than not apply on heap .



After several months of debug and test,  I finally resolved this problem . 
Thanks a lot for Steven Schveighoffer help .


my problem is not exists by switch stuck Token to class Token,  But I believe
there is also a hidden an druntime GC bug,  So I will not close this bug .

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 08 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6014




The struct implement with issue test case :
http://gool.googlecode.com/files/jade_dtor_bug.tar.bz2

the class implement without issue test case :
http://gool.googlecode.com/files/jade_dtor_bug_fixed.tar.bz2


I realy can't reduce the test case,  because it is a runtime issue .

jade is a web view template compiler, like http://www.smarty.net/ for php . 
jade will convert jade template language to d source for web deveplop purpose.

the hark part is if i change any jade view template source ( example.jade) the
isuse will not exists, so I really do not know how to reduce this test case .

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 08 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6014


David Simcha <dsimcha yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dsimcha yahoo.com



---
FWIW this is where/how/why the sporadic segfaults in the std.parallelism
unittests on Linux and FreeBSD that the auto tester keeps flagging are
occurring.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 13 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6014




*** Issue 5766 has been marked as a duplicate of this issue. ***

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Sep 06 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6014


dawg dawgfoto.de changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dawg dawgfoto.de



It could as well be a double finalization.
The vtable pointer is cleared when calling rt_finalize on a class.

There is also a deterministic bug happening due to an oversight in the
finalization design. Finalization is done in memory order and does not take
hierarchies into account.

---
class A
{
    ~this() {}
    void cleanup() {}
}

class B
{
    this(A a) { this.a = a; }
    ~this() { a.cleanup(); }
    A a;
}

void main() {
    auto a = new A();
    auto b = new B(a);
    // allocating a at a lower address than b causes it to be finalized earlier
    assert(cast(void*)b.a < cast(void*)b);
}
---

When b.a is finalized before b it's vtable is set to null, hence
the segfault at accessing the classinfo.

It seems like we need to somehow sort the to be finalized memory while
scanning.
Any cheap ideas to do that are welcome.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Sep 13 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6014


dawg dawgfoto.de changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |INVALID



 changlon

You won't like the cause of your bug.
All fields in a struct are default initialized.
Pointers with null, Integrals with 0, Floats with NaN and enums with
the first enum member.

enum BlkAttr : uint
{
    FINALIZE    = 0b0000_0001, /// Finalize the data in this block on collect.
    NO_SCAN     = 0b0000_0010, /// Do not scan through this block on collect.
    NO_MOVE     = 0b0000_0100,  /// Do not move this memory block on collect.
    APPENDABLE  = 0b0000_1000, /// This block contains the info to allow
appending.
    NO_INTERIOR = 0b0001_0000
}

That means the attr flag in your memory pool is always set to BlkAttr.FINALIZE.
Every GC.malloc you do will get a wrong finalization.
It can be avoided this by giving a default value to the field.
GC.BlkAttr attr = cast(GC.BlkAttr)0;
Arguably this could be the default member in BlkAttr.

I will close this bug and open a new one for the order of class finalization.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Sep 13 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6014




05:22:43 PDT ---


 There is also a deterministic bug happening due to an oversight in the
 finalization design. Finalization is done in memory order and does not take
 hierarchies into account.
Just to clarify as you discovered in your new bug report, this is by design -- a destructor cannot rely on any heap-allocated data being present. A concept in many GC-based languages is to have two "destructors", one which is only ever called synchronously, and one that can be called asynchronously by the GC. The synchronous one always calls the asynchronous one. This is sometimes called a finalizer (and in fact, ~this is a finalizer). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6014




PDT ---
I've added BlkAttr.NONE as a default for this enum.  Seems like an easy way to
avoid weird errors like this.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Sep 14 2011
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6014


Martin Nowak <code dawg.eu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|INVALID                     |FIXED



https://github.com/D-Programming-Language/druntime/commit/7189e90005e156ea2e826a47d47ed0e97efc0286

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Oct 09 2013