www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Bug 114] New: Multithreaded applications crash upon garbage collection

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/bugzilla/show_bug.cgi?id=114

           Summary: Multithreaded applications crash upon garbage collection
           Product: D
           Version: 0.154
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Keywords: patch
          Severity: critical
          Priority: P1
         Component: Phobos
        AssignedTo: bugzilla digitalmars.com
        ReportedBy: juanjo comellas.com.ar


There is a problem in std/thread.d in Phobos that appears when the garbage
collector runs and the gcx.mark() method is executed. Dave
<dave_member pathlink.com> provided a fix for this with the following message:

The problem is that the t.stackTop is not valid when it is passed into 
gcx.mark() because it is being munged as pauseAll returns (and lets the 
GC commence) before the stackTop is set for all of the paused threads.

    extern (C) static void pauseHandler(int sig)
    {
        int result;

        // Save all registers on the stack so they'll be scanned by the GC
        asm
        {
            pusha   ;
        }

        assert(sig == SIGUSR1);
        // Move sem_post to after t.stackTop = getESP();
        //sem_post(&flagSuspend);

        sigset_t sigmask;
        result = sigfillset(&sigmask);
        assert(result == 0);
        result = sigdelset(&sigmask, SIGUSR2);
        assert(result == 0);

        Thread t = getThis();
        t.stackTop = getESP();
        t.flags &= ~1;
        sem_post(&flagSuspend); // HERE
        while (1)
        {
            sigsuspend(&sigmask);   // suspend until SIGUSR2
            if (t.flags & 1)        // ensure it was resumeHandler()
            break;
        }

        // Restore all registers
        asm
        {
            popa    ;
        }
    }

I have already verified that this modification fixes the problem.


-- 
Apr 24 2006
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/bugzilla/show_bug.cgi?id=114


juanjo comellas.com.ar changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |juanjo comellas.com.ar




------- Comment #1 from juanjo comellas.com.ar  2006-04-24 09:59 -------
BTW, when the application crashes, the line reported by gdb is:

#0  0x0806a978 in _D3gcx3Gcx4markFPvPvZv () at gcx.d:1318
1318                byte *p = cast(byte *)(*p1);

The pointer that's being dereferenced by the GC is invalid. Here's a backtrace
of a test program that has two threads. The crash is happening on
thread 1.

  (gdb) thread apply all bt

Thread 2 (process 8953):
#0  0x5557db9d in sem_post GLIBC_2.0 () from /lib/tls/libpthread.so.0
#1  0x08062f27 in _D3std6thread6Thread12pauseHandlerUiZv () at std/thread.d:940
#2  <signal handler called>
#3  0x5557e83e in send () from /lib/tls/libpthread.so.0
#4  0x08050a61 in
_D5mango2io6Socket6Socket4sendFAvE5mango2io6Socket6Socket5FlagsZi () at
/home/jcomellas/devel/d/mango_test/mango/io/Socket.d:1423
#5  0x08050290 in _D5mango2io6Socket6Socket6writerFAvZk () at
/home/jcomellas/devel/d/mango_test/mango/io/Socket.d:879
#6  0x0804cbde in _D5mango2io7Conduit7Conduit5writeFAvZk () at
/home/jcomellas/devel/d/mango_test/mango/io/Conduit.d:198
#7  0x0805821f in _D8selector16clientThreadFuncFZv () at selector.d:363
#8  0x0805816e in _D8selector21dummyClientThreadFuncFPvZi () at selector.d:327
#9  0x080628c5 in _D3std6thread6Thread3runFZi () at std/thread.d:609
#10 0x08062d50 in _D3std6thread6Thread11threadstartUPvZPv () at
std/thread.d:845
#11 0x55579ced in start_thread () from /lib/tls/libpthread.so.0
#12 0x5567ddde in clone () from /lib/tls/libc.so.6

Thread 1 (process 8949):
#0  0x0806a978 in _D3gcx3Gcx4markFPvPvZv () at gcx.d:1318
#1  0x0806ad05 in _D3gcx3Gcx11fullcollectFPvZk () at gcx.d:1462
#2  0x0806aab5 in _D3gcx3Gcx16fullcollectshellFZk () at gcx.d:1382
#3  0x080692de in _D3gcx2GC12mallocNoSyncFkZPv () at gcx.d:275
#4  0x080691c1 in _D3gcx2GC6mallocFkZPv () at gcx.d:228
#5  0x080684db in _d_newclass () at gc.d:127
#6  0x08053df7 in
_D5mango2io8selector12PollSelector12PollSelector11selectedSetFZC5mango2io8selector5model9ISelector13ISelectionSet
()
    at /home/jcomellas/devel/d/mango_test/mango/io/selector/PollSelector.d:353
#7  0x08057d69 in
_D8selector12testSelectorFC5mango2io8selector5model9ISelector9ISelectorZv () at
selector.d:142
#8  0x08057c24 in _Dmain () at selector.d:66
#9  0x0805a38a in main () at internal/dmain2.d:94


-- 
Apr 24 2006
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/bugzilla/show_bug.cgi?id=114


juanjo comellas.com.ar changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED




-- 
May 04 2006