digitalmars.D.bugs - [Issue 15939] New: GC.collect causes deadlock in multi-threaded
- via Digitalmars-d-bugs (76/76) Apr 18 2016 https://issues.dlang.org/show_bug.cgi?id=15939
https://issues.dlang.org/show_bug.cgi?id=15939 Issue ID: 15939 Summary: GC.collect causes deadlock in multi-threaded environment Product: D Version: D2 Hardware: x86_64 OS: Linux Status: NEW Severity: blocker Priority: P1 Component: druntime Assignee: nobody puremagic.com Reporter: apreobrazhensky gmail.com I have multi-threaded application with threads doing memory intensive work and main thread cleaning up the garbage periodically by calling GC.collect manually. Sometimes GC.collect causes deadlock. I don't have simple example, but I do have stack traces of the threads at the moment of the deadlock. It happens both for dmd 2.071.0 and for dmd 2.070.* (so it is not related to the recent GC spinlock change). That seems like a blocker to me, I suspect that if it happens when I call it manually it could also happen during normal collections. I'm not familiar with runtime code, but I would expect some sort of race condition judging from stack traces below. Configuration: dmd 2.071.0 with -O -release -inline -boundscheck=off Linux 3.2.0-95-generic #135-Ubuntu SMP Tue Nov 10 13:33:29 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux That's the main thread's stack trace. Thread 1 (Thread 0x7ff6653bb6c0 (LWP 6857)): #0 sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:86 #1 0x00000000007b3ff6 in thread_suspendAll () #2 0x000000000079980d in gc.gc.Gcx.fullcollect() () #3 0x000000000079c2b2 in gc.gc.GC.__T9runLockedS49_D2gc2gc2GC11fullCollectMFNbZ2goFNbPS2gc2gc3GcxZmTPS2gc2gc3GcxZ.runLocked() () #4 0x0000000000796535 in gc.gc.GC.fullCollect() () #5 0x000000000076091c in gc_collect () ...application stack That's how stack trace looks like for the threads which were suspended correctly. Thread XX (Thread 0x7ff5c6ffd700 (LWP 6897)): #0 0x00007ff6640e6454 in do_sigsuspend (set=0x7ff5c6ff9bc0) at ../sysdeps/unix/sysv/linux/sigsuspend.c:63 #1 __GI___sigsuspend (set=<optimized out>) at ../sysdeps/unix/sysv/linux/sigsuspend.c:78 #2 0x00000000007c0401 in core.thread.thread_suspendHandler() () #3 0x00000000007c045c in core.thread.callWithStackShell() () #4 0x00000000007c038f in thread_suspendHandler () #5 <signal handler called> ... application stack That's how stack trace looks like for the threads which weren't suspended: Thread YY (Thread 0x7ff5c67fc700 (LWP 6898)): #0 0x00007ff664d9b52d in nanosleep () at ../sysdeps/unix/syscall-template.S:82 #1 0x000000000075dfde in core.thread.Thread.sleep() () #2 0x00000000007b46e0 in core.internal.spinlock.SpinLock.yield() () #3 0x00000000007b467c in core.internal.spinlock.SpinLock.lock() () #4 0x000000000079bc21 in gc.gc.GC.__T9runLockedS46_D2gc2gc2GC12extendNoSyncMFNbPvmmxC8TypeInfoZmS21_D2gc2gc10extendTimelS21_D2gc2gc10numExtendslTPvTmTmTxC8TypeInfoZ.runLocked() () #5 0x0000000000760bcc in gc_extend () #6 0x0000000000763c85 in _d_arraysetlengthT () ... application stack Thread ZZ (Thread 0x7ff566ffd700 (LWP 6918)): #0 0x00007ff664d9b52d in nanosleep () at ../sysdeps/unix/syscall-template.S:82 #1 0x000000000075dfde in core.thread.Thread.sleep() () #2 0x00000000007b46e0 in core.internal.spinlock.SpinLock.yield() () #3 0x00000000007b467c in core.internal.spinlock.SpinLock.lock() () #4 0x000000000079ba3c in gc.gc.GC.__T9runLockedS47_D2gc2gc2GC12mallocNoSyncMFNbmkKmxC8TypeInfoZPvS21_D2gc2gc10mallocTimelS21_D2gc2gc10numMallocslTmTkTmTxC8TypeInfoZ.runLocked() () #5 0x00000000007953be in gc.gc.GC.malloc() () #6 0x0000000000760a04 in gc_malloc () #7 0x0000000000762c43 in _d_newclass () ... application stack --
Apr 18 2016