www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - phobos - Thread.pauseAll breaks with custom signal handlers? Maybe

reply Mike Swieton <mike swieton.net> writes:
What I have here is only a theory, so if I'm completely off base just beat me
with a banana and I'll shut up ;) I've been seeing some hangs in some of the
multithreaded code I've been working on, and my browsing through stack traces
and phobos code suggests this. 

Executive Summary: A thread that is already waiting will not respond to
Thread.pauseAll(), causing Thread.pauseAll() to never return (stuck in
sem_wait). The garbage collector calls Thread.pauseAll(), triggering the
problem.

The longer version:

My suspicion right now is this: when I call pthread_cond_wait, it suspends the
current process, using sigsuspend (which ignores all signals but those
specified). When Thread.pauseAll suspends a thread, it does so by sending
SIGUSR1. If the thread is already in a wait state, the thread will *stay
suspended* (i.e. it will ignore SIGUSR1, because it's waiting for SIGUSR2).

Then, Thread.pauseAll() will sit in sem_wait(), waiting for the all the
threads to acknowledge being suspended. However, my user thread - already
waiting - doesn't acknowledge, because it ignored the signal (being already
suspended).

This means that Thread.pauseAll() never completes, and all threads are left in
a wait state that is impossible for them to leave.

Any thoughts?

Mike Swieton
__
Freedom lies in being bold.
	- Robert Frost
Jun 05 2004
parent Mike Swieton <mike swieton.net> writes:
Well, I've been unable to duplicate this in a small example, so it's probably
not a bug. Not in DM code, anyway ;)

On Sat, 05 Jun 2004 21:22:59 -0400, Mike Swieton wrote:

 What I have here is only a theory, so if I'm completely off base just beat me
 with a banana and I'll shut up ;) I've been seeing some hangs in some of the
 multithreaded code I've been working on, and my browsing through stack traces
 and phobos code suggests this. 
 
 Executive Summary: A thread that is already waiting will not respond to
 Thread.pauseAll(), causing Thread.pauseAll() to never return (stuck in
 sem_wait). The garbage collector calls Thread.pauseAll(), triggering the
 problem.
 
 The longer version:
 
 My suspicion right now is this: when I call pthread_cond_wait, it suspends the
 current process, using sigsuspend (which ignores all signals but those
 specified). When Thread.pauseAll suspends a thread, it does so by sending
 SIGUSR1. If the thread is already in a wait state, the thread will *stay
 suspended* (i.e. it will ignore SIGUSR1, because it's waiting for SIGUSR2).
 
 Then, Thread.pauseAll() will sit in sem_wait(), waiting for the all the
 threads to acknowledge being suspended. However, my user thread - already
 waiting - doesn't acknowledge, because it ignored the signal (being already
 suspended).
 
 This means that Thread.pauseAll() never completes, and all threads are left in
 a wait state that is impossible for them to leave.
 
 Any thoughts?
 
 Mike Swieton
 __
 Freedom lies in being bold.
 	- Robert Frost
-- Mike Swieton __ But it is vital to remember that information - in the sense of raw data - is not knowledge; that knowledge is not wisdom; and that wisdom is not foresight. - Sir Arthur C Clarke
Jun 06 2004