digitalmars.D.learn - GC dead-locking ?

Marco Leise (19/19) Jun 13 2013 Here is an excerpt from a stack trace I got while profiling

Marco Leise (7/7) Jun 13 2013 One more note: I get this consistently during profiling, but
Sean Kelly (11/29) Jun 17 2013 size=3D16401, this=3D...) gc/gcx.d:503

Marco Leise (7/29) Jun 18 2013 No, I have not overridden the signal handler. I'm aware of the

Sean Kelly (13/42) Jun 18 2013 =3D16401, this=3D...) gc/gcx.d:503

Marco Leise (8/41) Jul 01 2013 I could do that (with a little work setting the scenario up

Marco Leise <Marco.Leise gmx.de> writes:

Here is an excerpt from a stack trace I got while profiling
with OProfile:





alloc_size=0x7fc3d4bfe418) at gc/gcx.d:2099

this=...) gc/gcx.d:503

alloc_size=0x7fc3d4bfe418) gc/gcx.d:421




bitLengths=...) sequencer/algorithm/gzip.d:444

Two more threads are alive, but waiting on a condition
variable (i.e.: in pthread_cond_wait(), but from my own and
not from druntime code. Is there some obvious way I could have
dead-locked the GC ? Or is there a bug ?

This was compiled with GDC using DMD FE 2.062.

-- 
Marco

Jun 13 2013

Marco Leise <Marco.Leise gmx.de> writes:

One more note: I get this consistently during profiling, but
not without.
I don't count kernel involvement out either, since OProfile is
a kernel based profiler and there could be a quirk in its
interaction with semaphores.

-- 
Marco

Jun 13 2013

Sean Kelly <sean invisibleduck.org> writes:

On Jun 13, 2013, at 2:22 AM, Marco Leise <Marco.Leise gmx.de> wrote:

 Here is an excerpt from a stack trace I got while profiling
 with OProfile:
=20





poolPtr=3D0x7fc3d4bfe3c8, alloc_size=3D0x7fc3d4bfe418) at gc/gcx.d:2099


size=3D16401, this=3D...) gc/gcx.d:503


alloc_size=3D0x7fc3d4bfe418) gc/gcx.d:421





(this=3D..., bitLengths=3D...) sequencer/algorithm/gzip.d:444
=20
 Two more threads are alive, but waiting on a condition
 variable (i.e.: in pthread_cond_wait(), but from my own and
 not from druntime code. Is there some obvious way I could have
 dead-locked the GC ? Or is there a bug ?

I assume you're running on Linux, which uses signals (SIGUSR1, =
specifically) to suspend threads for a collection.  So I imagine what's =
happening is that your thread is trying to suspend all the other threads =
so it can collect, and those threads are ignoring the signal for some =
reason.  I would expect pthread_cond_wait to be interrupted if a signal =
arrives though.  Have you overridden the signal handler for SIGUSR1?=

Jun 17 2013

Marco Leise <Marco.Leise gmx.de> writes:

Am Mon, 17 Jun 2013 10:46:19 -0700
schrieb Sean Kelly <sean invisibleduck.org>:

 On Jun 13, 2013, at 2:22 AM, Marco Leise <Marco.Leise gmx.de> wrote:
 
 Here is an excerpt from a stack trace I got while profiling
 with OProfile:
 




alloc_size=0x7fc3d4bfe418) at gc/gcx.d:2099

this=...) gc/gcx.d:503

alloc_size=0x7fc3d4bfe418) gc/gcx.d:421




bitLengths=...) sequencer/algorithm/gzip.d:444
 
 Two more threads are alive, but waiting on a condition
 variable (i.e.: in pthread_cond_wait(), but from my own and
 not from druntime code. Is there some obvious way I could have
 dead-locked the GC ? Or is there a bug ?

 
 I assume you're running on Linux, which uses signals (SIGUSR1, specifically)
to suspend threads for a collection.  So I imagine what's happening is that
your thread is trying to suspend all the other threads so it can collect, and
those threads are ignoring the signal for some reason.  I would expect
pthread_cond_wait to be interrupted if a signal arrives though.  Have you
overridden the signal handler for SIGUSR1?

No, I have not overridden the signal handler. I'm aware of the
fact that signals make pthread_cond_wait() return early and
put them in a while loop as one would expect, that is all.

-- 
Marco

Jun 18 2013

Sean Kelly <sean invisibleduck.org> writes:

On Jun 18, 2013, at 7:01 AM, Marco Leise <Marco.Leise gmx.de> wrote:

 Am Mon, 17 Jun 2013 10:46:19 -0700
 schrieb Sean Kelly <sean invisibleduck.org>:
=20
 On Jun 13, 2013, at 2:22 AM, Marco Leise <Marco.Leise gmx.de> wrote:
=20
 Here is an excerpt from a stack trace I got while profiling
 with OProfile:
=20







fe3c8, alloc_size=3D0x7fc3d4bfe418) at gc/gcx.d:2099




=3D16401, this=3D...) gc/gcx.d:503




0x7fc3d4bfe418) gc/gcx.d:421







=3D..., bitLengths=3D...) sequencer/algorithm/gzip.d:444
=20
 Two more threads are alive, but waiting on a condition
 variable (i.e.: in pthread_cond_wait(), but from my own and
 not from druntime code. Is there some obvious way I could have
 dead-locked the GC ? Or is there a bug ?

=20
 I assume you're running on Linux, which uses signals (SIGUSR1, specifical=


ly) to suspend threads for a collection.  So I imagine what's happening is t=
hat your thread is trying to suspend all the other threads so it can collect=
, and those threads are ignoring the signal for some reason.  I would expect=
 pthread_cond_wait to be interrupted if a signal arrives though.  Have you o=
verridden the signal handler for SIGUSR1?
=20
 No, I have not overridden the signal handler. I'm aware of the
 fact that signals make pthread_cond_wait() return early and
 put them in a while loop as one would expect, that is all.

Hrm... Can you trap this in a debugger and post the stack traces of all thre=
ads?  That stack above is a thread waiting for others to say they're suspend=
ed so it can collect.=20=

Jun 18 2013

Marco Leise <Marco.Leise gmx.de> writes:

Am Tue, 18 Jun 2013 19:12:06 -0700
schrieb Sean Kelly <sean invisibleduck.org>:

 On Jun 18, 2013, at 7:01 AM, Marco Leise <Marco.Leise gmx.de> wrote:
 
 Am Mon, 17 Jun 2013 10:46:19 -0700
 schrieb Sean Kelly <sean invisibleduck.org>:
 
 On Jun 13, 2013, at 2:22 AM, Marco Leise <Marco.Leise gmx.de> wrote:
 
 Here is an excerpt from a stack trace I got while profiling
 with OProfile:
 




alloc_size=0x7fc3d4bfe418) at gc/gcx.d:2099

this=...) gc/gcx.d:503

alloc_size=0x7fc3d4bfe418) gc/gcx.d:421




bitLengths=...) sequencer/algorithm/gzip.d:444
 
 Two more threads are alive, but waiting on a condition
 variable (i.e.: in pthread_cond_wait(), but from my own and
 not from druntime code. Is there some obvious way I could have
 dead-locked the GC ? Or is there a bug ?

 
 I assume you're running on Linux, which uses signals (SIGUSR1, specifically)
to suspend threads for a collection.  So I imagine what's happening is that
your thread is trying to suspend all the other threads so it can collect, and
those threads are ignoring the signal for some reason.  I would expect
pthread_cond_wait to be interrupted if a signal arrives though.  Have you
overridden the signal handler for SIGUSR1?

 
 No, I have not overridden the signal handler. I'm aware of the
 fact that signals make pthread_cond_wait() return early and
 put them in a while loop as one would expect, that is all.

 
 Hrm... Can you trap this in a debugger and post the stack traces of all
threads?  That stack above is a thread waiting for others to say they're
suspended so it can collect. 

I could do that (with a little work setting the scenario up
again), but it wont help. As I said, the other two threads
were paused in pthread_cond_wait() in my own code. There was
nothing special about their stack trace.

-- 
Marco

Jul 01 2013

D Programming

C/C++ Programming

Other

digitalmars.D.learn - GC dead-locking ?