www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - std.thread bug

reply Ben Hinkle <bhinkle4 juno.com> writes:
During testing of various threading behaviors I think there is a bug in
std.thread in the routine "threadstart". The problem is that lines 289 and
745 decrement the nthreads global without grabbing the threadLock before
hand. When I have a bunch of small threads that start and stop on top of
one another the application fails to quit properly. I assume that is
because the nthreads global is wrong. If I throw in enough Thread.yield()
to spread out the thread starts and stops then it exits fine.
So I think those lines (at least) need to be wrapped with
synchronized(threadLock){...}

-Ben
May 30 2004
next sibling parent reply "Kris" <someidiot earthlink.dot.dot.dot.net> writes:
Perhaps it's not the nthreads global Ben, since it's only read by pauseAll()
and resumeAll(). I think it's likely to be the allThreads[] instead (line
287) where there's a race condition with start() while being set to null
during thread termination.

I'd like to convince Walter to change the implementation such that it places
the Thread object handle in O/S thread-local-storage (TLS) instead. By doing
so, we'd eliminate the linear lookup within getThis() when asking for the
current thread.

Having done that, the introduction of your Concurrent port would likely
eliminate the need for pauseAll() and resumeAll(). This would render a
triple bonus:

1) Concurrent brings solid high value functionality to D threading
2) two deadlock prone methods removed
3) Thread.d would no longer need allThreads[], so an inherent Phobos
limitation on the maximum number of threads is also eliminated.

All of this would be "A Good Thing" tm.

Does linux have native TLS ??

- Kris


"Ben Hinkle" <bhinkle4 juno.com> wrote in message
news:c9e7a3$1o8o$1 digitaldaemon.com...
 During testing of various threading behaviors I think there is a bug in
 std.thread in the routine "threadstart". The problem is that lines 289 and
 745 decrement the nthreads global without grabbing the threadLock before
 hand. When I have a bunch of small threads that start and stop on top of
 one another the application fails to quit properly. I assume that is
 because the nthreads global is wrong. If I throw in enough Thread.yield()
 to spread out the thread starts and stops then it exits fine.
 So I think those lines (at least) need to be wrapped with
 synchronized(threadLock){...}

 -Ben
May 30 2004
parent reply Ben Hinkle <bhinkle4 juno.com> writes:
Kris wrote:

 Perhaps it's not the nthreads global Ben, since it's only read by
 pauseAll() and resumeAll(). I think it's likely to be the allThreads[]
 instead (line 287) where there's a race condition with start() while being
 set to null during thread termination.
That could be, too. pauseAll gets called by gc_term (see internal/dmain2.c) so it is possible that nthreads is involved but I haven't debugged anything so I don't really know.
 I'd like to convince Walter to change the implementation such that it
 places the Thread object handle in O/S thread-local-storage (TLS) instead.
 By doing so, we'd eliminate the linear lookup within getThis() when asking
 for the current thread.
yeah I was wondering about that list, too. Actually instead of TLS I wonder if the Thread object reference could be put at some known offset from the bottom of the stack. I guess knowing that offset is the icky part.
 Having done that, the introduction of your Concurrent port would likely
 eliminate the need for pauseAll() and resumeAll(). This would render a
 triple bonus:
 
 1) Concurrent brings solid high value functionality to D threading
 2) two deadlock prone methods removed
 3) Thread.d would no longer need allThreads[], so an inherent Phobos
 limitation on the maximum number of threads is also eliminated.
hmm. I think the GC still would need pauseAll - or some way to get at all hoses the order of threads waiting on locks so low priority threads get shafted after a GC (to put it technically). It would be nice to get rid of pauseAll somehow.
 All of this would be "A Good Thing" tm.
 
 Does linux have native TLS ??
yup. pthread_key_create and friends.
 - Kris
 
 
 "Ben Hinkle" <bhinkle4 juno.com> wrote in message
 news:c9e7a3$1o8o$1 digitaldaemon.com...
 During testing of various threading behaviors I think there is a bug in
 std.thread in the routine "threadstart". The problem is that lines 289
 and 745 decrement the nthreads global without grabbing the threadLock
 before hand. When I have a bunch of small threads that start and stop on
 top of one another the application fails to quit properly. I assume that
 is because the nthreads global is wrong. If I throw in enough
 Thread.yield() to spread out the thread starts and stops then it exits
 fine. So I think those lines (at least) need to be wrapped with
 synchronized(threadLock){...}

 -Ben
May 30 2004
parent reply "Kris" <someidiot earthlink.dot.dot.dot.net> writes:
"Ben Hinkle" wrote
 That could be, too. pauseAll gets called by gc_term (see
internal/dmain2.c) Hmmm. Didn't look that far myself. Would be nice if D supported Daemon threads a la Java. Upon exit, the main thread would kill all non-daemons and then the last thread to subsequently exit would call gc_term() ?
 yeah I was wondering about that list, too. Actually instead of TLS I
wonder
 if the Thread object reference could be put at some known offset from the
 bottom of the stack. I guess knowing that offset is the icky part.
That sounds like a reasonable idea.
 hmm. I think the GC still would need pauseAll - or some way to get at all

 hoses the order of threads waiting on locks so low priority threads get
 shafted after a GC (to put it technically). It would be nice to get rid of
 pauseAll somehow.
Perhaps a dumb question, but how does the GC use pauseAll() to get at the stacks? - Kris
 All of this would be "A Good Thing" tm.

 Does linux have native TLS ??
yup. pthread_key_create and friends.
 - Kris


 "Ben Hinkle" <bhinkle4 juno.com> wrote in message
 news:c9e7a3$1o8o$1 digitaldaemon.com...
 During testing of various threading behaviors I think there is a bug in
 std.thread in the routine "threadstart". The problem is that lines 289
 and 745 decrement the nthreads global without grabbing the threadLock
 before hand. When I have a bunch of small threads that start and stop
on
 top of one another the application fails to quit properly. I assume
that
 is because the nthreads global is wrong. If I throw in enough
 Thread.yield() to spread out the thread starts and stops then it exits
 fine. So I think those lines (at least) need to be wrapped with
 synchronized(threadLock){...}

 -Ben
May 31 2004
parent "Walter" <newshound digitalmars.com> writes:
"Kris" <someidiot earthlink.dot.dot.dot.net> wrote in message
news:c9fodo$t44$1 digitaldaemon.com...
 Perhaps a dumb question, but how does the GC use pauseAll() to get at the
 stacks?
It doesn't. All pauseAll does is pause all the threads. GC finds the stacks by walking the array of all threads.
Jul 19 2004
prev sibling parent reply Ben Hinkle <bhinkle4 juno.com> writes:
Ben Hinkle wrote:

 
 During testing of various threading behaviors I think there is a bug in
 std.thread in the routine "threadstart". The problem is that lines 289 and
 745 decrement the nthreads global without grabbing the threadLock before
 hand. When I have a bunch of small threads that start and stop on top of
 one another the application fails to quit properly. I assume that is
 because the nthreads global is wrong. If I throw in enough Thread.yield()
 to spread out the thread starts and stops then it exits fine.
 So I think those lines (at least) need to be wrapped with
 synchronized(threadLock){...}
 
 -Ben
This is also related to the defining when the program exits. In Java the program exits when all the non-daemon threads are done. In D it exits when the main thread is done. Calling gc_term before all the threads are done seems like trouble. I wonder if it is really worth doing a full collect at exit anyway - it would skip call the destructors on garbage but is that so bad? see Note that wrapping pauseAll with synchronized(threadLock) is a problem, too, since it gets called at various times during startup before the threadLock has been created (from gc_init or the GC invariants or something). grumble.
May 31 2004
parent reply Mike Swieton <mike swieton.net> writes:
On Mon, 31 May 2004 10:38:20 -0400, Ben Hinkle wrote:
 This is also related to the defining when the program exits. In Java the
 program exits when all the non-daemon threads are done. In D it exits when
 the main thread is done. Calling gc_term before all the threads are done
 seems like trouble. I wonder if it is really worth doing a full collect at
 exit anyway - it would skip call the destructors on garbage but is that so
 bad?
 
 see 

 
 Note that wrapping pauseAll with synchronized(threadLock) is a problem, too,
 since it gets called at various times during startup before the threadLock
 has been created (from gc_init or the GC invariants or something). grumble.
I made a post about this a while back. A solution that worked for me here was to change the dmain2.d internal main to do something like: foreach (Thread t; Thread.getAll()) if (!t.isSelf()) t.wait(); Thus not cleaning up or exiting the main thread until all children have exited on their own. I think calling the destructors is important, for external resources. It seems to me that sockets, files, etc. could be left open if the collect isn't run? On a side note: do we want daemon threads? I don't think it'd be all that hard to implement (read: I think I can do it), and my recent work on Concurrent has shown me some of their usefulness, along with some of the other Java threads features ;) Mike Swieton __ Has this world been so kind to you that you should leave with regret? There are better things ahead than any we leave behind. - C. S. Lewis
May 31 2004
next sibling parent "Kris" <someidiot earthlink.dot.dot.dot.net> writes:
"Mike Swieton" <mike swieton.net> wrote
 Thus not cleaning up or exiting the main thread until all children have
exited
 on their own. I think calling the destructors is important, for external
 resources. It seems to me that sockets, files, etc. could be left open if
the
 collect isn't run?
The OS does tend to cleanup all file.socket handles at process termination, but I too would still prefer all destructors were called since there's no telling what else might be bound to the app. The only downside of this is the shutdown time: I recall there was a post early in the year about how a MEGA-wordcount took ~six minutes to complete -- all of the time appeared to be in the gazzilions of destructors <g>
 On a side note: do we want daemon threads? I don't think it'd be all that
hard
 to implement (read: I think I can do it), and my recent work on Concurrent
has
 shown me some of their usefulness, along with some of the other Java
threads
 features ;)
For server use, I would absolutely appreciate daemon threads. - Kris
May 31 2004
prev sibling parent Ben Hinkle <bhinkle4 juno.com> writes:
Mike Swieton wrote:

 On Mon, 31 May 2004 10:38:20 -0400, Ben Hinkle wrote:
 This is also related to the defining when the program exits. In Java the
 program exits when all the non-daemon threads are done. In D it exits
 when the main thread is done. Calling gc_term before all the threads are
 done seems like trouble. I wonder if it is really worth doing a full
 collect at exit anyway - it would skip call the destructors on garbage
 but is that so bad?
 
 see
 
 Note that wrapping pauseAll with synchronized(threadLock) is a problem,
 too, since it gets called at various times during startup before the
 threadLock has been created (from gc_init or the GC invariants or
 something). grumble.
I made a post about this a while back. A solution that worked for me here was to change the dmain2.d internal main to do something like: foreach (Thread t; Thread.getAll()) if (!t.isSelf()) t.wait();
nice.
 Thus not cleaning up or exiting the main thread until all children have
 exited on their own. I think calling the destructors is important, for
 external resources. It seems to me that sockets, files, etc. could be left
 open if the collect isn't run?
I'm not positive but I thought most everything gets cleaned up by the OS at exit. Calling abort or kill -9 might leave things open but then the GC doesn't get a chance to clean up anyway. One place where having a destructor would make a difference is if the socket or file was buffered and the buffer hadn't been flushed. <rant> But personally I think such things should be flushed by hand since the GC is free to do whatever it wants with garbage. It can ignore a given piece of garbage for a long long time. The GC is designed to manage memory so it shouldn't be relied upon to clean up other resources, too. </rant>
 On a side note: do we want daemon threads? I don't think it'd be all that
 hard to implement (read: I think I can do it), and my recent work on
 Concurrent has shown me some of their usefulness, along with some of the
 other Java threads features ;)
I bet Walter would be open to suggestions/contributions.
 Mike Swieton
 __
 Has this world been so kind to you that you should leave with regret?
 There are better things ahead than any we leave behind.
 - C. S. Lewis
May 31 2004