www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - How to avoid running out of OS thread handles when spawning lots

reply David Nadlinger <see klickverbot.at> writes:
The title says it all – how can I avoid running out of OS thread handles 
when spawning lots of short-lived threads?

In reality, I encountered the issue while writing tests a piece of code 
which spawns a thread, but this is the basic issue:

---
import core.thread;

void doNothing() {}
void main() {
   foreach (i; 0 .. 100_000) {
     auto t = new Thread(&doNothing);
     t.start();

     // Just to make sure the thread has time to terminate.
     Thread.sleep(dur!"msecs"(1));
   }
}
---

Even though the threads immediately terminate, the D Thread objects stay 
around a lot longer (until the garbage collector decides to collect 
them), and as pthread_detach is only called the Thread destructor, this 
causes the application to eventually fail with

core.thread.ThreadException src/core/thread.d(812): Error creating thread

because the available OS thread handles are exhausted.

Any ideas how to properly fix that?

David
Jun 09 2011
next sibling parent reply Robert Clipsham <robert octarineparrot.com> writes:
On 10/06/2011 00:17, David Nadlinger wrote:
 The title says it all – how can I avoid running out of OS thread handles
 when spawning lots of short-lived threads?

 In reality, I encountered the issue while writing tests a piece of code
 which spawns a thread, but this is the basic issue:

 ---
 import core.thread;

 void doNothing() {}
 void main() {
 foreach (i; 0 .. 100_000) {
 auto t = new Thread(&doNothing);
 t.start();

 // Just to make sure the thread has time to terminate.
 Thread.sleep(dur!"msecs"(1));
 }
 }
 ---

 Even though the threads immediately terminate, the D Thread objects stay
 around a lot longer (until the garbage collector decides to collect
 them), and as pthread_detach is only called the Thread destructor, this
 causes the application to eventually fail with

 core.thread.ThreadException src/core/thread.d(812): Error creating thread

 because the available OS thread handles are exhausted.

 Any ideas how to properly fix that?

 David
As far as I'm aware, you cannot avoid it, it's a hard coded limit set by the operating system. May I suggest using Fibers instead of threads? If your threads are short lived, their overhead is probably not worth it. You can also combine fibers with threads if you need to take advantage of multiple cores (use a couple of worker threads, with N fibers each). See http://octarineparrot.com/article/view/getting-more-fiber-in-your-diet for more information about fibers. -- Robert http://octarineparrot.com/
Jun 09 2011
parent David Nadlinger <see klickverbot.at> writes:
On 6/10/11 1:21 AM, Robert Clipsham wrote:
 As far as I'm aware, you cannot avoid it, it's a hard coded limit set by
 the operating system. May I suggest using Fibers instead of threads? If
 your threads are short lived, their overhead is probably not worth it.
 You can also combine fibers with threads if you need to take advantage
 of multiple cores (use a couple of worker threads, with N fibers each).
The threads are only short lived during that test, they aren't in the real application (incidentally, it would typically be a single thread running for the whole application lifetime). Also, the OS thread limit isn't a problem in my example, as no more than a couple of threads are running at the same time – the problem is just freeing the handle by doing pthread_detach for the finished threads. By the way, trying to force a garbage collection as a last resort exposes a druntime bug on OS X: http://d.puremagic.com/issues/show_bug.cgi?id=6135. David
Jun 09 2011
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 09 Jun 2011 19:17:28 -0400, David Nadlinger <see klickverbot.at>  
wrote:

 The title says it all – how can I avoid running out of OS thread handles  
 when spawning lots of short-lived threads?

 In reality, I encountered the issue while writing tests a piece of code  
 which spawns a thread, but this is the basic issue:

 ---
 import core.thread;

 void doNothing() {}
 void main() {
    foreach (i; 0 .. 100_000) {
      auto t = new Thread(&doNothing);
      t.start();

      // Just to make sure the thread has time to terminate.
      Thread.sleep(dur!"msecs"(1));
    }
 }
 ---

 Even though the threads immediately terminate, the D Thread objects stay  
 around a lot longer (until the garbage collector decides to collect  
 them), and as pthread_detach is only called the Thread destructor, this  
 causes the application to eventually fail with

 core.thread.ThreadException src/core/thread.d(812): Error creating thread

 because the available OS thread handles are exhausted.

 Any ideas how to properly fix that?
t.join() ? http://www.digitalmars.com/d/2.0/phobos/core_thread.html#join AFAIK, a thread cannot go away until you join it, because it still has to give you its exit status. See this man page: http://pubs.opengroup.org/onlinepubs/007908799/xsh/pthread_join.html It is unspecified whether a thread that has exited but remains unjoined counts against _POSIX_THREAD_THREADS_MAX. -Steve
Jun 09 2011
parent reply David Nadlinger <see klickverbot.at> writes:
On 6/10/11 1:37 AM, Steven Schveighoffer wrote:
 t.join() ?

 http://www.digitalmars.com/d/2.0/phobos/core_thread.html#join
Doesn't work, in my application I'm a) using std.concurrency, and b) even that is hidden behind the API I want to test. A better example would probably be the following, which crashes before reaching 600 iterations on my Arch Linux VM: --- import std.concurrency; void doNothing() {} void main() { foreach (i; 0 .. 1_000_000) { spawnLinked(&doNothing); receive((LinkTerminated t){}); } } --- I realize that spawn()ing lots of threads without causing much GC activity may be an edge case that probably only occurs during testing, but I'd still be interested in a workaround. (GC.collect() to collect the remaining Thread objects and thus cause the threads to be detach()ed unfortunately breaks on OS X: http://d.puremagic.com/issues/show_bug.cgi?id=6135).
 AFAIK, a thread cannot go away until you join it, because it still has
 to give you its exit status.
Not if you call pthread_detach(), after which calls to pthread_join() are defined to be invalid. David
Jun 09 2011
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 09 Jun 2011 19:57:27 -0400, David Nadlinger <see klickverbot.at>  
wrote:

 On 6/10/11 1:37 AM, Steven Schveighoffer wrote:
 t.join() ?

 http://www.digitalmars.com/d/2.0/phobos/core_thread.html#join
Doesn't work, in my application I'm a) using std.concurrency, and b) even that is hidden behind the API I want to test. A better example would probably be the following, which crashes before reaching 600 iterations on my Arch Linux VM:
In my previous application (admittedly, this was D1 and Tango, but essentially the same Thread class), I've run and terminated thousands and thousands of threads from a single process execution, and never had that problem. Now, std.concurrency, I'm not too familiar with. So I'm of little help there...
 ---
 import std.concurrency;

 void doNothing() {}
 void main() {
    foreach (i; 0 .. 1_000_000) {
      spawnLinked(&doNothing);
      receive((LinkTerminated t){});
    }
 }
 ---

 I realize that spawn()ing lots of threads without causing much GC  
 activity may be an edge case that probably only occurs during testing,  
 but I'd still be interested in a workaround. (GC.collect() to collect  
 the remaining Thread objects and thus cause the threads to be detach()ed  
 unfortunately breaks on OS X:  
 http://d.puremagic.com/issues/show_bug.cgi?id=6135).
I would be uneasy if thread joining or cleanup depends on the GC. Indeed, I don't see anything in std.concurrency that joins threads which terminate. That may be the issue. Receiving a termination message probably should join the thread... What I'd suggest is this: modify std.concurrency to put the thread inside the Tid struct (currently, it's not stored anywhere), then call join on that thread when you are sure it's dead. So your code will look like this: auto tid = spawnLinked(&doNothing); receive((LinkTerminated t) {}); tid.theThread.join(); See if that helps. If it does, file a bug against std.concurrency.
 AFAIK, a thread cannot go away until you join it, because it still has
 to give you its exit status.
Not if you call pthread_detach(), after which calls to pthread_join() are defined to be invalid.
Are you calling that? The only place that calls that in druntime I can find is the Thread dtor. -Steve
Jun 09 2011