www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 318] New: wait does not release thread resources on Linux

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=318

           Summary: wait does not release thread resources on Linux
           Product: D
           Version: 0.165
          Platform: All
        OS/Version: Linux
            Status: NEW
          Severity: blocker
          Priority: P2
         Component: Phobos
        AssignedTo: bugzilla digitalmars.com
        ReportedBy: mclysenk mtu.edu


While wait is supposed to release a thread's resources, it will fail if the
thread has already completed.  This makes it impossible to use more than 400
threads reliably.  Here is an example which demonstrates the problem:


import std.stdio, std.thread;

void main()
{
    for(int i=0; i<80000; i++)
    {
        writefln("Creating thread %d", i);
        Thread t = new Thread({writefln("    Created!"); return 0;});
        t.start;
        for(int x=0; x<1000; x++)
            Thread.yield;
        t.wait;
        writefln("    Finished.");
    }
}


Within a few hundred iterations, this code will likely produce a "failed to
start" error.  From my testing, this issue only affects Linux.

So far, there are no workarounds.


-- 
Sep 02 2006
next sibling parent Sean Kelly <sean f4.ca> writes:
d-bugmail puremagic.com wrote:
 http://d.puremagic.com/issues/show_bug.cgi?id=318
 
            Summary: wait does not release thread resources on Linux
            Product: D
            Version: 0.165
           Platform: All
         OS/Version: Linux
             Status: NEW
           Severity: blocker
           Priority: P2
          Component: Phobos
         AssignedTo: bugzilla digitalmars.com
         ReportedBy: mclysenk mtu.edu
 
 
 While wait is supposed to release a thread's resources, it will fail if the
 thread has already completed.  This makes it impossible to use more than 400
 threads reliably.  Here is an example which demonstrates the problem:
 
 
 import std.stdio, std.thread;
 
 void main()
 {
     for(int i=0; i<80000; i++)
     {
         writefln("Creating thread %d", i);
         Thread t = new Thread({writefln("    Created!"); return 0;});
         t.start;
         for(int x=0; x<1000; x++)
             Thread.yield;
         t.wait;
         writefln("    Finished.");
     }
 }
 
 
 Within a few hundred iterations, this code will likely produce a "failed to
 start" error.  From my testing, this issue only affects Linux.
I think line 667 of thread.d should be changed from: if (state == TS.RUNNING) to: if (state != TS.INITIAL) Because it is not only legal to call pthread_join on a thread that has run and finished, but calling pthread_join or pthread_detach is required for the thread resources to be released. However, it is illegal to call pthread_join more than once, and I believe it is also illegal to detach a thread that has already been joined, so 'id' should probably be cleared after join/detach is called, and this value tested along with 'state' before performing thread ops. As an unrelated issue, I just noticed that CloseHandle is not being called on the thread handle for Win32, and pthread_detach is not being called for Posix. I think this should be done in a thread dtor or the equivalent to ensure resources are properly released. Sean
Sep 02 2006
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=318






 As an unrelated issue, I just noticed that CloseHandle is not being 
 called on the thread handle for Win32, and pthread_detach is not being 
 called for Posix.  I think this should be done in a thread dtor or the 
 equivalent to ensure resources are properly released.
 
Yes, this is very important. This is a huge bug. Sometimes one uses "throwaway" threads that just do one thing and terminate. Currently, it will cause a huge leak and potential errors. --
Nov 19 2006
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=318


braddr puremagic.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|bugzilla digitalmars.com    |braddr puremagic.com
             Status|NEW                         |ASSIGNED





Created an attachment (id=197)
 --> (http://d.puremagic.com/issues/attachment.cgi?id=197&action=view)
Proposed fix for phobos 1.x - v1

I've run this through a bit of testing of this diff, both 1.x and 2.x, using
the provided example test case and a few variations of my own.  (so far just on
linux, but I'll test on windows shortly).

I can no longer reproduce the problem.  That said, threading problems are
notoriously difficult to be sure about.  I'd appreciate it if some of you could
take a look and hopefully even build your own phobos and do some testing.

I need to think a little bit more about the running -> terminated -> finished
transition steps a bit to make sure it's safe in all cases.  I really would
prefer not to have to make state management synchronized.

Thanks,
Brad


-- 
Oct 19 2007
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=318


braddr puremagic.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------

           obsolete|                            |





Created an attachment (id=198)
 --> (http://d.puremagic.com/issues/attachment.cgi?id=198&action=view)
patch v2

Further testing showed race conditions between the gc and the thread library so
I went ahead with the conservative approach.  I'm not happy with this many sync
points, but my test cases no longer show any problems.


-- 
Oct 21 2007
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=318


bugzilla digitalmars.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|                            |FIXED





Fixed dmd 1.023 and 2.007


-- 
Nov 03 2007