digitalmars.D.learn - How do I limit the number of active threads (queuing spawn calls)
- Andrej Mitovic (23/23) Mar 26 2011 I'm testing out some various compilation schemes with DMD. Right now I'm...
- Jonathan M Davis (37/85) Mar 26 2011 I don't believe that std.concurrency has any way to manage the number of...
- Andrej Mitrovic (95/95) Mar 26 2011 Well I've worked around this by polling a variable which holds the
- Andrej Mitrovic (5/5) Mar 26 2011 Edit: It looks like I did almost the same as Jonathan advised.
- Brad Roberts (10/16) Mar 26 2011 The way I've typically done this sort of pattern is with a thread pool t...
I'm testing out some various compilation schemes with DMD. Right now I'm spawning multiple threads which simply do a `system` call with a string like "DMD -c somefile.d". I'd like to limit the number of active threads to something my CPU can handle (4 in this case since I've got 4 cores..). How do I go about doing this? Here's the function which I spawn: void compileObjfile(string name) { shell(r"dmd -od" ~ r".\cache\" ~ r" -c -version=Unicode -version=WindowsNTonly -version=Windows2000 -version=WindowsXP -I..\ " ~ name ~ " "); } So I just need to pass the module name to it. The trouble is, if I spawn this function inside a foreach loop, I'll inadvertently create a few dozen threads. This hogs the system for a while. :) (although this does seem to create some rather impressive compilation speeds, LOL!) This is what the main function might look like: void main() { foreach (string name; dirEntries(curdir, SpanMode.shallow)) { if (name.isfile && name.getExt == "d") { spawn(&compileObjfile, name); } } } Sidenotes: So I've tried compiling the win32 libraries via `DMD -lib`. DMD eats up over 300 Megs of memory, and its quite scary how fast that number grows. It took over 25 seconds to compile a lib file. On the other hand, compiling .obj files one by one by blocking a single thread on system calls (in other words, single-threaded version), it takes about 15 seconds to create a library file. In each instantiation DMD wastes only about a dozen or so Mbytes, maybe less. When I spawn an unlimited number of threads via a foreach loop, again compiling object-by-object, the lib file is generated in only 5(!) seconds. I'm running a quad-core on XP32, btw. So I'm a little perplexed, because according to Tomasz (maker of xfBuild) and his various posts, compiling .obj by .obj file should apparently be really really slow and -lib makes the fastest builds. But I'm getting the exact opposite results.
Mar 26 2011
On 2011-03-26 18:15, Andrej Mitovic wrote:I'm testing out some various compilation schemes with DMD. Right now I'm spawning multiple threads which simply do a `system` call with a string like "DMD -c somefile.d". I'd like to limit the number of active threads to something my CPU can handle (4 in this case since I've got 4 cores..). How do I go about doing this? Here's the function which I spawn: void compileObjfile(string name) { shell(r"dmd -od" ~ r".\cache\" ~ r" -c -version=Unicode -version=WindowsNTonly -version=Windows2000 -version=WindowsXP -I..\ " ~ name ~ " "); } So I just need to pass the module name to it. The trouble is, if I spawn this function inside a foreach loop, I'll inadvertently create a few dozen threads. This hogs the system for a while. :) (although this does seem to create some rather impressive compilation speeds, LOL!) This is what the main function might look like: void main() { foreach (string name; dirEntries(curdir, SpanMode.shallow)) { if (name.isfile && name.getExt == "d") { spawn(&compileObjfile, name); } } } Sidenotes: So I've tried compiling the win32 libraries via `DMD -lib`. DMD eats up over 300 Megs of memory, and its quite scary how fast that number grows. It took over 25 seconds to compile a lib file. On the other hand, compiling .obj files one by one by blocking a single thread on system calls (in other words, single-threaded version), it takes about 15 seconds to create a library file. In each instantiation DMD wastes only about a dozen or so Mbytes, maybe less. When I spawn an unlimited number of threads via a foreach loop, again compiling object-by-object, the lib file is generated in only 5(!) seconds. I'm running a quad-core on XP32, btw. So I'm a little perplexed, because according to Tomasz (maker of xfBuild) and his various posts, compiling .obj by .obj file should apparently be really really slow and -lib makes the fastest builds. But I'm getting the exact opposite results.I don't believe that std.concurrency has any way to manage the number of threads that are running. It gives you the means to communicate between threads and gives you a nice to spawn a thread, but it doesn't really do much with thread management. You could use core.thread.Thread.getAll to get an array of all of the Threads, and spin until the number is below whatever the threshold is that you want, but that's not terribly efficient, since then you're going to have a thread spinning, eating up CPU as it waits for the others to finish. What I have done when I've wanted to do something like this is to have each spawned thread send a message back when it's done. Then, I increment a thread count when I spawn a thread and decrement it when I receive a message indicating that a thread has terminated. In the loop that I have running which is processing whatever list of things I want processed, it will only spawn a thread if the thread count is below the chosen threshhold. Otherwise it sits there waiting to receive a message. So, it would do something like this foreach(string name; dirEntries(curdir, SpanMode.shallow)) { if(name.isfile && name.getExt == "d") { if(currThreads < maxThreads) receiveTimeout(1, recProc); else receive(recProc; spawn(&compileObjfile, name); ++currThreads; } } recProc is then a function which handles receiving messages, and it decrements currThreads when it receives the message that a thread has been terminated. std.concurrency does not manage threads. It only gives you tools for creating them and communicating between them. So, you need to manage the threads yourself if you want to manage them. However, it should be noted that the task that you're looking to solve here may be better solved by std.parallelism, which David has been working on, and has been being reviewed on the main list. - Jonathan M Davis
Mar 26 2011
Well I've worked around this by polling a variable which holds the number of active threads. It's not a pretty solution, and I'd probably be best with using std.parallelism or some upcoming module. My solution for now is: import std.stdio; import std.file; import std.path; import std.process; import std.concurrency; import core.thread; shared int threadsCount; void compileObjfile(string name) { system(r"dmd -od" ~ r".\cache\" ~ r" -c -version=Unicode -version=WindowsNTonly -version=Windows2000 -version=WindowsXP -I..\ " ~ name ~ " "); atomicOp!"-="(threadsCount, 1); } int main() { string libfileName = r".\cache\win32.lib "; string objFiles; foreach (string name; dirEntries(curdir, SpanMode.shallow)) { if (name.isfile && name.basename.getName != "build" && (name.getExt == "d" || name.getExt == "di")) { string objfileName = r".\cache\" ~ name.basename.getName ~ ".obj"; objFiles ~= objfileName ~ " "; atomicOp!"+="(threadsCount, 1); while (threadsCount > 3) { Thread.sleep(dur!("msecs")(1)); } spawn(&compileObjfile, name); } } while (threadsCount) { Thread.sleep(dur!("msecs")(1)); // wait for threads to finish before call to lib } system(r"lib -c -n -p64 " ~ objFiles); return 0; } The timing: D:\dev\projects\win32\win32>timeit build Digital Mars Librarian Version 8.02n Copyright (C) Digital Mars 2000-2007 All Rights Reserved http://www.digitalmars.com/ctg/lib.html Digital Mars Librarian complete. Version Number: Windows NT 5.1 (Build 2600) Exit Time: 3:49 am, Sunday, March 27 2011 Elapsed Time: 0:00:06.437 Process Time: 0:00:00.062 System Calls: 627101 Context Switches: 123883 Page Faults: 734997 Bytes Read: 93800813 Bytes Written: 7138927 Bytes Other: 1043652 So about ~6.5 seconds. Now compare this to this build script which simply invokes DMD with -lib and all the modules: import std.stdio; import std.process; import std.path; import std.file; void main() { string files; foreach (string name; dirEntries(curdir, SpanMode.shallow)) { if (name.isfile && name.basename.getName != "build" && name.getExt == "d") files ~= name ~ " "; } system(r"dmd -lib -I..\ -version=Unicode -version=WindowsNTonly -version=Windows2000 -version=WindowsXP " ~ files); } D:\dev\projects\win32\win32>timeit build.exe Version Number: Windows NT 5.1 (Build 2600) Exit Time: 3:54 am, Sunday, March 27 2011 Elapsed Time: 0:00:25.750 Process Time: 0:00:00.015 System Calls: 139172 Context Switches: 44648 Page Faults: 87440 Bytes Read: 7427284 Bytes Written: 7413372 Bytes Other: 45798 Compiling object by object is almost exactly 4 times faster with threading than using -lib on all module files. And my multithreaded script is probably wasting some time by calling thread.sleep(), but I'm new to threading and I don't know how else to limit the number of threads.
Mar 26 2011
Edit: It looks like I did almost the same as Jonathan advised. I'm looking forward to std.parallelism though. I'm thinking I'd probably use some kind of parallel foreach loop that iterates over 4 files at once, and letting it do its work by spawning 4 threads. Or something like that. We'll see.
Mar 26 2011
On 3/26/2011 7:00 PM, Andrej Mitrovic wrote:Edit: It looks like I did almost the same as Jonathan advised. I'm looking forward to std.parallelism though. I'm thinking I'd probably use some kind of parallel foreach loop that iterates over 4 files at once, and letting it do its work by spawning 4 threads. Or something like that. We'll see.The way I've typically done this sort of pattern is with a thread pool that gets its work from a queue. The main thread shoves work into the queue and then calls a .join or .waitForEmpty sort of api on the pool. So it'd look something like: void workerFunc(string str) { ... } auto tp = new ThreadPool(getNumCpus(), &workerFunc); foreach(...) tp.push(str); tp.join(); This can suffer from queue size problems if the amount of work is awful, but that's not a problem for the vast majority of the cases I've had, so never worried about having the push capable of blocking or otherwise throttling the producer side.
Mar 26 2011