www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Garbage Collector and Foreign Threads

reply will75g <will75g yahoo.it> writes:
If I understand correctly how the D garbage collector works, in order to 
work properly it needs to stop every thread before performing collection.

What happens if one or more of the application threads have been created 
outside the control of the D runtime? For example imagine a C/C++ 
application (of which I don't even have the source code) loading a 
plug-in written by me in D... Since the D runtime knows only the threads 
created in D, the threads created by the C/C++ application could be 
still running and calling D code while the D runtime is performing a 
collection.

So far the only solution I can think of is suspending the GC while my D 
code is called from the C/C++ application.
My fear is that such a solution would lead to a lot of problems. For 
example if I have to allocate and release frequently memory, there will 
be a point where the D heap is exhausted, since the memory that I 
release won't be collected until the GC is enabled again. At that point 
every memory allocation will generate a 
std.outofmemory.OutOfMemoryException, but there's nothing I can do to 
give the collector more memory.

Any idea?
Mar 19 2007
parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
will75g wrote:
 If I understand correctly how the D garbage collector works, in order to
 work properly it needs to stop every thread before performing collection.
 
 What happens if one or more of the application threads have been created
 outside the control of the D runtime? For example imagine a C/C++
 application (of which I don't even have the source code) loading a
 plug-in written by me in D... Since the D runtime knows only the threads
 created in D, the threads created by the C/C++ application could be
 still running and calling D code while the D runtime is performing a
 collection.
 
 So far the only solution I can think of is suspending the GC while my D
 code is called from the C/C++ application.
 My fear is that such a solution would lead to a lot of problems. For
 example if I have to allocate and release frequently memory, there will
 be a point where the D heap is exhausted, since the memory that I
 release won't be collected until the GC is enabled again. At that point
 every memory allocation will generate a
 std.outofmemory.OutOfMemoryException, but there's nothing I can do to
 give the collector more memory.
 
 Any idea?

In so far as I'm aware, there shouldn't be any problems with leaving the GC enabled when your application is used as a plugin to a C app: the C app shouldn't have any GC'ed memory (it really, really shouldn't), thus its threads aren't a problem. So long as the GC knows about all the threads that *do* use GC'ed memory, you're fine. Incidentally, the GC only (currently) runs if you try to 'new' something, and you're out of memory. So unless the host application is calling into your plugin from multiple threads simultaneously, I don't think you need to worry about the GC running and then having your plugin called in the middle of that. Disclaimer: IANAGCE[1] -- Daniel [1]: I Am Not A Garbage Collector Expert :P -- int getRandomNumber() { return 4; // chosen by fair dice roll. // guaranteed to be random. } v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
Mar 19 2007
next sibling parent Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Daniel Keep wrote:
 
 will75g wrote:
 If I understand correctly how the D garbage collector works, in order to
 work properly it needs to stop every thread before performing collection.

 What happens if one or more of the application threads have been created
 outside the control of the D runtime? For example imagine a C/C++
 application (of which I don't even have the source code) loading a
 plug-in written by me in D... Since the D runtime knows only the threads
 created in D, the threads created by the C/C++ application could be
 still running and calling D code while the D runtime is performing a
 collection.

 So far the only solution I can think of is suspending the GC while my D
 code is called from the C/C++ application.
 My fear is that such a solution would lead to a lot of problems. For
 example if I have to allocate and release frequently memory, there will
 be a point where the D heap is exhausted, since the memory that I
 release won't be collected until the GC is enabled again. At that point
 every memory allocation will generate a
 std.outofmemory.OutOfMemoryException, but there's nothing I can do to
 give the collector more memory.

 Any idea?

In so far as I'm aware, there shouldn't be any problems with leaving the GC enabled when your application is used as a plugin to a C app: the C app shouldn't have any GC'ed memory (it really, really shouldn't), thus its threads aren't a problem. So long as the GC knows about all the threads that *do* use GC'ed memory, you're fine.

In fact, as long as the GC knows of at least one pointer to any object the application can still reference, you should be fine. So as long as any GC'ed objects referenced in other threads are also referenced from a D thread or from global variables the GC knows about you can even reference GC'ed objects in non-D threads. (At least for the current implementation, this might blow up with a moving collector)
 Incidentally, the GC only (currently) runs if you try to 'new'
 something, and you're out of memory.  So unless the host application is

Or you concatenate arrays (~ or ~=) or increase the .length of an array manually... Or you call any Phobos routine that does any of those things...
 calling into your plugin from multiple threads simultaneously, I don't
 think you need to worry about the GC running and then having your plugin
 called in the middle of that.

Also, normally C/C++ applications keep track of when objects can be freed in some way. As long as it then notifies your D library it no longer has any references to it (instead of trying to free()/delete them itself) you should be able to keep a reference the GC is aware of until then (in a global AA, for instance) so it doesn't get prematurely deleted and then remove that reference. If no D code has any references to it, it should then be picked up by the next GC run. Not that I've ever done this, but theoretically it should work...
Mar 19 2007
prev sibling parent reply will75g <will75g yahoo.it> writes:
Daniel Keep wrote:
 So long as the GC knows about all the threads that *do* use GC'ed
 memory, you're fine.

Unfortunately things are not that simple. The two most important threads of my plug-in will be an audio processing thread and a GUI thread, both created and managed by the hosting application (the host, not my plug-in, runs the event loop). The audio thread isn't really a problem: it's a real-time thread and allocating memory inside it is not advisable even when coding with a language without GC, so I'm used to that. The real problem is the GUI thread: I really can't imagine making a GUI and all the application logic without resorting on dynamic memory. The ironic part of this is that if I only had these two threads, there wouldn't be any problem, since the audio thread can't allocate or dispose memory anyway, while the GUI thread could block for a collection without causing any damage. The real problem is that I need at least another thread, created internally by my plug-in, and this thread needs also to allocate memory. If a collection is triggered by this third thread, the GC won't be able to stop the GUI thread (since it doesn't know it) and that's where I foresee problems.
Mar 19 2007
parent reply Sean Kelly <sean f4.ca> writes:
will75g wrote:
 Daniel Keep wrote:
 So long as the GC knows about all the threads that *do* use GC'ed
 memory, you're fine.

Unfortunately things are not that simple. The two most important threads of my plug-in will be an audio processing thread and a GUI thread, both created and managed by the hosting application (the host, not my plug-in, runs the event loop). The audio thread isn't really a problem: it's a real-time thread and allocating memory inside it is not advisable even when coding with a language without GC, so I'm used to that. The real problem is the GUI thread: I really can't imagine making a GUI and all the application logic without resorting on dynamic memory. The ironic part of this is that if I only had these two threads, there wouldn't be any problem, since the audio thread can't allocate or dispose memory anyway, while the GUI thread could block for a collection without causing any damage. The real problem is that I need at least another thread, created internally by my plug-in, and this thread needs also to allocate memory. If a collection is triggered by this third thread, the GC won't be able to stop the GUI thread (since it doesn't know it) and that's where I foresee problems.

Right. I think the GC will only be able to "see" the GUI thread if it is the one which triggers the collection (this *might* not be true if the GUI thread is the one that initializes the plugin, since a Thread object is created for the "main" thread on app initialization, but it sounds risky). If you're up for modifying Phobos, one thing that might work would be to modify std.thread such that proxy objects could be created for non-D threads. Assuming this actually works, it might be the cleanest solution, though it would mean suspending your GUI thread while collections were in progress. I think the alternative would be to access data from the plugin via some sort of communication mechanism. A producer/consumer model, for example. The GUI thread could issue requests to the plugin and then wait for a response. Since this would all be happening within the plugin logic itself, this would all be invisible to the GUI code. Sean
Mar 19 2007
parent reply Sean Kelly <sean f4.ca> writes:
Sean Kelly wrote:
 
 Right.  I think the GC will only be able to "see" the GUI thread if it 
 is the one which triggers the collection (this *might* not be true if 
 the GUI thread is the one that initializes the plugin, since a Thread 
 object is created for the "main" thread on app initialization, but it 
 sounds risky).

Just a clarification. By the parenthesized statement I meant that the GC might be aware of the GUI thread if it is the one which initialized the library because this thread may well be considered the "main" thread. After some consideration however, I don't think the GC will otherwise be aware of the GUI thread *even if it is the thread which triggers the collection*. The pertinent bit of code in Phobos 1.009 is internal/gc/gcx.d lines 1466 to 1500. As you can see, the GC only inspects threads in the list returned by Thread.getAll(). The GUI thread won't be in that list (ignoring the comment about it possibly being the "main" thread above) so its stack won't be inspected. The GC could make the calling thread a special case and obtain its thread id, etc, manually, but that of breaks encapsulation for what is a pretty weird situation. It may be worth seeing if the bit about the GUI thread being the "main" thread is valid. If not, the best integrated approach would probably be to generate proxy thread objects or to add some other means of unmanaged threads to be suspended/inspected. The safest approach for now, however, would likely be the producer/consumer method I mentioned in my prior post. Sean
Mar 19 2007
next sibling parent reply "Chris Warwick" <sp m.me.not> writes:
"Sean Kelly" <sean f4.ca> wrote in message 
news:etn7tt$2p86$1 digitalmars.com...
 Sean Kelly wrote:
 Right.  I think the GC will only be able to "see" the GUI thread if it is 
 the one which triggers the collection (this *might* not be true if the 
 GUI thread is the one that initializes the plugin, since a Thread object 
 is created for the "main" thread on app initialization, but it sounds 
 risky).

Just a clarification. By the parenthesized statement I meant that the GC might be aware of the GUI thread if it is the one which initialized the library because this thread may well be considered the "main" thread. After some consideration however, I don't think the GC will otherwise be aware of the GUI thread *even if it is the thread which triggers the collection*. The pertinent bit of code in Phobos 1.009 is internal/gc/gcx.d lines 1466 to 1500. As you can see, the GC only inspects threads in the list returned by Thread.getAll(). The GUI thread won't be in that list (ignoring the comment about it possibly being the "main" thread above) so its stack won't be inspected. The GC could make the calling thread a special case and obtain its thread id, etc, manually, but that of breaks encapsulation for what is a pretty weird situation.

asumming win32 here as thats all i know but.. Couldnt the plugin create a local GUI thread? So when the host requests a plugin instance, the dll actually creates a local thread that has it's own windows message loop and that basicly runs the plugin and gui. So as long as the host gui and audio thread dont cause any allocations or interfer in a way that would break the gc, it should work? Of course some way for the threads to interact / comunicate would need to be worked out but at least by creating your own thread you can controll where the allocations occur. With modern multicore it could be posible that more than one host gui thread exists? And that would be very problematic i think. cw
Mar 19 2007
parent reply will75g <will75g yahoo.it> writes:
Chris Warwick wrote:

 asumming win32 here as thats all i know but..
 
 Couldnt the plugin create a local GUI thread? So when the host requests a 
 plugin instance, the dll actually creates a local thread that has it's own 
 windows message loop and that basicly runs the plugin and gui. So as long as 
 the host gui and audio thread dont cause any allocations or interfer in a 
 way that would break the gc, it should work? Of course some way for the 
 threads to interact / comunicate would need to be worked out but at least by 
 creating your own thread you can controll where the allocations occur. With 
 modern multicore it could be posible that more than one host gui thread 
 exists? And that would be very problematic i think.

Probably that would work on win32, but I have to support both Mac and PC, and if I remember correctly in OSX it's forbidden to call any GUI related coded from a thread that is not the main application thread (a simple operations such as invalidating a window area would cause a crash). Things are even more complicated if you add the fact that my plug-in GUI must live in a child window of the host window.
Mar 20 2007
parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
will75g wrote:
 Chris Warwick wrote:
 
 asumming win32 here as thats all i know but..

 Couldnt the plugin create a local GUI thread? So when the host 
 requests a plugin instance, the dll actually creates a local thread 
 that has it's own windows message loop and that basicly runs the 
 plugin and gui. So as long as the host gui and audio thread dont cause 
 any allocations or interfer in a way that would break the gc, it 
 should work? Of course some way for the threads to interact / 
 comunicate would need to be worked out but at least by creating your 
 own thread you can controll where the allocations occur. With modern 
 multicore it could be posible that more than one host gui thread 
 exists? And that would be very problematic i think.

Probably that would work on win32, but I have to support both Mac and PC, and if I remember correctly in OSX it's forbidden to call any GUI related coded from a thread that is not the main application thread (a simple operations such as invalidating a window area would cause a crash). Things are even more complicated if you add the fact that my plug-in GUI must live in a child window of the host window.

So that would mean the GUI thread of your plugin would have to be the main thread? Then perhaps the proxy object created by std.thread for the main thread mentioned somewhere in this thread would be able to handle this on OSX, and you could use above method for win32? (Assuming by 'PC' above you meant win32, and not e.g. Linux as well. If you also need to support other PC platforms, you'll need to figure out what works there) You may want to create some kind of OS abstraction layer for this if you try to implement it. This is obviously not as nice as a cross-platform solution. On the other hand, it might actually work. Which is after all a very important quality for a program to have...
Mar 20 2007
parent will75g <will75g yahoo.it> writes:
Frits van Bommel will75g wrote:
 So that would mean the GUI thread of your plugin would have to be the 
 main thread?
 Then perhaps the proxy object created by std.thread for the main thread 
 mentioned somewhere in this thread would be able to handle this on OSX, 
 and you could use above method for win32?
 (Assuming by 'PC' above you meant win32, and not e.g. Linux as well. If 
 you also need to support other PC platforms, you'll need to figure out 
 what works there)
 You may want to create some kind of OS abstraction layer for this if you 
 try to implement it.
 
 This is obviously not as nice as a cross-platform solution. On the other 
 hand, it might actually work. Which is after all a very important 
 quality for a program to have...

Yes, by PC I mean win32... unfortunately linux is rather irrelevant for audio applications. The solution you're proposing would work with the simplified model I described. But audio plug-ins are a mess... The whole truth is that you're guarantee to have a GUI and an audio thread, but there could be more. Some notifications from the host are performed from an unspecified thread: it can be the GUI thread, the audio thread or a completely different thread, there's no standard for this and each host implements it in a different way, so the plug-in must be able to adapt to every possible situation... That's way I look for a generic solution. Being able to identify a foreign thread and wrapping it in a proxy thread looks like the most promising solution so far.
Mar 20 2007
prev sibling parent reply will75g <will75g yahoo.it> writes:
Sean Kelly wrote:
 Just a clarification.  By the parenthesized statement I meant that the 
 GC might be aware of the GUI thread if it is the one which initialized 
 the library because this thread may well be considered the "main" 
 thread.  After some consideration however, I don't think the GC will 
 otherwise be aware of the GUI thread *even if it is the thread which 
 triggers the collection*.  The pertinent bit of code in Phobos 1.009 is 
 internal/gc/gcx.d lines 1466 to 1500.  As you can see, the GC only 
 inspects threads in the list returned by Thread.getAll().  The GUI 
 thread won't be in that list (ignoring the comment about it possibly 
 being the "main" thread above) so its stack won't be inspected.  The GC 
 could make the calling thread a special case and obtain its thread id, 
 etc, manually, but that of breaks encapsulation for what is a pretty 
 weird situation.
 
 It may be worth seeing if the bit about the GUI thread being the "main" 
 thread is valid.  If not, the best integrated approach would probably be 
 to generate proxy thread objects or to add some other means of unmanaged 
 threads to be suspended/inspected.  The safest approach for now, 
 however, would likely be the producer/consumer method I mentioned in my 
 prior post.

The producer/consumer method looks like a good solution and is not too far from the design I'm currently using in C++. I still would prefer a generic solution for these situations. My plug-in is a special case, but if D doesn't have a mean to interact with foreign threads it will be a big limitation for any DLL made in D and meant to be used from languages other than D. If I understand correctly being able to block the foreign thread is not enough, because the GC must also be able to inspect the thread stack. If that's the case, your proxy thread idea is probably the cleanest solution. I'll see if I can manage to implement it (having something like that in Tango would be fantastic). Thanks for your help.
Mar 20 2007
next sibling parent reply "Chris Warwick" <sp m.me.not> writes:
"will75g" <will75g yahoo.it> wrote in message 
news:eto8v2$1fft$1 digitalmars.com...
 If I understand correctly being able to block the foreign thread is not 
 enough, because the GC must also be able to inspect the thread stack. If 
 that's the case, your proxy thread idea is probably the cleanest solution. 
 I'll see if I can manage to implement it (having something like that in 
 Tango would be fantastic).

 Thanks for your help.

I assume you are doing VST plugins? Anycase i've played round with that a bit (not in D) so I'd be interested in hearing how you get on with it ;-) cheers, cw
Mar 20 2007
parent will75g <will75g yahoo.it> writes:
Chris Warwick wrote:
 I assume you are doing VST plugins? Anycase i've played round with that a 
 bit (not in D) so I'd be interested in hearing how you get on with it ;-)

Yes, it's a VST plug-in. Currently I'm only evaluating how feasible is to create audio plug-ins in D and this GC issue is the first road block, so unfortunately there isn't much I can say.
Mar 20 2007
prev sibling parent Sean Kelly <sean f4.ca> writes:
will75g wrote:
 
 If I understand correctly being able to block the foreign thread is not 
 enough, because the GC must also be able to inspect the thread stack.

Yes. And the thread which initiates the collection doesn't need to be blocked (since it's running the collection) but its stack still must be inspected, assuming it holds references to GCed data.
 If
 that's the case, your proxy thread idea is probably the cleanest 
 solution. I'll see if I can manage to implement it (having something 
 like that in Tango would be fantastic).

To do this, I suggest looking at how the proxy for the main thread is set up in std.thread.Thread.thread_init(). Tango does something similar in thread_init() within tango.core.Thread. Let me think about this a bit more to make sure I can't think of any weird problems with the approach and if not then I'll add the feature to Tango. Sean
Mar 20 2007