www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - D2 weak references

reply Jason House <jason.james.house gmail.com> writes:
Does anyone have a weak reference library for D2?

Without that, hash tables and search trees don't mix. I'm hoping that a weak
reference library can be merged into druntime. 
Apr 15 2009
parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
Jason House wrote:
 Does anyone have a weak reference library for D2?
 
 Without that, hash tables and search trees don't mix. I'm hoping that 
 a weak reference library can be merged into druntime.
Weak references don't fit well into D at the moment. It's been talked about before (albeit not in reference to D2 in particular): http://www.digitalmars.com/d/archives/digitalmars/D/713.html http://www.digitalmars.com/d/archives/digitalmars/D/Weak_references._69761.html and probably others. But I think that, if D gained weak references as a built-in feature it would work well and be useful. A decent library implementation may otherwise be possible if D's GC API gains: - an explicit pinning mechanism - a means of creating listeners on the GC Stewart.
Apr 15 2009
parent reply Jason House <jason.james.house gmail.com> writes:
Stewart Gordon Wrote:

 Jason House wrote:
 Does anyone have a weak reference library for D2?
 
 Without that, hash tables and search trees don't mix. I'm hoping that 
 a weak reference library can be merged into druntime.
Weak references don't fit well into D at the moment. It's been talked about before (albeit not in reference to D2 in particular): http://www.digitalmars.com/d/archives/digitalmars/D/713.html http://www.digitalmars.com/d/archives/digitalmars/D/Weak_references._69761.html and probably others.
I participated in the more recent of those threads. I feel like I have a great application for weak pointers. The best manual memmory management of search trees and hashtables is to do mark and sweep. That should sound awfully familiar...
 But I think that, if D gained weak references as a built-in feature it 
 would work well and be useful.  A decent library implementation may 
 otherwise be possible if D's GC API gains:
 - an explicit pinning mechanism
 - a means of creating listeners on the GC
 
 Stewart.
Everything is pinned at the moment and Tango's GC (and therefore druntime's GC) has an explicit notification mechanism. The big advantage to having a weak ref implementation in the runtime is that it'll evolve with the GC as well as pool people's bug fixes / ports.
Apr 15 2009
parent reply Jason House <jason.james.house gmail.com> writes:
Jason House wrote:

 Tango's GC (and therefore
 druntime's GC) has an explicit notification mechanism.
I guess I shouldn't have assumed that the features of tango became part of druntime. I don't see any notification mechanism :( The best design I've come up with is to make a helper class weakReferencable(T) that will add a finalizer to fix any outstanding weak references. I'd then use a double linked list of structs as weak references. As long as collections are are single threaded, it should be pretty easy to make work. Introducing a new type besides just a weak reference seems like a bit of a hack. I'm hoping others will have suggestions on how to do that better.
Apr 18 2009
parent reply Sean Kelly <sean invisibleduck.org> writes:
Jason House wrote:
 Jason House wrote:
 
 Tango's GC (and therefore
 druntime's GC) has an explicit notification mechanism.
I guess I shouldn't have assumed that the features of tango became part of druntime. I don't see any notification mechanism :(
Same as Tango: alias void delegate(Object) DEvent; extern (C) void rt_attachDisposeEvent(Object h, DEvent e); extern (C) void rt_detachDisposeEvent(Object h, DEvent e); I should probably expose these in core.runtime.
Apr 19 2009
parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Sean Kelly wrote:
 Jason House wrote:
 Jason House wrote:

 Tango's GC (and therefore
 druntime's GC) has an explicit notification mechanism.
I guess I shouldn't have assumed that the features of tango became part of druntime. I don't see any notification mechanism :(
Same as Tango: alias void delegate(Object) DEvent; extern (C) void rt_attachDisposeEvent(Object h, DEvent e); extern (C) void rt_detachDisposeEvent(Object h, DEvent e); I should probably expose these in core.runtime.
How are these events dispatched? I remember a discussion about race conditions if these events were fired off after all threads had been resumed; you could have the event zero out the weak ref AFTER something else had dereferenced it. If that's still a possibility, maybe you could add a case where if the delegate's funcptr is null, it just assumes the context pointer points to a void*.sizeof word and zeroes it out. You could do that without resuming threads. -- Daniel
Apr 19 2009
parent reply Sean Kelly <sean invisibleduck.org> writes:
Daniel Keep wrote:
 
 Sean Kelly wrote:
 Jason House wrote:
 Jason House wrote:

 Tango's GC (and therefore
 druntime's GC) has an explicit notification mechanism.
I guess I shouldn't have assumed that the features of tango became part of druntime. I don't see any notification mechanism :(
Same as Tango: alias void delegate(Object) DEvent; extern (C) void rt_attachDisposeEvent(Object h, DEvent e); extern (C) void rt_detachDisposeEvent(Object h, DEvent e); I should probably expose these in core.runtime.
How are these events dispatched? I remember a discussion about race conditions if these events were fired off after all threads had been resumed; you could have the event zero out the weak ref AFTER something else had dereferenced it.
It's the responsibility of the person making the weak ref to make it thread-safe if that's a design consideration. If the cleanup is performed before the threads are restarted and the weak ref (or signal/slot mechanism) uses mutexes then a deadlock is possible. At least this approach provides the option for correct code to actually work.
 If that's still a possibility, maybe you could add a case where if the
 delegate's funcptr is null, it just assumes the context pointer points
 to a void*.sizeof word and zeroes it out.  You could do that without
 resuming threads.
This is all tied into the finalizer. To do what you suggest the GC would have to perform two sweeps--one before resuming threads and one after.
Apr 20 2009
parent reply Jason House <jason.james.house gmail.com> writes:
Sean Kelly Wrote:

 Daniel Keep wrote:
 
 Sean Kelly wrote:
 Jason House wrote:
 Jason House wrote:

 Tango's GC (and therefore
 druntime's GC) has an explicit notification mechanism.
I guess I shouldn't have assumed that the features of tango became part of druntime. I don't see any notification mechanism :(
Same as Tango: alias void delegate(Object) DEvent; extern (C) void rt_attachDisposeEvent(Object h, DEvent e); extern (C) void rt_detachDisposeEvent(Object h, DEvent e); I should probably expose these in core.runtime.
How are these events dispatched? I remember a discussion about race conditions if these events were fired off after all threads had been resumed; you could have the event zero out the weak ref AFTER something else had dereferenced it.
It's the responsibility of the person making the weak ref to make it thread-safe if that's a design consideration. If the cleanup is performed before the threads are restarted and the weak ref (or signal/slot mechanism) uses mutexes then a deadlock is possible. At least this approach provides the option for correct code to actually work.
 If that's still a possibility, maybe you could add a case where if the
 delegate's funcptr is null, it just assumes the context pointer points
 to a void*.sizeof word and zeroes it out.  You could do that without
 resuming threads.
This is all tied into the finalizer. To do what you suggest the GC would have to perform two sweeps--one before resuming threads and one after.
If I understand you correctly, a weak reference library needs to query the GC to see if an object is marked for collection. How do I do this?
Apr 20 2009
next sibling parent reply Jason House <jason.james.house gmail.com> writes:
Jason House Wrote:

 Sean Kelly Wrote:
 
 Daniel Keep wrote:
 
 Sean Kelly wrote:
 Jason House wrote:
 Jason House wrote:

 Tango's GC (and therefore
 druntime's GC) has an explicit notification mechanism.
I guess I shouldn't have assumed that the features of tango became part of druntime. I don't see any notification mechanism :(
Same as Tango: alias void delegate(Object) DEvent; extern (C) void rt_attachDisposeEvent(Object h, DEvent e); extern (C) void rt_detachDisposeEvent(Object h, DEvent e); I should probably expose these in core.runtime.
How are these events dispatched? I remember a discussion about race conditions if these events were fired off after all threads had been resumed; you could have the event zero out the weak ref AFTER something else had dereferenced it.
It's the responsibility of the person making the weak ref to make it thread-safe if that's a design consideration. If the cleanup is performed before the threads are restarted and the weak ref (or signal/slot mechanism) uses mutexes then a deadlock is possible. At least this approach provides the option for correct code to actually work.
 If that's still a possibility, maybe you could add a case where if the
 delegate's funcptr is null, it just assumes the context pointer points
 to a void*.sizeof word and zeroes it out.  You could do that without
 resuming threads.
This is all tied into the finalizer. To do what you suggest the GC would have to perform two sweeps--one before resuming threads and one after.
If I understand you correctly, a weak reference library needs to query the GC to see if an object is marked for collection. How do I do this?
Here's the implementation I'm currently thinking of: weak ref construction: Copy pointer into struct without casting Call rt_attach_dispose_event weak ref dereference: copy value into temp var ** query gc to see if garbage ** If not garbage, Recopy value and return If garbage set value = null dispose of weakly referenced obj: set value = null The emphasized line is impossible according to the publicized gc interface. Given the delayed finalization, I don't see any way around it.
Apr 20 2009
parent reply Sean Kelly <sean invisibleduck.org> writes:
== Quote from Jason House (jason.james.house gmail.com)'s article
 Jason House Wrote:
 If I understand you correctly, a weak reference library needs to query the GC
to see if an object is
marked for collection. How do I do this?
 Here's the implementation I'm currently thinking of:
 weak ref construction:
 Copy pointer into struct without casting
 Call rt_attach_dispose_event
 weak ref dereference:
 copy value into temp var
 ** query gc to see if garbage **
 If not garbage, Recopy value and return
 If garbage set value = null
 dispose of weakly referenced obj:
 set value = null
 The emphasized line is impossible according to the publicized gc interface.
Given the delayed
finalization, I don't see any way around it. You're guaranteed that the GC will notify you when it finalizes the referenced object, so you shouldn't have to query the GC at all. Just copy the reference into a GC-scannable location and then if you haven't been notified by the GC that the object was finalized the reference is good.
Apr 20 2009
parent reply Jason House <jason.james.house gmail.com> writes:
Sean Kelly Wrote:

 == Quote from Jason House (jason.james.house gmail.com)'s article
 Jason House Wrote:
 If I understand you correctly, a weak reference library needs to query the GC
to see if an object is
marked for collection. How do I do this?
 Here's the implementation I'm currently thinking of:
 weak ref construction:
 Copy pointer into struct without casting
 Call rt_attach_dispose_event
 weak ref dereference:
 copy value into temp var
 ** query gc to see if garbage **
 If not garbage, Recopy value and return
 If garbage set value = null
 dispose of weakly referenced obj:
 set value = null
 The emphasized line is impossible according to the publicized gc interface.
Given the delayed
finalization, I don't see any way around it. You're guaranteed that the GC will notify you when it finalizes the referenced object, so you shouldn't have to query the GC at all. Just copy the reference into a GC-scannable location and then if you haven't been notified by the GC that the object was finalized the reference is good.
Maybe I misunderstood, but I interpreted earlier discussion in this thread to mean the gc did the following steps: 1. Pause all threads 2. Scan objects to clear marks 3. Scan objects to add marks 4. Resume all threads 5. Scan objects, if no mark then... 5a. Call object finalizer 5b. Free memory for later use Assuming that is correct, any deref of a weak reference is unsafe between steps 4 and 5a. Even though the object may be valid in that window, there's no guarantee that the reference won't be used after step 5b... This would lead to random behavior and seg faults. No self-respecting weak reference library should allow that to happen. I hope I got something wrong and weak refs are easier to implement than I'm currently thinking.
Apr 20 2009
next sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
== Quote from Jason House (jason.james.house gmail.com)'s article
 Sean Kelly Wrote:
 You're guaranteed that the GC will notify you when it finalizes
 the referenced object, so you shouldn't have to query the GC
 at all.  Just copy the reference into a GC-scannable location
 and then if you haven't been notified by the GC that the object
 was finalized the reference is good.
Maybe I misunderstood, but I interpreted earlier discussion in this thread to mean the gc did the
following steps:
 1. Pause all threads
 2. Scan objects to clear marks
 3. Scan objects to add marks
 4. Resume all threads
 5. Scan objects, if no mark then...
     5a. Call object finalizer
     5b. Free memory for later use
 Assuming that is correct, any deref of a weak reference is unsafe between
steps 4 and 5a. Even
though the object may be valid in that window, there's no guarantee that the reference won't be used after step 5b... This would lead to random behavior and seg faults. No self-respecting weak reference library should allow that to happen.
 I hope I got something wrong and weak refs are easier to implement than I'm
currently thinking.
Oh I see what you're saying. So this could happen: 4. The GC resume all threads 4a. Another thread derefs the WeakRef 4b. This thread checks to see if the GC has notified that the object is being collected. No notification, so it uses the object. 5. The GC scans its memory and finalizes the object 6. The other thread tries to use the object and explodes You're right, it sounds like you'd need to add a step between 4a and 4b where the WeakRef notifies the GC that the object it's using shouldn't be collected (if a collection is in progress). In fact, all you really need to do is acquire the GCs mutex for a moment. Once you've acquired it then you know that any collection in progress must be complete (assuming of course that the GC uses a mutex to prevent corruption). The horrible thing about all this is it makes using a multithreaded WeakRef horribly slow. Even if the GC facility were provided for what you want to do you're talking about acquiring a mutex in the GC for every deref operation. What a mess.
Apr 20 2009
next sibling parent reply Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Mon, Apr 20, 2009 at 5:03 PM, Sean Kelly <sean invisibleduck.org> wrote:

 The horrible thing about all this is it makes using a multithreaded
 WeakRef horribly slow. =A0Even if the GC facility were provided for
 what you want to do you're talking about acquiring a mutex in
 the GC for every deref operation. =A0What a mess.
Then take a hint from D2 and make them thread-local by default ;) But provide a SharedWeakRef for the cases where you need it.
Apr 20 2009
next sibling parent Sean Kelly <sean invisibleduck.org> writes:
== Quote from Jarrett Billingsley (jarrett.billingsley gmail.com)'s article
 On Mon, Apr 20, 2009 at 5:03 PM, Sean Kelly <sean invisibleduck.org> wrote:
 The horrible thing about all this is it makes using a multithreaded
 WeakRef horribly slow.  Even if the GC facility were provided for
 what you want to do you're talking about acquiring a mutex in
 the GC for every deref operation.  What a mess.
Then take a hint from D2 and make them thread-local by default ;) But provide a SharedWeakRef for the cases where you need it.
Right. These problems all go away if the object is not shared.
Apr 20 2009
prev sibling parent reply Jason House <jason.james.house gmail.com> writes:
Jarrett Billingsley Wrote:

 On Mon, Apr 20, 2009 at 5:03 PM, Sean Kelly <sean invisibleduck.org> wrote:
 
 The horrible thing about all this is it makes using a multithreaded
 WeakRef horribly slow.  Even if the GC facility were provided for
 what you want to do you're talking about acquiring a mutex in
 the GC for every deref operation.  What a mess.
Then take a hint from D2 and make them thread-local by default ;) But provide a SharedWeakRef for the cases where you need it.
I'm pretty sure the GC can bypass thread locality for mark and sweep, so thread local won't help. Anything that stops the sweep from occurring in parallel with a thread using the data bypasses this issue.
Apr 20 2009
parent reply Sean Kelly <sean invisibleduck.org> writes:
== Quote from Jason House (jason.james.house gmail.com)'s article
 Jarrett Billingsley Wrote:
 On Mon, Apr 20, 2009 at 5:03 PM, Sean Kelly <sean invisibleduck.org> wrote:

 The horrible thing about all this is it makes using a multithreaded
 WeakRef horribly slow.  Even if the GC facility were provided for
 what you want to do you're talking about acquiring a mutex in
 the GC for every deref operation.  What a mess.
Then take a hint from D2 and make them thread-local by default ;) But provide a SharedWeakRef for the cases where you need it.
I'm pretty sure the GC can bypass thread locality for mark and sweep, so thread local won't help.
Anything that stops the sweep from occurring in parallel with a thread using the data bypasses this issue. Yeah. The unfortunate tradeoff is that finalizers are run while all these app threads are suspended, and if these finalizers enter a synchronized block they're likely to deadlock. This happened enough with the Phobos GC that I was motivated to make the change in the first place. It's unfortunate, but at least WeakRef is the only victim of the current approach that I'm aware of.
Apr 20 2009
next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2009-04-20 18:31:28 -0400, Sean Kelly <sean invisibleduck.org> said:

 Yeah.  The unfortunate tradeoff is that finalizers are run while all these
 app threads are suspended, and if these finalizers enter a synchronized
 block they're likely to deadlock.  This happened enough with the Phobos
 GC that I was motivated to make the change in the first place.  It's
 unfortunate, but at least WeakRef is the only victim of the current
 approach that I'm aware of.
Then can't you notify the weak refs, restart the world, and then finalize the objects for real? -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Apr 20 2009
parent Leandro Lucarella <llucax gmail.com> writes:
Michel Fortin, el 20 de abril a las 18:53 me escribiste:
 On 2009-04-20 18:31:28 -0400, Sean Kelly <sean invisibleduck.org> said:
 
Yeah.  The unfortunate tradeoff is that finalizers are run while all these
app threads are suspended, and if these finalizers enter a synchronized
block they're likely to deadlock.  This happened enough with the Phobos
GC that I was motivated to make the change in the first place.  It's
unfortunate, but at least WeakRef is the only victim of the current
approach that I'm aware of.
Then can't you notify the weak refs, restart the world, and then finalize the objects for real?
I've not followed this thread very closely but I think that would imply that the sweep phase had to run with all threads suspended, which is a bad idea unless you *really* enjoy pauses =) -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------------- Karma police arrest this man, he talks in maths, he buzzes like a fridge, he's like a detuned radio.
Apr 20 2009
prev sibling parent reply Jason House <jason.james.house gmail.com> writes:
Sean Kelly wrote:

 == Quote from Jason House (jason.james.house gmail.com)'s article
 Jarrett Billingsley Wrote:
 On Mon, Apr 20, 2009 at 5:03 PM, Sean Kelly <sean invisibleduck.org>
 wrote:

 The horrible thing about all this is it makes using a multithreaded
 WeakRef horribly slow. �Even if the GC facility were provided for
 what you want to do you're talking about acquiring a mutex in
 the GC for every deref operation. �What a mess.
Then take a hint from D2 and make them thread-local by default ;) But provide a SharedWeakRef for the cases where you need it.
I'm pretty sure the GC can bypass thread locality for mark and sweep, so thread local won't help.
Anything that stops the sweep from occurring in parallel with a thread using the data bypasses this issue. Yeah. The unfortunate tradeoff is that finalizers are run while all these app threads are suspended, and if these finalizers enter a synchronized block they're likely to deadlock. This happened enough with the Phobos GC that I was motivated to make the change in the first place. It's unfortunate, but at least WeakRef is the only victim of the current approach that I'm aware of.
Is it possible for druntime to expose an option as to whether or not the sweep phase / finalizers should run while all threads are suspended? It'd really suck to implement some kind of mark and sweep garbage collection just to compensate for how the standard mark and sweep garbage collector is implemented. Another alternative I came up with is notification when the GC clears or sets a mark on an object. That would allow the weak ref to know that its information is suspect. That approach would only required that the mark phase is done with all threads stopped. If you like either of these ideas, I could attempt a patch to add the functionality.
Apr 21 2009
next sibling parent reply grauzone <none example.net> writes:
 Is it possible for druntime to expose an option as to whether or not the 
 sweep phase / finalizers should run while all threads are suspended?
But then you can only do boring stuff in the finalizer. Calling any C or library function must be avoided, because it can lead to deadlocks. At the same time, your code is called at arbitrary times, and anything you do will cause race conditions by default. It's like a UNIX signal handler: programming in it sucks so much. Maybe the finalizer register function could take two callbacks: one called before threads are resumed, and one after threads are resumed. You could use the first callback to zero out the weak reference, while the second takes care of advanced functionality, like notifying someone that the reference has been collected.
Apr 21 2009
parent Jason House <jason.james.house gmail.com> writes:
grauzone Wrote:

 Is it possible for druntime to expose an option as to whether or not the 
 sweep phase / finalizers should run while all threads are suspended?
But then you can only do boring stuff in the finalizer. Calling any C or library function must be avoided, because it can lead to deadlocks. At the same time, your code is called at arbitrary times, and anything you do will cause race conditions by default. It's like a UNIX signal handler: programming in it sucks so much.
I've never had this kind of issue with finalizers, so I have limited appreciation for the argument. In the application I'm working on, I have no need for any functionality in this area except weak reference support.
 Maybe the finalizer register function could take two callbacks: one 
 called before threads are resumed, and one after threads are resumed. 
 You could use the first callback to zero out the weak reference, while 
 the second takes care of advanced functionality, like notifying someone 
 that the reference has been collected.
Wouldn't that implicitly require two sweep phases? In my last post, I suggested the possibiluty of a callback in the mark phase. At the moment, I'm thinking one callback at the start of collection and another callback when marking a weakly referenced object would do the trick. Any weakly referenced object not marked after the last collection is invalid
Apr 22 2009
prev sibling parent reply Leandro Lucarella <llucax gmail.com> writes:
Jason House, el 21 de abril a las 21:11 me escribiste:
 Another alternative I came up with is notification when the GC clears or sets 
 a mark on an object.  That would allow the weak ref to know that its 
 information is suspect.  That approach would only required that the mark 
 phase is done with all threads stopped.
I think instead of a notification mechanism access to the mark bit can be provided. The mark phase is ran with all thread paused, so you don't need any notifications in that phase because nobody can ask the weak ref object for the underlaying pointer then. When threads are resumed, the mark phase is complete, so the invariant of all live objects having the mark bit set and all the garbage having the bit unset should hold. If the GC guarantee that the mark bit is set *always* when a object is live (the mark bit clearing should be done with all threads stopped and that should be it), except only when the mark phase is running, then, a weak ref implementation can look at the mark bit to safely see if the reference is garbage or not (even before the sweep phase find it and call the finalizer). I think this approach is both simple and efficient, it doesn't have races at all (unless I'm missing something, of course). The only problem is it's too tied to the current GC implementation. But I think any weak ref implementation will, so it looks like weak ref really belong to the GC. -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------------- no longer afraid of the dark or midday shadows nothing so ridiculously teenage and desperate, nothing so childish - at a better pace, slower and more calculated, no chance of escape,
Apr 22 2009
next sibling parent reply Jason House <jason.james.house gmail.com> writes:
Leandro Lucarella Wrote:

 Jason House, el 21 de abril a las 21:11 me escribiste:
 Another alternative I came up with is notification when the GC clears or sets 
 a mark on an object.  That would allow the weak ref to know that its 
 information is suspect.  That approach would only required that the mark 
 phase is done with all threads stopped.
I think instead of a notification mechanism access to the mark bit can be provided. The mark phase is ran with all thread paused, so you don't need any notifications in that phase because nobody can ask the weak ref object for the underlaying pointer then. When threads are resumed, the mark phase is complete, so the invariant of all live objects having the mark bit set and all the garbage having the bit unset should hold. If the GC guarantee that the mark bit is set *always* when a object is live (the mark bit clearing should be done with all threads stopped and that should be it), except only when the mark phase is running, then, a weak ref implementation can look at the mark bit to safely see if the reference is garbage or not (even before the sweep phase find it and call the finalizer). I think this approach is both simple and efficient, it doesn't have races at all (unless I'm missing something, of course). The only problem is it's too tied to the current GC implementation. But I think any weak ref implementation will, so it looks like weak ref really belong to the GC.
Unfortunately, that is not efficient. The state used by the GC is in flux during the sweep phase, so there is no easy lockless way to access that data. Having to acquire a lock on every weakref dereference absolutely kills performance. I'm definitely open to hearing more ideas on how to efficiently implement.
Apr 22 2009
parent reply Leandro Lucarella <llucax gmail.com> writes:
Jason House, el 22 de abril a las 11:32 me escribiste:
 Leandro Lucarella Wrote:
 
 Jason House, el 21 de abril a las 21:11 me escribiste:
 Another alternative I came up with is notification when the GC clears or sets 
 a mark on an object.  That would allow the weak ref to know that its 
 information is suspect.  That approach would only required that the mark 
 phase is done with all threads stopped.
I think instead of a notification mechanism access to the mark bit can be provided. The mark phase is ran with all thread paused, so you don't need any notifications in that phase because nobody can ask the weak ref object for the underlaying pointer then. When threads are resumed, the mark phase is complete, so the invariant of all live objects having the mark bit set and all the garbage having the bit unset should hold. If the GC guarantee that the mark bit is set *always* when a object is live (the mark bit clearing should be done with all threads stopped and that should be it), except only when the mark phase is running, then, a weak ref implementation can look at the mark bit to safely see if the reference is garbage or not (even before the sweep phase find it and call the finalizer). I think this approach is both simple and efficient, it doesn't have races at all (unless I'm missing something, of course). The only problem is it's too tied to the current GC implementation. But I think any weak ref implementation will, so it looks like weak ref really belong to the GC.
Unfortunately, that is not efficient. The state used by the GC is in flux during the sweep phase, so there is no easy lockless way to access that data. Having to acquire a lock on every weakref dereference absolutely kills performance.
The sweep phase don't touch mark bits. I can't possible understand why you should synchronize access to a (temporarily) read-only variable. Mark bits are only written when: 1) Clearing all the mark bits before marking (this is done with all threads running now, and this is what it should be changed to be done when all threads are suspended) 2) Marking (threads are suspended in the current implementation)
 I'm definitely open to hearing more ideas on how to efficiently
 implement.
The only performance penalty of my proposal would be incrementing the time the world is stooped a bit (adding the marks cleaning phase there). I think is a fair price to pay (being that mark bits are implemented using bitmaps, it should be pretty fast resetting all bits to 0) for an easy and safe weak references implementations. -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------------- ELLA FUE INFIEL, PERO EX POLOLO PAGĂ“ -- TV CHILE
Apr 22 2009
next sibling parent Leandro Lucarella <llucax gmail.com> writes:
Leandro Lucarella, el 22 de abril a las 14:19 me escribiste:
 Unfortunately, that is not efficient. The state used by the GC is in
 flux during the sweep phase, so there is no easy lockless way to access
 that data. Having to acquire a lock on every weakref dereference
 absolutely kills performance.
The sweep phase don't touch mark bits. I can't possible understand why you should synchronize access to a (temporarily) read-only variable. Mark bits are only written when: 1) Clearing all the mark bits before marking (this is done with all threads running now, and this is what it should be changed to be done when all threads are suspended) 2) Marking (threads are suspended in the current implementation)
 I'm definitely open to hearing more ideas on how to efficiently
 implement.
The only performance penalty of my proposal would be incrementing the time the world is stooped a bit (adding the marks cleaning phase there). I think is a fair price to pay (being that mark bits are implemented using bitmaps, it should be pretty fast resetting all bits to 0) for an easy and safe weak references implementations.
I forgot to mention that at allocation time, the mark bit should be set to keep the invariant that all live data has the mark bit set. I don't think this should impose a very big performance penalty either. -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------------- Ideas claras, libertad y patriotismo, para mejorar la calidad de vida; claro que todo con mucha calidad, porque, joven, ideas claras, libertad o patriotismo tiene cualquiera. -- Ricardo Vaporeso
Apr 22 2009
prev sibling parent reply Jason House <jason.james.house gmail.com> writes:
Leandro Lucarella Wrote:

 Jason House, el 22 de abril a las 11:32 me escribiste:
 Leandro Lucarella Wrote:
 
 Jason House, el 21 de abril a las 21:11 me escribiste:
 Another alternative I came up with is notification when the GC clears or sets 
 a mark on an object.  That would allow the weak ref to know that its 
 information is suspect.  That approach would only required that the mark 
 phase is done with all threads stopped.
I think instead of a notification mechanism access to the mark bit can be provided. The mark phase is ran with all thread paused, so you don't need any notifications in that phase because nobody can ask the weak ref object for the underlaying pointer then. When threads are resumed, the mark phase is complete, so the invariant of all live objects having the mark bit set and all the garbage having the bit unset should hold. If the GC guarantee that the mark bit is set *always* when a object is live (the mark bit clearing should be done with all threads stopped and that should be it), except only when the mark phase is running, then, a weak ref implementation can look at the mark bit to safely see if the reference is garbage or not (even before the sweep phase find it and call the finalizer). I think this approach is both simple and efficient, it doesn't have races at all (unless I'm missing something, of course). The only problem is it's too tied to the current GC implementation. But I think any weak ref implementation will, so it looks like weak ref really belong to the GC.
Unfortunately, that is not efficient. The state used by the GC is in flux during the sweep phase, so there is no easy lockless way to access that data. Having to acquire a lock on every weakref dereference absolutely kills performance.
The sweep phase don't touch mark bits. I can't possible understand why you should synchronize access to a (temporarily) read-only variable.
I'm mostly regurgitating what Sean Kelly said. I know that if the sweep deallocates at an object, the bits marking it become invalid. I don't disagree that reading the bits without a lock should be possible, but I also trust Sean's statement that it isn't easy. I'm trying to find something that doesn't require reworking or rewriting large parts of a garbage collector.
 Mark bits are only written when:
 1) Clearing all the mark bits before marking (this is done with all
    threads running now, and this is what it should be changed to be done
    when all threads are suspended)
 2) Marking (threads are suspended in the current implementation)
Is that what it does? I realized recently that one could simple toggle a mark bit and avoid a clearing phase altogether.
 I'm definitely open to hearing more ideas on how to efficiently
 implement.
The only performance penalty of my proposal would be incrementing the time the world is stooped a bit (adding the marks cleaning phase there). I think is a fair price to pay (being that mark bits are implemented using bitmaps, it should be pretty fast resetting all bits to 0) for an easy and safe weak references implementations. -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------------- ELLA FUE INFIEL, PERO EX POLOLO PAGĂ“ -- TV CHILE
Apr 22 2009
parent reply Leandro Lucarella <llucax gmail.com> writes:
Jason House, el 22 de abril a las 13:31 me escribiste:
 Unfortunately, that is not efficient. The state used by the GC is in
 flux during the sweep phase, so there is no easy lockless way to access
 that data. Having to acquire a lock on every weakref dereference
 absolutely kills performance.
The sweep phase don't touch mark bits. I can't possible understand why you should synchronize access to a (temporarily) read-only variable.
I'm mostly regurgitating what Sean Kelly said. I know that if the sweep deallocates at an object, the bits marking it become invalid. I don't disagree that reading the bits without a lock should be possible, but I also trust Sean's statement that it isn't easy. I'm trying to find something that doesn't require reworking or rewriting large parts of a garbage collector.
Sean's idea was, if I recall correctly, to move the sweep phase into the stop-the-world. My idea is different, and it doesn't need rewriting any large part of the collector. I think the change is pretty small (and can be easily applied to D1/Tango too). I'm looking at the Tango code right now, and you know what? The unmarking is already done when all threads are paused, so all you need for this to work is a method to expose the mark bit. This can be really easily done by extending the BlkAttr defined values (or using a reserved value if this should not be a standard attribute) and ensuring that the mark bit is set when new pools/bins are allocated. This patch (Tango) should do that (untested): ------>8------>8------>8------>8------>8------>8------>8------>8------>8------ diff --git a/lib/gc/basic/gcx.d b/lib/gc/basic/gcx.d index a760465..fd21d44 100644 --- a/lib/gc/basic/gcx.d +++ b/lib/gc/basic/gcx.d -81,6 +81,7 private NO_SCAN = 0b0000_0010, NO_MOVE = 0b0000_0100, ALL_BITS = 0b1111_1111 + MARK_BIT = 0b1_0000_0000_0000_0000; // extension to query the mark bit } extern (C) void* rt_stackBottom(); -519,7 +520,14 class GC Pool *pool = gcx.findPool(p); assert(pool); - gcx.setBits(pool, cast(size_t)(p - pool.baseAddr) / 16, bits); + uint biti = cast(size_t)(p - pool.baseAddr) / 16; + gcx.setBits(pool, biti, bits); + // Set the mark bit to maintain the invariant that live objects + // always have the mark bit set, except when the collector is in + // the mark phase (useful for weak reference implementations) + // This is not done in setBits() to keep this bit read-only for + // the user. + pool.mark.set(biti) } return p; } -2576,6 +2584,8 struct Gcx // if (pool.nomove.nbits && // pool.nomove.test(biti)) // bits |= BlkAttr.NO_MOVE; + if (pool.mark.test(biti)) + bits |= BlkAttr.MARK_BIT; return bits; } ------8<------8<------8<------8<------8<------8<------8<------8<------8<------ This implementation use an "extension" (according to Druntime specs) bit to return the mark bit when querying objects attributes. The mark bit is kept read-only, but I'm not sure about that (there are already plenty of ways to break the GC now, so I don't think security is a good enough reason to ban the user from doing nasty things ;) If you try it, please tell me how it went.
 Mark bits are only written when:
 1) Clearing all the mark bits before marking (this is done with all
    threads running now, and this is what it should be changed to be done
    when all threads are suspended)
 2) Marking (threads are suspended in the current implementation)
Is that what it does? I realized recently that one could simple toggle a mark bit and avoid a clearing phase altogether.
Yes, but that would mean to touch all the mark bits in the sweep phase, which can have bad cache behaviour. But being a bitmap that should be fairly cheap. But since this involve bit manipulation operations, and clearing the bitmap can be done very efficiently clearing complete words in one instruction, I don't see a clear gain in this. I guess just benchmarks can have a final word on this, but I wouldn't expect much differences in performance for both approaches. -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------------- More people die from a champagne-cork popping, than from poison spiders
Apr 22 2009
parent Leandro Lucarella <llucax gmail.com> writes:
Leandro Lucarella, el 22 de abril a las 16:08 me escribiste:
 Jason House, el 22 de abril a las 13:31 me escribiste:
 Unfortunately, that is not efficient. The state used by the GC is in
 flux during the sweep phase, so there is no easy lockless way to access
 that data. Having to acquire a lock on every weakref dereference
 absolutely kills performance.
The sweep phase don't touch mark bits. I can't possible understand why you should synchronize access to a (temporarily) read-only variable.
I'm mostly regurgitating what Sean Kelly said. I know that if the sweep deallocates at an object, the bits marking it become invalid. I don't disagree that reading the bits without a lock should be possible, but I also trust Sean's statement that it isn't easy. I'm trying to find something that doesn't require reworking or rewriting large parts of a garbage collector.
Sean's idea was, if I recall correctly, to move the sweep phase into the stop-the-world. My idea is different, and it doesn't need rewriting any large part of the collector. I think the change is pretty small (and can be easily applied to D1/Tango too). I'm looking at the Tango code right now, and you know what? The unmarking is already done when all threads are paused, so all you need for this to work is a method to expose the mark bit. This can be really easily done by extending the BlkAttr defined values (or using a reserved value if this should not be a standard attribute) and ensuring that the mark bit is set when new pools/bins are allocated. This patch (Tango) should do that (untested): ------>8------>8------>8------>8------>8------>8------>8------>8------>8------ diff --git a/lib/gc/basic/gcx.d b/lib/gc/basic/gcx.d index a760465..fd21d44 100644 --- a/lib/gc/basic/gcx.d +++ b/lib/gc/basic/gcx.d -81,6 +81,7 private NO_SCAN = 0b0000_0010, NO_MOVE = 0b0000_0100, ALL_BITS = 0b1111_1111 + MARK_BIT = 0b1_0000_0000_0000_0000; // extension to query the mark bit } extern (C) void* rt_stackBottom(); -519,7 +520,14 class GC Pool *pool = gcx.findPool(p); assert(pool); - gcx.setBits(pool, cast(size_t)(p - pool.baseAddr) / 16, bits); + uint biti = cast(size_t)(p - pool.baseAddr) / 16; + gcx.setBits(pool, biti, bits); + // Set the mark bit to maintain the invariant that live objects + // always have the mark bit set, except when the collector is in + // the mark phase (useful for weak reference implementations) + // This is not done in setBits() to keep this bit read-only for + // the user. + pool.mark.set(biti) } return p; } -2576,6 +2584,8 struct Gcx // if (pool.nomove.nbits && // pool.nomove.test(biti)) // bits |= BlkAttr.NO_MOVE; + if (pool.mark.test(biti)) + bits |= BlkAttr.MARK_BIT; return bits; } ------8<------8<------8<------8<------8<------8<------8<------8<------8<------
Woops! I already spotted a bug there =). The mark bit was only set if the malloc was done passing bits. The solution has a performance penalty of a "extra" pool lookup for malloc where attr == 0. This is the fixed patch: ------>8------>8------>8------>8------>8------>8------>8------>8------>8------ diff --git a/lib/gc/basic/gcx.d b/lib/gc/basic/gcx.d index a760465..940bc71 100644 --- a/lib/gc/basic/gcx.d +++ b/lib/gc/basic/gcx.d -81,6 +81,7 private NO_SCAN = 0b0000_0010, NO_MOVE = 0b0000_0100, ALL_BITS = 0b1111_1111 + MARK_BIT = 0b1_0000_0000_0000_0000; // extension to query the mark bit } extern (C) void* rt_stackBottom(); -514,13 +515,18 class GC sentinel_init(p, size); gcx.log_malloc(p, size); - if (bits) - { - Pool *pool = gcx.findPool(p); - assert(pool); + Pool *pool = gcx.findPool(p); + assert(pool); - gcx.setBits(pool, cast(size_t)(p - pool.baseAddr) / 16, bits); - } + uint biti = cast(size_t)(p - pool.baseAddr) / 16; + if (bits) + gcx.setBits(pool, biti, bits); + // Set the mark bit to maintain the invariant that live objects + // always have the mark bit set, except when the collector is in + // the mark phase (useful for weak reference implementations) + // This is not done in setBits() to keep this bit read-only for + // the user. + pool.mark.set(biti) return p; } -2576,6 +2582,8 struct Gcx // if (pool.nomove.nbits && // pool.nomove.test(biti)) // bits |= BlkAttr.NO_MOVE; + if (pool.mark.test(biti)) + bits |= BlkAttr.MARK_BIT; return bits; } ------8<------8<------8<------8<------8<------8<------8<------8<------8<------ -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------------- Bulletproof... I Wish I Was
Apr 22 2009
prev sibling parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Leandro Lucarella wrote:
 I think instead of a notification mechanism access to the mark bit can be
 provided.
 
 The mark phase is ran with all thread paused, so you don't need any
 notifications in that phase because nobody can ask the weak ref object for
 the underlaying pointer then.
 
 When threads are resumed, the mark phase is complete, so the invariant of
 all live objects having the mark bit set and all the garbage having the
 bit unset should hold. If the GC guarantee that the mark bit is set
 *always* when a object is live (the mark bit clearing should be done with
 all threads stopped and that should be it), except only when the mark
 phase is running, then, a weak ref implementation can look at the mark bit
 to safely see if the reference is garbage or not (even before the sweep
 phase find it and call the finalizer).
 
 I think this approach is both simple and efficient, it doesn't have races
 at all (unless I'm missing something, of course). The only problem is it's
 too tied to the current GC implementation. But I think any weak ref
 implementation will, so it looks like weak ref really belong to the GC.
I think you're missing something. Consider this sequence of events: - Object A is collected by the GC. (And its mark bit M is cleared) - Object B is allocated, and gets the memory formerly used by object A. Mark bit M is set. - A weak reference to A gets used; it checks mark bit M, which is set... In other words, the "this object was collected" state needs to be stored in the weak reference, not in the object that got collected (or anything tied to its memory location) because that memory may get reused.
Apr 22 2009
parent reply Jason House <jason.james.house gmail.com> writes:
Frits van Bommel Wrote:

 Leandro Lucarella wrote:
 I think instead of a notification mechanism access to the mark bit can be
 provided.
 
 The mark phase is ran with all thread paused, so you don't need any
 notifications in that phase because nobody can ask the weak ref object for
 the underlaying pointer then.
 
 When threads are resumed, the mark phase is complete, so the invariant of
 all live objects having the mark bit set and all the garbage having the
 bit unset should hold. If the GC guarantee that the mark bit is set
 *always* when a object is live (the mark bit clearing should be done with
 all threads stopped and that should be it), except only when the mark
 phase is running, then, a weak ref implementation can look at the mark bit
 to safely see if the reference is garbage or not (even before the sweep
 phase find it and call the finalizer).
 
 I think this approach is both simple and efficient, it doesn't have races
 at all (unless I'm missing something, of course). The only problem is it's
 too tied to the current GC implementation. But I think any weak ref
 implementation will, so it looks like weak ref really belong to the GC.
I think you're missing something. Consider this sequence of events: - Object A is collected by the GC. (And its mark bit M is cleared) - Object B is allocated, and gets the memory formerly used by object A. Mark bit M is set. - A weak reference to A gets used; it checks mark bit M, which is set... In other words, the "this object was collected" state needs to be stored in the weak reference, not in the object that got collected (or anything tied to its memory location) because that memory may get reused.
That's where finalizers come in. They can mark the weak ref as invalid so that you don't care what was returned. In my (recent?) pseudo code, the weak ref was decoded, the mark bit was checked, and then the weak ref was decoded again. The final decoding is to solve the race with the finalizer. Even with such precautions, a lock free implementation has one more problem: what happens if the memory block was released to the os? That's a segfault :( if other data is checked to avoid such situations, that's likely where gc locks come into the picture... I'm pretty sure Sean said querying the attribute bits requires a lock. It may be for this reason. I still have to look at the code though...
Apr 22 2009
parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Jason House wrote:
 Frits van Bommel Wrote:
 
 Leandro Lucarella wrote:
 I think instead of a notification mechanism access to the mark bit can be
 provided.

 The mark phase is ran with all thread paused, so you don't need any
 notifications in that phase because nobody can ask the weak ref object for
 the underlaying pointer then.

 When threads are resumed, the mark phase is complete, so the invariant of
 all live objects having the mark bit set and all the garbage having the
 bit unset should hold. If the GC guarantee that the mark bit is set
 *always* when a object is live (the mark bit clearing should be done with
 all threads stopped and that should be it), except only when the mark
 phase is running, then, a weak ref implementation can look at the mark bit
 to safely see if the reference is garbage or not (even before the sweep
 phase find it and call the finalizer).

 I think this approach is both simple and efficient, it doesn't have races
 at all (unless I'm missing something, of course). The only problem is it's
 too tied to the current GC implementation. But I think any weak ref
 implementation will, so it looks like weak ref really belong to the GC.
I think you're missing something. Consider this sequence of events: - Object A is collected by the GC. (And its mark bit M is cleared) - Object B is allocated, and gets the memory formerly used by object A. Mark bit M is set. - A weak reference to A gets used; it checks mark bit M, which is set... In other words, the "this object was collected" state needs to be stored in the weak reference, not in the object that got collected (or anything tied to its memory location) because that memory may get reused.
That's where finalizers come in. They can mark the weak ref as invalid so that you don't care what was returned. In my (recent?) pseudo code, the weak ref was decoded, the mark bit was checked, and then the weak ref was decoded again. The final decoding is to solve the race with the finalizer.
Will this work if B was allocated in another thread, right after it gets resumed (i.e. possibly before the finalizer has a chance to run)? ... oh, maybe the GC lock gets held even after other threads get resumed, so the allocation will wait for finalizers to run? (just thought of that)
 Even with such precautions, a lock free implementation has one more problem:
what happens if the memory block was released to the os? That's a segfault :(
if other data is checked to avoid such situations, that's likely where gc locks
come into the picture... I'm pretty sure Sean said querying the attribute bits
requires a lock. It may be for this reason. I still have to look at the code
though...
You mean it would segfault on checking the mark bit? Can't the GC just say the mark bit was not set for any pointer not pointing into a currently-allocated pool? But yeah, checking if that address is part of an allocated pool may require holding the GC lock...
Apr 22 2009
parent reply Leandro Lucarella <llucax gmail.com> writes:
Frits van Bommel, el 22 de abril a las 23:39 me escribiste:
I think this approach is both simple and efficient, it doesn't have races
at all (unless I'm missing something, of course). The only problem is it's
too tied to the current GC implementation. But I think any weak ref
implementation will, so it looks like weak ref really belong to the GC.
I think you're missing something. Consider this sequence of events: - Object A is collected by the GC. (And its mark bit M is cleared) - Object B is allocated, and gets the memory formerly used by object A. Mark bit M is set. - A weak reference to A gets used; it checks mark bit M, which is set... In other words, the "this object was collected" state needs to be stored in the weak reference, not in the object that got collected (or anything tied to its memory location) because that memory may get reused.
That's where finalizers come in. They can mark the weak ref as invalid so that you don't care what was returned. In my (recent?) pseudo code, the weak ref was decoded, the mark bit was checked, and then the weak ref was decoded again. The final decoding is to solve the race with the finalizer.
Will this work if B was allocated in another thread, right after it gets resumed (i.e. possibly before the finalizer has a chance to run)? ... oh, maybe the GC lock gets held even after other threads get resumed, so the allocation will wait for finalizers to run? (just thought of that)
That's exactly what happens. So I'm not seeing any flaws so far with my proposal.
Even with such precautions, a lock free implementation has one more
problem: what happens if the memory block was released to the os?
That's a segfault :( if other data is checked to avoid such situations,
that's likely where gc locks come into the picture... I'm pretty sure
Sean said querying the attribute bits requires a lock. It may be for
this reason. I still have to look at the code though...
You mean it would segfault on checking the mark bit? Can't the GC just say the mark bit was not set for any pointer not pointing into a currently-allocated pool? But yeah, checking if that address is part of an allocated pool may require holding the GC lock...
In the patch I posted this is already happening. The mark bit is returned as part of the BlkAttr belonging to a memory block. If the pointer passed to gc_getAttr() (or gc_query()) don't point to a GC allocated memory block, 0 is returned. I any case, I can't see how that could happen if there are still notifications before freeing the memory block in the sweep phase. The sequence is like this: 0) All live cells has the mark bit set 1) Acquire the GC lock 2) Stop the world 3) Clear mark bits 4) Mark 5) Start the world 6) Sweep (calling finalizing notification callbacks) 7) Release the GC lock Then, you are guaranteed to access to valid memory (mark bit or even the entire memory block) until the notification comes. When the weak reference receive the notification it invalidates the reference and don't rely anymore in the mark bit. But I just found a flaw in my patch (not the proposal), right now, as you say, the gc_query() and gc_getAttr() functions are acquiring the GC lock. Some non-locking interface should be exposed that calls to queryNoSync() and getAttrNoSync(). It should be safe for this limited usage as long as there are not concurrent calls to setAttr()/clrAttr() to that memory block (I think). -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------------- Instead of doing a wash, I just keep buying underwear. My goal is to have over 360 pair. That way I only have to do wash once a year. -- George Constanza
Apr 22 2009
parent reply Jason House <jason.james.house gmail.com> writes:
I'm finally looking at the gc source code.  From what I can tell, memory is 
allocated in large chunks (pools), and divided into 16 byte chunks.  
Attributes are not all packed together like the BlkInfo of the spec implies, 
so some so access to that stuff is safer than I would have thought.

Here's a weak ref implementation that I just whipped together under the 
assumption it goes into druntime/src/gc/basic/gcx.d.  I think it is sane.  I 
haven't scoured the gc's code to ensure everything is legit.  I declared it 
as a class mostly because value semantics don't make much sense and I'm too 
lazy/tired to do the sample code as a 100% correct struct definition.

My two biggest concerns: 

1. Is the gc lock held while running a constructor? If not and it's needed 
to do what it needs, where can the hook be added to make the construction of 
a weak reference special?

2. Will the assignment to noscan get overwritten / hit a race condition?  It 
is possible to hide the pointer inside a size_t instead, but given the 
intimate tying to the GC such tricks just make the code tougher to read.

class weakRef(T) if (is(T==class))
{
    this(T _t){
        // set up tracking for the weakly referenced object
        t = _t;
        pool = findPool(t);
        rt_attachDisposeEvent(&t, &sweep);

        // tell the gc to not scan the weak reference
        weakRefPool = findPool(&this);
        weakRefPool.noscan[index(weakRefPool,&this)] = true;
    }

    T opCall(){
        T tmp = t; // force future conservative GC scan to find the pointer
        if ( (tmp is null) || (pool.mark[index] == false) ){
            t = null;
            return null;
        }
        tmp = t; // avoid race where t was replaced with new data
        return tmp;
    }

private:
    T t;
    Pool* pool;
  
    size_t index(Pool *p, byte *element){ return (element-p)/16; }
    size_t index(){ return index(pool,&t); }

    void sweep( Object o ){
        t = null;
        pool = null;
    }
}


Comments and criticism are welcome.







Leandro Lucarella wrote:

 Frits van Bommel, el 22 de abril a las 23:39 me escribiste:
I think this approach is both simple and efficient, it doesn't have
races at all (unless I'm missing something, of course). The only
problem is it's too tied to the current GC implementation. But I think
any weak ref implementation will, so it looks like weak ref really
belong to the GC.
I think you're missing something. Consider this sequence of events: - Object A is collected by the GC. (And its mark bit M is cleared) - Object B is allocated, and gets the memory formerly used by object A. Mark bit M is set. - A weak reference to A gets used; it checks mark bit M, which is set... In other words, the "this object was collected" state needs to be stored in the weak reference, not in the object that got collected (or anything tied to its memory location) because that memory may get reused.
That's where finalizers come in. They can mark the weak ref as invalid so that you don't care what was returned. In my (recent?) pseudo code, the weak ref was decoded, the mark bit was checked, and then the weak ref was decoded again. The final decoding is to solve the race with the finalizer.
Will this work if B was allocated in another thread, right after it gets resumed (i.e. possibly before the finalizer has a chance to run)? ... oh, maybe the GC lock gets held even after other threads get resumed, so the allocation will wait for finalizers to run? (just thought of that)
That's exactly what happens. So I'm not seeing any flaws so far with my proposal.
Even with such precautions, a lock free implementation has one more
problem: what happens if the memory block was released to the os?
That's a segfault :( if other data is checked to avoid such situations,
that's likely where gc locks come into the picture... I'm pretty sure
Sean said querying the attribute bits requires a lock. It may be for
this reason. I still have to look at the code though...
You mean it would segfault on checking the mark bit? Can't the GC just say the mark bit was not set for any pointer not pointing into a currently-allocated pool? But yeah, checking if that address is part of an allocated pool may require holding the GC lock...
In the patch I posted this is already happening. The mark bit is returned as part of the BlkAttr belonging to a memory block. If the pointer passed to gc_getAttr() (or gc_query()) don't point to a GC allocated memory block, 0 is returned. I any case, I can't see how that could happen if there are still notifications before freeing the memory block in the sweep phase. The sequence is like this: 0) All live cells has the mark bit set 1) Acquire the GC lock 2) Stop the world 3) Clear mark bits 4) Mark 5) Start the world 6) Sweep (calling finalizing notification callbacks) 7) Release the GC lock Then, you are guaranteed to access to valid memory (mark bit or even the entire memory block) until the notification comes. When the weak reference receive the notification it invalidates the reference and don't rely anymore in the mark bit. But I just found a flaw in my patch (not the proposal), right now, as you say, the gc_query() and gc_getAttr() functions are acquiring the GC lock. Some non-locking interface should be exposed that calls to queryNoSync() and getAttrNoSync(). It should be safe for this limited usage as long as there are not concurrent calls to setAttr()/clrAttr() to that memory block (I think).
Apr 22 2009
parent reply Leandro Lucarella <llucax gmail.com> writes:
Jason House, el 22 de abril a las 21:26 me escribiste:
 I'm finally looking at the gc source code.  From what I can tell, memory is 
 allocated in large chunks (pools), and divided into 16 byte chunks.  
 
 Attributes are not all packed together like the BlkInfo of the spec implies, 
 so some so access to that stuff is safer than I would have thought.
I'm sorry you had to go through that, I think this could have save you a lot of time =) http://proj.llucax.com.ar/blog/dgc/blog/tag/understanding%20the%20current%20gc
 Here's a weak ref implementation that I just whipped together under the 
 assumption it goes into druntime/src/gc/basic/gcx.d.  I think it is sane.  I 
 haven't scoured the gc's code to ensure everything is legit.  I declared it 
 as a class mostly because value semantics don't make much sense and I'm too 
 lazy/tired to do the sample code as a 100% correct struct definition.
 
 My two biggest concerns: 
 
 1. Is the gc lock held while running a constructor? If not and it's needed 
 to do what it needs, where can the hook be added to make the construction of 
 a weak reference special?
I don't think it's held, I just can't see how it can be held because construction is not done in the GC. If you need to acquire the GC lock I think you can just do: synchronized (GC.gcLock) { // protected stuff } The GC lock is a static member of the GC struct in gcx.d.
 2. Will the assignment to noscan get overwritten / hit a race condition?
I don't think that should happen except if someone calls gc_setAttr/gc_getAttr().
 It is possible to hide the pointer inside a size_t instead, but given the
 intimate tying to the GC such tricks just make the code tougher to read.
Agree. Maybe you can overload the operator new and pass the flags directly at allocation time, but I don't know very well how this is plugged in the runtime/GC, but since it's a regular gc_malloc() call it should be thread-safe. Something like: class weakRef(T) if (is(T==class)) { new(size_t size) { return gc_malloc(size, FINALIZE | NO_SCAN); } this(T _t) { // set up tracking for the weakly referenced object t = _t; pool = findPool(t); rt_attachDisposeEvent(&t, &sweep); } // ... }
 class weakRef(T) if (is(T==class))
 {
     this(T _t){
         // set up tracking for the weakly referenced object
         t = _t;
         pool = findPool(t);
         rt_attachDisposeEvent(&t, &sweep);
 
         // tell the gc to not scan the weak reference
         weakRefPool = findPool(&this);
         weakRefPool.noscan[index(weakRefPool,&this)] = true;
You can do this by calling setAttr(&this, NO_SCAN).
     }
 
     T opCall(){
         T tmp = t; // force future conservative GC scan to find the pointer
         if ( (tmp is null) || (pool.mark[index] == false) ){
pool.mark.test(index) is used in the GC code, I don't know if the array syntax works. And to use the mark bit trick, I think something in the lines of my patch is needed to ensure the mark bit is set for all live objects.
             t = null;
             return null;
         }
         tmp = t; // avoid race where t was replaced with new data
I don't understand exactly what kind of race are you trying to avoid with this (and how this assignation can avoid any race if there is one).
         return tmp;
     }
 
 private:
     T t;
     Pool* pool;
   
     size_t index(Pool *p, byte *element){ return (element-p)/16; }
     size_t index(){ return index(pool,&t); }
 
     void sweep( Object o ){
         t = null;
         pool = null;
     }
 }
-- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------------- Lo último que hay que pensar es que se desalinea la memoria Hay que priorizar como causa la idiotez propia Ya lo tengo asumido -- Pablete, filósofo contemporáneo desconocido
Apr 22 2009
parent reply Jason House <jason.james.house gmail.com> writes:
Leandro Lucarella Wrote:

 Jason House, el 22 de abril a las 21:26 me escribiste:
 I'm finally looking at the gc source code.  From what I can tell, memory is 
 allocated in large chunks (pools), and divided into 16 byte chunks.  
 
 Attributes are not all packed together like the BlkInfo of the spec implies, 
 so some so access to that stuff is safer than I would have thought.
I'm sorry you had to go through that, I think this could have save you a lot of time =) http://proj.llucax.com.ar/blog/dgc/blog/tag/understanding%20the%20current%20gc
It really wasn't that bad of a way to familiarize myself. I had enough of an idea about how it coulf work already. It is helpful to compare what I expect in a gc verses what's actually in the gc. I appreciate the link and will make sure to read it.
 Here's a weak ref implementation that I just whipped together under the 
 assumption it goes into druntime/src/gc/basic/gcx.d.  I think it is sane.  I 
 haven't scoured the gc's code to ensure everything is legit.  I declared it 
 as a class mostly because value semantics don't make much sense and I'm too 
 lazy/tired to do the sample code as a 100% correct struct definition.
 
 My two biggest concerns: 
 
 1. Is the gc lock held while running a constructor? If not and it's needed 
 to do what it needs, where can the hook be added to make the construction of 
 a weak reference special?
I don't think it's held, I just can't see how it can be held because construction is not done in the GC. If you need to acquire the GC lock I think you can just do: synchronized (GC.gcLock) { // protected stuff } The GC lock is a static member of the GC struct in gcx.d.
Efficiency is a big concern of mine with this because I'm going to use it heavily. Locking twice is undesirable, but I may do it in an initial implementation. A lock on deref is unacceptable though.
 2. Will the assignment to noscan get overwritten / hit a race condition?
I don't think that should happen except if someone calls gc_setAttr/gc_getAttr().
 It is possible to hide the pointer inside a size_t instead, but given the
 intimate tying to the GC such tricks just make the code tougher to read.
Agree. Maybe you can overload the operator new and pass the flags directly at allocation time, but I don't know very well how this is plugged in the runtime/GC, but since it's a regular gc_malloc() call it should be thread-safe.
Interesting idea. I'll look into that.
 Something like:
 
 class weakRef(T) if (is(T==class))
 {
 	new(size_t size)
 	{
 		return gc_malloc(size, FINALIZE | NO_SCAN);
 	}
 	this(T _t)
 	{
 		// set up tracking for the weakly referenced object
 		t = _t;
 		pool = findPool(t);
 		rt_attachDisposeEvent(&t, &sweep);
 	}
 	// ...
 }
 
 
 class weakRef(T) if (is(T==class))
 {
     this(T _t){
         // set up tracking for the weakly referenced object
         t = _t;
         pool = findPool(t);
         rt_attachDisposeEvent(&t, &sweep);
 
         // tell the gc to not scan the weak reference
         weakRefPool = findPool(&this);
         weakRefPool.noscan[index(weakRefPool,&this)] = true;
You can do this by calling setAttr(&this, NO_SCAN).
True.
     }
 
     T opCall(){
         T tmp = t; // force future conservative GC scan to find the pointer
         if ( (tmp is null) || (pool.mark[index] == false) ){
pool.mark.test(index) is used in the GC code, I don't know if the array syntax works. And to use the mark bit trick, I think something in the lines of my patch is needed to ensure the mark bit is set for all live objects.
             t = null;
             return null;
         }
         tmp = t; // avoid race where t was replaced with new data
I don't understand exactly what kind of race are you trying to avoid with this (and how this assignation can avoid any race if there is one).
Consider this sequence: 1. GC runs mark phase and leaves object unmarked. 2. Weakref starts dereferencing and reads its hidden pointer. 3. Sweep calls finalizer and deallocates 4. A new object is allocated in the old location 5. Query for GC mark says we're ok 6. Weakref returns its old copy of the pointer 7. Pointer is used 8. Random behavior
         return tmp;
     }
 
 private:
     T t;
     Pool* pool;
   
     size_t index(Pool *p, byte *element){ return (element-p)/16; }
     size_t index(){ return index(pool,&t); }
 
     void sweep( Object o ){
         t = null;
         pool = null;
     }
 }
-- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------------- Lo último que hay que pensar es que se desalinea la memoria Hay que priorizar como causa la idiotez propia Ya lo tengo asumido -- Pablete, filósofo contemporáneo desconocido
Apr 23 2009
parent reply Leandro Lucarella <llucax gmail.com> writes:
Jason House, el 23 de abril a las 12:51 me escribiste:
     T opCall(){
         T tmp = t; // force future conservative GC scan to find the pointer
         if ( (tmp is null) || (pool.mark[index] == false) ){
pool.mark.test(index) is used in the GC code, I don't know if the array syntax works. And to use the mark bit trick, I think something in the lines of my patch is needed to ensure the mark bit is set for all live objects.
             t = null;
             return null;
         }
         tmp = t; // avoid race where t was replaced with new data
I don't understand exactly what kind of race are you trying to avoid with this (and how this assignation can avoid any race if there is one).
Consider this sequence: 1. GC runs mark phase and leaves object unmarked. 2. Weakref starts dereferencing and reads its hidden pointer.
Ok, I see now that this can happen because you first store the reference into the stack so it can be seen in future collections. What I'm not sure is if it's really necessary to do that step. What can go wrong with this simple approach? T opCall() { if (!pool.mark.test(index)) t = null; return t; } Weakref only reads the reference *if* the mark bit is set, in which case is safe to assume that the object is live. If the object is garbage, the mark bit is unset and in that case the reference is only *written* (the only race it could happen is the sweep() can be called but in that case the reference is written to null too, so there should be no problem with that, I guess... As long as the test are successful (t is not overwritten to null), the GC should be able to find the reference in the stack when the function returns. I'm probably missing something else...
 3. Sweep calls finalizer and deallocates
 4. A new object is allocated in the old location
 5. Query for GC mark says we're ok
 6. Weakref returns its old copy of the pointer
 7. Pointer is used
 8. Random behavior
-- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ----------------------------------------------------------------------------
Apr 23 2009
parent reply Jason House <jason.james.house gmail.com> writes:
Leandro Lucarella Wrote:

 Jason House, el 23 de abril a las 12:51 me escribiste:
     T opCall(){
         T tmp = t; // force future conservative GC scan to find the pointer
         if ( (tmp is null) || (pool.mark[index] == false) ){
pool.mark.test(index) is used in the GC code, I don't know if the array syntax works. And to use the mark bit trick, I think something in the lines of my patch is needed to ensure the mark bit is set for all live objects.
             t = null;
             return null;
         }
         tmp = t; // avoid race where t was replaced with new data
I don't understand exactly what kind of race are you trying to avoid with this (and how this assignation can avoid any race if there is one).
Consider this sequence: 1. GC runs mark phase and leaves object unmarked. 2. Weakref starts dereferencing and reads its hidden pointer.
Ok, I see now that this can happen because you first store the reference into the stack so it can be seen in future collections.
Yeah.
 What I'm not sure is if it's really necessary to do that step. What can go
 wrong with this simple approach?
 
      T opCall() {
          if (!pool.mark.test(index))
              t = null;
          return t;
      }
What happens when a collection runs after test returns? You need to copy t onto the stack in order to be safe.
 Weakref only reads the reference *if* the mark bit is set, in which case
 is safe to assume that the object is live.
How do you check the mark bit? In the sample code, it used the reference... Admittedly, one could store the right info to avoid that, but it hardly matters.
 If the object is garbage, the mark bit is unset and in that case the
 reference is only *written* (the only race it could happen is the sweep()
 can be called but in that case the reference is written to null too, so
 there should be no problem with that, I guess...
 
 As long as the test are successful (t is not overwritten to null), the GC
 should be able to find the reference in the stack when the function
 returns.
 
 I'm probably missing something else...
You're not considering what can happen if the weak ref dereferencing thread stops and the rest of the world wreaks havoc on the state of the world. A collection could run after checking the mark. If one is careful with that, and a copy of t is kept on the stack, will the optimizer reuse it? Or will it reread the value? That matters if the finalizer runs. The mark bit check can fail from reallocation after deletion. I hope that helps...
 3. Sweep calls finalizer and deallocates
 4. A new object is allocated in the old location
 5. Query for GC mark says we're ok
 6. Weakref returns its old copy of the pointer
 7. Pointer is used
 8. Random behavior
-- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ----------------------------------------------------------------------------
Apr 23 2009
parent reply Leandro Lucarella <llucax gmail.com> writes:
Jason House, el 23 de abril a las 16:48 me escribiste:
 What I'm not sure is if it's really necessary to do that step. What can go
 wrong with this simple approach?
 
      T opCall() {
          if (!pool.mark.test(index))
              t = null;
          return t;
      }
What happens when a collection runs after test returns? You need to copy t onto the stack in order to be safe.
You're right, I definitely underestimated the problem. Now I see that this sequence breaks that implementation: 1) WeakRef.opCall() is called (whick stores a reference to R) 2) The mark bit for R is tested and returns true 3) CPU time is given to other threads with reference to R, which are removed 4) Other thread triggers a collection and the world is stopped 5) R gets unmarked because there are no other references to it 6) The world is started 7) WeakRef.opCall() returns R when it should returned null The problem I see with your original proposal: T opCall(){ T tmp = t; // force future conservative GC scan to find the pointer if ( (tmp is null) || (pool.mark[index] == false) ){ t = null; return null; } tmp = t; // avoid race where t was replaced with new data return tmp; } Is that I guess is very likely that 'tmp' get optimized out. Would declaring it as volatile help? If not, I guess the only way to avoid that race is add a root manually to the GC (or mark the weakref again as no NO_SCAN). But both requires synchronization. =(
 Weakref only reads the reference *if* the mark bit is set, in which case
 is safe to assume that the object is live.
How do you check the mark bit? In the sample code, it used the reference... Admittedly, one could store the right info to avoid that, but it hardly matters.
Right.
 If the object is garbage, the mark bit is unset and in that case the
 reference is only *written* (the only race it could happen is the sweep()
 can be called but in that case the reference is written to null too, so
 there should be no problem with that, I guess...
 
 As long as the test are successful (t is not overwritten to null), the GC
 should be able to find the reference in the stack when the function
 returns.
 
 I'm probably missing something else...
You're not considering what can happen if the weak ref dereferencing thread stops and the rest of the world wreaks havoc on the state of the world. A collection could run after checking the mark. If one is careful with that, and a copy of t is kept on the stack, will the optimizer reuse it? Or will it reread the value? That matters if the finalizer runs. The mark bit check can fail from reallocation after deletion. I hope that helps...
Yes, thank you. -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------------- - Que hacés, ratita? - Espero un ratito...
Apr 25 2009
parent reply Jason House <jason.james.house gmail.com> writes:
Leandro Lucarella Wrote:

 Jason House, el 23 de abril a las 16:48 me escribiste:
 What I'm not sure is if it's really necessary to do that step. What can go
 wrong with this simple approach?
 
      T opCall() {
          if (!pool.mark.test(index))
              t = null;
          return t;
      }
What happens when a collection runs after test returns? You need to copy t onto the stack in order to be safe.
You're right, I definitely underestimated the problem. Now I see that this sequence breaks that implementation: 1) WeakRef.opCall() is called (whick stores a reference to R) 2) The mark bit for R is tested and returns true 3) CPU time is given to other threads with reference to R, which are removed 4) Other thread triggers a collection and the world is stopped 5) R gets unmarked because there are no other references to it 6) The world is started 7) WeakRef.opCall() returns R when it should returned null The problem I see with your original proposal: T opCall(){ T tmp = t; // force future conservative GC scan to find the pointer if ( (tmp is null) || (pool.mark[index] == false) ){ t = null; return null; } tmp = t; // avoid race where t was replaced with new data return tmp; } Is that I guess is very likely that 'tmp' get optimized out. Would declaring it as volatile help? If not, I guess the only way to avoid that race is add a root manually to the GC (or mark the weakref again as no NO_SCAN). But both requires synchronization. =(
I would use tango.core.atomic that I took from the Tango D2 branch. Another option is inline asm, but I'd have to research how to do that.
 Weakref only reads the reference *if* the mark bit is set, in which case
 is safe to assume that the object is live.
How do you check the mark bit? In the sample code, it used the reference... Admittedly, one could store the right info to avoid that, but it hardly matters.
Right.
 If the object is garbage, the mark bit is unset and in that case the
 reference is only *written* (the only race it could happen is the sweep()
 can be called but in that case the reference is written to null too, so
 there should be no problem with that, I guess...
 
 As long as the test are successful (t is not overwritten to null), the GC
 should be able to find the reference in the stack when the function
 returns.
 
 I'm probably missing something else...
You're not considering what can happen if the weak ref dereferencing thread stops and the rest of the world wreaks havoc on the state of the world. A collection could run after checking the mark. If one is careful with that, and a copy of t is kept on the stack, will the optimizer reuse it? Or will it reread the value? That matters if the finalizer runs. The mark bit check can fail from reallocation after deletion. I hope that helps...
Yes, thank you.
It's amazingly difficult to get right for something so "simple"
 -- 
 Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
 ----------------------------------------------------------------------------
 GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
 ----------------------------------------------------------------------------
 - Que hacés, ratita?
 - Espero un ratito...
Apr 25 2009
parent reply Leandro Lucarella <llucax gmail.com> writes:
Jason House, el 25 de abril a las 14:19 me escribiste:
 Is that I guess is very likely that 'tmp' get optimized out. Would
 declaring it as volatile help? If not, I guess the only way to avoid that
 race is add a root manually to the GC (or mark the weakref again as no
 NO_SCAN). But both requires synchronization. =(
I would use tango.core.atomic that I took from the Tango D2 branch. Another option is inline asm, but I'd have to research how to do that.
I don't know how an atomic can help, I don't know much about atomics but I think they don't have to be pointer sized (different architectures provide different atomic types). And in any case you don't want tmp to be atomic, you just don't want it to be optimized out, I think.. I guess asm can never get optimized (right?) so I think that should do. BTW, that's wrong with volatile? I think it should work too.
 You're not considering what can happen if the weak ref dereferencing
 thread stops and the rest of the world wreaks havoc on the state of the
 world. A collection could run after checking the mark. If one is careful
 with that, and a copy of t is kept on the stack, will the optimizer
 reuse it? Or will it reread the value? That matters if the finalizer
 runs. The mark bit check can fail from reallocation after deletion.
 
 I hope that helps...
Yes, thank you.
It's amazingly difficult to get right for something so "simple"
Indeed. -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------------- Wake from your sleep, the drying of your tears, Today we escape, we escape.
Apr 25 2009
parent Jason House <jason.james.house gmail.com> writes:
Leandro Lucarella Wrote:

 Jason House, el 25 de abril a las 14:19 me escribiste:
 Is that I guess is very likely that 'tmp' get optimized out. Would
 declaring it as volatile help? If not, I guess the only way to avoid that
 race is add a root manually to the GC (or mark the weakref again as no
 NO_SCAN). But both requires synchronization. =(
I would use tango.core.atomic that I took from the Tango D2 branch. Another option is inline asm, but I'd have to research how to do that.
I don't know how an atomic can help, I don't know much about atomics but I think they don't have to be pointer sized (different architectures provide different atomic types). And in any case you don't want tmp to be atomic, you just don't want it to be optimized out, I think..
Not tmp, the hidden pointer t. The read calls can't be optimized away, even with inlining because they use asm code.
 I guess asm can never get optimized (right?) so I think that should do.
Right. D2 treats all asm as volatile asm.
 BTW, that's wrong with volatile? I think it should work too.
The volatile keyword does not exist in D2. Declaring the t (reference) as volatile would work if it existed in D2.
 You're not considering what can happen if the weak ref dereferencing
 thread stops and the rest of the world wreaks havoc on the state of the
 world. A collection could run after checking the mark. If one is careful
 with that, and a copy of t is kept on the stack, will the optimizer
 reuse it? Or will it reread the value? That matters if the finalizer
 runs. The mark bit check can fail from reallocation after deletion.
 
 I hope that helps...
Yes, thank you.
It's amazingly difficult to get right for something so "simple"
Indeed. -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------------- Wake from your sleep, the drying of your tears, Today we escape, we escape.
Apr 25 2009
prev sibling next sibling parent reply Jason House <jason.james.house gmail.com> writes:
Sean Kelly Wrote:

 == Quote from Jason House (jason.james.house gmail.com)'s article
 Sean Kelly Wrote:
 You're guaranteed that the GC will notify you when it finalizes
 the referenced object, so you shouldn't have to query the GC
 at all.  Just copy the reference into a GC-scannable location
 and then if you haven't been notified by the GC that the object
 was finalized the reference is good.
Maybe I misunderstood, but I interpreted earlier discussion in this thread to mean the gc did the
following steps:
 1. Pause all threads
 2. Scan objects to clear marks
 3. Scan objects to add marks
 4. Resume all threads
 5. Scan objects, if no mark then...
     5a. Call object finalizer
     5b. Free memory for later use
 Assuming that is correct, any deref of a weak reference is unsafe between
steps 4 and 5a. Even
though the object may be valid in that window, there's no guarantee that the reference won't be used after step 5b... This would lead to random behavior and seg faults. No self-respecting weak reference library should allow that to happen.
 I hope I got something wrong and weak refs are easier to implement than I'm
currently thinking.
Oh I see what you're saying. So this could happen: 4. The GC resume all threads 4a. Another thread derefs the WeakRef 4b. This thread checks to see if the GC has notified that the object is being collected. No notification, so it uses the object. 5. The GC scans its memory and finalizes the object 6. The other thread tries to use the object and explodes You're right, it sounds like you'd need to add a step between 4a and 4b where the WeakRef notifies the GC that the object it's using shouldn't be collected (if a collection is in progress). In fact, all you really need to do is acquire the GCs mutex for a moment. Once you've acquired it then you know that any collection in progress must be complete (assuming of course that the GC uses a mutex to prevent corruption). The horrible thing about all this is it makes using a multithreaded WeakRef horribly slow. Even if the GC facility were provided for what you want to do you're talking about acquiring a mutex in the GC for every deref operation. What a mess.
I was hoping for a lock-free query that would return garbage (without crashing) in the rare case that a race occurred. If I get garbage (and says the object is marked) then the finalizer has run when I recheck the weak ref. Do all druntime queries acquire the GC lock? I'm hoping to do about 100_000+ of these operations per second per core (and do other work too). As long as memory blocks are not released to the OS while all threads are unpaused, lock-free operations should be possible. As a side note, does druntime use multiple threads for the garbage collector? Scalability with Tango and a memory intensive application was horrible last time I tried renting an 8 core system. It may have been my fault, but the garbage collector is a convenient scape goat.
Apr 20 2009
next sibling parent Leandro Lucarella <llucax gmail.com> writes:
Jason House, el 20 de abril a las 17:45 me escribiste:
 As a side note, does druntime use multiple threads for the garbage
 collector?
No, but I'm planning to add concurrency support to the GC. D1 and Posix OSs, though. Porting to D2 should be fairly easy because I'll use Tango's GC as base, which is almost the same as Druntime AFAIK. -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------------- careful to all animals (never washing spiders down the plughole), keep in contact with old friends (enjoy a drink now and then), will frequently check credit at (moral) bank (hole in the wall),
Apr 20 2009
prev sibling parent Sean Kelly <sean invisibleduck.org> writes:
== Quote from Jason House (jason.james.house gmail.com)'s article
 Sean Kelly Wrote:
 The horrible thing about all this is it makes using a multithreaded
 WeakRef horribly slow.  Even if the GC facility were provided for
 what you want to do you're talking about acquiring a mutex in
 the GC for every deref operation.  What a mess.
I was hoping for a lock-free query that would return garbage (without crashing) in the rare case that
a race occurred. If I get garbage (and says the object is marked) then the finalizer has run when I recheck the weak ref. Do all druntime queries acquire the GC lock? I'm hoping to do about 100_000+ of these operations per second per core (and do other work too). They all acquire the lock. And even with per-thread GCs, any operation against the shared GC will acquire its lock as well. I'm not terribly inclined to try and make a lock-free GC to avoid this for even simple queries.
 As long as memory blocks are not released to the OS while all threads are
unpaused, lock-free
operations should be possible. Blocks may be released to the OS at this time, but currently they will only be for pools that are completely empty. In practice this generally only happens for a pool allocated specifically for a single big allocation. As for lock-free GC stuff in general, there's too much state information involved to make me terribly inclined to give it a shot without some serious motivation.
 As a side note, does druntime use multiple threads for the garbage collector?
Scalability with Tango
and a memory intensive application was horrible last time I tried renting an 8 core system. It may have been my fault, but the garbage collector is a convenient scape goat. Currently, no. The Druntime GC is nearly identical to the Tango GC but for a few tweaks here and there. I'm eventually going to make a GC with per-thread heaps as an alternative, but I haven't found the time yet. The current GC is a convenient scapegoat, but it's also a valid one if the threads in this app churn through memory continuously. If nothing else the threads are almost definitely convoying on the GC mutex, not to mention the cost of "stop the world" collections. You might want to look into judicious use of GC.reserve() to help with this some. In apps that grow memory rather than just churn through it, you can see a pretty solid benefit from GC.reserve().
Apr 20 2009
prev sibling parent Daniel Keep <daniel.keep.lists gmail.com> writes:
Sean Kelly wrote:
 == Quote from Jason House (jason.james.house gmail.com)'s article
 Sean Kelly Wrote:
 You're guaranteed that the GC will notify you when it finalizes
 the referenced object, so you shouldn't have to query the GC
 at all.  Just copy the reference into a GC-scannable location
 and then if you haven't been notified by the GC that the object
 was finalized the reference is good.
Maybe I misunderstood, but I interpreted earlier discussion in this thread to mean the gc did the
following steps:
 1. Pause all threads
 2. Scan objects to clear marks
 3. Scan objects to add marks
 4. Resume all threads
 5. Scan objects, if no mark then...
     5a. Call object finalizer
     5b. Free memory for later use
 Assuming that is correct, any deref of a weak reference is unsafe between
steps 4 and 5a. Even
though the object may be valid in that window, there's no guarantee that the reference won't be used after step 5b... This would lead to random behavior and seg faults. No self-respecting weak reference library should allow that to happen.
 I hope I got something wrong and weak refs are easier to implement than I'm
currently thinking.
Oh I see what you're saying. So this could happen: 4. The GC resume all threads 4a. Another thread derefs the WeakRef 4b. This thread checks to see if the GC has notified that the object is being collected. No notification, so it uses the object. 5. The GC scans its memory and finalizes the object 6. The other thread tries to use the object and explodes You're right, it sounds like you'd need to add a step between 4a and 4b where the WeakRef notifies the GC that the object it's using shouldn't be collected (if a collection is in progress). In fact, all you really need to do is acquire the GCs mutex for a moment. Once you've acquired it then you know that any collection in progress must be complete (assuming of course that the GC uses a mutex to prevent corruption). The horrible thing about all this is it makes using a multithreaded WeakRef horribly slow. Even if the GC facility were provided for what you want to do you're talking about acquiring a mutex in the GC for every deref operation. What a mess.
What if you added this: 3.5. Set GC.collectionInProgress to true. 4. The GC resumes all threads. 4a. Another thread derefs the WeakRef 4b. Before the WeakRef checks to see if the reference has been cleared, it spinlocks on GC.collectionInProgress. 5. The GC scans its memory and finalises the object. 5a. The GC clears GC.collectionInProgress. 6. The other thread unblocks, sees the WeakRef has been cleared and fails gracefully. -- Daniel
Apr 20 2009
prev sibling parent Fawzi Mohamed <fmohamed mac.com> writes:
On 2009-04-20 22:19:42 +0200, Jason House <jason.james.house gmail.com> said:

 Sean Kelly Wrote:
 
 == Quote from Jason House (jason.james.house gmail.com)'s article
 Jason House Wrote:
 
 If I understand you correctly, a weak reference library needs to query 
 the GC to see if an object is
marked for collection. How do I do this?
 Here's the implementation I'm currently thinking of:
 weak ref construction:
 Copy pointer into struct without casting
 Call rt_attach_dispose_event
 weak ref dereference:
 copy value into temp var
 ** query gc to see if garbage **
 If not garbage, Recopy value and return
 If garbage set value = null
 dispose of weakly referenced obj:
 set value = null
 The emphasized line is impossible according to the publicized gc 
 interface. Given the delayed
finalization, I don't see any way around it. You're guaranteed that the GC will notify you when it finalizes the referenced object, so you shouldn't have to query the GC at all. Just copy the reference into a GC-scannable location and then if you haven't been notified by the GC that the object was finalized the reference is good.
Maybe I misunderstood, but I interpreted earlier discussion in this thread to mean the gc did the following steps: 1. Pause all threads 2. Scan objects to clear marks 3. Scan objects to add marks 4. Resume all threads 5. Scan objects, if no mark then... 5a. Call object finalizer 5b. Free memory for later use Assuming that is correct, any deref of a weak reference is unsafe between steps 4 and 5a. Even though the object may be valid in that window, there's no guarantee that the reference won't be used after step 5b... This would lead to random behavior and seg faults. No self-respecting weak reference library should allow that to happen. I hope I got something wrong and weak refs are easier to implement than I'm currently thinking.
I was thinking to add weak refs to tango core, and I also saw this race condition. The only clean way to fix it is to notify while threads are stopped, all the rest is really a cludge. Fawzi
Apr 20 2009
prev sibling parent Sean Kelly <sean invisibleduck.org> writes:
== Quote from Jason House (jason.james.house gmail.com)'s article
 Sean Kelly Wrote:
 Daniel Keep wrote:
 Sean Kelly wrote:
 Jason House wrote:
 Jason House wrote:

 Tango's GC (and therefore
 druntime's GC) has an explicit notification mechanism.
I guess I shouldn't have assumed that the features of tango became part of druntime. I don't see any notification mechanism :(
Same as Tango: alias void delegate(Object) DEvent; extern (C) void rt_attachDisposeEvent(Object h, DEvent e); extern (C) void rt_detachDisposeEvent(Object h, DEvent e); I should probably expose these in core.runtime.
How are these events dispatched? I remember a discussion about race conditions if these events were fired off after all threads had been resumed; you could have the event zero out the weak ref AFTER something else had dereferenced it.
It's the responsibility of the person making the weak ref to make it thread-safe if that's a design consideration. If the cleanup is performed before the threads are restarted and the weak ref (or signal/slot mechanism) uses mutexes then a deadlock is possible. At least this approach provides the option for correct code to actually work.
 If that's still a possibility, maybe you could add a case where if the
 delegate's funcptr is null, it just assumes the context pointer points
 to a void*.sizeof word and zeroes it out.  You could do that without
 resuming threads.
This is all tied into the finalizer. To do what you suggest the GC would have to perform two sweeps--one before resuming threads and one after.
If I understand you correctly, a weak reference library needs to query the GC to see if an object is marked for collection.
How do I do this? I think the WeakRef just has to either wrap the deref operation in a synchronized block if it's intended for multithreaded use or perform an atomic read and update of the reference. From my understanding, the problem Daniel Keep describes above is simply an issue of the referenced object vanishing at the same instant someone tries to access it through the WeakRef. The same thing could happen if someone explicitly deletes the object instead of letting the GC clean it up.
Apr 20 2009