digitalmars.D - D2 weak references

Jason House (2/2) Apr 15 2009 Does anyone have a weak reference library for D2?

Stewart Gordon (12/16) Apr 15 2009 Weak references don't fit well into D at the moment. It's been talked

Jason House (5/25) Apr 15 2009 I participated in the more recent of those threads.

Jason House (10/12) Apr 18 2009 I guess I shouldn't have assumed that the features of tango became part ...

Sean Kelly (6/13) Apr 19 2009 Same as Tango:

Daniel Keep (10/26) Apr 19 2009 How are these events dispatched? I remember a discussion about race

Sean Kelly (8/33) Apr 20 2009 It's the responsibility of the person making the weak ref to make it

Jason House (2/38) Apr 20 2009 If I understand you correctly, a weak reference library needs to query t...

Jason House (13/53) Apr 20 2009 Here's the implementation I'm currently thinking of:

Sean Kelly (8/23) Apr 20 2009 finalization, I don't see any way around it.

Jason House (11/35) Apr 20 2009 Maybe I misunderstood, but I interpreted earlier discussion in this thre...

Sean Kelly (23/40) Apr 20 2009 though the object may be valid in that window, there's no guarantee that...

Jarrett Billingsley (3/7) Apr 20 2009 Then take a hint from D2 and make them thread-local by default ;) But

Sean Kelly (2/9) Apr 20 2009 Right. These problems all go away if the object is not shared.
Jason House (2/11) Apr 20 2009 I'm pretty sure the GC can bypass thread locality for mark and sweep, so...

Sean Kelly (9/20) Apr 20 2009 Anything that stops the sweep from occurring in parallel with a thread u...

Michel Fortin (7/13) Apr 20 2009 Then can't you notify the weak refs, restart the world, and then

Leandro Lucarella (14/25) Apr 20 2009 I've not followed this thread very closely but I think that would imply

Jason House (12/35) Apr 21 2009 Is it possible for druntime to expose an option as to whether or not the...

grauzone (10/12) Apr 21 2009 But then you can only do boring stuff in the finalizer. Calling any C or...

Jason House (4/17) Apr 22 2009 I've never had this kind of issue with finalizers, so I have limited app...

Leandro Lucarella (28/32) Apr 22 2009 I think instead of a notification mechanism access to the mark bit can b...

Jason House (3/29) Apr 22 2009 Unfortunately, that is not efficient. The state used by the GC is in flu...

Leandro Lucarella (20/55) Apr 22 2009 The sweep phase don't touch mark bits. I can't possible understand why

Leandro Lucarella (13/35) Apr 22 2009 I forgot to mention that at allocation time, the mark bit should be set ...
Jason House (4/62) Apr 22 2009 Is that what it does? I realized recently that one could simple toggle a...

Leandro Lucarella (71/93) Apr 22 2009 Sean's idea was, if I recall correctly, to move the sweep phase into the

Leandro Lucarella (59/127) Apr 22 2009 Woops! I already spotted a bug there =). The mark bit was only set if th...

Frits van Bommel (9/29) Apr 22 2009 I think you're missing something. Consider this sequence of events:

Jason House (3/34) Apr 22 2009 That's where finalizers come in. They can mark the weak ref as invalid s...

Frits van Bommel (9/44) Apr 22 2009 Will this work if B was allocated in another thread, right after it gets...

Leandro Lucarella (36/74) Apr 22 2009 That's exactly what happens. So I'm not seeing any flaws so far with my

Jason House (49/126) Apr 22 2009 I'm finally looking at the gc source code. From what I can tell, memory...

Leandro Lucarella (49/103) Apr 22 2009 I'm sorry you had to go through that, I think this could have save you

Jason House (18/137) Apr 23 2009 It really wasn't that bad of a way to familiarize myself. I had enough o...

Leandro Lucarella (25/51) Apr 23 2009 Ok, I see now that this can happen because you first store the reference

Jason House (6/62) Apr 23 2009 What happens when a collection runs after test returns? You need to copy...

Leandro Lucarella (34/70) Apr 25 2009 You're right, I definitely underestimated the problem. Now I see that th...

Jason House (5/83) Apr 25 2009 I would use tango.core.atomic that I took from the Tango D2 branch. Anot...

Leandro Lucarella (16/35) Apr 25 2009 I don't know how an atomic can help, I don't know much about atomics but

Jason House (6/44) Apr 25 2009 Right. D2 treats all asm as volatile asm.

Jason House (4/48) Apr 20 2009 I was hoping for a lock-free query that would return garbage (without cr...

Leandro Lucarella (12/14) Apr 20 2009 No, but I'm planning to add concurrency support to the GC. D1 and Posix
Sean Kelly (29/38) Apr 20 2009 a race occurred. If I get garbage (and says the object is marked) then t...

Daniel Keep (12/55) Apr 20 2009 What if you added this:

Fawzi Mohamed (6/53) Apr 20 2009 I was thinking to add weak refs to tango core, and I also saw this race

Sean Kelly (8/46) Apr 20 2009 How do I do this?

Jason House <jason.james.house gmail.com> writes:

Does anyone have a weak reference library for D2?

Without that, hash tables and search trees don't mix. I'm hoping that a weak
reference library can be merged into druntime.

Apr 15 2009

Stewart Gordon <smjg_1998 yahoo.com> writes:

Jason House wrote:
 Does anyone have a weak reference library for D2?
 
 Without that, hash tables and search trees don't mix. I'm hoping that 
 a weak reference library can be merged into druntime.

Weak references don't fit well into D at the moment.  It's been talked 
about before (albeit not in reference to D2 in particular):

http://www.digitalmars.com/d/archives/digitalmars/D/713.html
http://www.digitalmars.com/d/archives/digitalmars/D/Weak_references._69761.html

and probably others.

But I think that, if D gained weak references as a built-in feature it 
would work well and be useful.  A decent library implementation may 
otherwise be possible if D's GC API gains:
- an explicit pinning mechanism
- a means of creating listeners on the GC

Stewart.

Apr 15 2009

Jason House <jason.james.house gmail.com> writes:

Stewart Gordon Wrote:

 Jason House wrote:
 Does anyone have a weak reference library for D2?
 
 Without that, hash tables and search trees don't mix. I'm hoping that 
 a weak reference library can be merged into druntime.

 
 Weak references don't fit well into D at the moment.  It's been talked 
 about before (albeit not in reference to D2 in particular):
 
 http://www.digitalmars.com/d/archives/digitalmars/D/713.html
 http://www.digitalmars.com/d/archives/digitalmars/D/Weak_references._69761.html
 
 and probably others.

I participated in the more recent of those threads.

I feel like I have a great application for weak pointers. The best manual
memmory management of search trees and hashtables is to do mark and sweep. That
should sound awfully familiar...

 
 But I think that, if D gained weak references as a built-in feature it 
 would work well and be useful.  A decent library implementation may 
 otherwise be possible if D's GC API gains:
 - an explicit pinning mechanism
 - a means of creating listeners on the GC
 
 Stewart.

Everything is pinned at the moment and Tango's GC (and therefore druntime's GC)
has an explicit notification mechanism. The big advantage to having a weak ref
implementation in the runtime is that it'll evolve with the GC as well as pool
people's bug fixes / ports.

Apr 15 2009

Jason House <jason.james.house gmail.com> writes:

Jason House wrote:

 Tango's GC (and therefore
 druntime's GC) has an explicit notification mechanism.

I guess I shouldn't have assumed that the features of tango became part of 
druntime.  I don't see any notification mechanism :(

The best design I've come up with is to make a helper class 
weakReferencable(T) that will add a finalizer to fix any outstanding weak 
references.  I'd then use a double linked list of structs as weak 
references.  As long as collections are are single threaded, it should be 
pretty easy to make work.  Introducing a new type besides just a weak 
reference seems like a bit of a hack.  I'm hoping others will have 
suggestions on how to do that better.

Apr 18 2009

Sean Kelly <sean invisibleduck.org> writes:

Jason House wrote:
 Jason House wrote:
 
 Tango's GC (and therefore
 druntime's GC) has an explicit notification mechanism.

 
 I guess I shouldn't have assumed that the features of tango became part of 
 druntime.  I don't see any notification mechanism :(

Same as Tango:

alias void delegate(Object) DEvent;
extern (C) void rt_attachDisposeEvent(Object h, DEvent e);
extern (C) void rt_detachDisposeEvent(Object h, DEvent e);

I should probably expose these in core.runtime.

Apr 19 2009

Daniel Keep <daniel.keep.lists gmail.com> writes:

Sean Kelly wrote:
 Jason House wrote:
 Jason House wrote:

 Tango's GC (and therefore
 druntime's GC) has an explicit notification mechanism.

 I guess I shouldn't have assumed that the features of tango became
 part of druntime.  I don't see any notification mechanism :(

 
 Same as Tango:
 
 alias void delegate(Object) DEvent;
 extern (C) void rt_attachDisposeEvent(Object h, DEvent e);
 extern (C) void rt_detachDisposeEvent(Object h, DEvent e);
 
 I should probably expose these in core.runtime.

How are these events dispatched?  I remember a discussion about race
conditions if these events were fired off after all threads had been
resumed; you could have the event zero out the weak ref AFTER something
else had dereferenced it.

If that's still a possibility, maybe you could add a case where if the
delegate's funcptr is null, it just assumes the context pointer points
to a void*.sizeof word and zeroes it out.  You could do that without
resuming threads.

  -- Daniel

Apr 19 2009

Sean Kelly <sean invisibleduck.org> writes:

Daniel Keep wrote:
 
 Sean Kelly wrote:
 Jason House wrote:
 Jason House wrote:

 Tango's GC (and therefore
 druntime's GC) has an explicit notification mechanism.

 I guess I shouldn't have assumed that the features of tango became
 part of druntime.  I don't see any notification mechanism :(

 Same as Tango:

 alias void delegate(Object) DEvent;
 extern (C) void rt_attachDisposeEvent(Object h, DEvent e);
 extern (C) void rt_detachDisposeEvent(Object h, DEvent e);

 I should probably expose these in core.runtime.

 
 How are these events dispatched?  I remember a discussion about race
 conditions if these events were fired off after all threads had been
 resumed; you could have the event zero out the weak ref AFTER something
 else had dereferenced it.

It's the responsibility of the person making the weak ref to make it 
thread-safe if that's a design consideration.  If the cleanup is 
performed before the threads are restarted and the weak ref (or 
signal/slot mechanism) uses mutexes then a deadlock is possible.  At 
least this approach provides the option for correct code to actually work.

 If that's still a possibility, maybe you could add a case where if the
 delegate's funcptr is null, it just assumes the context pointer points
 to a void*.sizeof word and zeroes it out.  You could do that without
 resuming threads.

This is all tied into the finalizer.  To do what you suggest the GC 
would have to perform two sweeps--one before resuming threads and one after.

Apr 20 2009

Jason House <jason.james.house gmail.com> writes:

Sean Kelly Wrote:

 Daniel Keep wrote:
 
 Sean Kelly wrote:
 Jason House wrote:
 Jason House wrote:

 Tango's GC (and therefore
 druntime's GC) has an explicit notification mechanism.

 I guess I shouldn't have assumed that the features of tango became
 part of druntime.  I don't see any notification mechanism :(

 Same as Tango:

 alias void delegate(Object) DEvent;
 extern (C) void rt_attachDisposeEvent(Object h, DEvent e);
 extern (C) void rt_detachDisposeEvent(Object h, DEvent e);

 I should probably expose these in core.runtime.

 
 How are these events dispatched?  I remember a discussion about race
 conditions if these events were fired off after all threads had been
 resumed; you could have the event zero out the weak ref AFTER something
 else had dereferenced it.

 
 It's the responsibility of the person making the weak ref to make it 
 thread-safe if that's a design consideration.  If the cleanup is 
 performed before the threads are restarted and the weak ref (or 
 signal/slot mechanism) uses mutexes then a deadlock is possible.  At 
 least this approach provides the option for correct code to actually work.
 
 If that's still a possibility, maybe you could add a case where if the
 delegate's funcptr is null, it just assumes the context pointer points
 to a void*.sizeof word and zeroes it out.  You could do that without
 resuming threads.

 
 This is all tied into the finalizer.  To do what you suggest the GC 
 would have to perform two sweeps--one before resuming threads and one after.

If I understand you correctly, a weak reference library needs to query the GC
to see if an object is marked for collection. How do I do this?

Apr 20 2009

Jason House <jason.james.house gmail.com> writes:

Jason House Wrote:

 Sean Kelly Wrote:
 
 Daniel Keep wrote:
 
 Sean Kelly wrote:
 Jason House wrote:
 Jason House wrote:

 Tango's GC (and therefore
 druntime's GC) has an explicit notification mechanism.

 I guess I shouldn't have assumed that the features of tango became
 part of druntime.  I don't see any notification mechanism :(

 Same as Tango:

 alias void delegate(Object) DEvent;
 extern (C) void rt_attachDisposeEvent(Object h, DEvent e);
 extern (C) void rt_detachDisposeEvent(Object h, DEvent e);

 I should probably expose these in core.runtime.

 
 How are these events dispatched?  I remember a discussion about race
 conditions if these events were fired off after all threads had been
 resumed; you could have the event zero out the weak ref AFTER something
 else had dereferenced it.

 
 It's the responsibility of the person making the weak ref to make it 
 thread-safe if that's a design consideration.  If the cleanup is 
 performed before the threads are restarted and the weak ref (or 
 signal/slot mechanism) uses mutexes then a deadlock is possible.  At 
 least this approach provides the option for correct code to actually work.
 
 If that's still a possibility, maybe you could add a case where if the
 delegate's funcptr is null, it just assumes the context pointer points
 to a void*.sizeof word and zeroes it out.  You could do that without
 resuming threads.

 
 This is all tied into the finalizer.  To do what you suggest the GC 
 would have to perform two sweeps--one before resuming threads and one after.

 
 If I understand you correctly, a weak reference library needs to query the GC
to see if an object is marked for collection. How do I do this?

Here's the implementation I'm currently thinking of:

weak ref construction:
Copy pointer into struct without casting
Call rt_attach_dispose_event

weak ref dereference:
copy value into temp var
** query gc to see if garbage **
If not garbage, Recopy value and return
If garbage set value = null

dispose of weakly referenced obj:
set value = null

The emphasized line is impossible according to the publicized gc interface.
Given the delayed finalization, I don't see any way around it.

Apr 20 2009

Sean Kelly <sean invisibleduck.org> writes:

== Quote from Jason House (jason.james.house gmail.com)'s article
 Jason House Wrote:
 If I understand you correctly, a weak reference library needs to query the GC
to see if an object is


marked for collection. How do I do this?
 Here's the implementation I'm currently thinking of:
 weak ref construction:
 Copy pointer into struct without casting
 Call rt_attach_dispose_event
 weak ref dereference:
 copy value into temp var
 ** query gc to see if garbage **
 If not garbage, Recopy value and return
 If garbage set value = null
 dispose of weakly referenced obj:
 set value = null
 The emphasized line is impossible according to the publicized gc interface.
Given the delayed

finalization, I don't see any way around it.

You're guaranteed that the GC will notify you when it finalizes
the referenced object, so you shouldn't have to query the GC
at all.  Just copy the reference into a GC-scannable location
and then if you haven't been notified by the GC that the object
was finalized the reference is good.

Apr 20 2009

Jason House <jason.james.house gmail.com> writes:

Sean Kelly Wrote:

 == Quote from Jason House (jason.james.house gmail.com)'s article
 Jason House Wrote:
 If I understand you correctly, a weak reference library needs to query the GC
to see if an object is


 marked for collection. How do I do this?
 Here's the implementation I'm currently thinking of:
 weak ref construction:
 Copy pointer into struct without casting
 Call rt_attach_dispose_event
 weak ref dereference:
 copy value into temp var
 ** query gc to see if garbage **
 If not garbage, Recopy value and return
 If garbage set value = null
 dispose of weakly referenced obj:
 set value = null
 The emphasized line is impossible according to the publicized gc interface.
Given the delayed

 finalization, I don't see any way around it.
 
 You're guaranteed that the GC will notify you when it finalizes
 the referenced object, so you shouldn't have to query the GC
 at all.  Just copy the reference into a GC-scannable location
 and then if you haven't been notified by the GC that the object
 was finalized the reference is good.

Maybe I misunderstood, but I interpreted earlier discussion in this thread to
mean the gc did the following steps:
1. Pause all threads
2. Scan objects to clear marks
3. Scan objects to add marks
4. Resume all threads
5. Scan objects, if no mark then...
    5a. Call object finalizer
    5b. Free memory for later use

Assuming that is correct, any deref of a weak reference is unsafe between steps
4 and 5a. Even though the object may be valid in that window, there's no
guarantee that the reference won't be used after step 5b... This would lead to
random behavior and seg faults. No self-respecting weak reference library
should allow that to happen.

I hope I got something wrong and weak refs are easier to implement than I'm
currently thinking.

Apr 20 2009

Sean Kelly <sean invisibleduck.org> writes:

== Quote from Jason House (jason.james.house gmail.com)'s article
 Sean Kelly Wrote:
 You're guaranteed that the GC will notify you when it finalizes
 the referenced object, so you shouldn't have to query the GC
 at all.  Just copy the reference into a GC-scannable location
 and then if you haven't been notified by the GC that the object
 was finalized the reference is good.

 Maybe I misunderstood, but I interpreted earlier discussion in this thread to
mean the gc did the

following steps:
 1. Pause all threads
 2. Scan objects to clear marks
 3. Scan objects to add marks
 4. Resume all threads
 5. Scan objects, if no mark then...
     5a. Call object finalizer
     5b. Free memory for later use
 Assuming that is correct, any deref of a weak reference is unsafe between
steps 4 and 5a. Even

though the object may be valid in that window, there's no guarantee that the
reference won't be used
after step 5b... This would lead to random behavior and seg faults. No
self-respecting weak reference
library should allow that to happen.
 I hope I got something wrong and weak refs are easier to implement than I'm
currently thinking.

Oh I see what you're saying.  So this could happen:

4. The GC resume all threads
4a. Another thread derefs the WeakRef
4b. This thread checks to see if the GC has notified that the
       object is being collected.  No notification, so it uses the object.
5. The GC scans its memory and finalizes the object
6. The other thread tries to use the object and explodes

You're right, it sounds like you'd need to add a step between 4a
and 4b where the WeakRef notifies the GC that the object it's using
shouldn't be collected (if a collection is in progress).  In fact, all you
really need to do is acquire the GCs mutex for a moment.  Once
you've acquired it then you know that any collection in progress
must be complete (assuming of course that the GC uses a mutex
to prevent corruption).

The horrible thing about all this is it makes using a multithreaded
WeakRef horribly slow.  Even if the GC facility were provided for
what you want to do you're talking about acquiring a mutex in
the GC for every deref operation.  What a mess.

Apr 20 2009

Jarrett Billingsley <jarrett.billingsley gmail.com> writes:

On Mon, Apr 20, 2009 at 5:03 PM, Sean Kelly <sean invisibleduck.org> wrote:

 The horrible thing about all this is it makes using a multithreaded
 WeakRef horribly slow. =A0Even if the GC facility were provided for
 what you want to do you're talking about acquiring a mutex in
 the GC for every deref operation. =A0What a mess.

Then take a hint from D2 and make them thread-local by default ;)  But
provide a SharedWeakRef for the cases where you need it.

Apr 20 2009

Sean Kelly <sean invisibleduck.org> writes:

== Quote from Jarrett Billingsley (jarrett.billingsley gmail.com)'s article
 On Mon, Apr 20, 2009 at 5:03 PM, Sean Kelly <sean invisibleduck.org> wrote:
 The horrible thing about all this is it makes using a multithreaded
 WeakRef horribly slow. �Even if the GC facility were provided for
 what you want to do you're talking about acquiring a mutex in
 the GC for every deref operation. �What a mess.

 Then take a hint from D2 and make them thread-local by default ;)  But
 provide a SharedWeakRef for the cases where you need it.

Right.  These problems all go away if the object is not shared.

Apr 20 2009

Jason House <jason.james.house gmail.com> writes:

Jarrett Billingsley Wrote:

 On Mon, Apr 20, 2009 at 5:03 PM, Sean Kelly <sean invisibleduck.org> wrote:
 
 The horrible thing about all this is it makes using a multithreaded
 WeakRef horribly slow. �Even if the GC facility were provided for
 what you want to do you're talking about acquiring a mutex in
 the GC for every deref operation. �What a mess.

 
 Then take a hint from D2 and make them thread-local by default ;)  But
 provide a SharedWeakRef for the cases where you need it.

I'm pretty sure the GC can bypass thread locality for mark and sweep, so thread
local won't help.  Anything that stops the sweep from occurring in parallel
with a thread using the data bypasses this issue.

Apr 20 2009

Sean Kelly <sean invisibleduck.org> writes:

== Quote from Jason House (jason.james.house gmail.com)'s article
 Jarrett Billingsley Wrote:
 On Mon, Apr 20, 2009 at 5:03 PM, Sean Kelly <sean invisibleduck.org> wrote:

 The horrible thing about all this is it makes using a multithreaded
 WeakRef horribly slow. �Even if the GC facility were provided for
 what you want to do you're talking about acquiring a mutex in
 the GC for every deref operation. �What a mess.

 Then take a hint from D2 and make them thread-local by default ;)  But
 provide a SharedWeakRef for the cases where you need it.

 I'm pretty sure the GC can bypass thread locality for mark and sweep, so
thread local won't help.

Anything that stops the sweep from occurring in parallel with a thread using
the data bypasses this
issue.

Yeah.  The unfortunate tradeoff is that finalizers are run while all these
app threads are suspended, and if these finalizers enter a synchronized
block they're likely to deadlock.  This happened enough with the Phobos
GC that I was motivated to make the change in the first place.  It's
unfortunate, but at least WeakRef is the only victim of the current
approach that I'm aware of.

Apr 20 2009

Michel Fortin <michel.fortin michelf.com> writes:

On 2009-04-20 18:31:28 -0400, Sean Kelly <sean invisibleduck.org> said:

 Yeah.  The unfortunate tradeoff is that finalizers are run while all these
 app threads are suspended, and if these finalizers enter a synchronized
 block they're likely to deadlock.  This happened enough with the Phobos
 GC that I was motivated to make the change in the first place.  It's
 unfortunate, but at least WeakRef is the only victim of the current
 approach that I'm aware of.

Then can't you notify the weak refs, restart the world, and then 
finalize the objects for real?


-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Apr 20 2009

Leandro Lucarella <llucax gmail.com> writes:

Michel Fortin, el 20 de abril a las 18:53 me escribiste:
 On 2009-04-20 18:31:28 -0400, Sean Kelly <sean invisibleduck.org> said:
 
Yeah.  The unfortunate tradeoff is that finalizers are run while all these
app threads are suspended, and if these finalizers enter a synchronized
block they're likely to deadlock.  This happened enough with the Phobos
GC that I was motivated to make the change in the first place.  It's
unfortunate, but at least WeakRef is the only victim of the current
approach that I'm aware of.

 
 Then can't you notify the weak refs, restart the world, and then
 finalize the objects for real?

I've not followed this thread very closely but I think that would imply
that the sweep phase had to run with all threads suspended, which is a bad
idea unless you *really* enjoy pauses =)

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------
Karma police
arrest this man,
he talks in maths,
he buzzes like a fridge,
he's like a detuned radio.

Apr 20 2009

Jason House <jason.james.house gmail.com> writes:

Sean Kelly wrote:

 == Quote from Jason House (jason.james.house gmail.com)'s article
 Jarrett Billingsley Wrote:
 On Mon, Apr 20, 2009 at 5:03 PM, Sean Kelly <sean invisibleduck.org>
 wrote:

 The horrible thing about all this is it makes using a multithreaded
 WeakRef horribly slow. �Even if the GC facility were provided for
 what you want to do you're talking about acquiring a mutex in
 the GC for every deref operation. �What a mess.

 Then take a hint from D2 and make them thread-local by default ;)  But
 provide a SharedWeakRef for the cases where you need it.

 I'm pretty sure the GC can bypass thread locality for mark and sweep, so
 thread local won't help.

 Anything that stops the sweep from occurring in parallel with a thread
 using the data bypasses this issue.
 
 Yeah.  The unfortunate tradeoff is that finalizers are run while all these
 app threads are suspended, and if these finalizers enter a synchronized
 block they're likely to deadlock.  This happened enough with the Phobos
 GC that I was motivated to make the change in the first place.  It's
 unfortunate, but at least WeakRef is the only victim of the current
 approach that I'm aware of.

Is it possible for druntime to expose an option as to whether or not the 
sweep phase / finalizers should run while all threads are suspended?

It'd really suck to implement some kind of mark and sweep garbage collection 
just to compensate for how the standard mark and sweep garbage collector is 
implemented.

Another alternative I came up with is notification when the GC clears or sets 
a mark on an object.  That would allow the weak ref to know that its 
information is suspect.  That approach would only required that the mark 
phase is done with all threads stopped.

If you like either of these ideas, I could attempt a patch to add the 
functionality.

Apr 21 2009

grauzone <none example.net> writes:

 Is it possible for druntime to expose an option as to whether or not the 
 sweep phase / finalizers should run while all threads are suspended?

But then you can only do boring stuff in the finalizer. Calling any C or 
library function must be avoided, because it can lead to deadlocks. At 
the same time, your code is called at arbitrary times, and anything you 
do will cause race conditions by default. It's like a UNIX signal 
handler: programming in it sucks so much.

Maybe the finalizer register function could take two callbacks: one 
called before threads are resumed, and one after threads are resumed. 
You could use the first callback to zero out the weak reference, while 
the second takes care of advanced functionality, like notifying someone 
that the reference has been collected.

Apr 21 2009

Jason House <jason.james.house gmail.com> writes:

grauzone Wrote:

 Is it possible for druntime to expose an option as to whether or not the 
 sweep phase / finalizers should run while all threads are suspended?

 
 But then you can only do boring stuff in the finalizer. Calling any C or 
 library function must be avoided, because it can lead to deadlocks. At 
 the same time, your code is called at arbitrary times, and anything you 
 do will cause race conditions by default. It's like a UNIX signal 
 handler: programming in it sucks so much.

I've never had this kind of issue with finalizers, so I have limited
appreciation for the argument. In the application I'm working on, I have no
need for any functionality in this area except weak reference support.
 
 Maybe the finalizer register function could take two callbacks: one 
 called before threads are resumed, and one after threads are resumed. 
 You could use the first callback to zero out the weak reference, while 
 the second takes care of advanced functionality, like notifying someone 
 that the reference has been collected.

Wouldn't that implicitly require two sweep phases? In my last post, I suggested
the possibiluty of a callback in the mark phase. At the moment, I'm thinking
one callback at the start of collection and another callback when marking a
weakly referenced object would do the trick. Any weakly referenced object not
marked after the last collection is invalid

Apr 22 2009

Leandro Lucarella <llucax gmail.com> writes:

Jason House, el 21 de abril a las 21:11 me escribiste:
 Another alternative I came up with is notification when the GC clears or sets 
 a mark on an object.  That would allow the weak ref to know that its 
 information is suspect.  That approach would only required that the mark 
 phase is done with all threads stopped.

I think instead of a notification mechanism access to the mark bit can be
provided.

The mark phase is ran with all thread paused, so you don't need any
notifications in that phase because nobody can ask the weak ref object for
the underlaying pointer then.

When threads are resumed, the mark phase is complete, so the invariant of
all live objects having the mark bit set and all the garbage having the
bit unset should hold. If the GC guarantee that the mark bit is set
*always* when a object is live (the mark bit clearing should be done with
all threads stopped and that should be it), except only when the mark
phase is running, then, a weak ref implementation can look at the mark bit
to safely see if the reference is garbage or not (even before the sweep
phase find it and call the finalizer).

I think this approach is both simple and efficient, it doesn't have races
at all (unless I'm missing something, of course). The only problem is it's
too tied to the current GC implementation. But I think any weak ref
implementation will, so it looks like weak ref really belong to the GC.

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------
no longer afraid of the dark or midday shadows
nothing so ridiculously teenage and desperate,
nothing so childish - at a better pace,
slower and more calculated,
no chance of escape,

Apr 22 2009

Jason House <jason.james.house gmail.com> writes:

Leandro Lucarella Wrote:

 Jason House, el 21 de abril a las 21:11 me escribiste:
 Another alternative I came up with is notification when the GC clears or sets 
 a mark on an object.  That would allow the weak ref to know that its 
 information is suspect.  That approach would only required that the mark 
 phase is done with all threads stopped.

 
 I think instead of a notification mechanism access to the mark bit can be
 provided.
 
 The mark phase is ran with all thread paused, so you don't need any
 notifications in that phase because nobody can ask the weak ref object for
 the underlaying pointer then.
 
 When threads are resumed, the mark phase is complete, so the invariant of
 all live objects having the mark bit set and all the garbage having the
 bit unset should hold. If the GC guarantee that the mark bit is set
 *always* when a object is live (the mark bit clearing should be done with
 all threads stopped and that should be it), except only when the mark
 phase is running, then, a weak ref implementation can look at the mark bit
 to safely see if the reference is garbage or not (even before the sweep
 phase find it and call the finalizer).
 
 I think this approach is both simple and efficient, it doesn't have races
 at all (unless I'm missing something, of course). The only problem is it's
 too tied to the current GC implementation. But I think any weak ref
 implementation will, so it looks like weak ref really belong to the GC.

Unfortunately, that is not efficient. The state used by the GC is in flux
during the sweep phase, so there is no easy lockless way to access that data.
Having to acquire a lock on every weakref dereference absolutely kills
performance.

I'm definitely open to hearing more ideas on how to efficiently implement.

Apr 22 2009

Leandro Lucarella <llucax gmail.com> writes:

Jason House, el 22 de abril a las 11:32 me escribiste:
 Leandro Lucarella Wrote:
 
 Jason House, el 21 de abril a las 21:11 me escribiste:
 Another alternative I came up with is notification when the GC clears or sets 
 a mark on an object.  That would allow the weak ref to know that its 
 information is suspect.  That approach would only required that the mark 
 phase is done with all threads stopped.

 
 I think instead of a notification mechanism access to the mark bit can be
 provided.
 
 The mark phase is ran with all thread paused, so you don't need any
 notifications in that phase because nobody can ask the weak ref object for
 the underlaying pointer then.
 
 When threads are resumed, the mark phase is complete, so the invariant of
 all live objects having the mark bit set and all the garbage having the
 bit unset should hold. If the GC guarantee that the mark bit is set
 *always* when a object is live (the mark bit clearing should be done with
 all threads stopped and that should be it), except only when the mark
 phase is running, then, a weak ref implementation can look at the mark bit
 to safely see if the reference is garbage or not (even before the sweep
 phase find it and call the finalizer).
 
 I think this approach is both simple and efficient, it doesn't have races
 at all (unless I'm missing something, of course). The only problem is it's
 too tied to the current GC implementation. But I think any weak ref
 implementation will, so it looks like weak ref really belong to the GC.

 
 Unfortunately, that is not efficient. The state used by the GC is in
 flux during the sweep phase, so there is no easy lockless way to access
 that data. Having to acquire a lock on every weakref dereference
 absolutely kills performance.

The sweep phase don't touch mark bits. I can't possible understand why
you should synchronize access to a (temporarily) read-only variable.

Mark bits are only written when:
1) Clearing all the mark bits before marking (this is done with all
   threads running now, and this is what it should be changed to be done
   when all threads are suspended)
2) Marking (threads are suspended in the current implementation)

 I'm definitely open to hearing more ideas on how to efficiently
 implement.

The only performance penalty of my proposal would be incrementing the time
the world is stooped a bit (adding the marks cleaning phase there).
I think is a fair price to pay (being that mark bits are implemented using
bitmaps, it should be pretty fast resetting all bits to 0) for an easy and
safe weak references implementations.

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------
ELLA FUE INFIEL, PERO EX POLOLO PAGÓ
	-- TV CHILE

Apr 22 2009

Leandro Lucarella <llucax gmail.com> writes:

Leandro Lucarella, el 22 de abril a las 14:19 me escribiste:
 Unfortunately, that is not efficient. The state used by the GC is in
 flux during the sweep phase, so there is no easy lockless way to access
 that data. Having to acquire a lock on every weakref dereference
 absolutely kills performance.

 
 The sweep phase don't touch mark bits. I can't possible understand why
 you should synchronize access to a (temporarily) read-only variable.
 
 Mark bits are only written when:
 1) Clearing all the mark bits before marking (this is done with all
    threads running now, and this is what it should be changed to be done
    when all threads are suspended)
 2) Marking (threads are suspended in the current implementation)
 
 I'm definitely open to hearing more ideas on how to efficiently
 implement.

 
 The only performance penalty of my proposal would be incrementing the time
 the world is stooped a bit (adding the marks cleaning phase there).
 I think is a fair price to pay (being that mark bits are implemented using
 bitmaps, it should be pretty fast resetting all bits to 0) for an easy and
 safe weak references implementations.

I forgot to mention that at allocation time, the mark bit should be set to
keep the invariant that all live data has the mark bit set. I don't think
this should impose a very big performance penalty either.

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------
Ideas claras, libertad y patriotismo, para mejorar la calidad de vida;
claro que todo con mucha calidad, porque, joven, ideas claras, libertad
o patriotismo tiene cualquiera.
	-- Ricardo Vaporeso

Apr 22 2009

Jason House <jason.james.house gmail.com> writes:

Leandro Lucarella Wrote:

 Jason House, el 22 de abril a las 11:32 me escribiste:
 Leandro Lucarella Wrote:
 
 Jason House, el 21 de abril a las 21:11 me escribiste:
 Another alternative I came up with is notification when the GC clears or sets 
 a mark on an object.  That would allow the weak ref to know that its 
 information is suspect.  That approach would only required that the mark 
 phase is done with all threads stopped.

 
 I think instead of a notification mechanism access to the mark bit can be
 provided.
 
 The mark phase is ran with all thread paused, so you don't need any
 notifications in that phase because nobody can ask the weak ref object for
 the underlaying pointer then.
 
 When threads are resumed, the mark phase is complete, so the invariant of
 all live objects having the mark bit set and all the garbage having the
 bit unset should hold. If the GC guarantee that the mark bit is set
 *always* when a object is live (the mark bit clearing should be done with
 all threads stopped and that should be it), except only when the mark
 phase is running, then, a weak ref implementation can look at the mark bit
 to safely see if the reference is garbage or not (even before the sweep
 phase find it and call the finalizer).
 
 I think this approach is both simple and efficient, it doesn't have races
 at all (unless I'm missing something, of course). The only problem is it's
 too tied to the current GC implementation. But I think any weak ref
 implementation will, so it looks like weak ref really belong to the GC.

 
 Unfortunately, that is not efficient. The state used by the GC is in
 flux during the sweep phase, so there is no easy lockless way to access
 that data. Having to acquire a lock on every weakref dereference
 absolutely kills performance.

 
 The sweep phase don't touch mark bits. I can't possible understand why
 you should synchronize access to a (temporarily) read-only variable.

I'm mostly regurgitating what Sean Kelly said. I know that if the sweep
deallocates at an object, the bits marking it become invalid. I don't disagree
that reading the bits without a lock should be possible, but I also trust
Sean's statement that it isn't easy. I'm trying to find something that doesn't
require reworking or rewriting large parts of a garbage collector. 

 Mark bits are only written when:
 1) Clearing all the mark bits before marking (this is done with all
    threads running now, and this is what it should be changed to be done
    when all threads are suspended)
 2) Marking (threads are suspended in the current implementation)

Is that what it does? I realized recently that one could simple toggle a mark
bit and avoid a clearing phase altogether.

 
 I'm definitely open to hearing more ideas on how to efficiently
 implement.

 
 The only performance penalty of my proposal would be incrementing the time
 the world is stooped a bit (adding the marks cleaning phase there).
 I think is a fair price to pay (being that mark bits are implemented using
 bitmaps, it should be pretty fast resetting all bits to 0) for an easy and
 safe weak references implementations.
 
 -- 
 Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
 ----------------------------------------------------------------------------
 GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
 ----------------------------------------------------------------------------
 ELLA FUE INFIEL, PERO EX POLOLO PAGÓ
 	-- TV CHILE

Apr 22 2009

Leandro Lucarella <llucax gmail.com> writes:

Jason House, el 22 de abril a las 13:31 me escribiste:
 Unfortunately, that is not efficient. The state used by the GC is in
 flux during the sweep phase, so there is no easy lockless way to access
 that data. Having to acquire a lock on every weakref dereference
 absolutely kills performance.

 
 The sweep phase don't touch mark bits. I can't possible understand why
 you should synchronize access to a (temporarily) read-only variable.

 
 I'm mostly regurgitating what Sean Kelly said. I know that if the sweep
 deallocates at an object, the bits marking it become invalid. I don't
 disagree that reading the bits without a lock should be possible, but
 I also trust Sean's statement that it isn't easy. I'm trying to find
 something that doesn't require reworking or rewriting large parts of
 a garbage collector.

Sean's idea was, if I recall correctly, to move the sweep phase into the
stop-the-world. My idea is different, and it doesn't need rewriting any
large part of the collector. I think the change is pretty small (and can
be easily applied to D1/Tango too).

I'm looking at the Tango code right now, and you know what? The unmarking
is already done when all threads are paused, so all you need for this to
work is a method to expose the mark bit. This can be really easily done by
extending the BlkAttr defined values (or using a reserved value if this
should not be a standard attribute) and ensuring that the mark bit is set
when new pools/bins are allocated.

This patch (Tango) should do that (untested):
------>8------>8------>8------>8------>8------>8------>8------>8------>8------
diff --git a/lib/gc/basic/gcx.d b/lib/gc/basic/gcx.d
index a760465..fd21d44 100644
--- a/lib/gc/basic/gcx.d
+++ b/lib/gc/basic/gcx.d
   -81,6 +81,7    private
         NO_SCAN  = 0b0000_0010,
         NO_MOVE  = 0b0000_0100,
         ALL_BITS = 0b1111_1111
+        MARK_BIT = 0b1_0000_0000_0000_0000; // extension to query the mark bit
     }
 
     extern (C) void* rt_stackBottom();
   -519,7 +520,14    class GC
             Pool *pool = gcx.findPool(p);
             assert(pool);
 
-            gcx.setBits(pool, cast(size_t)(p - pool.baseAddr) / 16, bits);
+           uint biti = cast(size_t)(p - pool.baseAddr) / 16;
+            gcx.setBits(pool, biti, bits);
+           // Set the mark bit to maintain the invariant that live objects
+           // always have the mark bit set, except when the collector is in
+           // the mark phase (useful for weak reference implementations)
+           // This is not done in setBits() to keep this bit read-only for
+           // the user.
+           pool.mark.set(biti)
         }
         return p;
     }
   -2576,6 +2584,8    struct Gcx
 //        if (pool.nomove.nbits &&
 //            pool.nomove.test(biti))
 //            bits |= BlkAttr.NO_MOVE;
+        if (pool.mark.test(biti))
+            bits |= BlkAttr.MARK_BIT;
         return bits;
     }
 
------8<------8<------8<------8<------8<------8<------8<------8<------8<------

This implementation use an "extension" (according to Druntime specs) bit
to return the mark bit when querying objects attributes. The mark bit is
kept read-only, but I'm not sure about that (there are already plenty of
ways to break the GC now, so I don't think security is a good enough
reason to ban the user from doing nasty things ;)

If you try it, please tell me how it went.

 Mark bits are only written when:
 1) Clearing all the mark bits before marking (this is done with all
    threads running now, and this is what it should be changed to be done
    when all threads are suspended)
 2) Marking (threads are suspended in the current implementation)

 
 Is that what it does? I realized recently that one could simple toggle
 a mark bit and avoid a clearing phase altogether.

Yes, but that would mean to touch all the mark bits in the sweep phase,
which can have bad cache behaviour. But being a bitmap that should be
fairly cheap. But since this involve bit manipulation operations, and
clearing the bitmap can be done very efficiently clearing complete words
in one instruction, I don't see a clear gain in this.

I guess just benchmarks can have a final word on this, but I wouldn't
expect much differences in performance for both approaches.


-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------
More people die from a champagne-cork popping,
than from poison spiders

Apr 22 2009

Leandro Lucarella <llucax gmail.com> writes:

Leandro Lucarella, el 22 de abril a las 16:08 me escribiste:
 Jason House, el 22 de abril a las 13:31 me escribiste:
 Unfortunately, that is not efficient. The state used by the GC is in
 flux during the sweep phase, so there is no easy lockless way to access
 that data. Having to acquire a lock on every weakref dereference
 absolutely kills performance.

 
 The sweep phase don't touch mark bits. I can't possible understand why
 you should synchronize access to a (temporarily) read-only variable.

 
 I'm mostly regurgitating what Sean Kelly said. I know that if the sweep
 deallocates at an object, the bits marking it become invalid. I don't
 disagree that reading the bits without a lock should be possible, but
 I also trust Sean's statement that it isn't easy. I'm trying to find
 something that doesn't require reworking or rewriting large parts of
 a garbage collector.

 
 Sean's idea was, if I recall correctly, to move the sweep phase into the
 stop-the-world. My idea is different, and it doesn't need rewriting any
 large part of the collector. I think the change is pretty small (and can
 be easily applied to D1/Tango too).
 
 I'm looking at the Tango code right now, and you know what? The unmarking
 is already done when all threads are paused, so all you need for this to
 work is a method to expose the mark bit. This can be really easily done by
 extending the BlkAttr defined values (or using a reserved value if this
 should not be a standard attribute) and ensuring that the mark bit is set
 when new pools/bins are allocated.
 
 This patch (Tango) should do that (untested):
 ------>8------>8------>8------>8------>8------>8------>8------>8------>8------
 diff --git a/lib/gc/basic/gcx.d b/lib/gc/basic/gcx.d
 index a760465..fd21d44 100644
 --- a/lib/gc/basic/gcx.d
 +++ b/lib/gc/basic/gcx.d
    -81,6 +81,7    private
          NO_SCAN  = 0b0000_0010,
          NO_MOVE  = 0b0000_0100,
          ALL_BITS = 0b1111_1111
 +        MARK_BIT = 0b1_0000_0000_0000_0000; // extension to query the mark bit
      }
  
      extern (C) void* rt_stackBottom();
    -519,7 +520,14    class GC
              Pool *pool = gcx.findPool(p);
              assert(pool);
  
 -            gcx.setBits(pool, cast(size_t)(p - pool.baseAddr) / 16, bits);
 +           uint biti = cast(size_t)(p - pool.baseAddr) / 16;
 +            gcx.setBits(pool, biti, bits);
 +           // Set the mark bit to maintain the invariant that live objects
 +           // always have the mark bit set, except when the collector is in
 +           // the mark phase (useful for weak reference implementations)
 +           // This is not done in setBits() to keep this bit read-only for
 +           // the user.
 +           pool.mark.set(biti)
          }
          return p;
      }
    -2576,6 +2584,8    struct Gcx
  //        if (pool.nomove.nbits &&
  //            pool.nomove.test(biti))
  //            bits |= BlkAttr.NO_MOVE;
 +        if (pool.mark.test(biti))
 +            bits |= BlkAttr.MARK_BIT;
          return bits;
      }
  
 ------8<------8<------8<------8<------8<------8<------8<------8<------8<------

Woops! I already spotted a bug there =). The mark bit was only set if the
malloc was done passing bits. The solution has a performance penalty of a
"extra" pool lookup for malloc where attr == 0.

This is the fixed patch:

------>8------>8------>8------>8------>8------>8------>8------>8------>8------
diff --git a/lib/gc/basic/gcx.d b/lib/gc/basic/gcx.d
index a760465..940bc71 100644
--- a/lib/gc/basic/gcx.d
+++ b/lib/gc/basic/gcx.d
   -81,6 +81,7    private
         NO_SCAN  = 0b0000_0010,
         NO_MOVE  = 0b0000_0100,
         ALL_BITS = 0b1111_1111
+        MARK_BIT = 0b1_0000_0000_0000_0000; // extension to query the mark bit
     }
 
     extern (C) void* rt_stackBottom();
   -514,13 +515,18    class GC
         sentinel_init(p, size);
         gcx.log_malloc(p, size);
 
-        if (bits)
-        {
-            Pool *pool = gcx.findPool(p);
-            assert(pool);
+        Pool *pool = gcx.findPool(p);
+        assert(pool);
 
-            gcx.setBits(pool, cast(size_t)(p - pool.baseAddr) / 16, bits);
-        }
+        uint biti = cast(size_t)(p - pool.baseAddr) / 16;
+        if (bits)
+            gcx.setBits(pool, biti, bits);
+        // Set the mark bit to maintain the invariant that live objects
+        // always have the mark bit set, except when the collector is in
+        // the mark phase (useful for weak reference implementations)
+        // This is not done in setBits() to keep this bit read-only for
+        // the user.
+        pool.mark.set(biti)
         return p;
     }
 
   -2576,6 +2582,8    struct Gcx
 //        if (pool.nomove.nbits &&
 //            pool.nomove.test(biti))
 //            bits |= BlkAttr.NO_MOVE;
+        if (pool.mark.test(biti))
+            bits |= BlkAttr.MARK_BIT;
         return bits;
     }
 
------8<------8<------8<------8<------8<------8<------8<------8<------8<------

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------
Bulletproof... I Wish I Was

Apr 22 2009

Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:

Leandro Lucarella wrote:
 I think instead of a notification mechanism access to the mark bit can be
 provided.
 
 The mark phase is ran with all thread paused, so you don't need any
 notifications in that phase because nobody can ask the weak ref object for
 the underlaying pointer then.
 
 When threads are resumed, the mark phase is complete, so the invariant of
 all live objects having the mark bit set and all the garbage having the
 bit unset should hold. If the GC guarantee that the mark bit is set
 *always* when a object is live (the mark bit clearing should be done with
 all threads stopped and that should be it), except only when the mark
 phase is running, then, a weak ref implementation can look at the mark bit
 to safely see if the reference is garbage or not (even before the sweep
 phase find it and call the finalizer).
 
 I think this approach is both simple and efficient, it doesn't have races
 at all (unless I'm missing something, of course). The only problem is it's
 too tied to the current GC implementation. But I think any weak ref
 implementation will, so it looks like weak ref really belong to the GC.

I think you're missing something. Consider this sequence of events:
- Object A is collected by the GC. (And its mark bit M is cleared)
- Object B is allocated, and gets the memory formerly used by object A. Mark
bit 
M is set.
- A weak reference to A gets used; it checks mark bit M, which is set...

In other words, the "this object was collected" state needs to be stored in the 
weak reference, not in the object that got collected (or anything tied to its 
memory location) because that memory may get reused.

Apr 22 2009

Jason House <jason.james.house gmail.com> writes:

Frits van Bommel Wrote:

 Leandro Lucarella wrote:
 I think instead of a notification mechanism access to the mark bit can be
 provided.
 
 The mark phase is ran with all thread paused, so you don't need any
 notifications in that phase because nobody can ask the weak ref object for
 the underlaying pointer then.
 
 When threads are resumed, the mark phase is complete, so the invariant of
 all live objects having the mark bit set and all the garbage having the
 bit unset should hold. If the GC guarantee that the mark bit is set
 *always* when a object is live (the mark bit clearing should be done with
 all threads stopped and that should be it), except only when the mark
 phase is running, then, a weak ref implementation can look at the mark bit
 to safely see if the reference is garbage or not (even before the sweep
 phase find it and call the finalizer).
 
 I think this approach is both simple and efficient, it doesn't have races
 at all (unless I'm missing something, of course). The only problem is it's
 too tied to the current GC implementation. But I think any weak ref
 implementation will, so it looks like weak ref really belong to the GC.

 
 I think you're missing something. Consider this sequence of events:
 - Object A is collected by the GC. (And its mark bit M is cleared)
 - Object B is allocated, and gets the memory formerly used by object A. Mark
bit 
 M is set.
 - A weak reference to A gets used; it checks mark bit M, which is set...
 
 In other words, the "this object was collected" state needs to be stored in
the 
 weak reference, not in the object that got collected (or anything tied to its 
 memory location) because that memory may get reused.

That's where finalizers come in. They can mark the weak ref as invalid so that
you don't care what was returned. In my (recent?) pseudo code, the weak ref was
decoded, the mark bit was checked, and then the weak ref was decoded again. The
final decoding is to solve the race with the finalizer.

Even with such precautions, a lock free implementation has one more problem:
what happens if the memory block was released to the os? That's a segfault :(
if other data is checked to avoid such situations, that's likely where gc locks
come into the picture... I'm pretty sure Sean said querying the attribute bits
requires a lock. It may be for this reason. I still have to look at the code
though...

Apr 22 2009

Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:

Jason House wrote:
 Frits van Bommel Wrote:
 
 Leandro Lucarella wrote:
 I think instead of a notification mechanism access to the mark bit can be
 provided.

 The mark phase is ran with all thread paused, so you don't need any
 notifications in that phase because nobody can ask the weak ref object for
 the underlaying pointer then.

 When threads are resumed, the mark phase is complete, so the invariant of
 all live objects having the mark bit set and all the garbage having the
 bit unset should hold. If the GC guarantee that the mark bit is set
 *always* when a object is live (the mark bit clearing should be done with
 all threads stopped and that should be it), except only when the mark
 phase is running, then, a weak ref implementation can look at the mark bit
 to safely see if the reference is garbage or not (even before the sweep
 phase find it and call the finalizer).

 I think this approach is both simple and efficient, it doesn't have races
 at all (unless I'm missing something, of course). The only problem is it's
 too tied to the current GC implementation. But I think any weak ref
 implementation will, so it looks like weak ref really belong to the GC.

 I think you're missing something. Consider this sequence of events:
 - Object A is collected by the GC. (And its mark bit M is cleared)
 - Object B is allocated, and gets the memory formerly used by object A. Mark
bit 
 M is set.
 - A weak reference to A gets used; it checks mark bit M, which is set...

 In other words, the "this object was collected" state needs to be stored in
the 
 weak reference, not in the object that got collected (or anything tied to its 
 memory location) because that memory may get reused.

 
 That's where finalizers come in. They can mark the weak ref as invalid so that
you don't care what was returned. In my (recent?) pseudo code, the weak ref was
decoded, the mark bit was checked, and then the weak ref was decoded again. The
final decoding is to solve the race with the finalizer.

Will this work if B was allocated in another thread, right after it gets
resumed 
(i.e. possibly before the finalizer has a chance to run)?

... oh, maybe the GC lock gets held even after other threads get resumed, so
the 
allocation will wait for finalizers to run? (just thought of that)

 Even with such precautions, a lock free implementation has one more problem:
what happens if the memory block was released to the os? That's a segfault :(
if other data is checked to avoid such situations, that's likely where gc locks
come into the picture... I'm pretty sure Sean said querying the attribute bits
requires a lock. It may be for this reason. I still have to look at the code
though...

You mean it would segfault on checking the mark bit? Can't the GC just say the 
mark bit was not set for any pointer not pointing into a currently-allocated
pool?
But yeah, checking if that address is part of an allocated pool may require 
holding the GC lock...

Apr 22 2009

Leandro Lucarella <llucax gmail.com> writes:

Frits van Bommel, el 22 de abril a las 23:39 me escribiste:
I think this approach is both simple and efficient, it doesn't have races
at all (unless I'm missing something, of course). The only problem is it's
too tied to the current GC implementation. But I think any weak ref
implementation will, so it looks like weak ref really belong to the GC.

I think you're missing something. Consider this sequence of events:
- Object A is collected by the GC. (And its mark bit M is cleared)
- Object B is allocated, and gets the memory formerly used by object A. Mark
bit M is set.
- A weak reference to A gets used; it checks mark bit M, which is set...

In other words, the "this object was collected" state needs to be
stored in the weak reference, not in the object that got collected (or
anything tied to its memory location) because that memory may get
reused.

That's where finalizers come in. They can mark the weak ref as invalid
so that you don't care what was returned. In my (recent?) pseudo code,
the weak ref was decoded, the mark bit was checked, and then the weak
ref was decoded again. The final decoding is to solve the race with the
finalizer.

 
 Will this work if B was allocated in another thread, right after it gets
 resumed (i.e. possibly before the finalizer has a chance to run)?
 
 ... oh, maybe the GC lock gets held even after other threads get
 resumed, so the allocation will wait for finalizers to run? (just
 thought of that)

That's exactly what happens. So I'm not seeing any flaws so far with my
proposal.

Even with such precautions, a lock free implementation has one more
problem: what happens if the memory block was released to the os?
That's a segfault :( if other data is checked to avoid such situations,
that's likely where gc locks come into the picture... I'm pretty sure
Sean said querying the attribute bits requires a lock. It may be for
this reason. I still have to look at the code though...

 
 You mean it would segfault on checking the mark bit? Can't the GC just
 say the mark bit was not set for any pointer not pointing into
 a currently-allocated pool?  But yeah, checking if that address is part
 of an allocated pool may require holding the GC lock...

In the patch I posted this is already happening. The mark bit is returned
as part of the BlkAttr belonging to a memory block. If the pointer passed
to gc_getAttr() (or gc_query()) don't point to a GC allocated memory
block, 0 is returned.

I any case, I can't see how that could happen if there are still
notifications before freeing the memory block in the sweep phase.

The sequence is like this:

0) All live cells has the mark bit set
1) Acquire the GC lock
2) Stop the world
3) Clear mark bits
4) Mark
5) Start the world
6) Sweep (calling finalizing notification callbacks)
7) Release the GC lock

Then, you are guaranteed to access to valid memory (mark bit or even
the entire memory block) until the notification comes. When the weak
reference receive the notification it invalidates the reference and don't
rely anymore in the mark bit.

But I just found a flaw in my patch (not the proposal), right now, as you
say, the gc_query() and gc_getAttr() functions are acquiring the GC lock.
Some non-locking interface should be exposed that calls to queryNoSync()
and getAttrNoSync(). It should be safe for this limited usage as long as
there are not concurrent calls to setAttr()/clrAttr() to that memory
block (I think).

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------
Instead of doing a wash, I just keep buying underwear. My goal is to have over
360 pair. That way I only have to do wash once a year.
	-- George Constanza

Apr 22 2009

Jason House <jason.james.house gmail.com> writes:

I'm finally looking at the gc source code.  From what I can tell, memory is 
allocated in large chunks (pools), and divided into 16 byte chunks.  
Attributes are not all packed together like the BlkInfo of the spec implies, 
so some so access to that stuff is safer than I would have thought.

Here's a weak ref implementation that I just whipped together under the 
assumption it goes into druntime/src/gc/basic/gcx.d.  I think it is sane.  I 
haven't scoured the gc's code to ensure everything is legit.  I declared it 
as a class mostly because value semantics don't make much sense and I'm too 
lazy/tired to do the sample code as a 100% correct struct definition.

My two biggest concerns: 

1. Is the gc lock held while running a constructor? If not and it's needed 
to do what it needs, where can the hook be added to make the construction of 
a weak reference special?

2. Will the assignment to noscan get overwritten / hit a race condition?  It 
is possible to hide the pointer inside a size_t instead, but given the 
intimate tying to the GC such tricks just make the code tougher to read.

class weakRef(T) if (is(T==class))
{
    this(T _t){
        // set up tracking for the weakly referenced object
        t = _t;
        pool = findPool(t);
        rt_attachDisposeEvent(&t, &sweep);

        // tell the gc to not scan the weak reference
        weakRefPool = findPool(&this);
        weakRefPool.noscan[index(weakRefPool,&this)] = true;
    }

    T opCall(){
        T tmp = t; // force future conservative GC scan to find the pointer
        if ( (tmp is null) || (pool.mark[index] == false) ){
            t = null;
            return null;
        }
        tmp = t; // avoid race where t was replaced with new data
        return tmp;
    }

private:
    T t;
    Pool* pool;
  
    size_t index(Pool *p, byte *element){ return (element-p)/16; }
    size_t index(){ return index(pool,&t); }

    void sweep( Object o ){
        t = null;
        pool = null;
    }
}


Comments and criticism are welcome.







Leandro Lucarella wrote:

 Frits van Bommel, el 22 de abril a las 23:39 me escribiste:
I think this approach is both simple and efficient, it doesn't have
races at all (unless I'm missing something, of course). The only
problem is it's too tied to the current GC implementation. But I think
any weak ref implementation will, so it looks like weak ref really
belong to the GC.

I think you're missing something. Consider this sequence of events:
- Object A is collected by the GC. (And its mark bit M is cleared)
- Object B is allocated, and gets the memory formerly used by object A.
Mark bit M is set. - A weak reference to A gets used; it checks mark
bit M, which is set...

In other words, the "this object was collected" state needs to be
stored in the weak reference, not in the object that got collected (or
anything tied to its memory location) because that memory may get
reused.

That's where finalizers come in. They can mark the weak ref as invalid
so that you don't care what was returned. In my (recent?) pseudo code,
the weak ref was decoded, the mark bit was checked, and then the weak
ref was decoded again. The final decoding is to solve the race with the
finalizer.

 
 Will this work if B was allocated in another thread, right after it gets
 resumed (i.e. possibly before the finalizer has a chance to run)?
 
 ... oh, maybe the GC lock gets held even after other threads get
 resumed, so the allocation will wait for finalizers to run? (just
 thought of that)

 
 That's exactly what happens. So I'm not seeing any flaws so far with my
 proposal.
 
Even with such precautions, a lock free implementation has one more
problem: what happens if the memory block was released to the os?
That's a segfault :( if other data is checked to avoid such situations,
that's likely where gc locks come into the picture... I'm pretty sure
Sean said querying the attribute bits requires a lock. It may be for
this reason. I still have to look at the code though...

 
 You mean it would segfault on checking the mark bit? Can't the GC just
 say the mark bit was not set for any pointer not pointing into
 a currently-allocated pool?  But yeah, checking if that address is part
 of an allocated pool may require holding the GC lock...

 
 In the patch I posted this is already happening. The mark bit is returned
 as part of the BlkAttr belonging to a memory block. If the pointer passed
 to gc_getAttr() (or gc_query()) don't point to a GC allocated memory
 block, 0 is returned.
 
 I any case, I can't see how that could happen if there are still
 notifications before freeing the memory block in the sweep phase.
 
 The sequence is like this:
 
 0) All live cells has the mark bit set
 1) Acquire the GC lock
 2) Stop the world
 3) Clear mark bits
 4) Mark
 5) Start the world
 6) Sweep (calling finalizing notification callbacks)
 7) Release the GC lock
 
 Then, you are guaranteed to access to valid memory (mark bit or even
 the entire memory block) until the notification comes. When the weak
 reference receive the notification it invalidates the reference and don't
 rely anymore in the mark bit.
 
 But I just found a flaw in my patch (not the proposal), right now, as you
 say, the gc_query() and gc_getAttr() functions are acquiring the GC lock.
 Some non-locking interface should be exposed that calls to queryNoSync()
 and getAttrNoSync(). It should be safe for this limited usage as long as
 there are not concurrent calls to setAttr()/clrAttr() to that memory
 block (I think).

Apr 22 2009

Leandro Lucarella <llucax gmail.com> writes:

Jason House, el 22 de abril a las 21:26 me escribiste:
 I'm finally looking at the gc source code.  From what I can tell, memory is 
 allocated in large chunks (pools), and divided into 16 byte chunks.  
 
 Attributes are not all packed together like the BlkInfo of the spec implies, 
 so some so access to that stuff is safer than I would have thought.

I'm sorry you had to go through that, I think this could have save you
a lot of time =)
http://proj.llucax.com.ar/blog/dgc/blog/tag/understanding%20the%20current%20gc

 Here's a weak ref implementation that I just whipped together under the 
 assumption it goes into druntime/src/gc/basic/gcx.d.  I think it is sane.  I 
 haven't scoured the gc's code to ensure everything is legit.  I declared it 
 as a class mostly because value semantics don't make much sense and I'm too 
 lazy/tired to do the sample code as a 100% correct struct definition.
 
 My two biggest concerns: 
 
 1. Is the gc lock held while running a constructor? If not and it's needed 
 to do what it needs, where can the hook be added to make the construction of 
 a weak reference special?

I don't think it's held, I just can't see how it can be held because
construction is not done in the GC.

If you need to acquire the GC lock I think you can just do:
synchronized (GC.gcLock) {
	// protected stuff
}
The GC lock is a static member of the GC struct in gcx.d.

 2. Will the assignment to noscan get overwritten / hit a race condition?

I don't think that should happen except if someone calls
gc_setAttr/gc_getAttr().

 It is possible to hide the pointer inside a size_t instead, but given the
 intimate tying to the GC such tricks just make the code tougher to read.

Agree. Maybe you can overload the operator new and pass the flags directly
at allocation time, but I don't know very well how this is plugged in the
runtime/GC, but since it's a regular gc_malloc() call it should be
thread-safe.

Something like:

class weakRef(T) if (is(T==class))
{
	new(size_t size)
	{
		return gc_malloc(size, FINALIZE | NO_SCAN);
	}
	this(T _t)
	{
		// set up tracking for the weakly referenced object
		t = _t;
		pool = findPool(t);
		rt_attachDisposeEvent(&t, &sweep);
	}
	// ...
}


 class weakRef(T) if (is(T==class))
 {
     this(T _t){
         // set up tracking for the weakly referenced object
         t = _t;
         pool = findPool(t);
         rt_attachDisposeEvent(&t, &sweep);
 
         // tell the gc to not scan the weak reference
         weakRefPool = findPool(&this);
         weakRefPool.noscan[index(weakRefPool,&this)] = true;

You can do this by calling setAttr(&this, NO_SCAN).

     }
 
     T opCall(){
         T tmp = t; // force future conservative GC scan to find the pointer
         if ( (tmp is null) || (pool.mark[index] == false) ){

pool.mark.test(index) is used in the GC code, I don't know if the array
syntax works. And to use the mark bit trick, I think something in the
lines of my patch is needed to ensure the mark bit is set for all live
objects.

             t = null;
             return null;
         }
         tmp = t; // avoid race where t was replaced with new data

I don't understand exactly what kind of race are you trying to avoid with
this (and how this assignation can avoid any race if there is one).

         return tmp;
     }
 
 private:
     T t;
     Pool* pool;
   
     size_t index(Pool *p, byte *element){ return (element-p)/16; }
     size_t index(){ return index(pool,&t); }
 
     void sweep( Object o ){
         t = null;
         pool = null;
     }
 }



-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------
Lo último que hay que pensar es que se desalinea la memoria
Hay que priorizar como causa la idiotez propia
Ya lo tengo asumido
	-- Pablete, filósofo contemporáneo desconocido

Apr 22 2009

Jason House <jason.james.house gmail.com> writes:

Leandro Lucarella Wrote:

 Jason House, el 22 de abril a las 21:26 me escribiste:
 I'm finally looking at the gc source code.  From what I can tell, memory is 
 allocated in large chunks (pools), and divided into 16 byte chunks.  
 
 Attributes are not all packed together like the BlkInfo of the spec implies, 
 so some so access to that stuff is safer than I would have thought.

 
 I'm sorry you had to go through that, I think this could have save you
 a lot of time =)
 http://proj.llucax.com.ar/blog/dgc/blog/tag/understanding%20the%20current%20gc

It really wasn't that bad of a way to familiarize myself. I had enough of an
idea about how it coulf work already. It is helpful to compare what I expect in
a gc verses what's actually in the gc. I appreciate the link and will make sure
to read it.


 
 Here's a weak ref implementation that I just whipped together under the 
 assumption it goes into druntime/src/gc/basic/gcx.d.  I think it is sane.  I 
 haven't scoured the gc's code to ensure everything is legit.  I declared it 
 as a class mostly because value semantics don't make much sense and I'm too 
 lazy/tired to do the sample code as a 100% correct struct definition.
 
 My two biggest concerns: 
 
 1. Is the gc lock held while running a constructor? If not and it's needed 
 to do what it needs, where can the hook be added to make the construction of 
 a weak reference special?

 
 I don't think it's held, I just can't see how it can be held because
 construction is not done in the GC.
 
 If you need to acquire the GC lock I think you can just do:
 synchronized (GC.gcLock) {
 	// protected stuff
 }
 The GC lock is a static member of the GC struct in gcx.d.

Efficiency is a big concern of mine with this because I'm going to use it
heavily. Locking twice is undesirable, but I may do it in an initial
implementation. A lock on deref is unacceptable though.


 
 2. Will the assignment to noscan get overwritten / hit a race condition?

 
 I don't think that should happen except if someone calls
 gc_setAttr/gc_getAttr().
 
 It is possible to hide the pointer inside a size_t instead, but given the
 intimate tying to the GC such tricks just make the code tougher to read.

 
 Agree. Maybe you can overload the operator new and pass the flags directly
 at allocation time, but I don't know very well how this is plugged in the
 runtime/GC, but since it's a regular gc_malloc() call it should be
 thread-safe.

Interesting idea. I'll look into that.



 Something like:
 
 class weakRef(T) if (is(T==class))
 {
 	new(size_t size)
 	{
 		return gc_malloc(size, FINALIZE | NO_SCAN);
 	}
 	this(T _t)
 	{
 		// set up tracking for the weakly referenced object
 		t = _t;
 		pool = findPool(t);
 		rt_attachDisposeEvent(&t, &sweep);
 	}
 	// ...
 }
 
 
 class weakRef(T) if (is(T==class))
 {
     this(T _t){
         // set up tracking for the weakly referenced object
         t = _t;
         pool = findPool(t);
         rt_attachDisposeEvent(&t, &sweep);
 
         // tell the gc to not scan the weak reference
         weakRefPool = findPool(&this);
         weakRefPool.noscan[index(weakRefPool,&this)] = true;

 
 You can do this by calling setAttr(&this, NO_SCAN).

True.

 
     }
 
     T opCall(){
         T tmp = t; // force future conservative GC scan to find the pointer
         if ( (tmp is null) || (pool.mark[index] == false) ){

 
 pool.mark.test(index) is used in the GC code, I don't know if the array
 syntax works. And to use the mark bit trick, I think something in the
 lines of my patch is needed to ensure the mark bit is set for all live
 objects.
 
             t = null;
             return null;
         }
         tmp = t; // avoid race where t was replaced with new data

 
 I don't understand exactly what kind of race are you trying to avoid with
 this (and how this assignation can avoid any race if there is one).

Consider this sequence:
1. GC runs mark phase and leaves object unmarked.
2. Weakref starts dereferencing and reads its hidden pointer.
3. Sweep calls finalizer and deallocates
4. A new object is allocated in the old location
5. Query for GC mark says we're ok
6. Weakref returns its old copy of the pointer
7. Pointer is used
8. Random behavior


 
         return tmp;
     }
 
 private:
     T t;
     Pool* pool;
   
     size_t index(Pool *p, byte *element){ return (element-p)/16; }
     size_t index(){ return index(pool,&t); }
 
     void sweep( Object o ){
         t = null;
         pool = null;
     }
 }

 
 
 
 -- 
 Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
 ----------------------------------------------------------------------------
 GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
 ----------------------------------------------------------------------------
 Lo último que hay que pensar es que se desalinea la memoria
 Hay que priorizar como causa la idiotez propia
 Ya lo tengo asumido
 	-- Pablete, filósofo contemporáneo desconocido

Apr 23 2009

Leandro Lucarella <llucax gmail.com> writes:

Jason House, el 23 de abril a las 12:51 me escribiste:
     T opCall(){
         T tmp = t; // force future conservative GC scan to find the pointer
         if ( (tmp is null) || (pool.mark[index] == false) ){

 
 pool.mark.test(index) is used in the GC code, I don't know if the array
 syntax works. And to use the mark bit trick, I think something in the
 lines of my patch is needed to ensure the mark bit is set for all live
 objects.
 
             t = null;
             return null;
         }
         tmp = t; // avoid race where t was replaced with new data

 
 I don't understand exactly what kind of race are you trying to avoid with
 this (and how this assignation can avoid any race if there is one).

 
 Consider this sequence:
 1. GC runs mark phase and leaves object unmarked.
 2. Weakref starts dereferencing and reads its hidden pointer.

Ok, I see now that this can happen because you first store the reference
into the stack so it can be seen in future collections.

What I'm not sure is if it's really necessary to do that step. What can go
wrong with this simple approach?

     T opCall() {
         if (!pool.mark.test(index))
             t = null;
         return t;
     }

Weakref only reads the reference *if* the mark bit is set, in which case
is safe to assume that the object is live.

If the object is garbage, the mark bit is unset and in that case the
reference is only *written* (the only race it could happen is the sweep()
can be called but in that case the reference is written to null too, so
there should be no problem with that, I guess...

As long as the test are successful (t is not overwritten to null), the GC
should be able to find the reference in the stack when the function
returns.

I'm probably missing something else...

 3. Sweep calls finalizer and deallocates
 4. A new object is allocated in the old location
 5. Query for GC mark says we're ok
 6. Weakref returns its old copy of the pointer
 7. Pointer is used
 8. Random behavior

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------

Apr 23 2009

Jason House <jason.james.house gmail.com> writes:

Leandro Lucarella Wrote:

 Jason House, el 23 de abril a las 12:51 me escribiste:
     T opCall(){
         T tmp = t; // force future conservative GC scan to find the pointer
         if ( (tmp is null) || (pool.mark[index] == false) ){

 
 pool.mark.test(index) is used in the GC code, I don't know if the array
 syntax works. And to use the mark bit trick, I think something in the
 lines of my patch is needed to ensure the mark bit is set for all live
 objects.
 
             t = null;
             return null;
         }
         tmp = t; // avoid race where t was replaced with new data

 
 I don't understand exactly what kind of race are you trying to avoid with
 this (and how this assignation can avoid any race if there is one).

 
 Consider this sequence:
 1. GC runs mark phase and leaves object unmarked.
 2. Weakref starts dereferencing and reads its hidden pointer.

 
 Ok, I see now that this can happen because you first store the reference
 into the stack so it can be seen in future collections.

Yeah.


 What I'm not sure is if it's really necessary to do that step. What can go
 wrong with this simple approach?
 
      T opCall() {
          if (!pool.mark.test(index))
              t = null;
          return t;
      }

What happens when a collection runs after test returns? You need to copy t onto
the stack in order to be safe.

 Weakref only reads the reference *if* the mark bit is set, in which case
 is safe to assume that the object is live.

How do you check the mark bit? In the sample code, it used the reference...
Admittedly, one could store the right info to avoid that, but it hardly matters.


 If the object is garbage, the mark bit is unset and in that case the
 reference is only *written* (the only race it could happen is the sweep()
 can be called but in that case the reference is written to null too, so
 there should be no problem with that, I guess...
 
 As long as the test are successful (t is not overwritten to null), the GC
 should be able to find the reference in the stack when the function
 returns.
 
 I'm probably missing something else...

You're not considering what can happen if the weak ref dereferencing thread
stops and the rest of the world wreaks havoc on the state of the world. A
collection could run after checking the mark. If one is careful with that, and
a copy of t is kept on the stack, will the optimizer reuse it? Or will it
reread the value? That matters if the finalizer runs. The mark bit check can
fail from reallocation after deletion.

I hope that helps...



 3. Sweep calls finalizer and deallocates
 4. A new object is allocated in the old location
 5. Query for GC mark says we're ok
 6. Weakref returns its old copy of the pointer
 7. Pointer is used
 8. Random behavior

 
 -- 
 Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
 ----------------------------------------------------------------------------
 GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
 ----------------------------------------------------------------------------

Apr 23 2009

Leandro Lucarella <llucax gmail.com> writes:

Jason House, el 23 de abril a las 16:48 me escribiste:
 What I'm not sure is if it's really necessary to do that step. What can go
 wrong with this simple approach?
 
      T opCall() {
          if (!pool.mark.test(index))
              t = null;
          return t;
      }

 
 What happens when a collection runs after test returns? You need to copy
 t onto the stack in order to be safe.

You're right, I definitely underestimated the problem. Now I see that this
sequence breaks that implementation:

1) WeakRef.opCall() is called (whick stores a reference to R)
2) The mark bit for R is tested and returns true
3) CPU time is given to other threads with reference to R, which are
   removed
4) Other thread triggers a collection and the world is stopped
5) R gets unmarked because there are no other references to it
6) The world is started
7) WeakRef.opCall() returns R when it should returned null

The problem I see with your original proposal:

    T opCall(){
        T tmp = t; // force future conservative GC scan to find the pointer
        if ( (tmp is null) || (pool.mark[index] == false) ){
            t = null;
            return null;
        }
        tmp = t; // avoid race where t was replaced with new data
        return tmp;
    }

Is that I guess is very likely that 'tmp' get optimized out. Would
declaring it as volatile help? If not, I guess the only way to avoid that
race is add a root manually to the GC (or mark the weakref again as no
NO_SCAN). But both requires synchronization. =(

 Weakref only reads the reference *if* the mark bit is set, in which case
 is safe to assume that the object is live.

 
 How do you check the mark bit? In the sample code, it used the
 reference... Admittedly, one could store the right info to avoid that,
 but it hardly matters.

Right.

 If the object is garbage, the mark bit is unset and in that case the
 reference is only *written* (the only race it could happen is the sweep()
 can be called but in that case the reference is written to null too, so
 there should be no problem with that, I guess...
 
 As long as the test are successful (t is not overwritten to null), the GC
 should be able to find the reference in the stack when the function
 returns.
 
 I'm probably missing something else...

 
 You're not considering what can happen if the weak ref dereferencing
 thread stops and the rest of the world wreaks havoc on the state of the
 world. A collection could run after checking the mark. If one is careful
 with that, and a copy of t is kept on the stack, will the optimizer
 reuse it? Or will it reread the value? That matters if the finalizer
 runs. The mark bit check can fail from reallocation after deletion.
 
 I hope that helps...

Yes, thank you.

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------
- Que hacés, ratita?
- Espero un ratito...

Apr 25 2009

Jason House <jason.james.house gmail.com> writes:

Leandro Lucarella Wrote:

 Jason House, el 23 de abril a las 16:48 me escribiste:
 What I'm not sure is if it's really necessary to do that step. What can go
 wrong with this simple approach?
 
      T opCall() {
          if (!pool.mark.test(index))
              t = null;
          return t;
      }

 
 What happens when a collection runs after test returns? You need to copy
 t onto the stack in order to be safe.

 
 You're right, I definitely underestimated the problem. Now I see that this
 sequence breaks that implementation:
 
 1) WeakRef.opCall() is called (whick stores a reference to R)
 2) The mark bit for R is tested and returns true
 3) CPU time is given to other threads with reference to R, which are
    removed
 4) Other thread triggers a collection and the world is stopped
 5) R gets unmarked because there are no other references to it
 6) The world is started
 7) WeakRef.opCall() returns R when it should returned null
 
 The problem I see with your original proposal:
 
     T opCall(){
         T tmp = t; // force future conservative GC scan to find the pointer
         if ( (tmp is null) || (pool.mark[index] == false) ){
             t = null;
             return null;
         }
         tmp = t; // avoid race where t was replaced with new data
         return tmp;
     }
 
 Is that I guess is very likely that 'tmp' get optimized out. Would
 declaring it as volatile help? If not, I guess the only way to avoid that
 race is add a root manually to the GC (or mark the weakref again as no
 NO_SCAN). But both requires synchronization. =(

I would use tango.core.atomic that I took from the Tango D2 branch. Another
option is inline asm, but I'd have to research how to do that.

 
 Weakref only reads the reference *if* the mark bit is set, in which case
 is safe to assume that the object is live.

 
 How do you check the mark bit? In the sample code, it used the
 reference... Admittedly, one could store the right info to avoid that,
 but it hardly matters.

 
 Right.
 
 If the object is garbage, the mark bit is unset and in that case the
 reference is only *written* (the only race it could happen is the sweep()
 can be called but in that case the reference is written to null too, so
 there should be no problem with that, I guess...
 
 As long as the test are successful (t is not overwritten to null), the GC
 should be able to find the reference in the stack when the function
 returns.
 
 I'm probably missing something else...

 
 You're not considering what can happen if the weak ref dereferencing
 thread stops and the rest of the world wreaks havoc on the state of the
 world. A collection could run after checking the mark. If one is careful
 with that, and a copy of t is kept on the stack, will the optimizer
 reuse it? Or will it reread the value? That matters if the finalizer
 runs. The mark bit check can fail from reallocation after deletion.
 
 I hope that helps...

 
 Yes, thank you.

It's amazingly difficult to get right for something so "simple"
 
 -- 
 Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
 ----------------------------------------------------------------------------
 GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
 ----------------------------------------------------------------------------
 - Que hacés, ratita?
 - Espero un ratito...

Apr 25 2009

Leandro Lucarella <llucax gmail.com> writes:

Jason House, el 25 de abril a las 14:19 me escribiste:
 Is that I guess is very likely that 'tmp' get optimized out. Would
 declaring it as volatile help? If not, I guess the only way to avoid that
 race is add a root manually to the GC (or mark the weakref again as no
 NO_SCAN). But both requires synchronization. =(

 
 I would use tango.core.atomic that I took from the Tango D2 branch.
 Another option is inline asm, but I'd have to research how to do that.

I don't know how an atomic can help, I don't know much about atomics but
I think they don't have to be pointer sized (different architectures
provide different atomic types). And in any case you don't want tmp to be
atomic, you just don't want it to be optimized out, I think..

I guess asm can never get optimized (right?) so I think that should do.

BTW, that's wrong with volatile? I think it should work too.

 You're not considering what can happen if the weak ref dereferencing
 thread stops and the rest of the world wreaks havoc on the state of the
 world. A collection could run after checking the mark. If one is careful
 with that, and a copy of t is kept on the stack, will the optimizer
 reuse it? Or will it reread the value? That matters if the finalizer
 runs. The mark bit check can fail from reallocation after deletion.
 
 I hope that helps...

 
 Yes, thank you.

 
 It's amazingly difficult to get right for something so "simple"

Indeed.

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------
Wake from your sleep,
the drying of your tears,
Today we escape, we escape.

Apr 25 2009

Jason House <jason.james.house gmail.com> writes:

Leandro Lucarella Wrote:

 Jason House, el 25 de abril a las 14:19 me escribiste:
 Is that I guess is very likely that 'tmp' get optimized out. Would
 declaring it as volatile help? If not, I guess the only way to avoid that
 race is add a root manually to the GC (or mark the weakref again as no
 NO_SCAN). But both requires synchronization. =(

 
 I would use tango.core.atomic that I took from the Tango D2 branch.
 Another option is inline asm, but I'd have to research how to do that.

 
 I don't know how an atomic can help, I don't know much about atomics but
 I think they don't have to be pointer sized (different architectures
 provide different atomic types). And in any case you don't want tmp to be
 atomic, you just don't want it to be optimized out, I think..

Not tmp, the hidden pointer t. The read calls can't be optimized away, even
with inlining because they use asm code.


 I guess asm can never get optimized (right?) so I think that should do.

Right. D2 treats all asm as volatile asm.
 
 BTW, that's wrong with volatile? I think it should work too.

The volatile keyword does not exist in D2. Declaring the t (reference) as
volatile would work if it existed in D2.
 
 You're not considering what can happen if the weak ref dereferencing
 thread stops and the rest of the world wreaks havoc on the state of the
 world. A collection could run after checking the mark. If one is careful
 with that, and a copy of t is kept on the stack, will the optimizer
 reuse it? Or will it reread the value? That matters if the finalizer
 runs. The mark bit check can fail from reallocation after deletion.
 
 I hope that helps...

 
 Yes, thank you.

 
 It's amazingly difficult to get right for something so "simple"

 
 Indeed.
 
 -- 
 Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
 ----------------------------------------------------------------------------
 GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
 ----------------------------------------------------------------------------
 Wake from your sleep,
 the drying of your tears,
 Today we escape, we escape.

Apr 25 2009

Jason House <jason.james.house gmail.com> writes:

Sean Kelly Wrote:

 == Quote from Jason House (jason.james.house gmail.com)'s article
 Sean Kelly Wrote:
 You're guaranteed that the GC will notify you when it finalizes
 the referenced object, so you shouldn't have to query the GC
 at all.  Just copy the reference into a GC-scannable location
 and then if you haven't been notified by the GC that the object
 was finalized the reference is good.

 Maybe I misunderstood, but I interpreted earlier discussion in this thread to
mean the gc did the

 following steps:
 1. Pause all threads
 2. Scan objects to clear marks
 3. Scan objects to add marks
 4. Resume all threads
 5. Scan objects, if no mark then...
     5a. Call object finalizer
     5b. Free memory for later use
 Assuming that is correct, any deref of a weak reference is unsafe between
steps 4 and 5a. Even

 though the object may be valid in that window, there's no guarantee that the
reference won't be used
 after step 5b... This would lead to random behavior and seg faults. No
self-respecting weak reference
 library should allow that to happen.
 I hope I got something wrong and weak refs are easier to implement than I'm
currently thinking.

 
 Oh I see what you're saying.  So this could happen:
 
 4. The GC resume all threads
 4a. Another thread derefs the WeakRef
 4b. This thread checks to see if the GC has notified that the
        object is being collected.  No notification, so it uses the object.
 5. The GC scans its memory and finalizes the object
 6. The other thread tries to use the object and explodes
 
 You're right, it sounds like you'd need to add a step between 4a
 and 4b where the WeakRef notifies the GC that the object it's using
 shouldn't be collected (if a collection is in progress).  In fact, all you
 really need to do is acquire the GCs mutex for a moment.  Once
 you've acquired it then you know that any collection in progress
 must be complete (assuming of course that the GC uses a mutex
 to prevent corruption).
 
 The horrible thing about all this is it makes using a multithreaded
 WeakRef horribly slow.  Even if the GC facility were provided for
 what you want to do you're talking about acquiring a mutex in
 the GC for every deref operation.  What a mess.

I was hoping for a lock-free query that would return garbage (without crashing)
in the rare case that a race occurred. If I get garbage (and says the object is
marked) then the finalizer has run when I recheck the weak ref.  Do all
druntime queries acquire the GC lock? I'm hoping to do about 100_000+ of these
operations per second per core (and do other work too).

As long as memory blocks are not released to the OS while all threads are
unpaused, lock-free operations should be possible.

As a side note, does druntime use multiple threads for the garbage collector?
Scalability with Tango and a memory intensive application was horrible last
time I tried renting an 8 core system. It may have been my fault, but the
garbage collector is a convenient scape goat.

Apr 20 2009

Leandro Lucarella <llucax gmail.com> writes:

Jason House, el 20 de abril a las 17:45 me escribiste:
 As a side note, does druntime use multiple threads for the garbage
 collector?

No, but I'm planning to add concurrency support to the GC. D1 and Posix
OSs, though. Porting to D2 should be fairly easy because I'll use Tango's
GC as base, which is almost the same as Druntime AFAIK.

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------
careful to all animals (never washing spiders down the plughole),
keep in contact with old friends (enjoy a drink now and then),
will frequently check credit at (moral) bank (hole in the wall),

Apr 20 2009

Sean Kelly <sean invisibleduck.org> writes:

== Quote from Jason House (jason.james.house gmail.com)'s article
 Sean Kelly Wrote:
 The horrible thing about all this is it makes using a multithreaded
 WeakRef horribly slow.  Even if the GC facility were provided for
 what you want to do you're talking about acquiring a mutex in
 the GC for every deref operation.  What a mess.

 I was hoping for a lock-free query that would return garbage (without
crashing) in the rare case that

a race occurred. If I get garbage (and says the object is marked) then the
finalizer has run when I
recheck the weak ref.  Do all druntime queries acquire the GC lock? I'm hoping
to do about 100_000+ of
these operations per second per core (and do other work too).

They all acquire the lock.  And even with per-thread GCs, any
operation against the shared GC will acquire its lock as well.
I'm not terribly inclined to try and make a lock-free GC to
avoid this for even simple queries.

 As long as memory blocks are not released to the OS while all threads are
unpaused, lock-free

operations should be possible.

Blocks may be released to the OS at this time, but currently
they will only be for pools that are completely empty.  In
practice this generally only happens for a pool allocated
specifically for a single big allocation.  As for lock-free GC
stuff in general, there's too much state information involved
to make me terribly inclined to give it a shot without some
serious motivation.

 As a side note, does druntime use multiple threads for the garbage collector?
Scalability with Tango

and a memory intensive application was horrible last time I tried renting an 8
core system. It may have
been my fault, but the garbage collector is a convenient scape goat.

Currently, no.  The Druntime GC is nearly identical to the
Tango GC but for a few tweaks here and there.  I'm eventually
going to make a GC with per-thread heaps as an alternative,
but I haven't found the time yet.

The current GC is a convenient scapegoat, but it's also a valid
one if the threads in this app churn through memory continuously.
If nothing else the threads are almost definitely convoying on the
GC mutex, not to mention the cost of "stop the world" collections.
You might want to look into judicious use of GC.reserve() to help
with this some.  In apps that grow memory rather than just churn
through it, you can see a pretty solid benefit from GC.reserve().

Apr 20 2009

Daniel Keep <daniel.keep.lists gmail.com> writes:

Sean Kelly wrote:
 == Quote from Jason House (jason.james.house gmail.com)'s article
 Sean Kelly Wrote:
 You're guaranteed that the GC will notify you when it finalizes
 the referenced object, so you shouldn't have to query the GC
 at all.  Just copy the reference into a GC-scannable location
 and then if you haven't been notified by the GC that the object
 was finalized the reference is good.

 Maybe I misunderstood, but I interpreted earlier discussion in this thread to
mean the gc did the

 following steps:
 1. Pause all threads
 2. Scan objects to clear marks
 3. Scan objects to add marks
 4. Resume all threads
 5. Scan objects, if no mark then...
     5a. Call object finalizer
     5b. Free memory for later use
 Assuming that is correct, any deref of a weak reference is unsafe between
steps 4 and 5a. Even

 though the object may be valid in that window, there's no guarantee that the
reference won't be used
 after step 5b... This would lead to random behavior and seg faults. No
self-respecting weak reference
 library should allow that to happen.
 I hope I got something wrong and weak refs are easier to implement than I'm
currently thinking.

 
 Oh I see what you're saying.  So this could happen:
 
 4. The GC resume all threads
 4a. Another thread derefs the WeakRef
 4b. This thread checks to see if the GC has notified that the
        object is being collected.  No notification, so it uses the object.
 5. The GC scans its memory and finalizes the object
 6. The other thread tries to use the object and explodes
 
 You're right, it sounds like you'd need to add a step between 4a
 and 4b where the WeakRef notifies the GC that the object it's using
 shouldn't be collected (if a collection is in progress).  In fact, all you
 really need to do is acquire the GCs mutex for a moment.  Once
 you've acquired it then you know that any collection in progress
 must be complete (assuming of course that the GC uses a mutex
 to prevent corruption).
 
 The horrible thing about all this is it makes using a multithreaded
 WeakRef horribly slow.  Even if the GC facility were provided for
 what you want to do you're talking about acquiring a mutex in
 the GC for every deref operation.  What a mess.

What if you added this:

3.5. Set GC.collectionInProgress to true.
4. The GC resumes all threads.
4a. Another thread derefs the WeakRef
4b. Before the WeakRef checks to see if the reference has been cleared,
it spinlocks on GC.collectionInProgress.
5. The GC scans its memory and finalises the object.
5a. The GC clears GC.collectionInProgress.
6. The other thread unblocks, sees the WeakRef has been cleared and
fails gracefully.

  -- Daniel

Apr 20 2009

Fawzi Mohamed <fmohamed mac.com> writes:

On 2009-04-20 22:19:42 +0200, Jason House <jason.james.house gmail.com> said:

 Sean Kelly Wrote:
 
 == Quote from Jason House (jason.james.house gmail.com)'s article
 Jason House Wrote:
 
 If I understand you correctly, a weak reference library needs to query 
 the GC to see if an object is


 marked for collection. How do I do this?
 Here's the implementation I'm currently thinking of:
 weak ref construction:
 Copy pointer into struct without casting
 Call rt_attach_dispose_event
 weak ref dereference:
 copy value into temp var
 ** query gc to see if garbage **
 If not garbage, Recopy value and return
 If garbage set value = null
 dispose of weakly referenced obj:
 set value = null
 The emphasized line is impossible according to the publicized gc 
 interface. Given the delayed

 finalization, I don't see any way around it.
 
 You're guaranteed that the GC will notify you when it finalizes
 the referenced object, so you shouldn't have to query the GC
 at all.  Just copy the reference into a GC-scannable location
 and then if you haven't been notified by the GC that the object
 was finalized the reference is good.

 
 Maybe I misunderstood, but I interpreted earlier discussion in this 
 thread to mean the gc did the following steps:
 1. Pause all threads
 2. Scan objects to clear marks
 3. Scan objects to add marks
 4. Resume all threads
 5. Scan objects, if no mark then...
     5a. Call object finalizer
     5b. Free memory for later use
 
 Assuming that is correct, any deref of a weak reference is unsafe 
 between steps 4 and 5a. Even though the object may be valid in that 
 window, there's no guarantee that the reference won't be used after 
 step 5b... This would lead to random behavior and seg faults. No 
 self-respecting weak reference library should allow that to happen.
 
 I hope I got something wrong and weak refs are easier to implement than 
 I'm currently thinking.

I was thinking to add weak refs to tango core, and I also saw this race 
condition.
The only clean way to fix it is to notify while threads are stopped, 
all the rest is really a cludge.

Fawzi

Apr 20 2009

Sean Kelly <sean invisibleduck.org> writes:

== Quote from Jason House (jason.james.house gmail.com)'s article
 Sean Kelly Wrote:
 Daniel Keep wrote:
 Sean Kelly wrote:
 Jason House wrote:
 Jason House wrote:

 Tango's GC (and therefore
 druntime's GC) has an explicit notification mechanism.

 I guess I shouldn't have assumed that the features of tango became
 part of druntime.  I don't see any notification mechanism :(

 Same as Tango:

 alias void delegate(Object) DEvent;
 extern (C) void rt_attachDisposeEvent(Object h, DEvent e);
 extern (C) void rt_detachDisposeEvent(Object h, DEvent e);

 I should probably expose these in core.runtime.

 How are these events dispatched?  I remember a discussion about race
 conditions if these events were fired off after all threads had been
 resumed; you could have the event zero out the weak ref AFTER something
 else had dereferenced it.

 It's the responsibility of the person making the weak ref to make it
 thread-safe if that's a design consideration.  If the cleanup is
 performed before the threads are restarted and the weak ref (or
 signal/slot mechanism) uses mutexes then a deadlock is possible.  At
 least this approach provides the option for correct code to actually work.

 If that's still a possibility, maybe you could add a case where if the
 delegate's funcptr is null, it just assumes the context pointer points
 to a void*.sizeof word and zeroes it out.  You could do that without
 resuming threads.

 This is all tied into the finalizer.  To do what you suggest the GC
 would have to perform two sweeps--one before resuming threads and one after.

 If I understand you correctly, a weak reference library needs to query the GC
to see if an object is marked for collection.

How do I do this?

I think the WeakRef just has to either wrap the deref operation in a
synchronized
block if it's intended for multithreaded use or perform an atomic read and
update
of the reference.  From my understanding, the problem Daniel Keep describes
above
is simply an issue of the referenced object vanishing at the same instant
someone
tries to access it through the WeakRef.  The same thing could happen if someone
explicitly deletes the object instead of letting the GC clean it up.

Apr 20 2009

D Programming

C/C++ Programming

Other

digitalmars.D - D2 weak references