digitalmars.D - draft proposal for ref counting in D

Walter Bright (3/3) Oct 09 2013 This is an email conversation we had last summer. It's of general intere...

Walter Bright (84/84) Oct 09 2013 This is based on n.g. discussions and ideas from you guys. I'll redo it ...

Walter Bright (51/51) Oct 09 2013 Steven Schveighoffer wrote:

Walter Bright (40/63) Oct 09 2013 Reason being, you could be assigning a reference the same value as befor...

Walter Bright (2/2) Oct 09 2013 I have overlooked addressing what happens when you pass an RC ref to a p...

Walter Bright (15/16) Oct 09 2013 Steven Schveighoffer wrote:

Walter Bright (74/79) Oct 09 2013 well with something like TempAlloc. I don't think we should use convent...

Walter Bright (9/9) Oct 09 2013 If autoreleasepull is just a handy way to lump together Release() calls,...

Walter Bright (16/18) Oct 09 2013 that is quite unnecessary if the compiler inserts calls to Release()

Walter Bright (12/14) Oct 09 2013 management (whether it requires flow-analysis or not), but it seems to w...

Walter Bright (12/18) Oct 09 2013 management (whether it requires flow-analysis or not), but it seems to w...

Walter Bright (31/40) Oct 09 2013 well with something like TempAlloc. I don't think we should use convent...

Walter Bright (18/21) Oct 09 2013 giving the compiler a way to elide careful tracking of temporaries' refe...

Walter Bright (16/21) Oct 09 2013 wouldn't have to think of releasing manually every object called functio...

Walter Bright (3/4) Oct 09 2013 don't keep an unretained reference to an autoreleased object when the po...

Walter Bright (8/11) Oct 09 2013 don't keep an unretained reference to an autoreleased object when the po...

Walter Bright (7/15) Oct 09 2013 should not be no problem."

Walter Bright (13/26) Oct 09 2013 don't keep an unretained reference to an autoreleased object when the po...

Walter Bright (6/7) Oct 09 2013 have memory safety.

bearophile (10/15) Oct 09 2013 Walter Bright:

bearophile (5/7) Oct 11 2013 There is now a discussion and paper on optimizing a reference

Andrei Alexandrescu (3/9) Oct 11 2013 Yes, I think that's great work.

deadalnix (5/11) Oct 09 2013 You'll soon see the roadblock here with templates and type

Walter Bright (89/89) Oct 09 2013 Updated incorporating Steven's suggestion, and some comments about

Walter Bright (102/102) Oct 09 2013 Michel Fortin wrote:

Walter Bright (90/118) Oct 09 2013 reasons, most Objective-C methods return autoreleased objects (deferred ...

Walter Bright (33/108) Oct 09 2013 Cool. I didn't expect this to be tackled so soon.

Walter Bright (35/71) Oct 09 2013 object itself, so we might want to allow reference counting on other typ...

Walter Bright (6/8) Oct 09 2013 It's great that you're bringing this up. I think it's a good idea to add...
zoujiaqing (8/12) Sep 18 2019 Automatic Reference Counting (ARC) as an alternative to D's

Walter Bright (33/44) Oct 09 2013 object itself, so we might want to allow reference counting on other typ...
Walter Bright (54/68) Oct 09 2013 prevent the compiler for inserting any of those calls so it becomes the
Walter Bright (52/106) Oct 09 2013 I currently don't see how it can be memory safe with this proposal.

Robert Schadek (3/7) Oct 10 2013 I would imagine the counter to be manipulated with atomic_add_and_fetch

inout (3/14) Oct 11 2013 On shared objects, yes. Local objects need no atomics at all.

Rainer Schuetze (22/33) Oct 11 2013 Atomic increments/decrements are not good enough for shared references.

Michel Fortin (14/49) Oct 12 2013 I think you are mixing shared references with shared objects. Atomic

Rainer Schuetze (8/53) Oct 12 2013 I agree, that's why I used the term "shared reference", too ;-)

Michel Fortin (19/24) Oct 13 2013 For one of my projects I implemented a shared pointer like this. It

Rainer Schuetze (9/30) Oct 14 2013 Locking is very bad if you have threads at different priorities as it

Michel Fortin (23/53) Oct 14 2013 Spinning is good only when you very rarely expect contention, which is

deadalnix (4/12) Oct 14 2013 If you don't want any pause, concurrent GC is the way to go. This

Michel Fortin (26/39) Oct 14 2013 I'm not an expert in GCs, but as far as I know a concurrent GC also

deadalnix (19/33) Oct 14 2013 Usual strategy include :

Michel Fortin (23/59) Oct 14 2013 So as I understand it, your plan would then be:

deadalnix (17/35) Oct 14 2013 Yes, I think this is what make the most sense for D. If the GC

Michel Fortin (14/19) Oct 14 2013 mprotect is the one I'm worried about, as it lets you set the

Jacob Carlborg (7/10) Oct 15 2013 I haven't tried compiling anything and I don't know if I'm looking in

Michel Fortin (19/31) Oct 15 2013 You're right. Yes it does exist. I was confused.

Michel Fortin (19/23) Oct 15 2013 I think there's a small mistake in your phrasing, but it makes a differe...

deadalnix (24/48) Oct 15 2013 No, that is the beauty of it :D

Michel Fortin (33/54) Oct 15 2013 But you still have to scan the thread-local heaps and stacks to find

deadalnix (8/8) Oct 16 2013 Yes and no.

Paulo Pinto (6/56) Oct 14 2013 Well, if real time concurrent GC for Java systems is good enough for

Benjamin Thaut (6/11) Oct 16 2013 The problem is not that there are no GCs around in other languages which...

Sean Kelly (11/13) Oct 16 2013 which satisfy certain requirements. The problem is actually implementing...

Jacob Carlborg (6/7) Oct 16 2013 One need to be very careful with the memory when interfacing with C code...
Benjamin Thaut (8/12) Oct 16 2013 I think a even bigger problem are structs. Because if you need write

Sean Kelly (17/26) Oct 17 2013 satisfy certain requirements. The problem is actually implementing them...

David Nadlinger (7/12) Oct 17 2013 LLVM actually comes with a quite expensive GC support

Benjamin Thaut (10/22) Oct 17 2013 Uhhh, this sounds really good. They in fact have everything to implement...

Sean Kelly (5/10) Oct 17 2013 There's always a tradeoff. If an app is very delay-sensitive,

Benjamin Thaut (9/19) Oct 17 2013 Well I just read it again, and it appears to me that the shadow stack is...

Paulo Pinto (8/22) Oct 16 2013 I have read it when it came out in the late 90's.

Rainer Schuetze (15/22) Oct 15 2013 The work I was talking about uses C++, not D, so there is no GC involved...

deadalnix (8/23) Oct 12 2013 You need an sequentially consistent write. You also need to

Rainer Schuetze (9/34) Oct 13 2013 Same here ;-)

deadalnix (8/9) Oct 13 2013 I assumed that the reference count was in a struct with the data,

Artur Skawina (12/19) Oct 13 2013 No, if you have two (or more) threads concurrently accessing the object,

Sean Kelly (6/14) Oct 13 2013 I suppose it's worth noting that Boost (and now standard C++) has

=?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= (5/20) Oct 13 2013 I didn't read the paper, but I'd suspect that the paper refers to the

Rainer Schuetze (9/30) Oct 14 2013 I haven't read it either, but AFAICT the cas2 operation is used two

John Colvin (3/49) Oct 14 2013 I'm totally out of my depth here but can't you store the

deadalnix (6/8) Oct 14 2013 I think that can work if the refcount is stored with the pointer
Rainer Schuetze (3/43) Oct 14 2013 This might work for a single pointer, but not if you have multiple

Walter Bright (6/28) Oct 09 2013 That means you have to maintain two reference counts if you have both AR...
Walter Bright (14/42) Oct 09 2013 COM used in the same class. ARC can only release the object if both coun...
Walter Bright (38/53) Oct 09 2013 could be more obscure so people are less tempted to use it, but if you'r...
Walter Bright (27/129) Oct 09 2013 compiler already splits this into two function calls, this cannot be don...
Walter Bright (29/30) Oct 09 2013 every time this is done would be a disaster. It means that people would ...
Walter Bright (31/37) Oct 09 2013 every time this is done would be a disaster. It means that people would ...
Walter Bright (21/62) Oct 09 2013 When preparing for dconf I read the "Garbage Collection Handbook" by
Walter Bright (33/40) Oct 09 2013 On my way to work today, I figured that a safe but slow implementation c...
Walter Bright (8/37) Oct 09 2013 work, if the interface is not AddRef/Release, but
Walter Bright (9/72) Oct 09 2013 deletion is performed before the thread resumes, AddRef is called on gar...
Walter Bright (12/51) Oct 09 2013 You're right (I have been about to run to a meeting when writing this). ...
Walter Bright (33/37) Oct 09 2013 Speaking of the GC, you should probably rethink this point:
Walter Bright (36/46) Oct 09 2013 regardless of its reference count. That's fine as long as all the retain...
Walter Bright (24/30) Oct 09 2013 normal
Walter Bright (41/52) Oct 09 2013 automatically
Walter Bright (39/50) Oct 09 2013 objects)
Walter Bright (22/38) Oct 09 2013 Actually, I made a small error in opRetain. To match Objective-C ARC it ...
Walter Bright (5/6) Oct 09 2013 Here is another nitpick that needs to be addressed: As mentioned in the
Walter Bright (32/42) Oct 09 2013 I just had an idea, maybe it is obvious and just distracts, but I though...
Walter Bright (8/18) Oct 09 2013 Sorry to produce these drop by drop, but while writing the last mail, I ...
Walter Bright (8/9) Oct 09 2013 interface type must do reference counting as well. So the interface must...
Walter Bright (12/22) Oct 09 2013 Yes, that is probably the way to go. But that makes using protocols like...
Walter Bright (13/20) Oct 09 2013 it is
Walter Bright (18/45) Oct 09 2013 reference type. The compiler detects a type declaration "reference_type"...
Walter Bright (11/26) Oct 09 2013 of the non-reference-counted base class. This would require an implicit

deadalnix (2/11) Oct 09 2013 It means OOP is completely broken with that design.

Walter Bright (4/5) Oct 09 2013 I know. The thread kind of petered out as we began to realize the obstac...

Walter Bright (7/9) Oct 09 2013 By "decay", do mean the lowering or something else?
Walter Bright (7/12) Oct 09 2013 shared_ptr!C. Only @trusted code in shared_ptr will have to deal with "r...
Walter Bright (14/29) Oct 09 2013 Any parameter of type C is also lowered to shared_ptr!C. Calling a membe...
Walter Bright (12/34) Oct 09 2013 reference counting for safety. Treating every explicite or implicite usa...
Walter Bright (10/11) Oct 09 2013 class C {}
Walter Bright (10/15) Oct 09 2013 to an object of class C, not the object itself. If you replace type C wi...
Walter Bright (4/12) Oct 09 2013 send them to you.
Walter Bright (8/9) Oct 09 2013 interface type must do reference counting as well. So the interface must...
Walter Bright (13/16) Oct 09 2013 I don't see why you would want to lower from shared_ptr!C to C. It's onl...
Walter Bright (3/4) Oct 09 2013 No problem. It's a complicated subject, and none of us can think of all ...
Walter Bright (24/40) Oct 09 2013 I tried to scan it yesterday, but got large black bar at the fold (don't...
Walter Bright (7/9) Oct 09 2013 mov [S], ecx
Walter Bright (9/48) Oct 09 2013 if this the correct term) that eraased the first inch of text. I would h...
Walter Bright (4/5) Oct 09 2013 interfaces other than ones derived from IUnknown.
Walter Bright (32/35) Oct 09 2013 interfaces other than ones derived from IUnknown.
Walter Bright (15/31) Oct 09 2013 support anyway (for autoreleased objects notably). Tweaking the compiler...
Walter Bright (3/4) Oct 09 2013 Shouldn't all this reference counting be in its own DIP?

Walter Bright (2/5) Oct 09 2013 Yes. It's not ready yet, though.

Walter Bright (8/10) Oct 09 2013 fields will
Walter Bright (15/19) Oct 09 2013 exist already.
Walter Bright (11/17) Oct 09 2013 fields will

Walter Bright (11/20) Oct 09 2013 exist already.

Walter Bright (19/32) Oct 09 2013 fields will
Walter Bright (22/34) Oct 09 2013 fields will

Walter Bright (21/37) Oct 09 2013 doesn't exist already.

Walter Bright (21/22) Oct 09 2013 penalty. Note that if the GC is reaping a cycle, nobody else is referenc...

Walter Bright (16/23) Oct 09 2013 The GC starts to collect this cycle.

Walter Bright (21/26) Oct 09 2013 primitives uses atomic reference counts.

Walter Bright (13/23) Oct 09 2013 penalty. Note that if the GC is reaping a cycle, nobody else is referenc...

Walter Bright (24/45) Oct 09 2013 performance penalty. Note that if the GC is reaping a cycle, nobody else...

Walter Bright (23/47) Oct 09 2013 performance penalty. Note that if the GC is reaping a cycle, nobody else...

Walter Bright (19/28) Oct 09 2013 are all stopped using a signal. It's a very bad idea to run destructors...

Walter Bright (11/18) Oct 09 2013 are all stopped using a signal. It's a very bad idea to run destructors...

Walter Bright (12/15) Oct 09 2013 Logically, it's sound, but the implementation will be very difficult. I...

Jacob Carlborg (5/24) Oct 09 2013 Is this the last email in the conversation? In that case I think you

Walter Bright (2/3) Oct 10 2013 Yes, I posted them in chronological order.

Walter Bright (9/10) Oct 09 2013 in another thread than they belong in. The situation you describe is not...

Walter Bright (7/11) Oct 09 2013 in another thread than they belong in. The situation you describe is not...

Walter Bright (1/1) Oct 10 2013 And that's the last email in the original thread!

Walter Bright (2/3) Oct 10 2013 Durn, no it isn't.

Sean Kelly (10/29) Oct 11 2013 I think this ties into the requirement that after the GC collects

=?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= (44/44) Oct 12 2013 I've made short roundup of the different features/requirements that have...

=?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= (71/71) Oct 12 2013 To support most of the requirements we need to offer some control over
deadalnix (19/60) Oct 12 2013 We have a first missing block here :D

Michel Fortin (22/23) Oct 12 2013 A basic concept in Objective-C to make manual reference counting

Walter Bright <newshound2 digitalmars.com> writes:

This is an email conversation we had last summer. It's of general interest, so 
reposted here with permission. We didn't reach any conclusions, but there's a 
lot of good stuff in here, and it's particularly relevant to other recent
threads.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

This is based on n.g. discussions and ideas from you guys. I'll redo it as a
DIP 
if it passes the smoke test from y'all.

----------------------------------------------------------------------
     Adding Reference Counting to D

D currently supports manual memory management and generalized GC.
Unfortunately, 
the pausing
and memory consumption inherent in GC is not acceptable for many programs, and 
manual
memory management is error-prone, tedious, and unsafe. A third style, reference 
counting (RC),
addresses this. Common implementations are COM's AddRef/Release, Objective-C's
ARC,
and C++'s shared_ptr<>.

None of these three schemes are guaranteed memory-safe, they all require the 
programmer
to conform to a protocol (even O-C's ARC). Stepping outside of the protocol 
results in
memory corruption. A D implementation must make it possible to use ref counted 
objects
in code marked as  safe, although it will be necessary for the implementation
of 
those
objects to be unsafe.

Some aspects of a D implementation are inevitable:

1. Avoid any requirement for more pointer types. This would cause drastic 
increases in
complexity for both the compiler and the user. It may make generic code much 
more difficult
to write.

2. Decay of a ref-counted pointer to a non-ref-counted pointer is unsafe, and 
can only
be allowed (in  safe code) in circumstances where it can be statically proven
to 
be safe.

3. A ref counted object is inherently a reference type, not a value type.

4. The compiler needs to know about ref counted types.


==Proposal==

If a class contains the following methods, in either itself or a base class, it
is
an RC class:


     T AddRef();
     T Release();

An RC class is like a regular D class with these additional semantics:

1. In  safe code, casting (implicit or explicit) to a base class that does not
have both AddRef() and Release() is an error.

2. Initialization of a class reference causes a call to AddRef().

3. Assignment to a class reference causes a call to Release() on its original
value
and AddRef() on the new value.

4. Null checks are done before calling any AddRef() or Release().

5. Upon scope exit of all RC variables or temporaries, a call to Release() is 
performed,
analogously to the destruction of struct variables and temporaries.

6. If a class or struct contains RC fields, calls to Release() for those fields
will
be added to the destructor, and a destructor will be created if one doesn't 
exist already.

7. If a closure is created that contains RC fields, either a compile time error 
will be
generated or a destructor will be created for it.

8. Explicit calls to AddRef/Release will not be allowed in  safe code.

9. A call to AddRef() will be added to any argument passed as a parameter.

10. Function returns have an AddRef() already done to the return value.

11. The compiler can elide any AddRef()/Release() calls it can prove are
redundant.

12. AddRef() is not called when passed as the implicit 'this' reference.

13. Taking the address of, or passing by reference, any fields of an RC object
is not allowed in  safe code. Passing by reference an RC field is allowed.

14. RC objects will still be allocated on the GC heap - this means that a normal
GC run will reap RC objects that are in a cycle, and RC objects will get 
automatically
scanned for heap references with no additional action required by the user.


==Existing Code==

D COM objects already have AddRef() and Release(). This proposal should not
break
that code, it'll just mean that there will be extra AddRef()/Release calls made.
Calling AddRef()/Release() should never have been allowed in  safe code anyway.

Any other existing uses of AddRef()/Release() will break.

==Arrays==

Built-in arrays have no place to put a reference count. Ref counted arrays
would 
hence
become a library type, based on a ref counted class with overloaded operators
for
the array operations.

==Results==

This is a very flexible approach, allowing for support of general RC objects,
as 
well
as specific support for COM objects and Objective-C ARC. AddRef()/Release()'s 
implementation
is entirely up to the user or library writer.

 safe code can be guaranteed to be memory safe, as long as AddRef()/Release() 
are correctly
implemented.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Steven Schveighoffer wrote:

Looks like a good start.

A few things:

1. Proposal point 3, you need to AddRef first, THEN Release the original.
Reason 
being, you could be assigning a reference the same value as before. In this 
case, if you decrement *first*, you will decrement the reference, which might 
reduce it to 0, and free the object before you increment.  In most cases, the 
AddRef and Release are elided, so it's not bad if you do this.

I wonder if it's not a good idea to have a RefAssign function that takes two RC 
objects and does a check to see if they are the same before doing AddRef and 
Release to help with this issue. Calls where the compiler can prove they are
the 
same value can be elided.

2. AddRef and Release should be treated not as function calls, but as
callables. 
That is, if for some reason AddRef and Release should be aliases, this does not 
detract from the solution. Only requirement should be that they cannot be UFCS, 
as that doesn't make any sense (one cannot add reference counting after the
fact 
to an object).

I'm thinking of the Objective-C objects, whose functions are "release" and 
"retain". It would be good to use the names Objective-C coders are used to.

3. Objective-C ARC uses a mechanism called auto-release pools that help cut
down 
on the release/retain calls. It works like this:

 autoreleasepool {  // create a new pool
      NSString *str = [NSString stringWithFormat: "an int: %d", 1];
       autoreleasepool { // create a new pool on the "pool stack"
            NSDate *date = [NSDate date];
            {
                NSDate *date2 = [NSDate date];
            }
      } // auto-release date and date2
} // auto-release str

In this way, it's not the pointer going out of scope that releases the object, 
it's the release pool going out of scope that releases the object. These
release 
pools themselves are simply RC objects that call release on all their objects 
(in fact, prior to the  autoreleasepool directive, you had to manually create 
and destroy these pools).

The benefit of this model is that basically, inside a pool, you can move around 
auto-released objects at will, pass them into functions, return them from 
functions, etc, without having to retain or release them for each assignment. 
It's kind of like a mini-GC.

It works by convention, that any function that returns a RC object:

if it's called 'new...' or 'init...' or 'alloc...' or 'copy...', then the
object 
is assumed returned with it's retain count incremented on behalf of the calling 
scope. This means, if you assign it to a member variable, for instance, you do 
not have to retain the object again, and if it goes out of scope, you must call 
release on it.

All other functions return 'auto-released' objects, or objects which have
queued 
in the latest auto release pool. The compiler knows this and can elide more of 
the releases and retains.

Would this be a mechanism that's worth putting in? I think it goes really well 
with something like TempAlloc.  I don't think we should use convention,
though...

-Steve

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/25/2013 1:19 PM, Steven Schveighoffer wrote:
 Looks like a good start.

 A few things:

 1. Proposal point 3, you need to AddRef first, THEN Release the original. 

Reason being, you could be assigning a reference the same value as before. In 
this case, if you decrement *first*, you will decrement the reference, which 
might reduce it to 0, and free the object before you increment.  In most cases, 
the AddRef and Release are elided, so it's not bad if you do this.

Yeah, I got that backwards, and I should know better.

 I wonder if it's not a good idea to have a RefAssign function that takes two 

RC objects and does a check to see if they are the same before doing AddRef and 
Release to help with this issue. Calls where the compiler can prove they are
the 
same value can be elided.

I'd like to see how far we get with just AddRef/Release first, and getting the 
semantics of them right first.

 2. AddRef and Release should be treated not as function calls, but as 

callables. That is, if for some reason AddRef and Release should be aliases, 
this does not detract from the solution.

Yes, just like the names for Ranges are used.

   Only requirement should be that they cannot be UFCS, as that doesn't make 

any sense (one cannot add reference counting after the fact to an object).

That would be covered by disallowing explicit calls to AddRef/Release in  safe
code.

 I'm thinking of the Objective-C objects, whose functions are "release" and 

"retain". It would be good to use the names Objective-C coders are used to.

AddRef/Release are the COM names. It's trivial to have one wrap the other. I 
picked AddRef/Release because I'm familiar with their semantics, and am not
with 
O-C.


 3. Objective-C ARC uses a mechanism called auto-release pools that help cut 

down on the release/retain calls. It works like this:
  autoreleasepool {  // create a new pool
       NSString *str = [NSString stringWithFormat: "an int: %d", 1];
        autoreleasepool { // create a new pool on the "pool stack"
             NSDate *date = [NSDate date];
             {
                 NSDate *date2 = [NSDate date];
             }
       } // auto-release date and date2
 } // auto-release str

 In this way, it's not the pointer going out of scope that releases the 

object, it's the release pool going out of scope that releases the object.
These 
release pools themselves are simply RC objects that call release on all their 
objects (in fact, prior to the  autoreleasepool directive, you had to manually 
create and destroy these pools).
 The benefit of this model is that basically, inside a pool, you can move 

around auto-released objects at will, pass them into functions, return them
from 
functions, etc, without having to retain or release them for each assignment. 
It's kind of like a mini-GC.
 It works by convention, that any function that returns a RC object:

 if it's called 'new...' or 'init...' or 'alloc...' or 'copy...', then the 

object is assumed returned with it's retain count incremented on behalf of the 
calling scope. This means, if you assign it to a member variable, for instance, 
you do not have to retain the object again, and if it goes out of scope, you 
must call release on it.
 All other functions return 'auto-released' objects, or objects which have 

queued in the latest auto release pool. The compiler knows this and can elide 
more of the releases and retains.
 Would this be a mechanism that's worth putting in? I think it goes really 

well with something like TempAlloc.  I don't think we should use convention, 
though...

I agree with not relying on convention. But also reserving the new*, init*, 
alloc* and copy* namespaces seems excessive for D.

As for autoreleasepool, it is relying on a convention that its fields are not 
leaking. I don't see how we can enforce this.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

I have overlooked addressing what happens when you pass an RC ref to a pure 
function. Is the pure function allowed to call AddRef()/Release()? Not sure.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Steven Schveighoffer wrote:

That's a resounding yes.

Consider that this is allowed:

class X {}

struct S
{
    X foo;
    void setFoo(X newfoo) pure {foo = newfoo;}
}

If X is ref-counted, you HAVE to increment the ref count.

The only issue here is, ref counting may have to access global data.  But we 
already have exceptions for memory management, even for strong-pure functions.

-Steve

On Jun 25, 2013, at 4:48 PM, Walter Bright wrote:

 I have overlooked addressing what happens when you pass an RC ref to a pure 

function. Is the pure function allowed to call AddRef()/Release()? Not sure.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Steven Schveighoffer wrote:


On Jun 25, 2013, at 4:44 PM, Walter Bright wrote:

 On 6/25/2013 1:19 PM, Steven Schveighoffer wrote:
 Would this be a mechanism that's worth putting in? I think it goes really 


well with something like TempAlloc.  I don't think we should use convention, 
though...
 I agree with not relying on convention. But also reserving the new*, init*, 

alloc* and copy* namespaces seems excessive for D.
 As for autoreleasepool, it is relying on a convention that its fields are not 

leaking. I don't see how we can enforce this.

I don't think the autoreleasepool is relying on convention, it's simply giving 
the compiler a way to elide careful tracking of temporaries' reference counts.

It was definitely of more use when manual reference counting was done, because 
one only had to worry about retaining non-temporary data in that case.

But the compiler can make the same optimizations (and does in the ARC version
of 
Objective-C).

Consider the following code:

NSString *str = [NSString stringWithFormat: "%d", 5];

// translating to D, that would be something like:
NSString str = NSString.stringWithFormat("%d", 5);

stringWithFormat is a class method that gives you back a temporary string.  You 
are not asserting ownership, you are just assigning to a variable.

Now, if you wanted to do some fancy stuff with str, we could do:

{
NSString str2;

{
    NSString str = NSString.stringWithFormat("%d", 5);
    if(condition)
        str2 = str;
    if(otherCondition)
    {
        NSString str3 = str;
        str = NSString.stringWithFormat("%d", 6);
    }
}

str2 = str;
}

Now, in all this mess, how is the compiler to sort out the AddRefs and
Releases? 
  Most likely, it will end up adding more than it needs to.

But with autorelease pool, it's like you did this:

AutoReleasePool arp;
{
NSString str2;

{
    NSString str = NSString.stringWithFormat("%d", 5);
    arp.add(str);
    if(condition)
        str2 = str;
    if(otherCondition)
    {
        NSString str3 = str;
        str = NSString.stringWithFormat("%d", 6);
        arp.add(str);
    }
}

str2 = str;
}
arp.drain(); // releases both strings used, don't care what now-out-of-scope 
variables

Essentially, they are only retained when created, and because they go out of 
scope, they are no longer needed.

The compiler can surmise that because the fields aren't leaving the scope, it 
doesn't have to retain them.  If it does see that, it adds a retain.

Then, it can release them all at once.

In fact, this could be done automatically, but you have to allocate a place to 
put these 'scheduled for release' things. In Cocoa, the main event loop has an 
auto release pool, and you can add them manually wherever you wish for more 
fine-grained memory management (that is, if you wanted to free objects before 
you left the event loop).

Note that in Objective-C, they use those naming conventions to determine
whether 
an object is auto-released or not.  But we could make sure it's *always* 
auto-released, as we don't have the historical requirements that Objective-C 
has.  The question is, does it make sense to use this technique to "lump 
together" deallocations instead of conservatively calling retain/release 
wherever you assign variables (like C++ shared_ptr)?  And a further question is 
whether the compiler should pick those points, or whether they should be picked 
manually.

-Steve

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

If autoreleasepull is just a handy way to lump together Release() calls, then 
that is quite unnecessary if the compiler inserts calls to Release() 
automatically. If it is, instead, a promise that members of autoreleasepull do 
not leak references outside of that object, then this is very problematic for D 
to guarantee such - and guarantee it it must. I.e. it's "escape analysis" in 
another disguise.

I think the compiler should pick where to put the Release() calls, that is the 
whole point of ARC. If the compiler can do sufficient escape analysis to 
determine that the calls can be elided, so much the better.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On Jun 25, 2013, at 5:28 PM, Walter Bright wrote:

 If autoreleasepull is just a handy way to lump together Release() calls, then 

that is quite unnecessary if the compiler inserts calls to Release() 
automatically. If it is, instead, a promise that members of autoreleasepull do 
not leak references outside of that object, then this is very problematic for D 
to guarantee such - and guarantee it it must. I.e. it's "escape analysis" in 
another disguise.
 I think the compiler should pick where to put the Release() calls, that is 

the whole point of ARC. If the compiler can do sufficient escape analysis to 
determine that the calls can be elided, so much the better.

I'm not sure exactly what is required for ARC to guarantee proper memory 
management (whether it requires flow-analysis or not), but it seems to work 
quite well for Objective-C. I think it helps minimize the expensive 
release/retain calls when you can just say "oh, someone else will clean that up 
later", just like you can with a GC.

It might be good for someone who knows the ARC eliding techniques that clang 
uses to explain how they work.  We certainly shouldn't ignore those techniques.

-Steve

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/25/2013 2:47 PM, Steven Schveighoffer wrote:
 I'm not sure exactly what is required for ARC to guarantee proper memory 

management (whether it requires flow-analysis or not), but it seems to work 
quite well for Objective-C. I think it helps minimize the expensive 
release/retain calls when you can just say "oh, someone else will clean that up 
later", just like you can with a GC.
 It might be good for someone who knows the ARC eliding techniques that clang 

uses to explain how they work.  We certainly shouldn't ignore those techniques.

Also remember that O-C doesn't guarantee memory safety, so they are freed from 
some of the constraints we operate under. They can say "don't do that", we
can't.

C++ shared_ptr<> is memory safe as long as you don't escape a pointer - and no 
C++ compiler checks for that.

COM is also memory safe as long as you carefully follow the conventions - and 
again, no C++ compiler checks it.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

 On 6/25/2013 2:47 PM, Steven Schveighoffer wrote:
 I'm not sure exactly what is required for ARC to guarantee proper memory 


management (whether it requires flow-analysis or not), but it seems to work 
quite well for Objective-C. I think it helps minimize the expensive 
release/retain calls when you can just say "oh, someone else will clean that up 
later", just like you can with a GC.
 It might be good for someone who knows the ARC eliding techniques that clang 


uses to explain how they work.  We certainly shouldn't ignore those techniques.

 Also remember that O-C doesn't guarantee memory safety, so they are freed 

from some of the constraints we operate under. They can say "don't do that", we 
can't.

I'm not sure that it doesn't.  At least when we are talking about object 
references.

The only thing clang complains about is when you try to call any memory 
management manually, or if you disobey the naming conventions.

-Steve

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Michel Fortin wrote:

Le 25-juin-2013 à 17:20, Steven Schveighoffer a écrit :

 On Jun 25, 2013, at 4:44 PM, Walter Bright wrote:

 On 6/25/2013 1:19 PM, Steven Schveighoffer wrote:
 Would this be a mechanism that's worth putting in? I think it goes really 



well with something like TempAlloc.  I don't think we should use convention, 
though...
 I agree with not relying on convention. But also reserving the new*, init*, 


alloc* and copy* namespaces seems excessive for D.
 As for autoreleasepool, it is relying on a convention that its fields are 


not leaking. I don't see how we can enforce this.
 I don't think the autoreleasepool is relying on convention, it's simply 

giving the compiler a way to elide careful tracking of temporaries' reference 
counts.

Not at all. Autorelease pools were useful at a time before ARC so you wouldn't 
have to think of releasing manually every object called functions were returned 
to you. Instead, most functions would return autoreleased object and you'd only 
have to retain those objects you were storing elsewhere.

Nowadays, with ARC, we still have them but that's mostly for interoperability 
already existing code. Most functions still return autoreleased objects because 
that's the convention, and breaking that convention would cause objects to be 
retained or released to many times. So we still need autorelease pools. But ARC 
is hard at work[^1] behind the scene to reduce the number of those autoreleased 
objects.

So no, we shouldn't introduce autorelease pools to D... well, except maybe for 
the part were we want interoperability with Objective-C (because we have no
choice).

And finally, there's nothing unsafe with autorelease pools as long as you don't 
keep an unretained reference to an autoreleased object when the pool drains. 
Making sure you have no unretained reference is ARC's job, so with ARC it
should 
not be no problem.

[^1]: One clever trick ARC does is inside the implicit call to 
objc_returnAutoreleased it adds at the end of an autoreleasing function, the 
runtime checks to see if the return address points to an instruction that'll 
call objc_retain on that same pointer. If that's the case, it skips the 
autorelease and also skip objc_retain and goes to the next instruction
directly. 
Of course if the convention was always to return object retained, none of this 
would be needed. I saw that explained on a WWDC video a couple of years back.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Steven Schveighoffer wrote:

On Jun 25, 2013, at 9:31 PM, Michel Fortin wrote:

 Le 25-juin-2013 à 17:20, Steven Schveighoffer a écrit :

 I don't think the autoreleasepool is relying on convention, it's simply 


giving the compiler a way to elide careful tracking of temporaries' reference 
counts.
 Not at all. Autorelease pools were useful at a time before ARC so you 

wouldn't have to think of releasing manually every object called functions were 
returned to you. Instead, most functions would return autoreleased object and 
you'd only have to retain those objects you were storing elsewhere.

Having used MRC, I appreciate what autoreleasepool did, but I thought of it 
being also as a kind of blanket way to allow the compiler to remove extra 
retains/releases in ARC.

Is it not advantageous to release a whole pool of objects vs. releasing them 
individually during execution?  All releases and retains are atomic, so I 
figured one could do some optimization when it's all lumped together.

I find the autorelease pools very GC-like -- you don't have to worry who uses
or 
forgets the reference, it's kept in memory until you don't need it.

Anyway, everything I know about Obj-C ARC I learned from my iOS 5 book :)  So 
don't take me as an expert.

-Steve

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Michel Fortin wrote:

Le 25-juin-2013 à 21:40, Steven Schveighoffer a écrit :

 On Jun 25, 2013, at 9:31 PM, Michel Fortin wrote:

 Not at all. Autorelease pools were useful at a time before ARC so you 


wouldn't have to think of releasing manually every object called functions were 
returned to you. Instead, most functions would return autoreleased object and 
you'd only have to retain those objects you were storing elsewhere.
 Having used MRC, I appreciate what autoreleasepool did, but I thought of it 

being also as a kind of blanket way to allow the compiler to remove extra 
retains/releases in ARC.
 Is it not advantageous to release a whole pool of objects vs. releasing them 

individually during execution?  All releases and retains are atomic, so I 
figured one could do some optimization when it's all lumped together.

I haven't done any benchmarking, but I'd have to assume it is more advantageous 
to just return objects retained since Apple went to great lengths to make sure 
this can happen even when the convention is to return autoreleased.

There's no question it also simplifies the compiler. It's much easier to reason 
about pairs of retain/release than retain/autorelease.

 I find the autorelease pools very GC-like -- you don't have to worry who uses 

or forgets the reference, it's kept in memory until you don't need it.


The concept was truly great, no doubt about that.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/25/2013 6:31 PM, Michel Fortin wrote:
 And finally, there's nothing unsafe with autorelease pools as long as you 

don't keep an unretained reference to an autoreleased object when the pool
drains.

Well, that's exactly the issue - an escaping reference.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Steven Schveighoffer wrote:

On Jun 25, 2013, at 11:04 PM, Walter Bright wrote:

 On 6/25/2013 6:31 PM, Michel Fortin wrote:
 And finally, there's nothing unsafe with autorelease pools as long as you 


don't keep an unretained reference to an autoreleased object when the pool
drains.
 Well, that's exactly the issue - an escaping reference.

Read the next sentence after your quote though:

"Making sure you have no unretained reference is ARC's job, so with ARC it 
should not be no problem."

So with ARC, it's not unsafe.  I think that was the ultimate point.

-Steve

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/25/2013 8:08 PM, Steven Schveighoffer wrote:
 On Jun 25, 2013, at 11:04 PM, Walter Bright wrote:

 On 6/25/2013 6:31 PM, Michel Fortin wrote:
 And finally, there's nothing unsafe with autorelease pools as long as you 



don't keep an unretained reference to an autoreleased object when the pool
drains.
 Well, that's exactly the issue - an escaping reference.

 Read the next sentence after your quote though:

 "Making sure you have no unretained reference is ARC's job, so with ARC it 

should not be no problem."
 So with ARC, it's not unsafe.  I think that was the ultimate point.

Yes, I read the next sentence. What it means to me is that autorelease pools
add 
nothing when ARC is implemented - no optimizations, either. Either that or I 
don't understand how ARC figures out that there are no escaping references 
without doing things like runtime checks.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Steven Schveighoffer wrote:

On Jun 26, 2013, at 12:10 AM, Walter Bright wrote:

 On 6/25/2013 8:08 PM, Steven Schveighoffer wrote:
 On Jun 25, 2013, at 11:04 PM, Walter Bright wrote:

 On 6/25/2013 6:31 PM, Michel Fortin wrote:
 And finally, there's nothing unsafe with autorelease pools as long as you 




don't keep an unretained reference to an autoreleased object when the pool
drains.
 Well, that's exactly the issue - an escaping reference.

 Read the next sentence after your quote though:

 "Making sure you have no unretained reference is ARC's job, so with ARC it 


should not be no problem."
 So with ARC, it's not unsafe.  I think that was the ultimate point.

 Yes, I read the next sentence. What it means to me is that autorelease pools 

add nothing when ARC is implemented - no optimizations, either. Either that or
I 
don't understand how ARC figures out that there are no escaping references 
without doing things like runtime checks.

OK, it sounded like you were saying still that Objective C with ARC didn't have 
memory safety.

I'm no longer arguing about autorelease, I defer to Michel on that.  I clearly 
didn't understand that autorelease doesn't provide any benefits for ARC, it's 
just there for compatibility.

-Steve

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/25/2013 9:13 PM, Steven Schveighoffer wrote:
 OK, it sounded like you were saying still that Objective C with ARC didn't 

have memory safety.

My initial first pass read of:

http://clang.llvm.org/docs/AutomaticReferenceCounting.html

indicates that ARC does not guarantee memory safety. There are a number of 
"undefined" behaviors when the O-C programmer does not follow the rules.

Oct 09 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Walter Bright:

Where are the benchmarks that show that this is a good idea in 
some real situations? This is the essential first step.


 If a class contains the following methods, in either itself or 
 a base class, it is
 an RC class:


     T AddRef();
     T Release();

What if a programmer adds only one of those two? Currently if you 
add only part of the hashing protocol (or you bork a function 
signature) the compiler often gives no errors.

What are the plans for coalescing and optimizing away some 
reference counts updates?

Bye,
bearophile

Oct 09 2013

"bearophile" <bearophileHUGS lycos.com> writes:

 What are the plans for coalescing and optimizing away some 
 reference counts updates?

There is now a discussion and paper on optimizing a reference 
counter:
http://lambda-the-ultimate.org/node/4825

Bye,
bearophile

Oct 11 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 10/11/13 10:28 AM, bearophile wrote:
 What are the plans for coalescing and optimizing away some reference
 counts updates?

 There is now a discussion and paper on optimizing a reference counter:
 http://lambda-the-ultimate.org/node/4825

 Bye,
 bearophile

Yes, I think that's great work.

Andrei

Oct 11 2013

"deadalnix" <deadalnix gmail.com> writes:

On Wednesday, 9 October 2013 at 22:31:34 UTC, Walter Bright wrote:
 ==Arrays==

 Built-in arrays have no place to put a reference count. Ref 
 counted arrays would hence
 become a library type, based on a ref counted class with 
 overloaded operators for
 the array operations.

You'll soon see the roadblock here with templates and type 
qualifiers. const(A!T) and A!const(T) has nothing to do with each 
other as far as the compiler is concerned, and no we have no way 
around this ATM.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Updated incorporating Steven's suggestion, and some comments about 
shared/const/mutable/purity.

-------------------------------------------------------------

     Adding Reference Counting to D

D currently supports manual memory management and generalized GC.
Unfortunately, 
the pausing
and memory consumption inherent in GC is not acceptable for many programs, and 
manual
memory management is error-prone, tedious, and unsafe. A third style, reference 
counting (RC),
addresses this. Common implementations are COM's AddRef/Release, Objective-C's
ARC,
and C++'s shared_ptr<>.

None of these three schemes are guaranteed memory-safe, they all require the 
programmer
to conform to a protocol (even O-C's ARC). Stepping outside of the protocol 
results in
memory corruption. A D implementation must make it possible to use ref counted 
objects
in code marked as  safe, although it will be necessary for the implementation
of 
those
objects to be unsafe.

Some aspects of a D implementation are inevitable:

1. Avoid any requirement for more pointer types. This would cause drastic 
increases in
complexity for both the compiler and the user. It may make generic code much 
more difficult
to write.

2. Decay of a ref-counted pointer to a non-ref-counted pointer is unsafe, and 
can only
be allowed (in  safe code) in circumstances where it can be statically proven
to 
be safe.

3. A ref counted object is inherently a reference type, not a value type.

4. The compiler needs to know about ref counted types.


==Proposal==

If a class contains the following methods, in either itself or a base class, it
is
an RC class:


     T AddRef();
     T Release();

An RC class is like a regular D class with these additional semantics:

1. In  safe code, casting (implicit or explicit) to a base class that does not
have both AddRef() and Release() is an error.

2. Initialization of a class reference causes a call to AddRef().

3. Assignment to a class reference causes a call to AddRef() on the new value
followed by a call to Release() on its original value.

4. Null checks are done before calling any AddRef() or Release().

5. Upon scope exit of all RC variables or temporaries, a call to Release() is 
performed,
analogously to the destruction of struct variables and temporaries.

6. If a class or struct contains RC fields, calls to Release() for those fields
will
be added to the destructor, and a destructor will be created if one doesn't 
exist already.

7. If a closure is created that contains RC fields, either a compile time error 
will be
generated or a destructor will be created for it.

8. Explicit calls to AddRef/Release will not be allowed in  safe code.

9. A call to AddRef() will be added to any argument passed as a parameter.

10. Function returns have an AddRef() already done to the return value.

11. The compiler can elide any AddRef()/Release() calls it can prove are
redundant.

12. AddRef() is not called when passed as the implicit 'this' reference.

13. Taking the address of, or passing by reference, any fields of an RC object
is not allowed in  safe code. Passing by reference an RC field is allowed.

14. RC objects will still be allocated on the GC heap - this means that a normal
GC run will reap RC objects that are in a cycle, and RC objects will get 
automatically
scanned for heap references with no additional action required by the user.

15. The class implementor will be responsible for deciding whether or not to
support
sharing. Casting to shared is already disallowed in  safe code, so this is only
viable in system code.

16. RC objects cannot be const or immutable.

17. Can RC objects be arguments to pure functions?

==Existing Code==

D COM objects already have AddRef() and Release(). This proposal should not
break
that code, it'll just mean that there will be extra AddRef()/Release calls made.
Calling AddRef()/Release() should never have been allowed in  safe code anyway.

Any other existing uses of AddRef()/Release() will break.

==Arrays==

Built-in arrays have no place to put a reference count. Ref counted arrays
would 
hence
become a library type, based on a ref counted class with overloaded operators
for
the array operations.

==Results==

This is a very flexible approach, allowing for support of general RC objects,
as 
well
as specific support for COM objects and Objective-C ARC. AddRef()/Release()'s 
implementation
is entirely up to the user or library writer.

 safe code can be guaranteed to be memory safe, as long as AddRef()/Release() 
are correctly
implemented.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Michel Fortin wrote:



While its a start, this is hardly enough for Objective-C. Mostly for legacy 
reasons, most Objective-C methods return autoreleased objects (deferred release 
using an autorelease pool) based on a naming convention. Also, Objective-C 
objects can't be allocated from the D heap, so to avoid cycles we need weak 
pointers. More on Objective-C later.

While it's good that a direct call to AddRef/Release is forbidden in  safe
code, 
I think it should be forbidden in  system code too. The reason is that if the 
compiler is inserting calls to these automatically and you're also adding your 
own explicitly in the same function, it becomes practically impossible to
reason 
about the reference counts, short of looking at the assembly. Instead, I think 
you should create a  noarc attribute for functions: it'll prevent the compiler 
for inserting any of those calls so it becomes the responsibility of the author 
to make those calls (which are then allowed).  noarc would be incompatible with 
 safe, obviously.

Finally, that's a nitpick but I wish you'd use function names that fit D
better, 
such as opRetain and opRelease. Then you can add a "final void opRetain() { 
AddRef(); }" function to the IUnknown COM interface and we could do the same
for 
Objective-C.



Objective-C is a special case. In Objective-C we need to know whether the 
returned object of a function is already retained or if it is deferred released 
(autoreleased). This is easily deducted from the naming convention. 
Occasionally, we might need to create autorelease pools too, but that can 
probably stay  system.

(Note: all this idea of autoreleased objects might sound silly, but it was a 
great help before ARC, and Objective-C ARC has to be compatible with legacy
code 
so it conforms to those conventions.)

You can easily implement ARC for COM using an implementation of ARC for 
Objective-C, the reverse is not true because COM does not have this (old but 
still needed) concept of autorelease pools and deferred release where you need 
to know at each function boundary whether returned values (including those 
returned by pointer arguments) whether the object is expected to be retained or
not.

If I were you Walter, I would just not care about Objective-C idioms while 
implementing this feature at first. It'll have to be special cased anyway. 
Here's how I expect that'll be done:

What will need to be done later when adding Objective-C support is to add an 
internal "autoreleasedReturn" flag to a function that'll make codegen call 
"autorelease" in the callee when returning an object and "retain" in the caller 
where it receives an object from a function with that flag. Also, the same flag 
and behaviour is needed for out parameters (to mimick those cases where an 
object is returned by pointer). That flag will then be set automatically 
internally depending on the function name (only for Objective-C member 
functions), and it should be possible to override it explicitly with an 
attribute or a pragma of some sort. This is what Clang is doing, and we must 
match that to allow things to work.

Checking for null is redundant in the Objective-C case: that check is done by 
the runtime. That's of minor importance, but it might impact performance and 
should probably special-cased in this case.



With Apple's implementation of reference counting (using global hash tables 
protected by spin locks), it is more efficient to update many counters in one 
operation. The codegen for Objective-C ARC upon assignement to a variable calls 
"objc_storeStrong(id *object, id value)", incrementing and decrementing the two 
counters presumably in one operation (as well as replacing the content of the 
variable pointed by the first argument with the new value).

Ideally, the codegen for Objective-C ARC in D would call the same functions so 
we have the same performance. This means that codegen should make a call 
"objc_retain" when first initializing a variable, "objc_storeStrong" when doing 
an assignment, and "objc_release" when destructing a variable.

As for returning autoreleased objects, there are two functions to choose from 
depending on whether the object needs to be retained at the same time nor not. 
(In general, the object needs to be retained prior autoreleasing if it comes 
from a variable not part of the function's stack frame.)

Here's Clang's documentation for how it implements ARC:
http://clang.llvm.org/docs/AutomaticReferenceCounting.html



Weak pointers are essential in order to break retain cycles in Objective-C
where 
there is no GC. They are implemented with the same kind of function calls as 
strong pointers. Unfortunately, Apple's Objective-C implementation won't sit 
well with D the way it works now.

Weak pointers are implemented in Objective-C by registering the address of the 
pointer with the runtime. This means that when a pointer is moved from one 
location to another, the need to be notified of that through a call to 
objc_moveWeak. This breaks one assumption of D that you can move memory at will 
without calling anything.

While we could still implement a working weak pointer with a template struct, 
that struct would have to allocate a pointer on the heap (where it is
guarantied 
to not move) so it can store the true weak pointer recognized by the runtime. 
I'm not sure that would be acceptable, but at least it would work.



I feel like I should share some of my thoughts here about a broader use of 
reference counting in D.

First, we don't have to assume the reference counter has to be part of the 
object. Apple implements reference counting using global hash tables where the 
key is the address. It works very well.

If we added a hash table like this for all memory allocated from the GC, we'd 
just have to find the base address of any memory block to get to its reference 
counter. I know you were designing with only classes in mind, but I want to 
point out that it is possible to reference-count everything the GC allocates if 
we want to.

The downside is that every assignment to a pointer anywhere has to call a 
function. While this is some overhead, it is more predictable than overhead
from 
a GC scan and would be preferred in some situation (games I guess). Another 
downside is you have an object retained by being present on the stack frame of
a 
C function, it'd have to be explicitly retained from elsewhere.

As for pointers not pointing to GC memory, the generic addRef/release functions 
can ignore those pointers just like the GC ignores them today when it does its
scan.

Finally, cycles can still be reclaimed by having the GC scan for them. Those 
scans should be less frequent however since most of the memory can be reclaimed 
through reference counting.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/25/2013 6:09 PM, Michel Fortin wrote:


 While its a start, this is hardly enough for Objective-C. Mostly for legacy 

reasons, most Objective-C methods return autoreleased objects (deferred release 
using an autorelease pool) based on a naming convention. Also, Objective-C 
objects can't be allocated from the D heap, so to avoid cycles we need weak 
pointers. More on Objective-C later.
 While it's good that a direct call to AddRef/Release is forbidden in  safe 

code, I think it should be forbidden in  system code too. The reason is that if 
the compiler is inserting calls to these automatically and you're also adding 
your own explicitly in the same function, it becomes practically impossible to 
reason about the reference counts, short of looking at the assembly. Instead, I 
think you should create a  noarc attribute for functions: it'll prevent the 
compiler for inserting any of those calls so it becomes the responsibility of 
the author to make those calls (which are then allowed).  noarc would be 
incompatible with  safe, obviously.

It's a good point, but adding such an attribute to the function may be too 
coarse, for one, and may cause composition problems, for another. Maybe just 
disallowing it altogether is the best solution.

 Finally, that's a nitpick but I wish you'd use function names that fit D 

better, such as opRetain and opRelease. Then you can add a "final void 
opRetain() { AddRef(); }" function to the IUnknown COM interface and we could
do 
the same for Objective-C.

Makes sense.



 Objective-C is a special case. In Objective-C we need to know whether the 

returned object of a function is already retained or if it is deferred released 
(autoreleased). This is easily deducted from the naming convention. 
Occasionally, we might need to create autorelease pools too, but that can 
probably stay  system.
 (Note: all this idea of autoreleased objects might sound silly, but it was a 

great help before ARC, and Objective-C ARC has to be compatible with legacy
code 
so it conforms to those conventions.)
 You can easily implement ARC for COM using an implementation of ARC for 

Objective-C, the reverse is not true because COM does not have this (old but 
still needed) concept of autorelease pools and deferred release where you need 
to know at each function boundary whether returned values (including those 
returned by pointer arguments) whether the object is expected to be retained or
not.
 If I were you Walter, I would just not care about Objective-C idioms while 

implementing this feature at first. It'll have to be special cased anyway. 
Here's how I expect that'll be done:

 From reading over that clang document, O-C arc is far more complex than I'd 
anticipated. I think it is way beyond what we'd want in regular D. It also
comes 
with all kinds of pointer and function annotations - something I strongly want 
to avoid.

 What will need to be done later when adding Objective-C support is to add an 

internal "autoreleasedReturn" flag to a function that'll make codegen call 
"autorelease" in the callee when returning an object and "retain" in the caller 
where it receives an object from a function with that flag. Also, the same flag 
and behaviour is needed for out parameters (to mimick those cases where an 
object is returned by pointer). That flag will then be set automatically 
internally depending on the function name (only for Objective-C member 
functions), and it should be possible to override it explicitly with an 
attribute or a pragma of some sort. This is what Clang is doing, and we must 
match that to allow things to work.

I agree that this complexity should only be in O-C code.

 Checking for null is redundant in the Objective-C case: that check is done by 

the runtime. That's of minor importance, but it might impact performance and 
should probably special-cased in this case.


 With Apple's implementation of reference counting (using global hash tables 

protected by spin locks), it is more efficient to update many counters in one 
operation. The codegen for Objective-C ARC upon assignement to a variable calls 
"objc_storeStrong(id *object, id value)", incrementing and decrementing the two 
counters presumably in one operation (as well as replacing the content of the 
variable pointed by the first argument with the new value).
 Ideally, the codegen for Objective-C ARC in D would call the same functions 

so we have the same performance. This means that codegen should make a call 
"objc_retain" when first initializing a variable, "objc_storeStrong" when doing 
an assignment, and "objc_release" when destructing a variable.
 As for returning autoreleased objects, there are two functions to choose from 

depending on whether the object needs to be retained at the same time nor not. 
(In general, the object needs to be retained prior autoreleasing if it comes 
from a variable not part of the function's stack frame.)
 Here's Clang's documentation for how it implements ARC:
 http://clang.llvm.org/docs/AutomaticReferenceCounting.html



 Weak pointers are essential in order to break retain cycles in Objective-C 

where there is no GC. They are implemented with the same kind of function calls 
as strong pointers. Unfortunately, Apple's Objective-C implementation won't sit 
well with D the way it works now.
 Weak pointers are implemented in Objective-C by registering the address of 

the pointer with the runtime. This means that when a pointer is moved from one 
location to another, the need to be notified of that through a call to 
objc_moveWeak. This breaks one assumption of D that you can move memory at will 
without calling anything.
 While we could still implement a working weak pointer with a template struct, 

that struct would have to allocate a pointer on the heap (where it is
guarantied 
to not move) so it can store the true weak pointer recognized by the runtime. 
I'm not sure that would be acceptable, but at least it would work.


 I feel like I should share some of my thoughts here about a broader use of 

reference counting in D.
 First, we don't have to assume the reference counter has to be part of the 

object. Apple implements reference counting using global hash tables where the 
key is the address. It works very well.
 If we added a hash table like this for all memory allocated from the GC, we'd 

just have to find the base address of any memory block to get to its reference 
counter. I know you were designing with only classes in mind, but I want to 
point out that it is possible to reference-count everything the GC allocates if 
we want to.

D would need manual, RC and GC to coexist peacefully.

 The downside is that every assignment to a pointer anywhere has to call a 

function. While this is some overhead, it is more predictable than overhead
from 
a GC scan and would be preferred in some situation (games I guess). Another 
downside is you have an object retained by being present on the stack frame of
a 
C function, it'd have to be explicitly retained from elsewhere.

Doesn't this make it impractical to mix vanilla C with D code? An important 
feature of D is this capability, without worrying about a "JNI" style
interface. 
As for D switching to a full refcounted GC for everything, I'm very hesitant
for 
such a step. For one thing, reading the clang spec on all the various pointer 
and function annotations necessary is very off-putting.


 As for pointers not pointing to GC memory, the generic addRef/release 

functions can ignore those pointers just like the GC ignores them today when it 
does its scan.
 Finally, cycles can still be reclaimed by having the GC scan for them. Those 

scans should be less frequent however since most of the memory can be reclaimed 
through reference counting.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Rainer Schuetze wrote:

On 25.06.2013 23:00, Walter Bright wrote:
 Updated incorporating Steven's suggestion, and some comments about
 shared/const/mutable/purity.

 -------------------------------------------------------------

      Adding Reference Counting to D

Cool. I didn't expect this to be tackled so soon.


[...]
 3. A ref counted object is inherently a reference type, not a value type.

As Michel also said, the reference count does not have to be in inside the 
object itself, so we might want to allow reference counting on other types
aswell.

 4. The compiler needs to know about ref counted types.

I imagine a few (constrained) templated functions for the different operations 
defined in the library could also do the job, though it might drown compilation 
speed. Also getting help from the optimizer to remove redundant calls will need 
some back doors.

 ==Proposal==

 If a class contains the following methods, in either itself or a base
 class, it is
 an RC class:


      T AddRef();
      T Release();

Is T typeof(this) here?

I don't think we should force linking this functionality with COM, the 
programmer can do this with a simple wrapper.

 An RC class is like a regular D class with these additional semantics:

 1. In  safe code, casting (implicit or explicit) to a base class that
 does not
 have both AddRef() and Release() is an error.

 2. Initialization of a class reference causes a call to AddRef().

 3. Assignment to a class reference causes a call to AddRef() on the new
 value
 followed by a call to Release() on its original value.

It might be common knowledge, but I want to point out that the usual COM 
implementation (atomic increment/decrement and free when refcount goes down to 
0) is not thread-safe for shared pointers. That means you either have to guard 
all reads and writes with a lock to make the full assignment atomic or have to 
implement reference counting very different (e.g. deferred reference counting).

 4. Null checks are done before calling any AddRef() or Release().

 5. Upon scope exit of all RC variables or temporaries, a call to
 Release() is performed,
 analogously to the destruction of struct variables and temporaries.

 6. If a class or struct contains RC fields, calls to Release() for those
 fields will
 be added to the destructor, and a destructor will be created if one
 doesn't exist already.

 7. If a closure is created that contains RC fields, either a compile
 time error will be
 generated or a destructor will be created for it.

 8. Explicit calls to AddRef/Release will not be allowed in  safe code.

 9. A call to AddRef() will be added to any argument passed as a parameter.

 10. Function returns have an AddRef() already done to the return value.

 11. The compiler can elide any AddRef()/Release() calls it can prove are
 redundant.

 12. AddRef() is not called when passed as the implicit 'this' reference.

Isn't this unsafe if a member function is called through the last existing 
reference and this reference is then cleared during execution of this member 
function or from another thread?

 13. Taking the address of, or passing by reference, any fields of an RC
 object
 is not allowed in  safe code. Passing by reference an RC field is allowed.

Please note that this includes slices to fixed size arrays.

 14. RC objects will still be allocated on the GC heap - this means that
 a normal
 GC run will reap RC objects that are in a cycle, and RC objects will get
 automatically
 scanned for heap references with no additional action required by the user.

 15. The class implementor will be responsible for deciding whether or
 not to support
 sharing. Casting to shared is already disallowed in  safe code, so this
 is only
 viable in system code.

 16. RC objects cannot be const or immutable.

This is a bit of a downer. If the reference count is not within the object,
this 
can be implemented.

 17. Can RC objects be arguments to pure functions?

 ==Existing Code==

 D COM objects already have AddRef() and Release(). This proposal should
 not break
 that code, it'll just mean that there will be extra AddRef()/Release
 calls made.
 Calling AddRef()/Release() should never have been allowed in  safe code
 anyway.

 Any other existing uses of AddRef()/Release() will break.

 ==Arrays==

 Built-in arrays have no place to put a reference count. Ref counted
 arrays would hence
 become a library type, based on a ref counted class with overloaded
 operators for
 the array operations.

 ==Results==

 This is a very flexible approach, allowing for support of general RC
 objects, as well
 as specific support for COM objects and Objective-C ARC.
 AddRef()/Release()'s implementation
 is entirely up to the user or library writer.

  safe code can be guaranteed to be memory safe, as long as
 AddRef()/Release() are correctly
 implemented.

I feel I'm hijacking this proposal, but the step to library defined read/write 
barriers seems pretty small. Make AddRef, Release and assignment free template 
functions, e.g.

void ptrConstruct(T,bool stackOrHeap)(T*adr, T p);
void ptrAssign(T,bool stackOrHeap)(T*adr, T p);
void ptrRelease(T,bool stackOrHeap)(T*adr);

and we are able to experiment with all kinds of sophisticated GC algorithms 
including RC. Eliding redundant addref/release pairs would need some extra 
support though, I read that LLVM does something like this, but I don't know how.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/26/2013 12:19 AM, Rainer Schuetze wrote:
 As Michel also said, the reference count does not have to be in inside the 

object itself, so we might want to allow reference counting on other types
aswell.

That opens the question of what is the point of other RC types? For example,
C++ 
can throw any type - but it turns out that throwing anything but class types is 
largely pointless.

My proposal does not specify where the count actually is - the two functions
can 
be arbitrarily implemented.

 4. The compiler needs to know about ref counted types.

 I imagine a few (constrained) templated functions for the different 

operations defined in the library could also do the job, though it might drown 
compilation speed. Also getting help from the optimizer to remove redundant 
calls will need some back doors.

I don't see how this can be done without specific compiler knowledge in a
memory 
safe way.

      T AddRef();
      T Release();

 Is T typeof(this) here?

T is not relevant to the proposal - it's up to the specific implementation of 
those functions.

 I don't think we should force linking this functionality with COM, the 

programmer can do this with a simple wrapper.

Yeah, that's Michel's suggestion, and it's a good one.

 An RC class is like a regular D class with these additional semantics:

 1. In  safe code, casting (implicit or explicit) to a base class that
 does not
 have both AddRef() and Release() is an error.

 2. Initialization of a class reference causes a call to AddRef().

 3. Assignment to a class reference causes a call to AddRef() on the new
 value
 followed by a call to Release() on its original value.

 It might be common knowledge, but I want to point out that the usual COM 

implementation (atomic increment/decrement and free when refcount goes down to 
0) is not thread-safe for shared pointers. That means you either have to guard 
all reads and writes with a lock to make the full assignment atomic or have to 
implement reference counting very different (e.g. deferred reference counting).

Since the implementation of AddRef()/Release() is up to the user, whether it 
uses locks or not and whether it supports shared or not is up to the user.

 12. AddRef() is not called when passed as the implicit 'this' reference.

 Isn't this unsafe if a member function is called through the last existing 

reference and this reference is then cleared during execution of this member 
function or from another thread?

No. The caller of the function still retains a reference in that thread.

 13. Taking the address of, or passing by reference, any fields of an RC
 object
 is not allowed in  safe code. Passing by reference an RC field is allowed.

 Please note that this includes slices to fixed size arrays.

As I suggested, arrays would not be supported with this proposal - but the user 
can create ref counted array-like objects.

 16. RC objects cannot be const or immutable.

 This is a bit of a downer. If the reference count is not within the object, 

this can be implemented.

Also, an exception could be made for the AddRef()/Release() functions.

 I feel I'm hijacking this proposal, but the step to library defined 

read/write barriers seems pretty small. Make AddRef, Release and assignment
free 
template functions, e.g.
 void ptrConstruct(T,bool stackOrHeap)(T*adr, T p);
 void ptrAssign(T,bool stackOrHeap)(T*adr, T p);
 void ptrRelease(T,bool stackOrHeap)(T*adr);

 and we are able to experiment with all kinds of sophisticated GC algorithms 

including RC. Eliding redundant addref/release pairs would need some extra 
support though, I read that LLVM does something like this, but I don't know how.

It's pretty invasive into the code generation and performance, and could 
completely disrupt the C compatibility of D.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Jacob Carlborg wrote:

On Jun 25, 2013, at 11:00 PM, Walter Bright wrote:

 Updated incorporating Steven's suggestion, and some comments about
 shared/const/mutable/purity.

It's great that you're bringing this up. I think it's a good idea to add 
reference counting to D. Although I don't think I can add much to the
discussion 
and Michel have a lot better knowledge about Objective-C than I have. All I can 
say is that I really want it to be compatible with Objective-C.

Oct 09 2013

zoujiaqing <zoujiaqing gmail.com> writes:

On Thursday, 10 October 2013 at 00:02:09 UTC, Walter Bright wrote:
 Updated incorporating Steven's suggestion, and some comments 
 about shared/const/mutable/purity.

 -------------------------------------------------------------

     Adding Reference Counting to D

Automatic Reference Counting (ARC) as an alternative to D's 
Garbage Collector?

What is its state?

I think ARC is a very good direction.

Thanks.

Ref link:

https://wiki.dlang.org/Language_design_discussions#Automatic_Reference_Counting_.28ARC.29_as_an_alternative_to_D.27s_Garbage_Collector

Sep 18 2019

Walter Bright <newshound2 digitalmars.com> writes:

Michel Fortin wrote:

Le 26-juin-2013 à 5:38, Walter Bright  a écrit :

 On 6/26/2013 12:19 AM, Rainer Schuetze wrote:
 As Michel also said, the reference count does not have to be in inside the 


object itself, so we might want to allow reference counting on other types
aswell.
 That opens the question of what is the point of other RC types? For example, 

C++ can throw any type - but it turns out that throwing anything but class
types 
is largely pointless.

RC is just another garbage collection scheme. You might favor it for its 
performance characteristics, its determinism, or the lower memory footprint.

Or you might need it to interact with foreign code that relies on it (COM, 
Objective-C, etc.), in which case it needs to be customizable (use the foreign 
implementation) or be manually managed.

That's two different use cases. And in the later case you can't use the GC to 
release cycles because foreign code is using memory invisible to the GC. It is 
important to note that when foreign code calls AddRef you don't want the GC to 
collect that object, at least not until Release is called.


 I don't think we should force linking this functionality with COM, the 


programmer can do this with a simple wrapper.
 Yeah, that's Michel's suggestion, and it's a good one.

It could also be done with user attributes:

	void x_init(ref X x);
	void x_destroy(ref X x);
	void x_assign(ref X a, X b); // retains a, releases b

	 arc!(x_init, x_destroy, x_assign)
	class X {}

This way you don't have to check for null in the generated code: the standalone 
function is in charge of that.

Also, this "assign" function here provides good opportunity for optimization in 
the RC implementation because you can update the two reference counts in a 
single operation (in the case where they're stored at the same place and are 
protected by the same lock). It's an improvement over making two separate 
function calls.


 No. The caller of the function still retains a reference in that thread.

This should also apply to all function arguments. The caller is better placed 
than the callee to elide redundant AddRef/Release pairs, so it should be the
one 
in charge of retaining the object for the callee.


 16. RC objects cannot be const or immutable.

 This is a bit of a downer. If the reference count is not within the object, 


this can be implemented.
 Also, an exception could be made for the AddRef()/Release() functions.

I'm not too fond of that idea.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Michel Fortin wrote:


Le 26-juin-2013 à 1:36, Walter Bright  a écrit :

 On 6/25/2013 6:09 PM, Michel Fortin wrote:
 Instead, I think you should create a  noarc attribute for functions: it'll 


prevent the compiler for inserting any of those calls so it becomes the 
responsibility of the author to make those calls (which are then allowed). 
 noarc would be incompatible with  safe, obviously.
 It's a good point, but adding such an attribute to the function may be too 

coarse, for one, and may cause composition problems, for another. Maybe just 
disallowing it altogether is the best solution.

Well, it's mostly required to write runtime support functions. The attribute 
could be more obscure so people are less tempted to use it, but if you're going 
to implement the ref-counting code you'll need that.



 I feel like I should share some of my thoughts here about a broader use of 


reference counting in D.
 First, we don't have to assume the reference counter has to be part of the 


object. Apple implements reference counting using global hash tables where the 
key is the address. It works very well.
 If we added a hash table like this for all memory allocated from the GC, 


we'd just have to find the base address of any memory block to get to its 
reference counter. I know you were designing with only classes in mind, but I 
want to point out that it is possible to reference-count everything the GC 
allocates if we want to.
 D would need manual, RC and GC to coexist peacefully.

The problem is how to make the three of those use the same codegen?

- Druntime could have a flag to disable/enable refcounting. It'd make the 
retain/release functions no-ops, but it'd not prevent the GC from reclaiming 
memory as it does today.
- Druntime could have a flag to disable/enable garbage collection (it already 
has). That'd prevent cycles from being collected, but you could use weak 
pointers to work around that or request a collection manually at the
appropriate 
time.
- A  noarc (or similar) attribute at the function level could be used to
prevent 
the compiler from generating function calls on pointer assignments. You could 
make a whole module  noarc if you want by adding " noarc:" at the top.

Here's the annoying thing:  noarc is totally safe if reference counting is 
disabled and we rely entirely on the GC.  noarc is unsafe when reference 
counting is enabled.

 The downside is that every assignment to a pointer anywhere has to call a 


function. While this is some overhead, it is more predictable than overhead
from 
a GC scan and would be preferred in some situation (games I guess). Another 
downside is you have an object retained by being present on the stack frame of
a 
C function, it'd have to be explicitly retained from elsewhere.
 Doesn't this make it impractical to mix vanilla C with D code? An important 

feature of D is this capability, without worrying about a "JNI" style interface.

It's not very different than with the GC today.

If you call a C function by giving it a ref-counted pointer argument, that 
memory block is guarantied to live at least for that call's lifetime (because
it 
is retained by the caller). So simple calls to C functions are not a problem.

If the C function puts that pointer elsewhere you'll need to retain it some 
other way, but you have to do this with the GC too. If you're implementing a 
callback called from C you need to care about what you return because the 
caller's C code won't retain it, while with the GC you could manage if C code 
did not store that pointer outside of the stack.

I think that's all you have to worry about.

 As for D switching to a full refcounted GC for everything, I'm very hesitant 

for such a step. For one thing, reading the clang spec on all the various 
pointer and function annotations necessary is very off-putting.

Don't let Clang intimidate you. The Clang spec is about four to five time more 
complicated than needed because of autoreleased objects and because it supports 
weak pointers. Weak pointers can be implemented as a struct templates (as long 
as we have  noarc). And all those annotations are for special cases, when you 
need to break the rules. You don't use them when doing normal programming, well 
except for __weak.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Rainer Schuetze wrote:

On 26.06.2013 11:38, Walter Bright wrote:
 On 6/26/2013 12:19 AM, Rainer Schuetze wrote:
 I imagine a few (constrained) templated functions for the different
 operations defined in the library could also do the job, though it
 might drown compilation speed. Also getting help from the optimizer to
 remove redundant calls will need some back doors.

 I don't see how this can be done without specific compiler knowledge in
 a memory safe way.

I currently don't see how it can be memory safe with this proposal.

 3. Assignment to a class reference causes a call to AddRef() on the new
 value
 followed by a call to Release() on its original value.

 It might be common knowledge, but I want to point out that the usual
 COM implementation (atomic increment/decrement and free when refcount
 goes down to 0) is not thread-safe for shared pointers. That means you
 either have to guard all reads and writes with a lock to make the full
 assignment atomic or have to implement reference counting very
 different (e.g. deferred reference counting).

 Since the implementation of AddRef()/Release() is up to the user,
 whether it uses locks or not and whether it supports shared or not is up
 to the user.

You have to put the lock around the pair of AddRef and Release, but if the 
compiler already splits this into two function calls, this cannot be done in
the 
implementation.

 12. AddRef() is not called when passed as the implicit 'this' reference.

 Isn't this unsafe if a member function is called through the last
 existing reference and this reference is then cleared during execution
 of this member function or from another thread?

 No. The caller of the function still retains a reference in that thread.

Hmmm, I guess I misunderstand the proposal. Assume for example a refcounted 
class R and this code

class R : RefCounted
{
     int _x;
     int readx() { return _x; }
}
int main()
{
     R r = new R;
     return r.readx();
}

According to 12. there is no refcounting going on when calling or executing 
readx. Ok, now what happens here:

class R : RefCounted
{
     int _x;
     int readx(C c)
     {
         c.r = null; // "standard" rc deletes r here
         return _x;  // reads garbage
     }
}
class C
{
     R r;
}
int main()
{
     C c = new C;
     c.r = new R;
     return c.r.readx(c);
}

This reads garbage or crashes if there is no reference counting going on when 
calling readx.

 13. Taking the address of, or passing by reference, any fields of an RC
 object
 is not allowed in  safe code. Passing by reference an RC field is
 allowed.

 Please note that this includes slices to fixed size arrays.

 As I suggested, arrays would not be supported with this proposal - but
 the user can create ref counted array-like objects.

Just to clarify, I meant taking a slice of a static array that is a field of a 
refcounted class. Is it forbidden to have a field like this in a refcounted 
class or is taking the address through slicing forbidden?

 I feel I'm hijacking this proposal, but the step to library defined
 read/write barriers seems pretty small. Make AddRef, Release and
 assignment free template functions, e.g.

 void ptrConstruct(T,bool stackOrHeap)(T*adr, T p);
 void ptrAssign(T,bool stackOrHeap)(T*adr, T p);
 void ptrRelease(T,bool stackOrHeap)(T*adr);

 and we are able to experiment with all kinds of sophisticated GC
 algorithms including RC. Eliding redundant addref/release pairs would
 need some extra support though, I read that LLVM does something like
 this, but I don't know how.

 It's pretty invasive into the code generation and performance, and could
 completely disrupt the C compatibility of D.

I don't see a big difference between a free function and a member function
call, 
though the template character of it might hurt compilation performance.

Two more notes:

- I'm not sure it is mentioned, but I think you have to describe what happens 
when copying a struct. pre- and post-blit actions have to be taken if the
struct 
contain pointers to refcounted objects.

 10. Function returns have an AddRef() already done to the return value.

- A refcounted reference returned from a function (including new) would have to 
be Released if the return value is ignored or if only used as part of an
expression.

Oct 09 2013

Robert Schadek <realburner gmx.de> writes:

On 10/10/2013 03:45 AM, Walter Bright wrote:
 Rainer Schuetze wrote:

 You have to put the lock around the pair of AddRef and Release, but if
 the compiler already splits this into two function calls, this cannot
 be done in the implementation.

I would imagine the counter to be manipulated with atomic_add_and_fetch
operations, so no locks are required.

Oct 10 2013

"inout" <inout gmail.com> writes:

On Thursday, 10 October 2013 at 08:55:00 UTC, Robert Schadek
wrote:
 On 10/10/2013 03:45 AM, Walter Bright wrote:
 Rainer Schuetze wrote:

 You have to put the lock around the pair of AddRef and 
 Release, but if
 the compiler already splits this into two function calls, this 
 cannot
 be done in the implementation.

 I would imagine the counter to be manipulated with 
 atomic_add_and_fetch
 operations, so no locks are required.

On shared objects, yes. Local objects need no atomics at all.

Oct 11 2013

Rainer Schuetze <r.sagitario gmx.de> writes:

On 12.10.2013 04:16, inout wrote:
 On Thursday, 10 October 2013 at 08:55:00 UTC, Robert Schadek
 wrote:
 On 10/10/2013 03:45 AM, Walter Bright wrote:
 Rainer Schuetze wrote:

 You have to put the lock around the pair of AddRef and Release, but if
 the compiler already splits this into two function calls, this cannot
 be done in the implementation.

 I would imagine the counter to be manipulated with atomic_add_and_fetch
 operations, so no locks are required.

 On shared objects, yes. Local objects need no atomics at all.

Atomic increments/decrements are not good enough for shared references. 
See this example from later in the discussion:

Consider a global shared reference R that holds the last reference to an 
object O. One thread exchanges the reference with another reference P 
while another thread reads the reference into S.

shared(C) R = O;      ; refcnt of O is 1

in pseudo-assembly missing null-checks:

Thread1 (R = P)        Thread2 (S = R)

                        mov ecx,[R]
                        ; thread suspended
mov eax,[P]
inc [eax].refcnt
mov ebx,[R]
mov [R],eax
dec [ebx].refcnt      ; refcnt of O now 0
jnz done
call delete_ebx
                        ; thread resumed
                        inc [ecx].refcnt
done:

The increment on [ecx].refcnt modifies garbage.

Oct 11 2013

Michel Fortin <michel.fortin michelf.ca> writes:

On 2013-10-12 06:16:17 +0000, Rainer Schuetze <r.sagitario gmx.de> said:

 On 12.10.2013 04:16, inout wrote:
 On Thursday, 10 October 2013 at 08:55:00 UTC, Robert Schadek
 wrote:
 I would imagine the counter to be manipulated with atomic_add_and_fetch
 operations, so no locks are required.

 
 On shared objects, yes. Local objects need no atomics at all.

 
 Atomic increments/decrements are not good enough for shared references. 
 See this example from later in the discussion:
 
 Consider a global shared reference R that holds the last reference to 
 an object O. One thread exchanges the reference with another reference 
 P while another thread reads the reference into S.
 
 shared(C) R = O;      ; refcnt of O is 1
 
 in pseudo-assembly missing null-checks:
 
 Thread1 (R = P)        Thread2 (S = R)
 
                         mov ecx,[R]
                         ; thread suspended
 mov eax,[P]
 inc [eax].refcnt
 mov ebx,[R]
 mov [R],eax
 dec [ebx].refcnt      ; refcnt of O now 0
 jnz done
 call delete_ebx
                         ; thread resumed
                         inc [ecx].refcnt
 done:
 
 The increment on [ecx].refcnt modifies garbage.

I think you are mixing shared references with shared objects. Atomic 
increment/decrement will work fine for shared objects with unshared 
pointers to it. But if the pointer to the object is itself shared and 
available from two threads (as in your example) then you need to 
properly serialize reads and writes to that pointer (with a mutex or 
through other means).

Of course, then you fall into the problem that in D you are not able to 
set different attributes to an object and to its reference. (Hint: 
const(Object)ref)

-- 
Michel Fortin
michel.fortin michelf.ca
http://michelf.ca

Oct 12 2013

Rainer Schuetze <r.sagitario gmx.de> writes:

On 12.10.2013 14:16, Michel Fortin wrote:
 On 2013-10-12 06:16:17 +0000, Rainer Schuetze <r.sagitario gmx.de> said:

 On 12.10.2013 04:16, inout wrote:
 On Thursday, 10 October 2013 at 08:55:00 UTC, Robert Schadek
 wrote:
 I would imagine the counter to be manipulated with atomic_add_and_fetch
 operations, so no locks are required.

 On shared objects, yes. Local objects need no atomics at all.

 Atomic increments/decrements are not good enough for shared
 references. See this example from later in the discussion:

 Consider a global shared reference R that holds the last reference to
 an object O. One thread exchanges the reference with another reference
 P while another thread reads the reference into S.

 shared(C) R = O;      ; refcnt of O is 1

 in pseudo-assembly missing null-checks:

 Thread1 (R = P)        Thread2 (S = R)

                         mov ecx,[R]
                         ; thread suspended
 mov eax,[P]
 inc [eax].refcnt
 mov ebx,[R]
 mov [R],eax
 dec [ebx].refcnt      ; refcnt of O now 0
 jnz done
 call delete_ebx
                         ; thread resumed
                         inc [ecx].refcnt
 done:

 The increment on [ecx].refcnt modifies garbage.

 I think you are mixing shared references with shared objects. Atomic
 increment/decrement will work fine for shared objects with unshared
 pointers to it. But if the pointer to the object is itself shared and
 available from two threads (as in your example) then you need to
 properly serialize reads and writes to that pointer (with a mutex or
 through other means).

I agree, that's why I used the term "shared reference", too ;-)

If you are using only shared objects, but not shared references, you'll 
have to use message passing coming with its own set of synchronization 
operations that are not easily made lock-free.

 Of course, then you fall into the problem that in D you are not able to
 set different attributes to an object and to its reference. (Hint:
 const(Object)ref)

I think being able to distinguish these types is missing from the type 
system. It's even worse for the class describing TypeInfo object as it 
is used for both the reference and the object.

Oct 12 2013

Michel Fortin <michel.fortin michelf.ca> writes:

On 2013-10-13 06:54:04 +0000, Rainer Schuetze <r.sagitario gmx.de> said:

 I agree, that's why I used the term "shared reference", too ;-)
 
 If you are using only shared objects, but not shared references, you'll 
 have to use message passing coming with its own set of synchronization 
 operations that are not easily made lock-free.

For one of my projects I implemented a shared pointer like this. It 
uses the pointer value itself as a spin lock with the assumption that 
-1 is an invalid pointer value:

1. read pointer value
2. if read value is -1 go to line 1 (spin)
3. compare and swap (previously read value <-> -1)
4. if failure go to line 1 (spin)
// now pointer is "locked", its value is -1 but we have a copy of the original
5. copy pointer locally or assign to it (and update counter)
6. write back pointer value atomically to replace the -1

No mutex, but there's a spin lock so it's not good if there's contention.

That said, I find it extremely rare to want a shared pointer that isn't 
already protected by a mutex alongside other variables, or that isn't 
propagated using some form of message passing.

-- 
Michel Fortin
michel.fortin michelf.ca
http://michelf.ca

Oct 13 2013

Rainer Schuetze <r.sagitario gmx.de> writes:

On 13.10.2013 13:48, Michel Fortin wrote:
 On 2013-10-13 06:54:04 +0000, Rainer Schuetze <r.sagitario gmx.de> said:

 I agree, that's why I used the term "shared reference", too ;-)

 If you are using only shared objects, but not shared references,
 you'll have to use message passing coming with its own set of
 synchronization operations that are not easily made lock-free.

 For one of my projects I implemented a shared pointer like this. It uses
 the pointer value itself as a spin lock with the assumption that -1 is
 an invalid pointer value:

 1. read pointer value
 2. if read value is -1 go to line 1 (spin)
 3. compare and swap (previously read value <-> -1)
 4. if failure go to line 1 (spin)
 // now pointer is "locked", its value is -1 but we have a copy of the
 original
 5. copy pointer locally or assign to it (and update counter)
 6. write back pointer value atomically to replace the -1

 No mutex, but there's a spin lock so it's not good if there's contention.

 That said, I find it extremely rare to want a shared pointer that isn't
 already protected by a mutex alongside other variables, or that isn't
 propagated using some form of message passing.

Locking is very bad if you have threads at different priorities as it 
might introduce priority inversion. Spinning is probably even worse in 
that scenario.

At work, I use shared pointers all the time to pass information to a 
real time audio thread. The scheme uses triple-buffering of pointers for 
a lock free safe transport from/to the real time thread.

Not having to worry about these low-level locking stuff is one of the 
good aspects about garbage collecting.

Oct 14 2013

Michel Fortin <michel.fortin michelf.ca> writes:

On 2013-10-14 17:57:18 +0000, Rainer Schuetze <r.sagitario gmx.de> said:

 On 13.10.2013 13:48, Michel Fortin wrote:
 
 For one of my projects I implemented a shared pointer like this. It uses
 the pointer value itself as a spin lock with the assumption that -1 is
 an invalid pointer value:
 
 1. read pointer value
 2. if read value is -1 go to line 1 (spin)
 3. compare and swap (previously read value <-> -1)
 4. if failure go to line 1 (spin)
 // now pointer is "locked", its value is -1 but we have a copy of the
 original
 5. copy pointer locally or assign to it (and update counter)
 6. write back pointer value atomically to replace the -1
 
 No mutex, but there's a spin lock so it's not good if there's contention.
 
 That said, I find it extremely rare to want a shared pointer that isn't
 already protected by a mutex alongside other variables, or that isn't
 propagated using some form of message passing.

 
 Locking is very bad if you have threads at different priorities as it 
 might introduce priority inversion. Spinning is probably even worse in 
 that scenario.

Spinning is good only when you very rarely expect contention, which is 
the case for me. The above code is used once per object the first time 
someone requests a weak pointer for it. Having contention for that just 
doesn't make sense in most usage patterns. But still, being curious I 
added a log message anytime it does actually has to spin, and I have 
yet to see that message once in my logs (which probably means 
aggressive enough unit tests are missing).

If you have a lot of read accesses and rarely write to the pointer you 
could instead try a read-write mutex with concurrent read access. In 
any case, there's no solution that will be ideal in all cases. 
Different situations asks for different trade-offs.

 At work, I use shared pointers all the time to pass information to a 
 real time audio thread. The scheme uses triple-buffering of pointers 
 for a lock free safe transport from/to the real time thread.
 
 Not having to worry about these low-level locking stuff is one of the 
 good aspects about garbage collecting.

Indeed. The current garbage collector makes it easy to have shared 
pointers to shared objects. But the GC can also interrupt real-time 
threads for an unpredictable duration, how do you cope with that in a 
real-time thread?

I know ARC isn't the ideal solution for all use cases. But neither is 
the GC, especially for real-time applications. So, which one would you 
recommend for a project having a real-time audio thread?

-- 
Michel Fortin
michel.fortin michelf.ca
http://michelf.ca

Oct 14 2013

"deadalnix" <deadalnix gmail.com> writes:

On Monday, 14 October 2013 at 19:42:36 UTC, Michel Fortin wrote:
 Indeed. The current garbage collector makes it easy to have 
 shared pointers to shared objects. But the GC can also 
 interrupt real-time threads for an unpredictable duration, how 
 do you cope with that in a real-time thread?

 I know ARC isn't the ideal solution for all use cases. But 
 neither is the GC, especially for real-time applications. So, 
 which one would you recommend for a project having a real-time 
 audio thread?

If you don't want any pause, concurrent GC is the way to go. This 
type of GC come at a cost of increased memory usage (everything 
is a tradeoff) but exists.

Oct 14 2013

Michel Fortin <michel.fortin michelf.ca> writes:

On 2013-10-14 20:45:35 +0000, "deadalnix" <deadalnix gmail.com> said:

 On Monday, 14 October 2013 at 19:42:36 UTC, Michel Fortin wrote:
 Indeed. The current garbage collector makes it easy to have shared 
 pointers to shared objects. But the GC can also interrupt real-time 
 threads for an unpredictable duration, how do you cope with that in a 
 real-time thread?
 
 I know ARC isn't the ideal solution for all use cases. But neither is 
 the GC, especially for real-time applications. So, which one would you 
 recommend for a project having a real-time audio thread?

 
 If you don't want any pause, concurrent GC is the way to go. This type 
 of GC come at a cost of increased memory usage (everything is a 
 tradeoff) but exists.

I'm not an expert in GCs, but as far as I know a concurrent GC also 
requires some bookkeeping to be done when assigning to pointers, 
similar to ARC, and also when moving pointers, unlike ARC. So it 
requires hooks in the codegen that will perform atomic operations, just 
like ARC.

Unless of course you use "fork" and scan inside the child process… but 
that has its own set of problems on some platforms.

The only consensus we'll reach is that different projects have 
different needs. In theory being able to swap the GC for something else 
could bring everyone together. But to be able to replace the GC for 
another with a strategy different enough to matter (concurrent GC or 
ARC) you need the codegen to be different. So we can either:

1. make the codegen configurable -- which brings its own set of 
compatibility problems for compiled code but is good for 
experimentation, or

2. settle on something middle-ground performance-wise that can 
accommodate a couple of strategies, or

3. choose one GC strategy that ought to satisfy everybody and adapt the 
codegen to fit, or

4. keep everything as is.

We're stuck with 4 until something else is decided.

-- 
Michel Fortin
michel.fortin michelf.ca
http://michelf.ca

Oct 14 2013

"deadalnix" <deadalnix gmail.com> writes:

On Monday, 14 October 2013 at 21:25:29 UTC, Michel Fortin wrote:
 I'm not an expert in GCs, but as far as I know a concurrent GC 
 also requires some bookkeeping to be done when assigning to 
 pointers, similar to ARC, and also when moving pointers, unlike 
 ARC. So it requires hooks in the codegen that will perform 
 atomic operations, just like ARC.

Usual strategy include :
  - When you JIT, change the function itself, to write pointers 
through a function that mark the old value as live.
  - When AOT, always go throw that function, which make a test and 
mark alive the old value if this is done during a collection. 
This basically add a read to a global and a test for each pointer 
write.
  - Use the page protection mechanism and do regular write. This 
can be done via fork, but also via remapping the GCed memory as 
COW. The tax is then more expensive, but you only pay it once per 
page you actually write and only when actually collecting.

The good news, is that this tax is only required for object that 
contains shared mutable pointers. In D, most data are thread 
local or immutable. D's type system is really friendly to 
concurrent GC, and we definitively should go in that direction.

 The only consensus we'll reach is that different projects have 
 different needs. In theory being able to swap the GC for 
 something else could bring everyone together. But to be able to 
 replace the GC for another with a strategy different enough to 
 matter (concurrent GC or ARC) you need the codegen to be 
 different. So we can either:

ARC like system need a different codegen, but you can do this 
with regular codegen if you use page protection to detect writes.

 1. make the codegen configurable -- which brings its own set of 
 compatibility problems for compiled code but is good for 
 experimentation, or

Bad, we will end up having different incompatible binaries.

Oct 14 2013

Michel Fortin <michel.fortin michelf.ca> writes:

On 2013-10-14 23:45:42 +0000, "deadalnix" <deadalnix gmail.com> said:

 On Monday, 14 October 2013 at 21:25:29 UTC, Michel Fortin wrote:
 I'm not an expert in GCs, but as far as I know a concurrent GC also 
 requires some bookkeeping to be done when assigning to pointers, 
 similar to ARC, and also when moving pointers, unlike ARC. So it 
 requires hooks in the codegen that will perform atomic operations, just 
 like ARC.

 
 Usual strategy include :
   - When you JIT, change the function itself, to write pointers through 
 a function that mark the old value as live.
   - When AOT, always go throw that function, which make a test and mark 
 alive the old value if this is done during a collection. This basically 
 add a read to a global and a test for each pointer write.
   - Use the page protection mechanism and do regular write. This can be 
 done via fork, but also via remapping the GCed memory as COW. The tax 
 is then more expensive, but you only pay it once per page you actually 
 write and only when actually collecting.
 
 The good news, is that this tax is only required for object that 
 contains shared mutable pointers. In D, most data are thread local or 
 immutable. D's type system is really friendly to concurrent GC, and we 
 definitively should go in that direction.
 
 The only consensus we'll reach is that different projects have 
 different needs. In theory being able to swap the GC for something else 
 could bring everyone together. But to be able to replace the GC for 
 another with a strategy different enough to matter (concurrent GC or 
 ARC) you need the codegen to be different. So we can�either:

 
 ARC like system need a different codegen, but you can do this with 
 regular codegen if you use page protection to detect writes.

Very insightful. Thank you.


 1. make the codegen configurable -- which brings its own set of 
 compatibility problems for compiled code but is good for 
 experimentation, or

 
 Bad, we will end up having different incompatible binaries.

So as I understand it, your plan would then be:

- use concurrent GC using the page protection mechanism and COW to 
catch writes during collection
- add thread-local awareness to the GC so only shared and mutable 
memory needs COW

Makes sense. But adding thread-local awareness requires a couple of 
language changes. It won't work as long as people keep casting things 
around so you need to fix a lot of cases where casts are needed.

But otherwise, seems like a good plan.

I'm a little wary of the cost of doing COW during collection. Obviously 
the GC isn't pausing the program per-see, but it'll be slowing it down. 
By how much is probably dependent on what you're doing. At the very 
least the GC should allocate types with no pointers on separate pages 
from those with pointers.

Also, what are the calls required to implement page protection and COW 
on posix? I'd like to check whether those are allowed within the OS X 
and iOS sandbox. For instance fork() isn't allowed for sandboxed apps.

-- 
Michel Fortin
michel.fortin michelf.ca
http://michelf.ca

Oct 14 2013

"deadalnix" <deadalnix gmail.com> writes:

On Tuesday, 15 October 2013 at 00:38:20 UTC, Michel Fortin wrote:
 So as I understand it, your plan would then be:

 - use concurrent GC using the page protection mechanism and COW 
 to catch writes during collection
 - add thread-local awareness to the GC so only shared and 
 mutable memory needs COW

Yes, I think this is what make the most sense for D. If the GC 
allow explicit free; the reference counting and other manual 
memory management techniques can be built on top of it.

 Makes sense. But adding thread-local awareness requires a 
 couple of language changes. It won't work as long as people 
 keep casting things around so you need to fix a lot of cases 
 where casts are needed.

Type qualifiers provide the necessary information.

However, some practice (that are already mentioned as being 
undefined behavior) will become really unsafe. It also require to 
enrich the GC api, but users aren't supposed to use it directly :D

 I'm a little wary of the cost of doing COW during collection. 
 Obviously the GC isn't pausing the program per-see, but it'll 
 be slowing it down. By how much is probably dependent on what 
 you're doing. At the very least the GC should allocate types 
 with no pointers on separate pages from those with pointers.

Nothing is free, and indeed, trapping the write via memory 
protection is quite expensive. Hopefully, we can segregate object 
that contain mutable pointer from others and only memory protect 
theses.

It will indeed cause trouble for code that mutate a large amount 
of shared pointers. I'd say that such code is probably asking for 
trouble in the first place, but as always, no silver bullet. I 
still think solution is the one that fit D the best.

 Also, what are the calls required to implement page protection 
 and COW on posix? I'd like to check whether those are allowed 
 within the OS X and iOS sandbox. For instance fork() isn't 
 allowed for sandboxed apps.

You need mmap, mprotect and all the signal handling machinery.

Oct 14 2013

Michel Fortin <michel.fortin michelf.ca> writes:

On 2013-10-15 02:20:49 +0000, "deadalnix" <deadalnix gmail.com> said:

 Also, what are the calls required to implement page protection and COW 
 on posix? I'd like to check whether those are allowed within the OS X 
 and iOS sandbox. For instance fork() isn't allowed for sandboxed apps.

 
 You need mmap, mprotect and all the signal handling machinery.

mprotect is the one I'm worried about, as it lets you set the 
executable bit (among other things) which could be exploited to run 
arbitrary code. So I tested it and it seems to work fine on OS X inside 
the sandbox (including for setting the executable bit). I guess an 
executable with a reference to mprotect would probably also pass 
Apple's Mac App Store validation, but I haven't tested.

mprotect isn't available at all with the iOS SDK. So making this 
collector work on iOS (and the iOS Simulator) would require a different 
codegen.

-- 
Michel Fortin
michel.fortin michelf.ca
http://michelf.ca

Oct 14 2013

Jacob Carlborg <doob me.com> writes:

On 2013-10-15 05:11, Michel Fortin wrote:

 mprotect isn't available at all with the iOS SDK. So making this
 collector work on iOS (and the iOS Simulator) would require a different
 codegen.

I haven't tried compiling anything and I don't know if I'm looking in 
the correct file but this file:

/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS7.0.sdk/usr/include/sys/mman.h

Does contain "mprotect".

-- 
/Jacob Carlborg

Oct 15 2013

Michel Fortin <michel.fortin michelf.ca> writes:

On 2013-10-15 07:28:16 +0000, Jacob Carlborg <doob me.com> said:

 On 2013-10-15 05:11, Michel Fortin wrote:
 
 mprotect isn't available at all with the iOS SDK. So making this
 collector work on iOS (and the iOS Simulator) would require a different
 codegen.

 
 I haven't tried compiling anything and I don't know if I'm looking in 
 the correct file but this file:
 
 /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS7.0.sdk/usr/include/sys/mman.h

Does 
 
 contain "mprotect".

You're right. Yes it does exist. I was confused.

Not only it does exist, but it lets you set the executable bit. I find 
that depressing, since I'm pretty sure App Store apps are prevented 
from setting the executable bit, and I'd tend to think now that they're 
blocking it by checking for references to mprotect it in the executable 
when submitting to the App Store. And by doing it this way they 
probably wouldn't be able to distinguish between setting the executable 
bit or making a page read-only.

Also, someone would need to check that Windows Phone apps and Windows 
8-style (Metro) apps can access mprotect (or equivalent) too. They're 
sandboxed just as heavily and statically checked upon submission the 
same way.

Could some game consoles out there block it too?

-- 
Michel Fortin
michel.fortin michelf.ca
http://michelf.ca

Oct 15 2013

Michel Fortin <michel.fortin michelf.ca> writes:

On 2013-10-15 02:20:49 +0000, "deadalnix" <deadalnix gmail.com> said:

 It will indeed cause trouble for code that mutate a large amount of 
 shared pointers. I'd say that such code is probably asking for trouble 
 in the first place, but as always, no silver bullet. I still think 
 solution is the one that fit D the best.

I think there's a small mistake in your phrasing, but it makes a difference.

When the collector is running, it needs to know about any mutation for 
pointers to its shared memory pool, including pointers that are 
themselves thread-local but point to shared memory. So COW will be 
trouble for code that mutate a large amount of **pages containing 
pointers to shared memory**. And this which includes **pointers to 
immutable data** because immutable is implicitly shared. And this 
includes **pointers to const data** since those pointers might point to 
immutable (thus shared) memory.

So any memory page susceptible of containing pointers to shared memory 
would need to use COW during collection. Which means all the thread's 
stacks, and also all objects with a pointer to shared, immutable, and 
const data. At this point I think it is fair to approximate this to 
almost all memory that could contain pointers.

-- 
Michel Fortin
michel.fortin michelf.ca
http://michelf.ca

Oct 15 2013

"deadalnix" <deadalnix gmail.com> writes:

On Tuesday, 15 October 2013 at 11:03:01 UTC, Michel Fortin wrote:
 On 2013-10-15 02:20:49 +0000, "deadalnix" <deadalnix gmail.com> 
 said:

 It will indeed cause trouble for code that mutate a large 
 amount of shared pointers. I'd say that such code is probably 
 asking for trouble in the first place, but as always, no 
 silver bullet. I still think solution is the one that fit D 
 the best.

 I think there's a small mistake in your phrasing, but it makes 
 a difference.

 When the collector is running, it needs to know about any 
 mutation for pointers to its shared memory pool, including 
 pointers that are themselves thread-local but point to shared 
 memory. So COW will be trouble for code that mutate a large 
 amount of **pages containing pointers to shared memory**. And 
 this which includes **pointers to immutable data** because 
 immutable is implicitly shared. And this includes **pointers to 
 const data** since those pointers might point to immutable 
 (thus shared) memory.

No, that is the beauty of it :D

Consider you have pointer from Tl -> shared -> immutable and TL 
-> immutable.

I'm not covering TL collection here (It seem to be obvious that 
it doesn't require to stop the world). So the starting point is 
that we have the roots in all TL heaps/stacks, and we want to 
collect shared/immutable without blocking the worlds.

TL heap may get new pointers to the shared heap, but they can 
only come from the shared heap itself or new allocations. At this 
point, you consider every new allocations as live.

Reading a pointer from the shared heap and copy it to the TL heap 
isn't problematic in itself, but then we have a problem if this 
pointer is now updated in the shared heap, as the GC may never 
scan this pointer.

This is why you need to track pointer writes to the shared heap. 
The write value itself isn't important : it come from either new 
alloc that are live, or from somewhere else in the shared heap 
(so it will be scanned as we track writes).

 So any memory page susceptible of containing pointers to shared 
 memory would need to use COW during collection. Which means all 
 the thread's stacks, and also all objects with a pointer to 
 shared, immutable, and const data. At this point I think it is 
 fair to approximate this to almost all memory that could 
 contain pointers.

No, only the shared one, that is the beauty of the technique. Not 
that I'm not making that up myself, it is how GC used to work in 
the Caml family for a while, and it has proven itself really 
efficient (in Caml family, most data are either immutable or 
thread local, and the shared heap typically small).

Oct 15 2013

Michel Fortin <michel.fortin michelf.ca> writes:

On 2013-10-15 18:32:01 +0000, "deadalnix" <deadalnix gmail.com> said:

 No, that is the beauty of it :D
 
 Consider you have pointer from Tl -> shared -> immutable and TL -> immutable.
 
 I'm not covering TL collection here (It seem to be obvious that it 
 doesn't require to stop the world). So the starting point is that we 
 have the roots in all TL heaps/stacks, and we want to collect 
 shared/immutable without blocking the worlds.
 
 TL heap may get new pointers to the shared heap, but they can only come 
 from the shared heap itself or new allocations. At this point, you 
 consider every new allocations as live.
 
 Reading a pointer from the shared heap and copy it to the TL heap isn't 
 problematic in itself, but then we have a problem if this pointer is 
 now updated in the shared heap, as the GC may never scan this pointer.
 
 This is why you need to track pointer writes to the shared heap. The 
 write value itself isn't important : it come from either new alloc that 
 are live, or from somewhere else in the shared heap (so it will be 
 scanned as we track writes).

But you still have to scan the thread-local heaps and stacks to find 
pointers to shared memory. If you don't stop those threads, how do you 
know one of these threads isn't moving a pointer value during the scan 
from one place the GC still has to scan to another that the GC has just 
scanned, making it so the GC never sees the pointer?

For instance:

	class A {
		// string is immutable and points to shared heap
		immutable(char)[] s;
	}

	A global;

	void func()
	{
		// GC scans the stack of this thread in the background
		// no reference to our string on the stack

		// moving string pointer to the stack
		// while the GC is running
		auto tmp = global.s;
		global.s = null;

		// GC scans the heap of this thread in the background
		// no reference to our string on the heap
		// (missed reference to our string now on the stack)
	}

The thread can move the pointer around while the GC is looking away and 
you'll end up with a pointer to a freed string. So you have to use COW 
for the thread's stack, or get notified of a pointer assignment somehow 
(or of a pointer move), or stop the thread while you're scanning its 
heap.

-- 
Michel Fortin
michel.fortin michelf.ca
http://michelf.ca

Oct 15 2013

"deadalnix" <deadalnix gmail.com> writes:

Yes and no.

You obviously need to scan TL heaps at some point. When doing so 
you'll have a set of root that allow you to scan shared/immutable 
heap. What is going on in the TL heap become irrelevant once you 
have the root. And getting the roots from the TL heap is another 
problem altogether (any kind of GC can be used for that, stop the 
world, where the world is thread local, or another concurrent GC, 
the important point being that it is an independent problem).

Oct 16 2013

Paulo Pinto <pjmlp progtools.org> writes:

Am 14.10.2013 21:42, schrieb Michel Fortin:
 On 2013-10-14 17:57:18 +0000, Rainer Schuetze <r.sagitario gmx.de> said:

 On 13.10.2013 13:48, Michel Fortin wrote:
 For one of my projects I implemented a shared pointer like this. It uses
 the pointer value itself as a spin lock with the assumption that -1 is
 an invalid pointer value:

 1. read pointer value
 2. if read value is -1 go to line 1 (spin)
 3. compare and swap (previously read value <-> -1)
 4. if failure go to line 1 (spin)
 // now pointer is "locked", its value is -1 but we have a copy of the
 original
 5. copy pointer locally or assign to it (and update counter)
 6. write back pointer value atomically to replace the -1

 No mutex, but there's a spin lock so it's not good if there's
 contention.

 That said, I find it extremely rare to want a shared pointer that isn't
 already protected by a mutex alongside other variables, or that isn't
 propagated using some form of message passing.

 Locking is very bad if you have threads at different priorities as it
 might introduce priority inversion. Spinning is probably even worse in
 that scenario.

 Spinning is good only when you very rarely expect contention, which is
 the case for me. The above code is used once per object the first time
 someone requests a weak pointer for it. Having contention for that just
 doesn't make sense in most usage patterns. But still, being curious I
 added a log message anytime it does actually has to spin, and I have yet
 to see that message once in my logs (which probably means aggressive
 enough unit tests are missing).

 If you have a lot of read accesses and rarely write to the pointer you
 could instead try a read-write mutex with concurrent read access. In any
 case, there's no solution that will be ideal in all cases. Different
 situations asks for different trade-offs.

 At work, I use shared pointers all the time to pass information to a
 real time audio thread. The scheme uses triple-buffering of pointers
 for a lock free safe transport from/to the real time thread.

 Not having to worry about these low-level locking stuff is one of the
 good aspects about garbage collecting.

 Indeed. The current garbage collector makes it easy to have shared
 pointers to shared objects. But the GC can also interrupt real-time
 threads for an unpredictable duration, how do you cope with that in a
 real-time thread?

 I know ARC isn't the ideal solution for all use cases. But neither is
 the GC, especially for real-time applications. So, which one would you
 recommend for a project having a real-time audio thread?


Well, if real time concurrent GC for Java systems is good enough for 
systems that control militar missile systems, maybe it is good enough 
for real-time audio as well.

--
Paulo

Oct 14 2013

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 14.10.2013 22:59, schrieb Paulo Pinto:
 Well, if real time concurrent GC for Java systems is good enough for
 systems that control militar missile systems, maybe it is good enough
 for real-time audio as well.

 --
 Paulo

The problem is not that there are no GCs around in other languages which 
satisfy certain requirements. The problem is actually implementing them 
in D. I suggest that you read "The Garbage Collection Handbook" which 
explains this in deep detail. I'm currently reading it, and I might 
write an article about the entire D GC issue once I'm done with it.

Oct 16 2013

Sean Kelly <sean invisibleduck.org> writes:

On Oct 16, 2013, at 11:54 AM, Benjamin Thaut <code benjamin-thaut.de> =
wrote:
=20
 The problem is not that there are no GCs around in other languages =

which satisfy certain requirements. The problem is actually implementing =
them in D. I suggest that you read "The Garbage Collection Handbook" =
which explains this in deep detail. I'm currently reading it, and I =
might write an article about the entire D GC issue once I'm done with =
it.

I think the short version is that D being able to directly call C code =
is a huge problem here.  Incremental GCs all rely on the GC being =
notified when pointers are changed.  We might be able to manage it for =
SafeD, but then SafeD would basically be its own language.=

Oct 16 2013

Jacob Carlborg <doob me.com> writes:

On 2013-10-16 21:05, Sean Kelly wrote:

 I think the short version is that D being able to directly call C code is a
huge problem here.  Incremental GCs all rely on the GC being notified when
pointers are changed.  We might be able to manage it for SafeD, but then SafeD
would basically be its own language.

One need to be very careful with the memory when interfacing with C code 
today. What about having a function that notifies the GC that a pointer 
has been updated?

-- 
/Jacob Carlborg

Oct 16 2013

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 16.10.2013 21:05, schrieb Sean Kelly:
 On Oct 16, 2013, at 11:54 AM, Benjamin Thaut <code benjamin-thaut.de> wrote:
 The problem is not that there are no GCs around in other languages which
satisfy certain requirements. The problem is actually implementing them in D. I
suggest that you read "The Garbage Collection Handbook" which explains this in
deep detail. I'm currently reading it, and I might write an article about the
entire D GC issue once I'm done with it.

 I think the short version is that D being able to directly call C code is a
huge problem here.  Incremental GCs all rely on the GC being notified when
pointers are changed.  We might be able to manage it for SafeD, but then SafeD
would basically be its own language.

I think a even bigger problem are structs. Because if you need write 
barriers for pointers on the heap you are going to have a problem with 
structs. Because you will never know if they are located on the heap or 
the stack. Additionally making the stack percisely scannable and adding 
GC-Points will require a lot of compiler support. And even if this is 
doable in respect to DMD its going to be a big problem for GDC or LDC to 
change the codegen.

Oct 16 2013

Sean Kelly <sean invisibleduck.org> writes:

On Oct 16, 2013, at 1:05 PM, Benjamin Thaut <code benjamin-thaut.de> wrote:
=20
 Am 16.10.2013 21:05, schrieb Sean Kelly:
 On Oct 16, 2013, at 11:54 AM, Benjamin Thaut <code benjamin-thaut.de> wro=


te:
=20
 The problem is not that there are no GCs around in other languages which=



 satisfy certain requirements. The problem is actually implementing them in D=
. I suggest that you read "The Garbage Collection Handbook" which explains t=
his in deep detail. I'm currently reading it, and I might write an article a=
bout the entire D GC issue once I'm done with it.
=20
 I think the short version is that D being able to directly call C code is=


 a huge problem here.  Incremental GCs all rely on the GC being notified whe=
n pointers are changed.  We might be able to manage it for SafeD, but then S=
afeD would basically be its own language.
=20
 I think a even bigger problem are structs. Because if you need write barri=

ers for pointers on the heap you are going to have a problem with structs. B=
ecause you will never know if they are located on the heap or the stack. Add=
itionally making the stack percisely scannable and adding GC-Points will req=
uire a lot of compiler support. And even if this is doable in respect to DMD=
 its going to be a big problem for GDC or LDC to change the codegen.

Yes, any pointer anywhere. I recall someone posting a doc about a compromise=
 solution a few years back, but I'd have to do some digging to figure out wh=
at the approach was.=20=

Oct 17 2013

"David Nadlinger" <code klickverbot.at> writes:

On Thursday, 17 October 2013 at 17:11:06 UTC, Sean Kelly wrote:
 And even if this is doable in respect to DMD its going to be a 
 big problem for GDC or LDC to change the codegen.

 Yes, any pointer anywhere. I recall someone posting a doc about 
 a compromise solution a few years back, but I'd have to do some 
 digging to figure out what the approach was.

LLVM actually comes with a quite expensive GC support 
infrastructure: http://llvm.org/docs/GarbageCollection.html. As 
far as I'm aware, it is not widely used in terms of the  
"top-tier" LLVM projects, so there might be quite a bit of work 
involved in getting that to run.

David

Oct 17 2013

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 17.10.2013 19:16, schrieb David Nadlinger:
 On Thursday, 17 October 2013 at 17:11:06 UTC, Sean Kelly wrote:
 And even if this is doable in respect to DMD its going to be a big
 problem for GDC or LDC to change the codegen.

 Yes, any pointer anywhere. I recall someone posting a doc about a
 compromise solution a few years back, but I'd have to do some digging
 to figure out what the approach was.

 LLVM actually comes with a quite expensive GC support infrastructure:
 http://llvm.org/docs/GarbageCollection.html. As far as I'm aware, it is
 not widely used in terms of the "top-tier" LLVM projects, so there might
 be quite a bit of work involved in getting that to run.

 David

Uhhh, this sounds really good. They in fact have everything to implement 
a generational garbage collector. This would improve the D GC situation 
a lot.

But reading the part about the shadow stack really lowers my 
expectations. Thats really something you don't want. The performance 
impact is so going to be so big, that it doesn't make sense to use the 
better GC in the first place.

Kind Regards
Benjamin Thaut

Oct 17 2013

"Sean Kelly" <sean invisibleduck.org> writes:

On Thursday, 17 October 2013 at 17:28:17 UTC, Benjamin Thaut 
wrote:
 Am 17.10.2013 19:16, schrieb David Nadlinger:

 But reading the part about the shadow stack really lowers my 
 expectations. Thats really something you don't want. The 
 performance impact is so going to be so big, that it doesn't 
 make sense to use the better GC in the first place.

There's always a tradeoff.  If an app is very delay-sensitive, 
then making the app run slower in general might be worthwhile if 
the delay could be eliminated.

Oct 17 2013

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 17.10.2013 19:47, schrieb Sean Kelly:
 On Thursday, 17 October 2013 at 17:28:17 UTC, Benjamin Thaut wrote:
 Am 17.10.2013 19:16, schrieb David Nadlinger:

 But reading the part about the shadow stack really lowers my
 expectations. Thats really something you don't want. The performance
 impact is so going to be so big, that it doesn't make sense to use the
 better GC in the first place.

 There's always a tradeoff.  If an app is very delay-sensitive, then
 making the app run slower in general might be worthwhile if the delay
 could be eliminated.

Well I just read it again, and it appears to me that the shadow stack is 
something they already have implemented and it can be used for "gc 
prototyping" but if you want you can write your own code generator 
plugin and generate your own stack maps. It actually sounds more 
feasable to implement a generational GC with LLVM then with what we have 
in dmd.

Kind Regards
Benjamin Thaut

Oct 17 2013

Paulo Pinto <pjmlp progtools.org> writes:

Am 16.10.2013 20:54, schrieb Benjamin Thaut:
 Am 14.10.2013 22:59, schrieb Paulo Pinto:
 Well, if real time concurrent GC for Java systems is good enough for
 systems that control militar missile systems, maybe it is good enough
 for real-time audio as well.

 --
 Paulo

 The problem is not that there are no GCs around in other languages which
 satisfy certain requirements. The problem is actually implementing them
 in D. I suggest that you read "The Garbage Collection Handbook" which
 explains this in deep detail. I'm currently reading it, and I might
 write an article about the entire D GC issue once I'm done with it.

I have read it when it came out in the late 90's.

My main focus areas in the university were compiler design, distributed 
systems and graphics programming.

Although, like many, I tend to do plain boring enterprise applications 
nowadays.


--
Paulo

Oct 16 2013

Rainer Schuetze <r.sagitario gmx.de> writes:

On 14.10.2013 21:42, Michel Fortin wrote:
 Indeed. The current garbage collector makes it easy to have shared
 pointers to shared objects. But the GC can also interrupt real-time
 threads for an unpredictable duration, how do you cope with that in a
 real-time thread?

The work I was talking about uses C++, not D, so there is no GC involved.

The options I see for real-time threads in D is either a concurrent GC 
(which means read/write barriers for pointer accesses) or just excluding 
the real time thread from suspension by the GC. This forces the 
programmer to ensure that references in the real time thread are also 
found elsewhere. I'm not sure if this eliminates the benefits regarding 
locking, though.

 I know ARC isn't the ideal solution for all use cases. But neither is
 the GC, especially for real-time applications. So, which one would you
 recommend for a project having a real-time audio thread?

ARC doesn't work for real time threads anyway, because you are not 
allowed to deallocate if it can cause locks. It can only work if you 
defer reference counting into another thread through some buffering.

Realistically I would currently recommend the approach above: exclude 
the thread from suspension, and keep a reference to used object 
elsewhere. This is probably about as difficult as avoiding 
allocations/deallocations in C++, but harder to debug.

Oct 15 2013

"deadalnix" <deadalnix gmail.com> writes:

On Saturday, 12 October 2013 at 06:16:24 UTC, Rainer Schuetze 
wrote:
 in pseudo-assembly missing null-checks:

 Thread1 (R = P)        Thread2 (S = R)

                        mov ecx,[R]
                        ; thread suspended

You need an sequentially consistent write. You also need to 
increment the refcount BEFORE ! This codegen is incorrect.

 mov eax,[P]
 inc [eax].refcnt

Same here.

 mov ebx,[R]
 mov [R],eax
 dec [ebx].refcnt      ; refcnt of O now 0
 jnz done
 call delete_ebx
                        ; thread resumed
                        inc [ecx].refcnt
 done:

 The increment on [ecx].refcnt modifies garbage.

This can be done atomically (even with eventually consistent 
increment, don't need full sequential consistency here, you still 
need fully sequential consistent decrement).

Oct 12 2013

Rainer Schuetze <r.sagitario gmx.de> writes:

On 12.10.2013 20:31, deadalnix wrote:
 On Saturday, 12 October 2013 at 06:16:24 UTC, Rainer Schuetze wrote:
 in pseudo-assembly missing null-checks:

 Thread1 (R = P)        Thread2 (S = R)

                        mov ecx,[R]
                        ; thread suspended

 You need an sequentially consistent write. You also need to increment
 the refcount BEFORE ! This codegen is incorrect.

How do you increment the counter without reading its address?

 mov eax,[P]
 inc [eax].refcnt

 Same here.

Same here ;-)

 mov ebx,[R]
 mov [R],eax
 dec [ebx].refcnt      ; refcnt of O now 0
 jnz done
 call delete_ebx
                        ; thread resumed
                        inc [ecx].refcnt
 done:

 The increment on [ecx].refcnt modifies garbage.

 This can be done atomically (even with eventually consistent increment,
 don't need full sequential consistency here, you still need fully
 sequential consistent decrement).

According to the "Handbook of Garbage Collection" by Richard Jones eager 
lock-free reference counting can only be done with a cas2 operation 
modifying two seperate locations atomically (algorithm 18.2 "Eager 
reference counting with CompareAndSwap is broken"). This might be the 
quoted paper: http://scholr.ly/paper/2199608/lock-free-reference-counting

Unfortunately the CAS2 operation does not exist in most processors.

Oct 13 2013

"deadalnix" <deadalnix gmail.com> writes:

On Sunday, 13 October 2013 at 07:03:39 UTC, Rainer Schuetze wrote:
 How do you increment the counter without reading its address?

I assumed that the reference count was in a struct with the data, 
and refcounted point to it.

In this case, if you remove the pointer via a sequencially 
consistent write (while keeping a local copy internally) and THEN 
decrement the counter, the other thread will access another 
object (or skip on a null check). Granted the read is 
sequencially consistent.

Oct 13 2013

Artur Skawina <art.08.09 gmail.com> writes:

On 10/13/13 11:19, deadalnix wrote:
 On Sunday, 13 October 2013 at 07:03:39 UTC, Rainer Schuetze wrote:
 How do you increment the counter without reading its address?

 
 I assumed that the reference count was in a struct with the data, and
refcounted point to it.
 
 In this case, if you remove the pointer via a sequencially consistent write
(while keeping a local copy internally) and THEN decrement the counter, the
other thread will access another object (or skip on a null check). Granted the
read is sequencially consistent.

No, if you have two (or more) threads concurrently accessing the object,
it is possible that one threads reads the pointer, then sleeps before
incrementing the count. Then another thread comes and /destroys/ the
object, the memory is reused for something else, or even unmapped.
Then the first thread wakes up and incs the counter, which is no longer
there, causing a crash or data corruption.

But this is only a problem for shared objects, which are accessed without
any locking -- it's not a common case at all, and can be dealt with by
simply taking a lock *before* reading the reference. (there are many much
more complex solutions such as CAS2 or RCU based ones).

artur

Oct 13 2013

"Sean Kelly" <sean invisibleduck.org> writes:

On Sunday, 13 October 2013 at 07:03:39 UTC, Rainer Schuetze wrote:
 According to the "Handbook of Garbage Collection" by Richard 
 Jones eager lock-free reference counting can only be done with 
 a cas2 operation modifying two seperate locations atomically 
 (algorithm 18.2 "Eager reference counting with CompareAndSwap 
 is broken"). This might be the quoted paper: 
 http://scholr.ly/paper/2199608/lock-free-reference-counting

 Unfortunately the CAS2 operation does not exist in most 
 processors.

I suppose it's worth noting that Boost (and now standard C++) has 
a shared_ptr that works across threads and the implementation 
I've seen doesn't use a mutex.  In fact, I think the Boost one 
doesn't even use CAS on x86, though it's been quite a few years 
so my memory could be wrong on that last detail.

Oct 13 2013

=?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= <sludwig outerproduct.org> writes:

Am 13.10.2013 17:15, schrieb Sean Kelly:
 On Sunday, 13 October 2013 at 07:03:39 UTC, Rainer Schuetze wrote:
 According to the "Handbook of Garbage Collection" by Richard Jones
 eager lock-free reference counting can only be done with a cas2
 operation modifying two seperate locations atomically (algorithm 18.2
 "Eager reference counting with CompareAndSwap is broken"). This might
 be the quoted paper:
 http://scholr.ly/paper/2199608/lock-free-reference-counting

 Unfortunately the CAS2 operation does not exist in most processors.

 I suppose it's worth noting that Boost (and now standard C++) has a
 shared_ptr that works across threads and the implementation I've seen
 doesn't use a mutex.  In fact, I think the Boost one doesn't even use
 CAS on x86, though it's been quite a few years so my memory could be
 wrong on that last detail.

I didn't read the paper, but I'd suspect that the paper refers to the 
case where both, the reference count _and_ the reference is thread-safe, 
since the boost/c++ shared_ptr only has a thread-safe reference count 
after all.

Oct 13 2013

Rainer Schuetze <r.sagitario gmx.de> writes:

On 13.10.2013 19:05, Sönke Ludwig wrote:
 Am 13.10.2013 17:15, schrieb Sean Kelly:
 On Sunday, 13 October 2013 at 07:03:39 UTC, Rainer Schuetze wrote:
 According to the "Handbook of Garbage Collection" by Richard Jones
 eager lock-free reference counting can only be done with a cas2
 operation modifying two seperate locations atomically (algorithm 18.2
 "Eager reference counting with CompareAndSwap is broken"). This might
 be the quoted paper:
 http://scholr.ly/paper/2199608/lock-free-reference-counting

 Unfortunately the CAS2 operation does not exist in most processors.

 I suppose it's worth noting that Boost (and now standard C++) has a
 shared_ptr that works across threads and the implementation I've seen
 doesn't use a mutex.  In fact, I think the Boost one doesn't even use
 CAS on x86, though it's been quite a few years so my memory could be
 wrong on that last detail.

 I didn't read the paper, but I'd suspect that the paper refers to the
 case where both, the reference count _and_ the reference is thread-safe,
 since the boost/c++ shared_ptr only has a thread-safe reference count
 after all.

I haven't read it either, but AFAICT the cas2 operation is used two 
modify the pointer and the reference count at the same time atomically.

I just checked boost::shared_ptr, it uses cas operations on the 
reference counts. It has the same problem as described in my example, 
see the read/write example 3 here: 
http://www.boost.org/doc/libs/1_54_0/libs/smart_ptr/shared_ptr.htm#ThreadSafety

boost::shared_ptr is also unsafe with respect to calling member 
functions through "->" as it doesn't increment the reference count.

Oct 14 2013

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Monday, 14 October 2013 at 17:43:33 UTC, Rainer Schuetze wrote:
 On 13.10.2013 19:05, Sönke Ludwig wrote:
 Am 13.10.2013 17:15, schrieb Sean Kelly:
 On Sunday, 13 October 2013 at 07:03:39 UTC, Rainer Schuetze 
 wrote:
 According to the "Handbook of Garbage Collection" by Richard 
 Jones
 eager lock-free reference counting can only be done with a 
 cas2
 operation modifying two seperate locations atomically 
 (algorithm 18.2
 "Eager reference counting with CompareAndSwap is broken"). 
 This might
 be the quoted paper:
 http://scholr.ly/paper/2199608/lock-free-reference-counting

 Unfortunately the CAS2 operation does not exist in most 
 processors.

 I suppose it's worth noting that Boost (and now standard C++) 
 has a
 shared_ptr that works across threads and the implementation 
 I've seen
 doesn't use a mutex.  In fact, I think the Boost one doesn't 
 even use
 CAS on x86, though it's been quite a few years so my memory 
 could be
 wrong on that last detail.

 I didn't read the paper, but I'd suspect that the paper refers 
 to the
 case where both, the reference count _and_ the reference is 
 thread-safe,
 since the boost/c++ shared_ptr only has a thread-safe 
 reference count
 after all.

 I haven't read it either, but AFAICT the cas2 operation is used 
 two modify the pointer and the reference count at the same time 
 atomically.

 I just checked boost::shared_ptr, it uses cas operations on the 
 reference counts. It has the same problem as described in my 
 example, see the read/write example 3 here: 
 http://www.boost.org/doc/libs/1_54_0/libs/smart_ptr/shared_ptr.htm#ThreadSafety

 boost::shared_ptr is also unsafe with respect to calling member 
 functions through "->" as it doesn't increment the reference 
 count.

I'm totally out of my depth here but can't you store the 
reference count adjacent to the pointer and use CMPXCHG16B

Oct 14 2013

"deadalnix" <deadalnix gmail.com> writes:

On Monday, 14 October 2013 at 17:50:01 UTC, John Colvin wrote:
 I'm totally out of my depth here but can't you store the 
 reference count adjacent to the pointer and use CMPXCHG16B

I think that can work if the refcount is stored with the pointer 
(ie. if a fat pointer is used) and not in the object. But that 
would defeat the whole point of the refcount :(

It seems that Rainer Schuetze is right and that the 
implementation that boost::shared_ptr uses is not safe.

Oct 14 2013

Rainer Schuetze <r.sagitario gmx.de> writes:

On 14.10.2013 19:50, John Colvin wrote:
 On Monday, 14 October 2013 at 17:43:33 UTC, Rainer Schuetze wrote:
 On 13.10.2013 19:05, Sönke Ludwig wrote:
 Am 13.10.2013 17:15, schrieb Sean Kelly:
 On Sunday, 13 October 2013 at 07:03:39 UTC, Rainer Schuetze wrote:
 According to the "Handbook of Garbage Collection" by Richard Jones
 eager lock-free reference counting can only be done with a cas2
 operation modifying two seperate locations atomically (algorithm 18.2
 "Eager reference counting with CompareAndSwap is broken"). This might
 be the quoted paper:
 http://scholr.ly/paper/2199608/lock-free-reference-counting

 Unfortunately the CAS2 operation does not exist in most processors.

 I suppose it's worth noting that Boost (and now standard C++) has a
 shared_ptr that works across threads and the implementation I've seen
 doesn't use a mutex.  In fact, I think the Boost one doesn't even use
 CAS on x86, though it's been quite a few years so my memory could be
 wrong on that last detail.

 I didn't read the paper, but I'd suspect that the paper refers to the
 case where both, the reference count _and_ the reference is thread-safe,
 since the boost/c++ shared_ptr only has a thread-safe reference count
 after all.

 I haven't read it either, but AFAICT the cas2 operation is used two
 modify the pointer and the reference count at the same time atomically.

 I just checked boost::shared_ptr, it uses cas operations on the
 reference counts. It has the same problem as described in my example,
 see the read/write example 3 here:
 http://www.boost.org/doc/libs/1_54_0/libs/smart_ptr/shared_ptr.htm#ThreadSafety


 boost::shared_ptr is also unsafe with respect to calling member
 functions through "->" as it doesn't increment the reference count.

 I'm totally out of my depth here but can't you store the reference count
 adjacent to the pointer and use CMPXCHG16B

This might work for a single pointer, but not if you have multiple 
pointers to the same object.

Oct 14 2013

Walter Bright <newshound2 digitalmars.com> writes:

Rainer Schuetze wrote:

On 26.06.2013 13:12, Michel Fortin wrote:
 Le 26-juin-2013 à 5:38, Walter Bright  a
 écrit :

 On 6/26/2013 12:19 AM, Rainer Schuetze wrote:
 As Michel also said, the reference count does not have to be in
 inside the object itself, so we might want to allow reference
 counting on other types aswell.

 That opens the question of what is the point of other RC types? For
 example, C++ can throw any type - but it turns out that throwing
 anything but class types is largely pointless.

 RC is just another garbage collection scheme. You might favor it for
 its performance characteristics, its determinism, or the lower memory
 footprint.

 Or you might need it to interact with foreign code that relies on it
 (COM, Objective-C, etc.), in which case it needs to be customizable
 (use the foreign implementation) or be manually managed.

 That's two different use cases. And in the later case you can't use
 the GC to release cycles because foreign code is using memory
 invisible to the GC. It is important to note that when foreign code
 calls AddRef you don't want the GC to collect that object, at least
 not until Release is called.

That means you have to maintain two reference counts if you have both ARC and 
COM used in the same class. ARC can only release the object if both counters
are 
0, while the COM implementation has to add/remove roots to the GC when counting 
from/to 0 to avoid that the object is collected while external references exist.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Michel Fortin wrote:

Le 2013-06-26 à 17:42, Rainer Schuetze <r.sagitario gmx.de> a écrit :

 On 26.06.2013 13:12, Michel Fortin wrote:
 Le 26-juin-2013 à 5:38, Walter Bright  a
 écrit :

 On 6/26/2013 12:19 AM, Rainer Schuetze wrote:
 As Michel also said, the reference count does not have to be in
 inside the object itself, so we might want to allow reference
 counting on other types aswell.

 That opens the question of what is the point of other RC types? For
 example, C++ can throw any type - but it turns out that throwing
 anything but class types is largely pointless.

 RC is just another garbage collection scheme. You might favor it for
 its performance characteristics, its determinism, or the lower memory
 footprint.

 Or you might need it to interact with foreign code that relies on it
 (COM, Objective-C, etc.), in which case it needs to be customizable
 (use the foreign implementation) or be manually managed.

 That's two different use cases. And in the later case you can't use
 the GC to release cycles because foreign code is using memory
 invisible to the GC. It is important to note that when foreign code
 calls AddRef you don't want the GC to collect that object, at least
 not until Release is called.

 That means you have to maintain two reference counts if you have both ARC and 

COM used in the same class. ARC can only release the object if both counters
are 
0, while the COM implementation has to add/remove roots to the GC when counting 
from/to 0 to avoid that the object is collected while external references exist.

Yes. The "external" reference count prevents the gc from releasing memory and 
the "internal" reference count keeps track of the number of references from 
gc-scanned memory. When both fall to zero, the memory is released immediately; 
when the external count falls to zero, the gc can collect memory if it isn't 
connected to any of its roots.

Note that you could implement the external count as a separate hash table 
that'll include a gc-scanned pointer to the object. That pointer would keep the 
internal count above zero as long as it is present in the table. The external 
count doesn't need to be formally part of the gc data structure.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/26/2013 4:15 AM, Michel Fortin wrote:
 Well, it's mostly required to write runtime support functions. The attribute 

could be more obscure so people are less tempted to use it, but if you're going 
to implement the ref-counting code you'll need that.

I know the temptation is strong to create more attributes as an easy solution, 
but I'd really like to try hard to find other ways.

 D would need manual, RC and GC to coexist peacefully.

 The problem is how to make the three of those use the same codegen?

 - Druntime could have a flag to disable/enable refcounting. It'd make the 

retain/release functions no-ops, but it'd not prevent the GC from reclaiming 
memory as it does today.
 - Druntime could have a flag to disable/enable garbage collection (it already 

has). That'd prevent cycles from being collected, but you could use weak 
pointers to work around that or request a collection manually at the
appropriate 
time.
 - A  noarc (or similar) attribute at the function level could be used to 

prevent the compiler from generating function calls on pointer assignments. You 
could make a whole module  noarc if you want by adding " noarc:" at the top.
 Here's the annoying thing:  noarc is totally safe if reference counting is 

disabled and we rely entirely on the GC.  noarc is unsafe when reference 
counting is enabled.

I don't really understand your point. My proposal enables manual, RC and GC to 
coexist peacefully. The GC wouldn't even know about RC.


 The downside is that every assignment to a pointer anywhere has to call a 



function. While this is some overhead, it is more predictable than overhead
from 
a GC scan and would be preferred in some situation (games I guess). Another 
downside is you have an object retained by being present on the stack frame of
a 
C function, it'd have to be explicitly retained from elsewhere.
 Doesn't this make it impractical to mix vanilla C with D code? An important 


feature of D is this capability, without worrying about a "JNI" style interface.
 It's not very different than with the GC today.

 If you call a C function by giving it a ref-counted pointer argument, that 

memory block is guarantied to live at least for that call's lifetime (because
it 
is retained by the caller). So simple calls to C functions are not a problem.
 If the C function puts that pointer elsewhere you'll need to retain it some 

other way, but you have to do this with the GC too. If you're implementing a 
callback called from C you need to care about what you return because the 
caller's C code won't retain it, while with the GC you could manage if C code 
did not store that pointer outside of the stack.
 I think that's all you have to worry about.

D (like C and C++) loves to manipulate pointers. Having to call a function
every 
time this is done would be a disaster. It means that people would be motivated 
to drop down to C to do the fast code, and we might as well throw in the towel.



 As for D switching to a full refcounted GC for everything, I'm very hesitant 


for such a step. For one thing, reading the clang spec on all the various 
pointer and function annotations necessary is very off-putting.
 Don't let Clang intimidate you. The Clang spec is about four to five time 

more complicated than needed because of autoreleased objects and because it 
supports weak pointers. Weak pointers can be implemented as a struct templates 
(as long as we have  noarc). And all those annotations are for special cases, 
when you need to break the rules. You don't use them when doing normal 
programming, well except for __weak.

Ok.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/26/2013 2:33 PM, Rainer Schuetze wrote:
 On 26.06.2013 11:38, Walter Bright wrote:
 On 6/26/2013 12:19 AM, Rainer Schuetze wrote:
 I imagine a few (constrained) templated functions for the different
 operations defined in the library could also do the job, though it
 might drown compilation speed. Also getting help from the optimizer to
 remove redundant calls will need some back doors.

 I don't see how this can be done without specific compiler knowledge in
 a memory safe way.

 I currently don't see how it can be memory safe with this proposal.

I'm a little confused about what you are referring to here.


 3. Assignment to a class reference causes a call to AddRef() on the new
 value
 followed by a call to Release() on its original value.

 It might be common knowledge, but I want to point out that the usual
 COM implementation (atomic increment/decrement and free when refcount
 goes down to 0) is not thread-safe for shared pointers. That means you
 either have to guard all reads and writes with a lock to make the full
 assignment atomic or have to implement reference counting very
 different (e.g. deferred reference counting).

 Since the implementation of AddRef()/Release() is up to the user,
 whether it uses locks or not and whether it supports shared or not is up
 to the user.

 You have to put the lock around the pair of AddRef and Release, but if the 

compiler already splits this into two function calls, this cannot be done in
the 
implementation.

Why is it necessary to put a lock around the pair?

 12. AddRef() is not called when passed as the implicit 'this' reference.

 Isn't this unsafe if a member function is called through the last
 existing reference and this reference is then cleared during execution
 of this member function or from another thread?

 No. The caller of the function still retains a reference in that thread.

 Hmmm, I guess I misunderstand the proposal. Assume for example a refcounted 

class R and this code
 class R : RefCounted
 {
     int _x;
     int readx() { return _x; }
 }
 int main()
 {
     R r = new R;
     return r.readx();
 }

 According to 12. there is no refcounting going on when calling or executing 

readx. Ok, now what happens here:
 class R : RefCounted
 {
     int _x;
     int readx(C c)
     {
         c.r = null; // "standard" rc deletes r here
         return _x;  // reads garbage
     }
 }
 class C
 {
     R r;
 }
 int main()
 {
     C c = new C;
     c.r = new R;
     return c.r.readx(c);
 }

 This reads garbage or crashes if there is no reference counting going on when 

calling readx.

I think you're right. Grrrr!

 13. Taking the address of, or passing by reference, any fields of an RC
 object
 is not allowed in  safe code. Passing by reference an RC field is
 allowed.

 Please note that this includes slices to fixed size arrays.

 As I suggested, arrays would not be supported with this proposal - but
 the user can create ref counted array-like objects.

 Just to clarify, I meant taking a slice of a static array that is a field of 

a refcounted class. Is it forbidden to have a field like this in a refcounted 
class or is taking the address through slicing forbidden?

It would be forbidden to obtain a slice of a ref counted object in this way -
or 
even to simply refer to a static array embedded in a ref counted object (in 
 safe code).


 I feel I'm hijacking this proposal, but the step to library defined
 read/write barriers seems pretty small. Make AddRef, Release and
 assignment free template functions, e.g.

 void ptrConstruct(T,bool stackOrHeap)(T*adr, T p);
 void ptrAssign(T,bool stackOrHeap)(T*adr, T p);
 void ptrRelease(T,bool stackOrHeap)(T*adr);

 and we are able to experiment with all kinds of sophisticated GC
 algorithms including RC. Eliding redundant addref/release pairs would
 need some extra support though, I read that LLVM does something like
 this, but I don't know how.

 It's pretty invasive into the code generation and performance, and could
 completely disrupt the C compatibility of D.

 I don't see a big difference between a free function and a member function 

call, though the template character of it might hurt compilation performance.
 Two more notes:

 - I'm not sure it is mentioned, but I think you have to describe what happens 

when copying a struct. pre- and post-blit actions have to be taken if the
struct 
contain pointers to refcounted objects.

Yes. It's analogous to copying a struct that has fields which contain 
constructors and destructors.


 10. Function returns have an AddRef() already done to the return value.

 - A refcounted reference returned from a function (including new) would have 

to be Released if the return value is ignored or if only used as part of an 
expression.

That's right. Just as if a function returned a struct with a destructor.

The model I am basing this on is C++'s shared_ptr<>, which makes use of all the 
various rules around construction, assignment, and destruction. The wrinkles we 
have are:

1. memory safety
2. working with the GC

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Michel Fortin wrote:

Le 26-juin-2013 à 19:48, Walter Bright  a écrit :

 D (like C and C++) loves to manipulate pointers. Having to call a function 

every time this is done would be a disaster. It means that people would be 
motivated to drop down to C to do the fast code, and we might as well throw in 
the towel.

But at the same time, having a GC that stops the world at irregular interval is 
worse for certain kind of things, games notably. It's been stated often on the 
forum that game developers prefer increasing the overhead if it prevents those 
hiccups. Making everything in D reference-counted would undoubtedly increase
the 
general overhead, but it'd allow game developers to leverage the whole language 
and its libraries instead of restricting themselves to a custom subset of the 
class hierarchy derived from a reference counted class.

And about pointer manipulation: for cases where you know for certain that the 
pointer still points to the same memory block before and after the assignment 
(when you call popFront on an array for instance), you have no reference count 
to update and can elide the call.

The downside of optional support for language-wide reference counting is that
it 
requires two incompatible codegen (or rather one incompatible with RC). We
could 
have only one if it's the one that calls the retain/release functions on
pointer 
assignment, with those functions replaced with empty stubs in druntime when 
reference counting is disabled, but some overhead would remain for the function 
call.

I'm not claiming this is the right solution. It's just an idea I wanted to 
mention as an aside because it has some common points. It is however a mostly 
separate matter from your initial goal of supporting custom reference counting 
schemes for some object hierarchies. I decided to write about it mostly because 
you talked about reimplementing arrays using classes and that got me thinking. 
But perhaps I shouldn't have mentioned it because it seems to be side-tracking 
the discussion.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/26/2013 6:33 PM, Michel Fortin wrote:
 Le 26-juin-2013 à 19:48, Walter Bright  a écrit :

 D (like C and C++) loves to manipulate pointers. Having to call a function 


every time this is done would be a disaster. It means that people would be 
motivated to drop down to C to do the fast code, and we might as well throw in 
the towel.
 But at the same time, having a GC that stops the world at irregular interval 

is worse for certain kind of things, games notably. It's been stated often on 
the forum that game developers prefer increasing the overhead if it prevents 
those hiccups. Making everything in D reference-counted would undoubtedly 
increase the general overhead, but it'd allow game developers to leverage the 
whole language and its libraries instead of restricting themselves to a custom 
subset of the class hierarchy derived from a reference counted class.
 And about pointer manipulation: for cases where you know for certain that the 

pointer still points to the same memory block before and after the assignment 
(when you call popFront on an array for instance), you have no reference count 
to update and can elide the call.

I've never seen a scheme for "knows for certain" that did not involve extensive 
and intrusive pointer annotations, something we very much want to avoid.
Pointer 
annotations work great in theory, but in practice no successful language I know 
of uses them (we'll see if Rust will become an exception).

 The downside of optional support for language-wide reference counting is that 

it requires two incompatible codegen (or rather one incompatible with RC). We 
could have only one if it's the one that calls the retain/release functions on 
pointer assignment, with those functions replaced with empty stubs in druntime 
when reference counting is disabled, but some overhead would remain for the 
function call.
 I'm not claiming this is the right solution. It's just an idea I wanted to 

mention as an aside because it has some common points. It is however a mostly 
separate matter from your initial goal of supporting custom reference counting 
schemes for some object hierarchies. I decided to write about it mostly because 
you talked about reimplementing arrays using classes and that got me thinking. 
But perhaps I shouldn't have mentioned it because it seems to be side-tracking 
the discussion.

This proposal is modeled after C++'s shared_ptr<T> in that it should have 
equivalent performance and capabilities. Since it has been well accepted into 
the C++ "best practices", I think we're on solid ground with it.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Rainer Schuetze wrote:

On 27.06.2013 02:11, Walter Bright wrote:
 On 6/26/2013 2:33 PM, Rainer Schuetze wrote:
 On 26.06.2013 11:38, Walter Bright wrote:
 On 6/26/2013 12:19 AM, Rainer Schuetze wrote:
 I imagine a few (constrained) templated functions for the
 different operations defined in the library could also do the
 job, though it might drown compilation speed. Also getting help
 from the optimizer to remove redundant calls will need some
 back doors.

 I don't see how this can be done without specific compiler
 knowledge in a memory safe way.

 I currently don't see how it can be memory safe with this
 proposal.

 I'm a little confused about what you are referring to here.

When preparing for dconf I read the "Garbage Collection Handbook" by
Richard Jones, and it very much supported my suspicion that the usual
reference counting cannot be both memory safe and high-performance.

 You have to put the lock around the pair of AddRef and Release, but
 if the compiler already splits this into two function calls, this
 cannot be done in the implementation.

 Why is it necessary to put a lock around the pair?

To be more accurate, it is the assignment and the Release that have to
be atomic, in addition to a concurrent read with AddRef. Imagine the
reading thread is suspended while just having read the pointer, but not
incremented the reference count yet. If an assignment with release and deletion 
is performed before the thread resumes, AddRef is called on garbage.

IIRC you also have the GC handbook book on your shelf. Check the
chapters on RC, especially algorithm 18.2 "Eager reference counting with
CompareAndSwap is broken".


 Just to clarify, I meant taking a slice of a static array that is a
  field of a refcounted class. Is it forbidden to have a field like
 this in a refcounted class or is taking the address through slicing
 forbidden?

 It would be forbidden to obtain a slice of a ref counted object in
 this way - or even to simply refer to a static array embedded in a
 ref counted object (in  safe code).

Ok.

 Two more notes:

 - I'm not sure it is mentioned, but I think you have to describe
 what happens when copying a struct. pre- and post-blit actions have
 to be taken if the struct contain pointers to refcounted objects.

 Yes. It's analogous to copying a struct that has fields which contain
  constructors and destructors.

Ok, I tend to forget about the swapping to a temporary when assigning structs.

 The model I am basing this on is C++'s shared_ptr<>, which makes use
 of all the various rules around construction, assignment, and
 destruction. The wrinkles we have are:

 1. memory safety

This is the hard part.

 2. working with the GC

I don't think that the GC is getting in the way as long as it is
mark-and-sweep. A fully reference counting GC is a different story and can be 
made concurrent, but it is usually not eager and needs to defer actual
reference 
counting to avoid locks. Instead it logs accesses to some thread local buffer 
which needs to be processed eventually.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Rainer Schuetze wrote:


 Why is it necessary to put a lock around the pair?


 To be more accurate, it is the assignment and the Release that have to
 be atomic, in addition to a concurrent read with AddRef. Imagine the
 reading thread is suspended while just having read the pointer, but not
 incremented the reference count yet. If an assignment with release and
 deletion is performed before the thread resumes, AddRef is called on
 garbage.

On my way to work today, I figured that a safe but slow implementation can
work, 
if the interface is not AddRef/Release, but

class C
{
   C readThis();
   void writeThis(ref C c);
}

where the function can include the necessary locks, e.g.

class C
{
   int refcnt;

   C readThis()
   {
     synchronized(this)
     {
       refcnt++;
       return this;
     }
   }
   void writeThis(ref C c)
   {
     synchronized(c)
     {
        C x = c;
        c = this;
        if (--c.refcnt == 0)
          delete c;
     }
   }
}

Reading/Writing null (e.g. when constructing or destructing a reference) would 
have to be special cased, that would not be necesessary with free functions.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Michel Fortin wrote:


Le 27-juin-2013 à 8:03, "Rainer Schuetze" <r.sagitario gmx.de> a écrit :

 On my way to work today, I figured that a safe but slow implementation can 

work, if the interface is not AddRef/Release, but
 class C
 {
  C readThis();
  void writeThis(ref C c);
 }

 where the function can include the necessary locks, e.g.

 class C
 {
  int refcnt;

  C readThis()
  {
    synchronized(this)
    {
      refcnt++;
      return this;
    }
  }
  void writeThis(ref C c)
  {
    synchronized(c)
    {
       C x = c;
       c = this;
       if (--c.refcnt == 0)
         delete c;
    }
  }
 }

There's an error in this code. You must synchronize on the lock protecting the 
pointer, not on the lock at the other end of the pointer's value.

Also, you only need to do this if the pointer pointing to the object is shared. 
If the pointer is thread-local, assignment does not need to be atomic. And if 
the object itself is thread-local, not even the reference counter need to be
atomic.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/26/2013 11:28 PM, Rainer Schuetze wrote:
 On 27.06.2013 02:11, Walter Bright wrote:
 On 6/26/2013 2:33 PM, Rainer Schuetze wrote:
 On 26.06.2013 11:38, Walter Bright wrote:
 On 6/26/2013 12:19 AM, Rainer Schuetze wrote:
 I imagine a few (constrained) templated functions for the
 different operations defined in the library could also do the
 job, though it might drown compilation speed. Also getting help
 from the optimizer to remove redundant calls will need some
 back doors.

 I don't see how this can be done without specific compiler
 knowledge in a memory safe way.

 I currently don't see how it can be memory safe with this
 proposal.

 I'm a little confused about what you are referring to here.

 When preparing for dconf I read the "Garbage Collection Handbook" by
 Richard Jones, and it very much supported my suspicion that the usual
 reference counting cannot be both memory safe and high-performance.

I think with the rules in the proposal we have we can support it.

 You have to put the lock around the pair of AddRef and Release, but
 if the compiler already splits this into two function calls, this
 cannot be done in the implementation.

 Why is it necessary to put a lock around the pair?

 To be more accurate, it is the assignment and the Release that have to
 be atomic, in addition to a concurrent read with AddRef. Imagine the
 reading thread is suspended while just having read the pointer, but not
 incremented the reference count yet. If an assignment with release and 

deletion is performed before the thread resumes, AddRef is called on garbage.

I see. I'll have to think about that some more.

 IIRC you also have the GC handbook book on your shelf. Check the
 chapters on RC, especially algorithm 18.2 "Eager reference counting with
 CompareAndSwap is broken".

I have the book, but it is the first edition and there's no chapter 18 in it :-(


 Just to clarify, I meant taking a slice of a static array that is a
  field of a refcounted class. Is it forbidden to have a field like
 this in a refcounted class or is taking the address through slicing
 forbidden?

 It would be forbidden to obtain a slice of a ref counted object in
 this way - or even to simply refer to a static array embedded in a
 ref counted object (in  safe code).

 Ok.

 Two more notes:

 - I'm not sure it is mentioned, but I think you have to describe
 what happens when copying a struct. pre- and post-blit actions have
 to be taken if the struct contain pointers to refcounted objects.

 Yes. It's analogous to copying a struct that has fields which contain
  constructors and destructors.

 Ok, I tend to forget about the swapping to a temporary when assigning structs.

 The model I am basing this on is C++'s shared_ptr<>, which makes use
 of all the various rules around construction, assignment, and
 destruction. The wrinkles we have are:

 1. memory safety

 This is the hard part.

 2. working with the GC

 I don't think that the GC is getting in the way as long as it is
 mark-and-sweep. A fully reference counting GC is a different story and can be 

made concurrent, but it is usually not eager and needs to defer actual
reference 
counting to avoid locks. Instead it logs accesses to some thread local buffer 
which needs to be processed eventually.

I don't think we should do a fully ref counted GC anyway.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Rainer Schuetze wrote:

On 27.06.2013 15:50, Michel Fortin wrote:
 Le 27-juin-2013 à 8:03, "Rainer Schuetze" <r.sagitario gmx.de> a écrit :

 class C
 {
   C readThis();
   void writeThis(ref C c);
 }

 where the function can include the necessary locks, e.g.

 class C
 {
   int refcnt;

   C readThis()
   {
     synchronized(this)
     {
       refcnt++;
       return this;
     }
   }
   void writeThis(ref C c)
   {
     synchronized(c)
     {
        C x = c;
        c = this;
        if (--c.refcnt == 0)
          delete c;
     }
   }
 }

 There's an error in this code. You must synchronize on the lock
 protecting the pointer, not on the lock at the other end of the
 pointer's value.

You're right (I have been about to run to a meeting when writing this). Then, 
readThis will also need a reference to the pointer. Another more obvious bug is 
that it should read

         if (--x.refcnt == 0)
           delete x;

 Also, you only need to do this if the pointer pointing to the object
 is shared. If the pointer is thread-local, assignment does not need
 to be atomic. And if the object itself is thread-local, not even the
 reference counter need to be atomic.

True, these issues only happen with shared pointers. But remember that fields
in 
shared objects are also shared.

I also have a hard time to imagine how the code above works with reading 
pointers that live in registers or writing to "references to registers". These 
are never shared, so they could have simpler implementations.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Michel Fortin wrote:

Le 27-juin-2013 à 13:04, Walter Bright  a écrit :

 I don't think we should do a fully ref counted GC anyway.

Speaking of the GC, you should probably rethink this point:

 14. RC objects will still be allocated on the GC heap - this means that a
normal
 GC run will reap RC objects that are in a cycle, and RC objects will get 

automatically
 scanned for heap references with no additional action required by the user.

If you allocate the object from the GC heap, the GC will collect it regardless 
of its reference count. That's fine as long as all the retaining pointers are 
visible to the GC. But if you're defining a COM object, likely that's because 
you'll pass a pointer to an external API, and this API might store the pointer 
somewhere not scanned by the GC. This API will call AddRef to make sure the 
object is retained, but if the GC doesn't see that pointer on its heap it'll 
deallocate and next time external code uses the object everything goes boom! So 
that doesn't work.

If instead you allocate the object outside of the GC heap and your object 
contains pointers to the GC heap, you'll need to add roots to the GC for any 
pointer variable in the object. (This is what DMD/Objective-C currently does.) 
There's no way to detect cycles with that scheme, but it is simple.

We could use a hybrid scheme with two reference counts: one for internal 
references that the GC can see and one for external references that the GC 
cannot see. The GC cannot collect an object if the external reference count is 
non-zero. If the external count is zero, it can collect the object if the 
internal reference count reaches zero or if it becomes unreachable from any 
root. This allows detection of cycles, as long as this cycle is only made of 
internal references. Care must be taken about incrementing/decrementing the 
right reference count depending on the context, which sounds tricky.

Or we could use a somewhat less hybrid scheme where we have one reference count 
and the only thing it does is prevent objects from being deallocated. This can 
be implemented as one global hash table and you put all objects that have a 
non-zero reference count in that table. This hash table being scanned by the GC 
anything in it will never be collected. This will also detect internal cycles 
like the previous two-counter scheme, but it doesn't allow immediate 
deallocation as it waits for the GC to deallocate. (This is similar to how it 
worked in my defunct D/Objective-C bridge that did not rely on tweaking the 
compiler.)

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/27/2013 11:38 AM, Michel Fortin wrote:
 Le 27-juin-2013 à 13:04, Walter Bright  a écrit :

 I don't think we should do a fully ref counted GC anyway.

 Speaking of the GC, you should probably rethink this point:

 14. RC objects will still be allocated on the GC heap - this means that a
normal
 GC run will reap RC objects that are in a cycle, and RC objects will get 


automatically
 scanned for heap references with no additional action required by the user.

 If you allocate the object from the GC heap, the GC will collect it 

regardless of its reference count. That's fine as long as all the retaining 
pointers are visible to the GC. But if you're defining a COM object, likely 
that's because you'll pass a pointer to an external API, and this API might 
store the pointer somewhere not scanned by the GC. This API will call AddRef to 
make sure the object is retained, but if the GC doesn't see that pointer on its 
heap it'll deallocate and next time external code uses the object everything 
goes boom! So that doesn't work.

We already require that if you're going to pass a pointer to any GC allocated 
data to external code, that you retain a pointer. I see no additional issue
with 
requiring this for COM objects created on the GC heap.

 If instead you allocate the object outside of the GC heap and your object 

contains pointers to the GC heap, you'll need to add roots to the GC for any 
pointer variable in the object. (This is what DMD/Objective-C currently does.) 
There's no way to detect cycles with that scheme, but it is simple.

Yes, but that's a lot harder (and more error-prone) than simply requiring the 
programmer to retain a pointer as I outlined above.

 We could use a hybrid scheme with two reference counts: one for internal 

references that the GC can see and one for external references that the GC 
cannot see. The GC cannot collect an object if the external reference count is 
non-zero. If the external count is zero, it can collect the object if the 
internal reference count reaches zero or if it becomes unreachable from any 
root. This allows detection of cycles, as long as this cycle is only made of 
internal references. Care must be taken about incrementing/decrementing the 
right reference count depending on the context, which sounds tricky.

That also seems far more complex than what I proposed.

 Or we could use a somewhat less hybrid scheme where we have one reference 

count and the only thing it does is prevent objects from being deallocated.
This 
can be implemented as one global hash table and you put all objects that have a 
non-zero reference count in that table. This hash table being scanned by the GC 
anything in it will never be collected. This will also detect internal cycles 
like the previous two-counter scheme, but it doesn't allow immediate 
deallocation as it waits for the GC to deallocate. (This is similar to how it 
worked in my defunct D/Objective-C bridge that did not rely on tweaking the 
compiler.)

I'd really like to stick to the shared_ptr<T> model. (A global hash table also 
is not so simple when factoring in loading and unloading DLLs.) Of course, for 
the O-C bridge, you can implement it as required to be compatible with O-C.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Michel Fortin wrote:

Le 27-juin-2013 à 15:32, Walter Bright  a écrit :

 On 6/27/2013 11:38 AM, Michel Fortin wrote:
 14. RC objects will still be allocated on the GC heap - this means that a 



normal
 GC run will reap RC objects that are in a cycle, and RC objects will get 



automatically
 scanned for heap references with no additional action required by the user.

 If you allocate the object from the GC heap, the GC will collect it 


regardless of its reference count. That's fine as long as all the retaining 
pointers are visible to the GC. But if you're defining a COM object, likely 
that's because you'll pass a pointer to an external API, and this API might 
store the pointer somewhere not scanned by the GC. This API will call AddRef to 
make sure the object is retained, but if the GC doesn't see that pointer on its 
heap it'll deallocate and next time external code uses the object everything 
goes boom! So that doesn't work.
 We already require that if you're going to pass a pointer to any GC allocated 

data to external code, that you retain a pointer. I see no additional issue
with 
requiring this for COM objects created on the GC heap.


Perhaps it's just me, but I'd say if you need to anticipate the duration for 
which you need to keep the object alive when you pass it to some external code 
it completely defeats the purpose of said external code calling AddRef and
Release.

With the scheme you propose, reference counting would be useful inside D code
as 
a way to deallocate some classes of objects early without waiting a GC scan.
The 
GC can collect cycles for those objects.

People passing COM objects to external code however should allocate those 
objects outside of the GC if they intend to pass the object to external code. 
They should also add member pointers as GC roots. Also, no cycle detection for 
those objects. If done right it could be made memory safe, but cycles will leak.

Maybe that could work.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/27/2013 1:15 PM, Michel Fortin wrote:
 Le 27-juin-2013 à 15:32, Walter Bright  a écrit :

 On 6/27/2013 11:38 AM, Michel Fortin wrote:
 14. RC objects will still be allocated on the GC heap - this means that a 




normal
 GC run will reap RC objects that are in a cycle, and RC objects will get 




automatically
 scanned for heap references with no additional action required by the user.

 If you allocate the object from the GC heap, the GC will collect it 



regardless of its reference count. That's fine as long as all the retaining 
pointers are visible to the GC. But if you're defining a COM object, likely 
that's because you'll pass a pointer to an external API, and this API might 
store the pointer somewhere not scanned by the GC. This API will call AddRef to 
make sure the object is retained, but if the GC doesn't see that pointer on its 
heap it'll deallocate and next time external code uses the object everything 
goes boom! So that doesn't work.
 We already require that if you're going to pass a pointer to any GC 


allocated data to external code, that you retain a pointer. I see no additional 
issue with requiring this for COM objects created on the GC heap.
 Perhaps it's just me, but I'd say if you need to anticipate the duration for 

which you need to keep the object alive when you pass it to some external code 
it completely defeats the purpose of said external code calling AddRef and
Release.
 With the scheme you propose, reference counting would be useful inside D code 

as a way to deallocate some classes of objects early without waiting a GC scan. 
The GC can collect cycles for those objects.
 People passing COM objects to external code however should allocate those 

objects outside of the GC if they intend to pass the object to external code. 
They should also add member pointers as GC roots. Also, no cycle detection for 
those objects. If done right it could be made memory safe, but cycles will leak.
 Maybe that could work.

Nothing about the proposal acts to prevent one from constructing COM objects
any 
way they wish, including using malloc/free and managing it all themselves. All 
COM objects require is an implementation of the COM interface, which says 
nothing at all beyond having a pointer to an AddRef() and Release().

If you are building a COM object that is to be fired and forgotten into the
void 
of unknown external code, I don't think there's any automated replacement for 
thinking carefully about it and constructing it accordingly. D's memory safety 
guarantees cannot, of course, cover unknown and unknowable external code.

What I'm trying to accomplish with this proposal is:

1. A way to do ref-counted memory allocation for specific objects
2. Do it in a guaranteed memory safe manner (at least for the user of those
objects)
3. Do it in a way that does not interfere with people who want to use the GC or 
do manual memory management
4. Not impose penalties on non-refcounted code
5. Do it in a way that offers a similar performance and overhead profile to 
C++'s shared_ptr<T>
6. Do it in a way that makes it usable to construct COM objects, and work with 
NSObject's
7. Not require pointer annotations
8. Address the most common "why I can't use D" complaint

What I'm not trying to accomplish is:

1. Replacing all memory allocation in D with ref counting

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Michel Fortin wrote:

Le 27-juin-2013 à 16:56, Walter Bright  a écrit :

 What I'm trying to accomplish with this proposal is:

 1. A way to do ref-counted memory allocation for specific objects
 2. Do it in a guaranteed memory safe manner (at least for the user of those 

objects)
 3. Do it in a way that does not interfere with people who want to use the GC 

or do manual memory management
 4. Not impose penalties on non-refcounted code
 5. Do it in a way that offers a similar performance and overhead profile to 

C++'s shared_ptr<T>
 6. Do it in a way that makes it usable to construct COM objects, and work 

with NSObject's
 7. Not require pointer annotations
 8. Address the most common "why I can't use D" complaint

 What I'm not trying to accomplish is:

 1. Replacing all memory allocation in D with ref counting

That list is great for limiting the scope of your DIP. Make sure you include it 
in the DIP.

So if we return to the core of it, here's the problems that still need solving:

1. Depending on the reference counting scheme implemented, it might be more 
efficient to have a single operation for an assignment (retain a/release b) 
operation. I think that should be allowed.
2. If the pointer variable is shared assignment must be atomic (done under a 
lock, and it must always be the same lock for a given pointer, obviously).
3. If the pointer variable is shared, reading its value must be done atomically 
with a retain too.

Here's a suggestion for problem number 1 above:

	class MyObject
	{
		// user-implemented
		static void opRetain(MyObject var);  // must accept null
		static void opRelease(MyObject var); // must accept null

		// optional (showing default implementation below)
		// this can be made faster with for some implementations of ref-counting
		// only call it for an assignment, not for constructing/destructing the
pointer
		// (notably for Objective-C)
		static void opPtrAssign(ref MyObject var, MyObject newVal) {
			opRetain(newVal);
			opRelease(var);
			var = newVal;
		}
	}

This maps 1 on 1 to the underlying functions for Objective-C ARC.

I don't have a solution for the shared case. We do in fact have a tail-shared 
problem here. If I have a shared(MyObject), the pointer is as much shared along 
with the object. When the pointer itself is shared, we need a lock to access it 
reliably and that can only be provided by the outer context.

If we had a way to express tail-shared, then we could repeat the above three 
functions for tail-shared object pointers and it'd work reliably for that.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Michel Fortin Wrote:

Le 27-juin-2013 à 18:35, Michel Fortin a écrit :

 	class MyObject
 	{
 		// user-implemented
 		static void opRetain(MyObject var);  // must accept null
 		static void opRelease(MyObject var); // must accept null

 		// optional (showing default implementation below)
 		// this can be made faster with for some implementations of ref-counting
 		// only call it for an assignment, not for constructing/destructing the
pointer
 		// (notably for Objective-C)
 		static void opPtrAssign(ref MyObject var, MyObject newVal) {
 			opRetain(newVal);
 			opRelease(var);
 			var = newVal;
 		}
 	}

 This maps 1 on 1 to the underlying functions for Objective-C ARC.


Actually, I made a small error in opRetain. To match Objective-C ARC it should 
return the retained object:

	static MyObject opRetain(MyObject var);  // must accept null

and the default implementation for opPtrAssign would then become:

	static void opPtrAssign(ref MyObject var, MyObject newVal) {
		newVal = opRetain(newVal);
		opRelease(var);
		var = newVal;
	}

One reason is that Objective-C blocks (equivalent of delegate literals) are 
stack allocated. If you call retain on a block, it'll make a copy on the heap 
and return that copy.

Another reason for opRetain to return the object is to enable tail-call 
optimization for cases like this one:

	NSObject x;

	NSObject getX() {
		return x; // D ARC should insert an implicit opRetain here
	}

Of course it doesn't necessarily need to work that way, but it'd certainly make 
it easier to integrate with Objective-C if it worked that way.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Rainer Schuetze wrote:

On 25.06.2013 23:00, Walter Bright wrote:
 4. Null checks are done before calling any AddRef() or Release().

Here is another nitpick that needs to be addressed: As mentioned in the 
implementation of ComObject invariants (and out contracts) must not be called 
when returning from Release, if it is ok to actually delete the object.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Rainer Schuetze wrote:

On 28.06.2013 00:35, Michel Fortin wrote:
 So if we return to the core of it, here's the problems that still
 need solving:

 1. Depending on the reference counting scheme implemented, it might
 be more efficient to have a single operation for an assignment
 (retain a/release b) operation. I think that should be allowed.

 2. If the pointer variable is shared assignment must be atomic (done
 under a lock, and it must always be the same lock for a given
 pointer, obviously).

 3. If the pointer variable is shared, reading its value must be done
 atomically with a retain too.

I just had an idea, maybe it is obvious and just distracts, but I thought it 
might be worth sharing:

Instead of defining methods on the class type, we could also redefine the 
reference type. The compiler detects a type declaration "reference_type" in the 
class declaration and replaces all references to that class with that type.

class C
{
     alias shared_ptr!C reference_type;
}

C c = new C;

is lowered to

shared_ptr!C c = new C;

"new C" returns a shared_ptr!C aswell.

It is then up to the implementation of shared_ptr to define what member 
functions to call for reference counting and to deal with proper shared 
semantics in assignments. It can also define whether opCall should increment
the 
reference count or not. For most of the needed functionality, struct semantics 
work out-of-the-box.

2 immediate gotchas

- In a class hierarchy, you would want to define the reference_type in the base 
class only, so maybe it has to be a template. I'm not sure implicite casting to 
base class reference type and interfaces can be implemented.

- the implementation of the shared_ptr template will have to be able to deal 
with the "raw" reference, so that might need some type modifier/annotation. I 
think this might also be true for the addRef/release version, if the 
implementation is not just working on the refcount, but is also calling other 
functions.

- To elide redundant reference counting, the compiler will need annotations 
here, too. Move semantics of structs might reduce the number of reference count 
operations already, though.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Rainer Schuetze wrote:

On 28.06.2013 09:07, Walter Bright wrote:
 Will add to proposal.

 On 6/27/2013 11:27 PM, Rainer Schuetze wrote:
 On 25.06.2013 23:00, Walter Bright wrote:
 4. Null checks are done before calling any AddRef() or Release().

 Here is another nitpick that needs to be addressed: As mentioned in
 the implementation of ComObject invariants (and out contracts) must
 not be called when returning from Release, if it is ok to actually
 delete the object.



Sorry to produce these drop by drop, but while writing the last mail, I noticed 
another issue to think about:

What happens if the class also implements interfaces? A reference of the 
interface type must do reference counting as well. So the interface must also 
define AddRef and Release. This is currently true for COM-interfaces derived 
from IUnknown, but not for other interfaces.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Michel Fortin wrote:

Le 28-juin-2013 à 4:55, Rainer Schuetze <r.sagitario gmx.de> a écrit :

 What happens if the class also implements interfaces? A reference of the 

interface type must do reference counting as well. So the interface must also 
define AddRef and Release. This is currently true for COM-interfaces derived 
from IUnknown, but not for other interfaces.


I would assume if an object of a reference-counted class cannot be cast to its 
base non-reference-counted class that it'd be the same for casting to 
non-reference-counted interfaces.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Rainer Schuetze wrote:

On 28.06.2013 13:37, Michel Fortin wrote:
 Le 28-juin-2013 à 4:55, Rainer Schuetze <r.sagitario gmx.de> a écrit
 :

 What happens if the class also implements interfaces? A reference
 of the interface type must do reference counting as well. So the
 interface must also define AddRef and Release. This is currently
 true for COM-interfaces derived from IUnknown, but not for other
 interfaces.


 I would assume if an object of a reference-counted class cannot be
 cast to its base non-reference-counted class that it'd be the same
 for casting to non-reference-counted interfaces.

Yes, that is probably the way to go. But that makes using protocols like COM 
difficult in safe mode. We have already talked Walter into not linking
reference 
counting to AddRef and Release, but if it is implemented with other methods, 
these cannot be added to the already existing COM interfaces. Passing or
getting 
interface pointers to/from external code being unsafe sounds ok, but passing 
around these interface references in D code would be unsafe as well.

Adding aliases or non-virtual wrappers to the interface declaration to forward 
reference counting to AddRef/Release might help but could also introduce 
ambiguities in a class that derives both from a reference counted base class
and 
an interface like this.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Michel Fortin wrote:

Le 25-juin-2013 à 17:00, Walter Bright  a écrit :

 If a class contains the following methods, in either itself or a base class, 

it is
 an RC class:


    T AddRef();
    T Release();

 An RC class is like a regular D class with these additional semantics:

 1. In  safe code, casting (implicit or explicit) to a base class that does not
 have both AddRef() and Release() is an error.

I'm just realizing that this means  safe code cannot call any member function
of 
the non-reference-counted base class. This would require an implicit conversion 
of "this" to the base class.

 system code could, but it'd be extremely uneasy doing such calls unless I am 
the one in charge of that code and can make sure the base function will never 
store the (unretained) pointer somewhere it shouldn't now and in the future. An 
misstep here and you get memory corruption. Seriously, I don't think  system 
code should allow implicit conversions to the base class, it should be explicit.

I am starting to doubt there is any value in inheriting the base 
ref-counted-class from another class.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/28/2013 1:47 AM, Rainer Schuetze wrote:
 On 28.06.2013 00:35, Michel Fortin wrote:
 So if we return to the core of it, here's the problems that still
 need solving:

 1. Depending on the reference counting scheme implemented, it might
 be more efficient to have a single operation for an assignment
 (retain a/release b) operation. I think that should be allowed.

 2. If the pointer variable is shared assignment must be atomic (done
 under a lock, and it must always be the same lock for a given
 pointer, obviously).

 3. If the pointer variable is shared, reading its value must be done
 atomically with a retain too.

 I just had an idea, maybe it is obvious and just distracts, but I thought it 

might be worth sharing:
 Instead of defining methods on the class type, we could also redefine the 

reference type. The compiler detects a type declaration "reference_type" in the 
class declaration and replaces all references to that class with that type.
 class C
 {
     alias shared_ptr!C reference_type;
 }

 C c = new C;

 is lowered to

 shared_ptr!C c = new C;

 "new C" returns a shared_ptr!C aswell.

 It is then up to the implementation of shared_ptr to define what member 

functions to call for reference counting and to deal with proper shared 
semantics in assignments. It can also define whether opCall should increment
the 
reference count or not. For most of the needed functionality, struct semantics 
work out-of-the-box.
 2 immediate gotchas

 - In a class hierarchy, you would want to define the reference_type in the 

base class only, so maybe it has to be a template. I'm not sure implicite 
casting to base class reference type and interfaces can be implemented.
 - the implementation of the shared_ptr template will have to be able to deal 

with the "raw" reference, so that might need some type modifier/annotation. I 
think this might also be true for the addRef/release version, if the 
implementation is not just working on the refcount, but is also calling other 
functions.
 - To elide redundant reference counting, the compiler will need annotations 

here, too. Move semantics of structs might reduce the number of reference count 
operations already, though.

The main problem with this is the decay of a shared_ptr!C to a C. Once that 
happens, all the memory safety goes out the window.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/28/2013 7:14 AM, Michel Fortin wrote:
 Le 25-juin-2013 à 17:00, Walter Bright  a écrit :

 If a class contains the following methods, in either itself or a base class, 


it is
 an RC class:


     T AddRef();
     T Release();

 An RC class is like a regular D class with these additional semantics:

 1. In  safe code, casting (implicit or explicit) to a base class that does not
 have both AddRef() and Release() is an error.

 I'm just realizing that this means  safe code cannot call any member function 

of the non-reference-counted base class. This would require an implicit 
conversion of "this" to the base class.

That's right.

  system code could, but it'd be extremely uneasy doing such calls unless I am 

the one in charge of that code and can make sure the base function will never 
store the (unretained) pointer somewhere it shouldn't now and in the future. An 
misstep here and you get memory corruption. Seriously, I don't think  system 
code should allow implicit conversions to the base class, it should be explicit.

It's a worthy point.

 I am starting to doubt there is any value in inheriting the base 

ref-counted-class from another class.

Oct 09 2013

"deadalnix" <deadalnix gmail.com> writes:

On Thursday, 10 October 2013 at 02:19:02 UTC, Walter Bright wrote:
  system code could, but it'd be extremely uneasy doing such

 calls unless I am the one in charge of that code and can make 
 sure the base function will never store the (unretained) 
 pointer somewhere it shouldn't now and in the future. An 
 misstep here and you get memory corruption. Seriously, I don't 
 think  system code should allow implicit conversions to the 
 base class, it should be explicit.

 It's a worthy point.

It means OOP is completely broken with that design.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 10/9/2013 7:21 PM, deadalnix wrote:
 It means OOP is completely broken with that design.

I know. The thread kind of petered out as we began to realize the obstacles
with 
this approach. But it's important to have the conversation in the record so we 
don't have to go through it again.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Rainer Schuetze wrote:

On 28.06.2013 21:50, Walter Bright wrote:
 The main problem with this is the decay of a shared_ptr!C to a C. Once
 that happens, all the memory safety goes out the window.

By "decay", do mean the lowering or something else?

There is no stray C reference in user code, it always gets lowered to 
shared_ptr!C. Only  trusted code in shared_ptr will have to deal with "raw" 
references. It is shared_ptr's responsibilty to maintain memory safety, just
the 
same as for AddRef and Release.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/28/2013 1:11 PM, Rainer Schuetze wrote:
 On 28.06.2013 21:50, Walter Bright wrote:
 The main problem with this is the decay of a shared_ptr!C to a C. Once
 that happens, all the memory safety goes out the window.

 By "decay", do mean the lowering or something else?

 There is no stray C reference in user code, it always gets lowered to 

shared_ptr!C. Only  trusted code in shared_ptr will have to deal with "raw" 
references. It is shared_ptr's responsibilty to maintain memory safety, just
the 
same as for AddRef and Release.

"Decay" means it is converted to type C in order to call functions that take C 
as the 'this' pointer or C as a parameter. The problem is both type C and type 
shared_ptr!C will exist.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Rainer Schuetze wrote:

On 28.06.2013 22:29, Walter Bright wrote:
 On 6/28/2013 1:11 PM, Rainer Schuetze wrote:
 On 28.06.2013 21:50, Walter Bright wrote:
 The main problem with this is the decay of a shared_ptr!C to a C. Once
 that happens, all the memory safety goes out the window.

 By "decay", do mean the lowering or something else?

 There is no stray C reference in user code, it always gets lowered to
 shared_ptr!C. Only  trusted code in shared_ptr will have to deal with
 "raw" references. It is shared_ptr's responsibilty to maintain memory
 safety, just the same as for AddRef and Release.

 "Decay" means it is converted to type C in order to call functions that
 take C as the 'this' pointer or C as a parameter. The problem is both
 type C and type shared_ptr!C will exist.

Any parameter of type C is also lowered to shared_ptr!C. Calling a member 
function would go through opDot, which could also do reference counting for 
safety. Treating every explicite or implicite usage of "this" as a temporary 
shared_ptr!C might be overkill, so it could be restricted to assigning "this"
to 
another reference (this includes passing it as an argument to another function 
or returning it from a function). My current adhoc rule: if "this" is not 
followed by a '.', it has to be lowered to construct shared_ptr!C(this).

Assuming the reference count is updated by shared_ptr!C.opDot, there will
always 
be a thread local reference while inside a member function (it must have been 
called through an external reference at least once). Other member functions of 
the same object can always be called without ref-counting assuming that the 
object never gets destroyed through changing other references.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/28/2013 2:03 PM, Rainer Schuetze wrote:
 On 28.06.2013 22:29, Walter Bright wrote:
 On 6/28/2013 1:11 PM, Rainer Schuetze wrote:
 On 28.06.2013 21:50, Walter Bright wrote:
 The main problem with this is the decay of a shared_ptr!C to a C. Once
 that happens, all the memory safety goes out the window.

 By "decay", do mean the lowering or something else?

 There is no stray C reference in user code, it always gets lowered to
 shared_ptr!C. Only  trusted code in shared_ptr will have to deal with
 "raw" references. It is shared_ptr's responsibilty to maintain memory
 safety, just the same as for AddRef and Release.

 "Decay" means it is converted to type C in order to call functions that
 take C as the 'this' pointer or C as a parameter. The problem is both
 type C and type shared_ptr!C will exist.

 Any parameter of type C is also lowered to shared_ptr!C.

I don't see how lowering C to shared_ptr!C and lowering share_ptr!C to C can
work?

 Calling a member function would go through opDot, which could also do 

reference counting for safety. Treating every explicite or implicite usage of 
"this" as a temporary shared_ptr!C might be overkill, so it could be restricted 
to assigning "this" to another reference (this includes passing it as an 
argument to another function or returning it from a function). My current adhoc 
rule: if "this" is not followed by a '.', it has to be lowered to construct 
shared_ptr!C(this).
 Assuming the reference count is updated by shared_ptr!C.opDot, there will 

always be a thread local reference while inside a member function (it must have 
been called through an external reference at least once). Other member
functions 
of the same object can always be called without ref-counting assuming that the 
object never gets destroyed through changing other references.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Michel Fortin wrote:

Le 28-juin-2013 à 17:03, Rainer Schuetze <r.sagitario gmx.de> a écrit :

 Any parameter of type C is also lowered to shared_ptr!C.

class C {}

People still constantly forget that C used as a type represents a *reference*
to 
an object of class C, not the object itself. If you replace type C with 
shared_ptr!C, you must then replace it with shared_ptr!(shared_ptr!C) and so
on; 
there's no end to it.

Also, I strongly doubt the compiler will be able to elide redundant calls to 
retain/release made within shared_ptr!C while still respecting normal struct 
semantics.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/28/2013 6:42 PM, Michel Fortin wrote:
 Le 28-juin-2013 à 17:03, Rainer Schuetze <r.sagitario gmx.de> a écrit :

 Any parameter of type C is also lowered to shared_ptr!C.

 class C {}

 People still constantly forget that C used as a type represents a *reference* 

to an object of class C, not the object itself. If you replace type C with 
shared_ptr!C, you must then replace it with shared_ptr!(shared_ptr!C) and so
on; 
there's no end to it.
 Also, I strongly doubt the compiler will be able to elide redundant calls to 

retain/release made within shared_ptr!C while still respecting normal struct 
semantics.

Using some sort of shared_ptr!T was the original idea, but I could not figure a 
reasonable way to make it memory safe without the compiler knowing about it.
The 
easiest way to have the compiler know about it is to make it some sort of class 
type, not a struct type.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/27/2013 11:33 AM, Rainer Schuetze wrote:
 On 27.06.2013 19:04, Walter Bright wrote:
 IIRC you also have the GC handbook book on your shelf. Check the
 chapters on RC, especially algorithm 18.2 "Eager reference counting with
 CompareAndSwap is broken".

 I have the book, but it is the first edition and there's no chapter 18
 in it :-(

 I can remove the dust from my scanner to copy the 3 mostly relevant pages and 

send them to you.

I understand the issue (I think), but I can't think of a case where the ref 
count would be 1 when this happens.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/28/2013 1:55 AM, Rainer Schuetze wrote:
 What happens if the class also implements interfaces? A reference of the 

interface type must do reference counting as well. So the interface must also 
define AddRef and Release. This is currently true for COM-interfaces derived 
from IUnknown, but not for other interfaces.

Even implementing IUnknown is a problem, if we do the suggestion that opAddref 
and opRelease be used as wrappers around AddRef and Release.

I think the simplest thing is to not allow ref counted classes to implement 
interfaces other than ones derived from IUnknown.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Rainer Schuetze wrote:

On 29.06.2013 00:22, Walter Bright wrote:
 Any parameter of type C is also lowered to shared_ptr!C.

 I don't see how lowering C to shared_ptr!C and lowering share_ptr!C to C
 can work?

I don't see why you would want to lower from shared_ptr!C to C. It's only
inside 
shared_ptr where access to the non-lowered C is needed, e.g. by disabling the 
lowering inside the shared_ptr. I was referring to "raw" references before, so 
the lowering would be better shared_ptr!(__raw(C)). But I agree, having the 
lowering include the original seems bad.

I realized a worse flaw with my proposal: it doesn't solve the assignment 
problem it was meant to. shared_ptr implemented as a struct does not have full 
control of the assignment, but is only called for the postblit and the 
destruction of the previous value. It has no way to put a lock around the full 
assignment. Still thinking too much C++...

Sorry for the noise.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/29/2013 12:16 AM, Rainer Schuetze wrote:
 Sorry for the noise.

No problem. It's a complicated subject, and none of us can think of all the 
ramifications. That's why this thread exists.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Rainer Schuetze wrote:

On 29.06.2013 06:38, Walter Bright wrote:
 On 6/27/2013 11:33 AM, Rainer Schuetze wrote:
 On 27.06.2013 19:04, Walter Bright wrote:
 IIRC you also have the GC handbook book on your shelf. Check the
 chapters on RC, especially algorithm 18.2 "Eager reference counting
 with
 CompareAndSwap is broken".

 I have the book, but it is the first edition and there's no chapter 18
 in it :-(

 I can remove the dust from my scanner to copy the 3 mostly relevant
 pages and send them to you.


I tried to scan it yesterday, but got large black bar at the fold (don't know
if 
this the correct term) that eraased the first inch of text. I would have to rip 
the book apart to get better results.

 I understand the issue (I think), but I can't think of a case where the
 ref count would be 1 when this happens.


Consider a global shared reference R that holds the last reference to an object 
O. One thread exchanges the reference with another reference P while another 
thread reads the reference into S.

shared(C) R = O;      ; refcnt of O is 1

in pseudo-assembly missing null-checks:

Thread1 (R = P)        Thread2 (S = R)

                        mov ecx,[R]
                        ; thread suspended
mov eax,[P]
inc [eax].refcnt
mov ebx,[R]
mov [R],eax
dec [ebx].refcnt      ; refcnt of O now 0
jnz done
call delete_ebx
                        ; thread resumed
                        inc [ecx].refcnt
done:

The increment on [ecx].refcnt modifies garbage.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Rainer Schuetze wrote:
On 29.06.2013 09:36, Rainer Schuetze wrote:
                         inc [ecx].refcnt
 done:

                           mov [S], ecx


Just wanted to add the book states that lock-free reference counting can be 
implemented with a cas2 operation modifying two seperate locations atomically. 
Unfortunately this operation does not exist in most processors. This might be 
the quoted paper: http://scholr.ly/paper/2199608/lock-free-reference-counting

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/29/2013 12:36 AM, Rainer Schuetze wrote:
 On 29.06.2013 06:38, Walter Bright wrote:
 On 6/27/2013 11:33 AM, Rainer Schuetze wrote:
 On 27.06.2013 19:04, Walter Bright wrote:
 IIRC you also have the GC handbook book on your shelf. Check the
 chapters on RC, especially algorithm 18.2 "Eager reference counting
 with
 CompareAndSwap is broken".

 I have the book, but it is the first edition and there's no chapter 18
 in it :-(

 I can remove the dust from my scanner to copy the 3 mostly relevant
 pages and send them to you.


 I tried to scan it yesterday, but got large black bar at the fold (don't know 

if this the correct term) that eraased the first inch of text. I would have to 
rip the book apart to get better results.

Ah, don't rip up your book! (I cut the back off of mine and scanned it, but I
no 
longer care to store the thousands of pounds of books I have anymore, and I
like 
that my whole library fits on my laptop now!).

 I understand the issue (I think), but I can't think of a case where the
 ref count would be 1 when this happens.


 Consider a global shared reference R that holds the last reference to an 

object O. One thread exchanges the reference with another reference P while 
another thread reads the reference into S.
 shared(C) R = O;      ; refcnt of O is 1

 in pseudo-assembly missing null-checks:

 Thread1 (R = P)        Thread2 (S = R)

                        mov ecx,[R]
                        ; thread suspended
 mov eax,[P]
 inc [eax].refcnt
 mov ebx,[R]
 mov [R],eax
 dec [ebx].refcnt      ; refcnt of O now 0
 jnz done
 call delete_ebx
                        ; thread resumed
                        inc [ecx].refcnt
 done:

 The increment on [ecx].refcnt modifies garbage.

Ok, I see. Let me think about it some more.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Jacob Carlborg:

On 29 jun 2013, at 06:42, Walter Bright wrote:

 I think the simplest thing is to not allow ref counted classes to implement 

interfaces other than ones derived from IUnknown.

What about Objective-C interfaces?

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Michel Fortin wrote:
Le 29-juin-2013 à 6:08, Jacob Carlborg a écrit :

 On 29 jun 2013, at 06:42, Walter Bright wrote:

 I think the simplest thing is to not allow ref counted classes to implement 


interfaces other than ones derived from IUnknown.
 What about Objective-C interfaces?

Implementing ARC for Objective-C is going to require some more compiler support 
anyway (for autoreleased objects notably). Tweaking the compiler it so accepts 
Objective-C interfaces should just be a matter of tweaking the boolean 
expression that makes this check. "if (classdelc->objc && interfacedecl->objc) 
return true;" or something like that.

As for which function the compiler should call, it'll probably need to be 
special-cased for Objective-C in the compiler too. Here's the list of changes 
that'd be needed for Objective-C ARC:

== Retain ==
COM:  if (obj) obj->AddRef();
ObjC: obj = objc_retain(obj);
ObjC blocks: obj = objc_retainBlock(obj); // objc_retainBlock might do a copy

== Release ==
COM:  if (obj) obj->Release();
ObjC: objc_release(obj);

== Assignment ==
COM:  if (obj) obj->AddRef(); if (var) var->Release(); var = obj;
ObjC: objc_storeStrong(&var, obj);
ObjC blocks: obj = objc_retainBlock(obj); objc_release(var); var = obj;

As long as Walter implements D ARC in a way we can make the above substitutions 
it shouldn't be too hard.

Then, support for autorelease is a matter of calling objc_autorelease on 
returned objects from autoreleasing functions, followed by objc_retain after
the 
function call in the caller. We'll also have to check what ObjC ARC does for 
pointer write-backs to autoreleased variables and mimick that.

There's an optimized path for autoreleased return values that we should use,
but 
I'd defer that to later. It involves a special no-op instruction to insert at 
the right place as a flag. Also, autoreleased returns should be eliminated when 
inlining.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Looks like this should go into the O-C DIP.


On 6/29/2013 5:38 AM, Michel Fortin wrote:
 Implementing ARC for Objective-C is going to require some more compiler 

support anyway (for autoreleased objects notably). Tweaking the compiler it so 
accepts Objective-C interfaces should just be a matter of tweaking the boolean 
expression that makes this check. "if (classdelc->objc && interfacedecl->objc) 
return true;" or something like that.
 As for which function the compiler should call, it'll probably need to be 

special-cased for Objective-C in the compiler too. Here's the list of changes 
that'd be needed for Objective-C ARC:
 == Retain ==
 COM:  if (obj) obj->AddRef();
 ObjC: obj = objc_retain(obj);
 ObjC blocks: obj = objc_retainBlock(obj); // objc_retainBlock might do a copy

 == Release ==
 COM:  if (obj) obj->Release();
 ObjC: objc_release(obj);

 == Assignment ==
 COM:  if (obj) obj->AddRef(); if (var) var->Release(); var = obj;
 ObjC: objc_storeStrong(&var, obj);
 ObjC blocks: obj = objc_retainBlock(obj); objc_release(var); var = obj;

 As long as Walter implements D ARC in a way we can make the above 

substitutions it shouldn't be too hard.
 Then, support for autorelease is a matter of calling objc_autorelease on 

returned objects from autoreleasing functions, followed by objc_retain after
the 
function call in the caller. We'll also have to check what ObjC ARC does for 
pointer write-backs to autoreleased variables and mimick that.
 There's an optimized path for autoreleased return values that we should use, 

but I'd defer that to later. It involves a special no-op instruction to insert 
at the right place as a flag. Also, autoreleased returns should be eliminated 
when inlining.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Jacob Carlborg:

On 29 jun 2013, at 22:24, Walter Bright wrote:

 Looks like this should go into the O-C DIP.

Shouldn't all this reference counting be in its own DIP?

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/30/2013 1:50 AM, Jacob Carlborg wrote:
 On 29 jun 2013, at 22:24, Walter Bright wrote:

 Looks like this should go into the O-C DIP.

 Shouldn't all this reference counting be in its own DIP?

Yes. It's not ready yet, though.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Michel Fortin wrote:
Le 25-juin-2013 à 17:00, Walter Bright  a écrit :

 6. If a class or struct contains RC fields, calls to Release() for those 

fields will
 be added to the destructor, and a destructor will be created if one doesn't 

exist already.

Another thing to note that the above is dangerous if the destructor is called 
from the GC and RC objects are allocated from GC memory. Referenced objects 
might already have been destroyed and you'll be calling Release() on them. This 
will happen when the GC releases a cycle.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/30/2013 12:35 PM, Michel Fortin wrote:
 Le 25-juin-2013 à 17:00, Walter Bright  a écrit :

 6. If a class or struct contains RC fields, calls to Release() for those 


fields will
 be added to the destructor, and a destructor will be created if one doesn't 


exist already.
 Another thing to note that the above is dangerous if the destructor is called 

from the GC and RC objects are allocated from GC memory. Referenced objects 
might already have been destroyed and you'll be calling Release() on them. This 
will happen when the GC releases a cycle.

Amended as:

6. If a class or struct contains RC fields, calls to Release() for those fields
will
be added to the destructor, and a destructor will be created if one doesn't 
exist already.
Release() implementations should take care to not destroy objects that are 
already destroyed,
which can happen if the objects are allocated on the GC heap and the GC removes 
a cycle of
refcounted objects.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Michel Fortin wrote:
Le 2013-06-30 à 16:32, Walter Bright  a écrit :

 Amended as:

 6. If a class or struct contains RC fields, calls to Release() for those 

fields will
 be added to the destructor, and a destructor will be created if one doesn't 

exist already.
 Release() implementations should take care to not destroy objects that are 

already destroyed,
 which can happen if the objects are allocated on the GC heap and the GC 

removes a cycle of
 refcounted objects.

Good advice. But... how do you implement that? For one thing, I doubt there's
an 
API in the GC you can query for deleted objects, and if there was it'd be 
inefficient to call it for every call to Release. And also, will a virtual call 
to a function of a destroyed object work in the first place? It all seems quite 
fragile to me.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/30/2013 3:05 PM, Michel Fortin wrote:
 Le 2013-06-30 à 16:32, Walter Bright  a écrit :

 Amended as:

 6. If a class or struct contains RC fields, calls to Release() for those 


fields will
 be added to the destructor, and a destructor will be created if one doesn't 


exist already.
 Release() implementations should take care to not destroy objects that are 


already destroyed,
 which can happen if the objects are allocated on the GC heap and the GC 


removes a cycle of
 refcounted objects.

 Good advice. But... how do you implement that? For one thing, I doubt there's 

an API in the GC you can query for deleted objects, and if there was it'd be 
inefficient to call it for every call to Release. And also, will a virtual call 
to a function of a destroyed object work in the first place? It all seems quite 
fragile to me.

The GC doesn't actually delete anything while it is doing a collection cycle.
So 
the refcount could simply be checked.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Steven Schveighoffer wrote:

On Jun 30, 2013, at 6:11 PM, Walter Bright wrote:

 On 6/30/2013 3:05 PM, Michel Fortin wrote:
 Le 2013-06-30 à 16:32, Walter Bright  a écrit :

 Amended as:

 6. If a class or struct contains RC fields, calls to Release() for those 



fields will
 be added to the destructor, and a destructor will be created if one doesn't 



exist already.
 Release() implementations should take care to not destroy objects that are 



already destroyed,
 which can happen if the objects are allocated on the GC heap and the GC 



removes a cycle of
 refcounted objects.

 Good advice. But... how do you implement that? For one thing, I doubt 


there's an API in the GC you can query for deleted objects, and if there was 
it'd be inefficient to call it for every call to Release. And also, will a 
virtual call to a function of a destroyed object work in the first place? It
all 
seems quite fragile to me.

 The GC doesn't actually delete anything while it is doing a collection cycle. 

So the refcount could simply be checked.

AFAIK, this isn't a requirement of the GC.  May want to add it.  I have bad 
experiences with trying to second guess the GC and when it actually kills the 
object.

Note, if this is the case, then inc/dec refcount cannot depend on vtable, since 
that is zeroed.  I'm wondering if the GC shouldn't set the RC to size_t.max
when 
destructing, or even just +1 it, to ensure the ref count destructor doesn't 
accidentally free it before the reaper does.

-Steve

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Michel Fortin wrote:
Le 30-juin-2013 à 18:11, Walter Bright  a écrit :

 On 6/30/2013 3:05 PM, Michel Fortin wrote:
 Le 2013-06-30 à 16:32, Walter Bright  a écrit :

 Amended as:

 6. If a class or struct contains RC fields, calls to Release() for those 



fields will
 be added to the destructor, and a destructor will be created if one doesn't 



exist already.
 Release() implementations should take care to not destroy objects that are 



already destroyed,
 which can happen if the objects are allocated on the GC heap and the GC 



removes a cycle of
 refcounted objects.

 Good advice. But... how do you implement that? For one thing, I doubt 


there's an API in the GC you can query for deleted objects, and if there was 
it'd be inefficient to call it for every call to Release. And also, will a 
virtual call to a function of a destroyed object work in the first place? It
all 
seems quite fragile to me.
 The GC doesn't actually delete anything while it is doing a collection cycle. 

So the refcount could simply be checked.


... checked and decremented, and if it reaches zero in the thread the GC is 
currently running then it doesn't have to delete the object as, in theory, it 
should be destructed as part of the same run. Ok, I get it now.

You should add a requirement that the reference counter be atomic because the
GC 
can run in any thread and you still need to decrement counters of referenced 
objects in destructor.

Honestly, I think it'd be much easier if the runtime provided its own base 
object you could use for reference counting with the GC to collect cycles. The 
provided implementation could rely on internal details of the GC since both 
would be part of druntime. There isn't much room for alternate implementations 
when the GC is involved anyway.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/30/2013 4:35 PM, Michel Fortin wrote:
 Le 30-juin-2013 à 18:11, Walter Bright  a écrit :

 On 6/30/2013 3:05 PM, Michel Fortin wrote:
 Le 2013-06-30 à 16:32, Walter Bright  a écrit :

 Amended as:

 6. If a class or struct contains RC fields, calls to Release() for those 




fields will
 be added to the destructor, and a destructor will be created if one 




doesn't exist already.
 Release() implementations should take care to not destroy objects that are 




already destroyed,
 which can happen if the objects are allocated on the GC heap and the GC 




removes a cycle of
 refcounted objects.

 Good advice. But... how do you implement that? For one thing, I doubt 



there's an API in the GC you can query for deleted objects, and if there was 
it'd be inefficient to call it for every call to Release. And also, will a 
virtual call to a function of a destroyed object work in the first place? It
all 
seems quite fragile to me.
 The GC doesn't actually delete anything while it is doing a collection 


cycle. So the refcount could simply be checked.
 ... checked and decremented, and if it reaches zero in the thread the GC is 

currently running then it doesn't have to delete the object as, in theory, it 
should be destructed as part of the same run. Ok, I get it now.
 You should add a requirement that the reference counter be atomic because the 

GC can run in any thread and you still need to decrement counters of referenced 
objects in destructor.

I very much want to avoid requiring atomic counts - it's a major performance 
penalty. Note that if the GC is reaping a cycle, nobody else is referencing the 
object, so this should not be an issue.

 Honestly, I think it'd be much easier if the runtime provided its own base 

object you could use for reference counting with the GC to collect cycles. The 
provided implementation could rely on internal details of the GC since both 
would be part of druntime. There isn't much room for alternate implementations 
when the GC is involved anyway.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Steven Schveighoffer wrote:

On Jun 30, 2013, at 8:18 PM, Walter Bright wrote:

 I very much want to avoid requiring atomic counts - it's a major performance 

penalty. Note that if the GC is reaping a cycle, nobody else is referencing the 
object, so this should not be an issue.

I think you didn't understand what Michel was saying.

Take for example:

A->B->C->A

this is a cycle.  Imagine that nobody else is pointing at A, B or C.  Fine. 
The 
GC starts to collect this cycle.

But let's say that D is not being collected *AND* B has a reference to D.

B could be getting destroyed in one thread, and decrementing D's reference 
count, while someone else in another thread is incrementing/decrementing D's 
reference count.

I agree that RC optimally is thread-local.  But if you involve the GC, then ref 
incs and decs have to be atomic.

I don't think this is that bad.  iOS on ARM which has terrible atomic
primitives 
uses atomic reference counts.

If you do NOT involve the GC and are careful about cycles, then you could 
potentially have a RC solution that does not require atomics.  But that would 
have to be a special case, with the danger of having cycles.

-Steve

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Michel Fortin wrote:
Le 30-juin-2013 à 20:25, Steven Schveighoffer a écrit :

 A->B->C->A

 this is a cycle.  Imagine that nobody else is pointing at A, B or C.  Fine. 

The GC starts to collect this cycle.
 But let's say that D is not being collected *AND* B has a reference to D.

 B could be getting destroyed in one thread, and decrementing D's reference 

count, while someone else in another thread is incrementing/decrementing D's 
reference count.
 I agree that RC optimally is thread-local.  But if you involve the GC, then 

ref incs and decs have to be atomic.

Exactly what I was trying to explain. Thanks.

 I don't think this is that bad.  iOS on ARM which has terrible atomic 

primitives uses atomic reference counts.

Moreover iOS uses a single spinlock to protect a global hash table containing 
all reference counts.

 If you do NOT involve the GC and are careful about cycles, then you could 

potentially have a RC solution that does not require atomics.  But that would 
have to be a special case, with the danger of having cycles.

Not involving the GC is quite difficult: you need to be absolutely sure you
have 
no pointer pointing to that thread-local ref-counted object anywhere in the 
GC-heap. Unfortunately, there's no way to guaranty statically what is part of 
the GC heap and what is not, so any non-atomic reference counter is not  safe.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Steven Schveighoffer wrote:
On Jun 30, 2013, at 10:08 PM, Michel Fortin wrote:

 Le 30-juin-2013 à 20:25, Steven Schveighoffer a écrit :
 I don't think this is that bad.  iOS on ARM which has terrible atomic 


primitives uses atomic reference counts.
 Moreover iOS uses a single spinlock to protect a global hash table containing 

all reference counts.

Hearing this, I actually find it amazing how well it works :)

 If you do NOT involve the GC and are careful about cycles, then you could 


potentially have a RC solution that does not require atomics.  But that would 
have to be a special case, with the danger of having cycles.
 Not involving the GC is quite difficult: you need to be absolutely sure you 

have no pointer pointing to that thread-local ref-counted object anywhere in
the 
GC-heap. Unfortunately, there's no way to guaranty statically what is part of 
the GC heap and what is not, so any non-atomic reference counter is not  safe.

This is true, I was thinking of garbage collected RC object referring to RC 
object, I wasn't thinking of fully GC object referring to RC object.

In terms of pure functions and possibly a  nogc attribute, this might be a 
possibility.  Maybe at some point we have a  nogcref attribute we attach to 
specific *types* so the compiler prevents you from storing any references to 
that type in the GC.

I think it is be important to reserve the possibility for having cases where RC 
inc/dec is not atomic.  Especially where we have D's type system identifying 
what is shared and what is not.  Especially when there is the possibility for 
thread-local GCs.

-Steve

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/30/2013 5:25 PM, Steven Schveighoffer wrote:
 On Jun 30, 2013, at 8:18 PM, Walter Bright wrote:

 I very much want to avoid requiring atomic counts - it's a major performance 


penalty. Note that if the GC is reaping a cycle, nobody else is referencing the 
object, so this should not be an issue.
 I think you didn't understand what Michel was saying.

 Take for example:

 A->B->C->A

 this is a cycle.  Imagine that nobody else is pointing at A, B or C.  Fine. 

The GC starts to collect this cycle.
 But let's say that D is not being collected *AND* B has a reference to D.

 B could be getting destroyed in one thread, and decrementing D's reference 

count, while someone else in another thread is incrementing/decrementing D's 
reference count.
 I agree that RC optimally is thread-local.  But if you involve the GC, then 

ref incs and decs have to be atomic.

This is actually a problem right now with the GC, as destructors may be run in 
another thread than they belong in. The situation you describe is not worse or 
better than that, it's the same thing. The solution is to run the destructors
in 
the same thread the objects belong in.

 I don't think this is that bad.  iOS on ARM which has terrible atomic 

primitives uses atomic reference counts.

It's bad. ARM is not the only processor out there.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Steven Schveighoffer wrote:

On Jun 30, 2013, at 10:26 PM, Walter Bright wrote:

 On 6/30/2013 5:25 PM, Steven Schveighoffer wrote:
 On Jun 30, 2013, at 8:18 PM, Walter Bright wrote:

 I very much want to avoid requiring atomic counts - it's a major 



performance penalty. Note that if the GC is reaping a cycle, nobody else is 
referencing the object, so this should not be an issue.
 I think you didn't understand what Michel was saying.

 Take for example:

 A->B->C->A

 this is a cycle.  Imagine that nobody else is pointing at A, B or C.  Fine. 


  The GC starts to collect this cycle.
 But let's say that D is not being collected *AND* B has a reference to D.

 B could be getting destroyed in one thread, and decrementing D's reference 


count, while someone else in another thread is incrementing/decrementing D's 
reference count.
 I agree that RC optimally is thread-local.  But if you involve the GC, then 


ref incs and decs have to be atomic.
 This is actually a problem right now with the GC, as destructors may be run 

in another thread than they belong in. The situation you describe is not worse 
or better than that, it's the same thing. The solution is to run the
destructors 
in the same thread the objects belong in.

I think that's a tall order presently.  For instance, on linux, the threads are 
all stopped using a signal.  It's a very bad idea to run destructors in a
signal 
handler.

What it seems like you are saying is that a prerequisite for ref counting is to 
have thread-local GC working.  If that is the case, we need to start a 
thread-local GC "thread" before this goes any further.

 I don't think this is that bad.  iOS on ARM which has terrible atomic 


primitives uses atomic reference counts.
 It's bad. ARM is not the only processor out there.


Pragmatically, I think if D targets x86 variants and ARM, it is well-situated
in 
the mainstream of existing devices.  Yes, it would be nice if it could target 
other obscure platforms, but if we are talking ref counting works poorly on 
those, I don't think we are any worse off than today.  Note that we can keep
the 
options open, and implement atomic RC now without many headaches.

-Steve

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/30/2013 7:36 PM, Steven Schveighoffer wrote:
 On Jun 30, 2013, at 10:26 PM, Walter Bright wrote:

 On 6/30/2013 5:25 PM, Steven Schveighoffer wrote:
 On Jun 30, 2013, at 8:18 PM, Walter Bright wrote:

 I very much want to avoid requiring atomic counts - it's a major 




performance penalty. Note that if the GC is reaping a cycle, nobody else is 
referencing the object, so this should not be an issue.
 I think you didn't understand what Michel was saying.

 Take for example:

 A->B->C->A

 this is a cycle.  Imagine that nobody else is pointing at A, B or C.  Fine. 



  The GC starts to collect this cycle.
 But let's say that D is not being collected *AND* B has a reference to D.

 B could be getting destroyed in one thread, and decrementing D's reference 



count, while someone else in another thread is incrementing/decrementing D's 
reference count.
 I agree that RC optimally is thread-local.  But if you involve the GC, then 



ref incs and decs have to be atomic.
 This is actually a problem right now with the GC, as destructors may be run 


in another thread than they belong in. The situation you describe is not worse 
or better than that, it's the same thing. The solution is to run the
destructors 
in the same thread the objects belong in.
 I think that's a tall order presently.  For instance, on linux, the threads 

are all stopped using a signal.  It's a very bad idea to run destructors in a 
signal handler.
 What it seems like you are saying is that a prerequisite for ref counting is 

to have thread-local GC working.  If that is the case, we need to start a 
thread-local GC "thread" before this goes any further.

Not really. This doesn't make anything worse. Also, the proposed solution to 
this issue is to post the "destruct" list to the appropriate thread, and that 
thread runs it next time it calls the GC.

 I don't think this is that bad.  iOS on ARM which has terrible atomic 



primitives uses atomic reference counts.
 It's bad. ARM is not the only processor out there.

 Pragmatically, I think if D targets x86 variants and ARM, it is well-situated 

in the mainstream of existing devices.  Yes, it would be nice if it could
target 
other obscure platforms, but if we are talking ref counting works poorly on 
those, I don't think we are any worse off than today.  Note that we can keep
the 
options open, and implement atomic RC now without many headaches.

We don't need to require atomic RC for these.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Steven Schveighoffer wrote:

On Jul 1, 2013, at 3:11 AM, Walter Bright wrote:

 On 6/30/2013 7:36 PM, Steven Schveighoffer wrote:
 I think that's a tall order presently.  For instance, on linux, the threads 


are all stopped using a signal.  It's a very bad idea to run destructors in a 
signal handler.
 What it seems like you are saying is that a prerequisite for ref counting is 


to have thread-local GC working.  If that is the case, we need to start a 
thread-local GC "thread" before this goes any further.
 Not really. This doesn't make anything worse. Also, the proposed solution to 

this issue is to post the "destruct" list to the appropriate thread, and that 
thread runs it next time it calls the GC.

I really urge you to make this a separate project.  It's not trivial. 
Logically, it's sound, but the implementation will be very difficult.  I also 
think Sean (and probably others) should be involved for that discussion.

 Pragmatically, I think if D targets x86 variants and ARM, it is 


well-situated in the mainstream of existing devices.  Yes, it would be nice if 
it could target other obscure platforms, but if we are talking ref counting 
works poorly on those, I don't think we are any worse off than today.  Note
that 
we can keep the options open, and implement atomic RC now without many
headaches.

 We don't need to require atomic RC for these.

I didn't say that.  I said we could implement atomic RC without any changes to 
the GC, and worry about optimizing with non-atomic RC later.  As long as we
make 
it *possible*.

-Steve

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 7/1/2013 6:08 AM, Steven Schveighoffer wrote:
 On Jul 1, 2013, at 3:11 AM, Walter Bright wrote:

 On 6/30/2013 7:36 PM, Steven Schveighoffer wrote:
 I think that's a tall order presently.  For instance, on linux, the threads 



are all stopped using a signal.  It's a very bad idea to run destructors in a 
signal handler.
 What it seems like you are saying is that a prerequisite for ref counting 



is to have thread-local GC working.  If that is the case, we need to start a 
thread-local GC "thread" before this goes any further.
 Not really. This doesn't make anything worse. Also, the proposed solution to 


this issue is to post the "destruct" list to the appropriate thread, and that 
thread runs it next time it calls the GC.
 I really urge you to make this a separate project.  It's not trivial. 

Logically, it's sound, but the implementation will be very difficult.  I also 
think Sean (and probably others) should be involved for that discussion.

Make what a separate project? The destruction of objects by the GC in local 
threads? It already is not part of the ref counting proposal.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

Steven Schveighoffer wrote:

On Jul 1, 2013, at 12:17 PM, Walter Bright wrote:

 I really urge you to make this a separate project.  It's not trivial. 


Logically, it's sound, but the implementation will be very difficult.  I also 
think Sean (and probably others) should be involved for that discussion.
 Make what a separate project? The destruction of objects by the GC in local 

threads? It already is not part of the ref counting proposal.


As far as I can tell, the ref counting proposal is not viable without it, as 
long as you insist on non-atomic RC increments and decrements.  How can it 
possibly not be a prerequisite to this, and therefore part of the proposal?

Unless you are saying now that atomic ref counting is OK?

I'm going by your previous statement:

 I very much want to avoid requiring atomic counts - it's a major performance 

penalty.


-Steve

Oct 09 2013

Jacob Carlborg <doob me.com> writes:

On 2013-10-10 04:35, Walter Bright wrote:
 Steven Schveighoffer wrote:

 On Jul 1, 2013, at 12:17 PM, Walter Bright wrote:

  >> I really urge you to make this a separate project.  It's not
 trivial. Logically, it's sound, but the implementation will be very
 difficult.  I also think Sean (and probably others) should be involved
 for that discussion.
  >
  > Make what a separate project? The destruction of objects by the GC in
 local threads? It already is not part of the ref counting proposal.
  >

 As far as I can tell, the ref counting proposal is not viable without
 it, as long as you insist on non-atomic RC increments and decrements.
 How can it possibly not be a prerequisite to this, and therefore part of
 the proposal?

 Unless you are saying now that atomic ref counting is OK?

 I'm going by your previous statement:

  > I very much want to avoid requiring atomic counts - it's a major
 performance penalty.

 -Steve

Is this the last email in the conversation? In that case I think you 
clearly mark that with a post.

-- 
/Jacob Carlborg

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 10/9/2013 11:52 PM, Jacob Carlborg wrote:
 Is this the last email in the conversation?

Yes, I posted them in chronological order.

Oct 10 2013

Walter Bright <newshound2 digitalmars.com> writes:

Michel Fortin wrote:
Le 30-juin-2013 à 22:26, Walter Bright  a écrit :

 This is actually a problem right now with the GC, as destructors may be run 

in another thread than they belong in. The situation you describe is not worse 
or better than that, it's the same thing. The solution is to run the
destructors 
in the same thread the objects belong in.

Indeed. Maybe that could work. How ironic that we can't implement RC
efficiently 
because of the GC.

That said, it strongly favors having a base RC object implementation in 
druntime, where it can be kept in sync with the GC.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/30/2013 7:47 PM, Michel Fortin wrote:
 Le 30-juin-2013 à 22:26, Walter Bright  a écrit :

 This is actually a problem right now with the GC, as destructors may be run 


in another thread than they belong in. The situation you describe is not worse 
or better than that, it's the same thing. The solution is to run the
destructors 
in the same thread the objects belong in.
 Indeed. Maybe that could work. How ironic that we can't implement RC 

efficiently because of the GC.
 That said, it strongly favors having a base RC object implementation in 

druntime, where it can be kept in sync with the GC.

The GC doesn't need to know about it.

Oct 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

And that's the last email in the original thread!

Oct 10 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 10/10/2013 12:31 AM, Walter Bright wrote:
 And that's the last email in the original thread!

Durn, no it isn't.

Oct 10 2013

"Sean Kelly" <sean invisibleduck.org> writes:

On Thursday, 10 October 2013 at 02:28:13 UTC, Walter Bright wrote:
 Steven Schveighoffer wrote:

 On Jun 30, 2013, at 8:18 PM, Walter Bright wrote:

 I very much want to avoid requiring atomic counts - it's a

 major performance penalty. Note that if the GC is reaping a 
 cycle, nobody else is referencing the object, so this should 
 not be an issue.

 I think you didn't understand what Michel was saying.

 Take for example:

 A->B->C->A

 this is a cycle.  Imagine that nobody else is pointing at A, B 
 or C.  Fine.  The GC starts to collect this cycle.

 But let's say that D is not being collected *AND* B has a 
 reference to D.

 B could be getting destroyed in one thread, and decrementing 
 D's reference count, while someone else in another thread is 
 incrementing/decrementing D's reference count.

 I agree that RC optimally is thread-local.  But if you involve 
 the GC, then ref incs and decs have to be atomic.

I think this ties into the requirement that after the GC collects 
thread-local objects, they must be finalized by the thread that 
owns them (assuming it's still alive).  What's missing is some 
way to track what thread owns an object.  This isn't super 
difficult to add in the simple case, but if we allow thread-local 
objects to be transferred between threads, then the transferral 
of ownership has to be communicated to the GC.  Assuming for the 
moment that's not a problem though, I think RC updates could be 
non-atomic for thread-local data.

Oct 11 2013

=?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= <sludwig outerproduct.org> writes:

I've made short roundup of the different features/requirements that have 
been mentioned (may have forgotten some and added some) as this thread 
has a complexity that presumably makes it quite hard to follow for most 
readers. I have also attached an estimated priority for each requirement 
based on the discussion and my own experiences.

  - Memory safety [very important but also very difficult/limiting]
      Disallow escaping uncounted references to reference counted
      memory. Keywords: pure, scope, isolated/owned
  - COM compatible [important]
      Needs to support internal reference counting using
      AddRef/ReleaseRef, while obeying to the call convention
  - Objective-C compatible [important]
      Weak references, manual memory management and autorelease pools are
      some keywords here
  - Handle reference cycles [nice to have]
      Requires GC memory for storing the instances
  - Weak pointers [critical]
      Only briefly mentioned, but critical for many data structures
      (e.g. graphs or caches) requires external reference counting
  - Not require two separate counts for COM [mildly important]
      Using GC memory would require a second reference count for the
      D side of things, which is not desirable for multiple reasons
  - Support type qualifiers [critical]
      All usual type qualifiers and conversions should work as expected.
      This is not possible in a pure template based solution.
  - Support non-class types [nice to have]
      Although slightly limiting, classes provide a convenient
      abstraction and will arguably capture >90% use cases just fine
  - Support referencing fields/slices [nice to have]
      Letting references to members escape in a safe way would greatly
      increase the flexibility, but ties it tightly to the GC
  - Allow manual memory management [critical]
      For COM/Obj-C and any situation where external code is to take
      responsibility of the ref counting this needs to be an option
  - Not require new annotations [important]
      Getting this to work without introducing new keywords/syntax is
      strongly preferred by Walter
  - Safe shared(RefCountedType) variables [important]
      There needs to be a way to make shared references thread-safe
  - Support OOP reasonably well [important]
      The usual up and down casts should work and calling functions of
      base classes needs to be safe

Please mention any points that I have forgotten so that we have some 
kind of unit test against which proposed designs can be checked.

Oct 12 2013

=?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= <sludwig outerproduct.org> writes:

To support most of the requirements we need to offer some control over 
the reference type. Forcing the reference to be a pointer to the class 
instance precludes proper weak references and makes thread-safety difficult.

Rainer Schütze's proposal [1] looked promising, but didn't quite work 
out. However, by going a bit further, I think this approach can be fixed 
and will provide all the flexibility needed to implement solutions that 
can satisfy any of those requirements.

The basic idea is the same: Any reference to a class that is recognized 
as being reference counted is replaced by a struct that performs the 
reference counting using RAII (e.g. std.typecons.RefCounted). This 
allows any reference counting scheme to be implemented 
(internal/external, support weak refs or not, global counter table, GC 
memory or malloc, etc.).

TL;DR Let some code speak instead of a full blown spec first:

struct RefCounted(T) { /* ... */ }

//  referenceType!RefCounted
class C {
	// the presence of a C.ReferenceType template makes the class a
	// reference counted class
	// Note: A  referenceType UDA as above defined in object.d
         //       could be a cleaner/more explicit alternative
	alias ReferenceType = RefCounted;

	// C is now internally renamed to __ref_C to avoid ambiguities
	// and "C" itself is an alias for ReferenceType!__ref_C
	pragma(msg, C.stringof); // "RefCounted!__ref_C"

	void method()
	{
		// The "this" pointer, too, is of the reference counted
		// type. caveats: how to handle private fields? COM call
  		// convention?
		pragma(msg, typeof(this)); // "RefCounted!__ref_C"

		// Alternative: allow only pure functions to avoid
		// escaping references

		// Another alternative is to make 'scope' powerful
		// enough and use that:
		pragma(msg, typeof(this)); // "scope __ref_C"
	}
}

// The reference type itself is never const/immutable
// to enable reference counting for qualified types
C c; // -> RefCounted!__ref_C
const(C) d; // -> RefCounted!(const(__ref_C))

// To support the usual implicit conversions, some substitution is
// needed, since we have no implicit cast support for UDTs in the
// language
d = c; // d = typeof(D)(c)
        // or
        // d = typeof(D).implicitCastFrom(c)
        // or
        // d = typeof(C).implicitCastTo!D

// shared, however, is applied to the reference count itself (and
// transitively to the object) to force a thread-safety - or rather to
// avoid accidental use of unsafe implementations for shared references)
shared(const(C)) e; // -> shared(RefCounted!(const(__ref_C)))

---

Caveat: Changing the "this" pointer from '__ref_C' to 
'RefCounted!__ref_C' has implications on the calling convention, which 
needs to be taken into account when COM objects are involved. A simple 
COMPtr-like struct that only contains the target pointer may be enough 
here, though.

Also, to guarantee memory safety, some additional measures need to be 
taken to avoid escaping plain references to refcounted memory. One 
solution is the use of isolated/owned types, another is to make 'scope' 
not only check for shallow reference escaping, but also check for 
escaping of references to fields (similar thing but behaves 
differently). Both combined would of course be ideal. I think this is an 
issue that is mostly orthogonal to the refcount topic. See also the 
corresponding thread [2].

[1]: 
http://forum.dlang.org/thread/l34lei$255v$1 digitalmars.com?page=5#post-l352nk:242g3b:246:40digitalmars.com
[2]: http://forum.dlang.org/thread/kluaojijixhwigoujeip forum.dlang.org

Oct 12 2013

"deadalnix" <deadalnix gmail.com> writes:

On Saturday, 12 October 2013 at 14:03:26 UTC, Sönke Ludwig wrote:
 I've made short roundup of the different features/requirements 
 that have been mentioned (may have forgotten some and added 
 some) as this thread has a complexity that presumably makes it 
 quite hard to follow for most readers. I have also attached an 
 estimated priority for each requirement based on the discussion 
 and my own experiences.

  - Memory safety [very important but also very 
 difficult/limiting]
      Disallow escaping uncounted references to reference counted
      memory. Keywords: pure, scope, isolated/owned

We have a first missing block here :D

I do think this is mandatory.

  - Objective-C compatible [important]
      Weak references, manual memory management and autorelease 
 pools are
      some keywords here

Can someone explain me what an autorelease pool is ?

  - Handle reference cycles [nice to have]
      Requires GC memory for storing the instances

The good new is that we can (and IMO should) layer ref counting 
on top of GC. This is the only way to make both work nicely 
together and have safety net for leakage.

  - Weak pointers [critical]
      Only briefly mentioned, but critical for many data 
 structures
      (e.g. graphs or caches) requires external reference 
 counting

Easy for RefCounted but become tricky for GC stuff.

  - Not require two separate counts for COM [mildly important]
      Using GC memory would require a second reference count for 
 the
      D side of things, which is not desirable for multiple 
 reasons

Can you explain that one a bit more ? Especially how it require 2 
count.

  - Support type qualifiers [critical]
      All usual type qualifiers and conversions should work as 
 expected.
      This is not possible in a pure template based solution.

Here we have the second piece missing. We need a way to tail 
qualify template.

  - Not require new annotations [important]
      Getting this to work without introducing new 
 keywords/syntax is
      strongly preferred by Walter

We have 2 missing bloc to make that work (from a language 
perpective, many runtime/compiler magic is also required).

  - Support OOP reasonably well [important]
      The usual up and down casts should work and calling 
 functions of
      base classes needs to be safe

I see no way to make that work without increasing the scope of 
scope (huhuhu :P)

 Please mention any points that I have forgotten so that we have 
 some kind of unit test against which proposed designs can be 
 checked.

You did an excellent job here. I guess we have 2 missing piece 
(and an incomplete one in the name of scope) to get sorted out 
and we can get something really nice here !

Oct 12 2013

Michel Fortin <michel.fortin michelf.ca> writes:

On 2013-10-13 01:15:49 +0000, "deadalnix" <deadalnix gmail.com> said:

 Can someone explain me what an autorelease pool is ?

A basic concept in Objective-C to make manual reference counting 
bearable. As things are moving to *automatic* reference counting now 
autorelease pools are becoming less important, but they remain there 
for backward compatibility and autoreleased objects must be handled 
correctly by ARC following to the existing conventions.

The concept is to have functions return autoreleased objects, objects 
pending release. Each time you autorelease an object, instead of the 
counter being decremented immediately, the object gets added to the 
autorelease pool and the pool decrement the counter later when it gets 
drained. So, when the caller gets an autoreleased object, it doesn't 
have to decrement the counter when it stopped using the object as a 
temporary, the object will be cleaned up automatically later, generally 
at the next iteration of the event loop. You only need to retain the 
object if you're storing it somewhere else than a local variable.

So this is what made manual reference counting bearable in Objective-C. 
Autorelease pool support is only useful and needed for correctly 
implementing ARC for Objective-C object types.

-- 
Michel Fortin
michel.fortin michelf.ca
http://michelf.ca

Oct 12 2013

D Programming

C/C++ Programming

Other

digitalmars.D - draft proposal for ref counting in D