digitalmars.D - RCArray is unsafe

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (58/58) Mar 01 2015 Walter posted an example implementation of a reference counted

Andrei Alexandrescu (2/24) Mar 01 2015 Thanks for pointing this out - it's a real problem. -- Andrei
Zach the Mystic (5/32) Mar 01 2015 See:

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (4/39) Mar 01 2015 That's not applicable here though. The compiler doesn't know

Zach the Mystic (3/6) Mar 01 2015 Why can't structs also implement opAddRef/Release? Why is it
Zach the Mystic (5/12) Mar 01 2015 Maybe the compiler can detect a split-pass and insert the

Walter Bright (4/21) Mar 01 2015 The trouble seems to happen when there are two references to the same ob...

Michel Fortin (16/21) Mar 01 2015 That's actually not enough. You'll have to block access to global

Zach the Mystic (2/19) Mar 01 2015 Globals to impures, that is.
Walter Bright (2/3) Mar 01 2015 Hmm. That's not so easy to solve.

Zach the Mystic (7/11) Mar 02 2015 But consider this. It's only an impure function which might alias
Michel Fortin (34/39) Mar 03 2015 Let's see. The problem is that 'ref' variables get invalidated by some

Michel Fortin (8/43) Mar 03 2015 And giving it some more thought, @refbreaking also has the interesting

Zach the Mystic (10/27) Mar 02 2015 What's the difference between that and this:

Zach the Mystic (2/5) Mar 02 2015 I meant: T* t = &ta[0];

Walter Bright (4/14) Mar 02 2015 Thinking about it, there are many other ways this can happen. At the mom...

Andrei Alexandrescu (6/23) Mar 02 2015 I have a solution (as in working implementation), but also a deadline

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (4/32) Mar 02 2015 "I have discovered a marvellous solution, but this post is too

Walter Bright (14/16) Mar 02 2015 Fortunately, Fermat (er, Andrei) was able to pass along his dazz idea to...

Andrei Alexandrescu (6/25) Mar 02 2015 And since an RCArray may undergo several assignments during its lifetime...

Walter Bright (2/6) Mar 02 2015 Right, I had neglected to add that.
Andrei Alexandrescu (5/34) Mar 02 2015 We may apply the same principle (memory may only be freed during
Ivan Timokhin (17/38) Mar 04 2015 Excuse me if I miss something obvious, but:

Andrei Alexandrescu (2/18) Mar 04 2015 That's a problem, thanks very much for pointing it out. -- Andrei

Steven Schveighoffer (6/27) Mar 04 2015 Again, I think this is an issue with the expectation of RCArray. You

Zach the Mystic (8/12) Mar 04 2015 What you need is a special RCSlave type, which is reference

Zach the Mystic (13/25) Mar 04 2015 Another solution is to get compiler help. If you know the
Zach the Mystic (62/74) Mar 05 2015 A way to do this is to have a core RCData type which has the

deadalnix (6/6) Mar 05 2015 Kind of OT, but your train of thought is very difficult to follow

Zach the Mystic (7/13) Mar 05 2015 Okay. I seem to be mixing my more well-thought out ideas with

deadalnix (4/6) Mar 05 2015 I'd like you to expose what is pretty settled down in your mind

Andrei Alexandrescu (5/33) Mar 04 2015 Yah, this is a fork in the road: either we solve this with DIP25 +

Zach the Mystic (10/13) Mar 04 2015 The third solution is to keep track of lifetimes, recognize

Zach the Mystic (18/31) Mar 04 2015 I'm sorry, I just realized this proposal is too complicated, and

sclytrack (55/75) Mar 05 2015 struct PileInfo "Red" TOP of PILE
Nick Treleaven (29/44) Mar 09 2015 We could prevent reassignment of arr while val is alive, but perhaps

Zach the Mystic (2/8) Mar 02 2015 Add me to "we". I'm dazzed! :-)
Zach the Mystic (4/8) Mar 02 2015 RcArray, a struct, already does this. You wouldn't delete in a

Andrei Alexandrescu (4/13) Mar 02 2015 Probably not.

Zach the Mystic (4/13) Mar 02 2015 I guess you also mean opAssigns -- they would manipulate

Zach the Mystic (2/5) Mar 02 2015 Nevermind. I was a minute too soon with my post!

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (12/33) Mar 03 2015 Clever :-) But there are two things:

Andrei Alexandrescu (4/14) Mar 03 2015 Agreed.

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (3/19) Mar 03 2015 No, I can't, because it isn't true. I was confused...

Andrei Alexandrescu (2/15) Mar 03 2015

Nick Treleaven (6/9) Mar 03 2015 Maybe the compiler could enforce that opRelease is truly @safe - i.e.

Zach the Mystic (3/6) Mar 03 2015 I was dazzed, but I'm not anymore. I wrote my concern here:

Andrei Alexandrescu (4/10) Mar 03 2015 There's a misunderstanding here. The object being assigned keeps a

Zach the Mystic (14/20) Mar 03 2015 So you need an extra pointer per instance? Isn't that a big price

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (11/33) Mar 03 2015 All instances need to carry a pointer to refcount anyway, so the

Andrei Alexandrescu (7/37) Mar 03 2015 No, the cool thing about freelists is they use free memory to chain

Walter Bright (4/11) Mar 03 2015 I don't think the payload memory can be repurposed to be a free list unt...
ixid (1/3) Mar 03 2015 Isn't this testes-driven development?

Zach the Mystic (9/19) Mar 03 2015 If the last RcType is a global, will the list ever get freed at

Andrei Alexandrescu (4/10) Mar 03 2015 No. Making a global variable of a reference counted type would be poor

deadalnix (4/19) Mar 03 2015 Considering glopbal here means thread local, I think it does make

Andrei Alexandrescu (8/22) Mar 03 2015 Yah, or define your type to be single-assignment (probably an emerging

Zach the Mystic (18/45) Mar 03 2015 Isn't allocating and collecting a freelist also overhead?

Andrei Alexandrescu (7/49) Mar 03 2015 No. I don't have time now for a proof of concept and it seems everybody

Zach the Mystic (10/33) Mar 03 2015 Okay.

Zach the Mystic (16/19) Mar 03 2015 I think that part of the fear of my idea is that I want structs
deadalnix (18/22) Mar 04 2015 I don't think that it is fine.

Walter Bright (5/7) Mar 04 2015 The complexity of a free list doesn't remotely compare to that of adding...

bearophile (5/7) Mar 04 2015 A sound complete ownership system is the only good enough

Paolo Invernizzi (4/11) Mar 04 2015 +1
No (2/9) Mar 04 2015 +1
Walter Bright (4/9) Mar 04 2015 How do you type an an array of pointers with different owners?

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (5/8) Mar 04 2015 You mean an array of pointers to objects with different owners?
bearophile (10/11) Mar 04 2015 "Sound" doesn't mean it should be able to do everything. It will
"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (17/28) Mar 04 2015 There will be no combinatorial explosion. In our latest proposal

Andrei Alexandrescu (2/5) Mar 04 2015 Looking forward to it! -- Andrei

ponce (13/24) Mar 04 2015 You define a clear owner for everything so that this never

Walter Bright (2/4) Mar 04 2015 C++ doesn't have ownership annotations and no checkable notion of a clea...

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (5/12) Mar 04 2015 The standard library + type system provides mechanisms for clear

deadalnix (8/19) Mar 04 2015 Either the array is owning its elements or is is borrowing them.

ponce (4/11) Mar 04 2015 For that matters, I'd be happy with an unsound incomplete unsafe

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (2/4) Mar 04 2015 You already have that :^)

deadalnix (6/13) Mar 04 2015 A free list does not work as the data can be live. You cannot

deadalnix (3/18) Mar 04 2015 And come on, DIP25, DIP69 and DIP74 to get there. Come on, that

Walter Bright (2/4) Mar 04 2015 DIP69 has pretty much been abandoned.

Walter Bright (5/10) Mar 04 2015 I don't see why.

deadalnix (8/18) Mar 04 2015 What I wanted to illustrate is that it is not a free list (which

Andrei Alexandrescu (20/22) Mar 04 2015 Actually you can, which is a bit surprising. Consider:

Zach the Mystic (6/13) Mar 04 2015 My reference safety system has ownership built in, more-or-less

Zach the Mystic (56/79) Mar 04 2015 To me, DIP25 is just the first step towards an ownership system.

Zach the Mystic (2/5) Mar 04 2015 ...will not need `@noscope` either.
Zach the Mystic (8/20) Mar 04 2015 Actually, I think you convinced me before that `static` (or

Zach the Mystic (4/16) Mar 04 2015 Well, I don't want to make any enemies. I thought that once the

Zach the Mystic (9/24) Mar 02 2015 So with Andrei's solution, will s.array ever get freed, since s

Zach the Mystic (5/11) Mar 02 2015 auto y = new RcStruct;

Manu via Digitalmars-d (16/34) Mar 03 2015 My immediate impression on this problem:

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (23/28) Mar 03 2015 Yes, the sane thing to do is to improve the general type system

Paulo Pinto (7/36) Mar 03 2015 You just reminded me of ParaSail. Here is the latest version of

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (2/5) Mar 03 2015 Thanks! Section «5. Related Work» was a fun read. :-)

Zach the Mystic (4/26) Mar 03 2015 This is my position too.

Manu via Digitalmars-d (13/34) Mar 03 2015 Well you can't get to a subcomponent if not through it's owner.

Zach the Mystic (7/25) Mar 04 2015 Yeah, or you could mimic such a reference by wrapping the call in

Steven Schveighoffer (9/34) Mar 02 2015 This is an odd example, how does one take a ref to an RCArray element

Walter Bright (4/7) Mar 02 2015 It's seems that to solve the problem, it has to be borrowed, in that a s...

Walter Bright (9/16) Mar 02 2015 I looked at how Rust does it. I was hoping it was something clever, but ...

Andrei Alexandrescu (2/3) Mar 02 2015 There is! There is! -- Andrei
deadalnix (10/20) Mar 02 2015 I do think you are confusing how Rust does it. In Rust, borrowing

weaselcat (3/26) Mar 02 2015 I don't think you were advocating for this but,

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (2/4) Mar 03 2015 We're working on it ;-)

Walter Bright (16/19) Mar 02 2015 May I refer you to:

weaselcat (6/31) Mar 02 2015 That's actually very old(0.6 - nearly 2 years ago.) I believe

Walter Bright (2/7) Mar 02 2015 Thanks for the info. But I don't see how Marc's example is prevented by ...

weaselcat (8/19) Mar 02 2015 It wouldn't be compilable(it borrows the same object mutably
deadalnix (4/15) Mar 02 2015 After moving resources, the previous owner can no longer be used.

Walter Bright (2/3) Mar 02 2015 How does that work with the example presented by Marc?

weaselcat (4/8) Mar 02 2015 He couldn't pass s and a member of s because s is borrowed as

weaselcat (3/12) Mar 02 2015 Er, I was speaking in terms of Rust of course(it has no

weaselcat (4/17) Mar 02 2015 http://dpaste.com/30WAMVQ

Walter Bright (2/9) Mar 02 2015 A pointer to s could be obtained otherwise and passed.

Seo Sanghyeon (4/16) Mar 03 2015 No. Rust compiler forbids you from obtaining the pointer
Zach the Mystic (10/22) Mar 03 2015 Under normal circumstances, if the pointer to s is an lvalue, the

deadalnix (31/35) Mar 02 2015 I'm not sure what you don't understand. The comment in the sample

Walter Bright (4/31) Mar 02 2015 What if 'a' is a field of a struct in some non-trivial data structure? H...

weaselcat (3/56) Mar 02 2015 Borrowing 'a' from a struct would make the parent struct

Walter Bright (2/4) Mar 03 2015 Right, now consider that struct is a leaf in a complex graph of data str...

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (12/18) Mar 03 2015 Then you still cannot have more than one mutable reference to the

Andrei Alexandrescu (6/24) Mar 03 2015 Well... the bigger problem is that it's relying on a convention. The

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (8/46) Mar 03 2015 To avoid misunderstandings: It is in reply to a sub-thread where

Walter Bright (5/8) Mar 03 2015 My understanding of Rust is that many library types rely on unsafe code ...

deadalnix (7/13) Mar 03 2015 Rust has several pointer types, so what is gonna happen depend on

deadalnix (3/28) Mar 02 2015 Yes, this page is obsolete, but it is still very interesting as

Zach the Mystic (6/13) Mar 02 2015 It seems like the most common use case for passing a global to a

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (6/33) Mar 02 2015 Yes, it's a classical aliasing problem. Of course, if the

Sativa (467/477) Mar 02 2015 1 > struct T {

Biker (2/490) Mar 03 2015 tldr; ``I don't know what I'm talking about.``

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:

Walter posted an example implementation of a reference counted 
array [1], that utilizes the features introduced in DIP25 [2]. 
Then, in the threads about reference counted objects, several 
people posted examples [3, 4] that broke the suggested 
optimization of eliding `opAddRef()`/`opRelease()` calls in 
certain situations.

A weakness of the same kind affects DIP25, too. The core of the 
problem is borrowing (ref return as in DIP25), combined with 
manual (albeit hidden) memory management. An example to 
illustrate:

     struct T {
         void doSomething();
     }
     struct S {
         RCArray!T array;
     }
     void main() {
         auto s = S(RCArray!T([T()])); // s.array's refcount is 
now 1
         foo(s, s.array[0]);           // pass by ref
     }
     void foo(ref S s, ref T T) {
         s.array = RCArray!T([]);      // drop the old s.array
         t.doSomething();              // oops, t is gone
     }

Any suggestions how to deal with this? As far as I can see, there 
are the following options:

1) Make borrowing (return ref)  system. This would defeat the 
purpose of DIP25.

2) Disallow (by convention) borrowing for refcounted objects. 
Again, this would make DIP25 pointless, and strictly speaking, 
anything that relies on convention cannot be  safe. And there's 
no guarantee that it doesn't affect other things besides RC.

3) Introduce a full linear type system. A _very_ large and 
invasive change, and probably cumbersome to work with.

4) Live with it. Accept that it's not possible to get the last 
bit of  safe-ty without extreme and unjustifiable costs. Make it 
 safe nevertheless, and formulate usage guides that people are 
expected to follow.

5) Make `RCArray` a special type whose purpose is known to the 
compiler, and implement complicated checks to verify  safe-ty. 
Again, kind of defeats the purpose, and adds complexity to the 
language and implementation.

6) Restrict borrowing to situations where it's  safe. Or better, 
allow it everywhere, but make it  system where necessary. I think 
problems can only happen at function boundaries (what happens 
inside a function can be checked statically), but I'd have to 
think about it more.

7) Anything else? Are there some small details (in whatever part 
of the language) that can be adjusted to get us additional 
guarantees?

Option 6) currently appears the most promising to me.

Comments?

[1] http://forum.dlang.org/thread/mcg8qq$1mbr$1 digitalmars.com
[2] http://wiki.dlang.org/DIP25
[4] 
http://forum.dlang.org/post/ilfwmeobprkcorpiqoui forum.dlang.org
[5] http://forum.dlang.org/post/mcqk4s$6qb$1 digitalmars.com

Mar 01 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/1/15 7:44 AM, "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net>" wrote:
 Walter posted an example implementation of a reference counted array
 [1], that utilizes the features introduced in DIP25 [2]. Then, in the
 threads about reference counted objects, several people posted examples
 [3, 4] that broke the suggested optimization of eliding
 `opAddRef()`/`opRelease()` calls in certain situations.

 A weakness of the same kind affects DIP25, too. The core of the problem
 is borrowing (ref return as in DIP25), combined with manual (albeit
 hidden) memory management. An example to illustrate:

      struct T {
          void doSomething();
      }
      struct S {
          RCArray!T array;
      }
      void main() {
          auto s = S(RCArray!T([T()])); // s.array's refcount is now 1
          foo(s, s.array[0]);           // pass by ref
      }
      void foo(ref S s, ref T T) {
          s.array = RCArray!T([]);      // drop the old s.array
          t.doSomething();              // oops, t is gone
      }

Thanks for pointing this out - it's a real problem. -- Andrei

Mar 01 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Sunday, 1 March 2015 at 15:44:49 UTC, Marc Schütz wrote:
 Walter posted an example implementation of a reference counted 
 array [1], that utilizes the features introduced in DIP25 [2]. 
 Then, in the threads about reference counted objects, several 
 people posted examples [3, 4] that broke the suggested 
 optimization of eliding `opAddRef()`/`opRelease()` calls in 
 certain situations.

 A weakness of the same kind affects DIP25, too. The core of the 
 problem is borrowing (ref return as in DIP25), combined with 
 manual (albeit hidden) memory management. An example to 
 illustrate:

     struct T {
         void doSomething();
     }
     struct S {
         RCArray!T array;
     }
     void main() {
         auto s = S(RCArray!T([T()])); // s.array's refcount is 
 now 1
         foo(s, s.array[0]);           // pass by ref
     }
     void foo(ref S s, ref T T) {
         s.array = RCArray!T([]);      // drop the old s.array
         t.doSomething();              // oops, t is gone
     }

 Any suggestions how to deal with this? As far as I can see, 
 there are the following options:

See:
http://forum.dlang.org/post/bghjqvvrdcfqmoiyyuqz forum.dlang.org
...and:
http://forum.dlang.org/post/cviwlkugnothraubcfgy forum.dlang.org

Mar 01 2015

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:

On Sunday, 1 March 2015 at 16:54:15 UTC, Zach the Mystic wrote:
 On Sunday, 1 March 2015 at 15:44:49 UTC, Marc Schütz wrote:
 Walter posted an example implementation of a reference counted 
 array [1], that utilizes the features introduced in DIP25 [2]. 
 Then, in the threads about reference counted objects, several 
 people posted examples [3, 4] that broke the suggested 
 optimization of eliding `opAddRef()`/`opRelease()` calls in 
 certain situations.

 A weakness of the same kind affects DIP25, too. The core of 
 the problem is borrowing (ref return as in DIP25), combined 
 with manual (albeit hidden) memory management. An example to 
 illustrate:

    struct T {
        void doSomething();
    }
    struct S {
        RCArray!T array;
    }
    void main() {
        auto s = S(RCArray!T([T()])); // s.array's refcount is 
 now 1
        foo(s, s.array[0]);           // pass by ref
    }
    void foo(ref S s, ref T T) {
        s.array = RCArray!T([]);      // drop the old s.array
        t.doSomething();              // oops, t is gone
    }

 Any suggestions how to deal with this? As far as I can see, 
 there are the following options:

 See:
 http://forum.dlang.org/post/bghjqvvrdcfqmoiyyuqz forum.dlang.org
 ...and:
 http://forum.dlang.org/post/cviwlkugnothraubcfgy forum.dlang.org

That's not applicable here though. The compiler doesn't know 
we're doing reference counting, so it cannot insert 
AddRef/Release calls.

Mar 01 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Sunday, 1 March 2015 at 17:48:15 UTC, Marc Schütz wrote:
 That's not applicable here though. The compiler doesn't know 
 we're doing reference counting, so it cannot insert 
 AddRef/Release calls.

Why can't structs also implement opAddRef/Release? Why is it 
restricted to classes?

Mar 01 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Sunday, 1 March 2015 at 17:48:15 UTC, Marc Schütz wrote:
 See:
 http://forum.dlang.org/post/bghjqvvrdcfqmoiyyuqz forum.dlang.org
 ...and:
 http://forum.dlang.org/post/cviwlkugnothraubcfgy forum.dlang.org

 That's not applicable here though. The compiler doesn't know 
 we're doing reference counting, so it cannot insert 
 AddRef/Release calls.

Maybe the compiler can detect a split-pass and insert the 
appropriate calls to the correlate opAdd/opRelease functions for 
structs according to the same rules as for classes. Isn't 
split-passing a rare business in general?

Mar 01 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 3/1/2015 7:44 AM, "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net>" wrote:
 A weakness of the same kind affects DIP25, too. The core of the problem is
 borrowing (ref return as in DIP25), combined with manual (albeit hidden) memory
 management. An example to illustrate:

      struct T {
          void doSomething();
      }
      struct S {
          RCArray!T array;
      }
      void main() {
          auto s = S(RCArray!T([T()])); // s.array's refcount is now 1
          foo(s, s.array[0]);           // pass by ref
      }
      void foo(ref S s, ref T T) {
          s.array = RCArray!T([]);      // drop the old s.array
          t.doSomething();              // oops, t is gone
      }

The trouble seems to happen when there are two references to the same object 
passed to a function. I.e. there can be only one "borrowed" ref at a time.

I'm thinking this could be statically disallowed in  safe code.

Mar 01 2015

Michel Fortin <michel.fortin michelf.com> writes:

On 2015-03-01 19:21:57 +0000, Walter Bright said:

 The trouble seems to happen when there are two references to the same 
 object passed to a function. I.e. there can be only one "borrowed" ref 
 at a time.
 
 I'm thinking this could be statically disallowed in  safe code.

That's actually not enough. You'll have to block access to global 
variables too:

	S s;

	void main() {
		s.array = RCArray!T([T()]);   // s.array's refcount is now 1
		foo(s.array[0]);           // pass by ref
	}
	void foo(ref T t) {
		s.array = RCArray!T([]);      // drop the old s.array
		t.doSomething();              // oops, t is gone
	}


-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Mar 01 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Sunday, 1 March 2015 at 20:51:35 UTC, Michel Fortin wrote:
 On 2015-03-01 19:21:57 +0000, Walter Bright said:

 The trouble seems to happen when there are two references to 
 the same object passed to a function. I.e. there can be only 
 one "borrowed" ref at a time.
 
 I'm thinking this could be statically disallowed in  safe code.

 That's actually not enough. You'll have to block access to 
 global variables too:

 	S s;

 	void main() {
 		s.array = RCArray!T([T()]);   // s.array's refcount is now 1
 		foo(s.array[0]);           // pass by ref
 	}
 	void foo(ref T t) {
 		s.array = RCArray!T([]);      // drop the old s.array
 		t.doSomething();              // oops, t is gone
 	}

Globals to impures, that is.

Mar 01 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 3/1/2015 12:51 PM, Michel Fortin wrote:
 That's actually not enough. You'll have to block access to global variables
too:

Hmm. That's not so easy to solve.

Mar 01 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Monday, 2 March 2015 at 05:57:35 UTC, Walter Bright wrote:
 On 3/1/2015 12:51 PM, Michel Fortin wrote:
 That's actually not enough. You'll have to block access to 
 global variables too:

 Hmm. That's not so easy to solve.

But consider this. It's only an impure function which might alias 
a global. And since you already have access to the global in the 
impure function, there might be less incentive in general to pass 
it through a function. Other than that, you're stuck with a 
theoretical " impure!varName" function attribute, for example, 
which tells the caller which globals are accessed.

Mar 02 2015

Michel Fortin <michel.fortin michelf.com> writes:

On 2015-03-02 05:57:12 +0000, Walter Bright said:

 On 3/1/2015 12:51 PM, Michel Fortin wrote:
 That's actually not enough. You'll have to block access to global 
 variables too:

 
 Hmm. That's not so easy to solve.

Let's see. The problem is that 'ref' variables get invalidated by some 
operations. Perhaps we could just tell the compiler that doing this or 
that will makes 'ref' variables unsafe after that point. Let's try 
adding a  refbreaking function attribute, and apply it to 
RCArray.opAssign:

	S s;

	void main() {
		s.array = RCArray!T([T()]);
		foo(s.array[0]);
	}
	void foo(ref T t) {
		t.doSomething(); // all is fine
		s.array = RCArray!T([]); // here, RCArray.opAssign would be labeled 
 refbreaking
		t.doSomething(); // cannot use 'ref' variable after doing a 
refbreaking operation
	}

Also, the above shouldn't compile anyway because  refbreaking would 
need to be transitive, and it follows that `foo` would need to be 
 refbreaking too:

	void foo(ref T t)  refbreaking {
		...
	}

which in turn means that `main` too needs to be  refbreaking.

So what needs to be  refbreaking? Anything that might deallocate. This 
includes `opRelease` if it deallocates when the counter reaches zero. 
Although you could implement `opRelease` in a way that sends the memory 
block to an autorelease pool of some kind, in which case draining the 
autorelease pool at a later point would be  refbreaking.


-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Mar 03 2015

Michel Fortin <michel.fortin michelf.com> writes:

On 2015-03-03 22:39:12 +0000, Michel Fortin said:

 Let's see. The problem is that 'ref' variables get invalidated by some 
 operations. Perhaps we could just tell the compiler that doing this or 
 that will makes 'ref' variables unsafe after that point. Let's try 
 adding a  refbreaking function attribute, and apply it to 
 RCArray.opAssign:
 
 	S s;
 
 	void main() {
 		s.array = RCArray!T([T()]);
 		foo(s.array[0]);
 	}
 	void foo(ref T t) {
 		t.doSomething(); // all is fine
 		s.array = RCArray!T([]); // here, RCArray.opAssign would be labeled 
  refbreaking
 		t.doSomething(); // cannot use 'ref' variable after doing a 
 refbreaking operation
 	}
 
 Also, the above shouldn't compile anyway because  refbreaking would 
 need to be transitive, and it follows that `foo` would need to be 
  refbreaking too:
 
 	void foo(ref T t)  refbreaking {
 		...
 	}
 
 which in turn means that `main` too needs to be  refbreaking.
 
 So what needs to be  refbreaking? Anything that might deallocate. This 
 includes `opRelease` if it deallocates when the counter reaches zero. 
 Although you could implement `opRelease` in a way that sends the memory 
 block to an autorelease pool of some kind, in which case draining the 
 autorelease pool at a later point would be  refbreaking.

And giving it some more thought,  refbreaking also has the interesting 
property that any pair of opAddRef/opRelease with no  refbreaking call 
between them can be elided safely.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Mar 03 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Sunday, 1 March 2015 at 20:51:35 UTC, Michel Fortin wrote:
 On 2015-03-01 19:21:57 +0000, Walter Bright said:

 The trouble seems to happen when there are two references to 
 the same object passed to a function. I.e. there can be only 
 one "borrowed" ref at a time.
 
 I'm thinking this could be statically disallowed in  safe code.

 That's actually not enough. You'll have to block access to 
 global variables too:

 	S s;

 	void main() {
 		s.array = RCArray!T([T()]);   // s.array's refcount is now 1
 		foo(s.array[0]);           // pass by ref
 	}
 	void foo(ref T t) {
 		s.array = RCArray!T([]);      // drop the old s.array
 		t.doSomething();              // oops, t is gone
 	}

What's the difference between that and this:

void fun() {
   T[] ta = [T()].dup;
   T* t = ta[0];
   delete ta; // or however you do it
   *t = ...;
}

Why is this a parameter passing issue and not a "you kept a 
sub-reference to a deleted chunk" issue?

Mar 02 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Monday, 2 March 2015 at 15:22:33 UTC, Zach the Mystic wrote:
 void fun() {
   T[] ta = [T()].dup;
   T* t = ta[0];

I meant: T* t = &ta[0];

Mar 02 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 3/1/2015 12:51 PM, Michel Fortin wrote:
 That's actually not enough. You'll have to block access to global variables
too:

      S s;

      void main() {
          s.array = RCArray!T([T()]);   // s.array's refcount is now 1
          foo(s.array[0]);           // pass by ref
      }
      void foo(ref T t) {
          s.array = RCArray!T([]);      // drop the old s.array
          t.doSomething();              // oops, t is gone
      }

Thinking about it, there are many other ways this can happen. At the moment,
I'm 
stuck thinking of a solution other than requiring foo() to be pure. Anyone have 
ideas?

Mar 02 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/2/15 12:37 PM, Walter Bright wrote:
 On 3/1/2015 12:51 PM, Michel Fortin wrote:
 That's actually not enough. You'll have to block access to global
 variables too:

      S s;

      void main() {
          s.array = RCArray!T([T()]);   // s.array's refcount is now 1
          foo(s.array[0]);           // pass by ref
      }
      void foo(ref T t) {
          s.array = RCArray!T([]);      // drop the old s.array
          t.doSomething();              // oops, t is gone
      }

 Thinking about it, there are many other ways this can happen. At the
 moment, I'm stuck thinking of a solution other than requiring foo() to
 be pure. Anyone have ideas?

I have a solution (as in working implementation), but also a deadline 
that's staring me in the face so this will have to wait a couple more days.

I also have a bit of an attack to const/immutable support but that's 
less good (allows too much freedom).


Andrei

Mar 02 2015

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:

On Monday, 2 March 2015 at 21:00:54 UTC, Andrei Alexandrescu 
wrote:
 On 3/2/15 12:37 PM, Walter Bright wrote:
 On 3/1/2015 12:51 PM, Michel Fortin wrote:
 That's actually not enough. You'll have to block access to 
 global
 variables too:

     S s;

     void main() {
         s.array = RCArray!T([T()]);   // s.array's refcount 
 is now 1
         foo(s.array[0]);           // pass by ref
     }
     void foo(ref T t) {
         s.array = RCArray!T([]);      // drop the old s.array
         t.doSomething();              // oops, t is gone
     }

 Thinking about it, there are many other ways this can happen. 
 At the
 moment, I'm stuck thinking of a solution other than requiring 
 foo() to
 be pure. Anyone have ideas?

 I have a solution (as in working implementation), but also a 
 deadline that's staring me in the face so this will have to 
 wait a couple more days.

 I also have a bit of an attack to const/immutable support but 
 that's less good (allows too much freedom).

"I have discovered a marvellous solution, but this post is too 
short to describe it."

Mar 02 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 3/2/2015 1:09 PM, "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net>" wrote:
 "I have discovered a marvellous solution, but this post is too short to
describe
 it."

Fortunately, Fermat (er, Andrei) was able to pass along his dazz idea to me
this 
afternoon, before he expired to something he had to attend to. We were both in 
the dumps about this problem last night, but I think he solved it.

His insight was that the deletion of the payload occurred before the end of the 
lifetime of the RC object, and that this was the source of the problem. If the 
deletion of the payload occurs during the destructor call, rather than the 
postblit, then although the ref count of the payload goes to zero, it doesn't 
actually get deleted.

I.e. the postblit manipulates the ref count, but does NOT do payload deletions. 
The destructor checks the ref count, if it is zero, THEN it does the payload 
deletion.

Pretty dazz idea, dontcha think? And DIP25 still stands unscathed :-)

Unless, of course, we missed something obvious.

Mar 02 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/2/15 2:57 PM, Walter Bright wrote:
 On 3/2/2015 1:09 PM, "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net>"
 wrote:
 "I have discovered a marvellous solution, but this post is too short
 to describe
 it."

 Fortunately, Fermat (er, Andrei) was able to pass along his dazz idea to
 me this afternoon, before he expired to something he had to attend to.
 We were both in the dumps about this problem last night, but I think he
 solved it.

 His insight was that the deletion of the payload occurred before the end
 of the lifetime of the RC object, and that this was the source of the
 problem. If the deletion of the payload occurs during the destructor
 call, rather than the postblit, then although the ref count of the
 payload goes to zero, it doesn't actually get deleted.

 I.e. the postblit manipulates the ref count, but does NOT do payload
 deletions. The destructor checks the ref count, if it is zero, THEN it
 does the payload deletion.

 Pretty dazz idea, dontcha think? And DIP25 still stands unscathed :-)

 Unless, of course, we missed something obvious.

And since an RCArray may undergo several assignments during its lifetime 
(thus potentially needing to free several chunks of memory), the arrays 
to be destroyed will be kept in a freelist-style structure. Destructor 
walks the freelist and frees the chunks.

Andrei

Mar 02 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 3/2/2015 3:22 PM, Andrei Alexandrescu wrote:
 And since an RCArray may undergo several assignments during its lifetime (thus
 potentially needing to free several chunks of memory), the arrays to be
 destroyed will be kept in a freelist-style structure. Destructor walks the
 freelist and frees the chunks.

Right, I had neglected to add that.

Mar 02 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/2/15 3:22 PM, Andrei Alexandrescu wrote:
 On 3/2/15 2:57 PM, Walter Bright wrote:
 On 3/2/2015 1:09 PM, "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net>"
 wrote:
 "I have discovered a marvellous solution, but this post is too short
 to describe
 it."

 Fortunately, Fermat (er, Andrei) was able to pass along his dazz idea to
 me this afternoon, before he expired to something he had to attend to.
 We were both in the dumps about this problem last night, but I think he
 solved it.

 His insight was that the deletion of the payload occurred before the end
 of the lifetime of the RC object, and that this was the source of the
 problem. If the deletion of the payload occurs during the destructor
 call, rather than the postblit, then although the ref count of the
 payload goes to zero, it doesn't actually get deleted.

 I.e. the postblit manipulates the ref count, but does NOT do payload
 deletions. The destructor checks the ref count, if it is zero, THEN it
 does the payload deletion.

 Pretty dazz idea, dontcha think? And DIP25 still stands unscathed :-)

 Unless, of course, we missed something obvious.

 And since an RCArray may undergo several assignments during its lifetime
 (thus potentially needing to free several chunks of memory), the arrays
 to be destroyed will be kept in a freelist-style structure. Destructor
 walks the freelist and frees the chunks.

We may apply the same principle (memory may only be freed during 
destruction) to DIP74. If we figure a cheap way to keep a freelist of 
objects to be deallocated, we can borrow multiple times without a 
reference count add. But I didn't think this through yet. -- Andrei

Mar 02 2015

Ivan Timokhin <timokhin.iv gmail.com> writes:

Excuse me if I miss something obvious, but:

    void main()
    {
        auto arr = RCArray!int([0]);
        foo(arr, arr[0]);
    }

    void foo(ref RCArray!int arr, ref int val)
    {
        {
            auto copy = arr; //arr's (and copy's) reference counts are both 2
            arr = RCArray!int([]); // There is another owner, so arr 
                                   // forgets about the old payload
        } // Last owner of the array ('copy') gets destroyed and happily
          // frees the payload.
        val = 3; // Oops.
    }

On Mon, Mar 02, 2015 at 03:22:52PM -0800, Andrei Alexandrescu wrote:
 On 3/2/15 2:57 PM, Walter Bright wrote:
 His insight was that the deletion of the payload occurred before the end
 of the lifetime of the RC object, and that this was the source of the
 problem. If the deletion of the payload occurs during the destructor
 call, rather than the postblit, then although the ref count of the
 payload goes to zero, it doesn't actually get deleted.

 I.e. the postblit manipulates the ref count, but does NOT do payload
 deletions. The destructor checks the ref count, if it is zero, THEN it
 does the payload deletion.

 Pretty dazz idea, dontcha think? And DIP25 still stands unscathed :-)

 Unless, of course, we missed something obvious.

 
 And since an RCArray may undergo several assignments during its lifetime 
 (thus potentially needing to free several chunks of memory), the arrays 
 to be destroyed will be kept in a freelist-style structure. Destructor 
 walks the freelist and frees the chunks.
 
 Andrei

Mar 04 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/4/15 12:55 AM, Ivan Timokhin wrote:
 Excuse me if I miss something obvious, but:

      void main()
      {
          auto arr = RCArray!int([0]);
          foo(arr, arr[0]);
      }

      void foo(ref RCArray!int arr, ref int val)
      {
          {
              auto copy = arr; //arr's (and copy's) reference counts are both 2
              arr = RCArray!int([]); // There is another owner, so arr
                                     // forgets about the old payload
          } // Last owner of the array ('copy') gets destroyed and happily
            // frees the payload.
          val = 3; // Oops.
      }

That's a problem, thanks very much for pointing it out. -- Andrei

Mar 04 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 3/4/15 10:42 AM, Andrei Alexandrescu wrote:
 On 3/4/15 12:55 AM, Ivan Timokhin wrote:
 Excuse me if I miss something obvious, but:

      void main()
      {
          auto arr = RCArray!int([0]);
          foo(arr, arr[0]);
      }

      void foo(ref RCArray!int arr, ref int val)
      {
          {
              auto copy = arr; //arr's (and copy's) reference counts
 are both 2
              arr = RCArray!int([]); // There is another owner, so arr
                                     // forgets about the old payload
          } // Last owner of the array ('copy') gets destroyed and happily
            // frees the payload.
          val = 3; // Oops.
      }

 That's a problem, thanks very much for pointing it out. -- Andrei

Again, I think this is an issue with the expectation of RCArray. You 
cannot *save* a ref to an array element, only a ref to the array itself, 
because you lose control over the reference count.

I don't think arr[0] should correctly bind to foo's second argument.

-Steve

Mar 04 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Wednesday, 4 March 2015 at 17:22:15 UTC, Steven Schveighoffer 
wrote:
 Again, I think this is an issue with the expectation of 
 RCArray. You cannot *save* a ref to an array element, only a 
 ref to the array itself, because you lose control over the 
 reference count.

What you need is a special RCSlave type, which is reference 
counted not to the type of its *own* data, but to its parent's. 
In this case, a RCArraySlave!(T) holds data of type T, but a 
pointer to an RCArray, which it decrements when it gets 
destroyed. This could get expensive, with an extra pointer per 
instance than a regular T, but it would probably be safe.

Mar 04 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Wednesday, 4 March 2015 at 18:05:52 UTC, Zach the Mystic wrote:
 On Wednesday, 4 March 2015 at 17:22:15 UTC, Steven 
 Schveighoffer wrote:
 Again, I think this is an issue with the expectation of 
 RCArray. You cannot *save* a ref to an array element, only a 
 ref to the array itself, because you lose control over the 
 reference count.

 What you need is a special RCSlave type, which is reference 
 counted not to the type of its *own* data, but to its parent's. 
 In this case, a RCArraySlave!(T) holds data of type T, but a 
 pointer to an RCArray, which it decrements when it gets 
 destroyed. This could get expensive, with an extra pointer per 
 instance than a regular T, but it would probably be safe.

Another solution is to get compiler help. If you know the 
lifetime of a sub-reference `p.t` to be shorter than of its Rc'd 
parent `p`, the compiler can wrap its `p.t's lifetime in an 
addRef/release cycle for P. This works in calling a function:

fun(p, p.t);

Let's say that you know that `p.t` won't escape (a different 
question). The compiler doesn't need to know about `p.t` to wrap 
the whole function like this:

p.opAddRef(); // or equivalent
fun(p, p.t);
p.opRelease();

It just needs to know that `p.t's lifetime is shorter than `p's.

Mar 04 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Wednesday, 4 March 2015 at 18:05:52 UTC, Zach the Mystic wrote:
 On Wednesday, 4 March 2015 at 17:22:15 UTC, Steven 
 Schveighoffer wrote:
 Again, I think this is an issue with the expectation of 
 RCArray. You cannot *save* a ref to an array element, only a 
 ref to the array itself, because you lose control over the 
 reference count.

 What you need is a special RCSlave type, which is reference 
 counted not to the type of its *own* data, but to its parent's. 
 In this case, a RCArraySlave!(T) holds data of type T, but a 
 pointer to an RCArray, which it decrements when it gets 
 destroyed. This could get expensive, with an extra pointer per 
 instance than a regular T, but it would probably be safe.

A way to do this is to have a core RCData type which has the 
count itself and the chunk of memory the count refers to in type 
ambiguous form:

struct RCData {
   int count;
   // the point is that RCData can be type ambiguous
   void[] chunk;

   this(size_t size)
   {
     chunk = new void[size];
     count = 0;
   }
   void addRef() {
     ++count;
   }
   void decRef() {
     if (count && --count == 0)
       delete chunk;
   }
}

Over top of that you create a basic element type which refcounts 
an RCData rather than itself:

struct RCType(E) {
   E element;
   RCType* data;

   this(this)
   {
     data.addRef();
   }

   ~this()
   {
     data.decRef();
   }
   [...etc...]
}

Then you have an RCArray which returns RCType elements when 
indexed rather than naked types:

struct RCArray(E) {

   E[] array;
   private RCData* data;

   RCElement!E opIndex(size_t i) return
   {
     return RCElement!E(array[start + i], data);
   }

   this(E[] a)
   {
     data = new RCData(a * sizeof(a));
     array = cast(E[]) data.chunk;

   }

   this(this)
   {
     data.addRef();
   }

   ~this()
   {
     data.decRef();
   }

   //...
}

This might work. The idea is to only leak references to types 
which also have pointers to the original data.

Mar 05 2015

"deadalnix" <deadalnix gmail.com> writes:

Kind of OT, but your train of thought is very difficult to follow 
the way you are communicating (ie by updating on previous post by 
answering to yourself).

Could you post some more global overview at some point, so one 
does not need to gather various information for various posts 
please ?

Mar 05 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Thursday, 5 March 2015 at 18:41:31 UTC, deadalnix wrote:
 Kind of OT, but your train of thought is very difficult to 
 follow the way you are communicating (ie by updating on 
 previous post by answering to yourself).

 Could you post some more global overview at some point, so one 
 does not need to gather various information for various posts 
 please ?

Okay. I seem to be mixing my more well-thought out ideas with 
ideas I get on the spur of the moment. Then they come out in a 
jumble. I have to confess that a lot of my ideas just pop into my 
head.

Did you want me to talk about how I would do ownership with my 
reference safety system?

Mar 05 2015

"deadalnix" <deadalnix gmail.com> writes:

On Thursday, 5 March 2015 at 20:47:55 UTC, Zach the Mystic wrote:
 Did you want me to talk about how I would do ownership with my 
 reference safety system?

I'd like you to expose what is pretty settled down in your mind 
so we can have a base to understand the in flux part of the ideas 
:)

Mar 05 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/4/15 9:22 AM, Steven Schveighoffer wrote:
 On 3/4/15 10:42 AM, Andrei Alexandrescu wrote:
 On 3/4/15 12:55 AM, Ivan Timokhin wrote:
 Excuse me if I miss something obvious, but:

      void main()
      {
          auto arr = RCArray!int([0]);
          foo(arr, arr[0]);
      }

      void foo(ref RCArray!int arr, ref int val)
      {
          {
              auto copy = arr; //arr's (and copy's) reference counts
 are both 2
              arr = RCArray!int([]); // There is another owner, so arr
                                     // forgets about the old payload
          } // Last owner of the array ('copy') gets destroyed and
 happily
            // frees the payload.
          val = 3; // Oops.
      }

 That's a problem, thanks very much for pointing it out. -- Andrei

 Again, I think this is an issue with the expectation of RCArray. You
 cannot *save* a ref to an array element, only a ref to the array itself,
 because you lose control over the reference count.

 I don't think arr[0] should correctly bind to foo's second argument.

Yah, this is a fork in the road: either we solve this with DIP25 + 
implementation, or we add stricter static checking disallowing two lent 
references to data in the same scope.

Andrei

Mar 04 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Wednesday, 4 March 2015 at 18:17:41 UTC, Andrei Alexandrescu 
wrote:
 Yah, this is a fork in the road: either we solve this with 
 DIP25 + implementation, or we add stricter static checking 
 disallowing two lent references to data in the same scope.

The third solution is to keep track of lifetimes, recognize 
refcounted types for structs the same as suggested for classes in 
DIP74, and wrap the lifetime of the subreference `t.s` in an 
opAdd/Release cycle for `t`, as illustrated in my other reply. 
You could have the compiler recognize a refcounted struct by 
simply declaring "void opAddRef();" and "void opRelease();", with 
the compiler automatically aliasing them to "this(this)" and 
"~this".

Mar 04 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Wednesday, 4 March 2015 at 19:22:25 UTC, Zach the Mystic wrote:
 On Wednesday, 4 March 2015 at 18:17:41 UTC, Andrei Alexandrescu 
 wrote:
 Yah, this is a fork in the road: either we solve this with 
 DIP25 + implementation, or we add stricter static checking 
 disallowing two lent references to data in the same scope.

 The third solution is to keep track of lifetimes, recognize 
 refcounted types for structs the same as suggested for classes 
 in DIP74, and wrap the lifetime of the subreference `t.s` in an 
 opAdd/Release cycle for `t`, as illustrated in my other reply. 
 You could have the compiler recognize a refcounted struct by 
 simply declaring "void opAddRef();" and "void opRelease();", 
 with the compiler automatically aliasing them to "this(this)" 
 and "~this".

I'm sorry, I just realized this proposal is too complicated, and 
it wouldn't even work.

I think stricter static checking in  safe code is the way to go. 
When passing a global RC type to an impure, or duplicating the 
same RC reference variable in a function call, it's unsafe. The 
workaround is to make copies and use them:

static RcType s; // global
RcType c;

// Instead of:
func(s);
func(c, c);

// ...do this:
auto tmp = s; // get stack reference
func(tmp);
auto d = c; // copy Rc'd type
func(c, d);

Expensive, perhaps, but safe.

Mar 04 2015

"sclytrack" <sclytrack fake.com> writes:

On Wednesday, 4 March 2015 at 08:56:20 UTC, Ivan Timokhin wrote:
 Excuse me if I miss something obvious, but:

     void main()
     {
         auto arr = RCArray!int([0]);
         foo(arr, arr[0]);
     }

     void foo(ref RCArray!int arr, ref int val)
     {
         {
             auto copy = arr; //arr's (and copy's) reference 
 counts are both 2
             arr = RCArray!int([]); // There is another owner, 
 so arr
                                    // forgets about the old 
 payload
         } // Last owner of the array ('copy') gets destroyed 
 and happily
           // frees the payload.
         val = 3; // Oops.
     }



struct PileInfo		"Red" 	TOP of PILE
{
	int refCount;
	RCDataBlock * top;
	RCDataBlock * bottom;
}

struct RCDataBlock	"Blue"	
{
	PileInfo * pileInfo;
	RCDataBlock * next;
	Array * payload;	//Actual Data.
}

struct RCArray
{
	RCDataBlock * block;
}


RCArray a,b,c,d;	//all different piles

a = b;
b = c;
d = a;			//makes them one single pile.


What if you pile them up. Blue cubes which contain the data.
And a Red cube containing the reference count equal to
the sum of all references to the blue cube of the same pile.
Basically a pile of blue cubes with a red cube on top.



1) RCArray a,b,c,d;		//------------

[1]
  x <-a

[1]
  x <-b

[1]
  x <-c

[1]
  x <-d


2) a = b;	//------------

[2]
  x <-[old b1]
  x <-a,b

[1]
  x <-c

[1]
  x <-d

3) b = c;	//------------


[3]
  x <-b,c
  x <-[old b1]
  x <-a,[old b2]

[1]
  x <-d


4) d=a;		//------------

[4]
  x <-b,c
  x <-[old b1]
  x <-[old a1],[old b2]
  x <-d,a

Mar 05 2015

Nick Treleaven <ntrel-pub mybtinternet.com> writes:

On 04/03/2015 08:55, Ivan Timokhin wrote:
      void main()
      {
          auto arr = RCArray!int([0]);
          foo(arr, arr[0]);
      }

      void foo(ref RCArray!int arr, ref int val)
      {
          {
              auto copy = arr; //arr's (and copy's) reference counts are both 2
              arr = RCArray!int([]); // There is another owner, so arr
                                     // forgets about the old payload
          } // Last owner of the array ('copy') gets destroyed and happily
            // frees the payload.
          val = 3; // Oops.
      }

We could prevent reassignment of arr while val is alive, but perhaps 
still allow mutation of existing elements through arr.

I've changed Marc Schütz's example to explore this:

void main() {
     auto a = RCArray!int([5]); // a's refcount is now 1
     foo(a, a[0]); // pass by ref
}
void foo(ref RCArray!int a, ref int t) {
     a[0] = 4; // ok
     a = RCArray!int([]); // drop the old a
     t = 3; // oops, t is gone
}

I think Rust would enforce that either a *or* t can be mutably borrowed 
at once (and for a, t can't even be immutably-borrowed). Without 
designing a system, in theory foo is OK if a is const, but that prevents 
a[0] = 4. This could be allowed as long as a is not reassigned (i.e. a 
is head-const).

To support RCDynamicArray that supports appending and resizing, these 
operations also need to be excluded, whilst potentially allowing 
existing elements to be mutated.

(I've seen Andrei's solution for the above example, but it doesn't work 
for Ivan Timokhin's code, and for the former it can be non-ideal).

Perhaps a parameter attribute similar to head-const (but somehow 
configurable by the container author) could enforce this. The author of 
foo would need to use this attribute for a. The container could mark the 
safe mutation operations with this attribute.

Just an idea. I haven't considered other types of container, maybe it 
would help those also.

Mar 09 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Monday, 2 March 2015 at 22:58:19 UTC, Walter Bright wrote:
 I.e. the postblit manipulates the ref count, but does NOT do 
 payload deletions. The destructor checks the ref count, if it 
 is zero, THEN it does the payload deletion.

 Pretty dazz idea, dontcha think? And DIP25 still stands 
 unscathed :-)

 Unless, of course, we missed something obvious.

Add me to "we". I'm dazzed! :-)

Mar 02 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Monday, 2 March 2015 at 22:58:19 UTC, Walter Bright wrote:
 His insight was that the deletion of the payload occurred 
 before the end of the lifetime of the RC object, and that this 
 was the source of the problem. If the deletion of the payload 
 occurs during the destructor call, rather than the postblit,

RcArray, a struct, already does this. You wouldn't delete in a 
postblit anyway would you? Do you need opRelease and ~this to be 
separate for structs too?

Mar 02 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/2/15 4:01 PM, Zach the Mystic wrote:
 On Monday, 2 March 2015 at 22:58:19 UTC, Walter Bright wrote:
 His insight was that the deletion of the payload occurred before the
 end of the lifetime of the RC object, and that this was the source of
 the problem. If the deletion of the payload occurs during the
 destructor call, rather than the postblit,

 RcArray, a struct, already does this. You wouldn't delete in a postblit
 anyway would you?

Deletion occurs in opAssign.

 Do you need opRelease and ~this to be separate for
 structs too?

Probably not.


Andrei

Mar 02 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Monday, 2 March 2015 at 22:58:19 UTC, Walter Bright wrote:
 His insight was that the deletion of the payload occurred 
 before the end of the lifetime of the RC object, and that this 
 was the source of the problem. If the deletion of the payload 
 occurs during the destructor call, rather than the postblit, 
 then although the ref count of the payload goes to zero, it 
 doesn't actually get deleted.

 I.e. the postblit manipulates the ref count, but does NOT do 
 payload deletions. The destructor checks the ref count, if it 
 is zero, THEN it does the payload deletion.

I guess you also mean opAssigns -- they would manipulate 
refcounts too right? In fact, they would be the primary means of 
decrementing the refcount *apart* from the destructor, right?

Mar 02 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Tuesday, 3 March 2015 at 00:05:50 UTC, Zach the Mystic wrote:
 I guess you also mean opAssigns -- they would manipulate 
 refcounts too right? In fact, they would be the primary means 
 of decrementing the refcount *apart* from the destructor, right?

Nevermind. I was a minute too soon with my post!

Mar 02 2015

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:

On Monday, 2 March 2015 at 22:58:19 UTC, Walter Bright wrote:
 On 3/2/2015 1:09 PM, "Marc =?UTF-8?B?U2Now7x0eiI=?= 
 <schuetzm gmx.net>" wrote:
 "I have discovered a marvellous solution, but this post is too 
 short to describe
 it."

 Fortunately, Fermat (er, Andrei) was able to pass along his 
 dazz idea to me this afternoon, before he expired to something 
 he had to attend to. We were both in the dumps about this 
 problem last night, but I think he solved it.

 His insight was that the deletion of the payload occurred 
 before the end of the lifetime of the RC object, and that this 
 was the source of the problem. If the deletion of the payload 
 occurs during the destructor call, rather than the postblit, 
 then although the ref count of the payload goes to zero, it 
 doesn't actually get deleted.

 I.e. the postblit manipulates the ref count, but does NOT do 
 payload deletions. The destructor checks the ref count, if it 
 is zero, THEN it does the payload deletion.

 Pretty dazz idea, dontcha think? And DIP25 still stands 
 unscathed :-)

 Unless, of course, we missed something obvious.

Clever :-) But there are two things:

The object is still accessible after its refcount went to zero, 
and can therefore potentially be resurrected. Probably not a 
problem, but needs to be taken into account, in particular in 
with respect to the freelist. That's tricky, because an object 
can be released and resurrected several times, and care must be 
taken that it will not end up in the freelist and get destroyed 
multiple times. And that hasn't even touched on thread-safety yet.

The bigger problem is that it's relying on a convention. The RC 
wrapper needs to be constructed in a particular way that's easy 
to get wrong and that the compiler has no way to check for us.

Mar 03 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/3/15 5:05 AM, "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net>" wrote:
 The object is still accessible after its refcount went to zero, and can
 therefore potentially be resurrected. Probably not a problem, but needs
 to be taken into account, in particular in with respect to the freelist.
 That's tricky, because an object can be released and resurrected several
 times, and care must be taken that it will not end up in the freelist
 and get destroyed multiple times. And that hasn't even touched on
 thread-safety yet.

Could you please give an example?

 The bigger problem is that it's relying on a convention. The RC wrapper
 needs to be constructed in a particular way that's easy to get wrong and
 that the compiler has no way to check for us.

Agreed.


Andrei

Mar 03 2015

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:

On Tuesday, 3 March 2015 at 15:01:02 UTC, Andrei Alexandrescu 
wrote:
 On 3/3/15 5:05 AM, "Marc =?UTF-8?B?U2Now7x0eiI=?= 
 <schuetzm gmx.net>" wrote:
 The object is still accessible after its refcount went to 
 zero, and can
 therefore potentially be resurrected. Probably not a problem, 
 but needs
 to be taken into account, in particular in with respect to the 
 freelist.
 That's tricky, because an object can be released and 
 resurrected several
 times, and care must be taken that it will not end up in the 
 freelist
 and get destroyed multiple times. And that hasn't even touched 
 on
 thread-safety yet.

 Could you please give an example?

No, I can't, because it isn't true. I was confused...

Mar 03 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/3/15 9:30 AM, "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net>" wrote:
 On Tuesday, 3 March 2015 at 15:01:02 UTC, Andrei Alexandrescu wrote:
 On 3/3/15 5:05 AM, "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net>"
 wrote:
 The object is still accessible after its refcount went to zero, and can
 therefore potentially be resurrected. Probably not a problem, but needs
 to be taken into account, in particular in with respect to the freelist.
 That's tricky, because an object can be released and resurrected several
 times, and care must be taken that it will not end up in the freelist
 and get destroyed multiple times. And that hasn't even touched on
 thread-safety yet.

 Could you please give an example?

 No, I can't, because it isn't true. I was confused...

<silently wipes sweat off forehead>

Mar 03 2015

Nick Treleaven <ntrel-pub mybtinternet.com> writes:

On 03/03/2015 13:05, "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net>" 
wrote:
 The bigger problem is that it's relying on a convention. The RC wrapper
 needs to be constructed in a particular way that's easy to get wrong and
 that the compiler has no way to check for us.

Maybe the compiler could enforce that opRelease is truly  safe - i.e. 
doesn't call any  trusted code. Although there would still need to be an 
escape hatch for doing unsafe ref-counting without the overhead of a 
free list.

Mar 03 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Monday, 2 March 2015 at 22:58:19 UTC, Walter Bright wrote:
 Pretty dazz idea, dontcha think? And DIP25 still stands 
 unscathed :-)

 Unless, of course, we missed something obvious.

I was dazzed, but I'm not anymore. I wrote my concern here:

http://forum.dlang.org/post/ylpaqhnuiczfgfpqjuww forum.dlang.org

Mar 03 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/3/15 7:38 AM, Zach the Mystic wrote:
 On Monday, 2 March 2015 at 22:58:19 UTC, Walter Bright wrote:
 Pretty dazz idea, dontcha think? And DIP25 still stands unscathed :-)

 Unless, of course, we missed something obvious.

 I was dazzed, but I'm not anymore. I wrote my concern here:

 http://forum.dlang.org/post/ylpaqhnuiczfgfpqjuww forum.dlang.org

There's a misunderstanding here. The object being assigned keeps a 
trailing list of past values and defers their deallocation to 
destruction. -- Andrei

Mar 03 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Tuesday, 3 March 2015 at 16:31:07 UTC, Andrei Alexandrescu 
wrote:
 I was dazzed, but I'm not anymore. I wrote my concern here:

 http://forum.dlang.org/post/ylpaqhnuiczfgfpqjuww forum.dlang.org

 There's a misunderstanding here. The object being assigned 
 keeps a trailing list of past values and defers their 
 deallocation to destruction. -- Andrei

So you need an extra pointer per instance? Isn't that a big price 
to pay? Is the only problem we're still trying to solve aliasing 
which is not recognized as such and therefore doesn't bump the 
refcounter like it should? An extra pointer would be overkill for 
that. Isn't it better to just recognize the aliasing when it 
happens?

As far as taking the address of an RcArray element, the type of 
which element is not itself Rc'ed, it's a different problem. The 
only thing I've been able to come up with is maybe to create a 
wrapper type within RcArray for the individual elements, and have 
that type do refcounting on the parent instead of itself, if 
that's possible.

Mar 03 2015

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:

On Tuesday, 3 March 2015 at 17:00:17 UTC, Zach the Mystic wrote:
 On Tuesday, 3 March 2015 at 16:31:07 UTC, Andrei Alexandrescu 
 wrote:
 I was dazzed, but I'm not anymore. I wrote my concern here:

 http://forum.dlang.org/post/ylpaqhnuiczfgfpqjuww forum.dlang.org

 There's a misunderstanding here. The object being assigned 
 keeps a trailing list of past values and defers their 
 deallocation to destruction. -- Andrei

 So you need an extra pointer per instance? Isn't that a big 
 price to pay?

All instances need to carry a pointer to refcount anyway, so the 
freelist could just be stored next to the refcount. The idea of 
creating that list, however, is more worrying, because it again 
involves allocations. It can get arbitrarily long.

 Is the only problem we're still trying to solve aliasing which 
 is not recognized as such and therefore doesn't bump the 
 refcounter like it should? An extra pointer would be overkill 
 for that. Isn't it better to just recognize the aliasing when 
 it happens?

 As far as taking the address of an RcArray element, the type of 
 which element is not itself Rc'ed, it's a different problem. 
 The only thing I've been able to come up with is maybe to 
 create a wrapper type within RcArray for the individual 
 elements, and have that type do refcounting on the parent 
 instead of itself, if that's possible.

No, Andrei's proposed solution would take care of that. On 
assignment to RCArray, if the refcount goes to zero, the old 
array is put onto the cleanup list. But there can still be 
borrowed references to it's elements. However, these can never 
outlive the RCArray, therefore it's safe to destroy all of the 
arrays in the cleanup list in the destructor.

Mar 03 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/3/15 9:40 AM, "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net>" wrote:
 On Tuesday, 3 March 2015 at 17:00:17 UTC, Zach the Mystic wrote:
 On Tuesday, 3 March 2015 at 16:31:07 UTC, Andrei Alexandrescu wrote:
 I was dazzed, but I'm not anymore. I wrote my concern here:

 http://forum.dlang.org/post/ylpaqhnuiczfgfpqjuww forum.dlang.org

 There's a misunderstanding here. The object being assigned keeps a
 trailing list of past values and defers their deallocation to
 destruction. -- Andrei

 So you need an extra pointer per instance? Isn't that a big price to pay?

 All instances need to carry a pointer to refcount anyway, so the
 freelist could just be stored next to the refcount. The idea of creating
 that list, however, is more worrying, because it again involves
 allocations. It can get arbitrarily long.

No, the cool thing about freelists is they use free memory to chain 
things together.

Somebody please write the code already. With no code we sit forever on 
our testes speculating.

 Is the only problem we're still trying to solve aliasing which is not
 recognized as such and therefore doesn't bump the refcounter like it
 should? An extra pointer would be overkill for that. Isn't it better
 to just recognize the aliasing when it happens?

 As far as taking the address of an RcArray element, the type of which
 element is not itself Rc'ed, it's a different problem. The only thing
 I've been able to come up with is maybe to create a wrapper type
 within RcArray for the individual elements, and have that type do
 refcounting on the parent instead of itself, if that's possible.

 No, Andrei's proposed solution would take care of that. On assignment to
 RCArray, if the refcount goes to zero, the old array is put onto the
 cleanup list. But there can still be borrowed references to it's
 elements. However, these can never outlive the RCArray, therefore it's
 safe to destroy all of the arrays in the cleanup list in the destructor.

Yes, that's exactly right, thanks for putting it so clearly.


Andrei

Mar 03 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 3/3/2015 9:44 AM, Andrei Alexandrescu wrote:
 On 3/3/15 9:40 AM, "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net>" wrote:
 All instances need to carry a pointer to refcount anyway, so the
 freelist could just be stored next to the refcount. The idea of creating
 that list, however, is more worrying, because it again involves
 allocations. It can get arbitrarily long.

 No, the cool thing about freelists is they use free memory to chain things
 together.

I don't think the payload memory can be repurposed to be a free list until the 
destructor runs, because the whole problem is the payload may still be live to 
some references to it.

Mar 03 2015

"ixid" <adamsibson hotmail.com> writes:

 Somebody please write the code already. With no code we sit 
 forever on our testes speculating.

Isn't this testes-driven development?

Mar 03 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Tuesday, 3 March 2015 at 17:40:59 UTC, Marc Schütz wrote:
 All instances need to carry a pointer to refcount anyway, so 
 the freelist could just be stored next to the refcount. The 
 idea of creating that list, however, is more worrying, because 
 it again involves allocations. It can get arbitrarily long.

If the last RcType is a global, will the list ever get freed at 
all?

 No, Andrei's proposed solution would take care of that. On 
 assignment to RCArray, if the refcount goes to zero, the old 
 array is put onto the cleanup list. But there can still be 
 borrowed references to it's elements. However, these can never 
 outlive the RCArray, therefore it's safe to destroy all of the 
 arrays in the cleanup list in the destructor.

Wouldn't you need a lifetime system for this? A global, for 
example, couldn't borrow safely. I'm all in favor of an 
ownership/borrowing system, but that would be for a different 
DIP, right? It seems like taking the address of a sub-element of 
an RcType is inherently unsafe, since it separates the memory 
from the refcount.

Mar 03 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/3/15 10:22 AM, Zach the Mystic wrote:
 On Tuesday, 3 March 2015 at 17:40:59 UTC, Marc Schütz wrote:
 All instances need to carry a pointer to refcount anyway, so the
 freelist could just be stored next to the refcount. The idea of
 creating that list, however, is more worrying, because it again
 involves allocations. It can get arbitrarily long.

 If the last RcType is a global, will the list ever get freed at all?

No. Making a global variable of a reference counted type would be poor 
design.


Andrei

Mar 03 2015

"deadalnix" <deadalnix gmail.com> writes:

On Tuesday, 3 March 2015 at 18:49:43 UTC, Andrei Alexandrescu 
wrote:
 On 3/3/15 10:22 AM, Zach the Mystic wrote:
 On Tuesday, 3 March 2015 at 17:40:59 UTC, Marc Schütz wrote:
 All instances need to carry a pointer to refcount anyway, so 
 the
 freelist could just be stored next to the refcount. The idea 
 of
 creating that list, however, is more worrying, because it 
 again
 involves allocations. It can get arbitrarily long.

 If the last RcType is a global, will the list ever get freed 
 at all?

 No. Making a global variable of a reference counted type would 
 be poor design.


 Andrei

Considering glopbal here means thread local, I think it does make 
sense to use global for various forms of cache.

Mar 03 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/3/15 9:00 AM, Zach the Mystic wrote:
 On Tuesday, 3 March 2015 at 16:31:07 UTC, Andrei Alexandrescu wrote:
 I was dazzed, but I'm not anymore. I wrote my concern here:

 http://forum.dlang.org/post/ylpaqhnuiczfgfpqjuww forum.dlang.org

 There's a misunderstanding here. The object being assigned keeps a
 trailing list of past values and defers their deallocation to
 destruction. -- Andrei

 So you need an extra pointer per instance?

Yah, or define your type to be single-assignment (probably an emerging 
idiom). You can massage the extra pointer with other data thus reducing 
its cost.

 Isn't that a big price to
 pay? Is the only problem we're still trying to solve aliasing which is
 not recognized as such and therefore doesn't bump the refcounter like it
 should? An extra pointer would be overkill for that. Isn't it better to
 just recognize the aliasing when it happens?

It's all tradeoffs. This has runtime overhead. A static analysis would 
have the challenges of being permissive enough, cheap enough, not add 
notational overhead, etc. etc.


Andrei

Mar 03 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Tuesday, 3 March 2015 at 18:48:36 UTC, Andrei Alexandrescu 
wrote:
 On 3/3/15 9:00 AM, Zach the Mystic wrote:
 On Tuesday, 3 March 2015 at 16:31:07 UTC, Andrei Alexandrescu 
 wrote:
 I was dazzed, but I'm not anymore. I wrote my concern here:

 http://forum.dlang.org/post/ylpaqhnuiczfgfpqjuww forum.dlang.org

 There's a misunderstanding here. The object being assigned 
 keeps a
 trailing list of past values and defers their deallocation to
 destruction. -- Andrei

 So you need an extra pointer per instance?

 Yah, or define your type to be single-assignment (probably an 
 emerging idiom). You can massage the extra pointer with other 
 data thus reducing its cost.

 Isn't that a big price to
 pay? Is the only problem we're still trying to solve aliasing 
 which is
 not recognized as such and therefore doesn't bump the 
 refcounter like it
 should? An extra pointer would be overkill for that. Isn't it 
 better to
 just recognize the aliasing when it happens?

 It's all tradeoffs. This has runtime overhead.

Isn't allocating and collecting a freelist also overhead?

 A static analysis would have the challenges of being permissive 
 enough, cheap enough, not add notational overhead, etc. etc.

It's certainly permissive: you can do anything, and compiler 
wraps uncertain operations with add/release cycles automatically. 
These are: passing a global as a mutable reference to an impure 
function; aliasing the same variable in two parameters with 
itself. The unoptimized lowerings would be:

{
   auto tmp = myGlobal; // bump count
   impureFun(myGlobal);
}  // tmp destroyed, --count

{
   auto tmp2 = c; // bump count
   fun(c, c);
} // --count

The only addition is an optimization where the compiler elides 
the assignments and calls the add/release cycles directly.

Mar 03 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/3/15 12:35 PM, Zach the Mystic wrote:
 On Tuesday, 3 March 2015 at 18:48:36 UTC, Andrei Alexandrescu wrote:
 On 3/3/15 9:00 AM, Zach the Mystic wrote:
 On Tuesday, 3 March 2015 at 16:31:07 UTC, Andrei Alexandrescu wrote:
 I was dazzed, but I'm not anymore. I wrote my concern here:

 http://forum.dlang.org/post/ylpaqhnuiczfgfpqjuww forum.dlang.org

 There's a misunderstanding here. The object being assigned keeps a
 trailing list of past values and defers their deallocation to
 destruction. -- Andrei

 So you need an extra pointer per instance?

 Yah, or define your type to be single-assignment (probably an emerging
 idiom). You can massage the extra pointer with other data thus
 reducing its cost.

 Isn't that a big price to
 pay? Is the only problem we're still trying to solve aliasing which is
 not recognized as such and therefore doesn't bump the refcounter like it
 should? An extra pointer would be overkill for that. Isn't it better to
 just recognize the aliasing when it happens?

 It's all tradeoffs. This has runtime overhead.

 Isn't allocating and collecting a freelist also overhead?

No. I don't have time now for a proof of concept and it seems everybody 
wants to hypothesize about code that doesn't exist instead of writing 
code and then discussing it.

 A static analysis would have the challenges of being permissive
 enough, cheap enough, not add notational overhead, etc. etc.

 It's certainly permissive: you can do anything, and compiler wraps
 uncertain operations with add/release cycles automatically. These are:
 passing a global as a mutable reference to an impure function; aliasing
 the same variable in two parameters with itself. The unoptimized
 lowerings would be:

 {
    auto tmp = myGlobal; // bump count
    impureFun(myGlobal);
 }  // tmp destroyed, --count

 {
    auto tmp2 = c; // bump count
    fun(c, c);
 } // --count

 The only addition is an optimization where the compiler elides the
 assignments and calls the add/release cycles directly.

Do you have something reviewable, or just your own past posts?

For the time being I want to move forward with DIP25 and deferred freeing.


Andrei

Mar 03 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Tuesday, 3 March 2015 at 21:37:20 UTC, Andrei Alexandrescu 
wrote:
 On 3/3/15 12:35 PM, Zach the Mystic wrote:
 Isn't allocating and collecting a freelist also overhead?

 No. I don't have time now for a proof of concept and it seems 
 everybody wants to hypothesize about code that doesn't exist 
 instead of writing code and then discussing it.

Okay.

 The unoptimized lowerings would be:

 {
   auto tmp = myGlobal; // bump count
   impureFun(myGlobal);
 }  // tmp destroyed, --count

 {
   auto tmp2 = c; // bump count
   fun(c, c);
 } // --count

 The only addition is an optimization where the compiler elides 
 the
 assignments and calls the add/release cycles directly.

 Do you have something reviewable, or just your own past posts?

Just my own past posts. My suggestion is based on the compiler 
doing all the work. I don't know how it could be tested without 
hacking the compiler.

 For the time being I want to move forward with DIP25 and 
 deferred freeing.

That's fine. I like DIP25. It's a start towards stronger safety 
guarantees. While I'm pretty sure the runtime costs of my 
proposal are lower than yours, they do require compiler hacking, 
which means they can wait.

Mar 03 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Wednesday, 4 March 2015 at 03:46:36 UTC, Zach the Mystic wrote:
 Just my own past posts. My suggestion is based on the compiler 
 doing all the work. I don't know how it could be tested without 
 hacking the compiler.

I think that part of the fear of my idea is that I want structs 
to get some of the behavior suggested in DIP74 for classes, i.e. 
the compiler inserts calls to opAddRef/opRelease on its own at 
certain times. Since structs only have postblits and destructors, 
there's no canonical way to call them as separate functions. The 
behavior I'm suggesting would only be good if you had a 
refcounted type, which means it's superfluous if not harmful to 
insert it "just because" in other types of structs.

If it turns out that some of the behavior desirable for 
refcounted classes is useful for structs too, it may be necessary 
to hint to the complier that a struct is indeed of the refcounted 
type. For example, "void opAddRef();" and "void opRelease();" 
could be specially recognized, with no definitions even permitted 
(error on attempt), implying "alias opAddRef this(this);", "alias 
opRelease ~this;".

Mar 03 2015

"deadalnix" <deadalnix gmail.com> writes:

On Wednesday, 4 March 2015 at 03:46:36 UTC, Zach the Mystic wrote:
 That's fine. I like DIP25. It's a start towards stronger safety 
 guarantees. While I'm pretty sure the runtime costs of my 
 proposal are lower than yours, they do require compiler 
 hacking, which means they can wait.

I don't think that it is fine.

At this point we need to :
  - Not free anything as long as something is alive.
  - Can't recycle memory.
  - Keep track of allocated chunk to be able to free them (ie 
implementing malloc on top of malloc).

It means that RC is attached to an ever growing arena. Code that 
would manipulate RCArray and append to it on a regular manner 
must expect some impressive memory consumption.

Even if we manage to do this in phobos (I'm sure we can) it is 
pretty much guaranteed at this point that noone else will, at 
least safely. The benefit is reduced because of the bookeeping 
that need to be done for memory to be freed in addition to 
reference count themselves.


its simplicity. I assume at this point that we have empirical 
evidence that this is NOT the case.

Mar 04 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 3/4/2015 12:13 AM, deadalnix wrote:

I
 assume at this point that we have empirical evidence that this is NOT the case.

The complexity of a free list doesn't remotely compare to that of adding an 
ownership system.

Besides, we expect to provide sample code for how to do this that can be pretty 
much cut&pasted by people implementing their own RC types.

Mar 04 2015

"bearophile" <bearophileHUGS lycos.com> writes:

Walter Bright:

 The complexity of a free list doesn't remotely compare to that 
 of adding an ownership system.

A sound complete ownership system is the only good enough 
solution for D. That's my opinion.

Bye,
bearophile

Mar 04 2015

"Paolo Invernizzi" <paolo.invernizzi no.address> writes:

On Wednesday, 4 March 2015 at 09:16:22 UTC, bearophile wrote:
 Walter Bright:

 The complexity of a free list doesn't remotely compare to that 
 of adding an ownership system.

 A sound complete ownership system is the only good enough 
 solution for D. That's my opinion.

 Bye,
 bearophile

+1

---
Paolo

Mar 04 2015

"No" <no no.no> writes:

On Wednesday, 4 March 2015 at 09:16:22 UTC, bearophile wrote:
 Walter Bright:

 The complexity of a free list doesn't remotely compare to that 
 of adding an ownership system.

 A sound complete ownership system is the only good enough 
 solution for D. That's my opinion.

 Bye,
 bearophile

+1

Mar 04 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 3/4/2015 1:16 AM, bearophile wrote:
 Walter Bright:
 The complexity of a free list doesn't remotely compare to that of adding an
 ownership system.

 A sound complete ownership system is the only good enough solution for D.
That's
 my opinion.

How do you type an an array of pointers with different owners?

How do you deal with the combinatoric explosion of template instantiations with 
all those different ownership types?

Mar 04 2015

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:

On Wednesday, 4 March 2015 at 10:50:54 UTC, Walter Bright wrote:

 How do you type an an array of pointers with different owners?

You mean an array of pointers to objects with different owners? 
The array is the owner of the objects it points to. Or else it 
should be  trusted?

 How do you deal with the combinatoric explosion of template 
 instantiations with all those different ownership types?

Type erasure.

Mar 04 2015

"bearophile" <bearophileHUGS lycos.com> writes:

Walter Bright:

 How do you type an an array of pointers with different owners?

"Sound" doesn't mean it should be able to do everything. It will 
be just an approximated model. It means it's going to forbid some 
valid code, just like every type system. You use some  system 
code to work around some of those limitations. And if the design 
is good, such islands of unsafety are located inside Phobos 
constructs that leak 
(http://en.wikipedia.org/wiki/Leaky_abstraction ) very little.

Bye,
bearophile

Mar 04 2015

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:

On Wednesday, 4 March 2015 at 10:50:54 UTC, Walter Bright wrote:
 On 3/4/2015 1:16 AM, bearophile wrote:
 Walter Bright:
 The complexity of a free list doesn't remotely compare to 
 that of adding an
 ownership system.

 A sound complete ownership system is the only good enough 
 solution for D. That's
 my opinion.

 How do you type an an array of pointers with different owners?

scope T*[] array;

 How do you deal with the combinatoric explosion of template 
 instantiations with all those different ownership types?

There will be no combinatorial explosion. In our latest proposal 
(well, currently only in our minds), `scope` is a storage class, 
not a type modifier. For templates, which parameters are scope 
and which aren't depends only on the code inside the templates. 
There will be literally no additional instantiations because of 
different ownership.

I'm already halfway through the inference algorithm. I'm still 
looking for a way to formalize the detection of borrowing with 
aliasing (the problem started in the OP of this thread), in order 
to make those situations  system. I try to do it in a way that 
will also make deadalnix's "isolated islands" idea easy to 
implement later on.

I hope to finish writing up a proposal in the next days. Really, 
it is _much_ simpler than the previous one, and it will require 
almost no manual annotations.

Mar 04 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/4/15 6:02 AM, "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net>" wrote:
 I hope to finish writing up a proposal in the next days. Really, it is
 _much_ simpler than the previous one, and it will require almost no
 manual annotations.

Looking forward to it! -- Andrei

Mar 04 2015

"ponce" <contact gam3sfrommars.fr> writes:

On Wednesday, 4 March 2015 at 10:50:54 UTC, Walter Bright wrote:
 On 3/4/2015 1:16 AM, bearophile wrote:
 Walter Bright:
 The complexity of a free list doesn't remotely compare to 
 that of adding an
 ownership system.

 A sound complete ownership system is the only good enough 
 solution for D. That's
 my opinion.

 How do you type an an array of pointers with different owners?

You define a clear owner for everything so that this never 
happens.
That's what we do in C++ and shared_ptr is can be avoided as much 
as we like.

 How do you deal with the combinatoric explosion of template 
 instantiations with all those different ownership types?

What we do in C++: with unsafety. Taking raw pointer in function 
parameters and function returns.

Mark in comments assumptions about which lifetime exceed what.

Modern C++ has no such combinatoric explosion. Instead of a 
std::vector of pointers, you know have a std::vector of 
std::uinque_ptr.

It still is quite easy, in the very least easier that doing 
deterministic destructon in D.

Mar 04 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 3/4/2015 6:27 AM, ponce wrote:
 You define a clear owner for everything so that this never happens.
 That's what we do in C++ and shared_ptr is can be avoided as much as we like.

C++ doesn't have ownership annotations and no checkable notion of a clear owner.

Mar 04 2015

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:

On Wednesday, 4 March 2015 at 17:06:53 UTC, Walter Bright wrote:
 On 3/4/2015 6:27 AM, ponce wrote:
 You define a clear owner for everything so that this never 
 happens.
 That's what we do in C++ and shared_ptr is can be avoided as 
 much as we like.

 C++ doesn't have ownership annotations and no checkable notion 
 of a clear owner.

The standard library + type system provides mechanisms for clear 
ownership, but it does not check borrowing. Borrowing is 
potentially unsafe, and the programmer knows it, but it is not a 
big deal.

Mar 04 2015

"deadalnix" <deadalnix gmail.com> writes:

On Wednesday, 4 March 2015 at 10:50:54 UTC, Walter Bright wrote:
 On 3/4/2015 1:16 AM, bearophile wrote:
 Walter Bright:
 The complexity of a free list doesn't remotely compare to 
 that of adding an
 ownership system.

 A sound complete ownership system is the only good enough 
 solution for D. That's
 my opinion.

 How do you type an an array of pointers with different owners?

Either the array is owning its elements or is is borrowing them. 
There is no array of element with different owners.

 How do you deal with the combinatoric explosion of template 
 instantiations with all those different ownership types?

It is better than the combinatoric explosion of the code I have 
to write to support the various scheme proposed here (here RC and 
GC are 100% separet worlds).

As long as RC require bookkeeping, the codegen must be different 
anyway.

Mar 04 2015

"ponce" <contact gam3sfrommars.fr> writes:

On Wednesday, 4 March 2015 at 09:16:22 UTC, bearophile wrote:
 Walter Bright:

 The complexity of a free list doesn't remotely compare to that 
 of adding an ownership system.

 A sound complete ownership system is the only good enough 
 solution for D. That's my opinion.

 Bye,
 bearophile

For that matters, I'd be happy with an unsound incomplete unsafe 
ownership system :)

scoped ownership feels "right", while RC and GC feel heavyweight.

Mar 04 2015

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:

On Wednesday, 4 March 2015 at 14:21:24 UTC, ponce wrote:
 For that matters, I'd be happy with an unsound incomplete 
 unsafe ownership system :)

You already have that :^)

Mar 04 2015

"deadalnix" <deadalnix gmail.com> writes:

On Wednesday, 4 March 2015 at 09:06:01 UTC, Walter Bright wrote:
 On 3/4/2015 12:13 AM, deadalnix wrote:

 its simplicity. I
 assume at this point that we have empirical evidence that this 
 is NOT the case.

 The complexity of a free list doesn't remotely compare to that 
 of adding an ownership system.

A free list does not work as the data can be live. You cannot 
reuse it to maintain the free list. You need to maintain metadata 
about allocation in another structure.

Also you cannot free anything until all the refcount are to 0. 
This RC system will only cut it for short lived entities.

Mar 04 2015

"deadalnix" <deadalnix gmail.com> writes:

On Wednesday, 4 March 2015 at 10:03:13 UTC, deadalnix wrote:
 On Wednesday, 4 March 2015 at 09:06:01 UTC, Walter Bright wrote:
 On 3/4/2015 12:13 AM, deadalnix wrote:

 was its simplicity. I
 assume at this point that we have empirical evidence that 
 this is NOT the case.

 The complexity of a free list doesn't remotely compare to that 
 of adding an ownership system.

 A free list does not work as the data can be live. You cannot 
 reuse it to maintain the free list. You need to maintain 
 metadata about allocation in another structure.

 Also you cannot free anything until all the refcount are to 0. 
 This RC system will only cut it for short lived entities.

And come on, DIP25, DIP69 and DIP74 to get there. Come on, that 
is not even simpler than the alternative.

Mar 04 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 3/4/2015 2:06 AM, deadalnix wrote:
 And come on, DIP25, DIP69 and DIP74 to get there. Come on, that is not even
 simpler than the alternative.

DIP69 has pretty much been abandoned.

Mar 04 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 3/4/2015 2:03 AM, deadalnix wrote:
 A free list does not work as the data can be live.

It is a "to free" list.

 You need to maintain metadata about allocation in
 another structure.

I don't see why.

 Also you cannot free anything until all the refcount are to 0.

Right.

 This RC system will only cut it for short lived entities.

Not a problem if they aren't constantly reassigning the value.

Mar 04 2015

"deadalnix" <deadalnix gmail.com> writes:

On Wednesday, 4 March 2015 at 10:49:06 UTC, Walter Bright wrote:
 On 3/4/2015 2:03 AM, deadalnix wrote:
 A free list does not work as the data can be live.

 It is a "to free" list.

What I wanted to illustrate is that it is not a free list (which 
use the freed storage to maintain its state) as you cannot 
recycle storage.

 You need to maintain metadata about allocation in
 another structure.

 I don't see why.

Isn't it what the "to free" list is ?

 Also you cannot free anything until all the refcount are to 0.

 Right.

 This RC system will only cut it for short lived entities.

 Not a problem if they aren't constantly reassigning the value.

Having pool of object you recycle is a common strategy to reduce 
allocations when performance is critical. It is for instance 
incompatible with RC as proposed.

Mar 04 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/4/15 2:03 AM, deadalnix wrote:
 A free list does not work as the data can be live. You cannot reuse it
 to maintain the free list.

Actually you can, which is a bit surprising. Consider:

RCArray!int arr;
arr.length = 10;
arr[5] = 42;
fun(arr, arr[5]);
...
void fun(ref RCArray!int a, ref int b) {
     assert(b == 42);
     a = a.init; // array goes to freelist
     a.length = 10; // array may reuse the previous one!
     assert(b == 42);
}

Now the interesting here thing is, the last assert should be allowed to 
fail. The code is still safe because there is no change of type, and the 
use of b after the array is gone is a bug in the application anyway 
because the parent structure is out of existence.

This makes code potentially tighter on memory usage but potentially more 
difficult to debug.


Andrei

Mar 04 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Wednesday, 4 March 2015 at 09:06:01 UTC, Walter Bright wrote:
 On 3/4/2015 12:13 AM, deadalnix wrote:

 its simplicity. I
 assume at this point that we have empirical evidence that this 
 is NOT the case.

 The complexity of a free list doesn't remotely compare to that 
 of adding an ownership system.

My reference safety system has ownership built in, more-or-less 
for free:

http://forum.dlang.org/post/offurllmuxjewizxedab forum.dlang.org

See also my reply to deadalnix:

http://forum.dlang.org/post/oyaoibmwybzfkhhufpow forum.dlang.org

Mar 04 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Wednesday, 4 March 2015 at 08:13:33 UTC, deadalnix wrote:
 On Wednesday, 4 March 2015 at 03:46:36 UTC, Zach the Mystic 
 wrote:
 That's fine. I like DIP25. It's a start towards stronger 
 safety guarantees. While I'm pretty sure the runtime costs of 
 my proposal are lower than yours, they do require compiler 
 hacking, which means they can wait.

 I don't think that it is fine.

 At this point we need to :
  - Not free anything as long as something is alive.
  - Can't recycle memory.
  - Keep track of allocated chunk to be able to free them (ie 
 implementing malloc on top of malloc).

 It means that RC is attached to an ever growing arena. Code 
 that would manipulate RCArray and append to it on a regular 
 manner must expect some impressive memory consumption.

 Even if we manage to do this in phobos (I'm sure we can) it is 
 pretty much guaranteed at this point that noone else will, at 
 least safely. The benefit is reduced because of the bookeeping 
 that need to be done for memory to be freed in addition to 
 reference count themselves.


 its simplicity. I assume at this point that we have empirical 
 evidence that this is NOT the case.

To me, DIP25 is just the first step towards an ownership system. 
The only language additions you need to it are "out!" parameters, 
to track escapes to other parameters, "static" parameters 
(previously called "noscope"), to say that the parameter won't be 
copied to a global, and one more function attribute (for which I 
can reuse "noscope" as  noscope) which says the return value will 
nto be allocated on the heap. All of these will be rare, as they 
aim to target the exceptional cases rather than the norms 
("scope" would be the norm. Hence " noscope" to target the rare 
cases):

Examples:

T* fun(return T* a, T* b, T**c);

This signature would indicate complete ownership transferred from 
`a` to the return value, since only `a` can be returned (see why 
below)

T* gun(return out!b T* a, T** b);

`a` is declared to be copied both to the return value and to `b`. 
Therefore it is not owned. (If you're following my previous 
definition of `out!` in DIP71, you'll notice I moved `out!` to 
the source parameter rather than the target, but the point is the 
same.)

T* hun(return T* a)  noscope {
   if(something)
     return a;
   else return new T;
}

Again, no ownership. If you *might* return a heap or global, the 
function must be marked  noscope (Again I've readapted the word 
to a new meaning from dIP71. I'm using `static` now for 
`noscope's original meaning.)

Another example:

T* jun(return static T* a) {
   static T* t;
   t = a;
   return a;
}

Again, no ownership, because of the `static` parameter attribute. 
In a previous post, you suggested that such an attribute was 
unnecessary, but an ownership system would require that a given 
parameter `a` which was returned, not also be copied to a global 
at the same time. So `static` tells the compiler this, and thus 
cancels ownership.

My point is that DIP25's `return` parameters are the beginning of 
an ownership system. An option to specify that the function 
*will* return a given `return` parameter as opposed to *might* 
return it is the only thing needed. Hence the additions named 
above. (Also, `pure` functions will need no `static` parameter 
attributes, and functions both `pure` and ` nogc` will not need )

With the exception of some minor cosmetic changes, all this is 
in, or at least hinted at, in my previously posted Reference 
Safety System:

http://forum.dlang.org/post/offurllmuxjewizxedab forum.dlang.org

The only thing which bears reiterating is that with better 
attribute inference, the whole system becomes invisible for most 
uses.

Mar 04 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Wednesday, 4 March 2015 at 17:13:13 UTC, Zach the Mystic wrote:
 (Also, `pure` functions will need no `static` parameter 
 attributes, and functions both `pure` and ` nogc` will not need 
 )

...will not need ` noscope` either.

Mar 04 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Wednesday, 4 March 2015 at 17:13:13 UTC, Zach the Mystic wrote:
 Another example:

 T* jun(return static T* a) {
   static T* t;
   t = a;
   return a;
 }

 Again, no ownership, because of the `static` parameter 
 attribute. In a previous post, you suggested that such an 
 attribute was unnecessary, but an ownership system would 
 require that a given parameter `a` which was returned, not also 
 be copied to a global at the same time. So `static` tells the 
 compiler this, and thus cancels ownership.

Actually, I think you convinced me before that `static` (or 
`noscope`) parameters wouldn't carry their weight. Instead, 
copying a parameter reference to a global variable is unsafe by 
default. Wrap it in a ` trusted` lambda if you know what you're 
doing. (Trusted lambdas are assumed to copy no reference 
parameters.) In this way, you can assume ownership. Any unsafe 
global escapes are just ignored. ???

Mar 04 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Wednesday, 4 March 2015 at 08:13:33 UTC, deadalnix wrote:
 On Wednesday, 4 March 2015 at 03:46:36 UTC, Zach the Mystic 
 wrote:
 That's fine. I like DIP25. It's a start towards stronger 
 safety guarantees. While I'm pretty sure the runtime costs of 
 my proposal are lower than yours, they do require compiler 
 hacking, which means they can wait.

 I don't think that it is fine.

 At this point we need to :
  - Not free anything as long as something is alive.
  - Can't recycle memory.
  - Keep track of allocated chunk to be able to free them (ie 
 implementing malloc on top of malloc).

Well, I don't want to make any enemies. I thought that once the 
compiler was hacked people could just change their 
deferred-freeing code.

Mar 04 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Monday, 2 March 2015 at 20:37:46 UTC, Walter Bright wrote:
 On 3/1/2015 12:51 PM, Michel Fortin wrote:
 That's actually not enough. You'll have to block access to 
 global variables too:

     S s;

     void main() {
         s.array = RCArray!T([T()]);   // s.array's refcount is 
 now 1
         foo(s.array[0]);           // pass by ref
     }
     void foo(ref T t) {
         s.array = RCArray!T([]);      // drop the old s.array
         t.doSomething();              // oops, t is gone
     }


So with Andrei's solution, will s.array ever get freed, since s 
is a global? I guess it *should* never get freed, since s is a 
global and it will always exist as a reference.

Which makes me think about a bigger problem... when you opAssign, 
don't you redirect the variable to a different instance? Won't 
the destructor then destroy *that* instance (or not destroy it, 
since it just got a +1 count) instead of the one most recently 
decremented? How does it hold onto the instance to be destroyed?

Mar 02 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Tuesday, 3 March 2015 at 01:23:24 UTC, Zach the Mystic wrote:
 Which makes me think about a bigger problem... when you 
 opAssign, don't you redirect the variable to a different 
 instance? Won't the destructor then destroy *that* instance (or 
 not destroy it, since it just got a +1 count) instead of the 
 one most recently decremented? How does it hold onto the 
 instance to be destroyed?

auto y = new RcStruct;
y = null;

y's old RcStruct gets decremented to zero, but who owns it now? 
Whose destructor ever gets run on it?

Mar 02 2015

Manu via Digitalmars-d <digitalmars-d puremagic.com> writes:

On 3 March 2015 at 06:37, Walter Bright via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On 3/1/2015 12:51 PM, Michel Fortin wrote:
 That's actually not enough. You'll have to block access to global
 variables too:

      S s;

      void main() {
          s.array = RCArray!T([T()]);   // s.array's refcount is now 1
          foo(s.array[0]);           // pass by ref
      }
      void foo(ref T t) {
          s.array = RCArray!T([]);      // drop the old s.array
          t.doSomething();              // oops, t is gone
      }


 Thinking about it, there are many other ways this can happen. At the moment,
 I'm stuck thinking of a solution other than requiring foo() to be pure.
 Anyone have ideas?

My immediate impression on this problem:

s.array[0] is being passed to foo from main. s does not belong to main
(is global), and main does not hold have a reference to s.array.
Shouldn't main just need to inc/dec array around the call to foo when
passing un-owned references down the call tree.
It seems to me that there always needs to be a reference _somewhere_
on the stack for anything being passed down the call tree (unless the
function is pure). Seems simplest to capture a stack ref at the top
level, then as it's received as arguments to each callee, it's
effectively owned by those functions and they don't need to worry
anymore.

So, passing global x to some function; inc/dec x around the function
call that it's passed to...? Then the stack has its own reference, and
the global reference can go away safely.

Mar 03 2015

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:

On Tuesday, 3 March 2015 at 08:04:25 UTC, Manu wrote:
 So, passing global x to some function; inc/dec x around the 
 function
 call that it's passed to...? Then the stack has its own 
 reference, and
 the global reference can go away safely.

Yes, the sane thing to do is to improve the general type system 
and general optimizer. The special casing D is going for will 
lead to no good. New non-trivial special case features just 
punches more holes in the type system. Which is the opposite of 
what is needed to bring D to a stable state.

By trivial I mean syntax sugar, which always is ok, but a type 
system should be general with no special casing in it.

What you need to do is to a way to implement smart pointers as 
non-ref-capable types, and the ability to do moves to transfer 
ownership up the call stack, and good general opimizations in the 
backends for it. AFAIK the current type system is too weak to 
enforce it. The type system also lacks head-const-ref, which 
often is needed for safe "manual optimization"?

The special casing effort is largely wasted because one cannot 
have efficient ARC without whole program/module optimization 
anyway. Swift ARC does better when optimizing larger units. With 
whole program optimization, stronger typing and smart inlining 
the RC performance issues can be reduced more efficiently.

It would be better to focus on ways to tighten the type system 
and how to utilize stronger typing for optimization of larger 
units (whole module/program). Special casing in the language 
makes optimization algorithms harder to write. Long term evil.

Mar 03 2015

"Paulo Pinto" <pjmlp progtools.org> writes:

On Tuesday, 3 March 2015 at 08:59:08 UTC, Ola Fosheim Grøstad 
wrote:
 On Tuesday, 3 March 2015 at 08:04:25 UTC, Manu wrote:
 So, passing global x to some function; inc/dec x around the 
 function
 call that it's passed to...? Then the stack has its own 
 reference, and
 the global reference can go away safely.

 Yes, the sane thing to do is to improve the general type system 
 and general optimizer. The special casing D is going for will 
 lead to no good. New non-trivial special case features just 
 punches more holes in the type system. Which is the opposite of 
 what is needed to bring D to a stable state.

 By trivial I mean syntax sugar, which always is ok, but a type 
 system should be general with no special casing in it.

 What you need to do is to a way to implement smart pointers as 
 non-ref-capable types, and the ability to do moves to transfer 
 ownership up the call stack, and good general opimizations in 
 the backends for it. AFAIK the current type system is too weak 
 to enforce it. The type system also lacks head-const-ref, which 
 often is needed for safe "manual optimization"?

 The special casing effort is largely wasted because one cannot 
 have efficient ARC without whole program/module optimization 
 anyway. Swift ARC does better when optimizing larger units. 
 With whole program optimization, stronger typing and smart 
 inlining the RC performance issues can be reduced more 
 efficiently.

 It would be better to focus on ways to tighten the type system 
 and how to utilize stronger typing for optimization of larger 
 units (whole module/program). Special casing in the language 
 makes optimization algorithms harder to write. Long term evil.

You just reminded me of ParaSail. Here is the latest version of 
their pointer free programming paper.

https://drive.google.com/file/d/0B6Vq5QaY4U7ubm5qVkFpMEtmN2s/view?pli=1


--
Paulo

Mar 03 2015

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:

On Tuesday, 3 March 2015 at 19:50:46 UTC, Paulo Pinto wrote:
 You just reminded me of ParaSail. Here is the latest version of 
 their pointer free programming paper.

 https://drive.google.com/file/d/0B6Vq5QaY4U7ubm5qVkFpMEtmN2s/view?pli=1

Thanks!  Section «5. Related Work» was a fun read. :-)

Mar 03 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Tuesday, 3 March 2015 at 08:04:25 UTC, Manu wrote:
 My immediate impression on this problem:

 s.array[0] is being passed to foo from main. s does not belong 
 to main
 (is global), and main does not hold have a reference to s.array.
 Shouldn't main just need to inc/dec array around the call to 
 foo when
 passing un-owned references down the call tree.
 It seems to me that there always needs to be a reference 
 _somewhere_
 on the stack for anything being passed down the call tree 
 (unless the
 function is pure). Seems simplest to capture a stack ref at the 
 top
 level, then as it's received as arguments to each callee, it's
 effectively owned by those functions and they don't need to 
 worry
 anymore.

 So, passing global x to some function; inc/dec x around the 
 function
 call that it's passed to...? Then the stack has its own 
 reference, and
 the global reference can go away safely.

This is my position too.

There is another problem being discussed now, however, having to 
do with references to non-rc'd subcomponents of an Rc'd type.

Mar 03 2015

Manu via Digitalmars-d <digitalmars-d puremagic.com> writes:

On 4 March 2015 at 01:07, Zach the Mystic via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On Tuesday, 3 March 2015 at 08:04:25 UTC, Manu wrote:
 My immediate impression on this problem:

 s.array[0] is being passed to foo from main. s does not belong to main
 (is global), and main does not hold have a reference to s.array.
 Shouldn't main just need to inc/dec array around the call to foo when
 passing un-owned references down the call tree.
 It seems to me that there always needs to be a reference _somewhere_
 on the stack for anything being passed down the call tree (unless the
 function is pure). Seems simplest to capture a stack ref at the top
 level, then as it's received as arguments to each callee, it's
 effectively owned by those functions and they don't need to worry
 anymore.

 So, passing global x to some function; inc/dec x around the function
 call that it's passed to...? Then the stack has its own reference, and
 the global reference can go away safely.


 This is my position too.

 There is another problem being discussed now, however, having to do with
 references to non-rc'd subcomponents of an Rc'd type.

Well you can't get to a subcomponent if not through it's owner.
If the question is about passing RC objects members to functions, then
the solution is the same as above, the stack needs a reference to the
parent before it can pass a pointer to it's member down the line for
the same reasons.
The trouble then is what if that member pointer escapes? Well I'd
imagine that it needs to be a scope pointer (I think we all agree RC
relies on scope). So a raw pointer to some member of an RC object must
be scope(*). That it can't escape, combined with knowledge that the
stack has a reference to it's owner, guarantees that it won't
disappear.

Mar 03 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Wednesday, 4 March 2015 at 07:50:50 UTC, Manu wrote:
 Well you can't get to a subcomponent if not through it's owner.
 If the question is about passing RC objects members to 
 functions, then
 the solution is the same as above, the stack needs a reference 
 to the
 parent before it can pass a pointer to it's member down the 
 line for
 the same reasons.

Yeah, or you could mimic such a reference by wrapping the call in 
an addRef/release cycle, as a performance optimization.

 The trouble then is what if that member pointer escapes? Well 
 I'd
 imagine that it needs to be a scope pointer (I think we all 
 agree RC
 relies on scope). So a raw pointer to some member of an RC 
 object must
 be scope(*).

I have a whole Reference Safety System which doesn't need 
explicit scope because it incorporates it implicitly:

http://forum.dlang.org/post/offurllmuxjewizxedab forum.dlang.org

 That it can't escape, combined with knowledge that the
 stack has a reference to it's owner, guarantees that it won't
 disappear.

I think you and I are on the same page.

Mar 04 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 3/1/15 2:21 PM, Walter Bright wrote:
 On 3/1/2015 7:44 AM, "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net>"
 wrote:
 A weakness of the same kind affects DIP25, too. The core of the
 problem is
 borrowing (ref return as in DIP25), combined with manual (albeit
 hidden) memory
 management. An example to illustrate:

      struct T {
          void doSomething();
      }
      struct S {
          RCArray!T array;
      }
      void main() {
          auto s = S(RCArray!T([T()])); // s.array's refcount is now 1
          foo(s, s.array[0]);           // pass by ref
      }
      void foo(ref S s, ref T T) {
          s.array = RCArray!T([]);      // drop the old s.array
          t.doSomething();              // oops, t is gone
      }


This is an odd example, how does one take a ref to an RCArray element 
without the machinery to retain the array? I would think that RCArray[x] 
would return something that isn't passable to a function. Or am I 
missing something?

 The trouble seems to happen when there are two references to the same
 object passed to a function. I.e. there can be only one "borrowed" ref
 at a time.

Not exactly. Note that we are taking by reference S, which is NOT 
reference counted. So you are passing indirectly a reference to an RC 
object. You aren't "borrowing" that reference.

-Steve

Mar 02 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 3/2/2015 12:37 PM, Steven Schveighoffer wrote:
 Not exactly. Note that we are taking by reference S, which is NOT reference
 counted. So you are passing indirectly a reference to an RC object. You aren't
 "borrowing" that reference.

It's seems that to solve the problem, it has to be borrowed, in that a static 
check will be needed that what it points to cannot be modified/deleted through 
other references to the same data.

Mar 02 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 3/2/2015 12:42 PM, Walter Bright wrote:
 On 3/2/2015 12:37 PM, Steven Schveighoffer wrote:
 Not exactly. Note that we are taking by reference S, which is NOT reference
 counted. So you are passing indirectly a reference to an RC object. You aren't
 "borrowing" that reference.

 It's seems that to solve the problem, it has to be borrowed, in that a static
 check will be needed that what it points to cannot be modified/deleted through
 other references to the same data.


I looked at how Rust does it. I was hoping it was something clever, but it 
isn't. A copy is made of the object to which a borrowed reference is taken - in 
other words, the reference count is incremented.

For D structs, that means, if there's a postblit, a copy must be made. For D
ref 
counted classes, a ref count increment must be done.

I was hoping to avoid that, but apparently there's no way.

There are cases where this can be avoided, like calling pure functions. Another 
win for pure functions!

Mar 02 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/2/15 12:53 PM, Walter Bright wrote:
 I was hoping to avoid that, but apparently there's no way.

There is! There is! -- Andrei

Mar 02 2015

"deadalnix" <deadalnix gmail.com> writes:

On Monday, 2 March 2015 at 20:54:20 UTC, Walter Bright wrote:
 I looked at how Rust does it. I was hoping it was something 
 clever, but it isn't. A copy is made of the object to which a 
 borrowed reference is taken - in other words, the reference 
 count is incremented.

 For D structs, that means, if there's a postblit, a copy must 
 be made. For D ref counted classes, a ref count increment must 
 be done.

 I was hoping to avoid that, but apparently there's no way.

 There are cases where this can be avoided, like calling pure 
 functions. Another win for pure functions!

I do think you are confusing how Rust does it. In Rust, borrowing 
makes the source and borrowed reference immutable by default. So 
by default the problem do not occurs.

You can also borrow a mutable reference, in which case the source 
is disabled for the duration of the borrowing. That makes it 
impossible to borrow a mutable reference twice.

The above mentioned scenario cannot happen at all in Rust, by 
design. As a result, there is solution for the problem, as the 
problem cannot occur in the first place.

Mar 02 2015

"weaselcat" <weaselcat gmail.com> writes:

On Monday, 2 March 2015 at 21:12:20 UTC, deadalnix wrote:
 On Monday, 2 March 2015 at 20:54:20 UTC, Walter Bright wrote:
 I looked at how Rust does it. I was hoping it was something 
 clever, but it isn't. A copy is made of the object to which a 
 borrowed reference is taken - in other words, the reference 
 count is incremented.

 For D structs, that means, if there's a postblit, a copy must 
 be made. For D ref counted classes, a ref count increment must 
 be done.

 I was hoping to avoid that, but apparently there's no way.

 There are cases where this can be avoided, like calling pure 
 functions. Another win for pure functions!

 I do think you are confusing how Rust does it. In Rust, 
 borrowing makes the source and borrowed reference immutable by 
 default. So by default the problem do not occurs.

 You can also borrow a mutable reference, in which case the 
 source is disabled for the duration of the borrowing. That 
 makes it impossible to borrow a mutable reference twice.

 The above mentioned scenario cannot happen at all in Rust, by 
 design. As a result, there is solution for the problem, as the 
 problem cannot occur in the first place.

I don't think you were advocating for this but,
+1 for a borrow system similar to Rust.

Mar 02 2015

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:

On Monday, 2 March 2015 at 22:42:44 UTC, weaselcat wrote:
 I don't think you were advocating for this but,
 +1 for a borrow system similar to Rust.

We're working on it ;-)

Mar 03 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 3/2/2015 1:12 PM, deadalnix wrote:
 I do think you are confusing how Rust does it. In Rust, borrowing makes the
 source and borrowed reference immutable by default. So by default the problem
do
 not occurs.

May I refer you to:

http://static.rust-lang.org/doc/0.6/tutorial-borrowed-ptr.html#borrowing-managed-boxes-and-rooting

"Again the lifetime of y is L, the remainder of the function body. But there is 
a crucial difference: suppose x were to be reassigned during the lifetime L? If 
the compiler isn't careful, the managed box could become unrooted, and would 
therefore be subject to garbage collection. A heap box that is unrooted is one 
such that no pointer values in the heap point to it. It would violate memory 
safety for the box that was originally assigned to x to be garbage-collected, 
since a non-heap pointer---y---still points into it.
[...]
For this reason, whenever an & expression borrows the interior of a managed box 
stored in a mutable location, the compiler inserts a temporary that ensures
that 
the managed box remains live for the entire lifetime. [...] This process is 
called rooting."


It's possible that this is an old and obsolete document, but it's what I found.

Mar 02 2015

"weaselcat" <weaselcat gmail.com> writes:

On Monday, 2 March 2015 at 22:51:03 UTC, Walter Bright wrote:
On 3/2/2015 1:12 PM, deadalnix wrote:
I do think you are confusing how Rust does it. In Rust,
borrowing makes the
source and borrowed reference immutable by default. So by
default the problem do
not occurs.

May I refer you to:

http://static.rust-lang.org/doc/0.6/tutorial-borrowed-ptr.html#borrowing-managed-boxes-and-rooting

"Again the lifetime of y is L, the remainder of the function
body. But there is a crucial difference: suppose x were to be
reassigned during the lifetime L? If the compiler isn't
careful, the managed box could become unrooted, and would
therefore be subject to garbage collection. A heap box that is
unrooted is one such that no pointer values in the heap point
to it. It would violate memory safety for the box that was
originally assigned to x to be garbage-collected, since a
non-heap pointer---y---still points into it.
[...]
For this reason, whenever an & expression borrows the interior
of a managed box stored in a mutable location, the compiler
inserts a temporary that ensures that the managed box remains
live for the entire lifetime. [...] This process is called
rooting."

It's possible that this is an old and obsolete document, but
it's what I found.

That's actually very old(0.6 - nearly 2 years ago.) I believe
rooting was dropped from the language completely.

http://doc.rust-lang.org/book/ownership.html

also, rust by example chapter 17-19 cover this
http://rustbyexample.com/move.html

Mar 02 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 3/2/2015 3:47 PM, weaselcat wrote:
 That's actually very old(0.6 - nearly 2 years ago.) I believe rooting was
 dropped from the language completely.

 http://doc.rust-lang.org/book/ownership.html

 also, rust by example chapter 17-19 cover this
 http://rustbyexample.com/move.html

Thanks for the info. But I don't see how Marc's example is prevented by this.

Mar 02 2015

"weaselcat" <weaselcat gmail.com> writes:

On Tuesday, 3 March 2015 at 00:08:10 UTC, Walter Bright wrote:
 On 3/2/2015 3:47 PM, weaselcat wrote:
 That's actually very old(0.6 - nearly 2 years ago.) I believe 
 rooting was
 dropped from the language completely.

 http://doc.rust-lang.org/book/ownership.html

 also, rust by example chapter 17-19 cover this
 http://rustbyexample.com/move.html

 Thanks for the info. But I don't see how Marc's example is 
 prevented by this.

It wouldn't be compilable(it borrows the same object mutably 
twice)
I think so, anyways. I'm not that big on rust.


     During a borrow, the owner may NOT (a) mutate the resource, 
or (b) mutably lend the resource.
     During a mutable borrow, the owner may NOT (a) access the 
resource, or (b) lend the resource.

Mar 02 2015

"deadalnix" <deadalnix gmail.com> writes:

On Tuesday, 3 March 2015 at 00:08:10 UTC, Walter Bright wrote:
 On 3/2/2015 3:47 PM, weaselcat wrote:
 That's actually very old(0.6 - nearly 2 years ago.) I believe 
 rooting was
 dropped from the language completely.

 http://doc.rust-lang.org/book/ownership.html

 also, rust by example chapter 17-19 cover this
 http://rustbyexample.com/move.html

 Thanks for the info. But I don't see how Marc's example is 
 prevented by this.

After moving resources, the previous owner can no longer be used.

That means you cannot borrow twice, which mean you can't get 
yourself into the mentioned situation.

Mar 02 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 3/2/2015 4:40 PM, deadalnix wrote:
 After moving resources, the previous owner can no longer be used.

How does that work with the example presented by Marc?

Mar 02 2015

"weaselcat" <weaselcat gmail.com> writes:

On Tuesday, 3 March 2015 at 01:56:09 UTC, Walter Bright wrote:
 On 3/2/2015 4:40 PM, deadalnix wrote:
 After moving resources, the previous owner can no longer be 
 used.

 How does that work with the example presented by Marc?

He couldn't pass s and a member of s because s is borrowed as 
mutable.
He would have to pass both as immutable.

Mar 02 2015

"weaselcat" <weaselcat gmail.com> writes:

On Tuesday, 3 March 2015 at 02:04:15 UTC, weaselcat wrote:
 On Tuesday, 3 March 2015 at 01:56:09 UTC, Walter Bright wrote:
 On 3/2/2015 4:40 PM, deadalnix wrote:
 After moving resources, the previous owner can no longer be 
 used.

 How does that work with the example presented by Marc?

 He couldn't pass s and a member of s because s is borrowed as 
 mutable.
 He would have to pass both as immutable.

Er, I was speaking in terms of Rust of course(it has no 
distinction between const and immutable like D afaik)

Mar 02 2015

"weaselcat" <weaselcat gmail.com> writes:

On Tuesday, 3 March 2015 at 02:10:23 UTC, weaselcat wrote:
 On Tuesday, 3 March 2015 at 02:04:15 UTC, weaselcat wrote:
 On Tuesday, 3 March 2015 at 01:56:09 UTC, Walter Bright wrote:
 On 3/2/2015 4:40 PM, deadalnix wrote:
 After moving resources, the previous owner can no longer be 
 used.

 How does that work with the example presented by Marc?

 He couldn't pass s and a member of s because s is borrowed as 
 mutable.
 He would have to pass both as immutable.

 Er, I was speaking in terms of Rust of course(it has no 
 distinction between const and immutable like D afaik)

http://dpaste.com/30WAMVQ

I hope this helps explain it and I didn't butcher it too bad : )
Sorry for the triple reply.

Mar 02 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 3/2/2015 6:04 PM, weaselcat wrote:
 On Tuesday, 3 March 2015 at 01:56:09 UTC, Walter Bright wrote:
 On 3/2/2015 4:40 PM, deadalnix wrote:
 After moving resources, the previous owner can no longer be used.

 How does that work with the example presented by Marc?

 He couldn't pass s and a member of s because s is borrowed as mutable.
 He would have to pass both as immutable.

A pointer to s could be obtained otherwise and passed.

Mar 02 2015

"Seo Sanghyeon" <sanxiyn gmail.com> writes:

On Tuesday, 3 March 2015 at 05:12:15 UTC, Walter Bright wrote:
 On 3/2/2015 6:04 PM, weaselcat wrote:
 On Tuesday, 3 March 2015 at 01:56:09 UTC, Walter Bright wrote:
 On 3/2/2015 4:40 PM, deadalnix wrote:
 After moving resources, the previous owner can no longer be 
 used.

 How does that work with the example presented by Marc?

 He couldn't pass s and a member of s because s is borrowed as 
 mutable.
 He would have to pass both as immutable.

 A pointer to s could be obtained otherwise and passed.

No. Rust compiler forbids you from obtaining the pointer
otherwise. Namely, it is a compile time error to make an alias of
a mutably borrowed pointer.

Mar 03 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Tuesday, 3 March 2015 at 05:12:15 UTC, Walter Bright wrote:
 On 3/2/2015 6:04 PM, weaselcat wrote:
 On Tuesday, 3 March 2015 at 01:56:09 UTC, Walter Bright wrote:
 On 3/2/2015 4:40 PM, deadalnix wrote:
 After moving resources, the previous owner can no longer be 
 used.

 How does that work with the example presented by Marc?

 He couldn't pass s and a member of s because s is borrowed as 
 mutable.
 He would have to pass both as immutable.

 A pointer to s could be obtained otherwise and passed.

Under normal circumstances, if the pointer to s is an lvalue, the 
refcount will be bumped when it is taken.

Isn't the only problem now aliasing something (i.e. a global) 
invisibly through a parameter? This is easily solved -- when 
passing a global reference, or duplicating a variable in the same 
call, wrap the call in an add/release cycle. This preserves the 
alias for the duration of the call.

Or are we also talking about taking the address of a non-rc'd 
subcomponent of an rc'd struct?

Mar 03 2015

"deadalnix" <deadalnix gmail.com> writes:

On Tuesday, 3 March 2015 at 01:56:09 UTC, Walter Bright wrote:
 On 3/2/2015 4:40 PM, deadalnix wrote:
 After moving resources, the previous owner can no longer be 
 used.

 How does that work with the example presented by Marc?

I'm not sure what you don't understand. The comment in the sample 
code seems clear to me.

// "Move" `a` into `b`
// Here's what happens under the hood: the pointer `a` gets 
copied (*not*
// the data on the heap, just its address) into `b`. Now both are 
pointers
// to the *same* heap allocated data. But now, `b` *owns* the heap
// allocated data; `b` is now in charge of freeing the memory in 
the heap.
let b = a;

Now you have a variable b that owns the data. a is not usable 
anymore.

// After the previous move, `a` can no longer be used
// Error! `a` can no longer access the data, because it no longer 
owns the
// heap memory
//println!("a contains: {}", a);
// TODO ^ Try uncommenting this line

As explained here, if a is used, this is an error.

In the example a is move to b. If you had let b = &a, then b 
would borrow from a. The same way, a would not be usable anymore. 
The difference with move is that, when b is destroyed, in the 
first case memory is freed. in the second case, memory is not 
freed and a is "reenabled".

It is either possible to borrow once and disable who you borrow 
from, or borrow multiple time and everything become immutable for 
the time you borrow.

This makes the problems mentioned in this thread effectively 
impossible happen.

Mar 02 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 3/2/2015 9:27 PM, deadalnix wrote:
 On Tuesday, 3 March 2015 at 01:56:09 UTC, Walter Bright wrote:
 On 3/2/2015 4:40 PM, deadalnix wrote:
 After moving resources, the previous owner can no longer be used.

 How does that work with the example presented by Marc?

 I'm not sure what you don't understand. The comment in the sample code seems
 clear to me.

 // "Move" `a` into `b`
 // Here's what happens under the hood: the pointer `a` gets copied (*not*
 // the data on the heap, just its address) into `b`. Now both are pointers
 // to the *same* heap allocated data. But now, `b` *owns* the heap
 // allocated data; `b` is now in charge of freeing the memory in the heap.
 let b = a;

 Now you have a variable b that owns the data. a is not usable anymore.

 // After the previous move, `a` can no longer be used
 // Error! `a` can no longer access the data, because it no longer owns the
 // heap memory
 //println!("a contains: {}", a);
 // TODO ^ Try uncommenting this line

 As explained here, if a is used, this is an error.

 In the example a is move to b. If you had let b = &a, then b would borrow from
 a. The same way, a would not be usable anymore. The difference with move is
 that, when b is destroyed, in the first case memory is freed. in the second
 case, memory is not freed and a is "reenabled".

 It is either possible to borrow once and disable who you borrow from, or borrow
 multiple time and everything become immutable for the time you borrow.

 This makes the problems mentioned in this thread effectively impossible happen.

What if 'a' is a field of a struct in some non-trivial data structure? How is 
the compiler going to statically keep track of which field instances in this 
structure are not usable?

Mar 02 2015

"weaselcat" <weaselcat gmail.com> writes:

On Tuesday, 3 March 2015 at 05:40:59 UTC, Walter Bright wrote:
 On 3/2/2015 9:27 PM, deadalnix wrote:
 On Tuesday, 3 March 2015 at 01:56:09 UTC, Walter Bright wrote:
 On 3/2/2015 4:40 PM, deadalnix wrote:
 After moving resources, the previous owner can no longer be 
 used.

 How does that work with the example presented by Marc?

 I'm not sure what you don't understand. The comment in the 
 sample code seems
 clear to me.

 // "Move" `a` into `b`
 // Here's what happens under the hood: the pointer `a` gets 
 copied (*not*
 // the data on the heap, just its address) into `b`. Now both 
 are pointers
 // to the *same* heap allocated data. But now, `b` *owns* the 
 heap
 // allocated data; `b` is now in charge of freeing the memory 
 in the heap.
 let b = a;

 Now you have a variable b that owns the data. a is not usable 
 anymore.

 // After the previous move, `a` can no longer be used
 // Error! `a` can no longer access the data, because it no 
 longer owns the
 // heap memory
 //println!("a contains: {}", a);
 // TODO ^ Try uncommenting this line

 As explained here, if a is used, this is an error.

 In the example a is move to b. If you had let b = &a, then b 
 would borrow from
 a. The same way, a would not be usable anymore. The difference 
 with move is
 that, when b is destroyed, in the first case memory is freed. 
 in the second
 case, memory is not freed and a is "reenabled".

 It is either possible to borrow once and disable who you 
 borrow from, or borrow
 multiple time and everything become immutable for the time you 
 borrow.

 This makes the problems mentioned in this thread effectively 
 impossible happen.

 What if 'a' is a field of a struct in some non-trivial data 
 structure? How is the compiler going to statically keep track 
 of which field instances in this structure are not usable?

Borrowing 'a' from a struct would make the parent struct 
immutable during the borrow scope of 'a', I believe.

Mar 02 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 3/2/2015 9:58 PM, weaselcat wrote:
 Borrowing 'a' from a struct would make the parent struct immutable during the
 borrow scope of 'a', I believe.

Right, now consider that struct is a leaf in a complex graph of data structures.

Mar 03 2015

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:

On Tuesday, 3 March 2015 at 09:05:46 UTC, Walter Bright wrote:
 On 3/2/2015 9:58 PM, weaselcat wrote:
 Borrowing 'a' from a struct would make the parent struct 
 immutable during the
 borrow scope of 'a', I believe.

 Right, now consider that struct is a leaf in a complex graph of 
 data structures.

Then you still cannot have more than one mutable reference to the 
entire graph. Because that is impractical, Rust uses unsafe (i.e. 
 trusted in D speak) accessors that "cast away" the ownership, 
but do so in a way that doesn't violate the guarantees.

For example, the type system doesn't allow you to get mutable 
references to the left and right children of a binary tree node. 
But there can be an accessor method that internally does some 
unsafe magic to return a tuple with mutable references to them, 
annotated with the information that they are mutably borrowed 
from the node. Both child refs are mutable, and the parent node 
is inaccessible as long as they exist.

Mar 03 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/3/15 5:45 AM, "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net>" wrote:
 On Tuesday, 3 March 2015 at 09:05:46 UTC, Walter Bright wrote:
 On 3/2/2015 9:58 PM, weaselcat wrote:
 Borrowing 'a' from a struct would make the parent struct immutable
 during the
 borrow scope of 'a', I believe.

 Right, now consider that struct is a leaf in a complex graph of data
 structures.

 Then you still cannot have more than one mutable reference to the entire
 graph. Because that is impractical, Rust uses unsafe (i.e.  trusted in D
 speak) accessors that "cast away" the ownership, but do so in a way that
 doesn't violate the guarantees.

 For example, the type system doesn't allow you to get mutable references
 to the left and right children of a binary tree node. But there can be
 an accessor method that internally does some unsafe magic to return a
 tuple with mutable references to them, annotated with the information
 that they are mutably borrowed from the node. Both child refs are
 mutable, and the parent node is inaccessible as long as they exist.

Well... the bigger problem is that it's relying on a convention. The 
accessor method needs to be constructed in a particular way that's easy 
to get wrong and that the compiler has no way to check for us.

:o)


Andrei

Mar 03 2015

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:

On Tuesday, 3 March 2015 at 15:03:41 UTC, Andrei Alexandrescu 
wrote:
 On 3/3/15 5:45 AM, "Marc =?UTF-8?B?U2Now7x0eiI=?= 
 <schuetzm gmx.net>" wrote:
 On Tuesday, 3 March 2015 at 09:05:46 UTC, Walter Bright wrote:
 On 3/2/2015 9:58 PM, weaselcat wrote:
 Borrowing 'a' from a struct would make the parent struct 
 immutable
 during the
 borrow scope of 'a', I believe.

 Right, now consider that struct is a leaf in a complex graph 
 of data
 structures.

 Then you still cannot have more than one mutable reference to 
 the entire
 graph. Because that is impractical, Rust uses unsafe (i.e. 
  trusted in D
 speak) accessors that "cast away" the ownership, but do so in 
 a way that
 doesn't violate the guarantees.

 For example, the type system doesn't allow you to get mutable 
 references
 to the left and right children of a binary tree node. But 
 there can be
 an accessor method that internally does some unsafe magic to 
 return a
 tuple with mutable references to them, annotated with the 
 information
 that they are mutably borrowed from the node. Both child refs 
 are
 mutable, and the parent node is inaccessible as long as they 
 exist.

 Well... the bigger problem is that it's relying on a 
 convention. The accessor method needs to be constructed in a 
 particular way that's easy to get wrong and that the compiler 
 has no way to check for us.

 :o)

To avoid misunderstandings: It is in reply to a sub-thread where 
Walter asked about how Rust's type system works. This is an 
example for Rust, not for D.

Therefore, your reply isn't really valid. In Rust, it is an 
escape hatch from a fundamentally safe type system, whereas in D 
it would be a necessary convention to make usage of RC safe.

Mar 03 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 3/3/2015 9:19 AM, "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net>" wrote:
 Therefore, your reply isn't really valid. In Rust, it is an escape hatch from a
 fundamentally safe type system, whereas in D it would be a necessary convention
 to make usage of RC safe.

My understanding of Rust is that many library types rely on unsafe code under 
the hood to make them work. Rust does have an 'unsafe' mechanism to support
this.

But that's ok. The idea is it must be possible to create a type with a safe 
interface. The internals don't have to be safe.

Mar 03 2015

"deadalnix" <deadalnix gmail.com> writes:

On Tuesday, 3 March 2015 at 09:05:46 UTC, Walter Bright wrote:
 On 3/2/2015 9:58 PM, weaselcat wrote:
 Borrowing 'a' from a struct would make the parent struct 
 immutable during the
 borrow scope of 'a', I believe.

 Right, now consider that struct is a leaf in a complex graph of 
 data structures.

Rust has several pointer types, so what is gonna happen depend on 
the pointer type. However, either the entry in the graph you have 
owns 'a', in which case it can be disabled as only one owner of a 
exists, so there is no problem tracking it, or the thing is 
borrowed, in which case you can have multiple path to 'a' but you 
cannot borrow yourself (unless it is immutable).

Mar 03 2015

"deadalnix" <deadalnix gmail.com> writes:

On Monday, 2 March 2015 at 22:51:03 UTC, Walter Bright wrote:
 On 3/2/2015 1:12 PM, deadalnix wrote:
 I do think you are confusing how Rust does it. In Rust, 
 borrowing makes the
 source and borrowed reference immutable by default. So by 
 default the problem do
 not occurs.

 May I refer you to:

 http://static.rust-lang.org/doc/0.6/tutorial-borrowed-ptr.html#borrowing-managed-boxes-and-rooting

 "Again the lifetime of y is L, the remainder of the function 
 body. But there is a crucial difference: suppose x were to be 
 reassigned during the lifetime L? If the compiler isn't 
 careful, the managed box could become unrooted, and would 
 therefore be subject to garbage collection. A heap box that is 
 unrooted is one such that no pointer values in the heap point 
 to it. It would violate memory safety for the box that was 
 originally assigned to x to be garbage-collected, since a 
 non-heap pointer---y---still points into it.
 [...]
 For this reason, whenever an & expression borrows the interior 
 of a managed box stored in a mutable location, the compiler 
 inserts a temporary that ensures that the managed box remains 
 live for the entire lifetime. [...] This process is called 
 rooting."


 It's possible that this is an old and obsolete document, but 
 it's what I found.

Yes, this page is obsolete, but it is still very interesting as 
it is a viable solution for us.

Mar 02 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Monday, 2 March 2015 at 20:54:20 UTC, Walter Bright wrote:
 On 3/2/2015 12:42 PM, Walter Bright wrote:
 For D structs, that means, if there's a postblit, a copy must 
 be made. For D ref counted classes, a ref count increment must 
 be done.

 I was hoping to avoid that, but apparently there's no way.

 There are cases where this can be avoided, like calling pure 
 functions. Another win for pure functions!

It seems like the most common use case for passing a global to a 
parameter is to process that global in a pure way. You already 
have access to it in an impure function, so you could just access 
it directly. The good news is that from within the passed-to 
function, no further calls will treat it as global.

Mar 02 2015

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:

On Sunday, 1 March 2015 at 19:22:06 UTC, Walter Bright wrote:
 On 3/1/2015 7:44 AM, "Marc =?UTF-8?B?U2Now7x0eiI=?= 
 <schuetzm gmx.net>" wrote:
 A weakness of the same kind affects DIP25, too. The core of 
 the problem is
 borrowing (ref return as in DIP25), combined with manual 
 (albeit hidden) memory
 management. An example to illustrate:

     struct T {
         void doSomething();
     }
     struct S {
         RCArray!T array;
     }
     void main() {
         auto s = S(RCArray!T([T()])); // s.array's refcount is 
 now 1
         foo(s, s.array[0]);           // pass by ref
     }
     void foo(ref S s, ref T T) {
         s.array = RCArray!T([]);      // drop the old s.array
         t.doSomething();              // oops, t is gone
     }

 The trouble seems to happen when there are two references to 
 the same object passed to a function. I.e. there can be only 
 one "borrowed" ref at a time.

 I'm thinking this could be statically disallowed in  safe code.

Yes, it's a classical aliasing problem. Of course, if the 
references can't possible alias because of their types, it's ok. 
However, for non-pure functions, it's always (?) unsafe, because 
they have access to all kinds of global variables. Too bad we 
don't have pure by default :-(

Mar 02 2015

"Sativa" <Sativa Indica.org> writes:

On Sunday, 1 March 2015 at 15:44:49 UTC, Marc Schütz wrote:
 Walter posted an example implementation of a reference counted 
 array [1], that utilizes the features introduced in DIP25 [2]. 
 Then, in the threads about reference counted objects, several 
 people posted examples [3, 4] that broke the suggested 
 optimization of eliding `opAddRef()`/`opRelease()` calls in 
 certain situations.

 A weakness of the same kind affects DIP25, too. The core of the 
 problem is borrowing (ref return as in DIP25), combined with 
 manual (albeit hidden) memory management. An example to 
 illustrate:


1 >     struct T {
2 >         void doSomething();
3 >     }
4 >     struct S {
5 >         RCArray!T array;
7 >     }
8 >     void main() {
9 >         auto s = S(RCArray!T([T()])); // s.array's refcount is
10> now 1
11>         foo(s, s.array[0]);           // pass by ref

         ++  s.array[0] = myt;             // Would also be invalid
12>     }
13>     void foo(ref S s, ref T T) {
14>         s.array = RCArray!T([]);      // drop the old s.array
15>         t.doSomething();              // oops, t is gone
16>     }



1. Assignment to RC types is not the same as assignments to 
non-RC types.
2. Allocation of RC types is more than just allocating of non-RC 
types.

Using the above example:


    a. Decrements the current RC of s.array(here it is null so no 
RC is used)
    b. Increments the ref count of the allocated RCArray by one.


behavior and not technically required in this case(just pass S). 
Of course, we can't expect such behavior to creep up in complex 
code.


the following:
    a. Decrements the current RC of s.array(here was 1, so now it 
is 0)
    b. Increments the current RC of the newly allocated RC array 
and assigns it to s.array.


of 0(which we can assume to then be unallocated), line 15 becomes 
a run-time error.

---

How to fix?

It seems that the only way to make it work well is to use sort of 
"lazy" ref counting.

Here references are created at the start of functions and 
decremented at the end of functions... rather that in place(which 
then causes problems with things that happen after it).

But this can't work as in the ++ I added. Here it would solve 
your problem but not fix the a generalization of it.

---

Is there a solution? Essentially no. Only at the end of the 
program could we possibly have the last function release all RC 
counted arrays. Since it is the end of the program it is the only 
time we can know that no more RC types will be used.

The problem is analogous to the multiple-inheritance problem.

---

Flow analysis can only help so far(it won't help with dynamically 
allocated types that are allocated depending on some "random" 
factor.

---

Note that since t is pointing to a part of S, technically we 
can't release s until t is released.

So, effectively, S has to be ref counted too! And it's ref count 
must check check T's ref count. If it is non-zero then the it 
can't release itself(or the reference to it's array).

Also, something eventually has to release S(or which is then 
s.array). When? Well, since t depends on S, it is when t is 
released. So we would naturally have to inform t that it should 
too release s when it is released.

----

So, if that sort of makes sense I'll explain the solution: It 
might be a bit fuzzy but I'll try to make things very clear. [I 
will also sometimes conflate the usage of concrete objects and 
their corresponding abstract types. e.g., "T with t". It should 
be clear context which is meant. e.g., If I say "T is released", 
I mean that the object with type T was released. When they need 
to be distinguished between I will do so(mainly in the code 
examples]


A *reference counted* type, from here on designated as RCT, is a 
type that only allows itself to be released to the heap when 
nothing depends on it. [Obviously if anything depends on it when 
after it is released will fail to be fulfilled]

If a RCT does not contains any references to any RCT's we will 
call it a free RCT or FRCT. Such types behave in a simple way. 
They can be free'ed completely immediately. Since FRCT's do not 
contain any references to other RCT's all their references are 
simple references that can be free'ed immediately.

We require that if a type contains a reference to a RCT then it 
too is a RCT. [the relation ContainsRCT(T1,T2) ==> 
ContainsRCT(T2,T1) ==> Both T1 and T2 are either RCT's or not 
RCT's] This isn't a big deal but without it we end up requiring 
all types to be RCT's. Here we allow some types to be not be 
RCT's.


In code,

class T { }   // a standard non-reference counted type.
               // It can't use any RCT's as references. It is your 
basic D type of class.

rcclass FRCT  // rcclass means reference counted class. It is 
identical to your basic
               // D type of class but has different keyword 
designator(which is why we call them free).
{
   int x;
   float[] y
   // etc, FRCT does not point to any RCT's
}

rcclass X { }  // Just another RCT

rcclass RCT
{
     FRCT myFRCT;
     int y;
     RCT foo(ref barRCT);
     X x;
     // etc Basically this is a class that can include other RCT's
}

----

Ultimately we could simply use the basic D class but we would 
have to designate every type as an RCT type... even the basic 
types. I'm trying to avoid some basic issues of that method that 
would derail the discussion. [But for those that want to know: 
one could simply treat all objects in d as rcclass types describe 
above, even basic types and such, but optimize out the reference 
counting part in in the compiler for those basic and FRCT 
types... this then would simply lead back to the method I'm using 
to describe it but simply some of the details were hidden in the 
"simple method" by the compiler]


So, once you have such a type system. The way to deal with the 
complex relationships that exist are to know them:

1. A RCT have the following relationships between it's containing 
types. We will denote T to be a non-RCT type, FRCT to be an FRCT, 
and RCT to be an RCT.

      RCT can contain T's, FRCT's, and/or, RCT's.

      In Code:

      rcclass RCT
      {
          T t;
          FRCT myFRCT;
          RCT myRCT;
      }


In this case, which is the only case we will talk about w.l.o.g), 
when an RCT is released it must attempt to release each of it's 
references... in example above: t, myFRCT, and myRCT. Since t is 
a simple type, it can be released immediately. No RCT's point to 
it because if they did, then T would be an RCT and it isn't. 
Similarly myFRCT can easily be released because it is just a 
complex of things like T's. In this case though there is 
something slightly different about it which is the referencing 
counting part. Essentially it's contained types are simple types 
but it still has a reference counting aspect for itself.


Basically T's and FRCT's can free themselves when they need to. 
This is because they know how to free each reference they contain 
and it can't cause referential problems simply because they do 
not contain such references. This by no means they won't cause 
other parts of code to access invalid references. A pointer to 
such a member of a class can easily try and access the member 
after the object has been freed. But this is not our concern 
because we are only dealing with RCT's. Our goal isn't to have 
everything be RCT's. If that was the case then the T and FRCT 
cases would not be needed since we won't use them. e.g., We might 
want the GC to handle non-RCT and some other method for FRCT's. 
This allows one to build on current D system. But one must 
remember that RCT's only guarantee valid referencing when they 
access other RCT's.


The last sentence above directs where we are going with this: 
Since we have RCT we can make sure they know how to communicate 
with each other. (we are creating a metatype here)

By having the proper communication system here it is possible for 
the types to know when they need to be released.
In fact, it is not hard and several methods can be used, even 
simultaneously.

Axiom: Every RCT has the ability to have any other RCT that 
contains a reference to it, to be able to release it. Also, vice 
versa!

[This requirement essentially allows the very final reference to 
know that it needs to release other dependencies it had and 
allows it to not only clean up itself but everything else it 
needs to clean up]

Once you have the above requirement of RCT's(The implement dealt 
with at a different time), you can lazily release objects in the 
background without issue(fragmentation of the heap is a harder 
matter but still has a solution because of the above requirement).

How do they communicate:

Your example will do:

     rcstruct T
     {
         void doSomething();
     }

     rcstruct S
     {
         T[] array;
     }

     void main()
     {
         auto s = S(new T());  // Here when the compiler creates 
the array of T's S knows T is a RFC.
                               // In this case it can get at the 
hidden communications layer of T and
                               // inform T that it is using it. 
Hence at this point
                               // Both S and T can communicate 
with each other! (not just blind inc/dec of an int
         auto t = s.array[0];

         foo(s, s.array[0]);   // nothing special except foo much 
create an added reference to the two argments
                               // This can be optimized out as 
long as foo returns once for every time it
                               // is called. In this case, At the 
start of foo, the counters will be inc'ed
                               // and at the end they will be 
dec'ed. Always a zero sum outside the function call.
         t.doSomething();      // Skip to foo, come back when done 
with foo! Great, t is still around!
                               // (RC is still zero at this point!)
     }

     void foo(ref S s, ref T t)
     {
         s.array = new T[];    // Here we are reassigning the 
array. No problem! We simply inc the ref count on the new array 
and decment it on the old. Just like normal ARC'ing! The only 
different is that t is not immediately release! Even though it 
has a RC count of 0. Think of it as lazy ARC'ing... or LARC'ing! 
;)
         t.doSomething();      // Great, even though t has an RC 
count of zero, it hasn't been released yet!
     }



So, what is this LARCING?

Essentially instead of s simply dumping t in foo, it informs t 
that it is no longer using it. This sets the RC = 0 !!BUT!! t 
does not free itself immediately! If it does, it leads to the 
problem you have given and also to the extended problem I have 
shown above(calling t.doSomething in main).


If t sticks around though, we are not releasing memory and hence 
what is the point?


Well, as usual this is an engineering problem. We have a few 
cases.


1. The compiler can prove that a RCT goes out of scope and is not 
used any more. In this case t can release itself at the end of 
the function where it is provable that it is no longer used. (In 
your example it is at the end of foo ad in my example it is at 
the end of main) Flow analysis, simple brute force checking, etc 
are used here. Most cases in program will fall under this 
category. Local variables that are not referenced to by global 
variables are proven to be releasable at the end of the function 
call. Hence, many cases will work as simple ARC'ing and not 
require LARC'ing.


2. The compiler cannot prove when a variable is release. This is 
when the type itself has to decide when it is not being used. 
(this the last part of the problem and your example falls under 
it)


In this case, the communication between S and T is crucial. When 
s decides it no longer need t, it informs t. t has a list of 
every reference that is using it(this can just be a pointer to 
the ref
count of the RCT). Having this list, while potentially weighty, 
lets t know when all the references to it are not needed any 
more(sort of chains or links between any other RCT and t have 
been broken).


Once this is done it is a simple matter of t trying to release 
itself every time it goes out of scope.

      a. At some point t will release itself
      b. t will never be able to release itself(this is generally 
due to program error in which a reference was made to t without 
removing the reference when it was done). Note this is why if 
every type is an RCT, prevents dangling(e.g., we would have full 
LARC'ing... or FLARC'ing.). FLARC'ing may not be optimal 
though(all pointers would have to be RCT which would probably be 
quiet inefficient. Also interdependencies are handled easily:


          s.t = t;
		 t.s = s;
          while (!isFree(t) && !isFree(s))
          {
		     s.t = null;
			 t.s = null;
              if (t.refCount == 0)
                 s.release();
              if (s.refCount == 0)
                 t.release();

          }

          The reason that t and s are released in LARC'ing is 
because at the 2nd time through the loop s.release() knows that t 
no longer depend and can release itself immediately(because t.s = 
null set s's *t entry* to 0 on it so now s knows that t doesn't 
depend on it). This is just a prototype that can be used to show 
that any level of nesting between types will eventually resolve 
all releases properly.
		
      c. Alternatively, a GC like memory manager can scan through 
the types and release all types who's reference count = 0 and 
who's dependencies are all invalid(nothing depends on it). It can 
do this in the background without issue and can compact 
consecutively release objects without issue. i.e., the GC knows 
what part of the heap belonged to objects there were ref counted 
and if it is proven they have any references then it can mark 
them as free'ed memory. Consecutive free'ed objects in memory can 
be combined since they never will be used again. This can be done 
behind in parallel and is sort of minor defragmentation. In fact, 
analogous to defragmenting a drive. If you know a file is not 
being used then you can defragment it in the background and move 
the various parts around without issue because you know it is not 
being used. Once it is being uses, you run into issues with 
having to stop the threads that are using that part of the heap. 
(I think this limitation could ultimately be removed for most 
practical cases.)
	
	 In this case the GC would have to have a list of all objects of 
RCT's. Essentially the GC is given to the task of LARC'ing in the 
background.
	

	
Finally, which such an structural approach(Essentially 
implementing the reference list, which is a random access list 
containing all the references(pointers to all classes that use 
the type))  to ref counting becomes pretty stable and in probably 
99%+ of program usage, statically proven to release memory(step a 
above, The compiler inserts many static releases that are 
released immediately and many types will be simple types that 
don't need LARC'ing).


The two issues with this, as always, are speed and memory usage. 
I think it should rather fast(most releases are direct and just 
release themselves since they don't have references. On top of 
that the remaining are mostly FRCT's which can be released nearly 
immediately(they can, but don't until they are informed it is ok).


For the full blown RCT types, the way they release things is in a 
hierarchical manner which can create avalanches of changes. If 
many objects are waiting for one object to be released, then when 
it does, they all could be released immediately. But this chain 
reaction is simply lazily recursive and eventually terminates.


The only real problem is how to deal with un-managed references 
to RCT types. These are references(say pointers) to RCT's that 
may not be "smart"(inform the RFC that it is no longer being 
refered to. A simple pointer to a class that doesn't do any 
communication is the prototypical problem.


The only solution seems to be to just accept it. It is a 
necessary evil. Allow it but require the program to know the 
intricate details. Such pointers are needed for speed and don't 
inherently cause any problems. It's only when they are abused can 
issues creep in.


If one, say, had RCT pointers that are used for some types of 
problems and simple pointers when they are needed, it becomes 
pretty marginal of an issue. (if you are doing driver programming 
and have to have speed and direct access then use the simple 
pointers but it is your just to know if your references are 
working as expected.


In many cases, RCT pointers can be used in many of the cases 
which make further minimize the singular problem case(that of 
non-informative referencing). (We could call such types "Peeping 
Toms"... but as long as they only peek it is ok(not by law, in 
programming ;)).


At this, imagine that such a system was implemented. It would 
provide almost complete automatic memory management coverage with 
releasing resources as they are generally released in the program 
flow. In fact, with a GC(don't you just hate it when your garbage 
man doesn't pick up your trash sometimes? specially if he never 
picks it up) that essentially works parallel, most of the actual 
releasing can be done in the background without ever stopping the 
program. The benefits are: Better real-time behavior, Some 
programs might never need to compact the heap(in fact, if 
intelligent heap allocation is used, probably most programs could 
benefit from it), and debugging is much easier(Essentially more 
like manual management for most objects... just the compiler 
handles for us, but we know when, most likely, an object goes out 
of scope it will be released). Seg faults due to invalid memory 
access will almost be reduced to 0(since the corner case... the 
problem case of simple references is not used as much as all the 
other ways to use objects).


The downside? There really isn't one except memory usage! One can 
come up with special cases that simply don't work well under 
LARC'ing but because it allows one to use non-LARC'ing types, the 
programmer will just have to manage it on is own or a mixed 
combination.


Another great benefit is performance. One may think with all the 
all the extra allocation and deallocation code that the end's of 
functions would be slowed down(do to objects being lazily 
released). This is not necessarily the case because it is 
LARC'ing, the releases can be done later... say outside a 
performant piece of the code. It can even be faster than directly 
releasing it because that could be slow in and of itself, which, 
again, with LARC'ing, we can chose a better time to release the 
object or do it in the background.


Unfortunately, if I'm not mistaken, with FLARC'ing, all classes 
would essentially double in size. This is because if every 
reference the type has is a RCT then it would essentially need to 
store a pointer to the object as to provide a link to it. With 
limited LARC'ing I think one could probably achieve a good 
balance. (for large classes and/or a large number of objects, one 
could just avoid LARC'ing to prevent the overhead and doing it 
manually or with GC)

Some codetalk(tm) of how the interaction goes:

      Object s is being initialized. It is a RFC type. It has, as 
a member, a object x that is also a RFC type.  In the course of s 
being initialized, it initializes x. We'll assume it is not null.
	 To do this s creates the object for x then informs x that it 
has a reference to it(we could have multiple types of references 
for further optimization if we want). e.g.,
	
	     x = new X;
	     x.CreateReference(s);
		 s.CreateReferenec(x); // We use directionless binding 
references
		
		
     Now, suppose at some point s is being released. At the end of 
scope the compiler inserts
	
	     s.Release();
		
     S now tries to release itself. It checks it's ref count and 
the list of all references it has.
	
	     e.g.,
		 refCount = 0;
		 refList = {}
		
		 then in s.Release()
		
			if (s.refCount == 0 && s.refList.IsEmpty)
			{
				GC.RemoveRCT(s); // GC.RemoveRCT is not necessary needed but 
using it hear for expressive purposes.
				s.Free();
			}
			else
				GC.AddRCT(s);   // Here, since the compiler potentially isn't 
going insert another s.Release() outside the function we have to 
let the GC handle it
			
		 in this case s free's itself immediately.
		
		 If either refCount != 0 or refList != empty, the GC is given 
control(it is made aware that S needs to be periodically checked 
to see if it can be released.(the GC simply does the
		 if (s.refCount == 0 && s.refList.IsEmpty) s.Free(); in a 
parallel thread. If s.refCount == 0 and s.refList is empty at 
some point then the GC ends up free'ing s.
		
		 Also, the compiler may be able to insert another s.Release() 
in the outer scope of the function all if it releases that it s 
was probably not released before. This doesn't hurt anything 
except an extra check that might be wasted(profiling could remove 
this check or even insert it).
		
		
		 The only way s can't be released is if it's refCount and 
refList are not empty. This is "danling" references and can only 
happen when mixing non-RCT's with RCT's.
		
		 So we know how s should release it self but how does t know 
that s no longer needs it? (so t can release itself at some point)
		
		 well, this has to do when x is assigned to. e.g.,
		
		 s.x = new X;
		 // Hidden, we have s update it's ref list and inform both the 
hold x and the new x what is going on.
		
		 here the assignment will need to remove the old x from refList 
and add the new x. This keeps the the list up to date. (there is 
a big corner case here that I believe would never occur but will 
have to think about it more)
		
		
		
Any ways, I'll leave it at that for now!

Mar 02 2015

"Biker" <biker outlaw.net> writes:

On Tuesday, 3 March 2015 at 07:17:11 UTC, Sativa wrote:
 On Sunday, 1 March 2015 at 15:44:49 UTC, Marc Schütz wrote:
 Walter posted an example implementation of a reference counted 
 array [1], that utilizes the features introduced in DIP25 [2]. 
 Then, in the threads about reference counted objects, several 
 people posted examples [3, 4] that broke the suggested 
 optimization of eliding `opAddRef()`/`opRelease()` calls in 
 certain situations.

 A weakness of the same kind affects DIP25, too. The core of 
 the problem is borrowing (ref return as in DIP25), combined 
 with manual (albeit hidden) memory management. An example to 
 illustrate:


 1 >     struct T {
 2 >         void doSomething();
 3 >     }
 4 >     struct S {
 5 >         RCArray!T array;
 7 >     }
 8 >     void main() {
 9 >         auto s = S(RCArray!T([T()])); // s.array's refcount 
 is
 10> now 1
 11>         foo(s, s.array[0]);           // pass by ref

         ++  s.array[0] = myt;             // Would also be 
 invalid
 12>     }
 13>     void foo(ref S s, ref T T) {
 14>         s.array = RCArray!T([]);      // drop the old 
 s.array
 15>         t.doSomething();              // oops, t is gone
 16>     }



 1. Assignment to RC types is not the same as assignments to 
 non-RC types.
 2. Allocation of RC types is more than just allocating of 
 non-RC types.

 Using the above example:


    a. Decrements the current RC of s.array(here it is null so 
 no RC is used)
    b. Increments the ref count of the allocated RCArray by one.


 behavior and not technically required in this case(just pass 
 S). Of course, we can't expect such behavior to creep up in 
 complex code.


 the following:
    a. Decrements the current RC of s.array(here was 1, so now 
 it is 0)
    b. Increments the current RC of the newly allocated RC array 
 and assigns it to s.array.


 count of 0(which we can assume to then be unallocated), line 15 
 becomes a run-time error.

 ---

 How to fix?

 It seems that the only way to make it work well is to use sort 
 of "lazy" ref counting.

 Here references are created at the start of functions and 
 decremented at the end of functions... rather that in 
 place(which then causes problems with things that happen after 
 it).

 But this can't work as in the ++ I added. Here it would solve 
 your problem but not fix the a generalization of it.

 ---

 Is there a solution? Essentially no. Only at the end of the 
 program could we possibly have the last function release all RC 
 counted arrays. Since it is the end of the program it is the 
 only time we can know that no more RC types will be used.

 The problem is analogous to the multiple-inheritance problem.

 ---

 Flow analysis can only help so far(it won't help with 
 dynamically allocated types that are allocated depending on 
 some "random" factor.

 ---

 Note that since t is pointing to a part of S, technically we 
 can't release s until t is released.

 So, effectively, S has to be ref counted too! And it's ref 
 count must check check T's ref count. If it is non-zero then 
 the it can't release itself(or the reference to it's array).

 Also, something eventually has to release S(or which is then 
 s.array). When? Well, since t depends on S, it is when t is 
 released. So we would naturally have to inform t that it should 
 too release s when it is released.

 ----

 So, if that sort of makes sense I'll explain the solution: It 
 might be a bit fuzzy but I'll try to make things very clear. [I 
 will also sometimes conflate the usage of concrete objects and 
 their corresponding abstract types. e.g., "T with t". It should 
 be clear context which is meant. e.g., If I say "T is 
 released", I mean that the object with type T was released. 
 When they need to be distinguished between I will do so(mainly 
 in the code examples]


 A *reference counted* type, from here on designated as RCT, is 
 a type that only allows itself to be released to the heap when 
 nothing depends on it. [Obviously if anything depends on it 
 when after it is released will fail to be fulfilled]

 If a RCT does not contains any references to any RCT's we will 
 call it a free RCT or FRCT. Such types behave in a simple way. 
 They can be free'ed completely immediately. Since FRCT's do not 
 contain any references to other RCT's all their references are 
 simple references that can be free'ed immediately.

 We require that if a type contains a reference to a RCT then it 
 too is a RCT. [the relation ContainsRCT(T1,T2) ==> 
 ContainsRCT(T2,T1) ==> Both T1 and T2 are either RCT's or not 
 RCT's] This isn't a big deal but without it we end up requiring 
 all types to be RCT's. Here we allow some types to be not be 
 RCT's.


 In code,

 class T { }   // a standard non-reference counted type.
               // It can't use any RCT's as references. It is 
 your basic D type of class.

 rcclass FRCT  // rcclass means reference counted class. It is 
 identical to your basic
               // D type of class but has different keyword 
 designator(which is why we call them free).
 {
   int x;
   float[] y
   // etc, FRCT does not point to any RCT's
 }

 rcclass X { }  // Just another RCT

 rcclass RCT
 {
     FRCT myFRCT;
     int y;
     RCT foo(ref barRCT);
     X x;
     // etc Basically this is a class that can include other 
 RCT's
 }

 ----

 Ultimately we could simply use the basic D class but we would 
 have to designate every type as an RCT type... even the basic 
 types. I'm trying to avoid some basic issues of that method 
 that would derail the discussion. [But for those that want to 
 know: one could simply treat all objects in d as rcclass types 
 describe above, even basic types and such, but optimize out the 
 reference counting part in in the compiler for those basic and 
 FRCT types... this then would simply lead back to the method 
 I'm using to describe it but simply some of the details were 
 hidden in the "simple method" by the compiler]


 So, once you have such a type system. The way to deal with the 
 complex relationships that exist are to know them:

 1. A RCT have the following relationships between it's 
 containing types. We will denote T to be a non-RCT type, FRCT 
 to be an FRCT, and RCT to be an RCT.

      RCT can contain T's, FRCT's, and/or, RCT's.

      In Code:

      rcclass RCT
      {
          T t;
          FRCT myFRCT;
          RCT myRCT;
      }


 In this case, which is the only case we will talk about 
 w.l.o.g), when an RCT is released it must attempt to release 
 each of it's references... in example above: t, myFRCT, and 
 myRCT. Since t is a simple type, it can be released 
 immediately. No RCT's point to it because if they did, then T 
 would be an RCT and it isn't. Similarly myFRCT can easily be 
 released because it is just a complex of things like T's. In 
 this case though there is something slightly different about it 
 which is the referencing counting part. Essentially it's 
 contained types are simple types but it still has a reference 
 counting aspect for itself.


 Basically T's and FRCT's can free themselves when they need to. 
 This is because they know how to free each reference they 
 contain and it can't cause referential problems simply because 
 they do not contain such references. This by no means they 
 won't cause other parts of code to access invalid references. A 
 pointer to such a member of a class can easily try and access 
 the member after the object has been freed. But this is not our 
 concern because we are only dealing with RCT's. Our goal isn't 
 to have everything be RCT's. If that was the case then the T 
 and FRCT cases would not be needed since we won't use them. 
 e.g., We might want the GC to handle non-RCT and some other 
 method for FRCT's. This allows one to build on current D 
 system. But one must remember that RCT's only guarantee valid 
 referencing when they access other RCT's.


 The last sentence above directs where we are going with this: 
 Since we have RCT we can make sure they know how to communicate 
 with each other. (we are creating a metatype here)

 By having the proper communication system here it is possible 
 for the types to know when they need to be released.
 In fact, it is not hard and several methods can be used, even 
 simultaneously.

 Axiom: Every RCT has the ability to have any other RCT that 
 contains a reference to it, to be able to release it. Also, 
 vice versa!

 [This requirement essentially allows the very final reference 
 to know that it needs to release other dependencies it had and 
 allows it to not only clean up itself but everything else it 
 needs to clean up]

 Once you have the above requirement of RCT's(The implement 
 dealt with at a different time), you can lazily release objects 
 in the background without issue(fragmentation of the heap is a 
 harder matter but still has a solution because of the above 
 requirement).

 How do they communicate:

 Your example will do:

     rcstruct T
     {
         void doSomething();
     }

     rcstruct S
     {
         T[] array;
     }

     void main()
     {
         auto s = S(new T());  // Here when the compiler creates 
 the array of T's S knows T is a RFC.
                               // In this case it can get at the 
 hidden communications layer of T and
                               // inform T that it is using it. 
 Hence at this point
                               // Both S and T can communicate 
 with each other! (not just blind inc/dec of an int
         auto t = s.array[0];

         foo(s, s.array[0]);   // nothing special except foo 
 much create an added reference to the two argments
                               // This can be optimized out as 
 long as foo returns once for every time it
                               // is called. In this case, At 
 the start of foo, the counters will be inc'ed
                               // and at the end they will be 
 dec'ed. Always a zero sum outside the function call.
         t.doSomething();      // Skip to foo, come back when 
 done with foo! Great, t is still around!
                               // (RC is still zero at this 
 point!)
     }

     void foo(ref S s, ref T t)
     {
         s.array = new T[];    // Here we are reassigning the 
 array. No problem! We simply inc the ref count on the new array 
 and decment it on the old. Just like normal ARC'ing! The only 
 different is that t is not immediately release! Even though it 
 has a RC count of 0. Think of it as lazy ARC'ing... or 
 LARC'ing! ;)
         t.doSomething();      // Great, even though t has an RC 
 count of zero, it hasn't been released yet!
     }



 So, what is this LARCING?

 Essentially instead of s simply dumping t in foo, it informs t 
 that it is no longer using it. This sets the RC = 0 !!BUT!! t 
 does not free itself immediately! If it does, it leads to the 
 problem you have given and also to the extended problem I have 
 shown above(calling t.doSomething in main).


 If t sticks around though, we are not releasing memory and 
 hence what is the point?


 Well, as usual this is an engineering problem. We have a few 
 cases.


 1. The compiler can prove that a RCT goes out of scope and is 
 not used any more. In this case t can release itself at the end 
 of the function where it is provable that it is no longer used. 
 (In your example it is at the end of foo ad in my example it is 
 at the end of main) Flow analysis, simple brute force checking, 
 etc are used here. Most cases in program will fall under this 
 category. Local variables that are not referenced to by global 
 variables are proven to be releasable at the end of the 
 function call. Hence, many cases will work as simple ARC'ing 
 and not require LARC'ing.


 2. The compiler cannot prove when a variable is release. This 
 is when the type itself has to decide when it is not being 
 used. (this the last part of the problem and your example falls 
 under it)


 In this case, the communication between S and T is crucial. 
 When s decides it no longer need t, it informs t. t has a list 
 of every reference that is using it(this can just be a pointer 
 to the ref
 count of the RCT). Having this list, while potentially weighty, 
 lets t know when all the references to it are not needed any 
 more(sort of chains or links between any other RCT and t have 
 been broken).


 Once this is done it is a simple matter of t trying to release 
 itself every time it goes out of scope.

      a. At some point t will release itself
      b. t will never be able to release itself(this is 
 generally due to program error in which a reference was made to 
 t without removing the reference when it was done). Note this 
 is why if every type is an RCT, prevents dangling(e.g., we 
 would have full LARC'ing... or FLARC'ing.). FLARC'ing may not 
 be optimal though(all pointers would have to be RCT which would 
 probably be quiet inefficient. Also interdependencies are 
 handled easily:


          s.t = t;
 		 t.s = s;
          while (!isFree(t) && !isFree(s))
          {
 		     s.t = null;
 			 t.s = null;
              if (t.refCount == 0)
                 s.release();
              if (s.refCount == 0)
                 t.release();

          }

          The reason that t and s are released in LARC'ing is 
 because at the 2nd time through the loop s.release() knows that 
 t no longer depend and can release itself immediately(because 
 t.s = null set s's *t entry* to 0 on it so now s knows that t 
 doesn't depend on it). This is just a prototype that can be 
 used to show that any level of nesting between types will 
 eventually resolve all releases properly.
 		
      c. Alternatively, a GC like memory manager can scan 
 through the types and release all types who's reference count = 
 0 and who's dependencies are all invalid(nothing depends on 
 it). It can do this in the background without issue and can 
 compact consecutively release objects without issue. i.e., the 
 GC knows what part of the heap belonged to objects there were 
 ref counted and if it is proven they have any references then 
 it can mark them as free'ed memory. Consecutive free'ed objects 
 in memory can be combined since they never will be used again. 
 This can be done behind in parallel and is sort of minor 
 defragmentation. In fact, analogous to defragmenting a drive. 
 If you know a file is not being used then you can defragment it 
 in the background and move the various parts around without 
 issue because you know it is not being used. Once it is being 
 uses, you run into issues with having to stop the threads that 
 are using that part of the heap. (I think this limitation could 
 ultimately be removed for most practical cases.)
 	
 	 In this case the GC would have to have a list of all objects 
 of RCT's. Essentially the GC is given to the task of LARC'ing 
 in the background.
 	

 	
 Finally, which such an structural approach(Essentially 
 implementing the reference list, which is a random access list 
 containing all the references(pointers to all classes that use 
 the type))  to ref counting becomes pretty stable and in 
 probably 99%+ of program usage, statically proven to release 
 memory(step a above, The compiler inserts many static releases 
 that are released immediately and many types will be simple 
 types that don't need LARC'ing).


 The two issues with this, as always, are speed and memory 
 usage. I think it should rather fast(most releases are direct 
 and just release themselves since they don't have references. 
 On top of that the remaining are mostly FRCT's which can be 
 released nearly immediately(they can, but don't until they are 
 informed it is ok).


 For the full blown RCT types, the way they release things is in 
 a hierarchical manner which can create avalanches of changes. 
 If many objects are waiting for one object to be released, then 
 when it does, they all could be released immediately. But this 
 chain reaction is simply lazily recursive and eventually 
 terminates.


 The only real problem is how to deal with un-managed references 
 to RCT types. These are references(say pointers) to RCT's that 
 may not be "smart"(inform the RFC that it is no longer being 
 refered to. A simple pointer to a class that doesn't do any 
 communication is the prototypical problem.


 The only solution seems to be to just accept it. It is a 
 necessary evil. Allow it but require the program to know the 
 intricate details. Such pointers are needed for speed and don't 
 inherently cause any problems. It's only when they are abused 
 can issues creep in.


 If one, say, had RCT pointers that are used for some types of 
 problems and simple pointers when they are needed, it becomes 
 pretty marginal of an issue. (if you are doing driver 
 programming and have to have speed and direct access then use 
 the simple pointers but it is your just to know if your 
 references are working as expected.


 In many cases, RCT pointers can be used in many of the cases 
 which make further minimize the singular problem case(that of 
 non-informative referencing). (We could call such types 
 "Peeping Toms"... but as long as they only peek it is ok(not by 
 law, in programming ;)).


 At this, imagine that such a system was implemented. It would 
 provide almost complete automatic memory management coverage 
 with releasing resources as they are generally released in the 
 program flow. In fact, with a GC(don't you just hate it when 
 your garbage man doesn't pick up your trash sometimes? 
 specially if he never picks it up) that essentially works 
 parallel, most of the actual releasing can be done in the 
 background without ever stopping the program. The benefits are: 
 Better real-time behavior, Some programs might never need to 
 compact the heap(in fact, if intelligent heap allocation is 
 used, probably most programs could benefit from it), and 
 debugging is much easier(Essentially more like manual 
 management for most objects... just the compiler handles for 
 us, but we know when, most likely, an object goes out of scope 
 it will be released). Seg faults due to invalid memory access 
 will almost be reduced to 0(since the corner case... the 
 problem case of simple references is not used as much as all 
 the other ways to use objects).


 The downside? There really isn't one except memory usage! One 
 can come up with special cases that simply don't work well 
 under LARC'ing but because it allows one to use non-LARC'ing 
 types, the programmer will just have to manage it on is own or 
 a mixed combination.


 Another great benefit is performance. One may think with all 
 the all the extra allocation and deallocation code that the 
 end's of functions would be slowed down(do to objects being 
 lazily released). This is not necessarily the case because it 
 is LARC'ing, the releases can be done later... say outside a 
 performant piece of the code. It can even be faster than 
 directly releasing it because that could be slow in and of 
 itself, which, again, with LARC'ing, we can chose a better time 
 to release the object or do it in the background.


 Unfortunately, if I'm not mistaken, with FLARC'ing, all classes 
 would essentially double in size. This is because if every 
 reference the type has is a RCT then it would essentially need 
 to store a pointer to the object as to provide a link to it. 
 With limited LARC'ing I think one could probably achieve a good 
 balance. (for large classes and/or a large number of objects, 
 one could just avoid LARC'ing to prevent the overhead and doing 
 it manually or with GC)

 Some codetalk(tm) of how the interaction goes:

      Object s is being initialized. It is a RFC type. It has, 
 as a member, a object x that is also a RFC type.  In the course 
 of s being initialized, it initializes x. We'll assume it is 
 not null.
 	 To do this s creates the object for x then informs x that it 
 has a reference to it(we could have multiple types of 
 references for further optimization if we want). e.g.,
 	
 	     x = new X;
 	     x.CreateReference(s);
 		 s.CreateReferenec(x); // We use directionless binding 
 references
 		
 		
     Now, suppose at some point s is being released. At the end 
 of scope the compiler inserts
 	
 	     s.Release();
 		
     S now tries to release itself. It checks it's ref count and 
 the list of all references it has.
 	
 	     e.g.,
 		 refCount = 0;
 		 refList = {}
 		
 		 then in s.Release()
 		
 			if (s.refCount == 0 && s.refList.IsEmpty)
 			{
 				GC.RemoveRCT(s); // GC.RemoveRCT is not necessary needed 
 but using it hear for expressive purposes.
 				s.Free();
 			}
 			else
 				GC.AddRCT(s);   // Here, since the compiler potentially 
 isn't going insert another s.Release() outside the function we 
 have to let the GC handle it
 			
 		 in this case s free's itself immediately.
 		
 		 If either refCount != 0 or refList != empty, the GC is given 
 control(it is made aware that S needs to be periodically 
 checked to see if it can be released.(the GC simply does the
 		 if (s.refCount == 0 && s.refList.IsEmpty) s.Free(); in a 
 parallel thread. If s.refCount == 0 and s.refList is empty at 
 some point then the GC ends up free'ing s.
 		
 		 Also, the compiler may be able to insert another s.Release() 
 in the outer scope of the function all if it releases that it s 
 was probably not released before. This doesn't hurt anything 
 except an extra check that might be wasted(profiling could 
 remove this check or even insert it).
 		
 		
 		 The only way s can't be released is if it's refCount and 
 refList are not empty. This is "danling" references and can 
 only happen when mixing non-RCT's with RCT's.
 		
 		 So we know how s should release it self but how does t know 
 that s no longer needs it? (so t can release itself at some 
 point)
 		
 		 well, this has to do when x is assigned to. e.g.,
 		
 		 s.x = new X;
 		 // Hidden, we have s update it's ref list and inform both 
 the hold x and the new x what is going on.
 		
 		 here the assignment will need to remove the old x from 
 refList and add the new x. This keeps the the list up to date. 
 (there is a big corner case here that I believe would never 
 occur but will have to think about it more)
 		
 		
 		
 Any ways, I'll leave it at that for now!

tldr; ``I don't know what I'm talking about.``

Mar 03 2015

D Programming

C/C++ Programming

Other

digitalmars.D - RCArray is unsafe