digitalmars.D - Newbie initial comments on D language

digitalmars.D - Newbie initial comments on D language - scope

Edward Diener (31/31) Jan 28 2008 The 'scope' mechanism for RAII is a nice compromise for an intractable

Sean Kelly (9/32) Jan 28 2008 I disagree. If the "scope" were not present at the point of

Edward Diener (11/45) Jan 28 2008 I do not understand what you mean by "return the scoped value". If in D

Jesse Phillips (8/25) Jan 28 2008 He is referring to when you have:

Edward Diener (20/50) Jan 28 2008 Yes, I can see that. My own idea of a 'scope' class in a GC environment,...

BCS (4/10) Jan 28 2008 IIRC scope just destructs the class on exiting the scope. With some more...
Leandro Lucarella (14/45) Jan 29 2008 Exactly, that's a reference count, another way of doing *GC*, and that h...
Sean Kelly (5/46) Jan 29 2008 Someone may already have mentioned this, but 2.0 will eventually get

Janice Caron (16/25) Jan 28 2008 What you're suggesting is "semantic sugar" - allowing the compiler to

Rioshin an'Harthen (36/46) Jan 29 2008 I'll go somewhat deeper into this than Janice did.

Bruce Adams (30/65) Jan 29 2008 ng =

Edward Diener (17/50) Jan 29 2008 I see it exactly the other way. It is semantic sugar to force a

Bill Baxter (10/64) Jan 29 2008 My opinion is that the current rules make scope classes of very limited
Janice Caron (4/7) Jan 29 2008 The GC only collects /memory/, not /resources/.

Edward Diener (4/13) Jan 30 2008 Memory is a subset of resource. It is, in most programmer's mind, the

Walter Bright (6/13) Jan 30 2008 You both have good points, but the problem with scoped classes are there...

Extrawurst (2/3) Jan 30 2008 Thats good news. I am looking forward to it.
Craig Black (2/5) Jan 30 2008 Very good Walter!! What about copy semantics for structs?

Walter Bright (3/8) Jan 30 2008 You cannot do destructors for value objects without copy constructors

Edward Diener (20/35) Jan 30 2008 Please reconsider that decision, especially in the light of the

Walter Bright (33/59) Jan 30 2008 Maybe you're selling structs short :-). With RAII structs, to make an

Edward Diener (20/87) Jan 30 2008 You have sold me.

Walter Bright (3/13) Jan 31 2008 I agree that a method for forwarding is needed to complete the job. I

Edward Diener (51/65) Feb 02 2008 Thinking about this further, why not go all the way and just provide

Walter Bright (19/71) Feb 02 2008 You read my mind . But I wanted to see first how well proxy objects

Janice Caron (8/9) Feb 02 2008 Surely you don't need to lock the reference count? You can use atomic

Walter Bright (3/9) Feb 02 2008 Doing atomic inc/dec *is* locking. The LOCK CPU instruction is there,
Sean Kelly (5/11) Feb 02 2008 A locked operation on x86 takes something like 80ns, which is far from

Sergey Gromov (4/7) Feb 03 2008 Boost uses InterlocedIncrement() etc. under Windows and

Sean Kelly (3/12) Feb 03 2008 Ah, for some reason I thought they used a sort of spinlock.

Edward Diener (5/16) Feb 02 2008 Exactly ! The responsibility for object access in a multi-threading

David Wilson (7/16) Feb 02 2008 There are still effects on the CPU cache when using atomic
Frits van Bommel (12/20) Feb 02 2008 The latter won't be as nice as-is, since in the first case you can omit
Edward Diener (32/75) Feb 02 2008 I would like to see:

Walter Bright (11/24) Feb 02 2008 The problem is:

Edward Diener (21/49) Feb 02 2008 If C is a scope class, then it is fine and the normal reference counting...
Edward Diener (14/32) Feb 02 2008 My brain was not working correctly earlier, so let me correct myself.

Walter Bright (8/23) Feb 02 2008 Let's look at it by analogy to 'const'. Implicitly converting a const D

Edward Diener (66/91) Feb 02 2008 Your analogy to C++'s 'const' is a bad one.

Walter Bright (37/114) Feb 02 2008 'const' in C++ is very much a characteristic of the type of the object.

Edward Diener (76/204) Feb 03 2008 In your example justifying treating 'scope' as C++ treats 'const' the

Michel Fortin (52/68) Feb 03 2008 Considering the dynamic type at runtime means you need to check if

Edward Diener (30/102) Feb 03 2008 I am not knowledgable about the actual low-level difference between the

Michel Fortin (68/173) Feb 04 2008 Yes, and this is implemented in a simple and naive way: by adding an

Edward Diener (24/205) Feb 04 2008 The reference counting would only be implemented for a 'scope' object

Michel Fortin (53/200) Feb 04 2008 Basically, you need to:

Janice Caron (44/52) Feb 05 2008 Not quite. Consider an assignment

Michel Fortin (52/103) Feb 05 2008 Well, the algorithm above checks for the presence of a scope flag on
Edward Diener (15/76) Feb 05 2008 pb being a 'scope' object does not matter and its reference count does

Janice Caron (19/33) Feb 05 2008 The reference count of the value previously held by a /does/ get
Janice Caron (24/27) Feb 05 2008 And just to be really clear, real code would need extra checks, above

Janice Caron (6/8) Feb 05 2008 ...and that code had a bug in it! (Who else spotted it?) It can fail
Edward Diener (22/243) Feb 05 2008 I love it when people such as you carry on about all the work that must

Michel Fortin (94/114) Feb 07 2008 The thing is that if we have a static reference-counting type modifier
Michel Fortin (25/33) Feb 07 2008 If by "work that must be done" you mean find a solution with no

Walter Bright (29/58) Feb 04 2008 Pedantically, it is not attached to the object. It is attached to the

Edward Diener (44/114) Feb 04 2008 Yes, this is important in support of 'scope' at run-time. For each

Walter Bright (15/51) Feb 05 2008 It's a very serious issue, as it essentially negates much of the

Christopher Wright (8/74) Feb 07 2008 You'd have to outlaw:
Edward Diener (82/148) Feb 07 2008 I do not follow what having a reference count for an object has to do

Janice Caron (3/5) Feb 02 2008 Of course there is. For example:

Edward Diener (16/23) Feb 03 2008 I think a simpler example for you to cite would have been:

Janice Caron (49/57) Feb 03 2008 Having actually implemented reference counting in C++, I know how it

Edward Diener (34/107) Feb 03 2008 I am saying that when the type is 'scope class C' the compiler

Jason House (4/14) Jan 29 2008 It's already been clarified by others that scope is different than a ref...

Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:

The 'scope' mechanism for RAII is a nice compromise for an intractable
general problem in GC languages, and I can see Walter Bright has
possibly been influenced by other GC languages that have sought to
address this issue. But a couple of areas seem really dubious to me.

The first is the necessity of using an already scoped class by repeating
the 'scope' decalation when creating an object of that class. Since the
class itself has already been declared with the 'scope' keyword it seems
absolutely redundant that the user of an object of the class must repeat
'scope' in his usage of that object. Surely the compiler is smart enough
to know that the class is a 'scope' class and will generate the
necessary code to automatically call the destructor of the class when it
goes out of scope. In fact the user of this class via an instantiated
object should not even care if it is a scoped class or not, so having to
say it is again seems doubly wrong, although allowable.

The second is that an object of a 'scope' class can only be instantiated
as a local variable of a function. That pretty much destroys the usage
of a 'scope' class ( aka a class encapsulating a resource which should
be released as soon as it is no longer referenced ) to the most narrow
of usages and means that nobody will bother actually creating such a
class for using RAII in D. Surely a 'scope' class should be instantiable
anywhere, with the obvious candidate being as a data member in an
enclosing class, which itself may or may not be scoped.

The usage of the 'scope' keyword still would have a very important
function if it is designated to force RAII on an instantiated object
which would not ordinarily be scoped. This could occur most naturally
when the programmer is creating any container which may have scoped
objects in it, including the D versions of static and dynamic arrays and
associated arrays. In this way both the class designer can implement
RAII in his class and the programmer can implement RAII on their objects
independently of each other, with both having the necessary control to
solve the resource problem as far as the idea of a scope allows.

Jan 28 2008

Sean Kelly <sean f4.ca> writes:

Edward Diener wrote:
 The 'scope' mechanism for RAII is a nice compromise for an intractable
 general problem in GC languages, and I can see Walter Bright has
 possibly been influenced by other GC languages that have sought to
 address this issue. But a couple of areas seem really dubious to me.
 
 The first is the necessity of using an already scoped class by repeating
 the 'scope' decalation when creating an object of that class. Since the
 class itself has already been declared with the 'scope' keyword it seems
 absolutely redundant that the user of an object of the class must repeat
 'scope' in his usage of that object. Surely the compiler is smart enough
 to know that the class is a 'scope' class and will generate the
 necessary code to automatically call the destructor of the class when it
 goes out of scope. In fact the user of this class via an instantiated
 object should not even care if it is a scoped class or not, so having to
 say it is again seems doubly wrong, although allowable.

I disagree.  If the "scope" were not present at the point of
declaration, I think it would be too easy for a maintainer of the code
to screw up and return the scoped value.  Requiring "scope" is akin to
C++ having different declaration semantics for dynamic and static types.

 The second is that an object of a 'scope' class can only be instantiated
 as a local variable of a function. That pretty much destroys the usage
 of a 'scope' class ( aka a class encapsulating a resource which should
 be released as soon as it is no longer referenced ) to the most narrow
 of usages and means that nobody will bother actually creating such a
 class for using RAII in D. Surely a 'scope' class should be instantiable
 anywhere, with the obvious candidate being as a data member in an
 enclosing class, which itself may or may not be scoped.

Walter has mentioned in the past that he was considering doing exactly
this, but like many other things I think it's been on the back-burner
while const was sorted out in 2.0.


Sean

Jan 28 2008

Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:

Sean Kelly wrote:
 Edward Diener wrote:
 The 'scope' mechanism for RAII is a nice compromise for an intractable
 general problem in GC languages, and I can see Walter Bright has
 possibly been influenced by other GC languages that have sought to
 address this issue. But a couple of areas seem really dubious to me.

 The first is the necessity of using an already scoped class by repeating
 the 'scope' decalation when creating an object of that class. Since the
 class itself has already been declared with the 'scope' keyword it seems
 absolutely redundant that the user of an object of the class must repeat
 'scope' in his usage of that object. Surely the compiler is smart enough
 to know that the class is a 'scope' class and will generate the
 necessary code to automatically call the destructor of the class when it
 goes out of scope. In fact the user of this class via an instantiated
 object should not even care if it is a scoped class or not, so having to
 say it is again seems doubly wrong, although allowable.

 
 I disagree.  If the "scope" were not present at the point of
 declaration, I think it would be too easy for a maintainer of the code
 to screw up and return the scoped value.  Requiring "scope" is akin to
 C++ having different declaration semantics for dynamic and static types.

I do not understand what you mean by "return the scoped value". If in D 
I write:

scope class Foo { ... }

then why should I have to write, when declaring an instance of the class:

scope Foo g = new Foo();

as opposed to just:

Foo g = new Foo();

The compiler knows that Foo is a scoped class, so there is no need for 
the programmer to repeat it in the object declaration.

 
 The second is that an object of a 'scope' class can only be instantiated
 as a local variable of a function. That pretty much destroys the usage
 of a 'scope' class ( aka a class encapsulating a resource which should
 be released as soon as it is no longer referenced ) to the most narrow
 of usages and means that nobody will bother actually creating such a
 class for using RAII in D. Surely a 'scope' class should be instantiable
 anywhere, with the obvious candidate being as a data member in an
 enclosing class, which itself may or may not be scoped.

 
 Walter has mentioned in the past that he was considering doing exactly
 this, but like many other things I think it's been on the back-burner
 while const was sorted out in 2.0.

I understand.

Jan 28 2008

Jesse Phillips <jessekphillips gmail.com> writes:

On Mon, 28 Jan 2008 20:50:59 -0500, Edward Diener wrote:

 
 I do not understand what you mean by "return the scoped value". If in D
 I write:
 
 scope class Foo { ... }
 
 then why should I have to write, when declaring an instance of the
 class:
 
 scope Foo g = new Foo();
 
 as opposed to just:
 
 Foo g = new Foo();
 
 The compiler knows that Foo is a scoped class, so there is no need for
 the programmer to repeat it in the object declaration.

He is referring to when you have:

scope class Foo() {}

Foo doThings() {
   Foo cats = new Foo();

   return cats;
}

cats no longer exists after return.

Jan 28 2008

Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:

Jesse Phillips wrote:
 On Mon, 28 Jan 2008 20:50:59 -0500, Edward Diener wrote:
 
 I do not understand what you mean by "return the scoped value". If in D
 I write:

 scope class Foo { ... }

 then why should I have to write, when declaring an instance of the
 class:

 scope Foo g = new Foo();

 as opposed to just:

 Foo g = new Foo();

 The compiler knows that Foo is a scoped class, so there is no need for
 the programmer to repeat it in the object declaration.

 
 He is referring to when you have:
 
 scope class Foo() {}
 
 Foo doThings() {
    Foo cats = new Foo();
 
    return cats;
 }
 
 cats no longer exists after return.

Yes, I can see that. My own idea of a 'scope' class in a GC environment, 
one that completely solves the RAII conundrum, is that one should be 
able to pass around an object of that class and when the last reference 
to that object goes out of scope the destructor is immediately called.

That is very much like what boost::shared_ptr<T> offers for C++ in a 
language which does not have GC, but it is probably harder to implement 
in a GC language where such checks are ordinarily not made when an 
object goes out of scope.

Given that the 'scope' class can not be passed around when it leaves the 
block in which it is created, the above would lead to an error. But I do 
not see how that affects my initial observation that one should not have 
to specify the 'scope' keyword on an object of a 'scope' class when 
declaring such an object, unless I misunderstand the use of 'scope' in 
that situation. Are you saying that without specifying 'scope' for an 
object of a 'scope' class the object does not behave as a 'scope' object 
and that therefore the above example you give does not destroy the 
object when the doThings function exits ? If that is the case, then I 
missed the ramifications of using 'scope' when referred to object 
declarations themselves.

Jan 28 2008

BCS <ao pathlink.com> writes:

Reply to Edward,

 Yes, I can see that. My own idea of a 'scope' class in a GC
 environment, one that completely solves the RAII conundrum, is that
 one should be able to pass around an object of that class and when the
 last reference to that object goes out of scope the destructor is
 immediately called.
 

IIRC scope just destructs the class on exiting the scope. With some more 
overhead it might be possible to do better, but it would be much more complex. 
I'd say "not now".

Jan 28 2008

Leandro Lucarella <llucax gmail.com> writes:

Edward Diener, el 28 de enero a las 23:07 me escribiste:
 Jesse Phillips wrote:
On Mon, 28 Jan 2008 20:50:59 -0500, Edward Diener wrote:
I do not understand what you mean by "return the scoped value". If in D
I write:

scope class Foo { ... }

then why should I have to write, when declaring an instance of the
class:

scope Foo g = new Foo();

as opposed to just:

Foo g = new Foo();

The compiler knows that Foo is a scoped class, so there is no need for
the programmer to repeat it in the object declaration.

He is referring to when you have:
scope class Foo() {}
Foo doThings() {
   Foo cats = new Foo();
   return cats;
}
cats no longer exists after return.

 
 Yes, I can see that. My own idea of a 'scope' class in a GC environment, one
that completely solves the RAII conundrum, is that one should be able to pass 
 around an object of that class and when the last reference to that object goes
out of scope the destructor is immediately called.
 
 That is very much like what boost::shared_ptr<T> offers for C++ in a language
which does not have GC, but it is probably harder to implement in a GC language 
 where such checks are ordinarily not made when an object goes out of scope.

Exactly, that's a reference count, another way of doing *GC*, and that has
a lot of other complexities (like circular dependencies) and overhead (the
counting itself). Scope is much simpler, and as the name says, it destroys
an object when it's out of scope, just like C++ does with any object
allocated in the stack.

Anyway, I think I remember Walter saying that he wanted to add 2.0 the
tools necessary to smoothly implement reference counting.

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------
1 cigarette takes away 5 minutes of a person's life

Jan 29 2008

Sean Kelly <sean f4.ca> writes:

Edward Diener wrote:
 Jesse Phillips wrote:
 On Mon, 28 Jan 2008 20:50:59 -0500, Edward Diener wrote:

 I do not understand what you mean by "return the scoped value". If in D
 I write:

 scope class Foo { ... }

 then why should I have to write, when declaring an instance of the
 class:

 scope Foo g = new Foo();

 as opposed to just:

 Foo g = new Foo();

 The compiler knows that Foo is a scoped class, so there is no need for
 the programmer to repeat it in the object declaration.

 He is referring to when you have:

 scope class Foo() {}

 Foo doThings() {
    Foo cats = new Foo();

    return cats;
 }

 cats no longer exists after return.

 
 Yes, I can see that. My own idea of a 'scope' class in a GC environment,
 one that completely solves the RAII conundrum, is that one should be
 able to pass around an object of that class and when the last reference
 to that object goes out of scope the destructor is immediately called.
 
 That is very much like what boost::shared_ptr<T> offers for C++ in a
 language which does not have GC, but it is probably harder to implement
 in a GC language where such checks are ordinarily not made when an
 object goes out of scope.

Someone may already have mentioned this, but 2.0 will eventually get
copy and destruction semantics for structs so it should be possible to
create a fairly decent smart pointer in D as well.


Sean

Jan 29 2008

"Janice Caron" <caron800 googlemail.com> writes:

On Jan 29, 2008 1:50 AM, Edward Diener
<eddielee_no_spam_here tropicsoft.com> wrote:
 If in D
 I write:

 scope class Foo { ... }

 then why should I have to write, when declaring an instance of the class:

 scope Foo g = new Foo();

 as opposed to just:

 Foo g = new Foo();

 The compiler knows that Foo is a scoped class, so there is no need for
 the programmer to repeat it in the object declaration.

What you're suggesting is "semantic sugar" - allowing the compiler to
save us a bit of typing. Sometimes, that can be a good thing. Here,
however, I don't think it would be. You see, while the /compiler/
knows that Foo is RAII (you're right about that), future maintainers
of the function might not. Forcing the use of the keyword makes the
code a bit more readable.

Here's another way of looking at it: The right hand side of the
statement is evaluated /first/. Then it is assigned to the lvalue. So,
when the RHS is evaluated (new Foo()) it returns a value whose type is
"scope Foo". Now, you can't assign a "scope Foo" to a "Foo", so the
assignment fails. Allowing the semantic sugar would be like
"storage-class-deduction", which would open up a really huge can of
worms, and we almost certainly don't want to go there (at least not
before const has settled down).

Jan 28 2008

"Rioshin an'Harthen" <rharth75 hotmail.com> writes:

"Janice Caron" <caron800 googlemail.com> kirjoitti viestiss� 
news:mailman.33.1201592955.5260.digitalmars-d puremagic.com...
 On Jan 29, 2008 1:50 AM, Edward Diener
 <eddielee_no_spam_here tropicsoft.com> wrote:
 The compiler knows that Foo is a scoped class, so there is no need for
 the programmer to repeat it in the object declaration.

 What you're suggesting is "semantic sugar" - allowing the compiler to
 save us a bit of typing. Sometimes, that can be a good thing. Here,
 however, I don't think it would be. You see, while the /compiler/
 knows that Foo is RAII (you're right about that), future maintainers
 of the function might not. Forcing the use of the keyword makes the
 code a bit more readable.

I'll go somewhat deeper into this than Janice did.


place to note what parameters are ref and what parameters are out. The 
compiler of course already knows which parameter is using which convention: 


    len += o.foo(bar, ref baz, out foobar);

than

    len += o.foo(bar, baz, foobar);

because it documents the call better. Just by reading the call, it is 
immediately know that baz is passed as a reference, which might change baz, 
and that foobar will be used to return an additional value from the method. 
No need to look up the definition of o.foo anywhere.

The same reasoning applies to forcing scoped classes to be marked as such at 
the point of declaration in D. It requires for each instance the typing of 
an extra keyword, but the readability of the code increases by such a factor 
that typing that extra keyword doesn't bother me.

Now, you're reading through code that somebody else has written, having 
inherited the maintaining of that code. There's a bad bug in it that has to 
be fixed and quickly, because it's stopping the project from being 
finished - and it was supposed to be ready a few weeks ago. You've managed 
to tracke the bug to a certain function and look upon the code:

    Foo f = new Foo;
    // do whatever with f
    return f;

Suppose D doesn't require that scope in front of Foo. You have to check the 
Foo class, and you see the scope in front of the declaration. However, when 
D requires the scope keyword, the code looks like:

    scope Foo f = new Foo;
    // do whatever with f
    return f;

Bling! We see the error immediately, and have saved the need to actually 
look for where the Foo class is defined, which could be in the middle of a 
multi-kloc file that you couldn't have guessed from the ton of imports at 
the top of the file this function resides in.

Jan 29 2008

"Bruce Adams" <tortoise_74 yeah.who.co.uk> writes:

On Tue, 29 Jan 2008 08:39:46 -0000, Rioshin an'Harthen  =

<rharth75 hotmail.com> wrote:

 I'll go somewhat deeper into this than Janice did.



ng  =

 place to note what parameters are ref and what parameters are out. The=

  =

 compiler of course already knows which parameter is using which  =



 better to require

     len +=3D o.foo(bar, ref baz, out foobar);

 than

     len +=3D o.foo(bar, baz, foobar);

 because it documents the call better. Just by reading the call, it is =

 =

 immediately know that baz is passed as a reference, which might change=

  =

 baz, and that foobar will be used to return an additional value from t=

he  =

 method. No need to look up the definition of o.foo anywhere.

Unless you need to know. What is does and what the pre and post conditio=
ns
are. The idea of deliberately programming blind does not appeal to me.
Extra documentation as a style option on the otherhand sounds reasonable=
.

 The same reasoning applies to forcing scoped classes to be marked as  =

 such at the point of declaration in D. It requires for each instance t=

he  =

 typing of an extra keyword, but the readability of the code increases =

by  =

 such a factor that typing that extra keyword doesn't bother me.

 Now, you're reading through code that somebody else has written, havin=

g  =

 inherited the maintaining of that code. There's a bad bug in it that h=

as  =

 to be fixed and quickly, because it's stopping the project from being =

 =

 finished - and it was supposed to be ready a few weeks ago. You've  =

 managed to tracke the bug to a certain function and look upon the code=

:
     Foo f =3D new Foo;
     // do whatever with f
     return f;

 Suppose D doesn't require that scope in front of Foo. You have to chec=

k  =

 the Foo class, and you see the scope in front of the declaration.  =

 However, when D requires the scope keyword, the code looks like:

     scope Foo f =3D new Foo;
     // do whatever with f
     return f;

 Bling! We see the error immediately, and have saved the need to actual=

ly  =

 look for where the Foo class is defined, which could be in the middle =

of  =

 a multi-kloc file that you couldn't have guessed from the ton of impor=

ts  =

 at the top of the file this function resides in.

Any halfway decent compiler should report that as a semnatic error and  =

refuse
to compile it. If the error message is specific enough repeating the  =

keyword
or not makes no odds. As Janice says type (storage class) deduction may =
be  =

a
more serious can of worms.

Jan 29 2008

Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:

Janice Caron wrote:
 On Jan 29, 2008 1:50 AM, Edward Diener
 <eddielee_no_spam_here tropicsoft.com> wrote:
 If in D
 I write:

 scope class Foo { ... }

 then why should I have to write, when declaring an instance of the class:

 scope Foo g = new Foo();

 as opposed to just:

 Foo g = new Foo();

 The compiler knows that Foo is a scoped class, so there is no need for
 the programmer to repeat it in the object declaration.

 
 What you're suggesting is "semantic sugar" - allowing the compiler to
 save us a bit of typing. Sometimes, that can be a good thing. Here,
 however, I don't think it would be. You see, while the /compiler/
 knows that Foo is RAII (you're right about that), future maintainers
 of the function might not. Forcing the use of the keyword makes the
 code a bit more readable.
 
 Here's another way of looking at it: The right hand side of the
 statement is evaluated /first/. Then it is assigned to the lvalue. So,
 when the RHS is evaluated (new Foo()) it returns a value whose type is
 "scope Foo". Now, you can't assign a "scope Foo" to a "Foo", so the
 assignment fails. Allowing the semantic sugar would be like
 "storage-class-deduction", which would open up a really huge can of
 worms, and we almost certainly don't want to go there (at least not
 before const has settled down).

I see it exactly the other way. It is semantic sugar to force a 
programmer to specify that the instantiation of a scope class creates a 
scope object, just for the sake of making future maintainers feel 
better. One should not really care that a class is a scope class. It 
should just work to release the resource it encompasses when it goes out 
of scope by having its destructor called.

In a GC environment memory is just another resource. The user of objects 
does not worry about memory being released as appropriate. Why should he 
have to worry about other resources being released as appropriate ?

Understand that I am not saying that the user of an object of a scope 
class can not benefit from knowing, if he chooses, that the class is a 
scope class. Part of my suggestion about the keyword 'scope' in D is 
that when used it should force any object, even not normally scoped, to 
be destroyed when it goes out of scope. In this way the programmer can 
force a container of scoped objects to be destroyed immediately when it 
goes out of scope even though the container is not a scoped type.

Jan 29 2008

Bill Baxter <dnewsgroup billbaxter.com> writes:

Edward Diener wrote:
 Janice Caron wrote:
 On Jan 29, 2008 1:50 AM, Edward Diener
 <eddielee_no_spam_here tropicsoft.com> wrote:
 If in D
 I write:

 scope class Foo { ... }

 then why should I have to write, when declaring an instance of the 
 class:

 scope Foo g = new Foo();

 as opposed to just:

 Foo g = new Foo();

 The compiler knows that Foo is a scoped class, so there is no need for
 the programmer to repeat it in the object declaration.

 What you're suggesting is "semantic sugar" - allowing the compiler to
 save us a bit of typing. Sometimes, that can be a good thing. Here,
 however, I don't think it would be. You see, while the /compiler/
 knows that Foo is RAII (you're right about that), future maintainers
 of the function might not. Forcing the use of the keyword makes the
 code a bit more readable.

 Here's another way of looking at it: The right hand side of the
 statement is evaluated /first/. Then it is assigned to the lvalue. So,
 when the RHS is evaluated (new Foo()) it returns a value whose type is
 "scope Foo". Now, you can't assign a "scope Foo" to a "Foo", so the
 assignment fails. Allowing the semantic sugar would be like
 "storage-class-deduction", which would open up a really huge can of
 worms, and we almost certainly don't want to go there (at least not
 before const has settled down).

 
 I see it exactly the other way. It is semantic sugar to force a 
 programmer to specify that the instantiation of a scope class creates a 
 scope object, just for the sake of making future maintainers feel 
 better. One should not really care that a class is a scope class. It 
 should just work to release the resource it encompasses when it goes out 
 of scope by having its destructor called.
 
 In a GC environment memory is just another resource. The user of objects 
 does not worry about memory being released as appropriate. Why should he 
 have to worry about other resources being released as appropriate ?
 
 Understand that I am not saying that the user of an object of a scope 
 class can not benefit from knowing, if he chooses, that the class is a 
 scope class. Part of my suggestion about the keyword 'scope' in D is 
 that when used it should force any object, even not normally scoped, to 
 be destroyed when it goes out of scope. In this way the programmer can 
 force a container of scoped objects to be destroyed immediately when it 
 goes out of scope even though the container is not a scoped type.

My opinion is that the current rules make scope classes of very limited 
use.  Regular classes can still be used 'scope'd if desired, so all 
you're doing by declaring the class itself scope is removing the users' 
freedom to choose.

The only reason to use it is if you have a class that absolutely must 
not persist longer than one stack frame.  If you don't have such a 
requirement then you might as well not bother with scope classes.  Let 
users decide whether they want it to be scope or not.

--bb

Jan 29 2008

"Janice Caron" <caron800 googlemail.com> writes:

On 1/30/08, Edward Diener <eddielee_no_spam_here tropicsoft.com> wrote:
 In a GC environment memory is just another resource. The user of objects
 does not worry about memory being released as appropriate. Why should he
 have to worry about other resources being released as appropriate ?

The GC only collects /memory/, not /resources/.

You make a class "scope" if it needs to clean up non-memory resources
(e.g. close a file, terminate a network connection, etc).

Jan 29 2008

Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:

Janice Caron wrote:
 On 1/30/08, Edward Diener <eddielee_no_spam_here tropicsoft.com> wrote:
 In a GC environment memory is just another resource. The user of objects
 does not worry about memory being released as appropriate. Why should he
 have to worry about other resources being released as appropriate ?

 
 The GC only collects /memory/, not /resources/.

Memory is a subset of resource. It is, in most programmer's mind, the 
most important subset, but it is still just another resource.
 
 You make a class "scope" if it needs to clean up non-memory resources
 (e.g. close a file, terminate a network connection, etc).

Yes, I do understand that fully.

Jan 30 2008

Walter Bright <newshound1 digitalmars.com> writes:

Edward Diener wrote:
 Understand that I am not saying that the user of an object of a scope 
 class can not benefit from knowing, if he chooses, that the class is a 
 scope class. Part of my suggestion about the keyword 'scope' in D is 
 that when used it should force any object, even not normally scoped, to 
 be destroyed when it goes out of scope. In this way the programmer can 
 force a container of scoped objects to be destroyed immediately when it 
 goes out of scope even though the container is not a scoped type.

You both have good points, but the problem with scoped classes are there 
  are semantic problems with them if they are not carefully used.

I'm working on adding destructors to structs, which I'm thinking should 
completely supplant scoped classes. RAII is a much more natural fit with 
structs than it ever will be for classes.

Jan 30 2008

Extrawurst <spam extrawurst.org> writes:

Walter Bright schrieb:
 I'm working on adding destructors to structs

Thats good news. I am looking forward to it.

Jan 30 2008

"Craig Black" <cblack ara.com> writes:

 I'm working on adding destructors to structs, which I'm thinking should 
 completely supplant scoped classes. RAII is a much more natural fit with 
 structs than it ever will be for classes.

Very good Walter!!  What about copy semantics for structs?

-Craig

Jan 30 2008

Walter Bright <newshound1 digitalmars.com> writes:

Craig Black wrote:
 I'm working on adding destructors to structs, which I'm thinking should 
 completely supplant scoped classes. RAII is a much more natural fit with 
 structs than it ever will be for classes.

 
 Very good Walter!!  What about copy semantics for structs?

You cannot do destructors for value objects without copy constructors 
and assignment overloads.

Jan 30 2008

Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:

Walter Bright wrote:
 Edward Diener wrote:
 Understand that I am not saying that the user of an object of a scope 
 class can not benefit from knowing, if he chooses, that the class is a 
 scope class. Part of my suggestion about the keyword 'scope' in D is 
 that when used it should force any object, even not normally scoped, 
 to be destroyed when it goes out of scope. In this way the programmer 
 can force a container of scoped objects to be destroyed immediately 
 when it goes out of scope even though the container is not a scoped type.

 
 You both have good points, but the problem with scoped classes are there 
  are semantic problems with them if they are not carefully used.
 
 I'm working on adding destructors to structs, which I'm thinking should 
 completely supplant scoped classes. RAII is a much more natural fit with 
 structs than it ever will be for classes.

Please reconsider that decision, especially in the light of the 
restrictions to structs in D which classes do not have. You would 
essentially be saying that any class designer, who would want to 
incorporate deterministic destruction in his class because of a need to 
free a resource upon class destruction, is constrained in D to using a 
struct rather than a class.

In that case why bother, since structs are so much less than a class in 
features. You might just as well say "I did not want the challenge of 
RAII in D, a GC language, so I will just kill it this way." If you 
really don't want RAII in D, which simply and fairly enough means you 
want the release of resources in your GC environment to always be done 
manually, just don't implement it at all. That is much more 
straightforward than attempting to support but doing it in such a way 
that makes it impossible for a class designer to implement it.

The current 'scope' keyword for RAII is very limited. My OP was not 
questioning that but objecting to the redundant way it had to be used 
even in that environment. However, making it more limiting rather than 
less limiting just kills it entirely IMO, where you should be seeking to 
go in exactly the opposite direction.

Jan 30 2008

Walter Bright <newshound1 digitalmars.com> writes:

Edward Diener wrote:
 Walter Bright wrote:
 I'm working on adding destructors to structs, which I'm thinking 
 should completely supplant scoped classes. RAII is a much more natural 
 fit with structs than it ever will be for classes.

 
 Please reconsider that decision, especially in the light of the 
 restrictions to structs in D which classes do not have. You would 
 essentially be saying that any class designer, who would want to 
 incorporate deterministic destruction in his class because of a need to 
 free a resource upon class destruction, is constrained in D to using a 
 struct rather than a class.
 
 In that case why bother, since structs are so much less than a class in 
 features. You might just as well say "I did not want the challenge of 
 RAII in D, a GC language, so I will just kill it this way." If you 
 really don't want RAII in D, which simply and fairly enough means you 
 want the release of resources in your GC environment to always be done 
 manually, just don't implement it at all. That is much more 
 straightforward than attempting to support but doing it in such a way 
 that makes it impossible for a class designer to implement it.
 
 The current 'scope' keyword for RAII is very limited. My OP was not 
 questioning that but objecting to the redundant way it had to be used 
 even in that environment. However, making it more limiting rather than 
 less limiting just kills it entirely IMO, where you should be seeking to 
 go in exactly the opposite direction.

Maybe you're selling structs short :-). With RAII structs, to make an 
RAII class, one could create a wrapper struct template:

struct Wrapper(C)
{
     C c;
     ~this()
     {
         delete c;
     }
}

This is oversimplified, as there would also need to be a mechanism to 
forward operations from the struct to the class C, copy constructors, 
etc., but I think it is conceptually sound.

Such wrapper structs could, for example, be written to reference count 
their argument equivalently to the C++ shared_ptr<>.

 From another point of view, an RAII type is fundamentally a value type, 
whereas classes are fundamentally reference types. By using the wrapper 
approach to impart some value (i.e. RAII) semantics to a reference type, 
the operations on that reference type can be carefully controlled by the 
wrapper designer to prevent such problems as references escaping the 
scope - solutions which are problematic to put in the core language.

I'm not sure what you mean by saying that structs are so much less than 
classes. Structs aren't a subset of classes, they are a fundamentally 
different animal - a value type, as opposed to a class, which is a 
reference type. C++ does not distinguish between the two, leaving 
serious problems such as the "slicing problem", the virtual destructor 
problem, and trying to prevent users of your C++ class from using it as 
a value when it should be a reference or vice versa.

Behaviors which are a natural fit for reference types include 
inheritance, polymorphism, and gc (including ref counted gc).

Behaviors which are a natural fit for value types are scoped allocation, 
RAII, non-virtual functions.

Jan 30 2008

Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:

Walter Bright wrote:
 Edward Diener wrote:
 Walter Bright wrote:
 I'm working on adding destructors to structs, which I'm thinking 
 should completely supplant scoped classes. RAII is a much more 
 natural fit with structs than it ever will be for classes.

 Please reconsider that decision, especially in the light of the 
 restrictions to structs in D which classes do not have. You would 
 essentially be saying that any class designer, who would want to 
 incorporate deterministic destruction in his class because of a need 
 to free a resource upon class destruction, is constrained in D to 
 using a struct rather than a class.

 In that case why bother, since structs are so much less than a class 
 in features. You might just as well say "I did not want the challenge 
 of RAII in D, a GC language, so I will just kill it this way." If you 
 really don't want RAII in D, which simply and fairly enough means you 
 want the release of resources in your GC environment to always be done 
 manually, just don't implement it at all. That is much more 
 straightforward than attempting to support but doing it in such a way 
 that makes it impossible for a class designer to implement it.

 The current 'scope' keyword for RAII is very limited. My OP was not 
 questioning that but objecting to the redundant way it had to be used 
 even in that environment. However, making it more limiting rather than 
 less limiting just kills it entirely IMO, where you should be seeking 
 to go in exactly the opposite direction.

 
 Maybe you're selling structs short :-). With RAII structs, to make an 
 RAII class, one could create a wrapper struct template:
 
 struct Wrapper(C)
 {
     C c;
     ~this()
     {
         delete c;
     }
 }
 
 This is oversimplified, as there would also need to be a mechanism to 
 forward operations from the struct to the class C, copy constructors, 
 etc., but I think it is conceptually sound.
 
 Such wrapper structs could, for example, be written to reference count 
 their argument equivalently to the C++ shared_ptr<>.

OK, I see where you are going.

 
  From another point of view, an RAII type is fundamentally a value type, 
 whereas classes are fundamentally reference types. By using the wrapper 
 approach to impart some value (i.e. RAII) semantics to a reference type, 
 the operations on that reference type can be carefully controlled by the 
 wrapper designer to prevent such problems as references escaping the 
 scope - solutions which are problematic to put in the core language.
 
 I'm not sure what you mean by saying that structs are so much less than 
 classes. Structs aren't a subset of classes, they are a fundamentally 
 different animal - a value type, as opposed to a class, which is a 
 reference type. C++ does not distinguish between the two, leaving 
 serious problems such as the "slicing problem", the virtual destructor 
 problem, and trying to prevent users of your C++ class from using it as 
 a value when it should be a reference or vice versa.
 
 Behaviors which are a natural fit for reference types include 
 inheritance, polymorphism, and gc (including ref counted gc).
 
 Behaviors which are a natural fit for value types are scoped allocation, 
 RAII, non-virtual functions.

You have sold me.

When you said 'struct' I was not thinking in terms of a template class, 
ala boost::shared_ptr<T>, but instead of the limitations of 'struct' in 
D as opposed to a class, which tells me that in D a struct is a C++ POD. 
I have not read about templates yet in D so a struct that is a template 
class and wraps an object which is the actual type was beyond my 
thinking. Your idea is absolutely right and I was wrong to criticize it 
without understanding what you meant.

Now that I see where you are going, and you mentioned forwarding in your 
description above, I though of how boost::shared_ptr does it and I 
realized that the C++ 'operator ->' is the key. So I immediately looked 
for the equivalent in D, which would allow this to happen also, which 
would be an op function for the 'operator .'. But I could not find this 
operator supported in the D 1.0 docs. My suggestion then, if you are 
going to make the idea above work, is that you need to support an op 
function for the 'operator .' and then forwarding into the wrapped 
object would be simple and automatic no matter what functionality the 
wrapped object had.

Jan 30 2008

Walter Bright <newshound1 digitalmars.com> writes:

Edward Diener wrote:
 Now that I see where you are going, and you mentioned forwarding in your 
 description above, I though of how boost::shared_ptr does it and I 
 realized that the C++ 'operator ->' is the key. So I immediately looked 
 for the equivalent in D, which would allow this to happen also, which 
 would be an op function for the 'operator .'. But I could not find this 
 operator supported in the D 1.0 docs. My suggestion then, if you are 
 going to make the idea above work, is that you need to support an op 
 function for the 'operator .' and then forwarding into the wrapped 
 object would be simple and automatic no matter what functionality the 
 wrapped object had.

I agree that a method for forwarding is needed to complete the job. I 
plan on working on that after the RAII stuff is working.

Jan 31 2008

Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:

Walter Bright wrote:
 Edward Diener wrote:
 Now that I see where you are going, and you mentioned forwarding in 
 your description above, I though of how boost::shared_ptr does it and 
 I realized that the C++ 'operator ->' is the key. So I immediately 
 looked for the equivalent in D, which would allow this to happen also, 
 which would be an op function for the 'operator .'. But I could not 
 find this operator supported in the D 1.0 docs. My suggestion then, if 
 you are going to make the idea above work, is that you need to support 
 an op function for the 'operator .' and then forwarding into the 
 wrapped object would be simple and automatic no matter what 
 functionality the wrapped object had.

 
 I agree that a method for forwarding is needed to complete the job. I 
 plan on working on that after the RAII stuff is working.

Thinking about this further, why not go all the way and just provide 
automatic support for all 'scope' objects as RAII constructs with 
reference counted destruction. If you did that D would be the first GC 
language to have a transparent mechanism for handling deterministic 
destruction.

What you are saying is that you want to allow a template struct to be a 
wrapper for 'scope' classes. Your idea is that when the object of that 
template class gets created the reference count is set to 1, as the 
object of that template class gets copied or assigned to another object 
of the same type the reference goes up, when the object is destructed 
the reference count goes down and, if the reference count goes to 0, the 
wrapped GC class gets destroyed.

Obviously in D you are tracking whenever a struct goes out of scope in 
order to call the struct's destructor. Just as obviously you must allow 
some copy constructor and assignment processing for a struct whenever it 
gets copied or assigned, in order to increment the reference count.

If this plan is workable you could do the exact same thing at the 
compiler level when dealing with a 'scope' object.

You could allow 'scope' to be specified at the class level, as you are 
now doing, or at the object level, as you are now insisting on doing 
when creating objects of 'scope' classes even though it is redundant ( 
the initial reason for my OP ). But instead of being redundant, as it is 
now, you could allow it as a way of saying that the end-user wants a 
particular object to use scoping, ie. to be deterministically destroyed 
when the last reference to the object goes out of scope. In this way 
both the class designer, who knows if he may need his class to be 
'scope' because he knows if he needs to release a resource, and the 
object user, who may need control to 'scope' for containers which 
themselves are not 'scope' type but which may contain 'scope' types, 
have full control of 'scope'

Voila ! You now have a full GC language in which both the class 
designer, via a 'scope' class, and the object creator, via a 'scope' 
object, has complete control over the destruction of objects. Yours 
would be the first GC language to really solve the problem of objects 
encapsulating non-memory resources being destroyed deterministically 
when references to the object are no longer being used. All other GC 
languages just gloss over the problem or maintain that it occurs so 
rarely there is no need for anything but manual release of non-memory 
resources ( via try/catch and specialized Dispose/Close ) or semi-manual 
methods such as your current very limited use of 'scope'.

I hear you saying, "No I don't want to be the first GC language to solve 
this problem especially as Java, .Net, Python, Ruby, et al. just pretend 
it does not exist or is unimportant for practical programming and 
besides, it is difficult to solve and I have lots of other, better 
things to do, and finally few people will know or give me credit for it 
anyway." But somewhere, some day, someone is going to point out this 
flaw in GC and a solution, as I have described, will be implemented, and 
then everyone will say, "why did we not think of this sooner". And some 
bright person will say, "you know Walter Bright solved this years ago 
with D."

Feb 02 2008

Walter Bright <newshound1 digitalmars.com> writes:

Edward Diener wrote:
 Thinking about this further, why not go all the way and just provide 
 automatic support for all 'scope' objects as RAII constructs with 
 reference counted destruction. If you did that D would be the first GC 
 language to have a transparent mechanism for handling deterministic 
 destruction.

You read my mind <g>. But I wanted to see first how well proxy objects 
would work first, they may not need the extra help.

 What you are saying is that you want to allow a template struct to be a 
 wrapper for 'scope' classes. Your idea is that when the object of that 
 template class gets created the reference count is set to 1, as the 
 object of that template class gets copied or assigned to another object 
 of the same type the reference goes up, when the object is destructed 
 the reference count goes down and, if the reference count goes to 0, the 
 wrapped GC class gets destroyed.

Yes.

 Obviously in D you are tracking whenever a struct goes out of scope in 
 order to call the struct's destructor. Just as obviously you must allow 
 some copy constructor and assignment processing for a struct whenever it 
 gets copied or assigned, in order to increment the reference count.

Yes.

 If this plan is workable you could do the exact same thing at the 
 compiler level when dealing with a 'scope' object.

Yes.

 You could allow 'scope' to be specified at the class level, as you are 
 now doing, or at the object level, as you are now insisting on doing 
 when creating objects of 'scope' classes even though it is redundant ( 
 the initial reason for my OP ). But instead of being redundant, as it is 
 now, you could allow it as a way of saying that the end-user wants a 
 particular object to use scoping, ie. to be deterministically destroyed 
 when the last reference to the object goes out of scope. In this way 
 both the class designer, who knows if he may need his class to be 
 'scope' because he knows if he needs to release a resource, and the 
 object user, who may need control to 'scope' for containers which 
 themselves are not 'scope' type but which may contain 'scope' types, 
 have full control of 'scope'
 
 Voila ! You now have a full GC language in which both the class 
 designer, via a 'scope' class, and the object creator, via a 'scope' 
 object, has complete control over the destruction of objects. Yours 
 would be the first GC language to really solve the problem of objects 
 encapsulating non-memory resources being destroyed deterministically 
 when references to the object are no longer being used. All other GC 
 languages just gloss over the problem or maintain that it occurs so 
 rarely there is no need for anything but manual release of non-memory 
 resources ( via try/catch and specialized Dispose/Close ) or semi-manual 
 methods such as your current very limited use of 'scope'.
 
 I hear you saying, "No I don't want to be the first GC language to solve 
 this problem especially as Java, .Net, Python, Ruby, et al.  just pretend
 it does not exist or is unimportant for practical programming and 
 besides, it is difficult to solve and I have lots of other, better 
 things to do, and finally few people will know or give me credit for it 
 anyway." But somewhere, some day, someone is going to point out this 
 flaw in GC and a solution, as I have described, will be implemented, and 
 then everyone will say, "why did we not think of this sooner". And some 
 bright person will say, "you know Walter Bright solved this years ago 
 with D."

We did think of it over a year ago, and have been laying the groundwork 
for it (wanted to get the const madness done first). This will enable D 
to be, as you say, the first language to support the triumvirate of 
explicit, automatic, and ref counted memory allocation on an equal footing.

The only question is whether the proxy struct will be easy enough to use 
to not need extra core language support:

      scope C c;

v.s.:

      Scope!(C) c;

One major consideration arguing for it to be a library feature is 
multithreading. Doing locked reference counts is slow, and needed only 
for a minority of objects. It should be selectable when you allocate the 
object whether you need it multithreaded or not.

Feb 02 2008

"Janice Caron" <caron800 googlemail.com> writes:

On 2/2/08, Walter Bright <newshound1 digitalmars.com> wrote:
 Doing locked reference counts is slow

Surely you don't need to lock the reference count? You can use atomic
increment and atomic decrement instead. (I implemented a ref-counted
template in C++ once, and that's what I did. It seemed to work).

Locking the indirect object itself - that's another matter! However,
one would imagine that multithreaded code which shares objects will
already have a mutex system in place, because that would have been
needed even without ref counting.

Feb 02 2008

Walter Bright <newshound1 digitalmars.com> writes:

Janice Caron wrote:
 On 2/2/08, Walter Bright <newshound1 digitalmars.com> wrote:
 Doing locked reference counts is slow

 
 Surely you don't need to lock the reference count? You can use atomic
 increment and atomic decrement instead. (I implemented a ref-counted
 template in C++ once, and that's what I did. It seemed to work).

Doing atomic inc/dec *is* locking. The LOCK CPU instruction is there, 
but it's mighty slow.

Feb 02 2008

Sean Kelly <sean f4.ca> writes:

Janice Caron wrote:
 On 2/2/08, Walter Bright <newshound1 digitalmars.com> wrote:
 Doing locked reference counts is slow

 
 Surely you don't need to lock the reference count? You can use atomic
 increment and atomic decrement instead. (I implemented a ref-counted
 template in C++ once, and that's what I did. It seemed to work).

A locked operation on x86 takes something like 80ns, which is far from
cheap.  Though I think a cleverly implemented algorithm may be able to
avoid the use of 'lock' altogether (Boost's does, IIRC).


Sean

Feb 02 2008

Sergey Gromov <snake.scaly gmail.com> writes:

Sean Kelly Wrote:

 A locked operation on x86 takes something like 80ns, which is far from
 cheap.  Though I think a cleverly implemented algorithm may be able to
 avoid the use of 'lock' altogether (Boost's does, IIRC).

Boost uses InterlocedIncrement() etc. under Windows and
lock inc dword ptr [esi] and such otherwise.  That's what Walter's
talking about.

Feb 03 2008

Sean Kelly <sean f4.ca> writes:

Sergey Gromov wrote:
 Sean Kelly Wrote:
 
 A locked operation on x86 takes something like 80ns, which is far from
 cheap.  Though I think a cleverly implemented algorithm may be able to
 avoid the use of 'lock' altogether (Boost's does, IIRC).

 
 Boost uses InterlocedIncrement() etc. under Windows and
 lock inc dword ptr [esi] and such otherwise.  That's what Walter's
 talking about.

Ah, for some reason I thought they used a sort of spinlock.


Sean

Feb 03 2008

Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:

Janice Caron wrote:
 On 2/2/08, Walter Bright <newshound1 digitalmars.com> wrote:
 Doing locked reference counts is slow

 
 Surely you don't need to lock the reference count? You can use atomic
 increment and atomic decrement instead. (I implemented a ref-counted
 template in C++ once, and that's what I did. It seemed to work).
 
 Locking the indirect object itself - that's another matter! However,
 one would imagine that multithreaded code which shares objects will
 already have a mutex system in place, because that would have been
 needed even without ref counting.

Exactly ! The responsibility for object access in a multi-threading 
system is the end-user's not that of D, and the only low-level 
responsibility of D in such scheme is to manipulate the reference count 
atomically.

Feb 02 2008

"David Wilson" <dw botanicus.net> writes:

On 2/2/08, Janice Caron <caron800 googlemail.com> wrote:
 On 2/2/08, Walter Bright <newshound1 digitalmars.com> wrote:
 Doing locked reference counts is slow

 Surely you don't need to lock the reference count? You can use atomic
 increment and atomic decrement instead. (I implemented a ref-counted
 template in C++ once, and that's what I did. It seemed to work).

 Locking the indirect object itself - that's another matter! However,
 one would imagine that multithreaded code which shares objects will
 already have a mutex system in place, because that would have been
 needed even without ref counting.

There are still effects on the CPU cache when using atomic
instructions - namely, every CPU in a multi-CPU system must be
instructed to flush any cache line that would be affected by the
operation. As far as I know this is asserted in x86 via the "LOCK"
instruction prefix.


David.

Feb 02 2008

Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:

Walter Bright wrote:
 The only question is whether the proxy struct will be easy enough to use 
 to not need extra core language support:
 
      scope C c;
 
 v.s.:
 
      Scope!(C) c;

The latter won't be as nice as-is, since in the first case you can omit 
the C, and have it inferred.
Unless you accept Scope!(auto), I don't see how you could do this.

Even then, you need both 'Scope' and 'auto', but I personally don't have 
a problem with dropping type inference for declarations that don't 
include 'auto' and having 'auto' be automatically replaced with the 
inferred type (even in template arguments).
That wouldn't be backwards-compatible though, so you might want need to 
keep allowing non-'auto' automatic inference as well. Currently 'auto' 
is also allowed in non-final-attribute position, and that would be 
inconsistent as well if it's kept.

Feb 02 2008

Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:

Walter Bright wrote:
 Edward Diener wrote:
 Voila ! You now have a full GC language in which both the class 
 designer, via a 'scope' class, and the object creator, via a 'scope' 
 object, has complete control over the destruction of objects. Yours 
 would be the first GC language to really solve the problem of objects 
 encapsulating non-memory resources being destroyed deterministically 
 when references to the object are no longer being used. All other GC 
 languages just gloss over the problem or maintain that it occurs so 
 rarely there is no need for anything but manual release of non-memory 
 resources ( via try/catch and specialized Dispose/Close ) or 
 semi-manual methods such as your current very limited use of 'scope'.

 I hear you saying, "No I don't want to be the first GC language to 
 solve this problem especially as Java, .Net, Python, Ruby, et al.  
 just pretend
 it does not exist or is unimportant for practical programming and 
 besides, it is difficult to solve and I have lots of other, better 
 things to do, and finally few people will know or give me credit for 
 it anyway." But somewhere, some day, someone is going to point out 
 this flaw in GC and a solution, as I have described, will be 
 implemented, and then everyone will say, "why did we not think of this 
 sooner". And some bright person will say, "you know Walter Bright 
 solved this years ago with D."

 
 We did think of it over a year ago, and have been laying the groundwork 
 for it (wanted to get the const madness done first). This will enable D 
 to be, as you say, the first language to support the triumvirate of 
 explicit, automatic, and ref counted memory allocation on an equal footing.

That would be superb !

 
 The only question is whether the proxy struct will be easy enough to use 
 to not need extra core language support:
 
      scope C c;
 
 v.s.:
 
      Scope!(C) c;

I would like to see:

C c = new C(...);

and if C is a 'scope' class it is handled exactly the same as:

scope C c = new C(...);

if C is not a scope class. In both cases the 'c' object gets reference 
counted and treated as such when it is copied, assigned, and leaves a D 
scope in order to have its destructor called immediately when there are 
no more references to it. In other words I do not like the imposition of 
having to treat C as a struct which the form of:

scope C c;

implies nor of having to specify 'scope' if the class itself has already 
been marked as 'scope' in the class definition.

The idea in my mind is essentially that 'scope' classes automatically 
define an object that, when used exactly as an normal GC object, 
automatically calls the destructor of the object just as soon as the 
last reference to it goes out of scope. In this sense the user neither 
know or cares whether the object encapsulates a resource or not and uses 
an object of such a class just as he would use any other GC object. 
Similarly the user can force an object to be a 'scope' object through 
the second syntax give above, but from then on he treats the object just 
as he would any other GC object.

My point of view is quite simply the the core D language should make the 
syntax for using 'scope' objects as non-distinguishable from using 
normal GC objets as possible. Having a different way just to instantiate 
an object of a 'scope' class is not as it removes the transparency of 
their use.

 
 One major consideration arguing for it to be a library feature is 
 multithreading. Doing locked reference counts is slow, and needed only 
 for a minority of objects. It should be selectable when you allocate the 
 object whether you need it multithreaded or not.

Please take a look at atomic operations, which I am sure you already 
know about. I believe boost::shared_ptr<T> in its latest incarnation is 
  using it to increment and decrement the reference count without the 
usual time penalty which you mention.

Feb 02 2008

Walter Bright <newshound1 digitalmars.com> writes:

Edward Diener wrote:
 The idea in my mind is essentially that 'scope' classes automatically 
 define an object that, when used exactly as an normal GC object, 
 automatically calls the destructor of the object just as soon as the 
 last reference to it goes out of scope. In this sense the user neither 
 know or cares whether the object encapsulates a resource or not and uses 
 an object of such a class just as he would use any other GC object. 
 Similarly the user can force an object to be a 'scope' object through 
 the second syntax give above, but from then on he treats the object just 
 as he would any other GC object.

The problem is:
	scope C c;
	C d;
	d = c;
and now d is no longer properly ref-counted. The 'scopeness' of an 
object must therefore be part of its type. The assignment d=c would have 
to be illegal.


 Please take a look at atomic operations, which I am sure you already 
 know about. I believe boost::shared_ptr<T> in its latest incarnation is 
  using it to increment and decrement the reference count without the 
 usual time penalty which you mention.

There has been a lot of work done improving atomic operations. That's 
one reason for making it a library feature - the library stuff can be 
improved without having to change the compiler.

Feb 02 2008

Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:

Walter Bright wrote:
 Edward Diener wrote:
 The idea in my mind is essentially that 'scope' classes automatically 
 define an object that, when used exactly as an normal GC object, 
 automatically calls the destructor of the object just as soon as the 
 last reference to it goes out of scope. In this sense the user neither 
 know or cares whether the object encapsulates a resource or not and 
 uses an object of such a class just as he would use any other GC 
 object. Similarly the user can force an object to be a 'scope' object 
 through the second syntax give above, but from then on he treats the 
 object just as he would any other GC object.

 
 The problem is:
     scope C c;
     C d;
     d = c;
 and now d is no longer properly ref-counted. The 'scopeness' of an 
 object must therefore be part of its type. The assignment d=c would have 
 to be illegal.

If C is a scope class, then it is fine and the normal reference counting 
occurs when d=c; occurs, n'est-ce pas ? If is not a scope class, then the

C d;

say that d is not a scoped object so that when:

d = c;

all that happens is that the reference count is not updated for this 
assignment and d going out of scope does nothing. In this latter case it 
is the responsibility of the end user, since he is scoping at the object 
level, to do the correct thing for whatever he wants.

IMO the 'scope' at the compiler level is part of the object, with the 
only difference being that objects of a 'scope' type are automatically 
'scope' without the user of the object specifying it.

If you want the lesser benefit of 'scope' being only part of the type, 
then you take away from the end user the ability to create a 'scope' 
object of a type which is not 'scope'. This may be easier for you, the 
compiler writer, but it means that containers of objects, which may hold 
objects of 'scope' type but are not 'scope' type themselves, do not get 
the benefit of reference counted deterministic destruction.

 
 
 Please take a look at atomic operations, which I am sure you already 
 know about. I believe boost::shared_ptr<T> in its latest incarnation 
 is  using it to increment and decrement the reference count without 
 the usual time penalty which you mention.

 
 There has been a lot of work done improving atomic operations. That's 
 one reason for making it a library feature - the library stuff can be 
 improved without having to change the compiler.

As long as it is transparent to the end user you should do it in 
whatever the best way you deem possible.

Feb 02 2008

Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:

Walter Bright wrote:
 Edward Diener wrote:
 The idea in my mind is essentially that 'scope' classes automatically 
 define an object that, when used exactly as an normal GC object, 
 automatically calls the destructor of the object just as soon as the 
 last reference to it goes out of scope. In this sense the user neither 
 know or cares whether the object encapsulates a resource or not and 
 uses an object of such a class just as he would use any other GC 
 object. Similarly the user can force an object to be a 'scope' object 
 through the second syntax give above, but from then on he treats the 
 object just as he would any other GC object.

 
 The problem is:
     scope C c;
     C d;
     d = c;
 and now d is no longer properly ref-counted. The 'scopeness' of an 
 object must therefore be part of its type. The assignment d=c would have 
 to be illegal.

My brain was not working correctly earlier, so let me correct myself.

In your above, if the c object is 'scope', whether it is because the C 
class is 'scope' or, as in your example, you specify 'scope' on the 
object ( which in current D is the same thing as saying that the C class 
is 'scope' ) then the assignment to another object makes that object 
'scope' automatically. This is yet another reason why 'scope' at the 
compiler should be tracked at the object level, not at the class level. 
The canonical situation is:

class C { ... }
scope class D : C { ... }

scope ( redundant IMO ) D  d = new D(...);
C c = d;

Clearly c, whose polymorphical type is a D, has to be 'scope'.

Feb 02 2008

Walter Bright <newshound1 digitalmars.com> writes:

Edward Diener wrote:
 In your above, if the c object is 'scope', whether it is because the C 
 class is 'scope' or, as in your example, you specify 'scope' on the 
 object ( which in current D is the same thing as saying that the C class 
 is 'scope' ) then the assignment to another object makes that object 
 'scope' automatically. This is yet another reason why 'scope' at the 
 compiler should be tracked at the object level, not at the class level. 
 The canonical situation is:
 
 class C { ... }
 scope class D : C { ... }
 
 scope ( redundant IMO ) D  d = new D(...);
 C c = d;
 
 Clearly c, whose polymorphical type is a D, has to be 'scope'.

Let's look at it by analogy to 'const'. Implicitly converting a const D 
to its base class will produce a const C, not a C. A const C cannot be 
assigned to a C.

I think it should work similarly with scope, and that like const, it 
should be part of the type system (a proxy struct would accomplish 
that). Making it a dynamic part of the object would exact a heavy cost 
for most objects which don't need it.

Feb 02 2008

Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:

Walter Bright wrote:
 Edward Diener wrote:
 In your above, if the c object is 'scope', whether it is because the C 
 class is 'scope' or, as in your example, you specify 'scope' on the 
 object ( which in current D is the same thing as saying that the C 
 class is 'scope' ) then the assignment to another object makes that 
 object 'scope' automatically. This is yet another reason why 'scope' 
 at the compiler should be tracked at the object level, not at the 
 class level. The canonical situation is:

 class C { ... }
 scope class D : C { ... }

 scope ( redundant IMO ) D  d = new D(...);
 C c = d;

 Clearly c, whose polymorphical type is a D, has to be 'scope'.

 
 Let's look at it by analogy to 'const'. Implicitly converting a const D 
 to its base class will produce a const C, not a C. A const C cannot be 
 assigned to a C.
 
 I think it should work similarly with scope, and that like const, it 
 should be part of the type system (a proxy struct would accomplish 
 that). Making it a dynamic part of the object would exact a heavy cost 
 for most objects which don't need it.

Your analogy to C++'s 'const' is a bad one.

The C++ 'const' refers to a quality of the object while the D 'scope' 
refers to a quality of the type. There is no equivalent of 'const' in 
C++ which refers to the type. Once we say that a type is 'scope' in D we 
should no longer have to say that an object of that type is 'scope'. An 
object of that type should be 'scope' automatically and the user of that 
object should not care or even need to know. In C++ the user of an 
object specifically says it is 'const' to set the quality of the object 
to something ( one can not change the object ). Your analogy is mixing 
apples and oranges. These are different things.

What I am saying is that an object whose type is 'scope' is treated 
magically by the compiler in that the compiler is now doing reference 
counting on it and calling its destructor when the last reference goes 
out of scope. Furthermore as that object gets reference assigned the 
reference count is manipulated and whatever object is specified in that 
reference assignment, as long as it is allowable by the compiler by the 
rules of D, takes part in the 'scope' magic. In a polymorphic language 
this means that you should associate 'scope' with the dynamic type of 
the object, not its static type, and how you decide to do that is up to 
you. Think of it as wrapping a boost::shared_ptr around the object and 
for every object to which you legally assign/copy it a boost::shared_ptr 
gets wrapped around that object.

I agree this adds some overhead, but so what. Using boost::shared_ptr 
also imposes overhead and whole generation of programmers have somehow 
survived the extra x bytes per object in an age where physical memmory 
is in the gigabytes and virtual memory in 64 bit systems in the quadrabytes.

My added suggestion is that when applying the 'scope' keyword to the 
object, and not the type, this essentially means that the compiler now 
treats that object as 'scope' even though the type is not 'scope'. I 
will call this object 'scope' injection. My suggestion for this is based 
solely on the practical consequences:

1) Allow the instantiator of a type to have 'scope' control over the 
object even when the designer of the type does not specify it as a 
'scope' type. The user may know something about using the type at 
run-time that the class designer can not know, makes optional, or even 
disregards.

2) Following from the above, the most obvious practical cases occur when 
the type is created from a template class/struct, when the type is some 
built-in language container which can hold polymorphic objects of some 
base class type, and when the type embeds an object of 'scope' class 
type which may only be used in a corner case so that the designer of the 
type leaves scoping up to the user.

In other words the flexibility of control would be wonderful and, I 
believe, often necessary. Having both the class designer be able to 
'scope' the type and the end user be able to 'scope' an object of any 
type ( which has a destructor ) is the ultimate ideal. If you wanted to 
go even further you could allow the end-user to 'unscope' an object of a 
'scope' type when instantiating the object, even though the benefits of 
doing this seem to be practically negligible.

I view your choices as, from most desirable to least desirable:

1) Keep tracks of the objects themselves at run-time to see if they are 
'scope' or not. This allows object 'scope' injection and refernce 
assignment to objects whose static type is not 'scope' to make them 'scope'.

2) Keep track of the dynamic type of the objects themselves in order to 
see whether the dynamic type of the object is 'scope'. This does not 
allow object 'scope' injection, but does allow reference assignment to 
objects whose static type is not 'scope' since you are only considering 
the dynamic type of the object and not the static type to determine if 
the object should be 'scope'.

3) Keep track of only the static type of the object. This does not allow 
object 'scope' injection nor even reference assignment to objects whose 
static type is not 'scope'.

I view choice 3 as pretty poor and would really like to see choice 1 
rather than choice 2 for practical reasons.

I hope this at least gives you food for further thought.

Feb 02 2008

Walter Bright <newshound1 digitalmars.com> writes:

Edward Diener wrote:
 Your analogy to C++'s 'const' is a bad one.
 
 The C++ 'const' refers to a quality of the object while the D 'scope' 
 refers to a quality of the type. There is no equivalent of 'const' in 
 C++ which refers to the type.

'const' in C++ is very much a characteristic of the type of the object. 
It pervades the semantics of the type. It's easy to envision a scheme of 
const where the const-ness is controlled by a bit in the runtime 
instantiation of the object, and any mutating operations would first 
check that bit.

C++ avoids the overhead of that by having a static typing system, and 
such 'bits' become part of the compile-time information (i.e. the type) 
rather than the run-time information.

 Once we say that a type is 'scope' in D we 
 should no longer have to say that an object of that type is 'scope'. An 
 object of that type should be 'scope' automatically and the user of that 
 object should not care or even need to know. In C++ the user of an 
 object specifically says it is 'const' to set the quality of the object 
 to something ( one can not change the object ). Your analogy is mixing 
 apples and oranges. These are different things.

I don't believe they are different at all. Consider also languages that 
have no static types - the types are determined at runtime (Javascript 
is an example). What a language chooses to specify about an object at 
compile-time vs run-time is a spectrum with various tradeoffs, not an 
apples-oranges with a sharp dividing line.

Certainly, there is plenty of debate about static typing vs dynamic 
typing. D is a statically typed language primarily for performance 
reasons - a dynamically typed language can run 100x slower.


 What I am saying is that an object whose type is 'scope' is treated 
 magically by the compiler in that the compiler is now doing reference 
 counting on it and calling its destructor when the last reference goes 
 out of scope. Furthermore as that object gets reference assigned the 
 reference count is manipulated and whatever object is specified in that 
 reference assignment, as long as it is allowable by the compiler by the 
 rules of D, takes part in the 'scope' magic. In a polymorphic language 
 this means that you should associate 'scope' with the dynamic type of 
 the object, not its static type, and how you decide to do that is up to 
 you. Think of it as wrapping a boost::shared_ptr around the object and 
 for every object to which you legally assign/copy it a boost::shared_ptr 
 gets wrapped around that object.

Pulling on that string leads us to every object having scope semantics, 
because that machinery will have to exist and be checked at runtime for 
every object.

 I agree this adds some overhead, but so what.

And there lies the crux of our disagreement. My experience with memory 
allocation is that ref counting is appropriate for scarce resources, and 
gc is appropriate for abundant resources (i.e. memory). We both agree 
that gc is a poor choice for scarce resources, and I'm going to argue 
that rc is a poor choice for abundant resources.


 Using boost::shared_ptr 
 also imposes overhead and whole generation of programmers have somehow 
 survived the extra x bytes per object in an age where physical memmory 
 is in the gigabytes and virtual memory in 64 bit systems in the 
 quadrabytes.

There is a large push to add gc to C++. (rc has disadvantages besides 
using more memory - the overhead to allocate two objects instead of one, 
and the overhead of doing the inc/dec/test. A further disadvantage is 
you cannot do array slicing with rc without adding substantial more 
overhead - memory and runtime.)


 My added suggestion is that when applying the 'scope' keyword to the 
 object, and not the type, this essentially means that the compiler now 
 treats that object as 'scope' even though the type is not 'scope'. I 
 will call this object 'scope' injection. My suggestion for this is based 
 solely on the practical consequences:
 
 1) Allow the instantiator of a type to have 'scope' control over the 
 object even when the designer of the type does not specify it as a 
 'scope' type. The user may know something about using the type at 
 run-time that the class designer can not know, makes optional, or even 
 disregards.

I agree with you that the scopeness of an object is best determined by 
the user, not the class designer. But this doesn't preclude making scope 
part of the type any more than the user adding 'const' precludes it.


 2) Following from the above, the most obvious practical cases occur when 
 the type is created from a template class/struct, when the type is some 
 built-in language container which can hold polymorphic objects of some 
 base class type, and when the type embeds an object of 'scope' class 
 type which may only be used in a corner case so that the designer of the 
 type leaves scoping up to the user.
 
 In other words the flexibility of control would be wonderful and, I 
 believe, often necessary. Having both the class designer be able to 
 'scope' the type and the end user be able to 'scope' an object of any 
 type ( which has a destructor ) is the ultimate ideal. If you wanted to 
 go even further you could allow the end-user to 'unscope' an object of a 
 'scope' type when instantiating the object, even though the benefits of 
 doing this seem to be practically negligible.
 
 I view your choices as, from most desirable to least desirable:
 
 1) Keep tracks of the objects themselves at run-time to see if they are 
 'scope' or not. This allows object 'scope' injection and refernce 
 assignment to objects whose static type is not 'scope' to make them 
 'scope'.
 
 2) Keep track of the dynamic type of the objects themselves in order to 
 see whether the dynamic type of the object is 'scope'. This does not 
 allow object 'scope' injection, but does allow reference assignment to 
 objects whose static type is not 'scope' since you are only considering 
 the dynamic type of the object and not the static type to determine if 
 the object should be 'scope'.
 
 3) Keep track of only the static type of the object. This does not allow 
 object 'scope' injection nor even reference assignment to objects whose 
 static type is not 'scope'.

I believe that adding scope to the type allows for scope 'injection' as 
you defined it. But you're right in that (3) does not allow an object to 
be dynamically retyped as scope, though it could be 'wrapped' at runtime 
with a proxy struct that is itself statically scoped.

 I view choice 3 as pretty poor and would really like to see choice 1 
 rather than choice 2 for practical reasons.
 
 I hope this at least gives you food for further thought.

Feb 02 2008

Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:

Walter Bright wrote:
 Edward Diener wrote:
 Your analogy to C++'s 'const' is a bad one.

 The C++ 'const' refers to a quality of the object while the D 'scope' 
 refers to a quality of the type. There is no equivalent of 'const' in 
 C++ which refers to the type.

 
 'const' in C++ is very much a characteristic of the type of the object. 
 It pervades the semantics of the type. It's easy to envision a scheme of 
 const where the const-ness is controlled by a bit in the runtime 
 instantiation of the object, and any mutating operations would first 
 check that bit.

In your example justifying treating 'scope' as C++ treats 'const' the 
'const' is attached to the object upon instantiation of it. My point is 
that 'scope' is attached to the type by the class designer. To me these 
are conceptually two different things. That is why I said that your 
example is a poor analogy. I should have made that clearer by my argument.

We can argue the concept similarity/dissimilarity between 'const' and 
'scope' all day without getting much of anywhere. I was simply arguing 
against your assertion, using 'const' as an example, of not allowing a 
'scope' object to be assigned to an object that is not 'scope'. Clearly 
in the polymorphic world of D, where a base class may not be 'scope' 
while a derived class may be 'scope', such a treatment can not be right, 
since the basis of polymorphism is to assign to a base class object a 
derived class reference.

 
 C++ avoids the overhead of that by having a static typing system, and 
 such 'bits' become part of the compile-time information (i.e. the type) 
 rather than the run-time information.
 
 Once we say that a type is 'scope' in D we should no longer have to 
 say that an object of that type is 'scope'. An object of that type 
 should be 'scope' automatically and the user of that object should not 
 care or even need to know. In C++ the user of an object specifically 
 says it is 'const' to set the quality of the object to something ( one 
 can not change the object ). Your analogy is mixing apples and 
 oranges. These are different things.

 
 I don't believe they are different at all. Consider also languages that 
 have no static types - the types are determined at runtime (Javascript 
 is an example). What a language chooses to specify about an object at 
 compile-time vs run-time is a spectrum with various tradeoffs, not an 
 apples-oranges with a sharp dividing line.
 
 Certainly, there is plenty of debate about static typing vs dynamic 
 typing. D is a statically typed language primarily for performance 
 reasons - a dynamically typed language can run 100x slower.

I am fully cognizant of a dynamically typed language since I program in 
Python also. I agree there is no fixed dividing line. But the difference 
between static typing and dynamic typing is well defined in a statically 
typed language like D. My argument was that for 'scope' to be really 
effective it needs to consider the dynamic type at run-time and not just 
the static type as it exist at compile time.

 
 
 What I am saying is that an object whose type is 'scope' is treated 
 magically by the compiler in that the compiler is now doing reference 
 counting on it and calling its destructor when the last reference goes 
 out of scope. Furthermore as that object gets reference assigned the 
 reference count is manipulated and whatever object is specified in 
 that reference assignment, as long as it is allowable by the compiler 
 by the rules of D, takes part in the 'scope' magic. In a polymorphic 
 language this means that you should associate 'scope' with the dynamic 
 type of the object, not its static type, and how you decide to do that 
 is up to you. Think of it as wrapping a boost::shared_ptr around the 
 object and for every object to which you legally assign/copy it a 
 boost::shared_ptr gets wrapped around that object.

 
 Pulling on that string leads us to every object having scope semantics, 
 because that machinery will have to exist and be checked at runtime for 
 every object.

When the scope changes in D you need to make sure that any 'scope' 
object is treated appropriately.

But I do not see why you think that every object must therefore have 
scope semantics. Inserting compile time code when a scope changes to 
treat 'scope' objects in a special way does not mean to me that every 
object must have scope semantics. Perhaps you mean that the execution 
slows down a GC system too much to do that for every object.

If that is the case then I agree that RAII can not be done in a GC 
language in the terms in which I have defined it, although it probably 
can be done in lesser terms, as 'scope' currently exists in D.

 
 I agree this adds some overhead, but so what.

 
 And there lies the crux of our disagreement. My experience with memory 
 allocation is that ref counting is appropriate for scarce resources, and 
 gc is appropriate for abundant resources (i.e. memory). We both agree 
 that gc is a poor choice for scarce resources, and I'm going to argue 
 that rc is a poor choice for abundant resources.

You have already won that argument as I fully agree to what you say 
above. But I have no idea why you say that it is the crux of our 
disagreement. Care to elaborate ?

 
 
 Using boost::shared_ptr also imposes overhead and whole generation of 
 programmers have somehow survived the extra x bytes per object in an 
 age where physical memmory is in the gigabytes and virtual memory in 
 64 bit systems in the quadrabytes.

 
 There is a large push to add gc to C++. (rc has disadvantages besides 
 using more memory - the overhead to allocate two objects instead of one, 
 and the overhead of doing the inc/dec/test. A further disadvantage is 
 you cannot do array slicing with rc without adding substantial more 
 overhead - memory and runtime.)
 
 
 My added suggestion is that when applying the 'scope' keyword to the 
 object, and not the type, this essentially means that the compiler now 
 treats that object as 'scope' even though the type is not 'scope'. I 
 will call this object 'scope' injection. My suggestion for this is 
 based solely on the practical consequences:

 1) Allow the instantiator of a type to have 'scope' control over the 
 object even when the designer of the type does not specify it as a 
 'scope' type. The user may know something about using the type at 
 run-time that the class designer can not know, makes optional, or even 
 disregards.

 
 I agree with you that the scopeness of an object is best determined by 
 the user, not the class designer. But this doesn't preclude making scope 
 part of the type any more than the user adding 'const' precludes it.

No, I do not think that the scopeness of an object is best determined by 
the user and not the class designer. In fact I feel very strongly the 
opposite. The class designer knows whether his class has RAII or not and 
in the cast majority of cases the end user should not know or care.

My argument for scope injection is based purely on the practical 
considerations that there are types which can not possibly know if it is 
to be used with RAII or not. Template classes/structs which are 
containers are the most obvious as well as built-in arrays. That is why 
besides the ability for the class designer to specify 'scope' the end 
user should be able to do it at object creation time also.

 
 
 2) Following from the above, the most obvious practical cases occur 
 when the type is created from a template class/struct, when the type 
 is some built-in language container which can hold polymorphic objects 
 of some base class type, and when the type embeds an object of 'scope' 
 class type which may only be used in a corner case so that the 
 designer of the type leaves scoping up to the user.

 In other words the flexibility of control would be wonderful and, I 
 believe, often necessary. Having both the class designer be able to 
 'scope' the type and the end user be able to 'scope' an object of any 
 type ( which has a destructor ) is the ultimate ideal. If you wanted 
 to go even further you could allow the end-user to 'unscope' an object 
 of a 'scope' type when instantiating the object, even though the 
 benefits of doing this seem to be practically negligible.

 I view your choices as, from most desirable to least desirable:

 1) Keep tracks of the objects themselves at run-time to see if they 
 are 'scope' or not. This allows object 'scope' injection and refernce 
 assignment to objects whose static type is not 'scope' to make them 
 'scope'.

 2) Keep track of the dynamic type of the objects themselves in order 
 to see whether the dynamic type of the object is 'scope'. This does 
 not allow object 'scope' injection, but does allow reference 
 assignment to objects whose static type is not 'scope' since you are 
 only considering the dynamic type of the object and not the static 
 type to determine if the object should be 'scope'.

 3) Keep track of only the static type of the object. This does not 
 allow object 'scope' injection nor even reference assignment to 
 objects whose static type is not 'scope'.

 
 I believe that adding scope to the type allows for scope 'injection' as 
 you defined it. But you're right in that (3) does not allow an object to 
 be dynamically retyped as scope, though it could be 'wrapped' at runtime 
 with a proxy struct that is itself statically scoped.

Whether one does 'scope' injection using the 'scope' keyword on the 
object when it is declared or by using the equivalent of a 
boost:shared_ptr construct in D is of little practical matter to me.
This is purely syntax so that if D could silently translate 'scope' to 
such a boost::shared_ptr construct that would be better IMO because it 
would unite such a treatment under the same concept with the 'scope' 
keyword as it applies to a class.

The crux of my argument against 3) above is simply that the end user 
will not and should not be expected to know that an object is of a 
'scope' type.

// In a module

class C { ... }
scope class D : C { ... }

// In the end user's code

C d = new D(...);

Under 3) the d object is not 'scope', because its static type is not 
'scope' even though its dynamic type is 'scope'. This can not be right 
IMO. Requiring the user to have knowledge that D is a 'scope' negates a 
great deal of the transparency of having RAII in a GC language.

I can understand your feeling that the above should be:

class C { ... }
scope class D : C { ... }

// In the end user's code

scope C d = new D(...); // End user is required here to specify scope

This may make things much easier for the compiler, but it requires the 
end user knowledge of 'scope', which has been specified at the class 
level, to be applied at the syntax level. Intuitively I feel the 
compiler can figure this out, and that 'scope' should largely be totally 
transparent to the end user above at the syntax level.

I do agree that the end user should "know" that a class is 'scope' (RAII 
) by reading the documentation of that class. This is useful for scope 
injection for container objects and for the end user designing his own 
class as 'scope' when an object of a 'scope' class is a data member.

Feb 03 2008

Michel Fortin <michel.fortin michelf.com> writes:

On 2008-02-03 08:20:32 -0500, Edward Diener 
<eddielee_no_spam_here tropicsoft.com> said:

 I am fully cognizant of a dynamically typed language since I program in 
 Python also. I agree there is no fixed dividing line. But the 
 difference between static typing and dynamic typing is well defined in 
 a statically typed language like D. My argument was that for 'scope' to 
 be really effective it needs to consider the dynamic type at run-time 
 and not just the static type as it exist at compile time.

Considering the dynamic type at runtime means you need to check if 
you're dealing with a reference-counted object each time you copy a 
reference to that object to see if it the reference count needs 
adjusting. This is significant overhead over the "just copy the 
pointer" thing you can do in a GC. Basically, just checking this will 
increase by two or three times the time it take to copy an object 
reference... I can see why Walter doesn't want that.

Beside, the overhead of actually checking the type of the class will be 
approximativly the same as doing the reference counting. Given this, 
it's much better to always just do the reference counting than checking 
dynamically if it's needed.


 class C { ... }
 scope class D : C { ... }
 
 [...]
 
 This may make things much easier for the compiler, but it requires the 
 end user knowledge of 'scope', which has been specified at the class 
 level, to be applied at the syntax level. Intuitively I feel the 
 compiler can figure this out, and that 'scope' should largely be 
 totally transparent to the end user above at the syntax level.

Well, if the compiler is to be able to distinguish scope at compile 
time, then it needs a scope flag (either explicit or implicit) on each 
variable. This is exactly what Walter has proposed to do. He prefers 
the explicit route because going implicit isn't going to work in too 
many cases. For instance, let's have a function that returns a C:

	C makeOne() {
		if (/* random stuff here */)
			return new C;
		else
			return new D;
	}

Now let's call the function:

	C c = makeOne();

How can you know at compile time if the returned object of that 
function call is scoped or not? You can't, and therfore the compiler 
would need to add code to check if the returned object is scope or not, 
with a significant overhead, each time you assign a C.

If however you make scope known at compile time:

	scope C makeOne() {
		if (/* random stuff here */)
			return new C;
		else
			return new D;
	}

	scope C c = makeOne();

Now the compiler knows it must generate reference counting code for the 
following assignment, and any subsequent assignment of this type, and 
it won't have to generate code to dynamically everywhere you use a C 
check the "scopeness".

If makeOne returns a C, it'll simply be scope too, which is more 
overhead than having a garbage-collected C, but, as I said earlier, not 
necessarly less than checking dynamically if it should be reference 
counted.

Perhaps Walter can confirm that the above code makes sense given what 
he intends to do, but I believe it does.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Feb 03 2008

Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:

Michel Fortin wrote:
 On 2008-02-03 08:20:32 -0500, Edward Diener 
 <eddielee_no_spam_here tropicsoft.com> said:
 
 I am fully cognizant of a dynamically typed language since I program 
 in Python also. I agree there is no fixed dividing line. But the 
 difference between static typing and dynamic typing is well defined in 
 a statically typed language like D. My argument was that for 'scope' 
 to be really effective it needs to consider the dynamic type at 
 run-time and not just the static type as it exist at compile time.

 
 Considering the dynamic type at runtime means you need to check if 
 you're dealing with a reference-counted object each time you copy a 
 reference to that object to see if it the reference count needs 
 adjusting. This is significant overhead over the "just copy the pointer" 
 thing you can do in a GC. Basically, just checking this will increase by 
 two or three times the time it take to copy an object reference... I can 
 see why Walter doesn't want that.

I am not knowledgable about the actual low-level difference between the 
compiler statically checking the type of an object or dynamically 
checking the type of an object, and the run-time costs involved.

Yet clearly D already has to implement code when scopes come to an end 
in order to destroy stack-based objects, since structs ( user-define 
value types ) are already supported and can have destructors. So the 
added overhead goes from having to identify structs which must have 
their destructor called at the end of each scope to having to also 
identify 'scope' objects which must have their reference count 
decremented at the end of each scope and have their destructor called if 
the reference count reaches 0. The only difference I see, aside from the 
  run-time time overhead, is the actual identification for a greater set 
of objects.

 
 Beside, the overhead of actually checking the type of the class will be 
 approximativly the same as doing the reference counting. Given this, 
 it's much better to always just do the reference counting than checking 
 dynamically if it's needed.
 
 
 class C { ... }
 scope class D : C { ... }

 [...]

 This may make things much easier for the compiler, but it requires the 
 end user knowledge of 'scope', which has been specified at the class 
 level, to be applied at the syntax level. Intuitively I feel the 
 compiler can figure this out, and that 'scope' should largely be 
 totally transparent to the end user above at the syntax level.

 
 Well, if the compiler is to be able to distinguish scope at compile 
 time, then it needs a scope flag (either explicit or implicit) on each 
 variable. This is exactly what Walter has proposed to do. He prefers the 
 explicit route because going implicit isn't going to work in too many 
 cases. For instance, let's have a function that returns a C:
 
     C makeOne() {
         if (/* random stuff here */)
             return new C;
         else
             return new D;
     }
 
 Now let's call the function:
 
     C c = makeOne();
 
 How can you know at compile time if the returned object of that function 
 call is scoped or not? You can't, and therfore the compiler would need 
 to add code to check if the returned object is scope or not, with a 
 significant overhead, each time you assign a C.
 
 If however you make scope known at compile time:
 
     scope C makeOne() {
         if (/* random stuff here */)
             return new C;
         else
             return new D;
     }
 
     scope C c = makeOne();
 
 Now the compiler knows it must generate reference counting code for the 
 following assignment, and any subsequent assignment of this type, and it 
 won't have to generate code to dynamically everywhere you use a C check 
 the "scopeness".

Would you agree that all you are doing here is specifically telling the 
compiler that an object is 'scope' when it is created rather than having 
the compiler figure it out for itself by querying the dynamic type of 
the object at creation time ?

If you do, then a much simpler, and to the point, example would be based 
on my initial OP:

scope class C { ... }

scope C c = new C(...);

I specified that the scope keyword for creating the object is redundant. 
The compiler can figure it out. The major difference in opinion is that 
I think the compiler should figure it out from the dynamic type of the 
object at run-time and not from the static type of the object.

If Walter decides that creating code which at run-time determines the 
dynamic type of an object in order to implement RAII in D is too much 
overhead, I will understand. But I do no think it will be a solution for 
RAII in GC in my own understanding of what this should entail.

Feb 03 2008

Michel Fortin <michel.fortin michelf.com> writes:

On 2008-02-03 10:42:03 -0500, Edward Diener 
<eddielee_no_spam_here tropicsoft.com> said:

 Michel Fortin wrote:
 On 2008-02-03 08:20:32 -0500, Edward Diener 
 <eddielee_no_spam_here tropicsoft.com> said:
 
 I am fully cognizant of a dynamically typed language since I program in 
 Python also. I agree there is no fixed dividing line. But the 
 difference between static typing and dynamic typing is well defined in 
 a statically typed language like D. My argument was that for 'scope' to 
 be really effective it needs to consider the dynamic type at run-time 
 and not just the static type as it exist at compile time.

 
 Considering the dynamic type at runtime means you need to check if 
 you're dealing with a reference-counted object each time you copy a 
 reference to that object to see if it the reference count needs 
 adjusting. This is significant overhead over the "just copy the 
 pointer" thing you can do in a GC. Basically, just checking this will 
 increase by two or three times the time it take to copy an object 
 reference... I can see why Walter doesn't want that.

 
 I am not knowledgable about the actual low-level difference between the 
 compiler statically checking the type of an object or dynamically 
 checking the type of an object, and the run-time costs involved.
 
 Yet clearly D already has to implement code when scopes come to an end 
 in order to destroy stack-based objects, since structs ( user-define 
 value types ) are already supported and can have destructors.

Yes, and this is implemented in a simple and naive way: by adding an 
explicit call to the destructor at the end of the scope. The scope 
object cannot exist outside the scope, and thus no reference counting 
is needed in the way it's implemented currently.

 So the added overhead goes from having to identify structs which must 
 have their destructor called at the end of each scope to having to also 
 identify 'scope' objects which must have their reference count 
 decremented at the end of each scope and have their destructor called 
 if the reference count reaches 0.

Well, identifying structs can be done at compile time since you know 
exactly the type of the struct at that time. Classes are polymorphic, 
so it'd be a costly runtime check to know that, and that check is 
almost as costly as doing the reference counting itself. Given that, 
you should probably not bother at runtime and decide at compile time to 
just treat any class which has the potential to be a scope class as if 
it were one and actually do the reference counting.
 
 
 
 Beside, the overhead of actually checking the type of the class will be 
 approximativly the same as doing the reference counting. Given this, 
 it's much better to always just do the reference counting than checking 
 dynamically if it's needed.
 
 
 class C { ... }
 scope class D : C { ... }
 
 [...]
 
 This may make things much easier for the compiler, but it requires the 
 end user knowledge of 'scope', which has been specified at the class 
 level, to be applied at the syntax level. Intuitively I feel the 
 compiler can figure this out, and that 'scope' should largely be 
 totally transparent to the end user above at the syntax level.

 
 Well, if the compiler is to be able to distinguish scope at compile 
 time, then it needs a scope flag (either explicit or implicit) on each 
 variable. This is exactly what Walter has proposed to do. He prefers 
 the explicit route because going implicit isn't going to work in too 
 many cases. For instance, let's have a function that returns a C:
 
     C makeOne() {
         if (/* random stuff here */)
             return new C;
         else
             return new D;
     }
 
 Now let's call the function:
 
     C c = makeOne();
 
 How can you know at compile time if the returned object of that 
 function call is scoped or not? You can't, and therfore the compiler 
 would need to add code to check if the returned object is scope or not, 
 with a significant overhead, each time you assign a C.
 
 If however you make scope known at compile time:
 
     scope C makeOne() {
         if (/* random stuff here */)
             return new C;
         else
             return new D;
     }
 
     scope C c = makeOne();
 
 Now the compiler knows it must generate reference counting code for the 
 following assignment, and any subsequent assignment of this type, and 
 it won't have to generate code to dynamically everywhere you use a C 
 check the "scopeness".

 
 Would you agree that all you are doing here is specifically telling the 
 compiler that an object is 'scope' when it is created rather than 
 having the compiler figure it out for itself by querying the dynamic 
 type of the object at creation time ?

The compiler isn't knowleadgeable of what happens whithin every 
function call. So it can only check at runtime if the function returned 
at C or a D.

 If you do, then a much simpler, and to the point, example would be 
 based on my initial OP:
 
 scope class C { ... }
 
 scope C c = new C(...);
 
 I specified that the scope keyword for creating the object is 
 redundant. The compiler can figure it out. The major difference in 
 opinion is that I think the compiler should figure it out from the 
 dynamic type of the object at run-time and not from the static type of 
 the object.

You're prefectly right: it is redundent in *this* case, and you could 
have the compiler implicitly understand that C is a scope class in 
*this* case. But consider this example:

	Object o;
	if (/* random value */)
		o = new C; // c is a scope class
	else
		o = new Object; // Object is the base class of C but isn't scope

Now, should o be automatically reference-counted because you *could* 
later create a C object and assing it to o, or should line 3 gives an 
error since the type Object isn't scope and C must only be assigned as 
scope? I'd say it should be an error.

This however could be made legal without too much difficulty:

	scope Object o;
	if (/* random value */)
		o = new C; // c is a scope class
	else
		o = new Object; // Object is the base class of C but isn't scope

Basically, you're declaring a scope Object. While Object isn't 
necessarly a scope class, you are telling the compiler to treat it as 
scope, and thus an instance of C, which must be scope, *can* be put in 
this variable. If o wasn't scope, it'd be an error to put an instance 
of a scope class in it.

But there are still many holes in this scheme in which scope now means 
reference-counted. Take this example:

	class A {
		void doSomething() {
			globalReferences ~= this;
		}
	}
	scope class B { }

	A[] globalReferences;

	scope B b = new B; // Scope could be made implicit here, but it's 
irrelevant to my example
	b.doSomething();

This last statement would call A.doSomething which would put a 
non-scoped reference to globalReferences, which would fail to retain 
the object. There are two ways around that: ignore the problem and let 
the programmer handle these cases (basically, that is what 
boost::shared_ptr would do in such a situation), or introduce a new 
keyword to decorate parameters for functions that do not keep any 
reference beyound their own call so that you don't need to duplicate 
all your functions for a scope and non-scope parameter (much like const 
is the middle ground between mutable and invariant).

(Sidenote: this keyword could be useful to implement something like 
"unique" as it was discussed in another thread, as it'd allow functions 
to be called with a unique parameter and guarenty that no external 
references are kept after the call, thus perserving uniqueness.)


-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Feb 04 2008

Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:

Michel Fortin wrote:
 On 2008-02-03 10:42:03 -0500, Edward Diener 
 <eddielee_no_spam_here tropicsoft.com> said:
 
 Michel Fortin wrote:
 On 2008-02-03 08:20:32 -0500, Edward Diener 
 <eddielee_no_spam_here tropicsoft.com> said:

 I am fully cognizant of a dynamically typed language since I program 
 in Python also. I agree there is no fixed dividing line. But the 
 difference between static typing and dynamic typing is well defined 
 in a statically typed language like D. My argument was that for 
 'scope' to be really effective it needs to consider the dynamic type 
 at run-time and not just the static type as it exist at compile time.

 Considering the dynamic type at runtime means you need to check if 
 you're dealing with a reference-counted object each time you copy a 
 reference to that object to see if it the reference count needs 
 adjusting. This is significant overhead over the "just copy the 
 pointer" thing you can do in a GC. Basically, just checking this will 
 increase by two or three times the time it take to copy an object 
 reference... I can see why Walter doesn't want that.

 I am not knowledgable about the actual low-level difference between 
 the compiler statically checking the type of an object or dynamically 
 checking the type of an object, and the run-time costs involved.

 Yet clearly D already has to implement code when scopes come to an end 
 in order to destroy stack-based objects, since structs ( user-define 
 value types ) are already supported and can have destructors.

 
 Yes, and this is implemented in a simple and naive way: by adding an 
 explicit call to the destructor at the end of the scope. The scope 
 object cannot exist outside the scope, and thus no reference counting is 
 needed in the way it's implemented currently.

The reference counting would only be implemented for a 'scope' object 
only. The main overhead at the end of each scope is going through all 
the objects to determine which is a 'scope' object. Perhaps this is too 
expensive, but it would at least be interesting to see if it is or not.

 
 So the added overhead goes from having to identify structs which must 
 have their destructor called at the end of each scope to having to 
 also identify 'scope' objects which must have their reference count 
 decremented at the end of each scope and have their destructor called 
 if the reference count reaches 0.

 
 Well, identifying structs can be done at compile time since you know 
 exactly the type of the struct at that time. Classes are polymorphic, so 
 it'd be a costly runtime check to know that, and that check is almost as 
 costly as doing the reference counting itself. Given that, you should 
 probably not bother at runtime and decide at compile time to just treat 
 any class which has the potential to be a scope class as if it were one 
 and actually do the reference counting.

Your point is well taken, but I still would like to see if the check for 
a 'scope' object would be that expensive. It could be as easy as 
checking an extra 'int' for reference counting for each object and 
seeing whether it is 0 ( normal GC object ) or not 0 ( 'scope' object ).

 Beside, the overhead of actually checking the type of the class will 
 be approximativly the same as doing the reference counting. Given 
 this, it's much better to always just do the reference counting than 
 checking dynamically if it's needed.


 class C { ... }
 scope class D : C { ... }

 [...]

 This may make things much easier for the compiler, but it requires 
 the end user knowledge of 'scope', which has been specified at the 
 class level, to be applied at the syntax level. Intuitively I feel 
 the compiler can figure this out, and that 'scope' should largely be 
 totally transparent to the end user above at the syntax level.

 Well, if the compiler is to be able to distinguish scope at compile 
 time, then it needs a scope flag (either explicit or implicit) on 
 each variable. This is exactly what Walter has proposed to do. He 
 prefers the explicit route because going implicit isn't going to work 
 in too many cases. For instance, let's have a function that returns a C:

     C makeOne() {
         if (/* random stuff here */)
             return new C;
         else
             return new D;
     }

 Now let's call the function:

     C c = makeOne();

 How can you know at compile time if the returned object of that 
 function call is scoped or not? You can't, and therfore the compiler 
 would need to add code to check if the returned object is scope or 
 not, with a significant overhead, each time you assign a C.

 If however you make scope known at compile time:

     scope C makeOne() {
         if (/* random stuff here */)
             return new C;
         else
             return new D;
     }

     scope C c = makeOne();

 Now the compiler knows it must generate reference counting code for 
 the following assignment, and any subsequent assignment of this type, 
 and it won't have to generate code to dynamically everywhere you use 
 a C check the "scopeness".

 Would you agree that all you are doing here is specifically telling 
 the compiler that an object is 'scope' when it is created rather than 
 having the compiler figure it out for itself by querying the dynamic 
 type of the object at creation time ?

 
 The compiler isn't knowleadgeable of what happens whithin every function 
 call. So it can only check at runtime if the function returned at C or a D.

Fully agreed.

 
 If you do, then a much simpler, and to the point, example would be 
 based on my initial OP:

 scope class C { ... }

 scope C c = new C(...);

 I specified that the scope keyword for creating the object is 
 redundant. The compiler can figure it out. The major difference in 
 opinion is that I think the compiler should figure it out from the 
 dynamic type of the object at run-time and not from the static type of 
 the object.

 
 You're prefectly right: it is redundent in *this* case, and you could 
 have the compiler implicitly understand that C is a scope class in 
 *this* case. But consider this example:
 
     Object o;
     if (/* random value */)
         o = new C; // c is a scope class
     else
         o = new Object; // Object is the base class of C but isn't scope
 
 Now, should o be automatically reference-counted because you *could* 
 later create a C object and assing it to o, or should line 3 gives an 
 error since the type Object isn't scope and C must only be assigned as 
 scope? I'd say it should be an error.

I say it should be a 'scope' object. The dynamic type of o is that of a 
'scope' class.

 
 This however could be made legal without too much difficulty:
 
     scope Object o;
     if (/* random value */)
         o = new C; // c is a scope class
     else
         o = new Object; // Object is the base class of C but isn't scope
 
 Basically, you're declaring a scope Object. While Object isn't 
 necessarly a scope class, you are telling the compiler to treat it as 
 scope, and thus an instance of C, which must be scope, *can* be put in 
 this variable. If o wasn't scope, it'd be an error to put an instance of 
 a scope class in it.

But then the end-user is required to know that the C is a scope class. I 
do not think that should be necessary.

The whole point of 'scope' ( RAII ) in GC is that, for the most part, an 
end-user should instantiate and use 'scope' classes just as he would 
normal GC classes, with the language taking care to automatically 
destruct an object of a 'scope' class just as soon as the last reference 
to that object goes out of scope.

 
 But there are still many holes in this scheme in which scope now means 
 reference-counted. Take this example:
 
     class A {
         void doSomething() {
             globalReferences ~= this;
         }
     }
     scope class B { }
 
     A[] globalReferences;
 
     scope B b = new B; // Scope could be made implicit here, but it's 
 irrelevant to my example
     b.doSomething();
 
 This last statement would call A.doSomething which would put a 
 non-scoped reference to globalReferences, which would fail to retain the 
 object. There are two ways around that: ignore the problem and let the 
 programmer handle these cases (basically, that is what boost::shared_ptr 
 would do in such a situation), or introduce a new keyword to decorate 
 parameters for functions that do not keep any reference beyound their 
 own call so that you don't need to duplicate all your functions for a 
 scope and non-scope parameter (much like const is the middle ground 
 between mutable and invariant).

No, A.doSomething would put a 'scoped' reference in a non-scope array. 
However if we specify 'scope A[] globalReferences;' we can solve that 
problem.

Of course we may not control the declaration of 'A[] globalReferences;'. 
I acknowledge that.

Feb 04 2008

Michel Fortin <michel.fortin michelf.com> writes:

On 2008-02-04 22:43:21 -0500, Edward Diener 
<eddielee_no_spam_here tropicsoft.com> said:

 Michel Fortin wrote:
 On 2008-02-03 10:42:03 -0500, Edward Diener 
 <eddielee_no_spam_here tropicsoft.com> said:
 
 Michel Fortin wrote:
 On 2008-02-03 08:20:32 -0500, Edward Diener 
 <eddielee_no_spam_here tropicsoft.com> said:
 
 I am fully cognizant of a dynamically typed language since I program in 
 Python also. I agree there is no fixed dividing line. But the 
 difference between static typing and dynamic typing is well defined in 
 a statically typed language like D. My argument was that for 'scope' to 
 be really effective it needs to consider the dynamic type at run-time 
 and not just the static type as it exist at compile time.

 
 Considering the dynamic type at runtime means you need to check if 
 you're dealing with a reference-counted object each time you copy a 
 reference to that object to see if it the reference count needs 
 adjusting. This is significant overhead over the "just copy the 
 pointer" thing you can do in a GC. Basically, just checking this will 
 increase by two or three times the time it take to copy an object 
 reference... I can see why Walter doesn't want that.

 
 I am not knowledgable about the actual low-level difference between the 
 compiler statically checking the type of an object or dynamically 
 checking the type of an object, and the run-time costs involved.
 
 Yet clearly D already has to implement code when scopes come to an end 
 in order to destroy stack-based objects, since structs ( user-define 
 value types ) are already supported and can have destructors.

 
 Yes, and this is implemented in a simple and naive way: by adding an 
 explicit call to the destructor at the end of the scope. The scope 
 object cannot exist outside the scope, and thus no reference counting 
 is needed in the way it's implemented currently.

 
 The reference counting would only be implemented for a 'scope' object 
 only. The main overhead at the end of each scope is going through all 
 the objects to determine which is a 'scope' object. Perhaps this is too 
 expensive, but it would at least be interesting to see if it is or not.
 
 
 So the added overhead goes from having to identify structs which must 
 have their destructor called at the end of each scope to having to also 
 identify 'scope' objects which must have their reference count 
 decremented at the end of each scope and have their destructor called 
 if the reference count reaches 0.

 
 Well, identifying structs can be done at compile time since you know 
 exactly the type of the struct at that time. Classes are polymorphic, 
 so it'd be a costly runtime check to know that, and that check is 
 almost as costly as doing the reference counting itself. Given that, 
 you should probably not bother at runtime and decide at compile time to 
 just treat any class which has the potential to be a scope class as if 
 it were one and actually do the reference counting.

 
 Your point is well taken, but I still would like to see if the check 
 for a 'scope' object would be that expensive. It could be as easy as 
 checking an extra 'int' for reference counting for each object and 
 seeing whether it is 0 ( normal GC object ) or not 0 ( 'scope' object ).

Basically, you need to:

1. Load the object's pointer in a register
2. Load the "scope" flag from memory by offseting the object's pointer
3. Branch depending on that flag:
   a. if not scope, go to 4.
   b. if scope, do whatever is needed to increment the reference count 
atomically, then go to 4
4. Write the pointer to its new location.

That's a lot of extra work you'd have to do at every copy of an 
object's pointer to perform that check. That branch operation could 
become very expensive if the processor can't predict it right, and 
loading from an additional, possibly far away, memory block could mean 
missing the memory cache more often too.

1 and 4 is all you need if you don't care about scope.


 The compiler isn't knowleadgeable of what happens whithin every 
 function call. So it can only check at runtime if the function returned 
 at C or a D.

 
 Fully agreed.
 
 
 If you do, then a much simpler, and to the point, example would be 
 based on my initial OP:
 
 scope class C { ... }
 
 scope C c = new C(...);
 
 I specified that the scope keyword for creating the object is 
 redundant. The compiler can figure it out. The major difference in 
 opinion is that I think the compiler should figure it out from the 
 dynamic type of the object at run-time and not from the static type of 
 the object.

 
 You're prefectly right: it is redundent in *this* case, and you could 
 have the compiler implicitly understand that C is a scope class in 
 *this* case. But consider this example:
 
     Object o;
     if (/* random value */)
         o = new C; // c is a scope class
     else
         o = new Object; // Object is the base class of C but isn't scope
 
 Now, should o be automatically reference-counted because you *could* 
 later create a C object and assing it to o, or should line 3 gives an 
 error since the type Object isn't scope and C must only be assigned as 
 scope? I'd say it should be an error.

 
 I say it should be a 'scope' object. The dynamic type of o is that of a 
 'scope' class.

Hum, dynamic scope typing again? If you had that it'd work, sure, but 
since we surely won't have that this isn't an option.


 This however could be made legal without too much difficulty:
 
     scope Object o;
     if (/* random value */)
         o = new C; // c is a scope class
     else
         o = new Object; // Object is the base class of C but isn't scope
 
 Basically, you're declaring a scope Object. While Object isn't 
 necessarly a scope class, you are telling the compiler to treat it as 
 scope, and thus an instance of C, which must be scope, *can* be put in 
 this variable. If o wasn't scope, it'd be an error to put an instance 
 of a scope class in it.

 
 But then the end-user is required to know that the C is a scope class. 
 I do not think that should be necessary.

Perhaps not, I don't have a strong opinion on that. But I firmly belive 
scope should be enforced statically, not dynamically, and that's what 
I'm arguing for.

 The whole point of 'scope' ( RAII ) in GC is that, for the most part, 
 an end-user should instantiate and use 'scope' classes just as he would 
 normal GC classes, with the language taking care to automatically 
 destruct an object of a 'scope' class just as soon as the last 
 reference to that object goes out of scope.

Well, perhaps there's a solution that would do what you want while 
still keeping it compile-time only. It's some sort of compromise. Take 
these three classes:

	class A {}
	scope class B : A {}
	scope class C : B {}

B and C are scope, A isn't. Now, what if writing "B" was equivalent to 
writing "scope B" (since B is scope) and "C" was equivalent to writing 
"scope C". Obviously, writing "A" wouldn't be equivalent to "scope A" 
(because A is not scope). Then you could have:

	A a1 = new A;
	A a2 = new B; // illegal: B is scope, cannot be assigned to non-scope A
	scope A a3 = new B; // legal: B is scope and scope A is (explicitly) scope

	B b1 = new B;
	B b2 = new C; // legal: C is scope and B is (implicitly) scope
	scope B3 = new C; // same as above

That would mean that you'd only have to explictly write scope if you're 
using the non-scope base class as a type to hold a reference to your 
scope object.


 But there are still many holes in this scheme in which scope now means 
 reference-counted. Take this example:
 
     class A {
         void doSomething() {
             globalReferences ~= this;
         }
     }
     scope class B { }
 
     A[] globalReferences;
 
     scope B b = new B; // Scope could be made implicit here, but it's 
 irrelevant to my example
     b.doSomething();
 
 This last statement would call A.doSomething which would put a 
 non-scoped reference to globalReferences, which would fail to retain 
 the object. There are two ways around that: ignore the problem and let 
 the programmer handle these cases (basically, that is what 
 boost::shared_ptr would do in such a situation), or introduce a new 
 keyword to decorate parameters for functions that do not keep any 
 reference beyound their own call so that you don't need to duplicate 
 all your functions for a scope and non-scope parameter (much like const 
 is the middle ground between mutable and invariant).

 
 No, A.doSomething would put a 'scoped' reference in a non-scope array. 
 However if we specify 'scope A[] globalReferences;' we can solve that 
 problem.

Sure, you're solving the problem nicely. But how does the compiler 
finds out there's a problem in the first place? It needs to know that 
the this parameter is scope, and thus the member function should be 
decorated scope (just like you'd do with invariant). So you'd need to 
duplicate every member function so that it can be used either as scope 
or non-scope, and that's not very interesting unless you can declare 
that the function does not need to know if the paramater is typed scope 
or not (just like const means you don't know if it's invariant or 
mutable).

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Feb 04 2008

"Janice Caron" <caron800 googlemail.com> writes:

On 05/02/2008, Michel Fortin <michel.fortin michelf.com> wrote:
 Basically, you need to:

 1. Load the object's pointer in a register
 2. Load the "scope" flag from memory by offseting the object's pointer
 3. Branch depending on that flag:
    a. if not scope, go to 4.
    b. if scope, do whatever is needed to increment the reference count
 atomically, then go to 4
 4. Write the pointer to its new location.

Not quite. Consider an assignment

    A a;
    B b;
    a = b;

If I may be so bold as to rewrite this in C++, for clarity, that would
look like:

    A* pa;
    B* pb;
    pa = pb;

That's the situation currently. Assignment of classes is very fast.
/But/, this is what it would change to under your scheme:

    A* pa;
    B* pb;
    if (pa->refCount)
    {
        if (pb->refCount)
        {
            atomic { if (--pb->refCount == 0) delete pb; }
            atomic { ++pa->refCount; }
            pa = pb;
        }
        else
        {
            throw new RefCountException();
        }
    }
    else
    {
        if (pb->refCount)
        {
             throw new RefCountException();
        }
        else
        {
            pa = pb;
        }
    }

And that's just for /ordinary/ assignment. That looks like a
phenomenal overhead to me. Now just /imagine/ how complicated it gets
if you've overloaded opAssign in various complicated ways. (e.g
structs assigned from classes, classes assigned from structs, etc.)

I think I'd rather not have that overhead added to every single class
assignment.

Feb 05 2008

Michel Fortin <michel.fortin michelf.com> writes:

On 2008-02-05 04:09:31 -0500, "Janice Caron" <caron800 googlemail.com> said:

 On 05/02/2008, Michel Fortin <michel.fortin michelf.com> wrote:
 Basically, you need to:
 
 1. Load the object's pointer in a register
 2. Load the "scope" flag from memory by offseting the object's pointer
 3. Branch depending on that flag:
 a. if not scope, go to 4.
 b. if scope, do whatever is needed to increment the reference count
 atomically, then go to 4
 4. Write the pointer to its new location.

 
 Not quite.

Well, the algorithm above checks for the presence of a scope flag on 
the object to only reference-count objects whith are flagged scope at 
runtime. And I didn't bother elaborate the code in caes 4.b., doing the 
actual reference counting, because it's not that important.

But now I realise I forgot to check for the scope flag on the second 
object and failed to check for null pointers. Basically, what I should 
have written is this (now in D with a special atomic statement):

	A a;
	B b;
	if (a && a.isScope)
	{
		debug if (a.refCount == 0)
			throw new RefCountException();
		atomic { --a.refCount; }
		if (a.refCount == 0)
			delete a;
	}
	if (b && b.isScope)
	{
		atomic { ++b.refCount; }
	}
	a = b;

As you can see, there are four conditions that *must* be evaluated 
whether or not the object is scope at runtime (a, a.isScope, b, 
b.isScope), two of them requiring dereferencing the object and loading 
some of its memory (for each object's scope flag). This is the real 
drawback in Edward's proposal as the rest of the code wouldn't be 
executed if the isScope flag is set to false. Since branching is often 
an expensive operation on processors because of the instruction 
pipeline, doing the assignment with a runtime isScope flag would be 
much slower.

Janice, your code demonstrate how to do reference counting, but I don't 
see a scope flag anywhere.


 That's the situation currently. Assignment of classes is very fast.
 /But/, this is what it would change to under your scheme:
 
     A* pa;
     B* pb;
     if (pa->refCount)
     {
         if (pb->refCount)
         {

Shouldn't you check for null too? That'd be:

	if (pa && pa->refCount)
	{
		if (pb && pb->refCount)
		{

             atomic { if (--pb->refCount == 0) delete pb; }
             atomic { ++pa->refCount; }
             pa = pb;
         }
         else
         {
             throw new RefCountException();
         }
     }
     else
     {
         if (pb->refCount)
         {

And again:

		if (pb && pb->refCount)
		{

              throw new RefCountException();
         }
         else
         {
             pa = pb;
         }
     }
 
 And that's just for /ordinary/ assignment. That looks like a
 phenomenal overhead to me.
 
 Now just /imagine/ how complicated it gets
 if you've overloaded opAssign in various complicated ways. (e.g
 structs assigned from classes, classes assigned from structs, etc.)
 
 I think I'd rather not have that overhead added to every single class
 assignment.

Well, that's the idea of a reference-counted object: you add this 
overhead to make sure the object is destroyed as soon as it can. It can 
be usefull in many cases, when a class holds a scarse resource for 
instance. But you're completly right to not want this overhead for 
regular objects, and that's why an object being reference-counted must 
be a compile-time property, not decided by a runtime-evaluatable flag.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Feb 05 2008

Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:

Janice Caron wrote:
 On 05/02/2008, Michel Fortin <michel.fortin michelf.com> wrote:
 Basically, you need to:

 1. Load the object's pointer in a register
 2. Load the "scope" flag from memory by offseting the object's pointer
 3. Branch depending on that flag:
    a. if not scope, go to 4.
    b. if scope, do whatever is needed to increment the reference count
 atomically, then go to 4
 4. Write the pointer to its new location.

 
 Not quite. Consider an assignment
 
     A a;
     B b;
     a = b;
 
 If I may be so bold as to rewrite this in C++, for clarity, that would
 look like:
 
     A* pa;
     B* pb;
     pa = pb;
 
 That's the situation currently. Assignment of classes is very fast.
 /But/, this is what it would change to under your scheme:
 
     A* pa;
     B* pb;
     if (pa->refCount)
     {
         if (pb->refCount)
         {
             atomic { if (--pb->refCount == 0) delete pb; }
             atomic { ++pa->refCount; }
             pa = pb;
         }
         else
         {
             throw new RefCountException();
         }
     }
     else
     {
         if (pb->refCount)
         {
              throw new RefCountException();
         }
         else
         {
             pa = pb;
         }
     }

pb being a 'scope' object does not matter and its reference count does 
not get adjusted downward just because its object reference is assigned 
to another object.

A reference counted 'scope' object means there is a single reference 
count ( probably somewhere out in Gc memory ) for all references to a 
particular object. When a new reference to that object is created, 
either through assignment, as above, or passing the object reference, 
the single reference count is incremented. So you can throw out a good 
deal of your imagined code above.

 
 And that's just for /ordinary/ assignment. That looks like a
 phenomenal overhead to me. Now just /imagine/ how complicated it gets
 if you've overloaded opAssign in various complicated ways. (e.g
 structs assigned from classes, classes assigned from structs, etc.)
 
 I think I'd rather not have that overhead added to every single class
 assignment.

You would no doubt claim that even if your code above were correct and 
much simpler. As soon as people are against an idea they find the 
necessary reasons to denigrate it based on such spurious thought. In 
computer programming the favorite claim for such thought is always the 
logic and supposed overhead of implementing anything.

Feb 05 2008

"Janice Caron" <caron800 googlemail.com> writes:

On 06/02/2008, Edward Diener <eddielee_no_spam_here tropicsoft.com> wrote:
 pb being a 'scope' object does not matter and its reference count does
 not get adjusted downward just because its object reference is assigned
 to another object.

You're right. That was a mistake. I meant

             atomic { if (--pa->refCount == 0) delete pa; }
             atomic { ++pb->refCount; }

The reference count of the value previously held by a /does/ get
adjusted downwards. I wrote the last post without paying attention to
the details, but the complexity doesn't go away when you write it
properly. Honest.


 So you can throw out a good
 deal of your imagined code above.

and replace it with correct code which is just as complicated, yes.


 You would no doubt claim that even if your code above were correct and
 much simpler.

*NEVER* tell me what you think I would or would not claim. Only I get
to speak for me. If anyone else does it, I start calling strawman.
Quote me verbatim by all means, but /do not/ put words into my mouth.


 As soon as people are against an idea they find the
 necessary reasons to denigrate it based on such spurious thought.

Are you implying that I, personally, am guilty of "finding reasons to
denigrate" your idea because of "spurious thought". I ask because you
use the generic word "people", but the context of your sentence could
be taken to imply that you are talking about me, personally. Please
clarify. I don't take well to personal attacks.


 In
 computer programming the favorite claim for such thought is always the
 logic and supposed overhead of implementing anything.

I speak from experience. I have implemented reference counting in a
multithreaded environment. I have earned the right to discuss
implementation details, and to explain what the /actual/ (not
supposed) overhead really is.

Feb 05 2008

"Janice Caron" <caron800 googlemail.com> writes:

On 06/02/2008, Janice Caron <caron800 googlemail.com> wrote:
 You're right. That was a mistake. I meant

             atomic { if (--pa->refCount == 0) delete pa; }
             atomic { ++pb->refCount; }


And just to be really clear, real code would need extra checks, above
and beyond the simplified version I wrote above. In fact, it would be
more like

        if (pa != null && pa->refCount != 0)
        {
            if (pb != null && pb->refCount != 0)
            {
                atomic { ++pb->refCount; }
                atomic { if (--pa->refCount == 0) delete pa; }

etc. It's important to do the increment before the decrement because
otherwise you run the risk of crashing if (pa == pb). And if you store
refCount in too small a variable, you also would need to be concerned
about ++pb->refCount wrapping. Microsoft VC++'s implementation of
std::string, for example, contains a one-byte reference count, and
they have some complicated code in there which interprets 0xFF
differently so that things don't fall over.

But that's using your suggested convention that (refCount == 0)
implies "not reference counted". In my real life implementation,
ref-countedness was a compile-time property, not a runtime property,
so I didn't have to do all of those tests. I think compile-time
ref-countedness is a very good idea, and has many fantastic
advantages. I just think it's a very bad idea to make it a runtime
property.

Feb 05 2008

"Janice Caron" <caron800 googlemail.com> writes:

On 05/02/2008, Janice Caron <caron800 googlemail.com> wrote:
 /But/, this is what it would change to under your scheme:
 <snip>

...and that code had a bug in it! (Who else spotted it?) It can fail
with the assignment

    a = a;

It can also fail if either a or b is null.

See, this is harder than you think.

Feb 05 2008

Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:

Michel Fortin wrote:
 On 2008-02-04 22:43:21 -0500, Edward Diener 
 <eddielee_no_spam_here tropicsoft.com> said:
 
 Michel Fortin wrote:
 On 2008-02-03 10:42:03 -0500, Edward Diener 
 <eddielee_no_spam_here tropicsoft.com> said:

 Michel Fortin wrote:
 On 2008-02-03 08:20:32 -0500, Edward Diener 
 <eddielee_no_spam_here tropicsoft.com> said:

 I am fully cognizant of a dynamically typed language since I 
 program in Python also. I agree there is no fixed dividing line. 
 But the difference between static typing and dynamic typing is 
 well defined in a statically typed language like D. My argument 
 was that for 'scope' to be really effective it needs to consider 
 the dynamic type at run-time and not just the static type as it 
 exist at compile time.

 Considering the dynamic type at runtime means you need to check if 
 you're dealing with a reference-counted object each time you copy a 
 reference to that object to see if it the reference count needs 
 adjusting. This is significant overhead over the "just copy the 
 pointer" thing you can do in a GC. Basically, just checking this 
 will increase by two or three times the time it take to copy an 
 object reference... I can see why Walter doesn't want that.

 I am not knowledgable about the actual low-level difference between 
 the compiler statically checking the type of an object or 
 dynamically checking the type of an object, and the run-time costs 
 involved.

 Yet clearly D already has to implement code when scopes come to an 
 end in order to destroy stack-based objects, since structs ( 
 user-define value types ) are already supported and can have 
 destructors.

 Yes, and this is implemented in a simple and naive way: by adding an 
 explicit call to the destructor at the end of the scope. The scope 
 object cannot exist outside the scope, and thus no reference counting 
 is needed in the way it's implemented currently.

 The reference counting would only be implemented for a 'scope' object 
 only. The main overhead at the end of each scope is going through all 
 the objects to determine which is a 'scope' object. Perhaps this is 
 too expensive, but it would at least be interesting to see if it is or 
 not.

 So the added overhead goes from having to identify structs which 
 must have their destructor called at the end of each scope to having 
 to also identify 'scope' objects which must have their reference 
 count decremented at the end of each scope and have their destructor 
 called if the reference count reaches 0.

 Well, identifying structs can be done at compile time since you know 
 exactly the type of the struct at that time. Classes are polymorphic, 
 so it'd be a costly runtime check to know that, and that check is 
 almost as costly as doing the reference counting itself. Given that, 
 you should probably not bother at runtime and decide at compile time 
 to just treat any class which has the potential to be a scope class 
 as if it were one and actually do the reference counting.

 Your point is well taken, but I still would like to see if the check 
 for a 'scope' object would be that expensive. It could be as easy as 
 checking an extra 'int' for reference counting for each object and 
 seeing whether it is 0 ( normal GC object ) or not 0 ( 'scope' object ).

 
 Basically, you need to:
 
 1. Load the object's pointer in a register
 2. Load the "scope" flag from memory by offseting the object's pointer
 3. Branch depending on that flag:
   a. if not scope, go to 4.
   b. if scope, do whatever is needed to increment the reference count 
 atomically, then go to 4
 4. Write the pointer to its new location.
 
 That's a lot of extra work you'd have to do at every copy of an object's 
 pointer to perform that check. That branch operation could become very 
 expensive if the processor can't predict it right, and loading from an 
 additional, possibly far away, memory block could mean missing the 
 memory cache more often too.
 
 1 and 4 is all you need if you don't care about scope.

I love it when people such as you carry on about all the work that must 
be done to implement X. Implementing any new feature in any language 
takes work. There are NO free rides. But that never means that the new 
feature should not be done. Who cares if some program is slowed down by 
some number of microsecoonds each time if the feature makes a better and 
much easier programming paradigm work which otherwise could only be 
handled in a clumsy and inefficient manner.

 
 
 The compiler isn't knowleadgeable of what happens whithin every 
 function call. So it can only check at runtime if the function 
 returned at C or a D.

 Fully agreed.

 If you do, then a much simpler, and to the point, example would be 
 based on my initial OP:

 scope class C { ... }

 scope C c = new C(...);

 I specified that the scope keyword for creating the object is 
 redundant. The compiler can figure it out. The major difference in 
 opinion is that I think the compiler should figure it out from the 
 dynamic type of the object at run-time and not from the static type 
 of the object.

 You're prefectly right: it is redundent in *this* case, and you could 
 have the compiler implicitly understand that C is a scope class in 
 *this* case. But consider this example:

     Object o;
     if (/* random value */)
         o = new C; // c is a scope class
     else
         o = new Object; // Object is the base class of C but isn't scope

 Now, should o be automatically reference-counted because you *could* 
 later create a C object and assing it to o, or should line 3 gives an 
 error since the type Object isn't scope and C must only be assigned 
 as scope? I'd say it should be an error.

 I say it should be a 'scope' object. The dynamic type of o is that of 
 a 'scope' class.

 
 Hum, dynamic scope typing again? If you had that it'd work, sure, but 
 since we surely won't have that this isn't an option.

A brilliant conclusion. You decide that "we surely won't have that" so 
it will not work. Another candidate for a course in predicate logic 101 
shows up.

 
 
 This however could be made legal without too much difficulty:

     scope Object o;
     if (/* random value */)
         o = new C; // c is a scope class
     else
         o = new Object; // Object is the base class of C but isn't scope

 Basically, you're declaring a scope Object. While Object isn't 
 necessarly a scope class, you are telling the compiler to treat it as 
 scope, and thus an instance of C, which must be scope, *can* be put 
 in this variable. If o wasn't scope, it'd be an error to put an 
 instance of a scope class in it.

 But then the end-user is required to know that the C is a scope class. 
 I do not think that should be necessary.

 
 Perhaps not, I don't have a strong opinion on that. But I firmly belive 
 scope should be enforced statically, not dynamically, and that's what 
 I'm arguing for.

I understand your argument based on the simplicity of the solution, and 
the relative speed of the code compared to the alternative of 
determining 'scope' at run-time. I respect your argument but I think 
that it is an incomplete solution from the end-user's perspective 
because he must be aware of the 'scope'-ness of the objects he uses and 
notate the objects accordingly. I think this is an imposition although I 
could live with it. But I would like to see the dynamic solution at 
least attempted.

 
 The whole point of 'scope' ( RAII ) in GC is that, for the most part, 
 an end-user should instantiate and use 'scope' classes just as he 
 would normal GC classes, with the language taking care to 
 automatically destruct an object of a 'scope' class just as soon as 
 the last reference to that object goes out of scope.

 
 Well, perhaps there's a solution that would do what you want while still 
 keeping it compile-time only. It's some sort of compromise. Take these 
 three classes:
 
     class A {}
     scope class B : A {}
     scope class C : B {}
 
 B and C are scope, A isn't. Now, what if writing "B" was equivalent to 
 writing "scope B" (since B is scope) and "C" was equivalent to writing 
 "scope C". Obviously, writing "A" wouldn't be equivalent to "scope A" 
 (because A is not scope). Then you could have:
 
     A a1 = new A;
     A a2 = new B; // illegal: B is scope, cannot be assigned to non-scope A
     scope A a3 = new B; // legal: B is scope and scope A is (explicitly) 
 scope
 
     B b1 = new B;
     B b2 = new C; // legal: C is scope and B is (implicitly) scope
     scope B3 = new C; // same as above
 
 That would mean that you'd only have to explictly write scope if you're 
 using the non-scope base class as a type to hold a reference to your 
 scope object.

Yes, I understand your example completely.

 
 
 But there are still many holes in this scheme in which scope now 
 means reference-counted. Take this example:

     class A {
         void doSomething() {
             globalReferences ~= this;
         }
     }
     scope class B { }

     A[] globalReferences;

     scope B b = new B; // Scope could be made implicit here, but it's 
 irrelevant to my example
     b.doSomething();

 This last statement would call A.doSomething which would put a 
 non-scoped reference to globalReferences, which would fail to retain 
 the object. There are two ways around that: ignore the problem and 
 let the programmer handle these cases (basically, that is what 
 boost::shared_ptr would do in such a situation), or introduce a new 
 keyword to decorate parameters for functions that do not keep any 
 reference beyound their own call so that you don't need to duplicate 
 all your functions for a scope and non-scope parameter (much like 
 const is the middle ground between mutable and invariant).

 No, A.doSomething would put a 'scoped' reference in a non-scope array. 
 However if we specify 'scope A[] globalReferences;' we can solve that 
 problem.

 
 Sure, you're solving the problem nicely. But how does the compiler finds 
 out there's a problem in the first place? It needs to know that the this 
 parameter is scope, and thus the member function should be decorated 
 scope (just like you'd do with invariant). So you'd need to duplicate 
 every member function so that it can be used either as scope or 
 non-scope, and that's not very interesting unless you can declare that 
 the function does not need to know if the paramater is typed scope or 
 not (just like const means you don't know if it's invariant or mutable).

I am lost about what you are saying above. Member functions have nothing 
to do with 'scope'.

Feb 05 2008

Michel Fortin <michel.fortin michelf.com> writes:

On 2008-02-05 23:45:42 -0500, Edward Diener 
<eddielee_no_spam_here tropicsoft.com> said:

 Michel Fortin wrote:
 
 But there are still many holes in this scheme in which scope now means 
 reference-counted. Take this example:
 
     class A {
         void doSomething() {
             globalReferences ~= this;
         }
     }
     scope class B { }
 
     A[] globalReferences;
 
     scope B b = new B; // Scope could be made implicit here, but it's 
 irrelevant to my example
     b.doSomething();



 
 I am lost about what you are saying above. Member functions have 
 nothing to do with 'scope'.

The thing is that if we have a static reference-counting type modifier 
(the scope keyword in this case), the compiler has to emit code to 
increment and decrement the reference count each time we add or remove 
a reference to a scope object. To do that, it has to know when 
compiling a function whether or not the object's type is scope.

In the above example, class A has a doSomething function which adds a 
reference to itself to some global variable. Since 'this' is of type A 
(not scope A) in the doSomething function, no code is added to 
reference-count the object when assigning it to the global variable. 
Hence, if you could call doSomething on a scope A, and then remove all 
other references to A, A's reference count would become zero and A 
would be deleted despite it being still referenced.

(Having a B class derived from A just makes the thing harder to spot. 
It's basically the same thing as having a scope A object though.)

The obvious solution is this:

    class A {
        void doSomething() {
            globalReferences ~= this;
        }
        scope void doSomething() { // scope is an attribute of the 
function here
            globalReferences ~= this;
        }
    }

where one doSomething has no code to maintain the reference counter and 
the other version has. The first version would be called when the 
compiler has a non-scope A while the second would be called for a scope 
A.

That essentially mean that you couldn't call a scope function on a 
non-scope object and vice-versa. Each function therefore needs to be 
duplicated, with a scope and a non-scope variant, just in case it puts 
a reference to the object somewhere that'll still exist after the 
function call.

There is an obvious solution to that problem though: as the D source 
code for the two member functions is the same, the compiler could just 
compile the two variants from the same source. Unfortunately, the 
compiler would *always* have to generate two symbols for each member 
function, even if no reference is put elsewhere so the scope function 
can be reached when calling from elsewhere. (Remember that when doing a 
function call the compiler doesn't know what happens inside the 
function, and it can't just guess the scope function doesn't exist.)

    class A {
		// automatically generates the two functions from the example above
        void doSomething() {
            globalReferences ~= this;
        }
    }

If we had a new keyword to tell in the function signature that we won't 
take keep the reference somewhere else, that it'll be completely 
forgotten after the call, then we could avoid generating two functions 
needlessly for most member functions. Let's call this keyword "amnesic":

    class A {
        amnesic void doSomething() {
            globalReferences ~= this; // illegal, amnesic reference to 
this put outside function scope
        }
        amnesic void doSomethingElse() {
			// only one generated function
        }
    }

It's basically the same pattern as for invariant and non-invariant 
methods (you can't call an invariant method on a mutable object; you 
can't call a mutable method on an invariant object; both work with 
const). Here, you have regular methods, scope methods, and amnesic 
methods can work with both scope and non-scope objects.

I'm going a little off-topic now, but a new keyword such as this could 
be useful for creating invariant objects too.

Basically, while an amnesic function guaranties there are no more 
references to the object after the function call than there were 
before, an amnesic constructor could guaranty uniqueness of the 
reference after the creation. This means the created object could 
become invariant if that was the caller's intent:

    class A {
        amnesic this() {
			// legal: no reference given to the outside world
        }
        amnesic this() {
            globalReferences ~= this; // illegal, amnesic reference to 
this put outside constructor's scope
        }
    }

	A a1 = new A;
	invariant A a2 = new A;

(Others have talked about "unique" as a keyword, but "unique" isn't 
very useful because it describes the state of a reference at a certain 
point in time, not a property of a variable or a type.)

You could also have amnesic parameters to functions that would guaranty 
the function doesn't keep a reference after the call.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Feb 07 2008

Michel Fortin <michel.fortin michelf.com> writes:

On 2008-02-05 23:45:42 -0500, Edward Diener 
<eddielee_no_spam_here tropicsoft.com> said:

 I love it when people such as you carry on about all the work that must 
 be done to implement X. Implementing any new feature in any language 
 takes work. There are NO free rides. But that never means that the new 
 feature should not be done.

If by "work that must be done" you mean find a solution with no 
overhead, then good luck. I'm telling you it'll be difficult to 
implement, I'm telling you it's going to remove one of the biggest 
advantage of using a garbage collector by producing more code to deal 
with reference counts and object allocation/deallocation, and this code 
will consume time.

In other word, I'm arguing about the (undesirable) end result, not the 
work it'd take to get there.


 Who cares if some program is slowed down by some number of 
 microsecoonds each time if the feature makes a better and much easier 
 programming paradigm work which otherwise could only be handled in a 
 clumsy and inefficient manner.

Well, you have a valid point that often -- though not always -- a 
feature that, at the sacrifice of some runtime performance, help 
programmers save time is a good thing. I'll have to admit I'm not too 
convinced it'd be so helpful there, but that's not really what is 
concerning me.

One of D's goals is performance. The thing with performance in a 
program is that you don't need it, except at a few critical places 
where it is of the uttermost importance. By forcing the 
reference-counting code everywhere, it's going to end up at many places 
were it's not needed, and some of these places will be those 
time-critical parts.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Feb 07 2008

Walter Bright <newshound1 digitalmars.com> writes:

Edward Diener wrote:
 In your example justifying treating 'scope' as C++ treats 'const' the 
 'const' is attached to the object upon instantiation of it.

Pedantically, it is not attached to the object. It is attached to the 
*type* of the object. The bits in the object do not change, and there is 
no way at runtime to examine the bits of the object to determine if it 
is const or not. The "const-ness" is purely a compile time attribute, 
i.e. it's part of the static type.


 My point is 
 that 'scope' is attached to the type by the class designer. To me these 
 are conceptually two different things. That is why I said that your 
 example is a poor analogy. I should have made that clearer by my argument.

Whether the 'scope' is attached to the class definition or the variable 
definition is a separate and orthogonal issue. As I understand it, our 
difference is if an object can, at run time, be distinguished as being 
scope or not, and should this be tested at runtime at each place where 
assignment, copy construction, and scope exit happens.

I.e., should 'scope' be a part of the type of the object, or a dynamic 
part of the runtime representation of an object? Both are technically 
implementable.


 I was simply arguing 
 against your assertion, using 'const' as an example, of not allowing a 
 'scope' object to be assigned to an object that is not 'scope'.

Ok.


 Clearly 
 in the polymorphic world of D, where a base class may not be 'scope' 
 while a derived class may be 'scope', such a treatment can not be right, 
 since the basis of polymorphism is to assign to a base class object a 
 derived class reference.

Right. This is a reason why 'scope' for classes may need to be 
eventually deprecated.


 But I do not see why you think that every object must therefore have 
 scope semantics.

It will be required as any user could declare an object instance as 
'scope', and so any separately compiled code must anticipate that.


 You have already won that argument as I fully agree to what you say 
 above. But I have no idea why you say that it is the crux of our 
 disagreement. Care to elaborate ?

It's just that if any object could be scoped based on a runtime test, 
that then you've got to insert that test at every assignment, copy 
construction, and scope exit. You've got all the overhead of RC.

 No, I do not think that the scopeness of an object is best determined by 
 the user and not the class designer. In fact I feel very strongly the 
 opposite. The class designer knows whether his class has RAII or not and 
 in the cast majority of cases the end user should not know or care.

This is a very interesting issue. I've been slowly coming to the 
opposite conclusion that issues of where an object is created (and that 
includes scope) should be the purvey of the object user. C++ and D have 
class specific allocators, but that might be a mistake.


 My argument for scope injection is based purely on the practical 
 considerations that there are types which can not possibly know if it is 
 to be used with RAII or not. Template classes/structs which are 
 containers are the most obvious as well as built-in arrays. That is why 
 besides the ability for the class designer to specify 'scope' the end 
 user should be able to do it at object creation time also.

Then you have the problem that all generated code that manipulates any 
object must insert all the rc machinery for that object, just in case 
some user somewhere instantiates it as 'scope'.

Feb 04 2008

Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:

Walter Bright wrote:
 Edward Diener wrote:
 My point is that 'scope' is attached to the type by the class 
 designer. To me these are conceptually two different things. That is 
 why I said that your example is a poor analogy. I should have made 
 that clearer by my argument.

 
 Whether the 'scope' is attached to the class definition or the variable 
 definition is a separate and orthogonal issue. As I understand it, our 
 difference is if an object can, at run time, be distinguished as being 
 scope or not, and should this be tested at runtime at each place where 
 assignment, copy construction, and scope exit happens.

Yes, this is important in support of 'scope' at run-time. For each 
object you would need to determine if it is 'scope' in the cases cited 
above and take the appropriate action if it is. The easiest way, 
although perhaps the slowest, is to find out if the dynamic type is a 
'scope' type at the junctures you mentioned. How you do that at the 
compiler level you best would know. One possibility that comes to mind, 
but perhaps erroneous because too simple, is to add a int reference 
count to every object and then set it to 1 whenever a 'scope' object ( 
an object specified as 'scope' or having a 'scope class type ) is 
created or when a reference to a 'scope' object is legally assigned to 
any object. This goes along with my feeling that you will need to keep 
track of 'scope' objects by attaching your mechanism to the object at 
run-time and not rely merely on something in the 'scope' class type, and 
that it is the dynamic type of the object and not the static type which 
matters.

 
 I.e., should 'scope' be a part of the type of the object, or a dynamic 
 part of the runtime representation of an object? Both are technically 
 implementable.

My vote is the second. See above.

 
 
 I was simply arguing against your assertion, using 'const' as an 
 example, of not allowing a 'scope' object to be assigned to an object 
 that is not 'scope'.

 
 Ok.
 
 
 Clearly in the polymorphic world of D, where a base class may not be 
 'scope' while a derived class may be 'scope', such a treatment can not 
 be right, since the basis of polymorphism is to assign to a base class 
 object a derived class reference.

 
 Right. This is a reason why 'scope' for classes may need to be 
 eventually deprecated.

I think this is very wrong.

 
 
 But I do not see why you think that every object must therefore have 
 scope semantics.

 
 It will be required as any user could declare an object instance as 
 'scope', and so any separately compiled code must anticipate that.

I agree in the sense that every object may need to carry an extra 
reference count with it even though it will not be used for the vast 
majority of objects, which will be GC. I do not view this as an issue.

 
 
 You have already won that argument as I fully agree to what you say 
 above. But I have no idea why you say that it is the crux of our 
 disagreement. Care to elaborate ?

 
 It's just that if any object could be scoped based on a runtime test, 
 that then you've got to insert that test at every assignment, copy 
 construction, and scope exit. You've got all the overhead of RC.

Yes, agreed. There will be overhead to deal with 'scope' objects. 
However you already have some overhead dealing with stack variables, and 
so has C++ for its existence at the end of each scope and it sure does 
not make C++ slower than most GC systems.

 
 No, I do not think that the scopeness of an object is best determined 
 by the user and not the class designer. In fact I feel very strongly 
 the opposite. The class designer knows whether his class has RAII or 
 not and in the cast majority of cases the end user should not know or 
 care.

 
 This is a very interesting issue. I've been slowly coming to the 
 opposite conclusion that issues of where an object is created (and that 
 includes scope) should be the purvey of the object user. C++ and D have 
 class specific allocators, but that might be a mistake.

I can not say too strongly that if RAII, via 'scope', is to work in D or 
any other GC language, the end-user should be as oblivious as possible 
to it working automatically. This means that class designer, who surely 
must know whether objects of their class need RAII, tells the compiler 
that his type is 'scope' and the end-user proceeds to use objects of 
that type just as if he would use normal GC objects.

Otherwise you are creating a bifurcated system which does the end-user 
no good. Not only must the end user know something in advance about the 
inner workings of a class ( that it needs RAII ) when the class designer 
already knows it, but he must also use a separate notation to deal with 
objects of that class.

 
 
 My argument for scope injection is based purely on the practical 
 considerations that there are types which can not possibly know if it 
 is to be used with RAII or not. Template classes/structs which are 
 containers are the most obvious as well as built-in arrays. That is 
 why besides the ability for the class designer to specify 'scope' the 
 end user should be able to do it at object creation time also.

 
 Then you have the problem that all generated code that manipulates any 
 object must insert all the rc machinery for that object, just in case 
 some user somewhere instantiates it as 'scope'.

It needs to have inserted for it the mechanism which determines whether 
that object is a 'scope' object or not. It probably needs the extra int 
for possible reference counting. Other than that I do not see what other 
machinery is needed for normal GC objects.

If we are really still in the age, with vtables and alignment padding 
and god knows what else a compiler writer needs per object to correctly 
do his work, where another 4 bytes of int is considered prohibitory, 
then I give up the whole idea <g>.

Feb 04 2008

Walter Bright <newshound1 digitalmars.com> writes:

Edward Diener wrote:
 It will be required as any user could declare an object instance as 
 'scope', and so any separately compiled code must anticipate that.

 I agree in the sense that every object may need to carry an extra 
 reference count with it even though it will not be used for the vast 
 majority of objects, which will be GC. I do not view this as an issue.

It's a very serious issue, as it essentially negates much of the 
advantage of general gc. For one example, you'll have to give up 
interior pointers.

 It's just that if any object could be scoped based on a runtime test, 
 that then you've got to insert that test at every assignment, copy 
 construction, and scope exit. You've got all the overhead of RC.

 Yes, agreed. There will be overhead to deal with 'scope' objects. 

It will be needed for *every* gc object, too. And not just the 
allocation for the reference count, the test has to be executed every time.

 However you already have some overhead dealing with stack variables, and 
 so has C++ for its existence at the end of each scope and it sure does 
 not make C++ slower than most GC systems.

If reference counting worked that well, there would be no push to add gc 
to C++0x.


 I can not say too strongly that if RAII, via 'scope', is to work in D or 
 any other GC language, the end-user should be as oblivious as possible 
 to it working automatically. This means that class designer, who surely 
 must know whether objects of their class need RAII, tells the compiler 
 that his type is 'scope' and the end-user proceeds to use objects of 
 that type just as if he would use normal GC objects.
 
 Otherwise you are creating a bifurcated system which does the end-user 
 no good. Not only must the end user know something in advance about the 
 inner workings of a class ( that it needs RAII ) when the class designer 
 already knows it, but he must also use a separate notation to deal with 
 objects of that class.

For those cases, all the class designer needs to do is present to the 
user the struct wrapper for the class, not the class itself.


 Then you have the problem that all generated code that manipulates any 
 object must insert all the rc machinery for that object, just in case 
 some user somewhere instantiates it as 'scope'.

 
 It needs to have inserted for it the mechanism which determines whether 
 that object is a 'scope' object or not. It probably needs the extra int 
 for possible reference counting. Other than that I do not see what other 
 machinery is needed for normal GC objects.

Consider:

void foo(C c) { C d = c; }

foo() has no idea if c is ref counted or gc. Therefore, it has to check 
every time, at run time. All the machinery has to be there, just in case.

 If we are really still in the age, with vtables and alignment padding 
 and god knows what else a compiler writer needs per object to correctly 
 do his work, where another 4 bytes of int is considered prohibitory, 
 then I give up the whole idea <g>.

It's not just another 4 bytes.

Feb 05 2008

Christopher Wright <dhasenan gmail.com> writes:

Walter Bright wrote:
 Edward Diener wrote:
 It will be required as any user could declare an object instance as 
 'scope', and so any separately compiled code must anticipate that.

 I agree in the sense that every object may need to carry an extra 
 reference count with it even though it will not be used for the vast 
 majority of objects, which will be GC. I do not view this as an issue.

 
 It's a very serious issue, as it essentially negates much of the 
 advantage of general gc. For one example, you'll have to give up 
 interior pointers.
 
 It's just that if any object could be scoped based on a runtime test, 
 that then you've got to insert that test at every assignment, copy 
 construction, and scope exit. You've got all the overhead of RC.

 Yes, agreed. There will be overhead to deal with 'scope' objects. 

 
 It will be needed for *every* gc object, too. And not just the 
 allocation for the reference count, the test has to be executed every time.
 
 However you already have some overhead dealing with stack variables, 
 and so has C++ for its existence at the end of each scope and it sure 
 does not make C++ slower than most GC systems.

 
 If reference counting worked that well, there would be no push to add gc 
 to C++0x.
 
 
 I can not say too strongly that if RAII, via 'scope', is to work in D 
 or any other GC language, the end-user should be as oblivious as 
 possible to it working automatically. This means that class designer, 
 who surely must know whether objects of their class need RAII, tells 
 the compiler that his type is 'scope' and the end-user proceeds to use 
 objects of that type just as if he would use normal GC objects.

 Otherwise you are creating a bifurcated system which does the end-user 
 no good. Not only must the end user know something in advance about 
 the inner workings of a class ( that it needs RAII ) when the class 
 designer already knows it, but he must also use a separate notation to 
 deal with objects of that class.

 
 For those cases, all the class designer needs to do is present to the 
 user the struct wrapper for the class, not the class itself.
 
 
 Then you have the problem that all generated code that manipulates 
 any object must insert all the rc machinery for that object, just in 
 case some user somewhere instantiates it as 'scope'.

 It needs to have inserted for it the mechanism which determines 
 whether that object is a 'scope' object or not. It probably needs the 
 extra int for possible reference counting. Other than that I do not 
 see what other machinery is needed for normal GC objects.

 
 Consider:
 
 void foo(C c) { C d = c; }
 
 foo() has no idea if c is ref counted or gc. Therefore, it has to check 
 every time, at run time. All the machinery has to be there, just in case.
 
 If we are really still in the age, with vtables and alignment padding 
 and god knows what else a compiler writer needs per object to 
 correctly do his work, where another 4 bytes of int is considered 
 prohibitory, then I give up the whole idea <g>.

 
 It's not just another 4 bytes.

You'd have to outlaw:
T a;
scope(T) b = a;

You'd also have to outlaw:
scope(T) a;
T b = a;

This would be more obvious with a wrapper struct than a storage class.

Feb 07 2008

Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:

Walter Bright wrote:
 Edward Diener wrote:
 It will be required as any user could declare an object instance as 
 'scope', and so any separately compiled code must anticipate that.

 I agree in the sense that every object may need to carry an extra 
 reference count with it even though it will not be used for the vast 
 majority of objects, which will be GC. I do not view this as an issue.

 
 It's a very serious issue, as it essentially negates much of the 
 advantage of general gc. For one example, you'll have to give up 
 interior pointers.

I do not follow what having a reference count for an object has to do 
with giving up interior pointers.

 
 It's just that if any object could be scoped based on a runtime test, 
 that then you've got to insert that test at every assignment, copy 
 construction, and scope exit. You've got all the overhead of RC.

 Yes, agreed. There will be overhead to deal with 'scope' objects. 

 
 It will be needed for *every* gc object, too. And not just the 
 allocation for the reference count, the test has to be executed every time.

The test for a reference count is executed whenever you need to do 
something if the object is a 'scope' object which you would not do for a 
non-scoped object. Perhaps this is what you mean by "every time". I have 
these testing "times" as assignment/copy a reference and exiting a 
scope. When instantiating an object no "test" need be made since the 
compiler always knows when an object is 'scope' or not when it is 
created ( 'scope sometype someobject' notation or sometype has a 'scope 
class' notation).

 
 However you already have some overhead dealing with stack variables, 
 and so has C++ for its existence at the end of each scope and it sure 
 does not make C++ slower than most GC systems.

 
 If reference counting worked that well, there would be no push to add gc 
 to C++0x.

No one ever said that reference counting solved all memory problems as 
opposed to GC. The most obvious usage for GC which I know, over and 
above reference counting, is cross-referenced objects.

 
 
 I can not say too strongly that if RAII, via 'scope', is to work in D 
 or any other GC language, the end-user should be as oblivious as 
 possible to it working automatically. This means that class designer, 
 who surely must know whether objects of their class need RAII, tells 
 the compiler that his type is 'scope' and the end-user proceeds to use 
 objects of that type just as if he would use normal GC objects.

 Otherwise you are creating a bifurcated system which does the end-user 
 no good. Not only must the end user know something in advance about 
 the inner workings of a class ( that it needs RAII ) when the class 
 designer already knows it, but he must also use a separate notation to 
 deal with objects of that class.

 
 For those cases, all the class designer needs to do is present to the 
 user the struct wrapper for the class, not the class itself.

Sure, but then there becomes a different notation for dealing with 
specific classes, which nullifies the whole point of being able to 
specify an RAII type ( via 'scope class' in D ).

 
 
 Then you have the problem that all generated code that manipulates 
 any object must insert all the rc machinery for that object, just in 
 case some user somewhere instantiates it as 'scope'.

 It needs to have inserted for it the mechanism which determines 
 whether that object is a 'scope' object or not. It probably needs the 
 extra int for possible reference counting. Other than that I do not 
 see what other machinery is needed for normal GC objects.

 
 Consider:
 
 void foo(C c) { C d = c; }
 
 foo() has no idea if c is ref counted or gc. Therefore, it has to check 
 every time, at run time. All the machinery has to be there, just in case.

I agree.

 
 If we are really still in the age, with vtables and alignment padding 
 and god knows what else a compiler writer needs per object to 
 correctly do his work, where another 4 bytes of int is considered 
 prohibitory, then I give up the whole idea <g>.

 
 It's not just another 4 bytes.

I meant that memory-wise it is just 4 bytes. Of course it is extra 
programming from the language's point of view.

Let me try to make the case for RAII in D via 'scope' once again, by 
presenting the technical details as I see it, and then you will no doubt 
choose what you think best. If I am really far off please tell me about 
it, otherwise there is little reason for me to try to argue and present 
my idea further as you will do what you think best, and I appreciate 
that you have heard me out.

First, the situations when RAII processing occurs:

1) A 'scope' object is instantiated. The internal reference count, 
however you choose to implement it, is set to 1.

2) A 'scope' object's reference is assigned/copied to another object. If 
the 'scope' object is not a null reference, the reference count is 
incremented.

3) A 'scope' object's reference is changed through assignment. If the 
old reference is not a null reference, the old reference's reference 
count is decremented and if it is 0, the old object is destructed ( its 
destructor is called ) and its memory is released ( the latter may 
happen later through GC for all I know ).

4) A 'scope' object reaches the end of it's scope. Processing then 
occurs exactly as it does in 3).

There are two ways of dealing with the identification of a 'scope' object.

The first way is through its static type, where the compiler always 
knows the static type of an object and can generate the correct code in 
each of the 4 instances above for a 'scope' object, and ignore any 
changes to the way that normal non-scope objects are treated. This is 
the easiest way from the compiler's perspective and no doubt the 
fastest. There is no penalty for normal non-scope GC objects and only 
the 'scope' object undergoes special, slower processing. I still have 
hope that if you see fit to go this way that you will allow the user to 
identify a 'scope' object either by the 'scope' keyword applied to the 
instantiated object itself or by the 'scope' keyword applied to the 
class type of the object. I say that because I can not conceive of a 
compiler that could not figure out that an object was 'scope' because 
its class type was 'scope'.

The second way is by examining its dynamic type at run-time and 
generating code to take the appropriate action. This second way is 
harder for the compiler to do and no doubt slower, although how much 
slower is something which could only be pragmatically measured by you 
with D. With this second way, every object must be tested in each of the 
4 cases above to determine if it is a 'scope' object and to take the 
appropriate action if it is. Obviously 4) above is the potential killer 
as far as this goes because it would mean testing every reference at the 
end of each scope, just in case one or more of them is a 'scope' object 
and needs its end of scope processing. In the other three cases one is 
dealing with a single object in a well-defined, if general, situation so 
the overhead would be much less. This second way is obviously much 
better from the end-user's point of view, which does not mean it is 
practically a better solution by any means.

My only practical argument with all those who are certain that this 
second way would be an unnecessary imposition on all the users of normal 
GC objects, and want to regale me with code absolutely "proving" a 
priori their case, is that once an object is determined to be normal GC 
there is nothing further that needs be done for that object which would 
not have been done otherwise. Of course there is overhead for 
determining this in the cases above, especially with 4).

For this second way I have presented the extra reference count field, 
attached internally to all objects, as a way of determining if the 
object is 'scope' when doing 2), 3), or 4), with the proviso that when 
doing 1) the value for all normal GC objects of this field would be set 
to 0. If this is an entirely impractical solution, I am sure that if you 
decide to pursue the possibility of the second way, just to see if it 
can be done and what is the practical penalty in doing it, you will find 
a better scheme.

Feb 07 2008

"Janice Caron" <caron800 googlemail.com> writes:

On 2/3/08, Edward Diener <eddielee_no_spam_here tropicsoft.com> wrote:
 There is no equivalent of 'const' in
 C++ which refers to the type.

Of course there is. For example:

    typedef char const * PCSZ;

Feb 02 2008

Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:

Janice Caron wrote:
 On 2/3/08, Edward Diener <eddielee_no_spam_here tropicsoft.com> wrote:
 There is no equivalent of 'const' in
 C++ which refers to the type.

 
 Of course there is. For example:
 
     typedef char const * PCSZ;

I think a simpler example for you to cite would have been:

typedef char const PCSZ;

But again that does not change the type of 'char'. It simply creates a 
type alias.

In arguing for 'scope' at the type level, as in 'scope class C { ... }' 
I was arguing that 'scope' applies some attribute to the type of C. The 
class designer is making a conscious decision in designing his type in 
order to tell the compiler that all objects of his type need 
deterministic destruction in a system which normally implements 
non-deterministic destruction via GC. That says something about the type 
per se.

In specifying 'const char ch' the end user is making a conscious 
decision in telling the compiler that a specific object must not change 
after it has been initialized to some value. It has nothing to do with 
the type of the object per se.

Feb 03 2008

"Janice Caron" <caron800 googlemail.com> writes:

On 03/02/2008, Edward Diener <eddielee_no_spam_here tropicsoft.com> wrote:
 But again that does not change the type of 'char'.

I hope you're not implying that I said it did.

 In arguing for 'scope' at the type level, as in 'scope class C { ... }'
 I was arguing that 'scope' applies some attribute to the type of C. The
 class designer is making a conscious decision in designing his type in
 order to tell the compiler that all objects of his type need
 deterministic destruction in a system which normally implements
 non-deterministic destruction via GC. That says something about the type
 per se.

Having actually implemented reference counting in C++, I know how it
works. For any type T (doesn't matter if it's a class, a struct, or
just an int), you make two additional types. The first of these just
adds a reference counter:

    class Countable(T)
    {
        uint refCount;
        T val;
    }

and the second is what is exposed to the end user

    struct RefCounted(T)
    {
        Countable!(T) val;
        /* and appropriate ref-counting functions */
        /* and appropriate forwarding functions */
    }

It should be clear that there is no way to cast a RefCounted!(T) to a
T. If you want the following to compile:

    RefCounted!(C) c = whatever;
    C d;
    d = c;

Then the only way to do that would be to have RefCounted!(C) be
implicitly castable to C, presumably by having opImplicitCast() return
c.val.val. This would be disasterous. The moment you allow that,
suddenly you have uncounted references running around. It would then
be possible to do the following:

    C d;
    {
        RefCounted!(C) c = new RefCounted!(C)();
        d = c;
    }
    /* Whoops! d now points to a destructed object! */

Instead, what you'd /really/ want the class designer to do is
something like this:

    struct PrivateFileHandle
    {
        private this(string filename) { /*...*/ }
        ~this() { /*...*/ }
        /* other functions as appropriate */
    }
    alias RefCounted!(PrivateFileHandle) FileHandle;

That way, the caller only has to declare

    FileHandle fh;

but gets it ref-counted. (And there is no way for it not to be refcounted).

If we go the way of "scope C" meaning "RefCounted!(C)" under the hood,
then whether or not you'll need the word "scope" depends on whether
you want to refer to the refcounted type or the underlying type.

Feb 03 2008

Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:

Janice Caron wrote:
 On 03/02/2008, Edward Diener <eddielee_no_spam_here tropicsoft.com> wrote:
 But again that does not change the type of 'char'.

 
 I hope you're not implying that I said it did.
 
 In arguing for 'scope' at the type level, as in 'scope class C { ... }'
 I was arguing that 'scope' applies some attribute to the type of C. The
 class designer is making a conscious decision in designing his type in
 order to tell the compiler that all objects of his type need
 deterministic destruction in a system which normally implements
 non-deterministic destruction via GC. That says something about the type
 per se.

 
 Having actually implemented reference counting in C++, I know how it
 works. For any type T (doesn't matter if it's a class, a struct, or
 just an int), you make two additional types. The first of these just
 adds a reference counter:
 
     class Countable(T)
     {
         uint refCount;
         T val;
     }
 
 and the second is what is exposed to the end user
 
     struct RefCounted(T)
     {
         Countable!(T) val;
         /* and appropriate ref-counting functions */
         /* and appropriate forwarding functions */
     }
 
 It should be clear that there is no way to cast a RefCounted!(T) to a
 T. If you want the following to compile:
 
     RefCounted!(C) c = whatever;
     C d;
     d = c;
 
 Then the only way to do that would be to have RefCounted!(C) be
 implicitly castable to C, presumably by having opImplicitCast() return
 c.val.val. This would be disasterous. The moment you allow that,
 suddenly you have uncounted references running around. It would then
 be possible to do the following:
 
     C d;
     {
         RefCounted!(C) c = new RefCounted!(C)();
         d = c;
     }
     /* Whoops! d now points to a destructed object! */
 
 Instead, what you'd /really/ want the class designer to do is
 something like this:
 
     struct PrivateFileHandle
     {
         private this(string filename) { /*...*/ }
         ~this() { /*...*/ }
         /* other functions as appropriate */
     }
     alias RefCounted!(PrivateFileHandle) FileHandle;
 
 That way, the caller only has to declare
 
     FileHandle fh;
 
 but gets it ref-counted. (And there is no way for it not to be refcounted).
 
 If we go the way of "scope C" meaning "RefCounted!(C)" under the hood,
 then whether or not you'll need the word "scope" depends on whether
 you want to refer to the refcounted type or the underlying type.

I am saying that when the type is 'scope class C' the compiler 
automatically wraps every instantiated object of that type as a 
referenced counted object. While I am not uninterested in the details of 
doing that, as you present it above, I am more interested that it be 
done for the correct situations in order to provide RAII in GC. I do not 
think the end user should have to redundantly specify the object as:

scope C c = new C(...);

when C is already a scope class. The compiler can figure out that C is a 
scope class and silently treat c as a scope object.

As for assigning a reference of a scope object to a non-scope object I 
do understand the problems as outlined above. Thanks for the 
explanation. I still think the compiler can figure out in:

class B {...}
scope class C : B {...}

B b = new C(...);

that the b object upon instantiation needs to be wrapped as a scoped 
object without having to write instead:

scope B b = new C(...);

but I do realize now that with

B b;
b = new C(...);

it would be difficult, perhaps impossible, to somehow switch b from a 
non-scoped object to a scoped one since the object has been wrapped or 
not when it was created. So I do now accept that the second form should 
be illegal.

I know others may think it may be nit-picking that I do not think that 
when the dynamic type, upon creation of an object, can be ascertained as 
a scope class the end user should have to specify the 'scope' keyword 
again in the declaration of the object. But I think it is really 
important for RAII to be as transparent as possible from the end user's 
  point of view, while also allowing the end user to create scope 
objects as necessary also when the actual class type of the object 
created is not scope.

Feb 03 2008

Jason House <jason.james.house gmail.com> writes:

Edward Diener Wrote:
 The first is the necessity of using an already scoped class by repeating
 the 'scope' decalation when creating an object of that class. Since the
 class itself has already been declared with the 'scope' keyword it seems
 absolutely redundant that the user of an object of the class must repeat
 'scope' in his usage of that object. Surely the compiler is smart enough
 to know that the class is a 'scope' class and will generate the
 necessary code to automatically call the destructor of the class when it
 goes out of scope. In fact the user of this class via an instantiated
 object should not even care if it is a scoped class or not, so having to
 say it is again seems doubly wrong, although allowable.

It's already been clarified by others that scope is different than a reference
counted object that's deleted immediately when no more references to it exist. 
The post that follows is under the assumption that this applies to the use of
scope as defined in the D language.

RAII tends to require very specific usage semantics.  Because of this alternate
behavior (and requirements on usage), it makes complete sense to mark the
variable as scope when used.  I don't expect the addition or removal of the
scope property of a class to be something that would not require code changes
in other places.

One of the appeals of d to me is that it aims to reduce coding errors.  This
repeat of the use of scope feels like it's an attempt to keep scope usage both
clear and correct.

Jan 29 2008

D Programming

C/C++ Programming

Other

digitalmars.D - Newbie initial comments on D language - scope