www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Newbie initial comments on D language - scope

reply Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:
The 'scope' mechanism for RAII is a nice compromise for an intractable
general problem in GC languages, and I can see Walter Bright has
possibly been influenced by other GC languages that have sought to
address this issue. But a couple of areas seem really dubious to me.

The first is the necessity of using an already scoped class by repeating
the 'scope' decalation when creating an object of that class. Since the
class itself has already been declared with the 'scope' keyword it seems
absolutely redundant that the user of an object of the class must repeat
'scope' in his usage of that object. Surely the compiler is smart enough
to know that the class is a 'scope' class and will generate the
necessary code to automatically call the destructor of the class when it
goes out of scope. In fact the user of this class via an instantiated
object should not even care if it is a scoped class or not, so having to
say it is again seems doubly wrong, although allowable.

The second is that an object of a 'scope' class can only be instantiated
as a local variable of a function. That pretty much destroys the usage
of a 'scope' class ( aka a class encapsulating a resource which should
be released as soon as it is no longer referenced ) to the most narrow
of usages and means that nobody will bother actually creating such a
class for using RAII in D. Surely a 'scope' class should be instantiable
anywhere, with the obvious candidate being as a data member in an
enclosing class, which itself may or may not be scoped.

The usage of the 'scope' keyword still would have a very important
function if it is designated to force RAII on an instantiated object
which would not ordinarily be scoped. This could occur most naturally
when the programmer is creating any container which may have scoped
objects in it, including the D versions of static and dynamic arrays and
associated arrays. In this way both the class designer can implement
RAII in his class and the programmer can implement RAII on their objects
independently of each other, with both having the necessary control to
solve the resource problem as far as the idea of a scope allows.
Jan 28 2008
next sibling parent reply Sean Kelly <sean f4.ca> writes:
Edward Diener wrote:
 The 'scope' mechanism for RAII is a nice compromise for an intractable
 general problem in GC languages, and I can see Walter Bright has
 possibly been influenced by other GC languages that have sought to
 address this issue. But a couple of areas seem really dubious to me.
 
 The first is the necessity of using an already scoped class by repeating
 the 'scope' decalation when creating an object of that class. Since the
 class itself has already been declared with the 'scope' keyword it seems
 absolutely redundant that the user of an object of the class must repeat
 'scope' in his usage of that object. Surely the compiler is smart enough
 to know that the class is a 'scope' class and will generate the
 necessary code to automatically call the destructor of the class when it
 goes out of scope. In fact the user of this class via an instantiated
 object should not even care if it is a scoped class or not, so having to
 say it is again seems doubly wrong, although allowable.

I disagree. If the "scope" were not present at the point of declaration, I think it would be too easy for a maintainer of the code to screw up and return the scoped value. Requiring "scope" is akin to C++ having different declaration semantics for dynamic and static types.
 The second is that an object of a 'scope' class can only be instantiated
 as a local variable of a function. That pretty much destroys the usage
 of a 'scope' class ( aka a class encapsulating a resource which should
 be released as soon as it is no longer referenced ) to the most narrow
 of usages and means that nobody will bother actually creating such a
 class for using RAII in D. Surely a 'scope' class should be instantiable
 anywhere, with the obvious candidate being as a data member in an
 enclosing class, which itself may or may not be scoped.

Walter has mentioned in the past that he was considering doing exactly this, but like many other things I think it's been on the back-burner while const was sorted out in 2.0. Sean
Jan 28 2008
next sibling parent reply Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:
Sean Kelly wrote:
 Edward Diener wrote:
 The 'scope' mechanism for RAII is a nice compromise for an intractable
 general problem in GC languages, and I can see Walter Bright has
 possibly been influenced by other GC languages that have sought to
 address this issue. But a couple of areas seem really dubious to me.

 The first is the necessity of using an already scoped class by repeating
 the 'scope' decalation when creating an object of that class. Since the
 class itself has already been declared with the 'scope' keyword it seems
 absolutely redundant that the user of an object of the class must repeat
 'scope' in his usage of that object. Surely the compiler is smart enough
 to know that the class is a 'scope' class and will generate the
 necessary code to automatically call the destructor of the class when it
 goes out of scope. In fact the user of this class via an instantiated
 object should not even care if it is a scoped class or not, so having to
 say it is again seems doubly wrong, although allowable.

I disagree. If the "scope" were not present at the point of declaration, I think it would be too easy for a maintainer of the code to screw up and return the scoped value. Requiring "scope" is akin to C++ having different declaration semantics for dynamic and static types.

I do not understand what you mean by "return the scoped value". If in D I write: scope class Foo { ... } then why should I have to write, when declaring an instance of the class: scope Foo g = new Foo(); as opposed to just: Foo g = new Foo(); The compiler knows that Foo is a scoped class, so there is no need for the programmer to repeat it in the object declaration.
 
 The second is that an object of a 'scope' class can only be instantiated
 as a local variable of a function. That pretty much destroys the usage
 of a 'scope' class ( aka a class encapsulating a resource which should
 be released as soon as it is no longer referenced ) to the most narrow
 of usages and means that nobody will bother actually creating such a
 class for using RAII in D. Surely a 'scope' class should be instantiable
 anywhere, with the obvious candidate being as a data member in an
 enclosing class, which itself may or may not be scoped.

Walter has mentioned in the past that he was considering doing exactly this, but like many other things I think it's been on the back-burner while const was sorted out in 2.0.

I understand.
Jan 28 2008
next sibling parent reply Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:
Jesse Phillips wrote:
 On Mon, 28 Jan 2008 20:50:59 -0500, Edward Diener wrote:
 
 I do not understand what you mean by "return the scoped value". If in D
 I write:

 scope class Foo { ... }

 then why should I have to write, when declaring an instance of the
 class:

 scope Foo g = new Foo();

 as opposed to just:

 Foo g = new Foo();

 The compiler knows that Foo is a scoped class, so there is no need for
 the programmer to repeat it in the object declaration.

He is referring to when you have: scope class Foo() {} Foo doThings() { Foo cats = new Foo(); return cats; } cats no longer exists after return.

Yes, I can see that. My own idea of a 'scope' class in a GC environment, one that completely solves the RAII conundrum, is that one should be able to pass around an object of that class and when the last reference to that object goes out of scope the destructor is immediately called. That is very much like what boost::shared_ptr<T> offers for C++ in a language which does not have GC, but it is probably harder to implement in a GC language where such checks are ordinarily not made when an object goes out of scope. Given that the 'scope' class can not be passed around when it leaves the block in which it is created, the above would lead to an error. But I do not see how that affects my initial observation that one should not have to specify the 'scope' keyword on an object of a 'scope' class when declaring such an object, unless I misunderstand the use of 'scope' in that situation. Are you saying that without specifying 'scope' for an object of a 'scope' class the object does not behave as a 'scope' object and that therefore the above example you give does not destroy the object when the doThings function exits ? If that is the case, then I missed the ramifications of using 'scope' when referred to object declarations themselves.
Jan 28 2008
next sibling parent BCS <ao pathlink.com> writes:
Reply to Edward,

 Yes, I can see that. My own idea of a 'scope' class in a GC
 environment, one that completely solves the RAII conundrum, is that
 one should be able to pass around an object of that class and when the
 last reference to that object goes out of scope the destructor is
 immediately called.
 

IIRC scope just destructs the class on exiting the scope. With some more overhead it might be possible to do better, but it would be much more complex. I'd say "not now".
Jan 28 2008
prev sibling next sibling parent Leandro Lucarella <llucax gmail.com> writes:
Edward Diener, el 28 de enero a las 23:07 me escribiste:
 Jesse Phillips wrote:
On Mon, 28 Jan 2008 20:50:59 -0500, Edward Diener wrote:
I do not understand what you mean by "return the scoped value". If in D
I write:

scope class Foo { ... }

then why should I have to write, when declaring an instance of the
class:

scope Foo g = new Foo();

as opposed to just:

Foo g = new Foo();

The compiler knows that Foo is a scoped class, so there is no need for
the programmer to repeat it in the object declaration.

scope class Foo() {} Foo doThings() { Foo cats = new Foo(); return cats; } cats no longer exists after return.

Yes, I can see that. My own idea of a 'scope' class in a GC environment, one that completely solves the RAII conundrum, is that one should be able to pass around an object of that class and when the last reference to that object goes out of scope the destructor is immediately called. That is very much like what boost::shared_ptr<T> offers for C++ in a language which does not have GC, but it is probably harder to implement in a GC language where such checks are ordinarily not made when an object goes out of scope.

Exactly, that's a reference count, another way of doing *GC*, and that has a lot of other complexities (like circular dependencies) and overhead (the counting itself). Scope is much simpler, and as the name says, it destroys an object when it's out of scope, just like C++ does with any object allocated in the stack. Anyway, I think I remember Walter saying that he wanted to add 2.0 the tools necessary to smoothly implement reference counting. -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------------- 1 cigarette takes away 5 minutes of a person's life
Jan 29 2008
prev sibling parent Sean Kelly <sean f4.ca> writes:
Edward Diener wrote:
 Jesse Phillips wrote:
 On Mon, 28 Jan 2008 20:50:59 -0500, Edward Diener wrote:

 I do not understand what you mean by "return the scoped value". If in D
 I write:

 scope class Foo { ... }

 then why should I have to write, when declaring an instance of the
 class:

 scope Foo g = new Foo();

 as opposed to just:

 Foo g = new Foo();

 The compiler knows that Foo is a scoped class, so there is no need for
 the programmer to repeat it in the object declaration.

He is referring to when you have: scope class Foo() {} Foo doThings() { Foo cats = new Foo(); return cats; } cats no longer exists after return.

Yes, I can see that. My own idea of a 'scope' class in a GC environment, one that completely solves the RAII conundrum, is that one should be able to pass around an object of that class and when the last reference to that object goes out of scope the destructor is immediately called. That is very much like what boost::shared_ptr<T> offers for C++ in a language which does not have GC, but it is probably harder to implement in a GC language where such checks are ordinarily not made when an object goes out of scope.

Someone may already have mentioned this, but 2.0 will eventually get copy and destruction semantics for structs so it should be possible to create a fairly decent smart pointer in D as well. Sean
Jan 29 2008
prev sibling next sibling parent "Bruce Adams" <tortoise_74 yeah.who.co.uk> writes:
On Tue, 29 Jan 2008 08:39:46 -0000, Rioshin an'Harthen  =

<rharth75 hotmail.com> wrote:

 I'll go somewhat deeper into this than Janice did.

 This looks like it is a related issue to C# requiring the method-calli=

 place to note what parameters are ref and what parameters are out. The=

 compiler of course already knows which parameter is using which  =

 convention: normal, ref, out... But the designers of C# think it's  =

 better to require

     len +=3D o.foo(bar, ref baz, out foobar);

 than

     len +=3D o.foo(bar, baz, foobar);

 because it documents the call better. Just by reading the call, it is =

 immediately know that baz is passed as a reference, which might change=

 baz, and that foobar will be used to return an additional value from t=

 method. No need to look up the definition of o.foo anywhere.

ns are. The idea of deliberately programming blind does not appeal to me. Extra documentation as a style option on the otherhand sounds reasonable= .
 The same reasoning applies to forcing scoped classes to be marked as  =

 such at the point of declaration in D. It requires for each instance t=

 typing of an extra keyword, but the readability of the code increases =

 such a factor that typing that extra keyword doesn't bother me.

 Now, you're reading through code that somebody else has written, havin=

 inherited the maintaining of that code. There's a bad bug in it that h=

 to be fixed and quickly, because it's stopping the project from being =

 finished - and it was supposed to be ready a few weeks ago. You've  =

 managed to tracke the bug to a certain function and look upon the code=

     Foo f =3D new Foo;
     // do whatever with f
     return f;

 Suppose D doesn't require that scope in front of Foo. You have to chec=

 the Foo class, and you see the scope in front of the declaration.  =

 However, when D requires the scope keyword, the code looks like:

     scope Foo f =3D new Foo;
     // do whatever with f
     return f;

 Bling! We see the error immediately, and have saved the need to actual=

 look for where the Foo class is defined, which could be in the middle =

 a multi-kloc file that you couldn't have guessed from the ton of impor=

 at the top of the file this function resides in.

Any halfway decent compiler should report that as a semnatic error and = refuse to compile it. If the error message is specific enough repeating the = keyword or not makes no odds. As Janice says type (storage class) deduction may = be = a more serious can of worms.
Jan 29 2008
prev sibling parent reply Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:
Janice Caron wrote:
 On Jan 29, 2008 1:50 AM, Edward Diener
 <eddielee_no_spam_here tropicsoft.com> wrote:
 If in D
 I write:

 scope class Foo { ... }

 then why should I have to write, when declaring an instance of the class:

 scope Foo g = new Foo();

 as opposed to just:

 Foo g = new Foo();

 The compiler knows that Foo is a scoped class, so there is no need for
 the programmer to repeat it in the object declaration.

What you're suggesting is "semantic sugar" - allowing the compiler to save us a bit of typing. Sometimes, that can be a good thing. Here, however, I don't think it would be. You see, while the /compiler/ knows that Foo is RAII (you're right about that), future maintainers of the function might not. Forcing the use of the keyword makes the code a bit more readable. Here's another way of looking at it: The right hand side of the statement is evaluated /first/. Then it is assigned to the lvalue. So, when the RHS is evaluated (new Foo()) it returns a value whose type is "scope Foo". Now, you can't assign a "scope Foo" to a "Foo", so the assignment fails. Allowing the semantic sugar would be like "storage-class-deduction", which would open up a really huge can of worms, and we almost certainly don't want to go there (at least not before const has settled down).

I see it exactly the other way. It is semantic sugar to force a programmer to specify that the instantiation of a scope class creates a scope object, just for the sake of making future maintainers feel better. One should not really care that a class is a scope class. It should just work to release the resource it encompasses when it goes out of scope by having its destructor called. In a GC environment memory is just another resource. The user of objects does not worry about memory being released as appropriate. Why should he have to worry about other resources being released as appropriate ? Understand that I am not saying that the user of an object of a scope class can not benefit from knowing, if he chooses, that the class is a scope class. Part of my suggestion about the keyword 'scope' in D is that when used it should force any object, even not normally scoped, to be destroyed when it goes out of scope. In this way the programmer can force a container of scoped objects to be destroyed immediately when it goes out of scope even though the container is not a scoped type.
Jan 29 2008
next sibling parent Bill Baxter <dnewsgroup billbaxter.com> writes:
Edward Diener wrote:
 Janice Caron wrote:
 On Jan 29, 2008 1:50 AM, Edward Diener
 <eddielee_no_spam_here tropicsoft.com> wrote:
 If in D
 I write:

 scope class Foo { ... }

 then why should I have to write, when declaring an instance of the 
 class:

 scope Foo g = new Foo();

 as opposed to just:

 Foo g = new Foo();

 The compiler knows that Foo is a scoped class, so there is no need for
 the programmer to repeat it in the object declaration.

What you're suggesting is "semantic sugar" - allowing the compiler to save us a bit of typing. Sometimes, that can be a good thing. Here, however, I don't think it would be. You see, while the /compiler/ knows that Foo is RAII (you're right about that), future maintainers of the function might not. Forcing the use of the keyword makes the code a bit more readable. Here's another way of looking at it: The right hand side of the statement is evaluated /first/. Then it is assigned to the lvalue. So, when the RHS is evaluated (new Foo()) it returns a value whose type is "scope Foo". Now, you can't assign a "scope Foo" to a "Foo", so the assignment fails. Allowing the semantic sugar would be like "storage-class-deduction", which would open up a really huge can of worms, and we almost certainly don't want to go there (at least not before const has settled down).

I see it exactly the other way. It is semantic sugar to force a programmer to specify that the instantiation of a scope class creates a scope object, just for the sake of making future maintainers feel better. One should not really care that a class is a scope class. It should just work to release the resource it encompasses when it goes out of scope by having its destructor called. In a GC environment memory is just another resource. The user of objects does not worry about memory being released as appropriate. Why should he have to worry about other resources being released as appropriate ? Understand that I am not saying that the user of an object of a scope class can not benefit from knowing, if he chooses, that the class is a scope class. Part of my suggestion about the keyword 'scope' in D is that when used it should force any object, even not normally scoped, to be destroyed when it goes out of scope. In this way the programmer can force a container of scoped objects to be destroyed immediately when it goes out of scope even though the container is not a scoped type.

My opinion is that the current rules make scope classes of very limited use. Regular classes can still be used 'scope'd if desired, so all you're doing by declaring the class itself scope is removing the users' freedom to choose. The only reason to use it is if you have a class that absolutely must not persist longer than one stack frame. If you don't have such a requirement then you might as well not bother with scope classes. Let users decide whether they want it to be scope or not. --bb
Jan 29 2008
prev sibling next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Edward Diener wrote:
 Understand that I am not saying that the user of an object of a scope 
 class can not benefit from knowing, if he chooses, that the class is a 
 scope class. Part of my suggestion about the keyword 'scope' in D is 
 that when used it should force any object, even not normally scoped, to 
 be destroyed when it goes out of scope. In this way the programmer can 
 force a container of scoped objects to be destroyed immediately when it 
 goes out of scope even though the container is not a scoped type.

You both have good points, but the problem with scoped classes are there are semantic problems with them if they are not carefully used. I'm working on adding destructors to structs, which I'm thinking should completely supplant scoped classes. RAII is a much more natural fit with structs than it ever will be for classes.
Jan 30 2008
next sibling parent Extrawurst <spam extrawurst.org> writes:
Walter Bright schrieb:
 I'm working on adding destructors to structs

Thats good news. I am looking forward to it.
Jan 30 2008
prev sibling next sibling parent reply "Craig Black" <cblack ara.com> writes:
 I'm working on adding destructors to structs, which I'm thinking should 
 completely supplant scoped classes. RAII is a much more natural fit with 
 structs than it ever will be for classes.

Very good Walter!! What about copy semantics for structs? -Craig
Jan 30 2008
parent Walter Bright <newshound1 digitalmars.com> writes:
Craig Black wrote:
 I'm working on adding destructors to structs, which I'm thinking should 
 completely supplant scoped classes. RAII is a much more natural fit with 
 structs than it ever will be for classes.

Very good Walter!! What about copy semantics for structs?

You cannot do destructors for value objects without copy constructors and assignment overloads.
Jan 30 2008
prev sibling parent reply Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:
Walter Bright wrote:
 Edward Diener wrote:
 Understand that I am not saying that the user of an object of a scope 
 class can not benefit from knowing, if he chooses, that the class is a 
 scope class. Part of my suggestion about the keyword 'scope' in D is 
 that when used it should force any object, even not normally scoped, 
 to be destroyed when it goes out of scope. In this way the programmer 
 can force a container of scoped objects to be destroyed immediately 
 when it goes out of scope even though the container is not a scoped type.

You both have good points, but the problem with scoped classes are there are semantic problems with them if they are not carefully used. I'm working on adding destructors to structs, which I'm thinking should completely supplant scoped classes. RAII is a much more natural fit with structs than it ever will be for classes.

Please reconsider that decision, especially in the light of the restrictions to structs in D which classes do not have. You would essentially be saying that any class designer, who would want to incorporate deterministic destruction in his class because of a need to free a resource upon class destruction, is constrained in D to using a struct rather than a class. In that case why bother, since structs are so much less than a class in features. You might just as well say "I did not want the challenge of RAII in D, a GC language, so I will just kill it this way." If you really don't want RAII in D, which simply and fairly enough means you want the release of resources in your GC environment to always be done manually, just don't implement it at all. That is much more straightforward than attempting to support but doing it in such a way that makes it impossible for a class designer to implement it. The current 'scope' keyword for RAII is very limited. My OP was not questioning that but objecting to the redundant way it had to be used even in that environment. However, making it more limiting rather than less limiting just kills it entirely IMO, where you should be seeking to go in exactly the opposite direction.
Jan 30 2008
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Edward Diener wrote:
 Walter Bright wrote:
 I'm working on adding destructors to structs, which I'm thinking 
 should completely supplant scoped classes. RAII is a much more natural 
 fit with structs than it ever will be for classes.

Please reconsider that decision, especially in the light of the restrictions to structs in D which classes do not have. You would essentially be saying that any class designer, who would want to incorporate deterministic destruction in his class because of a need to free a resource upon class destruction, is constrained in D to using a struct rather than a class. In that case why bother, since structs are so much less than a class in features. You might just as well say "I did not want the challenge of RAII in D, a GC language, so I will just kill it this way." If you really don't want RAII in D, which simply and fairly enough means you want the release of resources in your GC environment to always be done manually, just don't implement it at all. That is much more straightforward than attempting to support but doing it in such a way that makes it impossible for a class designer to implement it. The current 'scope' keyword for RAII is very limited. My OP was not questioning that but objecting to the redundant way it had to be used even in that environment. However, making it more limiting rather than less limiting just kills it entirely IMO, where you should be seeking to go in exactly the opposite direction.

Maybe you're selling structs short :-). With RAII structs, to make an RAII class, one could create a wrapper struct template: struct Wrapper(C) { C c; ~this() { delete c; } } This is oversimplified, as there would also need to be a mechanism to forward operations from the struct to the class C, copy constructors, etc., but I think it is conceptually sound. Such wrapper structs could, for example, be written to reference count their argument equivalently to the C++ shared_ptr<>. From another point of view, an RAII type is fundamentally a value type, whereas classes are fundamentally reference types. By using the wrapper approach to impart some value (i.e. RAII) semantics to a reference type, the operations on that reference type can be carefully controlled by the wrapper designer to prevent such problems as references escaping the scope - solutions which are problematic to put in the core language. I'm not sure what you mean by saying that structs are so much less than classes. Structs aren't a subset of classes, they are a fundamentally different animal - a value type, as opposed to a class, which is a reference type. C++ does not distinguish between the two, leaving serious problems such as the "slicing problem", the virtual destructor problem, and trying to prevent users of your C++ class from using it as a value when it should be a reference or vice versa. Behaviors which are a natural fit for reference types include inheritance, polymorphism, and gc (including ref counted gc). Behaviors which are a natural fit for value types are scoped allocation, RAII, non-virtual functions.
Jan 30 2008
parent reply Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:
Walter Bright wrote:
 Edward Diener wrote:
 Walter Bright wrote:
 I'm working on adding destructors to structs, which I'm thinking 
 should completely supplant scoped classes. RAII is a much more 
 natural fit with structs than it ever will be for classes.

Please reconsider that decision, especially in the light of the restrictions to structs in D which classes do not have. You would essentially be saying that any class designer, who would want to incorporate deterministic destruction in his class because of a need to free a resource upon class destruction, is constrained in D to using a struct rather than a class. In that case why bother, since structs are so much less than a class in features. You might just as well say "I did not want the challenge of RAII in D, a GC language, so I will just kill it this way." If you really don't want RAII in D, which simply and fairly enough means you want the release of resources in your GC environment to always be done manually, just don't implement it at all. That is much more straightforward than attempting to support but doing it in such a way that makes it impossible for a class designer to implement it. The current 'scope' keyword for RAII is very limited. My OP was not questioning that but objecting to the redundant way it had to be used even in that environment. However, making it more limiting rather than less limiting just kills it entirely IMO, where you should be seeking to go in exactly the opposite direction.

Maybe you're selling structs short :-). With RAII structs, to make an RAII class, one could create a wrapper struct template: struct Wrapper(C) { C c; ~this() { delete c; } } This is oversimplified, as there would also need to be a mechanism to forward operations from the struct to the class C, copy constructors, etc., but I think it is conceptually sound. Such wrapper structs could, for example, be written to reference count their argument equivalently to the C++ shared_ptr<>.

OK, I see where you are going.
 
  From another point of view, an RAII type is fundamentally a value type, 
 whereas classes are fundamentally reference types. By using the wrapper 
 approach to impart some value (i.e. RAII) semantics to a reference type, 
 the operations on that reference type can be carefully controlled by the 
 wrapper designer to prevent such problems as references escaping the 
 scope - solutions which are problematic to put in the core language.
 
 I'm not sure what you mean by saying that structs are so much less than 
 classes. Structs aren't a subset of classes, they are a fundamentally 
 different animal - a value type, as opposed to a class, which is a 
 reference type. C++ does not distinguish between the two, leaving 
 serious problems such as the "slicing problem", the virtual destructor 
 problem, and trying to prevent users of your C++ class from using it as 
 a value when it should be a reference or vice versa.
 
 Behaviors which are a natural fit for reference types include 
 inheritance, polymorphism, and gc (including ref counted gc).
 
 Behaviors which are a natural fit for value types are scoped allocation, 
 RAII, non-virtual functions.

You have sold me. When you said 'struct' I was not thinking in terms of a template class, ala boost::shared_ptr<T>, but instead of the limitations of 'struct' in D as opposed to a class, which tells me that in D a struct is a C++ POD. I have not read about templates yet in D so a struct that is a template class and wraps an object which is the actual type was beyond my thinking. Your idea is absolutely right and I was wrong to criticize it without understanding what you meant. Now that I see where you are going, and you mentioned forwarding in your description above, I though of how boost::shared_ptr does it and I realized that the C++ 'operator ->' is the key. So I immediately looked for the equivalent in D, which would allow this to happen also, which would be an op function for the 'operator .'. But I could not find this operator supported in the D 1.0 docs. My suggestion then, if you are going to make the idea above work, is that you need to support an op function for the 'operator .' and then forwarding into the wrapped object would be simple and automatic no matter what functionality the wrapped object had.
Jan 30 2008
next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Edward Diener wrote:
 Now that I see where you are going, and you mentioned forwarding in your 
 description above, I though of how boost::shared_ptr does it and I 
 realized that the C++ 'operator ->' is the key. So I immediately looked 
 for the equivalent in D, which would allow this to happen also, which 
 would be an op function for the 'operator .'. But I could not find this 
 operator supported in the D 1.0 docs. My suggestion then, if you are 
 going to make the idea above work, is that you need to support an op 
 function for the 'operator .' and then forwarding into the wrapped 
 object would be simple and automatic no matter what functionality the 
 wrapped object had.

I agree that a method for forwarding is needed to complete the job. I plan on working on that after the RAII stuff is working.
Jan 31 2008
parent reply Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:
Walter Bright wrote:
 Edward Diener wrote:
 Now that I see where you are going, and you mentioned forwarding in 
 your description above, I though of how boost::shared_ptr does it and 
 I realized that the C++ 'operator ->' is the key. So I immediately 
 looked for the equivalent in D, which would allow this to happen also, 
 which would be an op function for the 'operator .'. But I could not 
 find this operator supported in the D 1.0 docs. My suggestion then, if 
 you are going to make the idea above work, is that you need to support 
 an op function for the 'operator .' and then forwarding into the 
 wrapped object would be simple and automatic no matter what 
 functionality the wrapped object had.

I agree that a method for forwarding is needed to complete the job. I plan on working on that after the RAII stuff is working.

Thinking about this further, why not go all the way and just provide automatic support for all 'scope' objects as RAII constructs with reference counted destruction. If you did that D would be the first GC language to have a transparent mechanism for handling deterministic destruction. What you are saying is that you want to allow a template struct to be a wrapper for 'scope' classes. Your idea is that when the object of that template class gets created the reference count is set to 1, as the object of that template class gets copied or assigned to another object of the same type the reference goes up, when the object is destructed the reference count goes down and, if the reference count goes to 0, the wrapped GC class gets destroyed. Obviously in D you are tracking whenever a struct goes out of scope in order to call the struct's destructor. Just as obviously you must allow some copy constructor and assignment processing for a struct whenever it gets copied or assigned, in order to increment the reference count. If this plan is workable you could do the exact same thing at the compiler level when dealing with a 'scope' object. You could allow 'scope' to be specified at the class level, as you are now doing, or at the object level, as you are now insisting on doing when creating objects of 'scope' classes even though it is redundant ( the initial reason for my OP ). But instead of being redundant, as it is now, you could allow it as a way of saying that the end-user wants a particular object to use scoping, ie. to be deterministically destroyed when the last reference to the object goes out of scope. In this way both the class designer, who knows if he may need his class to be 'scope' because he knows if he needs to release a resource, and the object user, who may need control to 'scope' for containers which themselves are not 'scope' type but which may contain 'scope' types, have full control of 'scope' Voila ! You now have a full GC language in which both the class designer, via a 'scope' class, and the object creator, via a 'scope' object, has complete control over the destruction of objects. Yours would be the first GC language to really solve the problem of objects encapsulating non-memory resources being destroyed deterministically when references to the object are no longer being used. All other GC languages just gloss over the problem or maintain that it occurs so rarely there is no need for anything but manual release of non-memory resources ( via try/catch and specialized Dispose/Close ) or semi-manual methods such as your current very limited use of 'scope'. I hear you saying, "No I don't want to be the first GC language to solve this problem especially as Java, .Net, Python, Ruby, et al. just pretend it does not exist or is unimportant for practical programming and besides, it is difficult to solve and I have lots of other, better things to do, and finally few people will know or give me credit for it anyway." But somewhere, some day, someone is going to point out this flaw in GC and a solution, as I have described, will be implemented, and then everyone will say, "why did we not think of this sooner". And some bright person will say, "you know Walter Bright solved this years ago with D."
Feb 02 2008
next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Edward Diener wrote:
 Thinking about this further, why not go all the way and just provide 
 automatic support for all 'scope' objects as RAII constructs with 
 reference counted destruction. If you did that D would be the first GC 
 language to have a transparent mechanism for handling deterministic 
 destruction.

You read my mind <g>. But I wanted to see first how well proxy objects would work first, they may not need the extra help.
 What you are saying is that you want to allow a template struct to be a 
 wrapper for 'scope' classes. Your idea is that when the object of that 
 template class gets created the reference count is set to 1, as the 
 object of that template class gets copied or assigned to another object 
 of the same type the reference goes up, when the object is destructed 
 the reference count goes down and, if the reference count goes to 0, the 
 wrapped GC class gets destroyed.

Yes.
 Obviously in D you are tracking whenever a struct goes out of scope in 
 order to call the struct's destructor. Just as obviously you must allow 
 some copy constructor and assignment processing for a struct whenever it 
 gets copied or assigned, in order to increment the reference count.

Yes.
 If this plan is workable you could do the exact same thing at the 
 compiler level when dealing with a 'scope' object.

Yes.
 You could allow 'scope' to be specified at the class level, as you are 
 now doing, or at the object level, as you are now insisting on doing 
 when creating objects of 'scope' classes even though it is redundant ( 
 the initial reason for my OP ). But instead of being redundant, as it is 
 now, you could allow it as a way of saying that the end-user wants a 
 particular object to use scoping, ie. to be deterministically destroyed 
 when the last reference to the object goes out of scope. In this way 
 both the class designer, who knows if he may need his class to be 
 'scope' because he knows if he needs to release a resource, and the 
 object user, who may need control to 'scope' for containers which 
 themselves are not 'scope' type but which may contain 'scope' types, 
 have full control of 'scope'
 
 Voila ! You now have a full GC language in which both the class 
 designer, via a 'scope' class, and the object creator, via a 'scope' 
 object, has complete control over the destruction of objects. Yours 
 would be the first GC language to really solve the problem of objects 
 encapsulating non-memory resources being destroyed deterministically 
 when references to the object are no longer being used. All other GC 
 languages just gloss over the problem or maintain that it occurs so 
 rarely there is no need for anything but manual release of non-memory 
 resources ( via try/catch and specialized Dispose/Close ) or semi-manual 
 methods such as your current very limited use of 'scope'.
 
 I hear you saying, "No I don't want to be the first GC language to solve 
 this problem especially as Java, .Net, Python, Ruby, et al.  just pretend
 it does not exist or is unimportant for practical programming and 
 besides, it is difficult to solve and I have lots of other, better 
 things to do, and finally few people will know or give me credit for it 
 anyway." But somewhere, some day, someone is going to point out this 
 flaw in GC and a solution, as I have described, will be implemented, and 
 then everyone will say, "why did we not think of this sooner". And some 
 bright person will say, "you know Walter Bright solved this years ago 
 with D."

We did think of it over a year ago, and have been laying the groundwork for it (wanted to get the const madness done first). This will enable D to be, as you say, the first language to support the triumvirate of explicit, automatic, and ref counted memory allocation on an equal footing. The only question is whether the proxy struct will be easy enough to use to not need extra core language support: scope C c; v.s.: Scope!(C) c; One major consideration arguing for it to be a library feature is multithreading. Doing locked reference counts is slow, and needed only for a minority of objects. It should be selectable when you allocate the object whether you need it multithreaded or not.
Feb 02 2008
next sibling parent Walter Bright <newshound1 digitalmars.com> writes:
Janice Caron wrote:
 On 2/2/08, Walter Bright <newshound1 digitalmars.com> wrote:
 Doing locked reference counts is slow

Surely you don't need to lock the reference count? You can use atomic increment and atomic decrement instead. (I implemented a ref-counted template in C++ once, and that's what I did. It seemed to work).

Doing atomic inc/dec *is* locking. The LOCK CPU instruction is there, but it's mighty slow.
Feb 02 2008
prev sibling next sibling parent reply Sean Kelly <sean f4.ca> writes:
Janice Caron wrote:
 On 2/2/08, Walter Bright <newshound1 digitalmars.com> wrote:
 Doing locked reference counts is slow

Surely you don't need to lock the reference count? You can use atomic increment and atomic decrement instead. (I implemented a ref-counted template in C++ once, and that's what I did. It seemed to work).

A locked operation on x86 takes something like 80ns, which is far from cheap. Though I think a cleverly implemented algorithm may be able to avoid the use of 'lock' altogether (Boost's does, IIRC). Sean
Feb 02 2008
parent reply Sergey Gromov <snake.scaly gmail.com> writes:
Sean Kelly Wrote:

 A locked operation on x86 takes something like 80ns, which is far from
 cheap.  Though I think a cleverly implemented algorithm may be able to
 avoid the use of 'lock' altogether (Boost's does, IIRC).

Boost uses InterlocedIncrement() etc. under Windows and lock inc dword ptr [esi] and such otherwise. That's what Walter's talking about.
Feb 03 2008
parent Sean Kelly <sean f4.ca> writes:
Sergey Gromov wrote:
 Sean Kelly Wrote:
 
 A locked operation on x86 takes something like 80ns, which is far from
 cheap.  Though I think a cleverly implemented algorithm may be able to
 avoid the use of 'lock' altogether (Boost's does, IIRC).

Boost uses InterlocedIncrement() etc. under Windows and lock inc dword ptr [esi] and such otherwise. That's what Walter's talking about.

Ah, for some reason I thought they used a sort of spinlock. Sean
Feb 03 2008
prev sibling next sibling parent Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Walter Bright wrote:
 The only question is whether the proxy struct will be easy enough to use 
 to not need extra core language support:
 
      scope C c;
 
 v.s.:
 
      Scope!(C) c;

The latter won't be as nice as-is, since in the first case you can omit the C, and have it inferred. Unless you accept Scope!(auto), I don't see how you could do this. Even then, you need both 'Scope' and 'auto', but I personally don't have a problem with dropping type inference for declarations that don't include 'auto' and having 'auto' be automatically replaced with the inferred type (even in template arguments). That wouldn't be backwards-compatible though, so you might want need to keep allowing non-'auto' automatic inference as well. Currently 'auto' is also allowed in non-final-attribute position, and that would be inconsistent as well if it's kept.
Feb 02 2008
prev sibling next sibling parent reply Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:
Walter Bright wrote:
 Edward Diener wrote:
 Voila ! You now have a full GC language in which both the class 
 designer, via a 'scope' class, and the object creator, via a 'scope' 
 object, has complete control over the destruction of objects. Yours 
 would be the first GC language to really solve the problem of objects 
 encapsulating non-memory resources being destroyed deterministically 
 when references to the object are no longer being used. All other GC 
 languages just gloss over the problem or maintain that it occurs so 
 rarely there is no need for anything but manual release of non-memory 
 resources ( via try/catch and specialized Dispose/Close ) or 
 semi-manual methods such as your current very limited use of 'scope'.

 I hear you saying, "No I don't want to be the first GC language to 
 solve this problem especially as Java, .Net, Python, Ruby, et al.  
 just pretend
 it does not exist or is unimportant for practical programming and 
 besides, it is difficult to solve and I have lots of other, better 
 things to do, and finally few people will know or give me credit for 
 it anyway." But somewhere, some day, someone is going to point out 
 this flaw in GC and a solution, as I have described, will be 
 implemented, and then everyone will say, "why did we not think of this 
 sooner". And some bright person will say, "you know Walter Bright 
 solved this years ago with D."

We did think of it over a year ago, and have been laying the groundwork for it (wanted to get the const madness done first). This will enable D to be, as you say, the first language to support the triumvirate of explicit, automatic, and ref counted memory allocation on an equal footing.

That would be superb !
 
 The only question is whether the proxy struct will be easy enough to use 
 to not need extra core language support:
 
      scope C c;
 
 v.s.:
 
      Scope!(C) c;

I would like to see: C c = new C(...); and if C is a 'scope' class it is handled exactly the same as: scope C c = new C(...); if C is not a scope class. In both cases the 'c' object gets reference counted and treated as such when it is copied, assigned, and leaves a D scope in order to have its destructor called immediately when there are no more references to it. In other words I do not like the imposition of having to treat C as a struct which the form of: scope C c; implies nor of having to specify 'scope' if the class itself has already been marked as 'scope' in the class definition. The idea in my mind is essentially that 'scope' classes automatically define an object that, when used exactly as an normal GC object, automatically calls the destructor of the object just as soon as the last reference to it goes out of scope. In this sense the user neither know or cares whether the object encapsulates a resource or not and uses an object of such a class just as he would use any other GC object. Similarly the user can force an object to be a 'scope' object through the second syntax give above, but from then on he treats the object just as he would any other GC object. My point of view is quite simply the the core D language should make the syntax for using 'scope' objects as non-distinguishable from using normal GC objets as possible. Having a different way just to instantiate an object of a 'scope' class is not as it removes the transparency of their use.
 
 One major consideration arguing for it to be a library feature is 
 multithreading. Doing locked reference counts is slow, and needed only 
 for a minority of objects. It should be selectable when you allocate the 
 object whether you need it multithreaded or not.

Please take a look at atomic operations, which I am sure you already know about. I believe boost::shared_ptr<T> in its latest incarnation is using it to increment and decrement the reference count without the usual time penalty which you mention.
Feb 02 2008
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Edward Diener wrote:
 The idea in my mind is essentially that 'scope' classes automatically 
 define an object that, when used exactly as an normal GC object, 
 automatically calls the destructor of the object just as soon as the 
 last reference to it goes out of scope. In this sense the user neither 
 know or cares whether the object encapsulates a resource or not and uses 
 an object of such a class just as he would use any other GC object. 
 Similarly the user can force an object to be a 'scope' object through 
 the second syntax give above, but from then on he treats the object just 
 as he would any other GC object.

The problem is: scope C c; C d; d = c; and now d is no longer properly ref-counted. The 'scopeness' of an object must therefore be part of its type. The assignment d=c would have to be illegal.
 Please take a look at atomic operations, which I am sure you already 
 know about. I believe boost::shared_ptr<T> in its latest incarnation is 
  using it to increment and decrement the reference count without the 
 usual time penalty which you mention.

There has been a lot of work done improving atomic operations. That's one reason for making it a library feature - the library stuff can be improved without having to change the compiler.
Feb 02 2008
next sibling parent Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:
Walter Bright wrote:
 Edward Diener wrote:
 The idea in my mind is essentially that 'scope' classes automatically 
 define an object that, when used exactly as an normal GC object, 
 automatically calls the destructor of the object just as soon as the 
 last reference to it goes out of scope. In this sense the user neither 
 know or cares whether the object encapsulates a resource or not and 
 uses an object of such a class just as he would use any other GC 
 object. Similarly the user can force an object to be a 'scope' object 
 through the second syntax give above, but from then on he treats the 
 object just as he would any other GC object.

The problem is: scope C c; C d; d = c; and now d is no longer properly ref-counted. The 'scopeness' of an object must therefore be part of its type. The assignment d=c would have to be illegal.

If C is a scope class, then it is fine and the normal reference counting occurs when d=c; occurs, n'est-ce pas ? If is not a scope class, then the C d; say that d is not a scoped object so that when: d = c; all that happens is that the reference count is not updated for this assignment and d going out of scope does nothing. In this latter case it is the responsibility of the end user, since he is scoping at the object level, to do the correct thing for whatever he wants. IMO the 'scope' at the compiler level is part of the object, with the only difference being that objects of a 'scope' type are automatically 'scope' without the user of the object specifying it. If you want the lesser benefit of 'scope' being only part of the type, then you take away from the end user the ability to create a 'scope' object of a type which is not 'scope'. This may be easier for you, the compiler writer, but it means that containers of objects, which may hold objects of 'scope' type but are not 'scope' type themselves, do not get the benefit of reference counted deterministic destruction.
 
 
 Please take a look at atomic operations, which I am sure you already 
 know about. I believe boost::shared_ptr<T> in its latest incarnation 
 is  using it to increment and decrement the reference count without 
 the usual time penalty which you mention.

There has been a lot of work done improving atomic operations. That's one reason for making it a library feature - the library stuff can be improved without having to change the compiler.

As long as it is transparent to the end user you should do it in whatever the best way you deem possible.
Feb 02 2008
prev sibling parent reply Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:
Walter Bright wrote:
 Edward Diener wrote:
 The idea in my mind is essentially that 'scope' classes automatically 
 define an object that, when used exactly as an normal GC object, 
 automatically calls the destructor of the object just as soon as the 
 last reference to it goes out of scope. In this sense the user neither 
 know or cares whether the object encapsulates a resource or not and 
 uses an object of such a class just as he would use any other GC 
 object. Similarly the user can force an object to be a 'scope' object 
 through the second syntax give above, but from then on he treats the 
 object just as he would any other GC object.

The problem is: scope C c; C d; d = c; and now d is no longer properly ref-counted. The 'scopeness' of an object must therefore be part of its type. The assignment d=c would have to be illegal.

My brain was not working correctly earlier, so let me correct myself. In your above, if the c object is 'scope', whether it is because the C class is 'scope' or, as in your example, you specify 'scope' on the object ( which in current D is the same thing as saying that the C class is 'scope' ) then the assignment to another object makes that object 'scope' automatically. This is yet another reason why 'scope' at the compiler should be tracked at the object level, not at the class level. The canonical situation is: class C { ... } scope class D : C { ... } scope ( redundant IMO ) D d = new D(...); C c = d; Clearly c, whose polymorphical type is a D, has to be 'scope'.
Feb 02 2008
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Edward Diener wrote:
 In your above, if the c object is 'scope', whether it is because the C 
 class is 'scope' or, as in your example, you specify 'scope' on the 
 object ( which in current D is the same thing as saying that the C class 
 is 'scope' ) then the assignment to another object makes that object 
 'scope' automatically. This is yet another reason why 'scope' at the 
 compiler should be tracked at the object level, not at the class level. 
 The canonical situation is:
 
 class C { ... }
 scope class D : C { ... }
 
 scope ( redundant IMO ) D  d = new D(...);
 C c = d;
 
 Clearly c, whose polymorphical type is a D, has to be 'scope'.

Let's look at it by analogy to 'const'. Implicitly converting a const D to its base class will produce a const C, not a C. A const C cannot be assigned to a C. I think it should work similarly with scope, and that like const, it should be part of the type system (a proxy struct would accomplish that). Making it a dynamic part of the object would exact a heavy cost for most objects which don't need it.
Feb 02 2008
next sibling parent reply Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:
Walter Bright wrote:
 Edward Diener wrote:
 In your above, if the c object is 'scope', whether it is because the C 
 class is 'scope' or, as in your example, you specify 'scope' on the 
 object ( which in current D is the same thing as saying that the C 
 class is 'scope' ) then the assignment to another object makes that 
 object 'scope' automatically. This is yet another reason why 'scope' 
 at the compiler should be tracked at the object level, not at the 
 class level. The canonical situation is:

 class C { ... }
 scope class D : C { ... }

 scope ( redundant IMO ) D  d = new D(...);
 C c = d;

 Clearly c, whose polymorphical type is a D, has to be 'scope'.

Let's look at it by analogy to 'const'. Implicitly converting a const D to its base class will produce a const C, not a C. A const C cannot be assigned to a C. I think it should work similarly with scope, and that like const, it should be part of the type system (a proxy struct would accomplish that). Making it a dynamic part of the object would exact a heavy cost for most objects which don't need it.

Your analogy to C++'s 'const' is a bad one. The C++ 'const' refers to a quality of the object while the D 'scope' refers to a quality of the type. There is no equivalent of 'const' in C++ which refers to the type. Once we say that a type is 'scope' in D we should no longer have to say that an object of that type is 'scope'. An object of that type should be 'scope' automatically and the user of that object should not care or even need to know. In C++ the user of an object specifically says it is 'const' to set the quality of the object to something ( one can not change the object ). Your analogy is mixing apples and oranges. These are different things. What I am saying is that an object whose type is 'scope' is treated magically by the compiler in that the compiler is now doing reference counting on it and calling its destructor when the last reference goes out of scope. Furthermore as that object gets reference assigned the reference count is manipulated and whatever object is specified in that reference assignment, as long as it is allowable by the compiler by the rules of D, takes part in the 'scope' magic. In a polymorphic language this means that you should associate 'scope' with the dynamic type of the object, not its static type, and how you decide to do that is up to you. Think of it as wrapping a boost::shared_ptr around the object and for every object to which you legally assign/copy it a boost::shared_ptr gets wrapped around that object. I agree this adds some overhead, but so what. Using boost::shared_ptr also imposes overhead and whole generation of programmers have somehow survived the extra x bytes per object in an age where physical memmory is in the gigabytes and virtual memory in 64 bit systems in the quadrabytes. My added suggestion is that when applying the 'scope' keyword to the object, and not the type, this essentially means that the compiler now treats that object as 'scope' even though the type is not 'scope'. I will call this object 'scope' injection. My suggestion for this is based solely on the practical consequences: 1) Allow the instantiator of a type to have 'scope' control over the object even when the designer of the type does not specify it as a 'scope' type. The user may know something about using the type at run-time that the class designer can not know, makes optional, or even disregards. 2) Following from the above, the most obvious practical cases occur when the type is created from a template class/struct, when the type is some built-in language container which can hold polymorphic objects of some base class type, and when the type embeds an object of 'scope' class type which may only be used in a corner case so that the designer of the type leaves scoping up to the user. In other words the flexibility of control would be wonderful and, I believe, often necessary. Having both the class designer be able to 'scope' the type and the end user be able to 'scope' an object of any type ( which has a destructor ) is the ultimate ideal. If you wanted to go even further you could allow the end-user to 'unscope' an object of a 'scope' type when instantiating the object, even though the benefits of doing this seem to be practically negligible. I view your choices as, from most desirable to least desirable: 1) Keep tracks of the objects themselves at run-time to see if they are 'scope' or not. This allows object 'scope' injection and refernce assignment to objects whose static type is not 'scope' to make them 'scope'. 2) Keep track of the dynamic type of the objects themselves in order to see whether the dynamic type of the object is 'scope'. This does not allow object 'scope' injection, but does allow reference assignment to objects whose static type is not 'scope' since you are only considering the dynamic type of the object and not the static type to determine if the object should be 'scope'. 3) Keep track of only the static type of the object. This does not allow object 'scope' injection nor even reference assignment to objects whose static type is not 'scope'. I view choice 3 as pretty poor and would really like to see choice 1 rather than choice 2 for practical reasons. I hope this at least gives you food for further thought.
Feb 02 2008
next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Edward Diener wrote:
 Your analogy to C++'s 'const' is a bad one.
 
 The C++ 'const' refers to a quality of the object while the D 'scope' 
 refers to a quality of the type. There is no equivalent of 'const' in 
 C++ which refers to the type.

'const' in C++ is very much a characteristic of the type of the object. It pervades the semantics of the type. It's easy to envision a scheme of const where the const-ness is controlled by a bit in the runtime instantiation of the object, and any mutating operations would first check that bit. C++ avoids the overhead of that by having a static typing system, and such 'bits' become part of the compile-time information (i.e. the type) rather than the run-time information.
 Once we say that a type is 'scope' in D we 
 should no longer have to say that an object of that type is 'scope'. An 
 object of that type should be 'scope' automatically and the user of that 
 object should not care or even need to know. In C++ the user of an 
 object specifically says it is 'const' to set the quality of the object 
 to something ( one can not change the object ). Your analogy is mixing 
 apples and oranges. These are different things.

I don't believe they are different at all. Consider also languages that have no static types - the types are determined at runtime (Javascript is an example). What a language chooses to specify about an object at compile-time vs run-time is a spectrum with various tradeoffs, not an apples-oranges with a sharp dividing line. Certainly, there is plenty of debate about static typing vs dynamic typing. D is a statically typed language primarily for performance reasons - a dynamically typed language can run 100x slower.
 What I am saying is that an object whose type is 'scope' is treated 
 magically by the compiler in that the compiler is now doing reference 
 counting on it and calling its destructor when the last reference goes 
 out of scope. Furthermore as that object gets reference assigned the 
 reference count is manipulated and whatever object is specified in that 
 reference assignment, as long as it is allowable by the compiler by the 
 rules of D, takes part in the 'scope' magic. In a polymorphic language 
 this means that you should associate 'scope' with the dynamic type of 
 the object, not its static type, and how you decide to do that is up to 
 you. Think of it as wrapping a boost::shared_ptr around the object and 
 for every object to which you legally assign/copy it a boost::shared_ptr 
 gets wrapped around that object.

Pulling on that string leads us to every object having scope semantics, because that machinery will have to exist and be checked at runtime for every object.
 I agree this adds some overhead, but so what.

And there lies the crux of our disagreement. My experience with memory allocation is that ref counting is appropriate for scarce resources, and gc is appropriate for abundant resources (i.e. memory). We both agree that gc is a poor choice for scarce resources, and I'm going to argue that rc is a poor choice for abundant resources.
 Using boost::shared_ptr 
 also imposes overhead and whole generation of programmers have somehow 
 survived the extra x bytes per object in an age where physical memmory 
 is in the gigabytes and virtual memory in 64 bit systems in the 
 quadrabytes.

There is a large push to add gc to C++. (rc has disadvantages besides using more memory - the overhead to allocate two objects instead of one, and the overhead of doing the inc/dec/test. A further disadvantage is you cannot do array slicing with rc without adding substantial more overhead - memory and runtime.)
 My added suggestion is that when applying the 'scope' keyword to the 
 object, and not the type, this essentially means that the compiler now 
 treats that object as 'scope' even though the type is not 'scope'. I 
 will call this object 'scope' injection. My suggestion for this is based 
 solely on the practical consequences:
 
 1) Allow the instantiator of a type to have 'scope' control over the 
 object even when the designer of the type does not specify it as a 
 'scope' type. The user may know something about using the type at 
 run-time that the class designer can not know, makes optional, or even 
 disregards.

I agree with you that the scopeness of an object is best determined by the user, not the class designer. But this doesn't preclude making scope part of the type any more than the user adding 'const' precludes it.
 2) Following from the above, the most obvious practical cases occur when 
 the type is created from a template class/struct, when the type is some 
 built-in language container which can hold polymorphic objects of some 
 base class type, and when the type embeds an object of 'scope' class 
 type which may only be used in a corner case so that the designer of the 
 type leaves scoping up to the user.
 
 In other words the flexibility of control would be wonderful and, I 
 believe, often necessary. Having both the class designer be able to 
 'scope' the type and the end user be able to 'scope' an object of any 
 type ( which has a destructor ) is the ultimate ideal. If you wanted to 
 go even further you could allow the end-user to 'unscope' an object of a 
 'scope' type when instantiating the object, even though the benefits of 
 doing this seem to be practically negligible.
 
 I view your choices as, from most desirable to least desirable:
 
 1) Keep tracks of the objects themselves at run-time to see if they are 
 'scope' or not. This allows object 'scope' injection and refernce 
 assignment to objects whose static type is not 'scope' to make them 
 'scope'.
 
 2) Keep track of the dynamic type of the objects themselves in order to 
 see whether the dynamic type of the object is 'scope'. This does not 
 allow object 'scope' injection, but does allow reference assignment to 
 objects whose static type is not 'scope' since you are only considering 
 the dynamic type of the object and not the static type to determine if 
 the object should be 'scope'.
 
 3) Keep track of only the static type of the object. This does not allow 
 object 'scope' injection nor even reference assignment to objects whose 
 static type is not 'scope'.

I believe that adding scope to the type allows for scope 'injection' as you defined it. But you're right in that (3) does not allow an object to be dynamically retyped as scope, though it could be 'wrapped' at runtime with a proxy struct that is itself statically scoped.
 I view choice 3 as pretty poor and would really like to see choice 1 
 rather than choice 2 for practical reasons.
 
 I hope this at least gives you food for further thought.

Feb 02 2008
next sibling parent reply Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:
Walter Bright wrote:
 Edward Diener wrote:
 Your analogy to C++'s 'const' is a bad one.

 The C++ 'const' refers to a quality of the object while the D 'scope' 
 refers to a quality of the type. There is no equivalent of 'const' in 
 C++ which refers to the type.

'const' in C++ is very much a characteristic of the type of the object. It pervades the semantics of the type. It's easy to envision a scheme of const where the const-ness is controlled by a bit in the runtime instantiation of the object, and any mutating operations would first check that bit.

In your example justifying treating 'scope' as C++ treats 'const' the 'const' is attached to the object upon instantiation of it. My point is that 'scope' is attached to the type by the class designer. To me these are conceptually two different things. That is why I said that your example is a poor analogy. I should have made that clearer by my argument. We can argue the concept similarity/dissimilarity between 'const' and 'scope' all day without getting much of anywhere. I was simply arguing against your assertion, using 'const' as an example, of not allowing a 'scope' object to be assigned to an object that is not 'scope'. Clearly in the polymorphic world of D, where a base class may not be 'scope' while a derived class may be 'scope', such a treatment can not be right, since the basis of polymorphism is to assign to a base class object a derived class reference.
 
 C++ avoids the overhead of that by having a static typing system, and 
 such 'bits' become part of the compile-time information (i.e. the type) 
 rather than the run-time information.
 
 Once we say that a type is 'scope' in D we should no longer have to 
 say that an object of that type is 'scope'. An object of that type 
 should be 'scope' automatically and the user of that object should not 
 care or even need to know. In C++ the user of an object specifically 
 says it is 'const' to set the quality of the object to something ( one 
 can not change the object ). Your analogy is mixing apples and 
 oranges. These are different things.

I don't believe they are different at all. Consider also languages that have no static types - the types are determined at runtime (Javascript is an example). What a language chooses to specify about an object at compile-time vs run-time is a spectrum with various tradeoffs, not an apples-oranges with a sharp dividing line. Certainly, there is plenty of debate about static typing vs dynamic typing. D is a statically typed language primarily for performance reasons - a dynamically typed language can run 100x slower.

I am fully cognizant of a dynamically typed language since I program in Python also. I agree there is no fixed dividing line. But the difference between static typing and dynamic typing is well defined in a statically typed language like D. My argument was that for 'scope' to be really effective it needs to consider the dynamic type at run-time and not just the static type as it exist at compile time.
 
 
 What I am saying is that an object whose type is 'scope' is treated 
 magically by the compiler in that the compiler is now doing reference 
 counting on it and calling its destructor when the last reference goes 
 out of scope. Furthermore as that object gets reference assigned the 
 reference count is manipulated and whatever object is specified in 
 that reference assignment, as long as it is allowable by the compiler 
 by the rules of D, takes part in the 'scope' magic. In a polymorphic 
 language this means that you should associate 'scope' with the dynamic 
 type of the object, not its static type, and how you decide to do that 
 is up to you. Think of it as wrapping a boost::shared_ptr around the 
 object and for every object to which you legally assign/copy it a 
 boost::shared_ptr gets wrapped around that object.

Pulling on that string leads us to every object having scope semantics, because that machinery will have to exist and be checked at runtime for every object.

When the scope changes in D you need to make sure that any 'scope' object is treated appropriately. But I do not see why you think that every object must therefore have scope semantics. Inserting compile time code when a scope changes to treat 'scope' objects in a special way does not mean to me that every object must have scope semantics. Perhaps you mean that the execution slows down a GC system too much to do that for every object. If that is the case then I agree that RAII can not be done in a GC language in the terms in which I have defined it, although it probably can be done in lesser terms, as 'scope' currently exists in D.
 
 I agree this adds some overhead, but so what.

And there lies the crux of our disagreement. My experience with memory allocation is that ref counting is appropriate for scarce resources, and gc is appropriate for abundant resources (i.e. memory). We both agree that gc is a poor choice for scarce resources, and I'm going to argue that rc is a poor choice for abundant resources.

You have already won that argument as I fully agree to what you say above. But I have no idea why you say that it is the crux of our disagreement. Care to elaborate ?
 
 
 Using boost::shared_ptr also imposes overhead and whole generation of 
 programmers have somehow survived the extra x bytes per object in an 
 age where physical memmory is in the gigabytes and virtual memory in 
 64 bit systems in the quadrabytes.

There is a large push to add gc to C++. (rc has disadvantages besides using more memory - the overhead to allocate two objects instead of one, and the overhead of doing the inc/dec/test. A further disadvantage is you cannot do array slicing with rc without adding substantial more overhead - memory and runtime.)
 My added suggestion is that when applying the 'scope' keyword to the 
 object, and not the type, this essentially means that the compiler now 
 treats that object as 'scope' even though the type is not 'scope'. I 
 will call this object 'scope' injection. My suggestion for this is 
 based solely on the practical consequences:

 1) Allow the instantiator of a type to have 'scope' control over the 
 object even when the designer of the type does not specify it as a 
 'scope' type. The user may know something about using the type at 
 run-time that the class designer can not know, makes optional, or even 
 disregards.

I agree with you that the scopeness of an object is best determined by the user, not the class designer. But this doesn't preclude making scope part of the type any more than the user adding 'const' precludes it.

No, I do not think that the scopeness of an object is best determined by the user and not the class designer. In fact I feel very strongly the opposite. The class designer knows whether his class has RAII or not and in the cast majority of cases the end user should not know or care. My argument for scope injection is based purely on the practical considerations that there are types which can not possibly know if it is to be used with RAII or not. Template classes/structs which are containers are the most obvious as well as built-in arrays. That is why besides the ability for the class designer to specify 'scope' the end user should be able to do it at object creation time also.
 
 
 2) Following from the above, the most obvious practical cases occur 
 when the type is created from a template class/struct, when the type 
 is some built-in language container which can hold polymorphic objects 
 of some base class type, and when the type embeds an object of 'scope' 
 class type which may only be used in a corner case so that the 
 designer of the type leaves scoping up to the user.

 In other words the flexibility of control would be wonderful and, I 
 believe, often necessary. Having both the class designer be able to 
 'scope' the type and the end user be able to 'scope' an object of any 
 type ( which has a destructor ) is the ultimate ideal. If you wanted 
 to go even further you could allow the end-user to 'unscope' an object 
 of a 'scope' type when instantiating the object, even though the 
 benefits of doing this seem to be practically negligible.

 I view your choices as, from most desirable to least desirable:

 1) Keep tracks of the objects themselves at run-time to see if they 
 are 'scope' or not. This allows object 'scope' injection and refernce 
 assignment to objects whose static type is not 'scope' to make them 
 'scope'.

 2) Keep track of the dynamic type of the objects themselves in order 
 to see whether the dynamic type of the object is 'scope'. This does 
 not allow object 'scope' injection, but does allow reference 
 assignment to objects whose static type is not 'scope' since you are 
 only considering the dynamic type of the object and not the static 
 type to determine if the object should be 'scope'.

 3) Keep track of only the static type of the object. This does not 
 allow object 'scope' injection nor even reference assignment to 
 objects whose static type is not 'scope'.

I believe that adding scope to the type allows for scope 'injection' as you defined it. But you're right in that (3) does not allow an object to be dynamically retyped as scope, though it could be 'wrapped' at runtime with a proxy struct that is itself statically scoped.

Whether one does 'scope' injection using the 'scope' keyword on the object when it is declared or by using the equivalent of a boost:shared_ptr construct in D is of little practical matter to me. This is purely syntax so that if D could silently translate 'scope' to such a boost::shared_ptr construct that would be better IMO because it would unite such a treatment under the same concept with the 'scope' keyword as it applies to a class. The crux of my argument against 3) above is simply that the end user will not and should not be expected to know that an object is of a 'scope' type. // In a module class C { ... } scope class D : C { ... } // In the end user's code C d = new D(...); Under 3) the d object is not 'scope', because its static type is not 'scope' even though its dynamic type is 'scope'. This can not be right IMO. Requiring the user to have knowledge that D is a 'scope' negates a great deal of the transparency of having RAII in a GC language. I can understand your feeling that the above should be: class C { ... } scope class D : C { ... } // In the end user's code scope C d = new D(...); // End user is required here to specify scope This may make things much easier for the compiler, but it requires the end user knowledge of 'scope', which has been specified at the class level, to be applied at the syntax level. Intuitively I feel the compiler can figure this out, and that 'scope' should largely be totally transparent to the end user above at the syntax level. I do agree that the end user should "know" that a class is 'scope' (RAII ) by reading the documentation of that class. This is useful for scope injection for container objects and for the end user designing his own class as 'scope' when an object of a 'scope' class is a data member.
Feb 03 2008
next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2008-02-03 08:20:32 -0500, Edward Diener 
<eddielee_no_spam_here tropicsoft.com> said:

 I am fully cognizant of a dynamically typed language since I program in 
 Python also. I agree there is no fixed dividing line. But the 
 difference between static typing and dynamic typing is well defined in 
 a statically typed language like D. My argument was that for 'scope' to 
 be really effective it needs to consider the dynamic type at run-time 
 and not just the static type as it exist at compile time.

Considering the dynamic type at runtime means you need to check if you're dealing with a reference-counted object each time you copy a reference to that object to see if it the reference count needs adjusting. This is significant overhead over the "just copy the pointer" thing you can do in a GC. Basically, just checking this will increase by two or three times the time it take to copy an object reference... I can see why Walter doesn't want that. Beside, the overhead of actually checking the type of the class will be approximativly the same as doing the reference counting. Given this, it's much better to always just do the reference counting than checking dynamically if it's needed.
 class C { ... }
 scope class D : C { ... }
 
 [...]
 
 This may make things much easier for the compiler, but it requires the 
 end user knowledge of 'scope', which has been specified at the class 
 level, to be applied at the syntax level. Intuitively I feel the 
 compiler can figure this out, and that 'scope' should largely be 
 totally transparent to the end user above at the syntax level.

Well, if the compiler is to be able to distinguish scope at compile time, then it needs a scope flag (either explicit or implicit) on each variable. This is exactly what Walter has proposed to do. He prefers the explicit route because going implicit isn't going to work in too many cases. For instance, let's have a function that returns a C: C makeOne() { if (/* random stuff here */) return new C; else return new D; } Now let's call the function: C c = makeOne(); How can you know at compile time if the returned object of that function call is scoped or not? You can't, and therfore the compiler would need to add code to check if the returned object is scope or not, with a significant overhead, each time you assign a C. If however you make scope known at compile time: scope C makeOne() { if (/* random stuff here */) return new C; else return new D; } scope C c = makeOne(); Now the compiler knows it must generate reference counting code for the following assignment, and any subsequent assignment of this type, and it won't have to generate code to dynamically everywhere you use a C check the "scopeness". If makeOne returns a C, it'll simply be scope too, which is more overhead than having a garbage-collected C, but, as I said earlier, not necessarly less than checking dynamically if it should be reference counted. Perhaps Walter can confirm that the above code makes sense given what he intends to do, but I believe it does. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Feb 03 2008
parent reply Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:
Michel Fortin wrote:
 On 2008-02-03 08:20:32 -0500, Edward Diener 
 <eddielee_no_spam_here tropicsoft.com> said:
 
 I am fully cognizant of a dynamically typed language since I program 
 in Python also. I agree there is no fixed dividing line. But the 
 difference between static typing and dynamic typing is well defined in 
 a statically typed language like D. My argument was that for 'scope' 
 to be really effective it needs to consider the dynamic type at 
 run-time and not just the static type as it exist at compile time.

Considering the dynamic type at runtime means you need to check if you're dealing with a reference-counted object each time you copy a reference to that object to see if it the reference count needs adjusting. This is significant overhead over the "just copy the pointer" thing you can do in a GC. Basically, just checking this will increase by two or three times the time it take to copy an object reference... I can see why Walter doesn't want that.

I am not knowledgable about the actual low-level difference between the compiler statically checking the type of an object or dynamically checking the type of an object, and the run-time costs involved. Yet clearly D already has to implement code when scopes come to an end in order to destroy stack-based objects, since structs ( user-define value types ) are already supported and can have destructors. So the added overhead goes from having to identify structs which must have their destructor called at the end of each scope to having to also identify 'scope' objects which must have their reference count decremented at the end of each scope and have their destructor called if the reference count reaches 0. The only difference I see, aside from the run-time time overhead, is the actual identification for a greater set of objects.
 
 Beside, the overhead of actually checking the type of the class will be 
 approximativly the same as doing the reference counting. Given this, 
 it's much better to always just do the reference counting than checking 
 dynamically if it's needed.
 
 
 class C { ... }
 scope class D : C { ... }

 [...]

 This may make things much easier for the compiler, but it requires the 
 end user knowledge of 'scope', which has been specified at the class 
 level, to be applied at the syntax level. Intuitively I feel the 
 compiler can figure this out, and that 'scope' should largely be 
 totally transparent to the end user above at the syntax level.

Well, if the compiler is to be able to distinguish scope at compile time, then it needs a scope flag (either explicit or implicit) on each variable. This is exactly what Walter has proposed to do. He prefers the explicit route because going implicit isn't going to work in too many cases. For instance, let's have a function that returns a C: C makeOne() { if (/* random stuff here */) return new C; else return new D; } Now let's call the function: C c = makeOne(); How can you know at compile time if the returned object of that function call is scoped or not? You can't, and therfore the compiler would need to add code to check if the returned object is scope or not, with a significant overhead, each time you assign a C. If however you make scope known at compile time: scope C makeOne() { if (/* random stuff here */) return new C; else return new D; } scope C c = makeOne(); Now the compiler knows it must generate reference counting code for the following assignment, and any subsequent assignment of this type, and it won't have to generate code to dynamically everywhere you use a C check the "scopeness".

Would you agree that all you are doing here is specifically telling the compiler that an object is 'scope' when it is created rather than having the compiler figure it out for itself by querying the dynamic type of the object at creation time ? If you do, then a much simpler, and to the point, example would be based on my initial OP: scope class C { ... } scope C c = new C(...); I specified that the scope keyword for creating the object is redundant. The compiler can figure it out. The major difference in opinion is that I think the compiler should figure it out from the dynamic type of the object at run-time and not from the static type of the object. If Walter decides that creating code which at run-time determines the dynamic type of an object in order to implement RAII in D is too much overhead, I will understand. But I do no think it will be a solution for RAII in GC in my own understanding of what this should entail.
Feb 03 2008
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2008-02-03 10:42:03 -0500, Edward Diener 
<eddielee_no_spam_here tropicsoft.com> said:

 Michel Fortin wrote:
 On 2008-02-03 08:20:32 -0500, Edward Diener 
 <eddielee_no_spam_here tropicsoft.com> said:
 
 I am fully cognizant of a dynamically typed language since I program in 
 Python also. I agree there is no fixed dividing line. But the 
 difference between static typing and dynamic typing is well defined in 
 a statically typed language like D. My argument was that for 'scope' to 
 be really effective it needs to consider the dynamic type at run-time 
 and not just the static type as it exist at compile time.

Considering the dynamic type at runtime means you need to check if you're dealing with a reference-counted object each time you copy a reference to that object to see if it the reference count needs adjusting. This is significant overhead over the "just copy the pointer" thing you can do in a GC. Basically, just checking this will increase by two or three times the time it take to copy an object reference... I can see why Walter doesn't want that.

I am not knowledgable about the actual low-level difference between the compiler statically checking the type of an object or dynamically checking the type of an object, and the run-time costs involved. Yet clearly D already has to implement code when scopes come to an end in order to destroy stack-based objects, since structs ( user-define value types ) are already supported and can have destructors.

Yes, and this is implemented in a simple and naive way: by adding an explicit call to the destructor at the end of the scope. The scope object cannot exist outside the scope, and thus no reference counting is needed in the way it's implemented currently.
 So the added overhead goes from having to identify structs which must 
 have their destructor called at the end of each scope to having to also 
 identify 'scope' objects which must have their reference count 
 decremented at the end of each scope and have their destructor called 
 if the reference count reaches 0.

Well, identifying structs can be done at compile time since you know exactly the type of the struct at that time. Classes are polymorphic, so it'd be a costly runtime check to know that, and that check is almost as costly as doing the reference counting itself. Given that, you should probably not bother at runtime and decide at compile time to just treat any class which has the potential to be a scope class as if it were one and actually do the reference counting.
 
 
 
 Beside, the overhead of actually checking the type of the class will be 
 approximativly the same as doing the reference counting. Given this, 
 it's much better to always just do the reference counting than checking 
 dynamically if it's needed.
 
 
 class C { ... }
 scope class D : C { ... }
 
 [...]
 
 This may make things much easier for the compiler, but it requires the 
 end user knowledge of 'scope', which has been specified at the class 
 level, to be applied at the syntax level. Intuitively I feel the 
 compiler can figure this out, and that 'scope' should largely be 
 totally transparent to the end user above at the syntax level.

Well, if the compiler is to be able to distinguish scope at compile time, then it needs a scope flag (either explicit or implicit) on each variable. This is exactly what Walter has proposed to do. He prefers the explicit route because going implicit isn't going to work in too many cases. For instance, let's have a function that returns a C: C makeOne() { if (/* random stuff here */) return new C; else return new D; } Now let's call the function: C c = makeOne(); How can you know at compile time if the returned object of that function call is scoped or not? You can't, and therfore the compiler would need to add code to check if the returned object is scope or not, with a significant overhead, each time you assign a C. If however you make scope known at compile time: scope C makeOne() { if (/* random stuff here */) return new C; else return new D; } scope C c = makeOne(); Now the compiler knows it must generate reference counting code for the following assignment, and any subsequent assignment of this type, and it won't have to generate code to dynamically everywhere you use a C check the "scopeness".

Would you agree that all you are doing here is specifically telling the compiler that an object is 'scope' when it is created rather than having the compiler figure it out for itself by querying the dynamic type of the object at creation time ?

The compiler isn't knowleadgeable of what happens whithin every function call. So it can only check at runtime if the function returned at C or a D.
 If you do, then a much simpler, and to the point, example would be 
 based on my initial OP:
 
 scope class C { ... }
 
 scope C c = new C(...);
 
 I specified that the scope keyword for creating the object is 
 redundant. The compiler can figure it out. The major difference in 
 opinion is that I think the compiler should figure it out from the 
 dynamic type of the object at run-time and not from the static type of 
 the object.

You're prefectly right: it is redundent in *this* case, and you could have the compiler implicitly understand that C is a scope class in *this* case. But consider this example: Object o; if (/* random value */) o = new C; // c is a scope class else o = new Object; // Object is the base class of C but isn't scope Now, should o be automatically reference-counted because you *could* later create a C object and assing it to o, or should line 3 gives an error since the type Object isn't scope and C must only be assigned as scope? I'd say it should be an error. This however could be made legal without too much difficulty: scope Object o; if (/* random value */) o = new C; // c is a scope class else o = new Object; // Object is the base class of C but isn't scope Basically, you're declaring a scope Object. While Object isn't necessarly a scope class, you are telling the compiler to treat it as scope, and thus an instance of C, which must be scope, *can* be put in this variable. If o wasn't scope, it'd be an error to put an instance of a scope class in it. But there are still many holes in this scheme in which scope now means reference-counted. Take this example: class A { void doSomething() { globalReferences ~= this; } } scope class B { } A[] globalReferences; scope B b = new B; // Scope could be made implicit here, but it's irrelevant to my example b.doSomething(); This last statement would call A.doSomething which would put a non-scoped reference to globalReferences, which would fail to retain the object. There are two ways around that: ignore the problem and let the programmer handle these cases (basically, that is what boost::shared_ptr would do in such a situation), or introduce a new keyword to decorate parameters for functions that do not keep any reference beyound their own call so that you don't need to duplicate all your functions for a scope and non-scope parameter (much like const is the middle ground between mutable and invariant). (Sidenote: this keyword could be useful to implement something like "unique" as it was discussed in another thread, as it'd allow functions to be called with a unique parameter and guarenty that no external references are kept after the call, thus perserving uniqueness.) -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Feb 04 2008
parent reply Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:
Michel Fortin wrote:
 On 2008-02-03 10:42:03 -0500, Edward Diener 
 <eddielee_no_spam_here tropicsoft.com> said:
 
 Michel Fortin wrote:
 On 2008-02-03 08:20:32 -0500, Edward Diener 
 <eddielee_no_spam_here tropicsoft.com> said:

 I am fully cognizant of a dynamically typed language since I program 
 in Python also. I agree there is no fixed dividing line. But the 
 difference between static typing and dynamic typing is well defined 
 in a statically typed language like D. My argument was that for 
 'scope' to be really effective it needs to consider the dynamic type 
 at run-time and not just the static type as it exist at compile time.

Considering the dynamic type at runtime means you need to check if you're dealing with a reference-counted object each time you copy a reference to that object to see if it the reference count needs adjusting. This is significant overhead over the "just copy the pointer" thing you can do in a GC. Basically, just checking this will increase by two or three times the time it take to copy an object reference... I can see why Walter doesn't want that.

I am not knowledgable about the actual low-level difference between the compiler statically checking the type of an object or dynamically checking the type of an object, and the run-time costs involved. Yet clearly D already has to implement code when scopes come to an end in order to destroy stack-based objects, since structs ( user-define value types ) are already supported and can have destructors.

Yes, and this is implemented in a simple and naive way: by adding an explicit call to the destructor at the end of the scope. The scope object cannot exist outside the scope, and thus no reference counting is needed in the way it's implemented currently.

The reference counting would only be implemented for a 'scope' object only. The main overhead at the end of each scope is going through all the objects to determine which is a 'scope' object. Perhaps this is too expensive, but it would at least be interesting to see if it is or not.
 
 So the added overhead goes from having to identify structs which must 
 have their destructor called at the end of each scope to having to 
 also identify 'scope' objects which must have their reference count 
 decremented at the end of each scope and have their destructor called 
 if the reference count reaches 0.

Well, identifying structs can be done at compile time since you know exactly the type of the struct at that time. Classes are polymorphic, so it'd be a costly runtime check to know that, and that check is almost as costly as doing the reference counting itself. Given that, you should probably not bother at runtime and decide at compile time to just treat any class which has the potential to be a scope class as if it were one and actually do the reference counting.

Your point is well taken, but I still would like to see if the check for a 'scope' object would be that expensive. It could be as easy as checking an extra 'int' for reference counting for each object and seeing whether it is 0 ( normal GC object ) or not 0 ( 'scope' object ).
 Beside, the overhead of actually checking the type of the class will 
 be approximativly the same as doing the reference counting. Given 
 this, it's much better to always just do the reference counting than 
 checking dynamically if it's needed.


 class C { ... }
 scope class D : C { ... }

 [...]

 This may make things much easier for the compiler, but it requires 
 the end user knowledge of 'scope', which has been specified at the 
 class level, to be applied at the syntax level. Intuitively I feel 
 the compiler can figure this out, and that 'scope' should largely be 
 totally transparent to the end user above at the syntax level.

Well, if the compiler is to be able to distinguish scope at compile time, then it needs a scope flag (either explicit or implicit) on each variable. This is exactly what Walter has proposed to do. He prefers the explicit route because going implicit isn't going to work in too many cases. For instance, let's have a function that returns a C: C makeOne() { if (/* random stuff here */) return new C; else return new D; } Now let's call the function: C c = makeOne(); How can you know at compile time if the returned object of that function call is scoped or not? You can't, and therfore the compiler would need to add code to check if the returned object is scope or not, with a significant overhead, each time you assign a C. If however you make scope known at compile time: scope C makeOne() { if (/* random stuff here */) return new C; else return new D; } scope C c = makeOne(); Now the compiler knows it must generate reference counting code for the following assignment, and any subsequent assignment of this type, and it won't have to generate code to dynamically everywhere you use a C check the "scopeness".

Would you agree that all you are doing here is specifically telling the compiler that an object is 'scope' when it is created rather than having the compiler figure it out for itself by querying the dynamic type of the object at creation time ?

The compiler isn't knowleadgeable of what happens whithin every function call. So it can only check at runtime if the function returned at C or a D.

Fully agreed.
 
 If you do, then a much simpler, and to the point, example would be 
 based on my initial OP:

 scope class C { ... }

 scope C c = new C(...);

 I specified that the scope keyword for creating the object is 
 redundant. The compiler can figure it out. The major difference in 
 opinion is that I think the compiler should figure it out from the 
 dynamic type of the object at run-time and not from the static type of 
 the object.

You're prefectly right: it is redundent in *this* case, and you could have the compiler implicitly understand that C is a scope class in *this* case. But consider this example: Object o; if (/* random value */) o = new C; // c is a scope class else o = new Object; // Object is the base class of C but isn't scope Now, should o be automatically reference-counted because you *could* later create a C object and assing it to o, or should line 3 gives an error since the type Object isn't scope and C must only be assigned as scope? I'd say it should be an error.

I say it should be a 'scope' object. The dynamic type of o is that of a 'scope' class.
 
 This however could be made legal without too much difficulty:
 
     scope Object o;
     if (/* random value */)
         o = new C; // c is a scope class
     else
         o = new Object; // Object is the base class of C but isn't scope
 
 Basically, you're declaring a scope Object. While Object isn't 
 necessarly a scope class, you are telling the compiler to treat it as 
 scope, and thus an instance of C, which must be scope, *can* be put in 
 this variable. If o wasn't scope, it'd be an error to put an instance of 
 a scope class in it.

But then the end-user is required to know that the C is a scope class. I do not think that should be necessary. The whole point of 'scope' ( RAII ) in GC is that, for the most part, an end-user should instantiate and use 'scope' classes just as he would normal GC classes, with the language taking care to automatically destruct an object of a 'scope' class just as soon as the last reference to that object goes out of scope.
 
 But there are still many holes in this scheme in which scope now means 
 reference-counted. Take this example:
 
     class A {
         void doSomething() {
             globalReferences ~= this;
         }
     }
     scope class B { }
 
     A[] globalReferences;
 
     scope B b = new B; // Scope could be made implicit here, but it's 
 irrelevant to my example
     b.doSomething();
 
 This last statement would call A.doSomething which would put a 
 non-scoped reference to globalReferences, which would fail to retain the 
 object. There are two ways around that: ignore the problem and let the 
 programmer handle these cases (basically, that is what boost::shared_ptr 
 would do in such a situation), or introduce a new keyword to decorate 
 parameters for functions that do not keep any reference beyound their 
 own call so that you don't need to duplicate all your functions for a 
 scope and non-scope parameter (much like const is the middle ground 
 between mutable and invariant).

No, A.doSomething would put a 'scoped' reference in a non-scope array. However if we specify 'scope A[] globalReferences;' we can solve that problem. Of course we may not control the declaration of 'A[] globalReferences;'. I acknowledge that.
Feb 04 2008
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2008-02-04 22:43:21 -0500, Edward Diener 
<eddielee_no_spam_here tropicsoft.com> said:

 Michel Fortin wrote:
 On 2008-02-03 10:42:03 -0500, Edward Diener 
 <eddielee_no_spam_here tropicsoft.com> said:
 
 Michel Fortin wrote:
 On 2008-02-03 08:20:32 -0500, Edward Diener 
 <eddielee_no_spam_here tropicsoft.com> said:
 
 I am fully cognizant of a dynamically typed language since I program in 
 Python also. I agree there is no fixed dividing line. But the 
 difference between static typing and dynamic typing is well defined in 
 a statically typed language like D. My argument was that for 'scope' to 
 be really effective it needs to consider the dynamic type at run-time 
 and not just the static type as it exist at compile time.

Considering the dynamic type at runtime means you need to check if you're dealing with a reference-counted object each time you copy a reference to that object to see if it the reference count needs adjusting. This is significant overhead over the "just copy the pointer" thing you can do in a GC. Basically, just checking this will increase by two or three times the time it take to copy an object reference... I can see why Walter doesn't want that.

I am not knowledgable about the actual low-level difference between the compiler statically checking the type of an object or dynamically checking the type of an object, and the run-time costs involved. Yet clearly D already has to implement code when scopes come to an end in order to destroy stack-based objects, since structs ( user-define value types ) are already supported and can have destructors.

Yes, and this is implemented in a simple and naive way: by adding an explicit call to the destructor at the end of the scope. The scope object cannot exist outside the scope, and thus no reference counting is needed in the way it's implemented currently.

The reference counting would only be implemented for a 'scope' object only. The main overhead at the end of each scope is going through all the objects to determine which is a 'scope' object. Perhaps this is too expensive, but it would at least be interesting to see if it is or not.
 
 So the added overhead goes from having to identify structs which must 
 have their destructor called at the end of each scope to having to also 
 identify 'scope' objects which must have their reference count 
 decremented at the end of each scope and have their destructor called 
 if the reference count reaches 0.

Well, identifying structs can be done at compile time since you know exactly the type of the struct at that time. Classes are polymorphic, so it'd be a costly runtime check to know that, and that check is almost as costly as doing the reference counting itself. Given that, you should probably not bother at runtime and decide at compile time to just treat any class which has the potential to be a scope class as if it were one and actually do the reference counting.

Your point is well taken, but I still would like to see if the check for a 'scope' object would be that expensive. It could be as easy as checking an extra 'int' for reference counting for each object and seeing whether it is 0 ( normal GC object ) or not 0 ( 'scope' object ).

Basically, you need to: 1. Load the object's pointer in a register 2. Load the "scope" flag from memory by offseting the object's pointer 3. Branch depending on that flag: a. if not scope, go to 4. b. if scope, do whatever is needed to increment the reference count atomically, then go to 4 4. Write the pointer to its new location. That's a lot of extra work you'd have to do at every copy of an object's pointer to perform that check. That branch operation could become very expensive if the processor can't predict it right, and loading from an additional, possibly far away, memory block could mean missing the memory cache more often too. 1 and 4 is all you need if you don't care about scope.
 The compiler isn't knowleadgeable of what happens whithin every 
 function call. So it can only check at runtime if the function returned 
 at C or a D.

Fully agreed.
 
 If you do, then a much simpler, and to the point, example would be 
 based on my initial OP:
 
 scope class C { ... }
 
 scope C c = new C(...);
 
 I specified that the scope keyword for creating the object is 
 redundant. The compiler can figure it out. The major difference in 
 opinion is that I think the compiler should figure it out from the 
 dynamic type of the object at run-time and not from the static type of 
 the object.

You're prefectly right: it is redundent in *this* case, and you could have the compiler implicitly understand that C is a scope class in *this* case. But consider this example: Object o; if (/* random value */) o = new C; // c is a scope class else o = new Object; // Object is the base class of C but isn't scope Now, should o be automatically reference-counted because you *could* later create a C object and assing it to o, or should line 3 gives an error since the type Object isn't scope and C must only be assigned as scope? I'd say it should be an error.

I say it should be a 'scope' object. The dynamic type of o is that of a 'scope' class.

Hum, dynamic scope typing again? If you had that it'd work, sure, but since we surely won't have that this isn't an option.
 This however could be made legal without too much difficulty:
 
     scope Object o;
     if (/* random value */)
         o = new C; // c is a scope class
     else
         o = new Object; // Object is the base class of C but isn't scope
 
 Basically, you're declaring a scope Object. While Object isn't 
 necessarly a scope class, you are telling the compiler to treat it as 
 scope, and thus an instance of C, which must be scope, *can* be put in 
 this variable. If o wasn't scope, it'd be an error to put an instance 
 of a scope class in it.

But then the end-user is required to know that the C is a scope class. I do not think that should be necessary.

Perhaps not, I don't have a strong opinion on that. But I firmly belive scope should be enforced statically, not dynamically, and that's what I'm arguing for.
 The whole point of 'scope' ( RAII ) in GC is that, for the most part, 
 an end-user should instantiate and use 'scope' classes just as he would 
 normal GC classes, with the language taking care to automatically 
 destruct an object of a 'scope' class just as soon as the last 
 reference to that object goes out of scope.

Well, perhaps there's a solution that would do what you want while still keeping it compile-time only. It's some sort of compromise. Take these three classes: class A {} scope class B : A {} scope class C : B {} B and C are scope, A isn't. Now, what if writing "B" was equivalent to writing "scope B" (since B is scope) and "C" was equivalent to writing "scope C". Obviously, writing "A" wouldn't be equivalent to "scope A" (because A is not scope). Then you could have: A a1 = new A; A a2 = new B; // illegal: B is scope, cannot be assigned to non-scope A scope A a3 = new B; // legal: B is scope and scope A is (explicitly) scope B b1 = new B; B b2 = new C; // legal: C is scope and B is (implicitly) scope scope B3 = new C; // same as above That would mean that you'd only have to explictly write scope if you're using the non-scope base class as a type to hold a reference to your scope object.
 But there are still many holes in this scheme in which scope now means 
 reference-counted. Take this example:
 
     class A {
         void doSomething() {
             globalReferences ~= this;
         }
     }
     scope class B { }
 
     A[] globalReferences;
 
     scope B b = new B; // Scope could be made implicit here, but it's 
 irrelevant to my example
     b.doSomething();
 
 This last statement would call A.doSomething which would put a 
 non-scoped reference to globalReferences, which would fail to retain 
 the object. There are two ways around that: ignore the problem and let 
 the programmer handle these cases (basically, that is what 
 boost::shared_ptr would do in such a situation), or introduce a new 
 keyword to decorate parameters for functions that do not keep any 
 reference beyound their own call so that you don't need to duplicate 
 all your functions for a scope and non-scope parameter (much like const 
 is the middle ground between mutable and invariant).

No, A.doSomething would put a 'scoped' reference in a non-scope array. However if we specify 'scope A[] globalReferences;' we can solve that problem.

Sure, you're solving the problem nicely. But how does the compiler finds out there's a problem in the first place? It needs to know that the this parameter is scope, and thus the member function should be decorated scope (just like you'd do with invariant). So you'd need to duplicate every member function so that it can be used either as scope or non-scope, and that's not very interesting unless you can declare that the function does not need to know if the paramater is typed scope or not (just like const means you don't know if it's invariant or mutable). -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Feb 04 2008
next sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2008-02-05 04:09:31 -0500, "Janice Caron" <caron800 googlemail.com> said:

 On 05/02/2008, Michel Fortin <michel.fortin michelf.com> wrote:
 Basically, you need to:
 
 1. Load the object's pointer in a register
 2. Load the "scope" flag from memory by offseting the object's pointer
 3. Branch depending on that flag:
 a. if not scope, go to 4.
 b. if scope, do whatever is needed to increment the reference count
 atomically, then go to 4
 4. Write the pointer to its new location.

Not quite.

Well, the algorithm above checks for the presence of a scope flag on the object to only reference-count objects whith are flagged scope at runtime. And I didn't bother elaborate the code in caes 4.b., doing the actual reference counting, because it's not that important. But now I realise I forgot to check for the scope flag on the second object and failed to check for null pointers. Basically, what I should have written is this (now in D with a special atomic statement): A a; B b; if (a && a.isScope) { debug if (a.refCount == 0) throw new RefCountException(); atomic { --a.refCount; } if (a.refCount == 0) delete a; } if (b && b.isScope) { atomic { ++b.refCount; } } a = b; As you can see, there are four conditions that *must* be evaluated whether or not the object is scope at runtime (a, a.isScope, b, b.isScope), two of them requiring dereferencing the object and loading some of its memory (for each object's scope flag). This is the real drawback in Edward's proposal as the rest of the code wouldn't be executed if the isScope flag is set to false. Since branching is often an expensive operation on processors because of the instruction pipeline, doing the assignment with a runtime isScope flag would be much slower. Janice, your code demonstrate how to do reference counting, but I don't see a scope flag anywhere.
 That's the situation currently. Assignment of classes is very fast.
 /But/, this is what it would change to under your scheme:
 
     A* pa;
     B* pb;
     if (pa->refCount)
     {
         if (pb->refCount)
         {

Shouldn't you check for null too? That'd be: if (pa && pa->refCount) { if (pb && pb->refCount) {
             atomic { if (--pb->refCount == 0) delete pb; }
             atomic { ++pa->refCount; }
             pa = pb;
         }
         else
         {
             throw new RefCountException();
         }
     }
     else
     {
         if (pb->refCount)
         {

And again: if (pb && pb->refCount) {
              throw new RefCountException();
         }
         else
         {
             pa = pb;
         }
     }
 
 And that's just for /ordinary/ assignment. That looks like a
 phenomenal overhead to me.
 
 Now just /imagine/ how complicated it gets
 if you've overloaded opAssign in various complicated ways. (e.g
 structs assigned from classes, classes assigned from structs, etc.)
 
 I think I'd rather not have that overhead added to every single class
 assignment.

Well, that's the idea of a reference-counted object: you add this overhead to make sure the object is destroyed as soon as it can. It can be usefull in many cases, when a class holds a scarse resource for instance. But you're completly right to not want this overhead for regular objects, and that's why an object being reference-counted must be a compile-time property, not decided by a runtime-evaluatable flag. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Feb 05 2008
prev sibling next sibling parent reply Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:
Michel Fortin wrote:
 On 2008-02-04 22:43:21 -0500, Edward Diener 
 <eddielee_no_spam_here tropicsoft.com> said:
 
 Michel Fortin wrote:
 On 2008-02-03 10:42:03 -0500, Edward Diener 
 <eddielee_no_spam_here tropicsoft.com> said:

 Michel Fortin wrote:
 On 2008-02-03 08:20:32 -0500, Edward Diener 
 <eddielee_no_spam_here tropicsoft.com> said:

 I am fully cognizant of a dynamically typed language since I 
 program in Python also. I agree there is no fixed dividing line. 
 But the difference between static typing and dynamic typing is 
 well defined in a statically typed language like D. My argument 
 was that for 'scope' to be really effective it needs to consider 
 the dynamic type at run-time and not just the static type as it 
 exist at compile time.

Considering the dynamic type at runtime means you need to check if you're dealing with a reference-counted object each time you copy a reference to that object to see if it the reference count needs adjusting. This is significant overhead over the "just copy the pointer" thing you can do in a GC. Basically, just checking this will increase by two or three times the time it take to copy an object reference... I can see why Walter doesn't want that.

I am not knowledgable about the actual low-level difference between the compiler statically checking the type of an object or dynamically checking the type of an object, and the run-time costs involved. Yet clearly D already has to implement code when scopes come to an end in order to destroy stack-based objects, since structs ( user-define value types ) are already supported and can have destructors.

Yes, and this is implemented in a simple and naive way: by adding an explicit call to the destructor at the end of the scope. The scope object cannot exist outside the scope, and thus no reference counting is needed in the way it's implemented currently.

The reference counting would only be implemented for a 'scope' object only. The main overhead at the end of each scope is going through all the objects to determine which is a 'scope' object. Perhaps this is too expensive, but it would at least be interesting to see if it is or not.
 So the added overhead goes from having to identify structs which 
 must have their destructor called at the end of each scope to having 
 to also identify 'scope' objects which must have their reference 
 count decremented at the end of each scope and have their destructor 
 called if the reference count reaches 0.

Well, identifying structs can be done at compile time since you know exactly the type of the struct at that time. Classes are polymorphic, so it'd be a costly runtime check to know that, and that check is almost as costly as doing the reference counting itself. Given that, you should probably not bother at runtime and decide at compile time to just treat any class which has the potential to be a scope class as if it were one and actually do the reference counting.

Your point is well taken, but I still would like to see if the check for a 'scope' object would be that expensive. It could be as easy as checking an extra 'int' for reference counting for each object and seeing whether it is 0 ( normal GC object ) or not 0 ( 'scope' object ).

Basically, you need to: 1. Load the object's pointer in a register 2. Load the "scope" flag from memory by offseting the object's pointer 3. Branch depending on that flag: a. if not scope, go to 4. b. if scope, do whatever is needed to increment the reference count atomically, then go to 4 4. Write the pointer to its new location. That's a lot of extra work you'd have to do at every copy of an object's pointer to perform that check. That branch operation could become very expensive if the processor can't predict it right, and loading from an additional, possibly far away, memory block could mean missing the memory cache more often too. 1 and 4 is all you need if you don't care about scope.

I love it when people such as you carry on about all the work that must be done to implement X. Implementing any new feature in any language takes work. There are NO free rides. But that never means that the new feature should not be done. Who cares if some program is slowed down by some number of microsecoonds each time if the feature makes a better and much easier programming paradigm work which otherwise could only be handled in a clumsy and inefficient manner.
 
 
 The compiler isn't knowleadgeable of what happens whithin every 
 function call. So it can only check at runtime if the function 
 returned at C or a D.

Fully agreed.
 If you do, then a much simpler, and to the point, example would be 
 based on my initial OP:

 scope class C { ... }

 scope C c = new C(...);

 I specified that the scope keyword for creating the object is 
 redundant. The compiler can figure it out. The major difference in 
 opinion is that I think the compiler should figure it out from the 
 dynamic type of the object at run-time and not from the static type 
 of the object.

You're prefectly right: it is redundent in *this* case, and you could have the compiler implicitly understand that C is a scope class in *this* case. But consider this example: Object o; if (/* random value */) o = new C; // c is a scope class else o = new Object; // Object is the base class of C but isn't scope Now, should o be automatically reference-counted because you *could* later create a C object and assing it to o, or should line 3 gives an error since the type Object isn't scope and C must only be assigned as scope? I'd say it should be an error.

I say it should be a 'scope' object. The dynamic type of o is that of a 'scope' class.

Hum, dynamic scope typing again? If you had that it'd work, sure, but since we surely won't have that this isn't an option.

A brilliant conclusion. You decide that "we surely won't have that" so it will not work. Another candidate for a course in predicate logic 101 shows up.
 
 
 This however could be made legal without too much difficulty:

     scope Object o;
     if (/* random value */)
         o = new C; // c is a scope class
     else
         o = new Object; // Object is the base class of C but isn't scope

 Basically, you're declaring a scope Object. While Object isn't 
 necessarly a scope class, you are telling the compiler to treat it as 
 scope, and thus an instance of C, which must be scope, *can* be put 
 in this variable. If o wasn't scope, it'd be an error to put an 
 instance of a scope class in it.

But then the end-user is required to know that the C is a scope class. I do not think that should be necessary.

Perhaps not, I don't have a strong opinion on that. But I firmly belive scope should be enforced statically, not dynamically, and that's what I'm arguing for.

I understand your argument based on the simplicity of the solution, and the relative speed of the code compared to the alternative of determining 'scope' at run-time. I respect your argument but I think that it is an incomplete solution from the end-user's perspective because he must be aware of the 'scope'-ness of the objects he uses and notate the objects accordingly. I think this is an imposition although I could live with it. But I would like to see the dynamic solution at least attempted.
 
 The whole point of 'scope' ( RAII ) in GC is that, for the most part, 
 an end-user should instantiate and use 'scope' classes just as he 
 would normal GC classes, with the language taking care to 
 automatically destruct an object of a 'scope' class just as soon as 
 the last reference to that object goes out of scope.

Well, perhaps there's a solution that would do what you want while still keeping it compile-time only. It's some sort of compromise. Take these three classes: class A {} scope class B : A {} scope class C : B {} B and C are scope, A isn't. Now, what if writing "B" was equivalent to writing "scope B" (since B is scope) and "C" was equivalent to writing "scope C". Obviously, writing "A" wouldn't be equivalent to "scope A" (because A is not scope). Then you could have: A a1 = new A; A a2 = new B; // illegal: B is scope, cannot be assigned to non-scope A scope A a3 = new B; // legal: B is scope and scope A is (explicitly) scope B b1 = new B; B b2 = new C; // legal: C is scope and B is (implicitly) scope scope B3 = new C; // same as above That would mean that you'd only have to explictly write scope if you're using the non-scope base class as a type to hold a reference to your scope object.

Yes, I understand your example completely.
 
 
 But there are still many holes in this scheme in which scope now 
 means reference-counted. Take this example:

     class A {
         void doSomething() {
             globalReferences ~= this;
         }
     }
     scope class B { }

     A[] globalReferences;

     scope B b = new B; // Scope could be made implicit here, but it's 
 irrelevant to my example
     b.doSomething();

 This last statement would call A.doSomething which would put a 
 non-scoped reference to globalReferences, which would fail to retain 
 the object. There are two ways around that: ignore the problem and 
 let the programmer handle these cases (basically, that is what 
 boost::shared_ptr would do in such a situation), or introduce a new 
 keyword to decorate parameters for functions that do not keep any 
 reference beyound their own call so that you don't need to duplicate 
 all your functions for a scope and non-scope parameter (much like 
 const is the middle ground between mutable and invariant).

No, A.doSomething would put a 'scoped' reference in a non-scope array. However if we specify 'scope A[] globalReferences;' we can solve that problem.

Sure, you're solving the problem nicely. But how does the compiler finds out there's a problem in the first place? It needs to know that the this parameter is scope, and thus the member function should be decorated scope (just like you'd do with invariant). So you'd need to duplicate every member function so that it can be used either as scope or non-scope, and that's not very interesting unless you can declare that the function does not need to know if the paramater is typed scope or not (just like const means you don't know if it's invariant or mutable).

I am lost about what you are saying above. Member functions have nothing to do with 'scope'.
Feb 05 2008
next sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2008-02-05 23:45:42 -0500, Edward Diener 
<eddielee_no_spam_here tropicsoft.com> said:

 Michel Fortin wrote:
 
 But there are still many holes in this scheme in which scope now means 
 reference-counted. Take this example:
 
     class A {
         void doSomething() {
             globalReferences ~= this;
         }
     }
     scope class B { }
 
     A[] globalReferences;
 
     scope B b = new B; // Scope could be made implicit here, but it's 
 irrelevant to my example
     b.doSomething();



I am lost about what you are saying above. Member functions have nothing to do with 'scope'.

The thing is that if we have a static reference-counting type modifier (the scope keyword in this case), the compiler has to emit code to increment and decrement the reference count each time we add or remove a reference to a scope object. To do that, it has to know when compiling a function whether or not the object's type is scope. In the above example, class A has a doSomething function which adds a reference to itself to some global variable. Since 'this' is of type A (not scope A) in the doSomething function, no code is added to reference-count the object when assigning it to the global variable. Hence, if you could call doSomething on a scope A, and then remove all other references to A, A's reference count would become zero and A would be deleted despite it being still referenced. (Having a B class derived from A just makes the thing harder to spot. It's basically the same thing as having a scope A object though.) The obvious solution is this: class A { void doSomething() { globalReferences ~= this; } scope void doSomething() { // scope is an attribute of the function here globalReferences ~= this; } } where one doSomething has no code to maintain the reference counter and the other version has. The first version would be called when the compiler has a non-scope A while the second would be called for a scope A. That essentially mean that you couldn't call a scope function on a non-scope object and vice-versa. Each function therefore needs to be duplicated, with a scope and a non-scope variant, just in case it puts a reference to the object somewhere that'll still exist after the function call. There is an obvious solution to that problem though: as the D source code for the two member functions is the same, the compiler could just compile the two variants from the same source. Unfortunately, the compiler would *always* have to generate two symbols for each member function, even if no reference is put elsewhere so the scope function can be reached when calling from elsewhere. (Remember that when doing a function call the compiler doesn't know what happens inside the function, and it can't just guess the scope function doesn't exist.) class A { // automatically generates the two functions from the example above void doSomething() { globalReferences ~= this; } } If we had a new keyword to tell in the function signature that we won't take keep the reference somewhere else, that it'll be completely forgotten after the call, then we could avoid generating two functions needlessly for most member functions. Let's call this keyword "amnesic": class A { amnesic void doSomething() { globalReferences ~= this; // illegal, amnesic reference to this put outside function scope } amnesic void doSomethingElse() { // only one generated function } } It's basically the same pattern as for invariant and non-invariant methods (you can't call an invariant method on a mutable object; you can't call a mutable method on an invariant object; both work with const). Here, you have regular methods, scope methods, and amnesic methods can work with both scope and non-scope objects. I'm going a little off-topic now, but a new keyword such as this could be useful for creating invariant objects too. Basically, while an amnesic function guaranties there are no more references to the object after the function call than there were before, an amnesic constructor could guaranty uniqueness of the reference after the creation. This means the created object could become invariant if that was the caller's intent: class A { amnesic this() { // legal: no reference given to the outside world } amnesic this() { globalReferences ~= this; // illegal, amnesic reference to this put outside constructor's scope } } A a1 = new A; invariant A a2 = new A; (Others have talked about "unique" as a keyword, but "unique" isn't very useful because it describes the state of a reference at a certain point in time, not a property of a variable or a type.) You could also have amnesic parameters to functions that would guaranty the function doesn't keep a reference after the call. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Feb 07 2008
prev sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2008-02-05 23:45:42 -0500, Edward Diener 
<eddielee_no_spam_here tropicsoft.com> said:

 I love it when people such as you carry on about all the work that must 
 be done to implement X. Implementing any new feature in any language 
 takes work. There are NO free rides. But that never means that the new 
 feature should not be done.

If by "work that must be done" you mean find a solution with no overhead, then good luck. I'm telling you it'll be difficult to implement, I'm telling you it's going to remove one of the biggest advantage of using a garbage collector by producing more code to deal with reference counts and object allocation/deallocation, and this code will consume time. In other word, I'm arguing about the (undesirable) end result, not the work it'd take to get there.
 Who cares if some program is slowed down by some number of 
 microsecoonds each time if the feature makes a better and much easier 
 programming paradigm work which otherwise could only be handled in a 
 clumsy and inefficient manner.

Well, you have a valid point that often -- though not always -- a feature that, at the sacrifice of some runtime performance, help programmers save time is a good thing. I'll have to admit I'm not too convinced it'd be so helpful there, but that's not really what is concerning me. One of D's goals is performance. The thing with performance in a program is that you don't need it, except at a few critical places where it is of the uttermost importance. By forcing the reference-counting code everywhere, it's going to end up at many places were it's not needed, and some of these places will be those time-critical parts. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Feb 07 2008
prev sibling parent Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:
Janice Caron wrote:
 On 05/02/2008, Michel Fortin <michel.fortin michelf.com> wrote:
 Basically, you need to:

 1. Load the object's pointer in a register
 2. Load the "scope" flag from memory by offseting the object's pointer
 3. Branch depending on that flag:
    a. if not scope, go to 4.
    b. if scope, do whatever is needed to increment the reference count
 atomically, then go to 4
 4. Write the pointer to its new location.

Not quite. Consider an assignment A a; B b; a = b; If I may be so bold as to rewrite this in C++, for clarity, that would look like: A* pa; B* pb; pa = pb; That's the situation currently. Assignment of classes is very fast. /But/, this is what it would change to under your scheme: A* pa; B* pb; if (pa->refCount) { if (pb->refCount) { atomic { if (--pb->refCount == 0) delete pb; } atomic { ++pa->refCount; } pa = pb; } else { throw new RefCountException(); } } else { if (pb->refCount) { throw new RefCountException(); } else { pa = pb; } }

pb being a 'scope' object does not matter and its reference count does not get adjusted downward just because its object reference is assigned to another object. A reference counted 'scope' object means there is a single reference count ( probably somewhere out in Gc memory ) for all references to a particular object. When a new reference to that object is created, either through assignment, as above, or passing the object reference, the single reference count is incremented. So you can throw out a good deal of your imagined code above.
 
 And that's just for /ordinary/ assignment. That looks like a
 phenomenal overhead to me. Now just /imagine/ how complicated it gets
 if you've overloaded opAssign in various complicated ways. (e.g
 structs assigned from classes, classes assigned from structs, etc.)
 
 I think I'd rather not have that overhead added to every single class
 assignment.

You would no doubt claim that even if your code above were correct and much simpler. As soon as people are against an idea they find the necessary reasons to denigrate it based on such spurious thought. In computer programming the favorite claim for such thought is always the logic and supposed overhead of implementing anything.
Feb 05 2008
prev sibling next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Edward Diener wrote:
 In your example justifying treating 'scope' as C++ treats 'const' the 
 'const' is attached to the object upon instantiation of it.

Pedantically, it is not attached to the object. It is attached to the *type* of the object. The bits in the object do not change, and there is no way at runtime to examine the bits of the object to determine if it is const or not. The "const-ness" is purely a compile time attribute, i.e. it's part of the static type.
 My point is 
 that 'scope' is attached to the type by the class designer. To me these 
 are conceptually two different things. That is why I said that your 
 example is a poor analogy. I should have made that clearer by my argument.

Whether the 'scope' is attached to the class definition or the variable definition is a separate and orthogonal issue. As I understand it, our difference is if an object can, at run time, be distinguished as being scope or not, and should this be tested at runtime at each place where assignment, copy construction, and scope exit happens. I.e., should 'scope' be a part of the type of the object, or a dynamic part of the runtime representation of an object? Both are technically implementable.
 I was simply arguing 
 against your assertion, using 'const' as an example, of not allowing a 
 'scope' object to be assigned to an object that is not 'scope'.

Ok.
 Clearly 
 in the polymorphic world of D, where a base class may not be 'scope' 
 while a derived class may be 'scope', such a treatment can not be right, 
 since the basis of polymorphism is to assign to a base class object a 
 derived class reference.

Right. This is a reason why 'scope' for classes may need to be eventually deprecated.
 But I do not see why you think that every object must therefore have 
 scope semantics.

It will be required as any user could declare an object instance as 'scope', and so any separately compiled code must anticipate that.
 You have already won that argument as I fully agree to what you say 
 above. But I have no idea why you say that it is the crux of our 
 disagreement. Care to elaborate ?

It's just that if any object could be scoped based on a runtime test, that then you've got to insert that test at every assignment, copy construction, and scope exit. You've got all the overhead of RC.
 No, I do not think that the scopeness of an object is best determined by 
 the user and not the class designer. In fact I feel very strongly the 
 opposite. The class designer knows whether his class has RAII or not and 
 in the cast majority of cases the end user should not know or care.

This is a very interesting issue. I've been slowly coming to the opposite conclusion that issues of where an object is created (and that includes scope) should be the purvey of the object user. C++ and D have class specific allocators, but that might be a mistake.
 My argument for scope injection is based purely on the practical 
 considerations that there are types which can not possibly know if it is 
 to be used with RAII or not. Template classes/structs which are 
 containers are the most obvious as well as built-in arrays. That is why 
 besides the ability for the class designer to specify 'scope' the end 
 user should be able to do it at object creation time also.

Then you have the problem that all generated code that manipulates any object must insert all the rc machinery for that object, just in case some user somewhere instantiates it as 'scope'.
Feb 04 2008
parent reply Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:
Walter Bright wrote:
 Edward Diener wrote:
 My point is that 'scope' is attached to the type by the class 
 designer. To me these are conceptually two different things. That is 
 why I said that your example is a poor analogy. I should have made 
 that clearer by my argument.

Whether the 'scope' is attached to the class definition or the variable definition is a separate and orthogonal issue. As I understand it, our difference is if an object can, at run time, be distinguished as being scope or not, and should this be tested at runtime at each place where assignment, copy construction, and scope exit happens.

Yes, this is important in support of 'scope' at run-time. For each object you would need to determine if it is 'scope' in the cases cited above and take the appropriate action if it is. The easiest way, although perhaps the slowest, is to find out if the dynamic type is a 'scope' type at the junctures you mentioned. How you do that at the compiler level you best would know. One possibility that comes to mind, but perhaps erroneous because too simple, is to add a int reference count to every object and then set it to 1 whenever a 'scope' object ( an object specified as 'scope' or having a 'scope class type ) is created or when a reference to a 'scope' object is legally assigned to any object. This goes along with my feeling that you will need to keep track of 'scope' objects by attaching your mechanism to the object at run-time and not rely merely on something in the 'scope' class type, and that it is the dynamic type of the object and not the static type which matters.
 
 I.e., should 'scope' be a part of the type of the object, or a dynamic 
 part of the runtime representation of an object? Both are technically 
 implementable.

My vote is the second. See above.
 
 
 I was simply arguing against your assertion, using 'const' as an 
 example, of not allowing a 'scope' object to be assigned to an object 
 that is not 'scope'.

Ok.
 Clearly in the polymorphic world of D, where a base class may not be 
 'scope' while a derived class may be 'scope', such a treatment can not 
 be right, since the basis of polymorphism is to assign to a base class 
 object a derived class reference.

Right. This is a reason why 'scope' for classes may need to be eventually deprecated.

I think this is very wrong.
 
 
 But I do not see why you think that every object must therefore have 
 scope semantics.

It will be required as any user could declare an object instance as 'scope', and so any separately compiled code must anticipate that.

I agree in the sense that every object may need to carry an extra reference count with it even though it will not be used for the vast majority of objects, which will be GC. I do not view this as an issue.
 
 
 You have already won that argument as I fully agree to what you say 
 above. But I have no idea why you say that it is the crux of our 
 disagreement. Care to elaborate ?

It's just that if any object could be scoped based on a runtime test, that then you've got to insert that test at every assignment, copy construction, and scope exit. You've got all the overhead of RC.

Yes, agreed. There will be overhead to deal with 'scope' objects. However you already have some overhead dealing with stack variables, and so has C++ for its existence at the end of each scope and it sure does not make C++ slower than most GC systems.
 
 No, I do not think that the scopeness of an object is best determined 
 by the user and not the class designer. In fact I feel very strongly 
 the opposite. The class designer knows whether his class has RAII or 
 not and in the cast majority of cases the end user should not know or 
 care.

This is a very interesting issue. I've been slowly coming to the opposite conclusion that issues of where an object is created (and that includes scope) should be the purvey of the object user. C++ and D have class specific allocators, but that might be a mistake.

I can not say too strongly that if RAII, via 'scope', is to work in D or any other GC language, the end-user should be as oblivious as possible to it working automatically. This means that class designer, who surely must know whether objects of their class need RAII, tells the compiler that his type is 'scope' and the end-user proceeds to use objects of that type just as if he would use normal GC objects. Otherwise you are creating a bifurcated system which does the end-user no good. Not only must the end user know something in advance about the inner workings of a class ( that it needs RAII ) when the class designer already knows it, but he must also use a separate notation to deal with objects of that class.
 
 
 My argument for scope injection is based purely on the practical 
 considerations that there are types which can not possibly know if it 
 is to be used with RAII or not. Template classes/structs which are 
 containers are the most obvious as well as built-in arrays. That is 
 why besides the ability for the class designer to specify 'scope' the 
 end user should be able to do it at object creation time also.

Then you have the problem that all generated code that manipulates any object must insert all the rc machinery for that object, just in case some user somewhere instantiates it as 'scope'.

It needs to have inserted for it the mechanism which determines whether that object is a 'scope' object or not. It probably needs the extra int for possible reference counting. Other than that I do not see what other machinery is needed for normal GC objects. If we are really still in the age, with vtables and alignment padding and god knows what else a compiler writer needs per object to correctly do his work, where another 4 bytes of int is considered prohibitory, then I give up the whole idea <g>.
Feb 04 2008
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Edward Diener wrote:
 It will be required as any user could declare an object instance as 
 'scope', and so any separately compiled code must anticipate that.

reference count with it even though it will not be used for the vast majority of objects, which will be GC. I do not view this as an issue.

It's a very serious issue, as it essentially negates much of the advantage of general gc. For one example, you'll have to give up interior pointers.
 It's just that if any object could be scoped based on a runtime test, 
 that then you've got to insert that test at every assignment, copy 
 construction, and scope exit. You've got all the overhead of RC.


It will be needed for *every* gc object, too. And not just the allocation for the reference count, the test has to be executed every time.
 However you already have some overhead dealing with stack variables, and 
 so has C++ for its existence at the end of each scope and it sure does 
 not make C++ slower than most GC systems.

If reference counting worked that well, there would be no push to add gc to C++0x.
 I can not say too strongly that if RAII, via 'scope', is to work in D or 
 any other GC language, the end-user should be as oblivious as possible 
 to it working automatically. This means that class designer, who surely 
 must know whether objects of their class need RAII, tells the compiler 
 that his type is 'scope' and the end-user proceeds to use objects of 
 that type just as if he would use normal GC objects.
 
 Otherwise you are creating a bifurcated system which does the end-user 
 no good. Not only must the end user know something in advance about the 
 inner workings of a class ( that it needs RAII ) when the class designer 
 already knows it, but he must also use a separate notation to deal with 
 objects of that class.

For those cases, all the class designer needs to do is present to the user the struct wrapper for the class, not the class itself.
 Then you have the problem that all generated code that manipulates any 
 object must insert all the rc machinery for that object, just in case 
 some user somewhere instantiates it as 'scope'.

It needs to have inserted for it the mechanism which determines whether that object is a 'scope' object or not. It probably needs the extra int for possible reference counting. Other than that I do not see what other machinery is needed for normal GC objects.

Consider: void foo(C c) { C d = c; } foo() has no idea if c is ref counted or gc. Therefore, it has to check every time, at run time. All the machinery has to be there, just in case.
 If we are really still in the age, with vtables and alignment padding 
 and god knows what else a compiler writer needs per object to correctly 
 do his work, where another 4 bytes of int is considered prohibitory, 
 then I give up the whole idea <g>.

It's not just another 4 bytes.
Feb 05 2008
next sibling parent Christopher Wright <dhasenan gmail.com> writes:
Walter Bright wrote:
 Edward Diener wrote:
 It will be required as any user could declare an object instance as 
 'scope', and so any separately compiled code must anticipate that.

reference count with it even though it will not be used for the vast majority of objects, which will be GC. I do not view this as an issue.

It's a very serious issue, as it essentially negates much of the advantage of general gc. For one example, you'll have to give up interior pointers.
 It's just that if any object could be scoped based on a runtime test, 
 that then you've got to insert that test at every assignment, copy 
 construction, and scope exit. You've got all the overhead of RC.


It will be needed for *every* gc object, too. And not just the allocation for the reference count, the test has to be executed every time.
 However you already have some overhead dealing with stack variables, 
 and so has C++ for its existence at the end of each scope and it sure 
 does not make C++ slower than most GC systems.

If reference counting worked that well, there would be no push to add gc to C++0x.
 I can not say too strongly that if RAII, via 'scope', is to work in D 
 or any other GC language, the end-user should be as oblivious as 
 possible to it working automatically. This means that class designer, 
 who surely must know whether objects of their class need RAII, tells 
 the compiler that his type is 'scope' and the end-user proceeds to use 
 objects of that type just as if he would use normal GC objects.

 Otherwise you are creating a bifurcated system which does the end-user 
 no good. Not only must the end user know something in advance about 
 the inner workings of a class ( that it needs RAII ) when the class 
 designer already knows it, but he must also use a separate notation to 
 deal with objects of that class.

For those cases, all the class designer needs to do is present to the user the struct wrapper for the class, not the class itself.
 Then you have the problem that all generated code that manipulates 
 any object must insert all the rc machinery for that object, just in 
 case some user somewhere instantiates it as 'scope'.

It needs to have inserted for it the mechanism which determines whether that object is a 'scope' object or not. It probably needs the extra int for possible reference counting. Other than that I do not see what other machinery is needed for normal GC objects.

Consider: void foo(C c) { C d = c; } foo() has no idea if c is ref counted or gc. Therefore, it has to check every time, at run time. All the machinery has to be there, just in case.
 If we are really still in the age, with vtables and alignment padding 
 and god knows what else a compiler writer needs per object to 
 correctly do his work, where another 4 bytes of int is considered 
 prohibitory, then I give up the whole idea <g>.

It's not just another 4 bytes.

You'd have to outlaw: T a; scope(T) b = a; You'd also have to outlaw: scope(T) a; T b = a; This would be more obvious with a wrapper struct than a storage class.
Feb 07 2008
prev sibling parent Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:
Walter Bright wrote:
 Edward Diener wrote:
 It will be required as any user could declare an object instance as 
 'scope', and so any separately compiled code must anticipate that.

reference count with it even though it will not be used for the vast majority of objects, which will be GC. I do not view this as an issue.

It's a very serious issue, as it essentially negates much of the advantage of general gc. For one example, you'll have to give up interior pointers.

I do not follow what having a reference count for an object has to do with giving up interior pointers.
 
 It's just that if any object could be scoped based on a runtime test, 
 that then you've got to insert that test at every assignment, copy 
 construction, and scope exit. You've got all the overhead of RC.


It will be needed for *every* gc object, too. And not just the allocation for the reference count, the test has to be executed every time.

The test for a reference count is executed whenever you need to do something if the object is a 'scope' object which you would not do for a non-scoped object. Perhaps this is what you mean by "every time". I have these testing "times" as assignment/copy a reference and exiting a scope. When instantiating an object no "test" need be made since the compiler always knows when an object is 'scope' or not when it is created ( 'scope sometype someobject' notation or sometype has a 'scope class' notation).
 
 However you already have some overhead dealing with stack variables, 
 and so has C++ for its existence at the end of each scope and it sure 
 does not make C++ slower than most GC systems.

If reference counting worked that well, there would be no push to add gc to C++0x.

No one ever said that reference counting solved all memory problems as opposed to GC. The most obvious usage for GC which I know, over and above reference counting, is cross-referenced objects.
 
 
 I can not say too strongly that if RAII, via 'scope', is to work in D 
 or any other GC language, the end-user should be as oblivious as 
 possible to it working automatically. This means that class designer, 
 who surely must know whether objects of their class need RAII, tells 
 the compiler that his type is 'scope' and the end-user proceeds to use 
 objects of that type just as if he would use normal GC objects.

 Otherwise you are creating a bifurcated system which does the end-user 
 no good. Not only must the end user know something in advance about 
 the inner workings of a class ( that it needs RAII ) when the class 
 designer already knows it, but he must also use a separate notation to 
 deal with objects of that class.

For those cases, all the class designer needs to do is present to the user the struct wrapper for the class, not the class itself.

Sure, but then there becomes a different notation for dealing with specific classes, which nullifies the whole point of being able to specify an RAII type ( via 'scope class' in D ).
 
 
 Then you have the problem that all generated code that manipulates 
 any object must insert all the rc machinery for that object, just in 
 case some user somewhere instantiates it as 'scope'.

It needs to have inserted for it the mechanism which determines whether that object is a 'scope' object or not. It probably needs the extra int for possible reference counting. Other than that I do not see what other machinery is needed for normal GC objects.

Consider: void foo(C c) { C d = c; } foo() has no idea if c is ref counted or gc. Therefore, it has to check every time, at run time. All the machinery has to be there, just in case.

I agree.
 
 If we are really still in the age, with vtables and alignment padding 
 and god knows what else a compiler writer needs per object to 
 correctly do his work, where another 4 bytes of int is considered 
 prohibitory, then I give up the whole idea <g>.

It's not just another 4 bytes.

I meant that memory-wise it is just 4 bytes. Of course it is extra programming from the language's point of view. Let me try to make the case for RAII in D via 'scope' once again, by presenting the technical details as I see it, and then you will no doubt choose what you think best. If I am really far off please tell me about it, otherwise there is little reason for me to try to argue and present my idea further as you will do what you think best, and I appreciate that you have heard me out. First, the situations when RAII processing occurs: 1) A 'scope' object is instantiated. The internal reference count, however you choose to implement it, is set to 1. 2) A 'scope' object's reference is assigned/copied to another object. If the 'scope' object is not a null reference, the reference count is incremented. 3) A 'scope' object's reference is changed through assignment. If the old reference is not a null reference, the old reference's reference count is decremented and if it is 0, the old object is destructed ( its destructor is called ) and its memory is released ( the latter may happen later through GC for all I know ). 4) A 'scope' object reaches the end of it's scope. Processing then occurs exactly as it does in 3). There are two ways of dealing with the identification of a 'scope' object. The first way is through its static type, where the compiler always knows the static type of an object and can generate the correct code in each of the 4 instances above for a 'scope' object, and ignore any changes to the way that normal non-scope objects are treated. This is the easiest way from the compiler's perspective and no doubt the fastest. There is no penalty for normal non-scope GC objects and only the 'scope' object undergoes special, slower processing. I still have hope that if you see fit to go this way that you will allow the user to identify a 'scope' object either by the 'scope' keyword applied to the instantiated object itself or by the 'scope' keyword applied to the class type of the object. I say that because I can not conceive of a compiler that could not figure out that an object was 'scope' because its class type was 'scope'. The second way is by examining its dynamic type at run-time and generating code to take the appropriate action. This second way is harder for the compiler to do and no doubt slower, although how much slower is something which could only be pragmatically measured by you with D. With this second way, every object must be tested in each of the 4 cases above to determine if it is a 'scope' object and to take the appropriate action if it is. Obviously 4) above is the potential killer as far as this goes because it would mean testing every reference at the end of each scope, just in case one or more of them is a 'scope' object and needs its end of scope processing. In the other three cases one is dealing with a single object in a well-defined, if general, situation so the overhead would be much less. This second way is obviously much better from the end-user's point of view, which does not mean it is practically a better solution by any means. My only practical argument with all those who are certain that this second way would be an unnecessary imposition on all the users of normal GC objects, and want to regale me with code absolutely "proving" a priori their case, is that once an object is determined to be normal GC there is nothing further that needs be done for that object which would not have been done otherwise. Of course there is overhead for determining this in the cases above, especially with 4). For this second way I have presented the extra reference count field, attached internally to all objects, as a way of determining if the object is 'scope' when doing 2), 3), or 4), with the proviso that when doing 1) the value for all normal GC objects of this field would be set to 0. If this is an entirely impractical solution, I am sure that if you decide to pursue the possibility of the second way, just to see if it can be done and what is the practical penalty in doing it, you will find a better scheme.
Feb 07 2008
prev sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 06/02/2008, Janice Caron <caron800 googlemail.com> wrote:
 You're right. That was a mistake. I meant

             atomic { if (--pa->refCount == 0) delete pa; }
             atomic { ++pb->refCount; }


And just to be really clear, real code would need extra checks, above and beyond the simplified version I wrote above. In fact, it would be more like if (pa != null && pa->refCount != 0) { if (pb != null && pb->refCount != 0) { atomic { ++pb->refCount; } atomic { if (--pa->refCount == 0) delete pa; } etc. It's important to do the increment before the decrement because otherwise you run the risk of crashing if (pa == pb). And if you store refCount in too small a variable, you also would need to be concerned about ++pb->refCount wrapping. Microsoft VC++'s implementation of std::string, for example, contains a one-byte reference count, and they have some complicated code in there which interprets 0xFF differently so that things don't fall over. But that's using your suggested convention that (refCount == 0) implies "not reference counted". In my real life implementation, ref-countedness was a compile-time property, not a runtime property, so I didn't have to do all of those tests. I think compile-time ref-countedness is a very good idea, and has many fantastic advantages. I just think it's a very bad idea to make it a runtime property.
Feb 05 2008
prev sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 06/02/2008, Edward Diener <eddielee_no_spam_here tropicsoft.com> wrote:
 pb being a 'scope' object does not matter and its reference count does
 not get adjusted downward just because its object reference is assigned
 to another object.

You're right. That was a mistake. I meant
             atomic { if (--pa->refCount == 0) delete pa; }
             atomic { ++pb->refCount; }

The reference count of the value previously held by a /does/ get adjusted downwards. I wrote the last post without paying attention to the details, but the complexity doesn't go away when you write it properly. Honest.
 So you can throw out a good
 deal of your imagined code above.

and replace it with correct code which is just as complicated, yes.
 You would no doubt claim that even if your code above were correct and
 much simpler.

*NEVER* tell me what you think I would or would not claim. Only I get to speak for me. If anyone else does it, I start calling strawman. Quote me verbatim by all means, but /do not/ put words into my mouth.
 As soon as people are against an idea they find the
 necessary reasons to denigrate it based on such spurious thought.

Are you implying that I, personally, am guilty of "finding reasons to denigrate" your idea because of "spurious thought". I ask because you use the generic word "people", but the context of your sentence could be taken to imply that you are talking about me, personally. Please clarify. I don't take well to personal attacks.
 In
 computer programming the favorite claim for such thought is always the
 logic and supposed overhead of implementing anything.

I speak from experience. I have implemented reference counting in a multithreaded environment. I have earned the right to discuss implementation details, and to explain what the /actual/ (not supposed) overhead really is.
Feb 05 2008
prev sibling next sibling parent reply Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:
Janice Caron wrote:
 On 2/3/08, Edward Diener <eddielee_no_spam_here tropicsoft.com> wrote:
 There is no equivalent of 'const' in
 C++ which refers to the type.

Of course there is. For example: typedef char const * PCSZ;

I think a simpler example for you to cite would have been: typedef char const PCSZ; But again that does not change the type of 'char'. It simply creates a type alias. In arguing for 'scope' at the type level, as in 'scope class C { ... }' I was arguing that 'scope' applies some attribute to the type of C. The class designer is making a conscious decision in designing his type in order to tell the compiler that all objects of his type need deterministic destruction in a system which normally implements non-deterministic destruction via GC. That says something about the type per se. In specifying 'const char ch' the end user is making a conscious decision in telling the compiler that a specific object must not change after it has been initialized to some value. It has nothing to do with the type of the object per se.
Feb 03 2008
parent Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:
Janice Caron wrote:
 On 03/02/2008, Edward Diener <eddielee_no_spam_here tropicsoft.com> wrote:
 But again that does not change the type of 'char'.

I hope you're not implying that I said it did.
 In arguing for 'scope' at the type level, as in 'scope class C { ... }'
 I was arguing that 'scope' applies some attribute to the type of C. The
 class designer is making a conscious decision in designing his type in
 order to tell the compiler that all objects of his type need
 deterministic destruction in a system which normally implements
 non-deterministic destruction via GC. That says something about the type
 per se.

Having actually implemented reference counting in C++, I know how it works. For any type T (doesn't matter if it's a class, a struct, or just an int), you make two additional types. The first of these just adds a reference counter: class Countable(T) { uint refCount; T val; } and the second is what is exposed to the end user struct RefCounted(T) { Countable!(T) val; /* and appropriate ref-counting functions */ /* and appropriate forwarding functions */ } It should be clear that there is no way to cast a RefCounted!(T) to a T. If you want the following to compile: RefCounted!(C) c = whatever; C d; d = c; Then the only way to do that would be to have RefCounted!(C) be implicitly castable to C, presumably by having opImplicitCast() return c.val.val. This would be disasterous. The moment you allow that, suddenly you have uncounted references running around. It would then be possible to do the following: C d; { RefCounted!(C) c = new RefCounted!(C)(); d = c; } /* Whoops! d now points to a destructed object! */ Instead, what you'd /really/ want the class designer to do is something like this: struct PrivateFileHandle { private this(string filename) { /*...*/ } ~this() { /*...*/ } /* other functions as appropriate */ } alias RefCounted!(PrivateFileHandle) FileHandle; That way, the caller only has to declare FileHandle fh; but gets it ref-counted. (And there is no way for it not to be refcounted). If we go the way of "scope C" meaning "RefCounted!(C)" under the hood, then whether or not you'll need the word "scope" depends on whether you want to refer to the refcounted type or the underlying type.

I am saying that when the type is 'scope class C' the compiler automatically wraps every instantiated object of that type as a referenced counted object. While I am not uninterested in the details of doing that, as you present it above, I am more interested that it be done for the correct situations in order to provide RAII in GC. I do not think the end user should have to redundantly specify the object as: scope C c = new C(...); when C is already a scope class. The compiler can figure out that C is a scope class and silently treat c as a scope object. As for assigning a reference of a scope object to a non-scope object I do understand the problems as outlined above. Thanks for the explanation. I still think the compiler can figure out in: class B {...} scope class C : B {...} B b = new C(...); that the b object upon instantiation needs to be wrapped as a scoped object without having to write instead: scope B b = new C(...); but I do realize now that with B b; b = new C(...); it would be difficult, perhaps impossible, to somehow switch b from a non-scoped object to a scoped one since the object has been wrapped or not when it was created. So I do now accept that the second form should be illegal. I know others may think it may be nit-picking that I do not think that when the dynamic type, upon creation of an object, can be ascertained as a scope class the end user should have to specify the 'scope' keyword again in the declaration of the object. But I think it is really important for RAII to be as transparent as possible from the end user's point of view, while also allowing the end user to create scope objects as necessary also when the actual class type of the object created is not scope.
Feb 03 2008
prev sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 05/02/2008, Janice Caron <caron800 googlemail.com> wrote:
 /But/, this is what it would change to under your scheme:
 <snip>

...and that code had a bug in it! (Who else spotted it?) It can fail with the assignment a = a; It can also fail if either a or b is null. See, this is harder than you think.
Feb 05 2008
prev sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 05/02/2008, Michel Fortin <michel.fortin michelf.com> wrote:
 Basically, you need to:

 1. Load the object's pointer in a register
 2. Load the "scope" flag from memory by offseting the object's pointer
 3. Branch depending on that flag:
    a. if not scope, go to 4.
    b. if scope, do whatever is needed to increment the reference count
 atomically, then go to 4
 4. Write the pointer to its new location.

Not quite. Consider an assignment A a; B b; a = b; If I may be so bold as to rewrite this in C++, for clarity, that would look like: A* pa; B* pb; pa = pb; That's the situation currently. Assignment of classes is very fast. /But/, this is what it would change to under your scheme: A* pa; B* pb; if (pa->refCount) { if (pb->refCount) { atomic { if (--pb->refCount == 0) delete pb; } atomic { ++pa->refCount; } pa = pb; } else { throw new RefCountException(); } } else { if (pb->refCount) { throw new RefCountException(); } else { pa = pb; } } And that's just for /ordinary/ assignment. That looks like a phenomenal overhead to me. Now just /imagine/ how complicated it gets if you've overloaded opAssign in various complicated ways. (e.g structs assigned from classes, classes assigned from structs, etc.) I think I'd rather not have that overhead added to every single class assignment.
Feb 05 2008
prev sibling parent Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:
Janice Caron wrote:
 On 2/2/08, Walter Bright <newshound1 digitalmars.com> wrote:
 Doing locked reference counts is slow

Surely you don't need to lock the reference count? You can use atomic increment and atomic decrement instead. (I implemented a ref-counted template in C++ once, and that's what I did. It seemed to work). Locking the indirect object itself - that's another matter! However, one would imagine that multithreaded code which shares objects will already have a mutex system in place, because that would have been needed even without ref counting.

Exactly ! The responsibility for object access in a multi-threading system is the end-user's not that of D, and the only low-level responsibility of D in such scheme is to manipulate the reference count atomically.
Feb 02 2008
prev sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 03/02/2008, Edward Diener <eddielee_no_spam_here tropicsoft.com> wrote:
 But again that does not change the type of 'char'.

I hope you're not implying that I said it did.
 In arguing for 'scope' at the type level, as in 'scope class C { ... }'
 I was arguing that 'scope' applies some attribute to the type of C. The
 class designer is making a conscious decision in designing his type in
 order to tell the compiler that all objects of his type need
 deterministic destruction in a system which normally implements
 non-deterministic destruction via GC. That says something about the type
 per se.

Having actually implemented reference counting in C++, I know how it works. For any type T (doesn't matter if it's a class, a struct, or just an int), you make two additional types. The first of these just adds a reference counter: class Countable(T) { uint refCount; T val; } and the second is what is exposed to the end user struct RefCounted(T) { Countable!(T) val; /* and appropriate ref-counting functions */ /* and appropriate forwarding functions */ } It should be clear that there is no way to cast a RefCounted!(T) to a T. If you want the following to compile: RefCounted!(C) c = whatever; C d; d = c; Then the only way to do that would be to have RefCounted!(C) be implicitly castable to C, presumably by having opImplicitCast() return c.val.val. This would be disasterous. The moment you allow that, suddenly you have uncounted references running around. It would then be possible to do the following: C d; { RefCounted!(C) c = new RefCounted!(C)(); d = c; } /* Whoops! d now points to a destructed object! */ Instead, what you'd /really/ want the class designer to do is something like this: struct PrivateFileHandle { private this(string filename) { /*...*/ } ~this() { /*...*/ } /* other functions as appropriate */ } alias RefCounted!(PrivateFileHandle) FileHandle; That way, the caller only has to declare FileHandle fh; but gets it ref-counted. (And there is no way for it not to be refcounted). If we go the way of "scope C" meaning "RefCounted!(C)" under the hood, then whether or not you'll need the word "scope" depends on whether you want to refer to the refcounted type or the underlying type.
Feb 03 2008
prev sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 2/3/08, Edward Diener <eddielee_no_spam_here tropicsoft.com> wrote:
 There is no equivalent of 'const' in
 C++ which refers to the type.

Of course there is. For example: typedef char const * PCSZ;
Feb 02 2008
prev sibling next sibling parent Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:
Janice Caron wrote:
 On 1/30/08, Edward Diener <eddielee_no_spam_here tropicsoft.com> wrote:
 In a GC environment memory is just another resource. The user of objects
 does not worry about memory being released as appropriate. Why should he
 have to worry about other resources being released as appropriate ?

The GC only collects /memory/, not /resources/.

Memory is a subset of resource. It is, in most programmer's mind, the most important subset, but it is still just another resource.
 
 You make a class "scope" if it needs to clean up non-memory resources
 (e.g. close a file, terminate a network connection, etc).

Yes, I do understand that fully.
Jan 30 2008
prev sibling parent "David Wilson" <dw botanicus.net> writes:
On 2/2/08, Janice Caron <caron800 googlemail.com> wrote:
 On 2/2/08, Walter Bright <newshound1 digitalmars.com> wrote:
 Doing locked reference counts is slow

Surely you don't need to lock the reference count? You can use atomic increment and atomic decrement instead. (I implemented a ref-counted template in C++ once, and that's what I did. It seemed to work). Locking the indirect object itself - that's another matter! However, one would imagine that multithreaded code which shares objects will already have a mutex system in place, because that would have been needed even without ref counting.

There are still effects on the CPU cache when using atomic instructions - namely, every CPU in a multi-CPU system must be instructed to flush any cache line that would be affected by the operation. As far as I know this is asserted in x86 via the "LOCK" instruction prefix. David.
Feb 02 2008
prev sibling next sibling parent Jesse Phillips <jessekphillips gmail.com> writes:
On Mon, 28 Jan 2008 20:50:59 -0500, Edward Diener wrote:

 
 I do not understand what you mean by "return the scoped value". If in D
 I write:
 
 scope class Foo { ... }
 
 then why should I have to write, when declaring an instance of the
 class:
 
 scope Foo g = new Foo();
 
 as opposed to just:
 
 Foo g = new Foo();
 
 The compiler knows that Foo is a scoped class, so there is no need for
 the programmer to repeat it in the object declaration.

He is referring to when you have: scope class Foo() {} Foo doThings() { Foo cats = new Foo(); return cats; } cats no longer exists after return.
Jan 28 2008
prev sibling next sibling parent reply "Janice Caron" <caron800 googlemail.com> writes:
On Jan 29, 2008 1:50 AM, Edward Diener
<eddielee_no_spam_here tropicsoft.com> wrote:
 If in D
 I write:

 scope class Foo { ... }

 then why should I have to write, when declaring an instance of the class:

 scope Foo g = new Foo();

 as opposed to just:

 Foo g = new Foo();

 The compiler knows that Foo is a scoped class, so there is no need for
 the programmer to repeat it in the object declaration.

What you're suggesting is "semantic sugar" - allowing the compiler to save us a bit of typing. Sometimes, that can be a good thing. Here, however, I don't think it would be. You see, while the /compiler/ knows that Foo is RAII (you're right about that), future maintainers of the function might not. Forcing the use of the keyword makes the code a bit more readable. Here's another way of looking at it: The right hand side of the statement is evaluated /first/. Then it is assigned to the lvalue. So, when the RHS is evaluated (new Foo()) it returns a value whose type is "scope Foo". Now, you can't assign a "scope Foo" to a "Foo", so the assignment fails. Allowing the semantic sugar would be like "storage-class-deduction", which would open up a really huge can of worms, and we almost certainly don't want to go there (at least not before const has settled down).
Jan 28 2008
parent "Rioshin an'Harthen" <rharth75 hotmail.com> writes:
"Janice Caron" <caron800 googlemail.com> kirjoitti viestissä 
news:mailman.33.1201592955.5260.digitalmars-d puremagic.com...
 On Jan 29, 2008 1:50 AM, Edward Diener
 <eddielee_no_spam_here tropicsoft.com> wrote:
 The compiler knows that Foo is a scoped class, so there is no need for
 the programmer to repeat it in the object declaration.

What you're suggesting is "semantic sugar" - allowing the compiler to save us a bit of typing. Sometimes, that can be a good thing. Here, however, I don't think it would be. You see, while the /compiler/ knows that Foo is RAII (you're right about that), future maintainers of the function might not. Forcing the use of the keyword makes the code a bit more readable.

I'll go somewhat deeper into this than Janice did. This looks like it is a related issue to C# requiring the method-calling place to note what parameters are ref and what parameters are out. The compiler of course already knows which parameter is using which convention: normal, ref, out... But the designers of C# think it's better to require len += o.foo(bar, ref baz, out foobar); than len += o.foo(bar, baz, foobar); because it documents the call better. Just by reading the call, it is immediately know that baz is passed as a reference, which might change baz, and that foobar will be used to return an additional value from the method. No need to look up the definition of o.foo anywhere. The same reasoning applies to forcing scoped classes to be marked as such at the point of declaration in D. It requires for each instance the typing of an extra keyword, but the readability of the code increases by such a factor that typing that extra keyword doesn't bother me. Now, you're reading through code that somebody else has written, having inherited the maintaining of that code. There's a bad bug in it that has to be fixed and quickly, because it's stopping the project from being finished - and it was supposed to be ready a few weeks ago. You've managed to tracke the bug to a certain function and look upon the code: Foo f = new Foo; // do whatever with f return f; Suppose D doesn't require that scope in front of Foo. You have to check the Foo class, and you see the scope in front of the declaration. However, when D requires the scope keyword, the code looks like: scope Foo f = new Foo; // do whatever with f return f; Bling! We see the error immediately, and have saved the need to actually look for where the Foo class is defined, which could be in the middle of a multi-kloc file that you couldn't have guessed from the ton of imports at the top of the file this function resides in.
Jan 29 2008
prev sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 1/30/08, Edward Diener <eddielee_no_spam_here tropicsoft.com> wrote:
 In a GC environment memory is just another resource. The user of objects
 does not worry about memory being released as appropriate. Why should he
 have to worry about other resources being released as appropriate ?

The GC only collects /memory/, not /resources/. You make a class "scope" if it needs to clean up non-memory resources (e.g. close a file, terminate a network connection, etc).
Jan 29 2008
prev sibling next sibling parent Jason House <jason.james.house gmail.com> writes:
Edward Diener Wrote:
 The first is the necessity of using an already scoped class by repeating
 the 'scope' decalation when creating an object of that class. Since the
 class itself has already been declared with the 'scope' keyword it seems
 absolutely redundant that the user of an object of the class must repeat
 'scope' in his usage of that object. Surely the compiler is smart enough
 to know that the class is a 'scope' class and will generate the
 necessary code to automatically call the destructor of the class when it
 goes out of scope. In fact the user of this class via an instantiated
 object should not even care if it is a scoped class or not, so having to
 say it is again seems doubly wrong, although allowable.

It's already been clarified by others that scope is different than a reference counted object that's deleted immediately when no more references to it exist. The post that follows is under the assumption that this applies to the use of scope as defined in the D language. RAII tends to require very specific usage semantics. Because of this alternate behavior (and requirements on usage), it makes complete sense to mark the variable as scope when used. I don't expect the addition or removal of the scope property of a class to be something that would not require code changes in other places. One of the appeals of d to me is that it aims to reduce coding errors. This repeat of the use of scope feels like it's an attempt to keep scope usage both clear and correct.
Jan 29 2008
prev sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 2/2/08, Walter Bright <newshound1 digitalmars.com> wrote:
 Doing locked reference counts is slow

Surely you don't need to lock the reference count? You can use atomic increment and atomic decrement instead. (I implemented a ref-counted template in C++ once, and that's what I did. It seemed to work). Locking the indirect object itself - that's another matter! However, one would imagine that multithreaded code which shares objects will already have a mutex system in place, because that would have been needed even without ref counting.
Feb 02 2008