digitalmars.D - Deterministic life-time storage type
- travert phare.normalesup.org (Christophe) (120/120) Apr 21 2012 Hi. I don't have time to follow all discussions here, but it makes a
- Artur Skawina (14/29) Apr 21 2012 "scope", in its current meaning, should have been the default for all
- travert phare.normalesup.org (Christophe) (16/25) Apr 21 2012 Scope in its current meaning is about the same as my scope(in) for
- Artur Skawina (6/35) Apr 21 2012 Yes. there was a reason for my lets-ignore-enforcement-for-now suggestio...
- Michel Fortin (9/19) Apr 21 2012 I fear your solution might not be complicated enough (!) to allow some
- travert phare.normalesup.org (Christophe Travert) (13/14) Apr 21 2012 I've thought about that.
- Michel Fortin (15/30) Apr 22 2012 Your proposal is very similar to some things that were discussed in
- travert phare.normalesup.org (Christophe) (2/5) Apr 22 2012 Thanks, I'll have a look.
- travert phare.normalesup.org (Christophe) (31/36) Apr 23 2012 If we choose the following defaults, the hurdle may not be that high:
Hi. I don't have time to follow all discussions here, but it makes a some time that I have an idea that I would like to share here, to know if that may interest programmers in various fields. The idea is major change in the langage (perhaps for D.3), to give tools for the compiler and the programmer to know what is the lifetime of any type or object in memory, to allow a better memory management. This could dramatically reduce the use of the GC, and even allow extra optimisations. It would also solve the cast-to-immutable problem, and perhaps even some r-value/l-value issues. However, this feature requires some discipline to use, and that's why I would like to know if people would be interested, or if it is too much to add to a programming langage. Now, this is the idea in a few words: In each function signature, you can add information about whether the function may keep reference to its parameters or return value. Then, when you declare a variable, you can say how long you want to use that variable. With these information, the compiler can check you use your variables right, and use this information to destroy the variable at the right time. To do this, I'll alter the meaning of the scope, in, out and inout keywords to create new storage type : - dynamic variable: this refers to a variable for which references can be freely taken. It is allocated on the heap, and garbage collected the usual way. This is the default, but an additional keyword, "dynamic" may be used to explicitely declare a dynamic variable. Example: | dynamic(int)[] a = [1, 2, 3]; // same as: int[] a = [1, 2, 3]; | dynamic(int) b = 5; // same as: ref b = new int; b = 5; - scope variable: this refers to a variable for which we can be sure that no reference to the variable, or any subpart of it (scope is transitive), will survive the current scope. No dynamic reference of a scope variable can be made. Example: | int[] g; | | void main() | { | scope int[] a = [1, 2, 3]; // the allocated array can be destroyed | // at the end of the current scope | scope int b = 5; // same as: int b = 5; (exept that no closure are | // allowed) | | g = a[]; // error: no reference of a may escape main's scope. | } A specific scope, different from the current scope, can be specified by adding parentheses to the scope keyword: - scope(in): This scope is a bridge between scope and dynamic. Variables of any scope (including dynamic variables) can be cast to scope(in). External references of a scope(in) variable may exist, but no new references of a scope(in) variable that survives the current scope may be made. Several scope(in) variables usually do not share the same scope (use scope(label) for that). Example: | int[] g; | | void main() | { | int a[] = [1, 2, 3]; | scope int b[] = [4, 5, 6]; | | scope(in) int[] c = a[]; // ok | c = b[]; // ok | g = c[]; // error: no reference of c may escape main's scope. | } - scope(out): This scope is for variables to be returned. When a scope(out) variable is returned, the calling function can be sure that no reference of the variable or any of its subpart exist anywhere, but in the returned value itself. The caller may cast the scope(out) variable to any scope, and may even cast it to immutable. The caller "decides" what is the scope of the scope(out) variable. Example: | scope(out) int[] oneTwoThree() | { | scope(out) int r = [1, 2, 3]; | return r; | } | | void main() | { | scope a = r; | }; - scope(inout): A combinaison of scope(in) and scope(inout): No reference of the variable that survive the scope may be taken, but the returned value. Example: | scope(inout) int[] firstHalf(scope(inout) int[] a) | { | return a[0..$/2]; | } - scope(label) variable: variable shares its scope with the variable or label "label". Example: | void main() | { | scope a = [1, 2, 3]; | { | scope(a) b = [3, 4, 5]; | a = b; // ok, b has a's scope | } In addition, to make scope usage less verbose, we may make in, out, and inout parameters and return values implicitely scope(in), scope(out), and scope(inout) respectively, in addition to their current meanig, as long as code breakage is tolerable (do probably not before D.3 unless this proposal gets more approval than I expect). This scope system is very similar to the mutable/immutable system. It is optionnal (one may code without it). There is transitivity, a bridge type (const or scope(in)), and also the same virality (is this an english word??). This means that to be usable, this system requires to restrict the usage of parameters and returned value of the functions by appropriate keywords (scope(in, out or inout), otherwise a scoped variable can't be passed to a function and is not usable in practice. But in my opinion, the gain is very large. When used, variable lifetimes becomes deterministic, the compiler can destroy them at the right time, and use the GC only when necessary, or with global variables. I only gave here a few definitions, from which a whole scope system can be deduced, and implemented. I've given it more thoughts, but this post is long enough for now, so I will let you give me your thoughts, and gladly answer your questions about subtelity that may arise, feasibility, etc. -- Christophe Travert
Apr 21 2012
On 04/21/12 16:22, Christophe wrote:Now, this is the idea in a few words: In each function signature, you can add information about whether the function may keep reference to its parameters or return value. Then, when you declare a variable, you can say how long you want to use that variable. With these information, the compiler can check you use your variables right, and use this information to destroy the variable at the right time. To do this, I'll alter the meaning of the scope, in, out and inout keywords to create new storage type :[...]I only gave here a few definitions, from which a whole scope system can be deduced, and implemented. I've given it more thoughts, but this post is long enough for now, so I will let you give me your thoughts, and gladly answer your questions about subtelity that may arise, feasibility, etc."scope", in its current meaning, should have been the default for all function arguments. If this was the case, would introducing your scope-scopes bring any additional benefits? (Let's ignore enforcement for now, and assume the compiler won't let the scoped variables escape). There was a thread some time ago on a similar topic: http://www.digitalmars.com/d/archives/digitalmars/D/learn/Why_I_could_not_cast_string_to_int_32126.html#N32168 Your "scope(out)" seems to be yet another incarnation of uniq/unique (something that apparently keeps coming up over and over again). "scope(inout)" AFAICT could be "T[] f(return T[] a) { return a[0..2]; }"; reusing the "return" keyword to mean "this argument could be returned directly or indirectly as result". artur
Apr 21 2012
Artur Skawina , dans le message (digitalmars.D:164784), a écrit :"scope", in its current meaning, should have been the default for all function arguments. If this was the case, would introducing your scope-scopes bring any additional benefits? (Let's ignore enforcement for now, and assume the compiler won't let the scoped variables escape).Scope in its current meaning is about the same as my scope(in) for parameters. However, it is not transitive, and that changes a lot of things: int[] g; void foo(scope int[] a) { g = a; } // passes: a is not protected at all scope is pointless here. It cannot even protect an array. Just like const/immutable, transitivity is essential.Your "scope(out)" seems to be yet another incarnation of uniq/unique (something that apparently keeps coming up over and over again). "scope(inout)" AFAICT could be "T[] f(return T[] a) { return a[0..2]; }"; reusing the "return" keyword to mean "this argument could be returned directly or indirectly as result".I never heard about uniq or return keyword for parameters. I don't have time to follow the forum most of the time. So basically I just put those three ideas together, with a new naming convention, and transitivity. Why are these ideas not going further ? Well, I could have summarized my long post in one line: Why is the scope attribute not transitive ? -- Christophe
Apr 21 2012
On 04/22/12 02:01, Christophe wrote:Artur Skawina , dans le message (digitalmars.D:164784), a écrit :Yes. there was a reason for my lets-ignore-enforcement-for-now suggestion... The compiler won't catch the escaping refs right now."scope", in its current meaning, should have been the default for all function arguments. If this was the case, would introducing your scope-scopes bring any additional benefits? (Let's ignore enforcement for now, and assume the compiler won't let the scoped variables escape).Scope in its current meaning is about the same as my scope(in) for parameters. However, it is not transitive, and that changes a lot of things: int[] g; void foo(scope int[] a) { g = a; } // passes: a is not protected at all scope is pointless here. It cannot even protect an array. Just like const/immutable, transitivity is essential.It is, or at least this is how i read "references in the parameter cannot be escaped". It just isn't currently enforced. arturYour "scope(out)" seems to be yet another incarnation of uniq/unique (something that apparently keeps coming up over and over again). "scope(inout)" AFAICT could be "T[] f(return T[] a) { return a[0..2]; }"; reusing the "return" keyword to mean "this argument could be returned directly or indirectly as result".I never heard about uniq or return keyword for parameters. I don't have time to follow the forum most of the time. So basically I just put those three ideas together, with a new naming convention, and transitivity. Why are these ideas not going further ? Well, I could have summarized my long post in one line: Why is the scope attribute not transitive ?
Apr 21 2012
On 2012-04-21 14:22:41 +0000, travert phare.normalesup.org (Christophe) said:This scope system is very similar to the mutable/immutable system. It is optionnal (one may code without it). There is transitivity, a bridge type (const or scope(in)), and also the same virality (is this an english word??). This means that to be usable, this system requires to restrict the usage of parameters and returned value of the functions by appropriate keywords (scope(in, out or inout), otherwise a scoped variable can't be passed to a function and is not usable in practice. But in my opinion, the gain is very large. When used, variable lifetimes becomes deterministic, the compiler can destroy them at the right time, and use the GC only when necessary, or with global variables.I fear your solution might not be complicated enough (!) to allow some common patterns. One simple case that often challenges such proposals is the swap function. So with your system, how do you write the swap function? -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Apr 21 2012
Michel Fortin , dans le message (digitalmars.D:164824), a écrit :So with your system, how do you write the swap function?I've thought about that. The scope(label) is the key. void T swap(T)(scope T a, scope(a) T b) { scope(a) tmp = a; a = b; b = tmp; } scope(inout) would also do the trick, since it is implicitely shared between parameters and return values. -- Christophe
Apr 21 2012
On 2012-04-22 06:41:46 +0000, travert phare.normalesup.org (Christophe Travert) said:Michel Fortin , dans le message (digitalmars.D:164824), a écrit :Your proposal is very similar to some things that were discussed in 2008 when escape analysis became the topic of the day on this newsgroup. There were two problems for adoption: it makes writing functions difficult (because you have to add all that scoping thing to your mental model) and implementing new type modifiers is a major undertaking that didn't fit with the schedule. While the second problem might disappear given enough time, the first one is a hurdle. You might find this a good read: <http://www.digitalmars.com/d/archives/digitalmars/D/Escape_analysis_78791.html> -- Michel Fortin michel.fortin michelf.com http://michelf.com/So with your system, how do you write the swap function?I've thought about that. The scope(label) is the key. void T swap(T)(scope T a, scope(a) T b) { scope(a) tmp = a; a = b; b = tmp; } scope(inout) would also do the trick, since it is implicitely shared between parameters and return values.
Apr 22 2012
Michel Fortin , dans le message (digitalmars.D:164837), a écrit :You might find this a good read: <http://www.digitalmars.com/d/archives/digitalmars/D/Escape_analysis_78791.html>Thanks, I'll have a look.
Apr 22 2012
Michel Fortin , dans le message (digitalmars.D:164837), a écrit :newsgroup. There were two problems for adoption: it makes writing functions difficult (because you have to add all that scoping thing to your mental model) and implementing new type modifiers is a major undertaking that didn't fit with the schedule. While the second problem might disappear given enough time, the first one is a hurdle.If we choose the following defaults, the hurdle may not be that high: -1: function *parameters* and return value are scope by default (scope(in) and scope(out) in my terminology, although it is enough to say scope in that case), -2: *variable* are dynamic (= noscope = escape) when it is necessary, and scope when the compiler can find that they do not escape the scope. This way, programmers don't have to worry about variable's scope, since they are dynamic by default. But the performance cost may not be too high, because most of the times the variable will be treated like scope since the functions that use them will be scope by defaults. Lazy programmers only have to say when they let an argument escape the scope of a function. This is a good thing, because this scheme should be avoided, and in any case, this information should be documented. When a scope as to be shared between parameters/return value, stating "inout" would adequately solve 90% of the cases or more. For the less than 10% left, "dynamic" and/or deep copies allows to get rid of the problem, if the programmer is too lazy to use scope(label) or has no choice. Finally, programmers requiring efficiency can control the scope of a variable by declaring them explicitely scope. This way, the compiler will check they are not dynamic. This is obviously non-backward-compatible. However, most of the errors will occur in function's signature, where the compiler can be made to provide adequate error messages to correct the signature quickly. I reckon backward-compatibility issue and implementation time makes it very difficult to considers this before D 3.0, which is not tomorrow. I would really like this sort of system to be implemented in a medium-term, even if I think it's just dream. -- Christophe Travert, 4 years too late to discuss the issue.
Apr 23 2012