digitalmars.D - Deterministic life-time storage type

travert phare.normalesup.org (Christophe) (120/120) Apr 21 2012 Hi. I don't have time to follow all discussions here, but it makes a

Artur Skawina (14/29) Apr 21 2012 "scope", in its current meaning, should have been the default for all

travert phare.normalesup.org (Christophe) (16/25) Apr 21 2012 Scope in its current meaning is about the same as my scope(in) for

Artur Skawina (6/35) Apr 21 2012 Yes. there was a reason for my lets-ignore-enforcement-for-now suggestio...

Michel Fortin (9/19) Apr 21 2012 I fear your solution might not be complicated enough (!) to allow some

travert phare.normalesup.org (Christophe Travert) (13/14) Apr 21 2012 I've thought about that.

Michel Fortin (15/30) Apr 22 2012 Your proposal is very similar to some things that were discussed in

travert phare.normalesup.org (Christophe) (2/5) Apr 22 2012 Thanks, I'll have a look.
travert phare.normalesup.org (Christophe) (31/36) Apr 23 2012 If we choose the following defaults, the hurdle may not be that high:

travert phare.normalesup.org (Christophe) writes:

Hi. I don't have time to follow all discussions here, but it makes a 
some time that I have an idea that I would like to share here, to know 
if that may interest programmers in various fields. The idea is major 
change in the langage (perhaps for D.3), to give tools for the compiler 
and the programmer to know what is the lifetime of any type or object in 
memory, to allow a better memory management. This could dramatically 
reduce the use of the GC, and even allow extra optimisations. It would 
also solve the cast-to-immutable problem, and perhaps even some 
r-value/l-value issues. However, this feature requires some discipline 
to use, and that's why I would like to know if people would be 
interested, or if it is too much to add to a programming langage.

Now, this is the idea in a few words: 
In each function signature, you can add information about whether the 
function may keep reference to its parameters or return value. Then, 
when you declare a variable, you can say how long you want to use that 
variable. With these information, the compiler can check you use your 
variables right, and use this information to destroy the variable at the 
right time.

To do this, I'll alter the meaning of the scope, in, out and inout 
keywords to create new storage type :

 - dynamic variable: this refers to a variable for which references can 
be freely taken. It is allocated on the heap, and garbage collected the 
usual way. This is the default, but an additional keyword, "dynamic" may 
be used to explicitely declare a dynamic variable.

 Example:
 | dynamic(int)[] a = [1, 2, 3]; // same as: int[] a = [1, 2, 3];
 | dynamic(int) b = 5; // same as: ref b = new int; b = 5;

 - scope variable: this refers to a variable for which we can be sure 
that no reference to the variable, or any subpart of it (scope is 
transitive), will survive the current scope. No dynamic reference of a 
scope variable can be made.

 Example:
 | int[] g;
 | 
 | void main()
 | {
 |    scope int[] a = [1, 2, 3]; // the allocated array can be destroyed 
 |                               // at the end of the current scope
 |    scope int b = 5; // same as: int b = 5; (exept that no closure are 
 |                     // allowed)
 |
 |    g = a[]; // error: no reference of a may escape main's scope.
 |  }

A specific scope, different from the current scope, can be specified by 
adding parentheses to the scope keyword:

 - scope(in): This scope is a bridge between scope and dynamic. 
Variables of any scope (including dynamic variables) can be cast to 
scope(in). External references of a scope(in) variable may exist, but no 
new references of a scope(in) variable that survives the current scope 
may be made. Several scope(in) variables usually do not share the same 
scope (use scope(label) for that).

 Example:
 | int[] g;
 | 
 | void main()
 | {
 |   int a[] = [1, 2, 3];
 |   scope int b[] = [4, 5, 6];
 |
 |   scope(in) int[] c = a[]; // ok
 |   c = b[]; // ok
 |   g = c[]; // error: no reference of c may escape main's scope.
 | }

 - scope(out): This scope is for variables to be returned. When a 
scope(out) variable is returned, the calling function can be sure that 
no reference of the variable or any of its subpart exist anywhere, but 
in the returned value itself. The caller may cast the scope(out) 
variable to any scope, and may even cast it to immutable. The caller 
"decides" what is the scope of the scope(out) variable.

 Example:
 | scope(out) int[] oneTwoThree()
 | {
 |   scope(out) int r = [1, 2, 3];
 |   return r;
 | }
 | 
 | void main()
 | {
 |   scope a = r;
 | };

 - scope(inout): A combinaison of scope(in) and scope(inout): No 
reference of the variable that survive the scope may be taken, but the 
returned value.

 Example:
 | scope(inout) int[] firstHalf(scope(inout) int[] a)
 | {
 |   return a[0..$/2];
 | }

 - scope(label) variable: variable shares its scope with
the variable or label "label".

 Example:
 | void main()
 | {
 |   scope a = [1, 2, 3];
 |   {
 |     scope(a) b = [3, 4, 5];
 |     a = b; //  ok, b has a's scope
 |   }

In addition, to make scope usage less verbose, we may make in, out, and 
inout parameters and return values implicitely scope(in), scope(out), 
and scope(inout) respectively, in addition to their current meanig, as 
long as code breakage is tolerable (do probably not before D.3 unless 
this proposal gets more approval than I expect).

This scope system is very similar to the mutable/immutable system. It is 
optionnal (one may code without it). There is transitivity, a bridge 
type (const or scope(in)), and also the same virality (is this an 
english word??). This means that to be usable, this system requires to 
restrict the usage of parameters and returned value of the functions by 
appropriate keywords (scope(in, out or inout), otherwise a scoped 
variable can't be passed to a function and is not usable in practice. 
But in my opinion, the gain is very large. When used, variable lifetimes 
becomes deterministic, the compiler can destroy them at the right time, 
and use the GC only when necessary, or with global variables.

I only gave here a few definitions, from which a whole scope system can 
be deduced, and implemented. I've given it more thoughts, but this post 
is long enough for now, so I will let you give me your thoughts, and 
gladly answer your questions about subtelity that may arise, 
feasibility, etc.

-- 
Christophe Travert

Apr 21 2012

Artur Skawina <art.08.09 gmail.com> writes:

On 04/21/12 16:22, Christophe wrote:
 Now, this is the idea in a few words: 
 In each function signature, you can add information about whether the 
 function may keep reference to its parameters or return value. Then, 
 when you declare a variable, you can say how long you want to use that 
 variable. With these information, the compiler can check you use your 
 variables right, and use this information to destroy the variable at the 
 right time.
 
 To do this, I'll alter the meaning of the scope, in, out and inout 
 keywords to create new storage type :

[...]
 I only gave here a few definitions, from which a whole scope system can 
 be deduced, and implemented. I've given it more thoughts, but this post 
 is long enough for now, so I will let you give me your thoughts, and 
 gladly answer your questions about subtelity that may arise, 
 feasibility, etc.

"scope", in its current meaning, should have been the default for all
function arguments. If this was the case, would introducing your scope-scopes
bring any additional benefits? (Let's ignore enforcement for now, and assume
the compiler won't let the scoped variables escape).

There was a thread some time ago on a similar topic:
http://www.digitalmars.com/d/archives/digitalmars/D/learn/Why_I_could_not_cast_string_to_int_32126.html#N32168

Your "scope(out)" seems to be yet another incarnation of uniq/unique
(something that apparently keeps coming up over and over again).

"scope(inout)" AFAICT could be "T[] f(return T[] a) { return a[0..2]; }";
reusing the "return" keyword to mean "this argument could be returned
directly or indirectly as result". 

artur

Apr 21 2012

travert phare.normalesup.org (Christophe) writes:

Artur Skawina , dans le message (digitalmars.D:164784), a �crit�:
 "scope", in its current meaning, should have been the default for all
 function arguments. If this was the case, would introducing your scope-scopes
 bring any additional benefits? (Let's ignore enforcement for now, and assume
 the compiler won't let the scoped variables escape).

Scope in its current meaning is about the same as my scope(in) for 
parameters. However, it is not transitive, and that changes a 
lot of things:

int[] g;
void foo(scope int[] a) { g = a; } // passes: a is not protected at all

scope is pointless here. It cannot even protect an array. Just like 
const/immutable, transitivity is essential.

 Your "scope(out)" seems to be yet another incarnation of uniq/unique 
 (something that apparently keeps coming up over and over again).

 "scope(inout)" AFAICT could be "T[] f(return T[] a) { return a[0..2]; 
 }"; reusing the "return" keyword to mean "this argument could be 
 returned directly or indirectly as result".

I never heard about uniq or return keyword for parameters. I don't have 
time to follow the forum most of the time. So basically I just put those 
three ideas together, with a new naming convention, and transitivity.
Why are these ideas not going further ?

Well, I could have summarized my long post in one line:
Why is the scope attribute not transitive ?

-- 
Christophe

Apr 21 2012

Artur Skawina <art.08.09 gmail.com> writes:

On 04/22/12 02:01, Christophe wrote:
 Artur Skawina , dans le message (digitalmars.D:164784), a écrit :
 "scope", in its current meaning, should have been the default for all
 function arguments. If this was the case, would introducing your scope-scopes
 bring any additional benefits? (Let's ignore enforcement for now, and assume
 the compiler won't let the scoped variables escape).

 
 Scope in its current meaning is about the same as my scope(in) for 
 parameters. However, it is not transitive, and that changes a 
 lot of things:
 
 int[] g;
 void foo(scope int[] a) { g = a; } // passes: a is not protected at all
 
 scope is pointless here. It cannot even protect an array. Just like 
 const/immutable, transitivity is essential.

Yes. there was a reason for my lets-ignore-enforcement-for-now suggestion...
The compiler won't catch the escaping refs right now.

 Your "scope(out)" seems to be yet another incarnation of uniq/unique 
 (something that apparently keeps coming up over and over again).

 "scope(inout)" AFAICT could be "T[] f(return T[] a) { return a[0..2]; 
 }"; reusing the "return" keyword to mean "this argument could be 
 returned directly or indirectly as result".

 
 I never heard about uniq or return keyword for parameters. I don't have 
 time to follow the forum most of the time. So basically I just put those 
 three ideas together, with a new naming convention, and transitivity.
 Why are these ideas not going further ?
 
 Well, I could have summarized my long post in one line:
 Why is the scope attribute not transitive ?

It is, or at least this is how i read "references in the parameter cannot be
escaped".
It just isn't currently enforced.

artur

Apr 21 2012

Michel Fortin <michel.fortin michelf.com> writes:

On 2012-04-21 14:22:41 +0000, travert phare.normalesup.org (Christophe) said:

 This scope system is very similar to the mutable/immutable system. It is
 optionnal (one may code without it). There is transitivity, a bridge
 type (const or scope(in)), and also the same virality (is this an
 english word??). This means that to be usable, this system requires to
 restrict the usage of parameters and returned value of the functions by
 appropriate keywords (scope(in, out or inout), otherwise a scoped
 variable can't be passed to a function and is not usable in practice.
 But in my opinion, the gain is very large. When used, variable lifetimes
 becomes deterministic, the compiler can destroy them at the right time,
 and use the GC only when necessary, or with global variables.

I fear your solution might not be complicated enough (!) to allow some 
common patterns. One simple case that often challenges such proposals 
is the swap function.

So with your system, how do you write the swap function?

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Apr 21 2012

travert phare.normalesup.org (Christophe Travert) writes:

Michel Fortin , dans le message (digitalmars.D:164824), a �crit�:
 So with your system, how do you write the swap function?

I've thought about that.
The scope(label) is the key.

void T swap(T)(scope T a, scope(a) T b)
{
  scope(a) tmp = a;
  a = b;
  b = tmp;
}

scope(inout) would also do the trick, since it is implicitely shared 
between parameters and return values.

-- 
Christophe

Apr 21 2012

Michel Fortin <michel.fortin michelf.com> writes:

On 2012-04-22 06:41:46 +0000, travert phare.normalesup.org (Christophe 
Travert) said:

 Michel Fortin , dans le message (digitalmars.D:164824), a �crit�:
 So with your system, how do you write the swap function?

 
 I've thought about that.
 The scope(label) is the key.
 
 void T swap(T)(scope T a, scope(a) T b)
 {
   scope(a) tmp = a;
   a = b;
   b = tmp;
 }
 
 scope(inout) would also do the trick, since it is implicitely shared
 between parameters and return values.

Your proposal is very similar to some things that were discussed in 
2008 when escape analysis became the topic of the day on this 
newsgroup. There were two problems for adoption: it makes writing 
functions difficult (because you have to add all that scoping thing to 
your mental model) and implementing new type modifiers is a major 
undertaking that didn't fit with the schedule. While the second problem 
might disappear given enough time, the first one is a hurdle.

You might find this a good read:
<http://www.digitalmars.com/d/archives/digitalmars/D/Escape_analysis_78791.html>

-- 


Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Apr 22 2012

travert phare.normalesup.org (Christophe) writes:

Michel Fortin , dans le message (digitalmars.D:164837), a �crit�:
 You might find this a good read:
 <http://www.digitalmars.com/d/archives/digitalmars/D/Escape_analysis_78791.html>
 

Thanks, I'll have a look.

Apr 22 2012

travert phare.normalesup.org (Christophe) writes:

Michel Fortin , dans le message (digitalmars.D:164837), a �crit�:
 newsgroup. There were two problems for adoption: it makes writing 
 functions difficult (because you have to add all that scoping thing to 
 your mental model) and implementing new type modifiers is a major 
 undertaking that didn't fit with the schedule. While the second problem 
 might disappear given enough time, the first one is a hurdle.

If we choose the following defaults, the hurdle may not be that high:

 -1: function *parameters* and return value are scope by default 
(scope(in) and scope(out) in my terminology, although it is enough to 
say scope in that case),
 -2: *variable* are dynamic (= noscope = escape) when it is necessary, 
and scope when the compiler can find that they do not escape the scope.

This way, programmers don't have to worry about variable's scope, since 
they are dynamic by default. But the performance cost may not be too 
high, because most of the times the variable will be treated like scope 
since the functions that use them will be scope by defaults.

Lazy programmers only have to say when they let an argument escape the 
scope of a function. This is a good thing, because this scheme should be 
avoided, and in any case, this information should be documented.

When a scope as to be shared between parameters/return value, stating 
"inout" would adequately solve 90% of the cases or more. For the less 
than 10% left, "dynamic" and/or deep copies allows to get rid of the 
problem, if the programmer is too lazy to use scope(label) or has no 
choice.

Finally, programmers requiring efficiency can control the scope of a 
variable by declaring them explicitely scope. This way, the compiler 
will check they are not dynamic.

This is obviously non-backward-compatible. However, most of the errors 
will occur in function's signature, where the compiler can be made to 
provide adequate error messages to correct the signature quickly.

I reckon backward-compatibility issue and implementation time makes it 
very difficult to considers this before D 3.0, which is not tomorrow. I 
would really like this sort of system to be implemented in a 
medium-term, even if I think it's just dream.

-- 
Christophe Travert, 4 years too late to discuss the issue.

Apr 23 2012

D Programming

C/C++ Programming

Other

digitalmars.D - Deterministic life-time storage type