www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - new DIP38: Safe references and rvalue references without runtime

reply Timothee Cour <thelastmammoth gmail.com> writes:
Abstract

In short, the compiler internally annotates ref-return functions with
ref(i1,...,iN) indicating that the function may return argument j
(j=i1...iN) by reference (possibly via field accesses), where j is
also a ref input argument. This list can be empty, and if the function
is a method or internal function, argument 0 refers to implicit 'this'
parameter. These annotations are used to validate/invalidate ref
return functions that call such a ref return function. These
annotations are also written in the automatically generated di
interface files.

See the DIP38 for more details and examples.
May 06 2013
next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 06 May 2013 14:52:23 -0400, Timothee Cour  
<thelastmammoth gmail.com> wrote:

 Abstract

 In short, the compiler internally annotates ref-return functions with
 ref(i1,...,iN) indicating that the function may return argument j
 (j=i1...iN) by reference (possibly via field accesses), where j is
 also a ref input argument. This list can be empty, and if the function
 is a method or internal function, argument 0 refers to implicit 'this'
 parameter. These annotations are used to validate/invalidate ref
 return functions that call such a ref return function. These
 annotations are also written in the automatically generated di
 interface files.

 See the DIP38 for more details and examples.
Link: http://wiki.dlang.org/DIP38 In order for this to work, the compiler must mangle according to ref specification. Otherwise, a incorrectly synchronized .di file might link code that should otherwise be rejected. I think this DIP might be too complex for acceptance. My opinion is we should try the runtime check thing. I think there will be very few cases where it is triggered. -Steve
May 06 2013
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/6/13 2:52 PM, Timothee Cour wrote:
 Abstract

 In short, the compiler internally annotates ref-return functions with
 ref(i1,...,iN) indicating that the function may return argument j
 (j=i1...iN) by reference (possibly via field accesses), where j is
 also a ref input argument. This list can be empty, and if the function
 is a method or internal function, argument 0 refers to implicit 'this'
 parameter. These annotations are used to validate/invalidate ref
 return functions that call such a ref return function. These
 annotations are also written in the automatically generated di
 interface files.

 See the DIP38 for more details and examples.
Knee-jerk reaction: this or any scheme based on internal (non-explicit) annotation for functions without that triggering inter-procedural analysis. The signature of a function should be everything the compiler can use as a basis in analysis. Andrei
May 06 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/6/13 3:03 PM, Andrei Alexandrescu wrote:
 On 5/6/13 2:52 PM, Timothee Cour wrote:
 Abstract

 In short, the compiler internally annotates ref-return functions with
 ref(i1,...,iN) indicating that the function may return argument j
 (j=i1...iN) by reference (possibly via field accesses), where j is
 also a ref input argument. This list can be empty, and if the function
 is a method or internal function, argument 0 refers to implicit 'this'
 parameter. These annotations are used to validate/invalidate ref
 return functions that call such a ref return function. These
 annotations are also written in the automatically generated di
 interface files.

 See the DIP38 for more details and examples.
Knee-jerk reaction: this or any scheme based on internal (non-explicit) annotation for functions without that triggering inter-procedural analysis.
Awfully put. I meant: this or any scheme based on internal (non-explicit) annotation for functions cannot work without triggering inter-procedural analysis. (I haven't read the DIP yet.) Andrei
May 06 2013
parent reply Timothee Cour <thelastmammoth gmail.com> writes:
 Awfully put. I meant: this or any scheme based on internal (non-explicit)
 annotation for functions cannot work without triggering inter-procedural
 analysis. (I haven't read the DIP yet.)
The proposed scheme doesn't require any inter-procedural analysis. Each function is summarized by a label indicating its ref dependencies, and a label propagation algorithm is used to label each function. The only tricky case is the (arguably rare) case of mutually recursive ref return functions (corresponding to loops in the graph), for which we can fall back to the runtime check, but for which I also believe we can have a simple compile time solution with a bit more care. I have updated the algorithmic details in the DIP.
May 06 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/6/13 6:41 PM, Timothee Cour wrote:
 Awfully put. I meant: this or any scheme based on internal (non-explicit)
 annotation for functions cannot work without triggering inter-procedural
 analysis. (I haven't read the DIP yet.)
The proposed scheme doesn't require any inter-procedural analysis. Each function is summarized by a label indicating its ref dependencies, and a label propagation algorithm is used to label each function.
BAM! Interprocedural analysis. Doesn't matter what name you invent for it. It's weird - you think you're in good shape, walking down the street, and suddenly you're in interprocedural analysis zone. Andrei
May 06 2013
parent reply Timothee Cour <thelastmammoth gmail.com> writes:
 BAM! Interprocedural analysis. Doesn't matter what name you invent for it.
 It's weird - you think you're in good shape, walking down the street, and
 suddenly you're in interprocedural analysis zone.
ok, call it interprocedural analysis, but what would be your arguments against it, assuming: A) I can convince (with code) that the propagation algorithm is simple to implement and has negligible compile time overhead B) it is safer than proposed approach in release builds (because unittests in debug builds might not catch all such bugs, for example things like min(ref int x, ref int y) which might be correct in some code paths but not all) C) it is faster in debug builds because no runtime checks
May 06 2013
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/6/13 10:56 PM, Timothee Cour wrote:
 BAM! Interprocedural analysis. Doesn't matter what name you invent for it.
 It's weird - you think you're in good shape, walking down the street, and
 suddenly you're in interprocedural analysis zone.
ok, call it interprocedural analysis, but what would be your arguments against it, assuming:
I have no arguments against it other than the usual cautions about interprocedural analysis. The point here is to acknowledge the risks and liabilities. Andrei
May 06 2013
prev sibling next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 5/6/2013 11:52 AM, Timothee Cour wrote:
 See the DIP38 for more details and examples.
Handy link: http://wiki.dlang.org/DIP38
May 06 2013
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/6/2013 11:52 AM, Timothee Cour wrote:
 Abstract

 In short, the compiler internally annotates ref-return functions with
 ref(i1,...,iN) indicating that the function may return argument j
 (j=i1...iN) by reference (possibly via field accesses), where j is
 also a ref input argument. This list can be empty, and if the function
 is a method or internal function, argument 0 refers to implicit 'this'
 parameter. These annotations are used to validate/invalidate ref
 return functions that call such a ref return function. These
 annotations are also written in the automatically generated di
 interface files.

 See the DIP38 for more details and examples.
It requires interprocedural analysis. This is possible for the same functions (such as template functions) that can infer pure/nothrow/ safe, but it cannot be done for ordinary functions.
May 06 2013
parent reply Timothee Cour <thelastmammoth gmail.com> writes:
 It requires interprocedural analysis. This is possible for the same
 functions (such as template functions) that can infer pure/nothrow/ safe,
 but it cannot be done for ordinary functions.
Can you please provide me a simple example for which the algorithm proposed in the DIP38 will fail, and that does not involve cycles (as described in the DIP) ?
May 06 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/6/13 11:44 PM, Timothee Cour wrote:
 It requires interprocedural analysis. This is possible for the same
 functions (such as template functions) that can infer pure/nothrow/ safe,
 but it cannot be done for ordinary functions.
Can you please provide me a simple example for which the algorithm proposed in the DIP38 will fail, and that does not involve cycles (as described in the DIP) ?
No. That's not the problem. It may as well work. When typechecking a function, ALL you have is: 1. the body of that function 2. the signatures of all other functions. NOT adorned with extra info, NO bodies, NO nothing. You need to make-do with that. Everything else explodes into interprocedural analysis. It's as cut and dried as it gets. Andrei
May 06 2013
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 07 May 2013 00:45:56 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 5/6/13 11:44 PM, Timothee Cour wrote:
 It requires interprocedural analysis. This is possible for the same
 functions (such as template functions) that can infer  
 pure/nothrow/ safe,
 but it cannot be done for ordinary functions.
Can you please provide me a simple example for which the algorithm proposed in the DIP38 will fail, and that does not involve cycles (as described in the DIP) ?
No. That's not the problem. It may as well work. When typechecking a function, ALL you have is: 1. the body of that function 2. the signatures of all other functions. NOT adorned with extra info, NO bodies, NO nothing.
I think the DIP fairly clearly says that either it has the function bodies, or the compiler-generated .di files WITH the extra info added. -Steve
May 07 2013
prev sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Monday, 6 May 2013 at 18:52:36 UTC, Timothee Cour wrote:
 Abstract

 In short, the compiler internally annotates ref-return 
 functions with
 ref(i1,...,iN) indicating that the function may return argument 
 j
 (j=i1...iN) by reference (possibly via field accesses), where j 
 is
 also a ref input argument. This list can be empty, and if the 
 function
 is a method or internal function, argument 0 refers to implicit 
 'this'
 parameter. These annotations are used to validate/invalidate ref
 return functions that call such a ref return function. These
 annotations are also written in the automatically generated di
 interface files.

 See the DIP38 for more details and examples.
I agree on Andrei on that one. It breaks separate compilation model.
May 06 2013
next sibling parent reply Timothee Cour <thelastmammoth gmail.com> writes:
Ok, I have updated and simplified the DIP38, please take a look. In
the proposed 'manual' scheme A, the user annotates each ref argument of a
ref-return function with either inref or outref (let's postpone the
discussion of what exactly those keywords should be and focus on the
logic instead; see bottom of email for a possibility). In proposed
scheme B, the inref/outref are instead
automatically infered. Let's focus on scheme A to avoid doing
interprocedural analysis.

Then all we need is to check whether the program typechecks under the
following type conversion rules:

global => outref //global: gc-allocated, static, etc.
output of ref-return function call => outref
outref 'dot' field => outref // field access
local => inref
global => inref
outref => inref
temporary => inref

where A=>B means that a variable of type A can be used in a context
where type B is expected.

see DIP38 for details and examples

I also would argue that it makes sense for the user to write
inref/outref instead of ref, indicating explicit intent on whether or
not to escape a ref argument. This is much simpler than rust's named
lifetime annotations as far as I understand, since it's just a binary
choice.

To avoid breaking code, we can assume that 'ref' means 'outref' and
'scope ref' means 'inref'.
Note, that this isn't the same as the rejected proposal DIP36, which
did not address escaping issues (it only dealt with non ref return
functions).

Example1:
ref T fooa(ref T t) { return t; }
ref T bar() { T t; return fooa(t); }
currently might compile, but has undefined behavior. Under the new
rules (with ref meaning outref), it would not compile because of the
illegal conversion local => outref when attempting to call fooa(t).

Example2:
ref T fooa(inref T t) { static T t2; return t2; }
ref T bar() { T t; return fooa(t); }
here this compiles because we've annotated the input argument to fooa
as inref, as foo returns a global.
May 06 2013
next sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 7 May 2013 at 06:28:54 UTC, Timothee Cour wrote:
 Then all we need is to check whether the program typechecks 
 under the
 following type conversion rules:

 global => outref //global: gc-allocated, static, etc.
 output of ref-return function call => outref
 outref 'dot' field => outref // field access
 local => inref
 global => inref
 outref => inref
 temporary => inref
Redundant rules should usually be avoided. For isntance, global => inref is pointless as global => outref => inref does it.
 Example1:
 ref T fooa(ref T t) { return t; }
 ref T bar() { T t; return fooa(t); }
 currently might compile, but has undefined behavior. Under the 
 new
 rules (with ref meaning outref), it would not compile because 
 of the
 illegal conversion local => outref when attempting to call 
 fooa(t).
Prevent valid code like T t2 = fooa(fooa(t)); in bar.
May 06 2013
parent Timothee Cour <thelastmammoth gmail.com> writes:
 Prevent valid code like T t2 = fooa(fooa(t)); in bar.
Thanks for the counter-example; I think I can still save the proposal with the following conversion rules: global => outref //global: gc-allocated, static, etc. outref 'dot' field => outref // field access ref function(args) where each outref arg of args is an outref expression => outref ref function(args) where at least one outref arg of args is not an outref expression => local inref => local return outref => outref return local => local // compile error if this is a ref return function (I've also updated the DIP38). Can you break these rules?
May 07 2013
prev sibling parent "Namespace" <rswhite4 googlemail.com> writes:
DIP 36 also forgot to mention which lifetime a temporary has, 
relating to the rvalue references.
And I do not see it even at your DIP.
And AFAIK 'scope ref' was *generally* rejected. 'auto ref' should 
/ will eventually be the answer someday.
May 06 2013
prev sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Monday, May 06, 2013 23:28:40 Timothee Cour wrote:
 Ok, I have updated and simplified the DIP38, please take a look. In
 the proposed 'manual' scheme A, the user annotates each ref argument of a
 ref-return function with either inref or outref (let's postpone the
 discussion of what exactly those keywords should be and focus on the
 logic instead; see bottom of email for a possibility). In proposed
 scheme B, the inref/outref are instead
 automatically infered. Let's focus on scheme A to avoid doing
 interprocedural analysis.
I confess that my gut reaction to all of this is that it's just plain simpler to do the runtime check. It won't be needed often and is trivial to disable if you don't want it. And it requires no annotations whatsoever. We're already seriously pushing it with the sheer number of annotations that we have, so I'm very much inclined to argue against adding new ones if we don't really need them. - Jonathan M Davis
May 07 2013
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 07 May 2013 15:20:43 -0400, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 On Monday, May 06, 2013 23:28:40 Timothee Cour wrote:
 Ok, I have updated and simplified the DIP38, please take a look. In
 the proposed 'manual' scheme A, the user annotates each ref argument of  
 a
 ref-return function with either inref or outref (let's postpone the
 discussion of what exactly those keywords should be and focus on the
 logic instead; see bottom of email for a possibility). In proposed
 scheme B, the inref/outref are instead
 automatically infered. Let's focus on scheme A to avoid doing
 interprocedural analysis.
I confess that my gut reaction to all of this is that it's just plain simpler to do the runtime check. It won't be needed often and is trivial to disable if you don't want it. And it requires no annotations whatsoever. We're already seriously pushing it with the sheer number of annotations that we have, so I'm very much inclined to argue against adding new ones if we don't really need them.
While I agree here, and there is the cognitive load of understanding the attributes to take into account, I like the proposed mechanism to propagate attributes to the .di file automatically for no-body functions. That in itself would enable attribute inference, as long as developers of closed-source libraries agreed to use that mechanism. -Steve
May 07 2013