www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - An important potential change to the language: transitory ref

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Hello,


Walter and I are mulling over an idea that may turn out to be extremely 
interesting for D.

Consider the following desideratum: D is a garbage-collected language, 
but we want to be able to define "completely encapsulated" types. That 
means types that never escape pointers to their internal data. If we 
manage to do so, then the type can manage memory any way it desires.

Consider the case of a simple container, e.g. an array. If we manage to 
convince the array to never "leak" a reference to an element, the array 
is free to deallocate memory as it wishes.

One way to achieve that is by simply having the array type return 
rvalues. That is (stylized code below for a type Array!T) instead of:

ref T opIndex(size_t index);

we'd have

T opIndex(size_t index);

and so on. That way nobody can escape a pointer to stuff pointing into 
the array, so the array is safe and can reallocate memory whenever it 
wishes without fearing that someone has squirreled a pointer to its buffer.

I'm encouraged that Scott Meyers' Effective C++ topic 'Avoid returning 
"handles" to internal data' talks essentially about that.

Of course, the problem is that you now can't change stuff inside the 
array. Also there is an efficiency issue.

So I was thinking of the following: how about still returning a 
reference, but define a rule in the language that references can't 
escape - you can't take their address and squirrel it away; the only 
thing you can do is use them right on the spot or pass them down to 
functions.

Essentially that means: "I'm giving you the address of the object, but 
in a scoped manner - there's nothing you can do to save it beyond the 
current expression."

To clarify, for an Array!T object, you can do:

arr[5] = obj;

but you can't do:

T* p = & arr[5];

The former uses the reference anonymously right there, and the latter is 
verboten because it could potentially escape "p" outside the current 
expression, and an array resize would leave p dangling.

Are we hamstringing the language in a way that would disable important 
idioms?


Thanks,

Andrei
Mar 19 2010
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 19 Mar 2010 22:56:59 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Hello,


 Walter and I are mulling over an idea that may turn out to be extremely  
 interesting for D.

 Consider the following desideratum: D is a garbage-collected language,  
 but we want to be able to define "completely encapsulated" types. That  
 means types that never escape pointers to their internal data. If we  
 manage to do so, then the type can manage memory any way it desires.

 Consider the case of a simple container, e.g. an array. If we manage to  
 convince the array to never "leak" a reference to an element, the array  
 is free to deallocate memory as it wishes.

 One way to achieve that is by simply having the array type return  
 rvalues. That is (stylized code below for a type Array!T) instead of:

 ref T opIndex(size_t index);

 we'd have

 T opIndex(size_t index);

 and so on. That way nobody can escape a pointer to stuff pointing into  
 the array, so the array is safe and can reallocate memory whenever it  
 wishes without fearing that someone has squirreled a pointer to its  
 buffer.

 I'm encouraged that Scott Meyers' Effective C++ topic 'Avoid returning  
 "handles" to internal data' talks essentially about that.

 Of course, the problem is that you now can't change stuff inside the  
 array. Also there is an efficiency issue.

 So I was thinking of the following: how about still returning a  
 reference, but define a rule in the language that references can't  
 escape - you can't take their address and squirrel it away; the only  
 thing you can do is use them right on the spot or pass them down to  
 functions.

 Essentially that means: "I'm giving you the address of the object, but  
 in a scoped manner - there's nothing you can do to save it beyond the  
 current expression."

 To clarify, for an Array!T object, you can do:

 arr[5] = obj;

 but you can't do:

 T* p = & arr[5];

 The former uses the reference anonymously right there, and the latter is  
 verboten because it could potentially escape "p" outside the current  
 expression, and an array resize would leave p dangling.

 Are we hamstringing the language in a way that would disable important  
 idioms?

What about returning refs that are ref returns or part of other refs? For example: ref int foo(ref int x) { return x; } ref int bar() { int x; return foo(x); } The reason I bring this up is because it's exactly what a struct is doing. Basically, the problem is not so much that you cannot squirrel it away, but you can return it out of the stack scope it was allocated on. I don't know if there's a way to fix this without restricting struct members from returning ref items. For instance, try to find a rule that prevents the above from compiling, but allows the following to compile. struct S { private int x; ref int getX() { return x;} } struct T { S s; ref int getSX() { return s.x; } } -Steve
Mar 19 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 03/19/2010 10:20 PM, Steven Schveighoffer wrote:
 What about returning refs that are ref returns or part of other refs?
 For example:

 ref int foo(ref int x)
 {
 return x;
 }

 ref int bar()
 {
 int x;
 return foo(x);
 }

 The reason I bring this up is because it's exactly what a struct is
 doing. Basically, the problem is not so much that you cannot squirrel it
 away, but you can return it out of the stack scope it was allocated on.
 I don't know if there's a way to fix this without restricting struct
 members from returning ref items.

I remember you brought up a similar point in a related discussion a couple of years ago. It's a good point, and my current understanding of the matter is that functions that take and return ref could and should be handled conservatively.
 For instance, try to find a rule that prevents the above from compiling,
 but allows the following to compile.

 struct S
 {
 private int x;
 ref int getX() { return x;}
 }

 struct T
 {
 S s;
 ref int getSX() { return s.x; }
 }

In the approach discussed with Walter, S is illegal. A struct can't define a method to return a reference to a direct member. This is exactly the advice given in Scott's book for C++. (A class can because classes sit on the heap.) Andrei
Mar 19 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 03/20/2010 12:56 AM, Steven Schveighoffer wrote:
 On Sat, 20 Mar 2010 00:07:55 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 I remember you brought up a similar point in a related discussion a
 couple of years ago. It's a good point, and my current understanding
 of the matter is that functions that take and return ref could and
 should be handled conservatively.

I don't like the sound of that... What I fear is that the compiler will force people to start using pointers because refs don't cut it. I'm guessing you mean you cannot return ref returns from other functions? That breaks abstraction principles, I should be able to delegate a task to a sub-function.

Perhaps it means you can't return ref returns from other functions if you pass them references to local state. (I've read a paper at some point about a program analysis that stored for each function the "return pattern" - a mini-graph describing the relationship between parameters and result. If it rings a bell to anyone... please chime in.)
 For instance, try to find a rule that prevents the above from compiling,
 but allows the following to compile.

 struct S
 {
 private int x;
 ref int getX() { return x;}
 }

 struct T
 {
 S s;
 ref int getSX() { return s.x; }
 }

In the approach discussed with Walter, S is illegal. A struct can't define a method to return a reference to a direct member. This is exactly the advice given in Scott's book for C++. (A class can because classes sit on the heap.)

A struct may sit on the heap too.

Yes. For those cases you can always use pointers, which are not subject to the restrictions I envision for ref. It's a very small inconvenience. For example, if you have a linked list struct, you may feel constrained that you can't do: struct List { List * next; List * prepend(List * lst) { lst.next = &this; return lst; } } In my approach, &this is illegal. And actually for a good reason. This code bombs: List iForgotThePointer() { List lst; lst.prepend(new List); return lst; } My response to the above issue is two-pronged: (a) For List a class would be an alternative (b) To work with pointers to structs use static member functions and pointers instead of methods and references
 I don't know how else to describe it,
 but it feels like you are applying the solution in the wrong place. I
 understand what you are trying to solve, but your solution may be too
 blunt an instrument.

The goal is worth pursuing, so let's keep on thinking of how to make it work. If D manages to define demonstrably safe encapsulated containers, that would be an absolutely huge win.
 Here's another case:

 struct S
 {
 int*x;
 ref int getX() {return *x;}
 }

 Is x on the heap or not? How do you know? Arrays are just a wrapped
 pointer, so they too could be stack allocated.

struct S { int*x; static ref int getX(S * p) {return *p.x;} } In an ideal world, if you have your hands on a pointer to a struct, you should be reasonably certain that that lives on the heap. It would be just great if D could guarantee that.
 Consider this:

 void foo(ref int x)
 {
 x++;
 }

 struct S
 {
 int x;
 int y;
 bool xisy;
 ref int getX() {if(xisy) return y; return x;}
 }

 foo(S.x);
 foo(S.getX());

Hm, I assume the two lines refer to an object of type S. The example above would again have to be rewritten in terms of static functions with pointers.
 Another case:

 struct S
 {
 int x;
 ref S opUnary(string op)() if (op == "++") {++x; return this;}
 }

 I feel this should all be possible.

I think opUnary should return void and the compiler should worry about that result being used.
 ------
 counter proposal:

 What about having a new kind of ref that can only be passed up the
 stack, or down only one level if you are the one who initiated it.

 Call it scope ref:

 ref int baz(ref y)
 {
 return y;
 }
 scope ref int foo(scope ref int x, ref int y)
 {
 //return x; // illegal, we did not make x scope ref
 //return baz(x); // illegal, cannot convert scope ref into ref
 return y; // legal, you can convert a ref parameter into scope ref.
 }

 scope ref int bar()
 {
 int y;
 //return foo(y, y); //illegal, you cannot pass scope refs down the stack
 more than one level
 }

 At least this leaves ref alone to be used without restrictions that the
 compiler can't prove are necessary. If we find scope ref is the only
 kind of ref we ever use, then maybe we can get rid of scope ref and just
 make ref be the restricted form. Or you could keep scope ref and reserve
 ref for only provable heap-variables.

 Man, it would be nice to have escape analysis...

It sure would, but it quickly gets into the interprocedural tarpit. Your idea is good, except I don't see why not make ref scoped ref. After all ref is currently not an enabler - it could be missing from the language; pointers are fine. So why not make ref do something actually interesting? Andrei
Mar 19 2010
next sibling parent reply Don <nospam nospam.com> writes:
Andrei Alexandrescu wrote:
 On 03/20/2010 12:56 AM, Steven Schveighoffer wrote:
 On Sat, 20 Mar 2010 00:07:55 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 I remember you brought up a similar point in a related discussion a
 couple of years ago. It's a good point, and my current understanding
 of the matter is that functions that take and return ref could and
 should be handled conservatively.

I don't like the sound of that... What I fear is that the compiler will force people to start using pointers because refs don't cut it. I'm guessing you mean you cannot return ref returns from other functions? That breaks abstraction principles, I should be able to delegate a task to a sub-function.

Perhaps it means you can't return ref returns from other functions if you pass them references to local state. (I've read a paper at some point about a program analysis that stored for each function the "return pattern" - a mini-graph describing the relationship between parameters and result. If it rings a bell to anyone... please chime in.)
 For instance, try to find a rule that prevents the above from 
 compiling,
 but allows the following to compile.

 struct S
 {
 private int x;
 ref int getX() { return x;}
 }

 struct T
 {
 S s;
 ref int getSX() { return s.x; }
 }

In the approach discussed with Walter, S is illegal. A struct can't define a method to return a reference to a direct member. This is exactly the advice given in Scott's book for C++. (A class can because classes sit on the heap.)

A struct may sit on the heap too.

Yes. For those cases you can always use pointers, which are not subject to the restrictions I envision for ref. It's a very small inconvenience. For example, if you have a linked list struct, you may feel constrained that you can't do: struct List { List * next; List * prepend(List * lst) { lst.next = &this; return lst; } } In my approach, &this is illegal. And actually for a good reason. This code bombs: List iForgotThePointer() { List lst; lst.prepend(new List); return lst; } My response to the above issue is two-pronged: (a) For List a class would be an alternative (b) To work with pointers to structs use static member functions and pointers instead of methods and references

That would prohibit their use in safe D, right? Although I can see the appeal of this idea, it seems very experimental. I fear it might have unexpected consequences. So it would need quite extensive testing. Do we have enough time to do that?
Mar 20 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 03/20/2010 02:36 AM, Don wrote:
 Andrei Alexandrescu wrote:
 My response to the above issue is two-pronged:

 (a) For List a class would be an alternative
 (b) To work with pointers to structs use static member functions and
 pointers instead of methods and references

That would prohibit their use in safe D, right?

That is correct. Note that things like &this are also forbidden in safe D. Practically safety is not affected one way or another.
 Although I can see the appeal of this idea, it seems very experimental.
 I fear it might have unexpected consequences. So it would need quite
 extensive testing. Do we have enough time to do that?

It is not urgent to introduce this restriction. Andrei
Mar 20 2010
prev sibling next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2010-03-20 02:53:50 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 Perhaps it means you can't return ref returns from other functions if 
 you pass them references to local state.

You realize you're doing that all over the place in std.range when a range wraps another one? struct WrapperRange(R) { R wrappedRange; ref ElementType!R front() { return wrappedRange.front(); } } How do you plan to implement that in your proposed ref regime? Member functions receive 'this' as a ref argument, so they can't return a ref don't they? I agree it's a goal worth pursuing, but we sure need new idiom to replace the one above. I think I see a solution: force the caller to provide a delegate when he wants to access and act on a reference. In the following example, the reference can't escape beyond applyFront's scope, but the caller is still free to do whatever it wants, with the exception that it can't escape the ref outside of the delegate he provided. struct WrapperRange(R) { R wrappedRange; void applyFront(void delegate(ref ElementType!R) actor) { actor(wrappedRange.front); } } R range; // Instead of this: range.front.callSomeFunc(); // Use this: range.applyFront!((ref ElementType!R e) { e.callSomeFunc(); })(); Perhaps the compiler could be of some help here, by adding some sugar in a similar way that foreach is transformed to opApply. Could the dot operator automatically do something of the sort? -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Mar 20 2010
next sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2010-03-20 07:53:34 -0400, Michel Fortin <michel.fortin michelf.com> said:

 	struct WrapperRange(R) {
 		R wrappedRange;
 		void applyFront(void delegate(ref ElementType!R) actor) {
 			actor(wrappedRange.front);
 		}
 	}
 
 	R range;
 	// Instead of this:
 	range.front.callSomeFunc();
 	// Use this:
 	range.applyFront!((ref ElementType!R e) { e.callSomeFunc(); })();

Hum, this last line doesn't work, I should have written this: range.applyFront((ref ElementType!R e) { e.callSomeFunc(); }); No template. It could be done with a template, but then it couldn't work for virtual functions. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Mar 20 2010
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 03/20/2010 06:53 AM, Michel Fortin wrote:
 On 2010-03-20 02:53:50 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 Perhaps it means you can't return ref returns from other functions if
 you pass them references to local state.

You realize you're doing that all over the place in std.range when a range wraps another one? struct WrapperRange(R) { R wrappedRange; ref ElementType!R front() { return wrappedRange.front(); } } How do you plan to implement that in your proposed ref regime? Member functions receive 'this' as a ref argument, so they can't return a ref don't they?

Hmmm, I think returning a ref to a member should be allowed as long as the compiler realizes that ref has the same scope as the struct it originated from.
 I agree it's a goal worth pursuing, but we sure need new idiom to
 replace the one above. I think I see a solution: force the caller to
 provide a delegate when he wants to access and act on a reference. In
 the following example, the reference can't escape beyond applyFront's
 scope, but the caller is still free to do whatever it wants, with the
 exception that it can't escape the ref outside of the delegate he provided.

 struct WrapperRange(R) {
 R wrappedRange;
 void applyFront(void delegate(ref ElementType!R) actor) {
 actor(wrappedRange.front);
 }
 }

 R range;
 // Instead of this:
 range.front.callSomeFunc();
 // Use this:
 range.applyFront!((ref ElementType!R e) { e.callSomeFunc(); })();

 Perhaps the compiler could be of some help here, by adding some sugar in
 a similar way that foreach is transformed to opApply. Could the dot
 operator automatically do something of the sort?

I find that quite alembicated. Andrei
Mar 20 2010
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2010-03-20 12:41:48 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 On 03/20/2010 06:53 AM, Michel Fortin wrote:
 On 2010-03-20 02:53:50 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:
 
 Perhaps it means you can't return ref returns from other functions if
 you pass them references to local state.

You realize you're doing that all over the place in std.range when a range wraps another one? struct WrapperRange(R) { R wrappedRange; ref ElementType!R front() { return wrappedRange.front(); } } How do you plan to implement that in your proposed ref regime? Member functions receive 'this' as a ref argument, so they can't return a ref don't they?

Hmmm, I think returning a ref to a member should be allowed as long as the compiler realizes that ref has the same scope as the struct it originated from.

Ok, so the compiler assumes that the scope of the ref returned by function is the same as its ref arguments and prevent you from returning a ref to it if one of the ref argument is a local variable? That could work. It'd only become a hindrance in the more complex cases where you have multiple ref arguments in a function and no way to tell which one the returned ref comes from. But there's an easy solution for that: add a marker. For instance: a 'ref' argument is acceptable for a return value, and 'scope ref' argument cannot be used as a return value. ('scope' here exists only as a restriction for 'ref', it is not applicable to anything else.) -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Mar 20 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 03/20/2010 12:30 PM, Michel Fortin wrote:
 On 2010-03-20 12:41:48 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 On 03/20/2010 06:53 AM, Michel Fortin wrote:
 On 2010-03-20 02:53:50 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 Perhaps it means you can't return ref returns from other functions if
 you pass them references to local state.

You realize you're doing that all over the place in std.range when a range wraps another one? struct WrapperRange(R) { R wrappedRange; ref ElementType!R front() { return wrappedRange.front(); } } How do you plan to implement that in your proposed ref regime? Member functions receive 'this' as a ref argument, so they can't return a ref don't they?

Hmmm, I think returning a ref to a member should be allowed as long as the compiler realizes that ref has the same scope as the struct it originated from.

Ok, so the compiler assumes that the scope of the ref returned by function is the same as its ref arguments and prevent you from returning a ref to it if one of the ref argument is a local variable? That could work. It'd only become a hindrance in the more complex cases where you have multiple ref arguments in a function and no way to tell which one the returned ref comes from.

Yes. Conservatively, the scope of a returned ref is the smallest of all parameters involved, including 'this'. There are two touches that Walter and I discussed: a) You only need to keep in that set the parameters with .sizeof greater than or equal to the return type; b) You only need to keep in that set the parameters that contain a direct subobject of the return type. The rules are a bit odd, but they do the job and most importantly enable important encapsulation mechanisms. Andrei
Mar 20 2010
parent Michel Fortin <michel.fortin michelf.com> writes:
On 2010-03-20 18:24:58 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 That could work. It'd only become a hindrance in the more complex cases
 where you have multiple ref arguments in a function and no way to tell
 which one the returned ref comes from.

Yes. Conservatively, the scope of a returned ref is the smallest of all parameters involved, including 'this'. There are two touches that Walter and I discussed: a) You only need to keep in that set the parameters with .sizeof greater than or equal to the return type; b) You only need to keep in that set the parameters that contain a direct subobject of the return type. The rules are a bit odd, but they do the job and most importantly enable important encapsulation mechanisms.

Not only odd, but hard to predict. The problem with those rules is that if the definition of a struct changes in subtle ways your program might stop compiling. If a struct on Linux is larger than the equivalent struct on Mac OS X and this prevents some code from compiling, what are your options? Refactor your program to use pointers? If some generic algorithm accepts two arguments of any type but which can only return the first argument, should certain ways of calling it become unusable as soon as you give it two arguments of the same type? That kind of attempt at cleverness sounds like a lot of annoying edge cases waiting to be found. I think it's bad coupling that if you add a never-used private member to a struct it might break some parts of the program elsewhere where a function attempts to return a reference to something that looks like your new member. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Mar 20 2010
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
Andrei Alexandrescu Wrote:

 On 03/20/2010 12:56 AM, Steven Schveighoffer wrote:
 On Sat, 20 Mar 2010 00:07:55 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 I remember you brought up a similar point in a related discussion a
 couple of years ago. It's a good point, and my current understanding
 of the matter is that functions that take and return ref could and
 should be handled conservatively.

I don't like the sound of that... What I fear is that the compiler will force people to start using pointers because refs don't cut it. I'm guessing you mean you cannot return ref returns from other functions? That breaks abstraction principles, I should be able to delegate a task to a sub-function.

Perhaps it means you can't return ref returns from other functions if you pass them references to local state. (I've read a paper at some point about a program analysis that stored for each function the "return pattern" - a mini-graph describing the relationship between parameters and result. If it rings a bell to anyone... please chime in.)

This would be full escape analysis :) I agree it is the best solution, but it requires D to have special object files and a special linker.
 
 For instance, try to find a rule that prevents the above from compiling,
 but allows the following to compile.

 struct S
 {
 private int x;
 ref int getX() { return x;}
 }

 struct T
 {
 S s;
 ref int getSX() { return s.x; }
 }

In the approach discussed with Walter, S is illegal. A struct can't define a method to return a reference to a direct member. This is exactly the advice given in Scott's book for C++. (A class can because classes sit on the heap.)

A struct may sit on the heap too.

Yes. For those cases you can always use pointers, which are not subject to the restrictions I envision for ref. It's a very small inconvenience. For example, if you have a linked list struct, you may feel constrained that you can't do: struct List { List * next; List * prepend(List * lst) { lst.next = &this; return lst; } } In my approach, &this is illegal. And actually for a good reason. This code bombs: List iForgotThePointer() { List lst; lst.prepend(new List); return lst; } My response to the above issue is two-pronged: (a) For List a class would be an alternative

Note, this is impractical if you care about performance (I know, because originally, dcollections used classes for link nodes).
 (b) To work with pointers to structs use static member functions and 
 pointers instead of methods and references

This prevents something like a linked list from being used in safe D. That might be too much of a restriction. Yes, it makes code safer, but it makes safe D unusable.
 The goal is worth pursuing, so let's keep on thinking of how to make it 
 work. If D manages to define demonstrably safe encapsulated containers, 
 that would be an absolutely huge win.

I agree.
 
 Here's another case:

 struct S
 {
 int*x;
 ref int getX() {return *x;}
 }

 Is x on the heap or not? How do you know? Arrays are just a wrapped
 pointer, so they too could be stack allocated.

struct S { int*x; static ref int getX(S * p) {return *p.x;} } In an ideal world, if you have your hands on a pointer to a struct, you should be reasonably certain that that lives on the heap. It would be just great if D could guarantee that.

again, not in safe D. Anytime pointers enter the mix, safe D is disqualified, no?
 
 Consider this:

 void foo(ref int x)
 {
 x++;
 }

 struct S
 {
 int x;
 int y;
 bool xisy;
 ref int getX() {if(xisy) return y; return x;}
 }

 foo(S.x);
 foo(S.getX());

Hm, I assume the two lines refer to an object of type S. The example above would again have to be rewritten in terms of static functions with pointers.

Yes, I meant to write: S s; foo(s.x); foo(s.getX());
 
 Another case:

 struct S
 {
 int x;
 ref S opUnary(string op)() if (op == "++") {++x; return this;}
 }

 I feel this should all be possible.

I think opUnary should return void and the compiler should worry about that result being used.

Yes, you are probably right. It would be cool if the compiler could automatically do that in all cases where the operator is expected to return a reference to the struct, i.e. +=, -=, etc. That removes one of my biggest concerns.
 
 ------
 counter proposal:

 What about having a new kind of ref that can only be passed up the
 stack, or down only one level if you are the one who initiated it.

 Call it scope ref:

 ref int baz(ref y)
 {
 return y;
 }
 scope ref int foo(scope ref int x, ref int y)
 {
 //return x; // illegal, we did not make x scope ref
 //return baz(x); // illegal, cannot convert scope ref into ref
 return y; // legal, you can convert a ref parameter into scope ref.
 }

 scope ref int bar()
 {
 int y;
 //return foo(y, y); //illegal, you cannot pass scope refs down the stack
 more than one level
 }

 At least this leaves ref alone to be used without restrictions that the
 compiler can't prove are necessary. If we find scope ref is the only
 kind of ref we ever use, then maybe we can get rid of scope ref and just
 make ref be the restricted form. Or you could keep scope ref and reserve
 ref for only provable heap-variables.

 Man, it would be nice to have escape analysis...

It sure would, but it quickly gets into the interprocedural tarpit.

It requires some sort of inter-object analysis, like you mentioned at the top.
 
 Your idea is good, except I don't see why not make ref scoped ref. After 
 all ref is currently not an enabler - it could be missing from the 
 language; pointers are fine. So why not make ref do something actually 
 interesting?

The whole point was to run an experiment seeing how many things could be written as scoped ref instead of just ref (leaving ref the way it is today). If they all can with minor adjustments, then we can switch scoped ref to simply ref. If some can't, and those also don't make sense as pointers, then we have a predicament that ref enables some sort of code that isn't enabled by pointers or scoped ref. My point was, I don't really know off the top of my head all the cases for using ref, and you probably don't either. I don't think a sweeping change like this is good without at least some practical evidence. -Steve
Mar 20 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 03/20/2010 07:04 AM, Steven Schveighoffer wrote:
 Andrei Alexandrescu Wrote:

 My response to the above issue is two-pronged:

 (a) For List a class would be an alternative

Note, this is impractical if you care about performance (I know, because originally, dcollections used classes for link nodes).

I agree :o).
 (b) To work with pointers to structs use static member functions
 and pointers instead of methods and references

This prevents something like a linked list from being used in safe D. That might be too much of a restriction. Yes, it makes code safer, but it makes safe D unusable.

Actually, no. Walter's vision is to enable certain pointer uses in SafeD. Unchecked pointer arithmetic is disabled, but code like the one below is entirely safe: List * lst = new List; List * lst2 = prepend(lst, new List); In fact, restricting use of ref is beneficial for SafeD because it helps the static proof that any pointer in SafeD points to the garbage-collected heap (as opposed to e.g. dangling stack variables).
 In an ideal world, if you have your hands on a pointer to a struct,
 you should be reasonably certain that that lives on the heap. It
 would be just great if D could guarantee that.

again, not in safe D. Anytime pointers enter the mix, safe D is disqualified, no?

As mentioned above, pointers are still on the table in SafeD. I found that quite interesting myself :o).
 Your idea is good, except I don't see why not make ref scoped ref.
 After all ref is currently not an enabler - it could be missing
 from the language; pointers are fine. So why not make ref do
 something actually interesting?

The whole point was to run an experiment seeing how many things could be written as scoped ref instead of just ref (leaving ref the way it is today). If they all can with minor adjustments, then we can switch scoped ref to simply ref. If some can't, and those also don't make sense as pointers, then we have a predicament that ref enables some sort of code that isn't enabled by pointers or scoped ref. My point was, I don't really know off the top of my head all the cases for using ref, and you probably don't either. I don't think a sweeping change like this is good without at least some practical evidence.

That is very sensible, thanks. Andrei
Mar 20 2010
prev sibling next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2010-03-19 22:56:59 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 So I was thinking of the following: how about still returning a 
 reference, but define a rule in the language that references can't 
 escape - you can't take their address and squirrel it away; the only 
 thing you can do is use them right on the spot or pass them down to 
 functions.

Can functions return the reference, or a reference to a part of the original referenced value? It's safe to do this only in some specific contexts, but the compiler already has some difficulty dealing with this. See bug 3925: <http://d.puremagic.com/issues/show_bug.cgi?id=3925> Another effect is that it should prevent closures from using that transitory ref value, since a closure can be leaked... unless we add transitory closures. :-) -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Mar 19 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 03/19/2010 10:52 PM, Michel Fortin wrote:
 On 2010-03-19 22:56:59 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 So I was thinking of the following: how about still returning a
 reference, but define a rule in the language that references can't
 escape - you can't take their address and squirrel it away; the only
 thing you can do is use them right on the spot or pass them down to
 functions.

Can functions return the reference, or a reference to a part of the original referenced value?

I see two possible approaches (a) disallow that, which I think would be irregular, and (b) conservatively assume that that happens on the call side. There is one other problem that I haven't mentioned. Consider the code: void fun(ref int a, int b) { a = b; } Array!int arr; arr.resize(5); fun(arr[4], (arr.resize(10000), 42)); The purpose of restricting ref was to allow Array to entirely encapsulate its memory allocation, i.e. allow Array to use malloc and free without fearing that someone still holds a dangling pointer in there. In the code above, however, evaluation will first take the address of arr[4] and _then_ call arr.resize (which may trigger reallocation). As a consequence, by the time fun sees the ref, it's already dangling. I don't know how to solve this. Andrei
Mar 19 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 20 Mar 2010 00:07:55 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 03/19/2010 10:20 PM, Steven Schveighoffer wrote:
 What about returning refs that are ref returns or part of other refs?
 For example:

 ref int foo(ref int x)
 {
 return x;
 }

 ref int bar()
 {
 int x;
 return foo(x);
 }

 The reason I bring this up is because it's exactly what a struct is
 doing. Basically, the problem is not so much that you cannot squirrel it
 away, but you can return it out of the stack scope it was allocated on.
 I don't know if there's a way to fix this without restricting struct
 members from returning ref items.

I remember you brought up a similar point in a related discussion a couple of years ago. It's a good point, and my current understanding of the matter is that functions that take and return ref could and should be handled conservatively.

I don't like the sound of that... What I fear is that the compiler will force people to start using pointers because refs don't cut it. I'm guessing you mean you cannot return ref returns from other functions? That breaks abstraction principles, I should be able to delegate a task to a sub-function.
 For instance, try to find a rule that prevents the above from compiling,
 but allows the following to compile.

 struct S
 {
 private int x;
 ref int getX() { return x;}
 }

 struct T
 {
 S s;
 ref int getSX() { return s.x; }
 }

In the approach discussed with Walter, S is illegal. A struct can't define a method to return a reference to a direct member. This is exactly the advice given in Scott's book for C++. (A class can because classes sit on the heap.)

A struct may sit on the heap too. I don't know how else to describe it, but it feels like you are applying the solution in the wrong place. I understand what you are trying to solve, but your solution may be too blunt an instrument. Here's another case: struct S { int*x; ref int getX() {return *x;} } Is x on the heap or not? How do you know? Arrays are just a wrapped pointer, so they too could be stack allocated. Consider this: void foo(ref int x) { x++; } struct S { int x; int y; bool xisy; ref int getX() {if(xisy) return y; return x;} } foo(S.x); foo(S.getX()); Another case: struct S { int x; ref S opUnary(string op)() if (op == "++") {++x; return this;} } I feel this should all be possible. ------ counter proposal: What about having a new kind of ref that can only be passed up the stack, or down only one level if you are the one who initiated it. Call it scope ref: ref int baz(ref y) { return y; } scope ref int foo(scope ref int x, ref int y) { //return x; // illegal, we did not make x scope ref //return baz(x); // illegal, cannot convert scope ref into ref return y; // legal, you can convert a ref parameter into scope ref. } scope ref int bar() { int y; //return foo(y, y); //illegal, you cannot pass scope refs down the stack more than one level } At least this leaves ref alone to be used without restrictions that the compiler can't prove are necessary. If we find scope ref is the only kind of ref we ever use, then maybe we can get rid of scope ref and just make ref be the restricted form. Or you could keep scope ref and reserve ref for only provable heap-variables. Man, it would be nice to have escape analysis... -Steve
Mar 19 2010
prev sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
I don't comment on this topic because I am not expert enough yet to see its
possible consequences.

Regarding D2 development, one of the original design goals of D is to be a not
revolutionary language, but to take what's already known as reliable and useful
from other languages. Bug lately D has become an experiment: it contains many
experimental features that are new, nearly untested in real programs. Their
semantics can be sound, but we can't be certain yet, so some of those designs
may need to be improved on a semantic level too. And some of them are not even
fully implemented.

I have used D2 for the last few weeks, and I can say that currently the D2
compiler is so full of bugs, rough edges, or not fully implemented features
that in my opinion it's nearly unusable. I have found a new bug every 10 lines
or code or so (my code is not normal code, I know). When the book is out people
will start looking for a compiler too, so I think it's better to offer them
something that works, or they will lose interest quickly, and then it will be
harder to call them back to give a second look/chance at/to the language.

So my suggestion is to focus on removing bugs, performing small local
improvements, to smooth the semantic rough edges, etc. I have listed here less
than fifteen small things that I've added to bugzilla, that I think can be
improved. They are not real bugs, but they are not large new features, they are
usually little local things that smooth corners.

Bye,
bearophile
Mar 20 2010
parent dolive <dolive89 sina.com> writes:
bearophile д:

 I don't comment on this topic because I am not expert enough yet to see its
possible consequences.
 
 Regarding D2 development, one of the original design goals of D is to be a not
revolutionary language, but to take what's already known as reliable and useful
from other languages. Bug lately D has become an experiment: it contains many
experimental features that are new, nearly untested in real programs. Their
semantics can be sound, but we can't be certain yet, so some of those designs
may need to be improved on a semantic level too. And some of them are not even
fully implemented.
 
 I have used D2 for the last few weeks, and I can say that currently the D2
compiler is so full of bugs, rough edges, or not fully implemented features
that in my opinion it's nearly unusable. I have found a new bug every 10 lines
or code or so (my code is not normal code, I know). When the book is out people
will start looking for a compiler too, so I think it's better to offer them
something that works, or they will lose interest quickly, and then it will be
harder to call them back to give a second look/chance at/to the language.
 
 So my suggestion is to focus on removing bugs, performing small local
improvements, to smooth the semantic rough edges, etc. I have listed here less
than fifteen small things that I've added to bugzilla, that I think can be
improved. They are not real bugs, but they are not large new features, they are
usually little local things that smooth corners.
 
 Bye,
 bearophile

Entirely correct ! to support ! Fix bugs, some small features perfect is a priority ! thank you, Andrei grandmaster ! dolive
Mar 20 2010