digitalmars.D - An important potential change to the language: transitory ref

Andrei Alexandrescu (41/41) Mar 19 2010 Hello,

Steven Schveighoffer (31/71) Mar 19 2010 What about returning refs that are ref returns or part of other refs? F...

Andrei Alexandrescu (10/38) Mar 19 2010 I remember you brought up a similar point in a related discussion a

Steven Schveighoffer (68/110) Mar 19 2010 I don't like the sound of that... What I fear is that the compiler will...

Andrei Alexandrescu (51/142) Mar 19 2010 Perhaps it means you can't return ref returns from other functions if

Don (5/78) Mar 20 2010 That would prohibit their use in safe D, right?

Andrei Alexandrescu (5/15) Mar 20 2010 That is correct. Note that things like &this are also forbidden in safe

Michel Fortin (38/40) Mar 20 2010 You realize you're doing that all over the place in std.range when a

Michel Fortin (9/21) Mar 20 2010 Hum, this last line doesn't work, I should have written this:
Andrei Alexandrescu (6/41) Mar 20 2010 Hmmm, I think returning a ref to a member should be allowed as long as

Michel Fortin (16/40) Mar 20 2010 Ok, so the compiler assumes that the scope of the ref returned by

Andrei Alexandrescu (11/43) Mar 20 2010 Yes. Conservatively, the scope of a returned ref is the smallest of all

Michel Fortin (20/37) Mar 20 2010 Not only odd, but hard to predict.

Steven Schveighoffer (14/186) Mar 20 2010 Note, this is impractical if you care about performance (I know, because...

Andrei Alexandrescu (15/45) Mar 20 2010 I agree :o).

Michel Fortin (14/19) Mar 19 2010 Can functions return the reference, or a reference to a part of the

Andrei Alexandrescu (18/27) Mar 19 2010 I see two possible approaches (a) disallow that, which I think would be

bearophile (6/6) Mar 20 2010 I don't comment on this topic because I am not expert enough yet to see ...

dolive (5/15) Mar 20 2010 Entirely correct ! to support !

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Hello,


Walter and I are mulling over an idea that may turn out to be extremely 
interesting for D.

Consider the following desideratum: D is a garbage-collected language, 
but we want to be able to define "completely encapsulated" types. That 
means types that never escape pointers to their internal data. If we 
manage to do so, then the type can manage memory any way it desires.

Consider the case of a simple container, e.g. an array. If we manage to 
convince the array to never "leak" a reference to an element, the array 
is free to deallocate memory as it wishes.

One way to achieve that is by simply having the array type return 
rvalues. That is (stylized code below for a type Array!T) instead of:

ref T opIndex(size_t index);

we'd have

T opIndex(size_t index);

and so on. That way nobody can escape a pointer to stuff pointing into 
the array, so the array is safe and can reallocate memory whenever it 
wishes without fearing that someone has squirreled a pointer to its buffer.

I'm encouraged that Scott Meyers' Effective C++ topic 'Avoid returning 
"handles" to internal data' talks essentially about that.

Of course, the problem is that you now can't change stuff inside the 
array. Also there is an efficiency issue.

So I was thinking of the following: how about still returning a 
reference, but define a rule in the language that references can't 
escape - you can't take their address and squirrel it away; the only 
thing you can do is use them right on the spot or pass them down to 
functions.

Essentially that means: "I'm giving you the address of the object, but 
in a scoped manner - there's nothing you can do to save it beyond the 
current expression."

To clarify, for an Array!T object, you can do:

arr[5] = obj;

but you can't do:

T* p = & arr[5];

The former uses the reference anonymously right there, and the latter is 
verboten because it could potentially escape "p" outside the current 
expression, and an array resize would leave p dangling.

Are we hamstringing the language in a way that would disable important 
idioms?


Thanks,

Andrei

Mar 19 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 19 Mar 2010 22:56:59 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Hello,


 Walter and I are mulling over an idea that may turn out to be extremely  
 interesting for D.

 Consider the following desideratum: D is a garbage-collected language,  
 but we want to be able to define "completely encapsulated" types. That  
 means types that never escape pointers to their internal data. If we  
 manage to do so, then the type can manage memory any way it desires.

 Consider the case of a simple container, e.g. an array. If we manage to  
 convince the array to never "leak" a reference to an element, the array  
 is free to deallocate memory as it wishes.

 One way to achieve that is by simply having the array type return  
 rvalues. That is (stylized code below for a type Array!T) instead of:

 ref T opIndex(size_t index);

 we'd have

 T opIndex(size_t index);

 and so on. That way nobody can escape a pointer to stuff pointing into  
 the array, so the array is safe and can reallocate memory whenever it  
 wishes without fearing that someone has squirreled a pointer to its  
 buffer.

 I'm encouraged that Scott Meyers' Effective C++ topic 'Avoid returning  
 "handles" to internal data' talks essentially about that.

 Of course, the problem is that you now can't change stuff inside the  
 array. Also there is an efficiency issue.

 So I was thinking of the following: how about still returning a  
 reference, but define a rule in the language that references can't  
 escape - you can't take their address and squirrel it away; the only  
 thing you can do is use them right on the spot or pass them down to  
 functions.

 Essentially that means: "I'm giving you the address of the object, but  
 in a scoped manner - there's nothing you can do to save it beyond the  
 current expression."

 To clarify, for an Array!T object, you can do:

 arr[5] = obj;

 but you can't do:

 T* p = & arr[5];

 The former uses the reference anonymously right there, and the latter is  
 verboten because it could potentially escape "p" outside the current  
 expression, and an array resize would leave p dangling.

 Are we hamstringing the language in a way that would disable important  
 idioms?

What about returning refs that are ref returns or part of other refs?  For  
example:

ref int foo(ref int x)
{
    return x;
}

ref int bar()
{
   int x;
   return foo(x);
}

The reason I bring this up is because it's exactly what a struct is  
doing.  Basically, the problem is not so much that you cannot squirrel it  
away, but you can return it out of the stack scope it was allocated on.  I  
don't know if there's a way to fix this without restricting struct members  
 from returning ref items.

For instance, try to find a rule that prevents the above from compiling,  
but allows the following to compile.

struct S
{
    private int x;
    ref int getX() { return x;}
}

struct T
{
   S s;
   ref int getSX() { return s.x; }
}

-Steve

Mar 19 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 03/19/2010 10:20 PM, Steven Schveighoffer wrote:
 What about returning refs that are ref returns or part of other refs?
 For example:

 ref int foo(ref int x)
 {
 return x;
 }

 ref int bar()
 {
 int x;
 return foo(x);
 }

 The reason I bring this up is because it's exactly what a struct is
 doing. Basically, the problem is not so much that you cannot squirrel it
 away, but you can return it out of the stack scope it was allocated on.
 I don't know if there's a way to fix this without restricting struct
 members from returning ref items.

I remember you brought up a similar point in a related discussion a 
couple of years ago. It's a good point, and my current understanding of 
the matter is that functions that take and return ref could and should 
be handled conservatively.

 For instance, try to find a rule that prevents the above from compiling,
 but allows the following to compile.

 struct S
 {
 private int x;
 ref int getX() { return x;}
 }

 struct T
 {
 S s;
 ref int getSX() { return s.x; }
 }

In the approach discussed with Walter, S is illegal. A struct can't 
define a method to return a reference to a direct member. This is 
exactly the advice given in Scott's book for C++. (A class can because 
classes sit on the heap.)

Andrei

Mar 19 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sat, 20 Mar 2010 00:07:55 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 03/19/2010 10:20 PM, Steven Schveighoffer wrote:
 What about returning refs that are ref returns or part of other refs?
 For example:

 ref int foo(ref int x)
 {
 return x;
 }

 ref int bar()
 {
 int x;
 return foo(x);
 }

 The reason I bring this up is because it's exactly what a struct is
 doing. Basically, the problem is not so much that you cannot squirrel it
 away, but you can return it out of the stack scope it was allocated on.
 I don't know if there's a way to fix this without restricting struct
 members from returning ref items.

 I remember you brought up a similar point in a related discussion a  
 couple of years ago. It's a good point, and my current understanding of  
 the matter is that functions that take and return ref could and should  
 be handled conservatively.

I don't like the sound of that...  What I fear is that the compiler will  
force people to start using pointers because refs don't cut it.  I'm  
guessing you mean you cannot return ref returns from other functions?   
That breaks abstraction principles, I should be able to delegate a task to  
a sub-function.

 For instance, try to find a rule that prevents the above from compiling,
 but allows the following to compile.

 struct S
 {
 private int x;
 ref int getX() { return x;}
 }

 struct T
 {
 S s;
 ref int getSX() { return s.x; }
 }

 In the approach discussed with Walter, S is illegal. A struct can't  
 define a method to return a reference to a direct member. This is  
 exactly the advice given in Scott's book for C++. (A class can because  
 classes sit on the heap.)

A struct may sit on the heap too.  I don't know how else to describe it,  
but it feels like you are applying the solution in the wrong place.  I  
understand what you are trying to solve, but your solution may be too  
blunt an instrument.

Here's another case:

struct S
{
   int*x;
   ref int getX() {return *x;}
}

Is x on the heap or not?  How do you know?  Arrays are just a wrapped  
pointer, so they too could be stack allocated.

Consider this:

void foo(ref int x)
{
   x++;
}

struct S
{
    int x;
    int y;
    bool xisy;
    ref int getX() {if(xisy) return y; return x;}
}

foo(S.x);
foo(S.getX());

Another case:

struct S
{
    int x;
    ref S opUnary(string op)() if (op == "++") {++x; return this;}
}

I feel this should all be possible.


------
counter proposal:

What about having a new kind of ref that can only be passed up the stack,  
or down only one level if you are the one who initiated it.

Call it scope ref:

ref int baz(ref y)
{
return y;
}
scope ref int foo(scope ref int x, ref int y)
{
    //return x; // illegal, we did not make x scope ref
    //return baz(x); // illegal, cannot convert scope ref into ref
    return y; // legal, you can convert a ref parameter into scope ref.
}

scope ref int bar()
{
    int y;
    //return foo(y, y); //illegal, you cannot pass scope refs down the  
stack more than one level
}

At least this leaves ref alone to be used without restrictions that the  
compiler can't prove are necessary.  If we find scope ref is the only kind  
of ref we ever use, then maybe we can get rid of scope ref and just make  
ref be the restricted form.  Or you could keep scope ref and reserve ref  
for only provable heap-variables.

Man, it would be nice to have escape analysis...

-Steve

Mar 19 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 03/20/2010 12:56 AM, Steven Schveighoffer wrote:
 On Sat, 20 Mar 2010 00:07:55 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 I remember you brought up a similar point in a related discussion a
 couple of years ago. It's a good point, and my current understanding
 of the matter is that functions that take and return ref could and
 should be handled conservatively.

 I don't like the sound of that... What I fear is that the compiler will
 force people to start using pointers because refs don't cut it. I'm
 guessing you mean you cannot return ref returns from other functions?
 That breaks abstraction principles, I should be able to delegate a task
 to a sub-function.

Perhaps it means you can't return ref returns from other functions if 
you pass them references to local state.

(I've read a paper at some point about a program analysis that stored 
for each function the "return pattern" - a mini-graph describing the 
relationship between parameters and result. If it rings a bell to 
anyone... please chime in.)

 For instance, try to find a rule that prevents the above from compiling,
 but allows the following to compile.

 struct S
 {
 private int x;
 ref int getX() { return x;}
 }

 struct T
 {
 S s;
 ref int getSX() { return s.x; }
 }

 In the approach discussed with Walter, S is illegal. A struct can't
 define a method to return a reference to a direct member. This is
 exactly the advice given in Scott's book for C++. (A class can because
 classes sit on the heap.)

 A struct may sit on the heap too.

Yes. For those cases you can always use pointers, which are not subject 
to the restrictions I envision for ref.

It's a very small inconvenience. For example, if you have a linked list 
struct, you may feel constrained that you can't do:

struct List {
     List * next;
     List * prepend(List * lst) {
         lst.next = &this;
         return lst;
     }
}

In my approach, &this is illegal. And actually for a good reason. This 
code bombs:

List iForgotThePointer() {
     List lst;
     lst.prepend(new List);
     return lst;
}

My response to the above issue is two-pronged:

(a) For List a class would be an alternative
(b) To work with pointers to structs use static member functions and 
pointers instead of methods and references

 I don't know how else to describe it,
 but it feels like you are applying the solution in the wrong place. I
 understand what you are trying to solve, but your solution may be too
 blunt an instrument.

The goal is worth pursuing, so let's keep on thinking of how to make it 
work. If D manages to define demonstrably safe encapsulated containers, 
that would be an absolutely huge win.

 Here's another case:

 struct S
 {
 int*x;
 ref int getX() {return *x;}
 }

 Is x on the heap or not? How do you know? Arrays are just a wrapped
 pointer, so they too could be stack allocated.

struct S
{
     int*x;
     static ref int getX(S * p) {return *p.x;}
}

In an ideal world, if you have your hands on a pointer to a struct, you 
should be reasonably certain that that lives on the heap. It would be 
just great if D could guarantee that.

 Consider this:

 void foo(ref int x)
 {
 x++;
 }

 struct S
 {
 int x;
 int y;
 bool xisy;
 ref int getX() {if(xisy) return y; return x;}
 }

 foo(S.x);
 foo(S.getX());

Hm, I assume the two lines refer to an object of type S. The example 
above would again have to be rewritten in terms of static functions with 
pointers.

 Another case:

 struct S
 {
 int x;
 ref S opUnary(string op)() if (op == "++") {++x; return this;}
 }

 I feel this should all be possible.

I think opUnary should return void and the compiler should worry about 
that result being used.

 ------
 counter proposal:

 What about having a new kind of ref that can only be passed up the
 stack, or down only one level if you are the one who initiated it.

 Call it scope ref:

 ref int baz(ref y)
 {
 return y;
 }
 scope ref int foo(scope ref int x, ref int y)
 {
 //return x; // illegal, we did not make x scope ref
 //return baz(x); // illegal, cannot convert scope ref into ref
 return y; // legal, you can convert a ref parameter into scope ref.
 }

 scope ref int bar()
 {
 int y;
 //return foo(y, y); //illegal, you cannot pass scope refs down the stack
 more than one level
 }

 At least this leaves ref alone to be used without restrictions that the
 compiler can't prove are necessary. If we find scope ref is the only
 kind of ref we ever use, then maybe we can get rid of scope ref and just
 make ref be the restricted form. Or you could keep scope ref and reserve
 ref for only provable heap-variables.

 Man, it would be nice to have escape analysis...

It sure would, but it quickly gets into the interprocedural tarpit.

Your idea is good, except I don't see why not make ref scoped ref. After 
all ref is currently not an enabler - it could be missing from the 
language; pointers are fine. So why not make ref do something actually 
interesting?


Andrei

Mar 19 2010

Don <nospam nospam.com> writes:

Andrei Alexandrescu wrote:
 On 03/20/2010 12:56 AM, Steven Schveighoffer wrote:
 On Sat, 20 Mar 2010 00:07:55 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 I remember you brought up a similar point in a related discussion a
 couple of years ago. It's a good point, and my current understanding
 of the matter is that functions that take and return ref could and
 should be handled conservatively.

 I don't like the sound of that... What I fear is that the compiler will
 force people to start using pointers because refs don't cut it. I'm
 guessing you mean you cannot return ref returns from other functions?
 That breaks abstraction principles, I should be able to delegate a task
 to a sub-function.

 
 Perhaps it means you can't return ref returns from other functions if 
 you pass them references to local state.
 
 (I've read a paper at some point about a program analysis that stored 
 for each function the "return pattern" - a mini-graph describing the 
 relationship between parameters and result. If it rings a bell to 
 anyone... please chime in.)
 
 For instance, try to find a rule that prevents the above from 
 compiling,
 but allows the following to compile.

 struct S
 {
 private int x;
 ref int getX() { return x;}
 }

 struct T
 {
 S s;
 ref int getSX() { return s.x; }
 }

 In the approach discussed with Walter, S is illegal. A struct can't
 define a method to return a reference to a direct member. This is
 exactly the advice given in Scott's book for C++. (A class can because
 classes sit on the heap.)

 A struct may sit on the heap too.

 
 Yes. For those cases you can always use pointers, which are not subject 
 to the restrictions I envision for ref.
 
 It's a very small inconvenience. For example, if you have a linked list 
 struct, you may feel constrained that you can't do:
 
 struct List {
     List * next;
     List * prepend(List * lst) {
         lst.next = &this;
         return lst;
     }
 }
 
 In my approach, &this is illegal. And actually for a good reason. This 
 code bombs:
 
 List iForgotThePointer() {
     List lst;
     lst.prepend(new List);
     return lst;
 }
 
 My response to the above issue is two-pronged:
 
 (a) For List a class would be an alternative
 (b) To work with pointers to structs use static member functions and 
 pointers instead of methods and references

That would prohibit their use in safe D, right?

Although I can see the appeal of this idea, it seems very experimental. 
I fear it might have unexpected consequences. So it would need quite 
extensive testing. Do we have enough time to do that?

Mar 20 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 03/20/2010 02:36 AM, Don wrote:
 Andrei Alexandrescu wrote:
 My response to the above issue is two-pronged:

 (a) For List a class would be an alternative
 (b) To work with pointers to structs use static member functions and
 pointers instead of methods and references

 That would prohibit their use in safe D, right?

That is correct. Note that things like &this are also forbidden in safe 
D. Practically safety is not affected one way or another.

 Although I can see the appeal of this idea, it seems very experimental.
 I fear it might have unexpected consequences. So it would need quite
 extensive testing. Do we have enough time to do that?

It is not urgent to introduce this restriction.


Andrei

Mar 20 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-03-20 02:53:50 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 Perhaps it means you can't return ref returns from other functions if 
 you pass them references to local state.

You realize you're doing that all over the place in std.range when a 
range wraps another one?

	struct WrapperRange(R) {
		R wrappedRange;
		ref ElementType!R front() {
			return wrappedRange.front();
		}
	}

How do you plan to implement that in your proposed ref regime? Member 
functions receive 'this' as a ref argument, so they can't return a ref 
don't they?

I agree it's a goal worth pursuing, but we sure need new idiom to 
replace the one above. I think I see a solution: force the caller to 
provide a delegate when he wants to access and act on a reference. In 
the following example, the reference can't escape beyond applyFront's 
scope, but the caller is still free to do whatever it wants, with the 
exception that it can't escape the ref outside of the delegate he 
provided.

	struct WrapperRange(R) {
		R wrappedRange;
		void applyFront(void delegate(ref ElementType!R) actor) {
			actor(wrappedRange.front);
		}
	}

	R range;
	// Instead of this:
	range.front.callSomeFunc();
	// Use this:
	range.applyFront!((ref ElementType!R e) { e.callSomeFunc(); })();

Perhaps the compiler could be of some help here, by adding some sugar 
in a similar way that foreach is transformed to opApply. Could the dot 
operator automatically do something of the sort?


-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Mar 20 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-03-20 07:53:34 -0400, Michel Fortin <michel.fortin michelf.com> said:

 	struct WrapperRange(R) {
 		R wrappedRange;
 		void applyFront(void delegate(ref ElementType!R) actor) {
 			actor(wrappedRange.front);
 		}
 	}
 
 	R range;
 	// Instead of this:
 	range.front.callSomeFunc();
 	// Use this:
 	range.applyFront!((ref ElementType!R e) { e.callSomeFunc(); })();

Hum, this last line doesn't work, I should have written this:

	range.applyFront((ref ElementType!R e) { e.callSomeFunc(); });

No template. It could be done with a template, but then it couldn't 
work for virtual functions.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Mar 20 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 03/20/2010 06:53 AM, Michel Fortin wrote:
 On 2010-03-20 02:53:50 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 Perhaps it means you can't return ref returns from other functions if
 you pass them references to local state.

 You realize you're doing that all over the place in std.range when a
 range wraps another one?

 struct WrapperRange(R) {
 R wrappedRange;
 ref ElementType!R front() {
 return wrappedRange.front();
 }
 }

 How do you plan to implement that in your proposed ref regime? Member
 functions receive 'this' as a ref argument, so they can't return a ref
 don't they?

Hmmm, I think returning a ref to a member should be allowed as long as 
the compiler realizes that ref has the same scope as the struct it 
originated from.

 I agree it's a goal worth pursuing, but we sure need new idiom to
 replace the one above. I think I see a solution: force the caller to
 provide a delegate when he wants to access and act on a reference. In
 the following example, the reference can't escape beyond applyFront's
 scope, but the caller is still free to do whatever it wants, with the
 exception that it can't escape the ref outside of the delegate he provided.

 struct WrapperRange(R) {
 R wrappedRange;
 void applyFront(void delegate(ref ElementType!R) actor) {
 actor(wrappedRange.front);
 }
 }

 R range;
 // Instead of this:
 range.front.callSomeFunc();
 // Use this:
 range.applyFront!((ref ElementType!R e) { e.callSomeFunc(); })();

 Perhaps the compiler could be of some help here, by adding some sugar in
 a similar way that foreach is transformed to opApply. Could the dot
 operator automatically do something of the sort?

I find that quite alembicated.


Andrei

Mar 20 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-03-20 12:41:48 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 On 03/20/2010 06:53 AM, Michel Fortin wrote:
 On 2010-03-20 02:53:50 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:
 
 Perhaps it means you can't return ref returns from other functions if
 you pass them references to local state.

 
 You realize you're doing that all over the place in std.range when a
 range wraps another one?
 
 struct WrapperRange(R) {
 R wrappedRange;
 ref ElementType!R front() {
 return wrappedRange.front();
 }
 }
 
 How do you plan to implement that in your proposed ref regime? Member
 functions receive 'this' as a ref argument, so they can't return a ref
 don't they?

 
 Hmmm, I think returning a ref to a member should be allowed as long as 
 the compiler realizes that ref has the same scope as the struct it 
 originated from.

Ok, so the compiler assumes that the scope of the ref returned by 
function is the same as its ref arguments and prevent you from 
returning a ref to it if one of the ref argument is a local variable?

That could work. It'd only become a hindrance in the more complex cases 
where you have multiple ref arguments in a function and no way to tell 
which one the returned ref comes from. But there's an easy solution for 
that: add a marker. For instance: a 'ref' argument is acceptable for a 
return value, and 'scope ref' argument cannot be used as a return 
value. ('scope' here exists only as a restriction for 'ref', it is not 
applicable to anything else.)


-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Mar 20 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 03/20/2010 12:30 PM, Michel Fortin wrote:
 On 2010-03-20 12:41:48 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 On 03/20/2010 06:53 AM, Michel Fortin wrote:
 On 2010-03-20 02:53:50 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 Perhaps it means you can't return ref returns from other functions if
 you pass them references to local state.

 You realize you're doing that all over the place in std.range when a
 range wraps another one?

 struct WrapperRange(R) {
 R wrappedRange;
 ref ElementType!R front() {
 return wrappedRange.front();
 }
 }

 How do you plan to implement that in your proposed ref regime? Member
 functions receive 'this' as a ref argument, so they can't return a ref
 don't they?

 Hmmm, I think returning a ref to a member should be allowed as long as
 the compiler realizes that ref has the same scope as the struct it
 originated from.

 Ok, so the compiler assumes that the scope of the ref returned by
 function is the same as its ref arguments and prevent you from returning
 a ref to it if one of the ref argument is a local variable?

 That could work. It'd only become a hindrance in the more complex cases
 where you have multiple ref arguments in a function and no way to tell
 which one the returned ref comes from.

Yes. Conservatively, the scope of a returned ref is the smallest of all 
parameters involved, including 'this'.

There are two touches that Walter and I discussed:

a) You only need to keep in that set the parameters with .sizeof greater 
than or equal to the return type;

b) You only need to keep in that set the parameters that contain a 
direct subobject of the return type.

The rules are a bit odd, but they do the job and most importantly enable 
important encapsulation mechanisms.


Andrei

Mar 20 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-03-20 18:24:58 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 That could work. It'd only become a hindrance in the more complex cases
 where you have multiple ref arguments in a function and no way to tell
 which one the returned ref comes from.

 
 Yes. Conservatively, the scope of a returned ref is the smallest of all 
 parameters involved, including 'this'.
 
 There are two touches that Walter and I discussed:
 
 a) You only need to keep in that set the parameters with .sizeof 
 greater than or equal to the return type;
 
 b) You only need to keep in that set the parameters that contain a 
 direct subobject of the return type.
 
 The rules are a bit odd, but they do the job and most importantly 
 enable important encapsulation mechanisms.

Not only odd, but hard to predict.

The problem with those rules is that if the definition of a struct 
changes in subtle ways your program might stop compiling.

If a struct on Linux is larger than the equivalent struct on Mac OS X 
and this prevents some code from compiling, what are your options? 
Refactor your program to use pointers? If some generic algorithm 
accepts two arguments of any type but which can only return the first 
argument, should certain ways of calling it become unusable as soon as 
you give it two arguments of the same type?

That kind of attempt at cleverness sounds like a lot of annoying edge 
cases waiting to be found. I think it's bad coupling that if you add a 
never-used private member to a struct it might break some parts of the 
program elsewhere where a function attempts to return a reference to 
something that looks like your new member.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Mar 20 2010

Steven Schveighoffer <schveiguy yahoo.com> writes:

Andrei Alexandrescu Wrote:

 On 03/20/2010 12:56 AM, Steven Schveighoffer wrote:
 On Sat, 20 Mar 2010 00:07:55 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 I remember you brought up a similar point in a related discussion a
 couple of years ago. It's a good point, and my current understanding
 of the matter is that functions that take and return ref could and
 should be handled conservatively.

 I don't like the sound of that... What I fear is that the compiler will
 force people to start using pointers because refs don't cut it. I'm
 guessing you mean you cannot return ref returns from other functions?
 That breaks abstraction principles, I should be able to delegate a task
 to a sub-function.

 
 Perhaps it means you can't return ref returns from other functions if 
 you pass them references to local state.
 
 (I've read a paper at some point about a program analysis that stored 
 for each function the "return pattern" - a mini-graph describing the 
 relationship between parameters and result. If it rings a bell to 
 anyone... please chime in.)

This would be full escape analysis :)  I agree it is the best solution, but it
requires D to have special object files and a special linker.

 
 For instance, try to find a rule that prevents the above from compiling,
 but allows the following to compile.

 struct S
 {
 private int x;
 ref int getX() { return x;}
 }

 struct T
 {
 S s;
 ref int getSX() { return s.x; }
 }

 In the approach discussed with Walter, S is illegal. A struct can't
 define a method to return a reference to a direct member. This is
 exactly the advice given in Scott's book for C++. (A class can because
 classes sit on the heap.)

 A struct may sit on the heap too.

 
 Yes. For those cases you can always use pointers, which are not subject 
 to the restrictions I envision for ref.
 
 It's a very small inconvenience. For example, if you have a linked list 
 struct, you may feel constrained that you can't do:
 
 struct List {
      List * next;
      List * prepend(List * lst) {
          lst.next = &this;
          return lst;
      }
 }
 
 In my approach, &this is illegal. And actually for a good reason. This 
 code bombs:
 
 List iForgotThePointer() {
      List lst;
      lst.prepend(new List);
      return lst;
 }
 
 My response to the above issue is two-pronged:
 
 (a) For List a class would be an alternative

Note, this is impractical if you care about performance (I know, because
originally, dcollections used classes for link nodes).

 (b) To work with pointers to structs use static member functions and 
 pointers instead of methods and references

This prevents something like a linked list from being used in safe D.  That
might be too much of a restriction.  Yes, it makes code safer, but it makes
safe D unusable.

 The goal is worth pursuing, so let's keep on thinking of how to make it 
 work. If D manages to define demonstrably safe encapsulated containers, 
 that would be an absolutely huge win.

I agree.

 
 Here's another case:

 struct S
 {
 int*x;
 ref int getX() {return *x;}
 }

 Is x on the heap or not? How do you know? Arrays are just a wrapped
 pointer, so they too could be stack allocated.

 
 struct S
 {
      int*x;
      static ref int getX(S * p) {return *p.x;}
 }
 
 In an ideal world, if you have your hands on a pointer to a struct, you 
 should be reasonably certain that that lives on the heap. It would be 
 just great if D could guarantee that.

again, not in safe D.  Anytime pointers enter the mix, safe D is disqualified,
no?

 
 Consider this:

 void foo(ref int x)
 {
 x++;
 }

 struct S
 {
 int x;
 int y;
 bool xisy;
 ref int getX() {if(xisy) return y; return x;}
 }

 foo(S.x);
 foo(S.getX());

 
 Hm, I assume the two lines refer to an object of type S. The example 
 above would again have to be rewritten in terms of static functions with 
 pointers.

Yes, I meant to write:
S s;
foo(s.x);
foo(s.getX());

 
 Another case:

 struct S
 {
 int x;
 ref S opUnary(string op)() if (op == "++") {++x; return this;}
 }

 I feel this should all be possible.

 
 I think opUnary should return void and the compiler should worry about 
 that result being used.

Yes, you are probably right.  It would be cool if the compiler could
automatically do that in all cases where the operator is expected to return a
reference to the struct, i.e. +=, -=, etc.  That removes one of my biggest
concerns.

 
 ------
 counter proposal:

 What about having a new kind of ref that can only be passed up the
 stack, or down only one level if you are the one who initiated it.

 Call it scope ref:

 ref int baz(ref y)
 {
 return y;
 }
 scope ref int foo(scope ref int x, ref int y)
 {
 //return x; // illegal, we did not make x scope ref
 //return baz(x); // illegal, cannot convert scope ref into ref
 return y; // legal, you can convert a ref parameter into scope ref.
 }

 scope ref int bar()
 {
 int y;
 //return foo(y, y); //illegal, you cannot pass scope refs down the stack
 more than one level
 }

 At least this leaves ref alone to be used without restrictions that the
 compiler can't prove are necessary. If we find scope ref is the only
 kind of ref we ever use, then maybe we can get rid of scope ref and just
 make ref be the restricted form. Or you could keep scope ref and reserve
 ref for only provable heap-variables.

 Man, it would be nice to have escape analysis...

 
 It sure would, but it quickly gets into the interprocedural tarpit.

It requires some sort of inter-object analysis, like you mentioned at the top.

 
 Your idea is good, except I don't see why not make ref scoped ref. After 
 all ref is currently not an enabler - it could be missing from the 
 language; pointers are fine. So why not make ref do something actually 
 interesting?

The whole point was to run an experiment seeing how many things could be
written as scoped ref instead of just ref (leaving ref the way it is today). 
If they all can with minor adjustments, then we can switch scoped ref to simply
ref.  If some can't, and those also don't make sense as pointers, then we have
a predicament that ref enables some sort of code that isn't enabled by pointers
or scoped ref.  My point was, I don't really know off the top of my head all
the cases for using ref, and you probably don't either.  I don't think a
sweeping change like this is good without at least some practical evidence.

-Steve

Mar 20 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 03/20/2010 07:04 AM, Steven Schveighoffer wrote:
 Andrei Alexandrescu Wrote:

[snip]
 My response to the above issue is two-pronged:

 (a) For List a class would be an alternative

 Note, this is impractical if you care about performance (I know,
 because originally, dcollections used classes for link nodes).

I agree :o).

 (b) To work with pointers to structs use static member functions
 and pointers instead of methods and references

 This prevents something like a linked list from being used in safe D.
 That might be too much of a restriction.  Yes, it makes code safer,
 but it makes safe D unusable.

Actually, no. Walter's vision is to enable certain pointer uses in 
SafeD. Unchecked pointer arithmetic is disabled, but code like the one 
below is entirely safe:

List * lst = new List;
List * lst2 = prepend(lst, new List);

In fact, restricting use of ref is beneficial for SafeD because it helps 
the static proof that any pointer in SafeD points to the 
garbage-collected heap (as opposed to e.g. dangling stack variables).

 In an ideal world, if you have your hands on a pointer to a struct,
 you should be reasonably certain that that lives on the heap. It
 would be just great if D could guarantee that.

 again, not in safe D.  Anytime pointers enter the mix, safe D is
 disqualified, no?

As mentioned above, pointers are still on the table in SafeD. I found 
that quite interesting myself :o).

 Your idea is good, except I don't see why not make ref scoped ref.
 After all ref is currently not an enabler - it could be missing
 from the language; pointers are fine. So why not make ref do
 something actually interesting?

 The whole point was to run an experiment seeing how many things could
 be written as scoped ref instead of just ref (leaving ref the way it
 is today).  If they all can with minor adjustments, then we can
 switch scoped ref to simply ref.  If some can't, and those also don't
 make sense as pointers, then we have a predicament that ref enables
 some sort of code that isn't enabled by pointers or scoped ref.  My
 point was, I don't really know off the top of my head all the cases
 for using ref, and you probably don't either.  I don't think a
 sweeping change like this is good without at least some practical
 evidence.

That is very sensible, thanks.


Andrei

Mar 20 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-03-19 22:56:59 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 So I was thinking of the following: how about still returning a 
 reference, but define a rule in the language that references can't 
 escape - you can't take their address and squirrel it away; the only 
 thing you can do is use them right on the spot or pass them down to 
 functions.

Can functions return the reference, or a reference to a part of the 
original referenced value? It's safe to do this only in some specific 
contexts, but the compiler already has some difficulty dealing with 
this. See bug 3925:
<http://d.puremagic.com/issues/show_bug.cgi?id=3925>

Another effect is that it should prevent closures from using that 
transitory ref value, since a closure can be leaked... unless we add 
transitory closures. :-)

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Mar 19 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 03/19/2010 10:52 PM, Michel Fortin wrote:
 On 2010-03-19 22:56:59 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 So I was thinking of the following: how about still returning a
 reference, but define a rule in the language that references can't
 escape - you can't take their address and squirrel it away; the only
 thing you can do is use them right on the spot or pass them down to
 functions.

 Can functions return the reference, or a reference to a part of the
 original referenced value?

I see two possible approaches (a) disallow that, which I think would be 
irregular, and (b) conservatively assume that that happens on the call side.

There is one other problem that I haven't mentioned. Consider the code:

void fun(ref int a, int b) {
     a = b;
}

Array!int arr;
arr.resize(5);
fun(arr[4], (arr.resize(10000), 42));

The purpose of restricting ref was to allow Array to entirely 
encapsulate its memory allocation, i.e. allow Array to use malloc and 
free without fearing that someone still holds a dangling pointer in there.

In the code above, however, evaluation will first take the address of 
arr[4] and _then_ call arr.resize (which may trigger reallocation). As a 
consequence, by the time fun sees the ref, it's already dangling. I 
don't know how to solve this.


Andrei

Mar 19 2010

bearophile <bearophileHUGS lycos.com> writes:

I don't comment on this topic because I am not expert enough yet to see its
possible consequences.

Regarding D2 development, one of the original design goals of D is to be a not
revolutionary language, but to take what's already known as reliable and useful
from other languages. Bug lately D has become an experiment: it contains many
experimental features that are new, nearly untested in real programs. Their
semantics can be sound, but we can't be certain yet, so some of those designs
may need to be improved on a semantic level too. And some of them are not even
fully implemented.

I have used D2 for the last few weeks, and I can say that currently the D2
compiler is so full of bugs, rough edges, or not fully implemented features
that in my opinion it's nearly unusable. I have found a new bug every 10 lines
or code or so (my code is not normal code, I know). When the book is out people
will start looking for a compiler too, so I think it's better to offer them
something that works, or they will lose interest quickly, and then it will be
harder to call them back to give a second look/chance at/to the language.

So my suggestion is to focus on removing bugs, performing small local
improvements, to smooth the semantic rough edges, etc. I have listed here less
than fifteen small things that I've added to bugzilla, that I think can be
improved. They are not real bugs, but they are not large new features, they are
usually little local things that smooth corners.

Bye,
bearophile

Mar 20 2010

dolive <dolive89 sina.com> writes:

bearophile д��:

 I don't comment on this topic because I am not expert enough yet to see its
possible consequences.
 
 Regarding D2 development, one of the original design goals of D is to be a not
revolutionary language, but to take what's already known as reliable and useful
from other languages. Bug lately D has become an experiment: it contains many
experimental features that are new, nearly untested in real programs. Their
semantics can be sound, but we can't be certain yet, so some of those designs
may need to be improved on a semantic level too. And some of them are not even
fully implemented.
 
 I have used D2 for the last few weeks, and I can say that currently the D2
compiler is so full of bugs, rough edges, or not fully implemented features
that in my opinion it's nearly unusable. I have found a new bug every 10 lines
or code or so (my code is not normal code, I know). When the book is out people
will start looking for a compiler too, so I think it's better to offer them
something that works, or they will lose interest quickly, and then it will be
harder to call them back to give a second look/chance at/to the language.
 
 So my suggestion is to focus on removing bugs, performing small local
improvements, to smooth the semantic rough edges, etc. I have listed here less
than fifteen small things that I've added to bugzilla, that I think can be
improved. They are not real bugs, but they are not large new features, they are
usually little local things that smooth corners.
 
 Bye,
 bearophile

Entirely correct ! to support !
Fix bugs, some small features perfect is a priority !

thank you, Andrei grandmaster !

dolive

Mar 20 2010

D Programming

C/C++ Programming

Other

digitalmars.D - An important potential change to the language: transitory ref