www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - escaping pointer to scope local array: bug or not?

reply HOSOKAWA Kenchi <hskwk inter7.jp> writes:
It seems dmd 2.031 forgets scope attribute for array.ptr in some cases, so that
it allows escaping a pointer to scope local array.
I'm not sure this is a bug or a kind of "dangerous-but-valid".

int[] a()
{
	scope auto a = new int[1];
	return a; // error; escaping reference to scope local array
}

int* ap()
{
	scope auto a = new int[1];
	return a.ptr; // no error; this is the problem
}

int* i()
{
	int i;
	return &i; // error; escaping reference to local variable
}

int* ip()
{
	scope int* p = new int;
	return p; // no error; only is "int* p" local, "new int" not scope local?
}
Aug 16 2009
next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Sun, 16 Aug 2009 10:13:42 -0700, HOSOKAWA Kenchi <hskwk inter7.jp>  
wrote:

 It seems dmd 2.031 forgets scope attribute for array.ptr in some cases,  
 so that it allows escaping a pointer to scope local array.
 I'm not sure this is a bug or a kind of "dangerous-but-valid".

 int[] a()
 {
 	scope auto a = new int[1];
 	return a; // error; escaping reference to scope local array
 }

 int* ap()
 {
 	scope auto a = new int[1];
 	return a.ptr; // no error; this is the problem
 }

 int* i()
 {
 	int i;
 	return &i; // error; escaping reference to local variable
 }

 int* ip()
 {
 	scope int* p = new int;
 	return p; // no error; only is "int* p" local, "new int" not scope  
 local?
 }

I'd recommend checking to see if p* was allocated on the stack or on the heap, as the difference represents two very different bugs.
Aug 16 2009
prev sibling next sibling parent reply Robert Fraser <fraserofthenight gmail.com> writes:
HOSOKAWA Kenchi wrote:
 It seems dmd 2.031 forgets scope attribute for array.ptr in some cases, so
that it allows escaping a pointer to scope local array.
 I'm not sure this is a bug or a kind of "dangerous-but-valid".

 int* ap()
 {
 	scope auto a = new int[1];
 	return a.ptr; // no error; this is the problem
 }

Probably should be an error, but, FWIW, scope arrays are still heap-allocated (yeah, I know it's inconsistent). So there's no chance of memory corruption, etc.; it's as if you didn't have the "scope" there.
Aug 17 2009
next sibling parent reply Hosokawa Kenchi <hskwk inter7.jp> writes:
Robert Fraser Wrote:

 HOSOKAWA Kenchi wrote:
 It seems dmd 2.031 forgets scope attribute for array.ptr in some cases, so
that it allows escaping a pointer to scope local array.
 I'm not sure this is a bug or a kind of "dangerous-but-valid".

 int* ap()
 {
 	scope auto a = new int[1];
 	return a.ptr; // no error; this is the problem
 }

Probably should be an error, but, FWIW, scope arrays are still heap-allocated (yeah, I know it's inconsistent). So there's no chance of memory corruption, etc.; it's as if you didn't have the "scope" there.

I'd tried to check where 'scoped' variables are on the Jacques's advice. As a result, I reached the same conclusion to yours. They are heap-allocated, of course, would not be collected by GC (I'm not sure collect-proofness is guaranteed or not). I suppose semantics of 'scope' is still ambiguous. int[] f() { scope x = new int[6]; auto y = x[1..3]; return y; // no error, successfully escape slice-reference of 'originally' scoped array. } I'm unable to make a quick decision that it SHOULD be error or not. I wish here is a opinion which gives clear cut on this issue.
Aug 17 2009
parent reply bearophile <bearophileHUGS lycos.com> writes:
Steven Schveighoffer: 
 Another way is to perform escape analysis, but Walter has expressed that  
 he doesn't want to do that.  It would require an intermediate interface  
 language for imports where annotations could be added by the compiler.

Why is that bad? Bye, bearophile
Aug 18 2009
next sibling parent Robert Fraser <fraserofthenight gmail.com> writes:
Robert Jacques wrote:
 ship/sell D libraries in binary format.

Does anyone sell static libraries anymore? There are too many problems with static linking for that to be very viable. Most libraries I've seen are sold as DLLs.
Aug 18 2009
prev sibling parent Robert Fraser <fraserofthenight gmail.com> writes:
Steven Schveighoffer wrote:
 It also says that Java 6, a language compiled as I proposed D could be, 
 has escape analysis.

Java's escape analysis is done at runtime (during JIT compilation) AFAIK. LDC can compile to bitcode and link-time codegen could be used to deal with escape analysis... this doesn't help (much) in generating errors, but it allows better codegen.
Aug 19 2009
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 17 Aug 2009 16:51:20 -0400, Hosokawa Kenchi <hskwk inter7.jp>  
wrote:

 Robert Fraser Wrote:

 HOSOKAWA Kenchi wrote:
 It seems dmd 2.031 forgets scope attribute for array.ptr in some  

 I'm not sure this is a bug or a kind of "dangerous-but-valid".

 int* ap()
 {
 	scope auto a = new int[1];
 	return a.ptr; // no error; this is the problem
 }

Probably should be an error, but, FWIW, scope arrays are still heap-allocated (yeah, I know it's inconsistent). So there's no chance of memory corruption, etc.; it's as if you didn't have the "scope" there.

I'd tried to check where 'scoped' variables are on the Jacques's advice. As a result, I reached the same conclusion to yours. They are heap-allocated, of course, would not be collected by GC (I'm not sure collect-proofness is guaranteed or not). I suppose semantics of 'scope' is still ambiguous. int[] f() { scope x = new int[6]; auto y = x[1..3]; return y; // no error, successfully escape slice-reference of 'originally' scoped array. } I'm unable to make a quick decision that it SHOULD be error or not. I wish here is a opinion which gives clear cut on this issue.

remember, scope is a storage class, not a type modifier. "scopeness" is only a hint to the compiler of where to store it originally, the hint is not passed on to other variables which point to the same data. I'm surprised it's actually an error to try and return a scope variable. One way to get scope to work as you desire is to make it a type modifier and define rules about assignment, but I'm not sure that's a good answer. Another way is to perform escape analysis, but Walter has expressed that he doesn't want to do that. It would require an intermediate interface language for imports where annotations could be added by the compiler. -Steve
Aug 18 2009
parent HOSOKAWA Kenchi <hskwk inter7.jp> writes:
Steven Schveighoffer Wrote:

 remember, scope is a storage class, not a type modifier. "scopeness" is  
 only a hint to the compiler of where to store it originally, the hint is  
 not passed on to other variables which point to the same data.  I'm  
 surprised it's actually an error to try and return a scope variable.
 
 One way to get scope to work as you desire is to make it a type modifier  
 and define rules about assignment, but I'm not sure that's a good answer.   
 Another way is to perform escape analysis, but Walter has expressed that  
 he doesn't want to do that.  It would require an intermediate interface  
 language for imports where annotations could be added by the compiler.

It seems that scopeness as-is is not a "hint to the compiler of where to store it originally" because scope reference accepts object which is not allocated at there. Here is a example: class C { int i = 1; ~this() { writeln("~C"); } void foo() {} } int* f(C arg) { scope c = arg; return &(c.i); } void main() { auto c = new C; auto p = f(c); // destructor is called after f() writeln(*p); // success, at first grance c.foo; // runtime error: access violation } The instance have been collected, hence escaped pointer probably points a garbage. The 'scope' is misleading rather than a hint in this case, since resource had been acquired far away from the 'scope'. Current behaviors of 'scope' are: 1. Prohibit to return the scope references (as you said, it is a storage class and not transitive). (compile time) 2. Calling destructor for the referenced instance when the reference goes out of scope. (runtime) I think RAII has similarity to const/immutable, which is fully cared in D2: * It is safe to reference values/objects from multi-thread, etc. if the referenced object will not change. * It is safe to destruct a object when reference goes out of scope if the object is referenced only from the scope. Consequently, the best answer is presumably make it a transitive type modifier. At least adding rules about assignment, right-hand value of 'scope' reference must be just allocated by new operator. It is too dangerous to allow referencing any object, especially parameter of the function. I agree that escape analysis is bad solution because it makes compiler implementation too hard.
Aug 18 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 18 Aug 2009 13:34:36 -0400, bearophile <bearophileHUGS lycos.com>  
wrote:

 Steven Schveighoffer:
 Another way is to perform escape analysis, but Walter has expressed that
 he doesn't want to do that.  It would require an intermediate interface
 language for imports where annotations could be added by the compiler.

Why is that bad?

I don't think it's bad, but definitely a lot of work. I would be all for it. -Steve
Aug 18 2009
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Tue, 18 Aug 2009 10:38:50 -0700, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Tue, 18 Aug 2009 13:34:36 -0400, bearophile  
 <bearophileHUGS lycos.com> wrote:

 Steven Schveighoffer:
 Another way is to perform escape analysis, but Walter has expressed  
 that
 he doesn't want to do that.  It would require an intermediate interface
 language for imports where annotations could be added by the compiler.

Why is that bad?

I don't think it's bad, but definitely a lot of work. I would be all for it. -Steve

Actually, it's really bad. Escape analysis requires whole program analysis. It would be impossible to do incremental compilation or to ship/sell D libraries in binary format. I'd recomend checking out http://en.wikipedia.org/wiki/Escape_analysis for an overview of the issues involved. You can avoid doing whole program analysis by introducing ownership types and being a bit conservative in what you allow. There's a (bit confusing) wiki page proposal an how to implement it at http://www.prowiki.org/wiki4d/wiki.cgi?OwnershipTypesInD.
Aug 18 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 18 Aug 2009 13:48:23 -0400, Robert Jacques <sandford jhu.edu>  
wrote:

 On Tue, 18 Aug 2009 10:38:50 -0700, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 On Tue, 18 Aug 2009 13:34:36 -0400, bearophile  
 <bearophileHUGS lycos.com> wrote:

 Steven Schveighoffer:
 Another way is to perform escape analysis, but Walter has expressed  
 that
 he doesn't want to do that.  It would require an intermediate  
 interface
 language for imports where annotations could be added by the compiler.

Why is that bad?

I don't think it's bad, but definitely a lot of work. I would be all for it. -Steve

Actually, it's really bad. Escape analysis requires whole program analysis. It would be impossible to do incremental compilation or to ship/sell D libraries in binary format. I'd recomend checking out http://en.wikipedia.org/wiki/Escape_analysis for an overview of the issues involved. You can avoid doing whole program analysis by introducing ownership types and being a bit conservative in what you allow. There's a (bit confusing) wiki page proposal an how to implement it at http://www.prowiki.org/wiki4d/wiki.cgi?OwnershipTypesInD.

Admitting I didn't read any of that, I think incremental analysis is possible as long as import files are generated by the compiler post-analysis. i.e. the compiler is able to alter the function signature indicating escape analysis information. With something like that, you could still ship in binary format, along with generated import files that describe the function signatures (if one requires building against your product). In fact, the import files could be a part of the binary files, similar to how java works. -Steve
Aug 18 2009
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Tue, 18 Aug 2009 11:54:27 -0700, Robert Fraser  
<fraserofthenight gmail.com> wrote:

 Robert Jacques wrote:
 ship/sell D libraries in binary format.

Does anyone sell static libraries anymore? There are too many problems with static linking for that to be very viable. Most libraries I've seen are sold as DLLs.

I don't know about selling, but (for instance) NVIDIA's CUDA shipped as a pair of static and dynamic libraries. For a long time, runtime linking wasn't possible with however they did the DLLs so you had to use the static libs. But, whole program/escape analysis also has to have access to the DLL's source code, to work, since it has to check/know what each function does with its input arguments.
Aug 18 2009
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Tue, 18 Aug 2009 10:57:50 -0700, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Tue, 18 Aug 2009 13:48:23 -0400, Robert Jacques <sandford jhu.edu>  
 wrote:

 On Tue, 18 Aug 2009 10:38:50 -0700, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 On Tue, 18 Aug 2009 13:34:36 -0400, bearophile  
 <bearophileHUGS lycos.com> wrote:

 Steven Schveighoffer:
 Another way is to perform escape analysis, but Walter has expressed  
 that
 he doesn't want to do that.  It would require an intermediate  
 interface
 language for imports where annotations could be added by the  
 compiler.

Why is that bad?

I don't think it's bad, but definitely a lot of work. I would be all for it. -Steve

Actually, it's really bad. Escape analysis requires whole program analysis. It would be impossible to do incremental compilation or to ship/sell D libraries in binary format. I'd recomend checking out http://en.wikipedia.org/wiki/Escape_analysis for an overview of the issues involved. You can avoid doing whole program analysis by introducing ownership types and being a bit conservative in what you allow. There's a (bit confusing) wiki page proposal an how to implement it at http://www.prowiki.org/wiki4d/wiki.cgi?OwnershipTypesInD.

Admitting I didn't read any of that, I think incremental analysis is possible as long as import files are generated by the compiler post-analysis. i.e. the compiler is able to alter the function signature indicating escape analysis information. With something like that, you could still ship in binary format, along with generated import files that describe the function signatures (if one requires building against your product). In fact, the import files could be a part of the binary files, similar to how java works. -Steve

*sigh* That doesn't work. From the wikipedia article:
 In traditional static compilation, method overriding can make escape  
 analysis impossible, as any called method might be overridden by a  
 version that allows a pointer to escape.

function f(A a), neither of which have any escapes. Now create a subclass of A, B, which does contain escapes. Does f(B) escape or not? Now, introducing some ownership types, (scope, stack, heap, shared, mobile), gives the compiler / methods the guarantees they need to get over this problem.
Aug 18 2009
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 18 Aug 2009 18:08:56 -0400, Robert Jacques <sandford jhu.edu>  
wrote:

 *sigh* That doesn't work. From the wikipedia article:
 In traditional static compilation, method overriding can make escape  
 analysis impossible, as any called method might be overridden by a  
 version that allows a pointer to escape.

and function f(A a), neither of which have any escapes. Now create a subclass of A, B, which does contain escapes. Does f(B) escape or not?

It also says that Java 6, a language compiled as I proposed D could be, has escape analysis. I don't think it's easy, but it's definitely possible. Besides, this entire argument is moot if Walter doesn't want to do it... -Steve
Aug 19 2009
prev sibling parent reply Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Sun, Aug 16, 2009 at 1:13 PM, HOSOKAWA Kenchi<hskwk inter7.jp> wrote:
 =A0 =A0 =A0 =A0scope auto a =3D new int[1];

Just for future reference, "scope auto" is redundant. "auto" does not mean "infer the type"; the absence of a type is enough to do that. "auto" is just the default storage class. "scope a =3D new int[1];" will work fine (as will "const a =3D 4;" "static a =3D 5;" etc.).
Aug 17 2009
parent Hosokawa Kenchi <hskwk inter7.jp> writes:
Jarrett Billingsley Wrote:

 On Sun, Aug 16, 2009 at 1:13 PM, HOSOKAWA Kenchi<hskwk inter7.jp> wrote:
 &#63728; &#63728; &#63728; &#63728;scope auto a = new int[1];

Just for future reference, "scope auto" is redundant. "auto" does not mean "infer the type"; the absence of a type is enough to do that. "auto" is just the default storage class. "scope a = new int[1];" will work fine (as will "const a = 4;" "static a = 5;" etc.).

Thanks for the nice advice. Now I become more familiar with D!
Aug 17 2009