www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Re: safe leak fix?

reply Jason House <jason.james.house gmail.com> writes:
Walter Bright Wrote:

 Consider the code:
 
     safe:
      T[] foo(T[] a) { return a; }
 
      T[] bar()
      {
          T[10] x;
          return foo(x);
      }
 
 Now we've got an escaping reference to bar's stack. This is not memory 
 safe. But giving up slices is a heavy burden.
 
 So it occurred to me that the same solution for closures can be used 
 here. If the address is taken of a stack variable in a safe function, 
 that variable is instead allocated on the heap. If a more advanced 
 compiler could prove that the address does not escape, it could be put 
 back on the stack.
 
 The code will be a little slower, but it will be memory safe. This 
 change wouldn't be done in trusted or unsafe functions.

At a fundamental level, safety isn't about pointers or references to stack variables, but rather preventing their escape beyond function scope. Scope parameters could be very useful. Scope delegates were introduced for a similar reason.
Nov 11 2009
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Jason House wrote:
 At a fundamental level, safety isn't about pointers or references to
 stack variables, but rather preventing their escape beyond function
 scope. Scope parameters could be very useful. Scope delegates were
 introduced for a similar reason.

The problem is, they aren't so easy to prove correct.
Nov 11 2009
next sibling parent reply Jason House <jason.james.house gmail.com> writes:
Walter Bright Wrote:

 Jason House wrote:
 At a fundamental level, safety isn't about pointers or references to
 stack variables, but rather preventing their escape beyond function
 scope. Scope parameters could be very useful. Scope delegates were
 introduced for a similar reason.

The problem is, they aren't so easy to prove correct.

I understand the general problem with escape analysis, but I've always thought of scope input as meaning noescape. That should lead to easy proofs. If my noescape input (or slice of an array on the stack) is passed to a function without noescape, it's a compile error. That reduces escape analysis to local verification.
Nov 12 2009
parent reply Jason House <jason.james.house gmail.com> writes:
Steven Schveighoffer Wrote:

 On Thu, 12 Nov 2009 08:45:36 -0500, Jason House  
 <jason.james.house gmail.com> wrote:
 
 Walter Bright Wrote:

 Jason House wrote:
 At a fundamental level, safety isn't about pointers or references to
 stack variables, but rather preventing their escape beyond function
 scope. Scope parameters could be very useful. Scope delegates were
 introduced for a similar reason.

The problem is, they aren't so easy to prove correct.

I understand the general problem with escape analysis, but I've always thought of scope input as meaning noescape. That should lead to easy proofs. If my noescape input (or slice of an array on the stack) is passed to a function without noescape, it's a compile error. That reduces escape analysis to local verification.

The problem is cases like this: char[] foo() { char buf[100]; // fill buf return strstr(buf, "hi").dup; } This function is completely safe, but without full escape analysis the compiler can't tell. The problem is, you don't know how the outputs of a function are connected to its inputs. strstr cannot have its parameters marked as scope because it returns them. Scope parameters draw a rather conservative line in the sand, and while I think it's a good optimization we can get right now, it's not going to help in every case. I'm perfectly fine with safe being conservative and trusted not, at least the power is still there if you need it. -Steve

what's the signature of strstr? Your example really boils down to proving strstr is safe. You're implying that the return of buf from strstr is unsafe. Indeed, my intentionally short post didn't discuss returning from functions. Ignoring that for a moment, surely you'd agree the following is safe: char[] foo(){ char[100] buf; copystringintobuf(buf, "hi"); return buf[0..2].dup; } As far as return types, there are two subtle issues: 1. Returned input argument must preserve the scope requirements of the caller. A similar problem as return variables and const annotation. 2. Unlike const annotations, there is more than three states for scope, it's simply a measure of how deep/shallowvariables can be in the stack.
Nov 12 2009
next sibling parent Nick B <"nick_NOSPAM_.barbalich" gmail.com> writes:
Overview:

The AMD Advanced Synchronization Facility (ASF) is an experimental 
instruction set extension for the AMD64 architecture that would provide 
new capabilities for efficient synchronization of access to shared data 
in highly multithreaded applications as well as operating system 
kernels. ASF provides a means for software to inspect and update 
multiple shared memory locations atomically without having to rely on 
locks for mutual exclusion. It is intended to facilitate lock-free 
programming for highly concurrent shared data structures, allowing more 
complex and higher performance manipulation of such structures than is 
practical with traditional techniques based on compare-swap instructions 
such as CMPXCHG16B. ASF code can also interoperate with lock-based code, 
or with _Software Transactional Memory_.

Some basic usage examples of ASF are provided in the specification. 
However, we expect the programming community could readily use the power 
and flexibility of ASF to implement very sophisticated, robust and 
innovative concurrent data structure algorithms, and we encourage such 
experimentation. AMD will be releasing a simulation framework in the 
near future to facilitate this.

AMD is releasing this proposal to encourage the parallel programming 
community to review and comment on it. Such input will help shape the 
ultimate direction of this feature, so that it may best serve the needs 
of advanced parallel application developers.

Discussion:
http://forums.amd.com/devblog/blogpost.cfm?catid=317&threadid=114715&utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+AmdDeveloperBlogs+%28AMD+Developer+Blogs%29
  and here
http://forums.amd.com/devblog/blogpost.cfm?catid=317&threadid=118419&utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+AmdDeveloperBlogs+%28AMD+Developer+Blogs%29


The spec can be found here:

http://developer.amd.com/cpu/ASF/Pages/default.aspx


regards
Nick B
Nov 12 2009
prev sibling next sibling parent grauzone <none example.net> writes:
Denis Koroskin wrote:
 I don't like his proposal at all. It introduces one more hidden 
 allocation. Why not just write
 
 char[] buf = new char[100];
 
 and disallow taking a slice of static array? (Andrei already hinted this 
 will be disallowed in  safe, if I understood him right).

I think that would be the best. What uses of static arrays are there? - allocating memory "inline" (eh, you better not use SafeD if you need this! new always works) - as value types, e.g. small vectors (don't really need slices in this case) - ...?
 Speaking about safety, I don't know how we can allow pointers in safe D:
 
 void foo()
 {
    int* p = new int;
    p[1000] = 0; // Will it crash or not? Is this a defined behavior, or 
 not?
    // If not, this must be disallowed in safe D
 }
 
 And, most importantly, *why* users would want to work with pointers in 
 safe D at all?

As far as I understood, pointers are supposed to be allowed in SafeD. You just aren't allowed to do the following things: - pointer arithmetic - turning arrays into slices - taking address (messy one!) - (unsafe) casts between pointers - array.ptr - probably more
Nov 13 2009
prev sibling parent Jason House <jason.james.house gmail.com> writes:
Steven Schveighoffer Wrote:

 On Thu, 12 Nov 2009 18:34:48 -0500, Jason House  
 <jason.james.house gmail.com> wrote:
 
 Steven Schveighoffer Wrote:

 On Thu, 12 Nov 2009 08:45:36 -0500, Jason House
 <jason.james.house gmail.com> wrote:

 Walter Bright Wrote:

 Jason House wrote:
 At a fundamental level, safety isn't about pointers or references  



 stack variables, but rather preventing their escape beyond function
 scope. Scope parameters could be very useful. Scope delegates were
 introduced for a similar reason.

The problem is, they aren't so easy to prove correct.

I understand the general problem with escape analysis, but I've always thought of scope input as meaning noescape. That should lead to easy proofs. If my noescape input (or slice of an array on the stack) is passed to a function without noescape, it's a compile error. That reduces escape analysis to local verification.

The problem is cases like this: char[] foo() { char buf[100]; // fill buf return strstr(buf, "hi").dup; } This function is completely safe, but without full escape analysis the compiler can't tell. The problem is, you don't know how the outputs of a function are connected to its inputs. strstr cannot have its parameters marked as scope because it returns them. Scope parameters draw a rather conservative line in the sand, and while I think it's a good optimization we can get right now, it's not going to help in every case. I'm perfectly fine with safe being conservative and trusted not, at least the power is still there if you need it. -Steve

what's the signature of strstr? Your example really boils down to proving strstr is safe.

The problem is, strstr isn't safe by itself, it's only safe in certain contexts. You can't mark it as trusted either because it has the potential to be unsafe. I think if safe D heap-allocates when it passes a local address into an unprovable function such as strstr, that's fine with me. So the signature of strstr has to be unmarked (no safe or trusted).

I disagree. Borrowing the syntax from the return const proposal, let's define strstr as follows: inout(char[]) strstr(inout(char[]) buf, const(char[]) orig); What I want that to tell the compiler is that buf, or some piece of buf, is returned from strstr. (please don't assign any more meaning than that, i.e. constness of buf). The compiler would then treat the return value with the same protection as buf, and a return without .dup is a compile error. I've been in drawn out discussions with you before. If this post and my prior post don't make you budge from your position than I'll simply give up trying to convince you. It's not worth the aggregation.
Nov 13 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 12 Nov 2009 08:45:36 -0500, Jason House  
<jason.james.house gmail.com> wrote:

 Walter Bright Wrote:

 Jason House wrote:
 At a fundamental level, safety isn't about pointers or references to
 stack variables, but rather preventing their escape beyond function
 scope. Scope parameters could be very useful. Scope delegates were
 introduced for a similar reason.

The problem is, they aren't so easy to prove correct.

I understand the general problem with escape analysis, but I've always thought of scope input as meaning noescape. That should lead to easy proofs. If my noescape input (or slice of an array on the stack) is passed to a function without noescape, it's a compile error. That reduces escape analysis to local verification.

The problem is cases like this: char[] foo() { char buf[100]; // fill buf return strstr(buf, "hi").dup; } This function is completely safe, but without full escape analysis the compiler can't tell. The problem is, you don't know how the outputs of a function are connected to its inputs. strstr cannot have its parameters marked as scope because it returns them. Scope parameters draw a rather conservative line in the sand, and while I think it's a good optimization we can get right now, it's not going to help in every case. I'm perfectly fine with safe being conservative and trusted not, at least the power is still there if you need it. -Steve
Nov 12 2009
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Thu, 12 Nov 2009 08:56:25 -0500, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Thu, 12 Nov 2009 08:45:36 -0500, Jason House  
 <jason.james.house gmail.com> wrote:

 Walter Bright Wrote:

 Jason House wrote:
 At a fundamental level, safety isn't about pointers or references to
 stack variables, but rather preventing their escape beyond function
 scope. Scope parameters could be very useful. Scope delegates were
 introduced for a similar reason.

The problem is, they aren't so easy to prove correct.

I understand the general problem with escape analysis, but I've always thought of scope input as meaning noescape. That should lead to easy proofs. If my noescape input (or slice of an array on the stack) is passed to a function without noescape, it's a compile error. That reduces escape analysis to local verification.

The problem is cases like this: char[] foo() { char buf[100]; // fill buf return strstr(buf, "hi").dup; } This function is completely safe, but without full escape analysis the compiler can't tell. The problem is, you don't know how the outputs of a function are connected to its inputs. strstr cannot have its parameters marked as scope because it returns them. Scope parameters draw a rather conservative line in the sand, and while I think it's a good optimization we can get right now, it's not going to help in every case. I'm perfectly fine with safe being conservative and trusted not, at least the power is still there if you need it. -Steve

Well something like this should work (note that I'm making the conversion from T[N] to T[] explicit) auto strstr(T,U)(T src, U substring) if(isRandomAccessRange!T && isRandomAccessRange!U && is(ElementType!U == ElementType!T) { /* Do strstr */ } char[] foo() { // Returns type char[] char buf[100]; // Of type scope char[100] // fill buf // "hi" is type immutable(char)[] return strstr(buf[], "hi").dup; // returns a lent char[], which is dup-ed into a char[], which is okay to return } char[] foo2() { // Returns type char[] char buf[100]; // Of type scope char[100] // fill buf // "hi" is type immutable(char)[] return strstr(buf[], "hi"); // Error, strstr returns a lent char[], not char[]. } lent char[] foo3() { // Returns type lent char[] char buf[100]; // Of type scope char[100] // fill buf // "hi" is type immutable(char)[] return strstr(buf[], "hi"); // Error, scope char[] cannot be implicitly converted to lent char[] inside a lent char[] function: possible escape. } char[] foo4() { // Returns type char[] char buf[100]; // Of type scope char[100] return buf; // Error, return type is char[] not char[100]. } char[] foo5() { // Returns type char[] char buf[100]; // Of type scope char[100] return buf[]; // Error, return type is char[] not scope char[]. } Here's an (outdated and confusing) proposal I put together a while ago (It's pre-DIP): http://prowiki.org/wiki4d/wiki.cgi?OwnershipTypesInD In it, I used stack and scope instead of scope and lent.
Nov 12 2009
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Fri, 13 Nov 2009 05:23:08 +0300, Nick B  
<nick_NOSPAM_.barbalich gmail.com> wrote:

 Overview:

 The AMD Advanced Synchronization Facility (ASF) is an experimental  
 instruction set extension for the AMD64 architecture that would provide  
 new capabilities for efficient synchronization of access to shared data  
 in highly multithreaded applications as well as operating system  
 kernels. ASF provides a means for software to inspect and update  
 multiple shared memory locations atomically without having to rely on  
 locks for mutual exclusion. It is intended to facilitate lock-free  
 programming for highly concurrent shared data structures, allowing more  
 complex and higher performance manipulation of such structures than is  
 practical with traditional techniques based on compare-swap instructions  
 such as CMPXCHG16B. ASF code can also interoperate with lock-based code,  
 or with _Software Transactional Memory_.

 Some basic usage examples of ASF are provided in the specification.  
 However, we expect the programming community could readily use the power  
 and flexibility of ASF to implement very sophisticated, robust and  
 innovative concurrent data structure algorithms, and we encourage such  
 experimentation. AMD will be releasing a simulation framework in the  
 near future to facilitate this.

 AMD is releasing this proposal to encourage the parallel programming  
 community to review and comment on it. Such input will help shape the  
 ultimate direction of this feature, so that it may best serve the needs  
 of advanced parallel application developers.

 Discussion:
 http://forums.amd.com/devblog/blogpost.cfm?catid=317&threadid=114715&utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+AmdDeveloperBlogs+%28AMD+Developer+Blogs%29
   and here
 http://forums.amd.com/devblog/blogpost.cfm?catid=317&threadid=118419&utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+AmdDeveloperBlogs+%28AMD+Developer+Blogs%29


 The spec can be found here:

 http://developer.amd.com/cpu/ASF/Pages/default.aspx


 regards
 Nick B

<offtopic> Please start a new thread by clicking "Create New" button (or similar), not by replying to an existing thread. Thanks! </offtopic>
Nov 13 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 12 Nov 2009 19:29:28 -0500, Robert Jacques <sandford jhu.edu>  
wrote:

 On Thu, 12 Nov 2009 08:56:25 -0500, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 On Thu, 12 Nov 2009 08:45:36 -0500, Jason House  
 <jason.james.house gmail.com> wrote:

 Walter Bright Wrote:

 Jason House wrote:
 At a fundamental level, safety isn't about pointers or references to
 stack variables, but rather preventing their escape beyond function
 scope. Scope parameters could be very useful. Scope delegates were
 introduced for a similar reason.

The problem is, they aren't so easy to prove correct.

I understand the general problem with escape analysis, but I've always thought of scope input as meaning noescape. That should lead to easy proofs. If my noescape input (or slice of an array on the stack) is passed to a function without noescape, it's a compile error. That reduces escape analysis to local verification.

The problem is cases like this: char[] foo() { char buf[100]; // fill buf return strstr(buf, "hi").dup; } This function is completely safe, but without full escape analysis the compiler can't tell. The problem is, you don't know how the outputs of a function are connected to its inputs. strstr cannot have its parameters marked as scope because it returns them. Scope parameters draw a rather conservative line in the sand, and while I think it's a good optimization we can get right now, it's not going to help in every case. I'm perfectly fine with safe being conservative and trusted not, at least the power is still there if you need it. -Steve

Well something like this should work (note that I'm making the conversion from T[N] to T[] explicit) auto strstr(T,U)(T src, U substring) if(isRandomAccessRange!T && isRandomAccessRange!U && is(ElementType!U == ElementType!T) { /* Do strstr */ } char[] foo() { // Returns type char[] char buf[100]; // Of type scope char[100] // fill buf // "hi" is type immutable(char)[] return strstr(buf[], "hi").dup; // returns a lent char[], which is dup-ed into a char[], which is okay to return } char[] foo2() { // Returns type char[] char buf[100]; // Of type scope char[100] // fill buf // "hi" is type immutable(char)[] return strstr(buf[], "hi"); // Error, strstr returns a lent char[], not char[]. }

Your proposal depends on scope being a type modifier, which it currently is not. I think that's a separate issue to tackle. -Steve
Nov 13 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 12 Nov 2009 18:34:48 -0500, Jason House  
<jason.james.house gmail.com> wrote:

 Steven Schveighoffer Wrote:

 On Thu, 12 Nov 2009 08:45:36 -0500, Jason House
 <jason.james.house gmail.com> wrote:

 Walter Bright Wrote:

 Jason House wrote:
 At a fundamental level, safety isn't about pointers or references  



 stack variables, but rather preventing their escape beyond function
 scope. Scope parameters could be very useful. Scope delegates were
 introduced for a similar reason.

The problem is, they aren't so easy to prove correct.

I understand the general problem with escape analysis, but I've always thought of scope input as meaning noescape. That should lead to easy proofs. If my noescape input (or slice of an array on the stack) is passed to a function without noescape, it's a compile error. That reduces escape analysis to local verification.

The problem is cases like this: char[] foo() { char buf[100]; // fill buf return strstr(buf, "hi").dup; } This function is completely safe, but without full escape analysis the compiler can't tell. The problem is, you don't know how the outputs of a function are connected to its inputs. strstr cannot have its parameters marked as scope because it returns them. Scope parameters draw a rather conservative line in the sand, and while I think it's a good optimization we can get right now, it's not going to help in every case. I'm perfectly fine with safe being conservative and trusted not, at least the power is still there if you need it. -Steve

what's the signature of strstr? Your example really boils down to proving strstr is safe.

The problem is, strstr isn't safe by itself, it's only safe in certain contexts. You can't mark it as trusted either because it has the potential to be unsafe. I think if safe D heap-allocates when it passes a local address into an unprovable function such as strstr, that's fine with me. So the signature of strstr has to be unmarked (no safe or trusted).
 You're implying that the return of buf from strstr is unsafe. Indeed, my  
 intentionally short post didn't discuss returning from functions.  
 Ignoring that for a moment, surely you'd agree the following is safe:

 char[] foo(){
     char[100] buf;
     copystringintobuf(buf, "hi");
     return buf[0..2].dup;
 }

 As far as return types, there are two subtle issues:
 1. Returned input argument must preserve the scope requirements of the  
 caller. A similar problem as return variables and const annotation.
 2. Unlike const annotations, there is more than three states for scope,  
 it's simply a measure of how deep/shallowvariables can be in the stack.

Yes, but I think such an annotation system is unworkable. I'd rather see the compiler annotate into an intermediate file. Even with those, you would be hard pressed to be able to prove all cases when the scope depth depends on runtime values. That would require runtime checks. I think escape analysis is a worthy goal, but very hard to implement. Just allocating when you can't prove anything is a decent solution. -Steve
Nov 13 2009
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Fri, 13 Nov 2009 14:50:58 +0300, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Thu, 12 Nov 2009 18:34:48 -0500, Jason House  
 <jason.james.house gmail.com> wrote:

 Steven Schveighoffer Wrote:

 On Thu, 12 Nov 2009 08:45:36 -0500, Jason House
 <jason.james.house gmail.com> wrote:

 Walter Bright Wrote:

 Jason House wrote:
 At a fundamental level, safety isn't about pointers or references  



 stack variables, but rather preventing their escape beyond  



 scope. Scope parameters could be very useful. Scope delegates were
 introduced for a similar reason.

The problem is, they aren't so easy to prove correct.

I understand the general problem with escape analysis, but I've

 thought of scope input as meaning  noescape. That should lead to easy
 proofs. If my  noescape input (or slice of an array on the stack) is
 passed to a function without  noescape, it's a compile error. That
 reduces escape analysis to local verification.

The problem is cases like this: char[] foo() { char buf[100]; // fill buf return strstr(buf, "hi").dup; } This function is completely safe, but without full escape analysis the compiler can't tell. The problem is, you don't know how the outputs of a function are connected to its inputs. strstr cannot have its parameters marked as scope because it returns them. Scope parameters draw a rather conservative line in the sand, and while I think it's a good optimization we can get right now, it's not going to help in every case. I'm perfectly fine with safe being conservative and trusted not, at least the power is still there if you need it. -Steve

what's the signature of strstr? Your example really boils down to proving strstr is safe.

The problem is, strstr isn't safe by itself, it's only safe in certain contexts. You can't mark it as trusted either because it has the potential to be unsafe. I think if safe D heap-allocates when it passes a local address into an unprovable function such as strstr, that's fine with me. So the signature of strstr has to be unmarked (no safe or trusted).

Any example of how unsafe strstr may be? BTW, strstr is no different from std.algorithm.find: import std.algorithm; char[] foo() { char[5] buf = ['h', 'e', 'l', 'l', 'o']; char[] result = find(buf[], 'e'); return result.dup; } I don't see why a general-purpose searching algorithm is unsafe.
Nov 13 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 13 Nov 2009 07:01:25 -0500, Denis Koroskin <2korden gmail.com>  
wrote:

 On Fri, 13 Nov 2009 14:50:58 +0300, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 On Thu, 12 Nov 2009 18:34:48 -0500, Jason House  
 <jason.james.house gmail.com> wrote:

 Steven Schveighoffer Wrote:

 On Thu, 12 Nov 2009 08:45:36 -0500, Jason House
 <jason.james.house gmail.com> wrote:

 Walter Bright Wrote:

 Jason House wrote:
 At a fundamental level, safety isn't about pointers or  



 stack variables, but rather preventing their escape beyond  



 scope. Scope parameters could be very useful. Scope delegates  



 introduced for a similar reason.

The problem is, they aren't so easy to prove correct.

I understand the general problem with escape analysis, but I've

 thought of scope input as meaning  noescape. That should lead to  

 proofs. If my  noescape input (or slice of an array on the stack) is
 passed to a function without  noescape, it's a compile error. That
 reduces escape analysis to local verification.

The problem is cases like this: char[] foo() { char buf[100]; // fill buf return strstr(buf, "hi").dup; } This function is completely safe, but without full escape analysis the compiler can't tell. The problem is, you don't know how the outputs of a function are connected to its inputs. strstr cannot have its parameters marked as scope because it returns them. Scope parameters draw a rather conservative line in the sand, and while I think it's a good optimization we can get right now, it's not going to help in every case. I'm perfectly fine with safe being conservative and trusted not, at least the power is still there if you need it. -Steve

what's the signature of strstr? Your example really boils down to proving strstr is safe.

The problem is, strstr isn't safe by itself, it's only safe in certain contexts. You can't mark it as trusted either because it has the potential to be unsafe. I think if safe D heap-allocates when it passes a local address into an unprovable function such as strstr, that's fine with me. So the signature of strstr has to be unmarked (no safe or trusted).

Any example of how unsafe strstr may be?

Sure (with the current compiler): char[] foo() { char buf[100]; // fill buf return strstr(buf, "hi"); // no .dup, buf escapes } The whole meaning of safe is fuzzy, because we don't know the safe rules with regards to passing references to local data. But I think the goal is to make it so strstr can be marked as safe. In order to do that, foo must be required to be unmarked or trusted, or foo allocates buf on the heap. The point I was trying to make to Jason is that escape analysis is more complicated than just marking parameters as noescape -- you leave out some provably safe functions.
 BTW, strstr is no different from std.algorithm.find:

 import std.algorithm;

 char[] foo()
 {
      char[5] buf = ['h', 'e', 'l', 'l', 'o'];
      char[] result = find(buf[], 'e');

      return result.dup;
 }

 I don't see why a general-purpose searching algorithm is unsafe.

It isn't inherently unsafe. It's just difficult for the compiler to see just from a function signature where the data flows, and escape analysis requires full data-flow disclosure. I think with Walter's proposal of allocating when a safe function passes an address to a local to another safe function is perfectly acceptable to me. I'd also like to see cases where you can mark the input parameter as scope, potentially optimizing out the allocation (but then you cannot return the scope parameter or a reference to any part of it). -Steve
Nov 13 2009
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Fri, 13 Nov 2009 15:29:20 +0300, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Fri, 13 Nov 2009 07:01:25 -0500, Denis Koroskin <2korden gmail.com>  
 wrote:

 On Fri, 13 Nov 2009 14:50:58 +0300, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 On Thu, 12 Nov 2009 18:34:48 -0500, Jason House  
 <jason.james.house gmail.com> wrote:

 Steven Schveighoffer Wrote:

 On Thu, 12 Nov 2009 08:45:36 -0500, Jason House
 <jason.james.house gmail.com> wrote:

 Walter Bright Wrote:

 Jason House wrote:
 At a fundamental level, safety isn't about pointers or  



 stack variables, but rather preventing their escape beyond  



 scope. Scope parameters could be very useful. Scope delegates  



 introduced for a similar reason.

The problem is, they aren't so easy to prove correct.

I understand the general problem with escape analysis, but I've

 thought of scope input as meaning  noescape. That should lead to  

 proofs. If my  noescape input (or slice of an array on the stack)  

 passed to a function without  noescape, it's a compile error. That
 reduces escape analysis to local verification.

The problem is cases like this: char[] foo() { char buf[100]; // fill buf return strstr(buf, "hi").dup; } This function is completely safe, but without full escape analysis the compiler can't tell. The problem is, you don't know how the outputs of a function are connected to its inputs. strstr cannot have its parameters marked as scope because it returns them. Scope parameters draw a rather conservative line in the sand, and while I think it's a good optimization we can get right now, it's not going to help in every case. I'm perfectly fine with safe being conservative and trusted not, at least the power is still there if you need it. -Steve

what's the signature of strstr? Your example really boils down to proving strstr is safe.

The problem is, strstr isn't safe by itself, it's only safe in certain contexts. You can't mark it as trusted either because it has the potential to be unsafe. I think if safe D heap-allocates when it passes a local address into an unprovable function such as strstr, that's fine with me. So the signature of strstr has to be unmarked (no safe or trusted).

Any example of how unsafe strstr may be?

Sure (with the current compiler): char[] foo() { char buf[100]; // fill buf return strstr(buf, "hi"); // no .dup, buf escapes }

No, no, no! It's foo which is unsafe in your example, not strstr!
 The whole meaning of safe is fuzzy, because we don't know the safe rules  
 with regards to passing references to local data.  But I think the goal  
 is to make it so strstr can be marked as safe.  In order to do that, foo  
 must be required to be unmarked or  trusted, or foo allocates buf on the  
 heap.

 The point I was trying to make to Jason is that escape analysis is more  
 complicated than just marking parameters as  noescape -- you leave out  
 some provably safe functions.

 BTW, strstr is no different from std.algorithm.find:

 import std.algorithm;

 char[] foo()
 {
      char[5] buf = ['h', 'e', 'l', 'l', 'o'];
      char[] result = find(buf[], 'e');

      return result.dup;
 }

 I don't see why a general-purpose searching algorithm is unsafe.

It isn't inherently unsafe. It's just difficult for the compiler to see just from a function signature where the data flows, and escape analysis requires full data-flow disclosure. I think with Walter's proposal of allocating when a safe function passes an address to a local to another safe function is perfectly acceptable to me. I'd also like to see cases where you can mark the input parameter as scope, potentially optimizing out the allocation (but then you cannot return the scope parameter or a reference to any part of it). -Steve

I don't like his proposal at all. It introduces one more hidden allocation. Why not just write char[] buf = new char[100]; and disallow taking a slice of static array? (Andrei already hinted this will be disallowed in safe, if I understood him right). Speaking about safety, I don't know how we can allow pointers in safe D: void foo() { int* p = new int; p[1000] = 0; // Will it crash or not? Is this a defined behavior, or not? // If not, this must be disallowed in safe D } And, most importantly, *why* users would want to work with pointers in safe D at all?
Nov 13 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 13 Nov 2009 07:46:02 -0500, Denis Koroskin <2korden gmail.com>  
wrote:


 Sure (with the current compiler):

 char[] foo()
 {
    char buf[100];
    // fill buf
    return strstr(buf, "hi"); // no .dup, buf escapes
 }

No, no, no! It's foo which is unsafe in your example, not strstr!

OK, tell me if foo is now safe or unsafe: safe char[] bar(char[] x); char[] foo() { char buf[100]; return bar(buf); } This is how the compiler looks at the code. It doesn't know what strstr does. For all it knows, bar (or strstr) could allocate heap data based on x and is perfectly safe.
 I don't like his proposal at all. It introduces one more hidden  
 allocation. Why not just write

 char[] buf = new char[100];

 and disallow taking a slice of static array? (Andrei already hinted this  
 will be disallowed in  safe, if I understood him right).

A major performance gain in D is to use stack-allocated buffers for things as opposed to heap-allocated buffers. The proposal allows lots of existing code to be marked as safe without having to add the explicit allocations. I have mixed feelings on the whole thing. I think disallowing a high performance technique such as stack buffer allocation is going to make safe code much less attractive, especially when it's very easy to write provably safe code that uses stack buffers. It's going to confuse and frustrate developers that want to use such buffers. The one good thing I see about the proposal is the heap allocations could be optimized out later if the compiler can get smarter, without having to go remove all those manual heap allocations you added. The other side of the coin is that you just have to mark your functions as trusted instead of safe. Then when the compiler gets smarter, you have to go back and change those functions to safe. That's also a possible solution.
 Speaking about safety, I don't know how we can allow pointers in safe D:

 void foo()
 {
     int* p = new int;
     p[1000] = 0; // Will it crash or not? Is this a defined behavior, or  
 not?
     // If not, this must be disallowed in safe D
 }

 And, most importantly, *why* users would want to work with pointers in  
 safe D at all?

I agree with you on this. But slicing a stack array is not exactly the same as taking a pointer and using unbounded pointer arithmetic. It has the potential to escape scope, but not the potential (at least in safe mode) of accessing data outside the array. -Steve
Nov 13 2009
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Fri, 13 Nov 2009 16:16:29 +0300, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Fri, 13 Nov 2009 07:46:02 -0500, Denis Koroskin <2korden gmail.com>  
 wrote:


 Sure (with the current compiler):

 char[] foo()
 {
    char buf[100];
    // fill buf
    return strstr(buf, "hi"); // no .dup, buf escapes
 }

No, no, no! It's foo which is unsafe in your example, not strstr!

OK, tell me if foo is now safe or unsafe: safe char[] bar(char[] x); char[] foo() { char buf[100]; return bar(buf); }

It is unsafe even if bar doesn't return anything (it could store reference to a buf in some global variable, for example). Or accessing globals is considered unsafe now? It is foo's fault that pointer to a stack allocated buffer is passed and returned outside of the scope. The dangerous line is buf[], which gets a slice out of a static array, not return bar(...). You could as well write: char[] foo() { char buf[100]; return buf[]; // no more bar, but code is still dangerous }
Nov 13 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 13 Nov 2009 08:45:28 -0500, Denis Koroskin <2korden gmail.com>  
wrote:

 On Fri, 13 Nov 2009 16:16:29 +0300, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 On Fri, 13 Nov 2009 07:46:02 -0500, Denis Koroskin <2korden gmail.com>  
 wrote:


 Sure (with the current compiler):

 char[] foo()
 {
    char buf[100];
    // fill buf
    return strstr(buf, "hi"); // no .dup, buf escapes
 }

No, no, no! It's foo which is unsafe in your example, not strstr!

OK, tell me if foo is now safe or unsafe: safe char[] bar(char[] x); char[] foo() { char buf[100]; return bar(buf); }

It is unsafe even if bar doesn't return anything (it could store reference to a buf in some global variable, for example). Or accessing globals is considered unsafe now?

No, it's *potentially* unsafe. If bar is written like this: safe char[] bar(char[] x){ return x.dup;} Then bar is completely safe in all contexts, and therefore foo is completely safe. Merely taking the address of a stack variable does not make a function unsafe. Is this unsafe? char[] foo() { char buf[100]; return buf[0..50].dup; } What about this? void foo(int a, int b) { swap(a, b); // uses references to local variables, what if swap stores a reference to one of its args in a global? } You might understand that if these kinds of thing is not allowed to be marked as safe, you might have non-stop complaints from new users and critics of D about how D's "safety" features are a joke, just like Vista's security popups are a joke. And then everything gets marked as trusted or unmarked, and safed becomes a complete waste of time. We need to choose rules that are good for safety, but which allow intuitive code to be written.
 It is foo's fault that pointer to a stack allocated buffer is passed and  
 returned outside of the scope. The dangerous line is buf[], which gets a  
 slice out of a static array, not return bar(...). You could as well  
 write:

 char[] foo()
 {
      char buf[100];
      return buf[]; // no more bar, but code is still dangerous
 }

The line is most of the time fuzzy whose fault it is. This is why definitions of what is allowed and what is not are important. Your example looks obvious, but there is code that does not look so obvious. Unless you know exactly the flow of the data in the functions you call, then you can't prove whether it's safe or not. I hope that someday the compiler can prove safety even through function calls, but we are a long ways away from that. -Steve
Nov 13 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 13 Nov 2009 08:31:07 -0500, Jason House  
<jason.james.house gmail.com> wrote:

 Steven Schveighoffer Wrote:

 So the signature of strstr has to be unmarked (no  safe or  trusted).

I disagree. Borrowing the syntax from the return const proposal, let's define strstr as follows: inout(char[]) strstr(inout(char[]) buf, const(char[]) orig); What I want that to tell the compiler is that buf, or some piece of buf, is returned from strstr. (please don't assign any more meaning than that, i.e. constness of buf). The compiler would then treat the return value with the same protection as buf, and a return without .dup is a compile error. I've been in drawn out discussions with you before. If this post and my prior post don't make you budge from your position than I'll simply give up trying to convince you. It's not worth the aggregation.

Sure, we can stop discussing. I'll just say I think the escape analysis problem is more complicated than the scoped const problem. Simply because, scoped parameters are not necessarily non-mutable, whereas scoped const parameters are always treated as const. scoped const has one output (the return value) and N inputs. escape analysis has N inputs and M outputs. Annotation is going to be very hard for functions like swap. Simplifications are possible, but like I said, conservative line in the sand. -Steve
Nov 13 2009
prev sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Fri, 13 Nov 2009 06:42:24 -0500, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Thu, 12 Nov 2009 19:29:28 -0500, Robert Jacques <sandford jhu.edu>  
 wrote:

 On Thu, 12 Nov 2009 08:56:25 -0500, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 On Thu, 12 Nov 2009 08:45:36 -0500, Jason House  
 <jason.james.house gmail.com> wrote:

 Walter Bright Wrote:

 Jason House wrote:
 At a fundamental level, safety isn't about pointers or references  

 stack variables, but rather preventing their escape beyond function
 scope. Scope parameters could be very useful. Scope delegates were
 introduced for a similar reason.

The problem is, they aren't so easy to prove correct.

I understand the general problem with escape analysis, but I've always thought of scope input as meaning noescape. That should lead to easy proofs. If my noescape input (or slice of an array on the stack) is passed to a function without noescape, it's a compile error. That reduces escape analysis to local verification.

The problem is cases like this: char[] foo() { char buf[100]; // fill buf return strstr(buf, "hi").dup; } This function is completely safe, but without full escape analysis the compiler can't tell. The problem is, you don't know how the outputs of a function are connected to its inputs. strstr cannot have its parameters marked as scope because it returns them. Scope parameters draw a rather conservative line in the sand, and while I think it's a good optimization we can get right now, it's not going to help in every case. I'm perfectly fine with safe being conservative and trusted not, at least the power is still there if you need it. -Steve

Well something like this should work (note that I'm making the conversion from T[N] to T[] explicit) auto strstr(T,U)(T src, U substring) if(isRandomAccessRange!T && isRandomAccessRange!U && is(ElementType!U == ElementType!T) { /* Do strstr */ } char[] foo() { // Returns type char[] char buf[100]; // Of type scope char[100] // fill buf // "hi" is type immutable(char)[] return strstr(buf[], "hi").dup; // returns a lent char[], which is dup-ed into a char[], which is okay to return } char[] foo2() { // Returns type char[] char buf[100]; // Of type scope char[100] // fill buf // "hi" is type immutable(char)[] return strstr(buf[], "hi"); // Error, strstr returns a lent char[], not char[]. }

Your proposal depends on scope being a type modifier, which it currently is not. I think that's a separate issue to tackle. -Steve

Actually, scope is currently a somewhat-limited type modifier (i.e. scope classes, scope class allocation). My use of it here was mainly to illustrate the compiler's internal representation. Also, the use of scope keyword in the proposal was based on a blog by Walter, where 'scope' became a more universal type modifier. The point was you can handle a large number of escape analysis cases correctly using only the type system (more, of course with type system+local analysis).
Nov 13 2009