www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - safe leak fix?

reply Walter Bright <newshound1 digitalmars.com> writes:
Consider the code:

    safe:
     T[] foo(T[] a) { return a; }

     T[] bar()
     {
         T[10] x;
         return foo(x);
     }

Now we've got an escaping reference to bar's stack. This is not memory 
safe. But giving up slices is a heavy burden.

So it occurred to me that the same solution for closures can be used 
here. If the address is taken of a stack variable in a safe function, 
that variable is instead allocated on the heap. If a more advanced 
compiler could prove that the address does not escape, it could be put 
back on the stack.

The code will be a little slower, but it will be memory safe. This 
change wouldn't be done in trusted or unsafe functions.
Nov 11 2009
next sibling parent BCS <none anon.com> writes:
Hello Walter,

 Consider the code:
 
  safe:
 T[] foo(T[] a) { return a; }
 T[] bar()
 {
 T[10] x;
 return foo(x);
 }
 Now we've got an escaping reference to bar's stack. This is not memory
 safe. But giving up slices is a heavy burden.
 
 So it occurred to me that the same solution for closures can be used
 here. If the address is taken of a stack variable in a safe function,
 that variable is instead allocated on the heap. If a more advanced
 compiler could prove that the address does not escape, it could be put
 back on the stack.
 
 The code will be a little slower, but it will be memory safe. This
 change wouldn't be done in trusted or unsafe functions.
 
Sounds good. If it happens, I'd vote for a push on the static analysis to do those proofs.
Nov 11 2009
prev sibling next sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2009-11-11 16:47:10 -0500, Walter Bright <newshound1 digitalmars.com> said:

 Consider the code:
 
     safe:
      T[] foo(T[] a) { return a; }
 
      T[] bar()
      {
          T[10] x;
          return foo(x);
      }
 
 Now we've got an escaping reference to bar's stack. This is not memory 
 safe. But giving up slices is a heavy burden.
 
 So it occurred to me that the same solution for closures can be used 
 here. If the address is taken of a stack variable in a safe function, 
 that variable is instead allocated on the heap. If a more advanced 
 compiler could prove that the address does not escape, it could be put 
 back on the stack.
 
 The code will be a little slower, but it will be memory safe. This 
 change wouldn't be done in trusted or unsafe functions.
Interesting. This is exactly what I've proposed a few months ago while we were endlessly discussing about scope as a function argument modifier: automatic heap allocation of all escaping variables. Of course I'm all for it. :-) But now you should consider wether or not it should do the same in unsafe D. If it doesn't do the same unsafe D will crash for things safe D won't crash. If you do this in unsafe D you need a way to force a variable not be heap allocated whatever happens. (Perhaps using 'scope' as a storage modifier for variables.) -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Nov 11 2009
prev sibling next sibling parent reply grauzone <none example.net> writes:
Walter Bright wrote:
 Consider the code:
 
    safe:
     T[] foo(T[] a) { return a; }
 
     T[] bar()
     {
         T[10] x;
         return foo(x);
     }
 
 Now we've got an escaping reference to bar's stack. This is not memory 
 safe. But giving up slices is a heavy burden.
 
 So it occurred to me that the same solution for closures can be used 
 here. If the address is taken of a stack variable in a safe function, 
 that variable is instead allocated on the heap. If a more advanced 
 compiler could prove that the address does not escape, it could be put 
 back on the stack.
 
 The code will be a little slower, but it will be memory safe. This 
 change wouldn't be done in trusted or unsafe functions.
That's just idiotic. One of the main uses of static arrays is to _avoid_ heap memory allocation in the first place. Do what you want within safe, but leave "unsafe" (oh god what a pejorative) alone.
Nov 11 2009
parent reply Walter Bright <newshound1 digitalmars.com> writes:
grauzone wrote:
 Walter Bright wrote:
 The code will be a little slower, but it will be memory safe. This 
 change wouldn't be done in trusted or unsafe functions.
That's just idiotic. One of the main uses of static arrays is to _avoid_ heap memory allocation in the first place. Do what you want within safe, but leave "unsafe" (oh god what a pejorative) alone.
Well, I did propose only doing this in safe functions! Also, I agree with "unsafe" being a pejorative. Got any better ideas? "unchecked"?
Nov 11 2009
next sibling parent reply grauzone <none example.net> writes:
Walter Bright wrote:
 grauzone wrote:
 Walter Bright wrote:
 The code will be a little slower, but it will be memory safe. This 
 change wouldn't be done in trusted or unsafe functions.
That's just idiotic. One of the main uses of static arrays is to _avoid_ heap memory allocation in the first place. Do what you want within safe, but leave "unsafe" (oh god what a pejorative) alone.
Well, I did propose only doing this in safe functions!
In this case, the semantic difference between safe and unsafe functions will cause trouble, and you'd eventually end up imposing the "safe" semantics upon unsafe functions. I'd vote for disallowing slicing in safe functions. Safe code can just use dynamic arrays instead. One other important use of arrays will be small SSE optimized vectors (as far as I understood that), but those should be fine in safe mode; usually you won't want to slice them.
 Also, I agree with "unsafe" being a pejorative. Got any better ideas? 
 "unchecked"?
Some brainstorming: highperf, fast mode, system mode, lowlevel, bare-metal, turbo mode (silly but fun), ...?
Nov 11 2009
parent reply Walter Bright <newshound1 digitalmars.com> writes:
grauzone wrote:
 In this case, the semantic difference between safe and unsafe functions 
 will cause trouble, and you'd eventually end up imposing the "safe" 
 semantics upon unsafe functions.
May be.
 I'd vote for disallowing slicing in safe functions. Safe code can just 
 use dynamic arrays instead. One other important use of arrays will be 
 small SSE optimized vectors (as far as I understood that), but those 
 should be fine in safe mode; usually you won't want to slice them.
I thought of that, but I think it's too restrictive.
 
 Also, I agree with "unsafe" being a pejorative. Got any better ideas? 
 "unchecked"?
Some brainstorming: highperf, fast mode, system mode, lowlevel, bare-metal, turbo mode (silly but fun), ...?
"system" sounds good.
Nov 11 2009
next sibling parent reply grauzone <none example.net> writes:
Walter Bright wrote:
 grauzone wrote:
 In this case, the semantic difference between safe and unsafe 
 functions will cause trouble, and you'd eventually end up imposing the 
 "safe" semantics upon unsafe functions.
May be.
Returning a slice to a local static array would be fine in safe mode, but lead to silent corruption in "unsafe" mode. I think everyone would assume that, if something works in safe mode, it should also work in unsafe mode. So, it's really a "must", or is there some other way around it?
Nov 11 2009
parent Walter Bright <newshound1 digitalmars.com> writes:
grauzone wrote:
 Walter Bright wrote:
 grauzone wrote:
 In this case, the semantic difference between safe and unsafe 
 functions will cause trouble, and you'd eventually end up imposing 
 the "safe" semantics upon unsafe functions.
May be.
Returning a slice to a local static array would be fine in safe mode, but lead to silent corruption in "unsafe" mode. I think everyone would assume that, if something works in safe mode, it should also work in unsafe mode.
It's a good point.
 So, it's really a "must", or is there some other way around it?
I don't know.
Nov 11 2009
prev sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Walter Bright:

I thought of that, but I think it's too restrictive.<
I agree. A possible solution to this problem (Ellery Newcomer may have said the same thing): in safe functions require locally some kind of annotation that turns that into a safe heap allocation (and at the same time it denotes such heap allocation in a visible way). In unsafe functions such annotation is optional, while in safe code you must put it if you want to use the feature.
"system" sounds good.<
"unsafe" is still good that purpose because it's like the __ for gshared: it's designed on purpose to look less nice. Bye, bearophile
Nov 11 2009
parent reply Walter Bright <newshound1 digitalmars.com> writes:
bearophile wrote:
 "system" sounds good.<
"unsafe" is still good that purpose because it's like the __ for gshared: it's designed on purpose to look less nice.
"Unsafe" is also a misnomer. It implies the code is broken. I don't like it.
Nov 11 2009
parent reply Don <nospam nospam.com> writes:
Walter Bright wrote:
 bearophile wrote:
 "system" sounds good.<
"unsafe" is still good that purpose because it's like the __ for gshared: it's designed on purpose to look less nice.
"Unsafe" is also a misnomer. It implies the code is broken. I don't like it.
There are definitely functions which are dangerous if you pass them invalid parameters. ie, "use at own risk" -- any function which uses them needs to add its own tests. I think something which implies "you should think before you use this function" is reasonable. I don't care at all what the name is, however. system would be OK.
Nov 12 2009
parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Don (nospam nospam.com)'s article
 There are definitely functions which are dangerous if you pass them
 invalid parameters. ie, "use at own risk" -- any function which uses
 them needs to add its own tests. I think something which implies "you
 should think before you use this function" is reasonable.
 I don't care at all what the name is, however.  system would be OK.
Yeah, and sometimes the functions that are unsafe when passed invalid parameters aren't obvious. Granted this is an extreme corner case, but I recently debugged an access violation that was occurring in a well-tested sorting function that I would have definitely annotated trusted. I was sorting on floating point keys, and it turned out there were NaNs in there and the sort function assumed that there would be a proper total ordering. If the pivot element was a NaN, it would therefore enter an endless loop because there was nothing in the array that was <= the pivot, until it read past the end of the array. This was a latent bug for a long time and only showed up when I ran the program with parameters that generated NaNs. Of course the real solution here is to get rid of the #()&# lack of total ordering for floats.
Nov 12 2009
parent Walter Bright <newshound1 digitalmars.com> writes:
dsimcha wrote:
 == Quote from Don (nospam nospam.com)'s article
 There are definitely functions which are dangerous if you pass them
 invalid parameters. ie, "use at own risk" -- any function which uses
 them needs to add its own tests. I think something which implies "you
 should think before you use this function" is reasonable.
 I don't care at all what the name is, however.  system would be OK.
Yeah, and sometimes the functions that are unsafe when passed invalid parameters aren't obvious.
Also, even safe functions cannot be guaranteed safe if they are passed garbage as arguments.
Nov 12 2009
prev sibling parent bearophile <bearophileHUGS lycos.com> writes:
Walter Bright:

 Also, I agree with "unsafe" being a pejorative. Got any better ideas? 
 "unchecked"?
Naming it "unsafe" is OK because it's already used in C#, and because unsafe code is indeed worse than safe code (because lot of today people want safety), so it's a fit name. Languages like C#, Java, etc, start being designed for safety from day 0, and then they add optimizations on top to make them fast too (and today Java is sometimes about as fast as C++, despite it lacks things as arrays of structs). D is now doing the opposite, but I think this try to create a SafeD may require a lot of work and in the end holes in the safety net may be possible still... I hope the design of SafeD will go well. Bye, bearophile
Nov 11 2009
prev sibling next sibling parent Brad Roberts <braddr bellevue.puremagic.com> writes:
On Wed, 11 Nov 2009, Walter Bright wrote:

 So it occurred to me that the same solution for closures can be used here. If
 the address is taken of a stack variable in a safe function, that variable is
 instead allocated on the heap. If a more advanced compiler could prove that
 the address does not escape, it could be put back on the stack.
 
 The code will be a little slower, but it will be memory safe. This change
 wouldn't be done in trusted or unsafe functions.
I think safe vs unsafe causing a behavior change is a really bad idea. They're contracts / constraints, not modifiers. - Brad
Nov 11 2009
prev sibling next sibling parent Frank Benoit <keinfarbton googlemail.com> writes:
Walter Bright schrieb:
 Consider the code:
 
    safe:
     T[] foo(T[] a) { return a; }
 
     T[] bar()
     {
         T[10] x;
         return foo(x);
     }
 
If D would have something like a slice-info which could be returned instead of the slice itself, then foo would be safe. slice-info would be something like a struct/Tuple storing the start and end index. That applied onto the original array gives the slice. SliceInfo foo( T[] a){ // do something, resulting in e.g. a[2..6] return SliceInfo(2, 6); } T[] bar(){ T[] x = new T[10]; return x[foo(x)]; // safe compile OK } T[] bar(){ T[10] x; return x[foo(x)]; // safe error, because x slice escapes } This shifts responsibility of memory safety to the caller with little extra effort.
Nov 11 2009
prev sibling next sibling parent Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
Walter Bright wrote:
 Consider the code:
 
    safe:
     T[] foo(T[] a) { return a; }
 
     T[] bar()
     {
         T[10] x;
         return foo(x);
     }
 
 Now we've got an escaping reference to bar's stack. This is not memory
 safe. But giving up slices is a heavy burden.
 
 So it occurred to me that the same solution for closures can be used
 here. If the address is taken of a stack variable in a safe function,
 that variable is instead allocated on the heap. If a more advanced
 compiler could prove that the address does not escape, it could be put
 back on the stack.
 
 The code will be a little slower, but it will be memory safe. This
 change wouldn't be done in trusted or unsafe functions.
Enter random annotation which asserts the function is allocated on the stack?
Nov 11 2009
prev sibling next sibling parent reply Jason House <jason.james.house gmail.com> writes:
Walter Bright Wrote:

 Consider the code:
 
     safe:
      T[] foo(T[] a) { return a; }
 
      T[] bar()
      {
          T[10] x;
          return foo(x);
      }
 
 Now we've got an escaping reference to bar's stack. This is not memory 
 safe. But giving up slices is a heavy burden.
 
 So it occurred to me that the same solution for closures can be used 
 here. If the address is taken of a stack variable in a safe function, 
 that variable is instead allocated on the heap. If a more advanced 
 compiler could prove that the address does not escape, it could be put 
 back on the stack.
 
 The code will be a little slower, but it will be memory safe. This 
 change wouldn't be done in trusted or unsafe functions.
At a fundamental level, safety isn't about pointers or references to stack variables, but rather preventing their escape beyond function scope. Scope parameters could be very useful. Scope delegates were introduced for a similar reason.
Nov 11 2009
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Jason House wrote:
 At a fundamental level, safety isn't about pointers or references to
 stack variables, but rather preventing their escape beyond function
 scope. Scope parameters could be very useful. Scope delegates were
 introduced for a similar reason.
The problem is, they aren't so easy to prove correct.
Nov 11 2009
parent reply Jason House <jason.james.house gmail.com> writes:
Walter Bright Wrote:

 Jason House wrote:
 At a fundamental level, safety isn't about pointers or references to
 stack variables, but rather preventing their escape beyond function
 scope. Scope parameters could be very useful. Scope delegates were
 introduced for a similar reason.
The problem is, they aren't so easy to prove correct.
I understand the general problem with escape analysis, but I've always thought of scope input as meaning noescape. That should lead to easy proofs. If my noescape input (or slice of an array on the stack) is passed to a function without noescape, it's a compile error. That reduces escape analysis to local verification.
Nov 12 2009
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 12 Nov 2009 08:45:36 -0500, Jason House  
<jason.james.house gmail.com> wrote:

 Walter Bright Wrote:

 Jason House wrote:
 At a fundamental level, safety isn't about pointers or references to
 stack variables, but rather preventing their escape beyond function
 scope. Scope parameters could be very useful. Scope delegates were
 introduced for a similar reason.
The problem is, they aren't so easy to prove correct.
I understand the general problem with escape analysis, but I've always thought of scope input as meaning noescape. That should lead to easy proofs. If my noescape input (or slice of an array on the stack) is passed to a function without noescape, it's a compile error. That reduces escape analysis to local verification.
The problem is cases like this: char[] foo() { char buf[100]; // fill buf return strstr(buf, "hi").dup; } This function is completely safe, but without full escape analysis the compiler can't tell. The problem is, you don't know how the outputs of a function are connected to its inputs. strstr cannot have its parameters marked as scope because it returns them. Scope parameters draw a rather conservative line in the sand, and while I think it's a good optimization we can get right now, it's not going to help in every case. I'm perfectly fine with safe being conservative and trusted not, at least the power is still there if you need it. -Steve
Nov 12 2009
next sibling parent reply Jason House <jason.james.house gmail.com> writes:
Steven Schveighoffer Wrote:

 On Thu, 12 Nov 2009 08:45:36 -0500, Jason House  
 <jason.james.house gmail.com> wrote:
 
 Walter Bright Wrote:

 Jason House wrote:
 At a fundamental level, safety isn't about pointers or references to
 stack variables, but rather preventing their escape beyond function
 scope. Scope parameters could be very useful. Scope delegates were
 introduced for a similar reason.
The problem is, they aren't so easy to prove correct.
I understand the general problem with escape analysis, but I've always thought of scope input as meaning noescape. That should lead to easy proofs. If my noescape input (or slice of an array on the stack) is passed to a function without noescape, it's a compile error. That reduces escape analysis to local verification.
The problem is cases like this: char[] foo() { char buf[100]; // fill buf return strstr(buf, "hi").dup; } This function is completely safe, but without full escape analysis the compiler can't tell. The problem is, you don't know how the outputs of a function are connected to its inputs. strstr cannot have its parameters marked as scope because it returns them. Scope parameters draw a rather conservative line in the sand, and while I think it's a good optimization we can get right now, it's not going to help in every case. I'm perfectly fine with safe being conservative and trusted not, at least the power is still there if you need it. -Steve
what's the signature of strstr? Your example really boils down to proving strstr is safe. You're implying that the return of buf from strstr is unsafe. Indeed, my intentionally short post didn't discuss returning from functions. Ignoring that for a moment, surely you'd agree the following is safe: char[] foo(){ char[100] buf; copystringintobuf(buf, "hi"); return buf[0..2].dup; } As far as return types, there are two subtle issues: 1. Returned input argument must preserve the scope requirements of the caller. A similar problem as return variables and const annotation. 2. Unlike const annotations, there is more than three states for scope, it's simply a measure of how deep/shallowvariables can be in the stack.
Nov 12 2009
next sibling parent reply Nick B <"nick_NOSPAM_.barbalich" gmail.com> writes:
Overview:

The AMD Advanced Synchronization Facility (ASF) is an experimental 
instruction set extension for the AMD64 architecture that would provide 
new capabilities for efficient synchronization of access to shared data 
in highly multithreaded applications as well as operating system 
kernels. ASF provides a means for software to inspect and update 
multiple shared memory locations atomically without having to rely on 
locks for mutual exclusion. It is intended to facilitate lock-free 
programming for highly concurrent shared data structures, allowing more 
complex and higher performance manipulation of such structures than is 
practical with traditional techniques based on compare-swap instructions 
such as CMPXCHG16B. ASF code can also interoperate with lock-based code, 
or with _Software Transactional Memory_.

Some basic usage examples of ASF are provided in the specification. 
However, we expect the programming community could readily use the power 
and flexibility of ASF to implement very sophisticated, robust and 
innovative concurrent data structure algorithms, and we encourage such 
experimentation. AMD will be releasing a simulation framework in the 
near future to facilitate this.

AMD is releasing this proposal to encourage the parallel programming 
community to review and comment on it. Such input will help shape the 
ultimate direction of this feature, so that it may best serve the needs 
of advanced parallel application developers.

Discussion:
http://forums.amd.com/devblog/blogpost.cfm?catid=317&threadid=114715&utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+AmdDeveloperBlogs+%28AMD+Developer+Blogs%29
  and here
http://forums.amd.com/devblog/blogpost.cfm?catid=317&threadid=118419&utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+AmdDeveloperBlogs+%28AMD+Developer+Blogs%29


The spec can be found here:

http://developer.amd.com/cpu/ASF/Pages/default.aspx


regards
Nick B
Nov 12 2009
parent "Denis Koroskin" <2korden gmail.com> writes:
On Fri, 13 Nov 2009 05:23:08 +0300, Nick B  
<nick_NOSPAM_.barbalich gmail.com> wrote:

 Overview:

 The AMD Advanced Synchronization Facility (ASF) is an experimental  
 instruction set extension for the AMD64 architecture that would provide  
 new capabilities for efficient synchronization of access to shared data  
 in highly multithreaded applications as well as operating system  
 kernels. ASF provides a means for software to inspect and update  
 multiple shared memory locations atomically without having to rely on  
 locks for mutual exclusion. It is intended to facilitate lock-free  
 programming for highly concurrent shared data structures, allowing more  
 complex and higher performance manipulation of such structures than is  
 practical with traditional techniques based on compare-swap instructions  
 such as CMPXCHG16B. ASF code can also interoperate with lock-based code,  
 or with _Software Transactional Memory_.

 Some basic usage examples of ASF are provided in the specification.  
 However, we expect the programming community could readily use the power  
 and flexibility of ASF to implement very sophisticated, robust and  
 innovative concurrent data structure algorithms, and we encourage such  
 experimentation. AMD will be releasing a simulation framework in the  
 near future to facilitate this.

 AMD is releasing this proposal to encourage the parallel programming  
 community to review and comment on it. Such input will help shape the  
 ultimate direction of this feature, so that it may best serve the needs  
 of advanced parallel application developers.

 Discussion:
 http://forums.amd.com/devblog/blogpost.cfm?catid=317&threadid=114715&utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+AmdDeveloperBlogs+%28AMD+Developer+Blogs%29
   and here
 http://forums.amd.com/devblog/blogpost.cfm?catid=317&threadid=118419&utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+AmdDeveloperBlogs+%28AMD+Developer+Blogs%29


 The spec can be found here:

 http://developer.amd.com/cpu/ASF/Pages/default.aspx


 regards
 Nick B
<offtopic> Please start a new thread by clicking "Create New" button (or similar), not by replying to an existing thread. Thanks! </offtopic>
Nov 13 2009
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 12 Nov 2009 18:34:48 -0500, Jason House  
<jason.james.house gmail.com> wrote:

 Steven Schveighoffer Wrote:

 On Thu, 12 Nov 2009 08:45:36 -0500, Jason House
 <jason.james.house gmail.com> wrote:

 Walter Bright Wrote:

 Jason House wrote:
 At a fundamental level, safety isn't about pointers or references  
to
 stack variables, but rather preventing their escape beyond function
 scope. Scope parameters could be very useful. Scope delegates were
 introduced for a similar reason.
The problem is, they aren't so easy to prove correct.
I understand the general problem with escape analysis, but I've always thought of scope input as meaning noescape. That should lead to easy proofs. If my noescape input (or slice of an array on the stack) is passed to a function without noescape, it's a compile error. That reduces escape analysis to local verification.
The problem is cases like this: char[] foo() { char buf[100]; // fill buf return strstr(buf, "hi").dup; } This function is completely safe, but without full escape analysis the compiler can't tell. The problem is, you don't know how the outputs of a function are connected to its inputs. strstr cannot have its parameters marked as scope because it returns them. Scope parameters draw a rather conservative line in the sand, and while I think it's a good optimization we can get right now, it's not going to help in every case. I'm perfectly fine with safe being conservative and trusted not, at least the power is still there if you need it. -Steve
what's the signature of strstr? Your example really boils down to proving strstr is safe.
The problem is, strstr isn't safe by itself, it's only safe in certain contexts. You can't mark it as trusted either because it has the potential to be unsafe. I think if safe D heap-allocates when it passes a local address into an unprovable function such as strstr, that's fine with me. So the signature of strstr has to be unmarked (no safe or trusted).
 You're implying that the return of buf from strstr is unsafe. Indeed, my  
 intentionally short post didn't discuss returning from functions.  
 Ignoring that for a moment, surely you'd agree the following is safe:

 char[] foo(){
     char[100] buf;
     copystringintobuf(buf, "hi");
     return buf[0..2].dup;
 }

 As far as return types, there are two subtle issues:
 1. Returned input argument must preserve the scope requirements of the  
 caller. A similar problem as return variables and const annotation.
 2. Unlike const annotations, there is more than three states for scope,  
 it's simply a measure of how deep/shallowvariables can be in the stack.
Yes, but I think such an annotation system is unworkable. I'd rather see the compiler annotate into an intermediate file. Even with those, you would be hard pressed to be able to prove all cases when the scope depth depends on runtime values. That would require runtime checks. I think escape analysis is a worthy goal, but very hard to implement. Just allocating when you can't prove anything is a decent solution. -Steve
Nov 13 2009
next sibling parent reply "Denis Koroskin" <2korden gmail.com> writes:
On Fri, 13 Nov 2009 14:50:58 +0300, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Thu, 12 Nov 2009 18:34:48 -0500, Jason House  
 <jason.james.house gmail.com> wrote:

 Steven Schveighoffer Wrote:

 On Thu, 12 Nov 2009 08:45:36 -0500, Jason House
 <jason.james.house gmail.com> wrote:

 Walter Bright Wrote:

 Jason House wrote:
 At a fundamental level, safety isn't about pointers or references  
to
 stack variables, but rather preventing their escape beyond  
function
 scope. Scope parameters could be very useful. Scope delegates were
 introduced for a similar reason.
The problem is, they aren't so easy to prove correct.
I understand the general problem with escape analysis, but I've
always
 thought of scope input as meaning  noescape. That should lead to easy
 proofs. If my  noescape input (or slice of an array on the stack) is
 passed to a function without  noescape, it's a compile error. That
 reduces escape analysis to local verification.
The problem is cases like this: char[] foo() { char buf[100]; // fill buf return strstr(buf, "hi").dup; } This function is completely safe, but without full escape analysis the compiler can't tell. The problem is, you don't know how the outputs of a function are connected to its inputs. strstr cannot have its parameters marked as scope because it returns them. Scope parameters draw a rather conservative line in the sand, and while I think it's a good optimization we can get right now, it's not going to help in every case. I'm perfectly fine with safe being conservative and trusted not, at least the power is still there if you need it. -Steve
what's the signature of strstr? Your example really boils down to proving strstr is safe.
The problem is, strstr isn't safe by itself, it's only safe in certain contexts. You can't mark it as trusted either because it has the potential to be unsafe. I think if safe D heap-allocates when it passes a local address into an unprovable function such as strstr, that's fine with me. So the signature of strstr has to be unmarked (no safe or trusted).
Any example of how unsafe strstr may be? BTW, strstr is no different from std.algorithm.find: import std.algorithm; char[] foo() { char[5] buf = ['h', 'e', 'l', 'l', 'o']; char[] result = find(buf[], 'e'); return result.dup; } I don't see why a general-purpose searching algorithm is unsafe.
Nov 13 2009
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 13 Nov 2009 07:01:25 -0500, Denis Koroskin <2korden gmail.com>  
wrote:

 On Fri, 13 Nov 2009 14:50:58 +0300, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 On Thu, 12 Nov 2009 18:34:48 -0500, Jason House  
 <jason.james.house gmail.com> wrote:

 Steven Schveighoffer Wrote:

 On Thu, 12 Nov 2009 08:45:36 -0500, Jason House
 <jason.james.house gmail.com> wrote:

 Walter Bright Wrote:

 Jason House wrote:
 At a fundamental level, safety isn't about pointers or  
references to
 stack variables, but rather preventing their escape beyond  
function
 scope. Scope parameters could be very useful. Scope delegates  
were
 introduced for a similar reason.
The problem is, they aren't so easy to prove correct.
I understand the general problem with escape analysis, but I've
always
 thought of scope input as meaning  noescape. That should lead to  
easy
 proofs. If my  noescape input (or slice of an array on the stack) is
 passed to a function without  noescape, it's a compile error. That
 reduces escape analysis to local verification.
The problem is cases like this: char[] foo() { char buf[100]; // fill buf return strstr(buf, "hi").dup; } This function is completely safe, but without full escape analysis the compiler can't tell. The problem is, you don't know how the outputs of a function are connected to its inputs. strstr cannot have its parameters marked as scope because it returns them. Scope parameters draw a rather conservative line in the sand, and while I think it's a good optimization we can get right now, it's not going to help in every case. I'm perfectly fine with safe being conservative and trusted not, at least the power is still there if you need it. -Steve
what's the signature of strstr? Your example really boils down to proving strstr is safe.
The problem is, strstr isn't safe by itself, it's only safe in certain contexts. You can't mark it as trusted either because it has the potential to be unsafe. I think if safe D heap-allocates when it passes a local address into an unprovable function such as strstr, that's fine with me. So the signature of strstr has to be unmarked (no safe or trusted).
Any example of how unsafe strstr may be?
Sure (with the current compiler): char[] foo() { char buf[100]; // fill buf return strstr(buf, "hi"); // no .dup, buf escapes } The whole meaning of safe is fuzzy, because we don't know the safe rules with regards to passing references to local data. But I think the goal is to make it so strstr can be marked as safe. In order to do that, foo must be required to be unmarked or trusted, or foo allocates buf on the heap. The point I was trying to make to Jason is that escape analysis is more complicated than just marking parameters as noescape -- you leave out some provably safe functions.
 BTW, strstr is no different from std.algorithm.find:

 import std.algorithm;

 char[] foo()
 {
      char[5] buf = ['h', 'e', 'l', 'l', 'o'];
      char[] result = find(buf[], 'e');

      return result.dup;
 }

 I don't see why a general-purpose searching algorithm is unsafe.
It isn't inherently unsafe. It's just difficult for the compiler to see just from a function signature where the data flows, and escape analysis requires full data-flow disclosure. I think with Walter's proposal of allocating when a safe function passes an address to a local to another safe function is perfectly acceptable to me. I'd also like to see cases where you can mark the input parameter as scope, potentially optimizing out the allocation (but then you cannot return the scope parameter or a reference to any part of it). -Steve
Nov 13 2009
parent reply "Denis Koroskin" <2korden gmail.com> writes:
On Fri, 13 Nov 2009 15:29:20 +0300, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Fri, 13 Nov 2009 07:01:25 -0500, Denis Koroskin <2korden gmail.com>  
 wrote:

 On Fri, 13 Nov 2009 14:50:58 +0300, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 On Thu, 12 Nov 2009 18:34:48 -0500, Jason House  
 <jason.james.house gmail.com> wrote:

 Steven Schveighoffer Wrote:

 On Thu, 12 Nov 2009 08:45:36 -0500, Jason House
 <jason.james.house gmail.com> wrote:

 Walter Bright Wrote:

 Jason House wrote:
 At a fundamental level, safety isn't about pointers or  
references to
 stack variables, but rather preventing their escape beyond  
function
 scope. Scope parameters could be very useful. Scope delegates  
were
 introduced for a similar reason.
The problem is, they aren't so easy to prove correct.
I understand the general problem with escape analysis, but I've
always
 thought of scope input as meaning  noescape. That should lead to  
easy
 proofs. If my  noescape input (or slice of an array on the stack)  
is
 passed to a function without  noescape, it's a compile error. That
 reduces escape analysis to local verification.
The problem is cases like this: char[] foo() { char buf[100]; // fill buf return strstr(buf, "hi").dup; } This function is completely safe, but without full escape analysis the compiler can't tell. The problem is, you don't know how the outputs of a function are connected to its inputs. strstr cannot have its parameters marked as scope because it returns them. Scope parameters draw a rather conservative line in the sand, and while I think it's a good optimization we can get right now, it's not going to help in every case. I'm perfectly fine with safe being conservative and trusted not, at least the power is still there if you need it. -Steve
what's the signature of strstr? Your example really boils down to proving strstr is safe.
The problem is, strstr isn't safe by itself, it's only safe in certain contexts. You can't mark it as trusted either because it has the potential to be unsafe. I think if safe D heap-allocates when it passes a local address into an unprovable function such as strstr, that's fine with me. So the signature of strstr has to be unmarked (no safe or trusted).
Any example of how unsafe strstr may be?
Sure (with the current compiler): char[] foo() { char buf[100]; // fill buf return strstr(buf, "hi"); // no .dup, buf escapes }
No, no, no! It's foo which is unsafe in your example, not strstr!
 The whole meaning of safe is fuzzy, because we don't know the safe rules  
 with regards to passing references to local data.  But I think the goal  
 is to make it so strstr can be marked as safe.  In order to do that, foo  
 must be required to be unmarked or  trusted, or foo allocates buf on the  
 heap.

 The point I was trying to make to Jason is that escape analysis is more  
 complicated than just marking parameters as  noescape -- you leave out  
 some provably safe functions.

 BTW, strstr is no different from std.algorithm.find:

 import std.algorithm;

 char[] foo()
 {
      char[5] buf = ['h', 'e', 'l', 'l', 'o'];
      char[] result = find(buf[], 'e');

      return result.dup;
 }

 I don't see why a general-purpose searching algorithm is unsafe.
It isn't inherently unsafe. It's just difficult for the compiler to see just from a function signature where the data flows, and escape analysis requires full data-flow disclosure. I think with Walter's proposal of allocating when a safe function passes an address to a local to another safe function is perfectly acceptable to me. I'd also like to see cases where you can mark the input parameter as scope, potentially optimizing out the allocation (but then you cannot return the scope parameter or a reference to any part of it). -Steve
I don't like his proposal at all. It introduces one more hidden allocation. Why not just write char[] buf = new char[100]; and disallow taking a slice of static array? (Andrei already hinted this will be disallowed in safe, if I understood him right). Speaking about safety, I don't know how we can allow pointers in safe D: void foo() { int* p = new int; p[1000] = 0; // Will it crash or not? Is this a defined behavior, or not? // If not, this must be disallowed in safe D } And, most importantly, *why* users would want to work with pointers in safe D at all?
Nov 13 2009
next sibling parent grauzone <none example.net> writes:
Denis Koroskin wrote:
 I don't like his proposal at all. It introduces one more hidden 
 allocation. Why not just write
 
 char[] buf = new char[100];
 
 and disallow taking a slice of static array? (Andrei already hinted this 
 will be disallowed in  safe, if I understood him right).
I think that would be the best. What uses of static arrays are there? - allocating memory "inline" (eh, you better not use SafeD if you need this! new always works) - as value types, e.g. small vectors (don't really need slices in this case) - ...?
 Speaking about safety, I don't know how we can allow pointers in safe D:
 
 void foo()
 {
    int* p = new int;
    p[1000] = 0; // Will it crash or not? Is this a defined behavior, or 
 not?
    // If not, this must be disallowed in safe D
 }
 
 And, most importantly, *why* users would want to work with pointers in 
 safe D at all?
As far as I understood, pointers are supposed to be allowed in SafeD. You just aren't allowed to do the following things: - pointer arithmetic - turning arrays into slices - taking address (messy one!) - (unsafe) casts between pointers - array.ptr - probably more
Nov 13 2009
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 13 Nov 2009 07:46:02 -0500, Denis Koroskin <2korden gmail.com>  
wrote:


 Sure (with the current compiler):

 char[] foo()
 {
    char buf[100];
    // fill buf
    return strstr(buf, "hi"); // no .dup, buf escapes
 }
No, no, no! It's foo which is unsafe in your example, not strstr!
OK, tell me if foo is now safe or unsafe: safe char[] bar(char[] x); char[] foo() { char buf[100]; return bar(buf); } This is how the compiler looks at the code. It doesn't know what strstr does. For all it knows, bar (or strstr) could allocate heap data based on x and is perfectly safe.
 I don't like his proposal at all. It introduces one more hidden  
 allocation. Why not just write

 char[] buf = new char[100];

 and disallow taking a slice of static array? (Andrei already hinted this  
 will be disallowed in  safe, if I understood him right).
A major performance gain in D is to use stack-allocated buffers for things as opposed to heap-allocated buffers. The proposal allows lots of existing code to be marked as safe without having to add the explicit allocations. I have mixed feelings on the whole thing. I think disallowing a high performance technique such as stack buffer allocation is going to make safe code much less attractive, especially when it's very easy to write provably safe code that uses stack buffers. It's going to confuse and frustrate developers that want to use such buffers. The one good thing I see about the proposal is the heap allocations could be optimized out later if the compiler can get smarter, without having to go remove all those manual heap allocations you added. The other side of the coin is that you just have to mark your functions as trusted instead of safe. Then when the compiler gets smarter, you have to go back and change those functions to safe. That's also a possible solution.
 Speaking about safety, I don't know how we can allow pointers in safe D:

 void foo()
 {
     int* p = new int;
     p[1000] = 0; // Will it crash or not? Is this a defined behavior, or  
 not?
     // If not, this must be disallowed in safe D
 }

 And, most importantly, *why* users would want to work with pointers in  
 safe D at all?
I agree with you on this. But slicing a stack array is not exactly the same as taking a pointer and using unbounded pointer arithmetic. It has the potential to escape scope, but not the potential (at least in safe mode) of accessing data outside the array. -Steve
Nov 13 2009
parent reply "Denis Koroskin" <2korden gmail.com> writes:
On Fri, 13 Nov 2009 16:16:29 +0300, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Fri, 13 Nov 2009 07:46:02 -0500, Denis Koroskin <2korden gmail.com>  
 wrote:


 Sure (with the current compiler):

 char[] foo()
 {
    char buf[100];
    // fill buf
    return strstr(buf, "hi"); // no .dup, buf escapes
 }
No, no, no! It's foo which is unsafe in your example, not strstr!
OK, tell me if foo is now safe or unsafe: safe char[] bar(char[] x); char[] foo() { char buf[100]; return bar(buf); }
It is unsafe even if bar doesn't return anything (it could store reference to a buf in some global variable, for example). Or accessing globals is considered unsafe now? It is foo's fault that pointer to a stack allocated buffer is passed and returned outside of the scope. The dangerous line is buf[], which gets a slice out of a static array, not return bar(...). You could as well write: char[] foo() { char buf[100]; return buf[]; // no more bar, but code is still dangerous }
Nov 13 2009
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 13 Nov 2009 08:45:28 -0500, Denis Koroskin <2korden gmail.com>  
wrote:

 On Fri, 13 Nov 2009 16:16:29 +0300, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 On Fri, 13 Nov 2009 07:46:02 -0500, Denis Koroskin <2korden gmail.com>  
 wrote:


 Sure (with the current compiler):

 char[] foo()
 {
    char buf[100];
    // fill buf
    return strstr(buf, "hi"); // no .dup, buf escapes
 }
No, no, no! It's foo which is unsafe in your example, not strstr!
OK, tell me if foo is now safe or unsafe: safe char[] bar(char[] x); char[] foo() { char buf[100]; return bar(buf); }
It is unsafe even if bar doesn't return anything (it could store reference to a buf in some global variable, for example). Or accessing globals is considered unsafe now?
No, it's *potentially* unsafe. If bar is written like this: safe char[] bar(char[] x){ return x.dup;} Then bar is completely safe in all contexts, and therefore foo is completely safe. Merely taking the address of a stack variable does not make a function unsafe. Is this unsafe? char[] foo() { char buf[100]; return buf[0..50].dup; } What about this? void foo(int a, int b) { swap(a, b); // uses references to local variables, what if swap stores a reference to one of its args in a global? } You might understand that if these kinds of thing is not allowed to be marked as safe, you might have non-stop complaints from new users and critics of D about how D's "safety" features are a joke, just like Vista's security popups are a joke. And then everything gets marked as trusted or unmarked, and safed becomes a complete waste of time. We need to choose rules that are good for safety, but which allow intuitive code to be written.
 It is foo's fault that pointer to a stack allocated buffer is passed and  
 returned outside of the scope. The dangerous line is buf[], which gets a  
 slice out of a static array, not return bar(...). You could as well  
 write:

 char[] foo()
 {
      char buf[100];
      return buf[]; // no more bar, but code is still dangerous
 }
The line is most of the time fuzzy whose fault it is. This is why definitions of what is allowed and what is not are important. Your example looks obvious, but there is code that does not look so obvious. Unless you know exactly the flow of the data in the functions you call, then you can't prove whether it's safe or not. I hope that someday the compiler can prove safety even through function calls, but we are a long ways away from that. -Steve
Nov 13 2009
prev sibling parent reply Jason House <jason.james.house gmail.com> writes:
Steven Schveighoffer Wrote:

 On Thu, 12 Nov 2009 18:34:48 -0500, Jason House  
 <jason.james.house gmail.com> wrote:
 
 Steven Schveighoffer Wrote:

 On Thu, 12 Nov 2009 08:45:36 -0500, Jason House
 <jason.james.house gmail.com> wrote:

 Walter Bright Wrote:

 Jason House wrote:
 At a fundamental level, safety isn't about pointers or references  
to
 stack variables, but rather preventing their escape beyond function
 scope. Scope parameters could be very useful. Scope delegates were
 introduced for a similar reason.
The problem is, they aren't so easy to prove correct.
I understand the general problem with escape analysis, but I've always thought of scope input as meaning noescape. That should lead to easy proofs. If my noescape input (or slice of an array on the stack) is passed to a function without noescape, it's a compile error. That reduces escape analysis to local verification.
The problem is cases like this: char[] foo() { char buf[100]; // fill buf return strstr(buf, "hi").dup; } This function is completely safe, but without full escape analysis the compiler can't tell. The problem is, you don't know how the outputs of a function are connected to its inputs. strstr cannot have its parameters marked as scope because it returns them. Scope parameters draw a rather conservative line in the sand, and while I think it's a good optimization we can get right now, it's not going to help in every case. I'm perfectly fine with safe being conservative and trusted not, at least the power is still there if you need it. -Steve
what's the signature of strstr? Your example really boils down to proving strstr is safe.
The problem is, strstr isn't safe by itself, it's only safe in certain contexts. You can't mark it as trusted either because it has the potential to be unsafe. I think if safe D heap-allocates when it passes a local address into an unprovable function such as strstr, that's fine with me. So the signature of strstr has to be unmarked (no safe or trusted).
I disagree. Borrowing the syntax from the return const proposal, let's define strstr as follows: inout(char[]) strstr(inout(char[]) buf, const(char[]) orig); What I want that to tell the compiler is that buf, or some piece of buf, is returned from strstr. (please don't assign any more meaning than that, i.e. constness of buf). The compiler would then treat the return value with the same protection as buf, and a return without .dup is a compile error. I've been in drawn out discussions with you before. If this post and my prior post don't make you budge from your position than I'll simply give up trying to convince you. It's not worth the aggregation.
Nov 13 2009
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 13 Nov 2009 08:31:07 -0500, Jason House  
<jason.james.house gmail.com> wrote:

 Steven Schveighoffer Wrote:

 So the signature of strstr has to be unmarked (no  safe or  trusted).
I disagree. Borrowing the syntax from the return const proposal, let's define strstr as follows: inout(char[]) strstr(inout(char[]) buf, const(char[]) orig); What I want that to tell the compiler is that buf, or some piece of buf, is returned from strstr. (please don't assign any more meaning than that, i.e. constness of buf). The compiler would then treat the return value with the same protection as buf, and a return without .dup is a compile error. I've been in drawn out discussions with you before. If this post and my prior post don't make you budge from your position than I'll simply give up trying to convince you. It's not worth the aggregation.
Sure, we can stop discussing. I'll just say I think the escape analysis problem is more complicated than the scoped const problem. Simply because, scoped parameters are not necessarily non-mutable, whereas scoped const parameters are always treated as const. scoped const has one output (the return value) and N inputs. escape analysis has N inputs and M outputs. Annotation is going to be very hard for functions like swap. Simplifications are possible, but like I said, conservative line in the sand. -Steve
Nov 13 2009
prev sibling parent reply "Robert Jacques" <sandford jhu.edu> writes:
On Thu, 12 Nov 2009 08:56:25 -0500, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Thu, 12 Nov 2009 08:45:36 -0500, Jason House  
 <jason.james.house gmail.com> wrote:

 Walter Bright Wrote:

 Jason House wrote:
 At a fundamental level, safety isn't about pointers or references to
 stack variables, but rather preventing their escape beyond function
 scope. Scope parameters could be very useful. Scope delegates were
 introduced for a similar reason.
The problem is, they aren't so easy to prove correct.
I understand the general problem with escape analysis, but I've always thought of scope input as meaning noescape. That should lead to easy proofs. If my noescape input (or slice of an array on the stack) is passed to a function without noescape, it's a compile error. That reduces escape analysis to local verification.
The problem is cases like this: char[] foo() { char buf[100]; // fill buf return strstr(buf, "hi").dup; } This function is completely safe, but without full escape analysis the compiler can't tell. The problem is, you don't know how the outputs of a function are connected to its inputs. strstr cannot have its parameters marked as scope because it returns them. Scope parameters draw a rather conservative line in the sand, and while I think it's a good optimization we can get right now, it's not going to help in every case. I'm perfectly fine with safe being conservative and trusted not, at least the power is still there if you need it. -Steve
Well something like this should work (note that I'm making the conversion from T[N] to T[] explicit) auto strstr(T,U)(T src, U substring) if(isRandomAccessRange!T && isRandomAccessRange!U && is(ElementType!U == ElementType!T) { /* Do strstr */ } char[] foo() { // Returns type char[] char buf[100]; // Of type scope char[100] // fill buf // "hi" is type immutable(char)[] return strstr(buf[], "hi").dup; // returns a lent char[], which is dup-ed into a char[], which is okay to return } char[] foo2() { // Returns type char[] char buf[100]; // Of type scope char[100] // fill buf // "hi" is type immutable(char)[] return strstr(buf[], "hi"); // Error, strstr returns a lent char[], not char[]. } lent char[] foo3() { // Returns type lent char[] char buf[100]; // Of type scope char[100] // fill buf // "hi" is type immutable(char)[] return strstr(buf[], "hi"); // Error, scope char[] cannot be implicitly converted to lent char[] inside a lent char[] function: possible escape. } char[] foo4() { // Returns type char[] char buf[100]; // Of type scope char[100] return buf; // Error, return type is char[] not char[100]. } char[] foo5() { // Returns type char[] char buf[100]; // Of type scope char[100] return buf[]; // Error, return type is char[] not scope char[]. } Here's an (outdated and confusing) proposal I put together a while ago (It's pre-DIP): http://prowiki.org/wiki4d/wiki.cgi?OwnershipTypesInD In it, I used stack and scope instead of scope and lent.
Nov 12 2009
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 12 Nov 2009 19:29:28 -0500, Robert Jacques <sandford jhu.edu>  
wrote:

 On Thu, 12 Nov 2009 08:56:25 -0500, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 On Thu, 12 Nov 2009 08:45:36 -0500, Jason House  
 <jason.james.house gmail.com> wrote:

 Walter Bright Wrote:

 Jason House wrote:
 At a fundamental level, safety isn't about pointers or references to
 stack variables, but rather preventing their escape beyond function
 scope. Scope parameters could be very useful. Scope delegates were
 introduced for a similar reason.
The problem is, they aren't so easy to prove correct.
I understand the general problem with escape analysis, but I've always thought of scope input as meaning noescape. That should lead to easy proofs. If my noescape input (or slice of an array on the stack) is passed to a function without noescape, it's a compile error. That reduces escape analysis to local verification.
The problem is cases like this: char[] foo() { char buf[100]; // fill buf return strstr(buf, "hi").dup; } This function is completely safe, but without full escape analysis the compiler can't tell. The problem is, you don't know how the outputs of a function are connected to its inputs. strstr cannot have its parameters marked as scope because it returns them. Scope parameters draw a rather conservative line in the sand, and while I think it's a good optimization we can get right now, it's not going to help in every case. I'm perfectly fine with safe being conservative and trusted not, at least the power is still there if you need it. -Steve
Well something like this should work (note that I'm making the conversion from T[N] to T[] explicit) auto strstr(T,U)(T src, U substring) if(isRandomAccessRange!T && isRandomAccessRange!U && is(ElementType!U == ElementType!T) { /* Do strstr */ } char[] foo() { // Returns type char[] char buf[100]; // Of type scope char[100] // fill buf // "hi" is type immutable(char)[] return strstr(buf[], "hi").dup; // returns a lent char[], which is dup-ed into a char[], which is okay to return } char[] foo2() { // Returns type char[] char buf[100]; // Of type scope char[100] // fill buf // "hi" is type immutable(char)[] return strstr(buf[], "hi"); // Error, strstr returns a lent char[], not char[]. }
Your proposal depends on scope being a type modifier, which it currently is not. I think that's a separate issue to tackle. -Steve
Nov 13 2009
parent "Robert Jacques" <sandford jhu.edu> writes:
On Fri, 13 Nov 2009 06:42:24 -0500, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Thu, 12 Nov 2009 19:29:28 -0500, Robert Jacques <sandford jhu.edu>  
 wrote:

 On Thu, 12 Nov 2009 08:56:25 -0500, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 On Thu, 12 Nov 2009 08:45:36 -0500, Jason House  
 <jason.james.house gmail.com> wrote:

 Walter Bright Wrote:

 Jason House wrote:
 At a fundamental level, safety isn't about pointers or references  
to
 stack variables, but rather preventing their escape beyond function
 scope. Scope parameters could be very useful. Scope delegates were
 introduced for a similar reason.
The problem is, they aren't so easy to prove correct.
I understand the general problem with escape analysis, but I've always thought of scope input as meaning noescape. That should lead to easy proofs. If my noescape input (or slice of an array on the stack) is passed to a function without noescape, it's a compile error. That reduces escape analysis to local verification.
The problem is cases like this: char[] foo() { char buf[100]; // fill buf return strstr(buf, "hi").dup; } This function is completely safe, but without full escape analysis the compiler can't tell. The problem is, you don't know how the outputs of a function are connected to its inputs. strstr cannot have its parameters marked as scope because it returns them. Scope parameters draw a rather conservative line in the sand, and while I think it's a good optimization we can get right now, it's not going to help in every case. I'm perfectly fine with safe being conservative and trusted not, at least the power is still there if you need it. -Steve
Well something like this should work (note that I'm making the conversion from T[N] to T[] explicit) auto strstr(T,U)(T src, U substring) if(isRandomAccessRange!T && isRandomAccessRange!U && is(ElementType!U == ElementType!T) { /* Do strstr */ } char[] foo() { // Returns type char[] char buf[100]; // Of type scope char[100] // fill buf // "hi" is type immutable(char)[] return strstr(buf[], "hi").dup; // returns a lent char[], which is dup-ed into a char[], which is okay to return } char[] foo2() { // Returns type char[] char buf[100]; // Of type scope char[100] // fill buf // "hi" is type immutable(char)[] return strstr(buf[], "hi"); // Error, strstr returns a lent char[], not char[]. }
Your proposal depends on scope being a type modifier, which it currently is not. I think that's a separate issue to tackle. -Steve
Actually, scope is currently a somewhat-limited type modifier (i.e. scope classes, scope class allocation). My use of it here was mainly to illustrate the compiler's internal representation. Also, the use of scope keyword in the proposal was based on a blog by Walter, where 'scope' became a more universal type modifier. The point was you can handle a large number of escape analysis cases correctly using only the type system (more, of course with type system+local analysis).
Nov 13 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 11 Nov 2009 16:47:10 -0500, Walter Bright  
<newshound1 digitalmars.com> wrote:

 Consider the code:

     safe:
      T[] foo(T[] a) { return a; }

      T[] bar()
      {
          T[10] x;
          return foo(x);
      }

 Now we've got an escaping reference to bar's stack. This is not memory  
 safe. But giving up slices is a heavy burden.

 So it occurred to me that the same solution for closures can be used  
 here. If the address is taken of a stack variable in a safe function,  
 that variable is instead allocated on the heap. If a more advanced  
 compiler could prove that the address does not escape, it could be put  
 back on the stack.

 The code will be a little slower, but it will be memory safe. This  
 change wouldn't be done in trusted or unsafe functions.
This sounds acceptable to me. In response to others making claims about modifying behavior, you can get a safe function that can use unsafe behavior by using trusted if you wish. I'm assuming this behavior translates to local non-array variables? Can we allow the scope variable hack that is afforded for delegates: safe int sum(scope int[] a) { int retval = 0; foreach(i; a) retval += i; return retval;} This would not result in a heap allocation when called with a static array. -Steve
Nov 12 2009
prev sibling parent =?ISO-8859-1?Q?=22J=E9r=F4me_M=2E_Berger=22?= <jeberger free.fr> writes:
Walter Bright wrote:
 Consider the code:
=20
    safe:
     T[] foo(T[] a) { return a; }
=20
     T[] bar()
     {
         T[10] x;
         return foo(x);
     }
=20
 Now we've got an escaping reference to bar's stack. This is not memory =
 safe. But giving up slices is a heavy burden.
=20
 So it occurred to me that the same solution for closures can be used=20
 here. If the address is taken of a stack variable in a safe function,=20
 that variable is instead allocated on the heap. If a more advanced=20
 compiler could prove that the address does not escape, it could be put =
 back on the stack.
=20
 The code will be a little slower, but it will be memory safe. This=20
 change wouldn't be done in trusted or unsafe functions.
Cyclone has this neat notion that a pointer is associated to a=20 memory "region" (by default, there are 3 regions: the data segment,=20 the heap and the stack of the current function, but you can have=20 user-defined regions). In this case, the function "foo" would have=20 the type: region (`R) T[] foo ( region (`R) T[] a) where `R is an abstract region name meaning that the return value is=20 in the same region as the argument. When compiling "bar", the=20 compiler would then be able to see that it is returning a pointer to=20 bar's stack region and refuse. Jerome --=20 mailto:jeberger free.fr http://jeberger.free.fr Jabber: jeberger jabber.fr
Nov 12 2009