digitalmars.D - safe inference fundamentally broken

Steven Schveighoffer (48/48) Jun 05 2014 I was just viewing an interesting pull request. The pull wanted to chang...

deadalnix (1/1) Jun 05 2014 @safe is fundamentally broken.

Timon Gehr (5/6) Jun 05 2014 What's the "fundamental" problem? The construct seems perfectly fit to

deadalnix (7/13) Jun 05 2014 Many constructs are assumed to be @safe on basis @safe don't

Timon Gehr (3/19) Jun 05 2014 I see this not as a fundamental problem with @safe, but with how its
Steven Schveighoffer (3/18) Jun 05 2014 Believe it or not, this is actually safe.

monarch_dodra (8/14) Jun 05 2014 What do you mean by "is actually safe" ? In the "can you actually

Steven Schveighoffer (7/20) Jun 05 2014 It's safe because of the implementation of arrays. There is always one

Steven Schveighoffer (6/22) Jun 05 2014 I take it back, it could be unsafe. You could have e.g. a 12 byte struct...

Steven Schveighoffer (5/28) Jun 05 2014 A possible fix could be to reject the call to ptr at runtime if the slic...

monarch_dodra (9/11) Jun 05 2014 I don't know why you'd ever do "arr.ptr" in the first place,

Steven Schveighoffer (9/19) Jun 05 2014 That's true. You can always get a pointer to any valid element with

deadalnix (3/18) Jun 05 2014 &arr[0] does bound checking. That is different ;)

Timon Gehr (10/32) Jun 05 2014 This issue is not a fundamental problem with @safe, nor is it a matter

Steven Schveighoffer (6/12) Jun 05 2014 I think I was one of those who argued to do it gradually. I was wrong.

Timon Gehr (21/33) Jun 05 2014 Actually there are two conflicting definitions of @safe in the

Steven Schveighoffer (3/16) Jun 05 2014 One starts with a decidedly non-gradual breaking change.

Andrei Alexandrescu (2/8) Jun 05 2014 Agreed. -- Andrei

Jonathan M Davis via Digitalmars-d (16/19) Jun 05 2014 On Thu, 05 Jun 2014 14:09:44 -0400
Jonathan M Davis via Digitalmars-d (5/6) Jun 05 2014 @safe as a concept is fine. It's what the compiler currently thinks is o...
Walter Bright (2/3) Jun 05 2014 Please file bugzilla issue(s) for this.

Timon Gehr (3/6) Jun 05 2014 There already is one:

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

I was just viewing an interesting pull request. The pull wanted to change  
a line like this (buf is a fixed-size stack allocated array):

auto x = buf.ptr[0..buf.len];

To this:

auto trustedBuffer(ref typeof(buf) b)  trusted { return b.ptr[0..buf.len];  
}

for the sake of allowing the enclosing template function to be inferred  
 safe. But the interesting thing is that the original line is DELIBERATELY  
that way to PREVENT the compiler from inferring  safety (the buffer could  
potentially be squirreled away somewhere).

I questioned the idea of using the .ptr trick to fool the compiler, and  
asked why doesn't just do the correct thing. Turns out, the compiler is  
very bad at determining if stack-allocated data is escaped or returned.

A quick example:

T[] getBuf(T)()  safe
{
     T[100] ret;
     auto r = ret[];
     return r;
}

void main()  safe
{
     auto buf = getBuf!int();
}

Note, the above compiles. An interesting thing here is that we have  
explicitly marked getBuf as  safe. So what if we want to remove that? It  
STILL compiles, because the compiler infers  safety!

In order to fix it, we can either mark the function getBuf as  system, or  
do what was done in the original code above. The benefit of using ptr  
instead of  system is that it's much less susceptible to someone coming  
along and saying "hey look, if I just remove  system, it works!" It also  
allows conditionally compiled un- safe code to be included in the same  
function, and let the compiler infer correctly.

But this is extremely alarming. Here we have to be vigilant for anything  
like this, and make SURE we explicitly mark this as  system. If there is  
some possibility that a function could be inferred  safe or  system  
depending on the template parameters, then we would need to write separate  
templates for both, or use the .ptr trick above.

This situation is very bad. I personally think that we need to make  
slicing a stack-allocated array INVALID in  safe code, and not let that  
code be inferred safe. We have already demonstrated an easy way to make an  
internal delegate that can be  trusted for one line. That should be used  
to work around this limitation.

I propose that we start migrating towards making slicing of stack data  
un- safe, first by making it a warning, enabled with -w. Then making it an  
error.

Thoughts?

-Steve

Jun 05 2014

"deadalnix" <deadalnix gmail.com> writes:

 safe is fundamentally broken.

Jun 05 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 06/05/2014 08:17 PM, deadalnix wrote:
  safe is fundamentally broken.

What's the "fundamental" problem? The construct seems perfectly fit to 
specify memory safety in at least the following context:

void main() safe{}

:o)

Jun 05 2014

"deadalnix" <deadalnix gmail.com> writes:

On Thursday, 5 June 2014 at 18:33:22 UTC, Timon Gehr wrote:
 On 06/05/2014 08:17 PM, deadalnix wrote:
  safe is fundamentally broken.

 What's the "fundamental" problem? The construct seems perfectly 
 fit to specify memory safety in at least the following context:

 void main() safe{}

 :o)

Many constructs are assumed to be  safe on basis  safe don't
guarantee.

T[] arr = [ ... ];
arr = arr[$ .. $];
auto garbage = *(arr.ptr);

 safe is as safe a a condom with holes poked in it.

Jun 05 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 06/05/2014 08:47 PM, deadalnix wrote:
 On Thursday, 5 June 2014 at 18:33:22 UTC, Timon Gehr wrote:
 On 06/05/2014 08:17 PM, deadalnix wrote:
  safe is fundamentally broken.

 What's the "fundamental" problem? The construct seems perfectly fit to
 specify memory safety in at least the following context:

 void main() safe{}

 :o)

 Many constructs are assumed to be  safe on basis  safe don't
 guarantee.

 T[] arr = [ ... ];
 arr = arr[$ .. $];
 auto garbage = *(arr.ptr);

  safe is as safe a a condom with holes poked in it.

I see this not as a fundamental problem with  safe, but with how its 
implementation has been approached.

Jun 05 2014

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 05 Jun 2014 14:47:54 -0400, deadalnix <deadalnix gmail.com> wrote:

 On Thursday, 5 June 2014 at 18:33:22 UTC, Timon Gehr wrote:
 On 06/05/2014 08:17 PM, deadalnix wrote:
  safe is fundamentally broken.

 What's the "fundamental" problem? The construct seems perfectly fit to  
 specify memory safety in at least the following context:

 void main() safe{}

 :o)

 Many constructs are assumed to be  safe on basis  safe don't
 guarantee.

 T[] arr = [ ... ];
 arr = arr[$ .. $];
 auto garbage = *(arr.ptr);

Believe it or not, this is actually safe.

-Steve

Jun 05 2014

"monarch_dodra" <monarchdodra gmail.com> writes:

On Thursday, 5 June 2014 at 19:27:56 UTC, Steven Schveighoffer 
wrote:
 On Thu, 05 Jun 2014 14:47:54 -0400, deadalnix
 T[] arr = [ ... ];
 arr = arr[$ .. $];
 auto garbage = *(arr.ptr);

 Believe it or not, this is actually safe.

 -Steve

What do you mean by "is actually safe" ? In the "can you actually 
believe this obviously wrong code is marked as safe" or "this 
code that looks wrong is actually perfectly safe"?

AFAIK, it's only safe if arr is GC allocated to a bloc smaller 
than a page, so the conditions are implementation defined 
behavior, and so is the result.

Jun 05 2014

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 05 Jun 2014 15:34:13 -0400, monarch_dodra <monarchdodra gmail.com>  
wrote:

 On Thursday, 5 June 2014 at 19:27:56 UTC, Steven Schveighoffer wrote:
 On Thu, 05 Jun 2014 14:47:54 -0400, deadalnix
 T[] arr = [ ... ];
 arr = arr[$ .. $];
 auto garbage = *(arr.ptr);

 Believe it or not, this is actually safe.

 What do you mean by "is actually safe" ? In the "can you actually  
 believe this obviously wrong code is marked as safe" or "this code that  
 looks wrong is actually perfectly safe"?

It's safe because of the implementation of arrays. There is always one  
sentinel byte that cannot be used for the block of data. This is why when  
you allocate e.g. 8 ints, it goes into a 32-byte block.

 AFAIK, it's only safe if arr is GC allocated to a bloc smaller than a  
 page, so the conditions are implementation defined behavior, and so is  
 the result.

No, pages also have a sentinel byte.

-Steve

Jun 05 2014

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 05 Jun 2014 15:48:09 -0400, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Thu, 05 Jun 2014 15:34:13 -0400, monarch_dodra  
 <monarchdodra gmail.com> wrote:

 On Thursday, 5 June 2014 at 19:27:56 UTC, Steven Schveighoffer wrote:
 On Thu, 05 Jun 2014 14:47:54 -0400, deadalnix
 T[] arr = [ ... ];
 arr = arr[$ .. $];
 auto garbage = *(arr.ptr);

 Believe it or not, this is actually safe.

 What do you mean by "is actually safe" ? In the "can you actually  
 believe this obviously wrong code is marked as safe" or "this code that  
 looks wrong is actually perfectly safe"?

 It's safe because of the implementation of arrays. There is always one  
 sentinel byte that cannot be used for the block of data. This is why  
 when you allocate e.g. 8 ints, it goes into a 32-byte block.

I take it back, it could be unsafe. You could have e.g. a 12 byte struct  
be T, and then the last "element" could extend through the end of the  
block.

-Steve

Jun 05 2014

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 05 Jun 2014 15:54:33 -0400, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Thu, 05 Jun 2014 15:48:09 -0400, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 On Thu, 05 Jun 2014 15:34:13 -0400, monarch_dodra  
 <monarchdodra gmail.com> wrote:

 On Thursday, 5 June 2014 at 19:27:56 UTC, Steven Schveighoffer wrote:
 On Thu, 05 Jun 2014 14:47:54 -0400, deadalnix
 T[] arr = [ ... ];
 arr = arr[$ .. $];
 auto garbage = *(arr.ptr);

 Believe it or not, this is actually safe.

 What do you mean by "is actually safe" ? In the "can you actually  
 believe this obviously wrong code is marked as safe" or "this code  
 that looks wrong is actually perfectly safe"?

 It's safe because of the implementation of arrays. There is always one  
 sentinel byte that cannot be used for the block of data. This is why  
 when you allocate e.g. 8 ints, it goes into a 32-byte block.

 I take it back, it could be unsafe. You could have e.g. a 12 byte struct  
 be T, and then the last "element" could extend through the end of the  
 block.

A possible fix could be to reject the call to ptr at runtime if the slice  
is empty.

-Steve

Jun 05 2014

"monarch_dodra" <monarchdodra gmail.com> writes:

On Thursday, 5 June 2014 at 19:57:08 UTC, Steven Schveighoffer 
wrote:
 A possible fix could be to reject the call to ptr at runtime if 
 the slice is empty.

I don't know why you'd ever do "arr.ptr" in the first place, 
other than to avoid the bounds check. So I think the call should 
just be unsafe, and we call it a day. Or maybe to interface with 
a function that want a pointer?

"Maybe", we could get away with allowing "&arr[someIndex]" 
though. The compiler would have to be able to "understand" this 
is not escaping a reference (for *DYNAMIC* arrays anyways).

Jun 05 2014

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 05 Jun 2014 16:32:24 -0400, monarch_dodra <monarchdodra gmail.com>  
wrote:

 On Thursday, 5 June 2014 at 19:57:08 UTC, Steven Schveighoffer wrote:
 A possible fix could be to reject the call to ptr at runtime if the  
 slice is empty.

 I don't know why you'd ever do "arr.ptr" in the first place, other than  
 to avoid the bounds check. So I think the call should just be unsafe,  
 and we call it a day. Or maybe to interface with a function that want a  
 pointer?

That's true. You can always get a pointer to any valid element with  
&arr[x]. Then at least you have bounds checking to save you.

In fact, in safe code, arr.ptr could be replaced with &arr[0].

 "Maybe", we could get away with allowing "&arr[someIndex]" though. The  
 compiler would have to be able to "understand" this is not escaping a  
 reference (for *DYNAMIC* arrays anyways).

If we are going to allow pointers, we need to allow pointers to data we  
know is valid and heap-allocated. This should include dynamic array  
elements.

-Steve

Jun 05 2014

"deadalnix" <deadalnix gmail.com> writes:

On Thursday, 5 June 2014 at 20:37:51 UTC, Steven Schveighoffer
wrote:
 On Thu, 05 Jun 2014 16:32:24 -0400, monarch_dodra 
 <monarchdodra gmail.com> wrote:

 On Thursday, 5 June 2014 at 19:57:08 UTC, Steven Schveighoffer 
 wrote:
 A possible fix could be to reject the call to ptr at runtime 
 if the slice is empty.

 I don't know why you'd ever do "arr.ptr" in the first place, 
 other than to avoid the bounds check. So I think the call 
 should just be unsafe, and we call it a day. Or maybe to 
 interface with a function that want a pointer?

 That's true. You can always get a pointer to any valid element 
 with &arr[x]. Then at least you have bounds checking to save 
 you.

 In fact, in safe code, arr.ptr could be replaced with &arr[0].

&arr[0] does bound checking. That is different ;)

Jun 05 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 06/05/2014 08:09 PM, Steven Schveighoffer wrote:
 A quick example:

 T[] getBuf(T)()  safe
 {
     T[100] ret;
     auto r = ret[];
     return r;
 }

 void main()  safe
 {
     auto buf = getBuf!int();
 }

 Note, the above compiles. An interesting thing here is that we have explicitly
marked getBuf as  safe. So what if we want to remove that? It STILL compiles,
because the compiler infers  safety!

AFAIK it's a well known problem.

 This situation is very bad. I personally think that we need to make
 slicing a stack-allocated array INVALID in  safe code,

This issue is not a fundamental problem with  safe, nor is it a matter
of personal opinion, this is simply a compiler bug.

 and not let that
 code be inferred safe. We have already demonstrated an easy way to make
 an internal delegate that can be  trusted for one line. That should be
 used to work around this limitation.

 I propose that we start migrating towards making slicing of stack data
 un- safe, first by making it a warning, enabled with -w. Then making it
 an error.

 Thoughts?

The fundamental issue seems to lie in methodology and it is that  safe 
is approximated by the DMD implementation from the wrong side. Instead 
of gradually banning usage of more and more constructs in  safe, the 
implementation should have started out with not allowing any constructs 
in  safe code and then should have gradually allowed more and more 
manually verified to be memory safe constructs.

Jun 05 2014

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 05 Jun 2014 14:30:49 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 The fundamental issue seems to lie in methodology and it is that  safe  
 is approximated by the DMD implementation from the wrong side. Instead  
 of gradually banning usage of more and more constructs in  safe, the  
 implementation should have started out with not allowing any constructs  
 in  safe code and then should have gradually allowed more and more  
 manually verified to be memory safe constructs.

I think I was one of those who argued to do it gradually. I was wrong.  
When one is manually marking  safe things, it's not as bad as when the  
compiler is automatically marking them. But in either case,  safe doesn't  
really mean safe, so it is pretty much useless.

-Steve

Jun 05 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 06/05/2014 08:35 PM, Steven Schveighoffer wrote:
 On Thu, 05 Jun 2014 14:30:49 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 The fundamental issue seems to lie in methodology and it is that  safe
 is approximated by the DMD implementation from the wrong side. Instead
 of gradually banning usage of more and more constructs in  safe, the
 implementation should have started out with not allowing any
 constructs in  safe code and then should have gradually allowed more
 and more manually verified to be memory safe constructs.

 I think I was one of those who argued to do it gradually. I was wrong.

I don't understand. Both strategies are gradual.

 When one is manually marking  safe things, it's not as bad as when the
 compiler is automatically marking them. But in either case,  safe
 doesn't really mean safe, so it is pretty much useless.

 -Steve

Actually there are two conflicting definitions of  safe in the 
documentation. One says that  safe means memory safe and the other says 
that  safe means what DMD does:

dlang.org/function.html

"Function Safety

Safe functions are functions that are statically checked to exhibit no 
possibility of undefined behavior. Undefined behavior is often used as a 
vector for malicious attacks.

Safe Functions

Safe functions are marked with the  safe attribute.

The following operations are not allowed in safe functions:

[ too short list ]"

It then goes on to note:

"Note: The verifiable safety of functions may be compromised by bugs in 
the compiler and specification. Please report all such errors so they 
can be corrected."

This is simply the wrong way to go about it.

The documentation should instead say:

"The following operations are _allowed_ in safe functions:"

Jun 05 2014

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 05 Jun 2014 14:52:39 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 06/05/2014 08:35 PM, Steven Schveighoffer wrote:
 On Thu, 05 Jun 2014 14:30:49 -0400, Timon Gehr <timon.gehr gmx.ch>  
 wrote:

 The fundamental issue seems to lie in methodology and it is that  safe
 is approximated by the DMD implementation from the wrong side. Instead
 of gradually banning usage of more and more constructs in  safe, the
 implementation should have started out with not allowing any
 constructs in  safe code and then should have gradually allowed more
 and more manually verified to be memory safe constructs.

 I think I was one of those who argued to do it gradually. I was wrong.

 I don't understand. Both strategies are gradual.

One starts with a decidedly non-gradual breaking change.

-Steve

Jun 05 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 6/5/14, 8:30 PM, Timon Gehr wrote:
 The fundamental issue seems to lie in methodology and it is that  safe
 is approximated by the DMD implementation from the wrong side. Instead
 of gradually banning usage of more and more constructs in  safe, the
 implementation should have started out with not allowing any constructs
 in  safe code and then should have gradually allowed more and more
 manually verified to be memory safe constructs.

Agreed. -- Andrei

Jun 05 2014

Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:

On Thu, 05 Jun 2014 14:09:44 -0400
Steven Schveighoffer via Digitalmars-d <digitalmars-d puremagic.com>
wrote:

 I propose that we start migrating towards making slicing of stack
 data un- safe, first by making it a warning, enabled with -w. Then
 making it an error.

I reported this quite some time ago, but unfortunately, it has yet to be
fixed.

https://issues.dlang.org/show_bug.cgi?id=8838

The bizarre thing though is that in the original thread where I brought it up,
some folks fought it, arguing that it should be considered  safe. However,
slicing a static array is no different from taking the address of a local
variable, so they should definitely be treated the same.

On a related note, I very much wish that there was no implicit slicing of
static arrays in the language. It's just begging for bugs IMHO - precisely for
the same kinds of reasons that automatically converting a local variable to a
pointer to that variable would create bugs. But unfortunately, I think that
fix that now would be a very tough sell.

- Jonathan M Davis

Jun 05 2014

Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:

On Thu, 05 Jun 2014 18:17:10 +0000
deadalnix via Digitalmars-d <digitalmars-d puremagic.com> wrote:

  safe is fundamentally broken.

 safe as a concept is fine. It's what the compiler currently thinks is or
isn't  safe which is broken. If we can fix that, we should be fine.

- Jonathan M Davis

Jun 05 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 6/5/2014 11:09 AM, Steven Schveighoffer wrote:
 Thoughts?

Please file bugzilla issue(s) for this.

Jun 05 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 06/06/2014 01:15 AM, Walter Bright wrote:
 On 6/5/2014 11:09 AM, Steven Schveighoffer wrote:
 Thoughts?

 Please file bugzilla issue(s) for this.

There already is one:
https://issues.dlang.org/show_bug.cgi?id=8838

Jun 05 2014

D Programming

C/C++ Programming

Other

digitalmars.D - safe inference fundamentally broken