digitalmars.D - misplaced trust?

Steven Schveighoffer (73/73) Feb 05 2015 Reading the thread in bug report

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (10/12) Feb 05 2015 So you want to spend another 8 years implementing linear typing
Zach the Mystic (5/84) Feb 05 2015 Hey I like the creativity you're showing. Just to give people a

Steven Schveighoffer (15/18) Feb 05 2015 So for example:

Zach the Mystic (16/38) Feb 05 2015 `cast(@safe)`...interesting. It's the most fine-tuned way of

H. S. Teoh via Digitalmars-d (43/84) Feb 05 2015 I mostly like this idea, except that foo() should not be marked @safe.

Andrei Alexandrescu (13/14) Feb 05 2015 Tainted variables are an interesting topic, but quite distinct from the

H. S. Teoh via Digitalmars-d (33/48) Feb 05 2015 Frankly, I don't understand how you could fail to see the maintenance
Walter Bright (7/14) Feb 05 2015 I agree.

Steven Schveighoffer (5/23) Feb 05 2015 That also applies to @safe functions since they can call @trusted

Walter Bright (11/21) Feb 05 2015 No - the @trusted function is reviewed to ensure it has a safe interface...

Steven Schveighoffer (17/37) Feb 05 2015 So you're saying that @safe is mechanically verified as long as @trusted...

Zach the Mystic (3/18) Feb 05 2015 But it doesn't have to be accepted into D! :-)
Walter Bright (10/24) Feb 05 2015 It's no different from:

Jacob Carlborg (5/8) Feb 08 2015 The difference is if you do that the compiler won't disable type

CraigDillabaugh (2/22) Feb 05 2015 Couldn't you just use a comment?

Steven Schveighoffer (10/23) Feb 05 2015 The proposal (the original one I stated, not H.S.'s) is to do 2 things:
Zach the Mystic (6/21) Feb 05 2015 At minimum, there needs to be official documented guidance on how

Andrei Alexandrescu (2/6) Feb 05 2015 Yah, some documentation is needed. -- Andrei

Dicebot (12/26) Feb 05 2015 Probbaly you and Walter should try maintaining Phobos for next

Dicebot (4/4) Feb 05 2015 To put it differently - there is no way I would have ever taken

Andrei Alexandrescu (14/16) Feb 05 2015 Surely you're exaggerating. We're looking at a function that performs

Dicebot (18/36) Feb 05 2015 Not even slightly. I have revoked my Phobos access for a specific

Andrei Alexandrescu (2/4) Feb 05 2015 Do you use @safe at work? -- Andrei

Dicebot (3/8) Feb 05 2015 If it is sarcasm, it could have been better.

Andrei Alexandrescu (5/12) Feb 05 2015 It's candid. You're saying you cannot verify safety of a 50-lines

Dicebot (4/18) Feb 05 2015 You do realize that I was one of reviewers for those Phobos pull

Andrei Alexandrescu (4/20) Feb 05 2015 The reference was to the fact that you are obviously a competent

Dicebot (17/45) Feb 05 2015 I referred to this fact with a comment "it is better to make no

Walter Bright (4/7) Feb 05 2015 That is only true if the @trusted code has an unsafe interface. Determin...
Andrei Alexandrescu (8/23) Feb 05 2015 I don't see it as bad, but I see what you're saying. Anyhow, it's likely...

Tobias Pankrath (10/23) Feb 06 2015 I'd like D to provide the following guarantee: If I corrupt my

Andrei Alexandrescu (2/6) Feb 06 2015 That's what we're going for. -- Andrei

Walter Bright (2/8) Feb 06 2015 Exactly. And it's been that way all along.

Walter Bright (4/7) Feb 05 2015 It's appropriate to recuse yourself from reviewing aspects of code that ...

Walter Bright (3/6) Feb 05 2015 I try to address your points in the new thread "@trust is an encapsulati...

Steven Schveighoffer (11/21) Feb 05 2015 I still am not sure I get this idea. If @safe code can call @trusted
Zach the Mystic (19/65) Feb 05 2015 This is another interesting addition to the original proposal. My

H. S. Teoh via Digitalmars-d (117/192) Feb 05 2015 That's the feeling I'm getting too!

Steven Schveighoffer (9/17) Feb 05 2015 Let's assume @trusted means @safe code can call it, but it may have

H. S. Teoh via Digitalmars-d (45/65) Feb 05 2015 No, @safe means the compiler can mechanically verify that it is safe,

Steven Schveighoffer (8/16) Feb 05 2015 What's the difference between an internal scope and a separate function

H. S. Teoh via Digitalmars-d (15/33) Feb 05 2015 [...]

Steven Schveighoffer (27/57) Feb 05 2015 I'll put out a strawman similar to my example response to Zach:

H. S. Teoh via Digitalmars-d (30/34) Feb 05 2015 Thank you, that's exactly what I've been trying to say, but rather

Steven Schveighoffer <schveiguy yahoo.com> writes:

Reading the thread in bug report 
https://issues.dlang.org/show_bug.cgi?id=14125 gets me thinking, perhaps 
 trusted is mis-designed.

At the moment, a  trusted function is the same as a  system function, 
but is allowed to be called from a  safe function. And that's it.

But a  trusted function may only need to be marked  trusted because of a 
few lines of code. So we now use this mechanism where we mark the 
 trusted portions with a lambda or static nested function, and call that 
internally, and mark the entire function  safe.

The benefit of this is, for the 90% of the code that is  safe, the 
compiler is used as a tool to check the  safe-ty of the function. For 
the 10% that isn't, we have contained it and marked it, and we can focus 
on that portion for scrutiny. This becomes EXTREMELY important when it 
comes to maintenance.

For example, if a  trusted function has some code added to it, the new 
code needs to be measured to see if it really is safe to call from a 
 safe function. This may be no easy feat.

At least with the  trusted inner function/lambda, you limit yourself to 
the code that has been marked as such, and you don't need to worry about 
the actual  safe code that is added to the function.

Or do you?... the problem with this mentality is interactions with data 
inside the  safe code after a  trusted function is called can still be 
subject to memory safety issues. An example (using static functions for 
clarity):

void foo()  safe
{
    static auto tmalloc(size_t x)  trusted {return (cast(int *)malloc(x 
* sizeof(int)))[0..x];}
    static void tfree(void[] arr)  trusted {free(arr.ptr);}

    auto mem = tmalloc(100);
    tfree(mem);
    mem[0] = 5;
}

Note that the final line in the function, setting mem[0] to 5, is the 
only unsafe part. it's safe to malloc, it's safe to free as long as you 
don't ever refer to that memory again.

But the problem with the above is that  trusted is not needed to apply 
to the mem[0] = 5 call. And imagine that the mem[0] = 5 function may be 
simply added, by another person who didn't understand the context. 
Marking the whole function as safe is kind of meaningless here.

Changing gears, one of the issues raised in the aforementioned bug is 
that a function like this really should be marked trusted in its 
entirety. But what does this actually mean?

When we mark a function  safe, we can assume that the compiler has 
checked it for us. When we mark a function  trusted, we can assume the 
compiler has NOT checked it for us. But how does this help? A  safe 
function can call a  trusted function. So there is little difference 
between a  safe and a  trusted function from an API point of view. I 
would contend that  trusted really should NEVER HAVE BEEN a function 
attribute. You may as well call them all  safe, there is no difference.

What I think we need to approach this correctly is that instead of 
marking *functions* with  trusted, we need to mark *data* and *calls* 
with  trusted. Once data is used inside a  trusted island, it becomes 
tainted with the  trusted mark. The compiler can no longer guarantee 
safety for that data.

So how can we implement this without too much pain? I envision that we 
can mark lines or blocks inside a  safe function as  trusted (a  trusted 
block inside a  system function would be a no-op, but allowed). I also 
envision that any variable used inside the function is internally marked 
as  safe until it's used in a  trusted block, and then is marked as 
 trusted (again internally, no requirement to mark in syntax). If at any 
time during compilation a  trusted variable is used in  safe code, it's 
a compile error.

There may be situations in which you DO need this to work, and in those 
cases, I would say a cast( safe) could get rid of that mark. For 
example, if you want to return the result of a  trusted call in a  safe 
function.

Such a change would be somewhat disruptive, but much more in line with 
what  trusted calls really mean.

I've left some of the details fuzzy on purpose, because I'm not a 
compiler writer :)

Destroy

-Steve

Feb 05 2015

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:

On Thursday, 5 February 2015 at 16:50:18 UTC, Steven 
Schveighoffer wrote:
 I've left some of the details fuzzy on purpose, because I'm not 
 a compiler writer :)

So you want to spend another 8 years implementing linear typing 
for D:
http://en.wikipedia.org/wiki/Substructural_type_system

Or maybe give up  safe.

Or implement either a behavioural or dependent type system.

Lots of options.
Lots of theory...

Without type theory => leaky cauldron.

Feb 05 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Thursday, 5 February 2015 at 16:50:18 UTC, Steven 
Schveighoffer wrote:
 Reading the thread in bug report 
 https://issues.dlang.org/show_bug.cgi?id=14125 gets me 
 thinking, perhaps  trusted is mis-designed.

 At the moment, a  trusted function is the same as a  system 
 function, but is allowed to be called from a  safe function. 
 And that's it.

 But a  trusted function may only need to be marked  trusted 
 because of a few lines of code. So we now use this mechanism 
 where we mark the  trusted portions with a lambda or static 
 nested function, and call that internally, and mark the entire 
 function  safe.

 The benefit of this is, for the 90% of the code that is  safe, 
 the compiler is used as a tool to check the  safe-ty of the 
 function. For the 10% that isn't, we have contained it and 
 marked it, and we can focus on that portion for scrutiny. This 
 becomes EXTREMELY important when it comes to maintenance.

 For example, if a  trusted function has some code added to it, 
 the new code needs to be measured to see if it really is safe 
 to call from a  safe function. This may be no easy feat.

 At least with the  trusted inner function/lambda, you limit 
 yourself to the code that has been marked as such, and you 
 don't need to worry about the actual  safe code that is added 
 to the function.

 Or do you?... the problem with this mentality is interactions 
 with data inside the  safe code after a  trusted function is 
 called can still be subject to memory safety issues. An example 
 (using static functions for clarity):

 void foo()  safe
 {
    static auto tmalloc(size_t x)  trusted {return (cast(int 
 *)malloc(x * sizeof(int)))[0..x];}
    static void tfree(void[] arr)  trusted {free(arr.ptr);}

    auto mem = tmalloc(100);
    tfree(mem);
    mem[0] = 5;
 }

 Note that the final line in the function, setting mem[0] to 5, 
 is the only unsafe part. it's safe to malloc, it's safe to free 
 as long as you don't ever refer to that memory again.

 But the problem with the above is that  trusted is not needed 
 to apply to the mem[0] = 5 call. And imagine that the mem[0] = 
 5 function may be simply added, by another person who didn't 
 understand the context. Marking the whole function as safe is 
 kind of meaningless here.

 Changing gears, one of the issues raised in the aforementioned 
 bug is that a function like this really should be marked 
 trusted in its entirety. But what does this actually mean?

 When we mark a function  safe, we can assume that the compiler 
 has checked it for us. When we mark a function  trusted, we can 
 assume the compiler has NOT checked it for us. But how does 
 this help? A  safe function can call a  trusted function. So 
 there is little difference between a  safe and a  trusted 
 function from an API point of view. I would contend that 
  trusted really should NEVER HAVE BEEN a function attribute. 
 You may as well call them all  safe, there is no difference.

 What I think we need to approach this correctly is that instead 
 of marking *functions* with  trusted, we need to mark *data* 
 and *calls* with  trusted. Once data is used inside a  trusted 
 island, it becomes tainted with the  trusted mark. The compiler 
 can no longer guarantee safety for that data.

 So how can we implement this without too much pain? I envision 
 that we can mark lines or blocks inside a  safe function as 
  trusted (a  trusted block inside a  system function would be a 
 no-op, but allowed). I also envision that any variable used 
 inside the function is internally marked as  safe until it's 
 used in a  trusted block, and then is marked as  trusted (again 
 internally, no requirement to mark in syntax). If at any time 
 during compilation a  trusted variable is used in  safe code, 
 it's a compile error.

 There may be situations in which you DO need this to work, and 
 in those cases, I would say a cast( safe) could get rid of that 
 mark. For example, if you want to return the result of a 
  trusted call in a  safe function.

 Such a change would be somewhat disruptive, but much more in 
 line with what  trusted calls really mean.

 I've left some of the details fuzzy on purpose, because I'm not 
 a compiler writer :)

 Destroy

 -Steve

Hey I like the creativity you're showing. Just to give people a 
concrete idea, you might show some sample code and illustrate how 
things work. It sure helps when I'm trying to think about things.

Feb 05 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 2/5/15 1:12 PM, Zach the Mystic wrote:

 Hey I like the creativity you're showing. Just to give people a concrete
 idea, you might show some sample code and illustrate how things work. It
 sure helps when I'm trying to think about things.

So for example:

 safe int *foo()
{
    int *x;
    int *y;
    int z;
    x = new int; // ok
    //y = &z; // not OK
     trusted y = &z; // OK, but now y is marked as  trusted
    // return y; // not OK, cannot return  trusted pointer in  safe function
    return cast( safe)y; // ok, we are overriding the compiler.
    // and of course return x; would be ok
}

-Steve

Feb 05 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Thursday, 5 February 2015 at 18:21:40 UTC, Steven 
Schveighoffer wrote:
 On 2/5/15 1:12 PM, Zach the Mystic wrote:

 Hey I like the creativity you're showing. Just to give people 
 a concrete
 idea, you might show some sample code and illustrate how 
 things work. It
 sure helps when I'm trying to think about things.

 So for example:

  safe int *foo()
 {
    int *x;
    int *y;
    int z;
    x = new int; // ok
    //y = &z; // not OK
     trusted y = &z; // OK, but now y is marked as  trusted
    // return y; // not OK, cannot return  trusted pointer in 
  safe function
    return cast( safe)y; // ok, we are overriding the compiler.
    // and of course return x; would be ok
 }

 -Steve

`cast( safe)`...interesting. It's the most fine-tuned way of 
adding safety, whereas  trusting a whole function is the most 
blunt way.

I've been hatching a scheme for reference safety in my head which 
would automatically track ` trusted y = &z;` above, marking `y` 
with "scopedepth(1)", which would be unreturnable in  safe code.

I can anticipate the objection that giving people too much power 
will encourage them to abuse it... but then again, if that were 
true, who let them mark the whole function ` trusted` to begin 
with? Your proposal really pinpoints the actual code which needs 
to be worked on.

You're basically moving the unit of safety from the *function* to 
the *pointer*, which makes sense to me, since only a pointer can 
really be unsafe.

Feb 05 2015

"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:

On Thu, Feb 05, 2015 at 06:56:02PM +0000, Zach the Mystic via Digitalmars-d
wrote:
 On Thursday, 5 February 2015 at 18:21:40 UTC, Steven Schveighoffer wrote:
On 2/5/15 1:12 PM, Zach the Mystic wrote:

Hey I like the creativity you're showing. Just to give people a
concrete idea, you might show some sample code and illustrate how
things work. It sure helps when I'm trying to think about things.

So for example:

 safe int *foo()
{
   int *x;
   int *y;
   int z;
   x = new int; // ok
   //y = &z; // not OK
    trusted y = &z; // OK, but now y is marked as  trusted
   // return y; // not OK, cannot return  trusted pointer in  safe
function
   return cast( safe)y; // ok, we are overriding the compiler.
   // and of course return x; would be ok
}

-Steve

 
 `cast( safe)`...interesting. It's the most fine-tuned way of adding
 safety, whereas  trusting a whole function is the most blunt way.
 
 I've been hatching a scheme for reference safety in my head which would
 automatically track ` trusted y = &z;` above, marking `y` with
 "scopedepth(1)", which would be unreturnable in  safe code.
 
 I can anticipate the objection that giving people too much power will
 encourage them to abuse it... but then again, if that were true, who
 let them mark the whole function ` trusted` to begin with? Your
 proposal really pinpoints the actual code which needs to be worked on.
 
 You're basically moving the unit of safety from the *function* to the
 *pointer*, which makes sense to me, since only a pointer can really be
 unsafe.

I mostly like this idea, except that foo() should not be marked  safe.
It should be marked  trusted because it still needs review, but the
meaning of  trusted should be changed so that it still enforces
 safe-ty, except that now  trusted variables are permitted. Or rather, I
would call them  system variables -- they *cannot* be trusted, and must
be manually verified.

 safe code should not allow any  system variables or any cast( safe)
operations, period. Otherwise, anybody can wrap unsafe operations inside
their  safe function and still clothe it with the sheep's clothing of
 safe, and  safe becomes a meaningless annotation.

In short, my proposal is:

-  safe should continue being  safe -- no (potentially) unsafe
  operations are allowed, period.
  
  Rationale: allowing  system variables in  safe code makes the function
  non-verifiable mechanically. This completely breaks the whole point of
   safe.

- Change the meaning of  trusted (as applied to a function) to require
   safe inside the function body, but in addition permit  system
  variables and cast( safe).
  
  Rationale: the function cannot be verified mechanically to be safe,
  therefore it cannot be marked  safe. It must be marked  trusted to
  draw attention to the fact that manual review is required. However,
  this does not constitute license to perform arbitrary  system
  operations. Instead, any  system code/variable inside the  trusted
  function must be explicitly marked as such, to indicate that these
  items require special attention during review. Everything else must
  still conform to  safe requirements.

- Introduce  system variables for holding tainted values that the
  compiler cannot guarantee the safety of, as well as cast( safe), as
  described in Steven's post. These constructs are only permitted inside
   trusted functions. They are prohibited in  safe code, and are no-ops
  in  system code.

  Rationale: to reduce the maintainability problem,  trusted functions
  should not allow  system code by default. Rather, the scope of  system
  code/data inside a  trusted function should be restricted by requiring
  explicit marking. The compiler then helps the verification process by
  ensuring that anything not explicitly marked is still  safe.


T

-- 
Nobody is perfect.  I am Nobody. -- pepoluan, GKC forum

Feb 05 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/5/15 11:17 AM, H. S. Teoh via Digitalmars-d wrote:
 In short, my proposal is:

Tainted variables are an interesting topic, but quite distinct from the 
notion of separating safe code from unsafe code.

As much as I was shocked about the use of  trusted/ safe/ system in 
std.file, std.array and sadly possibly in other places, I found no 
evidence that the feature is misdesigned. I continue to consider it a 
simple, sound, and very effective method of building and interfacing 
robust code. An excellent engineering solution that offers a lot of 
power at a modest cost.

I do not support this proposal to change the semantics of 
 trusted/ safe/ system. A separate tainted data proposal might be of 
interest for loosely related topics.


Andrei

Feb 05 2015

"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:

On Thu, Feb 05, 2015 at 11:49:40AM -0800, Andrei Alexandrescu via Digitalmars-d
wrote:
 On 2/5/15 11:17 AM, H. S. Teoh via Digitalmars-d wrote:
In short, my proposal is:

 
 Tainted variables are an interesting topic, but quite distinct from
 the notion of separating safe code from unsafe code.
 
 As much as I was shocked about the use of  trusted/ safe/ system in
 std.file, std.array and sadly possibly in other places, I found no
 evidence that the feature is misdesigned. I continue to consider it a
 simple, sound, and very effective method of building and interfacing
 robust code. An excellent engineering solution that offers a lot of
 power at a modest cost.

Frankly, I don't understand how you could fail to see the maintenance
nightmare  trusted, as you envisioned it, is causing. The wholesale
license to freely perform  system operations inside a  trusted function
greatly undermines the trustworthiness of SafeD as a whole. The problem
is that human verification only happens once, and thereafter  trusted
continues to be applied to that function, even if later on, further
downstream in the call graph, a remote helper function has changed in a
way that compromises that trustworthiness. This happens with no warnings
whatsoever from the compiler, and so we continue to build our house upon
sinking sand.

To truly be sure that our  trusted functions are actually trusted, we
have to continually review them not just for changes within the function
bodies, but also all downstream functions used within. The review
requirements for maintaining such a standard is impractically onerous.
And, judging from the trends in our industry, what will likely happen is
that this review is never actually carried out until a nasty security
exploit is discovered in  safe code. By then, it will be far too late.
Maybe to you this situation is acceptable, but to me, this is an utter
maintenance nightmare.


 I do not support this proposal to change the semantics of
  trusted/ safe/ system. A separate tainted data proposal might be of
 interest for loosely related topics.

[...]

That is your own prerogative, but this discussion has convinced me that
 safe is a joke, and your continual refusal to admit the existence of
any problems despite many active Phobos contributors describing it to
you in detail has made me completely lose interest in this subject. From
now on I will not bother with  safe in my projects anymore, since it has
become clear that it does not live up to its promise and probably never
will (this specific issue is only the tip of the iceberg; there are many
other problems with  safe that you may find in bugzilla), and I have no
further interest in contributing to  safe-related issues in Phobos.


T

-- 
Mediocrity has been pushed to extremes.

Feb 05 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 2/5/2015 11:49 AM, Andrei Alexandrescu wrote:
 As much as I was shocked about the use of  trusted/ safe/ system in std.file,
 std.array and sadly possibly in other places, I found no evidence that the
 feature is misdesigned. I continue to consider it a simple, sound, and very
 effective method of building and interfacing robust code. An excellent
 engineering solution that offers a lot of power at a modest cost.

 I do not support this proposal to change the semantics of
  trusted/ safe/ system.

I agree.

So the question is, what does  trusted actually buy you, since the compiler 
can't check it?

It serves as notice that "This function merits special attention during code 
review to check that it has a safe interface and that its implementation is 
correct."

Feb 05 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 2/5/15 3:13 PM, Walter Bright wrote:
 On 2/5/2015 11:49 AM, Andrei Alexandrescu wrote:
 As much as I was shocked about the use of  trusted/ safe/ system in
 std.file,
 std.array and sadly possibly in other places, I found no evidence that
 the
 feature is misdesigned. I continue to consider it a simple, sound, and
 very
 effective method of building and interfacing robust code. An excellent
 engineering solution that offers a lot of power at a modest cost.

 I do not support this proposal to change the semantics of
  trusted/ safe/ system.

 I agree.

 So the question is, what does  trusted actually buy you, since the
 compiler can't check it?

 It serves as notice that "This function merits special attention during
 code review to check that it has a safe interface and that its
 implementation is correct."

That also applies to  safe functions since they can call  trusted 
functions. In essense,  trusted buys you headaches. I think we should 
try to lessen them.

-Steve

Feb 05 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 2/5/2015 12:25 PM, Steven Schveighoffer wrote:
 On 2/5/15 3:13 PM, Walter Bright wrote:
 So the question is, what does  trusted actually buy you, since the
 compiler can't check it?

 It serves as notice that "This function merits special attention during
 code review to check that it has a safe interface and that its
 implementation is correct."

 That also applies to  safe functions since they can call  trusted functions.

No - the  trusted function is reviewed to ensure it has a safe interface. Then 
there is no need to review for safety anyone that calls it. It's the whole
point.

For example, https://issues.dlang.org/show_bug.cgi?id=14127

A rather cursory inspection reveals that these  trusted functions have unsafe 
interfaces, and are therefore unacceptable in Phobos. (Of course, D will let
you 
write such code because it is a systems programming language, but Phobos must
be 
an example of best practices, and these are not.)


 In essense,  trusted buys you headaches. I think we should try to lessen them.

An aspect of a well-designed encapsulation is the number of  trusted interfaces 
is minimized. If you find an abstraction that has  trusted sprinkled liberally 
through it, it's an indicator of a failed abstraction.

Feb 05 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 2/5/15 3:54 PM, Walter Bright wrote:
 On 2/5/2015 12:25 PM, Steven Schveighoffer wrote:
 On 2/5/15 3:13 PM, Walter Bright wrote:
 So the question is, what does  trusted actually buy you, since the
 compiler can't check it?

 It serves as notice that "This function merits special attention during
 code review to check that it has a safe interface and that its
 implementation is correct."

 That also applies to  safe functions since they can call  trusted
 functions.

 No - the  trusted function is reviewed to ensure it has a safe
 interface. Then there is no need to review for safety anyone that calls
 it. It's the whole point.

So you're saying that  safe is mechanically verified as long as  trusted 
functions are manually reviewed. It's that manually reviewed part that I 
think we have an issue with. And there is definitely a feel of "I can 
use trusted because I know I will only call it this way" without 
considering future possible calls.

I fully expect we can devise rules to make sure  trusted memory details 
do not leak out of the functions in an unsafe way (and have a much 
better,  safer code base). But it would be nice if the compiler helped 
enforce them.

 In essense,  trusted buys you headaches. I think we should try to
 lessen them.

 An aspect of a well-designed encapsulation is the number of  trusted
 interfaces is minimized. If you find an abstraction that has  trusted
 sprinkled liberally through it, it's an indicator of a failed abstraction.

I think you just made this up :)

But I agree that  trusted should be used sparingly not liberally. The 
problem is that when faced with such a huge function that calls one 
non- safe one, marking the whole thing as  trusted disables all the 
mechanical verification for everything.

There has to be a better way.

-Steve

Feb 05 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Thursday, 5 February 2015 at 21:15:52 UTC, Steven 
Schveighoffer wrote:
 An aspect of a well-designed encapsulation is the number of 
  trusted
 interfaces is minimized. If you find an abstraction that has 
  trusted
 sprinkled liberally through it, it's an indicator of a failed 
 abstraction.

 I think you just made this up :)

 But I agree that  trusted should be used sparingly not 
 liberally. The problem is that when faced with such a huge 
 function that calls one non- safe one, marking the whole thing 
 as  trusted disables all the mechanical verification for 
 everything.

 There has to be a better way.

 -Steve

But it doesn't have to be accepted into D! :-)

Feb 05 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 2/5/2015 1:15 PM, Steven Schveighoffer wrote:
 So you're saying that  safe is mechanically verified as long as  trusted
 functions are manually reviewed.

Yes.


 It's that manually reviewed part that I think
 we have an issue with. And there is definitely a feel of "I can use trusted
 because I know I will only call it this way" without considering future
possible
 calls.

It's no different from:

Q: the compiler gave me a mismatched type error!
A: put in a cast and it'll compile!

I.e. you can always find a way to make buggy, badly designed code compile. It's 
up to whoever does the reviews to have some sort of standards.


 An aspect of a well-designed encapsulation is the number of  trusted
 interfaces is minimized. If you find an abstraction that has  trusted
 sprinkled liberally through it, it's an indicator of a failed abstraction.

 I think you just made this up :)

No, I've always thought of trusted that way. It isn't different from classes 
that allow too many functions to access private members.


 But I agree that  trusted should be used sparingly not liberally. The problem
is
 that when faced with such a huge function that calls one non- safe one, marking
 the whole thing as  trusted disables all the mechanical verification for
 everything.

Then that's a candidate for a redesign of the abstraction.

Feb 05 2015

Jacob Carlborg <doob me.com> writes:

On 2015-02-05 22:35, Walter Bright wrote:

 It's no different from:

 Q: the compiler gave me a mismatched type error!
 A: put in a cast and it'll compile!

The difference is if you do that the compiler won't disable type 
checking for the whole function.

-- 
/Jacob Carlborg

Feb 08 2015

"CraigDillabaugh" <craig.dillabaugh gmail.com> writes:

On Thursday, 5 February 2015 at 20:13:32 UTC, Walter Bright wrote:
 On 2/5/2015 11:49 AM, Andrei Alexandrescu wrote:
 As much as I was shocked about the use of 
  trusted/ safe/ system in std.file,
 std.array and sadly possibly in other places, I found no 
 evidence that the
 feature is misdesigned. I continue to consider it a simple, 
 sound, and very
 effective method of building and interfacing robust code. An 
 excellent
 engineering solution that offers a lot of power at a modest 
 cost.

 I do not support this proposal to change the semantics of
  trusted/ safe/ system.

 I agree.

 So the question is, what does  trusted actually buy you, since 
 the compiler can't check it?

 It serves as notice that "This function merits special 
 attention during code review to check that it has a safe 
 interface and that its implementation is correct."

Couldn't you just use a comment?

Feb 05 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 2/5/15 2:49 PM, Andrei Alexandrescu wrote:
 On 2/5/15 11:17 AM, H. S. Teoh via Digitalmars-d wrote:
 In short, my proposal is:

 Tainted variables are an interesting topic, but quite distinct from the
 notion of separating safe code from unsafe code.

 As much as I was shocked about the use of  trusted/ safe/ system in
 std.file, std.array and sadly possibly in other places, I found no
 evidence that the feature is misdesigned. I continue to consider it a
 simple, sound, and very effective method of building and interfacing
 robust code. An excellent engineering solution that offers a lot of
 power at a modest cost.

 I do not support this proposal to change the semantics of
  trusted/ safe/ system. A separate tainted data proposal might be of
 interest for loosely related topics.

The proposal (the original one I stated, not H.S.'s) is to do 2 things:

1. Clean up the syntax for  trusted escapes inside  safe code that we 
have settled on.
2. Add a mechanism to make those escapes safer and more reviewable.

I don't think the idea behind  trusted is incorrect, just that the idea 
it's a function attribute is mis-designed.

Note that in my proposal, you can essentially create a  trusted function 
just by marking the whole thing  trusted:

-Steve

Feb 05 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Thursday, 5 February 2015 at 19:49:41 UTC, Andrei Alexandrescu 
wrote:
 On 2/5/15 11:17 AM, H. S. Teoh via Digitalmars-d wrote:
 In short, my proposal is:

 Tainted variables are an interesting topic, but quite distinct 
 from the notion of separating safe code from unsafe code.

 As much as I was shocked about the use of 
  trusted/ safe/ system in std.file, std.array and sadly 
 possibly in other places, I found no evidence that the feature 
 is misdesigned. I continue to consider it a simple, sound, and 
 very effective method of building and interfacing robust code. 
 An excellent engineering solution that offers a lot of power at 
 a modest cost.

 I do not support this proposal to change the semantics of 
  trusted/ safe/ system. A separate tainted data proposal might 
 be of interest for loosely related topics.


 Andrei

At minimum, there needs to be official documented guidance on how 
to use  trusted. If phobos developers got this far without 
knowing how to use it (assuming their complaints on its design 
are indeed meritless), how can anyone else be expected to?

Feb 05 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/5/15 12:24 PM, Zach the Mystic wrote:
 At minimum, there needs to be official documented guidance on how to use
  trusted. If phobos developers got this far without knowing how to use
 it (assuming their complaints on its design are indeed meritless), how
 can anyone else be expected to?

Yah, some documentation is needed. -- Andrei

Feb 05 2015

"Dicebot" <public dicebot.lv> writes:

On Thursday, 5 February 2015 at 19:49:41 UTC, Andrei Alexandrescu 
wrote:
 On 2/5/15 11:17 AM, H. S. Teoh via Digitalmars-d wrote:
 In short, my proposal is:

 Tainted variables are an interesting topic, but quite distinct 
 from the notion of separating safe code from unsafe code.

 As much as I was shocked about the use of 
  trusted/ safe/ system in std.file, std.array and sadly 
 possibly in other places, I found no evidence that the feature 
 is misdesigned. I continue to consider it a simple, sound, and 
 very effective method of building and interfacing robust code. 
 An excellent engineering solution that offers a lot of power at 
 a modest cost.

 I do not support this proposal to change the semantics of 
  trusted/ safe/ system. A separate tainted data proposal might 
 be of interest for loosely related topics.

Probbaly you and Walter should try maintaining Phobos for next 
year alone then and see how it works. Maybe that will make some 
issues more convincing.

This is absolutely ridiculous that every single one of existing 
Phobos reviewers who actually worked with that code in practice 
says that there is a problem with  trusted and you keep rejecting 
it with "no, it is all good as it is" argument.

For me this thread was clear alarm :  safe in its current state 
is a 100% misfeature and is better to be advertised against until 
either its design changes or effective idioms are presented.

Feb 05 2015

"Dicebot" <public dicebot.lv> writes:

To put it differently - there is no way I would have ever taken 
the risk merging a 50-line  trusted function, be it Phobos or 
work project. And it is better to not make promises than to break 
those.

Feb 05 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/5/15 3:22 PM, Dicebot wrote:
 To put it differently - there is no way I would have ever taken the risk
 merging a 50-line  trusted function, be it Phobos or work project.

Surely you're exaggerating. We're looking at a function that performs 
system calls and reads into a memory buffer allocated appropriately (and 
economically). Claiming that that function is safe then enumerating the 
numerous unsafe and unprovable escape hatches it uses is someone 
claiming "I'm a virgin - of course save for those six one-night stands 
I've had."

It's unclear what you're advocating here. I don't think your previous 
arguments stand scrutiny. One possible new argument might be an analysis 
on how this:

https://github.com/D-Programming-Language/phobos/blob/accb351b96bb04a6890bb7df018749337e55eccc/std/file.d#L194

is easier to reason about than this:

https://github.com/D-Programming-Language/phobos/blob/master/std/file.d#L194


Andrei

Feb 05 2015

"Dicebot" <public dicebot.lv> writes:

On Thursday, 5 February 2015 at 23:47:00 UTC, Andrei Alexandrescu 
wrote:
 On 2/5/15 3:22 PM, Dicebot wrote:
 To put it differently - there is no way I would have ever 
 taken the risk
 merging a 50-line  trusted function, be it Phobos or work 
 project.

 Surely you're exaggerating.

Not even slightly. I have revoked my Phobos access for a specific 
reason that I can't do the reviewer job properly with such 
requirements and would have been forced to ignore all pull 
requests that tackle  trusted anyway.

 We're looking at a function that performs system calls and 
 reads into a memory buffer allocated appropriately (and 
 economically). Claiming that that function is safe then 
 enumerating the numerous unsafe and unprovable escape hatches 
 it uses is someone claiming "I'm a virgin - of course save for 
 those six one-night stands I've had."

So what? I don't care how justified it is, I simply don't trust 
my attention span enough do verify that foo() is a virgin. I am 
not a rock-star programmer and I know my limits. Verifying 50 
lines of  trusted with no help from compiler at all is beyond 
those limits.

When all exceptions to safety are explicitly listed I can review 
the implementation knowing "ok, this will be safe _unless_ it 
gets screwed by data coming from those trusted wrappers". And 
that is big mentality switch that helps to maintain focus.

 It's unclear what you're advocating here. I don't think your 
 previous arguments stand scrutiny. One possible new argument 
 might be an analysis on how this:

 https://github.com/D-Programming-Language/phobos/blob/accb351b96bb04a6890bb7df018749337e55eccc/std/file.d#L194

 is easier to reason about than this:

 https://github.com/D-Programming-Language/phobos/blob/master/std/file.d#L194


It will be a very short analysis considering I am not able to 
reason about the latter one at all - it simply requires too much 
of a time investment to me to even consider it.

Feb 05 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/5/15 4:02 PM, Dicebot wrote:
 Verifying 50 lines of  trusted with no help from compiler at all is
 beyond those limits.

Do you use  safe at work? -- Andrei

Feb 05 2015

"Dicebot" <public dicebot.lv> writes:

On Friday, 6 February 2015 at 00:21:45 UTC, Andrei Alexandrescu 
wrote:
 On 2/5/15 4:02 PM, Dicebot wrote:
 Verifying 50 lines of  trusted with no help from compiler at 
 all is
 beyond those limits.

 Do you use  safe at work? -- Andrei

If it is sarcasm, it could have been better.

Feb 05 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/5/15 4:22 PM, Dicebot wrote:
 On Friday, 6 February 2015 at 00:21:45 UTC, Andrei Alexandrescu wrote:
 On 2/5/15 4:02 PM, Dicebot wrote:
 Verifying 50 lines of  trusted with no help from compiler at all is
 beyond those limits.

 Do you use  safe at work? -- Andrei

 If it is sarcasm, it could have been better.

It's candid. You're saying you cannot verify safety of a 50-lines 
function, but I know you are using D1 at work. So I don't see how your 
claim can be true.

Andrei

Feb 05 2015

"Dicebot" <public dicebot.lv> writes:

On Friday, 6 February 2015 at 00:31:06 UTC, Andrei Alexandrescu 
wrote:
 On 2/5/15 4:22 PM, Dicebot wrote:
 On Friday, 6 February 2015 at 00:21:45 UTC, Andrei 
 Alexandrescu wrote:
 On 2/5/15 4:02 PM, Dicebot wrote:
 Verifying 50 lines of  trusted with no help from compiler at 
 all is
 beyond those limits.

 Do you use  safe at work? -- Andrei

 If it is sarcasm, it could have been better.

 It's candid. You're saying you cannot verify safety of a 
 50-lines function, but I know you are using D1 at work. So I 
 don't see how your claim can be true.

You do realize that I was one of reviewers for those Phobos pull 
requests you complain about?

Feb 05 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/5/15 4:37 PM, Dicebot wrote:
 On Friday, 6 February 2015 at 00:31:06 UTC, Andrei Alexandrescu wrote:
 On 2/5/15 4:22 PM, Dicebot wrote:
 On Friday, 6 February 2015 at 00:21:45 UTC, Andrei Alexandrescu wrote:
 On 2/5/15 4:02 PM, Dicebot wrote:
 Verifying 50 lines of  trusted with no help from compiler at all is
 beyond those limits.

 Do you use  safe at work? -- Andrei

 If it is sarcasm, it could have been better.

 It's candid. You're saying you cannot verify safety of a 50-lines
 function, but I know you are using D1 at work. So I don't see how your
 claim can be true.

 You do realize that I was one of reviewers for those Phobos pull
 requests you complain about?

The reference was to the fact that you are obviously a competent 
engineer using an unsafe language, yet claim to be completely hopeless 
in reviewing a 50-liner that reads data from a file. -- Andrei

Feb 05 2015

"Dicebot" <public dicebot.lv> writes:

On Friday, 6 February 2015 at 00:56:09 UTC, Andrei Alexandrescu
wrote:
 On 2/5/15 4:37 PM, Dicebot wrote:
 On Friday, 6 February 2015 at 00:31:06 UTC, Andrei 
 Alexandrescu wrote:
 On 2/5/15 4:22 PM, Dicebot wrote:
 On Friday, 6 February 2015 at 00:21:45 UTC, Andrei 
 Alexandrescu wrote:
 On 2/5/15 4:02 PM, Dicebot wrote:
 Verifying 50 lines of  trusted with no help from compiler 
 at all is
 beyond those limits.

 Do you use  safe at work? -- Andrei

 If it is sarcasm, it could have been better.

 It's candid. You're saying you cannot verify safety of a 
 50-lines
 function, but I know you are using D1 at work. So I don't see 
 how your
 claim can be true.

 You do realize that I was one of reviewers for those Phobos 
 pull
 requests you complain about?

 The reference was to the fact that you are obviously a 
 competent engineer using an unsafe language, yet claim to be 
 completely hopeless in reviewing a 50-liner that reads data 
 from a file. -- Andrei

I referred to this fact with a comment "it is better to make no
promises than to make one and break it". Simply dealing with
unsafe language is something I got used to - all crashes and
weird think become expected. It is totally different from seeing
a memory corruption with  safe - "hey, you lied to me, it is not
safe!". Because of that amount of responsibility reviewing
 trusted is much higher than reviewing  system. I can do the
latter because I don't pretend review to be perfect. With
 trusted pressure is much harder.

What is worse, as it has been already mentioned, it is not just a
one time effort - careful review necessity taints all code that
gets called from  trusted code. With that much continuous effort
required there feels no point in trying to go for  safe as
opposed to just having  system everywhere and relying on
old-school memory safety techniques.

Feb 05 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 2/5/2015 8:24 PM, Dicebot wrote:
 What is worse, as it has been already mentioned, it is not just a
 one time effort - careful review necessity taints all code that
 gets called from  trusted code.

That is only true if the  trusted code has an unsafe interface. Determining if
a 
function has a safe interface is a far, far smaller and more tractable problem 
than examining all the code that calls it.

Feb 05 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/5/15 8:24 PM, Dicebot wrote:
 I referred to this fact with a comment "it is better to make no
 promises than to make one and break it". Simply dealing with
 unsafe language is something I got used to - all crashes and
 weird think become expected. It is totally different from seeing
 a memory corruption with  safe - "hey, you lied to me, it is not
 safe!". Because of that amount of responsibility reviewing
  trusted is much higher than reviewing  system. I can do the
 latter because I don't pretend review to be perfect. With
  trusted pressure is much harder.

Oh I understand. The notion of calibration comes to mind.

 What is worse, as it has been already mentioned, it is not just a
 one time effort - careful review necessity taints all code that
 gets called from  trusted code. With that much continuous effort
 required there feels no point in trying to go for  safe as
 opposed to just having  system everywhere and relying on
 old-school memory safety techniques.

I don't see it as bad, but I see what you're saying. Anyhow, it's likely 
we all grew tired of each other's arguments.

Probably best to stop here. Fresh perspectives would be great. Until 
then there is no change to  safe/ trusted/ system.


Thanks,

Andrei

Feb 05 2015

"Tobias Pankrath" <tobias pankrath.net> writes:

On Friday, 6 February 2015 at 06:25:06 UTC, Andrei Alexandrescu 
wrote:
 On 2/5/15 8:24 PM, Dicebot wrote:
 I referred to this fact with a comment "it is better to make no
 promises than to make one and break it". Simply dealing with
 unsafe language is something I got used to - all crashes and
 weird think become expected. It is totally different from 
 seeing
 a memory corruption with  safe - "hey, you lied to me, it is 
 not
 safe!". Because of that amount of responsibility reviewing
  trusted is much higher than reviewing  system. I can do the
 latter because I don't pretend review to be perfect. With
  trusted pressure is much harder.

 Oh I understand. The notion of calibration comes to mind.

I'd like D to provide the following guarantee: If I corrupt my 
memory using  safe code, the error must be in code marked 
 trusted /  system, either because the do not provide a  safe 
interface or because they are buggy.

We'll never provide a stronger guarantee as long as we allow to 
escape  safe-ty, just like we'll never be able to guarantee that 
a T is a valid T as long as we allow casts.

I think the guarantee is worth the effort, though.

Feb 06 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/6/15 3:26 AM, Tobias Pankrath wrote:
 I'd like D to provide the following guarantee: If I corrupt my memory
 using  safe code, the error must be in code marked  trusted /  system,
 either because the do not provide a  safe interface or because they are
 buggy.

That's what we're going for. -- Andrei

Feb 06 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 2/6/2015 8:10 AM, Andrei Alexandrescu wrote:
 On 2/6/15 3:26 AM, Tobias Pankrath wrote:
 I'd like D to provide the following guarantee: If I corrupt my memory
 using  safe code, the error must be in code marked  trusted /  system,
 either because the do not provide a  safe interface or because they are
 buggy.

 That's what we're going for. -- Andrei


Exactly. And it's been that way all along.

Feb 06 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 2/5/2015 4:02 PM, Dicebot wrote:
 I have revoked my Phobos access for a specific reason that I
 can't do the reviewer job properly with such requirements and would have been
 forced to ignore all pull requests that tackle  trusted anyway.

It's appropriate to recuse yourself from reviewing aspects of code that you 
disagree with. I do the same. But it seems a little drastic to withdraw from
all 
the rest. There's plenty, plenty more in Phobos.

Feb 05 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 2/5/2015 3:20 PM, Dicebot wrote:
 For me this thread was clear alarm :  safe in its current state is a 100%
 misfeature and is better to be advertised against until either its design
 changes or effective idioms are presented.

I try to address your points in the new thread " trust is an encapsulation 
method, not an escape". Please continue there.

Feb 05 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 2/5/15 2:17 PM, H. S. Teoh via Digitalmars-d wrote:

 I mostly like this idea, except that foo() should not be marked  safe.
 It should be marked  trusted because it still needs review, but the
 meaning of  trusted should be changed so that it still enforces
  safe-ty, except that now  trusted variables are permitted. Or rather, I
 would call them  system variables -- they *cannot* be trusted, and must
 be manually verified.

I still am not sure I get this idea. If  safe code can call  trusted 
code, how is it any more mechanically verified than  trusted code?

API-wise, there is no difference. The whole idea of  trusted is that you 
don't need to go read the implementation, you have to trust the person 
writing it. As a user of  trusted function, I really don't care whether 
it's marked  safe or  trusted. And I shouldn't be able to break my  safe 
function by calling it.

  safe code should not allow any  system variables or any cast( safe)
 operations, period. Otherwise, anybody can wrap unsafe operations inside
 their  safe function and still clothe it with the sheep's clothing of
  safe, and  safe becomes a meaningless annotation.

The only way to fix this is to ban  trusted altogether.

Which makes  safe quite useless indeed.

-Steve

Feb 05 2015

"Zach the Mystic" <reachzach gggmail.com> writes:

On Thursday, 5 February 2015 at 19:19:51 UTC, H. S. Teoh wrote:
 -  safe should continue being  safe -- no (potentially) unsafe
   operations are allowed, period.
 
   Rationale: allowing  system variables in  safe code makes the 
 function
   non-verifiable mechanically. This completely breaks the whole 
 point of
    safe.

 - Change the meaning of  trusted (as applied to a function) to 
 require
    safe inside the function body, but in addition permit  system
   variables and cast( safe).
 
   Rationale: the function cannot be verified mechanically to be 
 safe,
   therefore it cannot be marked  safe. It must be marked 
  trusted to
   draw attention to the fact that manual review is required. 
 However,
   this does not constitute license to perform arbitrary  system
   operations. Instead, any  system code/variable inside the 
  trusted
   function must be explicitly marked as such, to indicate that 
 these
   items require special attention during review. Everything 
 else must
   still conform to  safe requirements.

 - Introduce  system variables for holding tainted values that 
 the
   compiler cannot guarantee the safety of, as well as 
 cast( safe), as
   described in Steven's post. These constructs are only 
 permitted inside
    trusted functions. They are prohibited in  safe code, and 
 are no-ops
   in  system code.

   Rationale: to reduce the maintainability problem,  trusted 
 functions
   should not allow  system code by default. Rather, the scope 
 of  system
   code/data inside a  trusted function should be restricted by 
 requiring
   explicit marking. The compiler then helps the verification 
 process by
   ensuring that anything not explicitly marked is still  safe.


 T

This is another interesting addition to the original proposal. My 
initial response is that if I were building the system from the 
ground up, I might not understand why a function needs to be 
redundantly marked  trusted, since the unsafe data is already 
casted( safe) inside the function. Yet two factors start to sway 
me in favor of what you're saying, despite my initial doubt:

1. Memory safety is serious business. Forcing people to mark 
potentially unsafe functions  trusted instead of  safe, while (as 
Steven Schveighoffer pointed out) actually meaning nothing to the 
caller, will allow maintainers to more easily detect possible 
problems. In other words, it's a redundancy, but it's a GOOD 
redundancy because of how serious safety issues are. I don't know 
how strong this argument is, but it seems sound and I'm willing 
to go with it, barring good reasons not to.

2. It's already there in the language. This heavily reduces the 
"switching costs". While not a great argument in itself, to my 
mind anyway (#pleasebreakourcode ;-), it does seem more likely to 
be accepted by people who fear the change.

Feb 05 2015

"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:

On Thu, Feb 05, 2015 at 11:50:18AM -0500, Steven Schveighoffer via
Digitalmars-d wrote:
 Reading the thread in bug report
 https://issues.dlang.org/show_bug.cgi?id=14125 gets me thinking,
 perhaps  trusted is mis-designed.

That's the feeling I'm getting too!

Well, to be fair, I can see why it arose in its current form
historically, but history is not justification for mis-design.


 At the moment, a  trusted function is the same as a  system function,
 but is allowed to be called from a  safe function. And that's it.
 
 But a  trusted function may only need to be marked  trusted because of
 a few lines of code. So we now use this mechanism where we mark the
  trusted portions with a lambda or static nested function, and call
 that internally, and mark the entire function  safe.

The more I think about it, the more I'm becoming convinced that  trusted
is a misfeature. Basically, it's a blanket permission for a function to
perform arbitrary  system operations and yet hide under the sheep's
clothing of being callable from  safe code.  While that may work for
trivial 4-line functions, it quickly becomes a maintenance nightmare
when a large, complex function is marked  trusted.

Consider, for example, if trustedFunction() calls helper functions
helper1(), helper2(), and helper3(), that are currently marked  safe.
The last review of trustedFunction() verified its safety, *keyed on the
fact that helper1, helper2, and helper3 are  safe*. Now, helper1,
helper2 and helper3 are not directly related to trustedFunction; they
are merely general utilities that trustedFunction depends on. So people
may make changes to them later on, without realizing the implications
they may have on the trustworthiness of trustedFunction(). One of these
changes may, inadvertently or otherwise, make helper1()  system instead
of  safe. Note that with template attribute inference, this can even be
an *unconscious* change. However, since helper1() is a general utility,
nobody realizes that trustedFunction's manual proof of safety has been
compromised. As a result, the PR gets merged, and now trustedFunction()
is no longer trustworthy but is still marked  trusted.

Worse yet, it may not be helper1() directly that breaks the
trustworthiness of trustedFunction(), but helper1 calls helper4 which
calls helper5 that in turn calls helper6. Due to some obscure change in
helper6, which used to be  safe, it now becomes  system, and because of
that helper5, helper4, and helper1 all become  system. But since
trustedFunction() is allowed to call  system functions without
restriction, the breakage goes completely unnoticed. Even a thorough
review of trustedFunction may not detect this problem, unless the
reviewer recursively reviews ALL dependencies of trustedFunction.

Now, if  trusted functions are *still* under the restrictions of  safe
code, except small parts explicitly marked to require human
verification, then if helper1 is outside the marked section, as soon as
helper6 changes in safety, trustedFunction no longer compiles. This at
least provides us with some safety net in case things go wrong.

However, the problem still remains. No matter how confined those
explicitly marked sections of code may be, they are still subject to the
above indirect breach of trust problem. I see no real solution to this.


[...]
 At least with the  trusted inner function/lambda, you limit yourself
 to the code that has been marked as such, and you don't need to worry
 about the actual  safe code that is added to the function.

Actually, the  trusted inner function is an abuse of  trusted. They are
not trustworthy AT ALL, unless, as Walter said on the bug comments, they
present an API that CANNOT be made to break safety, no matter what
arguments you give it. The current implementation of wrapping &ptr in a
 trusted inner function that simply turns the & operator into a  safe
operation, is a completely wrong solution. Consider:

	auto trustedFunc(ref int x)  safe {
		ref int trustedDeref(int* x)  trusted {
			return *x;
		}
		auto p = &x;
		trustedDeref(p) = 999;
	}

On the surface, this looks good, since the dangerous operation inside
trustedFunc has been overtly marked as such, so reviewers will carefully
review it to make sure it's correct... except, it *cannot* be correct,
because it's assuming that its CALLER doesn't pass invalid arguments to
it. Somebody could easily change the code to:

	auto trustedFunc(ref int x)  safe {
		ref int trustedDeref(int* x)  trusted {
			return *x;
		}
		int* p;		// <--- N.B.: new change, now p is null
		trustedDeref(p) = 999; // arghhh....
	}

The compiler continues to accept it blindly, even though it's now
blatantly wrong.

While the *idea* of marking out specific sections of code inside a
 trusted function for scrutiny is valid, the above approach is NOT the
right way to go about implementing it.

Rather, what *should* have been done, is that trustedFunc should be
marked  trusted, but the compiler STILL imposes  safe restrictions on
the function body, except for explicitly-marked blocks inside. To use
hypothetical syntax, it should look something like this:

	auto trustedFunc(ref int x)  trusted { // <--  trusted to indicate need of
review
		int *p = &x;
		 system {	// indicate that code inside this block needs manual verification
			*p = 999;
		}
		//*p = 888;	// Illegal:  system operations not allowed outside  system block
	}

IOW, the entire function is marked  trusted to indicate that it needs
review, but the function body is STILL under  safe restrictions. So
 trusted becomes the same as  safe, except that it permits  system
blocks inside, and  system operations must be confined to these blocks.


 Or do you?... the problem with this mentality is interactions with
 data inside the  safe code after a  trusted function is called can
 still be subject to memory safety issues. An example (using static
 functions for clarity):
 
 void foo()  safe
 {
    static auto tmalloc(size_t x)  trusted {return (cast(int *)malloc(x *
 sizeof(int)))[0..x];}
    static void tfree(void[] arr)  trusted {free(arr.ptr);}
 
    auto mem = tmalloc(100);
    tfree(mem);
    mem[0] = 5;
 }
 
 Note that the final line in the function, setting mem[0] to 5, is the
 only unsafe part. it's safe to malloc, it's safe to free as long as
 you don't ever refer to that memory again.
 
 But the problem with the above is that  trusted is not needed to apply
 to the mem[0] = 5 call. And imagine that the mem[0] = 5 function may
 be simply added, by another person who didn't understand the context.
 Marking the whole function as safe is kind of meaningless here.

Exactly, this is a totally wrong approach to implementing maintainable
 trusted functions. The function should not be marked  safe because it
is actually  trusted, not  safe. The inner functions tmalloc and tfree
are mis-attributed as  trusted, when they are actually  system.


 Changing gears, one of the issues raised in the aforementioned bug is
 that a function like this really should be marked trusted in its
 entirety. But what does this actually mean?
 
 When we mark a function  safe, we can assume that the compiler has
 checked it for us. When we mark a function  trusted, we can assume the
 compiler has NOT checked it for us. But how does this help? A  safe
 function can call a  trusted function. So there is little difference
 between a  safe and a  trusted function from an API point of view. I
 would contend that  trusted really should NEVER HAVE BEEN a function
 attribute. You may as well call them all  safe, there is no
 difference.

Yes,  trusted in its current incarnation is fundamentally flawed.

However, I don't agree that the entire function can be marked  safe.
Otherwise,  safe code will now contain arbitrary  trusted blocks inside,
and so anybody can freely escape  safe restrictions just by putting
objectionable operations inside  trusted blocks. The function still
needs to be marked  trusted -- to draw attention for the need of
scrutiny -- *but* the function body is still confined under  safe
requirements, except that now the "escape hatch" of  trusted code blocks
are permitted as well.


 What I think we need to approach this correctly is that instead of
 marking *functions* with  trusted, we need to mark *data* and *calls*
 with  trusted.  Once data is used inside a  trusted island, it becomes
 tainted with the  trusted mark. The compiler can no longer guarantee
 safety for that data.
 
 So how can we implement this without too much pain? I envision that we
 can mark lines or blocks inside a  safe function as  trusted (a
  trusted block inside a  system function would be a no-op, but
 allowed).

I think it's better to keep  safe as-is, no  trusted blocks are allowed
inside  safe code. Instead, change  trusted to mean " safe by default,
but now allow  trusted blocks for performing operations that must be
manually verified". Any function that contains these "escape blocks"
can no longer be marked  safe, because they now require manual
verification.


 I also envision that any variable used inside the function is
 internally marked as  safe until it's used in a  trusted block, and
 then is marked as  trusted (again internally, no requirement to mark
 in syntax). If at any time during compilation a  trusted variable is
 used in  safe code, it's a compile error.
 
 There may be situations in which you DO need this to work, and in
 those cases, I would say a cast( safe) could get rid of that mark. For
 example, if you want to return the result of a  trusted call in a
  safe function.
 
 Such a change would be somewhat disruptive, but much more in line with
 what  trusted calls really mean.
 
 I've left some of the details fuzzy on purpose, because I'm not a
 compiler writer :)

[...]

I like the idea of tainting data as  system (for lack of a better term).
This increases the compiler's ability to catch mistakes, and does not
require completely turning off  safe checks inside  trusted functions. I
think it was a grave mistake for  trusted to completely turn off all
 safe checks.  trusted functions should still be under  safe
restrictions, and any unsafe operations must be explicitly marked as
such before being allowed.


T

-- 
I am not young enough to know everything. -- Oscar Wilde

Feb 05 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 2/5/15 1:54 PM, H. S. Teoh via Digitalmars-d wrote:

 However, I don't agree that the entire function can be marked  safe.
 Otherwise,  safe code will now contain arbitrary  trusted blocks inside,
 and so anybody can freely escape  safe restrictions just by putting
 objectionable operations inside  trusted blocks. The function still
 needs to be marked  trusted -- to draw attention for the need of
 scrutiny -- *but* the function body is still confined under  safe
 requirements, except that now the "escape hatch" of  trusted code blocks
 are permitted as well.

Let's assume  trusted means  safe code can call it, but it may have 
 system-like functionality in it (however it happens).

Whether it's in an internal lambda/nested static function or not, the 
point is,  safe code can call  trusted code. To say that  safe makes 
some promises above/beyond  trusted is just incorrect.

Now, if you're saying  trusted cannot be called via  safe, then I don't 
know what your plan is for  trusted :) If that's true, please explain.

-Steve

Feb 05 2015

"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:

On Thu, Feb 05, 2015 at 02:11:32PM -0500, Steven Schveighoffer via
Digitalmars-d wrote:
 On 2/5/15 1:54 PM, H. S. Teoh via Digitalmars-d wrote:
 
However, I don't agree that the entire function can be marked  safe.
Otherwise,  safe code will now contain arbitrary  trusted blocks inside,
and so anybody can freely escape  safe restrictions just by putting
objectionable operations inside  trusted blocks. The function still
needs to be marked  trusted -- to draw attention for the need of
scrutiny -- *but* the function body is still confined under  safe
requirements, except that now the "escape hatch" of  trusted code blocks
are permitted as well.

 
 Let's assume  trusted means  safe code can call it, but it may have
  system-like functionality in it (however it happens).
 
 Whether it's in an internal lambda/nested static function or not, the point
 is,  safe code can call  trusted code. To say that  safe makes some promises
 above/beyond  trusted is just incorrect.

No,  safe means the compiler can mechanically verify that it is safe,
under the assumption that any  trusted function called from  safe code
has been manually verified.  trusted means the compiler was not able to
mechanically verify the whole function, but it has been manually
verified to be safe.

If you allow  system variables in  safe code, then you're essentially
make  safe the same thing as  trusted, which means the compiler cannot
verify *anything*, so it's making the problem worse, as now you have to
manually verify *all*  safe code instead of just the  trusted portions.


 Now, if you're saying  trusted cannot be called via  safe, then I
 don't know what your plan is for  trusted :) If that's true, please
 explain.

[...]

The idea is that while we would like the compiler to mechanically verify
*everything*, in practice there are some things that the compiler simply
cannot verify. Since those remaining things require human effort to
verify and humans are prone to errors, we would like to limit the scope
of those things by confining them inside  trusted functions, which,
ideally, would be few in number and limited in scope. Everything else
should be relegated to  safe functions, where we *require* completely
automated verification by the compiler.

As it turns out, even within these  trusted functions, we humans could
use some help, therefore we'd like the compiler to help us verify as
much of these functions as it can for us, and then we can manually check
the remaining bits that cannot be mechanically verified. To this end,
your idea of tainting data is a valuable tool: by limiting  system-ness
to explicitly marked variables, we increase the scope of automatic
verification even inside  trusted functions, so that the compiler can
help us catch some things that we may have missed when we manually check
the code.

If we allow these  system variables inside  safe code, then we have
defeated the purpose of having  trusted functions in the first place,
because now the scope of functions that require manual inspection
expands to *all*  safe functions, which increases the maintainability
problem rather than reduce it.

Likewise, in the current implementation  trusted is a wholesale license
to perform arbitrary  system operations, which increases the difficulty
of manually verifying  trusted functions. (In fact, it's disastrous,
since you cannot guarantee that a code change in some remote function
won't change the trustworthiness of a verified  trusted function.)
Requiring potentially-unsafe data being explicitly marked  system allows
us to continue to impose  safe restrictions on everything else, thereby
reducing the scope of the remote-code-change problem to overtly marked
places where we can focus our scrutiny.


T

-- 
Doubtless it is a good thing to have an open mind, but a truly open mind should
be open at both ends, like the food-pipe, with the capacity for excretion as
well as absorption. -- Northrop Frye

Feb 05 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 2/5/15 2:43 PM, H. S. Teoh via Digitalmars-d wrote:

 The idea is that while we would like the compiler to mechanically verify
 *everything*, in practice there are some things that the compiler simply
 cannot verify. Since those remaining things require human effort to
 verify and humans are prone to errors, we would like to limit the scope
 of those things by confining them inside  trusted functions, which,
 ideally, would be few in number and limited in scope. Everything else
 should be relegated to  safe functions, where we *require* completely
 automated verification by the compiler.

What's the difference between an internal scope and a separate function 
scope? That is, a static internal function can simply be a private 
module function and have the same effect.

I don't see how your proposal is more safe than mine, or that somehow I 
can expect a  safe function never to have manually verified code that it 
uses.

-Steve

Feb 05 2015

"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:

On Thu, Feb 05, 2015 at 03:14:18PM -0500, Steven Schveighoffer via
Digitalmars-d wrote:
 On 2/5/15 2:43 PM, H. S. Teoh via Digitalmars-d wrote:
 
The idea is that while we would like the compiler to mechanically
verify *everything*, in practice there are some things that the
compiler simply cannot verify. Since those remaining things require
human effort to verify and humans are prone to errors, we would like
to limit the scope of those things by confining them inside  trusted
functions, which, ideally, would be few in number and limited in
scope. Everything else should be relegated to  safe functions, where
we *require* completely automated verification by the compiler.

 
 What's the difference between an internal scope and a separate
 function scope? That is, a static internal function can simply be a
 private module function and have the same effect.
 
 I don't see how your proposal is more safe than mine, or that somehow
 I can expect a  safe function never to have manually verified code
 that it uses.

[...]

It's as Walter just said:  safe means the compiler has mechanically
verified it,  trusted means the compiler has *not* verified it but that
a human did (or so we hope). If you like, think of it as
 safe-compiler-verified vs.  safe-human-verified. By segregating the
two, you limit the scope of code that needs to be reviewed.  Of course,
this is only of interest to the maintainer of the code, really, to the
user both sport a  safe API and there is no distinction.

In any case, it doesn't look like anything is going to change after all,
so this discussion has is just another of those what-could-have-beens
rather than what could be.


T

-- 
Klein bottle for rent ... inquire within. -- Stephen Mulraney

Feb 05 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 2/5/15 3:23 PM, H. S. Teoh via Digitalmars-d wrote:
 On Thu, Feb 05, 2015 at 03:14:18PM -0500, Steven Schveighoffer via
Digitalmars-d wrote:
 On 2/5/15 2:43 PM, H. S. Teoh via Digitalmars-d wrote:

 The idea is that while we would like the compiler to mechanically
 verify *everything*, in practice there are some things that the
 compiler simply cannot verify. Since those remaining things require
 human effort to verify and humans are prone to errors, we would like
 to limit the scope of those things by confining them inside  trusted
 functions, which, ideally, would be few in number and limited in
 scope. Everything else should be relegated to  safe functions, where
 we *require* completely automated verification by the compiler.

 What's the difference between an internal scope and a separate
 function scope? That is, a static internal function can simply be a
 private module function and have the same effect.

 I don't see how your proposal is more safe than mine, or that somehow
 I can expect a  safe function never to have manually verified code
 that it uses.

 [...]

 It's as Walter just said:  safe means the compiler has mechanically
 verified it,  trusted means the compiler has *not* verified it but that
 a human did (or so we hope). If you like, think of it as
  safe-compiler-verified vs.  safe-human-verified. By segregating the
 two, you limit the scope of code that needs to be reviewed.  Of course,
 this is only of interest to the maintainer of the code, really, to the
 user both sport a  safe API and there is no distinction.

I'll put out a strawman similar to my example response to Zach:

 trusted int[] tmalloc(size_t x) { ... }
 trusted void tfree(int[] x) { ... }

Now, let's say these are in some module you use, and your code is:

void foo()  safe
{
    auto x = tmalloc(100);
    tfree(x);

    ...

    x[0] = 1;
}

foo is "mechanically verified", but it's not really, because tmalloc and 
tfree are not. Now, you may just trust that tfree is fine, you may go 
and verify what tfree does. But in either case, you still have the 
problem that tfree(x) and the usage of x may be far away from each 
other, and may even be written by different people at different times. 
The compiler will still fail you in this regard, because it will not 
complain.

Understand that I don't disagree with your proposal, I just think it can 
be reduced to mine, and is unnecessarily complicated.

I think the *fundamental* problem with  trusted (currently) is that it 
assumes all the code it covers was written simultaneously and is not 
allowed to morph. This isn't the way code is written, it's massaged and 
tweaked over long periods of time by different people.

 In any case, it doesn't look like anything is going to change after all,
 so this discussion has is just another of those what-could-have-beens
 rather than what could be.

Don't give up so easily ;)

-Steve

Feb 05 2015

"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:

On Thu, Feb 05, 2015 at 03:39:16PM -0500, Steven Schveighoffer via
Digitalmars-d wrote:
[...]
 I think the *fundamental* problem with  trusted (currently) is that it
 assumes all the code it covers was written simultaneously and is not
 allowed to morph. This isn't the way code is written, it's massaged
 and tweaked over long periods of time by different people.

Thank you, that's exactly what I've been trying to say, but rather
poorly. This is what makes the current incarnation of  trusted
unworkable in real-life. Putting it on a function is a stamp of approval
that the code has been verified by a human. Unfortunately, the element
of time has been neglected. It may have been verified back when it was
first committed, but now that 10 other people have stuck their grubby
hands into the code, who knows if the original verification still
applies? Yet the  trusted label continues to be a stamp of approval
claiming that the function is still safe. It's like a car insurance
sticker without expiry date. The insurance company may have gone bust
for all I know, but it sure looks good that my car is still "insured"!

There needs to be some kind of "change insurance" to  trusted. If
somebody makes a careless code change that may break the promise of
 trusted, there needs to be a way for the compiler to detect this and
complain loudly. Of course, we can't prevent *malicious* changes, since
there's always another way to work around the compiler, but in the
reasonable cases at the very least, careless mistakes ought to be caught
and pointed out. Such as a  safe helper function used by a  trusted
function becoming  system because somebody modified the original
implementation. Requiring some kind of annotation on exactly what parts
of a  trusted function rely on unsafe (or rather, safe but unverifiable
by the compiler) operations helps by introducing a barrier for mistakes:
the compiler will reject your code unless you consciously mark it up as
trusted (thereby indicating that you have manually verified the code --
or maliciously introducing unsafe code in  trusted, as the case may be).


T

-- 
Computers shouldn't beep through the keyhole.

Feb 05 2015

D Programming

C/C++ Programming

Other

digitalmars.D - misplaced trust?