www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - misplaced trust?

reply Steven Schveighoffer <schveiguy yahoo.com> writes:
Reading the thread in bug report 
https://issues.dlang.org/show_bug.cgi?id=14125 gets me thinking, perhaps 
 trusted is mis-designed.

At the moment, a  trusted function is the same as a  system function, 
but is allowed to be called from a  safe function. And that's it.

But a  trusted function may only need to be marked  trusted because of a 
few lines of code. So we now use this mechanism where we mark the 
 trusted portions with a lambda or static nested function, and call that 
internally, and mark the entire function  safe.

The benefit of this is, for the 90% of the code that is  safe, the 
compiler is used as a tool to check the  safe-ty of the function. For 
the 10% that isn't, we have contained it and marked it, and we can focus 
on that portion for scrutiny. This becomes EXTREMELY important when it 
comes to maintenance.

For example, if a  trusted function has some code added to it, the new 
code needs to be measured to see if it really is safe to call from a 
 safe function. This may be no easy feat.

At least with the  trusted inner function/lambda, you limit yourself to 
the code that has been marked as such, and you don't need to worry about 
the actual  safe code that is added to the function.

Or do you?... the problem with this mentality is interactions with data 
inside the  safe code after a  trusted function is called can still be 
subject to memory safety issues. An example (using static functions for 
clarity):

void foo()  safe
{
    static auto tmalloc(size_t x)  trusted {return (cast(int *)malloc(x 
* sizeof(int)))[0..x];}
    static void tfree(void[] arr)  trusted {free(arr.ptr);}

    auto mem = tmalloc(100);
    tfree(mem);
    mem[0] = 5;
}

Note that the final line in the function, setting mem[0] to 5, is the 
only unsafe part. it's safe to malloc, it's safe to free as long as you 
don't ever refer to that memory again.

But the problem with the above is that  trusted is not needed to apply 
to the mem[0] = 5 call. And imagine that the mem[0] = 5 function may be 
simply added, by another person who didn't understand the context. 
Marking the whole function as safe is kind of meaningless here.

Changing gears, one of the issues raised in the aforementioned bug is 
that a function like this really should be marked trusted in its 
entirety. But what does this actually mean?

When we mark a function  safe, we can assume that the compiler has 
checked it for us. When we mark a function  trusted, we can assume the 
compiler has NOT checked it for us. But how does this help? A  safe 
function can call a  trusted function. So there is little difference 
between a  safe and a  trusted function from an API point of view. I 
would contend that  trusted really should NEVER HAVE BEEN a function 
attribute. You may as well call them all  safe, there is no difference.

What I think we need to approach this correctly is that instead of 
marking *functions* with  trusted, we need to mark *data* and *calls* 
with  trusted. Once data is used inside a  trusted island, it becomes 
tainted with the  trusted mark. The compiler can no longer guarantee 
safety for that data.

So how can we implement this without too much pain? I envision that we 
can mark lines or blocks inside a  safe function as  trusted (a  trusted 
block inside a  system function would be a no-op, but allowed). I also 
envision that any variable used inside the function is internally marked 
as  safe until it's used in a  trusted block, and then is marked as 
 trusted (again internally, no requirement to mark in syntax). If at any 
time during compilation a  trusted variable is used in  safe code, it's 
a compile error.

There may be situations in which you DO need this to work, and in those 
cases, I would say a cast( safe) could get rid of that mark. For 
example, if you want to return the result of a  trusted call in a  safe 
function.

Such a change would be somewhat disruptive, but much more in line with 
what  trusted calls really mean.

I've left some of the details fuzzy on purpose, because I'm not a 
compiler writer :)

Destroy

-Steve
Feb 05 2015
next sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 5 February 2015 at 16:50:18 UTC, Steven 
Schveighoffer wrote:
 I've left some of the details fuzzy on purpose, because I'm not 
 a compiler writer :)
So you want to spend another 8 years implementing linear typing for D: http://en.wikipedia.org/wiki/Substructural_type_system Or maybe give up safe. Or implement either a behavioural or dependent type system. Lots of options. Lots of theory... Without type theory => leaky cauldron.
Feb 05 2015
prev sibling next sibling parent reply "Zach the Mystic" <reachzach gggmail.com> writes:
On Thursday, 5 February 2015 at 16:50:18 UTC, Steven 
Schveighoffer wrote:
 Reading the thread in bug report 
 https://issues.dlang.org/show_bug.cgi?id=14125 gets me 
 thinking, perhaps  trusted is mis-designed.

 At the moment, a  trusted function is the same as a  system 
 function, but is allowed to be called from a  safe function. 
 And that's it.

 But a  trusted function may only need to be marked  trusted 
 because of a few lines of code. So we now use this mechanism 
 where we mark the  trusted portions with a lambda or static 
 nested function, and call that internally, and mark the entire 
 function  safe.

 The benefit of this is, for the 90% of the code that is  safe, 
 the compiler is used as a tool to check the  safe-ty of the 
 function. For the 10% that isn't, we have contained it and 
 marked it, and we can focus on that portion for scrutiny. This 
 becomes EXTREMELY important when it comes to maintenance.

 For example, if a  trusted function has some code added to it, 
 the new code needs to be measured to see if it really is safe 
 to call from a  safe function. This may be no easy feat.

 At least with the  trusted inner function/lambda, you limit 
 yourself to the code that has been marked as such, and you 
 don't need to worry about the actual  safe code that is added 
 to the function.

 Or do you?... the problem with this mentality is interactions 
 with data inside the  safe code after a  trusted function is 
 called can still be subject to memory safety issues. An example 
 (using static functions for clarity):

 void foo()  safe
 {
    static auto tmalloc(size_t x)  trusted {return (cast(int 
 *)malloc(x * sizeof(int)))[0..x];}
    static void tfree(void[] arr)  trusted {free(arr.ptr);}

    auto mem = tmalloc(100);
    tfree(mem);
    mem[0] = 5;
 }

 Note that the final line in the function, setting mem[0] to 5, 
 is the only unsafe part. it's safe to malloc, it's safe to free 
 as long as you don't ever refer to that memory again.

 But the problem with the above is that  trusted is not needed 
 to apply to the mem[0] = 5 call. And imagine that the mem[0] = 
 5 function may be simply added, by another person who didn't 
 understand the context. Marking the whole function as safe is 
 kind of meaningless here.

 Changing gears, one of the issues raised in the aforementioned 
 bug is that a function like this really should be marked 
 trusted in its entirety. But what does this actually mean?

 When we mark a function  safe, we can assume that the compiler 
 has checked it for us. When we mark a function  trusted, we can 
 assume the compiler has NOT checked it for us. But how does 
 this help? A  safe function can call a  trusted function. So 
 there is little difference between a  safe and a  trusted 
 function from an API point of view. I would contend that 
  trusted really should NEVER HAVE BEEN a function attribute. 
 You may as well call them all  safe, there is no difference.

 What I think we need to approach this correctly is that instead 
 of marking *functions* with  trusted, we need to mark *data* 
 and *calls* with  trusted. Once data is used inside a  trusted 
 island, it becomes tainted with the  trusted mark. The compiler 
 can no longer guarantee safety for that data.

 So how can we implement this without too much pain? I envision 
 that we can mark lines or blocks inside a  safe function as 
  trusted (a  trusted block inside a  system function would be a 
 no-op, but allowed). I also envision that any variable used 
 inside the function is internally marked as  safe until it's 
 used in a  trusted block, and then is marked as  trusted (again 
 internally, no requirement to mark in syntax). If at any time 
 during compilation a  trusted variable is used in  safe code, 
 it's a compile error.

 There may be situations in which you DO need this to work, and 
 in those cases, I would say a cast( safe) could get rid of that 
 mark. For example, if you want to return the result of a 
  trusted call in a  safe function.

 Such a change would be somewhat disruptive, but much more in 
 line with what  trusted calls really mean.

 I've left some of the details fuzzy on purpose, because I'm not 
 a compiler writer :)

 Destroy

 -Steve
Hey I like the creativity you're showing. Just to give people a concrete idea, you might show some sample code and illustrate how things work. It sure helps when I'm trying to think about things.
Feb 05 2015
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/5/15 1:12 PM, Zach the Mystic wrote:

 Hey I like the creativity you're showing. Just to give people a concrete
 idea, you might show some sample code and illustrate how things work. It
 sure helps when I'm trying to think about things.
So for example: safe int *foo() { int *x; int *y; int z; x = new int; // ok //y = &z; // not OK trusted y = &z; // OK, but now y is marked as trusted // return y; // not OK, cannot return trusted pointer in safe function return cast( safe)y; // ok, we are overriding the compiler. // and of course return x; would be ok } -Steve
Feb 05 2015
parent reply "Zach the Mystic" <reachzach gggmail.com> writes:
On Thursday, 5 February 2015 at 18:21:40 UTC, Steven 
Schveighoffer wrote:
 On 2/5/15 1:12 PM, Zach the Mystic wrote:

 Hey I like the creativity you're showing. Just to give people 
 a concrete
 idea, you might show some sample code and illustrate how 
 things work. It
 sure helps when I'm trying to think about things.
So for example: safe int *foo() { int *x; int *y; int z; x = new int; // ok //y = &z; // not OK trusted y = &z; // OK, but now y is marked as trusted // return y; // not OK, cannot return trusted pointer in safe function return cast( safe)y; // ok, we are overriding the compiler. // and of course return x; would be ok } -Steve
`cast( safe)`...interesting. It's the most fine-tuned way of adding safety, whereas trusting a whole function is the most blunt way. I've been hatching a scheme for reference safety in my head which would automatically track ` trusted y = &z;` above, marking `y` with "scopedepth(1)", which would be unreturnable in safe code. I can anticipate the objection that giving people too much power will encourage them to abuse it... but then again, if that were true, who let them mark the whole function ` trusted` to begin with? Your proposal really pinpoints the actual code which needs to be worked on. You're basically moving the unit of safety from the *function* to the *pointer*, which makes sense to me, since only a pointer can really be unsafe.
Feb 05 2015
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Thu, Feb 05, 2015 at 06:56:02PM +0000, Zach the Mystic via Digitalmars-d
wrote:
 On Thursday, 5 February 2015 at 18:21:40 UTC, Steven Schveighoffer wrote:
On 2/5/15 1:12 PM, Zach the Mystic wrote:

Hey I like the creativity you're showing. Just to give people a
concrete idea, you might show some sample code and illustrate how
things work. It sure helps when I'm trying to think about things.
So for example: safe int *foo() { int *x; int *y; int z; x = new int; // ok //y = &z; // not OK trusted y = &z; // OK, but now y is marked as trusted // return y; // not OK, cannot return trusted pointer in safe function return cast( safe)y; // ok, we are overriding the compiler. // and of course return x; would be ok } -Steve
`cast( safe)`...interesting. It's the most fine-tuned way of adding safety, whereas trusting a whole function is the most blunt way. I've been hatching a scheme for reference safety in my head which would automatically track ` trusted y = &z;` above, marking `y` with "scopedepth(1)", which would be unreturnable in safe code. I can anticipate the objection that giving people too much power will encourage them to abuse it... but then again, if that were true, who let them mark the whole function ` trusted` to begin with? Your proposal really pinpoints the actual code which needs to be worked on. You're basically moving the unit of safety from the *function* to the *pointer*, which makes sense to me, since only a pointer can really be unsafe.
I mostly like this idea, except that foo() should not be marked safe. It should be marked trusted because it still needs review, but the meaning of trusted should be changed so that it still enforces safe-ty, except that now trusted variables are permitted. Or rather, I would call them system variables -- they *cannot* be trusted, and must be manually verified. safe code should not allow any system variables or any cast( safe) operations, period. Otherwise, anybody can wrap unsafe operations inside their safe function and still clothe it with the sheep's clothing of safe, and safe becomes a meaningless annotation. In short, my proposal is: - safe should continue being safe -- no (potentially) unsafe operations are allowed, period. Rationale: allowing system variables in safe code makes the function non-verifiable mechanically. This completely breaks the whole point of safe. - Change the meaning of trusted (as applied to a function) to require safe inside the function body, but in addition permit system variables and cast( safe). Rationale: the function cannot be verified mechanically to be safe, therefore it cannot be marked safe. It must be marked trusted to draw attention to the fact that manual review is required. However, this does not constitute license to perform arbitrary system operations. Instead, any system code/variable inside the trusted function must be explicitly marked as such, to indicate that these items require special attention during review. Everything else must still conform to safe requirements. - Introduce system variables for holding tainted values that the compiler cannot guarantee the safety of, as well as cast( safe), as described in Steven's post. These constructs are only permitted inside trusted functions. They are prohibited in safe code, and are no-ops in system code. Rationale: to reduce the maintainability problem, trusted functions should not allow system code by default. Rather, the scope of system code/data inside a trusted function should be restricted by requiring explicit marking. The compiler then helps the verification process by ensuring that anything not explicitly marked is still safe. T -- Nobody is perfect. I am Nobody. -- pepoluan, GKC forum
Feb 05 2015
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/5/15 11:17 AM, H. S. Teoh via Digitalmars-d wrote:
 In short, my proposal is:
Tainted variables are an interesting topic, but quite distinct from the notion of separating safe code from unsafe code. As much as I was shocked about the use of trusted/ safe/ system in std.file, std.array and sadly possibly in other places, I found no evidence that the feature is misdesigned. I continue to consider it a simple, sound, and very effective method of building and interfacing robust code. An excellent engineering solution that offers a lot of power at a modest cost. I do not support this proposal to change the semantics of trusted/ safe/ system. A separate tainted data proposal might be of interest for loosely related topics. Andrei
Feb 05 2015
next sibling parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Thu, Feb 05, 2015 at 11:49:40AM -0800, Andrei Alexandrescu via Digitalmars-d
wrote:
 On 2/5/15 11:17 AM, H. S. Teoh via Digitalmars-d wrote:
In short, my proposal is:
Tainted variables are an interesting topic, but quite distinct from the notion of separating safe code from unsafe code. As much as I was shocked about the use of trusted/ safe/ system in std.file, std.array and sadly possibly in other places, I found no evidence that the feature is misdesigned. I continue to consider it a simple, sound, and very effective method of building and interfacing robust code. An excellent engineering solution that offers a lot of power at a modest cost.
Frankly, I don't understand how you could fail to see the maintenance nightmare trusted, as you envisioned it, is causing. The wholesale license to freely perform system operations inside a trusted function greatly undermines the trustworthiness of SafeD as a whole. The problem is that human verification only happens once, and thereafter trusted continues to be applied to that function, even if later on, further downstream in the call graph, a remote helper function has changed in a way that compromises that trustworthiness. This happens with no warnings whatsoever from the compiler, and so we continue to build our house upon sinking sand. To truly be sure that our trusted functions are actually trusted, we have to continually review them not just for changes within the function bodies, but also all downstream functions used within. The review requirements for maintaining such a standard is impractically onerous. And, judging from the trends in our industry, what will likely happen is that this review is never actually carried out until a nasty security exploit is discovered in safe code. By then, it will be far too late. Maybe to you this situation is acceptable, but to me, this is an utter maintenance nightmare.
 I do not support this proposal to change the semantics of
  trusted/ safe/ system. A separate tainted data proposal might be of
 interest for loosely related topics.
[...] That is your own prerogative, but this discussion has convinced me that safe is a joke, and your continual refusal to admit the existence of any problems despite many active Phobos contributors describing it to you in detail has made me completely lose interest in this subject. From now on I will not bother with safe in my projects anymore, since it has become clear that it does not live up to its promise and probably never will (this specific issue is only the tip of the iceberg; there are many other problems with safe that you may find in bugzilla), and I have no further interest in contributing to safe-related issues in Phobos. T -- Mediocrity has been pushed to extremes.
Feb 05 2015
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/5/2015 11:49 AM, Andrei Alexandrescu wrote:
 As much as I was shocked about the use of  trusted/ safe/ system in std.file,
 std.array and sadly possibly in other places, I found no evidence that the
 feature is misdesigned. I continue to consider it a simple, sound, and very
 effective method of building and interfacing robust code. An excellent
 engineering solution that offers a lot of power at a modest cost.

 I do not support this proposal to change the semantics of
  trusted/ safe/ system.
I agree. So the question is, what does trusted actually buy you, since the compiler can't check it? It serves as notice that "This function merits special attention during code review to check that it has a safe interface and that its implementation is correct."
Feb 05 2015
next sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/5/15 3:13 PM, Walter Bright wrote:
 On 2/5/2015 11:49 AM, Andrei Alexandrescu wrote:
 As much as I was shocked about the use of  trusted/ safe/ system in
 std.file,
 std.array and sadly possibly in other places, I found no evidence that
 the
 feature is misdesigned. I continue to consider it a simple, sound, and
 very
 effective method of building and interfacing robust code. An excellent
 engineering solution that offers a lot of power at a modest cost.

 I do not support this proposal to change the semantics of
  trusted/ safe/ system.
I agree. So the question is, what does trusted actually buy you, since the compiler can't check it? It serves as notice that "This function merits special attention during code review to check that it has a safe interface and that its implementation is correct."
That also applies to safe functions since they can call trusted functions. In essense, trusted buys you headaches. I think we should try to lessen them. -Steve
Feb 05 2015
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/5/2015 12:25 PM, Steven Schveighoffer wrote:
 On 2/5/15 3:13 PM, Walter Bright wrote:
 So the question is, what does  trusted actually buy you, since the
 compiler can't check it?

 It serves as notice that "This function merits special attention during
 code review to check that it has a safe interface and that its
 implementation is correct."
That also applies to safe functions since they can call trusted functions.
No - the trusted function is reviewed to ensure it has a safe interface. Then there is no need to review for safety anyone that calls it. It's the whole point. For example, https://issues.dlang.org/show_bug.cgi?id=14127 A rather cursory inspection reveals that these trusted functions have unsafe interfaces, and are therefore unacceptable in Phobos. (Of course, D will let you write such code because it is a systems programming language, but Phobos must be an example of best practices, and these are not.)
 In essense,  trusted buys you headaches. I think we should try to lessen them.
An aspect of a well-designed encapsulation is the number of trusted interfaces is minimized. If you find an abstraction that has trusted sprinkled liberally through it, it's an indicator of a failed abstraction.
Feb 05 2015
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/5/15 3:54 PM, Walter Bright wrote:
 On 2/5/2015 12:25 PM, Steven Schveighoffer wrote:
 On 2/5/15 3:13 PM, Walter Bright wrote:
 So the question is, what does  trusted actually buy you, since the
 compiler can't check it?

 It serves as notice that "This function merits special attention during
 code review to check that it has a safe interface and that its
 implementation is correct."
That also applies to safe functions since they can call trusted functions.
No - the trusted function is reviewed to ensure it has a safe interface. Then there is no need to review for safety anyone that calls it. It's the whole point.
So you're saying that safe is mechanically verified as long as trusted functions are manually reviewed. It's that manually reviewed part that I think we have an issue with. And there is definitely a feel of "I can use trusted because I know I will only call it this way" without considering future possible calls. I fully expect we can devise rules to make sure trusted memory details do not leak out of the functions in an unsafe way (and have a much better, safer code base). But it would be nice if the compiler helped enforce them.
 In essense,  trusted buys you headaches. I think we should try to
 lessen them.
An aspect of a well-designed encapsulation is the number of trusted interfaces is minimized. If you find an abstraction that has trusted sprinkled liberally through it, it's an indicator of a failed abstraction.
I think you just made this up :) But I agree that trusted should be used sparingly not liberally. The problem is that when faced with such a huge function that calls one non- safe one, marking the whole thing as trusted disables all the mechanical verification for everything. There has to be a better way. -Steve
Feb 05 2015
next sibling parent "Zach the Mystic" <reachzach gggmail.com> writes:
On Thursday, 5 February 2015 at 21:15:52 UTC, Steven 
Schveighoffer wrote:
 An aspect of a well-designed encapsulation is the number of 
  trusted
 interfaces is minimized. If you find an abstraction that has 
  trusted
 sprinkled liberally through it, it's an indicator of a failed 
 abstraction.
I think you just made this up :) But I agree that trusted should be used sparingly not liberally. The problem is that when faced with such a huge function that calls one non- safe one, marking the whole thing as trusted disables all the mechanical verification for everything. There has to be a better way. -Steve
But it doesn't have to be accepted into D! :-)
Feb 05 2015
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/5/2015 1:15 PM, Steven Schveighoffer wrote:
 So you're saying that  safe is mechanically verified as long as  trusted
 functions are manually reviewed.
Yes.
 It's that manually reviewed part that I think
 we have an issue with. And there is definitely a feel of "I can use trusted
 because I know I will only call it this way" without considering future
possible
 calls.
It's no different from: Q: the compiler gave me a mismatched type error! A: put in a cast and it'll compile! I.e. you can always find a way to make buggy, badly designed code compile. It's up to whoever does the reviews to have some sort of standards.
 An aspect of a well-designed encapsulation is the number of  trusted
 interfaces is minimized. If you find an abstraction that has  trusted
 sprinkled liberally through it, it's an indicator of a failed abstraction.
I think you just made this up :)
No, I've always thought of trusted that way. It isn't different from classes that allow too many functions to access private members.
 But I agree that  trusted should be used sparingly not liberally. The problem
is
 that when faced with such a huge function that calls one non- safe one, marking
 the whole thing as  trusted disables all the mechanical verification for
 everything.
Then that's a candidate for a redesign of the abstraction.
Feb 05 2015
parent Jacob Carlborg <doob me.com> writes:
On 2015-02-05 22:35, Walter Bright wrote:

 It's no different from:

 Q: the compiler gave me a mismatched type error!
 A: put in a cast and it'll compile!
The difference is if you do that the compiler won't disable type checking for the whole function. -- /Jacob Carlborg
Feb 08 2015
prev sibling parent "CraigDillabaugh" <craig.dillabaugh gmail.com> writes:
On Thursday, 5 February 2015 at 20:13:32 UTC, Walter Bright wrote:
 On 2/5/2015 11:49 AM, Andrei Alexandrescu wrote:
 As much as I was shocked about the use of 
  trusted/ safe/ system in std.file,
 std.array and sadly possibly in other places, I found no 
 evidence that the
 feature is misdesigned. I continue to consider it a simple, 
 sound, and very
 effective method of building and interfacing robust code. An 
 excellent
 engineering solution that offers a lot of power at a modest 
 cost.

 I do not support this proposal to change the semantics of
  trusted/ safe/ system.
I agree. So the question is, what does trusted actually buy you, since the compiler can't check it? It serves as notice that "This function merits special attention during code review to check that it has a safe interface and that its implementation is correct."
Couldn't you just use a comment?
Feb 05 2015
prev sibling next sibling parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/5/15 2:49 PM, Andrei Alexandrescu wrote:
 On 2/5/15 11:17 AM, H. S. Teoh via Digitalmars-d wrote:
 In short, my proposal is:
Tainted variables are an interesting topic, but quite distinct from the notion of separating safe code from unsafe code. As much as I was shocked about the use of trusted/ safe/ system in std.file, std.array and sadly possibly in other places, I found no evidence that the feature is misdesigned. I continue to consider it a simple, sound, and very effective method of building and interfacing robust code. An excellent engineering solution that offers a lot of power at a modest cost. I do not support this proposal to change the semantics of trusted/ safe/ system. A separate tainted data proposal might be of interest for loosely related topics.
The proposal (the original one I stated, not H.S.'s) is to do 2 things: 1. Clean up the syntax for trusted escapes inside safe code that we have settled on. 2. Add a mechanism to make those escapes safer and more reviewable. I don't think the idea behind trusted is incorrect, just that the idea it's a function attribute is mis-designed. Note that in my proposal, you can essentially create a trusted function just by marking the whole thing trusted: -Steve
Feb 05 2015
prev sibling next sibling parent reply "Zach the Mystic" <reachzach gggmail.com> writes:
On Thursday, 5 February 2015 at 19:49:41 UTC, Andrei Alexandrescu 
wrote:
 On 2/5/15 11:17 AM, H. S. Teoh via Digitalmars-d wrote:
 In short, my proposal is:
Tainted variables are an interesting topic, but quite distinct from the notion of separating safe code from unsafe code. As much as I was shocked about the use of trusted/ safe/ system in std.file, std.array and sadly possibly in other places, I found no evidence that the feature is misdesigned. I continue to consider it a simple, sound, and very effective method of building and interfacing robust code. An excellent engineering solution that offers a lot of power at a modest cost. I do not support this proposal to change the semantics of trusted/ safe/ system. A separate tainted data proposal might be of interest for loosely related topics. Andrei
At minimum, there needs to be official documented guidance on how to use trusted. If phobos developers got this far without knowing how to use it (assuming their complaints on its design are indeed meritless), how can anyone else be expected to?
Feb 05 2015
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/5/15 12:24 PM, Zach the Mystic wrote:
 At minimum, there needs to be official documented guidance on how to use
  trusted. If phobos developers got this far without knowing how to use
 it (assuming their complaints on its design are indeed meritless), how
 can anyone else be expected to?
Yah, some documentation is needed. -- Andrei
Feb 05 2015
prev sibling parent reply "Dicebot" <public dicebot.lv> writes:
On Thursday, 5 February 2015 at 19:49:41 UTC, Andrei Alexandrescu 
wrote:
 On 2/5/15 11:17 AM, H. S. Teoh via Digitalmars-d wrote:
 In short, my proposal is:
Tainted variables are an interesting topic, but quite distinct from the notion of separating safe code from unsafe code. As much as I was shocked about the use of trusted/ safe/ system in std.file, std.array and sadly possibly in other places, I found no evidence that the feature is misdesigned. I continue to consider it a simple, sound, and very effective method of building and interfacing robust code. An excellent engineering solution that offers a lot of power at a modest cost. I do not support this proposal to change the semantics of trusted/ safe/ system. A separate tainted data proposal might be of interest for loosely related topics.
Probbaly you and Walter should try maintaining Phobos for next year alone then and see how it works. Maybe that will make some issues more convincing. This is absolutely ridiculous that every single one of existing Phobos reviewers who actually worked with that code in practice says that there is a problem with trusted and you keep rejecting it with "no, it is all good as it is" argument. For me this thread was clear alarm : safe in its current state is a 100% misfeature and is better to be advertised against until either its design changes or effective idioms are presented.
Feb 05 2015
next sibling parent reply "Dicebot" <public dicebot.lv> writes:
To put it differently - there is no way I would have ever taken 
the risk merging a 50-line  trusted function, be it Phobos or 
work project. And it is better to not make promises than to break 
those.
Feb 05 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/5/15 3:22 PM, Dicebot wrote:
 To put it differently - there is no way I would have ever taken the risk
 merging a 50-line  trusted function, be it Phobos or work project.
Surely you're exaggerating. We're looking at a function that performs system calls and reads into a memory buffer allocated appropriately (and economically). Claiming that that function is safe then enumerating the numerous unsafe and unprovable escape hatches it uses is someone claiming "I'm a virgin - of course save for those six one-night stands I've had." It's unclear what you're advocating here. I don't think your previous arguments stand scrutiny. One possible new argument might be an analysis on how this: https://github.com/D-Programming-Language/phobos/blob/accb351b96bb04a6890bb7df018749337e55eccc/std/file.d#L194 is easier to reason about than this: https://github.com/D-Programming-Language/phobos/blob/master/std/file.d#L194 Andrei
Feb 05 2015
parent reply "Dicebot" <public dicebot.lv> writes:
On Thursday, 5 February 2015 at 23:47:00 UTC, Andrei Alexandrescu 
wrote:
 On 2/5/15 3:22 PM, Dicebot wrote:
 To put it differently - there is no way I would have ever 
 taken the risk
 merging a 50-line  trusted function, be it Phobos or work 
 project.
Surely you're exaggerating.
Not even slightly. I have revoked my Phobos access for a specific reason that I can't do the reviewer job properly with such requirements and would have been forced to ignore all pull requests that tackle trusted anyway.
 We're looking at a function that performs system calls and 
 reads into a memory buffer allocated appropriately (and 
 economically). Claiming that that function is safe then 
 enumerating the numerous unsafe and unprovable escape hatches 
 it uses is someone claiming "I'm a virgin - of course save for 
 those six one-night stands I've had."
So what? I don't care how justified it is, I simply don't trust my attention span enough do verify that foo() is a virgin. I am not a rock-star programmer and I know my limits. Verifying 50 lines of trusted with no help from compiler at all is beyond those limits. When all exceptions to safety are explicitly listed I can review the implementation knowing "ok, this will be safe _unless_ it gets screwed by data coming from those trusted wrappers". And that is big mentality switch that helps to maintain focus.
 It's unclear what you're advocating here. I don't think your 
 previous arguments stand scrutiny. One possible new argument 
 might be an analysis on how this:

 https://github.com/D-Programming-Language/phobos/blob/accb351b96bb04a6890bb7df018749337e55eccc/std/file.d#L194

 is easier to reason about than this:

 https://github.com/D-Programming-Language/phobos/blob/master/std/file.d#L194
It will be a very short analysis considering I am not able to reason about the latter one at all - it simply requires too much of a time investment to me to even consider it.
Feb 05 2015
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/5/15 4:02 PM, Dicebot wrote:
 Verifying 50 lines of  trusted with no help from compiler at all is
 beyond those limits.
Do you use safe at work? -- Andrei
Feb 05 2015
parent reply "Dicebot" <public dicebot.lv> writes:
On Friday, 6 February 2015 at 00:21:45 UTC, Andrei Alexandrescu 
wrote:
 On 2/5/15 4:02 PM, Dicebot wrote:
 Verifying 50 lines of  trusted with no help from compiler at 
 all is
 beyond those limits.
Do you use safe at work? -- Andrei
If it is sarcasm, it could have been better.
Feb 05 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/5/15 4:22 PM, Dicebot wrote:
 On Friday, 6 February 2015 at 00:21:45 UTC, Andrei Alexandrescu wrote:
 On 2/5/15 4:02 PM, Dicebot wrote:
 Verifying 50 lines of  trusted with no help from compiler at all is
 beyond those limits.
Do you use safe at work? -- Andrei
If it is sarcasm, it could have been better.
It's candid. You're saying you cannot verify safety of a 50-lines function, but I know you are using D1 at work. So I don't see how your claim can be true. Andrei
Feb 05 2015
parent reply "Dicebot" <public dicebot.lv> writes:
On Friday, 6 February 2015 at 00:31:06 UTC, Andrei Alexandrescu 
wrote:
 On 2/5/15 4:22 PM, Dicebot wrote:
 On Friday, 6 February 2015 at 00:21:45 UTC, Andrei 
 Alexandrescu wrote:
 On 2/5/15 4:02 PM, Dicebot wrote:
 Verifying 50 lines of  trusted with no help from compiler at 
 all is
 beyond those limits.
Do you use safe at work? -- Andrei
If it is sarcasm, it could have been better.
It's candid. You're saying you cannot verify safety of a 50-lines function, but I know you are using D1 at work. So I don't see how your claim can be true.
You do realize that I was one of reviewers for those Phobos pull requests you complain about?
Feb 05 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/5/15 4:37 PM, Dicebot wrote:
 On Friday, 6 February 2015 at 00:31:06 UTC, Andrei Alexandrescu wrote:
 On 2/5/15 4:22 PM, Dicebot wrote:
 On Friday, 6 February 2015 at 00:21:45 UTC, Andrei Alexandrescu wrote:
 On 2/5/15 4:02 PM, Dicebot wrote:
 Verifying 50 lines of  trusted with no help from compiler at all is
 beyond those limits.
Do you use safe at work? -- Andrei
If it is sarcasm, it could have been better.
It's candid. You're saying you cannot verify safety of a 50-lines function, but I know you are using D1 at work. So I don't see how your claim can be true.
You do realize that I was one of reviewers for those Phobos pull requests you complain about?
The reference was to the fact that you are obviously a competent engineer using an unsafe language, yet claim to be completely hopeless in reviewing a 50-liner that reads data from a file. -- Andrei
Feb 05 2015
parent reply "Dicebot" <public dicebot.lv> writes:
On Friday, 6 February 2015 at 00:56:09 UTC, Andrei Alexandrescu
wrote:
 On 2/5/15 4:37 PM, Dicebot wrote:
 On Friday, 6 February 2015 at 00:31:06 UTC, Andrei 
 Alexandrescu wrote:
 On 2/5/15 4:22 PM, Dicebot wrote:
 On Friday, 6 February 2015 at 00:21:45 UTC, Andrei 
 Alexandrescu wrote:
 On 2/5/15 4:02 PM, Dicebot wrote:
 Verifying 50 lines of  trusted with no help from compiler 
 at all is
 beyond those limits.
Do you use safe at work? -- Andrei
If it is sarcasm, it could have been better.
It's candid. You're saying you cannot verify safety of a 50-lines function, but I know you are using D1 at work. So I don't see how your claim can be true.
You do realize that I was one of reviewers for those Phobos pull requests you complain about?
The reference was to the fact that you are obviously a competent engineer using an unsafe language, yet claim to be completely hopeless in reviewing a 50-liner that reads data from a file. -- Andrei
I referred to this fact with a comment "it is better to make no promises than to make one and break it". Simply dealing with unsafe language is something I got used to - all crashes and weird think become expected. It is totally different from seeing a memory corruption with safe - "hey, you lied to me, it is not safe!". Because of that amount of responsibility reviewing trusted is much higher than reviewing system. I can do the latter because I don't pretend review to be perfect. With trusted pressure is much harder. What is worse, as it has been already mentioned, it is not just a one time effort - careful review necessity taints all code that gets called from trusted code. With that much continuous effort required there feels no point in trying to go for safe as opposed to just having system everywhere and relying on old-school memory safety techniques.
Feb 05 2015
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 2/5/2015 8:24 PM, Dicebot wrote:
 What is worse, as it has been already mentioned, it is not just a
 one time effort - careful review necessity taints all code that
 gets called from  trusted code.
That is only true if the trusted code has an unsafe interface. Determining if a function has a safe interface is a far, far smaller and more tractable problem than examining all the code that calls it.
Feb 05 2015
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/5/15 8:24 PM, Dicebot wrote:
 I referred to this fact with a comment "it is better to make no
 promises than to make one and break it". Simply dealing with
 unsafe language is something I got used to - all crashes and
 weird think become expected. It is totally different from seeing
 a memory corruption with  safe - "hey, you lied to me, it is not
 safe!". Because of that amount of responsibility reviewing
  trusted is much higher than reviewing  system. I can do the
 latter because I don't pretend review to be perfect. With
  trusted pressure is much harder.
Oh I understand. The notion of calibration comes to mind.
 What is worse, as it has been already mentioned, it is not just a
 one time effort - careful review necessity taints all code that
 gets called from  trusted code. With that much continuous effort
 required there feels no point in trying to go for  safe as
 opposed to just having  system everywhere and relying on
 old-school memory safety techniques.
I don't see it as bad, but I see what you're saying. Anyhow, it's likely we all grew tired of each other's arguments. Probably best to stop here. Fresh perspectives would be great. Until then there is no change to safe/ trusted/ system. Thanks, Andrei
Feb 05 2015
parent reply "Tobias Pankrath" <tobias pankrath.net> writes:
On Friday, 6 February 2015 at 06:25:06 UTC, Andrei Alexandrescu 
wrote:
 On 2/5/15 8:24 PM, Dicebot wrote:
 I referred to this fact with a comment "it is better to make no
 promises than to make one and break it". Simply dealing with
 unsafe language is something I got used to - all crashes and
 weird think become expected. It is totally different from 
 seeing
 a memory corruption with  safe - "hey, you lied to me, it is 
 not
 safe!". Because of that amount of responsibility reviewing
  trusted is much higher than reviewing  system. I can do the
 latter because I don't pretend review to be perfect. With
  trusted pressure is much harder.
Oh I understand. The notion of calibration comes to mind.
I'd like D to provide the following guarantee: If I corrupt my memory using safe code, the error must be in code marked trusted / system, either because the do not provide a safe interface or because they are buggy. We'll never provide a stronger guarantee as long as we allow to escape safe-ty, just like we'll never be able to guarantee that a T is a valid T as long as we allow casts. I think the guarantee is worth the effort, though.
Feb 06 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/6/15 3:26 AM, Tobias Pankrath wrote:
 I'd like D to provide the following guarantee: If I corrupt my memory
 using  safe code, the error must be in code marked  trusted /  system,
 either because the do not provide a  safe interface or because they are
 buggy.
That's what we're going for. -- Andrei
Feb 06 2015
parent Walter Bright <newshound2 digitalmars.com> writes:
On 2/6/2015 8:10 AM, Andrei Alexandrescu wrote:
 On 2/6/15 3:26 AM, Tobias Pankrath wrote:
 I'd like D to provide the following guarantee: If I corrupt my memory
 using  safe code, the error must be in code marked  trusted /  system,
 either because the do not provide a  safe interface or because they are
 buggy.
That's what we're going for. -- Andrei
Exactly. And it's been that way all along.
Feb 06 2015
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 2/5/2015 4:02 PM, Dicebot wrote:
 I have revoked my Phobos access for a specific reason that I
 can't do the reviewer job properly with such requirements and would have been
 forced to ignore all pull requests that tackle  trusted anyway.
It's appropriate to recuse yourself from reviewing aspects of code that you disagree with. I do the same. But it seems a little drastic to withdraw from all the rest. There's plenty, plenty more in Phobos.
Feb 05 2015
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 2/5/2015 3:20 PM, Dicebot wrote:
 For me this thread was clear alarm :  safe in its current state is a 100%
 misfeature and is better to be advertised against until either its design
 changes or effective idioms are presented.
I try to address your points in the new thread " trust is an encapsulation method, not an escape". Please continue there.
Feb 05 2015
prev sibling next sibling parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/5/15 2:17 PM, H. S. Teoh via Digitalmars-d wrote:

 I mostly like this idea, except that foo() should not be marked  safe.
 It should be marked  trusted because it still needs review, but the
 meaning of  trusted should be changed so that it still enforces
  safe-ty, except that now  trusted variables are permitted. Or rather, I
 would call them  system variables -- they *cannot* be trusted, and must
 be manually verified.
I still am not sure I get this idea. If safe code can call trusted code, how is it any more mechanically verified than trusted code? API-wise, there is no difference. The whole idea of trusted is that you don't need to go read the implementation, you have to trust the person writing it. As a user of trusted function, I really don't care whether it's marked safe or trusted. And I shouldn't be able to break my safe function by calling it.
  safe code should not allow any  system variables or any cast( safe)
 operations, period. Otherwise, anybody can wrap unsafe operations inside
 their  safe function and still clothe it with the sheep's clothing of
  safe, and  safe becomes a meaningless annotation.
The only way to fix this is to ban trusted altogether. Which makes safe quite useless indeed. -Steve
Feb 05 2015
prev sibling parent "Zach the Mystic" <reachzach gggmail.com> writes:
On Thursday, 5 February 2015 at 19:19:51 UTC, H. S. Teoh wrote:
 -  safe should continue being  safe -- no (potentially) unsafe
   operations are allowed, period.
 
   Rationale: allowing  system variables in  safe code makes the 
 function
   non-verifiable mechanically. This completely breaks the whole 
 point of
    safe.

 - Change the meaning of  trusted (as applied to a function) to 
 require
    safe inside the function body, but in addition permit  system
   variables and cast( safe).
 
   Rationale: the function cannot be verified mechanically to be 
 safe,
   therefore it cannot be marked  safe. It must be marked 
  trusted to
   draw attention to the fact that manual review is required. 
 However,
   this does not constitute license to perform arbitrary  system
   operations. Instead, any  system code/variable inside the 
  trusted
   function must be explicitly marked as such, to indicate that 
 these
   items require special attention during review. Everything 
 else must
   still conform to  safe requirements.

 - Introduce  system variables for holding tainted values that 
 the
   compiler cannot guarantee the safety of, as well as 
 cast( safe), as
   described in Steven's post. These constructs are only 
 permitted inside
    trusted functions. They are prohibited in  safe code, and 
 are no-ops
   in  system code.

   Rationale: to reduce the maintainability problem,  trusted 
 functions
   should not allow  system code by default. Rather, the scope 
 of  system
   code/data inside a  trusted function should be restricted by 
 requiring
   explicit marking. The compiler then helps the verification 
 process by
   ensuring that anything not explicitly marked is still  safe.


 T
This is another interesting addition to the original proposal. My initial response is that if I were building the system from the ground up, I might not understand why a function needs to be redundantly marked trusted, since the unsafe data is already casted( safe) inside the function. Yet two factors start to sway me in favor of what you're saying, despite my initial doubt: 1. Memory safety is serious business. Forcing people to mark potentially unsafe functions trusted instead of safe, while (as Steven Schveighoffer pointed out) actually meaning nothing to the caller, will allow maintainers to more easily detect possible problems. In other words, it's a redundancy, but it's a GOOD redundancy because of how serious safety issues are. I don't know how strong this argument is, but it seems sound and I'm willing to go with it, barring good reasons not to. 2. It's already there in the language. This heavily reduces the "switching costs". While not a great argument in itself, to my mind anyway (#pleasebreakourcode ;-), it does seem more likely to be accepted by people who fear the change.
Feb 05 2015
prev sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Thu, Feb 05, 2015 at 11:50:18AM -0500, Steven Schveighoffer via
Digitalmars-d wrote:
 Reading the thread in bug report
 https://issues.dlang.org/show_bug.cgi?id=14125 gets me thinking,
 perhaps  trusted is mis-designed.
That's the feeling I'm getting too! Well, to be fair, I can see why it arose in its current form historically, but history is not justification for mis-design.
 At the moment, a  trusted function is the same as a  system function,
 but is allowed to be called from a  safe function. And that's it.
 
 But a  trusted function may only need to be marked  trusted because of
 a few lines of code. So we now use this mechanism where we mark the
  trusted portions with a lambda or static nested function, and call
 that internally, and mark the entire function  safe.
The more I think about it, the more I'm becoming convinced that trusted is a misfeature. Basically, it's a blanket permission for a function to perform arbitrary system operations and yet hide under the sheep's clothing of being callable from safe code. While that may work for trivial 4-line functions, it quickly becomes a maintenance nightmare when a large, complex function is marked trusted. Consider, for example, if trustedFunction() calls helper functions helper1(), helper2(), and helper3(), that are currently marked safe. The last review of trustedFunction() verified its safety, *keyed on the fact that helper1, helper2, and helper3 are safe*. Now, helper1, helper2 and helper3 are not directly related to trustedFunction; they are merely general utilities that trustedFunction depends on. So people may make changes to them later on, without realizing the implications they may have on the trustworthiness of trustedFunction(). One of these changes may, inadvertently or otherwise, make helper1() system instead of safe. Note that with template attribute inference, this can even be an *unconscious* change. However, since helper1() is a general utility, nobody realizes that trustedFunction's manual proof of safety has been compromised. As a result, the PR gets merged, and now trustedFunction() is no longer trustworthy but is still marked trusted. Worse yet, it may not be helper1() directly that breaks the trustworthiness of trustedFunction(), but helper1 calls helper4 which calls helper5 that in turn calls helper6. Due to some obscure change in helper6, which used to be safe, it now becomes system, and because of that helper5, helper4, and helper1 all become system. But since trustedFunction() is allowed to call system functions without restriction, the breakage goes completely unnoticed. Even a thorough review of trustedFunction may not detect this problem, unless the reviewer recursively reviews ALL dependencies of trustedFunction. Now, if trusted functions are *still* under the restrictions of safe code, except small parts explicitly marked to require human verification, then if helper1 is outside the marked section, as soon as helper6 changes in safety, trustedFunction no longer compiles. This at least provides us with some safety net in case things go wrong. However, the problem still remains. No matter how confined those explicitly marked sections of code may be, they are still subject to the above indirect breach of trust problem. I see no real solution to this. [...]
 At least with the  trusted inner function/lambda, you limit yourself
 to the code that has been marked as such, and you don't need to worry
 about the actual  safe code that is added to the function.
Actually, the trusted inner function is an abuse of trusted. They are not trustworthy AT ALL, unless, as Walter said on the bug comments, they present an API that CANNOT be made to break safety, no matter what arguments you give it. The current implementation of wrapping &ptr in a trusted inner function that simply turns the & operator into a safe operation, is a completely wrong solution. Consider: auto trustedFunc(ref int x) safe { ref int trustedDeref(int* x) trusted { return *x; } auto p = &x; trustedDeref(p) = 999; } On the surface, this looks good, since the dangerous operation inside trustedFunc has been overtly marked as such, so reviewers will carefully review it to make sure it's correct... except, it *cannot* be correct, because it's assuming that its CALLER doesn't pass invalid arguments to it. Somebody could easily change the code to: auto trustedFunc(ref int x) safe { ref int trustedDeref(int* x) trusted { return *x; } int* p; // <--- N.B.: new change, now p is null trustedDeref(p) = 999; // arghhh.... } The compiler continues to accept it blindly, even though it's now blatantly wrong. While the *idea* of marking out specific sections of code inside a trusted function for scrutiny is valid, the above approach is NOT the right way to go about implementing it. Rather, what *should* have been done, is that trustedFunc should be marked trusted, but the compiler STILL imposes safe restrictions on the function body, except for explicitly-marked blocks inside. To use hypothetical syntax, it should look something like this: auto trustedFunc(ref int x) trusted { // <-- trusted to indicate need of review int *p = &x; system { // indicate that code inside this block needs manual verification *p = 999; } //*p = 888; // Illegal: system operations not allowed outside system block } IOW, the entire function is marked trusted to indicate that it needs review, but the function body is STILL under safe restrictions. So trusted becomes the same as safe, except that it permits system blocks inside, and system operations must be confined to these blocks.
 Or do you?... the problem with this mentality is interactions with
 data inside the  safe code after a  trusted function is called can
 still be subject to memory safety issues. An example (using static
 functions for clarity):
 
 void foo()  safe
 {
    static auto tmalloc(size_t x)  trusted {return (cast(int *)malloc(x *
 sizeof(int)))[0..x];}
    static void tfree(void[] arr)  trusted {free(arr.ptr);}
 
    auto mem = tmalloc(100);
    tfree(mem);
    mem[0] = 5;
 }
 
 Note that the final line in the function, setting mem[0] to 5, is the
 only unsafe part. it's safe to malloc, it's safe to free as long as
 you don't ever refer to that memory again.
 
 But the problem with the above is that  trusted is not needed to apply
 to the mem[0] = 5 call. And imagine that the mem[0] = 5 function may
 be simply added, by another person who didn't understand the context.
 Marking the whole function as safe is kind of meaningless here.
Exactly, this is a totally wrong approach to implementing maintainable trusted functions. The function should not be marked safe because it is actually trusted, not safe. The inner functions tmalloc and tfree are mis-attributed as trusted, when they are actually system.
 Changing gears, one of the issues raised in the aforementioned bug is
 that a function like this really should be marked trusted in its
 entirety. But what does this actually mean?
 
 When we mark a function  safe, we can assume that the compiler has
 checked it for us. When we mark a function  trusted, we can assume the
 compiler has NOT checked it for us. But how does this help? A  safe
 function can call a  trusted function. So there is little difference
 between a  safe and a  trusted function from an API point of view. I
 would contend that  trusted really should NEVER HAVE BEEN a function
 attribute. You may as well call them all  safe, there is no
 difference.
Yes, trusted in its current incarnation is fundamentally flawed. However, I don't agree that the entire function can be marked safe. Otherwise, safe code will now contain arbitrary trusted blocks inside, and so anybody can freely escape safe restrictions just by putting objectionable operations inside trusted blocks. The function still needs to be marked trusted -- to draw attention for the need of scrutiny -- *but* the function body is still confined under safe requirements, except that now the "escape hatch" of trusted code blocks are permitted as well.
 What I think we need to approach this correctly is that instead of
 marking *functions* with  trusted, we need to mark *data* and *calls*
 with  trusted.  Once data is used inside a  trusted island, it becomes
 tainted with the  trusted mark. The compiler can no longer guarantee
 safety for that data.
 
 So how can we implement this without too much pain? I envision that we
 can mark lines or blocks inside a  safe function as  trusted (a
  trusted block inside a  system function would be a no-op, but
 allowed).
I think it's better to keep safe as-is, no trusted blocks are allowed inside safe code. Instead, change trusted to mean " safe by default, but now allow trusted blocks for performing operations that must be manually verified". Any function that contains these "escape blocks" can no longer be marked safe, because they now require manual verification.
 I also envision that any variable used inside the function is
 internally marked as  safe until it's used in a  trusted block, and
 then is marked as  trusted (again internally, no requirement to mark
 in syntax). If at any time during compilation a  trusted variable is
 used in  safe code, it's a compile error.
 
 There may be situations in which you DO need this to work, and in
 those cases, I would say a cast( safe) could get rid of that mark. For
 example, if you want to return the result of a  trusted call in a
  safe function.
 
 Such a change would be somewhat disruptive, but much more in line with
 what  trusted calls really mean.
 
 I've left some of the details fuzzy on purpose, because I'm not a
 compiler writer :)
[...] I like the idea of tainting data as system (for lack of a better term). This increases the compiler's ability to catch mistakes, and does not require completely turning off safe checks inside trusted functions. I think it was a grave mistake for trusted to completely turn off all safe checks. trusted functions should still be under safe restrictions, and any unsafe operations must be explicitly marked as such before being allowed. T -- I am not young enough to know everything. -- Oscar Wilde
Feb 05 2015
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/5/15 1:54 PM, H. S. Teoh via Digitalmars-d wrote:

 However, I don't agree that the entire function can be marked  safe.
 Otherwise,  safe code will now contain arbitrary  trusted blocks inside,
 and so anybody can freely escape  safe restrictions just by putting
 objectionable operations inside  trusted blocks. The function still
 needs to be marked  trusted -- to draw attention for the need of
 scrutiny -- *but* the function body is still confined under  safe
 requirements, except that now the "escape hatch" of  trusted code blocks
 are permitted as well.
Let's assume trusted means safe code can call it, but it may have system-like functionality in it (however it happens). Whether it's in an internal lambda/nested static function or not, the point is, safe code can call trusted code. To say that safe makes some promises above/beyond trusted is just incorrect. Now, if you're saying trusted cannot be called via safe, then I don't know what your plan is for trusted :) If that's true, please explain. -Steve
Feb 05 2015
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Thu, Feb 05, 2015 at 02:11:32PM -0500, Steven Schveighoffer via
Digitalmars-d wrote:
 On 2/5/15 1:54 PM, H. S. Teoh via Digitalmars-d wrote:
 
However, I don't agree that the entire function can be marked  safe.
Otherwise,  safe code will now contain arbitrary  trusted blocks inside,
and so anybody can freely escape  safe restrictions just by putting
objectionable operations inside  trusted blocks. The function still
needs to be marked  trusted -- to draw attention for the need of
scrutiny -- *but* the function body is still confined under  safe
requirements, except that now the "escape hatch" of  trusted code blocks
are permitted as well.
Let's assume trusted means safe code can call it, but it may have system-like functionality in it (however it happens). Whether it's in an internal lambda/nested static function or not, the point is, safe code can call trusted code. To say that safe makes some promises above/beyond trusted is just incorrect.
No, safe means the compiler can mechanically verify that it is safe, under the assumption that any trusted function called from safe code has been manually verified. trusted means the compiler was not able to mechanically verify the whole function, but it has been manually verified to be safe. If you allow system variables in safe code, then you're essentially make safe the same thing as trusted, which means the compiler cannot verify *anything*, so it's making the problem worse, as now you have to manually verify *all* safe code instead of just the trusted portions.
 Now, if you're saying  trusted cannot be called via  safe, then I
 don't know what your plan is for  trusted :) If that's true, please
 explain.
[...] The idea is that while we would like the compiler to mechanically verify *everything*, in practice there are some things that the compiler simply cannot verify. Since those remaining things require human effort to verify and humans are prone to errors, we would like to limit the scope of those things by confining them inside trusted functions, which, ideally, would be few in number and limited in scope. Everything else should be relegated to safe functions, where we *require* completely automated verification by the compiler. As it turns out, even within these trusted functions, we humans could use some help, therefore we'd like the compiler to help us verify as much of these functions as it can for us, and then we can manually check the remaining bits that cannot be mechanically verified. To this end, your idea of tainting data is a valuable tool: by limiting system-ness to explicitly marked variables, we increase the scope of automatic verification even inside trusted functions, so that the compiler can help us catch some things that we may have missed when we manually check the code. If we allow these system variables inside safe code, then we have defeated the purpose of having trusted functions in the first place, because now the scope of functions that require manual inspection expands to *all* safe functions, which increases the maintainability problem rather than reduce it. Likewise, in the current implementation trusted is a wholesale license to perform arbitrary system operations, which increases the difficulty of manually verifying trusted functions. (In fact, it's disastrous, since you cannot guarantee that a code change in some remote function won't change the trustworthiness of a verified trusted function.) Requiring potentially-unsafe data being explicitly marked system allows us to continue to impose safe restrictions on everything else, thereby reducing the scope of the remote-code-change problem to overtly marked places where we can focus our scrutiny. T -- Doubtless it is a good thing to have an open mind, but a truly open mind should be open at both ends, like the food-pipe, with the capacity for excretion as well as absorption. -- Northrop Frye
Feb 05 2015
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/5/15 2:43 PM, H. S. Teoh via Digitalmars-d wrote:

 The idea is that while we would like the compiler to mechanically verify
 *everything*, in practice there are some things that the compiler simply
 cannot verify. Since those remaining things require human effort to
 verify and humans are prone to errors, we would like to limit the scope
 of those things by confining them inside  trusted functions, which,
 ideally, would be few in number and limited in scope. Everything else
 should be relegated to  safe functions, where we *require* completely
 automated verification by the compiler.
What's the difference between an internal scope and a separate function scope? That is, a static internal function can simply be a private module function and have the same effect. I don't see how your proposal is more safe than mine, or that somehow I can expect a safe function never to have manually verified code that it uses. -Steve
Feb 05 2015
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Thu, Feb 05, 2015 at 03:14:18PM -0500, Steven Schveighoffer via
Digitalmars-d wrote:
 On 2/5/15 2:43 PM, H. S. Teoh via Digitalmars-d wrote:
 
The idea is that while we would like the compiler to mechanically
verify *everything*, in practice there are some things that the
compiler simply cannot verify. Since those remaining things require
human effort to verify and humans are prone to errors, we would like
to limit the scope of those things by confining them inside  trusted
functions, which, ideally, would be few in number and limited in
scope. Everything else should be relegated to  safe functions, where
we *require* completely automated verification by the compiler.
What's the difference between an internal scope and a separate function scope? That is, a static internal function can simply be a private module function and have the same effect. I don't see how your proposal is more safe than mine, or that somehow I can expect a safe function never to have manually verified code that it uses.
[...] It's as Walter just said: safe means the compiler has mechanically verified it, trusted means the compiler has *not* verified it but that a human did (or so we hope). If you like, think of it as safe-compiler-verified vs. safe-human-verified. By segregating the two, you limit the scope of code that needs to be reviewed. Of course, this is only of interest to the maintainer of the code, really, to the user both sport a safe API and there is no distinction. In any case, it doesn't look like anything is going to change after all, so this discussion has is just another of those what-could-have-beens rather than what could be. T -- Klein bottle for rent ... inquire within. -- Stephen Mulraney
Feb 05 2015
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/5/15 3:23 PM, H. S. Teoh via Digitalmars-d wrote:
 On Thu, Feb 05, 2015 at 03:14:18PM -0500, Steven Schveighoffer via
Digitalmars-d wrote:
 On 2/5/15 2:43 PM, H. S. Teoh via Digitalmars-d wrote:

 The idea is that while we would like the compiler to mechanically
 verify *everything*, in practice there are some things that the
 compiler simply cannot verify. Since those remaining things require
 human effort to verify and humans are prone to errors, we would like
 to limit the scope of those things by confining them inside  trusted
 functions, which, ideally, would be few in number and limited in
 scope. Everything else should be relegated to  safe functions, where
 we *require* completely automated verification by the compiler.
What's the difference between an internal scope and a separate function scope? That is, a static internal function can simply be a private module function and have the same effect. I don't see how your proposal is more safe than mine, or that somehow I can expect a safe function never to have manually verified code that it uses.
[...] It's as Walter just said: safe means the compiler has mechanically verified it, trusted means the compiler has *not* verified it but that a human did (or so we hope). If you like, think of it as safe-compiler-verified vs. safe-human-verified. By segregating the two, you limit the scope of code that needs to be reviewed. Of course, this is only of interest to the maintainer of the code, really, to the user both sport a safe API and there is no distinction.
I'll put out a strawman similar to my example response to Zach: trusted int[] tmalloc(size_t x) { ... } trusted void tfree(int[] x) { ... } Now, let's say these are in some module you use, and your code is: void foo() safe { auto x = tmalloc(100); tfree(x); ... x[0] = 1; } foo is "mechanically verified", but it's not really, because tmalloc and tfree are not. Now, you may just trust that tfree is fine, you may go and verify what tfree does. But in either case, you still have the problem that tfree(x) and the usage of x may be far away from each other, and may even be written by different people at different times. The compiler will still fail you in this regard, because it will not complain. Understand that I don't disagree with your proposal, I just think it can be reduced to mine, and is unnecessarily complicated. I think the *fundamental* problem with trusted (currently) is that it assumes all the code it covers was written simultaneously and is not allowed to morph. This isn't the way code is written, it's massaged and tweaked over long periods of time by different people.
 In any case, it doesn't look like anything is going to change after all,
 so this discussion has is just another of those what-could-have-beens
 rather than what could be.
Don't give up so easily ;) -Steve
Feb 05 2015
parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Thu, Feb 05, 2015 at 03:39:16PM -0500, Steven Schveighoffer via
Digitalmars-d wrote:
[...]
 I think the *fundamental* problem with  trusted (currently) is that it
 assumes all the code it covers was written simultaneously and is not
 allowed to morph. This isn't the way code is written, it's massaged
 and tweaked over long periods of time by different people.
Thank you, that's exactly what I've been trying to say, but rather poorly. This is what makes the current incarnation of trusted unworkable in real-life. Putting it on a function is a stamp of approval that the code has been verified by a human. Unfortunately, the element of time has been neglected. It may have been verified back when it was first committed, but now that 10 other people have stuck their grubby hands into the code, who knows if the original verification still applies? Yet the trusted label continues to be a stamp of approval claiming that the function is still safe. It's like a car insurance sticker without expiry date. The insurance company may have gone bust for all I know, but it sure looks good that my car is still "insured"! There needs to be some kind of "change insurance" to trusted. If somebody makes a careless code change that may break the promise of trusted, there needs to be a way for the compiler to detect this and complain loudly. Of course, we can't prevent *malicious* changes, since there's always another way to work around the compiler, but in the reasonable cases at the very least, careless mistakes ought to be caught and pointed out. Such as a safe helper function used by a trusted function becoming system because somebody modified the original implementation. Requiring some kind of annotation on exactly what parts of a trusted function rely on unsafe (or rather, safe but unverifiable by the compiler) operations helps by introducing a barrier for mistakes: the compiler will reject your code unless you consciously mark it up as trusted (thereby indicating that you have manually verified the code -- or maliciously introducing unsafe code in trusted, as the case may be). T -- Computers shouldn't beep through the keyhole.
Feb 05 2015