digitalmars.D - Concept proposal: Safely catching error

Olivier FAURE (70/70) Jun 05 2017 I recently skimmed the "Bad array indexing is considered deadly"

ketmar (10/11) Jun 05 2017 tbh, i think that it adds Yet Another Exception Rule to the language, an...

Olivier FAURE (19/25) Jun 05 2017 Fair enough. A few counterpoints:

ketmar (15/38) Jun 05 2017 this still nullifies the sense of Error/Exception differences. not all

Olivier FAURE (16/25) Jun 07 2017 I don't think this is a workaround, or that it goes against the

Moritz Maxeiner (6/15) Jun 05 2017 Pragmatic question: How much work do you think this will require?

Olivier FAURE (22/26) Jun 05 2017 Good question. I'm no compiler programmer, so I'm not sure what

Moritz Maxeiner (21/43) Jun 05 2017 Sure, but with regards to long running processes that are

Olivier FAURE (21/41) Jun 07 2017 Note that in the case you describe, the alternative is either

Moritz Maxeiner (16/30) Jun 07 2017 Not how pure is currently defined in D, see the referred spec;

ag0aep6g (8/62) Jun 05 2017 But `myData` is still alive when `catch (Error)` is reached, isn't it?

Olivier FAURE (12/42) Jun 07 2017 Good catch; yes, this example would refuse to compile; myData

ag0aep6g (24/36) Jun 07 2017 I think I mistyped there. Makes more sense this way: "You can't assume

ag0aep6g (10/14) Jun 07 2017 Thinking a bit more about this, I'm not sure if it's entirely correct.
Olivier FAURE (33/62) Jun 08 2017 To clarify, when I said "shouldn't be trusted", I meant in the

ag0aep6g (31/39) Jun 08 2017 I might get the idea now. The throwing code could be in the middle of

Olivier FAURE (16/32) Jun 08 2017 That's true. A "pure after cleanup" function is incompatible with

ag0aep6g (19/35) Jun 08 2017 I think it's supposed to be just as pure as any other pure function.

Steven Schveighoffer (29/35) Jun 05 2017 I don't think this will work. Only throwing Error makes a function

Olivier FAURE (30/39) Jun 07 2017 If the function is @pure, then the only things it can set up will

Steven Schveighoffer (13/46) Jun 08 2017 Hm... if you locked an object that was passed in on the stack, for

Olivier FAURE (7/14) Jun 08 2017 This wouldn't be allowed unless the object was duplicated /

Steven Schveighoffer (9/15) Jun 08 2017 void foo(Mutex m, Data d) pure

Stanislav Blinov (3/11) Jun 08 2017 Isn't synchronized(m) not nothrow?

Steven Schveighoffer (17/27) Jun 08 2017 You're right, it isn't. I actually didn't know that. Also forgot to make...

Jesse Phillips (17/17) Jun 08 2017 I want to start by stating that the discussion around being able

Olivier FAURE <olivier.faure epitech.eu> writes:

I recently skimmed the "Bad array indexing is considered deadly" 
thread, which discusses the "array OOB throws Error, which throws 
the whole program away" problem.

The gist of the debate is:

- Array OOB is a programming problem; it means an invariant is 
broken, which means the code surrounding it probably makes 
invalid assumptions and shouldn't be trusted.

- Also, it can be caused by memory corruption.

- But then again, anything can be cause by memory corruption, so 
it's kind of an odd thing to worry about. We should worry about 
not causing it, not making memory corrupted programs safe, since 
it's extremely rare and there's not much we can do about it 
anyway.

- But memory corruption is super bad, if a proved error *might* 
be caused by memory corruption then we must absolutely throw the 
potentially corrupted data away without using it.

- Besides, even without memory corruption, the same argument 
applies to broken invariants; if we have data that breaks 
invariants, we need to throw it away, and use it as little as 
possible.

- But sometimes we have very big applications with lots of data 
and lots of code. If my server deals with dozens of clients or 
more, I don't want to brutally disconnect them all because I need 
to throw away one user's data.

- This could be achieved with processes. Then again, using 
processes often isn't practical for performance or architecture 
reasons.

My proposal for solving these problems would be to explicitly 
allow to catch Errors in  safe code IF the try block from which 
the Error is caught is perfectly pure.

In other words,  safe functions would be allowed to catch Error 
after try blocks if the block only mutates data declared inside 
of it; the code would look like:

     import vibe.d;

     // ...

     string handleRequestOrError(in HTTPServerRequest req)  safe {
         ServerData myData = createData();

         try {
             // both doSomethingWithData and mutateMyData are  pure

             doSomethingWithData(req, myData);
             mutateMyData(myData);

             return myData.toString;
         }
         catch (Error) {
             throw new SomeException("Oh no, a system error 
occured");
         }
     }

     void handleRequest(HTTPServerRequest req,
                        HTTPServerResponse res)  safe
     {
         try {
             res.writeBody(handleRequestOrError(req), 
"text/plain");
         }
         catch (SomeException) {
             // Handle exception
         }
     }

The point is, this is safe even when doSomethingWithData breaks 
an invariant or mutateMyData corrupts myData, because the 
compiler guarantees that the only data affected WILL be thrown 
away or otherwise unaccessible by the time catch(Error) is 
reached.

This would allow to design applications that can fail gracefully 
when dealing with multiple independent clients or tasks, even 
when one of the tasks has to thrown away because of a programmer 
error.

What do you think? Does the idea have merit? Should I make it 
into a DIP?

Jun 05 2017

ketmar <ketmar ketmar.no-ip.org> writes:

Olivier FAURE wrote:

 What do you think? Does the idea have merit? Should I make it into a DIP?

tbh, i think that it adds Yet Another Exception Rule to the language, and 
this does no good in the long run. "oh, you generally cannot do that, 
except if today is Friday, it is rainy, and you've seen pink unicorn at the 
morning." the more exceptions to general rules language has, the more it 
reminds Dragon Poker game from Robert Asprin books. any exception will 
usually have a strong rationale behind it, of course, so it will be a 
little reason to not accept it, especially if we had accepted some 
exceptions before. i think it is better to not follow that path, even if 
this one idea looks nice.

Jun 05 2017

Olivier FAURE <olivier.faure epitech.eu> writes:

On Monday, 5 June 2017 at 10:09:30 UTC, ketmar wrote:
 tbh, i think that it adds Yet Another Exception Rule to the 
 language, and this does no good in the long run. "oh, you 
 generally cannot do that, except if today is Friday, it is 
 rainy, and you've seen pink unicorn at the morning." the more 
 exceptions to general rules language has, the more it reminds 
 Dragon Poker game from Robert Asprin books.

Fair enough. A few counterpoints:

- This one special case is pretty self-contained. It doesn't 

impact code that doesn't use it, and the users most likely to 
hear about it are the one who need to recover from Errors in 
their code.

- It doesn't introduce elaborate under-the-hood tricks (unlike 
DIP 1008*). It uses already-existing concepts ( safe and  pure), 
and is in fact closer to the intuitive logic behind Error 
recovery than the current model; instead of "You can't recover 
from Errors" you have "You can't recover from Errors unless you 
flush all data that might have been affected by it".

*Note that I am not making a statement for or against those DIPs. 
I'm only using them as examples to compare my proposal against.

So while this would add feature creep to the language, but I'd 
argue that feature creep would be pretty minor and 
well-contained, and would probably be worth the problem it would 
solve.

Jun 05 2017

ketmar <ketmar ketmar.no-ip.org> writes:

Olivier FAURE wrote:

 On Monday, 5 June 2017 at 10:09:30 UTC, ketmar wrote:
 tbh, i think that it adds Yet Another Exception Rule to the language, 
 and this does no good in the long run. "oh, you generally cannot do 
 that, except if today is Friday, it is rainy, and you've seen pink 
 unicorn at the morning." the more exceptions to general rules language 
 has, the more it reminds Dragon Poker game from Robert Asprin books.

 Fair enough. A few counterpoints:

 - This one special case is pretty self-contained. It doesn't require 

 doesn't use it, and the users most likely to hear about it are the one 
 who need to recover from Errors in their code.

 - It doesn't introduce elaborate under-the-hood tricks (unlike DIP 
 1008*). It uses already-existing concepts ( safe and  pure), and is in 
 fact closer to the intuitive logic behind Error recovery than the current 
 model; instead of "You can't recover from Errors" you have "You can't 
 recover from Errors unless you flush all data that might have been 
 affected by it".

 *Note that I am not making a statement for or against those DIPs. I'm 
 only using them as examples to compare my proposal against.

 So while this would add feature creep to the language, but I'd argue that 
 feature creep would be pretty minor and well-contained, and would 
 probably be worth the problem it would solve.

this still nullifies the sense of Error/Exception differences. not all 
errors are recoverable, even in  safe code. assuming that it is safe to 
catch any Error in  safe immediately turns it no unsafe. so... we will need 
to introduce RecoverableInSafeCodeError class, and change runtime to throw 
it instead of Error (sometimes). and even more issues follows (it's 
avalanche of changes, and possible code breakage too).

so, in the original form your idea turns  safe code into unsafe, and with 
more changes it becomes a real pain to implement, and adds more complexity 
to the language (another Dragon Poker modifier).

using wrappers and carefully checking preconditions looks better to me. 
after all, if programmer failed to check some preconditions, the worst 
thing to do is trying to hide that by masking errors. bombing out is *way* 
better, i believe, 'cause it forcing programmer to really fix the bugs 
instead of creating hackish workarounds.

Jun 05 2017

Olivier FAURE <olivier.faure epitech.eu> writes:

On Monday, 5 June 2017 at 13:13:01 UTC, ketmar wrote:
 this still nullifies the sense of Error/Exception differences. 
 not all errors are recoverable, even in  safe code.

 ...

 using wrappers and carefully checking preconditions looks 
 better to me. after all, if programmer failed to check some 
 preconditions, the worst thing to do is trying to hide that by 
 masking errors. bombing out is *way* better, i believe, 'cause 
 it forcing programmer to really fix the bugs instead of 
 creating hackish workarounds.

I don't think this is a workaround, or that it goes against the 
purpose of Errors.

The goal would still be to bomb out, cancel whatever you were 
doing, print a big red error message to the coder / user, and 
exit.

A program that catches an Error would not try to use the data 
that broke a contract; in fact, the program would not have access 
to the invalid data, since it would be thrown away. It's natural 
progression would be to log the error, and quit whatever it was 
doing.

The point is, if the program needs to free system resources 
before shutting down, it could do so; or if the program is a 
server or a multi-threaded app dealing with multiple clients at 
the same time, those clients would not be affected by a crash 
unrelated to their data.

Jun 07 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Monday, 5 June 2017 at 09:50:15 UTC, Olivier FAURE wrote:
 My proposal for solving these problems would be to explicitly 
 allow to catch Errors in  safe code IF the try block from which 
 the Error is caught is perfectly pure.

 This would allow to design applications that can fail 
 gracefully when dealing with multiple independent clients or 
 tasks, even when one of the tasks has to thrown away because of 
 a programmer error.

 What do you think? Does the idea have merit? Should I make it 
 into a DIP?

Pragmatic question: How much work do you think this will require?
Because writing a generic wrapper that you can customize the 
fault behaviour for using DbI requires very little[1].

[1] 
https://github.com/Calrama/libds/blob/fbceda333dbf76697050faeb6e25dbfcc9e3fbc0/src/ds/linear/array/dynamic.d

Jun 05 2017

Olivier FAURE <olivier.faure epitech.eu> writes:

On Monday, 5 June 2017 at 10:59:28 UTC, Moritz Maxeiner wrote:
 Pragmatic question: How much work do you think this will 
 require?

Good question. I'm no compiler programmer, so I'm not sure what 
the answer is.

I would say "probably a few days at most". The change is fairly 
self-contained, and built around existing concepts (mutability 
and  safety); I think it would mostly be a matter of adding a 
function to the safety checks that tests whether a mutable 
reference to non-local data is used in any try block with 
catch(Error).

Another problem is that non-gc memory allocated in the try block 
would be irreversibly leaked when an Error is thrown (though now 
that I think about it, that would probably count as impure and be 
impossible anyway). Either way, it's not a safety risk and the 
programmer can decide whether leaking memory is worse than 
brutally shutting down for their purpose.

 Because writing a generic wrapper that you can customize the 
 fault behaviour for using DbI requires very little.

Using an array wrapper only covers part of the problem. Users may 
want their server to keep going even if they fail an assertion, 
or want the performance of  nothrow code, or use a library that 
throws RangeError in very rare and hard to pinpoint cases.

Arrays aside, I think there's some use in being able to safely 
recover from (or safely shut down after) the kind of broken 
contracts that throw Errors.

Jun 05 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Monday, 5 June 2017 at 12:01:35 UTC, Olivier FAURE wrote:
 On Monday, 5 June 2017 at 10:59:28 UTC, Moritz Maxeiner wrote:
 Pragmatic question: How much work do you think this will 
 require?

 Another problem is that non-gc memory allocated in the try 
 block would be irreversibly leaked when an Error is thrown 
 (though now that I think about it, that would probably count as 
 impure and be impossible anyway).

D considers allocating memory as pure[1].

 Either way, it's not a safety risk and the programmer can 
 decide whether leaking memory is worse than brutally shutting 
 down for their purpose.

Sure, but with regards to long running processes that are 
supposed to handle tens of thousands of requests, leaking memory 
(and continuing to run) will likely eventually end up brutally 
shutting down the process on out of memory errors. But yes, that 
is something that would have to be evaluated on a case by case 
basis.

 Because writing a generic wrapper that you can customize the 
 fault behaviour for using DbI requires very little.

 Using an array wrapper only covers part of the problem.

It *replaces* the hard coded assert Errors with flexible attests, 
that can throw whatever you want (or even kill the process 
immediately), you just have to disable the runtimes internal 
bound checks via `-boundscheck=off`.

 Users may want their server to keep going even if they fail an 
 assertion

Normal assertions (other than assert(false)) are not present in 
-release mode, they are purely for debug mode.

 or want the performance of  nothrow code

That's easily doable with the attest approach.

 or use a library that throws RangeError in very rare and hard 
 to pinpoint cases.

Fix the library (or get it fixed if you don't have the code).

 Arrays aside, I think there's some use in being able to safely 
 recover from (or safely shut down after) the kind of broken 
 contracts that throw Errors.

I consider there to be value in allowing users to say "this is 
not a contract, it is a valid use case" (-> wrapper), but a 
broken contract being recoverable violates the entire concept of 
DbC.

[1] https://dlang.org/spec/function.html#pure-functions

Jun 05 2017

Olivier FAURE <olivier.faure epitech.eu> writes:

On Monday, 5 June 2017 at 12:59:11 UTC, Moritz Maxeiner wrote:
 On Monday, 5 June 2017 at 12:01:35 UTC, Olivier FAURE wrote:
 Another problem is that non-gc memory allocated in the try 
 block would be irreversibly leaked when an Error is thrown 
 (though now that I think about it, that would probably count 
 as impure and be impossible anyway).

 D considers allocating memory as pure[1].

 ...

 Sure, but with regards to long running processes that are 
 supposed to handle tens of thousands of requests, leaking 
 memory (and continuing to run) will likely eventually end up 
 brutally shutting down the process on out of memory errors. But 
 yes, that is something that would have to be evaluated on a 
 case by case basis.

Note that in the case you describe, the alternative is either 
"Brutally shutdown right now", or "Throwaway some data, 
potentially some memory as well, and maybe brutally shut down 
later if that happens too often". (although in the second case, 
there is also the trade-off that the leaking program "steals" 
memory from the other routines running on the same computer)

Anyway, I don't think this would happen. Most forms of memory 
allocations are impure, and wouldn't be allowed in a try {} 
catch(Error) block; C's malloc() is pure, but C's free() isn't, 
so the thrown Error wouldn't be skipping over any calls to 
free(). Memory allocated by the GC would be reclaimed once the 
Error is caught and the data thrown away.

 Arrays aside, I think there's some use in being able to safely 
 recover from (or safely shut down after) the kind of broken 
 contracts that throw Errors.

 I consider there to be value in allowing users to say "this is 
 not a contract, it is a valid use case" (-> wrapper), but a 
 broken contract being recoverable violates the entire concept 
 of DbC.

I half-agree. There *should not* be way to say "Okay, the 
contract is broken, but let's keep going anyway".

There *should* be a way to say "okay, the contract is broken, 
let's get rid of all data associated with it, log an error 
message to explain what went wrong, then kill *the specific 
thread/process/task* and let the others keep going".

The goal isn't to ignore or bypass Errors, it's to 
compartmentalize the damage.

Jun 07 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Wednesday, 7 June 2017 at 15:35:56 UTC, Olivier FAURE wrote:
 On Monday, 5 June 2017 at 12:59:11 UTC, Moritz Maxeiner wrote:

 Anyway, I don't think this would happen. Most forms of memory 
 allocations are impure,

Not how pure is currently defined in D, see the referred spec; 
allocating memory is considered pure (even if it is impure with 
the theoretical pure definition).
This is something that would need to be changed in the spec.

 I consider there to be value in allowing users to say "this is 
 not a contract, it is a valid use case" (-> wrapper), but a 
 broken contract being recoverable violates the entire concept 
 of DbC.

 There *should* be a way to say "okay, the contract is broken, 
 let's get rid of all data associated with it, log an error 
 message to explain what went wrong, then kill *the specific 
 thread/process/task* and let the others keep going".

 The goal isn't to ignore or bypass Errors, it's to 
 compartmentalize the damage.

The problem is that in current operating systems the finest 
scope/context of computation you can (safely) kill / 
compartmentalize the damage in in order to allow the rest of the 
system to proceed is a process (-> process isolation).
Anything finer than that (threads, fibers, etc.) may or may not 
work in a particular use case, but you can't guarantee/proof that 
it works in the majority of use cases (which is what the runtime 
would have to be able to do if we were to allow that behaviour as 
the default).
Compartmentalizing like this is your job as the programmer imho, 
not the job of the runtime.

Jun 07 2017

ag0aep6g <anonymous example.com> writes:

On 06/05/2017 11:50 AM, Olivier FAURE wrote:
 - But memory corruption is super bad, if a proved error *might* be 
 caused by memory corruption then we must absolutely throw the 
 potentially corrupted data away without using it.
 
 - Besides, even without memory corruption, the same argument applies to 
 broken invariants; if we have data that breaks invariants, we need to 
 throw it away, and use it as little as possible.
 

[...]
 
 My proposal for solving these problems would be to explicitly allow to 
 catch Errors in  safe code IF the try block from which the Error is 
 caught is perfectly pure.
 
 In other words,  safe functions would be allowed to catch Error after 
 try blocks if the block only mutates data declared inside of it; the 
 code would look like:
 
      import vibe.d;
 
      // ...
 
      string handleRequestOrError(in HTTPServerRequest req)  safe {
          ServerData myData = createData();
 
          try {
              // both doSomethingWithData and mutateMyData are  pure
 
              doSomethingWithData(req, myData);
              mutateMyData(myData);
 
              return myData.toString;
          }
          catch (Error) {
              throw new SomeException("Oh no, a system error occured");
          }
      }
 
      void handleRequest(HTTPServerRequest req,
                         HTTPServerResponse res)  safe
      {
          try {
              res.writeBody(handleRequestOrError(req), "text/plain");
          }
          catch (SomeException) {
              // Handle exception
          }
      }
 
 The point is, this is safe even when doSomethingWithData breaks an 
 invariant or mutateMyData corrupts myData, because the compiler 
 guarantees that the only data affected WILL be thrown away or otherwise 
 unaccessible by the time catch(Error) is reached.

But `myData` is still alive when `catch (Error)` is reached, isn't it?

[...]
 
 What do you think? Does the idea have merit? Should I make it into a DIP?

How does ` trusted` fit into this? The premise is that there's a bug 
somewhere. You can't assume that the bug is in a ` system` function. It 
can just as well be in a ` trusted` one. And then ` safe` and `pure` 
mean nothing.

Jun 05 2017

Olivier FAURE <olivier.faure epitech.eu> writes:

On Monday, 5 June 2017 at 12:51:16 UTC, ag0aep6g wrote:
 On 06/05/2017 11:50 AM, Olivier FAURE wrote:
 In other words,  safe functions would be allowed to catch 
 Error after try blocks if the block only mutates data declared 
 inside of it; the code would look like:
 
      import vibe.d;
 
      // ...
 
      string handleRequestOrError(in HTTPServerRequest req) 
  safe {
          ServerData myData = createData();
 
          try {
             ...
          }
          catch (Error) {
              throw new SomeException("Oh no, a system error 
 occured");
          }
      }

      ...

 But `myData` is still alive when `catch (Error)` is reached, 
 isn't it?

Good catch; yes, this example would refuse to compile; myData 
needs to be declared in the try block.

 How does ` trusted` fit into this? The premise is that there's 
 a bug somewhere. You can't assume that the bug is in a 
 ` system` function. It can just as well be in a ` trusted` one. 
 And then ` safe` and `pure` mean nothing.

The point of this proposal is that catching Errors should be 
considered  safe under certain conditions; code that catch Errors 
properly would be considered as safe as any other code, which is, 
"as safe as the  trusted code it calls".

I think the issue of  trusted is tangential to this. If you (or 
the writer of a library you use) are using  trusted to cast away 
pureness and then have side effects, you're already risking data 
corruption and undefined behavior, catching Errors or no catching 
Errors.

Jun 07 2017

ag0aep6g <anonymous example.com> writes:

On 06/07/2017 05:19 PM, Olivier FAURE wrote:
 How does ` trusted` fit into this? The premise is that there's a bug 
 somewhere. You can't assume that the bug is in a ` system` function. 
 It can just as well be in a ` trusted` one. And then ` safe` and 
 `pure` mean nothing.


I think I mistyped there. Makes more sense this way: "You can't assume 
that the bug is in a **` safe`** function. It can just as well be in a 
` trusted` one."

 The point of this proposal is that catching Errors should be considered 
  safe under certain conditions; code that catch Errors properly would be 
 considered as safe as any other code, which is, "as safe as the  trusted 
 code it calls".

When no  trusted code is involved, then catching an out-of-bounds error 
from a  safe function is safe. No additional rules are needed. Assuming 
no compiler bugs, a  safe function simply cannot corrupt memory without 
calling  trusted code.

You gave the argument against catching out-of-bounds errors as: "it 
means an invariant is broken, which means the code surrounding it 
probably makes invalid assumptions and shouldn't be trusted."

That line of reasoning applies to  trusted code. Only  trusted code can 
lose its trustworthiness.  safe code is guaranteed trustworthy (except 
for calls to  trusted code).

So the argument against catching out-of-bounds errors is that there 
might be misbehaving  trusted code. And for misbehaving  trusted code 
you can't tell the reach of the potential corruption by looking at the 
function signature.

 I think the issue of  trusted is tangential to this. If you (or the 
 writer of a library you use) are using  trusted to cast away pureness 
 and then have side effects, you're already risking data corruption and 
 undefined behavior, catching Errors or no catching Errors.

It's not about intentional misuse of the  trusted attribute.  trusted 
functions must be safe.

The point is that an out-of-bounds error implies a bug somewhere. If the 
bug is in  safe code, it doesn't affect safety at all. There is no 
explosion. But if the bug is in  trusted code, you can't determine how 
large the explosion is by looking at the function signature.

Jun 07 2017

ag0aep6g <anonymous example.com> writes:

On 06/07/2017 09:45 PM, ag0aep6g wrote:
 When no  trusted code is involved, then catching an out-of-bounds error 
 from a  safe function is safe. No additional rules are needed. Assuming 
 no compiler bugs, a  safe function simply cannot corrupt memory without 
 calling  trusted code.

Thinking a bit more about this, I'm not sure if it's entirely correct. 
Can a  safe language feature throw an Error *after* corrupting memory? 
For example, could `a[i] = n;` write the value first and do the bounds 
check afterwards? There's probably a better example, if this kind of 
"shoot first, ask questions later" style ever makes sense.

If bounds checking could be implemented like that, you wouldn't be able 
to ever catch the resulting error safely. Wouldn't matter if it comes 
from  safe or  trusted code. Purity wouldn't matter either, because an 
arbitrary write like that doesn't care about purity.

Jun 07 2017

Olivier FAURE <olivier.faure epitech.eu> writes:

On Wednesday, 7 June 2017 at 19:45:05 UTC, ag0aep6g wrote:
 You gave the argument against catching out-of-bounds errors as: 
 "it means an invariant is broken, which means the code 
 surrounding it probably makes invalid assumptions and shouldn't 
 be trusted."

 That line of reasoning applies to  trusted code. Only  trusted 
 code can lose its trustworthiness.  safe code is guaranteed 
 trustworthy (except for calls to  trusted code).

To clarify, when I said "shouldn't be trusted", I meant in the 
general sense, not in the memory safety sense.

I think Jonathan M Davis put it nicely:

On Wednesday, 31 May 2017 at 23:51:30 UTC, Jonathan M Davis wrote:
 Honestly, once a memory corruption has occurred, all bets are 
 off anyway. The core thing here is that the contract of 
 indexing arrays was violated, which is a bug. If we're going to 
 argue about whether it makes sense to change that contract, 
 then we have to discuss the consequences of doing so, and I 
 really don't see why whether a memory corruption has occurred 
 previously is relevant. [...] In either case, the runtime has 
 no way of determining the reason for the failure, and I don't 
 see why passing a bad value to index an array is any more 
 indicative of a memory corruption than passing an invalid day 
 of the month to std.datetime's Date when constructing it is 
 indicative of a memory corruption.

The sane way to protect against memory corruption is to write 
safe code, not code that *might* shut down brutally onces memory 
corruption has already occurred. This is done by using  safe and 
proofreading all  trusted functions in your libs.

Contracts are made to preempt memory corruption, and to protect 
against *programming* errors; they're not recoverable because 
breaking a contract means that from now on the program is in a 
state that wasn't anticipated by the programmer.

Which means the only way to handle them gracefully is to cancel 
what you were doing and go back to the pre-contract-breaking 
state, then produce a big, detailed error message and then exit / 
remove the thread / etc.

 I think the issue of  trusted is tangential to this. If you 
 (or the writer of a library you use) are using  trusted to 
 cast away pureness and then have side effects, you're already 
 risking data corruption and undefined behavior, catching 
 Errors or no catching Errors.

 The point is that an out-of-bounds error implies a bug 
 somewhere. If the bug is in  safe code, it doesn't affect 
 safety at all. There is no explosion. But if the bug is in 
  trusted code, you can't determine how large the explosion is 
 by looking at the function signature.

I don't think there is much overlap between the problems that can 
be caused by faulty  trusted code and the problems than can be 
caught by Errors.

Not that this is not a philosophical problem. I'm making an 
empirical claim: "Catching Errors would not open programs to 
memory safety attacks or accidental memory safety blunders that 
would not otherwise happen".

For instance, if some poorly-written  trusted function causes the 
size of an int[10] slice to be registered as 20, then your 
program becomes vulnerable to buffer overflows when you iterate 
over it; the buffer overflow will not throw any Error.

I'm not sure what the official stance is on this. As far as I'm 
aware, contracts and OOB checks are supposed to prevent memory 
corruption, not detect it. Any security based on detecting 
potential memory corruption can ultimately be bypassed by a 
hacker.

Jun 08 2017

ag0aep6g <anonymous example.com> writes:

On 06/08/2017 11:27 AM, Olivier FAURE wrote:
 Contracts are made to preempt memory corruption, and to protect against 
 *programming* errors; they're not recoverable because breaking a 
 contract means that from now on the program is in a state that wasn't 
 anticipated by the programmer.
 
 Which means the only way to handle them gracefully is to cancel what you 
 were doing and go back to the pre-contract-breaking state, then produce 
 a big, detailed error message and then exit / remove the thread / etc.

I might get the idea now. The throwing code could be in the middle of 
some unsafe operation when it throws the out-of-bounds error. It would 
have cleaned up after itself, but it can't because of the (unexpected) 
error.

Silly example:

----
void f(ref int* p)  trusted
{
     p = cast(int*) 13; /* corrupt stuff or partially initialize
         or whatever */
     int[] a; auto x = a[0]; /* trigger an out-of-bounds error */
     p = new int; /* would have cleaned up */
}
----

Catching the resulting error is  safe when you throw the int* away. So 
if f is `pure` and you make sure that the arguments don't survive the 
`try` block, you're good, because f supposedly cannot have reached 
anything else. This is your proposal, right?

I don't think that's sound. At least, it clashes with another relatively 
recent development:

https://dlang.org/phobos/core_memory.html#.pureMalloc

That's a wrapper around C's malloc. C's malloc might set the global 
errno, so it's impure. pureMalloc achieves purity by resetting errno to 
the value it had before the call.

So a `pure` function may mess with global state, as long as it cleans it 
up. But when it's interrupted (e.g. by an out-of-bounds error), it may 
leave globals in an invalid state. So you can't assume that a `pure` 
function upholds its purity when it throws an error.

In the end, an error indicates that something is wrong, and probably all 
guarantees may be compromised.

Jun 08 2017

Olivier FAURE <olivier.faure epitech.eu> writes:

On Thursday, 8 June 2017 at 13:02:38 UTC, ag0aep6g wrote:
 Catching the resulting error is  safe when you throw the int* 
 away. So if f is `pure` and you make sure that the arguments 
 don't survive the `try` block, you're good, because f 
 supposedly cannot have reached anything else. This is your 
 proposal, right?

Right.

 I don't think that's sound. At least, it clashes with another 
 relatively recent development:

 https://dlang.org/phobos/core_memory.html#.pureMalloc

 That's a wrapper around C's malloc. C's malloc might set the 
 global errno, so it's impure. pureMalloc achieves purity by 
 resetting errno to the value it had before the call.

 So a `pure` function may mess with global state, as long as it 
 cleans it up. But when it's interrupted (e.g. by an 
 out-of-bounds error), it may leave globals in an invalid state. 
 So you can't assume that a `pure` function upholds its purity 
 when it throws an error.

That's true. A "pure after cleanup" function is incompatible with 
catching Errors (unless we introduce a "scope(error)" keyword 
that also runs on errors, but that comes with other problems).

Is pureMalloc supposed to be representative of pure functions, or 
more of a special case? That's not a rhetorical question, I 
genuinely don't know.

The spec says a pure function "does not read or write any global 
or static mutable state", which seems incompatible with "save a 
global, then write it back like it was". In fact, doing so seems 
contrary to the assumption that you can run any two pure 
functions on immutable / independent data at the same time and 
you won't have race conditions.

Actually, now I'm wondering whether pureMalloc & co handle 
potential race conditions at all, or just hope they don't happen.

Jun 08 2017

ag0aep6g <anonymous example.com> writes:

On 06/08/2017 04:02 PM, Olivier FAURE wrote:
 That's true. A "pure after cleanup" function is incompatible with 
 catching Errors (unless we introduce a "scope(error)" keyword that also 
 runs on errors, but that comes with other problems).
 
 Is pureMalloc supposed to be representative of pure functions, or more 
 of a special case? That's not a rhetorical question, I genuinely don't 
 know.

I think it's supposed to be just as pure as any other pure function.

Here's the pull request that added it:
https://github.com/dlang/druntime/pull/1746

I don't see anything about it being special-cased in the compiler or such.

 The spec says a pure function "does not read or write any global or 
 static mutable state", which seems incompatible with "save a global, 
 then write it back like it was".

True.

Something similar is going on with  safe. There's a list of things that 
are "not allowed in safe functions" [1], but you can do all those things 
in  trusted code, of course. The list is about what the compiler 
rejects, not about what a  safe function can actually do. It might be 
the same with the things that pure functions can/cannot do.

I suppose the idea is that it cannot be observed that pureMalloc messes 
with global state, so it's ok. The assumption being that you don't catch 
errors.

By the way, with regards to purity and errors, `new` is the same as 
pureMalloc. When `new` throws an OutOfMemoryError and you catch it, you 
can see that errno has been set. Yet `new` is considered `pure`.

 In fact, doing so seems contrary to the 
 assumption that you can run any two pure functions on immutable / 
 independent data at the same time and you won't have race conditions.
 
 Actually, now I'm wondering whether pureMalloc & co handle potential 
 race conditions at all, or just hope they don't happen.

Apparently errno is thread-local.



[1] https://dlang.org/spec/function.html#safe-functions

Jun 08 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 6/5/17 5:50 AM, Olivier FAURE wrote:
 I recently skimmed the "Bad array indexing is considered deadly" thread,
 which discusses the "array OOB throws Error, which throws the whole
 program away" problem.

[snip]
 My proposal for solving these problems would be to explicitly allow to
 catch Errors in  safe code IF the try block from which the Error is
 caught is perfectly pure.

I don't think this will work. Only throwing Error makes a function 
nothrow. A nothrow function may not properly clean up the stack while 
unwinding. Not because the stack unwinding code skips over it, but 
because the compiler knows nothing can throw, and so doesn't include the 
cleanup code.

So this means, regardless of whether you catch an Error or not, the 
program may be in a state that is not recoverable.

Not to mention that only doing this for pure code eliminates usages that 
sparked the original discussion, as my code communicates with a 
database, and that wouldn't be allowed in pure code.

The only possible language change I can think of here, is to have a 
third kind of Throwable type. Call it SafeError. A SafeError would be 
only catchable in  system or  trusted code.

This means that  safe code would have to terminate, but any wrapping 
code that is calling the  safe code (such as the vibe.d framework), 
could catch it and properly handle the error, knowing that everything 
was properly cleaned up, and knowing that because we are in  safe code, 
there hasn't been a memory corruption (right?).

Throwing a SafeError prevents a function from being marked nothrow. I 
can't see a way around this, unless we came up with another attribute 
(shudder).

Then we can change the compiler (runtime?) to throwing SafeRangeError 
instead of RangeError inside  safe code.

All of this, I'm not proposing to do, because I don't see it being 
accepted. Creating a new array type which is used in my code will work, 
and avoids all the hassle of navigating the DIP system.

-Steve

Jun 05 2017

Olivier FAURE <olivier.faure epitech.eu> writes:

On Monday, 5 June 2017 at 14:05:27 UTC, Steven Schveighoffer 
wrote:
 I don't think this will work. Only throwing Error makes a 
 function nothrow. A nothrow function may not properly clean up 
 the stack while unwinding. Not because the stack unwinding code 
 skips over it, but because the compiler knows nothing can 
 throw, and so doesn't include the cleanup code.

If the function is  pure, then the only things it can set up will 
be stored on local or GC data, and it won't matter if they're not 
properly cleaned up, since they won't be accessible anymore.

I'm not 100% sure about that, though. Can a pure function do 
impure things in its scope(exit) / destructor code?

 Not to mention that only doing this for pure code eliminates 
 usages that sparked the original discussion, as my code 
 communicates with a database, and that wouldn't be allowed in 
 pure code.

It would work for sending to a database; but you would need to 
use the functional programming idiom of "do 99% of the work in 
pure functions, then send the data to the remaining 1% for impure 
tasks".

A process's structure would be:
- Read the inputs from the socket (impure, no catching errors)
- Parse them and transform them into database requests (pure)
- Send the requests to the database (impure)
- Parse / analyse / whatever the results (pure)
- Send the results to the socket (impure)

And okay, yeah, that list isn't realistic. Using functional 
programming idioms in real life programs can be a pain in the 
ass, and lead to convoluted callback-based scaffolding and weird 
data structures that you need to pass around a bunch of functions 
that don't really need them.

The point is, you could isolate the pure data-manipulating parts 
of the program from the impure IO parts; and encapsulate the 
former in Error-catching blocks (which is convenient, since those 
parts are likely to be more convoluted and harder to foolproof 
than the IO parts, therefore likely to throw more Errors).

Then if an Error occurs, you can close the connection the client 
(maybe send them an error packet beforehand), close the database 
file descriptor, log an error message, etc.

Jun 07 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 6/7/17 12:20 PM, Olivier FAURE wrote:
 On Monday, 5 June 2017 at 14:05:27 UTC, Steven Schveighoffer wrote:
 I don't think this will work. Only throwing Error makes a function
 nothrow. A nothrow function may not properly clean up the stack while
 unwinding. Not because the stack unwinding code skips over it, but
 because the compiler knows nothing can throw, and so doesn't include
 the cleanup code.

 If the function is  pure, then the only things it can set up will be
 stored on local or GC data, and it won't matter if they're not properly
 cleaned up, since they won't be accessible anymore.

Hm... if you locked an object that was passed in on the stack, for 
instance, there is no guarantee the object gets unlocked.

 I'm not 100% sure about that, though. Can a pure function do impure
 things in its scope(exit) / destructor code?

Even if it does pure things, that can cause problems.

 Not to mention that only doing this for pure code eliminates usages
 that sparked the original discussion, as my code communicates with a
 database, and that wouldn't be allowed in pure code.

 It would work for sending to a database; but you would need to use the
 functional programming idiom of "do 99% of the work in pure functions,
 then send the data to the remaining 1% for impure tasks".

Even this still pushes the handling of the error onto the user. I want 
vibe.d to handle the error, in case I create a bug. But vibe.d can't 
possibly know what database things I'm going to do.

And really this isn't possible. 99% of the work is using the database.

 A process's structure would be:
 - Read the inputs from the socket (impure, no catching errors)
 - Parse them and transform them into database requests (pure)
 - Send the requests to the database (impure)
 - Parse / analyse / whatever the results (pure)
 - Send the results to the socket (impure)

 And okay, yeah, that list isn't realistic. Using functional programming
 idioms in real life programs can be a pain in the ass, and lead to
 convoluted callback-based scaffolding and weird data structures that you
 need to pass around a bunch of functions that don't really need them.

 The point is, you could isolate the pure data-manipulating parts of the
 program from the impure IO parts; and encapsulate the former in
 Error-catching blocks (which is convenient, since those parts are likely
 to be more convoluted and harder to foolproof than the IO parts,
 therefore likely to throw more Errors).

Aside from the point that this still doesn't solve the problem (pure 
functions do cleanup too), this means a lot of headache for people who 
just want to write code. I'd much rather just write an array type and be 
done.

-Steve

Jun 08 2017

Olivier FAURE <olivier.faure epitech.eu> writes:

On Thursday, 8 June 2017 at 12:20:19 UTC, Steven Schveighoffer 
wrote:
 Hm... if you locked an object that was passed in on the stack, 
 for instance, there is no guarantee the object gets unlocked.

This wouldn't be allowed unless the object was duplicated / 
created inside the try block.

 Aside from the point that this still doesn't solve the problem 
 (pure functions do cleanup too), this means a lot of headache 
 for people who just want to write code. I'd much rather just 
 write an array type and be done.

 -Steve

Fair enough. There are other advantages to writing with "create 
data with pure functions then process it" idioms (easier to do 
unit tests, better for parallelism, etc), though.

Jun 08 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 6/8/17 9:42 AM, Olivier FAURE wrote:
 On Thursday, 8 June 2017 at 12:20:19 UTC, Steven Schveighoffer wrote:
 Hm... if you locked an object that was passed in on the stack, for
 instance, there is no guarantee the object gets unlocked.

 This wouldn't be allowed unless the object was duplicated / created
 inside the try block.

void foo(Mutex m, Data d) pure
{
    synchronized(m)
    {
    	// ... manipulate d
    } // no guarantee m gets unlocked
}

-Steve

Jun 08 2017

Stanislav Blinov <stanislav.blinov gmail.com> writes:

On Thursday, 8 June 2017 at 14:13:53 UTC, Steven Schveighoffer 
wrote:

 void foo(Mutex m, Data d) pure
 {
    synchronized(m)
    {
    	// ... manipulate d
    } // no guarantee m gets unlocked
 }

 -Steve

Isn't synchronized(m) not nothrow?

Jun 08 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 6/8/17 11:19 AM, Stanislav Blinov wrote:
 On Thursday, 8 June 2017 at 14:13:53 UTC, Steven Schveighoffer wrote:

 void foo(Mutex m, Data d) pure
 {
    synchronized(m)
    {
        // ... manipulate d
    } // no guarantee m gets unlocked
 }

 Isn't synchronized(m) not nothrow?

You're right, it isn't. I actually didn't know that. Also forgot to make 
my function nothrow. Fixed:

void foo(Mutex m, Data d) pure nothrow
{
    try
    {
       synchronized(m)
       {
          // .. manipulate d
       }
    }
    catch(Exception)
    {
    }
}

-Steve

Jun 08 2017

Jesse Phillips <Jesse.K.Phillips+D gmail.com> writes:

I want to start by stating that the discussion around being able 
to throw Error from nothrow functions and the compiler 
optimizations that follow is important to the thoughts below.

The other aspect of array bounds checking is that those 
particular checks will not be added in -release. There has been 
much discussion around this already and I do recall that the 
solution was that  safe code will retain the array bounds checks 
(I'm not sure if contracts was included in this). Thus if using 
-release and  safe you'd be able to rely on having an Error to 
catch.

Now it might make sense for  safe code to throw an 
ArrayOutOfBounds Exception, but that would mean the function 
couldn't be marked as nothrow if array indexing is used. This is 
probably a terrible idea, but  safe nothrow functions could throw 
ArrayIndexError while  safe could throw ArrayIndexException. It 
would really suck that adding nothrow would change the semantics 
silently.

Jun 08 2017

D Programming

C/C++ Programming

Other

digitalmars.D - Concept proposal: Safely catching error