digitalmars.D - Bad array indexing is considered deadly

Steven Schveighoffer (25/25) May 31 2017 I have discovered an annoyance in using vibe.d instead of another web

H. S. Teoh via Digitalmars-d (12/31) May 31 2017 [...]

Steven Schveighoffer (10/38) May 31 2017 Yes, I can likely do this. This kills any existing connections being

Nick Sabalausky (Abscissa) (2/12) May 31 2017 Plus, relying on that strikes me as a DoS attack vector.
Laeeth Isharc (13/60) Jun 01 2017 Hi Steve.

aberba (3/18) Jun 02 2017 How does that setup affect response time? Do you cache large

Laeeth Isharc (6/29) Jun 02 2017 Our world is very different from web world. Very few users but

Steven Schveighoffer (6/15) Jun 02 2017 I think at some point, if vibe.d doesn't move in this direction, you

Adam D. Ruppe (5/7) May 31 2017 I don't use vibe, but my cgi.d just catches RangeError, kills the

Steven Schveighoffer (10/16) May 31 2017 There are a couple issues with this. At least from the perspective of

ketmar (6/23) May 31 2017 that is, the question reduces to "should out-of-bounds be Error or Excep...

Steven Schveighoffer (28/55) May 31 2017 That ship, unfortunately, has sailed. There is no reasonable migration

Steven Schveighoffer (15/30) May 31 2017 Just realized, that @trusted escape is just so unnecessarily verbose.

ketmar (3/34) May 31 2017 bonus point: you can include index and length in error message! (somethi...

Nick Sabalausky (Abscissa) (14/16) May 31 2017 +1 million. I *hate* D's notion of Error. Well, no...more correctly, I

Moritz Maxeiner (6/15) May 31 2017 To be fair, anything that can be handled in a sane&safe way

Nick Sabalausky (Abscissa) (4/21) May 31 2017 Then out-of-bounds and assert failures should be Exception not Error.

Moritz Maxeiner (5/10) May 31 2017 No, because as I stated in my other post, the runtime *cannot*

Timon Gehr (2/14) May 31 2017 Hence all programs must abort on startup.

H. S. Teoh via Digitalmars-d (7/13) May 31 2017 If D had *true* garbage collection, it would have done this upon

Moritz Maxeiner (3/14) May 31 2017 I think vigil will be a perfect fit for you[1] ;p

Moritz Maxeiner (4/19) May 31 2017 In the context of the conversation, and error has already

Timon Gehr (3/22) May 31 2017 Bounds checks have /no business at all/ trying to handle preexisting

Moritz Maxeiner (4/11) May 31 2017 Sure, because the program is in an undefined state by that point.

Timon Gehr (13/23) May 31 2017 What does that even mean? Everything is perfectly well-defined here:

Moritz Maxeiner (14/42) May 31 2017 That once memory corruption has occurred the state of the program

Timon Gehr (5/13) Jun 01 2017 Yes, they would stop me from using a smaller scope. 'nothrow' functions

Steven Schveighoffer (4/17) Jun 02 2017 By default yes, but...

Moritz Maxeiner (19/22) May 31 2017 It is not that accessing the array out of bounds *leading* to

Nick Sabalausky (Abscissa) (9/12) May 31 2017 Of course not, that's absurd. Where do people get the idea that

Moritz Maxeiner (16/28) May 31 2017 You assume something I did not write. What I wrote is that the

Nick Sabalausky (Abscissa) (33/62) May 31 2017 Like I said, *anything* could be the result of data corruption. (And

Steven Schveighoffer (15/32) May 31 2017 To be blunt, no this is completely wrong. Memory corruption *already

=?UTF-8?Q?Ali_=c3=87ehreli?= (12/33) May 31 2017 True.

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (3/7) May 31 2017 How is this different from a file system exception?

=?UTF-8?Q?Ali_=c3=87ehreli?= (15/21) May 31 2017 When you say "memory" I think you refer to the thought of bounds

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (21/26) Jun 01 2017 That's true, but that could be the case with file system

H. S. Teoh via Digitalmars-d (32/42) May 31 2017 [...]

Moritz Maxeiner (15/20) May 31 2017 While I agree on a theoretical level about the fact that in

Steven Schveighoffer (5/20) May 31 2017 Again, there has not been memory corruption. There is a confusion

Moritz Maxeiner (18/23) May 31 2017 Again, the runtime *cannot* know that and hence you *cannot*

Timon Gehr (10/35) May 31 2017 No, it is perfectly safe, because the language does not guarantee any

Moritz Maxeiner (11/26) May 31 2017 The language not guaranteeing a specific behaviour on memory

Jonathan M Davis via Digitalmars-d (24/32) May 31 2017 Honestly, once a memory corruption has occurred, all bets are off anyway...

Moritz Maxeiner (17/49) May 31 2017 Right, and that is why termination when in doubt (and the

Steven Schveighoffer (31/51) May 31 2017 Yes, it cannot know at any point whether or not a memory corruption has

Moritz Maxeiner (28/70) May 31 2017 Because assuming the worst is a sane default.

Steven Schveighoffer (24/58) May 31 2017 But the program cannot possibly know which variable is an index. So it
Kagamin (4/8) Jun 01 2017 Other systems work like this: an internal server error is

Moritz Maxeiner (22/61) May 31 2017 I disagree.
Nick Sabalausky (Abscissa) (4/9) May 31 2017 This is why the runtime needs to guarantee that normal unwinding/cleanup...

Jonathan M Davis via Digitalmars-d (23/30) May 31 2017 It is my understanding that with how nothrow is implemented, that's not

Walter Bright (5/10) May 31 2017 Everything about a network is unreliable, so any reliable system must ha...
Nick Sabalausky (Abscissa) (1/38) May 31 2017

Nick Sabalausky (Abscissa) (6/12) May 31 2017 Honestly, I really think that if there is need to wrap something as

Jonathan M Davis via Digitalmars-d (26/38) May 31 2017 Using an Exception to signal a programming bug and then potentially tryi...

Nick Sabalausky (Abscissa) (14/47) May 31 2017 Exeption thrown != "OMG NOTHING ABOUT ANY BRANCH OF THE PROGRAM CAN BE

Jonathan M Davis via Digitalmars-d (36/45) May 31 2017 Indexing an array with an invalid index is the same as violating any

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (19/31) Jun 01 2017 Well, if you take this position then you should not only crash

Kagamin (6/10) Jun 01 2017 Sad reality is that d programmers are still comfortable writing

H. S. Teoh via Digitalmars-d (7/10) Jun 01 2017 Huh? There is no void* in that bug report, and it was closed 3 years

Moritz Maxeiner (18/21) Jun 03 2017 After some consideration you can now find the (dynamic) array

Jonathan M Davis via Digitalmars-d (41/61) May 31 2017 I don't think that you even need to worry about whether memory corruptio...

Moritz Maxeiner (11/35) May 31 2017 That is correct (and that was even mentioned in the OP), but from
Steven Schveighoffer (25/88) Jun 01 2017 Yes, it's definitely a bug, and that is not something I'm arguing

rikki cattermole (5/5) Jun 01 2017 I'm just sitting here waiting for shared libraries to be properly
Jonathan M Davis via Digitalmars-d (17/24) Jun 01 2017 Honestly, unless something about vibe.d prevents fixing bugs like bad ar...
Adam D. Ruppe (9/13) Jun 01 2017 If you control the deployment, it works perfectly well. You
Jacob Carlborg (9/13) Jun 01 2017 You can do a combination of both. One request per fiber and as many
aberba (5/17) Jun 01 2017 I'm glad I know enough to know this is an opinion...

aberba (3/23) Jun 01 2017 Here is Daemonise
Steven Schveighoffer (10/27) Jun 02 2017 Don't get me wrong, I think D will be better than other frameworks for

Adam D. Ruppe (8/11) Jun 02 2017 Correction: "vibe.d frameworks" are fragile. This isn't D

Timon Gehr (2/12) Jun 02 2017 I'm not convinced that public perception is sensitive to such details. ;...

Moritz Maxeiner (20/23) May 31 2017 Sorry for double post, but - after thinking more about this - I

Steven Schveighoffer (4/14) May 31 2017 Nope, an autonomous system did not type out my code that caused the out

Moritz Maxeiner (3/5) May 31 2017 Same as the human who typed out the code of the autonomous system.

Kagamin (4/7) May 31 2017 On windows you can set up service restart settings in case it

Steven Schveighoffer (5/11) May 31 2017 That *would* be a feature on Windows ;)

Moritz Maxeiner (3/5) May 31 2017 OT: *with whatever process supervisor floats your boat.
Daniel Kozak via Digitalmars-d (5/20) Jun 01 2017 [Service]

Steven Schveighoffer (3/6) Jun 01 2017 Thanks!

John Colvin (32/63) May 31 2017 What things are considered unrecoverable errors or not is

Brad Roberts via Digitalmars-d (9/13) May 31 2017 This.. exactly this. I've worked on software from the tiny device level...

Walter Bright (25/32) May 31 2017 Since you don't know where the bad index came from, such a conclusion ca...

Guillaume Piolat (9/15) Jun 01 2017 +1

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (4/7) Jun 01 2017 No. You don't want to crash immediately. In fact, you want to

Guillaume Piolat (3/12) Jun 01 2017 Solved by auto-saving, _before_ the crash

H. S. Teoh via Digitalmars-d (10/21) Jun 01 2017 Yes. Saving *after* a crash was detected is stupid, because you no

Walter Bright (3/9) Jun 01 2017 An even better idea is to use rolling backups, with the crash recovery b...

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (2/3) Jun 02 2017 That only works for simple applications.

Steven Schveighoffer (30/62) Jun 01 2017 You could say that about any error. You could say that about malformed

Jonathan M Davis via Digitalmars-d (50/68) Jun 01 2017 I think that it really comes down to what the contract is and how it mak...

Walter Bright (4/11) Jun 01 2017 It is a programming bug to not validate the input. It's not that bad to ...

Timon Gehr (5/21) Jun 01 2017 They should be treated as bugs, but isn't it plausible that there are

Walter Bright (10/20) Jun 01 2017 The stages of programming expertise:

Timon Gehr (26/55) Jun 01 2017 This does not really say anything about programming expertise, it says

Walter Bright (2/4) Jun 01 2017 C quality code is straightforward in D. Just mark it @system.

Timon Gehr (2/7) Jun 01 2017 I don't know what this is, but it is not an answer to my post.

Paolo Invernizzi (19/61) Jun 01 2017 Everything coming as an input of the _process_ should be

Timon Gehr (10/24) Jun 01 2017 You seem to not understand what happened. There was a single server

Paolo Invernizzi (15/39) Jun 01 2017 I really understand what is happening: I've a vibe.d server

aberba (3/9) Jun 01 2017 Pretty much it. Containerisation of several stateless instances

Steven Schveighoffer (14/24) Jun 02 2017 If only that is what happened, I would not have started this thread!

Arafel (32/50) Jun 02 2017 Hi,

Steven Schveighoffer (12/22) Jun 02 2017 I don't think this is workable, simply because of nothrow. An Error is

Arafel (10/25) Jun 02 2017 Well, as I understood from this thread this is already possible in debug...

Steven Schveighoffer (5/31) Jun 02 2017 Yes, of course. This is a non-starter if you need to compile release

John Colvin (39/89) Jun 01 2017 I think the idea is that no, array overflows can never be caused

Stanislav Blinov (3/7) Jun 01 2017 Oh yes, there is a way:

John Colvin (2/9) Jun 01 2017 Sure, @safe has some holes as it currently stands.

Jonathan M Davis via Digitalmars-d (9/19) Jun 01 2017 It's far better than nothing, but it definitely has holes. DIP 1000 is

Walter Bright (3/5) Jun 01 2017 Please post bug reports to bugzilla. Posting them only on the n.g. prett...

Stanislav Blinov (2/7) Jun 01 2017 Please look at the very first post of that thread :\

H. S. Teoh via Digitalmars-d (13/21) Jun 01 2017 [...]
Walter Bright (34/45) Jun 01 2017 What's missing here is looking carefully at a program and deciding what ...

H. S. Teoh via Digitalmars-d (32/44) Jun 01 2017 +1. I think this is the root of the problem. Data that comes from

cym13 (15/25) Jun 01 2017 I'm not familiar with the idea, do we need more than the

H. S. Teoh via Digitalmars-d (123/150) Jun 01 2017 [...]

cym13 (3/4) Jun 01 2017 Now that I think about it, what we really want going that way is

Walter Bright (4/7) Jun 01 2017 Found it:
Dukc (11/16) Jun 01 2017 I think he understood all that already. Array overflow is a sign
Steven Schveighoffer (33/39) Jun 02 2017 I think it's important to state that no, I wasn't relying on array

John Carter (32/41) May 31 2017 In this case it is fairly obvious where the bad index is coming

H. S. Teoh via Digitalmars-d (31/40) May 31 2017 [...]

Paolo Invernizzi (4/10) Jun 01 2017 That's exactly the point: to use the right tool for the

Timon Gehr (2/16) Jun 01 2017 There is no such tool.

Jacob Carlborg (9/10) Jun 01 2017 In this case, Erlang is a pretty good candidate. It's using green
Paolo Invernizzi (3/19) Jun 01 2017 Process isolation was exactly crafted for that.

Vladimir Panteleev (28/31) Jun 01 2017 Since I wrote/run a bunch of websites/network services written in

Walter Bright (4/10) Jun 01 2017 This is the best advice.

Steven Schveighoffer (12/23) Jun 01 2017 Indeed it is good advice. I'm thinking actually a good setup is to have

Martin Tschierschke (9/12) Jun 01 2017 Is this option useful for you?

Nick Sabalausky (Abscissa) (3/17) Jun 01 2017 All that would do is *cause* corruption due to the way the runtime

Andrei Alexandrescu (21/53) Jun 02 2017 This is a meaningful concern. People use threads instead of processes

Joseph Rushton Wakeling (11/13) Jun 04 2017 Ideally, fiber, as well. Probably the real ideal for this sort

Jacob Carlborg (7/14) Jun 04 2017 Erlang has the philosophy of share nothing between processes (green

Paolo Invernizzi (5/23) Jun 04 2017 If I'm not wrong, it also uses a VM, also if there's the

Jacob Carlborg (4/7) Jun 04 2017 Yes, it's running on a VM, the Beam.
Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (12/15) Jun 04 2017 Not sure if I follow that. If you only use safe code then there

Joseph Rushton Wakeling (6/10) Jun 04 2017 Indeed. (I used 'task' here in a deliberately vague sense, in

nohbdy (36/36) Jun 02 2017 I'm using D to write an RSS reader.

Paolo Invernizzi (5/12) Jun 02 2017 The worst thing happened in programming in the last 30 years is

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (17/20) Jun 03 2017 Really?

Paolo Invernizzi (19/40) Jun 03 2017 It doesn't seems to me that the trends to try to handle somehow,

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (35/43) Jun 03 2017 That all depends. It makes perfect sense in a "strongly pure"

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (10/10) Jun 03 2017 Anyway, all of this boils down to the question of whether D
Paolo Invernizzi (20/49) Jun 03 2017 Sorry Ola, I can't support that way of working.

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (34/48) Jun 03 2017 If the compiler is broken then anything could happen, at any

Timon Gehr (10/26) Jun 03 2017 I don't get why you would /restart/ mission-critical software that has

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (10/20) Jun 03 2017 Yes, mission critical software such as flight control are (and
Paolo Invernizzi (16/45) Jun 03 2017 That's what should be done in mission-critical software, and we

Timon Gehr (13/65) Jun 03 2017 That document says that the crash was caused by a component going down

Steven Schveighoffer <schveiguy yahoo.com> writes:

I have discovered an annoyance in using vibe.d instead of another web 
framework. Simple errors in indexing crash the entire application.

For example:

int[3] arr;
arr[3] = 5;

Compare this to, let's say, a malformed unicode string (exception), 
malformed JSON data (exception), file not found (exception), etc.

Technically this is a programming error, and a bug. But memory hasn't 
actually been corrupted. The system properly stopped me from corrupting 
memory. But my reward is that even though this fiber threw an Error, and 
I get an error message in the log showing me the bug, the web server 
itself is now out of commission. No other pages can be served. This is 
like the equivalent of having a guard rail on a road not only stop you 
from going off the cliff but proactively disable your car afterwards to 
prevent you from more harm.

This seems like a large penalty for "almost" corrupting memory. No other 
web framework I've used crashes the entire web server for such a simple 
programming error. And vibe.d has no choice. There is no guarantee the 
stack is properly unwound, so it has to accept the characterization of 
this is a program-ending error by the D runtime.

I am considering writing a set of array wrappers that throw exceptions 
when trying to access out of bounds elements. This comes with its own 
set of problems, but at least the web server should continue to run.

What are your thoughts? Have you run into this? If so, how did you solve it?

-Steve

May 31 2017

"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:

On Wed, May 31, 2017 at 09:04:52AM -0400, Steven Schveighoffer via
Digitalmars-d wrote:
 I have discovered an annoyance in using vibe.d instead of another web
 framework. Simple errors in indexing crash the entire application.
 
 For example:
 
 int[3] arr;
 arr[3] = 5;
 
 Compare this to, let's say, a malformed unicode string (exception),
 malformed JSON data (exception), file not found (exception), etc.
 
 Technically this is a programming error, and a bug. But memory hasn't
 actually been corrupted. The system properly stopped me from
 corrupting memory. But my reward is that even though this fiber threw
 an Error, and I get an error message in the log showing me the bug,
 the web server itself is now out of commission. No other pages can be
 served. This is like the equivalent of having a guard rail on a road
 not only stop you from going off the cliff but proactively disable
 your car afterwards to prevent you from more harm.

[...]

Isn't it customary to have the webserver launched by a script that
restarts it whenever it crashes (after logging a message in an emergency
logfile)?  Not an ideal solution, I know, but at least it minimizes
downtime.

On another note, why didn't the compiler reject the above code? I
thought it checks static arrays bounds at compile time whenever
possible. Did I remember wrong?


T

-- 
Change is inevitable, except from a vending machine.

May 31 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 5/31/17 9:21 AM, H. S. Teoh via Digitalmars-d wrote:
 On Wed, May 31, 2017 at 09:04:52AM -0400, Steven Schveighoffer via
Digitalmars-d wrote:
 I have discovered an annoyance in using vibe.d instead of another web
 framework. Simple errors in indexing crash the entire application.

 For example:

 int[3] arr;
 arr[3] = 5;

 Compare this to, let's say, a malformed unicode string (exception),
 malformed JSON data (exception), file not found (exception), etc.

 Technically this is a programming error, and a bug. But memory hasn't
 actually been corrupted. The system properly stopped me from
 corrupting memory. But my reward is that even though this fiber threw
 an Error, and I get an error message in the log showing me the bug,
 the web server itself is now out of commission. No other pages can be
 served. This is like the equivalent of having a guard rail on a road
 not only stop you from going off the cliff but proactively disable
 your car afterwards to prevent you from more harm.

 [...]

 Isn't it customary to have the webserver launched by a script that
 restarts it whenever it crashes (after logging a message in an emergency
 logfile)?  Not an ideal solution, I know, but at least it minimizes
 downtime.

Yes, I can likely do this. This kills any existing connections being 
handled though, and is far far from ideal. It's also a hard crash, any 
operations such as writing DB data are killed mid-stream.

But you won't win over any minds that are used to php or python with 
this workaround.

 On another note, why didn't the compiler reject the above code? I
 thought it checks static arrays bounds at compile time whenever
 possible. Did I remember wrong?

I'm not sure, it's a toy example. In the real bug, the index was a 
variable. The annoying thing about this is that there is no actual 
memory corruption. It was properly stopped.

-Steve

May 31 2017

"Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:

On 05/31/2017 09:34 AM, Steven Schveighoffer wrote:
 On 5/31/17 9:21 AM, H. S. Teoh via Digitalmars-d wrote:
 Isn't it customary to have the webserver launched by a script that
 restarts it whenever it crashes (after logging a message in an emergency
 logfile)?  Not an ideal solution, I know, but at least it minimizes
 downtime.

 
 Yes, I can likely do this. This kills any existing connections being 
 handled though, and is far far from ideal. It's also a hard crash, any 
 operations such as writing DB data are killed mid-stream.

Plus, relying on that strikes me as a DoS attack vector.

May 31 2017

Laeeth Isharc <laeethnospam nospam.laeeth.com> writes:

On Wednesday, 31 May 2017 at 13:34:25 UTC, Steven Schveighoffer 
wrote:
 On 5/31/17 9:21 AM, H. S. Teoh via Digitalmars-d wrote:
 On Wed, May 31, 2017 at 09:04:52AM -0400, Steven Schveighoffer 
 via Digitalmars-d wrote:
 I have discovered an annoyance in using vibe.d instead of 
 another web
 framework. Simple errors in indexing crash the entire 
 application.

 For example:

 int[3] arr;
 arr[3] = 5;

 Compare this to, let's say, a malformed unicode string 
 (exception),
 malformed JSON data (exception), file not found (exception), 
 etc.

 Technically this is a programming error, and a bug. But 
 memory hasn't
 actually been corrupted. The system properly stopped me from
 corrupting memory. But my reward is that even though this 
 fiber threw
 an Error, and I get an error message in the log showing me 
 the bug,
 the web server itself is now out of commission. No other 
 pages can be
 served. This is like the equivalent of having a guard rail on 
 a road
 not only stop you from going off the cliff but proactively 
 disable
 your car afterwards to prevent you from more harm.

 [...]

 Isn't it customary to have the webserver launched by a script 
 that
 restarts it whenever it crashes (after logging a message in an 
 emergency
 logfile)?  Not an ideal solution, I know, but at least it 
 minimizes
 downtime.

 Yes, I can likely do this. This kills any existing connections 
 being handled though, and is far far from ideal. It's also a 
 hard crash, any operations such as writing DB data are killed 
 mid-stream.

..
 -Steve

Hi Steve.

Had similar problems early on.  We used supervisord to 
automatically keep a pool of vibed applications running and put 
nginx in front as a load balancer. User session info stored in 
redis.  And a separate process for data communicating with web 
server over nanomsg.  Zeromq is more mature but I found sometimes 
socket could get into an inconsistent state if servers crashed 
midway, and nanomsg doesn't have this problem. So data update 
either succeeds or fails but no corruption if Web server crashes.

Maybe better ways but it seems to be okay for us.


Laeeth

Jun 01 2017

aberba <karabutaworld gmail.com> writes:

On Friday, 2 June 2017 at 02:11:34 UTC, Laeeth Isharc wrote:
 On Wednesday, 31 May 2017 at 13:34:25 UTC, Steven Schveighoffer 
 wrote:
 [...]

 Hi Steve.

 Had similar problems early on.  We used supervisord to 
 automatically keep a pool of vibed applications running and put 
 nginx in front as a load balancer. User session info stored in 
 redis.  And a separate process for data communicating with web 
 server over nanomsg.  Zeromq is more mature but I found 
 sometimes socket could get into an inconsistent state if 
 servers crashed midway, and nanomsg doesn't have this problem. 
 So data update either succeeds or fails but no corruption if 
 Web server crashes.

 Maybe better ways but it seems to be okay for us.


 Laeeth

How does that setup affect response time? Do you cache large 
query results in redis?

Jun 02 2017

Laeeth Isharc <laeeth nospamlaeeth.com> writes:

On Friday, 2 June 2017 at 10:37:09 UTC, aberba wrote:
 On Friday, 2 June 2017 at 02:11:34 UTC, Laeeth Isharc wrote:
 On Wednesday, 31 May 2017 at 13:34:25 UTC, Steven 
 Schveighoffer wrote:
 [...]

 Hi Steve.

 Had similar problems early on.  We used supervisord to 
 automatically keep a pool of vibed applications running and 
 put nginx in front as a load balancer. User session info 
 stored in redis.  And a separate process for data 
 communicating with web server over nanomsg.  Zeromq is more 
 mature but I found sometimes socket could get into an 
 inconsistent state if servers crashed midway, and nanomsg 
 doesn't have this problem. So data update either succeeds or 
 fails but no corruption if Web server crashes.

 Maybe better ways but it seems to be okay for us.


 Laeeth

 How does that setup affect response time? Do you cache large 
 query results in redis?

Our world is very different from web world.  Very few users but 
incredibly high value.  If we have twenty users then for most 
things that's a lot.  We don't cache query results as it's fast 
enough and the data retrieval bit is not where the bottleneck is.


Laeeth

Jun 02 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 6/1/17 10:11 PM, Laeeth Isharc wrote:
 Had similar problems early on.  We used supervisord to automatically
 keep a pool of vibed applications running and put nginx in front as a
 load balancer. User session info stored in redis.  And a separate
 process for data communicating with web server over nanomsg.  Zeromq is
 more mature but I found sometimes socket could get into an inconsistent
 state if servers crashed midway, and nanomsg doesn't have this problem.
 So data update either succeeds or fails but no corruption if Web server
 crashes.

 Maybe better ways but it seems to be okay for us.

I think at some point, if vibe.d doesn't move in this direction, you 
will see a popular setup that wraps vibe.d along these lines. I imagined 
a similar solution earlier: 
https://forum.dlang.org/post/ogq7nd$ccj$1 digitalmars.com

-Steve

Jun 02 2017

Adam D. Ruppe <destructionator gmail.com> writes:

On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer 
wrote:
 What are your thoughts? Have you run into this? If so, how did 
 you solve it?

I don't use vibe, but my cgi.d just catches RangeError, kills the 
individual connection, and lets the others carry on. Can you do 
the same thing?

May 31 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 5/31/17 9:37 AM, Adam D. Ruppe wrote:
 On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:
 What are your thoughts? Have you run into this? If so, how did you
 solve it?

 I don't use vibe, but my cgi.d just catches RangeError, kills the
 individual connection, and lets the others carry on. Can you do the same
 thing?

There are a couple issues with this. At least from the perspective of 
vibe.d attempting to be a mainstream base library.

1. You can mark a function nothrow that throws a RangeError. So the 
compiler is free to assume the function won't throw and build faster 
code that won't properly clean up if an Error is thrown.

2. Technically, there is no guarantee by the runtime to unwind the 
stack. So at some point, your workaround may not even work. And even if 
it does, things like RAII may not work.

-Steve

May 31 2017

ketmar <ketmar ketmar.no-ip.org> writes:

Steven Schveighoffer wrote:

 On 5/31/17 9:37 AM, Adam D. Ruppe wrote:
 On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:
 What are your thoughts? Have you run into this? If so, how did you
 solve it?

 I don't use vibe, but my cgi.d just catches RangeError, kills the
 individual connection, and lets the others carry on. Can you do the same
 thing?

 There are a couple issues with this. At least from the perspective of 
 vibe.d attempting to be a mainstream base library.

 1. You can mark a function nothrow that throws a RangeError. So the 
 compiler is free to assume the function won't throw and build faster code 
 that won't properly clean up if an Error is thrown.

 2. Technically, there is no guarantee by the runtime to unwind the stack. 
 So at some point, your workaround may not even work. And even if it does, 
 things like RAII may not work.

 -Steve

that is, the question reduces to "should out-of-bounds be Error or Exception"?

i myself see no easy way to customize this with language attribute 
(new/delete disaster immediately comes to mind). so i'd say: "create your 
own array wrapper/implementation, and hope that all the functions you need 
are rangified, so they'll be able to work with YourArray".

May 31 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 5/31/17 9:54 AM, ketmar wrote:
 Steven Schveighoffer wrote:

 On 5/31/17 9:37 AM, Adam D. Ruppe wrote:
 On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:
 What are your thoughts? Have you run into this? If so, how did you
 solve it?

 I don't use vibe, but my cgi.d just catches RangeError, kills the
 individual connection, and lets the others carry on. Can you do the same
 thing?

 There are a couple issues with this. At least from the perspective of
 vibe.d attempting to be a mainstream base library.

 1. You can mark a function nothrow that throws a RangeError. So the
 compiler is free to assume the function won't throw and build faster
 code that won't properly clean up if an Error is thrown.

 2. Technically, there is no guarantee by the runtime to unwind the
 stack. So at some point, your workaround may not even work. And even
 if it does, things like RAII may not work.

 that is, the question reduces to "should out-of-bounds be Error or
 Exception"?

That ship, unfortunately, has sailed. There is no reasonable migration 
path, as every function that uses indexing can currently be marked 
nothrow, and would stop compiling in one way or another. In other words 
mass breakage of every project would likely happen.

 i myself see no easy way to customize this with language attribute
 (new/delete disaster immediately comes to mind). so i'd say: "create
 your own array wrapper/implementation, and hope that all the functions
 you need are rangified, so they'll be able to work with YourArray".

I have, and it seems to work OK for my purposes (and wasn't really that 
bad actually).

Here is complete implementation (should be  safe too):

struct ExArr(T, size_t dim)
{
     T[dim] _value;
     alias _value this;
     ref inout(T) opIndex(size_t idx, string fname = __FILE__, size_t 
linenum = __LINE__) inout
     {
         if(idx >= dim)
             throw new Exception("Index out of bounds", fname, linenum);
         static ref x(ref inout(T[dim]) val, size_t i)  trusted { return 
val.ptr[i]; }
         return x(_value, idx);
     }
}

Now, I just need to search and replace for all the cases where I have a 
static array...

A dynamic array replacement shouldn't be too difficult either. Just need 
to override opIndex and opSlice. Then I can override those in my static 
array implementation as well.

-Steve

May 31 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 5/31/17 10:07 AM, Steven Schveighoffer wrote:

 Here is complete implementation (should be  safe too):

 struct ExArr(T, size_t dim)
 {
     T[dim] _value;
     alias _value this;
     ref inout(T) opIndex(size_t idx, string fname = __FILE__, size_t
 linenum = __LINE__) inout
     {
         if(idx >= dim)
             throw new Exception("Index out of bounds", fname, linenum);
         static ref x(ref inout(T[dim]) val, size_t i)  trusted { return
 val.ptr[i]; }
         return x(_value, idx);
     }
 }

Just realized, that  trusted escape is just so unnecessarily verbose.

struct ExArr(T, size_t dim)
{
     T[dim] _value;
     alias _value this;
     ref inout(T) opIndex(size_t idx, string fname = __FILE__, size_t 
linenum = __LINE__) inout  trusted
     {
         if(idx >= dim)
             throw new Exception("Index out of bounds", fname, linenum);
         return _value.ptr[idx];
     }
}

-Steve

May 31 2017

ketmar <ketmar ketmar.no-ip.org> writes:

Steven Schveighoffer wrote:

 On 5/31/17 10:07 AM, Steven Schveighoffer wrote:

 Here is complete implementation (should be  safe too):

 struct ExArr(T, size_t dim)
 {
     T[dim] _value;
     alias _value this;
     ref inout(T) opIndex(size_t idx, string fname = __FILE__, size_t
 linenum = __LINE__) inout
     {
         if(idx >= dim)
             throw new Exception("Index out of bounds", fname, linenum);
         static ref x(ref inout(T[dim]) val, size_t i)  trusted { return
 val.ptr[i]; }
         return x(_value, idx);
     }
 }

 Just realized, that  trusted escape is just so unnecessarily verbose.

 struct ExArr(T, size_t dim)
 {
      T[dim] _value;
      alias _value this;
      ref inout(T) opIndex(size_t idx, string fname = __FILE__, size_t 
 linenum = __LINE__) inout  trusted
      {
          if(idx >= dim)
              throw new Exception("Index out of bounds", fname, linenum);
          return _value.ptr[idx];
      }
 }

 -Steve

bonus point: you can include index and length in error message! (something 
i really miss in dmd range error)

May 31 2017

"Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:

On 05/31/2017 09:04 AM, Steven Schveighoffer wrote:
 
 What are your thoughts?

+1 million. I *hate* D's notion of Error. Well, no...more correctly, I 
absolutely hate that it throws cleanup/unwinding straight out the window 
for many situations that can obviously be handled safely without the 
paranoid "ZOMG Sky Is Falling!!!!" overreaction that is baked into the 
design of Error. And that causes problems like the one you describe.

Kill it with fire!!!

A wrapper type seems like a plausable workaround, but I really, really 
dislike that it would ever be necessary to bother wrapping such a basic 
prevailant feature as...arrays, especially just to work around such a 
collosal misfeature.

And, as you describe in your reply to H.S. Teoh, the current behavior of 
Error can actually cause MORE damage than just rcovering from an 
obviously recoverable situation.

May 31 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Wednesday, 31 May 2017 at 17:13:08 UTC, Nick Sabalausky 
(Abscissa) wrote:
 On 05/31/2017 09:04 AM, Steven Schveighoffer wrote:
 
 What are your thoughts?

 +1 million. I *hate* D's notion of Error. Well, no...more 
 correctly, I absolutely hate that it throws cleanup/unwinding 
 straight out the window for many situations that can obviously 
 be handled safely without the paranoid "ZOMG Sky Is 
 Falling!!!!" overreaction that is baked into the design of 
 Error. And that causes problems like the one you describe.

To be fair, anything that can be handled in a sane&safe way 
should inherit from Exception, not from Error, so throwing away 
cleanup for Error makes sense, since an Error means the program 
is in an undefined state and should terminate asap.

May 31 2017

"Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:

On 05/31/2017 02:55 PM, Moritz Maxeiner wrote:
 On Wednesday, 31 May 2017 at 17:13:08 UTC, Nick Sabalausky (Abscissa) 
 wrote:
 On 05/31/2017 09:04 AM, Steven Schveighoffer wrote:
 What are your thoughts?

 +1 million. I *hate* D's notion of Error. Well, no...more correctly, I 
 absolutely hate that it throws cleanup/unwinding straight out the 
 window for many situations that can obviously be handled safely 
 without the paranoid "ZOMG Sky Is Falling!!!!" overreaction that is 
 baked into the design of Error. And that causes problems like the one 
 you describe.

 
 To be fair, anything that can be handled in a sane&safe way should 
 inherit from Exception, not from Error, so throwing away cleanup for 
 Error makes sense, since an Error means the program is in an undefined 
 state and should terminate asap.

Then out-of-bounds and assert failures should be Exception not Error. 
Frankly, even out-of-memory, arguably. And then there's null 
dereference... In other words, basically everything.

May 31 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Wednesday, 31 May 2017 at 20:09:16 UTC, Nick Sabalausky 
(Abscissa) wrote:
 [...]
 program is in an undefined state and should terminate asap.

 Then out-of-bounds and assert failures should be Exception not 
 Error. Frankly, even out-of-memory, arguably. And then there's 
 null dereference... In other words, basically everything.

No, because as I stated in my other post, the runtime *cannot* 
assume that it is safe *in all cases*. If there is even one 
single case in which it is unsafe, it must abort.

May 31 2017

Timon Gehr <timon.gehr gmx.ch> writes:

On 31.05.2017 22:45, Moritz Maxeiner wrote:
 On Wednesday, 31 May 2017 at 20:09:16 UTC, Nick Sabalausky (Abscissa) 
 wrote:
 [...]
 program is in an undefined state and should terminate asap.

 Then out-of-bounds and assert failures should be Exception not Error. 
 Frankly, even out-of-memory, arguably. And then there's null 
 dereference... In other words, basically everything.

 
 No, because as I stated in my other post, the runtime *cannot* assume 
 that it is safe *in all cases*. If there is even one single case in 
 which it is unsafe, it must abort.

Hence all programs must abort on startup.

May 31 2017

"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:

On Wed, May 31, 2017 at 11:29:53PM +0200, Timon Gehr via Digitalmars-d wrote:
 On 31.05.2017 22:45, Moritz Maxeiner wrote:

[...]
 No, because as I stated in my other post, the runtime *cannot*
 assume that it is safe *in all cases*. If there is even one single
 case in which it is unsafe, it must abort.

 
 Hence all programs must abort on startup.

If D had *true* garbage collection, it would have done this upon
starting up any buggy program. :-D


T

-- 
Why is it that all of the instruments seeking intelligent life in the universe
are pointed away from Earth? -- Michael Beibl

May 31 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Wednesday, 31 May 2017 at 21:30:47 UTC, H. S. Teoh wrote:
 On Wed, May 31, 2017 at 11:29:53PM +0200, Timon Gehr via 
 Digitalmars-d wrote:
 On 31.05.2017 22:45, Moritz Maxeiner wrote:

 [...]
 No, because as I stated in my other post, the runtime 
 *cannot* assume that it is safe *in all cases*. If there is 
 even one single case in which it is unsafe, it must abort.

 
 Hence all programs must abort on startup.

 If D had *true* garbage collection, it would have done this 
 upon starting up any buggy program. :-D

I think vigil will be a perfect fit for you[1] ;p

[1] https://github.com/munificent/vigil

May 31 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Wednesday, 31 May 2017 at 21:29:53 UTC, Timon Gehr wrote:
 On 31.05.2017 22:45, Moritz Maxeiner wrote:
 On Wednesday, 31 May 2017 at 20:09:16 UTC, Nick Sabalausky 
 (Abscissa) wrote:
 [...]
 program is in an undefined state and should terminate asap.

 Then out-of-bounds and assert failures should be Exception 
 not Error. Frankly, even out-of-memory, arguably. And then 
 there's null dereference... In other words, basically 
 everything.

 
 No, because as I stated in my other post, the runtime *cannot* 
 assume that it is safe *in all cases*. If there is even one 
 single case in which it is unsafe, it must abort.

 Hence all programs must abort on startup.

In the context of the conversation, and error has already 
occurred and the all cases was referring to all the cases that 
lead to the error.

May 31 2017

Timon Gehr <timon.gehr gmx.ch> writes:

On 01.06.2017 00:22, Moritz Maxeiner wrote:
 On Wednesday, 31 May 2017 at 21:29:53 UTC, Timon Gehr wrote:
 On 31.05.2017 22:45, Moritz Maxeiner wrote:
 On Wednesday, 31 May 2017 at 20:09:16 UTC, Nick Sabalausky (Abscissa) 
 wrote:
 [...]
 program is in an undefined state and should terminate asap.

 Then out-of-bounds and assert failures should be Exception not 
 Error. Frankly, even out-of-memory, arguably. And then there's null 
 dereference... In other words, basically everything.

 No, because as I stated in my other post, the runtime *cannot* assume 
 that it is safe *in all cases*. If there is even one single case in 
 which it is unsafe, it must abort.

 Hence all programs must abort on startup.

 
 In the context of the conversation, and error has already occurred and 
 the all cases was referring to all the cases that lead to the error.

Bounds checks have /no business at all/ trying to handle preexisting 
memory corruption, and in that sense they are comparable to program startup.

May 31 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Wednesday, 31 May 2017 at 23:40:00 UTC, Timon Gehr wrote:
 
 In the context of the conversation, and error has already 
 occurred and the all cases was referring to all the cases that 
 lead to the error.

 Bounds checks have /no business at all/ trying to handle 
 preexisting memory corruption,

Sure, because the program is in an undefined state by that point. 
There is only termination.

 and in that sense they are comparable to program startup.

I disagree.

May 31 2017

Timon Gehr <timon.gehr gmx.ch> writes:

On 01.06.2017 01:55, Moritz Maxeiner wrote:
 On Wednesday, 31 May 2017 at 23:40:00 UTC, Timon Gehr wrote:
 In the context of the conversation, and error has already occurred 
 and the all cases was referring to all the cases that lead to the error.

 Bounds checks have /no business at all/ trying to handle preexisting 
 memory corruption,

 
 Sure, because the program is in an undefined state by that point.

What does that even mean? Everything is perfectly well-defined here:

void main(){
     auto a = new int[](2);
     a[2] = 3;
}

 There is only termination.
 ...


Termination of what? How on earth do you determine that the scope of 
this "undefined state" is the program, not the machine, or the world? 
I.e., why terminate the program, but not shut down the machine or nuke 
the planet?

Scoping really ought to be up to the programmer as it greatly depends on 
the actual circumstances. Program termination is the only reasonable 
default behaviour, but it is not the only reasonable behaviour.

May 31 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Thursday, 1 June 2017 at 00:11:10 UTC, Timon Gehr wrote:
 On 01.06.2017 01:55, Moritz Maxeiner wrote:
 On Wednesday, 31 May 2017 at 23:40:00 UTC, Timon Gehr wrote:
 In the context of the conversation, and error has already 
 occurred and the all cases was referring to all the cases 
 that lead to the error.

 Bounds checks have /no business at all/ trying to handle 
 preexisting memory corruption,

 
 Sure, because the program is in an undefined state by that 
 point.

 What does that even mean?

That once memory corruption has occurred the state of the program 
is not well defined anymore.

 Everything is perfectly well-defined here:

 void main(){
     auto a = new int[](2);
     a[2] = 3;
 }

Sure, because there has been no memory corruption prior to the 
index out of bounds.
That is not something the runtime should just assume for every 
out of index error.

 There is only termination.
 ...


 Termination of what? How on earth do you determine that the 
 scope of this "undefined state" is the program, not the 
 machine, or the world?

As that is the closest scope current operating systems give us to 
work with, this is a sane default for the runtime. Nobody stops 
you from using a different scope if you need it.

 I.e., why terminate the program, but not shut down the machine 
 or nuke the planet?

 Scoping really ought to be up to the programmer as it greatly 
 depends on the actual circumstances.

Of course, and if you need something else you can do so.

 Program termination is the only reasonable default behaviour, 
 but it is not the only reasonable behaviour.

Absolutely; rereading through our subthread I realized that I had 
not made that explicit here (only in other subthreads). I 
apologize for being imprecise.

May 31 2017

Timon Gehr <timon.gehr gmx.ch> writes:

On 01.06.2017 02:57, Moritz Maxeiner wrote:
 Termination of what? How on earth do you determine that the scope of 
 this "undefined state" is the program, not the machine, or the world?

 
 As that is the closest scope current operating systems give us to work 
 with, this is a sane default for the runtime. Nobody stops you from 
 using a different scope if you need it.
 

Yes, they would stop me from using a smaller scope. 'nothrow' functions 
are not guaranteed to be unwindable and the compiler infers 'nothrow' 
automatically. Also, null pointer dereferences do not even throw. (On 
Linux.)

Jun 01 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 6/1/17 3:49 PM, Timon Gehr wrote:
 On 01.06.2017 02:57, Moritz Maxeiner wrote:
 Termination of what? How on earth do you determine that the scope of
 this "undefined state" is the program, not the machine, or the world?

 As that is the closest scope current operating systems give us to work
 with, this is a sane default for the runtime. Nobody stops you from
 using a different scope if you need it.

 Yes, they would stop me from using a smaller scope. 'nothrow' functions
 are not guaranteed to be unwindable and the compiler infers 'nothrow'
 automatically. Also, null pointer dereferences do not even throw. (On
 Linux.)

By default yes, but...

https://github.com/dlang/druntime/blob/master/src/etc/linux/memoryerror.d

-Steve

Jun 02 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer 
wrote:
 [...]

 What are your thoughts? Have you run into this? If so, how did 
 you solve it?

It is not that accessing the array out of bounds *leading* to 
data corruption that is the issue here, but that in general you 
have to assume that the index *being* out of bounds is itself the 
*result* of *already occurred* data corruption; and if data 
corruption occurred for the index, you *cannot* assume that 
*only* the index has been affected. The runtime cannot simply 
assume the index being out of bounds is not the result of already 
occurred data corruption, because that is inherently unsafe, so 
it *must* terminate asap as the default.

If you get the index as the input to your process - and thus 
*know* that it being out of bounds is not the result of previous 
data corruption - then you should check this yourself before 
accessing the array and handle it appropriately (e.g. via 
Exception).

So in your specific use case I would say use a wrapper. This is 
one of the reasons why I am working on my own library for data 
structures (libds).

May 31 2017

"Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:

On 05/31/2017 03:17 PM, Moritz Maxeiner wrote:
 in general you have to 
 assume that the index *being* out of bounds is itself the *result* of 
 *already occurred* data corruption;

Of course not, that's absurd. Where do people get the idea that 
out-of-bounds *implies* pre-existing data corruption? Most of the time, 
out-of-bounds comes from a bug (especially in D, what with all of its 
safeguards).

Sure, data corruption is one possible cause of out-of-bounds, but data 
corruption is one possible cause of *ANYTHING*. So just to be safe, 
let's just abort on all exceptions, and upon everything else for that 
matter.

May 31 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Wednesday, 31 May 2017 at 20:23:21 UTC, Nick Sabalausky 
(Abscissa) wrote:
 On 05/31/2017 03:17 PM, Moritz Maxeiner wrote:
 in general you have to assume that the index *being* out of 
 bounds is itself the *result* of *already occurred* data 
 corruption;

 Of course not, that's absurd. Where do people get the idea that 
 out-of-bounds *implies* pre-existing data corruption?

You assume something I did not write. What I wrote is that the 
runtime cannot *in general* (i.e. without further information 
about the semantics of your specific program) assume that it was 
*not* preexisting data corruption.

 Most of  the time, out-of-bounds comes from a bug (especially 
 in D, what with all of its safeguards).

Unfortunately the runtime has no way to know *if* the out of 
bounds comes from a bug or a data corruption, which was my point; 
only a human can know that. What is the most likely culprit is 
irrelevant for the default behaviour, because as long as it 
*could* be data corruption, the runtime cannot by default assume 
that it is not; that would be unsafe.

 Sure, data corruption is one possible cause of out-of-bounds, 
 but data corruption is one possible cause of *ANYTHING*. So 
 just to be safe, let's just abort on all exceptions, and upon 
 everything else for that matter.

No, abort on Errors where the runtime cannot know if data 
corruption has already occured, i.e. the program is in an 
undefined state. If you, as the programmer, know that it is safe, 
you have to code that in.

May 31 2017

"Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:

On 05/31/2017 05:03 PM, Moritz Maxeiner wrote:
 On Wednesday, 31 May 2017 at 20:23:21 UTC, Nick Sabalausky (Abscissa) 
 wrote:
 On 05/31/2017 03:17 PM, Moritz Maxeiner wrote:
 in general you have to assume that the index *being* out of bounds is 
 itself the *result* of *already occurred* data corruption;

 Of course not, that's absurd. Where do people get the idea that 
 out-of-bounds *implies* pre-existing data corruption?

 
 You assume something I did not write. What I wrote is that the runtime 
 cannot *in general* (i.e. without further information about the 
 semantics of your specific program) assume that it was *not* preexisting 
 data corruption.
 

Ok, fine. However...

 Most of  the time, out-of-bounds comes from a bug (especially in D, 
 what with all of its safeguards).

 
 Unfortunately the runtime has no way to know *if* the out of bounds 
 comes from a bug or a data corruption, which was my point; only a human 
 can know that. What is the most likely culprit is irrelevant for the 
 default behaviour, because as long as it *could* be data corruption, the 
 runtime cannot by default assume that it is not; that would be unsafe.
 

Like I said, *anything* could be the result of data corruption. (And 
with out-of-bounds in particular, it's very rare for the cause to be 
data corruption, especially in D).

If the determining factor for whether or not condition XYZ should abort 
is "*could* it be data corruption?", then ALL conditions must abort, 
because data corruption and undefined state can, by their very nature, 
cause *any* state - heck, even ones that "look" perfectly valid.

So, since that approach is a complete non-starter even in thory, the 
closest thing we *can* reasonably do is instead, use the crieteria "is 
this *likely enough* to be data corruption?" (for however we choose to 
define "likely enough").

BUT, in that case, out-of-bounds *still* fails to meet the criteria by a 
longshot. When an out-of-bounds does occurs, it's vastly most likely to 
be a bug, not data corruption. Fuck, in all my decades of programming, 
including using D since pre-v1.0, NOT ONCE have ANY of the hundreds, 
maybe thousands, of out-of-bounds I've encountered ever been the result 
of data corruption. NOT ONCE. Not exaggerating. Even as an anecdote, 
that's a FAR cry from being able to reasonably suspect data corruption 
as a likey cause, regardless of where we set the bar for "likely".

 Sure, data corruption is one possible cause of out-of-bounds, but data 
 corruption is one possible cause of *ANYTHING*. So just to be safe, 
 let's just abort on all exceptions, and upon everything else for that 
 matter.

 
 No, abort on Errors where the runtime cannot know if data corruption has 
 already occured, i.e. the program is in an undefined state.

The runtime can NEVER be know that no data corruption has occurred. Let 
me emphasise that: *NEVER*.

By the very nature of data curruption and undefined states, it is NOT 
even theoretically plausible for a runtime to EVER be able to rule out 
data corruption, *not even when things look A-OK*, and hell, not even 
when the algorithm is mathematically proven correct, because, shoot, 
let's just pretend we live in a fantasy world where hardware failures 
are impossible why don't we?

Therefore, if we follow your reasoning (that we must abort whenever data 
corruption is possible), then we must therefore abort all processes 
unconditionally upon creation.

Your approach sounds nice, but it's completely unrealistic.

May 31 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 5/31/17 3:17 PM, Moritz Maxeiner wrote:
 On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:
 [...]

 What are your thoughts? Have you run into this? If so, how did you
 solve it?

 It is not that accessing the array out of bounds *leading* to data
 corruption that is the issue here, but that in general you have to
 assume that the index *being* out of bounds is itself the *result* of
 *already occurred* data corruption;

To be blunt, no this is completely wrong. Memory corruption *already 
having happened* can cause any number of errors. The point of bounds 
checking is to prevent memory corruption in the first place. I could 
memory corrupt the length of the array also (assuming a dynamic array), 
and bounds checking merrily does nothing to stop further memory corruption.

 and if data corruption occurred for
 the index, you *cannot* assume that *only* the index has been affected.
 The runtime cannot simply assume the index being out of bounds is not
 the result of already occurred data corruption, because that is
 inherently unsafe, so it *must* terminate asap as the default.

The runtime should not assume that crashing the whole program is 
necessary when an integer is out of range. Preventing actual corruption, 
yes that is good. But an Exception would have done the job just fine.

But that ship, as I said elsewhere, has sailed. We can't change it to 
Exception now, as that would break just about all nothrow code in existence.

 So in your specific use case I would say use a wrapper. This is one of
 the reasons why I am working on my own library for data structures (libds).

That is my conclusion too. Is your library in a usable state? Perhaps we 
should not repeat efforts, though I wasn't planning on making a robust 
public library for it :)

-Steve

May 31 2017

=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:

On 05/31/2017 02:00 PM, Steven Schveighoffer wrote:
 On 5/31/17 3:17 PM, Moritz Maxeiner wrote:

 It is not that accessing the array out of bounds *leading* to data
 corruption that is the issue here, but that in general you have to
 assume that the index *being* out of bounds is itself the *result* of
 *already occurred* data corruption;

 To be blunt, no this is completely wrong.

Blunter: Moritz is right. :)

 Memory corruption *already having happened* can cause any
 number of errors.

True.

 The point of bounds checking is to prevent memory corruption in
 the first place.

That's just one goal. It also maintains an invariant of arrays: The 
index value must be within bounds.

 I could memory corrupt the length of the array also (assuming a
 dynamic array), and bounds checking merrily does nothing to
 stop further memory corruption.

That's true but the language provides no tool to check for that. The 
fact that program correctness is not achievable in general should not 
have any bearing on bounds checking.

 and if data corruption occurred for
 the index, you *cannot* assume that *only* the index has been affected.
 The runtime cannot simply assume the index being out of bounds is not
 the result of already occurred data corruption, because that is
 inherently unsafe, so it *must* terminate asap as the default.

 The runtime should not assume that crashing the whole program is
 necessary when an integer is out of range. Preventing actual corruption,
 yes that is good. But an Exception would have done the job just fine.

How could an Exception work in this case? Catch it and repeat the same 
bug over and over again? What would the program be achieving? (I assume 
the exception handler will not arbitrarily decrease index values.)

Ali

May 31 2017

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:

On Wednesday, 31 May 2017 at 21:30:05 UTC, Ali Çehreli wrote:
 How could an Exception work in this case? Catch it and repeat 
 the same bug over and over again? What would the program be 
 achieving? (I assume the exception handler will not arbitrarily 
 decrease index values.)

How is this different from a file system exception?
The file system is memory too...

May 31 2017

=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:

On 05/31/2017 02:41 PM, Ola Fosheim Grøstad wrote:
 On Wednesday, 31 May 2017 at 21:30:05 UTC, Ali Çehreli wrote:
 How could an Exception work in this case? Catch it and repeat the same
 bug over and over again? What would the program be achieving? (I
 assume the exception handler will not arbitrarily decrease index 


values.)
 How is this different from a file system exception?
 The file system is memory too...

When you say "memory" I think you refer to the thought of bounds 
checking being for prevention of memory corruption. True, memory 
corruption can happen when the program writes out of bounds but it's one 
special case. The actual reason for bounds checking is maintaining an 
invariant.

Regarding the file system, because it's part of the environment of the 
program, hence the program cannot control, it's correct to throw an 
Exception, in which case the response can be "Cannot open that file; how 
about another one?".

In the case of array indexes, they are in complete control of the 
program, hence a bug when out of bounds. It's not possible to say "Bad 
index; let me try 42 less."

Ali

May 31 2017

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:

On Wednesday, 31 May 2017 at 21:57:04 UTC, Ali Çehreli wrote:
 of bounds but it's one special case. The actual reason for 
 bounds checking is maintaining an invariant.

That's true, but that could be the case with file system 
exception too. Say, a file is supposed to be of length N, but you 
get an exception because you are reading past the file end. Same 
issue.

Should you then wipe the entire file system, because there 
appears to be a problem with a single file?

 In the case of array indexes, they are in complete control of 
 the program, hence a bug when out of bounds. It's not possible 
 to say "Bad index; let me try 42 less."

Well, it is possible that the bad indexing was because the input 
was empty and there was a mistake in the program.

One reasonable thing to do is to rollback for that particular 
input, log it as a problem, then continue processing other input.

Which is often better than shutting down the service, but it 
really is contextual.

The real question is, what is the probability of a mismatched 
index for your application being just an indexing problem. I 
think it is very high for most "safe" code.

So if D supports "safe" code well, then indexing issues will most 
likely almost never be due to corruption.

If you only write "unsafe" code, then indexing issues are still 
most likely to not be because of corruption, but the probability 
is much higher.

Jun 01 2017

"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:

On Wed, May 31, 2017 at 02:30:05PM -0700, Ali �ehreli via Digitalmars-d wrote:
 On 05/31/2017 02:00 PM, Steven Schveighoffer wrote:

[...]
 The runtime should not assume that crashing the whole program is
 necessary when an integer is out of range. Preventing actual
 corruption, yes that is good. But an Exception would have done the
 job just fine.

 
 How could an Exception work in this case? Catch it and repeat the same
 bug over and over again? What would the program be achieving? (I
 assume the exception handler will not arbitrarily decrease index
 values.)

[...]

In this particular case, the idea is that the fibre that ran into the
bug would throw an Exception to the main loop, which catches it and
terminates the fibre (presumably also sending an error response to the
client browser), while continuing to process other, possibly-ongoing
requests normally.

Rather than having the one bad request triggering the buggy code and
causing *all* currently in-progress requests to terminate because the
entire program has aborted.

An extreme example of this is if you had a vibe.d server hosting
multiple virtual domains belonging to different customers. It's bad
enough that one customer's service would crash when it encounters a bug,
but it's far worse to have *all* customers' services crash just because
*one* of them encountered a bug.

This is an interesting use case, because conceptually speaking, each
vibe.d fibre actually represents an independent computation, so any
fatal errors like out-of-bounds bugs should cause the termination of the
*fibre*, rather than *everything* that just happens to be running in the
same process.  If vibe.d had been implemented with, say, forked
processes instead, this wouldn't have been an issue.  But of course, the
fibre implementation was chosen for performance (and possibly other)
reasons. Forking would give you the per-request isolation needed to
handle this kind of problem cleanly, but it also comes with a hefty
performance price tag.  Like all things in practical engineering, it's a
tradeoff.

I'd say creating a custom type that throws Exception instead of Error is
probably the best solution here, given what we have.


T

-- 
The fact that anyone still uses AOL shows that even the presence of options
doesn't stop some people from picking the pessimal one. - Mike Ellis

May 31 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Wednesday, 31 May 2017 at 21:45:51 UTC, H. S. Teoh wrote:
 This is an interesting use case, because conceptually speaking, 
 each vibe.d fibre actually represents an independent 
 computation, so any fatal errors like out-of-bounds bugs should 
 cause the termination of the *fibre*, rather than *everything* 
 that just happens to be running in the same process.

While I agree on a theoretical level about the fact that in 
principal only the fibre (and the same argument goes for threads) 
should terminate, the problem is that fibres, as well as threads, 
share the same virtual memory of a process, i.e. memory 
corruption in one fibre (or thread) cannot in general be safely 
contained and kept from spreading to the other fibres (or 
threads; except in the thread case one might argue if you know 
the memory corruption to have happened only in TLS then you can 
kill the thread, but I don't know how you would prove that).
If you cannot be sure that the memory corruption is contained in 
a scope (i.e. a fibre or thread), you must terminate at the 
closest enclosing scope that you know will keep the error from 
escaping further outward to the rest of your system; AFAIK in 
modern operating system the closest such scope is a process.

May 31 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 5/31/17 6:36 PM, Moritz Maxeiner wrote:
 On Wednesday, 31 May 2017 at 21:45:51 UTC, H. S. Teoh wrote:
 This is an interesting use case, because conceptually speaking, each
 vibe.d fibre actually represents an independent computation, so any
 fatal errors like out-of-bounds bugs should cause the termination of
 the *fibre*, rather than *everything* that just happens to be running
 in the same process.

 While I agree on a theoretical level about the fact that in principal
 only the fibre (and the same argument goes for threads) should
 terminate, the problem is that fibres, as well as threads, share the
 same virtual memory of a process, i.e. memory corruption in one fibre
 (or thread) cannot in general be safely contained and kept from
 spreading to the other fibres (or threads; except in the thread case one
 might argue if you know the memory corruption to have happened only in
 TLS then you can kill the thread, but I don't know how you would prove
 that).

Again, there has not been memory corruption. There is a confusion 
rampant in this thread that preventing *attempted* memory corruption 
must mean there *is* memory corruption. One does not require the other.

-Steve

May 31 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Wednesday, 31 May 2017 at 22:47:38 UTC, Steven Schveighoffer 
wrote:
 Again, there has not been memory corruption.

Again, the runtime *cannot* know that and hence you *cannot* 
claim that. It sees an index out of bounds and it *cannot* reason 
about whether a memory corruption has already occurred or not, 
which means it *must assume* the worst case (it must *assume* 
there was).

 There is a  confusion rampant in this thread that preventing 
 *attempted* memory corruption must mean there *is* memory 
 corruption.

No, please no. Nobody has written that in the entire thread even 
once!
- An index being out of bounds is an error (lowercase!).
- The runtime sees that error when the array is accessed (what 
you describe as *attemped* memory corruption.
- The runtime does not know *why* the index is out of bounds
It does *not* mean that there *was* memory corruption (and again, 
nobody claimed that), but the runtime cannot assume that there 
was not, because that is *unsafe*.

 One  does not require the other.

Correct, but the runtime has to be safe in the *general* case, so 
it *must* assume the worst in case of a bug.

May 31 2017

Timon Gehr <timon.gehr gmx.ch> writes:

On 01.06.2017 01:13, Moritz Maxeiner wrote:
 On Wednesday, 31 May 2017 at 22:47:38 UTC, Steven Schveighoffer wrote:
 Again, there has not been memory corruption.

 
 Again, the runtime *cannot* know that and hence you *cannot* claim that. 
 It sees an index out of bounds and it *cannot* reason about whether a 
 memory corruption has already occurred or not, which means it *must 
 assume* the worst case (it must *assume* there was).
 
 There is a  confusion rampant in this thread that preventing 
 *attempted* memory corruption must mean there *is* memory corruption.

 
 No, please no. Nobody has written that in the entire thread even once!
 - An index being out of bounds is an error (lowercase!).
 - The runtime sees that error when the array is accessed (what you 
 describe as *attemped* memory corruption.
 - The runtime does not know *why* the index is out of bounds
 It does *not* mean that there *was* memory corruption (and again, nobody 
 claimed that), but the runtime cannot assume that there was not, because 
 that is *unsafe*.
 ...

No, it is perfectly safe, because the language does not guarantee any 
specific behavior in case memory is corrupted. Therefore the language 
can /always/ assume that there is no memory corruption.

 One  does not require the other.

 
 Correct, but the runtime has to be safe in the *general* case, so it 
 *must* assume the worst in case of a bug.

Software has bugs. The runtime has no business claiming that the scope 
of any particular bug is the entire service. The practical outcomes of 
this design are just silly. Data is lost, services go down, etc. When in 
doubt, the software should just do what the programmer has written. It 
is not always correct, but it is the best available proxy of the 
desirable behavior.

May 31 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Wednesday, 31 May 2017 at 23:50:07 UTC, Timon Gehr wrote:
 No, it is perfectly safe, because the language does not 
 guarantee any specific behavior in case memory is corrupted.

The language not guaranteeing a specific behaviour on memory 
corruption does not imply that assuming a bug was not caused by 
memory corruption is safe.

 Therefore the language can /always/ assume that there is no 
 memory corruption.

That is also not implied.

 One  does not require the other.

 
 Correct, but the runtime has to be safe in the *general* case, 
 so it *must* assume the worst in case of a bug.

 Software has bugs. The runtime has no business claiming that 
 the scope of any particular bug is the entire service.

It absolutely has the business of doing exactly that as long as 
you, the programmer, do not tell it otherwise; which you can do 
and is your job.

 The practical outcomes of this design are just silly. Data is 
 lost, services go down, etc. When in doubt, the software should 
 just do what the programmer has written. It is not always 
 correct, but it is the best available proxy of the desirable 
 behavior.

When in doubt about memory corruption, the closest enclosing 
scope that will get rid of the memory corruption must die. The 
current behaviour achieves that in many cases.

May 31 2017

Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:

On Wednesday, May 31, 2017 23:13:35 Moritz Maxeiner via Digitalmars-d wrote:
 On Wednesday, 31 May 2017 at 22:47:38 UTC, Steven Schveighoffer
 wrote:
 Again, there has not been memory corruption.

 Again, the runtime *cannot* know that and hence you *cannot*
 claim that. It sees an index out of bounds and it *cannot* reason
 about whether a memory corruption has already occurred or not,
 which means it *must assume* the worst case (it must *assume*
 there was).

Honestly, once a memory corruption has occurred, all bets are off anyway.
The core thing here is that the contract of indexing arrays was violated,
which is a bug. If we're going to argue about whether it makes sense to
change that contract, then we have to discuss the consequences of doing so,
and I really don't see why whether a memory corruption has occurred
previously is relevant. We could easily treat indexing arrays the same as as
any other function which chooses to throw an Exception when it's given bad
input. The core difference is whether it's considered okay to give bad
values or whether it's considered a programming bug to pass bad values. In
either case, the runtime has no way of determining the reason for the
failure, and I don't see why passing a bad value to index an array is any
more indicative of a memory corruption than passing an invalid day of the
month to std.datetime's Date when constructing it is indicative of a memory
corruption. In both cases, the input is bad, and the runtime doesn't know
why. It's just that in the array case, the API of arrays requires that the
input be valid, whereas for Date, it's acceptable for bad input to be
passed. So, while I can appreciate that you're trying to argue for us
keeping RangeError (which I agree with), I think that this whole argument
about possible, previous memory corruptions prior to the invalid index being
passed is derailing things.

The issue ultimately is what the consequences are of using an Error vs an
Exception, and _that_ is what we need to discuss.

- Jonathan M Davis

May 31 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Wednesday, 31 May 2017 at 23:51:30 UTC, Jonathan M Davis wrote:
 On Wednesday, May 31, 2017 23:13:35 Moritz Maxeiner via 
 Digitalmars-d wrote:
 On Wednesday, 31 May 2017 at 22:47:38 UTC, Steven 
 Schveighoffer wrote:
 Again, there has not been memory corruption.

 Again, the runtime *cannot* know that and hence you *cannot* 
 claim that. It sees an index out of bounds and it *cannot* 
 reason about whether a memory corruption has already occurred 
 or not, which means it *must assume* the worst case (it must 
 *assume* there was).

 Honestly, once a memory corruption has occurred, all bets are 
 off anyway.

Right, and that is why termination when in doubt (and the 
programmer has not done anything to clear that doubt up) is the 
sane choice.

 The core thing here is that the contract of indexing arrays was 
 violated, which is a bug.

I disagree about it being the core issue, because that was 
already established in the OP.

 If we're going to argue about whether it makes sense to change 
 that contract, then we have to discuss the consequences of 
 doing so, and I really don't see why whether a memory 
 corruption has occurred previously is relevant.

Because if such a memory corruption occurred, termination of the 
closest enclosing scope to get rid of it must follow (or your 
entire system can end up corrupted).

 We could easily treat indexing arrays the same as as any other 
 function which chooses to throw an Exception when it's given 
 bad input. The core difference is whether it's considered okay 
 to give bad values or whether it's considered a programming bug 
 to pass bad values. In either case, the runtime has no way of 
 determining the reason for the failure, and I don't see why 
 passing a bad value to index an array is any more indicative of 
 a memory corruption than passing an invalid day of the month to 
 std.datetime's Date when constructing it is indicative of a 
 memory corruption. In both cases, the input is bad, and the 
 runtime doesn't know why.

One of those is a library construct, the other is baked into the 
language; it is perfectly fine for the former to use exceptions, 
because it can be easily avoided by anyone; the latter is a 
required component of pretty much everything you can build with D 
and must thus use the stricter contract.

 The issue ultimately is what the consequences are of using an 
 Error vs an Exception, and _that_ is what we need to discuss.

An Exception leads to unwinding&cleanup, an Error to termination 
(with unwinding&cleanup in debug mode for debugging purposes). 
What would you like to discuss here?

May 31 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 5/31/17 7:13 PM, Moritz Maxeiner wrote:
 On Wednesday, 31 May 2017 at 22:47:38 UTC, Steven Schveighoffer wrote:
 Again, there has not been memory corruption.

 Again, the runtime *cannot* know that and hence you *cannot* claim that.
 It sees an index out of bounds and it *cannot* reason about whether a
 memory corruption has already occurred or not, which means it *must
 assume* the worst case (it must *assume* there was).

Yes, it cannot know at any point whether or not a memory corruption has 
occurred. However, it has a lever to pull to say "your program cannot 
continue, and you have no choice." It chooses to pull this lever on any 
attempt of out of bounds access of an array, regardless of the reason 
why that is happening. The chances that a memory corruption is the cause 
is so low, and it doesn't matter even if it is. The program may already 
have messed up everything by that point. In fact, the current behavior 
of printing the Error message and doing an orderly shutdown is pretty 
risky anyway if we think this is a memory corruption.

There are almost no other environmentally caused errors that cause this 
lever to be pulled. It doesn't make a whole lot of sense that it is.

 There is a  confusion rampant in this thread that preventing
 *attempted* memory corruption must mean there *is* memory corruption.

 No, please no. Nobody has written that in the entire thread even once!

"you have to assume that the index *being* out of bounds is itself the 
*result* of *already occurred* data corruption;"

 - An index being out of bounds is an error (lowercase!).
 - The runtime sees that error when the array is accessed (what you
 describe as *attemped* memory corruption.
 - The runtime does not know *why* the index is out of bounds
 It does *not* mean that there *was* memory corruption (and again, nobody
 claimed that), but the runtime cannot assume that there was not, because
 that is *unsafe*.

It's not the runtime's job to determine that the cause of an 
out-of-bounds access could be memory corruption. It's job is to prevent 
the current attempt. Throwing an Error accomplishes this, yes, but it 
also means you must shut down the program. I have no problem at all with 
it preventing the corruption, nor do I have a problem with it throwing 
an Error, per se. The problem I have is that throwing an Error itself 
corrupts the program, and makes it unusable. Therefore, it's the wrong 
tool for that job.

And I absolutely do not think that throwing an Error in this case was 
the result of a careful choice deciding that memory corruption must be 
or even might be the cause. I think it's this way because of the desire 
to write nothrow code without having to pepper your code with try/catch 
blocks.

 One  does not require the other.

 Correct, but the runtime has to be safe in the *general* case, so it
 *must* assume the worst in case of a bug.

It's easy to prove as well that throwing an Exception instead of an 
Error is perfectly safe. My array wrapper is perfectly safe and does not 
throw an Error on bad indexing.

-Steve

May 31 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Wednesday, 31 May 2017 at 23:53:11 UTC, Steven Schveighoffer 
wrote:
 On 5/31/17 7:13 PM, Moritz Maxeiner wrote:
 On Wednesday, 31 May 2017 at 22:47:38 UTC, Steven 
 Schveighoffer wrote:
 Again, there has not been memory corruption.

 Again, the runtime *cannot* know that and hence you *cannot* 
 claim that.
 It sees an index out of bounds and it *cannot* reason about 
 whether a
 memory corruption has already occurred or not, which means it 
 *must
 assume* the worst case (it must *assume* there was).

 Yes, it cannot know at any point whether or not a memory 
 corruption has occurred. However, it has a lever to pull to say 
 "your program cannot continue, and you have no choice." It 
 chooses to pull this lever on any attempt of out of bounds 
 access of an array, regardless of the reason why that is 
 happening.

Because assuming the worst is a sane default.

 The chances that a memory corruption is the cause is so low, 
 and it doesn't matter even if it is. The program may already 
 have messed up everything by that point.

True, it might have already corrupted other things; but that is 
no argument for allowing it to continue to potentially corrupt 
even more.

 In fact, the  current behavior of printing the Error message 
 and doing an orderly shutdown is pretty risky anyway if we 
 think this is a memory corruption.

AFAIK the orderly shutdown is not guaranteed to be done in 
release mode and I would welcome for thrown errors in release 
mode to simply kill the process immediately.

 There is a  confusion rampant in this thread that preventing
 *attempted* memory corruption must mean there *is* memory 
 corruption.

 No, please no. Nobody has written that in the entire thread 
 even once!

 "you have to assume that the index *being* out of bounds is 
 itself the *result* of *already occurred* data corruption;"

Yes, precisely.
I state: "you have to assume that the index *being* out of bounds 
is itself the *result* of *already occurred* data corruption;"
You state: "that preventing *attempted* memory corruption must 
mean there *is* memory corruption"

You state that I claim the memory corruption must definitely have 
occurred, while in contrast I state that one has to *assume* that 
is has occurred. *Not* the same.

 It's not the runtime's job to determine that the cause of an 
 out-of-bounds access could be memory corruption.

That was the job of whoever wrote the runtime, yes.

 It's job is to  prevent the current attempt.

That is one of its jobs. The other is to terminate when it 
detects potential memory corruptions the programmer has not 
ensured are not.

 The problem I have is that throwing an Error itself corrupts 
 the program, and makes it unusable.

Because the programmer has not done the steps to ensure the 
runtime that memory has not been corrupted, that is the only sane 
choice I see.

 It's easy to prove as well that throwing an Exception instead 
 of an Error is perfectly safe. My array wrapper is perfectly 
 safe and does not throw an Error on bad indexing.

And anyone using wrapper implicitly promises that a wrong index 
cannot be the result of memory corruption, which can definitely 
be a sane choice for a lot of use cases, but not as the default 
for the basic building block in the language.

May 31 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 5/31/17 5:30 PM, Ali Çehreli wrote:
 On 05/31/2017 02:00 PM, Steven Schveighoffer wrote:
 On 5/31/17 3:17 PM, Moritz Maxeiner wrote:

 It is not that accessing the array out of bounds *leading* to data
 corruption that is the issue here, but that in general you have to
 assume that the index *being* out of bounds is itself the *result* of
 *already occurred* data corruption;

 To be blunt, no this is completely wrong.

 Blunter: Moritz is right. :)

I'll ignore this section of the debate :)

 Memory corruption *already having happened* can cause any
 number of errors.

 True.

 The point of bounds checking is to prevent memory corruption in
 the first place.

 That's just one goal. It also maintains an invariant of arrays: The
 index value must be within bounds.

But the program cannot possibly know which variable is an index. So it 
cannot maintain the invariant until it's actually used.

At that point, it can use throwing an Error to say that something isn't 
right, or it can use throwing an Exception. D chose Error, and the 
consequences of that choice are that you have to check before D checks 
or else your entire program is killed.

 I could memory corrupt the length of the array also (assuming a
 dynamic array), and bounds checking merrily does nothing to
 stop further memory corruption.

 That's true but the language provides no tool to check for that. The
 fact that program correctness is not achievable in general should not
 have any bearing on bounds checking.

My point simply is that assuming corruption is not a good answer. It's a 
good *excuse* for the current behavior, but doesn't really satisfy any 
meaningful requirement.

To borrow from another subthread here, imagine if when you attempted to 
open a non-existent file, the OS assumed that your program must have 
been memory corrupted and killed it instead of returning ENOENT? It 
could be a "reasonable" assumption -- memory corruption could have 
caused that filename to be corrupt, hence you have sniffed out a memory 
corruption and stopped it in its tracks! Well, actually not really, but 
you saw the tracks. Or else, maybe someone made a typo?

 and if data corruption occurred for
 the index, you *cannot* assume that *only* the index has been affected.
 The runtime cannot simply assume the index being out of bounds is not
 the result of already occurred data corruption, because that is
 inherently unsafe, so it *must* terminate asap as the default.

 The runtime should not assume that crashing the whole program is
 necessary when an integer is out of range. Preventing actual corruption,
 yes that is good. But an Exception would have done the job just fine.

 How could an Exception work in this case? Catch it and repeat the same
 bug over and over again? What would the program be achieving? (I assume
 the exception handler will not arbitrarily decrease index values.)

Just like it works for all other exceptions -- you print a reasonable 
message to the offending party (in this case, it would be a 500 error I 
think), and continue executing other things. No memory corruption has 
occurred because bounds checking stopped it, therefore the program is 
still sane.

-Steve

May 31 2017

Kagamin <spam here.lot> writes:

On Wednesday, 31 May 2017 at 21:30:05 UTC, Ali Çehreli wrote:
 How could an Exception work in this case? Catch it and repeat 
 the same bug over and over again? What would the program be 
 achieving? (I assume the exception handler will not arbitrarily 
 decrease index values.)

Other systems work like this: an internal server error is 
reported to the client, client reports an unexpected error to the 
user, and the action is repeated at the user's discretion.

Jun 01 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Wednesday, 31 May 2017 at 21:00:43 UTC, Steven Schveighoffer 
wrote:
 On 5/31/17 3:17 PM, Moritz Maxeiner wrote:
 On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven 
 Schveighoffer wrote:
 [...]

 What are your thoughts? Have you run into this? If so, how 
 did you
 solve it?

 It is not that accessing the array out of bounds *leading* to 
 data
 corruption that is the issue here, but that in general you 
 have to
 assume that the index *being* out of bounds is itself the 
 *result* of
 *already occurred* data corruption;

 To be blunt, no this is completely wrong.

I disagree.

 Memory corruption *already having happened* can cause any 
 number of errors.

Correct, of which out of bounds array is *one*.

 The point of bounds checking is to prevent memory corruption in 
 the first place.

That is *one* of the purposes. The other is to abort in case of 
already occurred memory corruption.

 I could memory corrupt the length of the array also (assuming a 
 dynamic array), and bounds checking merrily does nothing to 
 stop further memory corruption.

Yes, that is one case against out of bounds checks do not help; 
but that changes nothing for the case we were talking about.

 The runtime should not assume that crashing the whole program 
 is necessary when an integer is out of range.

Without *any* other information, I think it should.

 Preventing actual corruption, yes that is good. But an 
 Exception would have done the job just fine.

If it were only about further memory corruption, yes, but as I 
said, my argument about preexisting corruption remains.

 But that ship, as I said elsewhere, has sailed. We can't change 
 it to Exception now, as that would break just about all nothrow 
 code in existence.

Sure.

 So in your specific use case I would say use a wrapper. This 
 is one of
 the reasons why I am working on my own library for data 
 structures (libds).

 That is my conclusion too. Is your library in a usable state?

Well, since I really needed only a single data structure at the 
time, it only contains a binary heap so far, but I believe it to 
be usable. I intend to add a dynamic array implementation next.

 Perhaps we should not repeat efforts, though I wasn't planning 
 on making a robust public library for it :)

Well, you can take a look at the binary heap implementation[1] 
and decide if that a style you are interested in, but it does 
currently use errors for things such as removing an element when 
the heap is empty; I am not sure there, what I intend to do here, 
but I might make it configurable via the Conf template parameter 
in a design-by-introspection style.

[1] https://github.com/Calrama/libds

May 31 2017

"Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:

On 05/31/2017 05:00 PM, Steven Schveighoffer wrote:
 
 But that ship, as I said elsewhere, has sailed. We can't change it to 
 Exception now, as that would break just about all nothrow code in 
 existence.
 

This is why the runtime needs to guarantee that normal unwinding/cleanup 
*does* occur on Error (barring actual corruption or physical 
impossibilities, obviously).

May 31 2017

Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:

On Wednesday, May 31, 2017 22:24:16 Nick Sabalausky  via Digitalmars-d 
wrote:
 On 05/31/2017 05:00 PM, Steven Schveighoffer wrote:
 But that ship, as I said elsewhere, has sailed. We can't change it to
 Exception now, as that would break just about all nothrow code in
 existence.

 This is why the runtime needs to guarantee that normal unwinding/cleanup
 *does* occur on Error (barring actual corruption or physical
 impossibilities, obviously).

It is my understanding that with how nothrow is implemented, that's not
actually possible. The compiler takes advantage of nothrow to optimize out
the exception handling code where possible. To force it to stay just to try
and clean up when an Error is thrown would defeat the performance gains that
we get with nothrow.

Besides, it's highly debatable that you're actually better off cleaning up
when an Error is thrown, because it largely depends on what has gone wrong.
In some cases, it _would_ be better if clean-up occurred, whereas in others,
it's just making matters worse.

What we currently have is a weird hybrid. When an Error is thrown, _some_ of
the clean-up is done, but not all. Whether that's worse than doing no
clean-up is debatable, but regardless, due to nothrow, we can't do all of
the clean-up, so relying on all of the clean-up occurring is error-prone.
And pretty much the only reason that _any_ clean-up is done when an Error is
thrown is because someone implemented it when Walter wasn't looking.

The reality of the matter though is that no matter what we do, a completely
robust program must be able to deal with the fact that it could be killed at
any time (e.g. due to a power outage) - not that it needs to function
perfectly when it gets killed, but for stuff like database consistency, you
can't rely on the program dying gracefully to avoid data corruption.

- Jonathan M Davis

May 31 2017

Walter Bright <newshound2 digitalmars.com> writes:

On 5/31/2017 7:39 PM, Jonathan M Davis via Digitalmars-d wrote:
 The reality of the matter though is that no matter what we do, a completely
 robust program must be able to deal with the fact that it could be killed at
 any time (e.g. due to a power outage) - not that it needs to function
 perfectly when it gets killed, but for stuff like database consistency, you
 can't rely on the program dying gracefully to avoid data corruption.

Everything about a network is unreliable, so any reliable system must have
baked 
into it the ability to cleanly redo any transaction that failed partway through 
it. Trying to have the software ignore serious bugs in order to complete a 
transaction is a doomed approach.

May 31 2017

"Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:

On 05/31/2017 10:39 PM, Jonathan M Davis via Digitalmars-d wrote:
 On Wednesday, May 31, 2017 22:24:16 Nick Sabalausky  via Digitalmars-d
 wrote:
 On 05/31/2017 05:00 PM, Steven Schveighoffer wrote:
 But that ship, as I said elsewhere, has sailed. We can't change it to
 Exception now, as that would break just about all nothrow code in
 existence.

 This is why the runtime needs to guarantee that normal unwinding/cleanup
 *does* occur on Error (barring actual corruption or physical
 impossibilities, obviously).

 
 It is my understanding that with how nothrow is implemented, that's not
 actually possible. The compiler takes advantage of nothrow to optimize out
 the exception handling code where possible. To force it to stay just to try
 and clean up when an Error is thrown would defeat the performance gains that
 we get with nothrow.
 
 Besides, it's highly debatable that you're actually better off cleaning up
 when an Error is thrown, because it largely depends on what has gone wrong.
 In some cases, it _would_ be better if clean-up occurred, whereas in others,
 it's just making matters worse.
 
 What we currently have is a weird hybrid. When an Error is thrown, _some_ of
 the clean-up is done, but not all. Whether that's worse than doing no
 clean-up is debatable, but regardless, due to nothrow, we can't do all of
 the clean-up, so relying on all of the clean-up occurring is error-prone.
 And pretty much the only reason that _any_ clean-up is done when an Error is
 thrown is because someone implemented it when Walter wasn't looking.
 
 The reality of the matter though is that no matter what we do, a completely
 robust program must be able to deal with the fact that it could be killed at
 any time (e.g. due to a power outage) - not that it needs to function
 perfectly when it gets killed, but for stuff like database consistency, you
 can't rely on the program dying gracefully to avoid data corruption.
 
 - Jonathan M Davis

May 31 2017

"Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:

On 05/31/2017 05:00 PM, Steven Schveighoffer wrote:
 On 5/31/17 3:17 PM, Moritz Maxeiner wrote:
 So in your specific use case I would say use a wrapper. This is one of
 the reasons why I am working on my own library for data structures 
 (libds).

 
 That is my conclusion too.

Honestly, I really think that if there is need to wrap something as 
basic as "all arrays in a codebase" then it's clear something in the 
langauge had gone horribly wrong.

But short of actually *fixing* D's broken concept of Error, I don't see 
a better solution either.

May 31 2017

Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:

On Wednesday, May 31, 2017 22:33:43 Nick Sabalausky  via Digitalmars-d 
wrote:
 On 05/31/2017 05:00 PM, Steven Schveighoffer wrote:
 On 5/31/17 3:17 PM, Moritz Maxeiner wrote:
 So in your specific use case I would say use a wrapper. This is one of
 the reasons why I am working on my own library for data structures
 (libds).

 That is my conclusion too.

 Honestly, I really think that if there is need to wrap something as
 basic as "all arrays in a codebase" then it's clear something in the
 langauge had gone horribly wrong.

 But short of actually *fixing* D's broken concept of Error, I don't see
 a better solution either.

Using an Exception to signal a programming bug and then potentially trying
to recover from it is like trying to recover from a segfault. It really
doesn't make sense.

Yes, it's annoying when you have a bug that kills your program, and even
when you do solid testing, you're unlikely to have found everything, but the
solution to a bug is to fix the bug, not try and have your program limp
along in an unknown state.

Yes, there may be cases where array indices are effectively coming from user
input, and you're going to have to check them all rather than the code
having been written in a way that guarantees that the indices are valid, and
in those cases, wrapping an array to do the checks may make sense, but in
the vast majority of programs, invalid indices should simply never happen -
just like dereferencing a null pointer should simply never happen - and if
it does happen, it's a bug. So, treating it like bad user input as the
default really doesn't make sense. Just fix the bug and move on, and over
time, such problems will go away, because you'll have found the bugs and
fixed them. And if you're consistently not finding them while testing, then
maybe you need to do more and/or better testing.

I can totally understand how it can be frustrating when a bug results in
your program being killed, but it's far better for it to be in your face so
that you find it and fix it rather than letting your program limp along and
potentially have problems later down the line that are disconnected from the
original bug and thus far harder to track down.

- Jonathan M Davis

May 31 2017

"Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:

On 05/31/2017 10:50 PM, Jonathan M Davis via Digitalmars-d wrote:
 On Wednesday, May 31, 2017 22:33:43 Nick Sabalausky  via Digitalmars-d
 wrote:
 On 05/31/2017 05:00 PM, Steven Schveighoffer wrote:
 On 5/31/17 3:17 PM, Moritz Maxeiner wrote:
 So in your specific use case I would say use a wrapper. This is one of
 the reasons why I am working on my own library for data structures
 (libds).

 That is my conclusion too.

 Honestly, I really think that if there is need to wrap something as
 basic as "all arrays in a codebase" then it's clear something in the
 langauge had gone horribly wrong.

 But short of actually *fixing* D's broken concept of Error, I don't see
 a better solution either.

 
 Using an Exception to signal a programming bug and then potentially trying
 to recover from it is like trying to recover from a segfault. It really
 doesn't make sense.
 
 Yes, it's annoying when you have a bug that kills your program, and even
 when you do solid testing, you're unlikely to have found everything, but the

Exeption thrown != "OMG NOTHING ABOUT ANY BRANCH OF THE PROGRAM CAN BE 
REASONED ABOUT OR RELIED UPON ANYMORE!!!!"

Your argument only applies for spaghetti code. Normal code is 
compartmentalized. Different subsystems and all that jazz. Just because 
one thing fails in one box, doesn't mean we gotta nuke the whole friggin 
industrial park and rebuild.

 solution to a bug is to fix the bug,

Obviously. But that's not the question. The question is: What do you do 
in the meantime? Do you quarantine 12 states and a neighboring country 
because somebody coughed untill the threat is neutralized, or should the 
response actually match the threat?

 not try and have your program limp
 along in an unknown state.

False dichotomy. Exceptions causes are usually very localized. There is 
no "unknown state" outside of that tiny little already-quaranteened box.


 Yes, there may be cases where array indices are effectively coming from user
 input, and you're going to have to check them all rather than the code
 having been written in a way that guarantees that the indices are valid, and
 in those cases, wrapping an array to do the checks may make sense, but in
 the vast majority of programs, invalid indices should simply never happen -
 just like dereferencing a null pointer should simply never happen - and if
 it does happen, it's a bug.

Yes, it's a bug. A *localized* bug. NOT RAMPANT MEMORY CORRUPTION.

May 31 2017

Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:

On Wednesday, May 31, 2017 23:20:54 Nick Sabalausky  via Digitalmars-d 
wrote:
 On 05/31/2017 10:50 PM, Jonathan M Davis via Digitalmars-d wrote:
 Yes, there may be cases where array indices are effectively coming from
 user input, and you're going to have to check them all rather than the
 code having been written in a way that guarantees that the indices are
 valid, and in those cases, wrapping an array to do the checks may make
 sense, but in the vast majority of programs, invalid indices should
 simply never happen - just like dereferencing a null pointer should
 simply never happen - and if it does happen, it's a bug.

 Yes, it's a bug. A *localized* bug. NOT RAMPANT MEMORY CORRUPTION.

Indexing an array with an invalid index is the same as violating any
contract in D except that you get a RangeError instead of an AssertError,
and the check is always in place in  safe code (even with -release) in order
to avoid memory corruption. As soon as the contract is violated, the program
is in an unknown state. It's logic is clearly wrong, and the assumptions
that it's making may or may not be valid. So, continuing may or may not be
safe.

Whether memory corruption is involved is irrelevant. The program violated
the contract, so the runtime knows that the program is in an invalid state.
The cause of that bug may or may not be localized, but it's a guarantee at
that point that the program is wrong, so you can't rely on it doing the
right thing.

Yes, we _could_ have made it so that the contract of indexing arrays in D
was such that passing an invalid index was considered normal and then have
it throw an Exception to indicate that bad input had been given. But that
means that that code can no longer be nothrow (which does mean that it can't
be optimized as well), and programs would then need to deal with the fact
that indexing an array could throw and handle that case appropriately. For
the vast majority of programs, most array indices do not come from user
input, and thus it usually really doesn't make sense to treat passing an
invalid index to an array as anything other than a bug. It's reasonable to
expect the programmer to get it right and that if they don't, they'll find
it during testing.

If you want to wrap indexing arrays so that you get an Exception, then fine.
At that point, you're saying that it's not a program bug to be passed an
invalid index, and you're writing your programs with the idea that they need
to be able to handle and recover from such bad input. But that is not the
contract that the language itself uses precisely because indexing an array
with an invalid index is usually a bug and not bad program input, and in the
case where the array index _does_ somehow come from user input, then the
programmer can test it. But having the runtime throw an Exception for what
is normally a program bug would harm the programs that actually got their
indices right.

- Jonathan M Davis

May 31 2017

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:

On Thursday, 1 June 2017 at 05:03:17 UTC, Jonathan M Davis wrote:
 Whether memory corruption is involved is irrelevant. The 
 program violated the contract, so the runtime knows that the 
 program is in an invalid state. The cause of that bug may or 
 may not be localized, but it's a guarantee at that point that 
 the program is wrong, so you can't rely on it doing the right 
 thing.

Well, if you take this position then you should not only crash 
the program, but also delete the executable to prevent it from 
being run again.

Allowing the process to be restarted when you know that it 
contains logic errors breaks with the principles you are 
outlining.

 handle that case appropriately. For the vast majority of 
 programs, most array indices do not come from user input, and 
 thus it usually really doesn't make sense to treat passing an 
 invalid index to an array as anything other than a bug. It's 
 reasonable to expect the programmer to get it right and that if 
 they don't, they'll find it during testing.

It is surprisingly common to forget to check for a field/file 
being empty in a service. So it makes a lot of sense to roll back 
for such errors and keep the service alive. In my experience this 
is the common scenario. And indexing an array is no different 
than asking for a key that doesn't exist in any other 
data-structure, array shouldn't be a special case. Does that mean 
that other ADTs also should throw Error and not Exception?

For instance, assume you have a chat-server and the supplied 
clients work fine. Then some guy decides to reverse engineer it 
and build his own client. You don't want that service to go down 
all the time. You want to shut out that specific client. You want 
to identify the client and block it.

Jun 01 2017

Kagamin <spam here.lot> writes:

On Wednesday, 31 May 2017 at 21:00:43 UTC, Steven Schveighoffer 
wrote:
 To be blunt, no this is completely wrong. Memory corruption 
 *already having happened* can cause any number of errors. The 
 point of bounds checking is to prevent memory corruption in the 
 first place.

Sad reality is that d programmers are still comfortable writing 
code in 70s style playing with void* pointers and don't enable 
bound checks early enough, see 
https://issues.dlang.org/show_bug.cgi?id=13367

Jun 01 2017

"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:

On Thu, Jun 01, 2017 at 10:11:19AM +0000, Kagamin via Digitalmars-d wrote:
[...]
 Sad reality is that d programmers are still comfortable writing code
 in 70s style playing with void* pointers and don't enable bound checks
 early enough, see https://issues.dlang.org/show_bug.cgi?id=13367

Huh? There is no void* in that bug report, and it was closed 3 years
ago. What's your point?


T

-- 
Ph.D. = Permanent head Damage

Jun 01 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Wednesday, 31 May 2017 at 21:00:43 UTC, Steven Schveighoffer 
wrote:
 That is my conclusion too. Is your library in a usable state? 
 Perhaps we should not repeat efforts, though I wasn't planning 
 on making a robust public library for it :)

After some consideration you can now find the (dynamic) array 
implementation here[1].
With regards to (usage) errors: The data structures in libds 
allow passing an optional function `attest` via the template 
parameter `Hook` (DbI). `attest` is passed the data structure (by 
ref, for logging purposes) and a boolean value and must only 
return successfully if the value is true; if it is false, 
`attest` must throw something (e.g. an Exception), or terminate 
the process.
An example of how to use it is here[2].
If no `attest` is passed, the data structures default to throwing 
an AssertError.

[1] 
https://github.com/Calrama/libds/blob/fbceda333dbf76697050faeb6e25dbfcc9e3fbc0/src/ds/linear/array/dynamic.d
[2] 
https://github.com/Calrama/libds/blob/fbceda333dbf76697050faeb6e25dbfcc9e3fbc0/src/ds/tree/heap/binary.d#L381

Jun 03 2017

Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:

On Wednesday, May 31, 2017 19:17:16 Moritz Maxeiner via Digitalmars-d wrote:
 On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer

 wrote:
 [...]

 What are your thoughts? Have you run into this? If so, how did
 you solve it?

 It is not that accessing the array out of bounds *leading* to
 data corruption that is the issue here, but that in general you
 have to assume that the index *being* out of bounds is itself the
 *result* of *already occurred* data corruption; and if data
 corruption occurred for the index, you *cannot* assume that
 *only* the index has been affected. The runtime cannot simply
 assume the index being out of bounds is not the result of already
 occurred data corruption, because that is inherently unsafe, so
 it *must* terminate asap as the default.

 If you get the index as the input to your process - and thus
 *know* that it being out of bounds is not the result of previous
 data corruption - then you should check this yourself before
 accessing the array and handle it appropriately (e.g. via
 Exception).

I don't think that you even need to worry about whether memory corruption
occurred prior to indexing the array with an invalid index. The fact that
the array was indexed with an invalid index is a bug. What caused the bug
depends entirely on the code. Whether it's a memory corruption or something
else is irrelevant. The contract of indexing arrays is that only valid
indices be passed. If an invalid index has been passed, then the contract
has been violated, and by definition, there's a bug in the program, so the
runtime has no choice but to throw an Error or otherwise kill the program.
Given the contract, the only alternative would be to use assertions and only
check when not compiling with -release, but that would be a serious problem
for  safe code, and it really wouldn't help Steven's situation. Either way,
the contract of indexing arrays is such that passing an invalid index is a
bug, and no program should be doing it. The reason that the index is invalid
is pretty much irrelevant to the discussion. It's a bug regardless.

We _could_ make it so that the contract of indexing arrays is such that
you're allowed to pass invalid values, but then the runtime would _always_
have to check the indices (even in  system code), and arrays in general
could never be used in code that was nothrow without a bunch of extra
try-catch blocks. It would be like how auto-decoding and UTFException screws
over our ability to have nothrow code with strings, only it would be for
_all_ arrays. So, the result would be annoying for a lot of code as well as
less efficient.

The vast majority of array code is written in a way that invalid indices are
simple never used, and having it so that indexing an array could throw an
Exception would cause serious problems for a lot of code - especially when
the code is already written in a way that such an exception will never be
thrown (similar to how format can't be nothrow even when you know you've
passed the correct arguments, and it will never throw).

As such, it really doesn't make sense to force all programs to deal with
arrays throwing Exceptions due to bad indices. If a program can't guarantee
that it's going to be passing a valid index to an array, then it needs to
validate the index first. And if that needs to be done frequently, it makes
a lot of sense to either create a wrapper function for indexing arrays which
does the check or to outright wrap arrays such that opIndex on that type
does the check and throws an Exception before the invalid index is passed to
the array. And if the wrapper function is  trusted, it _should_ make it so
that druntime doesn't check the index, avoiding having redundant checks.

I can understand Steven's frustration, but I really think that we're better
off the way it is now, even if it's not ideal for his current use case.

- Jonathan M Davis

May 31 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Wednesday, 31 May 2017 at 22:42:30 UTC, Jonathan M Davis wrote:
 I don't think that you even need to worry about whether memory 
 corruption occurred prior to indexing the array with an invalid 
 index. The fact that the array was indexed with an invalid 
 index is a bug. What caused the bug depends entirely on the 
 code. Whether it's a memory corruption or something else is 
 irrelevant. The contract of indexing arrays is that only valid 
 indices be passed. [...]

That is correct (and that was even mentioned in the OP), but from 
my PoV the argument was about whether that contract is sensible 
the way it is, so I was arguing for why I think the contract is 
good as it is.
*The contract says so* is not an argument supporting the case of 
*why* the contract is the way it is.


 We _could_ make it so that the contract of indexing arrays is 
 such that you're allowed to pass invalid values, but then [...]

Another reason as to why I support the current contract.

 As such, it really doesn't make sense to force all programs to 
 deal with arrays throwing Exceptions due to bad indices. If a 
 program can't guarantee that it's going to be passing a valid 
 index to an array, then it needs to validate the index first.
 And if that needs to be done frequently, it makes a lot of 
 sense to either create a wrapper function for indexing arrays 
 which does the check or to outright wrap arrays such that 
 opIndex on that type does the check and throws an Exception 
 before the invalid index is passed to the array. And if the 
 wrapper function is  trusted, it _should_ make it so that 
 druntime doesn't check the index, avoiding having redundant 
 checks.

Precisely, and that is why I stated that I think he should use a 
wrapper.

 I can understand Steven's frustration, but I really think that 
 we're better off the way it is now, even if it's not ideal for 
 his current use case.

I agree.

May 31 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 5/31/17 6:42 PM, Jonathan M Davis via Digitalmars-d wrote:
 On Wednesday, May 31, 2017 19:17:16 Moritz Maxeiner via Digitalmars-d wrote:
 On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer

 wrote:
 [...]

 What are your thoughts? Have you run into this? If so, how did
 you solve it?

 It is not that accessing the array out of bounds *leading* to
 data corruption that is the issue here, but that in general you
 have to assume that the index *being* out of bounds is itself the
 *result* of *already occurred* data corruption; and if data
 corruption occurred for the index, you *cannot* assume that
 *only* the index has been affected. The runtime cannot simply
 assume the index being out of bounds is not the result of already
 occurred data corruption, because that is inherently unsafe, so
 it *must* terminate asap as the default.

 If you get the index as the input to your process - and thus
 *know* that it being out of bounds is not the result of previous
 data corruption - then you should check this yourself before
 accessing the array and handle it appropriately (e.g. via
 Exception).

 I don't think that you even need to worry about whether memory corruption
 occurred prior to indexing the array with an invalid index. The fact that
 the array was indexed with an invalid index is a bug. What caused the bug
 depends entirely on the code. Whether it's a memory corruption or something
 else is irrelevant. The contract of indexing arrays is that only valid
 indices be passed. If an invalid index has been passed, then the contract
 has been violated, and by definition, there's a bug in the program, so the
 runtime has no choice but to throw an Error or otherwise kill the program.
 Given the contract, the only alternative would be to use assertions and only
 check when not compiling with -release, but that would be a serious problem
 for  safe code, and it really wouldn't help Steven's situation. Either way,
 the contract of indexing arrays is such that passing an invalid index is a
 bug, and no program should be doing it. The reason that the index is invalid
 is pretty much irrelevant to the discussion. It's a bug regardless.

Yes, it's definitely a bug, and that is not something I'm arguing 
against. The correct handling is to throw something, and prevent the 
corruption in doing so.

The problem is that the act of throwing itself makes the program 
unusable after that. I'm not on Nick's side saying that everything 
should be Exception, especially not out of memory.

But the result of throwing an Error means your entire program has now 
been *made* invalid, even if it wasn't before. Therefore you must close 
it. I feel this is a mistake. A bad index can come from anywhere, and to 
assume it's from memory corruption is a huge leap.

What would have been nice is to have a level between Error and 
Exception, and to throw that when a bug such as this occurs. Something 
that a framework can catch, but  safe code couldn't. I feel that when 
these decisions were made, the concept of a single-process fiber-based 
server wasn't planned for.

 We _could_ make it so that the contract of indexing arrays is such that
 you're allowed to pass invalid values, but then the runtime would _always_
 have to check the indices (even in  system code), and arrays in general
 could never be used in code that was nothrow without a bunch of extra
 try-catch blocks. It would be like how auto-decoding and UTFException screws
 over our ability to have nothrow code with strings, only it would be for
 _all_ arrays. So, the result would be annoying for a lot of code as well as
 less efficient.

Right, we can't pick that path now anyway. Too much code would break.

 The vast majority of array code is written in a way that invalid indices are
 simple never used, and having it so that indexing an array could throw an
 Exception would cause serious problems for a lot of code - especially when
 the code is already written in a way that such an exception will never be
 thrown (similar to how format can't be nothrow even when you know you've
 passed the correct arguments, and it will never throw).

 As such, it really doesn't make sense to force all programs to deal with
 arrays throwing Exceptions due to bad indices. If a program can't guarantee
 that it's going to be passing a valid index to an array, then it needs to
 validate the index first. And if that needs to be done frequently, it makes
 a lot of sense to either create a wrapper function for indexing arrays which
 does the check or to outright wrap arrays such that opIndex on that type
 does the check and throws an Exception before the invalid index is passed to
 the array. And if the wrapper function is  trusted, it _should_ make it so
 that druntime doesn't check the index, avoiding having redundant checks.

 I can understand Steven's frustration, but I really think that we're better
 off the way it is now, even if it's not ideal for his current use case.

It just means that D is an inferior platform for a web framework, unless 
you use the process-per-request model so the entire thing doesn't go 
down for one page request. But that obviously is going to cause 
performance problems.

Which is unfortunate, because vibe.d is a great platform for web 
development, other than this. You could go Adam's route and just put the 
blinders on, but I think that's not a sustainable practice.

-Steve

Jun 01 2017

rikki cattermole <rikki cattermole.co.nz> writes:

I'm just sitting here waiting for shared libraries to be properly 
implemented cross platform.
Then I can start thinking about a proper web server written in D.

Until then, we are not really suited to become a generic web server and 
should only exist in the context of multiple instances (and restart-able).

Jun 01 2017

Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:

On Thursday, June 01, 2017 06:13:25 Steven Schveighoffer via Digitalmars-d 
wrote:
 It just means that D is an inferior platform for a web framework, unless
 you use the process-per-request model so the entire thing doesn't go
 down for one page request. But that obviously is going to cause
 performance problems.

 Which is unfortunate, because vibe.d is a great platform for web
 development, other than this. You could go Adam's route and just put the
 blinders on, but I think that's not a sustainable practice.

Honestly, unless something about vibe.d prevents fixing bugs like bad array
indices, I'd just use vibe.d and program normally, and if a problem like you
hit occurs, I'd fix it, and then that wouldn't crash the program anymore.
Depending on how many such logic errors got passed the testing process, it
could take a while before the server was stable enough, or it could be very
little time at all, but in the long run, there wouldn't be any invalid array
indices, because those bugs would have been fixed, and there wouldn't be a
problem anymore.

Now, if there's something about vibe.d that outright prevents fixing these
bugs or makes them impossible to avoid, then that calls for a different
approach, but from what I understand of the situation, I don't see anything
here preventing using vibe.d's approach with fibers. It's just that missed
bugs will be very annoying until they're fixed, but that's true of most
programs.

- Jonathan M Davis

Jun 01 2017

Adam D. Ruppe <destructionator gmail.com> writes:

On Thursday, 1 June 2017 at 10:13:25 UTC, Steven Schveighoffer 
wrote:
 Which is unfortunate, because vibe.d is a great platform for 
 web development, other than this. You could go Adam's route and 
 just put the blinders on, but I think that's not a sustainable 
 practice.

If you control the deployment, it works perfectly well. You 
aren't being blind to it, you are just taking control.

I prefer to use processes anyway (they are easier to use, 
compatible with more libraries, considerably more reliable, and 
perform quite well - we don't have to spin up a new perl 
interpreter, 1999 was a long time ago), but fibers can handle 
RangeError too as long as you never use -release and such.

Jun 01 2017

Jacob Carlborg <doob me.com> writes:

On 2017-06-01 12:13, Steven Schveighoffer wrote:

 It just means that D is an inferior platform for a web framework, unless
 you use the process-per-request model so the entire thing doesn't go
 down for one page request. But that obviously is going to cause
 performance problems.

You can do a combination of both. One request per fiber and as many 
instances of your program as cores. That will utilize the hardware 
better. I've noticed that the multi-threading in vibe.d doesn't seem to 
work. If one process goes down all those request are lost, but you can 
still handle new requests. That in the combination of auto restarting 
the processes if they crash.

-- 
/Jacob Carlborg

Jun 01 2017

aberba <karabutaworld gmail.com> writes:

On Thursday, 1 June 2017 at 10:13:25 UTC, Steven Schveighoffer 
wrote:
 On 5/31/17 6:42 PM, Jonathan M Davis via Digitalmars-d wrote:
 On Wednesday, May 31, 2017 19:17:16 Moritz Maxeiner via 
 Digitalmars-d wrote:


 It just means that D is an inferior platform for a web 
 framework, unless you use the process-per-request model so the 
 entire thing doesn't go down for one page request. But that 
 obviously is going to cause performance problems.

 Which is unfortunate, because vibe.d is a great platform for 
 web development, other than this. You could go Adam's route and 
 just put the blinders on, but I think that's not a sustainable 
 practice.

 -Steve

I'm glad I know enough to know this is an opinion...

anyway, its better to run a vibe.d instance in something like 
daemonized package. You should also use the vibe.d error handlers.

Jun 01 2017

aberba <karabutaworld gmail.com> writes:

On Friday, 2 June 2017 at 00:15:39 UTC, aberba wrote:
 On Thursday, 1 June 2017 at 10:13:25 UTC, Steven Schveighoffer 
 wrote:
 On 5/31/17 6:42 PM, Jonathan M Davis via Digitalmars-d wrote:
 [...]


 It just means that D is an inferior platform for a web 
 framework, unless you use the process-per-request model so the 
 entire thing doesn't go down for one page request. But that 
 obviously is going to cause performance problems.

 Which is unfortunate, because vibe.d is a great platform for 
 web development, other than this. You could go Adam's route 
 and just put the blinders on, but I think that's not a 
 sustainable practice.

 -Steve

 I'm glad I know enough to know this is an opinion...

 anyway, its better to run a vibe.d instance in something like 
 daemonized package. You should also use the vibe.d error 
 handlers.

Here is Daemonise 
https://github.com/NCrashed/daemonize/blob/master/examples/03.Vibed/README.md
for running it as a daemon. Offers some control

Jun 01 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 6/1/17 8:15 PM, aberba wrote:
 On Thursday, 1 June 2017 at 10:13:25 UTC, Steven Schveighoffer wrote:
 On 5/31/17 6:42 PM, Jonathan M Davis via Digitalmars-d wrote:
 On Wednesday, May 31, 2017 19:17:16 Moritz Maxeiner via Digitalmars-d
 wrote:


 It just means that D is an inferior platform for a web framework,
 unless you use the process-per-request model so the entire thing
 doesn't go down for one page request. But that obviously is going to
 cause performance problems.

 Which is unfortunate, because vibe.d is a great platform for web
 development, other than this. You could go Adam's route and just put
 the blinders on, but I think that's not a sustainable practice.

 I'm glad I know enough to know this is an opinion...

Don't get me wrong, I think D will be better than other frameworks for 
those who are willing to work with the warts. But the perception is 
going to be that D web frameworks are too fragile -- one miswritten 
handler, and your whole webserver dies. DOS attacks will be easy with D 
web frameworks, even if you have distributed handling.

 anyway, its better to run a vibe.d instance in something like daemonized
 package. You should also use the vibe.d error handlers.

I found the way to restart it using systemd, so that part should be 
handled. Now, I need to push up moving my session handling into a 
persistent storage (just using the memory storage for now).

-Steve

Jun 02 2017

Adam D. Ruppe <destructionator gmail.com> writes:

On Friday, 2 June 2017 at 12:33:17 UTC, Steven Schveighoffer 
wrote:
 But the perception is going to be that D web frameworks are too 
 fragile -- one miswritten handler, and your whole webserver 
 dies.

Correction: "vibe.d frameworks" are fragile. This isn't D 
specific - my cgi.d is resilient to this (and more) and has been 
since 2008 since it uses a process pool. Simple solution that 
works very well.

Might not handle 10,000 concurrent connections... but you very 
rarely actually have to.

Jun 02 2017

Timon Gehr <timon.gehr gmx.ch> writes:

On 02.06.2017 15:24, Adam D. Ruppe wrote:
 On Friday, 2 June 2017 at 12:33:17 UTC, Steven Schveighoffer wrote:
 But the perception is going to be that D web frameworks are too 
 fragile -- one miswritten handler, and your whole webserver dies.

 
 Correction: "vibe.d frameworks" are fragile. This isn't D specific - my 
 cgi.d is resilient to this (and more) and has been since 2008 since it 
 uses a process pool. Simple solution that works very well.
 
 Might not handle 10,000 concurrent connections... but you very rarely 
 actually have to.

I'm not convinced that public perception is sensitive to such details. ;)

Jun 02 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer 
wrote:
 This is like the  equivalent of having a guard rail on a road 
 not only stop you from going off the cliff but proactively 
 disable your car afterwards to prevent you from more harm.

Sorry for double post, but - after thinking more about this - I 
do not agree that this fits. I think a better analogy would be 
this:
Your car has an autonomous driving system and an anti-collision 
system and the anti-collision system detects that you are about 
to hit an obstacle (let us say another car); as a result it 
engages the breaks and shuts off the autonomous driving system.
It might be that the autonomous driving system was in the right 
and the reason for the almost collision was another human driver 
driving illegally, but it might also be that there is a bug in 
the autonomous driving system. If the latter is the case, in this 
one instance the anti-collision device detected the result of the 
bug, but the next time it might be that the autonomous driving 
system drives you off a cliff, which the anti-collision would not 
help against.
So the only sane thing to do is shut the autonomous driving 
system off, requiring human intervention to decide which of the 
two was the case (and if it was the former, turn it on again).

May 31 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 5/31/17 4:06 PM, Moritz Maxeiner wrote:
 On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:
 This is like the  equivalent of having a guard rail on a road not only
 stop you from going off the cliff but proactively disable your car
 afterwards to prevent you from more harm.

 Sorry for double post, but - after thinking more about this - I do not
 agree that this fits. I think a better analogy would be this:
 Your car has an autonomous driving system and an anti-collision system
 and the anti-collision system detects that you are about to hit an
 obstacle (let us say another car); as a result it engages the breaks and
 shuts off the autonomous driving system.

Nope, an autonomous system did not type out my code that caused the out 
of bounds error, I did :)

-Steve

May 31 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Wednesday, 31 May 2017 at 21:02:06 UTC, Steven Schveighoffer 
wrote:
 Nope, an autonomous system did not type out my code that caused 
 the out of bounds error, I did :)

Same as the human who typed out the code of the autonomous system.

May 31 2017

Kagamin <spam here.lot> writes:

On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer 
wrote:
 This seems like a large penalty for "almost" corrupting memory. 
 No other web framework I've used crashes the entire web server 
 for such a simple programming error.

On windows you can set up service restart settings in case it 
crashes. Useful for services that crash regularly.

May 31 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 5/31/17 4:53 PM, Kagamin wrote:
 On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:
 This seems like a large penalty for "almost" corrupting memory. No
 other web framework I've used crashes the entire web server for such a
 simple programming error.

 On windows you can set up service restart settings in case it crashes.
 Useful for services that crash regularly.

That *would* be a feature on Windows ;)

No, this is Linux, so I'll have to research how to properly do it with 
systemd.

-Steve

May 31 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Wednesday, 31 May 2017 at 21:03:02 UTC, Steven Schveighoffer 
wrote:
 No, this is Linux, so I'll have to research how to properly do 
 it with systemd.

OT: *with whatever process supervisor floats your boat.

May 31 2017

Daniel Kozak via Digitalmars-d <digitalmars-d puremagic.com> writes:

[Service]
...

Restart=on-failure


On Wed, May 31, 2017 at 11:03 PM, Steven Schveighoffer via Digitalmars-d <
digitalmars-d puremagic.com> wrote:

 On 5/31/17 4:53 PM, Kagamin wrote:

 On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:

 This seems like a large penalty for "almost" corrupting memory. No
 other web framework I've used crashes the entire web server for such a
 simple programming error.

 On windows you can set up service restart settings in case it crashes.
 Useful for services that crash regularly.

 That *would* be a feature on Windows ;)

 No, this is Linux, so I'll have to research how to properly do it with
 systemd.

 -Steve

Jun 01 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 6/1/17 6:05 AM, Daniel Kozak via Digitalmars-d wrote:
 [Service]
 ....

 Restart=on-failure

Thanks!

-Steve

Jun 01 2017

John Colvin <john.loughran.colvin gmail.com> writes:

On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer 
wrote:
 I have discovered an annoyance in using vibe.d instead of 
 another web framework. Simple errors in indexing crash the 
 entire application.

 For example:

 int[3] arr;
 arr[3] = 5;

 Compare this to, let's say, a malformed unicode string 
 (exception), malformed JSON data (exception), file not found 
 (exception), etc.

 Technically this is a programming error, and a bug. But memory 
 hasn't actually been corrupted. The system properly stopped me 
 from corrupting memory. But my reward is that even though this 
 fiber threw an Error, and I get an error message in the log 
 showing me the bug, the web server itself is now out of 
 commission. No other pages can be served. This is like the 
 equivalent of having a guard rail on a road not only stop you 
 from going off the cliff but proactively disable your car 
 afterwards to prevent you from more harm.

 This seems like a large penalty for "almost" corrupting memory. 
 No other web framework I've used crashes the entire web server 
 for such a simple programming error. And vibe.d has no choice. 
 There is no guarantee the stack is properly unwound, so it has 
 to accept the characterization of this is a program-ending 
 error by the D runtime.

 I am considering writing a set of array wrappers that throw 
 exceptions when trying to access out of bounds elements. This 
 comes with its own set of problems, but at least the web server 
 should continue to run.

 What are your thoughts? Have you run into this? If so, how did 
 you solve it?

 -Steve

What things are considered unrecoverable errors or not is 
debatable, but in the end I think the whole things can be seen 
from the perspective of a fundamental problem of systems where 
multiple operations must be able to progress successfully* 
independently of each other. All operations (a.k.a. processes, 
fibers, or function calls within fibers, or whatever granularity 
you choose) that modify shared state (could be external to the 
fiber, the thread, the process, the machine, could be 
"real-world") must somehow maintain some consistency with other 
operations that come before, are interleaved, simultaneous or 
after.

The way I see it is that you have two choices: reason more 
explicitly about the relationship between different operations 
and carefully catch only the mishaps that you know (or are 
prepared to risk) don't ruin the consistent picture between 
operations OR remove the need for consistency. A lot of the 
latter makes the former easier.

IIRC this is what deadalnix has talked about as one of the big 
wins of php in practice, the separation of state between requests 
means that things can mess up locally without having to worry 
about wider consequences except in the specific cases where 
things are shared; I.e. the set of things that must be maintained 
consistent are opt-in, as opposed to opt-out in care-free use of 
the vibe-d model.

* "progress successfully" is itself a tricky idea.

P.S. Sometimes I do feel D is a bit eager on the self-destruct 
switch, but I think the solution is to rise to the challenge of 
making better software, not to be more blasé about pretending to 
know how to recover from unknown logic errors (exposed by 
unexpected input).

May 31 2017

Brad Roberts via Digitalmars-d <digitalmars-d puremagic.com> writes:

On 5/31/2017 5:37 PM, John Colvin via Digitalmars-d wrote:
 P.S. Sometimes I do feel D is a bit eager on the self-destruct switch, 
 but I think the solution is to rise to the challenge of making better 
 software, not to be more blasé about pretending to know how to recover 
 from unknown logic errors (exposed by unexpected input).

This.. exactly this.  I've worked on software from the tiny device level 
to the largest distributed systems in the world and many in between.  
The ones that are aggressive about defining application correctness 
through asserts and similar mechanisms and use the basic precepts of 
failing fast are the most stable.  Problems are caught early, they're 
loud, obnoxious, and obvious.  And they get fixed, fast.

I'm happy that D takes a similar stance.  It makes my job easier.

- Brad

May 31 2017

Walter Bright <newshound2 digitalmars.com> writes:

On 5/31/2017 6:04 AM, Steven Schveighoffer wrote:
 Technically this is a programming error, and a bug. But memory hasn't actually 
 been corrupted.

Since you don't know where the bad index came from, such a conclusion cannot be 
drawn.


 This seems like a large penalty for "almost" corrupting memory. No other web 
 framework I've used crashes the entire web server for such a simple
programming 
 error.

Hence the endless vectors for malware insertion in those other frameworks.


 What are your thoughts?

Track down where the bad index is coming from and fix it.

-----------

 Compare this to, let's say, a malformed unicode string (exception), malformed 

JSON data (exception), file not found (exception), etc.

That's because those are input and environmental errors, not programming bugs.

There can be grey areas in classifying problems as input errors or programming 
bugs, and those will need some careful thought by the programmer as to which
bin 
they fall into, and then code accordingly.

Array overflows are not a grey area, however. They are always programming bugs.

-----------

This topic comes up regularly in this forum - the idea that a program that 
entered an unknown, undefined state is actually ok and can continue executing. 
Maybe that's fine on a system (such as a gaming console) where nobody cares if 
it goes off the deep end and it is not connected to the internet so it cannot 
propagate malware infections.

Otherwise, while it's hard to write invulnerable programs, it is another thing 
entirely to endorse vulnerabilities. I cannot endorse such practices, nor can I 
endorse vibe.d if it is coded to continue running after entering an undefined
state.

A corollary is the idea that one creates reliable systems by writing programs 
that can continue executing after corruption. This is another fallacious 
concept. Reliable systems are ones that have independent components that can 
take over if some part of them fails. Shared memory is not independence.

May 31 2017

Guillaume Piolat <first.last gmail.com> writes:

On Thursday, 1 June 2017 at 01:05:42 UTC, Walter Bright wrote:
 This topic comes up regularly in this forum - the idea that a 
 program that entered an unknown, undefined state is actually ok 
 and can continue executing. Maybe that's fine on a system (such 
 as a gaming console) where nobody cares if it goes off the deep 
 end and it is not connected to the internet so it cannot 
 propagate malware infections.

+1

Why are we discussing this topic again at all? Again?

Even with consumer software, you may want to crash immediately so 
that you actually get complaints from testers/buyers instead of 
having a silent, invisible bug that no one will report ever.

Actually leaving checks is imho perfectly valid for consumer 
software, if you don't do that the next consumers will have the 
issues that didn't get reported.

Jun 01 2017

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:

On Thursday, 1 June 2017 at 09:18:24 UTC, Guillaume Piolat wrote:
 Even with consumer software, you may want to crash immediately 
 so that you actually get complaints from testers/buyers instead 
 of having a silent, invisible bug that no one will report ever.

No. You don't want to crash immediately. In fact, you want to 
save and recover. Preferably without much work lost and without 
the user being bothered by it.

Jun 01 2017

Guillaume Piolat <first.last gmail.com> writes:

On Thursday, 1 June 2017 at 09:46:09 UTC, Ola Fosheim Grøstad 
wrote:
 On Thursday, 1 June 2017 at 09:18:24 UTC, Guillaume Piolat 
 wrote:
 Even with consumer software, you may want to crash immediately 
 so that you actually get complaints from testers/buyers 
 instead of having a silent, invisible bug that no one will 
 report ever.

 No. You don't want to crash immediately. In fact, you want to 
 save and recover. Preferably without much work lost and without 
 the user being bothered by it.

Solved by auto-saving, _before_ the crash

Jun 01 2017

"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:

On Thu, Jun 01, 2017 at 02:04:40PM +0000, Guillaume Piolat via Digitalmars-d
wrote:
 On Thursday, 1 June 2017 at 09:46:09 UTC, Ola Fosheim Gr�stad wrote:
 On Thursday, 1 June 2017 at 09:18:24 UTC, Guillaume Piolat wrote:
 Even with consumer software, you may want to crash immediately so
 that you actually get complaints from testers/buyers instead of
 having a silent, invisible bug that no one will report ever.

 
 No. You don't want to crash immediately. In fact, you want to save
 and recover. Preferably without much work lost and without the user
 being bothered by it.

 
 Solved by auto-saving, _before_ the crash

Yes.  Saving *after* a crash was detected is stupid, because you no
longer can guarantee the user data you're saving hasn't already been
corrupted.  I've experienced over-zealous "crash recovery" code in
applications overwrite the last known good copy of my data with the
latest, most up-to-date, and also most-corrupted data after it detected
a problem. Not nice at all.


T

-- 
Question authority. Don't ask why, just do it.

Jun 01 2017

Walter Bright <newshound2 digitalmars.com> writes:

On 6/1/2017 7:48 AM, H. S. Teoh via Digitalmars-d wrote:
 Yes.  Saving *after* a crash was detected is stupid, because you no
 longer can guarantee the user data you're saving hasn't already been
 corrupted.  I've experienced over-zealous "crash recovery" code in
 applications overwrite the last known good copy of my data with the
 latest, most up-to-date, and also most-corrupted data after it detected
 a problem. Not nice at all.

An even better idea is to use rolling backups, with the crash recovery backup 
only being the most recent, not the only, version.

Jun 01 2017

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:

On Thursday, 1 June 2017 at 14:04:40 UTC, Guillaume Piolat wrote:
 Solved by auto-saving, _before_ the crash

That only works for simple applications.

Jun 02 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 5/31/17 9:05 PM, Walter Bright wrote:
 On 5/31/2017 6:04 AM, Steven Schveighoffer wrote:
 Technically this is a programming error, and a bug. But memory hasn't
 actually been corrupted.

 Since you don't know where the bad index came from, such a conclusion
 cannot be drawn.

You could say that about any error. You could say that about malformed 
unicode strings, malformed JSON data, file not found. In this mindset, 
everything should be an Error, and nothing should be recoverable.

 This seems like a large penalty for "almost" corrupting memory. No
 other web framework I've used crashes the entire web server for such a
 simple programming error.

 Hence the endless vectors for malware insertion in those other frameworks.

No, those are due to the implementation of the interpreter. If the 
interpreter is implemented in  safe D, then you don't have those problems.

 Compare this to, let's say, a malformed unicode string (exception),

 malformed JSON data (exception), file not found (exception), etc.

 That's because those are input and environmental errors, not programming
 bugs.

Not necessarily. A file name could be sourced from the program, but have 
a typo. An index could come from the environment. The library can't 
know, but makes assumptions one way or the other. Just like we assume 
you want to use the GC, these assumptions are harmful for those who need 
it to be the other way.

 There can be grey areas in classifying problems as input errors or
 programming bugs, and those will need some careful thought by the
 programmer as to which bin they fall into, and then code accordingly.

 Array overflows are not a grey area, however. They are always
 programming bugs.

Of course, programming bugs cause all kinds of Errors and Exceptions 
alike. Environmental bugs can cause Array overflows.

I can detail exactly what happened in my code -- I am accepting dates 
from a given week from a web request. One of the dates fell outside the 
week, and so tried to access a 7 element array with index 9. Nothing 
corrupted memory, but the runtime corrupted my entire process, forcing a 
shutdown.

With an exception thrown, I still see the programming error, I still can 
fix it, and other web pages can still continue to be served.

 This topic comes up regularly in this forum - the idea that a program
 that entered an unknown, undefined state is actually ok and can continue
 executing. Maybe that's fine on a system (such as a gaming console)
 where nobody cares if it goes off the deep end and it is not connected
 to the internet so it cannot propagate malware infections.

In fact, it did not enter such a state. The runtime successfully 
*prevented* such a state. And then instantaneously ruined the state by 
unwinding the stack without

 Otherwise, while it's hard to write invulnerable programs, it is another
 thing entirely to endorse vulnerabilities. I cannot endorse such
 practices, nor can I endorse vibe.d if it is coded to continue running
 after entering an undefined state.

It's not. And it can't be. What I have to do is re-engineer the contract 
between myself and arrays. The only way to do that is to not use builtin 
arrays. That's the part that sucks. My code will be perfectly safe, and 
not ever experience corruption. It's just a bit ugly.

 A corollary is the idea that one creates reliable systems by writing
 programs that can continue executing after corruption. This is another
 fallacious concept. Reliable systems are ones that have independent
 components that can take over if some part of them fails. Shared memory
 is not independence.

That is not what is happening here. I'm avoiding corruption so I don't 
have to crash.

-Steve

Jun 01 2017

Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:

On Thursday, June 01, 2017 06:26:24 Steven Schveighoffer via Digitalmars-d 
wrote:
 On 5/31/17 9:05 PM, Walter Bright wrote:
 On 5/31/2017 6:04 AM, Steven Schveighoffer wrote:
 Technically this is a programming error, and a bug. But memory hasn't
 actually been corrupted.

 Since you don't know where the bad index came from, such a conclusion
 cannot be drawn.

 You could say that about any error. You could say that about malformed
 unicode strings, malformed JSON data, file not found. In this mindset,
 everything should be an Error, and nothing should be recoverable.

I think that it really comes down to what the contract is and how it makes
sense to treat bad values. At the one extreme, you can treat all bad input
as programmer error, requiring that callers validate all arguments to all
functions (in which case, assertions or some other type of Error would be
used on failure), and at the other extreme, you can be completely defensive
about it and can have every function validate its input and throw an
Exception on failure so that the checks never get compiled out, and the
caller can choose whether they want to recover or not. Both approaches are
of course rather extreme, and what we should do is somewhere in the middle.

So, for any given function, we need to decide whether we want to take the
DbC approach and require that the caller validates the input or take the
defensive programming approach and have the function itself validate the
input. Which makes more sense depends on what the function does and how it's
used and is a bit of an art. But ultimately, whether something is a
programming error depends on what the API and its contracts are, and that
definitely does not mean that one-size-fits-all.

As a default, I think that treating invalid indices as an Error makes a lot
of sense, but it is true that if the index comes from user input or is
otherwise inferred from user input, having the checks result in Errors is
annoying. But you can certainly do additional checks yourself, and if you
wrap the actual call to index the array in an  trusted function, it should
be possible to avoid there being two checks in the case that the index is
valid.

I get the impression that Walter tends to prefer treating stuff as
programmatic error due to the types of programs that he usually writes. You
get a lot fewer things that come from user input when you're simply
processing a file (like you do with a compiler) than you get with stuff like
a server application or a GUI. So, I think that he's more inclined to come
to the conclusion that something should be treated as programmatic error
than some other folks are. That being said, I also think that many folks are
too willing to try and make their program continue like nothing was wrong
after something fairly catastrophic happened.

 Otherwise, while it's hard to write invulnerable programs, it is another
 thing entirely to endorse vulnerabilities. I cannot endorse such
 practices, nor can I endorse vibe.d if it is coded to continue running
 after entering an undefined state.

 It's not. And it can't be. What I have to do is re-engineer the contract
 between myself and arrays. The only way to do that is to not use builtin
 arrays. That's the part that sucks. My code will be perfectly safe, and
 not ever experience corruption. It's just a bit ugly.

Well, you _can_ use the built-in arrays and just use a helper function for
indexing arrays so that the arrays are passed around normally, but you get
an Exception for an invalid index rather than an Error. You would have to be
careful to remember to index the array through the helper function, but it
wouldn't force you to use different data structures. e.g.

auto result = arr[i];

becomes something like

auto result = arr.at(i);

As an aside, I think that there has been way too much talk of memory
corruption in this thread, and much of it has derailed the discussion from
the actual issue. The array bounds checking prevented the memory corruption
problem. The question here is how to deal with invalid indices and whether
it should be treated as programmer error or bad input, and that's really a
question of whether arrays should use DbC or defensive programming and
whether there should be a way to choose based on your application's needs.

- Jonathan M Davis

Jun 01 2017

Walter Bright <newshound2 digitalmars.com> writes:

On 6/1/2017 3:56 AM, Jonathan M Davis via Digitalmars-d wrote:
 I get the impression that Walter tends to prefer treating stuff as
 programmatic error due to the types of programs that he usually writes. You
 get a lot fewer things that come from user input when you're simply
 processing a file (like you do with a compiler) than you get with stuff like
 a server application or a GUI. So, I think that he's more inclined to come
 to the conclusion that something should be treated as programmatic error
 than some other folks are.

It is a programming bug to not validate the input. It's not that bad to abort 
programs if you neglected to validate the input.

It is always bad to treat programming bugs as input errors.

Jun 01 2017

Timon Gehr <timon.gehr gmx.ch> writes:

On 01.06.2017 20:37, Walter Bright wrote:
 On 6/1/2017 3:56 AM, Jonathan M Davis via Digitalmars-d wrote:
 I get the impression that Walter tends to prefer treating stuff as
 programmatic error due to the types of programs that he usually 
 writes. You
 get a lot fewer things that come from user input when you're simply
 processing a file (like you do with a compiler) than you get with 
 stuff like
 a server application or a GUI. So, I think that he's more inclined to 
 come
 to the conclusion that something should be treated as programmatic error
 than some other folks are.

 
 It is a programming bug> to not validate the input. It's not that bad to 
 abort programs if you neglected to validate the input.
 ...

It really depends on the specific circumstances.

 It is always bad to treat programming bugs as input errors.

They should be treated as bugs, but isn't it plausible that there are 
circumstances where one does not want to authorize every  safe library 
function one calls to bring down the entire process?

Jun 01 2017

Walter Bright <newshound2 digitalmars.com> writes:

On 6/1/2017 12:16 PM, Timon Gehr wrote:
 On 01.06.2017 20:37, Walter Bright wrote:
 It is a programming bug> to not validate the input. It's not that bad to abort 
 programs if you neglected to validate the input.
 ...

 
 It really depends on the specific circumstances.

The stages of programming expertise:

1. newbie - follows the rules because he is told to
2. master - follows the rules because he understands them
3. guru - breaks the rules because he understands the rules don't apply

Let's not skip stages :-)


 It is always bad to treat programming bugs as input errors.

 They should be treated as bugs, but isn't it plausible that there are 
 circumstances where one does not want to authorize every  safe library
function 
 one calls to bring down the entire process?

You, as the programmer, need to decide what is validated data and what is not. 
Being unclear about this is technical debt that is going to cause problems.

Validated data that is not valid is a programming bug and the program should be 
aborted.

Jun 01 2017

Timon Gehr <timon.gehr gmx.ch> writes:

On 01.06.2017 21:48, Walter Bright wrote:
 On 6/1/2017 12:16 PM, Timon Gehr wrote:
 On 01.06.2017 20:37, Walter Bright wrote:
 It is a programming bug> to not validate the input. It's not that bad 
 to abort programs if you neglected to validate the input.
 ...

 It really depends on the specific circumstances.

 
 The stages of programming expertise:
 
 1. newbie - follows the rules because he is told to
 2. master - follows the rules because he understands them
 3. guru - breaks the rules because he understands the rules don't apply
 
 Let's not skip stages :-)
 ...

This does not really say anything about programming expertise, it says 
that "the rules" (whatever those are) are incomplete (unless there are 
no gurus, but then the list is nothing but silly).

I guess "terminate the program upon detection of a bug" is one of your 
rules. It's incomplete, but the language specification enforces it (for 
a subset of bugs).

 
 It is always bad to treat programming bugs as input errors.

 They should be treated as bugs, but isn't it plausible that there are 
 circumstances where one does not want to authorize every  safe library 
 function one calls to bring down the entire process?

 
 You, as the programmer, need to decide what is validated data and what 
 is not.

There is not only one programmer and not all programmers are me.

 Being unclear about this is technical debt that is going to 
 cause problems.
 ...

This is both obvious and not answering my question.

 Validated data that is not valid is a programming bug

Again, obvious.

 and the program should be aborted.

The buggy subprogram should be. Let's say I want to use library 
functionality written over the course of years by non-computer scientist 
domain expert Random C. Monkey. The library is an ugly jungle of special 
cases but it is mostly correct and makes it trivial to add feature X to 
my product. It's also pure and  safe without any  trusted functions. I 
can still serve customers if this library occasionally misbehaves, at a 
lower quality. (Let's say it is trivial to check whether the code 
returned a correct result, even though building the result in the first 
place was hard.) I cannot trust Mr. Monkey to have written only correct 
code respecting array bounds and null pointers, but if my product does 
not (seem to) have feature X by tomorrow, I'm most likely going out of 
business. Now, why exactly should any of Mr. Monkey's bugs terminate my 
entire service, necessitating a costly restart and causing unnecessary 
frustration to my customers?

I'm pretty sure D should not outright prevent this use case, even though 
in an ideal world this situation would never arise.

Jun 01 2017

Walter Bright <newshound2 digitalmars.com> writes:

On 6/1/2017 1:47 PM, Timon Gehr wrote:
 I'm pretty sure D should not outright prevent this use case, even though in an 
 ideal world this situation would never arise.

C quality code is straightforward in D. Just mark it  system.

Jun 01 2017

Timon Gehr <timon.gehr gmx.ch> writes:

On 01.06.2017 23:12, Walter Bright wrote:
 On 6/1/2017 1:47 PM, Timon Gehr wrote:
 I'm pretty sure D should not outright prevent this use case, even 
 though in an ideal world this situation would never arise.

 
 C quality code is straightforward in D. Just mark it  system.

I don't know what this is, but it is not an answer to my post.

Jun 01 2017

Paolo Invernizzi <paolo.invernizzi gmail.com> writes:

On Thursday, 1 June 2017 at 10:26:24 UTC, Steven Schveighoffer 
wrote:
 On 5/31/17 9:05 PM, Walter Bright wrote:
 On 5/31/2017 6:04 AM, Steven Schveighoffer wrote:
 Technically this is a programming error, and a bug. But 
 memory hasn't
 actually been corrupted.

 Since you don't know where the bad index came from, such a 
 conclusion
 cannot be drawn.

 You could say that about any error. You could say that about 
 malformed unicode strings, malformed JSON data, file not found. 
 In this mindset, everything should be an Error, and nothing 
 should be recoverable.

Everything coming as an input of the _process_ should be 
validated... once validated, if still find during the execution 
malformed JSON data, malformed unicode strings, etc, there's a 
bug, and the process should terminate.

 This seems like a large penalty for "almost" corrupting 
 memory. No
 other web framework I've used crashes the entire web server 
 for such a
 simple programming error.

 Hence the endless vectors for malware insertion in those other 
 frameworks.

 No, those are due to the implementation of the interpreter. If 
 the interpreter is implemented in  safe D, then you don't have 
 those problems.

It seems to me that reducing the danger only to corrupted memory 
is underestimating the damage that can be done, for example by a 
simple SQL injection, that can be done without corrupting memory 
at all.


 Compare this to, let's say, a malformed unicode string 
 (exception),

 malformed JSON data (exception), file not found (exception), 
 etc.

 That's because those are input and environmental errors, not 
 programming
 bugs.

 Not necessarily. A file name could be sourced from the program, 
 but have a typo. An index could come from the environment. The 
 library can't know, but makes assumptions one way or the other. 
 Just like we assume you want to use the GC, these assumptions 
 are harmful for those who need it to be the other way.

The library should not assume nothing about anything coming from 
the environment, the filesystem, etc: there's must be a 
validation at the boundaries.

 I can detail exactly what happened in my code -- I am accepting 
 dates from a given week from a web request. One of the dates 
 fell outside the week, and so tried to access a 7 element array 
 with index 9. Nothing corrupted memory, but the runtime 
 corrupted my entire process, forcing a shutdown.

And that's a good thing! The input should be validated, 
especially because we are talking about a web request.

See it like being kind with the other side of the connection, 
informing it with a clear "rejected as the date is invalid".

:-)

/Paolo

Jun 01 2017

Timon Gehr <timon.gehr gmx.ch> writes:

On 01.06.2017 14:25, Paolo Invernizzi wrote:
 
 I can detail exactly what happened in my code -- I am accepting dates 
 from a given week from a web request. One of the dates fell outside 
 the week, and so tried to access a 7 element array with index 9. 
 Nothing corrupted memory, but the runtime corrupted my entire process, 
 forcing a shutdown.

 
 And that's a good thing! The input should be validated, especially 
 because we are talking about a web request.
 
 See it like being kind with the other side of the connection, informing 
 it with a clear "rejected as the date is invalid".
 
 :-)

You seem to not understand what happened. There was a single server 
serving multiple different web pages. There was an out-of-bounds error 
due to a single user inserting invalid data into a single form with 
missing data validation. The web server went down, killing all pages for 
all users.

There is no question that input data should be validated, but if it 
isn't, the response should be proportional. It's enough to kill the 
request, log the exception , notify the developer, and maybe even 
disable the specific web page.

Jun 01 2017

Paolo Invernizzi <paolo.invernizzi gmail.com> writes:

On Thursday, 1 June 2017 at 18:54:51 UTC, Timon Gehr wrote:
 On 01.06.2017 14:25, Paolo Invernizzi wrote:
 
 I can detail exactly what happened in my code -- I am 
 accepting dates from a given week from a web request. One of 
 the dates fell outside the week, and so tried to access a 7 
 element array with index 9. Nothing corrupted memory, but the 
 runtime corrupted my entire process, forcing a shutdown.

 
 And that's a good thing! The input should be validated, 
 especially because we are talking about a web request.
 
 See it like being kind with the other side of the connection, 
 informing it with a clear "rejected as the date is invalid".
 
 :-)

 You seem to not understand what happened. There was a single 
 server serving multiple different web pages. There was an 
 out-of-bounds error due to a single user inserting invalid data 
 into a single form with missing data validation. The web server 
 went down, killing all pages for all users.

 There is no question that input data should be validated, but 
 if it isn't, the response should be proportional. It's enough 
 to kill the request, log the exception , notify the developer, 
 and maybe even disable the specific web page.

I really understand what is happening: I've a vibe.d server 
that's serving a US top 5 FMCG world company, and sometime it 
goes down for a crash.

It's dockerized, in a docker swarm, and every times it crashes 
(or it's "unhealty") it's restarted, and we've a log, that it's 
helping us to squeeze bugs.

Guess it, it's not a problem for the customer (at least right 
now!) as long as we have taken a clear approach: we are squeezing 
bug, and if process state is signalling us that a bug has 
occurred, we simply pull the plug.

A proportional response can be archived having multiple processes 
handling the requests.. it's the only sane way I can think to not 
kill "all" the sessions, but only a portion.

/Paolo

Jun 01 2017

aberba <karabutaworld gmail.com> writes:

On Thursday, 1 June 2017 at 21:55:55 UTC, Paolo Invernizzi wrote:
 On Thursday, 1 June 2017 at 18:54:51 UTC, Timon Gehr wrote:
 [...]

 I really understand what is happening: I've a vibe.d server 
 that's serving a US top 5 FMCG world company, and sometime it 
 goes down for a crash.

 [...]

Pretty much it. Containerisation of several stateless instances 
is pretty much the scalable approach going forward.

Jun 01 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 6/1/17 8:25 AM, Paolo Invernizzi wrote:
 On Thursday, 1 June 2017 at 10:26:24 UTC, Steven Schveighoffer wrote:
 I can detail exactly what happened in my code -- I am accepting dates
 from a given week from a web request. One of the dates fell outside
 the week, and so tried to access a 7 element array with index 9.
 Nothing corrupted memory, but the runtime corrupted my entire process,
 forcing a shutdown.

 And that's a good thing! The input should be validated, especially
 because we are talking about a web request.

 See it like being kind with the other side of the connection, informing
 it with a clear "rejected as the date is invalid".

If only that is what happened, I would not have started this thread!

In any case, the way forward is clear -- create containers that don't 
throw Error, and make them easy to use.

I think I will actually publish them, because it's a very useful thing 
to have. You can validate your input all you want, but if you have a 
program bug, or there is something you didn't consider, then the entire 
server isn't crashed because of it. I *like* the bounds checking, I 
don't have to translate back to the input what it will mean for every 
array access in the function -- the simple check is enough.

Still good to have it auto-restart, which I will also do. But having 
some sort of feedback to the client, and an attempt to continue on with 
other unrelated requests is preferable.

-Steve

Jun 02 2017

Arafel <er.krali gmail.com> writes:

On 06/02/2017 01:26 PM, Steven Schveighoffer wrote:
 
 If only that is what happened, I would not have started this thread!
 
 In any case, the way forward is clear -- create containers that don't 
 throw Error, and make them easy to use.
 
 I think I will actually publish them, because it's a very useful thing 
 to have. You can validate your input all you want, but if you have a 
 program bug, or there is something you didn't consider, then the entire 
 server isn't crashed because of it. I *like* the bounds checking, I 
 don't have to translate back to the input what it will mean for every 
 array access in the function -- the simple check is enough.
 
 Still good to have it auto-restart, which I will also do. But having 
 some sort of feedback to the client, and an attempt to continue on with 
 other unrelated requests is preferable.
 
 -Steve


Hi,

I think that most people agree that an out-of-bounds access is a bug 
that needs to be fixed, this shouldn't be an acceptable way of running 
the program.

The question here seems to be what to do *in the meanwhile*, and here is 
the problem. I can understand the position that from a theoretical point 
of view the process is already unsafe at this point, and that the best 
option is to stop (and restart if needed).

But, in the real world if I've got a (web)server that has proper 
isolation, I'd much rather have a server that sends back a 500 [error 
message] for the buggy page and keeps working otherwise, than one that 
is killed and has to be restarted every time a buggy page is asked.

Think that it can be a multithreaded server, and that other ongoing (and 
safe!) tasks might be affected, and that safe restart, even when 
available, often has a performance hit.

I agree that one (perhaps even the proper) way to get this is through 
process isolation, but this doesn't mean that the language shouldn't 
allow it if needed and explicitly required. There are ways for the 
programmer to explicitly disable most other security features 
(__gshared, casting away shared and immutable,  trusted code, etc.) so 
why not this one?

Perhaps an intermediate solution would be to offer a compiler switch 
that allows Errors to be safely caught (that is, they behave as 
exceptions). As far as I understand from reading this thread, that's 
already the case in debug builds, so it cannot be that bad practice, but 
it would be nice to have a mode that it's otherwise "release", only with 
this feature turned on.

Even better would be to turn on this behaviour on a per-function basis 
(say  throwErrors). Although perhaps that'd be promoting this behaviour 
a bit too much...

Anyway, just 2¢ from a half-newbie (okay, still full-newbie :) )

Jun 02 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 6/2/17 7:55 AM, Arafel wrote:
 But, in the real world if I've got a (web)server that has proper
 isolation, I'd much rather have a server that sends back a 500 [error
 message] for the buggy page and keeps working otherwise, than one that
 is killed and has to be restarted every time a buggy page is asked.

Yes, exactly what I want.

 Perhaps an intermediate solution would be to offer a compiler switch
 that allows Errors to be safely caught (that is, they behave as
 exceptions). As far as I understand from reading this thread, that's
 already the case in debug builds, so it cannot be that bad practice, but
 it would be nice to have a mode that it's otherwise "release", only with
 this feature turned on.

I don't think this is workable, simply because of nothrow. An Error is 
allowed to be thrown in nothrow code, and the compiler can 
simultaneously assume that nothrow functions won't throw. Therefore it 
can legally omit the scaffolding for deallocating scope variables when 
an Exception is thrown (for performance reasons), and leave your program 
in an invalid state.

The only conclusion I can come to is that I need to write my own array 
types. This isn't going to be so bad as I thought, and likely will just 
become second nature to use them.

-Steve

Jun 02 2017

Arafel <er.krali gmail.com> writes:

On 06/02/2017 02:12 PM, Steven Schveighoffer wrote:
 Perhaps an intermediate solution would be to offer a compiler switch
 that allows Errors to be safely caught (that is, they behave as
 exceptions). As far as I understand from reading this thread, that's
 already the case in debug builds, so it cannot be that bad practice, but
 it would be nice to have a mode that it's otherwise "release", only with
 this feature turned on.

 
 I don't think this is workable, simply because of nothrow. An Error is 
 allowed to be thrown in nothrow code, and the compiler can 
 simultaneously assume that nothrow functions won't throw. Therefore it 
 can legally omit the scaffolding for deallocating scope variables when 
 an Exception is thrown (for performance reasons), and leave your program 
 in an invalid state.
 

Well, as I understood from this thread this is already possible in debug 
mode:

 An Exception leads to unwinding&cleanup, an Error to termination (with
unwinding&cleanup in debug mode for debugging purposes).

If it is indeed so, then adding a switch that only removes this 
optimization (from  nothrow code) but is otherwise a release version 
shouldn't be too hard to implement? Even if not, making  nothrow a no-op 
w.r.t. unwinding should still be possible and not too hard (sorry if I'm 
being naïve here, I don't know how hard it would be to implement, but 
conceptually it seems straightforward).

Of course, one must be willing to take the performance hit.

Jun 02 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 6/2/17 9:00 AM, Arafel wrote:
 On 06/02/2017 02:12 PM, Steven Schveighoffer wrote:
 Perhaps an intermediate solution would be to offer a compiler switch
 that allows Errors to be safely caught (that is, they behave as
 exceptions). As far as I understand from reading this thread, that's
 already the case in debug builds, so it cannot be that bad practice, but
 it would be nice to have a mode that it's otherwise "release", only with
 this feature turned on.

 I don't think this is workable, simply because of nothrow. An Error is
 allowed to be thrown in nothrow code, and the compiler can
 simultaneously assume that nothrow functions won't throw. Therefore it
 can legally omit the scaffolding for deallocating scope variables when
 an Exception is thrown (for performance reasons), and leave your
 program in an invalid state.

 Well, as I understood from this thread this is already possible in debug
 mode:

 An Exception leads to unwinding&cleanup, an Error to termination (with
 unwinding&cleanup in debug mode for debugging purposes).

 If it is indeed so, then adding a switch that only removes this
 optimization (from  nothrow code) but is otherwise a release version
 shouldn't be too hard to implement? Even if not, making  nothrow a no-op
 w.r.t. unwinding should still be possible and not too hard (sorry if I'm
 being naïve here, I don't know how hard it would be to implement, but
 conceptually it seems straightforward).

 Of course, one must be willing to take the performance hit.

Yes, of course. This is a non-starter if you need to compile release 
mode (and you do, my relatively small app is 47MB in debug mode, 20MB in 
release mode, and I can't imagine performance doing very well).

-Steve

Jun 02 2017

John Colvin <john.loughran.colvin gmail.com> writes:

On Thursday, 1 June 2017 at 10:26:24 UTC, Steven Schveighoffer 
wrote:
 On 5/31/17 9:05 PM, Walter Bright wrote:
 On 5/31/2017 6:04 AM, Steven Schveighoffer wrote:
 Technically this is a programming error, and a bug. But 
 memory hasn't
 actually been corrupted.

 Since you don't know where the bad index came from, such a 
 conclusion
 cannot be drawn.

 You could say that about any error. You could say that about 
 malformed unicode strings, malformed JSON data, file not found. 
 In this mindset, everything should be an Error, and nothing 
 should be recoverable.

 This seems like a large penalty for "almost" corrupting 
 memory. No
 other web framework I've used crashes the entire web server 
 for such a
 simple programming error.

 Hence the endless vectors for malware insertion in those other 
 frameworks.

 No, those are due to the implementation of the interpreter. If 
 the interpreter is implemented in  safe D, then you don't have 
 those problems.

 Compare this to, let's say, a malformed unicode string 
 (exception),

 malformed JSON data (exception), file not found (exception), 
 etc.

 That's because those are input and environmental errors, not 
 programming
 bugs.

 Not necessarily. A file name could be sourced from the program, 
 but have a typo. An index could come from the environment. The 
 library can't know, but makes assumptions one way or the other. 
 Just like we assume you want to use the GC, these assumptions 
 are harmful for those who need it to be the other way.

 There can be grey areas in classifying problems as input 
 errors or
 programming bugs, and those will need some careful thought by 
 the
 programmer as to which bin they fall into, and then code 
 accordingly.

 Array overflows are not a grey area, however. They are always
 programming bugs.

 Of course, programming bugs cause all kinds of Errors and 
 Exceptions alike. Environmental bugs can cause Array overflows.

I think the idea is that no, array overflows can never be caused 
by the environment in a correct program. If you don't adequately 
screen the environmental input, your program is incorrect.


This is how I think about it:

There are 3 categories of bugs: known safe to survive, known 
unsafe to survive, unknown safety.

Range Errors are an example of errors that can be considered 
"unknown safety", so by default we assume it is unsafe to 
continue.

If you - as the human developer - decide that the specific 
RangeError bug from this place in the code is actually known safe 
to survive, you should add screening for the bad value and throw 
an Exception instead, or if that's difficult to do then catch the 
Error and then throw an Exception*. Note that these aren't fixes 
for the bug, these are explicit recognition of the continued 
existence of the bug while promising ( trusted style) that 
everything will still be OK.

If you decide it is truly an "unsafe to continue" bug, then let 
it carry on crashing there.

Ultimately of course you screen the environment at the 
appropriate level or fix the bug, do the "right thing" whatever 
that may be.

*note that you could abstract this away into an array type that 
throws Exceptions, but where would you know it was safe to use? 
Perhaps not so many places.

Tldr; if you know that a bug is safe to continue/recover from, 
put in the necessary code to do so.


I would be interested to see ideas of how to implement some sort 
of logical sandboxing in D. Perhaps if one calls a strongly pure 
 safe function, there is no way it can mess up shared state, so 
you know that as long as you disregard the result it will always 
be safe to continue... Effectively it's a "process within a 
process" or something like that. Of course you'd need to be able 
to guarantee you can catch Errors, plus even though the function 
you've called can't have *caused* the problem, it might be the 
only place where you *find* the problem and that might be bad 
enough to not want to continue from...

Jun 01 2017

Stanislav Blinov <stanislav.blinov gmail.com> writes:

On Thursday, 1 June 2017 at 14:10:21 UTC, John Colvin wrote:

 I would be interested to see ideas of how to implement some 
 sort of logical sandboxing in D. Perhaps if one calls a 
 strongly pure  safe function, there is no way it can mess up 
 shared state,

Oh yes, there is a way: 
http://forum.dlang.org/post/psdamamjecdwfeiuvqsz forum.dlang.org

Jun 01 2017

John Colvin <john.loughran.colvin gmail.com> writes:

On Thursday, 1 June 2017 at 14:21:35 UTC, Stanislav Blinov wrote:
 On Thursday, 1 June 2017 at 14:10:21 UTC, John Colvin wrote:

 I would be interested to see ideas of how to implement some 
 sort of logical sandboxing in D. Perhaps if one calls a 
 strongly pure  safe function, there is no way it can mess up 
 shared state,

 Oh yes, there is a way: 
 http://forum.dlang.org/post/psdamamjecdwfeiuvqsz forum.dlang.org

Sure,  safe has some holes as it currently stands.

Jun 01 2017

Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:

On Thursday, June 01, 2017 14:40:59 John Colvin via Digitalmars-d wrote:
 On Thursday, 1 June 2017 at 14:21:35 UTC, Stanislav Blinov wrote:
 On Thursday, 1 June 2017 at 14:10:21 UTC, John Colvin wrote:
 I would be interested to see ideas of how to implement some
 sort of logical sandboxing in D. Perhaps if one calls a
 strongly pure  safe function, there is no way it can mess up
 shared state,

 Oh yes, there is a way:
 http://forum.dlang.org/post/psdamamjecdwfeiuvqsz forum.dlang.org

 Sure,  safe has some holes as it currently stands.

It's far better than nothing, but it definitely has holes. DIP 1000 is
fixing a lot of those holes. Unfortunately, the only way to absolutely
guarantee that it doesn't have any holes is to do it via whitelisting
operations and then vetting every operation to make sure that it's safe for
the compiler to say that it's  safe, whereas it's implemented by
blacklisting operations that are determined to be unsafe. So, we'll probably
always be at risk of having holes in  safe, but the situation is improving.

- Jonathan M Davis

Jun 01 2017

Walter Bright <newshound2 digitalmars.com> writes:

On 6/1/2017 7:21 AM, Stanislav Blinov wrote:
 Oh yes, there is a way: 
 http://forum.dlang.org/post/psdamamjecdwfeiuvqsz forum.dlang.org


Please post bug reports to bugzilla. Posting them only on the n.g. pretty much 
ensures they will never get addressed.

Jun 01 2017

Stanislav Blinov <stanislav.blinov gmail.com> writes:

On Thursday, 1 June 2017 at 18:40:28 UTC, Walter Bright wrote:
 On 6/1/2017 7:21 AM, Stanislav Blinov wrote:
 Oh yes, there is a way: 
 http://forum.dlang.org/post/psdamamjecdwfeiuvqsz forum.dlang.org


 Please post bug reports to bugzilla. Posting them only on the 
 n.g. pretty much ensures they will never get addressed.

Please look at the very first post of that thread :\

Jun 01 2017

"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:

On Thu, Jun 01, 2017 at 06:26:24AM -0400, Steven Schveighoffer via
Digitalmars-d wrote:
[...]
 Of course, programming bugs cause all kinds of Errors and Exceptions
 alike.  Environmental bugs can cause Array overflows.
 
 I can detail exactly what happened in my code -- I am accepting dates
 from a given week from a web request. One of the dates fell outside
 the week, and so tried to access a 7 element array with index 9.
 Nothing corrupted memory, but the runtime corrupted my entire process,
 forcing a shutdown.

[...]

Isn't this a case of failing to sanitize user input adequately before
using it for internal processing?  And failing to test the code with
pathological data to ensure resilience before deploying to a live
server?

In this case, nothing worse happened than an out-of-bounds array index.
But we all know what *could* happen with unsanitized user input in other
cases...


T

-- 
Stop staring at me like that! It's offens... no, you'll hurt your eyes!

Jun 01 2017

Walter Bright <newshound2 digitalmars.com> writes:

On 6/1/2017 3:26 AM, Steven Schveighoffer wrote:
 On 5/31/17 9:05 PM, Walter Bright wrote:
 On 5/31/2017 6:04 AM, Steven Schveighoffer wrote:
 Technically this is a programming error, and a bug. But memory hasn't
 actually been corrupted.

 Since you don't know where the bad index came from, such a conclusion
 cannot be drawn.

 
 You could say that about any error. You could say that about malformed unicode 
 strings, malformed JSON data, file not found. In this mindset, everything
should 
 be an Error, and nothing should be recoverable.

What's missing here is looking carefully at a program and deciding what are 
input (and environmental) errors and what are program bugs. The former are 
recoverable, the latter are not.

For example, malformed unicode strings. Joel Spolsky wrote about this issue
long 
ago, in that data in a program should be compartmentalized into untrusted and 
trusted data.

Untrusted data comes from the input, and stays untrusted until it is validated. 
Malformed untrusted data are recoverable. Once it is validated, it becomes 
trusted data. Any malformations in trusted data are programming bugs. It should 
be clear in a well designed program what data is trusted and what data is 
untrusted. Spolsky suggests using different types for them so they are distinct.

For your date case, the date was not validated, and was fed into an array,
where 
the invalid date overflowed the array bounds. The program was relying on the 
array bounds checking to validate the data.

I'd argue this is a problematic program design because:

1. It's inefficient. Data should be validated once in a clear location in the 
program. Arrays appear all over the place, and tend to be in hot locations. 
Validating the same data over and over is highly inefficient.

2. Array bounds checking can be turned off by a compiler switch. Program data 
validation should not be silently disabled in such an unexpected manner.

3. Arrays are a ubiquitous data structure. They are used all over the place. 
There is no way to distinguish "this is a data validation use" and "this must
be 
valid data".

4. It would be surprising to anyone familiar with D looking at your code to 
realize that an array access is data validation rather than bug checking.

5. Arrays are sometimes optimized by removing the bounds checking. This should 
not turn off data validation.

6.  safe code is intended to find programming bugs, not validate input data.

7. Just because code is marked  safe doesn't mean memory corruption is 
impossible. Even if  safe is perfect, programs have  trusted and  system code 
too, and those may have memory corrupting bugs.

8. It does not distinguish array overflow from programming bugs / corruption 
from invalid program input.

Jun 01 2017

"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:

On Thu, Jun 01, 2017 at 11:29:53AM -0700, Walter Bright via Digitalmars-d wrote:
[...]
 Untrusted data comes from the input, and stays untrusted until it is
 validated. Malformed untrusted data are recoverable. Once it is
 validated, it becomes trusted data. Any malformations in trusted data
 are programming bugs. It should be clear in a well designed program
 what data is trusted and what data is untrusted. Spolsky suggests
 using different types for them so they are distinct.
 
 For your date case, the date was not validated, and was fed into an
 array, where the invalid date overflowed the array bounds. The program
 was relying on the array bounds checking to validate the data.

+1.  I think this is the root of the problem.  Data that comes from
outside sources must never, ever be trusted, until they are validated.
Any errors that occur during validation are recoverable, because you
*know* they are caused by wrong data from outside.

Once the data is validated, any further errors involving that data are
program bugs: either your validation code was incorrect / incomplete, or
there is a program logic error that led to an inconsistent state. In
this case, aborting the program is the only sane response, especially in
an online services setting, because your broken validation code may have
let through maliciously-crafted data that can lead to an exploit (better
nip it in the bud before the exploit proceeds any further), or the
internal program logic is inconsistent, so proceeding further is UB.

Feeding unvalidated, tainted data directly into inner program logic like
indexing an array is a bad idea.  The data ought to be validated first.

I like Spolsky's idea of using separate types for tainted / verified
input. Let the compiler statically verify that you at least made an
attempt at validating your program's inputs (though obviously it can
only go so far -- the compiler can't guarantee that your validation code
is actually correct).  The problem, though, is that D currently doesn't
have tainted types, so for example you can't tell at a glance whether a
given string is untrusted user input or validated data, it's all just
`string`.  I wonder if tainted types could be something worth adding
either to the language or to Phobos.


[...]
 8. It does not distinguish array overflow from programming bugs /
 corruption from invalid program input.

Yes, I think this conflation is the root cause of this problem.
Validation should be explicit, and separate from inner program logic.
Mixing the two together only serves to confuse the issue.


T

-- 
If you think you are too small to make a difference, try sleeping in a closed
room with a mosquito. -- Jan van Steenbergen

Jun 01 2017

cym13 <cpicard openmailbox.org> writes:

On Thursday, 1 June 2017 at 19:04:19 UTC, H. S. Teoh wrote:
 I like Spolsky's idea of using separate types for tainted / 
 verified input. Let the compiler statically verify that you at 
 least made an attempt at validating your program's inputs 
 (though obviously it can only go so far -- the compiler can't 
 guarantee that your validation code is actually correct).  The 
 problem, though, is that D currently doesn't have tainted 
 types, so for example you can't tell at a glance whether a 
 given string is untrusted user input or validated data, it's 
 all just `string`.  I wonder if tainted types could be 
 something worth adding either to the language or to Phobos.

I'm not familiar with the idea, do we need more than the 
following?

struct Tainted {
     T _basetype;
     alias _basetype this;
}


void main(string[] args) {
     auto ts = Tainted!string("Hello");
     writeln(ts);
}

It's a PoC, ok, but it lets you use ts like any variable of the 
base type, it lets you convert one easily to the other, but this 
conversion has to be explicit. So, real question, what more do we 
need?

Jun 01 2017

"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:

On Thu, Jun 01, 2017 at 10:09:36PM +0000, cym13 via Digitalmars-d wrote:
 On Thursday, 1 June 2017 at 19:04:19 UTC, H. S. Teoh wrote:
 I like Spolsky's idea of using separate types for tainted / verified
 input. Let the compiler statically verify that you at least made an
 attempt at validating your program's inputs (though obviously it can
 only go so far -- the compiler can't guarantee that your validation
 code is actually correct).  The problem, though, is that D currently
 doesn't have tainted types, so for example you can't tell at a
 glance whether a given string is untrusted user input or validated
 data, it's all just `string`.  I wonder if tainted types could be
 something worth adding either to the language or to Phobos.

 
 I'm not familiar with the idea, do we need more than the following?
 
 struct Tainted {
     T _basetype;
     alias _basetype this;
 }
 
 
 void main(string[] args) {
     auto ts = Tainted!string("Hello");
     writeln(ts);
 }
 
 It's a PoC, ok, but it lets you use ts like any variable of the base
 type, it lets you convert one easily to the other, but this conversion
 has to be explicit. So, real question, what more do we need?

[...]

Actually, I re-read Spolsky's blog post[1] again, and apparently he didn't
actually recommend using the type system for enforcing this, but a
naming convention that would make code stick out when it's doing
something funny.

[1] https://www.joelonsoftware.com/2005/05/11/making-wrong-code-look-wrong/

So, for example, you'd name all tainted strings with the prefix `us`,
and all functions that return tainted strings are prefixed with `us`,
including any string identifiers you might use to refer to the tainted
data.  E.g.:

	string usName = usGetParam(httpRequest, "name");
	...
	database.cache("usName", usName);
	...
	string usData = database.read("usName");
	...
	// sEscapeHtmlUs means it converts unsafe data (...Us) to safe
	// data (s...) by escaping dangerous characters.
	string sData = sEscapeHtmlUs(usData);
	...
	// sWrite means it requires safe data
	sWrite(html, "<p>Your name is %s</p>", sData);

The idea is that if you see a line of code where the prefixes don't
match, then you immediately know there's a problem. For example:

	// Uh-oh, we assigned unsafe data to a variable that should only
	// hold safe data.
	string sName = usGetParam(httpRequest, "name");

	// Uh-oh, we wrote unsafe data into a database field that should
	// only contain safe data.
	database.cache("sName", usName);

	// Uh-oh, we're printing unsafe data via a function that assumes
	// its input is safe.
	sWrite(html, "<p>Your name is %s</p>", usData);

This is not bad, since with some practice you could immediately identify
code that's probably wrong (mixing s- and us- prefixes wrongly, or
identifier with no prefix, meaning the code needs to be reviewed and the
identifier renamed accordingly).

The problem is that this is still in the realm of coding by convention.
What I had in mind was more along the lines of what you proposed, that
you'd actually use the type system to enforce a distinction between safe
and unsafe data, so that the compiler will reject code that tries to mix
the two without an explicit conversion.

I haven't thought too deeply about how to actually implement this, but
here's my initial idea: any function that reads data from external
sources (network, filesystem, environment) will return Tainted!string or
Tainted!(T[]) rather than string or T[]. Unlike what you proposed above,
the Tainted wrapper will *not* allow implicit conversion to the
underlying type, because otherwise it defeats the purpose (pass
Tainted!T to a function that expects T, and the compiler will
automatically cast it to T for you: no good).  So you cannot pass this
data directly to a function that expects string or T[].  However, they
will allow some way of accessing the wrapped data, so that the
validation function can inspect the data to ensure that it's OK, then
explicitly cast it to the underlying type.

Sketch of code:

	struct Tainted(T)
	{
		// Note: outside code cannot directly access payload.
		private T payload;

		T validate(alias isClean)()
			if (is(typeof(isClean(T.init)) == bool))
		{
			// Do not allow isClean to escape references to
			// payload (?is this correct usage?). Requires
			// -dip1000.
			scope _p = payload;

			if (isClean(_p))
				return payload;
			throw new Exception("Bad data");
		}

		T cleanse(alias cleaner)()
			if (is(typeof(cleaner(T.init)) == T))
		{
			// Prevent cleaner() from cheating and simply
			// returning the payload (?necessary?). Requires
			// -dip1000. The idea being to force the
			// creation of safe data from the payload, e.g.,
			// a HTML-escaped string from a raw string.
			scope _p = payload;

			return cleaner(_p);
		}
	}

	// Note: returns Tainted!T instead of T.
	Tainted!T readParam(T)(HttpRequest req, string paramName);

	// Note: requires string, not Tainted!string
	void writeToOutput(string s);

	void handleRequest(HttpRequest req)
	{
		string[7] daysOfWeek = [
			"Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"
		];

		// Returns Tainted!int
		auto day = req.readParam!int("dayOfWeek");

		// Compile error: cannot index array with Tainted!int
		//writeToOutput(daysOfWeek[day]);

		// Check range and return unwrapped int if OK, throw
		// Exception otherwise.
		auto checkedDay = day.validate!(d => d >= 0 && d < daysOfWeek.length);

		writeToOutput(daysOfWeek[checkedDay]); // OK

		// Returns Tainted!string
		auto name = req.readParam!string("name");

		// Compile error: cannot pass Tainted!string to writeToOutput.
		//writeToOutput(name);

		// Unwrap to string if does not contain meta-characters,
		// throw Exception otherwise.
		auto safeName = name.validate!hasNoMetaCharacters;

		writeToOutput(safeName); // OK

		// Cleanse the string by escaping metacharacters.
		auto escapedName = name.cleanse!escapeHtmlMetaChars;
		writeToOutput(escapedName); // OK
	}

This is just a rough sketch, of course.  A more complete implementation
would have to consider what to do about code that obtains unsafe data
directly from OS interfaces like core.stdc.stdlib.fread that isn't
wrapped by Tainted.

Also, it would have to address what to do about functions like
File.rawRead(), that writes to a user-provided buffer, since the caller
could just read the tainted data directly from the buffer, bypassing any
Tainted protections.


T

-- 
I'm still trying to find a pun for "punishment"...

Jun 01 2017

cym13 <cpicard openmailbox.org> writes:

On Friday, 2 June 2017 at 00:30:39 UTC, H. S. Teoh wrote:
 [...]

Now that I think about it, what we really want going that way is 
an IO monad.

Jun 01 2017

Walter Bright <newshound2 digitalmars.com> writes:

On 6/1/2017 11:29 AM, Walter Bright wrote:
 Joel Spolsky wrote about this issue long 
 ago, in that data in a program should be compartmentalized into untrusted and 
 trusted data.

Found it:

https://www.joelonsoftware.com/2005/05/11/making-wrong-code-look-wrong/

It's one of those programming essays that everyone should read.

Jun 01 2017

Dukc <ajieskola gmail.com> writes:

On Thursday, 1 June 2017 at 18:29:53 UTC, Walter Bright wrote:
 

 What's missing here is looking carefully at a program and 
 deciding what are input (and environmental) errors and what are 
 program bugs. The former are recoverable, the latter are not.

 [...]

I think he understood all that already. Array overflow is a sign 
of a bug, which must not be left to slip past.

But I think the point was that it causes so big amount of work 
-the whole program- to abort. Potentially thousands of customers 
could lose connection to server because of that. He wishes that 
just the connection in question crashed, so other users using 
other, likely bugless, parts of the program would not be 
disturbed.

Personally I have no opinion of this right now, save that it's 
definitely a tough sounding question.

Jun 01 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 6/1/17 2:29 PM, Walter Bright wrote:
 For your date case, the date was not validated, and was fed into an
 array, where the invalid date overflowed the array bounds. The program
 was relying on the array bounds checking to validate the data.

I think it's important to state that no, I wasn't relying on array 
bounds checks to validate the data. I should be validating the data (and 
am now). What I had was a bug in my program.

What I have been saying is that in this framework, designed the way it 
is, there is no good reason to crash the entire process for such a bug. 
There are clear delineations of when the bug is in the "user code" 
section of vibe.d (i.e. the code in your project) and when it is in 
"system code" (i.e. the vibe.d framework). I want the "system code" 
section to continue to function when an out of bounds error happens in 
"user code", to give feedback to the user that no, this didn't work, 
there was an internal error.

Other frameworks and languages don't have this issue. An out of range 
error in PHP doesn't crash apache. Similarly, a segfault in a program 
doesn't crash the OS. This is the way I view the layer between vibe.d 
framework and the code that handles requests. I get that the memory is 
shared, and there's a greater risk of corruption affecting the 
framework. The right answer is to physically separate the processes, and 
at some point, maybe vibe can move in that direction (I outlined what I 
considered a good setup in another post). But a more logical separation 
is still possible by requiring for instance that all request handlers 
are  safe. Even in that case, crashing the *fiber* and not the entire 
process is still preferable in cases where the input isn't properly 
validated. Specifically, I'm talking about out-of-bounds failures, and 
not general asserts.

This is why I'm still moving forward with making my arrays throw 
Exceptions for out-of-bounds issues (and will publish the library to do 
so in case anyone else feels the same way).

 2. Array bounds checking can be turned off by a compiler switch. Program
 data validation should not be silently disabled in such an unexpected
 manner.

Most of your points are based on the assumption that this was a design 
decision, so they aren't applicable, but on this point I wanted to say:

IMO, anything that's on the Internet should never have array bounds 
checking turned off. The risk is too great.

-Steve

Jun 02 2017

John Carter <john.carter taitradio.com> writes:

On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer 
wrote:
 For example:

 int[3] arr;
 arr[3] = 5;


 Technically this is a programming error, and a bug. But memory 
 hasn't actually been corrupted. The system properly stopped me 
 from corrupting memory. But my reward is that even though this 
 fiber threw an Error, and I get an error message in the log 
 showing me the bug, the web server itself is now out of 
 commission. No other pages can be served.

In this case it is fairly obvious where the bad index is coming 
from... but in general it is impossible to say.

So how much of your program is mad?

You need to reset to some safe / correct point to continue.

Which point?

It is impossible for the compiler to determine that.

Personally I would say the design fault is trying to build 
_everything_ into a single OS process.

The mechanism that is guaranteed, enforced by the hardware, to 
recover all resources and reset to a sane point is OS process 
exit.

ie. If you need "bug" tolerance, decompose your system into 
multiple processes. This actually has a large number of other 
benefits. (eg. Automagically concurrent)

Of course, you then need to encode some common sense in the 
harness... if something keeps on starting up and dying within a 
very short period of time.... stop restarting it.

Of course, this is just one (of many) ways that a program bug can 
screw up a system. For example it can start chewing way too many 
resources.

So your harness needs to be able to limit that.

And of course if you are going to decompose in processes, a 
process may spawn many more, so you need to shepherd all the 
subprocesses sanely.....

...and start the herd of processes in appropriate order, and shut 
them down appropriately....

Sounds like quite an intelligent harness...

Fortunately one exists and has really carefully thought through 
all these issues.

It's called systemd and works very well.

May 31 2017

"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:

On Thu, Jun 01, 2017 at 03:24:02AM +0000, John Carter via Digitalmars-d wrote:
[...]
 Personally I would say the design fault is trying to build
 _everything_ into a single OS process.
 
 The mechanism that is guaranteed, enforced by the hardware, to recover
 all resources and reset to a sane point is OS process exit.
 
 ie. If you need "bug" tolerance, decompose your system into multiple
 processes. This actually has a large number of other benefits. (eg.
 Automagically concurrent)

[...]

Again, from an engineering standpoint, this is a tradeoff.

The self-containment of an OS-level process is good for isolating it
from affecting other processes, but they come with a cost.  In the case
of vibe.d, while I can't speak for the design rationales because I'm not
involved in its development, it does appear to me that fibres were
chosen because of their very low context-switch cost and memory
requirements.  If you were to turn the fibres into full-blown processes,
that means incurring the cost of saving/restoring the full process
context, because that's what it takes to achieve independence between
processes. You need a bigger memory footprint because each process needs
to have its own copy of data in order to ensure independence.

It may very well be that for your particular design, process
independence is important, so this price may be well worth paying.

The fibre route chosen by vibe.d comes with the advantage of faster
context switches and smaller memory footprint (and probably other perks
as well), but the price you pay for that performance boost is that the
fibres are not self-contained and isolated from each other.  So if one
fibre goes awry, you can no longer guarantee that the other fibres
aren't also compromised. Hence if you wish to guarantee safety in case
of logic errors like out-of-bounds array accesses, you're forced to have
to reset the entire process before you can be absolutely sure you're
back in a sane state.

Which route to choose depends on the particulars of what you're trying
to achieve, and how much / whether you're willing to pay the price to
achieve what you want.


T

-- 
Today's society is one of specialization: as you grow, you learn more and more
about less and less. Eventually, you know everything about nothing.

May 31 2017

Paolo Invernizzi <paolo.invernizzi gmail.com> writes:

On Thursday, 1 June 2017 at 06:11:43 UTC, H. S. Teoh wrote:
 On Thu, Jun 01, 2017 at 03:24:02AM +0000, John Carter via 
 Digitalmars-d wrote: [...]
 [...]

 [...]

 Again, from an engineering standpoint, this is a tradeoff.

 [...]

That's exactly the point: to use the right tool for the 
requirement of the job to be done.

/P

Jun 01 2017

Timon Gehr <timon.gehr gmx.ch> writes:

On 01.06.2017 10:47, Paolo Invernizzi wrote:
 On Thursday, 1 June 2017 at 06:11:43 UTC, H. S. Teoh wrote:
 On Thu, Jun 01, 2017 at 03:24:02AM +0000, John Carter via 
 Digitalmars-d wrote: [...]
 [...]

 [...]

 Again, from an engineering standpoint, this is a tradeoff.

 [...]

 
 That's exactly the point: to use the right tool for the requirement of 
 the job to be done.
 
 /P

There is no such tool.

Jun 01 2017

Jacob Carlborg <doob me.com> writes:

On 2017-06-01 21:20, Timon Gehr wrote:

 There is no such tool.

In this case, Erlang is a pretty good candidate. It's using green 
processes that are even more lightweight than fibers. You can have 
millions of these processes. All data is process local. If there's a 
corruption in one of the processes it cannot affect the other ones 
(unless there's a bug in the virtual machine). The major downside is 
that it's not D and it's a pretty crappy programming language.

-- 
/Jacob Carlborg

Jun 01 2017

Paolo Invernizzi <paolo.invernizzi gmail.com> writes:

On Thursday, 1 June 2017 at 19:20:01 UTC, Timon Gehr wrote:
 On 01.06.2017 10:47, Paolo Invernizzi wrote:
 On Thursday, 1 June 2017 at 06:11:43 UTC, H. S. Teoh wrote:
 On Thu, Jun 01, 2017 at 03:24:02AM +0000, John Carter via 
 Digitalmars-d wrote: [...]
 [...]

 [...]

 Again, from an engineering standpoint, this is a tradeoff.

 [...]

 
 That's exactly the point: to use the right tool for the 
 requirement of the job to be done.
 
 /P

 There is no such tool.

Process isolation was exactly crafted for that.

/Paolo

Jun 01 2017

Vladimir Panteleev <thecybershadow.lists gmail.com> writes:

On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer 
wrote:
 I have discovered an annoyance in using vibe.d instead of 
 another web framework. Simple errors in indexing crash the 
 entire application.

Since I wrote/run a bunch of websites/network services written in 
D, here's my experience/advice:

First, this is not something specific to array indexing, but an 
entire class of logic errors which are sometimes recoverable. 
Other examples are associative array indexing, division by zero, 
and out-of-memory errors resulting from underflows. All of these 
are due to bugs in the program, but could hypothetically be 
handled without compromising the integrity of the process.

My advice:

1. Let the program crash. Make sure it's restarted afterwards, 
either via a looping script, or a watchdog.

2. Make sure you are notified of the error. I don't mean just 
recorded in a log file somewhere, but set it up so you receive an 
email any time it happens, with the stack trace. I run all my D 
network services from a cronjob, which automatically sends output 
by email. If you have the stack trace, most of these bugs take 
only a few minutes to fix - at the very least, turning the error 
into an exception is a trivial modification if you don't have 
time for a full root cause analysis at that moment.

3. Design your program so that it can be terminated at any point 
without resulting in data corruption. I don't know if Vibe.d can 
satisfy this constraint, but e.g. the ae.net.http.server workflow 
is to build/send the entire response atomically, meaning that the 
Content-Length will always be populated. Wrap your database 
updates in transactions. Use the "write to temporary file then 
rename over the original file" pattern when updating files. Etc.

Jun 01 2017

Walter Bright <newshound2 digitalmars.com> writes:

On 6/1/2017 2:53 AM, Vladimir Panteleev wrote:
 3. Design your program so that it can be terminated at any point without 
 resulting in data corruption. I don't know if Vibe.d can satisfy this 
 constraint, but e.g. the ae.net.http.server workflow is to build/send the
entire 
 response atomically, meaning that the Content-Length will always be populated. 
 Wrap your database updates in transactions. Use the "write to temporary file 
 then rename over the original file" pattern when updating files. Etc.

This is the best advice.

I.e. design with the assumption that failure will occur, rather than
fruitlessly 
trying to prevent all failure.

Jun 01 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 6/1/17 2:00 PM, Walter Bright wrote:
 On 6/1/2017 2:53 AM, Vladimir Panteleev wrote:
 3. Design your program so that it can be terminated at any point
 without resulting in data corruption. I don't know if Vibe.d can
 satisfy this constraint, but e.g. the ae.net.http.server workflow is
 to build/send the entire response atomically, meaning that the
 Content-Length will always be populated. Wrap your database updates in
 transactions. Use the "write to temporary file then rename over the
 original file" pattern when updating files. Etc.

 This is the best advice.

 I.e. design with the assumption that failure will occur, rather than
 fruitlessly trying to prevent all failure.

Indeed it is good advice. I'm thinking actually a good setup is to have 
2 levels of processes: one which delivers requests to some set of child 
processes that handle the requests with fibers, and one which handles 
the i/o to the client. Then if the subprocess dies, the master process 
can both inform the client of the failure, and retry other fibers that 
were in process but never had a chance to finish.

Not sure if I'll get to that point. At this time, I'm writing an array 
wrapping struct that will turn all range errors into range exceptions. 
Then at least I can inform the client of the error and continue to 
handle requests.

-Steve

Jun 01 2017

Martin Tschierschke <mt smartdolphin.de> writes:

On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer 
wrote:
 I have discovered an annoyance in using vibe.d instead of 
 another web framework. Simple errors in indexing crash the 
 entire application.

Is this option useful for you?

VibeDebugCatchAll 	Enables catching of exceptions that derive 
from Error. This can be useful during application development to 
get useful error information while keeping the application 
running, but can generally be dangerous, because the application 
may be left in a bad state after an Error has been thrown.

From: http://vibed.org/docs#compile-time-configuration

Jun 01 2017

"Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:

On 06/01/2017 09:54 AM, Martin Tschierschke wrote:
 On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:
 I have discovered an annoyance in using vibe.d instead of another web 
 framework. Simple errors in indexing crash the entire application.

 Is this option useful for you?
 
 VibeDebugCatchAll     Enables catching of exceptions that derive from 
 Error. This can be useful during application development to get useful 
 error information while keeping the application running, but can 
 generally be dangerous, because the application may be left in a bad 
 state after an Error has been thrown.
 
 From: http://vibed.org/docs#compile-time-configuration
 
 

All that would do is *cause* corruption due to the way the runtime 
handles (or more precisely, doesn't handle) a thrown Error.

Jun 01 2017

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/31/2017 09:04 AM, Steven Schveighoffer wrote:
 I have discovered an annoyance in using vibe.d instead of another web 
 framework. Simple errors in indexing crash the entire application.
 
 For example:
 
 int[3] arr;
 arr[3] = 5;
 
 Compare this to, let's say, a malformed unicode string (exception), 
 malformed JSON data (exception), file not found (exception), etc.
 
 Technically this is a programming error, and a bug. But memory hasn't 
 actually been corrupted. The system properly stopped me from corrupting 
 memory. But my reward is that even though this fiber threw an Error, and 
 I get an error message in the log showing me the bug, the web server 
 itself is now out of commission. No other pages can be served. This is 
 like the equivalent of having a guard rail on a road not only stop you 
 from going off the cliff but proactively disable your car afterwards to 
 prevent you from more harm.
 
 This seems like a large penalty for "almost" corrupting memory. No other 
 web framework I've used crashes the entire web server for such a simple 
 programming error. And vibe.d has no choice. There is no guarantee the 
 stack is properly unwound, so it has to accept the characterization of 
 this is a program-ending error by the D runtime.
 
 I am considering writing a set of array wrappers that throw exceptions 
 when trying to access out of bounds elements. This comes with its own 
 set of problems, but at least the web server should continue to run.
 
 What are your thoughts? Have you run into this? If so, how did you solve 
 it?

This is a meaningful concern. People use threads instead of processes 
for serving requests for improving speed and footprint. Threads hardly 
communicate with one another so they are virtually independent. D 
already has good mechanisms for isolating threads effectively (the 
shared qualifier,  safe) so an argument could be made that bringing down 
the entire process because a thread has had a problem is 
disproportionate response.

This was a concern about using D on the server for Facebook as well.

Of course, it may be the case that that thread's failure reflects a 
memory corruption that affects all others, so one could reduce the 
matter to this and argue the entire process should be brought down. But 
of course things are never as simple as we'd like them to be.

Array bound accesses should be easy to intercept and have them just kill 
the current thread. Vibe may want to do that, or allow their users to. 
The more difficult matter is null pointer dereferences. I recall there 
has been work in druntime to convert memory violations into thrown 
Errors at least on Linux. You may want to look into that.

It seems to me we'd do good to improve matters on this front.


Thanks,

Andrei

Jun 02 2017

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On Friday, 2 June 2017 at 15:19:29 UTC, Andrei Alexandrescu wrote:
 Array bound accesses should be easy to intercept and have them 
 just kill the current thread.

Ideally, fiber, as well.  Probably the real ideal for this sort 
of problem is to be able to be as close as possible to Erlang, 
where errors bring down the particular task in progress, but not 
the application that spawned the task.

Incidentally, I wouldn't limit the area of concern here to array 
bound access issues.  This is more about the ability of _any_ 
error to propagate in applications of this nature, where you have 
many independent tasks being spawned in separate threads or (more 
often) fibers, and where you absolutely do not want an error in 
one task preventing you from being able to continue with others.

Jun 04 2017

Jacob Carlborg <doob me.com> writes:

On 2017-06-04 20:15, Joseph Rushton Wakeling wrote:
 On Friday, 2 June 2017 at 15:19:29 UTC, Andrei Alexandrescu wrote:
 Array bound accesses should be easy to intercept and have them just
 kill the current thread.

 Ideally, fiber, as well.  Probably the real ideal for this sort of
 problem is to be able to be as close as possible to Erlang, where errors
 bring down the particular task in progress, but not the application that
 spawned the task.

Erlang has the philosophy of share nothing between processes (green 
processes), or task as you call it here. All allocations are process 
local, that makes it easier to know that a failing process doesn't 
affect any other process.

-- 
/Jacob Carlborg

Jun 04 2017

Paolo Invernizzi <paolo.invernizzi gmail.com> writes:

On Sunday, 4 June 2017 at 19:12:42 UTC, Jacob Carlborg wrote:
 On 2017-06-04 20:15, Joseph Rushton Wakeling wrote:
 On Friday, 2 June 2017 at 15:19:29 UTC, Andrei Alexandrescu 
 wrote:
 Array bound accesses should be easy to intercept and have 
 them just
 kill the current thread.

 Ideally, fiber, as well.  Probably the real ideal for this 
 sort of
 problem is to be able to be as close as possible to Erlang, 
 where errors
 bring down the particular task in progress, but not the 
 application that
 spawned the task.

 Erlang has the philosophy of share nothing between processes 
 (green processes), or task as you call it here. All allocations 
 are process local, that makes it easier to know that a failing 
 process doesn't affect any other process.

If I'm not wrong, it also uses a VM, also if there's the 
availability of a native code compiler...
If a VM is involved, it's another game...

/Paolo

Jun 04 2017

Jacob Carlborg <doob me.com> writes:

On 2017-06-04 21:24, Paolo Invernizzi wrote:

 If I'm not wrong, it also uses a VM, also if there's the availability of
 a native code compiler...
 If a VM is involved, it's another game...

Yes, it's running on a VM, the Beam.

-- 
/Jacob Carlborg

Jun 04 2017

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:

On Sunday, 4 June 2017 at 19:24:27 UTC, Paolo Invernizzi wrote:
 If I'm not wrong, it also uses a VM, also if there's the 
 availability of a native code compiler...
 If a VM is involved, it's another game...

Not sure if I follow that.  If you only use safe code then there 
should be no difference between using a VM or not.  And what is a 
VM these days anyway? (e.g. hypervisors and micro code caches in 
CPUs etc)

Now, you might argue that some IRs are too complicated, and that 
a simple IR is easier to get right. Or that some concurrency 
models are more volatile than others. That is true, but it 
doesn't have much to do with using a VM.

So the only special thing about using a VM in this case is that 
it could allow an actor to migrate to another server while 
running. Which is another game...

Jun 04 2017

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On Sunday, 4 June 2017 at 19:12:42 UTC, Jacob Carlborg wrote:
 Erlang has the philosophy of share nothing between processes 
 (green processes), or task as you call it here. All allocations 
 are process local, that makes it easier to know that a failing 
 process doesn't affect any other process.

Indeed.  (I used 'task' here in a deliberately vague sense, in 
order to not be too Erlang- or D-specific.)

The obvious differences in how D handles things seem to make it 
rather hard to get the same ease of error handling, but it would 
be interesting to consider what might get us closer.

Jun 04 2017

nohbdy <nobby vimesnet.net> writes:

I'm using D to write an RSS reader.

As I understand it, the compiler does not guarantee correct 
cleanup when an Error is thrown through a nothrow function. 
Furthermore, it doesn't guarantee that an Error can be caught 
(though it happens to allow it today).

Do I need to modify the compiler to ignore nothrow and treat all 
throwables the same so it doesn't corrupt application state when 
I recover from an Error? Fork vibe.d and every other library I 
use to remove nothrow? I can't really justify that. My RSS reader 
is a side project.

Do I accept that writing my code in D will result in a program 

show a 503 and log an error to disk? That's a disservice to my 
users.

Do I increase development time to make up for D's problems in 
this area, pipe requests through a proxy that will convert 
crashes to 503 errors, split things out into as many processes as 

a wide variety of ways, but I'd save a lot of work and complexity.

And this practice is to make code marginally more efficient in 
uncommon cases, because people are conflating "this is a problem 
that a competent programmer should have been able to avoid" 
(yeah, okay, I was incautious, we can move on) with "this 
dependency of yours, probably the runtime, is in an invalid 
state", and nothrow optimizations assume the latter only.

And it's exacerbated because bounds checking is seen as an option 
to help with debugging instead of a safety feature to be used in 
production. Because removing bounds checking is seen as a 
sensible thing to do instead of a highly unsafe optimization.

It's exacerbated because Walter is in a mindset of writing 
mission-critical applications where any detectable bug means you 
need to restart the program. Honestly, if I were writing flight 
control systems for Airbus, I could modify druntime to raise 
SIGABRT or call exit(3) when you try to throw an Error. It would 
be easy, and it would be worthwhile. If you really need cleanup, 
atexit(3) is available.

Jun 02 2017

Paolo Invernizzi <paolo.invernizzi gmail.com> writes:

On Friday, 2 June 2017 at 23:23:45 UTC, nohbdy wrote:

 It's exacerbated because Walter is in a mindset of writing 
 mission-critical applications where any detectable bug means 
 you need to restart the program. Honestly, if I were writing 
 flight control systems for Airbus, I could modify druntime to 
 raise SIGABRT or call exit(3) when you try to throw an Error. 
 It would be easy, and it would be worthwhile. If you really 
 need cleanup, atexit(3) is available.

The worst thing happened in programming in the last 30 years is 
just that less and less programmers are adopting Walter mindset...

I'm really really puzzled by why this topic pops up so often...


/Paolo

Jun 02 2017

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:

On Saturday, 3 June 2017 at 06:55:35 UTC, Paolo Invernizzi wrote:
 The worst thing happened in programming in the last 30 years is 
 just that less and less programmers are adopting Walter 
 mindset...

Really?

On the contrary. What is being adopted is robustness and program 
verification. More and more.

Assuming that a program shouldn't be able to flush its buffers 
out of some flawed reasoning about program correctness does not 
support your argument at all.

Even if your program is fully based on event-sourcing and can 
deal with an immediate shutdown YOU STILL WANT TO FLUSH YOUR 
EVENT-BUFFERS TO DISK!

The argument Walter is follwing is flawed. If a failed assert 
means you should not be able to flush to disk, then it also means 
that you should undo everything the program has ever written to 
disk.

The incorrect program state could have occured at install.

You have to reason about these things in probabilistic terms and 
not in absolutes.

Jun 03 2017

Paolo Invernizzi <paolo.invernizzi gmail.com> writes:

On Saturday, 3 June 2017 at 07:51:55 UTC, Ola Fosheim Grøstad 
wrote:
 On Saturday, 3 June 2017 at 06:55:35 UTC, Paolo Invernizzi 
 wrote:
 The worst thing happened in programming in the last 30 years 
 is just that less and less programmers are adopting Walter 
 mindset...

 Really?

 On the contrary. What is being adopted is robustness and 
 program verification. More and more.

It doesn't seems to me that the trends to try to handle somehow, 
that something, somewhere, who knows when, has gone wild it's 
coherent with the term "robustness".

And the fact that the "nice tries" are done at runtime, in 
production, is the opposite of what I'm thinking is program 
verification.

 Assuming that a program shouldn't be able to flush its buffers 
 out of some flawed reasoning about program correctness does not 
 support your argument at all.

 Even if your program is fully based on event-sourcing and can 
 deal with an immediate shutdown YOU STILL WANT TO FLUSH YOUR 
 EVENT-BUFFERS TO DISK!

There's a fundamental difference between trying to flush logs and 
trying to report what's happened, with the scope of gaining more 
information of what happened, and trying to "automagically" 
handle the fact that there's an error in the implementation, or 
in the logic, or in the HW.

 The argument Walter is follwing is flawed. If a failed assert 
 means you should not be able to flush to disk, then it also 
 means that you should undo everything the program has ever 
 written to disk.

 The incorrect program state could have occured at install.

The argument Walter is following is not flawed: it's a really 
beautiful pragmatic balance of risks and engineering way of 
developing software, IMHO.

 You have to reason about these things in probabilistic terms 
 and not in absolutes.

I'm trying to exactly do that, I like to think myself as a very 
pragmatic person...

/Paolo

Jun 03 2017

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:

On Saturday, 3 June 2017 at 10:21:03 UTC, Paolo Invernizzi wrote:
 It doesn't seems to me that the trends to try to handle 
 somehow, that something, somewhere, who knows when, has gone 
 wild it's coherent with the term "robustness".

That all depends. It makes perfect sense in a "strongly pure" 
function to just return an exception for basically anything that 
went wrong in that function.

I use this strategy in other languages for writing 
validator_functions, it is a very useful and time-saving way of 
writing validators. E.g.:

try {
     …
     validated_field = validate_input(unvalidated_input);
}

I don't really care why validate_input failed, even if it was a 
logic flaws in the "validate_input" code itself it is perfectly 
fine to just respond to the exception, log the failure return a 
failure status code and continue with the next request.

The idea that programs can do provably full veracity checking of 
input isn't realistic in evolving code bases that need constant 
updates.

My "validate_input" only have to be correct for correct input. If 
it fails because the input is wrong or because the validation 
spec is wrong does not matter, as long as it fails.

 And the fact that the "nice tries" are done at runtime, in 
 production, is the opposite of what I'm thinking is program 
 verification.

Program verification requires a spec.

In the above example the spec could be that it should never allow 
illegal input to pass, but it could also make room for failing 
for some legal input.

"false alarm" is a concept that is allowed for in many real world 
application.

In this context it means that you throw too many exceptions, but 
that does not mean that you don't throw an exception when you 
should have.

 I'm trying to exactly do that, I like to think myself as a very 
 pragmatic person...

What do you mean by "pragmatic"? Shutting down a B2B website 
because one insignificant request-handler fails on some requests 
(e.g. requesting a help page)  is not very pragmatic.

Pragmatic in this context would be to specify which handlers are 
critical and which ones are not.

Jun 03 2017

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:

Anyway, all of this boils down to the question of whether D 
really provides a safe programming environment.

If you only write safe code in a safe language, then it should be 
perfectly ok to trap and deal with a failed lookup, irrespective 
of what kind of data-structure it is.

So, if this isn't possible in D, then D isn't able to compete 
with other safe programming languages...

But then maybe one shouldn't try to sell it as a safe programming 
language either.

You can't really have it both ways.

Jun 03 2017

Paolo Invernizzi <paolo.invernizzi gmail.com> writes:

On Saturday, 3 June 2017 at 10:47:36 UTC, Ola Fosheim Grøstad 
wrote:
 On Saturday, 3 June 2017 at 10:21:03 UTC, Paolo Invernizzi 
 wrote:
 It doesn't seems to me that the trends to try to handle 
 somehow, that something, somewhere, who knows when, has gone 
 wild it's coherent with the term "robustness".

 That all depends. It makes perfect sense in a "strongly pure" 
 function to just return an exception for basically anything 
 that went wrong in that function.

 I use this strategy in other languages for writing 
 validator_functions, it is a very useful and time-saving way of 
 writing validators. E.g.:

 try {
     …
     validated_field = validate_input(unvalidated_input);
 }

 I don't really care why validate_input failed, even if it was a 
 logic flaws in the "validate_input" code itself it is perfectly 
 fine to just respond to the exception, log the failure return a 
 failure status code and continue with the next request.

 The idea that programs can do provably full veracity checking 
 of input isn't realistic in evolving code bases that need 
 constant updates.

Sorry Ola, I can't support that way of working.

Don't take it wrong, Walter is doing a lot on  safe, but 
compilers are built from a codebase, and the codebase has, 
anyway, bugs.

I can't approve a "ok, do whatever you want in the validate_input 
and I try to *safely* throw"

IMHO you can only do that if the validator is totally segregated, 
and to me that means in a separate process, neither in another 
thread.

 I'm trying to exactly do that, I like to think myself as a 
 very pragmatic person...

 What do you mean by "pragmatic"? Shutting down a B2B website 
 because one insignificant request-handler fails on some 
 requests (e.g. requesting a help page)  is not very pragmatic.

 Pragmatic in this context would be to specify which handlers 
 are critical and which ones are not.

To me, pragmatic means that the B2B website has to be organised 
in a way that the impact is minimum if one of the processes that 
are handling the requests are restarted, for a bug or not. See 
Laeeth [1]. Just handle "insignificant requests" to a cheeper, 
less robust, less costly, web stack.

But we were talking about another argument...

/Paolo

[1] 
http://forum.dlang.org/post/uvhlxtolghfydydoxwfg forum.dlang.org

Jun 03 2017

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:

On Saturday, 3 June 2017 at 11:18:16 UTC, Paolo Invernizzi wrote:
 Sorry Ola, I can't support that way of working.

 Don't take it wrong, Walter is doing a lot on  safe, but 
 compilers are built from a codebase, and the codebase has, 
 anyway, bugs.

 I can't approve a "ok, do whatever you want in the 
 validate_input and I try to *safely* throw"

If the compiler is broken then anything could happen, at any 
time. So that merely suggests that you consider the current 
version to be of beta-quality.

Would you make the same argument for Python?


 IMHO you can only do that if the validator is totally 
 segregated, and to me that means in a separate process, neither 
 in another thread.

Well, that would be very tedious.

The crux is:

The best way to write at good validator is to make the code in 
the validator as simple as possible so that you can reduce the 
probability of making mistakes in the implementation of the spec.

If you have to add code for things like division-by-zero logic 
flaws or out of bounds checks then you make it harder to catch 
mistakes in validator and increase the probability of a much 
worse situation: letting illegal input pass.

So for a validator I want to focus my energy on writing simple 
crystal clear code that only allows legal input to pass. That 
makes the overall system robust, as long as the language is 
capable of trapping all the logic flaws and classify them as a 
validation-error.

So there is a trade off here. What is more important?

1. Increasing the probability of correctly implementing the 
validation spec to keep the database consistent.

2. Reduce the chance of the unlikely event that the 
compiler/unsafe code could cause the validator to pass when it 
shouldn't.

If the programmer knows that the validator was written in this 
way, it also isn't a big deal to catch all Errors from it. 
Probabilistically speaking, the compiler being the cause here 
would be a highly unlikely event (power failure would be much 
more likely).

 To me, pragmatic means that the B2B website has to be organised 
 in a way that the impact is minimum if one of the processes 
 that are handling the requests are restarted, for a bug or not. 
 See Laeeth [1]. Just handle "insignificant requests" to a 
 cheeper, less robust, less costly, web stack.

Then we land on the conclusion that development and running cost 
would increase by choosing D over some of the competing 
alternatives for this particular use case.

That's ok.

Jun 03 2017

Timon Gehr <timon.gehr gmx.ch> writes:

On 03.06.2017 08:55, Paolo Invernizzi wrote:
 On Friday, 2 June 2017 at 23:23:45 UTC, nohbdy wrote:
 
 It's exacerbated because Walter is in a mindset of writing 
 mission-critical applications where any detectable bug means you need 
 to restart the program. Honestly, if I were writing flight control 
 systems for Airbus, I could modify druntime to raise SIGABRT or call 
 exit(3) when you try to throw an Error. It would be easy, and it would 
 be worthwhile. If you really need cleanup, atexit(3) is available.

 
 The worst thing happened in programming in the last 30 years is just 
 that less and less programmers are adopting Walter mindset...
 
 I'm really really puzzled by why this topic pops up so often...
 
 
 /Paolo

I don't get why you would /restart/ mission-critical software that has 
been shown to be buggy. What you need to do instead: Have a few more 
development teams that create independent implementations of your 
service. (Completely from scratch, as the available libraries were not 
developed to the necessary standard.) All of them should run on 
different hardware produced in different factories by different 
companies. Furthermore, you need to hire a team of testers and software 
verification experts vastly exceeding the team of developers in 
magnitude, etc.

Jun 03 2017

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:

On Saturday, 3 June 2017 at 09:48:05 UTC, Timon Gehr wrote:
 I don't get why you would /restart/ mission-critical software 
 that has been shown to be buggy. What you need to do instead: 
 Have a few more development teams that create independent 
 implementations of your service. (Completely from scratch, as 
 the available libraries were not developed to the necessary 
 standard.) All of them should run on different hardware 
 produced in different factories by different companies.
 Furthermore, you need to hire a team of testers and software 
 verification experts vastly exceeding the team of developers in 
 magnitude, etc.

Yes, mission critical software such as flight control are (and 
should) be proven correct. There is modelling software for this 
very narrow field that will generate correct code.

Or as you say, you can implement 3 different versions, running on 
3 different hardware platforms and shut down the 1 that disagrees 
with the others.

But you still have to think in probabilistic terms, because there 
could be problems with sensors, actuators, human errors etc etc 
etc..

Jun 03 2017

Paolo Invernizzi <paolo.invernizzi gmail.com> writes:

On Saturday, 3 June 2017 at 09:48:05 UTC, Timon Gehr wrote:
On 03.06.2017 08:55, Paolo Invernizzi wrote:
On Friday, 2 June 2017 at 23:23:45 UTC, nohbdy wrote:

It's exacerbated because Walter is in a mindset of writing
mission-critical applications where any detectable bug means
you need to restart the program. Honestly, if I were writing
flight control systems for Airbus, I could modify druntime to
raise SIGABRT or call exit(3) when you try to throw an Error.
It would be easy, and it would be worthwhile. If you really
need cleanup, atexit(3) is available.

The worst thing happened in programming in the last 30 years
is just that less and less programmers are adopting Walter
mindset...

I'm really really puzzled by why this topic pops up so often...

/Paolo

I don't get why you would /restart/ mission-critical software
that has been shown to be buggy. What you need to do instead:
Have a few more development teams that create independent
implementations of your service. (Completely from scratch, as
the available libraries were not developed to the necessary
standard.) All of them should run on different hardware
produced in different factories by different companies.
Furthermore, you need to hire a team of testers and software
verification experts vastly exceeding the team of developers in
magnitude, etc.

That's what should be done in mission-critical software, and we
are relaxing the constraint of mission critical, it seems [1]

The point is software, somehow, has to be run, with bugs, or
sometimes logic flaws: alas bugged software is running here [2]...

So, if you have to, you should restart
'not-so-critical-software', and you should code it as it should
be restarted from time to time.

It's an opinion, when it's the better moment to just restart it,
and a judgement between risks and opportunities.

My personal opinion, it should be stopped ASAP a bug is detected.

/Paolo

[1]
http://exploration.esa.int/mars/59176-exomars-2016-schiaparelli-anomaly-inquiry
[2]
https://motherboard.vice.com/en_us/article/the-f-35s-software-is-so-buggy-it-might-ground-the-whole-fleet

Jun 03 2017

Timon Gehr <timon.gehr gmx.ch> writes:

On 03.06.2017 12:44, Paolo Invernizzi wrote:
 On Saturday, 3 June 2017 at 09:48:05 UTC, Timon Gehr wrote:
 On 03.06.2017 08:55, Paolo Invernizzi wrote:
 On Friday, 2 June 2017 at 23:23:45 UTC, nohbdy wrote:

 It's exacerbated because Walter is in a mindset of writing 
 mission-critical applications where any detectable bug means you 
 need to restart the program. Honestly, if I were writing flight 
 control systems for Airbus, I could modify druntime to raise SIGABRT 
 or call exit(3) when you try to throw an Error. It would be easy, 
 and it would be worthwhile. If you really need cleanup, atexit(3) is 
 available.

 The worst thing happened in programming in the last 30 years is just 
 that less and less programmers are adopting Walter mindset...

 I'm really really puzzled by why this topic pops up so often...


 /Paolo

 I don't get why you would /restart/ mission-critical software that has 
 been shown to be buggy. What you need to do instead: Have a few more 
 development teams that create independent implementations of your 
 service. (Completely from scratch, as the available libraries were not 
 developed to the necessary standard.) All of them should run on 
 different hardware produced in different factories by different 
 companies. Furthermore, you need to hire a team of testers and 
 software verification experts vastly exceeding the team of developers 
 in magnitude, etc.

 
 That's what should be done in mission-critical software, and we are 
 relaxing the constraint of mission critical, it seems [1]
 ...

That document says that the crash was caused by a component going down 
after an unexpected condition instead of just continuing to operate 
normally. (Admittedly this is biased reporting, but it is true.)

 The point is software, somehow, has to be run, with bugs, or sometimes 
 logic flaws: alas bugged software is running here [2]...
 ...

I.e., a detected bug is not always a sufficient reason to bring down the 
entire system.

 So, if you have to, you should restart 'not-so-critical-software', and 
 you should code it as it should be restarted from time to time.
 ...

I agree. What I don't agree with is the idea that the programmer should 
have no way to figure out which component failed and only stop or 
restart that component if that is the most sensible thing to do under 
the given circumstances. Ideally, the Mars mission shouldn't need to be 
restarted just because there is a bug in one component of the probe.

 It's an opinion, when it's the better moment to just restart it, and a 
 judgement between risks and opportunities.
 ...

I.e., the language shouldn't mandate it to be one way or the other.

 My personal opinion, it should be stopped ASAP a bug is detected.
 ...

Which is the right thing to do often enough.

 /Paolo
 
 [1] 
 http://exploration.esa.int/mars/59176-exomars-2016-schiapar
lli-anomaly-inquiry 
 
 [2] 
 https://motherboard.vice.com/en_us/article/the-f-35s-software-is-so-buggy-it-might-gr
und-the-whole-fleet

Jun 03 2017

D Programming

C/C++ Programming

Other

digitalmars.D - Bad array indexing is considered deadly