digitalmars.D - [RFC] Throwing an exception with null pointers
- Richard (Rikki) Andrew Cattermole (129/129) Apr 12 I've been looking once again at having an exception being thrown
- Ogion (3/5) Apr 13 Why `Exception` and not `Error`?
- Richard (Rikki) Andrew Cattermole (5/11) Apr 13 Neither of them.
- Richard (Rikki) Andrew Cattermole (2/15) Apr 13 err, thrown by read barriers and then caught.
- Derek Fawcus (14/34) Apr 13 From the unix/posix perspective, I'd say don't even try. Just
- Richard (Rikki) Andrew Cattermole (3/43) Apr 13 This is how it is implemented currently.
- user1234 (9/28) Apr 13 I think that you actually dont need any language addition at all.
- Derek Fawcus (22/44) Apr 13 As to language level support for nullable vs non-nullable
- a11e99z (12/14) Apr 14 null pointer in x86 is any pointer less than 0x00010000
- a11e99z (6/12) Apr 14 almost same problem is misaligned access:
- Richard (Rikki) Andrew Cattermole (7/22) Apr 14 They are both geared towards killing the process, from what I've read
- Richard (Rikki) Andrew Cattermole (15/26) Apr 14 The null page right, and how exactly did you get access to a value that
- Atila Neves (3/8) Apr 14 I would like to know why one would want this.
- Richard (Rikki) Andrew Cattermole (11/20) Apr 14 Imagine you have a web server that is handling 50k requests per second.
- Arafel (32/52) Apr 14 I won't get into the merits of the feature itself, but I have to say
- Richard (Rikki) Andrew Cattermole (36/46) Apr 14 It isn't a low probability.
- Derek Fawcus (16/18) Apr 14 It is not the dereference which is the issue, that is the
- Richard (Rikki) Andrew Cattermole (30/49) Apr 14 You are not the first to say this, and its indicative of not
- Derek Fawcus (35/45) Apr 14 I'd have to suggest you're chasing something which can not be
- Richard (Rikki) Andrew Cattermole (14/17) Apr 14 In my original post I proposed a three tier solution.
- Richard (Rikki) Andrew Cattermole (7/30) Apr 14 I should mention, like the assert handler you would have the ability to
- Steven Schveighoffer (20/30) Apr 14 This is the exact problem. The solution proposed here just
- Alexandru Ermicioi (6/23) Apr 15 This simply is not manageable. Sometimes it is better to have
- Paolo Invernizzi (13/37) Apr 15 Au contraire!
- Walter Bright (1/1) Apr 16 Thank you, Steven. This is correct.
- Derek Fawcus (19/20) Apr 16 Yup - I like the crash...
- Walter Bright (2/5) Apr 16 That's what templates are for!
- Dave P. (12/18) Apr 16 I think what people are complaining about isn’t that null
- Walter Bright (11/15) Apr 16 Annotation means more than one pointer type.
- Richard (Rikki) Andrew Cattermole (30/49) Apr 16 Annotating the type, and annotating the variable/expression are two
- Derek Fawcus (61/64) Apr 16 An example using it:
- Derek Fawcus (4/22) Apr 16 Having now got this to complain in the desired fashion, I'll now
- Richard (Rikki) Andrew Cattermole (19/41) Apr 16 After a quick play, I would suggest also passing ``--analyze`` if you
- Dave P. (23/40) Apr 16 There are three annotations: `_Nullable`, `_Nonnull` and
- Walter Bright (12/21) Apr 16 I've long since given up on memorizing the documentation of each paramet...
- Dave P. (14/22) Apr 17 It’s easier to goto-definition and see the annotations. Fancier
- Walter Bright (5/8) Apr 18 Same thing happened to me. When I pointed it out on HackerNews recently,...
- user1234 (12/16) Apr 17 Actually I have implemented a runtime detector in another
- Walter Bright (2/3) Apr 18 Yes, indeed they are.
- Derek Fawcus (22/25) Apr 16 One has nullable and nonnull pointers. One uses the former for
- Richard (Rikki) Andrew Cattermole (7/38) Apr 16 I've been implementing this specific thing in my fast DFA implementation...
- Walter Bright (1/1) Apr 16 Sounds like one must do a bunch of casting to make use of non-nullable p...
- Walter Bright (4/4) Apr 16 ```C
- Derek Fawcus (17/22) Apr 17 Yeah, but that is test code as I'm experimenting with the
- Derek Fawcus (15/21) Apr 16 I can't say I've played with them much, having come from C not
- Richard (Rikki) Andrew Cattermole (17/40) Apr 16 The current C++ analyzer for clang doesn't support that.
- Walter Bright (4/9) Apr 16 ```d
- Walter Bright (22/22) Apr 17 Here's something I spent 5 minutes on:
- Dave P. (14/38) Apr 17 First to support classes, you’d have to make a slight change
- Jonathan M Davis (39/53) Apr 17 I am in favor of making changes along the lines that Rikki is proposing ...
- Walter Bright (4/4) Apr 17 I'd like to know what those gdc and ldc transformations are, and whether...
- Jonathan M Davis (11/15) Apr 19 Unfortunately, my understanding isn't good enough to explain those detai...
- Johan (60/80) Apr 21 There is a way now to tell LLVM that dereferencing null is
- Richard (Rikki) Andrew Cattermole (19/24) Apr 21 I agree with what you're saying here, but I want to refine it a little b...
- Meta (12/36) Apr 17 When you work on a team with less skilled/meticulous teammates on
- Atila Neves (6/16) Apr 21 I can, last week. The process crashed, I ran `coredumpctl gdb`,
- Walter Bright (7/11) Apr 17 Sure, you'd want to overload the NonNull template with one that takes a ...
- Richard (Rikki) Andrew Cattermole (22/29) Apr 17 D is designed around the type state initialized, aka nullable.
- Walter Bright (12/22) Apr 18 It always looks simple in such examples, but then there are things like:
- Richard (Rikki) Andrew Cattermole (21/46) Apr 18 Yeah indirection.
- kdevel (3/15) Apr 19 Nobody.
- Ogion (2/26) Apr 18 Isn’t `ref` essentially a non-null pointer?
- Walter Bright (7/8) Apr 18 It's supposed to be. But you can write:
- Meta (3/11) Apr 19 How is this possible? Shouldn't dereferencing p crash the program
- Jonathan M Davis (51/63) Apr 19 It doesn't, because nothing is actually dereferenced. This is like if yo...
- kdevel (14/23) Apr 19 That is undefined behavior. In the C++ standard null references
- Jonathan M Davis (16/42) Apr 19 My point about non-virtual functions and derefencing wasn't really about
- kdevel (9/26) Apr 19 Of course it doesn't and I didn't write that. I wrote that it is
- Jonathan M Davis (17/44) Apr 19 If it's not doing any additional checks, then I don't understand your po...
- kdevel (49/63) Apr 19 int main ()
- Jonathan M Davis (15/38) Apr 19 In both cases it's a valid program where the programmer screwed up, and
- kdevel (17/36) Apr 19 Only the D version is valid. The C++ program violates the std.
- Jonathan M Davis (12/47) Apr 19 I see no practical difference in general, but I guess that the differenc...
- kdevel (2/3) Apr 20 I consider nonconforming generally inacceptable.
- Jonathan M Davis (23/26) Apr 20 Writing a program which doesn't behave properly is always a problem and
- kdevel (45/51) Apr 21 The problematic word is "behave". Only recently there was a
- Derek Fawcus (11/12) Apr 19 Not really, it is similar but not the same.
- Atila Neves (11/29) Apr 16 Possible mitigations:
- Richard (Rikki) Andrew Cattermole (32/64) Apr 16 The only thing you can reliably do on segfault is to kill the process.
- Richard (Rikki) Andrew Cattermole (7/7) Apr 16 I should follow on from this to explain why I care so much.
- Derek Fawcus (17/25) Apr 16 Exactly what are you referring to by "read barrier"?
- Richard (Rikki) Andrew Cattermole (6/37) Apr 16 Yes.
- Derek Fawcus (13/23) Apr 16 I'm not sure what you have in mind, what I have in mind is
- Richard (Rikki) Andrew Cattermole (43/72) Apr 16 ``clang --analyze -Xanalyzer -analyzer-output=text``
- Walter Bright (3/3) Apr 16 The correct solution is to restart the process.
- Richard (Rikki) Andrew Cattermole (11/15) Apr 16 Yes, we are in agreement on this situation.
- Walter Bright (8/8) Apr 16 The focus of safe code in D is preventing memory corruption, not null po...
- Dave P. (2/11) Apr 16 The 0 address on WASM is writable, which has burned me many times.
- Walter Bright (4/5) Apr 16 How that oversight was made seems incredible.
- Walter Bright (2/3) Apr 16 Sounds equivalent to D's Exception and Error hierarchies.
- Richard (Rikki) Andrew Cattermole (14/18) Apr 16 Its not.
- Walter Bright (8/13) Apr 16 Adam do.
- Walter Bright (61/61) Apr 16 I confess I don't understand the fear behind a null pointer.
- Richard (Rikki) Andrew Cattermole (73/149) Apr 16 Agreed.
- GrimMaple (18/19) Apr 21 FYI, some time ago I designed this
- Richard (Rikki) Andrew Cattermole (2/26) Apr 21 That is the Swift solution to the problem, more or less.
- GrimMaple (15/16) Apr 21 My take here is that null pointers (references/whatever) just
I've been looking once again at having an exception being thrown on null pointer dereferencing. However the following can be extended to other hardware level exceptions. I do not like the conclusion, but it is based upon facts that we do not control. We cannot rely upon hardware or kernel support for throwing of a null pointer exception. To do this we have to use read barriers, like we do for array bounds check. There are three levels of support needed: 1. Language support, it does not alter code generation, available everwhere. Can be tuned to the users threshold of pain and need for guarantees. 2. Altering codegen, throws an exception via a read barrier just like a bounds check does. 3. Something has gone really wrong and CPU has said NOPE and a signal has fired to kill the process. Read barriers can be optimized out thanks to language support via data flow analysis and they can be turned on/off like bounds checks are. For Posix we can handle any errors that occur and throw an exception, its tricky but it is possible. For Windows it'll result in a exception being thrown, which can be caught and handled... except we do not support the exception mechanism on Win64 for cleanup routines (dmd only) let alone catching. I've asked this recently and this is not a bug nor is it guaranteed by the language to work. Even if this were to work, signal handlers can be changed, and they can be a bit touchy at times. We cannot rely on kernel level or cpu support to catch these errors. To have a 100% solution for within D code there is really only one option: read barriers. We already have them for bounds checks. And they can throw a D exception without any problems, plus they bypass the cpu/kernel guarantees which can result in infinite loops. This catches logic problems, but not program corruption where pointers point to something that they shouldn't. There is one major problem with a read barrier on pointers, how do you disable it? With slices you can access the pointer directly and do the dereference that by-passes it. Sadly we'd be stuck with either a storage class or attribute to turn it off. I know Walter would hate the proposal of a storage class so that is a no go. So how does .net handle it? As another Microsoft owned project, .net is a very good thing to study, it has the exact same problems that we do here. The .net exceptions are split into managed and unmanaged exceptions. Unmanaged exceptions are what we are comparable to (but don't support their method). These are not meant to be caught by .net, including stuff like null dereference exceptions, they kill the process. The managed exceptions include ones like null dereference for .net and are allowed to be caught. Quite importantly in frameworks like asp.net it guarantees that non-framework code cannot crash the process even in the most extreme cases. This is possible because null is a valid pointer value, and a pointer cannot point into unmapped memory. The guarantee of .net that you cannot corrupt a pointer, also happens to be what a signal that causes a process crash is good at handling; corrupted pointers. of the type system with assistance with data flow analysis to prevent you from doing bad things at compile time. It is a very involved process to upgrade all code to it, and from what I have seen many people in the D community would be appauled at the notion that they have to explicitly state a pointer as being non-null or nullable. Worse the typing of a pointer as non-null or nullable tends to be in the type system, but has the data flow analysis to infect other variables. In C++ nullability is handled via lint level analysis without any language help. It is compiler specific analysis that can require not only opt-ing into it, but also turning on optimizations. We can do better than this. It is no where near a desirable solution. A known good strategy for handling errors in a system is to have three solutions. For handling pointer related issues we could have three solutions at play: 1. CPU/kernel kill process via signal/exception handling 2. Read barrier throw an exception, use for when null dereference occurs, not when pointer corruption occurs i.e. dereference of unmapped memory 3. Language level support All read barriers in D are optional, with differing level of action when they error. It should be the same here also. I do not care about the default. Although if the default is not on, it could cause problems in practice with say PhobosV3. As already stated, forced typing is likely to annoy too many people and any pointer typing that could solve that is already out as it is doing the classic managed vs unmanaged pointer typing. There must be a way for the programmer to acknowledge pointer nullability status without it being required or be part of a type. This makes us fairly unique. Since we cannot require attribution by default, we cannot have a 100% solution with just language level support. But we also do not want to have a common error to be in the code and effect runtime if at all possible. Or at least in cases where me and Adam Wilson care about regarding eventloops. Which means we are limited to local only information, similar to C++, except... we can store the results into the type system. The result is more code is checked than C++, but also less than It is possible to opt into more advanced analysis and error when it cannot model your code.
Apr 12
On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew Cattermole wrote:I've been looking once again at having an exception being thrown on null pointer dereferencing.Why `Exception` and not `Error`?
Apr 13
On 13/04/2025 9:48 PM, Ogion wrote:On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew Cattermole wrote:Neither of them. The former has language specific behavior, and the latter has implementation specific behavior that isn't suitable for catching by read barriers.I've been looking once again at having an exception being thrown on null pointer dereferencing.Why `Exception` and not `Error`?
Apr 13
On 14/04/2025 9:45 AM, Richard (Rikki) Andrew Cattermole wrote:On 13/04/2025 9:48 PM, Ogion wrote:err, thrown by read barriers and then caught.On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew Cattermole wrote:Neither of them. The former has language specific behavior, and the latter has implementation specific behavior that isn't suitable for catching by read barriers.I've been looking once again at having an exception being thrown on null pointer dereferencing.Why `Exception` and not `Error`?
Apr 13
On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew Cattermole wrote:I've been looking once again at having an exception being thrown on null pointer dereferencing. However the following can be extended to other hardware level exceptions. I do not like the conclusion, but it is based upon facts that we do not control. We cannot rely upon hardware or kernel support for throwing of a null pointer exception. To do this we have to use read barriers, like we do for array bounds check. There are three levels of support needed: 1. Language support, it does not alter code generation, available everwhere. Can be tuned to the users threshold of pain and need for guarantees. 2. Altering codegen, throws an exception via a read barrier just like a bounds check does. 3. Something has gone really wrong and CPU has said NOPE and a signal has fired to kill the process.From the unix/posix perspective, I'd say don't even try. Just allow the signal (SIGSEGV and/or SIGBUS) to be raised, not caught, and have the process crash. Debugging then depends upon examining the corefile, or using a debugger. The only gain from catching it is to generate some pretty form of back trace before dying, and that is just as well (or better) handled by a debugger, or an external crash handling and backtrace generating corset/monitor/supervision process. Once one catches either of these signals, one has to be very careful in handling if any processing is to continue beyond the signal handler. With a complex runtime, and/or a multi-threaded application, it often isn't worth the effort.
Apr 13
On 14/04/2025 1:07 AM, Derek Fawcus wrote:On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew Cattermole wrote:This is how it is implemented currently. Which is don't touch it and it matches my analysis.I've been looking once again at having an exception being thrown on null pointer dereferencing. However the following can be extended to other hardware level exceptions. I do not like the conclusion, but it is based upon facts that we do not control. We cannot rely upon hardware or kernel support for throwing of a null pointer exception. To do this we have to use read barriers, like we do for array bounds check. There are three levels of support needed: 1. Language support, it does not alter code generation, available everwhere. Can be tuned to the users threshold of pain and need for guarantees. 2. Altering codegen, throws an exception via a read barrier just like a bounds check does. 3. Something has gone really wrong and CPU has said NOPE and a signal has fired to kill the process.From the unix/posix perspective, I'd say don't even try. Just allow the signal (SIGSEGV and/or SIGBUS) to be raised, not caught, and have the process crash. Debugging then depends upon examining the corefile, or using a debugger. The only gain from catching it is to generate some pretty form of back trace before dying, and that is just as well (or better) handled by a debugger, or an external crash handling and backtrace generating corset/ monitor/supervision process. Once one catches either of these signals, one has to be very careful in handling if any processing is to continue beyond the signal handler. With a complex runtime, and/or a multi-threaded application, it often isn't worth the effort.
Apr 13
On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew Cattermole wrote:I've been looking once again at having an exception being thrown on null pointer dereferencing. [...] To have a 100% solution for within D code there is really only one option: read barriers. We already have them for bounds checks. And they can throw a D exception without any problems, plus they bypass the cpu/kernel guarantees which can result in infinite loops. This catches logic problems, but not program corruption where pointers point to something that they shouldn't. There is one major problem with a read barrier on pointers, how do you disable it? With slices you can access the pointer directly and do the dereference that by-passes it. Sadly we'd be stuck with either a storage class or attribute to turn it off.I think that you actually dont need any language addition at all. The read barriers can be considered as implicit, "opt-in", contracts and whether they are codegened can be controlled with a simple command line switch. At first glance this system may appear costly. This is exact but there are actually many cases for which the compiler can determine that it has not to add a barrier.
Apr 13
On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew Cattermole wrote:As already stated, forced typing is likely to annoy too many people and any pointer typing that could solve that is already out as it is doing the classic managed vs unmanaged pointer typing. There must be a way for the programmer to acknowledge pointer nullability status without it being required or be part of a type. This makes us fairly unique. Since we cannot require attribution by default, we cannot have a 100% solution with just language level support. But we also do not want to have a common error to be in the code and effect runtime if at all possible. Or at least in cases where me and Adam Wilson care about regarding eventloops. Which means we are limited to local only information, similar to C++, except... we can store the results into the type system. The result is more code is checked than C++, but also less than It is possible to opt into more advanced analysis and error when it cannot model your code.As to language level support for nullable vs non-nullable pointers, without having used it yet, I believe I'd like to have such. Picking a default is an issue. I probably need to play (in C) with the clang __nullable and _nonnull markers to see how well they work. From reading the GCC docs I can't see benefit from its mechanisms, as they serve to guide optimisation rather than checks / assertions at compile time and/or checks at runtime. I think I really want something like Cyclone offered, with forms of non-null pointers and nullable pointers. Or maybe something like Odin/Zig offer with default non-null pointers and optional nullable pointers, the latter requring source guards (plus dataflow analysis). As to how that translates to D, I'm not yet sure. However references alone are not the answer, as I want an explicit annotation at function call sites to indicate that a pointer/reference may be passed. Hence I have a quibble with D safe mode not allowing passing pointers to locals; only mitigated by the 'scoped pointer' annotation when that preview flag is enabled.
Apr 13
On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew Cattermole wrote:I've been looking once again at having an exception being thrown on null pointer dereferencing.null pointer in x86 is any pointer less than 0x00010000 also what about successful dereferencing 0x23a7b63c41704h827? its depends: reserved this memory by process, committed, mmaped etc failure for any "wrong" pointer should be same as for 0x0000....000 (pure NULL) so read barrier is wrong option (maybe good for DEBUG only for simple cases) silent killing process by OS is also wrong option, not every programmer is kernel debugger/developer or windbg guru imo need to dig into .net source and grab their option
Apr 14
On Monday, 14 April 2025 at 09:49:27 UTC, a11e99z wrote:On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew Cattermole wrote:almost same problem is misaligned access: x86/x64 allows this except few instructions ARM - prohibit it (as I know) it seems there are no other options except to handle kernel signals and WinSEHI've been looking once again at having an exception being thrown on null pointer dereferencing.null pointer in x86 is any pointer less than 0x00010000 imo need to dig into .net source and grab their option
Apr 14
On 14/04/2025 10:01 PM, a11e99z wrote:On Monday, 14 April 2025 at 09:49:27 UTC, a11e99z wrote:The only way for this to occur in safe code is program corruption.On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew Cattermole wrote:almost same problem is misaligned access: x86/x64 allows this except few instructions ARM - prohibit it (as I know)I've been looking once again at having an exception being thrown on null pointer dereferencing.null pointer in x86 is any pointer less than 0x00010000 imo need to dig into .net source and grab their optionit seems there are no other options except to handle kernel signals and WinSEHThey are both geared towards killing the process, from what I've read its really not a good idea to throw an exception using them. Even if we could rely upon the signal handler being what we think it is. We do not support the MSVC exception mechanism for Windows 64bit of dmd, so even if we wanted to do this, we cannot.
Apr 14
On 14/04/2025 9:49 PM, a11e99z wrote:On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew Cattermole wrote:The null page right, and how exactly did you get access to a value that isn't 0 or a valid pointer? Program corruption. Which should only be possible in non- safe code.I've been looking once again at having an exception being thrown on null pointer dereferencing.null pointer in x86 is any pointer less than 0x00010000also what about successful dereferencing 0x23a7b63c41704h827? its depends: reserved this memory by process, committed, mmaped etcHow did you get this value? Program corruption.failure for any "wrong" pointer should be same as for 0x0000....000 (pure NULL)Null (0) is a special value in D, these other values are simply assumed to be valid. Of course its not possible to get these other values without calling out of safe code where program corruption can occur.imo need to dig into .net source and grab their option.net is an application VM, and for all intents and purposes they will be injecting read barriers before each null dereference. They also have strong guarantees that you cannot have a invalid to a pointer, pointer. It must be null or point to a valid object. It isn't as simple as copying them. Plus they have the type state analysis as part of the language to handle nullability!
Apr 14
On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew Cattermole wrote:I've been looking once again at having an exception being thrown on null pointer dereferencing. However the following can be extended to other hardware level exceptions. [...]I would like to know why one would want this.
Apr 14
On 15/04/2025 1:51 AM, Atila Neves wrote:On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew Cattermole wrote:Imagine you have a web server that is handling 50k requests per second. It makes you $1 million dollars a day. In it, you accidentally have some bad business logic that results in a null dereference or indexing a slice out of bounds. It kills the entire server losing you potentially the full 1 million dollars before you can fix it. How likely are you to keep using D, or willing to talk about using D positively afterwards? ASP.net guarantees that this will kill the task and will give the right response code. No process death.I've been looking once again at having an exception being thrown on null pointer dereferencing. However the following can be extended to other hardware level exceptions. [...]I would like to know why one would want this.
Apr 14
On 14.04.25 16:22, Richard (Rikki) Andrew Cattermole wrote:On 15/04/2025 1:51 AM, Atila Neves wrote:I won't get into the merits of the feature itself, but I have to say that this example is poorly chosen, to say the least. In fact, it looks to me like a case of "when you only have a hammer, everything looks like a nail": not everything should be handled by the application itself. As somebody coming rather from the "ops" side of "devops", let me tell you that there is a wide range of tools that you should be using **on top of your application** if you have an app that makes you 1M$ a day, including but not restricted to: * A monitoring process to make sure the server is running (and healthy). Among this process's tasks are making sure that in case of a failure the main process is fully stopped, killing any leftover tasks, removing lock files, ensuring data sanity, etc., and then restarting the main server again. * A HA system routing queries to a pool of several servers that are regularly polled for health status, assuming that the failure happens seldom enough that it's very unlikely to affect several backend servers at the same time. * Some meatbag on-call 24/7 (or even on-site) who can at the very least restart the affected server (including the hardware) if it comes to that. I mean, a service can fail for a number of reasons, including hardware issues, among which dereferencing a null pointer should be quite low in the scale of probabilities. Having a 1M$/day operation depend on your application's continued run after dereferencing a null pointer would seem to me... rather risky and sort-sighted. On top of that, there's the "small" issue that you can't really be sure what state the application has been left in. I certainly wouldn't want to risk any silent data corruption and would rather kill the process ASAP to start it again from a known good state. Again, I'm not arguing for or against the feature itself, but I just think this example doesn't do it any help.I would like to know why one would want this.Imagine you have a web server that is handling 50k requests per second. It makes you $1 million dollars a day. In it, you accidentally have some bad business logic that results in a null dereference or indexing a slice out of bounds. It kills the entire server losing you potentially the full 1 million dollars before you can fix it. How likely are you to keep using D, or willing to talk about using D positively afterwards? ASP.net guarantees that this will kill the task and will give the right response code. No process death.
Apr 14
On 15/04/2025 2:55 AM, Arafel wrote:I mean, a service can fail for a number of reasons, including hardware issues, among which dereferencing a null pointer should be quite low in the scale of probabilities.It isn't a low probability. The reason why all these application VM languages have been introducing nullability guarantees, is because it has been a plague of problems. We have nothing currently. “I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W). My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn't resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.” https://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare/Having a 1M$/day operation depend on your application's continued run after dereferencing a null pointer would seem to me... rather risky and sort-sighted.You need to read my initial post. I concluded that once an instruction executes the dereference it is in fact dead. We are not in disagreement in terms of this. This is why the static analysis and read barrier is so important, they catch it before it ever happens. The program isn't corrupted by the signal at this point in time. Before that, thanks to `` safe``, we can assume it is in a valid state, and its just business logic that is wrong. Yes people will disagree with me on that, but the blame is purely on the non- safe code which should be getting thoroughly vetted for this kind of thing (and even then I still want the read barriers and type state analysis to kick in, because this can never be correct). There is a reason why my DIP makes stackless coroutines default to safe, not just allow you to default to system like everything else.On top of that, there's the "small" issue that you can't really be sure what state the application has been left in. I certainly wouldn't want to risk any silent data corruption and would rather kill the process ASAP to start it again from a known good state.By doing this you have killed all other tasks, those tasks make you lose money. Throw in say a scrapper and you could still be down all day or more. It is an entirely unnecessary downtime with accepted widely used solutions. It would be bad engineering to ignore this. It is important to note that a task isn't always a process. But once an event like null dereference occurs that task must die. If anything prevents that task from cleaning up, then yes the process dies.
Apr 14
On Monday, 14 April 2025 at 15:24:37 UTC, Richard (Rikki) Andrew Cattermole wrote:It is important to note that a task isn't always a process. But once an event like null dereference occurs that task must die.It is not the dereference which is the issue, that is the downstream symptom of an earlier problem. If that reference is never supposed to be null, then the program is already in a non deterministic even without the crash. The crash is what allows that bad state to be fixed. Simply limping along (aoiding the deref) is sweeping the issue under the rug. One doesn't know what else within the complete system is at fault. The safe annotation is not yet sufficient to ensure that the rest of the system is in a valid state. Fail fast, and design the complete architecture with redundant HA instance can take the (or a sub-set of the) load until the system can regain its redundant operating condition.That is exactly what we do with routers.
Apr 14
On 15/04/2025 3:42 AM, Derek Fawcus wrote:On Monday, 14 April 2025 at 15:24:37 UTC, Richard (Rikki) Andrew Cattermole wrote:You are not the first to say this, and its indicative of not understanding the scenario. Coroutines are used for business logic. They are supposed to have guarantees where they can always be cleaned up on framework level exceptional events. That includes attempts at null dereferencing or out of bounds access of slices. They should never have the ability to corrupt the entire program. They need to be safe. If safe allows program corruption, then it needs fixing. If you call trusted code that isn't shipped with the compiler, that isn't our fault it wasn't vetted. But that is what it is exists for. So that you can do unsafe things and present a safe API. Any other scenario than coroutines will result in a process death.It is important to note that a task isn't always a process. But once an event like null dereference occurs that task must die.It is not the dereference which is the issue, that is the downstream symptom of an earlier problem. If that reference is never supposed to be null, then the program is already in a non deterministic even without the crash.The crash is what allows that bad state to be fixed. Simply limping along (aoiding the deref) is sweeping the issue under the rug. One doesn't know what else within the complete system is at fault.Except it isn't the entire system that could be bad. Its one single threaded business logic laden task. It is not library or framework code that is bad. A single piece of business logic, that was written likely by a graduate level skilled person, failed to account for something. This is not the same thing as the entire program being corrupt. If some kind of coroutine isn't in use, the program still dies, just like today.The safe annotation is not yet sufficient to ensure that the rest of the system is in a valid state.Please elaborate. If safe has a bug, it needs solving.Fail fast, and design the complete architecture with redundant HA instance can take the (or a sub-set of the) load until the system can regain its redundant operating condition.That is exactly what we do with routers.While all of those is within my recommendations generally, it does not cover this particular scenario. This is far too common of an issue, and within business logic which will commonly be found inside a coroutine (including Fiber), it should not be bringing down the entire process. ASP.net thanks to .net offers this guarantee for a reason.
Apr 14
On Monday, 14 April 2025 at 16:12:35 UTC, Richard (Rikki) Andrew Cattermole wrote:You are not the first to say this, and its indicative of not understanding the scenario. Coroutines are used for business logic. They are supposed to have guarantees where they can always be cleaned up on framework level exceptional events. That includes attempts at null dereferencing or out of bounds access of slices. They should never have the ability to corrupt the entire program. They need to be safe. If safe allows program corruption, then it needs fixing.I'd have to suggest you're chasing something which can not be achieved. It is always possible for the program to get in to a partial, or fully, unrecoverable state, where some portion of the system is inoperative, or operating incorrectly. The only way to recover in that case is to restart the program. The reason the program can not recover is that there is a bug, where some portion is able to drive in to an unanticipated area, and there is not the correct logic to recover as the author did not think of that scenario. I recently wrote a highly concurrent program in Go. This was in CSP style, taking care only only access slices and maps from one goroutine, and not accidentally capturing free variables in lambda. So this manually covered the "safety" escapes which the Rust folks like to point to in Go. The "safety" provided being greater than D currently offers. Also nil pointers are present in Go, and will usually crash the complete program. One is able to catch panics if one desires, so being similar to your exception case. There were many goroutines, and a bunch dynamically started and stopped which ran "business logic". Despite that it was possible (due to some bugs of mine) to get the system state where things could not recover, or for some cases took 30 mins to recover. A flaw could be detected (via outside behaviour), and could be reasoned through by analysing the log files. However for the error which did not clear after 30 mins, there was no way for the system to come back in to full operational state without the program being restarted. A similar situation would be achievable in a Rust program. In trying to handle and recover from such things, you're up against a variant of Gödel's incompleteness theorem, and anyone offering a language which "solves" that is IMHO selling snake oil.
Apr 14
On 15/04/2025 8:53 AM, Derek Fawcus wrote:In trying to handle and recover from such things, you're up against a variant of Gödel's incompleteness theorem, and anyone offering a language which "solves" that is IMHO selling snake oil.In my original post I proposed a three tier solution. The read barriers are secondary to the language guarantees via type state analysis. You need them for when using a fast DFA engine, that doesn't do full control flow graph analysis and ignores a variable when it cannot analyze it. But if you are ok with having a bit of pain in terms of what can be modeled and can accept a bit of slowness, you won't need the read barriers. Unfortunately not everyone will accept the slower DFA therefore it can't be on by default. I know this as it copies a couple of the perceived negative traits of DIP1000. So yes we can solve this, but MT could still mess it up, hence the last option in the three solution; signals killing the process.
Apr 14
On 15/04/2025 9:28 AM, Richard (Rikki) Andrew Cattermole wrote:On 15/04/2025 8:53 AM, Derek Fawcus wrote:I should mention, like the assert handler you would have the ability to configure the read barrier to do whatever you want at runtime. So if you prefer it to kill the process you can. As to what the default would be? Idk. The benefit of having it is that we can likely have a stack trace, where there might not be one otherwise.In trying to handle and recover from such things, you're up against a variant of Gödel's incompleteness theorem, and anyone offering a language which "solves" that is IMHO selling snake oil.In my original post I proposed a three tier solution. The read barriers are secondary to the language guarantees via type state analysis. You need them for when using a fast DFA engine, that doesn't do full control flow graph analysis and ignores a variable when it cannot analyze it. But if you are ok with having a bit of pain in terms of what can be modeled and can accept a bit of slowness, you won't need the read barriers. Unfortunately not everyone will accept the slower DFA therefore it can't be on by default. I know this as it copies a couple of the perceived negative traits of DIP1000. So yes we can solve this, but MT could still mess it up, hence the last option in the three solution; signals killing the process.
Apr 14
On Monday, 14 April 2025 at 15:42:07 UTC, Derek Fawcus wrote:On Monday, 14 April 2025 at 15:24:37 UTC, Richard (Rikki) Andrew Cattermole wrote:This is the exact problem. The solution proposed here just doesn't understand what the actual problem is. Null dereferences, and index out-of-bounds are *programming errors*. You need to fix them in the program, not recover and hope for the best. Trying to recover is the equivalent of a compiler resolving a syntax ambiguity with a random number generator. Null dereference? 1. Is it because I trusted a user value? => validate user input, rebuild, redeploy 2. Is it because I forgot to initialize something? => initialize it, rebuild, redeploy 3. Is it because I forgot to validate something? => do the validation properly, fix whatever it was sending in invalid data, rebuild, redeploy 4. Is it something else? => thank you program, for crashing instead of corrupting everything. Now, time to find the memory corruption somewhere. Similar flow chart for out-of-bounds errors. -SteveIt is important to note that a task isn't always a process. But once an event like null dereference occurs that task must die.It is not the dereference which is the issue, that is the downstream symptom of an earlier problem. If that reference is never supposed to be null, then the program is already in a non deterministic even without the crash.
Apr 14
On Tuesday, 15 April 2025 at 02:48:42 UTC, Steven Schveighoffer wrote:On Monday, 14 April 2025 at 15:42:07 UTC, Derek Fawcus wrote:This simply is not manageable. Sometimes it is better to have semi operating systems rather ones that don't work at all because a minor thing in a tertiary module causes npe, while a fix is in progress.On Monday, 14 April 2025 at 15:24:37 UTC, Richard (Rikki) Andrew Cattermole wrote:This is the exact problem. The solution proposed here just doesn't understand what the actual problem is. Null dereferences, and index out-of-bounds are *programming errors*. You need to fix them in the program, not recover and hope for the best.It is important to note that a task isn't always a process. But once an event like null dereference occurs that task must die.It is not the dereference which is the issue, that is the downstream symptom of an earlier problem. If that reference is never supposed to be null, then the program is already in a non deterministic even without the crash.
Apr 15
On Tuesday, 15 April 2025 at 07:23:14 UTC, Alexandru Ermicioi wrote:On Tuesday, 15 April 2025 at 02:48:42 UTC, Steven Schveighoffer wrote:Au contraire! That's exactly why today we have: - kernels (or microkernels!) - processes living in userland - threads living in processes - coroutine (or similar stuff, whatever variant and name they take) living on thread - and least but not last VM, plenty of them. I don't see a real use case for trying to recover a process from UB at all: go down the list and choose another layer. /POn Monday, 14 April 2025 at 15:42:07 UTC, Derek Fawcus wrote:This simply is not manageable. Sometimes it is better to have semi operating systems rather ones that don't work at all because a minor thing in a tertiary module causes npe, while a fix is in progress.On Monday, 14 April 2025 at 15:24:37 UTC, Richard (Rikki) Andrew Cattermole wrote:This is the exact problem. The solution proposed here just doesn't understand what the actual problem is. Null dereferences, and index out-of-bounds are *programming errors*. You need to fix them in the program, not recover and hope for the best.It is important to note that a task isn't always a process. But once an event like null dereference occurs that task must die.It is not the dereference which is the issue, that is the downstream symptom of an earlier problem. If that reference is never supposed to be null, then the program is already in a non deterministic even without the crash.
Apr 15
On Wednesday, 16 April 2025 at 18:19:58 UTC, Walter Bright wrote:Thank you, Steven. This is correct.Yup - I like the crash... However I do have an interest in being able to write code with distinct nullable and nonnull pointers. That such that the compiler (or an SA tool) can complain when they're incorrectly confused. So passing (or assigning) a nullable pointer to a nonnull one without a prior check should generate a compile error or warning. That should only require function local DFA. The reason to want it is simply that test cases may not exercise complete coverage for various paths when one only has the C style pointer, and so it should allow for easy latent bug detection and fixes when one is not bypassing the type system. If one is bypassing the type system, then one takes the risks, but the SIGSEGV is still there to catch the bug. (Yes, I've programmed under DOS. I also took advantage of a protected mode OS (FlexOS) when available to prove and debug the code first. The TurboC style 'detected a null pointer write' at program exit while occasionally useful, was grossly inadequate)
Apr 16
On 4/16/2025 11:43 AM, Derek Fawcus wrote:However I do have an interest in being able to write code with distinct nullable and nonnull pointers. That such that the compiler (or an SA tool) can complain when they're incorrectly confused.That's what templates are for!
Apr 16
On Wednesday, 16 April 2025 at 19:44:09 UTC, Walter Bright wrote:On 4/16/2025 11:43 AM, Derek Fawcus wrote:I think what people are complaining about isn’t that null pointers exist at all, it’s that every pointer and reference type has a null value instead of you opting into it. The type system fights you if you want to use it to prove or enforce certain values are nullable or not. Something that people don’t bring up is that clang’s nullability extension allows you to change what the default is. You put a `#pragma clang assume_nonnull begin` at the top of your C/C++/Objective-C code and you have to annotate only the nullable pointers. Most pointers in a program should be non-null and the nullable ones should be the exception that you have to annotate.However I do have an interest in being able to write code with distinct nullable and nonnull pointers. That such that the compiler (or an SA tool) can complain when they're incorrectly confused.That's what templates are for!
Apr 16
On 4/16/2025 12:57 PM, Dave P. wrote:You put a `#pragma clang assume_nonnull begin` at the top of your C/C++/Objective-C code and you have to annotate only the nullable pointers. Most pointers in a program should be non-null and the nullable ones should be the exception that you have to annotate.Annotation means more than one pointer type. Back in the old MSDOS days, there were 5 pointer types - near, far, stack, code and huge. Dealing with that is a gigantic mess - which pointer type does strlen() take? Or worse, strcpy()? Microsoft's Managed C++ has two pointer types with different syntax, a GC pointer and a non-GC pointer. The same problem - what pointer type does strcpy() accept? It's an ugly mess, and why I've avoided any such thing in D. I'm curious - how does one traverse a binary tree with non-null pointers? How does one create a circular data structure with non-null pointers?
Apr 16
On 17/04/2025 9:12 AM, Walter Bright wrote:On 4/16/2025 12:57 PM, Dave P. wrote:Annotating the type, and annotating the variable/expression are two different things. In the DFA literature they have distinct properties and are applied differently. I read Principles of Program Analysis today, it was very interesting and did have some details on the subject (but not much). It also confirmed some things that I had already come up with independently which was nice! From what I've seen, application VM languages annotate the type, whereas C++ annotates the variable. As of DIP1000, we annotate the variable i.e. scope. From a link made previously in this thread, the state of the art annotation of nullability in C++: https://clang.llvm.org/docs/analyzer/developer-docs/nullability.html Very similar to what I'm wanting.You put a `#pragma clang assume_nonnull begin` at the top of your C/C+ +/Objective-C code and you have to annotate only the nullable pointers. Most pointers in a program should be non-null and the nullable ones should be the exception that you have to annotate.Annotation means more than one pointer type.Back in the old MSDOS days, there were 5 pointer types - near, far, stack, code and huge. Dealing with that is a gigantic mess - which pointer type does strlen() take? Or worse, strcpy()? Microsoft's Managed C++ has two pointer types with different syntax, a GC pointer and a non-GC pointer. The same problem - what pointer type does strcpy() accept?I genuinely would prefer throwing a full fledged CFG DFA at this kind of thing, and only annotate the variable, not the type. Its a shame not everyone would accept that as a solution. It is forcing me to verify that there are no alternative solutions for these people. I know you and I Walter would be happy with a full CFG DFA as the resolution, but alas. I remain heavily concerned at the idea of boxing types in D, in any scenario. It seems to spell an absolute mess in any attempts I have modeled mentally.It's an ugly mess, and why I've avoided any such thing in D. I'm curious - how does one traverse a binary tree with non-null pointers? How does one create a circular data structure with non-null pointers?Sentinels. They are used pretty heavily in data structures, such as head and foot nodes. My recommendation for data structure/algorithm book: https://www.amazon.com/Algorithms-Parts-1-4-Fundamentals-Structures/dp/0201314525
Apr 16
On Wednesday, 16 April 2025 at 21:30:58 UTC, Richard (Rikki) Andrew Cattermole wrote:From a link made previously in this thread, the state of the art annotation of nullability in C++: https://clang.llvm.org/docs/analyzer/developer-docs/nullability.htmlAn example using it: ```C // Compile with -Wnullable-to-nonnull-conversion #if defined(__clang__) #define ASSUME_NONNULL_BEGIN _Pragma("clang assume_nonnull begin") #define ASSUME_NONNULL_END _Pragma("clang assume_nonnull end") #else #define ASSUME_NONNULL_BEGIN #define ASSUME_NONNULL_END #define __has_feature(x) (0) #endif /* clang */ #if __has_feature (nullability) #define NONNULL _Nonnull #define NULLABLE _Nullable #define NULL_UNSPECIFIED _Null_unspecified #else #define NONNULL #define NULLABLE #define NULL_UNSPECIFIED #endif /* nullability */ ASSUME_NONNULL_BEGIN int *someNonNull(int *); int *someNullable(int * NULLABLE); int foo(int * arg1, int * NULL_UNSPECIFIED arg2, int * NULLABLE arg3) { int a = 0; int * NONNULL ptr; a += *someNullable(arg1); a += *someNullable(arg2); a += *someNullable(arg3); ptr = arg1; ptr = arg2; ptr = arg3; a += *someNonNull(arg1); a += *someNonNull(arg2); a += *someNonNull(arg3); return a; } ASSUME_NONNULL_END ``` ``` $ clang-14 -Wall -Wnullable-to-nonnull-conversion -c ttt.c ttt.c:37:8: warning: implicit conversion from nullable pointer 'int * _Nullable' to non-nullable pointer type 'int * _Nonnull' [-Wnullable-to-nonnull-conversion] ptr = arg3; ^ ttt.c:41:20: warning: implicit conversion from nullable pointer 'int * _Nullable' to non-nullable pointer type 'int * _Nonnull' [-Wnullable-to-nonnull-conversion] a += *someNonNull(arg3); ^ ttt.c:29:16: warning: variable 'ptr' set but not used [-Wunused-but-set-variable] int * NONNULL ptr; ^ 3 warnings generated. ```
Apr 16
On Wednesday, 16 April 2025 at 22:43:24 UTC, Derek Fawcus wrote:``` $ clang-14 -Wall -Wnullable-to-nonnull-conversion -c ttt.c ttt.c:37:8: warning: implicit conversion from nullable pointer 'int * _Nullable' to non-nullable pointer type 'int * _Nonnull' [-Wnullable-to-nonnull-conversion] ptr = arg3; ^ ttt.c:41:20: warning: implicit conversion from nullable pointer 'int * _Nullable' to non-nullable pointer type 'int * _Nonnull' [-Wnullable-to-nonnull-conversion] a += *someNonNull(arg3); ^ ttt.c:29:16: warning: variable 'ptr' set but not used [-Wunused-but-set-variable] int * NONNULL ptr; ^ 3 warnings generated. ```Having now got this to complain in the desired fashion, I'll now be applying it to some code at work. More help in quashing sources of bugs.
Apr 16
On 17/04/2025 10:54 AM, Derek Fawcus wrote:On Wednesday, 16 April 2025 at 22:43:24 UTC, Derek Fawcus wrote:After a quick play, I would suggest also passing ``--analyze`` if you are not already doing so. It covers the case where you are not assuming non-null. I.e. it can see the difference between: ```c++ void test(int *p) { if (!p) *p = 0; // warn } ``` and ```c++ void test2(int *p) { if (p) *p = 0; // ok } ``` This is definitely DFA.``` $ clang-14 -Wall -Wnullable-to-nonnull-conversion -c ttt.c ttt.c:37:8: warning: implicit conversion from nullable pointer 'int * _Nullable' to non-nullable pointer type 'int * _Nonnull' [-Wnullable- to-nonnull-conversion] ptr = arg3; ^ ttt.c:41:20: warning: implicit conversion from nullable pointer 'int * _Nullable' to non-nullable pointer type 'int * _Nonnull' [-Wnullable- to-nonnull-conversion] a += *someNonNull(arg3); ^ ttt.c:29:16: warning: variable 'ptr' set but not used [-Wunused-but- set-variable] int * NONNULL ptr; ^ 3 warnings generated. ```Having now got this to complain in the desired fashion, I'll now be applying it to some code at work. More help in quashing sources of bugs.
Apr 16
On Wednesday, 16 April 2025 at 21:12:08 UTC, Walter Bright wrote:On 4/16/2025 12:57 PM, Dave P. wrote:There are three annotations: `_Nullable`, `_Nonnull` and `_Null_unspecified`. `_Null_unspecified` can freely convert between the other two types. If you don’t annotate a pointer (and don’t change the default), then it is a `_Null_unspecified` for compatibility with existing code. `strlen` and company thus take `_Null_unspecified`. They *should* take `_Nonnull`, but it’s old code. For data structures, you mostly use `_Null_unspecified` as the nullability is not as simple as this pointer may be null or not, it is a potentially dynamic invariant of your data structure. In D, you would annotate it as an ` system` variable so only trusted code can modify it. Where the annotations are really valuable is for function arguments and return. Can this argument be null or not? Now the compiler can help you instead of you having to memorize the documentation of every function you call. I’ve consistently applied these annotations to my C projects and it works wonders. Many classes of mistakes are caught at compile time. Somethings are a dynamic property of your system, and for that there is a nullability sanitizer that can detect at runtime if a `_Nonnull` pointer gets a null value (usually from a data structure that was not properly initialized).You put a `#pragma clang assume_nonnull begin` at the top of your C/C++/Objective-C code and you have to annotate only the nullable pointers. Most pointers in a program should be non-null and the nullable ones should be the exception that you have to annotate.Annotation means more than one pointer type. Back in the old MSDOS days, there were 5 pointer types - near, far, stack, code and huge. Dealing with that is a gigantic mess - which pointer type does strlen() take? Or worse, strcpy()? Microsoft's Managed C++ has two pointer types with different syntax, a GC pointer and a non-GC pointer. The same problem - what pointer type does strcpy() accept? It's an ugly mess, and why I've avoided any such thing in D. I'm curious - how does one traverse a binary tree with non-null pointers? How does one create a circular data structure with non-null pointers?
Apr 16
On 4/16/2025 2:49 PM, Dave P. wrote:Where the annotations are really valuable is for function arguments and return. Can this argument be null or not? Now the compiler can help you instead of you having to memorize the documentation of every function you call.I've long since given up on memorizing the documentation of each parameter. Too many functions! I just google it, it just takes a sec.I’ve consistently applied these annotations to my C projects and it works wonders. Many classes of mistakes are caught at compile time.I agree that compile time detection is always better than runtime. I get a null pointer seg fault now and then. I look at the stack trace, and have it fixed in the same amount of time it takes you. I don't worry about it, because it is not a memory corruption issue. The time I spend tracking down a null pointer seg fault does not even register among the time I spend programming. You're not wrong. But it's a cost/benefit thing.Somethings are a dynamic property of your system, and for that there is a nullability sanitizer that can detect at runtime if a `_Nonnull` pointer gets a null value (usually from a data structure that was not properly initialized).I don't see how a runtime detector can work better than a seg fault with stack trace.
Apr 16
On Thursday, 17 April 2025 at 05:31:48 UTC, Walter Bright wrote:On 4/16/2025 2:49 PM, Dave P. wrote:It’s easier to goto-definition and see the annotations. Fancier editors can even show it inline.[...]I've long since given up on memorizing the documentation of each parameter. Too many functions! I just google it, it just takes a sec.[...] I don't see how a runtime detector can work better than a seg fault with stack trace.The source of a null pointer can be very far from its eventual dereference, especially when stored in a structure. The nullability sanitizer also doesn’t kill your program by default, it just logs when a null pointer is stored in a nonnull variable. So you can see where the first non null store is made and see how it eventually moves to when it is derefenced. --- It’s funny though, when I first started using this kind of thing it caught loads of bugs. But after years of use, I realized it hadn’t caught any bugs in a long time. I had internalized the rules.
Apr 17
On 4/17/2025 12:21 AM, Dave P. wrote:It’s funny though, when I first started using this kind of thing it caught loads of bugs. But after years of use, I realized it hadn’t caught any bugs in a long time. I had internalized the rules.Same thing happened to me. When I pointed it out on HackerNews recently, I was lambasted for being "arrogant". LOL. I make different kinds of mistakes these days. Mostly due to failure to understand the problem I am trying to solve, rather than coding errors.
Apr 18
On Thursday, 17 April 2025 at 05:31:48 UTC, Walter Bright wrote:[...] I don't see how a runtime detector can work better than a seg fault with stack trace.Actually I have implemented a runtime detector in another language and that gives me -- quite rarely -- things liketemp.sx:11:7: runtime error, member read with null `this`So that you know directly where the problem is and often there's even no need for running again in gdb. That is implemented in the Dlang equivalent of the DotVarExp. Concretly it's about codegening the same as for an assertion and just right before loading from the LHS. That's why, previously in the thread, I called that an "implicit contract". Aren't assertions the most primitive form of contracts ?
Apr 17
On 4/17/2025 4:03 AM, user1234 wrote:Aren't assertions the most primitive form of contracts ?Yes, indeed they are.
Apr 18
On Wednesday, 16 April 2025 at 21:12:08 UTC, Walter Bright wrote:I'm curious - how does one traverse a binary tree with non-null pointers? How does one create a circular data structure with non-null pointers?One has nullable and nonnull pointers. One uses the former for the circular data structures. Now the current ugly bit with clang is this: ```C int foo(int * arg1, int * NULL_UNSPECIFIED arg2, int * NULLABLE arg3) { if (arg3 != 0) ptr = arg3; return a; } ``` ``` $ clang-14 -Wall -Wnullable-to-nonnull-conversion -c ttt.c ttt.c:36:9: warning: implicit conversion from nullable pointer 'int * _Nullable' to non-nullable pointer type 'int * _Nonnull' [-Wnullable-to-nonnull-conversion] ptr = arg3; ^ ``` i.e. there is no DFA, and one still has to cast the value in the assignment. So the clang support could stand some improvement.
Apr 16
On 17/04/2025 10:50 AM, Derek Fawcus wrote:On Wednesday, 16 April 2025 at 21:12:08 UTC, Walter Bright wrote:I've been implementing this specific thing in my fast DFA implementation recently. Truthiness is needed for it to work. However you can still be using DFA without supporting truthiness of variables. DFA is a very large subject domain.I'm curious - how does one traverse a binary tree with non-null pointers? How does one create a circular data structure with non-null pointers?One has nullable and nonnull pointers. One uses the former for the circular data structures. Now the current ugly bit with clang is this: ```C int foo(int * arg1, int * NULL_UNSPECIFIED arg2, int * NULLABLE arg3) { if (arg3 != 0) ptr = arg3; return a; } ``` ``` $ clang-14 -Wall -Wnullable-to-nonnull-conversion -c ttt.c ttt.c:36:9: warning: implicit conversion from nullable pointer 'int * _Nullable' to non-nullable pointer type 'int * _Nonnull' [-Wnullable-to- nonnull-conversion] ptr = arg3; ^ ``` i.e. there is no DFA, and one still has to cast the value in the assignment. So the clang support could stand some improvement.
Apr 16
Sounds like one must do a bunch of casting to make use of non-nullable pointers?
Apr 16
```C int foo(int * arg1, int * NULL_UNSPECIFIED arg2, int * NULLABLE arg3) { ``` https://www.youtube.com/watch?v=vEr0EAcSWcc
Apr 16
On Thursday, 17 April 2025 at 05:09:41 UTC, Walter Bright wrote:```C int foo(int * arg1, int * NULL_UNSPECIFIED arg2, int * NULLABLE arg3) { ``` https://www.youtube.com/watch?v=vEr0EAcSWccYeah, but that is test code as I'm experimenting with the mechanism, and trying to figure out how it can be used, and what (if any) value I perceive it as adding. I rather suspect the norm would to use the pragma such that most uses are implicitly non-null, and only the exceptional nullable pointer would have to be so marked. On this particular point, my preferred form would be something similar to what Cyclone had, but not using its syntax forms. Possibly using the syntax from Zig. So the common case would be a non-nuull pointer, using '*' alone in its declaration. The nullable point would probably be declared as '?*' (or '*?' - not sure which). Then for use, one could not assign from a nullable to a non-null unless DFA had already proved the value was not null. Except those would be non backward compatible changes; so one would somehow have to opt in to them.
Apr 17
On Wednesday, 16 April 2025 at 19:44:09 UTC, Walter Bright wrote:On 4/16/2025 11:43 AM, Derek Fawcus wrote:I can't say I've played with them much, having come from C not C++. I see there is the ability to overload the unary '*' operator, and so can imagine how one could define a struct providing a non-null form of pointer. But just how awkward is that going to be for mixed forms of nullability in function definitions. Without trying, i suspect it will just get too awkward. e.g., how would the equivalent of this args end up in a D rendition: ```C int foo(char * _Nonnull * _Nullable a, int * _Nullable * _Nonnull b); ```However I do have an interest in being able to write code with distinct nullable and nonnull pointers. That such that the compiler (or an SA tool) can complain when they're incorrectly confused.That's what templates are for!
Apr 16
On 17/04/2025 11:08 AM, Derek Fawcus wrote:On Wednesday, 16 April 2025 at 19:44:09 UTC, Walter Bright wrote:The current C++ analyzer for clang doesn't support that. ```c++ int foo(char ** _Nullable a, int ** _Nonnull b); ``` Would be more accurate. For D, my design work for type state analysis would have it be: ```d int foo(/*?initialized*/ char** a, ?nonnull int** b); ``` If you want to dereference twice, you need to perform a load + check. ```d if (auto c = *b) { int d = *c; } ``` Some syntax sugar can lower to that.On 4/16/2025 11:43 AM, Derek Fawcus wrote:I can't say I've played with them much, having come from C not C++. I see there is the ability to overload the unary '*' operator, and so can imagine how one could define a struct providing a non-null form of pointer. But just how awkward is that going to be for mixed forms of nullability in function definitions. Without trying, i suspect it will just get too awkward. e.g., how would the equivalent of this args end up in a D rendition: ```C int foo(char * _Nonnull * _Nullable a, int * _Nullable * _Nonnull b); ```However I do have an interest in being able to write code with distinct nullable and nonnull pointers. That such that the compiler (or an SA tool) can complain when they're incorrectly confused.That's what templates are for!
Apr 16
On 4/16/2025 4:08 PM, Derek Fawcus wrote:e.g., how would the equivalent of this args end up in a D rendition: ```C int foo(char * _Nonnull * _Nullable a, int * _Nullable * _Nonnull b); ``````d int foo(_Nullable!(_Nonnull!char)) a, _Nonnull!(_Nullable!int) b); ```
Apr 16
Here's something I spent 5 minutes on: ```d struct NonNull(T) { T* p; T* ptr() { return p; } alias this = ptr; } int test(NonNull!int np) { int i = *np; int* p1 = np; int* p2 = np.ptr; np = p1; // Error: cannot implicitly convert expression `p1` of type `int*` to `NonNull!int` return i; } ``` Note that NonNull can be used as a pointer, can be implicitly converted to a pointer, but a pointer cannot be implicitly converted to a NonNull. There's more window dressing one would want, like a constructor with a null check, but the basic idea looks workable and is not complicated.
Apr 17
On Thursday, 17 April 2025 at 16:39:28 UTC, Walter Bright wrote:Here's something I spent 5 minutes on: ```d struct NonNull(T) { T* p; T* ptr() { return p; } alias this = ptr; } int test(NonNull!int np) { int i = *np; int* p1 = np; int* p2 = np.ptr; np = p1; // Error: cannot implicitly convert expression `p1` of type `int*` to `NonNull!int` return i; } ``` Note that NonNull can be used as a pointer, can be implicitly converted to a pointer, but a pointer cannot be implicitly converted to a NonNull. There's more window dressing one would want, like a constructor with a null check, but the basic idea looks workable and is not complicated.First to support classes, you’d have to make a slight change ```d struct NonNull(T) { T p; T ptr() { return p; } alias this = ptr; } ``` It works, but the syntax/defaults is backwards. Why does the unusual case of a nullable pointer get the nice syntax while the common case gets the `NonNull!(int*)` syntax? Who is going to write that all over their code?
Apr 17
On Thursday, April 17, 2025 11:36:49 AM MDT Dave P. via Digitalmars-d wrote:On Thursday, 17 April 2025 at 16:39:28 UTC, Walter Bright wrote: First to support classes, you’d have to make a slight change ```d struct NonNull(T) { T p; T ptr() { return p; } alias this = ptr; } ``` It works, but the syntax/defaults is backwards. Why does the unusual case of a nullable pointer get the nice syntax while the common case gets the `NonNull!(int*)` syntax? Who is going to write that all over their code?I am in favor of making changes along the lines that Rikki is proposing so that we can better handle failure in multi-threaded environments without having to kill the entire program (and in addition, there are cases where dereferencing null pointers is not actually memory-safe, because either that platform doesn't protect it or a compiler like ldc or gdc will optimize the code based on the fact that it treats null pointer dereferencing as undefined behavior). That being said, I honestly think that the concern over null pointers is completely overblown. I can't even remember the last time that I encountered one being dereferenced. And when I have, it's usually because I used a class and forgot to initialize it, which blows up very quickly in testing rather than it being a random bug that occurs during execution. So, if someone feels the need to use non-null pointers of some kind all over the place, I'd be concerned about how they're writing their code such that it's even a problem - though it could easily be the case that they're just paranoid about it, which happens with sometimes with a variety of issues and people. And as such, I really don't think that it's all that big a deal if a wrapper type is required to guarantee that a pointer isn't null. Most code shouldn't need anything of the sort. However, as I understand it, there _are_ memory-safety issues with null pointers with ldc (and probably gdc as well) due to how they optimize. IIRC, the core problem is that they treat dereferencing a null pointer as undefined behavior, and that can have serious consequences with regards to what the code does when there actually is a null pointer. dmd doesn't have the same problem from what I understand simply because it's not as aggressive with its optimizations. So, Walter's stance makes sense based on what dmd is doing, but it doesn't necessarily make sense with D compilers in general. So, between that and the issues with platforms such as webasm, I am inclined to think that we should treat dereferencing pointers like we treat accessing elements of arrays and insert checks that the compiler can then optimize out. And we can provide flags to change that behavior just like we do with array bounds checking. But if we cannot guarantee that attempting to dereference a null pointer is always safe (and as I understand it, outside of using dmd, we can't), then that's a hole in safe. And no matter how rare dereferencing null is or isn't in practice, we can't have holes in safe and have it actually give guarantees about memory safety. - Jonathan M Davis
Apr 17
I'd like to know what those gdc and ldc transformations are, and whether they are controllable with a switch to their optimizers. I know there's a problem with WASM not faulting on a null dereference, but in another post I suggested a way to deal with it.
Apr 17
On Thursday, April 17, 2025 8:39:27 PM MDT Walter Bright via Digitalmars-d wrote:I'd like to know what those gdc and ldc transformations are, and whether they are controllable with a switch to their optimizers. I know there's a problem with WASM not faulting on a null dereference, but in another post I suggested a way to deal with it.Unfortunately, my understanding isn't good enough to explain those details. I discussed it with Johan in the past, but I've never worked on ldc or with llvm (or on gdc/gcc), so I really don't know what is or isn't possible. However, from what I recall of what Johan said, we were kind of stuck, and llvm considered dereferencing null to be undefined behavior. It may be the case that there's some sort of way to control that (and llvm may have more capabilities in that regard since I last discussed it with Johan), but someone who actually knows llvm is going to have to answer those questions. And I don't know how gdc's situation differs either. - Jonathan M Davis
Apr 19
On Saturday, 19 April 2025 at 22:49:19 UTC, Jonathan M Davis wrote:On Thursday, April 17, 2025 8:39:27 PM MDT Walter Bright via Digitalmars-d wrote:There is a way now to tell LLVM that dereferencing null is _defined_ (nota bene) behavior.I'd like to know what those gdc and ldc transformations are, and whether they are controllable with a switch to their optimizers. I know there's a problem with WASM not faulting on a null dereference, but in another post I suggested a way to deal with it.Unfortunately, my understanding isn't good enough to explain those details. I discussed it with Johan in the past, but I've never worked on ldc or with llvm (or on gdc/gcc), so I really don't know what is or isn't possible. However, from what I recall of what Johan said, we were kind of stuck, and llvm considered dereferencing null to be undefined behavior.It may be the case that there's some sort of way to control that (and llvm may have more capabilities in that regard since I last discussed it with Johan), but someone who actually knows llvm is going to have to answer those questions. And I don't know how gdc's situation differs either.So far not responded in this thread because I feel it is an old discussion, with old misunderstandings. There is confusion between dereferencing in the language, versus dereferencing by the CPU. What I think that C and C++ do very well is separate language behavior from implementation/CPU behavior, and only prescribe language behavior, no (or very little) implementation behavior. I feel D should do the same. Non-virtual method example, where (in my opinion) the dereference happens at call site, not inside the function: ``` class A { int a; final void foo() { // non-virtual a = 1; // no dereference here } } A a; a.foo(); <-- DEREFERENCE ``` During program execution, _with the current D implementation of classes and non-virtual methods_, the CPU will only "dereference" the `this` pointer to do the assignment to `a`. But that is only the case for our _current implementation_. For the D language behavior, it does not matter what the implementation does: same behavior should happen on any architecture/platform/execution model. If you want to fault on null-dereference, I believe you _have_ to add a null-check at every dereference at _language_ level (regardless of implementation details). Perhaps it does not impact performance very much (with optimizer enabled); I vaguely remember a paper from Microsoft where they tried this and did not see a big perf impact (if any). Some notes to trigger you to think about distinguishing language behavior from CPU/implementation details: - You don't _have_ to implement classes and virtual functions using a vptr/vtable, there are other options! - There does not need to be a "stack" (implementation detail vocabulary). Some "CPUs" don't have a "stack", and instead do "local storage" (language vocabulary) in an alternative way. In fact, even on CPUs _with_ stack, it can help to not use it! (read about Address Sanitizer detection of stack-use-after-scope and ASan's "fake stack") - Pointers don't have to be memory addresses (you probably already know that they are not physical addresses on common CPUs), but could probably be implemented as hashes/keys into a database as well. C does not define ordered comparison (e.g. > and <) for pointers (it's implementation defined, IIRC), except when they point into the same object (e.g. an array or struct). Why? Because what does it mean on segmented memory architectures (i.e. x86)? - Distinguishing language from implementation behavior means that correct programs work the same on all kinds of different implementations (e.g. you can run your C++ program in a REPL, or run it in your browser through WASM). cheers, Johan
Apr 21
On 22/04/2025 5:29 AM, Johan wrote:If you want to fault on null-dereference, I believe you /have/ to add a null-check at every dereference at /language/ level (regardless of implementation details). Perhaps it does not impact performance very much (with optimizer enabled); I vaguely remember a paper from Microsoft where they tried this and did not see a big perf impact (if any).I agree with what you're saying here, but I want to refine it a little bit. Every language dereference must have an _associated_ read barrier. What this means is: ```d T* ptr; readbarrier(ptr); ptr.field1; ptr.field2; ptr = ...; readbarrier(ptr); ptr.field3; ``` A very simple bit of object tracking when inserting the check, will eliminate a ton of these, tbf we should be doing that for array bounds checking if we are not already. Also the fast DFA which this would be used with, would eliminate a ton of them, so performance should be a complete non-issue, given how ok we are with array bounds checks.
Apr 21
On Thursday, 17 April 2025 at 22:12:22 UTC, Jonathan M Davis wrote:On Thursday, April 17, 2025 11:36:49 AM MDT Dave P. via Digitalmars-d wrote:When you work on a team with less skilled/meticulous teammates on high performance, highly parallel software, you realize how amazing it would be to have a language where pointers are guaranteed to be non-null by default. I spent so much time fixing random annoying NPEs that other people wrote that pop up randomly in our test cluster. This feature is not for people like you. It is to protect your sanity from the terrible code that people who are not as principled will inevitably write. This is an absolute no-brainer IMO, and it's one thing that is really great about Rust.On Thursday, 17 April 2025 at 16:39:28 UTC, Walter Bright wrote: First to support classes, you’d have to make a slight change ```d struct NonNull(T) { T p; T ptr() { return p; } alias this = ptr; } ``` It works, but the syntax/defaults is backwards. Why does the unusual case of a nullable pointer get the nice syntax while the common case gets the `NonNull!(int*)` syntax? Who is going to write that all over their code?That being said, I honestly think that the concern over null pointers is completely overblown. I can't even remember the last time that I encountered one being dereferenced. And when I have, it's usually because I used a class and forgot to initialize it, which blows up very quickly in testing rather than it being a random bug that occurs during execution.
Apr 17
On Thursday, 17 April 2025 at 22:12:22 UTC, Jonathan M Davis wrote:On Thursday, April 17, 2025 11:36:49 AM MDT Dave P. via Digitalmars-d wrote:I can, last week. The process crashed, I ran `coredumpctl gdb`, immediately fixed the issue and carried on with my day. By which I mean I agree with you that I don't think it's a big deal either.On Thursday, 17 April 2025 at 16:39:28 UTC, Walter Bright wrote:That being said, I honestly think that the concern over null pointers is completely overblown. I can't even remember the last time that I encountered one being dereferenced.And when I have, it's usually because I used a class and forgot to initialize it, which blows up very quickly in testing rather than it being a random bug that occurs during execution.That's exactly what I did last week.
Apr 21
On 4/17/2025 10:36 AM, Dave P. wrote:First to support classes, you’d have to make a slight changeSure, you'd want to overload the NonNull template with one that takes a class type parameter.It works, but the syntax/defaults is backwards. Why does the unusual case of a nullable pointer get the nice syntax while the common case gets the `NonNull!(int*)` syntax? Who is going to write that all over their code?Backwards compatibility. The NonNull is the addition, the nullable is the existing. Changing the existing behavior would be a massive disruption. Anyhow, it's a good idea to see how far one can take the metaprogramming approach to what you want, before changing the core language.
Apr 17
On 18/04/2025 2:35 PM, Walter Bright wrote:It works, but the syntax/defaults is backwards. Why does the unusual case of a nullable pointer get the nice syntax while the common case gets the |NonNull!(int*)| syntax? Who is going to write that all over their code? Backwards compatibility. The NonNull is the addition, the nullable is the existing. Changing the existing behavior would be a massive disruption.D is designed around the type state initialized, aka nullable. This was a _very_ smart thing to do, before type state analysis was ever mainstream. Walter this is by far one of the best design decisions you have ever made. Trying to change the default type state for pointers to non-null would be absolutely horrific. Its possible to prove a variable is non-null, but if we start painting pointers themselves in the type system? Ughhhhhh, the pain application VM languages are having over this isn't worth it, in the context of D. We can do a lot better than that. If this doesn't work without annotation (in a single compilation run), we've failed. ```d void main() { func(new int); // ok func(null); // error } void func(int* ptr) { int v = *ptr; } ```
Apr 17
On 4/17/2025 7:48 PM, Richard (Rikki) Andrew Cattermole wrote:```d void main() { func(new int); // ok func(null); // error } void func(int* ptr) { int v = *ptr; } ```It always looks simple in such examples, but then there are things like: ```d struct S { int a,b; int* p; } void main() { S s; funky(&s); func(s.p); } ``` where trying to track information in structs gets complicated fast.
Apr 18
On 19/04/2025 6:11 PM, Walter Bright wrote:On 4/17/2025 7:48 PM, Richard (Rikki) Andrew Cattermole wrote:Yeah indirection. It is not a solved problem, even if you throw the type system with say a type qualifier into the mix. https://kotlinlang.org/docs/java-to-kotlin-nullability-guide.html "Tracking multiple levels of annotations for pointers pointing to pointers would make the checker more complicated, because this way a vector of nullability qualifiers would be needed to be tracked for each symbol. This is not a big caveat, since once the top level pointer is dereferenced, the symvol for the inner pointer will have the nullability information. The lack of multi level annotation tracking only observable, when multiple levels of pointers are passed to a function which has a parameter with multiple levels of annotations. So for now the checker support the top level nullability qualifiers only.: int * __nonnull * __nullable p; int ** q = p; takesStarNullableStarNullable(q);" https://clang.llvm.org/docs/analyzer/developer-docs/nullability.html Its actually a pretty good example of why I think supporting modelling of fields is not worth our time. Plus throw in multi-threading and boom shuckala, not modellable anymore.```d void main() { func(new int); // ok func(null); // error } void func(int* ptr) { int v = *ptr; } ```It always looks simple in such examples, but then there are things like: ```d struct S { int a,b; int* p; } void main() { S s; funky(&s); func(s.p); } ``` where trying to track information in structs gets complicated fast.
Apr 18
On Thursday, 17 April 2025 at 17:36:49 UTC, Dave P. wrote:```d struct NonNull(T) { T p; T ptr() { return p; } alias this = ptr; } ``` It works, but the syntax/defaults is backwards. Why does the unusual case of a nullable pointer get the nice syntax while the common case gets the `NonNull!(int*)` syntax?+1Who is going to write that all over their code?Nobody.
Apr 19
On Thursday, 17 April 2025 at 16:39:28 UTC, Walter Bright wrote:Here's something I spent 5 minutes on: ```d struct NonNull(T) { T* p; T* ptr() { return p; } alias this = ptr; } int test(NonNull!int np) { int i = *np; int* p1 = np; int* p2 = np.ptr; np = p1; // Error: cannot implicitly convert expression `p1` of type `int*` to `NonNull!int` return i; } ``` Note that NonNull can be used as a pointer, can be implicitly converted to a pointer, but a pointer cannot be implicitly converted to a NonNull. There's more window dressing one would want, like a constructor with a null check, but the basic idea looks workable and is not complicated.Isn’t `ref` essentially a non-null pointer?
Apr 18
On 4/18/2025 12:21 AM, Ogion wrote:Isn’t `ref` essentially a non-null pointer?It's supposed to be. But you can write: ```d int* p = null; ref r = *p; ``` and you get a null ref.
Apr 18
On Saturday, 19 April 2025 at 06:05:53 UTC, Walter Bright wrote:On 4/18/2025 12:21 AM, Ogion wrote:How is this possible? Shouldn't dereferencing p crash the program before r is initialized?Isn’t `ref` essentially a non-null pointer?It's supposed to be. But you can write: ```d int* p = null; ref r = *p; ``` and you get a null ref.
Apr 19
On Saturday, April 19, 2025 3:17:29 AM MDT Meta via Digitalmars-d wrote:On Saturday, 19 April 2025 at 06:05:53 UTC, Walter Bright wrote:It doesn't, because nothing is actually dereferenced. This is like if you have a null pointer to a struct and then call a member variable on it, e.g. ``` void main() { S* s; s.foo(); } struct S { int i; void foo() { } } ``` This does not crash, because s is never actually used. It's just passed to foo. Of course, if you then changed foo to something like ``` void foo() { import std.stdio; writeln(i); } ``` it would crash, because then it would need to dereference s to access its i member, but until it needs to access a member, there's no reason for any dereferencing to take place. The same happens with C++ classes as long as the function isn't virtual. Where you _do_ get it failing in basically any language would be with a class' virtual function, because the class reference needs to be dereferenced in order to get the correct function. This is one of those bugs that can be _very_ confusing if you don't think it through, since you do naturally tend to assume that when you call a member function, the pointer is dereferenced, but if the function isn't virtual, there's no reason to dereference it to make the call. The funciton is just passed the pointer or reference as in invisible argument. So, you can end up with a segfault inside of your function instead of at the call site and get rather confused by it. It's happened to me a couple of times in my career, and it's initially been pretty confusing each time, even though after it was explained to me the first time, I understood it, because you just tend to think of calling a member function as derferencing the object even though it it doesn't actually have any reason to do so unless the function is virtual. And with int* p = null; ref r = *p; no dereferencing occurs, because the compiler is converting int* to ref int, and underneath the hood, ref int is just int*, so it's simply copying the value of the pointer. - Jonathan M DavisOn 4/18/2025 12:21 AM, Ogion wrote:How is this possible? Shouldn't dereferencing p crash the program before r is initialized?Isn’t `ref` essentially a non-null pointer?It's supposed to be. But you can write: ```d int* p = null; ref r = *p; ``` and you get a null ref.
Apr 19
On Saturday, 19 April 2025 at 10:35:01 UTC, Jonathan M Davis wrote:[...] because then it would need to dereference s to access its i member, but until it needs to access a member, there's no reason for any dereferencing to take place. The same happens with C++ classes as long as the function isn't virtual.That is undefined behavior. In the C++ standard null references have been carefully ruled out [1]. There is no standard conforming C++ program having null references.And with int* p = null; ref r = *p; no dereferencing occurs,In C++ this is a programming error. When creating a reference from a pointer the null check it is necessary in order to uphold C++' guarantee that references are actually bound to existing objects. [1] google.com?q="c++ reference from null pointer" - https://old.reddit.com/r/cpp/comments/80zm83/no_references_are_never_null/ - https://stackoverflow.com/questions/4364536/is-a-null-reference-possible
Apr 19
On Saturday, April 19, 2025 5:26:29 AM MDT kdevel via Digitalmars-d wrote:On Saturday, 19 April 2025 at 10:35:01 UTC, Jonathan M Davis wrote:My point about non-virtual functions and derefencing wasn't really about references so much as about the fact that the compiler doesn't necessary dereference when you think that you'r telling it to dereference. It only does so when it actually needs to. And whatever is supposed to be defined behavior or not, I have seen pointers not be dereferenced when calling non-virtual functions - and when creating references from what they point to.[...] because then it would need to dereference s to access its i member, but until it needs to access a member, there's no reason for any dereferencing to take place. The same happens with C++ classes as long as the function isn't virtual.That is undefined behavior. In the C++ standard null references have been carefully ruled out [1]. There is no standard conforming C++ program having null references.Well, if C++ now checks that pointer is non-null when creating a reference from it, that's new behavior, because it most definitely did not do that before. Either way, unless the compiler inserts checks of some kind in order to try ensure that a reference is never null, there's no reason to derefence a pointer or reference until the data it points to is actually used. And historically, no null checks were done for correctness. - Jonathan M DavisAnd with int* p = null; ref r = *p; no dereferencing occurs,In C++ this is a programming error. When creating a reference from a pointer the null check it is necessary in order to uphold C++' guarantee that references are actually bound to existing objects. [1] google.com?q="c++ reference from null pointer" - https://old.reddit.com/r/cpp/comments/80zm83/no_references_are_never_null/ - https://stackoverflow.com/questions/4364536/is-a-null-reference-possible
Apr 19
On Saturday, 19 April 2025 at 11:44:42 UTC, Jonathan M Davis wrote:[...]Of course it doesn't and I didn't write that. I wrote that it is a programming error to use a ptr to initialize a reference when it is possible that the ptr is null. If refs in D were as strong as in C++ I would write [... int *p is potentially null ...] enforce (p); auto ref r = *p;Well, if C++ now checks that pointer is non-null when creating a reference from it, that's new behavior, because it most definitely did not do that before.And with int* p = null; ref r = *p; no dereferencing occurs,In C++ this is a programming error. When creating a reference from a pointer the null check it is necessary in order to uphold C++' guarantee that references are actually bound to existing objects. [...]
Apr 19
On Saturday, April 19, 2025 6:13:36 AM MDT kdevel via Digitalmars-d wrote:On Saturday, 19 April 2025 at 11:44:42 UTC, Jonathan M Davis wrote:If it's not doing any additional checks, then I don't understand your point. Of course it's programmer error to convert a pointer to a reference when that pointer is null. It's the same programmer error as any time that you dereference a null pointer except that it doesn't actually dereference the pointer when you create the reference and instead blows up later when you attempt to use what it refers to, because that's when the actual dereferencing takes place. If C++ doesn't have additional checks, then it's not any stronger about guarantees with & than D is with ref. Meta was asking how it was possible that int* p = null; ref r = *p; would result in a null reference instead of blowing up, and I explained why it didn't blow up and pointed out that C++ has the exact same situation. And unless C++ has added additional checks (and it sounds like they haven't), then there's no real difference here between C++ and D. - Jonathan M Davis[...]Of course it doesn't and I didn't write that. I wrote that it is a programming error to use a ptr to initialize a reference when it is possible that the ptr is null. If refs in D were as strong as in C++ I would write [... int *p is potentially null ...] enforce (p); auto ref r = *p;Well, if C++ now checks that pointer is non-null when creating a reference from it, that's new behavior, because it most definitely did not do that before.And with int* p = null; ref r = *p; no dereferencing occurs,In C++ this is a programming error. When creating a reference from a pointer the null check it is necessary in order to uphold C++' guarantee that references are actually bound to existing objects. [...]
Apr 19
On Saturday, 19 April 2025 at 12:54:27 UTC, Jonathan M Davis wrote:int main () { int *p = NULL; int &i = *p; } That is an error (mistake) only in C++ because the reference is not initialized with a valid initializer. In D, however, void main () { int *p = null; ref int i = *p; // DMD v2.111.0 } is a valid program [3].[... int *p is potentially null ...] enforce (p); auto ref r = *p;If it's not doing any additional checks, then I don't understand your point. Of course it's programmer error to convert a pointer to a reference when that pointer is null.It's the same programmer error as any time that you dereference a null pointer except that it doesn't actually dereference the pointer when you create the reference and instead blows up later when you attempt to use what it refers to, because that's when the actual dereferencing takes place.Assume the "dereference" of the pointer and the initialization of the reference happen in different translation units written by different programmers. I.e. tu1.cc void foo (int &i) { } tu2.cc int main () { int *p = NULL; foo (*p); } versus tu1.d void foo (ref int i) { } tu2.d int main () { int *p = NULL; foo (*p); } Then we have different responsibilities. In the C++ case the programmer of tu2.cc made a mistake while in the D case the code of tu2.d is legit. I would not call this situation "the same programmer error".If C++ doesn't have additional checks, then it's not any stronger about guarantees with & than D is with ref.As programmer of translation unit 1 my job is much easier if I use C++. [3] https://dlang.org/spec/type.html#pointers "When a pointer to T is dereferenced, it must either contain a null value, or point to a valid object of type T."
Apr 19
On Saturday, April 19, 2025 8:23:09 AM MDT kdevel via Digitalmars-d wrote:On Saturday, 19 April 2025 at 12:54:27 UTC, Jonathan M Davis wrote:In both cases it's a valid program where the programmer screwed up, and they're going to get a segfault later on if the reference is ever accessed. If it weren't a valid program, it wouldn't compile. If you had a situation where a cast were being used to circumvent compiler checks, it could be argued that it wasn't valid, because the programmer was circumventing the compiler, but nothing is being circumvented here. Neither language has checks - either at compile time or at runtime - to catch this issue, so I don't see how it could be argued that the compiler is providing guarantees about this or that the program is invalid. In both cases, it's an error on the programmer's part, and in neither case is the language providing anything to prevent it or catch it. As far as I can see, the situation in both cases is identical. Maybe there's some difference in how the C++ spec talks about it, but there is no practical difference. - Jonathan M Davisint main () { int *p = NULL; int &i = *p; } That is an error (mistake) only in C++ because the reference is not initialized with a valid initializer. In D, however, void main () { int *p = null; ref int i = *p; // DMD v2.111.0 } is a valid program [3].[... int *p is potentially null ...] enforce (p); auto ref r = *p;If it's not doing any additional checks, then I don't understand your point. Of course it's programmer error to convert a pointer to a reference when that pointer is null.
Apr 19
On Saturday, 19 April 2025 at 22:23:54 UTC, Jonathan M Davis wrote:Only the D version is valid. The C++ program violates the std. From the SO page there is a quote of the C++ 11 std draft which says in sec. "8.3.2 References": "A reference shall be initialized to refer to a valid object or function. [ Note: in particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the “object” obtained by dereferencing a null pointer, which causes undefined behavior. [...] — end note ]" You find nearly the same wording in sec. 11.3.2 of the C++17 std draft (N4713) and in sec. 9.3.4.3 of the C++23 std draft (N4928) with the "dereferencing" replaced with "indirection".int main () { int *p = NULL; int &i = *p; } That is an error (mistake) only in C++ because the reference is not initialized with a valid initializer. In D, however, void main () { int *p = null; ref int i = *p; // DMD v2.111.0 } is a valid program [3].In both cases it's a valid program[...] If it weren't a valid program, it wouldn't compile.That is an interesting opinion.
Apr 19
On Saturday, April 19, 2025 5:51:58 PM MDT kdevel via Digitalmars-d wrote:On Saturday, 19 April 2025 at 22:23:54 UTC, Jonathan M Davis wrote:I see no practical difference in general, but I guess that the difference would be that if C++ says that the behavior is undefined, then it can let the optimizer do whatever it wants with it, whereas D has to say at least roughly what would happen, or the behavior would be undefined and thus screw up safe. Whatever assumptions the optimizer may make about it, they can't be anything that would violate memory safety. In practice though, particularly with unoptimized code, C++ and D are going to do the same thing here, and in both cases, the programmer screwed up, so their program is going to crash. And realistically, it'll likely do the same thing in most cases even with optimized code. - Jonathan M DavisOnly the D version is valid. The C++ program violates the std. From the SO page there is a quote of the C++ 11 std draft which says in sec. "8.3.2 References": "A reference shall be initialized to refer to a valid object or function. [ Note: in particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the “object” obtained by dereferencing a null pointer, which causes undefined behavior. [...] — end note ]" You find nearly the same wording in sec. 11.3.2 of the C++17 std draft (N4713) and in sec. 9.3.4.3 of the C++23 std draft (N4928) with the "dereferencing" replaced with "indirection".int main () { int *p = NULL; int &i = *p; } That is an error (mistake) only in C++ because the reference is not initialized with a valid initializer. In D, however, void main () { int *p = null; ref int i = *p; // DMD v2.111.0 } is a valid program [3].In both cases it's a valid program
Apr 19
On Sunday, 20 April 2025 at 00:33:52 UTC, Jonathan M Davis wrote:I see no practical difference in general [...]I consider nonconforming generally inacceptable.
Apr 20
On Sunday, April 20, 2025 8:13:44 AM MDT kdevel via Digitalmars-d wrote:On Sunday, 20 April 2025 at 00:33:52 UTC, Jonathan M Davis wrote:Writing a program which doesn't behave properly is always a problem and should be consider unacceptable. And both having a program relying on undefined behavior and having a program which dereferences null are problems. The latter will crash the program. The only real difference there between C++ and D is that if the language states that it's undefined behavior for the reference to be null, then the optimizer can do screwy things in the case when it actually does happen instead of the program being guaranteed to crash when the null reference is dereferenced. So, that's why I say that I see no practical difference. If you create a reference from a null pointer, you have a bug whether the program is written in C++ or D. And outside of optimized builds (and likely in almost all cases even with optimized builds), what happens when you screw that up will be the same in both languages. In any case, we clearly both agree that if the programmer does this, they've screwed up, and I think that we're basically arguing over language here rather than an actual technical problem. Ultimately, the only real difference is what the language's optimizer is allowed to do when the programmer does screw it up, because the C++ spec says that it's undefined behavior, and the D spec can't say that and have references work in safe code, since safe code disallows undefined behavior in order to ensure memory safety. The program has a bug either way. - Jonathan M DavisI see no practical difference in general [...]I consider nonconforming generally inacceptable.
Apr 20
On Sunday, 20 April 2025 at 22:19:39 UTC, Jonathan M Davis wrote:The problematic word is "behave". Only recently there was a thread on reddit where the user Zde-G pinpointed the problem while discussing a "new name" for undefined behavior (UB) [5]: '90% of confusion about UB comes from the simple fact that something is called behavior. Defined, undefined, it doesn't matter: layman observes world behavior, layman starts thinking about what kind of behavior can there be. The mental model every programmer which observes that term for the first time is “some secret behavior which is too complex to write in the description of the language… but surely I can glean it from the compiler with some experiments”. This is entirely wrong mental model even for C and doubly so for Rust or Zig. And it takes insane amount of effort to teach **every single newcomer** that it's wrong model. I have seen **zero** exceptions. New name should talk about code, not about behavior. “Invalid code” or “forbidden code” or maybe “erroneous construct”, but something, anything which is not related to what happens in runtime. There are no runtime after UB, it's as simple as that. The only option if your code have UB is to go and fix the code… and yet the name doesn't include anything related to code at all and concentrates on entirely wrong thing.'I consider nonconforming generally inacceptable.Writing a program which doesn't behave properly is always a problem and should be consider unacceptable.[...] If you create a reference from a null pointer, you have a bug whether the program is written in C++ or D.That is not true. A D program like this: void main () { int *p = null; ref int i = *p; // DMD v2.111.0 } is a valid program and there is no UB, no crash and no bug. I already pointed this out earlier with reference to the D spec. [3]. [3] https://dlang.org/spec/type.html#pointers "When a pointer to T is dereferenced, it must either contain a null value, or point to a valid object of type T." [5] Zde-G kommentiert Blog Post: UB Might Be a Wrong Term for Newer Languages: https://old.reddit.com/r/rust/comments/129mz8z/blog_post_ub_might_be_a_wrong_term_for_newer/jep231f/
Apr 21
On Friday, 18 April 2025 at 07:21:43 UTC, Ogion wrote:Isn’t `ref` essentially a non-null pointer?Not really, it is similar but not the same. The thing pointers give, or rather the syntax of initialising them gives, is easy local reasoning, due to having to explicitly use the address of (&) operator. So any function in a different compilation unit is a lot easier to reason about if one does not have to worry about if an argument may experience an under the covers change due to the parameter being defined as a reference. Sadly, that state already exists, so it is rather a case of water under the bridge.
Apr 19
On Monday, 14 April 2025 at 14:22:09 UTC, Richard (Rikki) Andrew Cattermole wrote:On 15/04/2025 1:51 AM, Atila Neves wrote:Possible mitigations: * Use `sigaction` to catch `SIGSEGV` and throw an exception in the handler. * Use a nullable/option type. * Address sanitizer. * Fuzzing the server (which one should do anyway). How is out of bounds access related to null pointers throwing exceptions?On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew Cattermole wrote:Imagine you have a web server that is handling 50k requests per second. It makes you $1 million dollars a day. In it, you accidentally have some bad business logic that results in a null dereference or indexing a slice out of bounds.I've been looking once again at having an exception being thrown on null pointer dereferencing. However the following can be extended to other hardware level exceptions. [...]I would like to know why one would want this.How likely are you to keep using D, or willing to talk about using D positively afterwards?People write servers in C and C++ too.
Apr 16
On 16/04/2025 8:18 PM, Atila Neves wrote:On Monday, 14 April 2025 at 14:22:09 UTC, Richard (Rikki) Andrew Cattermole wrote:The only thing you can reliably do on segfault is to kill the process. From what I've read, they get awfully iffy, even with the workarounds. And that's just for Posix, Windows is an entirely different kettle of fish and is designed around exception handling instead, which dmd doesn't support!On 15/04/2025 1:51 AM, Atila Neves wrote:Possible mitigations: * Use `sigaction` to catch `SIGSEGV` and throw an exception in the handler.On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew Cattermole wrote:Imagine you have a web server that is handling 50k requests per second. It makes you $1 million dollars a day. In it, you accidentally have some bad business logic that results in a null dereference or indexing a slice out of bounds.I've been looking once again at having an exception being thrown on null pointer dereferencing. However the following can be extended to other hardware level exceptions. [...]I would like to know why one would want this.* Use a nullable/option type.While valid to box pointers, we would then need to disallow them in business logic functions. Very invasive, not my preference.* Address sanitizer.Slow at runtime, which kinda defeats the purpose.* Fuzzing the server (which one should do anyway).Absolutely, but there is too much state to kinda guarantee that it covers everything. And very few people will get it to that level (after all, people need significant amount of training to do it successfully).How is out of bounds access related to null pointers throwing exceptions?Out of bounds on a slice uses a read barrier to throw an exception. A read barrier to prevent dereferencing a null pointer is exactly the same concept. One is 0 or 1. Second is 0 or N.Yes they do, just like they do in D. But they have something we do not have, a ton of static analysis. Check, select professional developers: https://survey.stackoverflow.co/2024/technology#most-popular-technologies I'm not going to say they each have a good solution to the problem, but they each have a solution that isn't just kill the process. The end of this foray into read barriers may be the conclusion that we cannot use them for this. What worries me is that I don't have the evidence to show that it won't work, and dismissing it without evidence does mean that we'll be forced to recommend full CFG DFA which is slow. If it can work, using the read barriers to fill in the gap of what a faster DFA can offer would be a much better user experience. At least as a default.How likely are you to keep using D, or willing to talk about using D positively afterwards?People write servers in C and C++ too
Apr 16
I should follow on from this to explain why I care so much. See Developer type: https://survey.stackoverflow.co/2024/developer-profile Back-end (which is what D could excel at) is second on the list, right next to full-stack which is first. This is the biggest area of growth possible for D, and we're missing key parts that is not only expected, but needed to target the largest target audience possible.
Apr 16
On Wednesday, 16 April 2025 at 08:49:39 UTC, Richard (Rikki) Andrew Cattermole wrote:Exactly what are you referring to by "read barrier"? To me it has a specific technical meaning, related to memory access and if/when one access may pass another. It has nothing to do with exceptions, but rather the details of how a CPU architecture approaches superscalar memory accesses (reads and/or writes), and how they are (or may be) re-ordered. However I would not classify the bounds check performed as part of accessing a slice as a "read barrier", rather it is a manual range check. So by "read barrier" for null's do you simply mean having the compile generate a "compare to zero" instruction, followed by a "jump if zero" to some error path? If so, then while it may catch errors (and I have no objection to optionally generating such null checks); it is not IMO a means of error recovery - simply another means of forcing a crash.How is out of bounds access related to null pointers throwing exceptions?Out of bounds on a slice uses a read barrier to throw an exception. A read barrier to prevent dereferencing a null pointer is exactly the same concept. One is 0 or 1. Second is 0 or N.
Apr 16
On 16/04/2025 11:38 PM, Derek Fawcus wrote:On Wednesday, 16 April 2025 at 08:49:39 UTC, Richard (Rikki) Andrew Cattermole wrote:Yes. It would then call a compiler hook just like array bounds checks do.Exactly what are you referring to by "read barrier"? To me it has a specific technical meaning, related to memory access and if/when one access may pass another. It has nothing to do with exceptions, but rather the details of how a CPU architecture approaches superscalar memory accesses (reads and/or writes), and how they are (or may be) re-ordered. However I would not classify the bounds check performed as part of accessing a slice as a "read barrier", rather it is a manual range check. So by "read barrier" for null's do you simply mean having the compile generate a "compare to zero" instruction, followed by a "jump if zero" to some error path?How is out of bounds access related to null pointers throwing exceptions?Out of bounds on a slice uses a read barrier to throw an exception. A read barrier to prevent dereferencing a null pointer is exactly the same concept. One is 0 or 1. Second is 0 or N.If so, then while it may catch errors (and I have no objection to optionally generating such null checks); it is not IMO a means of error recovery - simply another means of forcing a crash.This is why I think its important that we make this configurable via a global function pointer. Like we do for asserts. It allows people to configure what it does rather than pick for them.
Apr 16
On Wednesday, 16 April 2025 at 08:49:39 UTC, Richard (Rikki) Andrew Cattermole wrote:On 16/04/2025 8:18 PM, Atila Neves wrote:I'm not sure what you have in mind, what I have in mind is something like this: https://discourse.llvm.org/t/rfc-nullability-qualifiers/35672 https://clang.llvm.org/docs/analyzer/developer-docs/nullability.html The checks here are performed in a distinct SA tool, not in the main compiler. However it catches the main erroneous cases - first two listed checks of second link:* Use a nullable/option type.While valid to box pointers, we would then need to disallow them in business logic functions.If a pointer p has a nullable annotation and no explicit null check or assert, we should warn in the following cases: - p gets implicitly converted into nonnull pointer, for example, we are passing it to a function that takes a nonnull parameter. - p gets dereferencedGiven how individual variable / fields have to be annotated, it probably does not need complete DFA, but only function local analysis for loads/stores/compares.
Apr 16
On 17/04/2025 1:41 AM, Derek Fawcus wrote:On Wednesday, 16 April 2025 at 08:49:39 UTC, Richard (Rikki) Andrew Cattermole wrote:``clang --analyze -Xanalyzer -analyzer-output=text`` "While it’s somewhat exceptional for us to introduce new type qualifiers that don’t produce semantically distinct types, we feel that this is the only plausible design and implementation strategy for this feature: pushing nullability qualifiers into the type system semantically would cause significant changes to the language (e.g., overloading, partial specialization) and break ABI (due to name mangling) that would drastically reduce the number of potential users, and we feel that Clang’s support for maintaining type sugar throughout semantic analysis is generally good enough [6] to get the benefits of nullability annotations in our tools." Its available straight from clang, it annotates variables and is part of the type system, but also isn't effecting symbol lookup or introspection. Its part of the frontend, not backend. Exactly what I want also. I'm swearing right now, I knew we were 20 years behind, I didn't realize that they are one stones throw away from the end game. We can't get ahead of them at this point. The attributes are different to what I want in D however. For D I want us to solve all of type state analysis not just nullability.On 16/04/2025 8:18 PM, Atila Neves wrote:I'm not sure what you have in mind, what I have in mind is something like this: https://discourse.llvm.org/t/rfc-nullability-qualifiers/35672 https://clang.llvm.org/docs/analyzer/developer-docs/nullability.html The checks here are performed in a distinct SA tool, not in the main compiler. However it catches the main erroneous cases - first two listed checks of second link:* Use a nullable/option type.While valid to box pointers, we would then need to disallow them in business logic functions.Fields no, it'll be hell if we were to start annotating them, Walter balked at that idea ages ago and he was right to. I want stuff like this to work, without needing annotation: ```d bool isNull(int* ptr) => ptr is null; int* ptr; if (!isNull(ptr)) int v = *ptr; // ok else int v = *ptr; // error ``` No annotations when things are not virtual or for the default case complex (such as backwards goto's, that needs a full CFG as part of DFA). ```d void main() { func(new int); // ok func(null); // error } void func(/*?nonnull*/ int* ptr) { int v = *ptr; } ```If a pointer p has a nullable annotation and no explicit null check or assert, we should warn in the following cases: - p gets implicitly converted into nonnull pointer, for example, we are passing it to a function that takes a nonnull parameter. - p gets dereferencedGiven how individual variable / fields have to be annotated, it probably does not need complete DFA, but only function local analysis for loads/ stores/compares.
Apr 16
The correct solution is to restart the process. The null pointer dereference could be a symptom of a wild pointer writing all over the process space.
Apr 16
On 17/04/2025 6:38 AM, Walter Bright wrote:The correct solution is to restart the process. The null pointer dereference could be a symptom of a wild pointer writing all over the process space.Yes, we are in agreement on this situation. .net has a very strong guarantee that a pointer can only point to null or a valid instance of that type. The restrictions that safe place on a function does remove this as a possibility in D also. And this is where we are diverging, there is a subset which does have this guarantee, that it does indicate logic error, and not program corruption. This is heavily present in web development, but rather rare in comparison to other types of projects.
Apr 16
The focus of safe code in D is preventing memory corruption, not null pointer dereference seg faults or other programming bugs. A null pointer deference may be a symptom of memory corruption or some logic bug, but it does not cause memory corruption. D has many aspects that reduce the likelihood of bugs (such as no variable shadowing), but that is not what safe is about. (Yes I know that if there's a constant offset to a null pointer larger than the guard page, it can cause memory corruption.)
Apr 16
On Wednesday, 16 April 2025 at 19:58:45 UTC, Walter Bright wrote:The focus of safe code in D is preventing memory corruption, not null pointer dereference seg faults or other programming bugs. A null pointer deference may be a symptom of memory corruption or some logic bug, but it does not cause memory corruption. D has many aspects that reduce the likelihood of bugs (such as no variable shadowing), but that is not what safe is about. (Yes I know that if there's a constant offset to a null pointer larger than the guard page, it can cause memory corruption.)The 0 address on WASM is writable, which has burned me many times.
Apr 16
On 4/16/2025 1:16 PM, Dave P. wrote:The 0 address on WASM is writable, which has burned me many times.How that oversight was made seems incredible. Anyhow, what you can do is allocate some memory at location 0, and fill it with 0xDEAD_BEEF. Then, periodically, check to see if those values changed.
Apr 16
On 4/12/2025 4:11 PM, Richard (Rikki) Andrew Cattermole wrote:The .net exceptions are split into managed and unmanaged exceptions.Sounds equivalent to D's Exception and Error hierarchies.
Apr 16
On 17/04/2025 4:55 AM, Walter Bright wrote:On 4/12/2025 4:11 PM, Richard (Rikki) Andrew Cattermole wrote:Its not. The unmanaged exceptions are from native, then it gets wrapped by .net so it can be caught. Including for null dereference. As far as I'm aware cleanup routines are not messed with. We've lately been discussing what to do with Error on Discord, and so far it seems like the discussion is going in the direction of it should either do what assert does and kill the process with a function pointer in the middle to allow configurability, or throw what I've dubbed a framework exception. A framework exception sits in the middle of the existing hierarchy, does cleanup, but doesn't effect nothrow. Manu wanted something like this recently for identical reasons that me and Adam do.The .net exceptions are split into managed and unmanaged exceptions.Sounds equivalent to D's Exception and Error hierarchies.
Apr 16
On 4/16/2025 10:08 AM, Richard (Rikki) Andrew Cattermole wrote:We've lately been discussing what to do with Error on Discord, and so far it seems like the discussion is going in the direction of it should either do what assert does and kill the process with a function pointer in the middle to allow configurability, or throw what I've dubbed a framework exception.You can configure what assert() does via a command line switch.Manu wanted something like this recently for identical reasons that me andAdam do. Adam and I discussed this extensively at the last Coffee Haus meeting. I haven't talked with Manu about it. Anyone can configure assert() to do whatever they want, after all, D is a systems programming language. But if it does something other than exit the process, you're on your own.
Apr 16
I confess I don't understand the fear behind a null pointer. A null pointer is a NaN (Not a Number) value for a pointer. It's similar (but not exactly the same behavior) as 0xFF is a NaN value for a character and NaN is a NaN value for a floating point value. It means the pointer is not pointing to a valid object. Therefore, it should not be dereferenced. To dereference a null pointer is: A BUG IN THE PROGRAM When a bug in the program is detected, the only correct course of action is: GO DIRECTLY TO JAIL, DO NOT PASS GO, DO NOT COLLECT $200 It's the same thing as `assert(condition)`. When the condition evaluates to `false`, there's a bug in the program. A bug in the program means the program has entered an unanticipated state. The notion that one can recover from this and continue running the program is only for toy programs. There is NO WAY to determine if continuing to run the program is safe or not. I did a lot of programming on MS-DOS. There is no memory protection there. Writing through a null pointer would scramble the operating system tables, which meant the operating system would do something terrible. There were many times when it literally scrambled my hard disk. (I made lots of backups.) If you haven't had this pleasure, it may be hard to realize what a godsend protected memory is. A null pointer no longer requires reinstalling the operating system. Your program simply quits with a stack trace. With the advent of protected mode, I immediately ceased all program development in real mode DOS. Instead, I'd fully debug it in protected mode, and then as the very last step I'd test it in real mode. Protected mode is the greatest invention ever for computer programs. When the hardware detects a null pointer dereference, it produces a seg fault, the program stops running and you get a stack trace which gives you the best chance ever of finding the cause of the seg fault. A lovely characteristic of seg faults is they come FOR FREE! There is zero cost to them. They don't slow your program down at all. They do not add bloat. It's all under the hood. The idea that a null pointer is a billion dollar mistake is just ludicrous to me. The real mistake is having unchecked arrays, which don't get hardware Being unhappy about a null pointer seg fault is like complaining that the seatbelt left a bruise on your body as it saved you from your body being broken (this has happened to me, I always always wear that seatbelt!). Of course, it is better to detect a seg fault at compile time. Data Flow Analysis can help: ```d int x = 1; void main() { int* p; if (x) *p = 3; } ``` Compiling with `-O`, which enables Data Flow Analysis: ``` dmd -O test.d Error: null dereference in function _Dmain ``` Unfortunately, DFA has its limitations that nobody has managed to solve (the halting problem), hence the need for runtime checks, which the hardware does nicely for you. Fortunately, D is powerful enough so you can make a non-nullable type. In summary, the notion that one can recover from an unanticipated null pointer dereference and continue running the program is a seriously bad idea. There are far better ways to make failsafe systems. Complaining about a seg fault is like complaining that a seatbelt left a bruise while saving you from being maimed.
Apr 16
On 17/04/2025 6:18 AM, Walter Bright wrote:I confess I don't understand the fear behind a null pointer. A null pointer is a NaN (Not a Number) value for a pointer. It's similar (but not exactly the same behavior) as 0xFF is a NaN value for a character and NaN is a NaN value for a floating point value.Agreed. But unlike floating point, pointer issues kill the process. They invalidate the task at hand.It means the pointer is not pointing to a valid object. Therefore, it should not be dereferenced.If you write purely safe code that isn't possible. Just like what .net guarantees.To dereference a null pointer is: A BUG IN THE PROGRAMAgreed, the task has not got the ability to continue and must stop. A task is not the same thing as a process.When a bug in the program is detected, the only correct course of action is: GO DIRECTLY TO JAIL, DO NOT PASS GO, DO NOT COLLECT $200 It's the same thing as `assert(condition)`. When the condition evaluates to `false`, there's a bug in the program.You are not going to like what the unittest runner is doing then. https://github.com/dlang/dmd/blob/d6602a6b0f658e8ec24005dc7f4bf51f037c2b18/druntime/src/core/runtime.d#L561A bug in the program means the program has entered an unanticipated state. The notion that one can recover from this and continue running the program is only for toy programs. There is NO WAY to determine if continuing to run the program is safe or not.Yes, that is certainly possible in a lot of cases. We are in total agreement that the default should always be to kill the process. The problem lies in a very specific scenario where safe is being used heavily, where logic errors are extremely common but memory errors are not. I want us to be 100% certain that a read barrier cannot function as a backup plan to DFA language features. If it can, it will give a better user experience then just DFA, we've seen what happens when you try to solve these kinds of problems exclusively with DFA, it shows up as DIP1000 cannot be turned on by default. If the end result is that we have to recommend the slow DFA exclusively for production code then so be it. I want us to be certain that we have no other options.I did a lot of programming on MS-DOS. There is no memory protection there. Writing through a null pointer would scramble the operating system tables, which meant the operating system would do something terrible. There were many times when it literally scrambled my hard disk. (I made lots of backups.)As you know I'm into retro computers, so yeah I'm familiar with not having memory protection and the consequences thereof.If you haven't had this pleasure, it may be hard to realize what a godsend protected memory is. A null pointer no longer requires reinstalling the operating system. Your program simply quits with a stack trace. With the advent of protected mode, I immediately ceased all program development in real mode DOS. Instead, I'd fully debug it in protected mode, and then as the very last step I'd test it in real mode.I've read your story on this in the past and believed you the first time.Protected mode is the greatest invention ever for computer programs. When the hardware detects a null pointer dereference, it produces a seg fault, the program stops running and you get a stack trace which gives you the best chance ever of finding the cause of the seg fault.You don't always get a stack trace. Nor does it allow you to fully report to a reporting daemon what went wrong for diagnostics. What Windows does instead of a signal, is to have it throw an exception that then gets caught right at the top. This then triggers the reporting daemon kicking in. It allows for catching, filtering and adding of more information to the report. Naturally we can't support it due to exceptions... At the OS level things have progressed from simply segfaulting out, even in the native world. https://learn.microsoft.com/en-us/windows/win32/api/werapi/nf-werapi-werregisterruntimeexceptionmoduleA lovely characteristic of seg faults is they come FOR FREE! There is zero cost to them. They don't slow your program down at all. They do not add bloat. It's all under the hood. The idea that a null pointer is a billion dollar mistake is just ludicrous to me. The real mistake is having unchecked arrays, which injection problems.While I don't agree that it was a mistake (token values are just as bad), and that is his name for it. I view it the same way as I view coroutine coloring. Its a feature to keep operating environments sane. But by doing so it causes pain and forces you to deal with the problem rather than let it go unnoticed. Have a read of the show notes: https://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare/ "27:40 This led me to suggest that the null value is a member of every type, and a null check is required on every use of that reference variable, and it may be perhaps a billion dollar mistake." None of this is new! :)Being unhappy about a null pointer seg fault is like complaining that the seatbelt left a bruise on your body as it saved you from your body being broken (this has happened to me, I always always wear that seatbelt!).Never happened to me, and I still wear it. Doesn't mean I want to be in a car that is driven with hard stops that is in the drivers control to not do.Of course, it is better to detect a seg fault at compile time. Data Flow Analysis can help: ```d int x = 1; void main() { int* p; if (x) *p = 3; } ``` Compiling with `-O`, which enables Data Flow Analysis: ``` dmd -O test.d Error: null dereference in function _Dmain ```Right, local information only. Turns out even the C++ folks are messing around with frontend DFA for this :/ With cross-procedural information in AST.Unfortunately, DFA has its limitations that nobody has managed to solve (the halting problem), hence the need for runtime checks, which the hardware does nicely for you. Fortunately, D is powerful enough so you can make a non-nullable type.I've considered the possibility of explicit boxing. With and without compiler forcing it (by disallowing raw pointers and slices). Everything we can do with boxing using library types, can be done better with the compiler. Including making sure that it actually happens. I've seen what happens if we force boxing rather than doing something in the language in my own stuff. The amount of errors I have with my mustuse error type is staggering. We gotta get a -betterC compatible solution to exceptions that isn't heap allocated or using unwinding tables ext. It would absolutely poor engineering to try to convince anyone to box raw pointers let alone being the recommended or required solution as part of PhobosV3. There has to be a better way.In summary, the notion that one can recover from an unanticipated null pointer dereference and continue running the program is a seriously bad idea. There are far better ways to make failsafe systems. Complaining about a seg fault is like complaining that a seatbelt left a bruise while saving you from being maimed.Program != task. No one wants the task to continue after a null dereference occurs. We are not in disagreement. It must attempt to cleanup (if segfault handler fires then straight to death the process goes) and die. We are not as far off as it might appear.
Apr 16
On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew Cattermole wrote:FYI, some time ago I designed this https://github.com/GrimMaple/mud/blob/master/source/mud/nullable.d to work as `<Nullable>enable</Nullable>`). My intention was to include it in OpenD to work like this: When a compiler flag is enabled (eg -nullcheck), the code below: ```d Object o = new Object(); ``` Would be then silently rewritten by compiler as ``` NotNull!Object o = new Object(); ``` thus enabling compile-time null checks. The solution is not perfect and needs some further compiler work (eg checking if some field is inderectly initialized by some func called in a constructor). Also, I lack dmd knowledge to insert this myself, so this didn't go anywhere in terms of actual inclusion, but I might give it a go some time in the future.
Apr 21
On 21/04/2025 10:34 PM, GrimMaple wrote:On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew Cattermole wrote:That is the Swift solution to the problem, more or less.FYI, some time ago I designed this https://github.com/GrimMaple/mud/ checks (that work if you `<Nullable>enable</Nullable>`). My intention was to include it in OpenD to work like this: When a compiler flag is enabled (eg -nullcheck), the code below: ```d Object o = new Object(); ``` Would be then silently rewritten by compiler as ``` NotNull!Object o = new Object(); ``` thus enabling compile-time null checks. The solution is not perfect and needs some further compiler work (eg checking if some field is inderectly initialized by some func called in a constructor). Also, I lack dmd knowledge to insert this myself, so this didn't go anywhere in terms of actual inclusion, but I might give it a go some time in the future.
Apr 21
On Monday, 21 April 2025 at 10:45:15 UTC, Richard (Rikki) Andrew Cattermole wrote:That is the Swift solution to the problem, more or less.My take here is that null pointers (references/whatever) just _shouldn't be_, __period__. The compiler should disallow me to generate a null pointer, unless specifically asked for - like with `Nullable!MyType`. or, even better, `Optional!MyType` -- then we can finally ditch that "null pointer semantic" and use "correct" terminology, because most often a null pointer is just used as a quasi-optional type. Any memory allocations that end up in a null pointer, eg `new` running out of memory, should result in an Exception/Error, not returning null. I think that having `NotNull!T` provides a soft transition from having null to not having null at all -- it's fairly trivial to append `MaybeNull!` to any var type that needs it; and then it's fairly easy to just rename it to `Optional` c:
Apr 21