www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - [RFC] Throwing an exception with null pointers

reply Richard (Rikki) Andrew Cattermole <richard cattermole.co.nz> writes:
I've been looking once again at having an exception being thrown 
on null pointer dereferencing.
However the following can be extended to other hardware level 
exceptions.

I do not like the conclusion, but it is based upon facts that we 
do not control.



We cannot rely upon hardware or kernel support for throwing of a 
null pointer exception.
To do this we have to use read barriers, like we do for array 
bounds check.

There are three levels of support needed:

1. Language support, it does not alter code generation, available 
everwhere.
     Can be tuned to the users threshold of pain and need for 
guarantees.
2. Altering codegen, throws an exception via a read barrier just 
like a bounds check does.
3. Something has gone really wrong and CPU has said NOPE and a 
signal has fired to kill the process.

Read barriers can be optimized out thanks to language support via 
data flow analysis and they can be turned on/off like bounds 
checks are.





For Posix we can handle any errors that occur and throw an 
exception, its tricky but it is possible.

For Windows it'll result in a exception being thrown, which can 
be caught and handled... except we do not support the exception 
mechanism on Win64 for cleanup routines (dmd only) let alone 
catching.
I've asked this recently and this is not a bug nor is it 
guaranteed by the language to work.

Even if this were to work, signal handlers can be changed, and 
they can be a bit touchy at times.
We cannot rely on kernel level or cpu support to catch these 
errors.



To have a 100% solution for within D code there is really only 
one option: read barriers.
We already have them for bounds checks.
And they can throw a D exception without any problems, plus they 
bypass the cpu/kernel guarantees which can result in infinite 
loops.

This catches logic problems, but not program corruption where 
pointers point to something that they shouldn't.

There is one major problem with a read barrier on pointers, how 
do you disable it?
With slices you can access the pointer directly and do the 
dereference that by-passes it.
Sadly we'd be stuck with either a storage class or attribute to 
turn it off.
I know Walter would hate the proposal of a storage class so that 
is a no go.



So how does .net handle it?
As another Microsoft owned project, .net is a very good thing to 
study, it has the exact same problems that we do here.

The .net exceptions are split into managed and unmanaged 
exceptions.
Unmanaged exceptions are what we are comparable to (but don't 
support their method).
These are not meant to be caught by .net, including stuff like 
null dereference exceptions, they kill the process.

The managed exceptions include ones like null dereference for 
.net and are allowed to be caught.
Quite importantly in frameworks like asp.net it guarantees that 
non-framework code cannot crash the process even in the most 
extreme cases.
This is possible because null is a valid pointer value, and a 
pointer cannot point into unmapped memory.

The guarantee of .net that you cannot corrupt a pointer, also 
happens to be what a signal that causes a process crash is good 
at handling; corrupted pointers.




of the type system with assistance with data flow analysis to 
prevent you from doing bad things at compile time.

It is a very involved process to upgrade all code to it, and from 
what I have seen many people in the D community would be appauled 
at the notion that they have to explicitly state a pointer as 
being non-null or nullable.

Worse the typing of a pointer as non-null or nullable tends to be 
in the type system, but has the data flow analysis to infect 
other variables.



In C++ nullability is handled via lint level analysis without any 
language help.

It is compiler specific analysis that can require not only 
opt-ing into it, but also turning on optimizations.

We can do better than this.
It is no where near a desirable solution.



A known good strategy for handling errors in a system is to have 
three solutions.
For handling pointer related issues we could have three solutions 
at play:

1. CPU/kernel kill process via signal/exception handling
2. Read barrier throw an exception, use for when null dereference 
occurs, not when pointer corruption occurs i.e. dereference of 
unmapped memory
3. Language level support

All read barriers in D are optional, with differing level of 
action when they error.
It should be the same here also.
I do not care about the default.
Although if the default is not on, it could cause problems in 
practice with say PhobosV3.



As already stated, forced typing is likely to annoy too many 
people and any pointer typing that could solve that is already 
out as it is doing the classic managed vs unmanaged pointer 
typing.

There must be a way for the programmer to acknowledge pointer 
nullability status without it being required or be part of a type.
This makes us fairly unique.

Since we cannot require attribution by default, we cannot have a 
100% solution with just language level support.
But we also do not want to have a common error to be in the code 
and effect runtime if at all possible.
Or at least in cases where me and Adam Wilson care about 
regarding eventloops.

Which means we are limited to local only information, similar to 
C++, except... we can store the results into the type system.
The result is more code is checked than C++, but also less than 


It is possible to opt into more advanced analysis and error when 
it cannot model your code.

Apr 12
next sibling parent reply Ogion <ogion.art gmail.com> writes:
On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 I've been looking once again at having an exception being 
 thrown on null pointer dereferencing.
Why `Exception` and not `Error`?
Apr 13
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 13/04/2025 9:48 PM, Ogion wrote:
 On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 I've been looking once again at having an exception being thrown on 
 null pointer dereferencing.
Why `Exception` and not `Error`?
Neither of them. The former has language specific behavior, and the latter has implementation specific behavior that isn't suitable for catching by read barriers.
Apr 13
parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 14/04/2025 9:45 AM, Richard (Rikki) Andrew Cattermole wrote:
 On 13/04/2025 9:48 PM, Ogion wrote:
 On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 I've been looking once again at having an exception being thrown on 
 null pointer dereferencing.
Why `Exception` and not `Error`?
Neither of them. The former has language specific behavior, and the latter has implementation specific behavior that isn't suitable for catching by read barriers.
err, thrown by read barriers and then caught.
Apr 13
prev sibling next sibling parent reply Derek Fawcus <dfawcus+dlang employees.org> writes:
On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 I've been looking once again at having an exception being 
 thrown on null pointer dereferencing.
 However the following can be extended to other hardware level 
 exceptions.

 I do not like the conclusion, but it is based upon facts that 
 we do not control.



 We cannot rely upon hardware or kernel support for throwing of 
 a null pointer exception.
 To do this we have to use read barriers, like we do for array 
 bounds check.

 There are three levels of support needed:

 1. Language support, it does not alter code generation, 
 available everwhere.
     Can be tuned to the users threshold of pain and need for 
 guarantees.
 2. Altering codegen, throws an exception via a read barrier 
 just like a bounds check does.
 3. Something has gone really wrong and CPU has said NOPE and a 
 signal has fired to kill the process.
From the unix/posix perspective, I'd say don't even try. Just allow the signal (SIGSEGV and/or SIGBUS) to be raised, not caught, and have the process crash. Debugging then depends upon examining the corefile, or using a debugger. The only gain from catching it is to generate some pretty form of back trace before dying, and that is just as well (or better) handled by a debugger, or an external crash handling and backtrace generating corset/monitor/supervision process. Once one catches either of these signals, one has to be very careful in handling if any processing is to continue beyond the signal handler. With a complex runtime, and/or a multi-threaded application, it often isn't worth the effort.
Apr 13
parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 14/04/2025 1:07 AM, Derek Fawcus wrote:
 On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 I've been looking once again at having an exception being thrown on 
 null pointer dereferencing.
 However the following can be extended to other hardware level exceptions.

 I do not like the conclusion, but it is based upon facts that we do 
 not control.



 We cannot rely upon hardware or kernel support for throwing of a null 
 pointer exception.
 To do this we have to use read barriers, like we do for array bounds 
 check.

 There are three levels of support needed:

 1. Language support, it does not alter code generation, available 
 everwhere.
     Can be tuned to the users threshold of pain and need for guarantees.
 2. Altering codegen, throws an exception via a read barrier just like 
 a bounds check does.
 3. Something has gone really wrong and CPU has said NOPE and a signal 
 has fired to kill the process.
From the unix/posix perspective, I'd say don't even try.  Just allow the signal (SIGSEGV and/or SIGBUS) to be raised, not caught, and have the process crash.  Debugging then depends upon examining the corefile, or using a debugger. The only gain from catching it is to generate some pretty form of back trace before dying, and that is just as well (or better) handled by a debugger, or an external crash handling and backtrace generating corset/ monitor/supervision process. Once one catches either of these signals, one has to be very careful in handling if any processing is to continue beyond the signal handler. With a complex runtime, and/or a multi-threaded application, it often isn't worth the effort.
This is how it is implemented currently. Which is don't touch it and it matches my analysis.
Apr 13
prev sibling next sibling parent user1234 <user1234 12.de> writes:
On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 I've been looking once again at having an exception being 
 thrown on null pointer dereferencing.

 [...]
 


 To have a 100% solution for within D code there is really only 
 one option: read barriers.
 We already have them for bounds checks.
 And they can throw a D exception without any problems, plus 
 they bypass the cpu/kernel guarantees which can result in 
 infinite loops.

 This catches logic problems, but not program corruption where 
 pointers point to something that they shouldn't.

 There is one major problem with a read barrier on pointers, how 
 do you disable it?
 With slices you can access the pointer directly and do the 
 dereference that by-passes it.
 Sadly we'd be stuck with either a storage class or attribute to 
 turn it off.
I think that you actually dont need any language addition at all. The read barriers can be considered as implicit, "opt-in", contracts and whether they are codegened can be controlled with a simple command line switch. At first glance this system may appear costly. This is exact but there are actually many cases for which the compiler can determine that it has not to add a barrier.
Apr 13
prev sibling next sibling parent Derek Fawcus <dfawcus+dlang employees.org> writes:
On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) 
Andrew Cattermole wrote:


 As already stated, forced typing is likely to annoy too many 
 people and any pointer typing that could solve that is already 
 out as it is doing the classic managed vs unmanaged pointer 
 typing.

 There must be a way for the programmer to acknowledge pointer 
 nullability status without it being required or be part of a 
 type.
 This makes us fairly unique.

 Since we cannot require attribution by default, we cannot have 
 a 100% solution with just language level support.
 But we also do not want to have a common error to be in the 
 code and effect runtime if at all possible.
 Or at least in cases where me and Adam Wilson care about 
 regarding eventloops.

 Which means we are limited to local only information, similar 
 to C++, except... we can store the results into the type system.
 The result is more code is checked than C++, but also less than 


 It is possible to opt into more advanced analysis and error 
 when it cannot model your code.

As to language level support for nullable vs non-nullable pointers, without having used it yet, I believe I'd like to have such. Picking a default is an issue. I probably need to play (in C) with the clang __nullable and _nonnull markers to see how well they work. From reading the GCC docs I can't see benefit from its mechanisms, as they serve to guide optimisation rather than checks / assertions at compile time and/or checks at runtime. I think I really want something like Cyclone offered, with forms of non-null pointers and nullable pointers. Or maybe something like Odin/Zig offer with default non-null pointers and optional nullable pointers, the latter requring source guards (plus dataflow analysis). As to how that translates to D, I'm not yet sure. However references alone are not the answer, as I want an explicit annotation at function call sites to indicate that a pointer/reference may be passed. Hence I have a quibble with D safe mode not allowing passing pointers to locals; only mitigated by the 'scoped pointer' annotation when that preview flag is enabled.
Apr 13
prev sibling next sibling parent reply a11e99z <a a.com> writes:
On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 I've been looking once again at having an exception being 
 thrown on null pointer dereferencing.
null pointer in x86 is any pointer less than 0x00010000 also what about successful dereferencing 0x23a7b63c41704h827? its depends: reserved this memory by process, committed, mmaped etc failure for any "wrong" pointer should be same as for 0x0000....000 (pure NULL) so read barrier is wrong option (maybe good for DEBUG only for simple cases) silent killing process by OS is also wrong option, not every programmer is kernel debugger/developer or windbg guru imo need to dig into .net source and grab their option
Apr 14
next sibling parent reply a11e99z <a a.com> writes:
On Monday, 14 April 2025 at 09:49:27 UTC, a11e99z wrote:
 On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) 
 Andrew Cattermole wrote:
 I've been looking once again at having an exception being 
 thrown on null pointer dereferencing.
null pointer in x86 is any pointer less than 0x00010000 imo need to dig into .net source and grab their option
almost same problem is misaligned access: x86/x64 allows this except few instructions ARM - prohibit it (as I know) it seems there are no other options except to handle kernel signals and WinSEH
Apr 14
parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 14/04/2025 10:01 PM, a11e99z wrote:
 On Monday, 14 April 2025 at 09:49:27 UTC, a11e99z wrote:
 On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 I've been looking once again at having an exception being thrown on 
 null pointer dereferencing.
null pointer in x86 is any pointer less than 0x00010000 imo need to dig into .net source and grab their option
almost same problem is misaligned access: x86/x64 allows this except few instructions ARM - prohibit it (as I know)
The only way for this to occur in safe code is program corruption.
 it seems there are no other options except to handle kernel signals and 
 WinSEH
They are both geared towards killing the process, from what I've read its really not a good idea to throw an exception using them. Even if we could rely upon the signal handler being what we think it is. We do not support the MSVC exception mechanism for Windows 64bit of dmd, so even if we wanted to do this, we cannot.
Apr 14
prev sibling parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 14/04/2025 9:49 PM, a11e99z wrote:
 On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 I've been looking once again at having an exception being thrown on 
 null pointer dereferencing.
null pointer in x86 is any pointer less than 0x00010000
The null page right, and how exactly did you get access to a value that isn't 0 or a valid pointer? Program corruption. Which should only be possible in non- safe code.
 also what about successful dereferencing 0x23a7b63c41704h827? its 
 depends: reserved this memory by process, committed, mmaped etc
How did you get this value? Program corruption.
 failure for any "wrong" pointer should be same as for 0x0000....000 
 (pure NULL)
Null (0) is a special value in D, these other values are simply assumed to be valid. Of course its not possible to get these other values without calling out of safe code where program corruption can occur.
 imo need to dig into .net source and grab their option
.net is an application VM, and for all intents and purposes they will be injecting read barriers before each null dereference. They also have strong guarantees that you cannot have a invalid to a pointer, pointer. It must be null or point to a valid object. It isn't as simple as copying them. Plus they have the type state analysis as part of the language to handle nullability!
Apr 14
prev sibling next sibling parent reply Atila Neves <atila.neves gmail.com> writes:
On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 I've been looking once again at having an exception being 
 thrown on null pointer dereferencing.
 However the following can be extended to other hardware level 
 exceptions.

 [...]
I would like to know why one would want this.
Apr 14
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 15/04/2025 1:51 AM, Atila Neves wrote:
 On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 I've been looking once again at having an exception being thrown on 
 null pointer dereferencing.
 However the following can be extended to other hardware level exceptions.

 [...]
I would like to know why one would want this.
Imagine you have a web server that is handling 50k requests per second. It makes you $1 million dollars a day. In it, you accidentally have some bad business logic that results in a null dereference or indexing a slice out of bounds. It kills the entire server losing you potentially the full 1 million dollars before you can fix it. How likely are you to keep using D, or willing to talk about using D positively afterwards? ASP.net guarantees that this will kill the task and will give the right response code. No process death.
Apr 14
next sibling parent reply Arafel <er.krali gmail.com> writes:
On 14.04.25 16:22, Richard (Rikki) Andrew Cattermole wrote:
 On 15/04/2025 1:51 AM, Atila Neves wrote:
 I would like to know why one would want this.
Imagine you have a web server that is handling 50k requests per second. It makes you $1 million dollars a day. In it, you accidentally have some bad business logic that results in a null dereference or indexing a slice out of bounds. It kills the entire server losing you potentially the full 1 million dollars before you can fix it. How likely are you to keep using D, or willing to talk about using D positively afterwards? ASP.net guarantees that this will kill the task and will give the right response code. No process death.
I won't get into the merits of the feature itself, but I have to say that this example is poorly chosen, to say the least. In fact, it looks to me like a case of "when you only have a hammer, everything looks like a nail": not everything should be handled by the application itself. As somebody coming rather from the "ops" side of "devops", let me tell you that there is a wide range of tools that you should be using **on top of your application** if you have an app that makes you 1M$ a day, including but not restricted to: * A monitoring process to make sure the server is running (and healthy). Among this process's tasks are making sure that in case of a failure the main process is fully stopped, killing any leftover tasks, removing lock files, ensuring data sanity, etc., and then restarting the main server again. * A HA system routing queries to a pool of several servers that are regularly polled for health status, assuming that the failure happens seldom enough that it's very unlikely to affect several backend servers at the same time. * Some meatbag on-call 24/7 (or even on-site) who can at the very least restart the affected server (including the hardware) if it comes to that. I mean, a service can fail for a number of reasons, including hardware issues, among which dereferencing a null pointer should be quite low in the scale of probabilities. Having a 1M$/day operation depend on your application's continued run after dereferencing a null pointer would seem to me... rather risky and sort-sighted. On top of that, there's the "small" issue that you can't really be sure what state the application has been left in. I certainly wouldn't want to risk any silent data corruption and would rather kill the process ASAP to start it again from a known good state. Again, I'm not arguing for or against the feature itself, but I just think this example doesn't do it any help.
Apr 14
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 15/04/2025 2:55 AM, Arafel wrote:
 I mean, a service can fail for a number of reasons, including hardware 
 issues, among which dereferencing a null pointer should be quite low in 
 the scale of probabilities.
It isn't a low probability. The reason why all these application VM languages have been introducing nullability guarantees, is because it has been a plague of problems. We have nothing currently. “I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W). My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn't resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.” https://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare/
 Having a 1M$/day operation depend on your application's continued run 
 after dereferencing a null pointer would seem to me... rather risky and 
 sort-sighted.
You need to read my initial post. I concluded that once an instruction executes the dereference it is in fact dead. We are not in disagreement in terms of this. This is why the static analysis and read barrier is so important, they catch it before it ever happens. The program isn't corrupted by the signal at this point in time. Before that, thanks to `` safe``, we can assume it is in a valid state, and its just business logic that is wrong. Yes people will disagree with me on that, but the blame is purely on the non- safe code which should be getting thoroughly vetted for this kind of thing (and even then I still want the read barriers and type state analysis to kick in, because this can never be correct). There is a reason why my DIP makes stackless coroutines default to safe, not just allow you to default to system like everything else.
 On top of that, there's the "small" issue that you can't really be sure 
 what state the application has been left in. I certainly wouldn't want 
 to risk any silent data corruption and would rather kill the process 
 ASAP to start it again from a known good state.
By doing this you have killed all other tasks, those tasks make you lose money. Throw in say a scrapper and you could still be down all day or more. It is an entirely unnecessary downtime with accepted widely used solutions. It would be bad engineering to ignore this. It is important to note that a task isn't always a process. But once an event like null dereference occurs that task must die. If anything prevents that task from cleaning up, then yes the process dies.
Apr 14
parent reply Derek Fawcus <dfawcus+dlang employees.org> writes:
On Monday, 14 April 2025 at 15:24:37 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
 It is important to note that a task isn't always a process. But 
 once an event like null dereference occurs that task must die.
It is not the dereference which is the issue, that is the downstream symptom of an earlier problem. If that reference is never supposed to be null, then the program is already in a non deterministic even without the crash. The crash is what allows that bad state to be fixed. Simply limping along (aoiding the deref) is sweeping the issue under the rug. One doesn't know what else within the complete system is at fault. The safe annotation is not yet sufficient to ensure that the rest of the system is in a valid state. Fail fast, and design the complete architecture with redundant HA instance can take the (or a sub-set of the) load until the system can regain its redundant operating condition.That is exactly what we do with routers.
Apr 14
next sibling parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 15/04/2025 3:42 AM, Derek Fawcus wrote:
 On Monday, 14 April 2025 at 15:24:37 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 It is important to note that a task isn't always a process. But once 
 an event like null dereference occurs that task must die.
It is not the dereference which is the issue, that is the downstream symptom of an earlier problem.  If that reference is never supposed to be null, then the program is already in a non deterministic even without the crash.
You are not the first to say this, and its indicative of not understanding the scenario. Coroutines are used for business logic. They are supposed to have guarantees where they can always be cleaned up on framework level exceptional events. That includes attempts at null dereferencing or out of bounds access of slices. They should never have the ability to corrupt the entire program. They need to be safe. If safe allows program corruption, then it needs fixing. If you call trusted code that isn't shipped with the compiler, that isn't our fault it wasn't vetted. But that is what it is exists for. So that you can do unsafe things and present a safe API. Any other scenario than coroutines will result in a process death.
 The crash is what allows that bad state to be fixed.  Simply limping 
 along (aoiding the deref) is sweeping the issue under the rug.  One 
 doesn't know what else within the complete system is at fault.
Except it isn't the entire system that could be bad. Its one single threaded business logic laden task. It is not library or framework code that is bad. A single piece of business logic, that was written likely by a graduate level skilled person, failed to account for something. This is not the same thing as the entire program being corrupt. If some kind of coroutine isn't in use, the program still dies, just like today.
 The  safe annotation is not yet sufficient to ensure that the rest of 
 the system is in a valid state.
Please elaborate. If safe has a bug, it needs solving.
 Fail fast, and design the complete architecture with redundant HA 
 instance can take the (or a sub-set of the) load until the system can 
 regain its redundant operating condition.That is exactly what we do with 
 routers.
While all of those is within my recommendations generally, it does not cover this particular scenario. This is far too common of an issue, and within business logic which will commonly be found inside a coroutine (including Fiber), it should not be bringing down the entire process. ASP.net thanks to .net offers this guarantee for a reason.
Apr 14
parent reply Derek Fawcus <dfawcus+dlang employees.org> writes:
On Monday, 14 April 2025 at 16:12:35 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
 You are not the first to say this, and its indicative of not 
 understanding the scenario.

 Coroutines are used for business logic.

 They are supposed to have guarantees where they can always be 
 cleaned up on framework level exceptional events. That includes 
 attempts at null dereferencing or out of bounds access of 
 slices.

 They should never have the ability to corrupt the entire 
 program. They need to be  safe.

 If  safe allows program corruption, then it needs fixing.
I'd have to suggest you're chasing something which can not be achieved. It is always possible for the program to get in to a partial, or fully, unrecoverable state, where some portion of the system is inoperative, or operating incorrectly. The only way to recover in that case is to restart the program. The reason the program can not recover is that there is a bug, where some portion is able to drive in to an unanticipated area, and there is not the correct logic to recover as the author did not think of that scenario. I recently wrote a highly concurrent program in Go. This was in CSP style, taking care only only access slices and maps from one goroutine, and not accidentally capturing free variables in lambda. So this manually covered the "safety" escapes which the Rust folks like to point to in Go. The "safety" provided being greater than D currently offers. Also nil pointers are present in Go, and will usually crash the complete program. One is able to catch panics if one desires, so being similar to your exception case. There were many goroutines, and a bunch dynamically started and stopped which ran "business logic". Despite that it was possible (due to some bugs of mine) to get the system state where things could not recover, or for some cases took 30 mins to recover. A flaw could be detected (via outside behaviour), and could be reasoned through by analysing the log files. However for the error which did not clear after 30 mins, there was no way for the system to come back in to full operational state without the program being restarted. A similar situation would be achievable in a Rust program. In trying to handle and recover from such things, you're up against a variant of Gödel's incompleteness theorem, and anyone offering a language which "solves" that is IMHO selling snake oil.
Apr 14
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 15/04/2025 8:53 AM, Derek Fawcus wrote:
 In trying to handle and recover from such things, you're up against a 
 variant of Gödel's incompleteness theorem, and anyone offering a 
 language which "solves" that is IMHO selling snake oil.
In my original post I proposed a three tier solution. The read barriers are secondary to the language guarantees via type state analysis. You need them for when using a fast DFA engine, that doesn't do full control flow graph analysis and ignores a variable when it cannot analyze it. But if you are ok with having a bit of pain in terms of what can be modeled and can accept a bit of slowness, you won't need the read barriers. Unfortunately not everyone will accept the slower DFA therefore it can't be on by default. I know this as it copies a couple of the perceived negative traits of DIP1000. So yes we can solve this, but MT could still mess it up, hence the last option in the three solution; signals killing the process.
Apr 14
parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 15/04/2025 9:28 AM, Richard (Rikki) Andrew Cattermole wrote:
 On 15/04/2025 8:53 AM, Derek Fawcus wrote:
 In trying to handle and recover from such things, you're up against a 
 variant of Gödel's incompleteness theorem, and anyone offering a 
 language which "solves" that is IMHO selling snake oil.
In my original post I proposed a three tier solution. The read barriers are secondary to the language guarantees via type state analysis. You need them for when using a fast DFA engine, that doesn't do full control flow graph analysis and ignores a variable when it cannot analyze it. But if you are ok with having a bit of pain in terms of what can be modeled and can accept a bit of slowness, you won't need the read barriers. Unfortunately not everyone will accept the slower DFA therefore it can't be on by default. I know this as it copies a couple of the perceived negative traits of DIP1000. So yes we can solve this, but MT could still mess it up, hence the last option in the three solution; signals killing the process.
I should mention, like the assert handler you would have the ability to configure the read barrier to do whatever you want at runtime. So if you prefer it to kill the process you can. As to what the default would be? Idk. The benefit of having it is that we can likely have a stack trace, where there might not be one otherwise.
Apr 14
prev sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On Monday, 14 April 2025 at 15:42:07 UTC, Derek Fawcus wrote:
 On Monday, 14 April 2025 at 15:24:37 UTC, Richard (Rikki) 
 Andrew Cattermole wrote:
 It is important to note that a task isn't always a process. 
 But once an event like null dereference occurs that task must 
 die.
It is not the dereference which is the issue, that is the downstream symptom of an earlier problem. If that reference is never supposed to be null, then the program is already in a non deterministic even without the crash.
This is the exact problem. The solution proposed here just doesn't understand what the actual problem is. Null dereferences, and index out-of-bounds are *programming errors*. You need to fix them in the program, not recover and hope for the best. Trying to recover is the equivalent of a compiler resolving a syntax ambiguity with a random number generator. Null dereference? 1. Is it because I trusted a user value? => validate user input, rebuild, redeploy 2. Is it because I forgot to initialize something? => initialize it, rebuild, redeploy 3. Is it because I forgot to validate something? => do the validation properly, fix whatever it was sending in invalid data, rebuild, redeploy 4. Is it something else? => thank you program, for crashing instead of corrupting everything. Now, time to find the memory corruption somewhere. Similar flow chart for out-of-bounds errors. -Steve
Apr 14
next sibling parent reply Alexandru Ermicioi <alexandru.ermicioi gmail.com> writes:
On Tuesday, 15 April 2025 at 02:48:42 UTC, Steven Schveighoffer 
wrote:
 On Monday, 14 April 2025 at 15:42:07 UTC, Derek Fawcus wrote:
 On Monday, 14 April 2025 at 15:24:37 UTC, Richard (Rikki) 
 Andrew Cattermole wrote:
 It is important to note that a task isn't always a process. 
 But once an event like null dereference occurs that task must 
 die.
It is not the dereference which is the issue, that is the downstream symptom of an earlier problem. If that reference is never supposed to be null, then the program is already in a non deterministic even without the crash.
This is the exact problem. The solution proposed here just doesn't understand what the actual problem is. Null dereferences, and index out-of-bounds are *programming errors*. You need to fix them in the program, not recover and hope for the best.
This simply is not manageable. Sometimes it is better to have semi operating systems rather ones that don't work at all because a minor thing in a tertiary module causes npe, while a fix is in progress.
Apr 15
parent Paolo Invernizzi <paolo.invernizzi gmail.com> writes:
On Tuesday, 15 April 2025 at 07:23:14 UTC, Alexandru Ermicioi 
wrote:
 On Tuesday, 15 April 2025 at 02:48:42 UTC, Steven Schveighoffer 
 wrote:
 On Monday, 14 April 2025 at 15:42:07 UTC, Derek Fawcus wrote:
 On Monday, 14 April 2025 at 15:24:37 UTC, Richard (Rikki) 
 Andrew Cattermole wrote:
 It is important to note that a task isn't always a process. 
 But once an event like null dereference occurs that task 
 must die.
It is not the dereference which is the issue, that is the downstream symptom of an earlier problem. If that reference is never supposed to be null, then the program is already in a non deterministic even without the crash.
This is the exact problem. The solution proposed here just doesn't understand what the actual problem is. Null dereferences, and index out-of-bounds are *programming errors*. You need to fix them in the program, not recover and hope for the best.
This simply is not manageable. Sometimes it is better to have semi operating systems rather ones that don't work at all because a minor thing in a tertiary module causes npe, while a fix is in progress.
Au contraire! That's exactly why today we have: - kernels (or microkernels!) - processes living in userland - threads living in processes - coroutine (or similar stuff, whatever variant and name they take) living on thread - and least but not last VM, plenty of them. I don't see a real use case for trying to recover a process from UB at all: go down the list and choose another layer. /P
Apr 15
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
Thank you, Steven. This is correct.
Apr 16
parent reply Derek Fawcus <dfawcus+dlang employees.org> writes:
On Wednesday, 16 April 2025 at 18:19:58 UTC, Walter Bright wrote:
 Thank you, Steven. This is correct.
Yup - I like the crash... However I do have an interest in being able to write code with distinct nullable and nonnull pointers. That such that the compiler (or an SA tool) can complain when they're incorrectly confused. So passing (or assigning) a nullable pointer to a nonnull one without a prior check should generate a compile error or warning. That should only require function local DFA. The reason to want it is simply that test cases may not exercise complete coverage for various paths when one only has the C style pointer, and so it should allow for easy latent bug detection and fixes when one is not bypassing the type system. If one is bypassing the type system, then one takes the risks, but the SIGSEGV is still there to catch the bug. (Yes, I've programmed under DOS. I also took advantage of a protected mode OS (FlexOS) when available to prove and debug the code first. The TurboC style 'detected a null pointer write' at program exit while occasionally useful, was grossly inadequate)
Apr 16
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/16/2025 11:43 AM, Derek Fawcus wrote:
 However I do have an interest in being able to write code with distinct
nullable 
 and nonnull pointers.  That such that the compiler (or an SA tool) can
complain 
 when they're incorrectly confused.
That's what templates are for!
Apr 16
next sibling parent reply Dave P. <dave287091 gmail.com> writes:
On Wednesday, 16 April 2025 at 19:44:09 UTC, Walter Bright wrote:
 On 4/16/2025 11:43 AM, Derek Fawcus wrote:
 However I do have an interest in being able to write code with 
 distinct nullable and nonnull pointers.  That such that the 
 compiler (or an SA tool) can complain when they're incorrectly 
 confused.
That's what templates are for!
I think what people are complaining about isn’t that null pointers exist at all, it’s that every pointer and reference type has a null value instead of you opting into it. The type system fights you if you want to use it to prove or enforce certain values are nullable or not. Something that people don’t bring up is that clang’s nullability extension allows you to change what the default is. You put a `#pragma clang assume_nonnull begin` at the top of your C/C++/Objective-C code and you have to annotate only the nullable pointers. Most pointers in a program should be non-null and the nullable ones should be the exception that you have to annotate.
Apr 16
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/16/2025 12:57 PM, Dave P. wrote:
 You put a `#pragma clang 
 assume_nonnull begin` at the top of your C/C++/Objective-C code and you have
to 
 annotate only the nullable pointers. Most pointers in a program should be 
 non-null and the nullable ones should be the exception that you have to
annotate.
Annotation means more than one pointer type. Back in the old MSDOS days, there were 5 pointer types - near, far, stack, code and huge. Dealing with that is a gigantic mess - which pointer type does strlen() take? Or worse, strcpy()? Microsoft's Managed C++ has two pointer types with different syntax, a GC pointer and a non-GC pointer. The same problem - what pointer type does strcpy() accept? It's an ugly mess, and why I've avoided any such thing in D. I'm curious - how does one traverse a binary tree with non-null pointers? How does one create a circular data structure with non-null pointers?
Apr 16
next sibling parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 17/04/2025 9:12 AM, Walter Bright wrote:
 On 4/16/2025 12:57 PM, Dave P. wrote:
 You put a `#pragma clang assume_nonnull begin` at the top of your C/C+ 
 +/Objective-C code and you have to annotate only the nullable 
 pointers. Most pointers in a program should be non-null and the 
 nullable ones should be the exception that you have to annotate.
Annotation means more than one pointer type.
Annotating the type, and annotating the variable/expression are two different things. In the DFA literature they have distinct properties and are applied differently. I read Principles of Program Analysis today, it was very interesting and did have some details on the subject (but not much). It also confirmed some things that I had already come up with independently which was nice! From what I've seen, application VM languages annotate the type, whereas C++ annotates the variable. As of DIP1000, we annotate the variable i.e. scope. From a link made previously in this thread, the state of the art annotation of nullability in C++: https://clang.llvm.org/docs/analyzer/developer-docs/nullability.html Very similar to what I'm wanting.
 Back in the old MSDOS days, there were 5 pointer types - near, far, 
 stack, code and huge. Dealing with that is a gigantic mess - which 
 pointer type does strlen() take? Or worse, strcpy()?
 
 Microsoft's Managed C++ has two pointer types with different syntax, a 
 GC pointer and a non-GC pointer. The same problem - what pointer type 
 does strcpy() accept?
I genuinely would prefer throwing a full fledged CFG DFA at this kind of thing, and only annotate the variable, not the type. Its a shame not everyone would accept that as a solution. It is forcing me to verify that there are no alternative solutions for these people. I know you and I Walter would be happy with a full CFG DFA as the resolution, but alas. I remain heavily concerned at the idea of boxing types in D, in any scenario. It seems to spell an absolute mess in any attempts I have modeled mentally.
 It's an ugly mess, and why I've avoided any such thing in D.
 
 I'm curious - how does one traverse a binary tree with non-null 
 pointers? How does one create a circular data structure with non-null 
 pointers?
Sentinels. They are used pretty heavily in data structures, such as head and foot nodes. My recommendation for data structure/algorithm book: https://www.amazon.com/Algorithms-Parts-1-4-Fundamentals-Structures/dp/0201314525
Apr 16
parent reply Derek Fawcus <dfawcus+dlang employees.org> writes:
On Wednesday, 16 April 2025 at 21:30:58 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 From a link made previously in this thread, the state of the 
 art annotation of nullability in C++: 
 https://clang.llvm.org/docs/analyzer/developer-docs/nullability.html
An example using it: ```C // Compile with -Wnullable-to-nonnull-conversion #if defined(__clang__) #define ASSUME_NONNULL_BEGIN _Pragma("clang assume_nonnull begin") #define ASSUME_NONNULL_END _Pragma("clang assume_nonnull end") #else #define ASSUME_NONNULL_BEGIN #define ASSUME_NONNULL_END #define __has_feature(x) (0) #endif /* clang */ #if __has_feature (nullability) #define NONNULL _Nonnull #define NULLABLE _Nullable #define NULL_UNSPECIFIED _Null_unspecified #else #define NONNULL #define NULLABLE #define NULL_UNSPECIFIED #endif /* nullability */ ASSUME_NONNULL_BEGIN int *someNonNull(int *); int *someNullable(int * NULLABLE); int foo(int * arg1, int * NULL_UNSPECIFIED arg2, int * NULLABLE arg3) { int a = 0; int * NONNULL ptr; a += *someNullable(arg1); a += *someNullable(arg2); a += *someNullable(arg3); ptr = arg1; ptr = arg2; ptr = arg3; a += *someNonNull(arg1); a += *someNonNull(arg2); a += *someNonNull(arg3); return a; } ASSUME_NONNULL_END ``` ``` $ clang-14 -Wall -Wnullable-to-nonnull-conversion -c ttt.c ttt.c:37:8: warning: implicit conversion from nullable pointer 'int * _Nullable' to non-nullable pointer type 'int * _Nonnull' [-Wnullable-to-nonnull-conversion] ptr = arg3; ^ ttt.c:41:20: warning: implicit conversion from nullable pointer 'int * _Nullable' to non-nullable pointer type 'int * _Nonnull' [-Wnullable-to-nonnull-conversion] a += *someNonNull(arg3); ^ ttt.c:29:16: warning: variable 'ptr' set but not used [-Wunused-but-set-variable] int * NONNULL ptr; ^ 3 warnings generated. ```
Apr 16
parent reply Derek Fawcus <dfawcus+dlang employees.org> writes:
On Wednesday, 16 April 2025 at 22:43:24 UTC, Derek Fawcus wrote:
 ```
 $ clang-14 -Wall -Wnullable-to-nonnull-conversion -c ttt.c
 ttt.c:37:8: warning: implicit conversion from nullable pointer 
 'int * _Nullable' to non-nullable pointer type 'int * _Nonnull' 
 [-Wnullable-to-nonnull-conversion]
         ptr = arg3;
               ^
 ttt.c:41:20: warning: implicit conversion from nullable pointer 
 'int * _Nullable' to non-nullable pointer type 'int * _Nonnull' 
 [-Wnullable-to-nonnull-conversion]
         a += *someNonNull(arg3);
                           ^
 ttt.c:29:16: warning: variable 'ptr' set but not used 
 [-Wunused-but-set-variable]
         int * NONNULL ptr;
                       ^
 3 warnings generated.
 ```
Having now got this to complain in the desired fashion, I'll now be applying it to some code at work. More help in quashing sources of bugs.
Apr 16
parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 17/04/2025 10:54 AM, Derek Fawcus wrote:
 On Wednesday, 16 April 2025 at 22:43:24 UTC, Derek Fawcus wrote:
 ```
 $ clang-14 -Wall -Wnullable-to-nonnull-conversion -c ttt.c
 ttt.c:37:8: warning: implicit conversion from nullable pointer 'int * 
 _Nullable' to non-nullable pointer type 'int * _Nonnull' [-Wnullable- 
 to-nonnull-conversion]
         ptr = arg3;
               ^
 ttt.c:41:20: warning: implicit conversion from nullable pointer 'int * 
 _Nullable' to non-nullable pointer type 'int * _Nonnull' [-Wnullable- 
 to-nonnull-conversion]
         a += *someNonNull(arg3);
                           ^
 ttt.c:29:16: warning: variable 'ptr' set but not used [-Wunused-but- 
 set-variable]
         int * NONNULL ptr;
                       ^
 3 warnings generated.
 ```
Having now got this to complain in the desired fashion, I'll now be applying it to some code at work.  More help in quashing sources of bugs.
After a quick play, I would suggest also passing ``--analyze`` if you are not already doing so. It covers the case where you are not assuming non-null. I.e. it can see the difference between: ```c++ void test(int *p) { if (!p) *p = 0; // warn } ``` and ```c++ void test2(int *p) { if (p) *p = 0; // ok } ``` This is definitely DFA.
Apr 16
prev sibling next sibling parent reply Dave P. <dave287091 gmail.com> writes:
On Wednesday, 16 April 2025 at 21:12:08 UTC, Walter Bright wrote:
 On 4/16/2025 12:57 PM, Dave P. wrote:
 You put a `#pragma clang assume_nonnull begin` at the top of 
 your C/C++/Objective-C code and you have to annotate only the 
 nullable pointers. Most pointers in a program should be 
 non-null and the nullable ones should be the exception that 
 you have to annotate.
Annotation means more than one pointer type. Back in the old MSDOS days, there were 5 pointer types - near, far, stack, code and huge. Dealing with that is a gigantic mess - which pointer type does strlen() take? Or worse, strcpy()? Microsoft's Managed C++ has two pointer types with different syntax, a GC pointer and a non-GC pointer. The same problem - what pointer type does strcpy() accept? It's an ugly mess, and why I've avoided any such thing in D. I'm curious - how does one traverse a binary tree with non-null pointers? How does one create a circular data structure with non-null pointers?
There are three annotations: `_Nullable`, `_Nonnull` and `_Null_unspecified`. `_Null_unspecified` can freely convert between the other two types. If you don’t annotate a pointer (and don’t change the default), then it is a `_Null_unspecified` for compatibility with existing code. `strlen` and company thus take `_Null_unspecified`. They *should* take `_Nonnull`, but it’s old code. For data structures, you mostly use `_Null_unspecified` as the nullability is not as simple as this pointer may be null or not, it is a potentially dynamic invariant of your data structure. In D, you would annotate it as an ` system` variable so only trusted code can modify it. Where the annotations are really valuable is for function arguments and return. Can this argument be null or not? Now the compiler can help you instead of you having to memorize the documentation of every function you call. I’ve consistently applied these annotations to my C projects and it works wonders. Many classes of mistakes are caught at compile time. Somethings are a dynamic property of your system, and for that there is a nullability sanitizer that can detect at runtime if a `_Nonnull` pointer gets a null value (usually from a data structure that was not properly initialized).
Apr 16
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/16/2025 2:49 PM, Dave P. wrote:
 Where the annotations are really valuable is for function arguments and
return. 
 Can this argument be null or not? Now the compiler can help you instead of you 
 having to memorize the documentation of every function you call.
I've long since given up on memorizing the documentation of each parameter. Too many functions! I just google it, it just takes a sec.
 I’ve consistently applied these annotations to my C projects and it works 
 wonders. Many classes of mistakes are caught at compile time.
I agree that compile time detection is always better than runtime. I get a null pointer seg fault now and then. I look at the stack trace, and have it fixed in the same amount of time it takes you. I don't worry about it, because it is not a memory corruption issue. The time I spend tracking down a null pointer seg fault does not even register among the time I spend programming. You're not wrong. But it's a cost/benefit thing.
 Somethings are a
 dynamic property of your system, and for that there is a nullability sanitizer
 that can detect at runtime if a `_Nonnull` pointer gets a null value (usually
 from a data structure that was not properly initialized).
I don't see how a runtime detector can work better than a seg fault with stack trace.
Apr 16
next sibling parent reply Dave P. <dave287091 gmail.com> writes:
On Thursday, 17 April 2025 at 05:31:48 UTC, Walter Bright wrote:
 On 4/16/2025 2:49 PM, Dave P. wrote:
 [...]
I've long since given up on memorizing the documentation of each parameter. Too many functions! I just google it, it just takes a sec.
It’s easier to goto-definition and see the annotations. Fancier editors can even show it inline.
 [...]
 I don't see how a runtime detector can work better than a seg 
 fault with stack trace.
The source of a null pointer can be very far from its eventual dereference, especially when stored in a structure. The nullability sanitizer also doesn’t kill your program by default, it just logs when a null pointer is stored in a nonnull variable. So you can see where the first non null store is made and see how it eventually moves to when it is derefenced. --- It’s funny though, when I first started using this kind of thing it caught loads of bugs. But after years of use, I realized it hadn’t caught any bugs in a long time. I had internalized the rules.
Apr 17
parent Walter Bright <newshound2 digitalmars.com> writes:
On 4/17/2025 12:21 AM, Dave P. wrote:
 It’s funny though, when I first started using this kind of thing it caught
loads 
 of bugs. But after years of use, I realized it hadn’t caught any bugs in a
long 
 time. I had internalized the rules.
Same thing happened to me. When I pointed it out on HackerNews recently, I was lambasted for being "arrogant". LOL. I make different kinds of mistakes these days. Mostly due to failure to understand the problem I am trying to solve, rather than coding errors.
Apr 18
prev sibling parent reply user1234 <user1234 12.de> writes:
On Thursday, 17 April 2025 at 05:31:48 UTC, Walter Bright wrote:
 [...]
 I don't see how a runtime detector can work better than a seg 
 fault with stack trace.
Actually I have implemented a runtime detector in another language and that gives me -- quite rarely -- things like
 temp.sx:11:7: runtime error, member read with null `this`
So that you know directly where the problem is and often there's even no need for running again in gdb. That is implemented in the Dlang equivalent of the DotVarExp. Concretly it's about codegening the same as for an assertion and just right before loading from the LHS. That's why, previously in the thread, I called that an "implicit contract". Aren't assertions the most primitive form of contracts ?
Apr 17
parent Walter Bright <newshound2 digitalmars.com> writes:
On 4/17/2025 4:03 AM, user1234 wrote:
 Aren't assertions the most primitive form of contracts ?
Yes, indeed they are.
Apr 18
prev sibling parent reply Derek Fawcus <dfawcus+dlang employees.org> writes:
On Wednesday, 16 April 2025 at 21:12:08 UTC, Walter Bright wrote:
 I'm curious - how does one traverse a binary tree with non-null 
 pointers? How does one create a circular data structure with 
 non-null pointers?
One has nullable and nonnull pointers. One uses the former for the circular data structures. Now the current ugly bit with clang is this: ```C int foo(int * arg1, int * NULL_UNSPECIFIED arg2, int * NULLABLE arg3) { if (arg3 != 0) ptr = arg3; return a; } ``` ``` $ clang-14 -Wall -Wnullable-to-nonnull-conversion -c ttt.c ttt.c:36:9: warning: implicit conversion from nullable pointer 'int * _Nullable' to non-nullable pointer type 'int * _Nonnull' [-Wnullable-to-nonnull-conversion] ptr = arg3; ^ ``` i.e. there is no DFA, and one still has to cast the value in the assignment. So the clang support could stand some improvement.
Apr 16
next sibling parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 17/04/2025 10:50 AM, Derek Fawcus wrote:
 On Wednesday, 16 April 2025 at 21:12:08 UTC, Walter Bright wrote:
 I'm curious - how does one traverse a binary tree with non-null 
 pointers? How does one create a circular data structure with non-null 
 pointers?
One has nullable and nonnull pointers.  One uses the former for the circular data structures. Now the current ugly bit with clang is this: ```C int foo(int * arg1, int * NULL_UNSPECIFIED arg2, int * NULLABLE arg3) {         if (arg3 != 0)                 ptr = arg3;         return a; } ``` ``` $ clang-14 -Wall -Wnullable-to-nonnull-conversion -c ttt.c ttt.c:36:9: warning: implicit conversion from nullable pointer 'int * _Nullable' to non-nullable pointer type 'int * _Nonnull' [-Wnullable-to- nonnull-conversion]                 ptr = arg3;                       ^ ``` i.e. there is no DFA, and one still has to cast the value in the assignment.  So the clang support could stand some improvement.
I've been implementing this specific thing in my fast DFA implementation recently. Truthiness is needed for it to work. However you can still be using DFA without supporting truthiness of variables. DFA is a very large subject domain.
Apr 16
prev sibling next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
Sounds like one must do a bunch of casting to make use of non-nullable pointers?
Apr 16
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
```C
int foo(int * arg1, int * NULL_UNSPECIFIED arg2, int * NULLABLE arg3) {
```

https://www.youtube.com/watch?v=vEr0EAcSWcc
Apr 16
parent Derek Fawcus <dfawcus+dlang employees.org> writes:
On Thursday, 17 April 2025 at 05:09:41 UTC, Walter Bright wrote:
 ```C
 int foo(int * arg1, int * NULL_UNSPECIFIED arg2, int * NULLABLE 
 arg3) {
 ```

 https://www.youtube.com/watch?v=vEr0EAcSWcc
Yeah, but that is test code as I'm experimenting with the mechanism, and trying to figure out how it can be used, and what (if any) value I perceive it as adding. I rather suspect the norm would to use the pragma such that most uses are implicitly non-null, and only the exceptional nullable pointer would have to be so marked. On this particular point, my preferred form would be something similar to what Cyclone had, but not using its syntax forms. Possibly using the syntax from Zig. So the common case would be a non-nuull pointer, using '*' alone in its declaration. The nullable point would probably be declared as '?*' (or '*?' - not sure which). Then for use, one could not assign from a nullable to a non-null unless DFA had already proved the value was not null. Except those would be non backward compatible changes; so one would somehow have to opt in to them.
Apr 17
prev sibling parent reply Derek Fawcus <dfawcus+dlang employees.org> writes:
On Wednesday, 16 April 2025 at 19:44:09 UTC, Walter Bright wrote:
 On 4/16/2025 11:43 AM, Derek Fawcus wrote:
 However I do have an interest in being able to write code with 
 distinct nullable and nonnull pointers.  That such that the 
 compiler (or an SA tool) can complain when they're incorrectly 
 confused.
That's what templates are for!
I can't say I've played with them much, having come from C not C++. I see there is the ability to overload the unary '*' operator, and so can imagine how one could define a struct providing a non-null form of pointer. But just how awkward is that going to be for mixed forms of nullability in function definitions. Without trying, i suspect it will just get too awkward. e.g., how would the equivalent of this args end up in a D rendition: ```C int foo(char * _Nonnull * _Nullable a, int * _Nullable * _Nonnull b); ```
Apr 16
next sibling parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 17/04/2025 11:08 AM, Derek Fawcus wrote:
 On Wednesday, 16 April 2025 at 19:44:09 UTC, Walter Bright wrote:
 On 4/16/2025 11:43 AM, Derek Fawcus wrote:
 However I do have an interest in being able to write code with 
 distinct nullable and nonnull pointers.  That such that the compiler 
 (or an SA tool) can complain when they're incorrectly confused.
That's what templates are for!
I can't say I've played with them much, having come from C not C++. I see there is the ability to overload the unary '*' operator, and so can imagine how one could define a struct providing a non-null form of pointer. But just how awkward is that going to be for mixed forms of nullability in function definitions.  Without trying, i suspect it will just get too awkward. e.g., how would the equivalent of this args end up in a D rendition: ```C int foo(char * _Nonnull * _Nullable a, int * _Nullable * _Nonnull b); ```
The current C++ analyzer for clang doesn't support that. ```c++ int foo(char ** _Nullable a, int ** _Nonnull b); ``` Would be more accurate. For D, my design work for type state analysis would have it be: ```d int foo(/*?initialized*/ char** a, ?nonnull int** b); ``` If you want to dereference twice, you need to perform a load + check. ```d if (auto c = *b) { int d = *c; } ``` Some syntax sugar can lower to that.
Apr 16
prev sibling next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 4/16/2025 4:08 PM, Derek Fawcus wrote:
 e.g., how would the equivalent of this args end up in a D rendition:
 
 ```C
 int foo(char * _Nonnull * _Nullable a, int * _Nullable * _Nonnull b);
 ```
```d int foo(_Nullable!(_Nonnull!char)) a, _Nonnull!(_Nullable!int) b); ```
Apr 16
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
Here's something I spent 5 minutes on:

```d
struct NonNull(T)
{
     T* p;
     T* ptr() { return p; }
     alias this = ptr;
}

int test(NonNull!int np)
{
     int i = *np;
     int* p1 = np;
     int* p2 = np.ptr;
     np = p1; // Error: cannot implicitly convert expression `p1` of type
`int*` 
to `NonNull!int`
     return i;
}
```
Note that NonNull can be used as a pointer, can be implicitly converted to a 
pointer, but a pointer cannot be implicitly converted to a NonNull.

There's more window dressing one would want, like a constructor with a null 
check, but the basic idea looks workable and is not complicated.
Apr 17
next sibling parent reply Dave P. <dave287091 gmail.com> writes:
On Thursday, 17 April 2025 at 16:39:28 UTC, Walter Bright wrote:
 Here's something I spent 5 minutes on:

 ```d
 struct NonNull(T)
 {
     T* p;
     T* ptr() { return p; }
     alias this = ptr;
 }

 int test(NonNull!int np)
 {
     int i = *np;
     int* p1 = np;
     int* p2 = np.ptr;
     np = p1; // Error: cannot implicitly convert expression 
 `p1` of type `int*` to `NonNull!int`
     return i;
 }
 ```
 Note that NonNull can be used as a pointer, can be implicitly 
 converted to a pointer, but a pointer cannot be implicitly 
 converted to a NonNull.

 There's more window dressing one would want, like a constructor 
 with a null check, but the basic idea looks workable and is not 
 complicated.
First to support classes, you’d have to make a slight change ```d struct NonNull(T) { T p; T ptr() { return p; } alias this = ptr; } ``` It works, but the syntax/defaults is backwards. Why does the unusual case of a nullable pointer get the nice syntax while the common case gets the `NonNull!(int*)` syntax? Who is going to write that all over their code?
Apr 17
next sibling parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Thursday, April 17, 2025 11:36:49 AM MDT Dave P. via Digitalmars-d wrote:
 On Thursday, 17 April 2025 at 16:39:28 UTC, Walter Bright wrote:
 First to support classes, you’d have to make a slight change
 ```d
 struct NonNull(T)
 {
      T p;
      T ptr() { return p; }
      alias this = ptr;
 }
 ```

 It works, but the syntax/defaults is backwards. Why does the
 unusual case of a nullable pointer get the nice syntax while the
 common case gets the `NonNull!(int*)` syntax? Who is going to
 write that all over their code?
I am in favor of making changes along the lines that Rikki is proposing so that we can better handle failure in multi-threaded environments without having to kill the entire program (and in addition, there are cases where dereferencing null pointers is not actually memory-safe, because either that platform doesn't protect it or a compiler like ldc or gdc will optimize the code based on the fact that it treats null pointer dereferencing as undefined behavior). That being said, I honestly think that the concern over null pointers is completely overblown. I can't even remember the last time that I encountered one being dereferenced. And when I have, it's usually because I used a class and forgot to initialize it, which blows up very quickly in testing rather than it being a random bug that occurs during execution. So, if someone feels the need to use non-null pointers of some kind all over the place, I'd be concerned about how they're writing their code such that it's even a problem - though it could easily be the case that they're just paranoid about it, which happens with sometimes with a variety of issues and people. And as such, I really don't think that it's all that big a deal if a wrapper type is required to guarantee that a pointer isn't null. Most code shouldn't need anything of the sort. However, as I understand it, there _are_ memory-safety issues with null pointers with ldc (and probably gdc as well) due to how they optimize. IIRC, the core problem is that they treat dereferencing a null pointer as undefined behavior, and that can have serious consequences with regards to what the code does when there actually is a null pointer. dmd doesn't have the same problem from what I understand simply because it's not as aggressive with its optimizations. So, Walter's stance makes sense based on what dmd is doing, but it doesn't necessarily make sense with D compilers in general. So, between that and the issues with platforms such as webasm, I am inclined to think that we should treat dereferencing pointers like we treat accessing elements of arrays and insert checks that the compiler can then optimize out. And we can provide flags to change that behavior just like we do with array bounds checking. But if we cannot guarantee that attempting to dereference a null pointer is always safe (and as I understand it, outside of using dmd, we can't), then that's a hole in safe. And no matter how rare dereferencing null is or isn't in practice, we can't have holes in safe and have it actually give guarantees about memory safety. - Jonathan M Davis
Apr 17
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
I'd like to know what those gdc and ldc transformations are, and whether they 
are controllable with a switch to their optimizers.

I know there's a problem with WASM not faulting on a null dereference, but in 
another post I suggested a way to deal with it.
Apr 17
parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Thursday, April 17, 2025 8:39:27 PM MDT Walter Bright via Digitalmars-d
wrote:
 I'd like to know what those gdc and ldc transformations are, and whether they
 are controllable with a switch to their optimizers.

 I know there's a problem with WASM not faulting on a null dereference, but in
 another post I suggested a way to deal with it.
Unfortunately, my understanding isn't good enough to explain those details. I discussed it with Johan in the past, but I've never worked on ldc or with llvm (or on gdc/gcc), so I really don't know what is or isn't possible. However, from what I recall of what Johan said, we were kind of stuck, and llvm considered dereferencing null to be undefined behavior. It may be the case that there's some sort of way to control that (and llvm may have more capabilities in that regard since I last discussed it with Johan), but someone who actually knows llvm is going to have to answer those questions. And I don't know how gdc's situation differs either. - Jonathan M Davis
Apr 19
parent reply Johan <j j.nl> writes:
On Saturday, 19 April 2025 at 22:49:19 UTC, Jonathan M Davis 
wrote:
 On Thursday, April 17, 2025 8:39:27 PM MDT Walter Bright via 
 Digitalmars-d wrote:
 I'd like to know what those gdc and ldc transformations are, 
 and whether they are controllable with a switch to their 
 optimizers.

 I know there's a problem with WASM not faulting on a null 
 dereference, but in another post I suggested a way to deal 
 with it.
Unfortunately, my understanding isn't good enough to explain those details. I discussed it with Johan in the past, but I've never worked on ldc or with llvm (or on gdc/gcc), so I really don't know what is or isn't possible. However, from what I recall of what Johan said, we were kind of stuck, and llvm considered dereferencing null to be undefined behavior.
There is a way now to tell LLVM that dereferencing null is _defined_ (nota bene) behavior.
 It may be the case that there's some sort of way to control 
 that (and llvm may have more capabilities in that regard since 
 I last discussed it with Johan), but someone who actually knows 
 llvm is going to have to answer those questions. And I don't 
 know how gdc's situation differs either.
So far not responded in this thread because I feel it is an old discussion, with old misunderstandings. There is confusion between dereferencing in the language, versus dereferencing by the CPU. What I think that C and C++ do very well is separate language behavior from implementation/CPU behavior, and only prescribe language behavior, no (or very little) implementation behavior. I feel D should do the same. Non-virtual method example, where (in my opinion) the dereference happens at call site, not inside the function: ``` class A { int a; final void foo() { // non-virtual a = 1; // no dereference here } } A a; a.foo(); <-- DEREFERENCE ``` During program execution, _with the current D implementation of classes and non-virtual methods_, the CPU will only "dereference" the `this` pointer to do the assignment to `a`. But that is only the case for our _current implementation_. For the D language behavior, it does not matter what the implementation does: same behavior should happen on any architecture/platform/execution model. If you want to fault on null-dereference, I believe you _have_ to add a null-check at every dereference at _language_ level (regardless of implementation details). Perhaps it does not impact performance very much (with optimizer enabled); I vaguely remember a paper from Microsoft where they tried this and did not see a big perf impact (if any). Some notes to trigger you to think about distinguishing language behavior from CPU/implementation details: - You don't _have_ to implement classes and virtual functions using a vptr/vtable, there are other options! - There does not need to be a "stack" (implementation detail vocabulary). Some "CPUs" don't have a "stack", and instead do "local storage" (language vocabulary) in an alternative way. In fact, even on CPUs _with_ stack, it can help to not use it! (read about Address Sanitizer detection of stack-use-after-scope and ASan's "fake stack") - Pointers don't have to be memory addresses (you probably already know that they are not physical addresses on common CPUs), but could probably be implemented as hashes/keys into a database as well. C does not define ordered comparison (e.g. > and <) for pointers (it's implementation defined, IIRC), except when they point into the same object (e.g. an array or struct). Why? Because what does it mean on segmented memory architectures (i.e. x86)? - Distinguishing language from implementation behavior means that correct programs work the same on all kinds of different implementations (e.g. you can run your C++ program in a REPL, or run it in your browser through WASM). cheers, Johan
Apr 21
parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 22/04/2025 5:29 AM, Johan wrote:
 If you want to fault on null-dereference, I believe you /have/ to add a 
 null-check at every dereference at /language/ level (regardless of 
 implementation details). Perhaps it does not impact performance very 
 much (with optimizer enabled); I vaguely remember a paper from Microsoft 
 where they tried this and did not see a big perf impact (if any).
I agree with what you're saying here, but I want to refine it a little bit. Every language dereference must have an _associated_ read barrier. What this means is: ```d T* ptr; readbarrier(ptr); ptr.field1; ptr.field2; ptr = ...; readbarrier(ptr); ptr.field3; ``` A very simple bit of object tracking when inserting the check, will eliminate a ton of these, tbf we should be doing that for array bounds checking if we are not already. Also the fast DFA which this would be used with, would eliminate a ton of them, so performance should be a complete non-issue, given how ok we are with array bounds checks.
Apr 21
prev sibling next sibling parent Meta <jared771 gmail.com> writes:
On Thursday, 17 April 2025 at 22:12:22 UTC, Jonathan M Davis 
wrote:
 On Thursday, April 17, 2025 11:36:49 AM MDT Dave P. via 
 Digitalmars-d wrote:
 On Thursday, 17 April 2025 at 16:39:28 UTC, Walter Bright 
 wrote:
 First to support classes, you’d have to make a slight change
 ```d
 struct NonNull(T)
 {
      T p;
      T ptr() { return p; }
      alias this = ptr;
 }
 ```

 It works, but the syntax/defaults is backwards. Why does the 
 unusual case of a nullable pointer get the nice syntax while 
 the common case gets the `NonNull!(int*)` syntax? Who is going 
 to write that all over their code?
That being said, I honestly think that the concern over null pointers is completely overblown. I can't even remember the last time that I encountered one being dereferenced. And when I have, it's usually because I used a class and forgot to initialize it, which blows up very quickly in testing rather than it being a random bug that occurs during execution.
When you work on a team with less skilled/meticulous teammates on high performance, highly parallel software, you realize how amazing it would be to have a language where pointers are guaranteed to be non-null by default. I spent so much time fixing random annoying NPEs that other people wrote that pop up randomly in our test cluster. This feature is not for people like you. It is to protect your sanity from the terrible code that people who are not as principled will inevitably write. This is an absolute no-brainer IMO, and it's one thing that is really great about Rust.
Apr 17
prev sibling parent Atila Neves <atila.neves gmail.com> writes:
On Thursday, 17 April 2025 at 22:12:22 UTC, Jonathan M Davis 
wrote:
 On Thursday, April 17, 2025 11:36:49 AM MDT Dave P. via 
 Digitalmars-d wrote:
 On Thursday, 17 April 2025 at 16:39:28 UTC, Walter Bright 
 wrote:
That being said, I honestly think that the concern over null pointers is completely overblown. I can't even remember the last time that I encountered one being dereferenced.
I can, last week. The process crashed, I ran `coredumpctl gdb`, immediately fixed the issue and carried on with my day. By which I mean I agree with you that I don't think it's a big deal either.
 And when I have, it's usually because I used a class and forgot 
 to initialize it, which blows up very quickly in testing rather 
 than it being a random bug that occurs during execution.
That's exactly what I did last week.
Apr 21
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/17/2025 10:36 AM, Dave P. wrote:
 First to support classes, you’d have to make a slight change
Sure, you'd want to overload the NonNull template with one that takes a class type parameter.
 It works, but the syntax/defaults is backwards. Why does the unusual case of a 
 nullable pointer get the nice syntax while the common case gets the 
 `NonNull!(int*)` syntax? Who is going to write that all over their code?
Backwards compatibility. The NonNull is the addition, the nullable is the existing. Changing the existing behavior would be a massive disruption. Anyhow, it's a good idea to see how far one can take the metaprogramming approach to what you want, before changing the core language.
Apr 17
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 18/04/2025 2:35 PM, Walter Bright wrote:
     It works, but the syntax/defaults is backwards. Why does the unusual
     case of a nullable pointer get the nice syntax while the common case
     gets the |NonNull!(int*)| syntax? Who is going to write that all
     over their code?
 
 Backwards compatibility. The NonNull is the addition, the nullable is 
 the existing. Changing the existing behavior would be a massive disruption.
D is designed around the type state initialized, aka nullable. This was a _very_ smart thing to do, before type state analysis was ever mainstream. Walter this is by far one of the best design decisions you have ever made. Trying to change the default type state for pointers to non-null would be absolutely horrific. Its possible to prove a variable is non-null, but if we start painting pointers themselves in the type system? Ughhhhhh, the pain application VM languages are having over this isn't worth it, in the context of D. We can do a lot better than that. If this doesn't work without annotation (in a single compilation run), we've failed. ```d void main() { func(new int); // ok func(null); // error } void func(int* ptr) { int v = *ptr; } ```
Apr 17
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/17/2025 7:48 PM, Richard (Rikki) Andrew Cattermole wrote:
 ```d
 void main() {
      func(new int); // ok
      func(null); // error
 }
 
 void func(int* ptr) {
      int v = *ptr;
 }
 ```
It always looks simple in such examples, but then there are things like: ```d struct S { int a,b; int* p; } void main() { S s; funky(&s); func(s.p); } ``` where trying to track information in structs gets complicated fast.
Apr 18
parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 19/04/2025 6:11 PM, Walter Bright wrote:
 On 4/17/2025 7:48 PM, Richard (Rikki) Andrew Cattermole wrote:
 ```d
 void main() {
      func(new int); // ok
      func(null); // error
 }

 void func(int* ptr) {
      int v = *ptr;
 }
 ```
It always looks simple in such examples, but then there are things like: ```d struct S { int a,b; int* p; } void main() {     S s;     funky(&s);     func(s.p); } ``` where trying to track information in structs gets complicated fast.
Yeah indirection. It is not a solved problem, even if you throw the type system with say a type qualifier into the mix. https://kotlinlang.org/docs/java-to-kotlin-nullability-guide.html "Tracking multiple levels of annotations for pointers pointing to pointers would make the checker more complicated, because this way a vector of nullability qualifiers would be needed to be tracked for each symbol. This is not a big caveat, since once the top level pointer is dereferenced, the symvol for the inner pointer will have the nullability information. The lack of multi level annotation tracking only observable, when multiple levels of pointers are passed to a function which has a parameter with multiple levels of annotations. So for now the checker support the top level nullability qualifiers only.: int * __nonnull * __nullable p; int ** q = p; takesStarNullableStarNullable(q);" https://clang.llvm.org/docs/analyzer/developer-docs/nullability.html Its actually a pretty good example of why I think supporting modelling of fields is not worth our time. Plus throw in multi-threading and boom shuckala, not modellable anymore.
Apr 18
prev sibling parent kdevel <kdevel vogtner.de> writes:
On Thursday, 17 April 2025 at 17:36:49 UTC, Dave P. wrote:
 ```d
 struct NonNull(T)
 {
     T p;
     T ptr() { return p; }
     alias this = ptr;
 }
 ```

 It works, but the syntax/defaults is backwards. Why does the 
 unusual case of a nullable pointer get the nice syntax while 
 the common case gets the `NonNull!(int*)` syntax?
+1
 Who is going to write that all over their code?
Nobody.
Apr 19
prev sibling parent reply Ogion <ogion.art gmail.com> writes:
On Thursday, 17 April 2025 at 16:39:28 UTC, Walter Bright wrote:
 Here's something I spent 5 minutes on:

 ```d
 struct NonNull(T)
 {
     T* p;
     T* ptr() { return p; }
     alias this = ptr;
 }

 int test(NonNull!int np)
 {
     int i = *np;
     int* p1 = np;
     int* p2 = np.ptr;
     np = p1; // Error: cannot implicitly convert expression 
 `p1` of type `int*` to `NonNull!int`
     return i;
 }
 ```
 Note that NonNull can be used as a pointer, can be implicitly 
 converted to a pointer, but a pointer cannot be implicitly 
 converted to a NonNull.

 There's more window dressing one would want, like a constructor 
 with a null check, but the basic idea looks workable and is not 
 complicated.
Isn’t `ref` essentially a non-null pointer?
Apr 18
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/18/2025 12:21 AM, Ogion wrote:
 Isn’t `ref` essentially a non-null pointer?
It's supposed to be. But you can write: ```d int* p = null; ref r = *p; ``` and you get a null ref.
Apr 18
parent reply Meta <jared771 gmail.com> writes:
On Saturday, 19 April 2025 at 06:05:53 UTC, Walter Bright wrote:
 On 4/18/2025 12:21 AM, Ogion wrote:
 Isn’t `ref` essentially a non-null pointer?
It's supposed to be. But you can write: ```d int* p = null; ref r = *p; ``` and you get a null ref.
How is this possible? Shouldn't dereferencing p crash the program before r is initialized?
Apr 19
parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Saturday, April 19, 2025 3:17:29 AM MDT Meta via Digitalmars-d wrote:
 On Saturday, 19 April 2025 at 06:05:53 UTC, Walter Bright wrote:
 On 4/18/2025 12:21 AM, Ogion wrote:
 Isn’t `ref` essentially a non-null pointer?
It's supposed to be. But you can write: ```d int* p = null; ref r = *p; ``` and you get a null ref.
How is this possible? Shouldn't dereferencing p crash the program before r is initialized?
It doesn't, because nothing is actually dereferenced. This is like if you have a null pointer to a struct and then call a member variable on it, e.g. ``` void main() { S* s; s.foo(); } struct S { int i; void foo() { } } ``` This does not crash, because s is never actually used. It's just passed to foo. Of course, if you then changed foo to something like ``` void foo() { import std.stdio; writeln(i); } ``` it would crash, because then it would need to dereference s to access its i member, but until it needs to access a member, there's no reason for any dereferencing to take place. The same happens with C++ classes as long as the function isn't virtual. Where you _do_ get it failing in basically any language would be with a class' virtual function, because the class reference needs to be dereferenced in order to get the correct function. This is one of those bugs that can be _very_ confusing if you don't think it through, since you do naturally tend to assume that when you call a member function, the pointer is dereferenced, but if the function isn't virtual, there's no reason to dereference it to make the call. The funciton is just passed the pointer or reference as in invisible argument. So, you can end up with a segfault inside of your function instead of at the call site and get rather confused by it. It's happened to me a couple of times in my career, and it's initially been pretty confusing each time, even though after it was explained to me the first time, I understood it, because you just tend to think of calling a member function as derferencing the object even though it it doesn't actually have any reason to do so unless the function is virtual. And with int* p = null; ref r = *p; no dereferencing occurs, because the compiler is converting int* to ref int, and underneath the hood, ref int is just int*, so it's simply copying the value of the pointer. - Jonathan M Davis
Apr 19
parent reply kdevel <kdevel vogtner.de> writes:
On Saturday, 19 April 2025 at 10:35:01 UTC, Jonathan M Davis 
wrote:
 [...] because then it would need to dereference s to access its 
 i member, but until it needs to access a member, there's no 
 reason for any dereferencing to take place.

 The same happens with C++ classes as long as the function isn't 
 virtual.
That is undefined behavior. In the C++ standard null references have been carefully ruled out [1]. There is no standard conforming C++ program having null references.
 And with

 int* p = null;
 ref r = *p;

 no dereferencing occurs,
In C++ this is a programming error. When creating a reference from a pointer the null check it is necessary in order to uphold C++' guarantee that references are actually bound to existing objects. [1] google.com?q="c++ reference from null pointer" - https://old.reddit.com/r/cpp/comments/80zm83/no_references_are_never_null/ - https://stackoverflow.com/questions/4364536/is-a-null-reference-possible
Apr 19
parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Saturday, April 19, 2025 5:26:29 AM MDT kdevel via Digitalmars-d wrote:
 On Saturday, 19 April 2025 at 10:35:01 UTC, Jonathan M Davis
 wrote:
 [...] because then it would need to dereference s to access its
 i member, but until it needs to access a member, there's no
 reason for any dereferencing to take place.

 The same happens with C++ classes as long as the function isn't
 virtual.
That is undefined behavior. In the C++ standard null references have been carefully ruled out [1]. There is no standard conforming C++ program having null references.
My point about non-virtual functions and derefencing wasn't really about references so much as about the fact that the compiler doesn't necessary dereference when you think that you'r telling it to dereference. It only does so when it actually needs to. And whatever is supposed to be defined behavior or not, I have seen pointers not be dereferenced when calling non-virtual functions - and when creating references from what they point to.
 And with

 int* p = null;
 ref r = *p;

 no dereferencing occurs,
In C++ this is a programming error. When creating a reference from a pointer the null check it is necessary in order to uphold C++' guarantee that references are actually bound to existing objects. [1] google.com?q="c++ reference from null pointer" - https://old.reddit.com/r/cpp/comments/80zm83/no_references_are_never_null/ - https://stackoverflow.com/questions/4364536/is-a-null-reference-possible
Well, if C++ now checks that pointer is non-null when creating a reference from it, that's new behavior, because it most definitely did not do that before. Either way, unless the compiler inserts checks of some kind in order to try ensure that a reference is never null, there's no reason to derefence a pointer or reference until the data it points to is actually used. And historically, no null checks were done for correctness. - Jonathan M Davis
Apr 19
parent reply kdevel <kdevel vogtner.de> writes:
On Saturday, 19 April 2025 at 11:44:42 UTC, Jonathan M Davis 
wrote:
 [...]
 And with

 int* p = null;
 ref r = *p;

 no dereferencing occurs,
In C++ this is a programming error. When creating a reference from a pointer the null check it is necessary in order to uphold C++' guarantee that references are actually bound to existing objects. [...]
Well, if C++ now checks that pointer is non-null when creating a reference from it, that's new behavior, because it most definitely did not do that before.
Of course it doesn't and I didn't write that. I wrote that it is a programming error to use a ptr to initialize a reference when it is possible that the ptr is null. If refs in D were as strong as in C++ I would write [... int *p is potentially null ...] enforce (p); auto ref r = *p;
Apr 19
parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Saturday, April 19, 2025 6:13:36 AM MDT kdevel via Digitalmars-d wrote:
 On Saturday, 19 April 2025 at 11:44:42 UTC, Jonathan M Davis
 wrote:
 [...]
 And with

 int* p = null;
 ref r = *p;

 no dereferencing occurs,
In C++ this is a programming error. When creating a reference from a pointer the null check it is necessary in order to uphold C++' guarantee that references are actually bound to existing objects. [...]
Well, if C++ now checks that pointer is non-null when creating a reference from it, that's new behavior, because it most definitely did not do that before.
Of course it doesn't and I didn't write that. I wrote that it is a programming error to use a ptr to initialize a reference when it is possible that the ptr is null. If refs in D were as strong as in C++ I would write [... int *p is potentially null ...] enforce (p); auto ref r = *p;
If it's not doing any additional checks, then I don't understand your point. Of course it's programmer error to convert a pointer to a reference when that pointer is null. It's the same programmer error as any time that you dereference a null pointer except that it doesn't actually dereference the pointer when you create the reference and instead blows up later when you attempt to use what it refers to, because that's when the actual dereferencing takes place. If C++ doesn't have additional checks, then it's not any stronger about guarantees with & than D is with ref. Meta was asking how it was possible that int* p = null; ref r = *p; would result in a null reference instead of blowing up, and I explained why it didn't blow up and pointed out that C++ has the exact same situation. And unless C++ has added additional checks (and it sounds like they haven't), then there's no real difference here between C++ and D. - Jonathan M Davis
Apr 19
parent reply kdevel <kdevel vogtner.de> writes:
On Saturday, 19 April 2025 at 12:54:27 UTC, Jonathan M Davis 
wrote:
      [... int *p is potentially null ...]
      enforce (p);
      auto ref r = *p;
If it's not doing any additional checks, then I don't understand your point. Of course it's programmer error to convert a pointer to a reference when that pointer is null.
int main () { int *p = NULL; int &i = *p; } That is an error (mistake) only in C++ because the reference is not initialized with a valid initializer. In D, however, void main () { int *p = null; ref int i = *p; // DMD v2.111.0 } is a valid program [3].
 It's the same programmer error as any time that you dereference 
 a null pointer except that it doesn't actually dereference the 
 pointer when you create the reference and instead blows up 
 later when you attempt to use what it refers to, because that's 
 when the actual dereferencing takes place.
Assume the "dereference" of the pointer and the initialization of the reference happen in different translation units written by different programmers. I.e. tu1.cc void foo (int &i) { } tu2.cc int main () { int *p = NULL; foo (*p); } versus tu1.d void foo (ref int i) { } tu2.d int main () { int *p = NULL; foo (*p); } Then we have different responsibilities. In the C++ case the programmer of tu2.cc made a mistake while in the D case the code of tu2.d is legit. I would not call this situation "the same programmer error".
 If C++ doesn't have additional checks, then it's not any 
 stronger about guarantees with & than D is with ref.
As programmer of translation unit 1 my job is much easier if I use C++. [3] https://dlang.org/spec/type.html#pointers "When a pointer to T is dereferenced, it must either contain a null value, or point to a valid object of type T."
Apr 19
parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Saturday, April 19, 2025 8:23:09 AM MDT kdevel via Digitalmars-d wrote:
 On Saturday, 19 April 2025 at 12:54:27 UTC, Jonathan M Davis
 wrote:
      [... int *p is potentially null ...]
      enforce (p);
      auto ref r = *p;
If it's not doing any additional checks, then I don't understand your point. Of course it's programmer error to convert a pointer to a reference when that pointer is null.
int main () { int *p = NULL; int &i = *p; } That is an error (mistake) only in C++ because the reference is not initialized with a valid initializer. In D, however, void main () { int *p = null; ref int i = *p; // DMD v2.111.0 } is a valid program [3].
In both cases it's a valid program where the programmer screwed up, and they're going to get a segfault later on if the reference is ever accessed. If it weren't a valid program, it wouldn't compile. If you had a situation where a cast were being used to circumvent compiler checks, it could be argued that it wasn't valid, because the programmer was circumventing the compiler, but nothing is being circumvented here. Neither language has checks - either at compile time or at runtime - to catch this issue, so I don't see how it could be argued that the compiler is providing guarantees about this or that the program is invalid. In both cases, it's an error on the programmer's part, and in neither case is the language providing anything to prevent it or catch it. As far as I can see, the situation in both cases is identical. Maybe there's some difference in how the C++ spec talks about it, but there is no practical difference. - Jonathan M Davis
Apr 19
parent reply kdevel <kdevel vogtner.de> writes:
On Saturday, 19 April 2025 at 22:23:54 UTC, Jonathan M Davis 
wrote:
     int main ()
     {
        int *p = NULL;
        int &i = *p;
     }

 That is an error (mistake) only in C++ because the reference 
 is not initialized with a valid initializer. In D, however,

     void main ()
     {
        int *p = null;
        ref int i = *p; // DMD v2.111.0
     }

 is a valid program [3].
In both cases it's a valid program
Only the D version is valid. The C++ program violates the std. From the SO page there is a quote of the C++ 11 std draft which says in sec. "8.3.2 References": "A reference shall be initialized to refer to a valid object or function. [ Note: in particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the “object” obtained by dereferencing a null pointer, which causes undefined behavior. [...] — end note ]" You find nearly the same wording in sec. 11.3.2 of the C++17 std draft (N4713) and in sec. 9.3.4.3 of the C++23 std draft (N4928) with the "dereferencing" replaced with "indirection".
 [...] If it weren't a valid program, it wouldn't compile.
That is an interesting opinion.
Apr 19
parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Saturday, April 19, 2025 5:51:58 PM MDT kdevel via Digitalmars-d wrote:
 On Saturday, 19 April 2025 at 22:23:54 UTC, Jonathan M Davis
 wrote:
     int main ()
     {
        int *p = NULL;
        int &i = *p;
     }

 That is an error (mistake) only in C++ because the reference
 is not initialized with a valid initializer. In D, however,

     void main ()
     {
        int *p = null;
        ref int i = *p; // DMD v2.111.0
     }

 is a valid program [3].
In both cases it's a valid program
Only the D version is valid. The C++ program violates the std. From the SO page there is a quote of the C++ 11 std draft which says in sec. "8.3.2 References": "A reference shall be initialized to refer to a valid object or function. [ Note: in particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the “object” obtained by dereferencing a null pointer, which causes undefined behavior. [...] — end note ]" You find nearly the same wording in sec. 11.3.2 of the C++17 std draft (N4713) and in sec. 9.3.4.3 of the C++23 std draft (N4928) with the "dereferencing" replaced with "indirection".
I see no practical difference in general, but I guess that the difference would be that if C++ says that the behavior is undefined, then it can let the optimizer do whatever it wants with it, whereas D has to say at least roughly what would happen, or the behavior would be undefined and thus screw up safe. Whatever assumptions the optimizer may make about it, they can't be anything that would violate memory safety. In practice though, particularly with unoptimized code, C++ and D are going to do the same thing here, and in both cases, the programmer screwed up, so their program is going to crash. And realistically, it'll likely do the same thing in most cases even with optimized code. - Jonathan M Davis
Apr 19
parent reply kdevel <kdevel vogtner.de> writes:
On Sunday, 20 April 2025 at 00:33:52 UTC, Jonathan M Davis wrote:
 I see no practical difference in general [...]
I consider nonconforming generally inacceptable.
Apr 20
parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Sunday, April 20, 2025 8:13:44 AM MDT kdevel via Digitalmars-d wrote:
 On Sunday, 20 April 2025 at 00:33:52 UTC, Jonathan M Davis wrote:
 I see no practical difference in general [...]
I consider nonconforming generally inacceptable.
Writing a program which doesn't behave properly is always a problem and should be consider unacceptable. And both having a program relying on undefined behavior and having a program which dereferences null are problems. The latter will crash the program. The only real difference there between C++ and D is that if the language states that it's undefined behavior for the reference to be null, then the optimizer can do screwy things in the case when it actually does happen instead of the program being guaranteed to crash when the null reference is dereferenced. So, that's why I say that I see no practical difference. If you create a reference from a null pointer, you have a bug whether the program is written in C++ or D. And outside of optimized builds (and likely in almost all cases even with optimized builds), what happens when you screw that up will be the same in both languages. In any case, we clearly both agree that if the programmer does this, they've screwed up, and I think that we're basically arguing over language here rather than an actual technical problem. Ultimately, the only real difference is what the language's optimizer is allowed to do when the programmer does screw it up, because the C++ spec says that it's undefined behavior, and the D spec can't say that and have references work in safe code, since safe code disallows undefined behavior in order to ensure memory safety. The program has a bug either way. - Jonathan M Davis
Apr 20
parent kdevel <kdevel vogtner.de> writes:
On Sunday, 20 April 2025 at 22:19:39 UTC, Jonathan M Davis wrote:
 I consider nonconforming generally inacceptable.
Writing a program which doesn't behave properly is always a problem and should be consider unacceptable.
The problematic word is "behave". Only recently there was a thread on reddit where the user Zde-G pinpointed the problem while discussing a "new name" for undefined behavior (UB) [5]: '90% of confusion about UB comes from the simple fact that something is called behavior. Defined, undefined, it doesn't matter: layman observes world behavior, layman starts thinking about what kind of behavior can there be. The mental model every programmer which observes that term for the first time is “some secret behavior which is too complex to write in the description of the language… but surely I can glean it from the compiler with some experiments”. This is entirely wrong mental model even for C and doubly so for Rust or Zig. And it takes insane amount of effort to teach **every single newcomer** that it's wrong model. I have seen **zero** exceptions. New name should talk about code, not about behavior. “Invalid code” or “forbidden code” or maybe “erroneous construct”, but something, anything which is not related to what happens in runtime. There are no runtime after UB, it's as simple as that. The only option if your code have UB is to go and fix the code… and yet the name doesn't include anything related to code at all and concentrates on entirely wrong thing.'
 [...]

 If you create a reference from a null pointer, you have a bug 
 whether the program is written in C++ or D.
That is not true. A D program like this: void main () { int *p = null; ref int i = *p; // DMD v2.111.0 } is a valid program and there is no UB, no crash and no bug. I already pointed this out earlier with reference to the D spec. [3]. [3] https://dlang.org/spec/type.html#pointers "When a pointer to T is dereferenced, it must either contain a null value, or point to a valid object of type T." [5] Zde-G kommentiert Blog Post: UB Might Be a Wrong Term for Newer Languages: https://old.reddit.com/r/rust/comments/129mz8z/blog_post_ub_might_be_a_wrong_term_for_newer/jep231f/
Apr 21
prev sibling parent Derek Fawcus <dfawcus+dlang employees.org> writes:
On Friday, 18 April 2025 at 07:21:43 UTC, Ogion wrote:
 Isn’t `ref` essentially a non-null pointer?
Not really, it is similar but not the same. The thing pointers give, or rather the syntax of initialising them gives, is easy local reasoning, due to having to explicitly use the address of (&) operator. So any function in a different compilation unit is a lot easier to reason about if one does not have to worry about if an argument may experience an under the covers change due to the parameter being defined as a reference. Sadly, that state already exists, so it is rather a case of water under the bridge.
Apr 19
prev sibling next sibling parent reply Atila Neves <atila.neves gmail.com> writes:
On Monday, 14 April 2025 at 14:22:09 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
 On 15/04/2025 1:51 AM, Atila Neves wrote:
 On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) 
 Andrew Cattermole wrote:
 I've been looking once again at having an exception being 
 thrown on null pointer dereferencing.
 However the following can be extended to other hardware level 
 exceptions.

 [...]
I would like to know why one would want this.
Imagine you have a web server that is handling 50k requests per second. It makes you $1 million dollars a day. In it, you accidentally have some bad business logic that results in a null dereference or indexing a slice out of bounds.
Possible mitigations: * Use `sigaction` to catch `SIGSEGV` and throw an exception in the handler. * Use a nullable/option type. * Address sanitizer. * Fuzzing the server (which one should do anyway). How is out of bounds access related to null pointers throwing exceptions?
 How likely are you to keep using D, or willing to talk about 
 using D positively afterwards?
People write servers in C and C++ too.
Apr 16
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 16/04/2025 8:18 PM, Atila Neves wrote:
 On Monday, 14 April 2025 at 14:22:09 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 On 15/04/2025 1:51 AM, Atila Neves wrote:
 On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 I've been looking once again at having an exception being thrown on 
 null pointer dereferencing.
 However the following can be extended to other hardware level 
 exceptions.

 [...]
I would like to know why one would want this.
Imagine you have a web server that is handling 50k requests per second. It makes you $1 million dollars a day. In it, you accidentally have some bad business logic that results in a null dereference or indexing a slice out of bounds.
Possible mitigations: * Use `sigaction` to catch `SIGSEGV` and throw an exception in the handler.
The only thing you can reliably do on segfault is to kill the process. From what I've read, they get awfully iffy, even with the workarounds. And that's just for Posix, Windows is an entirely different kettle of fish and is designed around exception handling instead, which dmd doesn't support!
 * Use a nullable/option type.
While valid to box pointers, we would then need to disallow them in business logic functions. Very invasive, not my preference.
 * Address sanitizer.
Slow at runtime, which kinda defeats the purpose.
 * Fuzzing the server (which one should do anyway).
Absolutely, but there is too much state to kinda guarantee that it covers everything. And very few people will get it to that level (after all, people need significant amount of training to do it successfully).
 How is out of bounds access related to null pointers throwing exceptions?
Out of bounds on a slice uses a read barrier to throw an exception. A read barrier to prevent dereferencing a null pointer is exactly the same concept. One is 0 or 1. Second is 0 or N.
 How likely are you to keep using D, or willing to talk about using D 
 positively afterwards?
People write servers in C and C++ too
Yes they do, just like they do in D. But they have something we do not have, a ton of static analysis. Check, select professional developers: https://survey.stackoverflow.co/2024/technology#most-popular-technologies I'm not going to say they each have a good solution to the problem, but they each have a solution that isn't just kill the process. The end of this foray into read barriers may be the conclusion that we cannot use them for this. What worries me is that I don't have the evidence to show that it won't work, and dismissing it without evidence does mean that we'll be forced to recommend full CFG DFA which is slow. If it can work, using the read barriers to fill in the gap of what a faster DFA can offer would be a much better user experience. At least as a default.
Apr 16
next sibling parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
I should follow on from this to explain why I care so much.

See Developer type: https://survey.stackoverflow.co/2024/developer-profile

Back-end (which is what D could excel at) is second on the list, right 
next to full-stack which is first.

This is the biggest area of growth possible for D, and we're missing key 
parts that is not only expected, but needed to target the largest target 
audience possible.
Apr 16
prev sibling next sibling parent reply Derek Fawcus <dfawcus+dlang employees.org> writes:
On Wednesday, 16 April 2025 at 08:49:39 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 How is out of bounds access related to null pointers throwing 
 exceptions?
Out of bounds on a slice uses a read barrier to throw an exception. A read barrier to prevent dereferencing a null pointer is exactly the same concept. One is 0 or 1. Second is 0 or N.
Exactly what are you referring to by "read barrier"? To me it has a specific technical meaning, related to memory access and if/when one access may pass another. It has nothing to do with exceptions, but rather the details of how a CPU architecture approaches superscalar memory accesses (reads and/or writes), and how they are (or may be) re-ordered. However I would not classify the bounds check performed as part of accessing a slice as a "read barrier", rather it is a manual range check. So by "read barrier" for null's do you simply mean having the compile generate a "compare to zero" instruction, followed by a "jump if zero" to some error path? If so, then while it may catch errors (and I have no objection to optionally generating such null checks); it is not IMO a means of error recovery - simply another means of forcing a crash.
Apr 16
parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 16/04/2025 11:38 PM, Derek Fawcus wrote:
 On Wednesday, 16 April 2025 at 08:49:39 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 How is out of bounds access related to null pointers throwing 
 exceptions?
Out of bounds on a slice uses a read barrier to throw an exception. A read barrier to prevent dereferencing a null pointer is exactly the same concept. One is 0 or 1. Second is 0 or N.
Exactly what are you referring to by "read barrier"? To me it has a specific technical meaning, related to memory access and if/when one access may pass another.  It has nothing to do with exceptions, but rather the details of how a CPU architecture approaches superscalar memory accesses (reads and/or writes), and how they are (or may be) re-ordered. However I would not classify the bounds check performed as part of accessing a slice as a "read barrier", rather it is a manual range check. So by "read barrier" for null's do you simply mean having the compile generate a "compare to zero" instruction, followed by a "jump if zero" to some error path?
Yes. It would then call a compiler hook just like array bounds checks do.
 If so, then while it may catch errors (and I have no objection to 
 optionally generating such null checks); it is not IMO a means of error 
 recovery - simply another means of forcing a crash.
This is why I think its important that we make this configurable via a global function pointer. Like we do for asserts. It allows people to configure what it does rather than pick for them.
Apr 16
prev sibling parent reply Derek Fawcus <dfawcus+dlang employees.org> writes:
On Wednesday, 16 April 2025 at 08:49:39 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 On 16/04/2025 8:18 PM, Atila Neves wrote:

 * Use a nullable/option type.
While valid to box pointers, we would then need to disallow them in business logic functions.
I'm not sure what you have in mind, what I have in mind is something like this: https://discourse.llvm.org/t/rfc-nullability-qualifiers/35672 https://clang.llvm.org/docs/analyzer/developer-docs/nullability.html The checks here are performed in a distinct SA tool, not in the main compiler. However it catches the main erroneous cases - first two listed checks of second link:
 If a pointer p has a nullable annotation and no explicit null 
 check or assert, we should warn in the following cases:

-    p gets implicitly converted into nonnull pointer, for 
example, we are passing it to a function that takes a nonnull 
parameter.

-    p gets dereferenced
Given how individual variable / fields have to be annotated, it probably does not need complete DFA, but only function local analysis for loads/stores/compares.
Apr 16
parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 17/04/2025 1:41 AM, Derek Fawcus wrote:
 On Wednesday, 16 April 2025 at 08:49:39 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 On 16/04/2025 8:18 PM, Atila Neves wrote:

 * Use a nullable/option type.
While valid to box pointers, we would then need to disallow them in business logic functions.
I'm not sure what you have in mind, what I have in mind is something like this:   https://discourse.llvm.org/t/rfc-nullability-qualifiers/35672 https://clang.llvm.org/docs/analyzer/developer-docs/nullability.html The checks here are performed in a distinct SA tool, not in the main compiler.  However it catches the main erroneous cases - first two listed checks of second link:
``clang --analyze -Xanalyzer -analyzer-output=text`` "While it’s somewhat exceptional for us to introduce new type qualifiers that don’t produce semantically distinct types, we feel that this is the only plausible design and implementation strategy for this feature: pushing nullability qualifiers into the type system semantically would cause significant changes to the language (e.g., overloading, partial specialization) and break ABI (due to name mangling) that would drastically reduce the number of potential users, and we feel that Clang’s support for maintaining type sugar throughout semantic analysis is generally good enough [6] to get the benefits of nullability annotations in our tools." Its available straight from clang, it annotates variables and is part of the type system, but also isn't effecting symbol lookup or introspection. Its part of the frontend, not backend. Exactly what I want also. I'm swearing right now, I knew we were 20 years behind, I didn't realize that they are one stones throw away from the end game. We can't get ahead of them at this point. The attributes are different to what I want in D however. For D I want us to solve all of type state analysis not just nullability.
 If a pointer p has a nullable annotation and no explicit null check or 
 assert, we should warn in the following cases:

 -    p gets implicitly converted into nonnull pointer, for example, we 
 are passing it to a function that takes a nonnull parameter.

 -    p gets dereferenced
Given how individual variable / fields have to be annotated, it probably does not need complete DFA, but only function local analysis for loads/ stores/compares.
Fields no, it'll be hell if we were to start annotating them, Walter balked at that idea ages ago and he was right to. I want stuff like this to work, without needing annotation: ```d bool isNull(int* ptr) => ptr is null; int* ptr; if (!isNull(ptr)) int v = *ptr; // ok else int v = *ptr; // error ``` No annotations when things are not virtual or for the default case complex (such as backwards goto's, that needs a full CFG as part of DFA). ```d void main() { func(new int); // ok func(null); // error } void func(/*?nonnull*/ int* ptr) { int v = *ptr; } ```
Apr 16
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
The correct solution is to restart the process.

The null pointer dereference could be a symptom of a wild pointer writing all 
over the process space.
Apr 16
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 17/04/2025 6:38 AM, Walter Bright wrote:
 The correct solution is to restart the process.
 
 The null pointer dereference could be a symptom of a wild pointer 
 writing all over the process space.
Yes, we are in agreement on this situation. .net has a very strong guarantee that a pointer can only point to null or a valid instance of that type. The restrictions that safe place on a function does remove this as a possibility in D also. And this is where we are diverging, there is a subset which does have this guarantee, that it does indicate logic error, and not program corruption. This is heavily present in web development, but rather rare in comparison to other types of projects.
Apr 16
parent reply Walter Bright <newshound2 digitalmars.com> writes:
The focus of safe code in D is preventing memory corruption, not null pointer 
dereference seg faults or other programming bugs.

A null pointer deference may be a symptom of memory corruption or some logic 
bug, but it does not cause memory corruption.

D has many aspects that reduce the likelihood of bugs (such as no variable 
shadowing), but that is not what safe is about.

(Yes I know that if there's a constant offset to a null pointer larger than the 
guard page, it can cause memory corruption.)
Apr 16
parent reply Dave P. <dave287091 gmail.com> writes:
On Wednesday, 16 April 2025 at 19:58:45 UTC, Walter Bright wrote:
 The focus of safe code in D is preventing memory corruption, 
 not null pointer dereference seg faults or other programming 
 bugs.

 A null pointer deference may be a symptom of memory corruption 
 or some logic bug, but it does not cause memory corruption.

 D has many aspects that reduce the likelihood of bugs (such as 
 no variable shadowing), but that is not what safe is about.

 (Yes I know that if there's a constant offset to a null pointer 
 larger than the guard page, it can cause memory corruption.)
The 0 address on WASM is writable, which has burned me many times.
Apr 16
parent Walter Bright <newshound2 digitalmars.com> writes:
On 4/16/2025 1:16 PM, Dave P. wrote:
 The 0 address on WASM is writable, which has burned me many times.
How that oversight was made seems incredible. Anyhow, what you can do is allocate some memory at location 0, and fill it with 0xDEAD_BEEF. Then, periodically, check to see if those values changed.
Apr 16
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/12/2025 4:11 PM, Richard (Rikki) Andrew Cattermole wrote:
 The .net exceptions are split into managed and unmanaged exceptions.
Sounds equivalent to D's Exception and Error hierarchies.
Apr 16
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 17/04/2025 4:55 AM, Walter Bright wrote:
 On 4/12/2025 4:11 PM, Richard (Rikki) Andrew Cattermole wrote:
 The .net exceptions are split into managed and unmanaged exceptions.
Sounds equivalent to D's Exception and Error hierarchies.
Its not. The unmanaged exceptions are from native, then it gets wrapped by .net so it can be caught. Including for null dereference. As far as I'm aware cleanup routines are not messed with. We've lately been discussing what to do with Error on Discord, and so far it seems like the discussion is going in the direction of it should either do what assert does and kill the process with a function pointer in the middle to allow configurability, or throw what I've dubbed a framework exception. A framework exception sits in the middle of the existing hierarchy, does cleanup, but doesn't effect nothrow. Manu wanted something like this recently for identical reasons that me and Adam do.
Apr 16
parent Walter Bright <newshound2 digitalmars.com> writes:
On 4/16/2025 10:08 AM, Richard (Rikki) Andrew Cattermole wrote:
 We've lately been discussing what to do with Error on Discord, and so far it 
 seems like the discussion is going in the direction of it should either do
what 
 assert does and kill the process with a function pointer in the middle to
allow 
 configurability, or throw what I've dubbed a framework exception.
You can configure what assert() does via a command line switch.
 Manu wanted something like this recently for identical reasons that me and 
Adam do. Adam and I discussed this extensively at the last Coffee Haus meeting. I haven't talked with Manu about it. Anyone can configure assert() to do whatever they want, after all, D is a systems programming language. But if it does something other than exit the process, you're on your own.
Apr 16
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
I confess I don't understand the fear behind a null pointer.

A null pointer is a NaN (Not a Number) value for a pointer. It's similar (but 
not exactly the same behavior) as 0xFF is a NaN value for a character and NaN
is 
a NaN value for a floating point value.

It means the pointer is not pointing to a valid object. Therefore, it should
not 
be dereferenced. To dereference a null pointer is:

A BUG IN THE PROGRAM

When a bug in the program is detected, the only correct course of action is:

GO DIRECTLY TO JAIL, DO NOT PASS GO, DO NOT COLLECT $200

It's the same thing as `assert(condition)`. When the condition evaluates to 
`false`, there's a bug in the program.

A bug in the program means the program has entered an unanticipated state. The 
notion that one can recover from this and continue running the program is only 
for toy programs. There is NO WAY to determine if continuing to run the program 
is safe or not.

I did a lot of programming on MS-DOS. There is no memory protection there. 
Writing through a null pointer would scramble the operating system tables,
which 
meant the operating system would do something terrible. There were many times 
when it literally scrambled my hard disk. (I made lots of backups.)

If you haven't had this pleasure, it may be hard to realize what a godsend 
protected memory is. A null pointer no longer requires reinstalling the 
operating system. Your program simply quits with a stack trace.

With the advent of protected mode, I immediately ceased all program development 
in real mode DOS. Instead, I'd fully debug it in protected mode, and then as
the 
very last step I'd test it in real mode.

Protected mode is the greatest invention ever for computer programs. When the 
hardware detects a null pointer dereference, it produces a seg fault, the 
program stops running and you get a stack trace which gives you the best chance 
ever of finding the cause of the seg fault.

A lovely characteristic of seg faults is they come FOR FREE! There is zero cost 
to them. They don't slow your program down at all. They do not add bloat. It's 
all under the hood.

The idea that a null pointer is a billion dollar mistake is just ludicrous to 
me. The real mistake is having unchecked arrays, which don't get hardware 


Being unhappy about a null pointer seg fault is like complaining that the 
seatbelt left a bruise on your body as it saved you from your body being broken 
(this has happened to me, I always always wear that seatbelt!).

Of course, it is better to detect a seg fault at compile time. Data Flow 
Analysis can help:

```d
int x = 1;
void main()
{
     int* p;
     if (x) *p = 3;
}
```
Compiling with `-O`, which enables Data Flow Analysis:
```
dmd -O test.d
Error: null dereference in function _Dmain
```
Unfortunately, DFA has its limitations that nobody has managed to solve (the 
halting problem), hence the need for runtime checks, which the hardware does 
nicely for you.

Fortunately, D is powerful enough so you can make a non-nullable type.

In summary, the notion that one can recover from an unanticipated null pointer 
dereference and continue running the program is a seriously bad idea. There are 
far better ways to make failsafe systems. Complaining about a seg fault is like 
complaining that a seatbelt left a bruise while saving you from being maimed.
Apr 16
parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 17/04/2025 6:18 AM, Walter Bright wrote:
 I confess I don't understand the fear behind a null pointer.
 
 A null pointer is a NaN (Not a Number) value for a pointer. It's similar 
 (but not exactly the same behavior) as 0xFF is a NaN value for a 
 character and NaN is a NaN value for a floating point value.
Agreed. But unlike floating point, pointer issues kill the process. They invalidate the task at hand.
 It means the pointer is not pointing to a valid object. Therefore, it 
 should not be dereferenced.
If you write purely safe code that isn't possible. Just like what .net guarantees.
 To dereference a null pointer is:
 
 A BUG IN THE PROGRAM
Agreed, the task has not got the ability to continue and must stop. A task is not the same thing as a process.
 When a bug in the program is detected, the only correct course of action 
 is:
 
 GO DIRECTLY TO JAIL, DO NOT PASS GO, DO NOT COLLECT $200
 
 It's the same thing as `assert(condition)`. When the condition evaluates 
 to `false`, there's a bug in the program.
You are not going to like what the unittest runner is doing then. https://github.com/dlang/dmd/blob/d6602a6b0f658e8ec24005dc7f4bf51f037c2b18/druntime/src/core/runtime.d#L561
 A bug in the program means the program has entered an unanticipated 
 state. The notion that one can recover from this and continue running 
 the program is only for toy programs. There is NO WAY to determine if 
 continuing to run the program is safe or not.
Yes, that is certainly possible in a lot of cases. We are in total agreement that the default should always be to kill the process. The problem lies in a very specific scenario where safe is being used heavily, where logic errors are extremely common but memory errors are not. I want us to be 100% certain that a read barrier cannot function as a backup plan to DFA language features. If it can, it will give a better user experience then just DFA, we've seen what happens when you try to solve these kinds of problems exclusively with DFA, it shows up as DIP1000 cannot be turned on by default. If the end result is that we have to recommend the slow DFA exclusively for production code then so be it. I want us to be certain that we have no other options.
 I did a lot of programming on MS-DOS. There is no memory protection 
 there. Writing through a null pointer would scramble the operating 
 system tables, which meant the operating system would do something 
 terrible. There were many times when it literally scrambled my hard 
 disk. (I made lots of backups.)
As you know I'm into retro computers, so yeah I'm familiar with not having memory protection and the consequences thereof.
 If you haven't had this pleasure, it may be hard to realize what a 
 godsend protected memory is. A null pointer no longer requires 
 reinstalling the operating system. Your program simply quits with a 
 stack trace.
 
 With the advent of protected mode, I immediately ceased all program 
 development in real mode DOS. Instead, I'd fully debug it in protected 
 mode, and then as the very last step I'd test it in real mode.
I've read your story on this in the past and believed you the first time.
 Protected mode is the greatest invention ever for computer programs. 
 When the hardware detects a null pointer dereference, it produces a seg 
 fault, the program stops running and you get a stack trace which gives 
 you the best chance ever of finding the cause of the seg fault.
You don't always get a stack trace. Nor does it allow you to fully report to a reporting daemon what went wrong for diagnostics. What Windows does instead of a signal, is to have it throw an exception that then gets caught right at the top. This then triggers the reporting daemon kicking in. It allows for catching, filtering and adding of more information to the report. Naturally we can't support it due to exceptions... At the OS level things have progressed from simply segfaulting out, even in the native world. https://learn.microsoft.com/en-us/windows/win32/api/werapi/nf-werapi-werregisterruntimeexceptionmodule
 A lovely characteristic of seg faults is they come FOR FREE! There is 
 zero cost to them. They don't slow your program down at all. They do not 
 add bloat. It's all under the hood.
 
 The idea that a null pointer is a billion dollar mistake is just 
 ludicrous to me. The real mistake is having unchecked arrays, which 

 injection problems.
While I don't agree that it was a mistake (token values are just as bad), and that is his name for it. I view it the same way as I view coroutine coloring. Its a feature to keep operating environments sane. But by doing so it causes pain and forces you to deal with the problem rather than let it go unnoticed. Have a read of the show notes: https://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare/ "27:40 This led me to suggest that the null value is a member of every type, and a null check is required on every use of that reference variable, and it may be perhaps a billion dollar mistake." None of this is new! :)
 Being unhappy about a null pointer seg fault is like complaining that 
 the seatbelt left a bruise on your body as it saved you from your body 
 being broken (this has happened to me, I always always wear that 
 seatbelt!).
Never happened to me, and I still wear it. Doesn't mean I want to be in a car that is driven with hard stops that is in the drivers control to not do.
 Of course, it is better to detect a seg fault at compile time. Data Flow 
 Analysis can help:
 
 ```d
 int x = 1;
 void main()
 {
      int* p;
      if (x) *p = 3;
 }
 ```
 Compiling with `-O`, which enables Data Flow Analysis:
 ```
 dmd -O test.d
 Error: null dereference in function _Dmain
 ```
Right, local information only. Turns out even the C++ folks are messing around with frontend DFA for this :/ With cross-procedural information in AST.
 Unfortunately, DFA has its limitations that nobody has managed to solve 
 (the halting problem), hence the need for runtime checks, which the 
 hardware does nicely for you.
 
 Fortunately, D is powerful enough so you can make a non-nullable type.
I've considered the possibility of explicit boxing. With and without compiler forcing it (by disallowing raw pointers and slices). Everything we can do with boxing using library types, can be done better with the compiler. Including making sure that it actually happens. I've seen what happens if we force boxing rather than doing something in the language in my own stuff. The amount of errors I have with my mustuse error type is staggering. We gotta get a -betterC compatible solution to exceptions that isn't heap allocated or using unwinding tables ext. It would absolutely poor engineering to try to convince anyone to box raw pointers let alone being the recommended or required solution as part of PhobosV3. There has to be a better way.
 In summary, the notion that one can recover from an unanticipated null 
 pointer dereference and continue running the program is a seriously bad 
 idea. There are far better ways to make failsafe systems. Complaining 
 about a seg fault is like complaining that a seatbelt left a bruise 
 while saving you from being maimed.
Program != task. No one wants the task to continue after a null dereference occurs. We are not in disagreement. It must attempt to cleanup (if segfault handler fires then straight to death the process goes) and die. We are not as far off as it might appear.
Apr 16
prev sibling parent reply GrimMaple <grimmaple95 gmail.com> writes:
On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) 
Andrew Cattermole wrote:

FYI, some time ago I designed this https://github.com/GrimMaple/mud/blob/master/source/mud/nullable.d to work as `<Nullable>enable</Nullable>`). My intention was to include it in OpenD to work like this: When a compiler flag is enabled (eg -nullcheck), the code below: ```d Object o = new Object(); ``` Would be then silently rewritten by compiler as ``` NotNull!Object o = new Object(); ``` thus enabling compile-time null checks. The solution is not perfect and needs some further compiler work (eg checking if some field is inderectly initialized by some func called in a constructor). Also, I lack dmd knowledge to insert this myself, so this didn't go anywhere in terms of actual inclusion, but I might give it a go some time in the future.
Apr 21
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 21/04/2025 10:34 PM, GrimMaple wrote:
 On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:

FYI, some time ago I designed this https://github.com/GrimMaple/mud/ checks (that work if you `<Nullable>enable</Nullable>`). My intention was to include it in OpenD to work like this: When a compiler flag is enabled (eg -nullcheck), the code below: ```d Object o = new Object(); ``` Would be then silently rewritten by compiler as ``` NotNull!Object o = new Object(); ``` thus enabling compile-time null checks. The solution is not perfect and needs some further compiler work (eg checking if some field is inderectly initialized by some func called in a constructor). Also, I lack dmd knowledge to insert this myself, so this didn't go anywhere in terms of actual inclusion, but I might give it a go some time in the future.
That is the Swift solution to the problem, more or less.
Apr 21
parent GrimMaple <grimmaple95 gmail.com> writes:
On Monday, 21 April 2025 at 10:45:15 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
 That is the Swift solution to the problem, more or less.
My take here is that null pointers (references/whatever) just _shouldn't be_, __period__. The compiler should disallow me to generate a null pointer, unless specifically asked for - like with `Nullable!MyType`. or, even better, `Optional!MyType` -- then we can finally ditch that "null pointer semantic" and use "correct" terminology, because most often a null pointer is just used as a quasi-optional type. Any memory allocations that end up in a null pointer, eg `new` running out of memory, should result in an Exception/Error, not returning null. I think that having `NotNull!T` provides a soft transition from having null to not having null at all -- it's fairly trivial to append `MaybeNull!` to any var type that needs it; and then it's fairly easy to just rename it to `Optional` c:
Apr 21