www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Allocator-aware safe reference counting is still not possible

reply Paul Backus <snarwin gmail.com> writes:
D has made a lot of progress recently on memory safety with 
`-preview=dip1000`, thanks in no small part to [the work of 
Dennis Korpel][1]. This progress has in turn enabled the creation 
of [`SafeRefCounted`][2] by Ate Eskola, which will hopefully be 
available in the next release of Phobos.

The next logical step on this journey is a version of 
`SafeRefCounted` with support for `std.experimental.allocator`. 
Unfortunately, this step is where we run into a roadblock.

`SafeRefCounted` is allowed make a ` trusted` call to `free` when 
it knows it holds the only pointer to its payload, because it 
knows (from the C standard) that `free` will not corrupt memory 
when called under those circumstances.

However, an allocator-aware version of `SafeRefCounted` that 
calls a generic `Allocator.deallocate` function instead of free 
specifically has *literally no idea* what that function will do, 
and therefore cannot mark that call as ` trusted`, ever, under 
any circumstances.

The only solution is to somehow allow `deallocate` (and by 
extension `free`) to have a ` safe` interface on its own—which 
isn't possible in the current D language. At minimum, it would 
require something like an [`isolated` qualifier][3] (h/t 
deadalnix for the link), which would guarantee that a pointer is 
the only pointer to a particular block of memory. Some form of 
ownership/borrow checking would also work, of course.

In any case, this is not something that can be solved in library 
code. A language change is necessary.

[1]: 
https://github.com/dlang/dmd/pulls?q=is%3Apr+author%3Adkorpel+is%3Aclosed+scope
[2]: https://github.com/dlang/phobos/pull/8368
[3]: 
https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/msr-tr-2012-79.pdf
Sep 25 2022
next sibling parent reply Dukc <ajieskola gmail.com> writes:
On Sunday, 25 September 2022 at 12:03:08 UTC, Paul Backus wrote:
 This progress has in turn enabled the creation of 
 `SafeRefCounted` by Ate Eskola, which will hopefully be 
 available in the next release of Phobos.
Well Atila started it, so he has a bit of credit. I "stole" his PR at some point.
 The next logical step on this journey is a version of 
 `SafeRefCounted` with support for `std.experimental.allocator`. 
 Unfortunately, this step is where we run into a roadblock.

 `SafeRefCounted` is allowed make a ` trusted` call to `free` 
 when it knows it holds the only pointer to its payload, because 
 it knows (from the C standard) that `free` will not corrupt 
 memory when called under those circumstances.

 However, an allocator-aware version of `SafeRefCounted` that 
 calls a generic `Allocator.deallocate` function instead of free 
 specifically has *literally no idea* what that function will 
 do, and therefore cannot mark that call as ` trusted`, ever, 
 under any circumstances.
I think it can. We need to agree on what the deallocator can and cannot do. If the deallocator then does something disallowed, then its the deallocator that's to blame, not `SafeRefCounted`. The deallocator, unless it's an intentional memory leaker or the GC, cannot be ` safe` anyway, so there's no way for the user to cause UB without writing incorrect ` trusted` or ` system` code.
 The only solution is to somehow allow `deallocate` (and by 
 extension `free`) to have a ` safe` interface on its own—which 
 isn't possible in the current D language. At minimum, it would 
 require something like an [`isolated` qualifier][3] (h/t 
 deadalnix for the link), which would guarantee that a pointer 
 is the only pointer to a particular block of memory. Some form 
 of ownership/borrow checking would also work, of course.
I do concur that reading how `isolated` works is worth it for anyone thinking how to improve ` live`.
Sep 25 2022
parent reply Paul Backus <snarwin gmail.com> writes:
On Sunday, 25 September 2022 at 12:48:14 UTC, Dukc wrote:
 On Sunday, 25 September 2022 at 12:03:08 UTC, Paul Backus wrote:
 However, an allocator-aware version of `SafeRefCounted` that 
 calls a generic `Allocator.deallocate` function instead of 
 free specifically has *literally no idea* what that function 
 will do, and therefore cannot mark that call as ` trusted`, 
 ever, under any circumstances.
I think it can. We need to agree on what the deallocator can and cannot do. If the deallocator then does something disallowed, then its the deallocator that's to blame, not `SafeRefCounted`.
This is "safety by convention"--the exact thing we're trying to get away from by using ` safe`. It can work in example code and small projects, but it doesn't scale, because the effort required to maintain safety scales exponentially with program size (proportional to the number of code paths). To make this work, we need the compiler to *enforce* the rules about what the deallocator can and cannot do.
Sep 25 2022
parent reply Dukc <ajieskola gmail.com> writes:
On Sunday, 25 September 2022 at 13:12:11 UTC, Paul Backus wrote:
 To make this work, we need the compiler to *enforce* the rules 
 about what the deallocator can and cannot do.
Why? The deallocator is going to be ` system` anyway. This means that it does not matter whether there is an allocator-enabled reference counter. In either case it's safety by convention inside the deallocators, but compiler enforced safety in client code assuming the deallocator is written correctly.
Sep 25 2022
parent reply Paul Backus <snarwin gmail.com> writes:
On Sunday, 25 September 2022 at 14:04:31 UTC, Dukc wrote:
 On Sunday, 25 September 2022 at 13:12:11 UTC, Paul Backus wrote:
 To make this work, we need the compiler to *enforce* the rules 
 about what the deallocator can and cannot do.
Why? The deallocator is going to be ` system` anyway.
If you have `isolated`, the deallocator can be made ` safe` by having it take an `isolated` pointer as its argument. You are right that it is not enough just to have the compiler enforce the rules on the deallocator--the calling code also has to *know* that the rules are being enforced.
Sep 25 2022
next sibling parent reply Dukc <ajieskola gmail.com> writes:
On Sunday, 25 September 2022 at 14:23:22 UTC, Paul Backus wrote:
 On Sunday, 25 September 2022 at 14:04:31 UTC, Dukc wrote:
 On Sunday, 25 September 2022 at 13:12:11 UTC, Paul Backus 
 wrote:
 To make this work, we need the compiler to *enforce* the 
 rules about what the deallocator can and cannot do.
Why? The deallocator is going to be ` system` anyway.
If you have `isolated`, the deallocator can be made ` safe` by having it take an `isolated` pointer as its argument.
A language change like that is ideal in the long term. My point, however, is that we're not blocked on waiting for one. An allocator-enabled safe reference counter is possible with the present language (except perhaps for some yet-undiscovered issue). The downside that the deallocators will have to be reviewed against not only causing UB by themselves, but also against leaking a pointer so that it can be reached by ` safe` code.
Sep 25 2022
parent reply Paul Backus <snarwin gmail.com> writes:
On Sunday, 25 September 2022 at 14:45:38 UTC, Dukc wrote:
 A language change like that is ideal in the long term. My 
 point, however, is that we're not blocked on waiting for one. 
 An allocator-enabled safe reference counter is possible with 
 the present language (except perhaps for some yet-undiscovered 
 issue). The downside that the deallocators will have to be 
 reviewed against not only causing UB by themselves, but also 
 against leaking a pointer so that it can be reached by ` safe` 
 code.
Well, of course, if you are willing to write unsound ` trusted` code, anything can be made ` safe`. :)
Sep 25 2022
parent reply Dukc <ajieskola gmail.com> writes:
On Sunday, 25 September 2022 at 14:52:09 UTC, Paul Backus wrote:
 On Sunday, 25 September 2022 at 14:45:38 UTC, Dukc wrote:
 A language change like that is ideal in the long term. My 
 point, however, is that we're not blocked on waiting for one. 
 An allocator-enabled safe reference counter is possible with 
 the present language (except perhaps for some yet-undiscovered 
 issue). The downside that the deallocators will have to be 
 reviewed against not only causing UB by themselves, but also 
 against leaking a pointer so that it can be reached by ` safe` 
 code.
Well, of course, if you are willing to write unsound ` trusted` code, anything can be made ` safe`. :)
The `SafeRefCounted` destructor wouldn't be any less sound that it's now. If `free` did something it's not supposed to, the ` trusted` attribute on the destructor would be invalid. It isn't any different with a custom deallocator.
Sep 25 2022
parent Paul Backus <snarwin gmail.com> writes:
On Sunday, 25 September 2022 at 15:07:04 UTC, Dukc wrote:
 On Sunday, 25 September 2022 at 14:52:09 UTC, Paul Backus wrote:
 Well, of course, if you are willing to write unsound 
 ` trusted` code, anything can be made ` safe`. :)
The `SafeRefCounted` destructor wouldn't be any less sound that it's now. If `free` did something it's not supposed to, the ` trusted` attribute on the destructor would be invalid. It isn't any different with a custom deallocator.
`free`'s interface is defined by the C language standard. There are a lot of safeguards in place, both social and technical, to prevent a non-conformant implementation of `free` from making its way into your program. Therefore, it is not a big risk for `SafeRefCounted`'s ` trusted` destructor to rely on the behavior of `free` conforming to the C language standard. No such safeguards exist to prevent users of a hypothetical allocator-aware `SafeRefCounted` from passing in a custom deallocator that violates safety when called from the same destructor. So the risk of allowing memory corruption in ` safe` code is much, much, higher when using a custom deallocator than it is when using `free`.
Sep 25 2022
prev sibling parent reply Sebastiaan Koppe <mail skoppe.eu> writes:
On Sunday, 25 September 2022 at 14:23:22 UTC, Paul Backus wrote:
 On Sunday, 25 September 2022 at 14:04:31 UTC, Dukc wrote:
 On Sunday, 25 September 2022 at 13:12:11 UTC, Paul Backus 
 wrote:
 To make this work, we need the compiler to *enforce* the 
 rules about what the deallocator can and cannot do.
Why? The deallocator is going to be ` system` anyway.
If you have `isolated`, the deallocator can be made ` safe` by having it take an `isolated` pointer as its argument.
That would be nice, but doesn't that tantamount to only being able to deallocate something when you can proof there are no other aliases? At which point you might as well use that proof and have the compiler call deallocate for you ;)
Sep 25 2022
parent Paul Backus <snarwin gmail.com> writes:
On Sunday, 25 September 2022 at 17:04:51 UTC, Sebastiaan Koppe 
wrote:
 On Sunday, 25 September 2022 at 14:23:22 UTC, Paul Backus wrote:
 If you have `isolated`, the deallocator can be made ` safe` by 
 having it take an `isolated` pointer as its argument.
That would be nice, but doesn't that tantamount to only being able to deallocate something when you can proof there are no other aliases?
That's a necessary condition for being able to deallocate memory in ` safe` code, period. You can only do it if it doesn't create any dangling pointers.
 At which point you might as well use that proof and have the 
 compiler call deallocate for you ;)
Not necessarily. The proof is allowed to rely on runtime information, like reference counts, which the compiler does not necessarily know about. In these cases, the programmer can cast the pointer to `isolated` in ` trusted` code (similar to how you can cast away `shared` in ` trusted` code after locking a mutex).
Sep 25 2022
prev sibling next sibling parent reply jmh530 <john.michael.hall gmail.com> writes:
On Sunday, 25 September 2022 at 12:03:08 UTC, Paul Backus wrote:
 [snip]
 The only solution is to somehow allow `deallocate` (and by 
 extension `free`) to have a ` safe` interface on its own—which 
 isn't possible in the current D language. At minimum, it would 
 require something like an [`isolated` qualifier][3] (h/t 
 deadalnix for the link), which would guarantee that a pointer 
 is the only pointer to a particular block of memory. Some form 
 of ownership/borrow checking would also work, of course.
 [snip]
It makes sense to me, though I'm not an expert. Another way to think about it is some way to incorporate ownership into the type system. Walter hasn't been convinced yet that live isn't sufficient. I think this would need to be proven to his satisfaction before he starts considering alternative. Maybe it can't hurt to flesh this argument out further? Nevertheless, I think being able to do allocator-aware safe reference counting should be a necessary condition before moving to safe by default.
Sep 26 2022
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 26.09.22 14:23, jmh530 wrote:
 
 Walter hasn't been convinced yet that  live isn't sufficient.
To be fair, I don't think Walter ever made the claim that live is sufficient.
Sep 26 2022
prev sibling parent reply Atila Neves <atila.neves gmail.com> writes:
On Sunday, 25 September 2022 at 12:03:08 UTC, Paul Backus wrote:
 D has made a lot of progress recently on memory safety with 
 `-preview=dip1000`, thanks in no small part to [the work of 
 Dennis Korpel][1]. This progress has in turn enabled the 
 creation of [`SafeRefCounted`][2] by Ate Eskola, which will 
 hopefully be available in the next release of Phobos.

 [...]
Couldn't it be ` safe` iff the particular allocator's deallocate is ` safe` (or missing)?
Apr 14 2023
next sibling parent Paul Backus <snarwin gmail.com> writes:
On Friday, 14 April 2023 at 13:42:15 UTC, Atila Neves wrote:
 On Sunday, 25 September 2022 at 12:03:08 UTC, Paul Backus wrote:
 D has made a lot of progress recently on memory safety with 
 `-preview=dip1000`, thanks in no small part to [the work of 
 Dennis Korpel][1]. This progress has in turn enabled the 
 creation of [`SafeRefCounted`][2] by Ate Eskola, which will 
 hopefully be available in the next release of Phobos.

 [...]
Couldn't it be ` safe` iff the particular allocator's deallocate is ` safe` (or missing)?
Yes. The obvious follow-up question is, "what does it take to make a `deallocate` method ` safe`?" And the answer is: it takes `isolated`, or some other way to restrict aliasing in ` safe` code. As Timon [1] and others [2][3] has helpfully explained, now that we have ` system` variables from DIP 1035, it is possible to do this without adding new language features, although the UX is not ideal. So, the current next step on the TODO list is to design a new allocator API that takes advantage of these techniques to make `deallocate` ` safe`. [1] https://forum.dlang.org/post/tr9j1h$1fvd$1 digitalmars.com [2] https://forum.dlang.org/post/xggosoodlcegitocruwf forum.dlang.org [3] https://forum.dlang.org/post/gdkikaklqyvxdyklvmug forum.dlang.org
Apr 14 2023
prev sibling parent Dukc <ajieskola gmail.com> writes:
On Friday, 14 April 2023 at 13:42:15 UTC, Atila Neves wrote:
 On Sunday, 25 September 2022 at 12:03:08 UTC, Paul Backus wrote:
 D has made a lot of progress recently on memory safety with 
 `-preview=dip1000`, thanks in no small part to [the work of 
 Dennis Korpel][1]. This progress has in turn enabled the 
 creation of [`SafeRefCounted`][2] by Ate Eskola, which will 
 hopefully be available in the next release of Phobos.

 [...]
Couldn't it be ` safe` iff the particular allocator's deallocate is ` safe` (or missing)?
An interesting question. In principle, you COULD make ` safe` allocator that allocates out of a static memory block. You are only getting and returning `void[]` slices, which in itself isn't ` system`. What makes it dangerous is that those void slices are them used as storage for arbitrary types. So if your ` safe` allocator doesn't do what it's supposed to you can end up overwriting live pointers, because the allocation machinery does ` trusted` casts that rely on the custom allocator behaving right. In practice it's probably going to be a problem. Maybe the allocator should instead return some wrapper type over `void[]` that can only be created or destructed in ` system` code.
Apr 14 2023