www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Allocator-aware safe reference counting is still not possible

reply Atila Neves <atila.neves gmail.com> writes:
https://forum.dlang.org/post/jsuraddtynhjoaikqprs forum.dlang.org

On Sunday, 25 September 2022 at 12:03:08 UTC, Paul Backus wrote:
 D has made a lot of progress recently on memory safety with 
 `-preview=dip1000`, thanks in no small part to [the work of 
 Dennis Korpel][1]. This progress has in turn enabled the 
 creation of [`SafeRefCounted`][2] by Ate Eskola, which will 
 hopefully be available in the next release of Phobos.

 The next logical step on this journey is a version of 
 `SafeRefCounted` with support for `std.experimental.allocator`. 
 Unfortunately, this step is where we run into a roadblock.

 `SafeRefCounted` is allowed make a ` trusted` call to `free` 
 when it knows it holds the only pointer to its payload, because 
 it knows (from the C standard) that `free` will not corrupt 
 memory when called under those circumstances.

 However, an allocator-aware version of `SafeRefCounted` that 
 calls a generic `Allocator.deallocate` function instead of free 
 specifically has *literally no idea* what that function will 
 do, and therefore cannot mark that call as ` trusted`, ever, 
 under any circumstances.

 The only solution is to somehow allow `deallocate` (and by 
 extension `free`) to have a ` safe` interface on its own—which 
 isn't possible in the current D language. At minimum, it would 
 require something like an [`isolated` qualifier][3] (h/t 
 deadalnix for the link), which would guarantee that a pointer 
 is the only pointer to a particular block of memory. Some form 
 of ownership/borrow checking would also work, of course.

 In any case, this is not something that can be solved in 
 library code. A language change is necessary.

 [1]: 
 https://github.com/dlang/dmd/pulls?q=is%3Apr+author%3Adkorpel+is%3Aclosed+scope
 [2]: https://github.com/dlang/phobos/pull/8368
 [3]: 
 https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/msr-tr-2012-79.pdf
I'm pretty much convinced we need isolated. This is very similar to why the language as it exists today doesn't allow a library author to write a vector type that can be appended to, which... is the main reason one would use a vector to begin with. Some allocators (GC?) might have a safe deallocate function but most (all except the GC?) can't due to aliasing, and that requires isolated.
Jan 22 2023
next sibling parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 23/01/2023 4:28 AM, Atila Neves wrote:
 I'm pretty much convinced we need isolated.
I'm not. When I first got a link to that paper I certainly didn't understand even the basic concepts. Variable based borrow checker is much easier to understand comparatively. My general feeling is allocators get used in two scenarios: - Controlled: this is your self contained data structure type scenario with RC. Safe, because if it wasn't the data structure wouldn't work. - Uncontrolled: No lifetimes, either global heavy or only used with function body, which means fat slices and pointers (no thanks). Unsafe and cannot be made it (due to things like globals). So there is no point in trying to make memory lifetimes of uncontrolled safe, because you shouldn't be doing this! Use a data structure instead. That just leaves controlled, where localsafe would be desirable (so you could call system RCAllocator api). And having a better lifetime management strategy to and with RC (i.e. eliding & order of destruction via borrow checker). Throw in value type exceptions as well, and ROM aware RC hooks; we'd be in a good place I think for this.
Jan 22 2023
next sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Sunday, 22 January 2023 at 15:50:27 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 On 23/01/2023 4:28 AM, Atila Neves wrote:
 I'm pretty much convinced we need isolated.
I'm not. When I first got a link to that paper I certainly didn't understand even the basic concepts. Variable based borrow checker is much easier to understand comparatively.
A borrow checker would also work. I think `isolated` is probably a better fit for D, but it's certainly not the only option.
 That just leaves controlled, where  localsafe would be 
 desirable (so you could call  system RCAllocator api).
Can you explain more about localsafe? I don't understand how this is different from Dukc's proposal in the linked thread.
Jan 22 2023
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 23/01/2023 10:31 AM, Paul Backus wrote:
 Can you explain more about  localsafe? I don't understand how this is 
 different from Dukc's proposal in the linked thread.
Its safe except you can call non-safe functions. Same goes for nogc and pure. Basically it verifies that you didn't do something stupid without limiting what you can call (like callbacks). This is something I've been wanting for a while now due to potential mistakes with creating contexts with callbacks.
Jan 22 2023
parent reply Paul Backus <snarwin gmail.com> writes:
On Monday, 23 January 2023 at 07:06:04 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 On 23/01/2023 10:31 AM, Paul Backus wrote:
 Can you explain more about  localsafe? I don't understand how 
 this is different from Dukc's proposal in the linked thread.
Its safe except you can call non-safe functions. Same goes for nogc and pure.
In this context, that makes it no different from trusted. The problem is that, in a generic allocator-aware container, if you write a trusted/ localsafe call to RCAllocator.deallocate, there is nothing to stop someone from writing a custom allocator with a deallocate function like this: struct NaughtyAllocator { // ... system void deallocate(void[] block) { corruptMemory(); } } ...and then RCAllocator.deallocate will dispatch to this function, and you will end up corrupting memory in safe code.
Jan 23 2023
next sibling parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
Yes, a bad allocator is still a bad allocator. There is nothing we can 
do to guard against that. Only something like address sanitizer could 
prevent bad things from happening.

Unfortunately there is also nothing stopping the implementation in 
phobos or libc from doing the same thing either. Its not really worth 
considering at this level. Either by mistake or on purpose a memory 
allocator can corrupt memory without the D compiler being able to 
discover it,  safe has nothing to do with it.
Jan 23 2023
parent reply Paul Backus <snarwin gmail.com> writes:
On Monday, 23 January 2023 at 16:39:07 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 Yes, a bad allocator is still a bad allocator. There is nothing 
 we can do to guard against that. Only something like address 
 sanitizer could prevent bad things from happening.

 Unfortunately there is also nothing stopping the implementation 
 in phobos or libc from doing the same thing either. Its not 
 really worth considering at this level. Either by mistake or on 
 purpose a memory allocator can corrupt memory without the D 
 compiler being able to discover it,  safe has nothing to do 
 with it.
Please read the original thread linked in Atila's first post. It is not very long, and I responded to these exact objections in that thread already.
Jan 23 2023
parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 24/01/2023 5:41 AM, Paul Backus wrote:
 Please read the original thread linked in Atila's first post. It is not 
 very long, and I responded to these exact objections in that thread 
 already.
Yeah I read it at the time, but did so again at your request.
Jan 23 2023
prev sibling parent reply Dukc <ajieskola gmail.com> writes:
On Monday, 23 January 2023 at 16:33:11 UTC, Paul Backus wrote:
 The problem is that, in a generic allocator-aware container, if 
 you write a  trusted/ localsafe call to RCAllocator.deallocate, 
 there is nothing to stop someone from writing a custom 
 allocator with a deallocate function like this:

 struct NaughtyAllocator
 {
     // ...

      system void deallocate(void[] block)
     {
          corruptMemory();
     }
 }

 ...and then RCAllocator.deallocate will dispatch to this 
 function, and you will end up corrupting memory in  safe code.
Yes... but note that this does require writing ` system` code wrong. In that sense it's no different from someone providing ```D trusted void naughtyFunction() => corruptMemory(); ``` The difference is that in the allocator there's no user-provided ` trusted` attribute you can point to and say "this one has the problem". We could require that any allocator that's supposed to be safe for reference counting must have a member function ` trusted disable void refCountCertificate();`, with ` trusted` attribute being checked for. That way it's not possible to corrupt memory in ` safe` code without writing at least one ` trusted` function.
Jan 23 2023
parent reply Paul Backus <snarwin gmail.com> writes:
On Monday, 23 January 2023 at 17:33:11 UTC, Dukc wrote:
 Yes... but note that this does require writing ` system` code 
 wrong. In that sense it's no different from someone providing
 ```D
  trusted void naughtyFunction() => corruptMemory();
 ```
The difference is that system code is not callable from safe, but trusted is. You can be as wrong as you want in system code, and it will not compromise safe code unless someone uses trusted improperly.
 The difference is that in the allocator there's no 
 user-provided ` trusted` attribute you can point to and say 
 "this one has the problem".
In Richard Cattermole's hypothetical, the trusted (or localsafe) attribute is in the implementation of the container; for example: struct Vector { RCAllocator allocator; void[] memory; // ... ~this() { // ... () trusted { allocator.deallocate(memory); }(); } }
 We could require that any allocator that's supposed to be safe 
 for reference counting must have a member function ` trusted 
  disable void refCountCertificate();`, with ` trusted` 
 attribute being checked for. That way it's not possible to 
 corrupt memory in ` safe` code without writing at least one 
 ` trusted` function.
This would technically work, but I do not think I could look someone in the eye who was new to D and explain it without dying of embarrassment.
Jan 23 2023
parent reply Dukc <ajieskola gmail.com> writes:
On Monday, 23 January 2023 at 17:44:03 UTC, Paul Backus wrote:
 We could require that any allocator that's supposed to be safe 
 for reference counting must have a member function ` trusted 
  disable void refCountCertificate();`, with ` trusted` 
 attribute being checked for. That way it's not possible to 
 corrupt memory in ` safe` code without writing at least one 
 ` trusted` function.
This would technically work, but I do not think I could look someone in the eye who was new to D and explain it without dying of embarrassment.
Now when I think of it, it probably should be called an allocator certificate and used for all allocators. ` trusted` required only if any of the other allocation primitives are ` system`. We have no reason to support allocators which might corrupt memory if they free their own memory after all, so no reason to name the certificate after reference counting. Anyway, what's the problem? Perhaps we can improve this solution somehow. I agree with you that we do not want to wait for any major language-level changes, considering how long it took for DIP1000 to become stable.
Jan 24 2023
parent reply Dukc <ajieskola gmail.com> writes:
On Tuesday, 24 January 2023 at 14:51:06 UTC, Dukc wrote:
 I agree with you that we do not want to wait for any major 
 language-level changes, considering how long it took for 
 DIP1000 to become stable.
Also I suspect the benefit-to-complexity ration of `isolated` or a complete borrow checker would be poor. It fits Rust, since it's a dedicated systems programming language. D is an application programming and scripting language as much as it's a systems language, so for us it isn't as good fit. After all, GC solves the majority of memory safety issues already, and DIP1000 can solve a big part, if not the majority of what isn't solved by the GC.
Jan 24 2023
parent reply Paul Backus <snarwin gmail.com> writes:
On Tuesday, 24 January 2023 at 15:03:46 UTC, Dukc wrote:
 On Tuesday, 24 January 2023 at 14:51:06 UTC, Dukc wrote:
 I agree with you that we do not want to wait for any major 
 language-level changes, considering how long it took for 
 DIP1000 to become stable.
Also I suspect the benefit-to-complexity ration of `isolated` or a complete borrow checker would be poor. It fits Rust, since it's a dedicated systems programming language. D is an application programming and scripting language as much as it's a systems language, so for us it isn't as good fit.
It's worth noting that D already has a bunch of special-case language rules that would be unified by adding `isolated`. For example: return values of so-called ["Pure factory functions"][1] are allowed to implicitly convert to `immutable`, because "all mutable memory returned by the call cannot be referenced by any other part of the program"--or, in other words, the return value is `isolated`. There's a similar rule for `new` expressions, although it doesn't seem to be written down anywhere. You can see it at work in code like this: ```d // converts from mutable to immutable immutable(int[][]) a = new int[][](3, 3); ``` Replacing all of these special cases with a single consistent set of rules would help simplify the language. [1]: https://dlang.org/spec/function.html#pure-factory-functions
Jan 24 2023
parent reply Dukc <ajieskola gmail.com> writes:
On Tuesday, 24 January 2023 at 16:11:00 UTC, Paul Backus wrote:
 It's worth noting that D already has a bunch of special-case 
 language rules that would be unified by adding `isolated`.

 [snip]

 Replacing all of these special cases with a single consistent 
 set of rules would help simplify the language.
That's precisely what made the bottom type such a great DIP from the perspective of use. However, as with the bottom type, it isn't any easier to write the DIP nor implement nor debug it. Meaning, in the long run we might well want `isolated`, but allocator-aware reference counter should not be waiting for one - which I think is what you said.
Jan 24 2023
parent Dukc <ajieskola gmail.com> writes:
On Tuesday, 24 January 2023 at 16:21:37 UTC, Dukc wrote:
 Meaning, in the long run we might well want `isolated`, but 
 allocator-aware reference counter should not be waiting for one 
 - which I think is what you said.
Oh sorry, I remembered Paul's earlier post wrong. He wrote:
 I agree that it's bad UX, but what's the alternative?
 Implement a borrow checker in D? It'll take 5-10 years and
 won't even work properly when it's done. At that point, you
 may as well just tell people to switch to Rust.
..but that was referring to the `borrow` function, not allocators.
Jan 24 2023
prev sibling next sibling parent reply jmh530 <john.michael.hall gmail.com> writes:
On Sunday, 22 January 2023 at 15:50:27 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 On 23/01/2023 4:28 AM, Atila Neves wrote:
 I'm pretty much convinced we need isolated.
I'm not. When I first got a link to that paper I certainly didn't understand even the basic concepts. Variable based borrow checker is much easier to understand comparatively.
I can’t claim to be an expert, but I’ve read some on isolated and some smart people here like the idea. The borrow checker certainly has some similarities. With the borrow checker, you can have an unlimited number of const references or a single mutable reference. Isolated means you can at most have a single mutable way of accessing some data. So it’s missing that either/or-ness of the borrow checker, if I understand it correctly. Another question is whether an affine type or qualifier is better than live for handling the borrow checker behavior. If such a thing existed would there be a demand for isolated?
Jan 22 2023
parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 23/01/2023 3:07 PM, jmh530 wrote:
 Another question is whether an affine type or qualifier is better than 
  live for handling the borrow checker behavior. If such a thing existed 
 would there be a demand for isolated?
Yes I believe a type qualifier (scope), would be better suited towards a borrow checker than any other solution. Essentially a borrow checker just says, an owning reference lifetime must exceed that of a borrowed reference. It guarantees the right order of death for them. Its so simple, we already have a ton of logic to support this with DIP1000! We don't need new syntax, just some smarter semantics surrounding owning/borrowing. Just to be clear, I don't think live solves any problem that the D community has. It is useless. Here is how much live is used (note we know it is severely incomplete): https://issues.dlang.org/buglist.cgi?bug_status=UNCONFIRMED&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&bug_status=RESOLVED&bug_status=VERIFIED&bug_status=CLOSED&component=dmd&f0=OP&f1=OP&f2=assigned_to&f3=CP&f4=CP&j1=OR&list_id=243775&o2=substring&query_format=advanced&short_desc=%40live&short_desc_type=allwordssubstr&v2=live None. Not a one bug report. Nobody uses it today. Because we don't need it.
Jan 22 2023
prev sibling parent reply Atila Neves <atila.neves gmail.com> writes:
On Sunday, 22 January 2023 at 15:50:27 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 On 23/01/2023 4:28 AM, Atila Neves wrote:
 [...]
I'm not. When I first got a link to that paper I certainly didn't understand even the basic concepts. Variable based borrow checker is much easier to understand comparatively. [...]
I don't understand your post. I've never seen anything resembling the "uncontrolled" above. I think the way to use allocators is via smart pointers/containers and iff it's strictly required.
Jan 23 2023
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 23/01/2023 9:31 PM, Atila Neves wrote:
 On Sunday, 22 January 2023 at 15:50:27 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 On 23/01/2023 4:28 AM, Atila Neves wrote:
 [...]
I'm not. When I first got a link to that paper I certainly didn't understand even the basic concepts. Variable based borrow checker is much easier to understand comparatively. [...]
I don't understand your post. I've never seen anything resembling the "uncontrolled" above. I think the way to use allocators is via smart pointers/containers and iff it's strictly required.
I have written uncontrolled allocator usage. Its also in druntime/phobos plenty. After all its basically just a life cycle is known but compiler can't prove anything useful related to it. See any usage of malloc/free (including internally to the GC). But yes, I think the way to go is some sort of controlled representation for normal usage (which requires borrowing, which live does not solve). ```d Vector!int vector; vector ~= 3; auto borrowed = vector[0]; func(borrowed); void func(scope ref int value) { } ``` Basically right now we're missing the lifetime checks surrounding borrowed & function parameter. Everything else is do-able right now, even if it isn't as cheap as it could be (like RC eliding).
Jan 23 2023
next sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Monday, 23 January 2023 at 08:49:50 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 Basically right now we're missing the lifetime checks 
 surrounding borrowed & function parameter. Everything else is 
 do-able right now, even if it isn't as cheap as it could be 
 (like RC eliding).
Have you seen the borrow method [1] used by SafeRefCounted? It is already possible, in the current D language, to prevent a container or smart pointer from leaking references. The syntax is awkward, because you have to use a callback, but it can be done. Lifetime issues are not the blocker here. The blocker is being able to give deallocate/free a safe interface, so that it can be used safely in a generic or polymorphic context, where the specific implementation is not known in advance. [1]: https://dlang.org/library/std/typecons/borrow.html
Jan 23 2023
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 24/01/2023 5:39 AM, Paul Backus wrote:
 On Monday, 23 January 2023 at 08:49:50 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 Basically right now we're missing the lifetime checks surrounding 
 borrowed & function parameter. Everything else is do-able right now, 
 even if it isn't as cheap as it could be (like RC eliding).
Have you seen the borrow method [1] used by SafeRefCounted? It is already possible, in the current D language, to prevent a container or smart pointer from leaking references. The syntax is awkward, because you have to use a callback, but it can be done.
Yes I'm aware of this method. I've talked about it with Robert Schadek during last DConf Online who argued that this is the only way forward. This is not the way forward as far as I'm concerned. It is a major step backwards in usability as it is not how people work with arrays or data types in general.
 Lifetime issues are not the blocker here. The blocker is being able to 
 give deallocate/free a safe interface, so that it can be used safely in 
 a generic or polymorphic context, where the specific implementation is 
 not known in advance.
I can't agree with that. We need to move people away from calling into allocators directly! They are an advanced concept, that is easy to get wrong especially when creating them. Which is what composing them like std.experimental.allocators does. Ultimately, I think its ok for allocators to not be safe, they are simply too easy to get wrong without any way to prevent this at the compiler level. You use them when you want to do something more advanced without any hand holding. Use data structures like a vector type to make it safe instead, which is why I think lifetime is the only thing blocking it atm.
Jan 23 2023
parent reply Paul Backus <snarwin gmail.com> writes:
On Monday, 23 January 2023 at 16:59:16 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 On 24/01/2023 5:39 AM, Paul Backus wrote:
 Have you seen the borrow method [1] used by SafeRefCounted? It 
 is already possible, in the current D language, to prevent a 
 container or smart pointer from leaking references. The syntax 
 is awkward, because you have to use a callback, but it can be 
 done.
Yes I'm aware of this method. I've talked about it with Robert Schadek during last DConf Online who argued that this is the only way forward. This is not the way forward as far as I'm concerned. It is a major step backwards in usability as it is not how people work with arrays or data types in general.
I agree that it's bad UX, but what's the alternative? Implement a borrow checker in D? It'll take 5-10 years and won't even work properly when it's done. At that point, you may as well just tell people to switch to Rust.
 Lifetime issues are not the blocker here. The blocker is being 
 able to give deallocate/free a safe interface, so that it can 
 be used safely in a generic or polymorphic context, where the 
 specific implementation is not known in advance.
I can't agree with that. We need to move people away from calling into allocators directly! They are an advanced concept, that is easy to get wrong especially when creating them. Which is what composing them like std.experimental.allocators does. Ultimately, I think its ok for allocators to not be safe, they are simply too easy to get wrong without any way to prevent this at the compiler level. You use them when you want to do something more advanced without any hand holding. Use data structures like a vector type to make it safe instead, which is why I think lifetime is the only thing blocking it atm.
I am not advocating for typical D users to call into allocators directly. The people who benefit from allocators having a safe interface are authors of generic container and smart pointer libraries. It is currently *impossible* to write a safe vector type that accepts a user-supplied allocator. You can only do it if you hard-code a dependency on a specific allocator (or a specific predefined set of allocators) whose behavior you know in advance. If your argument is that we should do exactly that, and simply give up on supporting user-supplied allocators entirely, then I can accept that as a reasonable position. Certainly it would be the quickest and easiest way to un-block progress on containers, and we can always revisit the issue in the future if necessary.
Jan 23 2023
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 24/01/2023 6:32 AM, Paul Backus wrote:
 On Monday, 23 January 2023 at 16:59:16 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 On 24/01/2023 5:39 AM, Paul Backus wrote:
 Have you seen the borrow method [1] used by SafeRefCounted? It is 
 already possible, in the current D language, to prevent a container 
 or smart pointer from leaking references. The syntax is awkward, 
 because you have to use a callback, but it can be done.
Yes I'm aware of this method. I've talked about it with Robert Schadek during last DConf Online who argued that this is the only way forward. This is not the way forward as far as I'm concerned. It is a major step backwards in usability as it is not how people work with arrays or data types in general.
I agree that it's bad UX, but what's the alternative? Implement a borrow checker in D? It'll take 5-10 years and won't even work properly when it's done. At that point, you may as well just tell people to switch to Rust.
A lot of it is already done with DIP1000. When you call a function and pass a borrowed value in, that part is complete. What we are missing is the initial borrow action and guaranteeing it being tied to another object in terms of life. This only really needs to cover within a function body, so the DFA here should actually be really easy. Perhaps something like this (note ref not actually required for pointer types): ```d struct Thing(T) { ref T get() scope { ... } } { Thing thing; scope ref got = thing.get; // owner = thing func(got); // ok parameter is scope thing = Thing.init;// Error: thing must out live got variable, but thing is being assigned to. return got; // Error: scope variable thing must out live got variable, but got is being returned. } void func(scope ref T value) {} ``` For the rest, I'm glad that we are converging on a possible position :)
Jan 23 2023
next sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Monday, 23 January 2023 at 17:49:20 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 A lot of it is already done with DIP1000.

 When you call a function and pass a borrowed value in, that 
 part is complete.

 What we are missing is the initial borrow action and 
 guaranteeing it being tied to another object in terms of life.

 This only really needs to cover within a function body, so the 
 DFA here should actually be really easy.

 Perhaps something like this (note ref not actually required for 
 pointer types):

 [...]
I agree that this would work, but I'm not convinced it can be done as an incremental change on top of the existing `scope` system--I think we would end up having to essentially redesign and reimplement `scope` from scratch by the time we were done. Then again, it's not obvious that implementing `isolated` would be *less* work than reimplementing `scope` from scratch, so maybe it's just as good a proposal. :)
Jan 23 2023
parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 24/01/2023 8:04 AM, Paul Backus wrote:
 Then again, it's not obvious that implementing `isolated` would be 
 *less* work than reimplementing `scope` from scratch, so maybe it's just 
 as good a proposal. 😄
Lol yeah. Realistically somebody should give it a go, see how things turn out, only way to see what the true cost of any approach would be.
Jan 23 2023
prev sibling parent reply Dukc <ajieskola gmail.com> writes:
On Monday, 23 January 2023 at 17:49:20 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 Perhaps something like this (note ref not actually required for 
 pointer types):

 ```d
 struct Thing(T) {
 	ref T get() scope { ... }
 }

 {
 	Thing thing;
 	scope ref got = thing.get; // owner = thing
 	func(got); // ok parameter is scope

 	thing = Thing.init;// Error: thing must out live got variable, 
 but thing is being assigned to.
 	return got; // Error: scope variable thing must out live got 
 variable, but got is being returned.
 }

 void func(scope ref T value) {}
 ```

 For the rest, I'm glad that we are converging on a possible 
 position :)
I'm afraid it's more complicated than you think. `thing` might have its destructor called before the end of `got` lifetime. The language could pretty trivially prevent doing that directly, but what if you have a `scope` pointer to `thing` and call the destructor via it? or a `scope SumType!(Thing*, int*)[5]` variable, that may contain both references to both `thing`s and ints? These are probably solvable, but the solution is going to be at least as complex as ` live`, if not more so.
Jan 24 2023
parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 25/01/2023 3:39 AM, Dukc wrote:
 I'm afraid it's more complicated than you think.
I expect it will be complicated. DFA for this sort of thing always is.
 `thing` might have its destructor called before the end of `got` 
 lifetime. The language could pretty trivially prevent doing that 
 directly, but what if you have a `scope` pointer to `thing` and call the 
 destructor via it? or a `scope SumType!(Thing*, int*)[5]` variable, that 
 may contain both references to both `thing`s and ints?
Yes, you need to track the 'real' owner for memory and ensure the right order of variable destruction.
 These are probably solvable, but the solution is going to be at least as 
 complex as ` live`, if not more so.
Considering live isn't complete, I'd argue implementing live is more complicated than live ;) The singular difference is live is opt-in, function by function. This on the other hand isn't, which means its guarantees are actually real for memory safety.
Jan 24 2023
prev sibling parent reply FeepingCreature <feepingcreature gmail.com> writes:
On Monday, 23 January 2023 at 08:49:50 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 ```d
 Vector!int vector;
 vector ~= 3;

 auto borrowed = vector[0];
 func(borrowed);

 void func(scope ref int value) {
 	
 }
 ```

 Basically right now we're missing the lifetime checks 
 surrounding borrowed & function parameter. Everything else is 
 do-able right now, even if it isn't as cheap as it could be 
 (like RC eliding).
I'm writing a language with borrowing and ref counting (Neat), and this is not a valid borrow. Basically you don't want to take on borrowing with variables that are mutable by default, because then you're asking for things like: ```d Vector!int vector; vector ~= 3; void evil() { vector = Vector!int.init; } auto borrowed = vector[0]; func(borrowed); void func(scope ref int value) { // destroy the last non-borrowed reference to vector, where is your God now? evil; ```
Jan 25 2023
parent FeepingCreature <feepingcreature gmail.com> writes:
On Thursday, 26 January 2023 at 06:55:41 UTC, FeepingCreature 
wrote:
 On Monday, 23 January 2023 at 08:49:50 UTC, Richard (Rikki) 
 Andrew Cattermole wrote:
 ```d
 Vector!int vector;
 vector ~= 3;

 auto borrowed = vector[0];
 func(borrowed);

 void func(scope ref int value) {
 	
 }
 ```

 Basically right now we're missing the lifetime checks 
 surrounding borrowed & function parameter. Everything else is 
 do-able right now, even if it isn't as cheap as it could be 
 (like RC eliding).
I'm writing a language with borrowing and ref counting (Neat), and this is not a valid borrow. Basically you don't want to take on borrowing with variables that are mutable by default, because then you're asking for things like: ```d Vector!int vector; vector ~= 3; void evil() { vector = Vector!int.init; } auto borrowed = vector[0]; func(borrowed); void func(scope ref int value) { // destroy the last non-borrowed reference to vector, where is your God now? evil; ```
Addendum: This idiom is impossible in Rust because the equivalent of `void evil()` already captures `vector`. But nested functions are an essential part of D. This is why you cannot bolt borrowing onto a language that has been designed around a garbage collector; it needs support at every level. (Like variables being rvalue by default, cough.)
Jan 25 2023
prev sibling next sibling parent reply RTM <riven baryonides.ru> writes:
On Sunday, 22 January 2023 at 15:28:53 UTC, Atila Neves wrote:
 I'm pretty much convinced we need isolated.
Seems redundant to dip1021 ( live).
Jan 22 2023
parent reply Nick Treleaven <nick geany.org> writes:
On Sunday, 22 January 2023 at 21:11:31 UTC, RTM wrote:
 On Sunday, 22 January 2023 at 15:28:53 UTC, Atila Neves wrote:
 I'm pretty much convinced we need isolated.
Seems redundant to dip1021 ( live).
From the live docs:
 Multiple borrower pointers can simultaneously exist if all of 
 them are pointers to read only (const or immutable) data
AFAIU isolated needs to guarantee no other references point to the data, so that we can call `free` on it without dangling references. That applies even if the data is immutable, so we can put immutable data on the heap.
Jan 24 2023
parent jmh530 <john.michael.hall gmail.com> writes:
On Tuesday, 24 January 2023 at 21:01:04 UTC, Nick Treleaven wrote:
 [snip]

 AFAIU isolated needs to guarantee no other references point to 
 the data, so that we can call `free` on it without dangling 
 references. That applies even if the data is immutable, so we 
 can put immutable data on the heap.
I would think that a hypothetical isolated would apply to mutable data or pointers to mutable data. If the data is const or immutable, then D's transitivity implies that all pointers to it are read only.
Jan 24 2023
prev sibling parent reply Nick Treleaven <nick geany.org> writes:
On Sunday, 22 January 2023 at 15:28:53 UTC, Atila Neves wrote:
 I'm pretty much convinced we need isolated. This is very 
 similar to why the language as it exists today doesn't allow a 
 library author to write a vector type that can be appended to, 
 which... is the main reason one would use a vector to begin 
 with.

 Some allocators (GC?) might have a  safe deallocate function 
 but most (all except the GC?) can't due to aliasing, and that 
 requires isolated.
`isolated` would be nice, but for now we can model it with a struct so that this works: ```d class Mallocator : IAllocator { import core.stdc.stdlib : free, malloc; void* safeAllocate(size_t n) trusted { return malloc(n); } void safeDeallocate(Isolated!(void*) ip) trusted { ip.unwrap.free; } } void main() { IAllocator a = new Mallocator; scope m = a.safeAllocate(4); auto ip = (() trusted => assumeIsolated(a.safeAllocate(4)))(); a.safeDeallocate(ip.move); assert(ip.unwrap == null); } ``` Working code: https://github.com/ntrel/stuff/blob/master/typecons/isolated.d Isolated could go in std.typecons.
Jan 28 2023
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 1/28/23 16:56, Nick Treleaven wrote:
 On Sunday, 22 January 2023 at 15:28:53 UTC, Atila Neves wrote:
 I'm pretty much convinced we need isolated. This is very similar to 
 why the language as it exists today doesn't allow a library author to 
 write a vector type that can be appended to, which... is the main 
 reason one would use a vector to begin with.

 Some allocators (GC?) might have a  safe deallocate function but most 
 (all except the GC?) can't due to aliasing, and that requires isolated.
`isolated` would be nice, but for now we can model it with a struct so that this works: ```d class Mallocator : IAllocator {     import core.stdc.stdlib : free, malloc;     void* safeAllocate(size_t n) trusted     {         return malloc(n);     }     void safeDeallocate(Isolated!(void*) ip) trusted     {         ip.unwrap.free;     } } void main() {     IAllocator a = new Mallocator;     scope m = a.safeAllocate(4);     auto ip = (() trusted => assumeIsolated(a.safeAllocate(4)))();     a.safeDeallocate(ip.move);     assert(ip.unwrap == null); } ``` Working code: https://github.com/ntrel/stuff/blob/master/typecons/isolated.d Isolated could go in std.typecons.
Isolated is not sufficient, you also have to guarantee the pointer was allocated with `malloc`.
Jan 29 2023
next sibling parent reply Dukc <ajieskola gmail.com> writes:
On Monday, 30 January 2023 at 01:07:32 UTC, Timon Gehr wrote:
 Working code:
 https://github.com/ntrel/stuff/blob/master/typecons/isolated.d
 
 Isolated could go in std.typecons.
Isolated is not sufficient, you also have to guarantee the pointer was allocated with `malloc`.
This could be accomplished with building a wrapper type over `malloc`ed pointers. ` safe` `free` would accept only them, not any isolated pointer.
Jan 30 2023
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 31/01/2023 1:17 AM, Dukc wrote:
 On Monday, 30 January 2023 at 01:07:32 UTC, Timon Gehr wrote:
 Working code:
 https://github.com/ntrel/stuff/blob/master/typecons/isolated.d

 Isolated could go in std.typecons.
Isolated is not sufficient, you also have to guarantee the pointer was allocated with `malloc`.
This could be accomplished with building a wrapper type over `malloc`ed pointers. ` safe` `free` would accept only them, not any isolated pointer.
This is just horrible. You might as well call it DynamicArray which is exactly what I recommend you do, use data structures and not call allocators directly! Of course we could add `` require(AllocatorAware)`` and call it a day. That would at least force people to audit their code and realize hey... this isn't something I should be using directly but still allow passing allocators around.
Jan 30 2023
parent reply Paul Backus <snarwin gmail.com> writes:
On Monday, 30 January 2023 at 12:30:01 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 On 31/01/2023 1:17 AM, Dukc wrote:
 
 This could be accomplished with building a wrapper type over 
 `malloc`ed pointers. ` safe` `free` would accept only them, 
 not any isolated pointer.
This is just horrible. You might as well call it DynamicArray which is exactly what I recommend you do, use data structures and not call allocators directly!
I don't understand why you keep bringing this up--it's totally beside the point. Obviously most users should not need to use the allocator API directly. However, if you are *implementing* a data structure like a dynamic array, and you want to support user-supplied custom allocators, then the only way your data structure can be safe is if the allocator API uses this kind of wrapper type to present a safe interface.
Jan 30 2023
next sibling parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 31/01/2023 5:27 AM, Paul Backus wrote:
 I don't understand why you keep bringing this up--it's totally beside 
 the point. Obviously most users should not need to use the allocator API 
 directly.
In the above case, with system you could pass in whatever pointer you want or extract the pointer out and lose any protection that the given struct could provide. If the struct is done simply, you could do this with safe easily as well :/ So not really appropriate for an allocator to return. So you remain in audit only territory of it.
 However, if you are *implementing* a data structure like a dynamic 
 array, and you want to support user-supplied custom allocators, then the 
 only way your data structure can be  safe is if the allocator API uses 
 this kind of wrapper type to present a  safe interface.
No? The data structure is the one doing the lifetime management. It takes over from the language to keep the guarantees in check. Ultimately, data structures and algorithms typically need auditing to make sure they are doing what they are supposed to be doing. I strongly believe adding a whole lifetime tracking feature in what should be library code won't have the ROI in adding it to the language. On that note, Dennis has done some work to replace some scope inference which should offer the infrastructure required to plug the last big hole in lifetime tracking of borrowed memory from a data structure :) https://github.com/dlang/dmd/pull/14492
Jan 30 2023
parent reply Paul Backus <snarwin gmail.com> writes:
On Monday, 30 January 2023 at 17:01:38 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 On 31/01/2023 5:27 AM, Paul Backus wrote:
 I don't understand why you keep bringing this up--it's totally 
 beside the point. Obviously most users should not need to use 
 the allocator API directly.
In the above case, with system you could pass in whatever pointer you want or extract the pointer out and lose any protection that the given struct could provide. If the struct is done simply, you could do this with safe easily as well :/ So not really appropriate for an allocator to return. So you remain in audit only territory of it.
We can use DIP 1035's system variables to ensure that the wrapper struct's internals are not meddled with.
 However, if you are *implementing* a data structure like a 
 dynamic array, and you want to support user-supplied custom 
 allocators, then the only way your data structure can be  safe 
 is if the allocator API uses this kind of wrapper type to 
 present a  safe interface.
No? The data structure is the one doing the lifetime management. It takes over from the language to keep the guarantees in check.
It seems as though you have completely failed to grasp the essential point of my original post [1]. In order for a data structure to provide these guarantees to its users, the allocator must, in turn, provide certain guarantees *to the data structure*. And if the data structure does not know in advance which allocator it will be using (e.g., if it is calling a generic RCAllocator.deallocate function), then those guarantees must be *encoded in the type system* somehow, so that safe code cannot accidentally break them. [1] https://forum.dlang.org/post/jsuraddtynhjoaikqprs forum.dlang.org
Jan 30 2023
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 31/01/2023 6:18 AM, Paul Backus wrote:
 It seems as though you have completely failed to grasp the essential 
 point of my original post [1].
 
 In order for a data structure to provide these guarantees to its users, 
 the allocator must, in turn, provide certain guarantees *to the data 
 structure*. And if the data structure does not know in advance which 
 allocator it will be using (e.g., if it is calling a generic 
 RCAllocator.deallocate function), then those guarantees must be *encoded 
 in the type system* somehow, so that  safe code cannot accidentally 
 break them.
Unfortunately I have indeed completely grasped it. I genuinely do not believe it will have the ROI that others seem to think it will. Walter has shown that he does not want the DFA that I strongly suspect is required to do this properly and therefore trying to solve this will only result in wasted effort when we could get almost there without it within our (ok sometimes artificial) limitations.
Jan 30 2023
parent reply Paul Backus <snarwin gmail.com> writes:
On Monday, 30 January 2023 at 17:22:56 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 Unfortunately I have indeed completely grasped it.

 I genuinely do not believe it will have the ROI that others 
 seem to think it will.

 Walter has shown that he does not want the DFA that I strongly 
 suspect is required to do this properly and therefore trying to 
 solve this will only result in wasted effort when we could get 
 almost there without it within our (ok sometimes artificial) 
 limitations.
As far as I am aware, it is impossible to have all three of the following: 1. safe containers. 2. User-supplied allocators. 3. No language changes. Given Walter and Atila's stance on memory safety, (1) is non-negotiable, so the question is whether we prefer (1)+(2) (implement isolated or something similar) or (1)+(3) (do not allow users to define their own allocators). I myself am not sure which of these would have better ROI. In this post, it sounds as though you are advocating for (1)+(3). But in your previous post [1], you replied to my claim that (1)+(2) requires not-(3) with "No? The data structure is the one doing the lifetime management," which suggested to me that you believe (1)+(2)+(3) is actually possible somehow. If you agree with me that (1)+(2)+(3) is impossible, then I have no objections to your other claims. If you believe that (1)+(2)+(3) is possible, on the other hand, then I would very much like to hear how. [1] https://forum.dlang.org/post/tr8t5i$2gt5$1 digitalmars.com
Jan 30 2023
next sibling parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
We'll have language changes for helping with this, I can't see us not 
making them if opportunities arise ;) Especially if they are small and 
have multiple use cases.

Now, 1. and 2. I want to differentiate between effectively  safe, and 
actually machine check-able  safe.

I do not believe we will ever have fully machine check able. That means 
significant DFA, we really don't have the required people to design and 
implement this. DIP1000 is a good example of this, since it doesn't 
support indirection with multiple lifetimes being involved in a variable.

Effectively  safe means as much code is machine checked, but we want to 
isolate to library code the unsafe parts, where we ensure those 
guarantees for what we can by auditing instead.

So no I don't think we can have 1/2 without hiring some people, but we 
can get close enough to it with what resources we do have, just by 
telling people to not use something and push them instead to use things 
that do offer it as long you don't do something outright stupid (which 
lets face it, they probably won't be using anything other than the 
default global allocator).
Jan 30 2023
parent reply Paul Backus <snarwin gmail.com> writes:
On Monday, 30 January 2023 at 17:51:52 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 Effectively  safe means as much code is machine checked, but we 
 want to isolate to library code the unsafe parts, where we 
 ensure those guarantees for what we can by auditing instead.
Yes; I assumed that this went without saying. Even with language features like an isolated qualifier to help us, it will still be necessary for the data structures and allocators to use trusted code internally.
 So no I don't think we can have 1/2 without hiring some people, 
 but we can get close enough to it with what resources we do 
 have, just by telling people to not use something and push them 
 instead to use things that do offer it as long you don't do 
 something outright stupid (which lets face it, they probably 
 won't be using anything other than the default global 
 allocator).
I am afraid that this description is far too vague for me to understand what you have in mind here. Are you advocating for (1)+(3), (2)+(3), or maybe some hybrid of both? Like, if you use one of the Officially Blessed allocators, the container will be safe, and if you use a 3rd-party custom allocator, it'll be system?
Jan 30 2023
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 31/01/2023 7:19 AM, Paul Backus wrote:
 So no I don't think we can have 1/2 without hiring some people, but we 
 can get close enough to it with what resources we do have, just by 
 telling people to not use something and push them instead to use 
 things that do offer it as long you don't do something outright stupid 
 (which lets face it, they probably won't be using anything other than 
 the default global allocator).
I am afraid that this description is far too vague for me to understand what you have in mind here. Are you advocating for (1)+(3), (2)+(3), or maybe some hybrid of both? Like, if you use one of the Officially Blessed allocators, the container will be safe, and if you use a 3rd-party custom allocator, it'll be system?
If we hired some people then yes 1&2 (but not 3). Otherwise we are stuck with changing the goal posts from perfect to good enough for the time being. As Andrei use to say: perfect is the enemy of the good; which is unfortunately what I think we need to strongly consider here. We have what appears to be a pretty decent path forward for 'good'. It shouldn't matter what allocator library you use. Either it works or it doesn't. The compiler shouldn't care whose code it is, only the lifetime patterns.
Jan 30 2023
parent reply Paul Backus <snarwin gmail.com> writes:
On Monday, 30 January 2023 at 18:33:57 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 If we hired some people then yes 1&2 (but not 3). Otherwise we 
 are stuck with changing the goal posts from perfect to good 
 enough for the time being. As Andrei use to say: perfect is the 
 enemy of the good; which is unfortunately what I think we need 
 to strongly consider here. We have what appears to be a pretty 
 decent path forward for 'good'.
I'm not opposed to the idea of moving the goalposts in principle, I am just not sure where you are proposing that we move them to. :) It seems like every time I ask, I get a different answer.
 It shouldn't matter what allocator library you use. Either it 
 works or it doesn't. The compiler shouldn't care whose code it 
 is, only the lifetime patterns.
For example: I agree with this paragraph in principle, but isn't this one of the goalposts that you were just saying we should be willing to move?
Jan 30 2023
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 31/01/2023 7:49 AM, Paul Backus wrote:
 On Monday, 30 January 2023 at 18:33:57 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 If we hired some people then yes 1&2 (but not 3). Otherwise we are 
 stuck with changing the goal posts from perfect to good enough for the 
 time being. As Andrei use to say: perfect is the enemy of the good; 
 which is unfortunately what I think we need to strongly consider here. 
 We have what appears to be a pretty decent path forward for 'good'.
I'm not opposed to the idea of moving the goalposts in principle, I am just not sure where you are proposing that we move them to. :) It seems like every time I ask, I get a different answer.
I'm not expressing myself clearly enough but the goal posts I'm recommending are the same as at the start. Check as much as possible with safe, but let data structures be responsible for as much of the life time guarantees wrt. allocators as possible. Perfect solution: all code is mechanically checked. Good solution: library and under the hood code gets audited, user code of the library code gets mechanically checked. Let's have a good solution, then try to figure out if we can remove restrictions and get a perfect solution in the library code as well.
 It shouldn't matter what allocator library you use. Either it works or 
 it doesn't. The compiler shouldn't care whose code it is, only the 
 lifetime patterns.
For example: I agree with this paragraph in principle, but isn't this one of the goalposts that you were just saying we should be willing to move?
I'm half and half. Stuff like localsafe would mean that I think that allocators should all be system in API. I'm genuinely reconsidering it for my own because of this thread. But in terms of hard coding into the compiler std.allocators vs sidero.base.allocators yeah no. It shouldn't do that, which is what the quoted statement was about. If both API's match, then the compiler should treat them the same way in terms of lifetime management.
Jan 30 2023
next sibling parent Dukc <ajieskola gmail.com> writes:
On Monday, 30 January 2023 at 18:59:38 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 Perfect solution: all code is mechanically checked.

 Good solution: library and under the hood code gets audited, 
 user code of the library code gets mechanically checked.
I believe we all have considered the latter good enough from the get go. It does not matter whether the language primitives or Phobos functions give the guarantees. All that matters is what the user can write without dabbling with ` system` or ` trusted` code him/herself. I'd even argue it's not necessarily even worse than the "perfect" solution. Phobos can have bugs, but so can the language-level checker.
Jan 30 2023
prev sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Monday, 30 January 2023 at 18:59:38 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 I'm not expressing myself clearly enough but the goal posts I'm 
 recommending are the same as at the start. Check as much as 
 possible with  safe, but let data structures be responsible for 
 as much of the life time guarantees wrt. allocators as possible.
[...]
 I'm half and half. Stuff like  localsafe would mean that I 
 think that allocators should all be  system in API. I'm 
 genuinely reconsidering it for my own because of this thread.

 But in terms of hard coding into the compiler std.allocators vs 
 sidero.base.allocators yeah no. It shouldn't do that, which is 
 what the quoted statement was about. If both API's match, then 
 the compiler should treat them the same way in terms of 
 lifetime management.
So, what, we should allow safe data structures to call system allocators and simply *assume* that the allocator implementation won't do anything weird? And if someone writes their own allocator implementation, and doesn't manually check that their system code provides the guarantees that the data structures expect, we're ok with them getting memory corruption in safe code? To me this seems antithetical to the entire idea of safe. It's what Walter calls "programming by convention," where the only thing standing between the user and disaster is programmer discipline. I do not think he or Atila would be willing to accept it (nor would I accept it in their place).
Jan 30 2023
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 31/01/2023 10:34 AM, Paul Backus wrote:
 So, what, we should allow  safe data structures to call  system 
 allocators and simply *assume* that the allocator implementation won't 
 do anything weird? And if someone writes their own allocator 
 implementation, and doesn't manually check that their  system code 
 provides the guarantees that the data structures expect, we're ok with 
 them getting memory corruption in  safe code?
If you can show me papers where mechanical checking of memory allocators take place without the use of proofing assistants, I'll be very interested in it. My understanding is that the state of the art barely touches upon this subject even with proofing assistants, let alone without. I.e. https://surface.syr.edu/eecs_techreports/182/ So yes, if the state of the art literally requires proof assistants to verify that a memory allocator is doing the right thing at the bare minimum, then it absolutely is out of scope of D for the time being. But hey, if there is a known good solution here that we can implement, lets do that.
Jan 30 2023
parent reply Paul Backus <snarwin gmail.com> writes:
On Monday, 30 January 2023 at 21:54:44 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 On 31/01/2023 10:34 AM, Paul Backus wrote:
 So, what, we should allow  safe data structures to call 
  system allocators and simply *assume* that the allocator 
 implementation won't do anything weird? And if someone writes 
 their own allocator implementation, and doesn't manually check 
 that their  system code provides the guarantees that the data 
 structures expect, we're ok with them getting memory 
 corruption in  safe code?
If you can show me papers where mechanical checking of memory allocators take place without the use of proofing assistants, I'll be very interested in it.
Once again, it seems that I have utterly failed to communicate the fundamental premise of this discussion. This has nothing to do with verifying the allocator's implementation (which I agree must be done by hand). The problem is, if you are attempting to write a safe data structure 1. You need the allocator to provide you with certain guarantees (i.e., it is safe to call deallocate as long as you satisfy preconditions A, B, and C). 2. If you accept arbitrary allocators supplied by users, you have no way to tell whether they provide those guarantees or not. Forget the author of the allocator--he is completely irrelevant. What should the *data structure author* do in this scenario? One possible answer is, "the data structure author should just assume that the guarantees are provided." This answer is not acceptable because it allows memory corruption to occur in safe code. Another possible answer is, "the data structure author should look for some kind of 'flag' or 'certificate' on the allocator that indicates it provides the necessary guarantees." This is a better answer, but not ideal, because it requires manual verification not just of trusted code but also system code, and the responsibility for that verification is divided between the data structure author and the allocator author. Another possible answer is, "the data structure author should make a whitelist of allocators that he personally knows provide the necessary guarantees, and only trust allocators on that list." This is similar to the previous answer, but it places the all of the responsibility for verification on the author of the trusted code (i.e., the data structure author), and does not rely on allocator authors to manually verify their system code. Another possible answer is, "the data structure author should not accept arbitrary allocators from users in the first place." This is a more restrictive compromise, but unlike the previous ones, it allows the data structure author to be certain that his trusted code will not cause memory corruption. Another possible answer is, "the data structure author should rely on the allocator author to provide a safe interface." This is the best answer, but unfortunately it is not possible without new language features. What I would like to hear from you is, what is *your* answer to this question? What do you think the author of the *data structure* should do?
Jan 30 2023
next sibling parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 31/01/2023 11:28 AM, Paul Backus wrote:
 What I would like to hear from you is, what is *your* answer to this
 question? What do you think the author of the *data structure* should do?
Pray that it will error or work correctly. Unless Walter changes his opinion of DFA or says let's add a proofing engine to dmd, I do not think we are fixing this situation. I want to see this solved. But we have very little to lean on in the literature, if we manage to do it, good chance somebody is getting a PHD.
Jan 30 2023
prev sibling parent jmh530 <john.michael.hall gmail.com> writes:
On Monday, 30 January 2023 at 22:28:56 UTC, Paul Backus wrote:
 On Monday, 30 January 2023 at 21:54:44 UTC, Richard (Rikki) 
 Andrew Cattermole wrote:
 On 31/01/2023 10:34 AM, Paul Backus wrote:
 So, what, we should allow  safe data structures to call 
  system allocators and simply *assume* that the allocator 
 implementation won't do anything weird? And if someone writes 
 their own allocator implementation, and doesn't manually 
 check that their  system code provides the guarantees that 
 the data structures expect, we're ok with them getting memory 
 corruption in  safe code?
If you can show me papers where mechanical checking of memory allocators take place without the use of proofing assistants, I'll be very interested in it.
Once again, it seems that I have utterly failed to communicate the fundamental premise of this discussion. This has nothing to do with verifying the allocator's implementation (which I agree must be done by hand). [snip]
But spelling it out in more detail will help for any future DIP. I think part of the problem is the whole "can't prove a negative" of this. Or rather, being able to show that this is the smallest language change to enable safe allocator-aware reference counting. I was watching Walter's Q&A from the recent Dconf and one point he makes it that things are much easier for him to understand with code examples. Probably goes for others as well. I think being able to show simple versions of various attempts at safe allocator-aware RC approaches that either don't work or are awkward to use might go a long way to convincing people.
Jan 31 2023
prev sibling next sibling parent reply Dukc <ajieskola gmail.com> writes:
On Monday, 30 January 2023 at 17:36:59 UTC, Paul Backus wrote:
 As far as I am aware, it is impossible to have all three of the 
 following:

 1.  safe containers.
 2. User-supplied allocators.
 3. No language changes.
Earlier I presented the idea to require a ` trusted` "certificate" function from the allocator. You agreed that it satisfies all of these in principle, but you wouldn't consider it in practice because it'd be embarrassing to explain for an outsider. Can you elaborate why? Maybe we can have some further ideas.
Jan 30 2023
parent Paul Backus <snarwin gmail.com> writes:
On Monday, 30 January 2023 at 19:34:05 UTC, Dukc wrote:
 On Monday, 30 January 2023 at 17:36:59 UTC, Paul Backus wrote:
 As far as I am aware, it is impossible to have all three of 
 the following:

 1.  safe containers.
 2. User-supplied allocators.
 3. No language changes.
Earlier I presented the idea to require a ` trusted` "certificate" function from the allocator. You agreed that it satisfies all of these in principle, but you wouldn't consider it in practice because it'd be embarrassing to explain for an outsider. Can you elaborate why? Maybe we can have some further ideas.
The fundamental problem with it is that there is no enforcement mechanism (other than convention) preventing an allocator from having a ` trusted` certificate and misbehaving anyway. In particular, anyone who works on the allocator after its initial implementation has to "just know" somehow (documentation? comments?) that changes to its ` system` functions must be reviewed for compliance with the ` trusted` certificate. Requiring manual verification of ` system` code is something we are trying to move away from in D (see DIP 1035), and adopting this scheme would be a step backwards.
Jan 30 2023
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 1/30/23 18:36, Paul Backus wrote:
 
 As far as I am aware, it is impossible to have all three of the following:
 
 1.  safe containers.
 2. User-supplied allocators.
 3. No language changes.
Well, we have system variables now, so we can have poor man's typestate together with poor man's move semantics [1], like already proposed by ntrel and Dukc. Why is this scheme not workable? Isn't this exactly the kind of problem (non-trivial memory safety invariant) we invented ` system` variables to solve? (With sumtype, I guess you can even move the flags to runtime (at the cost of template bloat exponential in the number of flags) to get poor man's dependent type state.) [1]: import core.stdc.stdlib; void main(){ import std.stdio; void foo() safe{ auto ptr0=fancyMalloc(16); writeln(ptr0.borrow((scope ptr){ return cast(int)ptr; })); fancyFree(ptr0); } foo(); void bar() safe{ auto ptr0=fancyMalloc(16); auto ptr1=ptr0.withAliasing.leak; // ok, leaking is safe writeln(ptr1); } bar(); void baz() safe{ auto ptr0=fancyMalloc(16); auto ptr1=ptr0.withAliasing; // fancyFree(ptr1); // error, not isolated } baz(); void qux() safe{ auto ptr0=fancyMalloc(16); auto ptr1=ptr0.withAliasing; // auto ptr2=ptr1.unsafeAddFlags!(PointerFlags.isolated); // error, unsafe // fancyFree(ptr2); // (ok) } qux(); void flarp() trusted{ auto ptr0=fancyMalloc(16); auto ptr1=ptr0.withAliasing; auto ptr2=ptr1.unsafeAddFlags!(PointerFlags.isolated); // ok, and we can check it is fine fancyFree(ptr2); // (ok) } flarp(); void bongo() safe{ auto ptr1=function() trusted{ auto ptr0=malloc(16); return ptr0.unsafeAddFlags!(PointerFlags.mallocd|PointerFlags.isolated); // ok, we can tell it is mallocd and isolated }(); fancyFree(ptr1); // ok } bongo(); } enum PointerFlags{ none, mallocd=1, isolated=2, } struct Pointer(T,PointerFlags flags){ private system T* ptr; Pointer!(T,flags&~PointerFlags.isolated) withAliasing() trusted{ auto result=ptr; ptr=null; return typeof(return)(result); } static if(!(flags&PointerFlags.isolated)){ T* leak() trusted{ auto result=ptr; ptr=null; return result; } } auto borrow(R)(scope R delegate(scope T*) safe dg) trusted{ scope local=ptr; ptr=null; scope(exit) ptr=local; return dg(local); } auto borrow(R)(scope R delegate(scope T*) system dg) system{ scope local=ptr; ptr=null; scope(exit) ptr=local; return dg(ptr); } } Pointer!(T,flags) unsafeAddFlags(PointerFlags flags,T)(ref T* ptr) system{ auto result=ptr; ptr=null; return typeof(return)(result); } Pointer!(T,newFlags|oldFlags) unsafeAddFlags(PointerFlags newFlags,T,PointerFlags oldFlags)(ref Pointer!(T,oldFlags) ptr) system{ auto result=ptr.ptr; ptr.ptr=null; return unsafeAddFlags!(newFlags|oldFlags)(result); } Pointer!(void, PointerFlags.mallocd|PointerFlags.isolated) fancyMalloc(size_t size) trusted{ return typeof(return)(malloc(size)); } void fancyFree(ref Pointer!(void, PointerFlags.mallocd|PointerFlags.isolated) ptr) trusted{ if(!ptr.ptr) return; free(ptr.ptr); ptr.ptr=null; }
Jan 30 2023
parent Dukc <ajieskola gmail.com> writes:
On Monday, 30 January 2023 at 23:14:57 UTC, Timon Gehr wrote:
 Well, we have  system variables now, so we can have poor man's 
 typestate together with poor man's move semantics [1], like 
 already proposed by ntrel and Dukc.

 Why is this scheme not workable? Isn't this exactly the kind of 
 problem (non-trivial memory safety invariant) we invented 
 ` system` variables to solve?

 (With sumtype, I guess you can even move the flags to runtime 
 (at the cost of template bloat exponential in the number of 
 flags) to get poor man's dependent type state.)

 [1]: [snip]
Great, thanks for building this concept of proof! It indeed looks like the way to go for me if we agree that whitelists or certificates aren't thorough enough. I definitely want a solution that requires only minor or no language changes, and this might well be it. Of course, we still have to look for weaknesses in this scheme. `SafeRefCounted` sure had it's share when it still was in the works, although I'm sure you're better than me foreseeing them.
Jan 31 2023
prev sibling parent reply patescross <patescross outlook.com> writes:
I'm pretty much convinced we need isolated. This is very similar 
to why the language as it exists today doesn't allow a library 
author to write a vector type that can be appended to, which... 
is the main reason one would use a vector to begin with.
Feb 04 2023
parent Paul Backus <snarwin gmail.com> writes:
On Saturday, 4 February 2023 at 11:33:18 UTC, patescross wrote:
 I'm pretty much convinced we need isolated. This is very 
 similar to why the language as it exists today doesn't allow a 
 library author to write a vector type that can be appended to, 
 which... is the main reason one would use a vector to begin 
 with.
Actually, Timon Gehr has pointed out a clever way that this can be done in library code, using the newly-added -preview=systemVariables feature: https://forum.dlang.org/post/tr9j1h$1fvd$1 digitalmars.com Doing this makes the allocator API quite ugly and cumbersome to work with, but as Richard Cattermole pointed out in his earlier messages, most users should not have to work with allocators directly.
Feb 04 2023
prev sibling parent reply jmh530 <john.michael.hall gmail.com> writes:
On Monday, 30 January 2023 at 01:07:32 UTC, Timon Gehr wrote:
 On 1/28/23 16:56, Nick Treleaven wrote:
 [snip]
 Isolated could go in std.typecons.
Isolated is not sufficient, you also have to guarantee the pointer was allocated with `malloc`.
Could you explain in a bit more detail why this is the case? Thanks.
Jan 30 2023
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 1/30/23 13:19, jmh530 wrote:
 On Monday, 30 January 2023 at 01:07:32 UTC, Timon Gehr wrote:
 On 1/28/23 16:56, Nick Treleaven wrote:
 [snip]
 Isolated could go in std.typecons.
Isolated is not sufficient, you also have to guarantee the pointer was allocated with `malloc`.
Could you explain in a bit more detail why this is the case? Thanks.
https://en.cppreference.com/w/c/memory/free "The behavior is undefined if the value of ptr does not equal a value returned earlier by malloc(), calloc(), realloc(), or aligned_alloc()." Undefined behavior is not memory safe.
Jan 30 2023
parent jmh530 <john.michael.hall gmail.com> writes:
On Monday, 30 January 2023 at 12:26:09 UTC, Timon Gehr wrote:
 [snip]

 https://en.cppreference.com/w/c/memory/free

 "The behavior is undefined if the value of ptr does not equal a 
 value returned earlier by malloc(), calloc(), realloc(), or 
 aligned_alloc()."

 Undefined behavior is not memory safe.
Thanks.
Jan 30 2023
prev sibling parent reply Atila Neves <atila.neves gmail.com> writes:
On Saturday, 28 January 2023 at 15:56:54 UTC, Nick Treleaven 
wrote:
 On Sunday, 22 January 2023 at 15:28:53 UTC, Atila Neves wrote:
 [...]
`isolated` would be nice, but for now we can model it with a struct so that this works: ```d class Mallocator : IAllocator { import core.stdc.stdlib : free, malloc; void* safeAllocate(size_t n) trusted { return malloc(n); } void safeDeallocate(Isolated!(void*) ip) trusted { ip.unwrap.free; } } void main() { IAllocator a = new Mallocator; scope m = a.safeAllocate(4); auto ip = (() trusted => assumeIsolated(a.safeAllocate(4)))(); a.safeDeallocate(ip.move); assert(ip.unwrap == null); } ``` Working code: https://github.com/ntrel/stuff/blob/master/typecons/isolated.d Isolated could go in std.typecons.
I don't understand how this presents a safe interface.
Feb 01 2023
parent Dukc <ajieskola gmail.com> writes:
On Wednesday, 1 February 2023 at 17:46:17 UTC, Atila Neves wrote:
 On Saturday, 28 January 2023 at 15:56:54 UTC, Nick Treleaven 
 wrote:
 Working code:
 https://github.com/ntrel/stuff/blob/master/typecons/isolated.d

 Isolated could go in std.typecons.
I don't understand how this presents a safe interface.
See Timon's last post for demonstration.
Feb 01 2023