digitalmars.D - Should (p - q) be disallowed in safe code?
- Walter Bright (30/30) Dec 31 Consider:
- Richard (Rikki) Andrew Cattermole (3/40) Dec 31 Make it ptrdiff_t not size_t, and I'm happy.
- Walter Bright (2/3) Dec 31 My bad.
- Richard (Rikki) Andrew Cattermole (9/13) Dec 31 I wasn't correcting you, I was saying what I wanted it to do.
- user1234 (14/28) Jan 01 Yes but this has nothing to do with the substraction. You simply
- Richard (Rikki) Andrew Cattermole (7/39) Jan 01 That sounds completely overkill.
- Vladimir Panteleev (5/6) Jan 01 I don't think so. An expression which calculates a `size_t` (or
- Timon Gehr (6/15) Jan 01 In C, subtracting pointers to different memory objects is undefined
- Walter Bright (3/9) Jan 01 Since the same front end is used, I'd be surprised if they behaved diffe...
- Timon Gehr (21/35) Jan 01 Well, that was the point. Vladimir had said that pointer subtraction is
- Walter Bright (16/37) Jan 01 That "can do whatever it wants" is a correct interpretation, but it woul...
- Richard (Rikki) Andrew Cattermole (12/15) Jan 01 That sounds a little like you're wanting to make safety designate which
- Walter Bright (13/24) Jan 01 Right. Just disallow the p-q.
- Walter Bright (4/8) Jan 02 The original C semantics were designed to map directly onto PDP-11 CPU i...
- Timon Gehr (56/108) Jan 01 Usually if the UB does anything of note it is because an attacker is
- Nick Treleaven (7/20) Jan 03 Point 6 here:
- claptrap (5/12) Jan 01 I use this all the time to iterate multiple arrays in lockstep.
- Walter Bright (3/10) Jan 01 @safe code doesn't allow pointer arithmetic, and so such code would have...
- Paul Backus (13/23) Jan 01 Pointer *subtraction* is allowed in @safe code because the result
- Timon Gehr (4/7) Jan 01 There are plenty of unsafe operations whose result is considered to be
- Walter Bright (6/19) Jan 01 Yes, that code would be safe, but it would also be garbage.
- Timon Gehr (13/24) Jan 01 It's one question whether you have to mark it `@trusted` for it to type
- Walter Bright (8/20) Jan 01 My proposal would not affect that - the frontend would diagnose p-q as a...
- Timon Gehr (12/44) Jan 01 I understand, but your latest point was "just put `@trusted` on it".
- Walter Bright (15/30) Jan 02 Let's step back a bit. I expect it to behave as a C backend would. More
- Timon Gehr (47/95) Jan 02 There is some conversion going on here that you did not mention, and in
- Walter Bright (7/7) Jan 02 It seems we are in full agreement that p-q should be disallowed in @safe...
- Timon Gehr (23/34) Jan 02 I am happy with each of these two outcomes:
- Walter Bright (16/35) Jan 02 You are obviously correct. But using known computers, it is not a memory...
- Timon Gehr (37/86) Jan 02 Underlying this admission is your utterly wrong claim, namely that it is...
- Walter Bright (48/71) Jan 02 The bottom line here is why are we arguing about this? Haven't we agreed...
- Richard (Rikki) Andrew Cattermole (2/12) Jan 02 Web assembly is segmented :(
- Walter Bright (2/3) Jan 02 They should have talked to me first!
- Richard (Rikki) Andrew Cattermole (15/19) Jan 02 I understand why they are doing it.
- Walter Bright (2/4) Jan 03 LDC has a webassembly back end, so it is not a killer.
- Richard (Rikki) Andrew Cattermole (7/12) Jan 03 You have misunderstood the situation.
- Walter Bright (2/9) Jan 03 I don't see why calls to `new` cannot be redirected to whatever WASM doe...
- Richard (Rikki) Andrew Cattermole (10/22) Jan 03 You can't do pointer arithmetic with WasmGC.
- Walter Bright (9/19) Jan 03 ```
- Richard (Rikki) Andrew Cattermole (41/65) Jan 03 Loading and storing fields work.
- Walter Bright (2/2) Jan 04 Thanks for the explanation. What it suggest to me is that a subset of D ...
- Richard (Rikki) Andrew Cattermole (6/8) Jan 04 Yes it should do, but it'll be limited enough that people will get
- H. S. Teoh (37/48) Jan 04 In the past I've managed to get a rudimentary (highly-hacked) druntime
- Adam D. Ruppe (3/5) Jan 04 https://dpldocs.info/this-week-in-arsd/Blog.Posted_2024_10_25.html
- Walter Bright (2/3) Jan 04 It'd be little different from the WASM targets for C++.
- Adam Ruppe (3/4) Jan 04 Just use opend, we ported druntime to wasm and make cross compilation
- Timon Gehr (13/16) Jan 03 You brought up some tangential points that I think are based on flawed
- Walter Bright (22/24) Jan 02 The current documentation says:
- Walter Bright (1/1) Jan 02 https://github.com/dlang/dlang.org/pull/4358
- Timon Gehr (14/23) Jan 02 This is what people intuitively assume will happen, but UB is not this.
- Walter Bright (11/22) Jan 03 I've said that it was memory safe all along. I've said this proposal is
- Timon Gehr (20/50) Jan 03 You have both said that and then also said that I am obviously correct
- Walter Bright (2/2) Jan 03 I'm just going to make p-q an error in @safe code. If someone still want...
- jmh530 (3/7) Jan 02 To what extent can D know when pointers are known to point to the
- Walter Bright (4/11) Jan 02 Some can be done trivially, such as (&c - &d). More can be discovered wi...
- Richard (Rikki) Andrew Cattermole (30/43) Jan 02 Four days ago I started implementing value tracking.
- Walter Bright (1/1) Jan 03 Sorry about that. If you ask me, it may be possible, but it sure looks i...
- Richard (Rikki) Andrew Cattermole (3/5) Jan 03 Its not your fault, its a hard problem that I don't think has any good
- Walter Bright (1/1) Jan 03 https://github.com/dlang/dmd/pull/22348
- jmh530 (4/5) Jan 04 2024 edition?
Consider:
```d
safe
size_t distance(int* p, int* q) => p - q;
```
The difficulty here is when p and q may not be pointing into the same memory
object. If they're not, the result is nonsense:
```d
int a;
int b;
size_t distance = &b - &a;
```
The address relationship between `a` and `b` is implementation-defined, and
code
like this would be almost certainly a bug.
Where this could be valid:
```d
struct S
{
int a,b;
}
S s;
size_t distance = &s.b - &s.a;
```
So this would be valid, as the two pointers are known to point to the same
memory object.
A corollary to this would be disallowing < <= > >= comparisons between pointers.
p-q is commonplace in C code, where one traverses a loop. But in D code the
preferred way would be to use arrays.
Thoughts?
P.S. I don't recall ever having a bug with misusing `p-q`. Has anyone?
Dec 31
On 01/01/2026 7:15 PM, Walter Bright wrote:
Consider:
```d
safe
size_t distance(int* p, int* q) => p - q;
```
The difficulty here is when p and q may not be pointing into the same
memory object. If they're not, the result is nonsense:
```d
int a;
int b;
size_t distance = &b - &a;
```
The address relationship between `a` and `b` is implementation-defined,
and code like this would be almost certainly a bug.
Where this could be valid:
```d
struct S
{
int a,b;
}
S s;
size_t distance = &s.b - &s.a;
```
So this would be valid, as the two pointers are known to point to the
same memory object.
A corollary to this would be disallowing < <= > >= comparisons between
pointers.
p-q is commonplace in C code, where one traverses a loop. But in D code
the preferred way would be to use arrays.
Thoughts?
P.S. I don't recall ever having a bug with misusing `p-q`. Has anyone?
Make it ptrdiff_t not size_t, and I'm happy.
The loops might go bad, but hey that is what static analyzers are for ;)
Dec 31
On 12/31/2025 10:54 PM, Richard (Rikki) Andrew Cattermole wrote:Make it ptrdiff_t not size_t, and I'm happy.My bad.
Dec 31
On 01/01/2026 8:19 PM, Walter Bright wrote:On 12/31/2025 10:54 PM, Richard (Rikki) Andrew Cattermole wrote:I wasn't correcting you, I was saying what I wanted it to do. ```d void func(void* a, void* b) { ptrdiff_t diff = b - a; // size_t diff = b - a; ERROR assert(diff >= 0, "ARGUMENTS BACKWARDS"); } ```Make it ptrdiff_t not size_t, and I'm happy.My bad.
Dec 31
On Thursday, 1 January 2026 at 07:27:04 UTC, Richard (Rikki) Andrew Cattermole wrote:On 01/01/2026 8:19 PM, Walter Bright wrote:Yes but this has nothing to do with the substraction. You simply hit the implicit corecions rules there ```d ptrdiff_t a; size_t b; a = b; b = a; ``` You need some kind of tracking/dfa/vrp to put restrictions. Same remark about the initial question I would say. You can imagine some code based on aliasing where `p - q` is finally totally fine. however ` safe` is good deal I would say.On 12/31/2025 10:54 PM, Richard (Rikki) Andrew Cattermole wrote:I wasn't correcting you, I was saying what I wanted it to do. ```d void func(void* a, void* b) { ptrdiff_t diff = b - a; // size_t diff = b - a; ERROR assert(diff >= 0, "ARGUMENTS BACKWARDS"); } ```Make it ptrdiff_t not size_t, and I'm happy.My bad.
Jan 01
On 01/01/2026 9:24 PM, user1234 wrote:On Thursday, 1 January 2026 at 07:27:04 UTC, Richard (Rikki) Andrew Cattermole wrote:That sounds completely overkill. Disable implicit unsigned <-> signed conversion, for a subtraction of pointers. Doesn't need to be inter statement and definitely does not need to solve exact values for the pointers, which may only be known at runtime with 100% certainty.On 01/01/2026 8:19 PM, Walter Bright wrote:Yes but this has nothing to do with the substraction. You simply hit the implicit corecions rules there ```d ptrdiff_t a; size_t b; a = b; b = a; ``` You need some kind of tracking/dfa/vrp to put restrictions. Same remark about the initial question I would say. You can imagine some code based on aliasing where `p - q` is finally totally fine. however ` safe` is good deal I would say.On 12/31/2025 10:54 PM, Richard (Rikki) Andrew Cattermole wrote:I wasn't correcting you, I was saying what I wanted it to do. ```d void func(void* a, void* b) { ptrdiff_t diff = b - a; // size_t diff = b - a; ERROR assert(diff >= 0, "ARGUMENTS BACKWARDS"); } ```Make it ptrdiff_t not size_t, and I'm happy.My bad.
Jan 01
On Thursday, 1 January 2026 at 06:15:09 UTC, Walter Bright wrote:Thoughts?I don't think so. An expression which calculates a `size_t` (or `ptrdiff_t`) value without side effects is memory-safe. What you do with the index (valid or not) would be scrutinized by the usual rules.
Jan 01
On 1/1/26 10:10, Vladimir Panteleev wrote:On Thursday, 1 January 2026 at 06:15:09 UTC, Walter Bright wrote:In C, subtracting pointers to different memory objects is undefined behavior, hence side-effecting. Subtracting pointers can be ` safe` iff it is always defined behavior. (Even if the defined behavior is to yield a nonsense value.) I am not sure how GDC and LDC are currently treating this.Thoughts?I don't think so. An expression which calculates a `size_t` (or `ptrdiff_t`) value without side effects is memory-safe. What you do with the index (valid or not) would be scrutinized by the usual rules.
Jan 01
On 1/1/2026 7:18 AM, Timon Gehr wrote:In C, subtracting pointers to different memory objects is undefined behavior, hence side-effecting. Subtracting pointers can be ` safe` iff it is always defined behavior. (Even if the defined behavior is to yield a nonsense value.)Getting a nonsense value is memory safe, but is almost certainly a bug.I am not sure how GDC and LDC are currently treating this.Since the same front end is used, I'd be surprised if they behaved differently.
Jan 01
On 1/1/26 18:18, Walter Bright wrote:On 1/1/2026 7:18 AM, Timon Gehr wrote:Well, that was the point. Vladimir had said that pointer subtraction is free of side effects, but UB would *be* the side effect. And AFAIU according to the C standard it can be UB. This does not merely mean that the result could be nonsensical (which, for the record, would *not* be a bug in C), it means the program can do whatever it wants. As long as it is defined behavior in D, keeping it ` safe` is perfectly fine. But differences to C and C++ may nevertheless trip up some implementations and violate memory safety, as backends were developed with C and C++ in mind. ` safe` has to be consistent with the backend semantics. This means either making certain constructs ` system` or ensuring all backends compile them safely, or having a broken ` safe`.In C, subtracting pointers to different memory objects is undefined behavior, hence side-effecting. Subtracting pointers can be ` safe` iff it is always defined behavior. (Even if the defined behavior is to yield a nonsense value.)Getting a nonsense value is memory safe, but is almost certainly a bug. ...UB is mostly a glue/backend thing, it's about what the code *means*, not about how it is type checked by the frontend. And backends are often biased towards C and C++ semantics. There are other cases, e.g., with DMD a null dereference may be a guaranteed segfault, but I think it's likely UB with GDC and LDC. It seems LDC even has the flag `-fno-delete-null-pointer-checks` to turn off UB on null pointer dereference, which would indeed indicate it is UB by default.I am not sure how GDC and LDC are currently treating this.Since the same front end is used, I'd be surprised if they behaved differently.
Jan 01
On 1/1/2026 12:47 PM, Timon Gehr wrote:Well, that was the point. Vladimir had said that pointer subtraction is free of side effects, but UB would *be* the side effect. And AFAIU according to the C standard it can be UB. This does not merely mean that the result could be nonsensical (which, for the record, would *not* be a bug in C), it means the program can do whatever it wants.That "can do whatever it wants" is a correct interpretation, but it would be insane to deliberately set up a system that launched nuclear missiles upon encountering UB. I also object to common optimizations that interpret UB as license to delete the offending code path.As long as it is defined behavior in D, keeping it ` safe` is perfectly fine. But differences to C and C++ may nevertheless trip up some implementations and violate memory safety, as backends were developed with C and C++ in mind. ` safe` has to be consistent with the backend semantics. This means either making certain constructs ` system` or ensuring all backends compile them safely, or having a broken ` safe`.The backends do not, to my knowledge, have any awareness of safe or system. I don't see a scenario where not allowing p-q in safe code would have any effect on the backend.UB is mostly a glue/backend thing, it's about what the code *means*, not about how it is type checked by the frontend. And backends are often biased towards C and C++ semantics. There are other cases, e.g., with DMD a null dereference may be a guaranteed segfault, but I think it's likely UB with GDC and LDC. It seems LDC even has the flag `-fno-delete-null-pointer-checks` to turn off UB on null pointer dereference, which would indeed indicate it is UB by default.DMD's optimizer can detect null pointer dereferences as a result of copy propagation, etc., and always gives a compile time error when it does. Otherwise, it just dereferences the null pointer and whatever the CPU does with it happens. What the proposal in this thread is about is extending the safe semantics to not just be about memory safety, but about checking for common bugs where rewriting the code slightly to avoid it is practical.
Jan 01
On 02/01/2026 10:53 AM, Walter Bright wrote:What the proposal in this thread is about is extending the safe semantics to not just be about memory safety, but about checking for common bugs where rewriting the code slightly to avoid it is practical.That sounds a little like you're wanting to make safety designate which functions need static analysis. However in this case it isn't required to make it disallow the operation: ``for (auto q = &array[0]; p - q; ++q)`` If ``p - q`` is a /signed/ integer that cannot implicitly cast to unsigned, it will never iterate. Both ``size_t`` and ``ptrdiff_t`` should be built in types that cannot implicitly cast off them. Making them aliases was a mistake. Note: safe even with this upgrade does not track aliasing or not-aliasing. So if it is positive there is no way to know if it is the same object.
Jan 01
On 1/1/2026 4:42 PM, Richard (Rikki) Andrew Cattermole wrote:That sounds a little like you're wanting to make safety designate which functions need static analysis.Possibly, but not for the p-q case.However in this case it isn't required to make it disallow the operation: ``for (auto q = &array[0]; p - q; ++q)``Right. Just disallow the p-q.If ``p - q`` is a /signed/ integer that cannot implicitly cast to unsigned, it will never iterate.The loop works correctly whether the difference is signed or not.Both ``size_t`` and ``ptrdiff_t`` should be built in types that cannot implicitly cast off them. Making them aliases was a mistake.Andrei proposed something like that years ago. Trying to write code with this soup of same-but-different types turns out to be an awful soup, resulting in lots of casting. I've used a language that did not have implicit casting. The result was casts everywhere, which winds up *increasing* the number of hidden bugs. C has a well-designed implicit casting system (it isn't perfect) that is a lot more flexible when one, for instance, wants to change the type of an integer.Note: safe even with this upgrade does not track aliasing or not-aliasing. So if it is positive there is no way to know if it is the same object.That's right. That's why p-q would be disallowed, whether or not it points to the same memory object.
Jan 01
On 1/1/2026 5:55 PM, Walter Bright wrote:I've used a language that did not have implicit casting. The result was casts everywhere, which winds up *increasing* the number of hidden bugs. C has a well-designed implicit casting system (it isn't perfect) that is a lot more flexible when one, for instance, wants to change the type of an integer.The original C semantics were designed to map directly onto PDP-11 CPU instructions. Along with C's rise to dominance in the 80s, the design of other CPUs were adjusted to match C semantics.
Jan 02
On 1/1/26 22:53, Walter Bright wrote:On 1/1/2026 12:47 PM, Timon Gehr wrote:Usually if the UB does anything of note it is because an attacker is exploiting the hole in the language semantics to, deliberately, make the program do something actively malicious that was never intended by its author. It's a common misunderstanding that these kinds of scenarios are hypothetical. UB just sucks in this way, whether you deliberately want it to or not.Well, that was the point. Vladimir had said that pointer subtraction is free of side effects, but UB would *be* the side effect. And AFAIU according to the C standard it can be UB. This does not merely mean that the result could be nonsensical (which, for the record, would *not* be a bug in C), it means the program can do whatever it wants.That "can do whatever it wants" is a correct interpretation, but it would be insane to deliberately set up a system that launched nuclear missiles upon encountering UB. ...I also object to common optimizations that interpret UB as license to delete the offending code path. ...Whether it is deliberately interpreted in any way or not, the code has to do _SOMETHING_ if the UB condition actually occurs, and it's unclear how you would specify that optimizations are supposed to maintain that specific behavior when it may not even be clear at a point in the optimization pipeline what it will end up being in the end. Often enough, it will end up being an exploitable weakness. If you want optimizers to preserve the behavior of code that has UB in it, you have to turn any potential of UB into an optimization blocker. The optimizer's intermediate representation just does not carry the final machine semantics for expressions with UB. It's a fundamental problem of any language design with UB, not some sort of conspiracy by evil compiler developers. UB exists because language designers and programmers want/need power (runnable low-level fast code fast) without responsibility (using formal methods such as advanced type systems).I think this is true yet completely irrelevant to what I was saying.As long as it is defined behavior in D, keeping it ` safe` is perfectly fine. But differences to C and C++ may nevertheless trip up some implementations and violate memory safety, as backends were developed with C and C++ in mind. ` safe` has to be consistent with the backend semantics. This means either making certain constructs ` system` or ensuring all backends compile them safely, or having a broken ` safe`.The backends do not, to my knowledge, have any awareness of safe or system.I don't see a scenario where not allowing p-q in safe code would have any effect on the backend. ...Not the point at all. I was reacting to Vladimir's statement that:I don't think so. An expression which calculates a `size_t` (or`ptrdiff_t`) value without side effects is memory-safe.What you do with the index (valid or not) would be scrutinized by theusual rules. My point was that: a) There does not actually seem to be any explicit documentation in the D spec about pointer subtraction. If there is, I have not found it. b) In some popular languages, `p-q` is UB if `p` and `q` point to different memory objects. c) It's hence possible that some D backends give UB to this expression when according to your intention they should not. d) This scenario is not implausible, I think it already happens for null pointer dereferences that code that the frontend says is ` safe` is treated as UB by some of the backends.The flag is somewhat of a misnomer, you might have to actually look into its documentation.UB is mostly a glue/backend thing, it's about what the code *means*, not about how it is type checked by the frontend. And backends are often biased towards C and C++ semantics. There are other cases, e.g., with DMD a null dereference may be a guaranteed segfault, but I think it's likely UB with GDC and LDC. It seems LDC even has the flag `-fno-delete-null-pointer-checks` to turn off UB on null pointer dereference, which would indeed indicate it is UB by default.DMD's optimizer can detect null pointer dereferencesas a result of copy propagation, etc., and always gives a compile time error when it does.Sure, when you can prove that a piece of code is always wrong to execute, you can do that (and I think it's a good idea). Often you however can't.Otherwise, it just dereferences the null pointer and whatever the CPU does with it happens. ...And hence you are now stuck treating pointer dereferences as a side-effecting operation. Some backends don't like doing that. Another issue is that some targets will not trap at all and just treat 0 as a valid memory address. (Less relevant for DMD's supported targets.)What the proposal in this thread is about is extending the safe semantics to not just be about memory safety, but about checking for common bugs where rewriting the code slightly to avoid it is practical.I understand, but there are more than two positions here. Your position: `p-q` is memory safe yet might be error prone and we might want to start banning error prone constructs in ` safe` code even though it was originally meant to be strictly about memory safety. Vladimir's position: `p-q` is memory safe, hence there is no need to reject it in ` safe` code. My position: Wait, is `p-q` even _currently implemented_ in a memory safe way? Where is it documented? What are the backends doing? There are already cases where ` safe` code is treated as UB by some backends and `p-q` might be among these cases.
Jan 01
On Friday, 2 January 2026 at 02:00:29 UTC, Timon Gehr wrote:a) There does not actually seem to be any explicit documentation in the D spec about pointer subtraction. If there is, I have not found it.Point 6 here: https://dlang.org/spec/expression.html#pointer_arithmeticIf both operands are pointers, and the operator is -, the pointers are subtracted and the result is divided by the size of the type pointed to by the operands.It sounds like we need to put in an undefined behaviour note. What about *RelExpression* on pointers, UB or not?b) In some popular languages, `p-q` is UB if `p` and `q` point to different memory objects. c) It's hence possible that some D backends give UB to this expression when according to your intention they should not. d) This scenario is not implausible, I think it already happens for null pointer dereferences that code that the frontend says is ` safe` is treated as UB by some of the backends.Yes as of May, that requirement is in the spec: https://dlang.org/spec/function.html#null-dereferences
Jan 03
On Thursday, 1 January 2026 at 06:15:09 UTC, Walter Bright wrote:Consider: ```d safe size_t distance(int* p, int* q) => p - q; ``` The difficulty here is when p and q may not be pointing into the same memory object. If they're not, the result is nonsense:I use this all the time to iterate multiple arrays in lockstep. size_t offset = q-p; you access q with "p[offset]", and you just iterate p I tried to avoid using it but it is just faster sometimes,
Jan 01
On 1/1/2026 7:16 AM, claptrap wrote:I use this all the time to iterate multiple arrays in lockstep. size_t offset = q-p; you access q with "p[offset]", and you just iterate p I tried to avoid using it but it is just faster sometimes,safe code doesn't allow pointer arithmetic, and so such code would have to be marked trusted anyway.
Jan 01
On Thursday, 1 January 2026 at 17:20:33 UTC, Walter Bright wrote:On 1/1/2026 7:16 AM, claptrap wrote:Pointer *subtraction* is allowed in safe code because the result is an integer, and all integers are [safe values][1]. For example, this compiles using the latest release of DMD: ```d import std.stdio; void main() safe { int* p = new int, q = new int; writeln(q - p); } ``` [1]: https://dlang.org/spec/function.html#safe-valuesI use this all the time to iterate multiple arrays in lockstep. size_t offset = q-p; you access q with "p[offset]", and you just iterate p I tried to avoid using it but it is just faster sometimes,safe code doesn't allow pointer arithmetic, and so such code would have to be marked trusted anyway.
Jan 01
On 1/1/26 18:56, Paul Backus wrote:Pointer *subtraction* is allowed in safe code because the result is an integer, and all integers are [safe values][1].There are plenty of unsafe operations whose result is considered to be an integer by the type checker, so I don't think this justification is sufficient.
Jan 01
On 1/1/2026 9:56 AM, Paul Backus wrote:
For example, this compiles using the latest release of DMD:
```d
import std.stdio;
void main() safe
{
int* p = new int, q = new int;
writeln(q - p);
}
```
[1]: https://dlang.org/spec/function.html#safe-values
Yes, that code would be safe, but it would also be garbage.
I was referring more to code like this:
char* p = &array[array.length;
for (auto q = &array[0]; p - q; ++q)
...
Jan 01
On 1/1/26 18:20, Walter Bright wrote:On 1/1/2026 7:16 AM, claptrap wrote:It's one question whether you have to mark it ` trusted` for it to type check, it's another question whether you are allowed to mark it ` trusted` (i.e., whether it is actually memory safe). In C, I think adding an integer to a pointer aiming to get a result that points to a different memory object entirely would just be UB. I.e., there is some potential for `q+(p-q)` to do something other than give you `p` unless glue code and backends are careful to handle it as intended. As far as I can tell, such nuances are not documented in the D spec and so it would be defensible for GDC and LDC to assume it's just supposed to mimic C behavior. Whether the backends actually do end up breaking assumptions like they would be allowed to is still another question.I use this all the time to iterate multiple arrays in lockstep. size_t offset = q-p; you access q with "p[offset]", and you just iterate p I tried to avoid using it but it is just faster sometimes,safe code doesn't allow pointer arithmetic, and so such code would have to be marked trusted anyway.
Jan 01
1/1/2026 12:57 PM, Timon Gehr wrote: It's one question whether you have to mark it ` trusted` for it to type check, it's another question whether you are allowed to mark it ` trusted` (i.e., whether it is actually memory safe).trusted only applies to the interface, not the code itself.In C, I think adding an integer to a pointer aiming to get a result that points to a different memory object entirely would just be UB.You're right.I.e., there is some potential for `q+(p-q)` to do something other than give you `p` unless glue code and backends are careful to handle it as intended.My proposal would not affect that - the frontend would diagnose p-q as an error in safe code.As far as I can tell, such nuances are not documented in the D spec and so it would be defensible for GDC and LDC to assume it's just supposed to mimic C behavior. Whether the backends actually do end up breaking assumptions like they would be allowed to is still another question.I designed the semantics of D fully aware of the reality that the usable backends (including mine) were designed for C, and that to not do so would be language suicide. (And it's not just the backends, there are the debuggers, etc.)
Jan 01
On 1/1/26 23:14, Walter Bright wrote:> 1/1/2026 12:57 PM, Timon Gehr wrote:I understand, but your latest point was "just put ` trusted` on it". Let's say the frontend now treats `p-q` as ` system`, and there is not even any documentation of what its semantics is supposed to be. Do you believe with this background, alternative backends will in the future be more likely to: - treat `p-q` as UB when different memory objects are involved - treat `p-q` as defined behavior when different memory objects are involved I just think the overall effect of this will be to cause confusion about what is allowed among all parties involved. I think it's better to stick to banning language constructs from ` safe` if they can actually exhibit UB.It's one question whether you have to mark it ` trusted` for it to type check, it's another question whether you are allowed to mark it ` trusted` (i.e., whether it is actually memory safe).trusted only applies to the interface, not the code itself.In C, I think adding an integer to a pointer aiming to get a result that points to a different memory object entirely would just be UB.You're right.I.e., there is some potential for `q+(p-q)` to do something other than give you `p` unless glue code and backends are careful to handle it as intended.My proposal would not affect that - the frontend would diagnose p-q as an error in safe code. ...And yet it seems for `p-q` you differed.As far as I can tell, such nuances are not documented in the D spec and so it would be defensible for GDC and LDC to assume it's just supposed to mimic C behavior. Whether the backends actually do end up breaking assumptions like they would be allowed to is still another question.I designed the semantics of D fully aware of the reality that the usable backends (including mine) were designed for C, and that to not do so would be language suicide. (And it's not just the backends, there are the debuggers, etc.)
Jan 01
On 1/1/2026 6:11 PM, Timon Gehr wrote:On 1/1/26 23:14, Walter Bright wrote: I understand, but your latest point was "just put ` trusted` on it". Let's say the frontend now treats `p-q` as ` system`, and there is not even any documentation of what its semantics is supposed to be.It's semantics are subtract q from p and divide by the size of the pointed to type.Do you believe with this background, alternative backends will in the future be more likely to: - treat `p-q` as UB when different memory objects are involved - treat `p-q` as defined behavior when different memory objects are involvedLet's step back a bit. I expect it to behave as a C backend would. More precisely, I have read the C/C++ memory model specification. It is very carefully written and well done. I requested a license to copy it to use in the D specification, but my request was ignored. I could rewrite it to an equivalent definition, but that's a lot of work. But still, D is going to adhere to it. It works, everyone understands it, and the existing backends are carefully tuned to match it. All my proposal does is disallow pointer subtraction in safe code. Code generation is not affected in any material way. It's the same thing as disallowing p+=1 in safe code. The memory model does not change.I just think the overall effect of this will be to cause confusion about what is allowed among all parties involved. I think it's better to stick to banning language constructs from ` safe` if they can actually exhibit UB.Isn't that what I proposed?And yet it seems for `p-q` you differed.How did I differ? I am confused.
Jan 02
On 1/2/26 18:53, Walter Bright wrote:On 1/1/2026 6:11 PM, Timon Gehr wrote:There is some conversion going on here that you did not mention, and in C the subtraction is sometimes invalid. I understand how to subtract pointers in e.g. x86 assembly, but the abstract semantics in a high-level language is a different thing. E.g., there is no such thing as a "memory object" at the assembly level.On 1/1/26 23:14, Walter Bright wrote: I understand, but your latest point was "just put ` trusted` on it". Let's say the frontend now treats `p-q` as ` system`, and there is not even any documentation of what its semantics is supposed to be.It's semantics are subtract q from p and divide by the size of the pointed to type. ...(What any given C backend does _in practice_ is yet another question.) But it seems you'd like it to be UB sometimes. Then it must be ` system`.Do you believe with this background, alternative backends will in the future be more likely to: - treat `p-q` as UB when different memory objects are involved - treat `p-q` as defined behavior when different memory objects are involvedLet's step back a bit. I expect it to behave as a C backend would.More precisely, I have read the C/C++ memory model specification. It is very carefully written and well done. I requested a license to copy it to use in the D specification, but my request was ignored. I could rewrite it to an equivalent definition, but that's a lot of work. ...That's not really the point of contention, if you are saying "D pointer arithmetic semantics is like C", that's a sufficient specification as far as I am concerned. And then it immediately follows that `p-q` cannot be allowed in ` safe` code.But still, D is going to adhere to it. It works, everyone understands it, and the existing backends are carefully tuned to match it. ...Ok.All my proposal does is disallow pointer subtraction in safe code. Code generation is not affected in any material way. ...The point of contention is really not whether banning a construct will affect codegen. The actual dependency is: type checking <- semantics -> codegen However, if you allow `p-q` in ` safe` code, assuming logical consistency, we can infer an intent about semantics that will put certain restrictions on code generation.It's the same thing as disallowing p+=1 in safe code.Maybe to you this is the same, but to me `p-q` and `p+=1` are materially different: one yields an integer, the other one yields a potentially invalid pointer. It is conceivable _in principle_ to have a language semantics where `p-q` is defined behavior.The memory model does not change. ...That's fine, but to allow `p-q` in ` safe` code with C semantics is inconsistent with _the definition of ` safe`_. And now you are saying that this is the _current behavior_. It seems something is broken, and fixing it is a _design problem_. There are two different ways to fix it: - Make cross-memory-object `p-q` implementation-defined (as you claimed in your OP was already the case), differing from C. - Make cross-memory-object `p-q` UB (as you are claiming now is already the case), then ban `p-q` from ` safe` code. You can't ignore the intended semantics of your programming constructs when deciding if they can be ` safe`, even if changing the type checker to consider something ` safe` or not does not have a material effect on code generation by itself.I am not able to tell, which is the problem. You are saying contradictory things. You so far made all of these claims: - cross-memory-object `p-q` is implementation-defined in D - `p-q` in D is like in C - cross-memory-object `p-q` is UB in C. One of these three statements must be false. I think the last one is correct.I just think the overall effect of this will be to cause confusion about what is allowed among all parties involved. I think it's better to stick to banning language constructs from ` safe` if they can actually exhibit UB.Isn't that what I proposed? ...`p-q` is sometimes UB in C and hence not memory safe. You said `p-q` is memory safe in D. Hence it would have to be different. There is no such thing as "UB yet memory safe".And yet it seems for `p-q` you differed.How did I differ? I am confused.
Jan 02
It seems we are in full agreement that p-q should be disallowed in safe code, which is my proposal here. BTW, p-q is not a memory safety issue. At worst you get an integer result that is an unpredictable value. Yes, I am suggesting expanding the scope of safe. `i<<j` can also result in nonsense if `j>=32`. But it is not unsafe. Given the pervasiveness of C, it would be insanity for a CPU to do anything other than seg fault or produce a random result.
Jan 02
On 1/2/26 22:03, Walter Bright wrote:It seems we are in full agreement that p-q should be disallowed in safe code, which is my proposal here. ...I am happy with each of these two outcomes: 1. `p-q` is ` safe`, implementation-defined. 2. `p-q` can be UB, must be ` system`. So, works for me.BTW, p-q is not a memory safety issue.Any type of UB is a memory safety issue.At worst you get an integer result that is an unpredictable value.No, _at worst_ you get e.g. the nuclear launch thing you mentioned (or worse). Undefined semantics is starkly distinct from nondeterministic semantics. Any assumption that any type of UB is benign must rely on additional information about specific backends. So what you claim may be true with DMD, but that is about the extent of it.Yes, I am suggesting expanding the scope of safe. ...As long as ` safe` code consistently bans UB, the discussion of whether banning UB from ` safe` code is an expansion of the scope of ` safe` is mostly a philosophical one. I am happy if ` safe` code disallows language constructs that can cause UB. I think you are completely wrong to claim that this is an expansion of the scope of ` safe`, but I will not lose any sleep over that part.`i<<j` can also result in nonsense if `j>=32`.It is UB in C.But it is not unsafe.UB implies unsafe. If it is not unsafe, it is not UB (e.g., in Java it is safe and hence not UB).Given the pervasiveness of C, it would be insanity for a CPU to do anything other than seg fault or produce a random result.I would expect a CPU to just do `i<<(j&31)`. The C abstract machine is however not the CPU.
Jan 02
On 1/2/2026 2:54 PM, Timon Gehr wrote:On 1/2/26 22:03, Walter Bright wrote:You are obviously correct. But using known computers, it is not a memory safety measure. I don't see any reason anyone would implement p-q such that it trashes memory or sets the CPU on fire. Maybe what actually happens should be documented, to make it "implementation defined", but I'm not in a position to authoritatively document what CPUs do. Dereferencing random pointers, on the other hand, can realistically corrupt memory. This is why pointer arithmetic is not allowed in safe code.It seems we are in full agreement that p-q should be disallowed in safe code, which is my proposal here. ...I am happy with each of these two outcomes: 1. `p-q` is ` safe`, implementation-defined. 2. `p-q` can be UB, must be ` system`. So, works for me.BTW, p-q is not a memory safety issue.Any type of UB is a memory safety issue.Any assumption that any type of UB is benign must rely on additional information about specific backends. So what you claim may be true with DMD, but that is about the extent of it.I can't see a professionally designed CPU catching fire or corrupting memory by subtracting two unrelated pointers. One would have to add more transistors to make that happen. Nobody would buy such a machine. Current CPUs are what they are. We live with that, and we trade off performance for some level of unpredictable failure.I would expect a CPU to just do `i<<(j&31)`.The X86_64 and Aarch64 give different results, I ran into that bug.The C abstract machine is however not the CPU.CPU design has very much followed C semantics since the 80s. Unfortunately, the C spec didn't nail down certain behaviors, and so we have different behaviors.
Jan 02
On 1/3/26 01:00, Walter Bright wrote:On 1/2/2026 2:54 PM, Timon Gehr wrote:Underlying this admission is your utterly wrong claim, namely that it is a theoretical issue without practical significance.On 1/2/26 22:03, Walter Bright wrote:You are obviously correct.It seems we are in full agreement that p-q should be disallowed in safe code, which is my proposal here. ...I am happy with each of these two outcomes: 1. `p-q` is ` safe`, implementation-defined. 2. `p-q` can be UB, must be ` system`. So, works for me.BTW, p-q is not a memory safety issue.Any type of UB is a memory safety issue.But using known computers, it is not a memory safety measure.What "known computers" are doing at the machine level is only part of the puzzle. You can't just ignore "known compilers". This is not about hardware.I don't see any reason anyone would implement p-q such that it trashes memory or sets the CPU on fire.Compiler passes just do what they do, assuming things like that if you see `p-q` then `p` and `q` are pointing to the same memory object. Garbage in, garbage out. Wrong assumptions entering optimizers can and do cause befuddling miscompilation. The optimizer does not care to explicitly trash your memory on `p-q`, it's just a side effect of completely disregarding the case where `p` and `q` are unrelated.Maybe what actually happens should be documented, to make it "implementation defined", but I'm not in a position to authoritatively document what CPUs do. ...UB does not care about what CPUs do. Even saying "it will do whatever the CPU does in this and this situation" is much, much safer than saying "this is UB". However, most backends made for C will not be able to implement this semantics while still performing optimizations.Dereferencing random pointers, on the other hand, can realistically corrupt memory. This is why pointer arithmetic is not allowed in safe code. ...`p-q` in a C program can _realistically_ corrupt memory even if the CPU will never corrupt memory when subtracting addresses. This is not just a theoretical problem, UB is UB and it has caused problems in practice.You absolutely can make a more efficient CPU by adding UB to it that can cause it to destroy itself or corrupt other components of the system if you run the wrong program. Professionals just indeed don't do that, because for some reason hardware reliability is taken seriously while software reliability is not. CPUs come with manufacturer warranties, software comes with EULAs that read "ABSOLUTELY NO WARRANTY OF FITNESS FOR ANY PARTICULAR PURPOSE". CPU manufacturers are using formal methods to verify their designs.Any assumption that any type of UB is benign must rely on additional information about specific backends. So what you claim may be true with DMD, but that is about the extent of it.I can't see a professionally designed CPU catching fire or corrupting memory by subtracting two unrelated pointers. One would have to add more transistors to make that happen.Nobody would buy such a machine. ...This is not about the CPU, it's about compilers.Current CPUs are what they are. We live with that, and we trade off performance for some level of unpredictable failure.The CPU does not have a concept of "memory object" or "different memory objects". It usually does not even distinguish addresses from other machine-word integers.I would expect a CPU to just do `i<<(j&31)`.The X86_64 and Aarch64 give different results, I ran into that bug.The C abstract machine is however not the CPU.CPU design has very much followed C semantics since the 80s.Unfortunately, the C spec didn't nail down certain behaviors, and so we have different behaviors.This is analogous to implementation-defined behavior, not undefined behavior. The C spec has undefined behavior, it is not saying "do what the CPU does", it is saying "do whatever is expedient, e.g. so to make the program run fast".
Jan 02
The bottom line here is why are we arguing about this? Haven't we agreed that p-q should be disallowed in safe code? The rest of this message you can ignore if you like. --------------------- On 1/2/2026 4:46 PM, Timon Gehr wrote:This is not about hardware.Good, we can move on from that issue!The optimizer does not care to explicitly trash your memory on `p-q`, it's just a side effect of completely disregarding the case where `p` and `q` are unrelated.``` int i,j; p = &i; q = &j; x = p - q; ``` The compiler can detect that p-q would would be undefined behavior. A sane compiler would issue an error message upon such detection. Note that the the C11 spec says not doing a "shall" means undefined behavior. Taking that literally means any syntax/semantic error in your code can legitimately cause the compiler to generate undefined behavior. But not a sane compiler. And yes, I oppose optimizers that detect UB and just delete it. That's a disservice to the users, who find out the hard way about this behavior, rather than getting a useful error message. If the compiler does not detect that error (which will be most cases), then it will do the reasonable thing and just subtract the two numbers, which will not cause memory corruption in any mainstream CPU.`p-q` in a C program can _realistically_ corrupt memory even if the CPU will never corrupt memory when subtracting addresses. This is not just a theoretical problem, UB is UB and it has caused problems in practice.I know, but I haven't seen an example of it for `p-q`. It would be interesting if you could devise one! The UB problems I've seen were for other constructions.You absolutely can make a more efficient CPU by adding UB to it that can cause it to destroy itself or corrupt other components of the system if you run the wrong program.I don't know if that is possible for `p-q`. It's just a subtraction. It may very well be possible for other UBs.Professionals just indeed don't do that, because for some reason hardware reliability is taken seriously while software reliability is not.The reason is pretty simple. Remember the disaster with the Intel Pentium floating point bug? Wow was that expensive! I bore some of that cost because I had to add workarounds to the code generator. Software updates are a lot cheaper than having to pry out everyone's CPU chip and replace it, and even so, compilers had to assume they were running on a bad CPU.CPUs come with manufacturer warranties, software comes with EULAs that read "ABSOLUTELY NO WARRANTY OF FITNESS FOR ANY PARTICULAR PURPOSE".The software industry would cease to exist without that clause.CPU manufacturers are using formal methods to verify their designs.Formal methods have bugs, too. Though I agree that formal methods are highly useful. I know how to set up DFA and such and get them right, but I can't say I have expertise in formal methods. For example, I don't know how to prove that DFA converges to a solution, though I know it does, because the paper I learned it from says they proved it :-) and have never found it to not be true. Full disclosure: I have no formal education in computer science, which you have surely inferred by now!The CPU does not have a concept of "memory object" or "different memory objects". It usually does not even distinguish addresses from other machine-word integers.It does with the segmented memory system of the IBM PC, and the banked memory card add-ons. I wrote a software virtual memory system using banked memory and segment registers. You didn't really want to use an offset larger than the memory allocated to that segment! But those designs are all obsolete now and irrelevant.The commercial reality is starting in the 80s CPU designs changed to be very friendly to actual C behavior. The C spec doesn't say anything about expedience or speed (that I recall).Unfortunately, the C spec didn't nail down certain behaviors, and so we have different behaviors.This is analogous to implementation-defined behavior, not undefined behavior. The C spec has undefined behavior, it is not saying "do what the CPU does", it is saying "do whatever is expedient, e.g. so to make the program run fast".
Jan 02
On 03/01/2026 3:29 PM, Walter Bright wrote:Web assembly is segmented :(The CPU does not have a concept of "memory object" or "different memory objects". It usually does not even distinguish addresses from other machine-word integers.It does with the segmented memory system of the IBM PC, and the banked memory card add-ons. I wrote a software virtual memory system using banked memory and segment registers. You didn't really want to use an offset larger than the memory allocated to that segment! But those designs are all obsolete now and irrelevant.
Jan 02
On 1/2/2026 6:37 PM, Richard (Rikki) Andrew Cattermole wrote:Web assembly is segmented :(They should have talked to me first!
Jan 02
On 03/01/2026 4:24 PM, Walter Bright wrote:On 1/2/2026 6:37 PM, Richard (Rikki) Andrew Cattermole wrote:I understand why they are doing it. Its not like a traditional cpu ISA, its all typed. The killer though for D is you can't get a pointer with whatever offset you want into a GC object. There are some improvements being worked on: https://github.com/WebAssembly/memory-control/blob/main/proposals/memory-control/Overview.md https://github.com/WebAssembly/multibyte-array-access/blob/main/proposals/multibyte-array-access/Overview.md But what we'd need to take full advantage is a reference type that can point to whatever segment of memory + an arbitrary offset, and do arithmetic on it. Possible to do that due to it all being typed and JIT'd. Funnily enough I watched a video on Web Assembly's GC today, left a comment about how its a bit of a disappointment that it is DOA for us. https://www.youtube.com/watch?v=nbqjDEaRkVIWeb assembly is segmented :(They should have talked to me first!
Jan 02
On 1/2/2026 7:57 PM, Richard (Rikki) Andrew Cattermole wrote:The killer though for D is you can't get a pointer with whatever offset you want into a GC object.LDC has a webassembly back end, so it is not a killer.
Jan 03
On 04/01/2026 8:30 AM, Walter Bright wrote:On 1/2/2026 7:57 PM, Richard (Rikki) Andrew Cattermole wrote:You have misunderstood the situation. As far as GC is concerned we are on our own and stuck on linear memory aka sbrk, we cannot use WasmGC with our pointers. Upstream ldc does not have runtime supported and I'm not sure I'd even suggest the -betterC support as acceptable. Having the target enabled isn't the same thing as being a supported target.The killer though for D is you can't get a pointer with whatever offset you want into a GC object.LDC has a webassembly back end, so it is not a killer.
Jan 03
On 1/3/2026 6:03 PM, Richard (Rikki) Andrew Cattermole wrote:As far as GC is concerned we are on our own and stuck on linear memory aka sbrk, we cannot use WasmGC with our pointers. Upstream ldc does not have runtime supported and I'm not sure I'd even suggest the -betterC support as acceptable. Having the target enabled isn't the same thing as being a supported target.I don't see why calls to `new` cannot be redirected to whatever WASM does?
Jan 03
On 04/01/2026 3:30 PM, Walter Bright wrote:On 1/3/2026 6:03 PM, Richard (Rikki) Andrew Cattermole wrote:You can't do pointer arithmetic with WasmGC. No subtraction, no getting pointers to fields, nothing like that. That is the GC offering currently. For the linear memory, its a memory mapper only, sbrk. Oh and you can have multiple linear memories that you have to keep track what the offset is actually for when dereferencing. They are typed entirely differently, you cannot mix them. It is exactly like near vs far pointers. Basically you're on your own as a compiler developer.As far as GC is concerned we are on our own and stuck on linear memory aka sbrk, we cannot use WasmGC with our pointers. Upstream ldc does not have runtime supported and I'm not sure I'd even suggest the -betterC support as acceptable. Having the target enabled isn't the same thing as being a supported target.I don't see why calls to `new` cannot be redirected to whatever WASM does?
Jan 03
On 1/3/2026 7:30 PM, Richard (Rikki) Andrew Cattermole wrote:``` struct S { int a; } S* s = new S(); s.a = 3; ``` What's the problem?I don't see why calls to `new` cannot be redirected to whatever WASM does?You can't do pointer arithmetic with WasmGC. No subtraction, no getting pointers to fields, nothing like that. That is the GC offering currently.For the linear memory, its a memory mapper only, sbrk. Oh and you can have multiple linear memories that you have to keep track what the offset is actually for when dereferencing.I don't get it.They are typed entirely differently, you cannot mix them. It is exactly like near vs far pointers.??
Jan 03
On 04/01/2026 6:28 PM, Walter Bright wrote:On 1/3/2026 7:30 PM, Richard (Rikki) Andrew Cattermole wrote:Loading and storing fields work. But this doesn't when using the Wasm GC: ``int* ptr = &s.a;`` Or this: ```d func(s.a); void func(ref int); `````` struct S { int a; } S* s = new S(); s.a = 3; ``` What's the problem?I don't see why calls to `new` cannot be redirected to whatever WASM does?You can't do pointer arithmetic with WasmGC. No subtraction, no getting pointers to fields, nothing like that. That is the GC offering currently.Ahhh, I see. Here is the man page for bsd 2.11 which is the last BSD (that is actively in use by retro community) to run on PDP-11: https://man.freebsd.org/cgi/man.cgi?query=sbrk&apropos=0&sektion=0&manpath=2.11+BSD&arch=default&format=html It was removed in Posix 2001. https://en.wikipedia.org/wiki/Sbrk This is how memory is mapped into a process and then is cut up and returned by memory allocators like malloc. For an example of this, open K&R C Programming Language 2nd edition to page 185. It has an example malloc implementation that uses sbrk. This isn't how its done today, these days memory is mapped using mmap instead, sbrk isn't an option. Web assembly folks however decided to do it the way we did it in the 70's before MMU's were a thing.For the linear memory, its a memory mapper only, sbrk. Oh and you can have multiple linear memories that you have to keep track what the offset is actually for when dereferencing.I don't get it.Okay, I think I understand the confusion. Web assembly isn't a byte code like x86 is. It is fully typed, a reference to a GC object is different to a pointer to linear memory (sbrk). In pseudo code: ``` struct S { int field; } linear(S*) l = cast(S*)linear_alloc(4); l++; // ok gc(S) g = new S; int* ptr = &g.field; // error no instruction to do this l = g; // error g = l; // error l - l; // ok g - g; // error no instruction to do this ```They are typed entirely differently, you cannot mix them. It is exactly like near vs far pointers.??
Jan 03
Thanks for the explanation. What it suggest to me is that a subset of D will work perfectly fine with WASM.
Jan 04
On 05/01/2026 1:33 PM, Walter Bright wrote:Thanks for the explanation. What it suggest to me is that a subset of D will work perfectly fine with WASM.Yes it should do, but it'll be limited enough that people will get annoyed with it rather fast, at the very least I would. Not worth my time building up a new backend and a new runtime for it. The whole time it'll be nope can't do that, or that or that... Ugh me no like.
Jan 04
On Mon, Jan 05, 2026 at 01:48:25PM +1300, Richard (Rikki) Andrew Cattermole via Digitalmars-d wrote:On 05/01/2026 1:33 PM, Walter Bright wrote:In the past I've managed to get a rudimentary (highly-hacked) druntime running in WASM, with bare-minimum support for module ctors and JS interface. It's quite comfortable to use, except for memory allocation. Problem is, either you have to port the entire GC implementation to WASM, which will take up a LOT of code (i.e., slow loading of your project over the web), and require gobs of memory to run (your memory requirements will go way up, even for the simplest of modules), or you have to live with completely no GC, or some hackish in-between. For things like frame-based animated games, you could get away with per-frame allocation, i.e., allocate everything statically before the main loop, then during the main loop all allocations only last until the end of the frame, after that it's thrown away. While it will work, it lacks the comfort of programming with full GC support. You couldn't just use standard D features like delegates and closures without worrying about lifetime issues. Things like threading and other advanced features will of course be very limited as well. I haven't had the motivation to actually port the GC to WASM, because it adds so much code that it becomes a bigger project than the target app itself. I ended up going back to JS for web projects just to avoid having to grapple with these issues. Dreamed about writing a D to JS translator, actually, just haven't gotten around to it yet. :-P WASM GC is a thing, but it requires treating GC references as separate types from normal pointers, which D's memory model just doesn't fit in well with. Host-managed GC'd memory is also treated differently from linear memory; the layout of the object must be known to the host so generic pointers and unions are unsupported. It also requires LLVM support if you're using LDC, but AFAIK LLVM doesn't have full support for WASM GC yet. IOW, WASM GC imposes restrictions that are incompatible with D's memory model, so it will be very hard to work with it. The only alternative for full GC support is to port the GC itself into WASM. As I said, it greatly increases the payload size and memory requirements, and also won't be as efficient as the host browser's GC. All in all, a suboptimal situation. T -- First Rule of History: History doesn't repeat itself -- historians merely repeat each other.Thanks for the explanation. What it suggest to me is that a subset of D will work perfectly fine with WASM.Yes it should do, but it'll be limited enough that people will get annoyed with it rather fast, at the very least I would. Not worth my time building up a new backend and a new runtime for it. The whole time it'll be nope can't do that, or that or that... Ugh me no like.
Jan 04
On Monday, 5 January 2026 at 01:17:11 UTC, H. S. Teoh wrote:In the past I've managed to get a rudimentary (highly-hacked) druntime running in WASMhttps://dpldocs.info/this-week-in-arsd/Blog.Posted_2024_10_25.html OpenD did most of it successfully in 2024.
Jan 04
On 1/4/2026 4:48 PM, Richard (Rikki) Andrew Cattermole wrote:The whole time it'll be nope can't do that, or that or that... Ugh me no like.It'd be little different from the WASM targets for C++.
Jan 04
I haven't had the motivation to actually port the GC to WASM,Just use opend, we ported druntime to wasm and make cross compilation easy. see https://dpldocs.info/this-week-in-arsd/Blog.Posted_2024_10_25.html Yeah it is a lil bloated, like a megabyte download, but meh.
Jan 04
On 1/3/26 03:29, Walter Bright wrote:The bottom line here is why are we arguing about this?You brought up some tangential points that I think are based on flawed reasoning.Haven't we agreed that p-q should be disallowed in safe code?With the semantics you clarified it is intended to have, it must indeed be ` system`.And yes, I oppose optimizers that detect UB and just delete it.The optimizers don't crave or need your approval, all they need is your specification that it is UB. You are thereby inviting them to do this. Your UB is their dead code. And it helps them delete real dead code that they otherwise would not be able to detect. They will not stop doing this unless the language stops giving them UB to exploit. If you don't mean UB, don't say UB. There are some claims in your last post with which I disagree, but as I said, I will not sacrifice sleep in order to argue against everything.
Jan 03
On 1/1/2026 6:11 PM, Timon Gehr wrote:Let's say the frontend now treats `p-q` as ` system`, and there is not even any documentation of what its semantics is supposed to be.The current documentation says: "If both operands are pointers, and the operator is -, the pointers are subtracted and the result is divided by the size of the type pointed to by the operands. In this calculation the assumed size of void is one byte. It is an error if the pointers point to different types. The type of the result is ptrdiff_t." https://dlang.org/spec/expression.html#pointer_arithmetic C11 says: "When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object; the result is the difference of the subscripts of the two array elements." and: "The behavior is undefined in the following circumstances: A ‘‘shall’’ or ‘‘shall not’’ requirement that appears outside of a constraint is violated (clause 4)." In general, it is not possible for the compiler to ensure two pointers point to the same object without expensive instrumentation added to the code. The practical effect is to assume they do, subtract the values, and divide by the size of the type. The only thing D can do is in safe code simply disallow p-q, as there are good alternatives to do the equivalent thing.
Jan 02
On 1/2/26 20:51, Walter Bright wrote:... In general, it is not possible for the compiler to ensure two pointers point to the same object without expensive instrumentation added to the code. The practical effect is to assume they do, subtract the values, and divide by the size of the type. ...This is what people intuitively assume will happen, but UB is not this. UB is notoriously prone to confusing programmers as well as compiler writers. One memory-safe alternative semantics to UB would be: if arguments point to different memory objects, you may get any result value. This can still be implemented by your "practical effect" above, but now it's memory safe (in isolation). You could have an even stronger semantics that also guarantees things like `p is p+(q-p)`. The source and target semantics are the only things the optimizer really tends to care about.The only thing D can do is in safe code simply disallow p-q, as there are good alternatives to do the equivalent thing.It's absolutely not the only possible thing. It is just one way (the C way) to deal with the issue.
Jan 02
On 1/2/2026 3:17 PM, Timon Gehr wrote:One memory-safe alternative semantics to UB would be: if arguments point to different memory objects, you may get any result value. This can still be implemented by your "practical effect" above, but now it's memory safe (in isolation).I've said that it was memory safe all along. I've said this proposal is extending safe beyond memory safety to include bug detection of other things like p-q.You could have an even stronger semantics that also guarantees things like `p is p+(q-p)`.I don't see any particular use for recognizing that. There are an infinite number of patterns that can be recognized, it's only useful to recognize ones that occur commonly. BTW, the optimizer does recognize i+(j-i) as being just j, and it will do it for pointers to 1 byte objects. For pointers to int, the intermediate code looks like p+((q-p)/4)*4 which is not recognized.Please explain the other ways.The only thing D can do is in safe code simply disallow p-q, as there are good alternatives to do the equivalent thing.It's absolutely not the only possible thing. It is just one way (the C way) to deal with the issue.
Jan 03
On 1/3/26 20:28, Walter Bright wrote:On 1/2/2026 3:17 PM, Timon Gehr wrote:You have both said that and then also said that I am obviously correct when I say that UB is not memory safe. `p-q` is sometimes UB in C.One memory-safe alternative semantics to UB would be: if arguments point to different memory objects, you may get any result value. This can still be implemented by your "practical effect" above, but now it's memory safe (in isolation).I've said that it was memory safe all along.I've said this proposal is extending safe beyond memory safety to include bug detection of other things like p-q. ...You said that, it's false. This is not merely a bug, it is UB if you copy the C semantics.It was an example of what even stronger semantics would possibly guarantee, not a suggestion to detect something as a special case. In this thread, you have often shot off on a tangent and then made a moot unrelated claim along that tangent. I have to assume these are all the result of misunderstandings.You could have an even stronger semantics that also guarantees things like `p is p+(q-p)`.I don't see any particular use for recognizing that.There are an infinite number of patterns that can be recognized, it's only useful to recognize ones that occur commonly. ...There was literally someone in this thread who said they commonly rely on this particular pattern and that it gives them better performance.BTW, the optimizer does recognize i+(j-i) as being just j, and it will do it for pointers to 1 byte objects. For pointers to int, the intermediate code looks like p+((q-p)/4)*4 which is not recognized. ...The point was not about whether it is actually just the expression `j`, the point was whether it will always result in the same value as `j`, no matter how it got there (assuming the pointers were properly aligned).I already have, in that same post. What is important for a ` safe` construct is that it always has defined behavior. That behavior can be nondeterministic, it just can't be undefined. You can for example say: the semantics is that it will yield an arbitrary value or crash. That would be a memory safe semantics. UB is not.Please explain the other ways.The only thing D can do is in safe code simply disallow p-q, as there are good alternatives to do the equivalent thing.It's absolutely not the only possible thing. It is just one way (the C way) to deal with the issue.
Jan 03
I'm just going to make p-q an error in safe code. If someone still wants to do it, they can cast the pointers to size_t, or make it trusted/ system code.
Jan 03
On Thursday, 1 January 2026 at 06:15:09 UTC, Walter Bright wrote:[snip] So this would be valid, as the two pointers are known to point to the same memory object. [snip]To what extent can D know when pointers are known to point to the same object?
Jan 02
On 1/2/2026 11:20 AM, jmh530 wrote:On Thursday, 1 January 2026 at 06:15:09 UTC, Walter Bright wrote:Some can be done trivially, such as (&c - &d). More can be discovered with DFA (Data Flow Analysis), but not really that much. Just like not many cases of null dereference can be unambiguously discovered with DFA.[snip] So this would be valid, as the two pointers are known to point to the same memory object. [snip]To what extent can D know when pointers are known to point to the same object?
Jan 02
On 03/01/2026 9:21 AM, Walter Bright wrote:On 1/2/2026 11:20 AM, jmh530 wrote:Four days ago I started implementing value tracking. I can get objects like: ```d int a, b; int* ptr = condition ? &a : &b; ``` And see that ptr could be either a or b. Its also possible to see: ```d int* ptr = new int, oldObj = ptr; foreach(i; 0 .. 10) { ptr = new int; } assert(ptr !is oldObj); ``` But what you can't do with DFA alone: ```d void func(int* a, int* b) { assert(a is b); // how can I know this? } ``` While I'd love to have knowledge that pointers are not from the same object, or are, realistically its beyond what can be annotated on code explicitly, let alone inferred. I've been trying to solve for this for well over a year and have not made progress on it. The best you can really hope for here I suspect is ownership transfer and modelling it in the function that borrows from it. As well as new and stack allocations ext.On Thursday, 1 January 2026 at 06:15:09 UTC, Walter Bright wrote:Some can be done trivially, such as (&c - &d). More can be discovered with DFA (Data Flow Analysis), but not really that much. Just like not many cases of null dereference can be unambiguously discovered with DFA.[snip] So this would be valid, as the two pointers are known to point to the same memory object. [snip]To what extent can D know when pointers are known to point to the same object?
Jan 02
Sorry about that. If you ask me, it may be possible, but it sure looks impractical.
Jan 03
On 04/01/2026 3:13 PM, Walter Bright wrote:Sorry about that. If you ask me, it may be possible, but it sure looks impractical.Its not your fault, its a hard problem that I don't think has any good answers. I expect that I would've found one by now if it wasn't.
Jan 03
On Sunday, 4 January 2026 at 06:57:11 UTC, Walter Bright wrote:https://github.com/dlang/dmd/pull/223482024 edition? And I recall a discussion of a procedure in place for checking for breakage in projects. Is there anything formal on this front?
Jan 04









"Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> 