www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - New safe rule: defensive closures

reply Steven Schveighoffer <schveiguy gmail.com> writes:
We have this cool feature in D where if you try take the address of a 
function, and the compiler can't prove that it doesn't escape where the 
actual stack frame is, it allocates the stack frame on the GC, and then 
it becomes a "closure".

Why can't we just do this in ` safe` code whenever it has the same 
problem? Consider this code snippet:

```d
void bar(int[])  safe;
void foo()  safe
{
    int[5] arr;
    bar(arr[]);
}
```

Currently, without DIP1000 enabled, this compiles, and if `bar` 
squirrels away the array, you have a memory issue.

With DIP1000, this becomes an error, the compiler yells at you to put 
scope on the bar parameter (and then you can't squirrel it away).

But what if instead, with DIP1000 seeing that, it just says now we have 
a closure situation, and allocates `foo`'s frame on the heap.

A sufficiently smart optimizer might be able to detect that actually 
`bar` doesn't squirrel it away, and will still allocate on the stack.

If you want to ensure this doesn't happen, just like with other 
closures, you annotate with  nogc. And then you can have suggestions 
about putting scope on `bar`'s parameter.

This gives us safe code that may not perform as expected, but at least 
it *is safe*. And it doesn't spew endless errors to the user. Consider 
that there are already so many cases where closures are allocated, with 
std.algorithm and lambdas, and mostly nobody bats an eye.

Just another possible idea for debate.

-Steve
May 27 2022
next sibling parent deadalnix <deadalnix gmail.com> writes:
On Friday, 27 May 2022 at 22:16:30 UTC, Steven Schveighoffer 
wrote:
 Currently, without DIP1000 enabled, this compiles, and if `bar` 
 squirrels away the array, you have a memory issue.
The fundamental problem here, is that proving whether it squirrels away or not is actually very difficult, and therefore, you really don't want to rely on this, especially if the end result is as drastic as making most functions closures.
May 27 2022
prev sibling next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Friday, 27 May 2022 at 22:16:30 UTC, Steven Schveighoffer 
wrote:
 But what if instead, with DIP1000 seeing that, it just says now 
 we have a closure situation, and allocates `foo`'s frame on the 
 heap.
Object oriented languages has often used only closures conceptually (no assumptions about a contiguous stack), it allows for high degree of concurrency etc. It is a common strategy for high level languages, yes.
May 27 2022
prev sibling next sibling parent reply Dukc <ajieskola gmail.com> writes:
On Friday, 27 May 2022 at 22:16:30 UTC, Steven Schveighoffer 
wrote:
 Why can't we just do this in ` safe` code whenever it has the 
 same problem? Consider this code snippet:

 ```d
 void bar(int[])  safe;
 void foo()  safe
 {
    int[5] arr;
    bar(arr[]);
 }
 ```
 But what if instead, with DIP1000 seeing that, it just says now 
 we have a closure situation, and allocates `foo`'s frame on the 
 heap.
Good brainstorming. But I don't think it can work: ```D void foo(ref int[5] arr) { // It's not up to this function to decide where // arr resides in memory. Thus we can't make a // closure out of this one without breaking epected // behaviour. bar(arr[]); } ```
May 28 2022
next sibling parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Saturday, 28 May 2022 at 08:12:53 UTC, Dukc wrote:
 Good brainstorming. But I don't think it can work:
The root cause is that Walter does not want safe to be a high level feature, as a result it will be easier to write system code than safe no matter what features he comes up with!! If one instead frame safe as easy high level programming and add some reasonable constraints and owning pointers/objects then things will fall into place. Then optimize. Without that it will be a neverending tail chasing dance. Or you will end up with something that constantly gets in your way, like Rust. And that makes system or a switch to Rust more attractive, so nothing to gain there...
May 28 2022
parent reply Dukc <ajieskola gmail.com> writes:
On Saturday, 28 May 2022 at 10:24:40 UTC, Ola Fosheim Grøstad 
wrote:
 On Saturday, 28 May 2022 at 08:12:53 UTC, Dukc wrote:
 Good brainstorming. But I don't think it can work:
The root cause is that Walter does not want safe to be a high level feature, as a result it will be easier to write system code than safe no matter what features he comes up with!!
` safe` code is almost always higher level than ` system` code on average. In both theory and practice. You are right in the sense that you can do everything in ` system` that you can do in ` safe` and more, so it's more expressive and higher level in the same sense C90 is "higher-level" than C99. And just like it may not be worth it to add proper function prototypes to already well-tested C90 codebase, it's not always worthwhile to add ` safe` to D a stable D codebase that is not going to see any major refactoring. But woe to one who chooses ` system` just to "be higher level". While it does enable slightly shorter code in some cases, much more time will be lost debugging the error-prone practices it leads to. It's just as "good" choice as not using function prototypes in a new C codebase because it saves some LOC.
May 28 2022
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Saturday, 28 May 2022 at 10:59:58 UTC, Dukc wrote:
 But woe to one who chooses ` system` just to "be higher level". 
 While it does enable slightly shorter code in some cases, much 
 more time will be lost debugging the error-prone practices it 
 leads to. It's just as "good" choice as not using function 
 prototypes in a new C codebase because it saves some LOC.
What I mean is that DSP code will most likely be system, but the to limit safe to high level constructs and put the burden on system code calling safe code (eg provide owning pointers and other useful constructs). That has value. I am not sure if there is all that much value in making low level code safe, if you look at status quo for research languages... that is a tall mountain to climb and not a pretty view.
May 28 2022
prev sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 5/28/22 4:12 AM, Dukc wrote:
 On Friday, 27 May 2022 at 22:16:30 UTC, Steven Schveighoffer wrote:
 Why can't we just do this in ` safe` code whenever it has the same 
 problem? Consider this code snippet:

 ```d
 void bar(int[])  safe;
 void foo()  safe
 {
    int[5] arr;
    bar(arr[]);
 }
 ```
 But what if instead, with DIP1000 seeing that, it just says now we 
 have a closure situation, and allocates `foo`'s frame on the heap.
Good brainstorming. But I don't think it can work: ```D void foo(ref int[5] arr) { // It's not up to this function to decide where   // arr resides in memory. Thus we can't make a   // closure out of this one without breaking epected   // behaviour.   bar(arr[]); } ```
Well, we have to decide if taking a value by ref means it should be allocated, or if it should be scope (like DIP1000). If we go with the latter, then you start getting error messages, and we are kind of back to square one. If we go with the former, then any simple use of struct methods is going to allocate a closure. So yeah, that pretty much destroys this idea. -Steve
May 28 2022
parent reply Dukc <ajieskola gmail.com> writes:
On Saturday, 28 May 2022 at 14:50:39 UTC, Steven Schveighoffer 
wrote:
 
 Good brainstorming. But I don't think it can work:
 
 ```D
 void foo(ref int[5] arr)
 { // It's not up to this function to decide where
    // arr resides in memory. Thus we can't make a
    // closure out of this one without breaking epected
    // behaviour.
    bar(arr[]);
 }
 ```
 
Well, we have to decide if taking a value by ref means it should be allocated, or if it should be scope (like DIP1000). If we go with the latter, then you start getting error messages, and we are kind of back to square one. If we go with the former, then any simple use of struct methods is going to allocate a closure. So yeah, that pretty much destroys this idea. -Steve
I'm necrobumping an old thread, because I made a discovery that I think brings this idea back to the table. The closures we already have suffer from the same problem I wrote about here! Behold: ```D int delegate(int) safe escape1(scope int* x) safe { return (int y) => *x += y; } int delegate(int) safe escape2(ref int x) safe { return (int y) => x += y; } ``` These compile, but they shouldn't. They should require annotating their parameters with `return`. Now in itself this is just a DIP1000 bug, but consider the situation once it's fixed. If you create a pointer to a local variable and try to return it, you get an error, but if you create a delegate using it and return it, you get a closure. This is a language inconsistency for no good reason. So we won't get rid of DIP1000 with this observation, but for sake of consistency maybe we still should reconsider the idea this thread is about.
Feb 15
parent reply Paul Backus <snarwin gmail.com> writes:
On Thursday, 15 February 2024 at 14:48:56 UTC, Dukc wrote:
 So we won't get rid of DIP1000 with this observation, but for 
 sake of consistency maybe we still should reconsider the idea 
 this thread is about.
No, let's not. This idea is terrible. Consider: what happens if you're compiling a function with inferred attributes and you encounter some code that escapes the address of a local variable? Do you... 1. Allocate a closure and infer ` safe`? 2. Not allocate a closure and infer ` system`? What if you're compiling code with an explicit ` system` attribute? Do you still allocate a closure then? What if the programmer wants the less safe, more performant behavior? What about in ` trusted` code? Either you consistently use closures everywhere and introduce huge performance regressions into existing ` system` code, or you end up in a world where the exact same code can have completely different semantics depending on whether it's ` safe`/` system`/` trusted`/inferred. I don't think either of those outcomes is acceptable.
Feb 15
parent Dukc <ajieskola gmail.com> writes:
On Thursday, 15 February 2024 at 19:35:21 UTC, Paul Backus wrote:
 Either you consistently use closures everywhere and introduce 
 huge performance regressions into existing ` system` code, or 
 you end up in a world where the exact same code can have 
 completely different semantics depending on whether it's 
 ` safe`/` system`/` trusted`/inferred. I don't think either of 
 those outcomes is acceptable.
If we're to do either it should be closures everywhere I think. First I thought that maybe it depends. I frankly don't know what precisely triggers the creation of the closure. If it's done only when the delegate is returned or assigned to the heap or static memory, Is it exactly huge? After all, any code that is returning pointers to locals is returning pointers to expiring stack frames and therefore the code isn't working. Still, probably there's too much code out there that is assigning the pointers to locals temporarily into the heap. Considering the foundations pledge to favour backwards compatibility, on balance I have to concur with your assesment: a bad idea. Now another option would be to stop the automatic closure creation for delegates too but that'd be even much bigger breakage. It's not worth any serious consideration. Of the evils we have to pick from, language inconsistency it is.
Feb 16
prev sibling parent Siarhei Siamashka <siarhei.siamashka gmail.com> writes:
On Friday, 27 May 2022 at 22:16:30 UTC, Steven Schveighoffer 
wrote:
 Consider that there are already so many cases where closures 
 are allocated, with std.algorithm and lambdas, and mostly 
 nobody bats an eye.
Nobody bats an eye when it's DMD doing something as silly as allocating closures on heap, because low performance is pretty much expected and normal for it. But if this problem also affects GDC, then I'm very much concerned about it: https://forum.dlang.org/thread/myiqlzkghnnyykbyksga forum.dlang.org
May 30 2022