www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Let's get the semantic around closure fixed.

reply deadalnix <deadalnix gmail.com> writes:
Long story short: https://issues.dlang.org/show_bug.cgi?id=21929

Closure do not respect scope the way they should. Let's fix it.
May 18
next sibling parent reply Max Haughton <maxhaton gmail.com> writes:
On Tuesday, 18 May 2021 at 16:47:03 UTC, deadalnix wrote:
 Long story short: https://issues.dlang.org/show_bug.cgi?id=21929

 Closure do not respect scope the way they should. Let's fix it.
Time to consider by-value captures? Closures (specifically the way they interact with the GC) seem problematic in general, not just here.
May 18
next sibling parent reply Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:
On Tuesday, 18 May 2021 at 17:40:17 UTC, Max Haughton wrote:
 On Tuesday, 18 May 2021 at 16:47:03 UTC, deadalnix wrote:
 Long story short: 
 https://issues.dlang.org/show_bug.cgi?id=21929

 Closure do not respect scope the way they should. Let's fix it.
Time to consider by-value captures? Closures (specifically the way they interact with the GC) seem problematic in general, not just here.
Does a delegate do anything more than retaining a pointer to the stack record? Anyway, it should not escape the scope it references. So escape analysis is your friend.
May 18
next sibling parent Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:
On Tuesday, 18 May 2021 at 17:53:20 UTC, Ola Fosheim Grostad 
wrote:

Forget all that... Mixed up with local functions... :P
May 18
prev sibling next sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Tuesday, 18 May 2021 at 17:53:20 UTC, Ola Fosheim Grostad 
wrote:
 Does a delegate do anything more than retaining a pointer to 
 the stack record?
 Anyway, it should not escape the scope it references. So escape 
 analysis is your friend.
The compiler does escape analysis and pre-emptively allocates any variables that escape through closures on the GC heap. The bug here is that, for variables declared in loop bodies, the compiler *should* allocate a new copy on the heap for each loop iteration, but instead it only allocates one copy that's reused across all iterations.
May 18
parent reply Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:
On Tuesday, 18 May 2021 at 18:51:40 UTC, Paul Backus wrote:
 On Tuesday, 18 May 2021 at 17:53:20 UTC, Ola Fosheim Grostad 
 wrote:
 Does a delegate do anything more than retaining a pointer to 
 the stack record?
 Anyway, it should not escape the scope it references. So 
 escape analysis is your friend.
The compiler does escape analysis and pre-emptively allocates any variables that escape through closures on the GC heap. The bug here is that, for variables declared in loop bodies, the compiler *should* allocate a new copy on the heap for each loop iteration, but instead it only allocates one copy that's reused across all iterations.
Yes, absolutely. I dont use delegates much (well, I did in D1 but that is a long time ago), but the difference between point 3 and point 7 in the language reference is tricky to grasp. So if you bind a local function to a delegate parameter you dont take a closure, but if you assign to a delegate variable you get a closure? "7. Delegates to non-static nested functions contain two pieces of data: the pointer to the stack frame of the lexically enclosing function (called the context pointer) and the address of the function."
May 18
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 5/18/21 3:04 PM, Ola Fosheim Grostad wrote:
 On Tuesday, 18 May 2021 at 18:51:40 UTC, Paul Backus wrote:
 On Tuesday, 18 May 2021 at 17:53:20 UTC, Ola Fosheim Grostad wrote:
 Does a delegate do anything more than retaining a pointer to the 
 stack record?
 Anyway, it should not escape the scope it references. So escape 
 analysis is your friend.
The compiler does escape analysis and pre-emptively allocates any variables that escape through closures on the GC heap. The bug here is that, for variables declared in loop bodies, the compiler *should* allocate a new copy on the heap for each loop iteration, but instead it only allocates one copy that's reused across all iterations.
Yes, absolutely. I dont use delegates much (well, I did in D1 but that is a long time ago), but the difference between point 3 and point 7 in the language reference is tricky to grasp. So if you bind a local function to a delegate parameter you dont take a closure, but if you assign to a delegate variable you get a closure? "7. Delegates to non-static nested functions contain two pieces of data: the pointer to the stack frame of the lexically enclosing function (called the context pointer) and the address of the function."
There's also the issue that if you have a scoped variable that has a destructor, the value will be destroyed (and probably unusable) if you call the delegate from outside the scope. -Steve
May 18
parent reply Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:
On Tuesday, 18 May 2021 at 19:20:55 UTC, Steven Schveighoffer 
wrote:
 On 5/18/21 3:04 PM, Ola Fosheim Grostad wrote:
 There's also the issue that if you have a scoped variable that 
 has a destructor, the value will be destroyed (and probably 
 unusable) if you call the delegate from outside the scope.
Ouch. Ok, so in OO languages like Simula, all scopes are heap-closures and there is no stack, which kinda changes the game. I guess Javascript does the same conceptually but the JIT perhaps extracts uncaptured variables and puts those on a stack as an optimization? (My guess) But how does a function with a delegate parameter know if it is safe to store the delegate or not?
May 18
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 5/18/21 3:32 PM, Ola Fosheim Grostad wrote:
 On Tuesday, 18 May 2021 at 19:20:55 UTC, Steven Schveighoffer wrote:
 On 5/18/21 3:04 PM, Ola Fosheim Grostad wrote:
 There's also the issue that if you have a scoped variable that has a 
 destructor, the value will be destroyed (and probably unusable) if you 
 call the delegate from outside the scope.
Ouch. Ok, so in OO languages like Simula, all scopes are heap-closures and there is no stack, which kinda changes the game. I guess Javascript does the same conceptually but the JIT perhaps extracts uncaptured variables and puts those on a stack as an optimization? (My guess) But how does a function with a delegate parameter know if it is safe to store the delegate or not?
Shouldn't matter. The compiler should not compile code that allows you to use a dangling struct (i.e. a destroyed struct). In fact, it used to be this way, but there was a "hack" introduced to allow it to compile. See: https://issues.dlang.org/show_bug.cgi?id=15952 And the change that allowed it (clearly identified as a hack): https://github.com/dlang/dmd/pull/5292/files#diff-a0928b0b76375204c6f58973fb3f2748e9e614394d5f4b0d8fa3cb20eb5a96c9R757-R760 I don't pretend to understand most of this, it was other sleuths (mostly Paul Backus) that discovered this. -Steve
May 18
parent reply Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:
On Tuesday, 18 May 2021 at 20:01:15 UTC, Steven Schveighoffer 
wrote:
 I don't pretend to understand most of this, it was other 
 sleuths (mostly Paul Backus) that discovered this.
I am not sure if my understanding of the language reference is correct, but I get a feeling this is an area where one just have to try different combinations and see what happens.
May 18
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 5/18/21 4:07 PM, Ola Fosheim Grostad wrote:
 On Tuesday, 18 May 2021 at 20:01:15 UTC, Steven Schveighoffer wrote:
 I don't pretend to understand most of this, it was other sleuths 
 (mostly Paul Backus) that discovered this.
I am not sure if my understanding of the language reference is correct, but I get a feeling this is an area where one just have to try different combinations and see what happens.
No, it was correct before the hack. Code which captured a struct that would be destroyed outside the scope just wouldn't compile. Now it does. An example: struct S { bool destroyed = false; ~this() { destroyed = true; } } void main() { void delegate() dg; { S s; dg = {writeln("destroyed = ", s.destroyed);}; dg(); // destroyed = false } dg(); // destroyed = true } So basically, depending on when you call the delegate, the thing could be invalid. Not a big deal (maybe?) for a boolean, but could cause real problems for other things. And the user expectation is that when you capture the variable, it's how it was when you captured it. At least it should live as long as the delegate is alive, no? -Steve
May 18
next sibling parent Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:
On Tuesday, 18 May 2021 at 20:26:26 UTC, Steven Schveighoffer 
wrote:
 So basically, depending on when you call the delegate, the 
 thing could be invalid. Not a big deal (maybe?) for a boolean, 
 but could cause real problems for other things. And the user 
 expectation is that when you capture the variable, it's how it 
 was when you captured it. At least it should live as long as 
 the delegate is alive, no?
Yes, OO languages usually dont have destructors, so... Hm. I could see how this can go wrong, what if the captured object was "made" from an object in an outer scope that assumes that the inner scope objectrd is destructed before its lifetime is up? I think it helps if we forget about stack and think of scopes as objects on a GC heap with links to the parent scope, they are kept alive as long as they are reachable, then destructed. But then we need to maintain the destruction order so that inner scopes are destructed first. (what I meant in my previous post was that I need to experiment with delegate parameters and see hownit prevents stuff from escaping to fully grok it :-)
May 18
prev sibling parent reply Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:
On Tuesday, 18 May 2021 at 20:26:26 UTC, Steven Schveighoffer 
wrote:
 On 5/18/21 4:07 PM, Ola Fosheim Grostad wrote:
 No, it was correct before the hack. Code which captured a 
 struct that would be destroyed outside the scope just wouldn't 
 compile. Now it does.
Btw in C++ lambdas should not outlive captured references, if you want that you need to make a copy, aka capture by value. That is a clean solution to the destructor problem as the destructor will be called when you expect it to. I guess that is the only sensible solution.
May 18
parent Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:
On Tuesday, 18 May 2021 at 21:11:08 UTC, Ola Fosheim Grostad 
wrote:
 On Tuesday, 18 May 2021 at 20:26:26 UTC, Steven Schveighoffer 
 wrote:
 On 5/18/21 4:07 PM, Ola Fosheim Grostad wrote:
 No, it was correct before the hack. Code which captured a 
 struct that would be destroyed outside the scope just wouldn't 
 compile. Now it does.
Btw in C++ lambdas should not outlive captured references, if you want that you need to make a copy, aka capture by value. That is a clean solution to the destructor problem as the destructor will be called when you expect it to. I guess that is the only sensible solution.
Another correct solution is to track destruction and use a runtime liveness test before allowing object access. This would be more of a high level feature though. So you capture an object by reference, but you are not allowed to access a dead object.
May 18
prev sibling parent deadalnix <deadalnix gmail.com> writes:
On Tuesday, 18 May 2021 at 17:53:20 UTC, Ola Fosheim Grostad 
wrote:
 On Tuesday, 18 May 2021 at 17:40:17 UTC, Max Haughton wrote:
 On Tuesday, 18 May 2021 at 16:47:03 UTC, deadalnix wrote:
 Long story short: 
 https://issues.dlang.org/show_bug.cgi?id=21929

 Closure do not respect scope the way they should. Let's fix 
 it.
Time to consider by-value captures? Closures (specifically the way they interact with the GC) seem problematic in general, not just here.
Does a delegate do anything more than retaining a pointer to the stack record? Anyway, it should not escape the scope it references. So escape analysis is your friend.
Yes, they allocate the closure on head.
May 18
prev sibling parent reply deadalnix <deadalnix gmail.com> writes:
On Tuesday, 18 May 2021 at 17:40:17 UTC, Max Haughton wrote:
 On Tuesday, 18 May 2021 at 16:47:03 UTC, deadalnix wrote:
 Long story short: 
 https://issues.dlang.org/show_bug.cgi?id=21929

 Closure do not respect scope the way they should. Let's fix it.
Time to consider by-value captures? Closures (specifically the way they interact with the GC) seem problematic in general, not just here.
No need, but a new closure is needed for each loop iteration in which there is a capture.
May 18
parent Max Haughton <maxhaton gmail.com> writes:
On Tuesday, 18 May 2021 at 19:37:54 UTC, deadalnix wrote:
 On Tuesday, 18 May 2021 at 17:40:17 UTC, Max Haughton wrote:
 On Tuesday, 18 May 2021 at 16:47:03 UTC, deadalnix wrote:
 Long story short: 
 https://issues.dlang.org/show_bug.cgi?id=21929

 Closure do not respect scope the way they should. Let's fix 
 it.
Time to consider by-value captures? Closures (specifically the way they interact with the GC) seem problematic in general, not just here.
No need, but a new closure is needed for each loop iteration in which there is a capture.
https://d.godbolt.org/z/r4TKPa946 this pattern shouldn't go through the GC for example. If the delegate can fit a pointer it can fit an int, so going through the GC n times is a waste even before a proper solution is found.
May 18
prev sibling next sibling parent Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:
On Tuesday, 18 May 2021 at 16:47:03 UTC, deadalnix wrote:
 Long story short: https://issues.dlang.org/show_bug.cgi?id=21929

 Closure do not respect scope the way they should. Let's fix it.
Btw, javascript is irrelevant. "var" always binds names to the outmost scope at the beginning of the function, where it appears in the text is only to be more reader friendly.
May 18
prev sibling next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 5/18/2021 9:47 AM, deadalnix wrote:
 Long story short: https://issues.dlang.org/show_bug.cgi?id=21929
 
 Closure do not respect scope the way they should. Let's fix it.
The simplest solution would be to disallow delegates referencing variables in scopes other than function scope.
May 18
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/18/2021 9:47 AM, deadalnix wrote:
 Long story short: https://issues.dlang.org/show_bug.cgi?id=21929
 
 Closure do not respect scope the way they should. Let's fix it.
Let's rewrite it to something that does not use closures: int test() safe { int j; int*[20] ps; for (int i = 0; i < 10; i++) { ps[j++] = &i; } for (int i = 0; i < 10; i++) { int index = i; ps[j++] = &index; } int x; foreach (p; ps) { x += *p; } return x; } This code is equivalent in terms of what is happening with references and scopes. Compiling it with -dip1000 yields: Error: address of variable i assigned to ps with longer lifetime Error: address of variable index assigned to ps with longer lifetime Which is pragmatically what the behavior of the delegate example would be, because the delegate is also storing a pointer to the variable.
May 18
parent reply Walter Bright <newshound2 digitalmars.com> writes:
The general rule for determining "what should happen here" when there are 
abstractions around pointers (such as arrays, delegates, refs, outs, class 
references, etc.), is to rewrite it in explicit terms of those pointers. The 
(sometimes baffling) behavior is then exposed for what it actually is, and the 
behavior should match.

Granted, there is a special kludge in the compiler to sometimes put the 
variables referenced by the delegate into a closure allocated by the gc, but 
that doesn't account for variables that go out of scope before the function 
scope ends. There is no process to make a closure for them, and adding such a 
capability is likely much more complication than added value, and so should
just 
be an error.
May 18
parent reply deadalnix <deadalnix gmail.com> writes:
On Wednesday, 19 May 2021 at 03:09:03 UTC, Walter Bright wrote:
 The general rule for determining "what should happen here" when 
 there are abstractions around pointers (such as arrays, 
 delegates, refs, outs, class references, etc.), is to rewrite 
 it in explicit terms of those pointers. The (sometimes 
 baffling) behavior is then exposed for what it actually is, and 
 the behavior should match.
No, this is definitively wrong. Languages constructs provide invariant, that both the developers and the compiler can rely on. These invariant ensure that the codebase can scale to larger size while keeping bugs, complexity, extensibility and so on in check. It is capital to look at language constructs through that lens, or, very quickly, one end up with sets of invariant that vanish to nothing because they are not provided consistently by various languages constructs. This results in more complexity for the programmer, more bugs, and slower programs because the runtime and compiler cannot leverage the invariant either. Thinking through the way it is all implemented with pointer is definitively useful too, but simply as a tool to know what can be implemented efficiently and what cannot, what can be optimized easily and what cannot, etc... This is not very useful as a design tool, as it lads to unsound language constructs. In fact, not even C++ works this way, as they went through great length to define a virtual machine that does not exist that would execute the C++ in their spec.
 Granted, there is a special kludge in the compiler to sometimes 
 put the variables referenced by the delegate into a closure 
 allocated by the gc, but that doesn't account for variables 
 that go out of scope before the function scope ends. There is 
 no process to make a closure for them, and adding such a 
 capability is likely much more complication than added value, 
 and so should just be an error.
I find it surprising that you call this a kludge. This is how pretty much any language except C++ does it. It is proven. Without this and without the ability to capture by value like in C++, delegates are effectively useless. This is not a kludge, this is the very thing that makes delegates useful at all. That being said, the DIP1000 analysis you mention is a useful tool here. If nothing escape, then it is possible for the compiler to promote the closure on stack rather than on heap. This is where attacking the problem from first principles helps. It is not about the pointers, it is about the invariants. If the compiler can find a better way to implement these invariants given a set of conditions, then great.
May 19
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/19/2021 12:36 AM, deadalnix wrote:
 This is where attacking the problem from first principles helps. It is not
about 
 the pointers, it is about the invariants. If the compiler can find a better
way 
 to implement these invariants given a set of conditions, then great.
The thing about metaprogramming is users can build things by aggregating simpler pieces (like pointers). If the compiler has special semantics for a higher level type that cannot be assembled from simpler pieces, then the language has composability problems. (This problem has shown up with the special semantics given to associative arrays.)
May 19
parent reply deadalnix <deadalnix gmail.com> writes:
On Wednesday, 19 May 2021 at 07:53:49 UTC, Walter Bright wrote:
 On 5/19/2021 12:36 AM, deadalnix wrote:
 This is where attacking the problem from first principles 
 helps. It is not about the pointers, it is about the 
 invariants. If the compiler can find a better way to implement 
 these invariants given a set of conditions, then great.
The thing about metaprogramming is users can build things by aggregating simpler pieces (like pointers). If the compiler has special semantics for a higher level type that cannot be assembled from simpler pieces, then the language has composability problems. (This problem has shown up with the special semantics given to associative arrays.)
Composability comes from the invariants you can rely on, and more precisely, the fact that these invariant do not impose constraints on each others.
May 19
parent reply deadalnix <deadalnix gmail.com> writes:
On Wednesday, 19 May 2021 at 11:06:29 UTC, deadalnix wrote:
 Composability comes from the invariants you can rely on, and 
 more precisely, the fact that these invariant do not impose 
 constraints on each others.
To go further, in this specific case, there is no composability. The notion of loops, immutability, ad closure at colliding with each others. they do not compose because they don't propose a set of invariant which are independent from each other, but, on the other hand, step on each others.
May 19
parent Walter Bright <newshound2 digitalmars.com> writes:
On 5/19/2021 4:08 AM, deadalnix wrote:
 The notion of loops, immutability, ad closure at colliding with each others. 
 they do not compose because they don't propose a set of invariant which are 
 independent from each other, but, on the other hand, step on each others.
I don't really know what you mean by invariants in this context. Can you enumerate what invariants you propose for delegates?
May 19
prev sibling next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 5/18/21 12:47 PM, deadalnix wrote:
 Long story short: https://issues.dlang.org/show_bug.cgi?id=21929
 
 Closure do not respect scope the way they should. Let's fix it.
Thinking about how this would have to be implemented: 1. If you want to access a variable in a scope from both the closure and the function itself, the variable has to be allocated on the heap 2. We need one allocation PER loop. If we do this the way normal closures are done (i.e. allocate before the scope is entered), this would be insanely costly for a loop. 3. It *could* allocate on demand. Basically, reserve stack space for the captured variables, have a pointer to that stack space. When a closure is used, copy that stack space to a heap allocation, and switch the pointer to that heap block (so the function itself also refers to the same data). This might be a reasonable tradeoff. But it has some limitations -- like if you ever take the address of one of these variables, that would also have to generate the allocated closure. Of course, with Walter's chosen fix, only allowing capture of non-scoped variables, all of this is moot. I kind of feel like that's a much simpler (even if less convenient) solution. And also, of course, can we please fix the issue where destroyed structs are accessible from a delegate? -Steve
May 19
next sibling parent reply deadalnix <deadalnix gmail.com> writes:
On Wednesday, 19 May 2021 at 13:02:59 UTC, Steven Schveighoffer 
wrote:
 Thinking about how this would have to be implemented:

 1. If you want to access a variable in a scope from both the 
 closure and the function itself, the variable has to be 
 allocated on the heap.
This is definitively what D guarantees.
 2. We need one allocation PER loop. If we do this the way 
 normal closures are done (i.e. allocate before the scope is 
 entered), this would be insanely costly for a loop.
This is costly, but also the only way to ensure other invariants in the language are respected (immutability, no access after destruction, ...). This is also consistent with what other languages do. This is also consistent with the fact that D allow to iterate over loops using opDispatch, which already should exhibit this behavior, because it is a function under the hood.
 3. It *could* allocate on demand. Basically, reserve stack 
 space for the captured variables, have a pointer to that stack 
 space. When a closure is used, copy that stack space to a heap 
 allocation, and switch the pointer to that heap block (so the 
 function itself also refers to the same data). This might be a 
 reasonable tradeoff. But it has some limitations -- like if you 
 ever take the address of one of these variables, that would 
 also have to generate the allocated closure.
I suspect this will open a can of worm of edge cases.
May 19
next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 5/19/21 1:26 PM, deadalnix wrote:
 On Wednesday, 19 May 2021 at 13:02:59 UTC, Steven Schveighoffer wrote:
 Thinking about how this would have to be implemented:

 1. If you want to access a variable in a scope from both the closure 
 and the function itself, the variable has to be allocated on the heap.
This is definitively what D guarantees.
Yes, it is guaranteed... if it compiles. For sure the current behavior is junk and unsafe.
 
 2. We need one allocation PER loop. If we do this the way normal 
 closures are done (i.e. allocate before the scope is entered), this 
 would be insanely costly for a loop.
This is costly, but also the only way to ensure other invariants in the language are respected (immutability, no access after destruction, ...). This is also consistent with what other languages do.
Again, costly as long as it compiles. If a la Walter's suggestion it no longer compiles, then it's moot.
 
 This is also consistent with the fact that D allow to iterate over loops 
 using opDispatch, which already should exhibit this behavior, because it 
 is a function under the hood.
You mean opApply? Not necessarily, if the delegate parameter is scope (and it should be).
 
 3. It *could* allocate on demand. Basically, reserve stack space for 
 the captured variables, have a pointer to that stack space. When a 
 closure is used, copy that stack space to a heap allocation, and 
 switch the pointer to that heap block (so the function itself also 
 refers to the same data). This might be a reasonable tradeoff. But it 
 has some limitations -- like if you ever take the address of one of 
 these variables, that would also have to generate the allocated closure.
I suspect this will open a can of worm of edge cases.
I don't think a can of worms is opened, but it's not easy to implement for sure. I'm not suggesting that we follow this path. I'm just thinking about "What's the most performant way we can implement closures used inside loops". If a loop *rarely* allocates a closure (i.e. only one element actually allocates a closure), then allocating defensively seems super-costly. -Steve
May 19
next sibling parent reply deadalnix <deadalnix gmail.com> writes:
On Wednesday, 19 May 2021 at 18:43:24 UTC, Steven Schveighoffer 
wrote:
 You mean opApply? Not necessarily, if the delegate parameter is 
 scope (and it should be).
In all cases, if the closure doesn't escape, it can stay on heap. This is what compiler optimization do.
May 19
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 5/19/21 3:05 PM, deadalnix wrote:
 On Wednesday, 19 May 2021 at 18:43:24 UTC, Steven Schveighoffer wrote:
 You mean opApply? Not necessarily, if the delegate parameter is scope 
 (and it should be).
In all cases, if the closure doesn't escape, it can stay on heap. This is what compiler optimization do.
This results in code that only compiles when optimized. -Steve
May 19
parent reply deadalnix <deadalnix gmail.com> writes:
On Wednesday, 19 May 2021 at 19:48:05 UTC, Steven Schveighoffer 
wrote:
 On 5/19/21 3:05 PM, deadalnix wrote:
 On Wednesday, 19 May 2021 at 18:43:24 UTC, Steven 
 Schveighoffer wrote:
 You mean opApply? Not necessarily, if the delegate parameter 
 is scope (and it should be).
In all cases, if the closure doesn't escape, it can stay on heap. This is what compiler optimization do.
This results in code that only compiles when optimized. -Steve
No that result in code that looks like it's always allocating on the heap, but in fact doesn't if it doesn't need to.
May 19
parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 5/19/21 4:20 PM, deadalnix wrote:
 On Wednesday, 19 May 2021 at 19:48:05 UTC, Steven Schveighoffer wrote:
 On 5/19/21 3:05 PM, deadalnix wrote:
 On Wednesday, 19 May 2021 at 18:43:24 UTC, Steven Schveighoffer wrote:
 You mean opApply? Not necessarily, if the delegate parameter is 
 scope (and it should be).
In all cases, if the closure doesn't escape, it can stay on heap. This is what compiler optimization do.
This results in code that only compiles when optimized.
No that result in code that looks like it's always allocating on the heap, but in fact doesn't if it doesn't need to.
Sorry, I misread that, it looked like you were saying in all cases it could stay on the stack (you did mean to write stack, right?), but missed the qualifier "if the closure doesn't escape". -Steve
May 19
prev sibling parent deadalnix <deadalnix gmail.com> writes:
On Wednesday, 19 May 2021 at 18:43:24 UTC, Steven Schveighoffer 
wrote:
 I don't think a can of worms is opened, but it's not easy to 
 implement for sure. I'm not suggesting that we follow this 
 path. I'm just thinking about "What's the most performant way 
 we can implement closures used inside loops". If a loop 
 *rarely* allocates a closure (i.e. only one element actually 
 allocates a closure), then allocating defensively seems 
 super-costly.
There is going to be a ton of situation where the address of the variable becomes visible in some fashion.
May 19
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/19/2021 10:26 AM, deadalnix wrote:
 On Wednesday, 19 May 2021 at 13:02:59 UTC, Steven Schveighoffer wrote:
 2. We need one allocation PER loop. If we do this the way normal closures are 
 done (i.e. allocate before the scope is entered), this would be insanely 
 costly for a loop.
This is costly, but also the only way to ensure other invariants in the language are respected (immutability, no access after destruction, ...). This is also consistent with what other languages do.
Languages like D also need to be useful, not just correct. Having a hidden allocation per loop will be completely unexpected for such a simple looking loop for a lot of people. That includes pretty much all of *us*, too. I doubt users will be happy when they eventually discover this the reason their D program runs like sludge on Pluto and consumes all the memory in their system. If they discover the reason at all, and don't just dismiss D as unusable. The workaround, for the users, is to simply move that referenced variable from an inner scope to function scope. It's best to just return a compile error for such cases rather than go to very expensive efforts to make every combination of features work. It's similar to the decision to give an error for certain operations on vectors if the hardware won't support it, rather than emulate. Emulation will necessarily be very, very slow. Give users the opportunity to fix hidden and extreme slowdowns in code rather than hide them. A systems programming language ought to behave this way.
May 19
next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Wednesday, 19 May 2021 at 19:01:59 UTC, Walter Bright wrote:
 Having a hidden allocation per loop will be completely 
 unexpected for such a simple looking loop for a lot of people. 
 That includes pretty much all of *us*, too.
Citation needed. It is fairly well known that closures and objects are pretty interchangeable, so the allocation should surprise nobody. This is a very common pattern in several languages. And even ones that don't do this have workarounds - a function returning a function that gets called to capture the arguments (this works in D as well btw) - since the allocation is kinda the point of a closure. Whereas the current behavior surprises most everybody AND is pretty useless.
May 19
next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 5/19/21 3:29 PM, Adam D. Ruppe wrote:
 On Wednesday, 19 May 2021 at 19:01:59 UTC, Walter Bright wrote:
 Having a hidden allocation per loop will be completely unexpected for 
 such a simple looking loop for a lot of people. That includes pretty 
 much all of *us*, too.
Citation needed. It is fairly well known that closures and objects are pretty interchangeable, so the allocation should surprise nobody. This is a very common pattern in several languages. And even ones that don't do this have workarounds - a function returning a function that gets called to capture the arguments (this works in D as well btw) - since the allocation is kinda the point of a closure.
e.g.: foreach(i; someLargeThing) { if(Clock.currTime.year == 2020)// i.e. never dg = {return i;}; } If we defensively allocate for the delegate, this is going to allocate every iteration of someLargeThing, even though it's very rare that it will need to.
 
 Whereas the current behavior surprises most everybody AND is pretty 
 useless.
Nobody disagrees. What the disagreement here is, whether we should make the behavior work as expected at all costs, or invalidate the behavior completely because it's too costly. -Steve
May 19
next sibling parent Adam D. Ruppe <destructionator gmail.com> writes:
On Wednesday, 19 May 2021 at 19:48:52 UTC, Steven Schveighoffer 
wrote:
 If we defensively allocate for the delegate, this is going to 
 allocate every iteration of someLargeThing, even though it's 
 very rare that it will need to.
Yeah, it could just allocate when the assignment is made for cases like that, which is what the current dg = ((i)=>(){return i;})(i); pattern does. Which I actually don't mind at all myself.
May 19
prev sibling next sibling parent Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:
On Wednesday, 19 May 2021 at 19:48:52 UTC, Steven Schveighoffer 
wrote:
 e.g.:

 foreach(i; someLargeThing)
 {
    if(Clock.currTime.year == 2020)// i.e. never
      dg = {return  i;};
 }

 If we defensively allocate for the delegate, this is going to 
 allocate every iteration of someLargeThing, even though it's 
 very rare that it will need to.
Why not just use a backend intrinsic? The frontend does not have to know what the backend will do? Leave it to the implementation...
May 19
prev sibling parent reply Petar Kirov [ZombineDev] <petar.p.kirov gmail.com> writes:
On Wednesday, 19 May 2021 at 19:48:52 UTC, Steven Schveighoffer 
wrote:
 Nobody disagrees. What the disagreement here is, whether we 
 should make the behavior work as expected at all costs, or 
 invalidate the behavior completely because it's too costly.

 -Steve
Make it work as expected. If it turns out to be a performance bottleneck for some applications, they can always work around it, as obviously they had done until now. Being conscious about performance trade-offs is important in language design, but at the same time, just because someone can create a fork bomb with just several lines of code doesn't mean that we should disallow every type of dynamic memory allocation. For every misuse a of sound language feature, there are plenty more valid usages.
May 19
next sibling parent reply Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:
On Wednesday, 19 May 2021 at 20:08:02 UTC, Petar Kirov 
[ZombineDev] wrote:
 Being conscious about performance trade-offs is important in 
 language design, but at the same time, just because someone can 
 create a fork bomb with just several lines of code doesn't mean 
 that we should disallow every type of dynamic memory 
 allocation. For every misuse a of sound language feature, there 
 are plenty more valid usages.
A trade-off is to issue a warning and provide a warning silencer.
May 19
parent reply Petar Kirov [ZombineDev] <petar.p.kirov gmail.com> writes:
On Wednesday, 19 May 2021 at 20:14:18 UTC, Ola Fosheim Grostad 
wrote:
 On Wednesday, 19 May 2021 at 20:08:02 UTC, Petar Kirov 
 [ZombineDev] wrote:
 Being conscious about performance trade-offs is important in 
 language design, but at the same time, just because someone 
 can create a fork bomb with just several lines of code doesn't 
 mean that we should disallow every type of dynamic memory 
 allocation. For every misuse a of sound language feature, 
 there are plenty more valid usages.
A trade-off is to issue a warning and provide a warning silencer.
I see no point in having closure allocations causing compiler warnings. That's what profilers are for. Every application has different characteristics. Just because a newbie can write code that ends up generating a ton of GC garbage doesn't mean that closure allocations would even register on the performance radar of many applications.
May 19
next sibling parent Petar Kirov [ZombineDev] <petar.p.kirov gmail.com> writes:
On Wednesday, 19 May 2021 at 20:27:59 UTC, Petar Kirov 
[ZombineDev] wrote:
 On Wednesday, 19 May 2021 at 20:14:18 UTC, Ola Fosheim Grostad 
 wrote:
 On Wednesday, 19 May 2021 at 20:08:02 UTC, Petar Kirov 
 [ZombineDev] wrote:
 Being conscious about performance trade-offs is important in 
 language design, but at the same time, just because someone 
 can create a fork bomb with just several lines of code 
 doesn't mean that we should disallow every type of dynamic 
 memory allocation. For every misuse a of sound language 
 feature, there are plenty more valid usages.
A trade-off is to issue a warning and provide a warning silencer.
I see no point in having closure allocations causing compiler warnings. That's what profilers are for. Every application has different characteristics. Just because a newbie can write code that ends up generating a ton of GC garbage doesn't mean that closure allocations would even register on the performance radar of many applications.
That said, there's the [`-vgc`][0] compiler switch, which prints during compilation all parts of the program that may cause a GC allocation. My point is that GC allocations shouldn't cause errors/warnings outside of ` nogc` code as we have plenty of tools to diagnose performance bugs. [0]: https://dlang.org/dmd-linux.html#switch-vgc
May 19
prev sibling parent Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:
On Wednesday, 19 May 2021 at 20:27:59 UTC, Petar Kirov 
[ZombineDev] wrote:
 I see no point in having closure allocations causing compiler 
 warnings. That's what profilers are for. Every application has 
 different characteristics. Just because a newbie can write code
Ok, but if the alternative is an error...
May 19
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/19/2021 1:08 PM, Petar Kirov [ZombineDev] wrote:
 Make it work as expected. If it turns out to be a performance bottleneck for 
 some applications, they can always work around it, as obviously they had done 
 until now.
The point was that people will not realize they will have created a potentially very large performance bottleneck with an innocuous bit of code. This is a design pattern that should be avoided.
 Being conscious about performance trade-offs is important in language design, 
 but at the same time, just because someone can create a fork bomb with just 
 several lines of code doesn't mean that we should disallow every type of
dynamic 
 memory allocation. For every misuse a of sound language feature, there are 
 plenty more valid usages.
Yeah, well, I tend to bear the brunt of the unhappiness when these things go wrong. A fair amount of D's design decisions grew from discussions with programming lead engineers having problems with their less experienced devs making poor tradeoffs. I'm sure you've heard some of my rants against macros, version conditionals being simple identifiers instead of expressions, etc. D has many ways of getting past guardrails, but those need to be conscious decisions. Having no guardrails is not good design.
May 19
next sibling parent Petar Kirov [ZombineDev] <petar.p.kirov gmail.com> writes:
On Thursday, 20 May 2021 at 01:21:34 UTC, Walter Bright wrote:
 On 5/19/2021 1:08 PM, Petar Kirov [ZombineDev] wrote:
 Make it work as expected. If it turns out to be a performance 
 bottleneck for some applications, they can always work around 
 it, as obviously they had done until now.
The point was that people will not realize they will have created a potentially very large performance bottleneck with an innocuous bit of code. This is a design pattern that should be avoided.
If they didn't realize it's a performance bottleneck, like it wasn't that important enough to profile ;) People who value performance often are willing to go to great lengths to achieve it. I'm saying that we should make they path harder, but that it's not a hard problem to diagnose, so that we would force a type system hole on everyone (see Paul's post for an example: https://forum.dlang.org/post/bscrrwjvxqaydbohdjuw forum.dlang.org), just because someone may misuse it and create a perf bottleneck (I'm doubtful that this would be anywhere high on the list of possible problem in most programs).
 Being conscious about performance trade-offs is important in 
 language design, but at the same time, just because someone 
 can create a fork bomb with just several lines of code doesn't 
 mean that we should disallow every type of dynamic memory 
 allocation. For every misuse a of sound language feature, 
 there are plenty more valid usages.
Yeah, well, I tend to bear the brunt of the unhappiness when these things go wrong. A fair amount of D's design decisions grew from discussions with programming lead engineers having problems with their less experienced devs making poor tradeoffs. I'm sure you've heard some of my rants against macros, version conditionals being simple identifiers instead of expressions, etc. D has many ways of getting past guardrails, but those need to be conscious decisions. Having no guardrails is not good design.
I strongly agree with the sentiment that language/library design should guide users into making the right choice. The "pit of success" and all that. In this instance however, we don't have "limitation for the greater good", but a series of implementation (or design) issues that break D's type system. Let's focus on cleaning the foundation of the language, as the more we wait, the more painful the transition may be in the future if we delay.
May 20
prev sibling parent deadalnix <deadalnix gmail.com> writes:
On Thursday, 20 May 2021 at 01:21:34 UTC, Walter Bright wrote:
 On 5/19/2021 1:08 PM, Petar Kirov [ZombineDev] wrote:
 Make it work as expected. If it turns out to be a performance 
 bottleneck for some applications, they can always work around 
 it, as obviously they had done until now.
The point was that people will not realize they will have created a potentially very large performance bottleneck with an innocuous bit of code. This is a design pattern that should be avoided.
I don't expect this to be a huge deal in practice. There are many reasons for this I could go over, but the strongest argument is that literally any language out there does it and it doesn't seem to be a major issue.
 Being conscious about performance trade-offs is important in 
 language design, but at the same time, just because someone 
 can create a fork bomb with just several lines of code doesn't 
 mean that we should disallow every type of dynamic memory 
 allocation. For every misuse a of sound language feature, 
 there are plenty more valid usages.
Yeah, well, I tend to bear the brunt of the unhappiness when these things go wrong. A fair amount of D's design decisions grew from discussions with programming lead engineers having problems with their less experienced devs making poor tradeoffs. I'm sure you've heard some of my rants against macros, version conditionals being simple identifiers instead of expressions, etc. D has many ways of getting past guardrails, but those need to be conscious decisions. Having no guardrails is not good design.
You are making a categorical error here. The delegate things is fundamentally different. The tradeof being discussed is between something that might be slow in some cases, versus something that is outright broken. While we will all agree that possibly slow or confusing is bad, and that the argument stands for macro and alike, when the alternative is to produce something broken, it simply does not make sense. Possibly slow can be useful nevertheless, depending on the specifics of the situation. Broken is never useful.
May 20
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 5/19/2021 12:29 PM, Adam D. Ruppe wrote:
 On Wednesday, 19 May 2021 at 19:01:59 UTC, Walter Bright wrote:
 Having a hidden allocation per loop will be completely unexpected for such a 
 simple looking loop for a lot of people. That includes pretty much all of 
 *us*, too.
Citation needed.
Ok, I had to read it over very closely to realize what was happening. I've also done programming language tech support for nearly 40 years now. I have a pretty good feel for what people readily grasp and what they don't.
May 19
prev sibling next sibling parent reply deadalnix <deadalnix gmail.com> writes:
On Wednesday, 19 May 2021 at 19:01:59 UTC, Walter Bright wrote:
 On 5/19/2021 10:26 AM, deadalnix wrote:
 On Wednesday, 19 May 2021 at 13:02:59 UTC, Steven 
 Schveighoffer wrote:
 2. We need one allocation PER loop. If we do this the way 
 normal closures are done (i.e. allocate before the scope is 
 entered), this would be insanely costly for a loop.
This is costly, but also the only way to ensure other invariants in the language are respected (immutability, no access after destruction, ...). This is also consistent with what other languages do.
Languages like D also need to be useful, not just correct. Having a hidden allocation per loop will be completely unexpected for such a simple looking loop for a lot of people. That includes pretty much all of *us*, too.
It is not surprising that taking a closure would allocate on heap if the closure escapes. This is done for functions, this is done in every single programming language out there but D, and the compiler can remove the allocation if it detect that thing don't escape. In fact, even in C++, you'll find yourself with an allocation per loop if you do: std::vector<std::function<void()>> funs; for (int i = 0; i < 10; i++) { funs.push_back([i]() { printf("%d\n", i); }); } The instantiation of std::function here will allocate.
May 19
parent reply Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:
On Wednesday, 19 May 2021 at 20:19:22 UTC, deadalnix wrote:
 In fact, even in C++, you'll find yourself with an allocation 
 per loop if you do:

 std::vector<std::function<void()>> funs;
 for (int i = 0; i < 10; i++) {
     funs.push_back([i]() { printf("%d\n", i); });
 }

 The instantiation of std::function here will allocate.
I think it is implementation defined how large the internal buffer in std::function is? So it will allocate if it is too large? But yeah, it is ugly, I never use it. Usually one can avoid it..
May 19
parent deadalnix <deadalnix gmail.com> writes:
On Wednesday, 19 May 2021 at 21:56:19 UTC, Ola Fosheim Grostad 
wrote:
 On Wednesday, 19 May 2021 at 20:19:22 UTC, deadalnix wrote:
 In fact, even in C++, you'll find yourself with an allocation 
 per loop if you do:

 std::vector<std::function<void()>> funs;
 for (int i = 0; i < 10; i++) {
     funs.push_back([i]() { printf("%d\n", i); });
 }

 The instantiation of std::function here will allocate.
I think it is implementation defined how large the internal buffer in std::function is? So it will allocate if it is too large? But yeah, it is ugly, I never use it. Usually one can avoid it..
It is as large as it needs to be because you can capture arbitrarily large objects. If it is small enough, some implementations of std::function can do small object optimization and do it in place. It is only guaranteed for raw function pointers.
May 19
prev sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Wednesday, 19 May 2021 at 19:01:59 UTC, Walter Bright wrote:
 Languages like D also need to be useful, not just correct. 
 Having a hidden allocation per loop will be completely 
 unexpected for such a simple looking loop for a lot of people. 
 That includes pretty much all of *us*, too.
If closures causing "hidden" allocations is problematic, from a language-design perspective, then it's problematic whether it occurs inside a loop or not. Either we should (a) deprecate and remove GC-allocated closures entirely, or (b) make them work correctly in all cases.
 It's best to just return a compile error for such cases rather 
 than go to very expensive efforts to make every combination of 
 features work.
This is the worst of both worlds: we still pay the price of having "hidden" allocations in our code, but we do not even get the benefit of having properly-implemented closures in return.
May 19
next sibling parent deadalnix <deadalnix gmail.com> writes:
On Wednesday, 19 May 2021 at 20:56:10 UTC, Paul Backus wrote:
 On Wednesday, 19 May 2021 at 19:01:59 UTC, Walter Bright wrote:
 Languages like D also need to be useful, not just correct. 
 Having a hidden allocation per loop will be completely 
 unexpected for such a simple looking loop for a lot of people. 
 That includes pretty much all of *us*, too.
If closures causing "hidden" allocations is problematic, from a language-design perspective, then it's problematic whether it occurs inside a loop or not. Either we should (a) deprecate and remove GC-allocated closures entirely, or (b) make them work correctly in all cases.
 It's best to just return a compile error for such cases rather 
 than go to very expensive efforts to make every combination of 
 features work.
This is the worst of both worlds: we still pay the price of having "hidden" allocations in our code, but we do not even get the benefit of having properly-implemented closures in return.
I couldn't have phrased this better. Thanks.
May 19
prev sibling parent reply Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:
On Wednesday, 19 May 2021 at 20:56:10 UTC, Paul Backus wrote:
 On Wednesday, 19 May 2021 at 19:01:59 UTC, Walter Bright wrote:
 If closuras causing "hidden" allocations is problematic, from a 
 language-design perspective, then it's problematic whether it 
 occurs inside a loop or not. Either we should (a) deprecate and 
 remove GC-allocated closures entirely, or (b) make them work 
 correctly in all cases.
Or better, acknowledge that there is a difference between low level and high level projects (or libraries), and let low level programmers get warnings that they can silence while allowing high level programmers to have an easy life.
May 20
next sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Thursday, 20 May 2021 at 09:42:20 UTC, Ola Fosheim Grostad 
wrote:
 On Wednesday, 19 May 2021 at 20:56:10 UTC, Paul Backus wrote:
 On Wednesday, 19 May 2021 at 19:01:59 UTC, Walter Bright wrote:
 If closures causing "hidden" allocations is problematic, from 
 a language-design perspective, then it's problematic whether 
 it occurs inside a loop or not. Either we should (a) deprecate 
 and remove GC-allocated closures entirely, or (b) make them 
 work correctly in all cases.
Or better, acknowledge that there is a difference between low level and high level projects (or libraries), and let low level programmers get warnings that they can silence while allowing high level programmers to have an easy life.
Low-level programmers who want to find/avoid GC allocations in their programs already have plenty of tools to do so: nogc, -vgc, and -profile=gc.
May 20
parent Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:
On Thursday, 20 May 2021 at 11:00:55 UTC, Paul Backus wrote:
 On Thursday, 20 May 2021 at 09:42:20 UTC, Ola Fosheim Grostad 
 wrote:
 Low-level programmers who want to find/avoid GC allocations in 
 their programs already have plenty of tools to do so:  nogc, 
 -vgc, and -profile=gc.
That is a low goal, delegates should work without GC as an option.
May 20
prev sibling parent reply Petar Kirov [ZombineDev] <petar.p.kirov gmail.com> writes:
On Thursday, 20 May 2021 at 09:42:20 UTC, Ola Fosheim Grostad 
wrote:
 On Wednesday, 19 May 2021 at 20:56:10 UTC, Paul Backus wrote:
 On Wednesday, 19 May 2021 at 19:01:59 UTC, Walter Bright wrote:
 If closuras causing "hidden" allocations is problematic, from 
 a language-design perspective, then it's problematic whether 
 it occurs inside a loop or not. Either we should (a) deprecate 
 and remove GC-allocated closures entirely, or (b) make them 
 work correctly in all cases.
Or better, acknowledge that there is a difference between low level and high level projects (or libraries), and let low level programmers get warnings that they can silence while allowing high level programmers to have an easy life.
I strongly agree with both you and Paul. One D's biggest strengths in my experience is that it's not good for just one area, but many, each with it's own challenges. If I were to write a kernel module, I wouldn't even consider using the GC or linking druntime, while for scripting (which D is surprisingly good at) I would never bother with manual memory management or smart pointers. There are plenty of languages that force a single "right" solution; we don't need to copy the limitations from them. If people don't want to use any parts of druntime that may incur run-time or non-optional code-size cost (e.g. `Object.factory`) they can always use `-betterC` in their build scripts. If they just don't want the GC, then there's no ` nogc` function attribute and the `-vgc` compiler switch. We should also make is so you can put function attributes before `module` declarations to enforce them transitively for the whole module, so that you can put ` nogc module foo;` once and for all and not have to bother putting it on every function. There's probably some percentage of anti-GC people who tried D and were put off by the "GC tax" of having to annotate every single function with ` nogc` (there are more efficient ways to annotate multiple symbols with a give set of attributes, but they probably haven't taken the time to learn them). For past past 5-7 years, I can't think of a single new language feature that required the GC. We have been going the nogc / DasBetterC road long enough (*). I think we shouldn't "skip leg day" any more and we should improve a bit the high-level side of D. Failing that, we should remove heap-allocated closures from the language as it brings more harm to keep them working wrong than not having them at all. (*) Obviously, I don't mean we should stop improving in that direction (quite the opposite), but that we can afford to improve in other directions as well, without diminishing our current strengths.
May 20
parent reply Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:
On Thursday, 20 May 2021 at 11:01:56 UTC, Petar Kirov 
[ZombineDev] wrote:
 On Thursday, 20 May 2021 at 09:42:20 UTC, Ola Fosheim Grostad 
 wrote:
 For past past 5-7 years, I can't think of a single new language 
 feature that required the GC. We have been going the  nogc / 
 DasBetterC road long enough (*). I think we shouldn't "skip leg 
 day" any more and we should improve a bit the high-level side 
 of D. Failing that, we should remove heap-allocated closures 
 from the language as it brings more harm to keep them working 
 wrong than not having them at all.

 (*) Obviously, I don't mean we should stop improving in that 
 direction (quite the opposite), but that we can afford to 
 improve in other directions as well, without diminishing our 
 current strengths.
But delegates have to work without a GC too, and C++ types dont like hidden allocations at all. What they do instead is utilizing template parameters to receive the lambda, so effctively the closure is stored in the "delegate". The more I think about this the more I dislike separate compilation (to asm, to IR is ok). If you ditch that maybe you could use static analysis and store the closure in the delegate object.
May 20
next sibling parent Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:
On Thursday, 20 May 2021 at 11:40:24 UTC, Ola Fosheim Grostad 
wrote:
 The more I think about this the more I dislike separate 
 compilation (to asm, to IR is ok). If you ditch that maybe you 
 could use static analysis and store the closure in the delegate 
 object.
In my experience, lamdas often capture few symbols, so it is a bit silly to heap allocate 16 bytes... But std::function is not a good solution as its internal buffer is fixed, which is wasteful.
May 20
prev sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Thursday, 20 May 2021 at 11:40:24 UTC, Ola Fosheim Grostad 
wrote:
 But delegates have to work without a GC too
Well they do... sort of. You can always take the address of a struct member function and now you have your nogc delegate. Of course the difficulty is the receiving function has no way to knowing if that void* it received is a struct or an automatically captured variable set or what. And the capture list takes a little work but there's tricks like making it all in a struct. I wrote about this not too long ago: http://dpldocs.info/this-week-in-d/Blog.Posted_2021_03_01.html#tip-of-the-week However the delegate itself is less useful than a functor or interface though unless you must pass it to existing code. And then unless it is a `scope` receiver you're asking for a leak anyway again because of that void* being unknown to the caller (which is why this is possible at all, but also it leaves you a bit stuck). It would be kinda cool if the compiler would magically pack small types into that void* sometimes. Since it is opaque to the caller it could actually pack in a captured int or two right there and be a by-value delegate.
May 20
parent reply Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:
On Thursday, 20 May 2021 at 12:10:31 UTC, Adam D. Ruppe wrote:
 It would be kinda cool if the compiler would magically pack 
 small types into that void* sometimes. Since it is opaque to 
 the caller it could actually pack in a captured int or two 
 right there and be a by-value delegate.
Yes, what C++ lacks is a way to type a lambda (function object) before it is defined. A solution could be to have a way to say: this delegate should be able to hold 2 ints and 1 double, then it would have buffer space for that and there wold be no need to allocate. Libraries could provide aliases that are shorter, obviously.
May 20
parent reply deadalnix <deadalnix gmail.com> writes:
On Thursday, 20 May 2021 at 12:42:51 UTC, Ola Fosheim Grostad 
wrote:
 On Thursday, 20 May 2021 at 12:10:31 UTC, Adam D. Ruppe wrote:
 It would be kinda cool if the compiler would magically pack 
 small types into that void* sometimes. Since it is opaque to 
 the caller it could actually pack in a captured int or two 
 right there and be a by-value delegate.
Yes, what C++ lacks is a way to type a lambda (function object) before it is defined. A solution could be to have a way to say: this delegate should be able to hold 2 ints and 1 double, then it would have buffer space for that and there wold be no need to allocate. Libraries could provide aliases that are shorter, obviously.
You can do this with functors.
May 20
parent reply Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:
On Thursday, 20 May 2021 at 12:53:08 UTC, deadalnix wrote:
 You can do this with functors.
Yes, but the point is to declare a delegate with internal closure buffer without knowing what it receives?
May 20
parent reply deadalnix <deadalnix gmail.com> writes:
On Thursday, 20 May 2021 at 13:12:30 UTC, Ola Fosheim Grostad 
wrote:
 On Thursday, 20 May 2021 at 12:53:08 UTC, deadalnix wrote:
 You can do this with functors.
Yes, but the point is to declare a delegate with internal closure buffer without knowing what it receives?
It is more tricky, but some implementation of std::function do that. If what's captured is small enough, they store it in place, if it is larger, they allocate. It is not mandated by the standard and the size after which they'll allocate is implementation defined when they do, not under the user's control.
May 20
parent Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:
On Thursday, 20 May 2021 at 13:48:01 UTC, deadalnix wrote:
 It is more tricky, but some implementation of std::function do 
 that. If what's captured is small enough, they store it in 
 place, if it is larger, they allocate.
Yes, but sizeof std::function is always the same?
 It is not mandated by the standard and the size after which 
 they'll allocate is implementation defined when they do, not 
 under the user's control.
True, but D could be smarter and do something similar, but allow the size to vary so that you can save memory. And if you only assign once, then the cost of having a larger buffer in the delegate is smaller than if you do many assignments. D can be smarter than C++ because it can generate IR for all D files, I think?
May 20
prev sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 5/19/21 9:02 AM, Steven Schveighoffer wrote:

 Of course, with Walter's chosen fix, only allowing capture of non-scoped 
 variables, all of this is moot. I kind of feel like that's a much 
 simpler (even if less convenient) solution.
After reading a lot of this discussion, I have changed my mind. We should implement the "correct" thing even if it performs poorly. While Walter's solution gets the compiler out of responsibility, it doesn't square with the fact that closures are already hidden allocations, so consistency dictates we deal with inner allocations the same way. We need one heap block per scope that has captured variables. Expensive, but I don't see a way around it. Hopefully optimizers and scope delegates can alleviate performance issues. -Steve
May 20
parent reply TheGag96 <thegag96 gmail.com> writes:
On Thursday, 20 May 2021 at 12:31:00 UTC, Steven Schveighoffer 
wrote:
 On 5/19/21 9:02 AM, Steven Schveighoffer wrote:

 Of course, with Walter's chosen fix, only allowing capture of 
 non-scoped variables, all of this is moot. I kind of feel like 
 that's a much simpler (even if less convenient) solution.
After reading a lot of this discussion, I have changed my mind. We should implement the "correct" thing even if it performs poorly. While Walter's solution gets the compiler out of responsibility, it doesn't square with the fact that closures are already hidden allocations, so consistency dictates we deal with inner allocations the same way. We need one heap block per scope that has captured variables. Expensive, but I don't see a way around it. Hopefully optimizers and scope delegates can alleviate performance issues. -Steve
Yeah... Honestly, that getting-around-immutable thing seems like the nail in the coffin for the current behavior. Hopefully making it work "correctly" won't be too painful... The delegate-related thing I really want improved is being able to capture local variables in places like: ```d int i = 3; someRange.map!(x => x.thing == i).each!writeln; ``` ...without needing the GC, since we "know" that `i` doesn't escape. Dunno if that's a pipe dream, though.
May 20
next sibling parent reply Max Haughton <maxhaton gmail.com> writes:
On Friday, 21 May 2021 at 00:31:52 UTC, TheGag96 wrote:
 On Thursday, 20 May 2021 at 12:31:00 UTC, Steven Schveighoffer 
 wrote:
 On 5/19/21 9:02 AM, Steven Schveighoffer wrote:

 Of course, with Walter's chosen fix, only allowing capture of 
 non-scoped variables, all of this is moot. I kind of feel 
 like that's a much simpler (even if less convenient) solution.
After reading a lot of this discussion, I have changed my mind. We should implement the "correct" thing even if it performs poorly. While Walter's solution gets the compiler out of responsibility, it doesn't square with the fact that closures are already hidden allocations, so consistency dictates we deal with inner allocations the same way. We need one heap block per scope that has captured variables. Expensive, but I don't see a way around it. Hopefully optimizers and scope delegates can alleviate performance issues. -Steve
Yeah... Honestly, that getting-around-immutable thing seems like the nail in the coffin for the current behavior. Hopefully making it work "correctly" won't be too painful... The delegate-related thing I really want improved is being able to capture local variables in places like: ```d int i = 3; someRange.map!(x => x.thing == i).each!writeln; ``` ...without needing the GC, since we "know" that `i` doesn't escape. Dunno if that's a pipe dream, though.
This has to be an aim. It's simply stupid that using map in the way it's intended to be used results in a GC allocation (there are workarounds, but this is missing the point entirely). I know it won't be easy, but quite frankly if it's not possible that's a knock on us and our infrastructure - if we can't do big and important things properly we need to change that too.
May 20
parent reply Q. Schroll <qs.il.paperinik gmail.com> writes:
On Friday, 21 May 2021 at 00:54:27 UTC, Max Haughton wrote:
 On Friday, 21 May 2021 at 00:31:52 UTC, TheGag96 wrote:
 The delegate-related thing I really want improved is being 
 able to capture local variables in places like:

 ```d
 int i = 3;
 someRange.map!(x => x.thing == i).each!writeln;
 ```

 ...without needing the GC, since we "know" that `i` doesn't 
 escape. Dunno if that's a pipe dream, though.
This has to be an aim. It's simply stupid that using map in the way it's intended to be used results in a GC allocation (there are workarounds, but this is missing the point entirely). I know it won't be easy, but quite frankly if it's not possible that's a knock on us and our infrastructure - if we can't do big and important things properly we need to change that too.
I was a little surprised that it needs the GC in the first place. It's a template parameter after all. Here, it even seems a runtime delegate parameter would shine since it could be marked `scope`. If nothing else goes on, an allocation is not necessary. It's even ` nogc`. (Maybe the lambda bound to the alias could have `scope` implied?)
May 27
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Friday, 28 May 2021 at 02:41:27 UTC, Q. Schroll wrote:
 I was a little surprised that it needs the GC in the first 
 place. It's a template parameter after all.
It isn't the lambda that allocates per se, it is the `MapResult` struct that cannot necessarily be scope since it doesn't know if the range will be stored or returned or whatever. Though perhaps if the returned map result inherited the lifetime of the captured variables it could work, just that gets complicated.
May 27
parent reply deadalnix <deadalnix gmail.com> writes:
On Friday, 28 May 2021 at 02:47:41 UTC, Adam D. Ruppe wrote:
 On Friday, 28 May 2021 at 02:41:27 UTC, Q. Schroll wrote:
 I was a little surprised that it needs the GC in the first 
 place. It's a template parameter after all.
It isn't the lambda that allocates per se, it is the `MapResult` struct that cannot necessarily be scope since it doesn't know if the range will be stored or returned or whatever. Though perhaps if the returned map result inherited the lifetime of the captured variables it could work, just that gets complicated.
I suspect the compiler should be able to see through it after a few rounds of inlining. If there are so many layers that it doesn't, I suspect one allocation isn't going to be your bottleneck.
May 28
parent reply Paul Backus <snarwin gmail.com> writes:
On Friday, 28 May 2021 at 13:47:32 UTC, deadalnix wrote:
 On Friday, 28 May 2021 at 02:47:41 UTC, Adam D. Ruppe wrote:
 On Friday, 28 May 2021 at 02:41:27 UTC, Q. Schroll wrote:
 I was a little surprised that it needs the GC in the first 
 place. It's a template parameter after all.
It isn't the lambda that allocates per se, it is the `MapResult` struct that cannot necessarily be scope since it doesn't know if the range will be stored or returned or whatever. Though perhaps if the returned map result inherited the lifetime of the captured variables it could work, just that gets complicated.
I suspect the compiler should be able to see through it after a few rounds of inlining. If there are so many layers that it doesn't, I suspect one allocation isn't going to be your bottleneck.
`ldc -O` is able to elide the allocation: <https://d.godbolt.org/z/9aoMo9hbe> However, the code still does not qualify as ` nogc`, because ` nogc` analysis is done prior to optimization.
May 28
next sibling parent Max Haughton <maxhaton gmail.com> writes:
On Friday, 28 May 2021 at 14:29:50 UTC, Paul Backus wrote:
 On Friday, 28 May 2021 at 13:47:32 UTC, deadalnix wrote:
 On Friday, 28 May 2021 at 02:47:41 UTC, Adam D. Ruppe wrote:
 On Friday, 28 May 2021 at 02:41:27 UTC, Q. Schroll wrote:
 I was a little surprised that it needs the GC in the first 
 place. It's a template parameter after all.
It isn't the lambda that allocates per se, it is the `MapResult` struct that cannot necessarily be scope since it doesn't know if the range will be stored or returned or whatever. Though perhaps if the returned map result inherited the lifetime of the captured variables it could work, just that gets complicated.
I suspect the compiler should be able to see through it after a few rounds of inlining. If there are so many layers that it doesn't, I suspect one allocation isn't going to be your bottleneck.
`ldc -O` is able to elide the allocation: <https://d.godbolt.org/z/9aoMo9hbe> However, the code still does not qualify as ` nogc`, because ` nogc` analysis is done prior to optimization.
Some more context (i.e. how LDC goes about doing this) https://d.godbolt.org/z/jsPdoTxY1 https://github.com/ldc-developers/ldc/blob/master/gen/passes/GarbageCollect2Stack.cpp GCC does not do this optimization (yet?), but it does for malloc and new in C++.
May 28
prev sibling parent reply deadalnix <deadalnix gmail.com> writes:
On Friday, 28 May 2021 at 14:29:50 UTC, Paul Backus wrote:
 `ldc -O` is able to elide the allocation: 
 <https://d.godbolt.org/z/9aoMo9hbe>

 However, the code still does not qualify as ` nogc`, because 
 ` nogc` analysis is done prior to optimization.
I've long argued that nogc needs to trigger n leaks, not allocations, and this is one more example as to why. And tracking leak isn't as crazy as it sound whn you track ownership, a leak is a transfers of ownership to the GC.
May 30
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Sunday, 30 May 2021 at 21:09:36 UTC, deadalnix wrote:
 I've long argued that nogc needs to trigger n leaks, not 
 allocations, and this is one more example as to why.

 And tracking leak isn't as crazy as it sound whn you track 
 ownership, a leak is a transfers of ownership to the GC.
I don't disagree that tracking ownership through the type system is desirable, but it isn't possible without explicit ownership. Shape analysis get tricky very fast.
May 31
parent reply deadalnix <deadalnix gmail.com> writes:
On Monday, 31 May 2021 at 15:08:34 UTC, Ola Fosheim Grøstad wrote:
 On Sunday, 30 May 2021 at 21:09:36 UTC, deadalnix wrote:
 I've long argued that nogc needs to trigger n leaks, not 
 allocations, and this is one more example as to why.

 And tracking leak isn't as crazy as it sound whn you track 
 ownership, a leak is a transfers of ownership to the GC.
I don't disagree that tracking ownership through the type system is desirable, but it isn't possible without explicit ownership. Shape analysis get tricky very fast.
I think it is unavoidable. DIP1000 and alike are just doing exactly that while pretending they aren't, and it is causing a parsing XML with regex kind of running problem - it looks like it'll actually work, but it doesn't.
May 31
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Monday, 31 May 2021 at 23:37:21 UTC, deadalnix wrote:
 I think it is unavoidable. DIP1000 and alike are just doing 
 exactly that while pretending they aren't, and it is causing a 
 parsing XML with regex kind of running problem - it looks like 
 it'll actually work, but it doesn't.
True, no point in avoiding the machinery if that is what one is aiming for. Better to take the full machinery with explicit annotations, and then provide syntactical sugar for the common case if that is desirable. Going the other way will most likely be messy.
May 31
prev sibling parent Adam D. Ruppe <destructionator gmail.com> writes:
On Friday, 21 May 2021 at 00:31:52 UTC, TheGag96 wrote:
 Dunno if that's a pipe dream, though.
Once it is all inlined, the backend ought to be able to see what is going on here and elide the allocations. nogc might not pass that though since it is before all those optimizations are performed...... But like i just checked ldc -O and it indeed optimizes it out today.
May 20
prev sibling parent reply Jesse Phillips <Jesse.K.Phillips+D gmail.com> writes:
On Tuesday, 18 May 2021 at 16:47:03 UTC, deadalnix wrote:
 Long story short: https://issues.dlang.org/show_bug.cgi?id=21929

 Closure do not respect scope the way they should. Let's fix it.
After Walter's post I definitely see what is happening. ```dlang for (int i = 0; i < 10; i++) { int index = i; dgs ~= () { import std.stdio; writeln(index); }; } ``` When this loop concludes, the value of `i` is 10 and the value of index is 9 (as shown from your output). This is because within the `for` logic `i` was increased and it determined `10 < 10` is false. This means the `for`body is not executed again leaving `index` at 9. I don't know why compiler magic you would expect is "correct" here. We can't say `i` should be 9 as the loop would not have exited then. We certainly don't want `index` to be 10 as that would mean the loop expected on more time than it was defined to. Untested ```dlang auto i; auto index; for (i = 0; i < 10; i++) { index = i; dgs ~= () { import std.stdio; writeln(i); writeln(index); }; } ```
May 19
parent Paul Backus <snarwin gmail.com> writes:
On Wednesday, 19 May 2021 at 14:24:57 UTC, Jesse Phillips wrote:
 ```dlang
 for (int i = 0; i < 10; i++) {
         int index = i;
         dgs ~= () {
             import std.stdio;
             writeln(index);
         };
     }
 ```

 When this loop concludes, the value of `i` is 10 and the value 
 of index is 9 (as shown from your output).

 This is because within the `for` logic `i` was increased and it 
 determined `10 < 10` is false. This means the `for`body is not 
 executed again leaving `index` at 9.

 I don't know why compiler magic you would expect is "correct" 
 here. We can't say `i` should be 9 as the loop would not have 
 exited then. We certainly don't want `index` to be 10 as that 
 would mean the loop expected on more time than it was defined 
 to.
A local variable's lifetime starts at its declaration and ends at the closing brace of the scope where it's declared: ```d void main() { int x; // start of x's lifetime { int y; // start of y's lifetime } // end of y's lifetime int z; // start of z's lifetime } // end of x's and z's lifetimes ``` This also applies to variables inside loops: ```d void main() { foreach (i; 0 .. 10) { int x; // start of x's lifetime } // end of x's lifetime } ``` We can see that this is the case by declaring a variable with a destructor inside a loop: ```d import std.stdio; struct S { ~this() { writeln("destroyed"); } } void main() { foreach (i; 0 .. 10) { S s; // start of s's lifetime } // end of s's lifetime } ``` The above program prints "destroyed" 10 times. At the start of each loop iteration, a new instance of `s` is initialized; at the end of each iteration, it is destroyed. Normally, an instance of a variable declared inside a loop cannot outlive the loop iteration in which it was created, so the compiler is free to reuse the same memory for each instance. We can verify that it does so by printing out the address of each instance: ```d import std.stdio; struct S { ~this() { writeln("destroyed ", &this); } } void main() { foreach (i; 0 .. 10) { S s; } } ``` On `run.dlang.io`, this prints "destroyed 7FFE478D283C" 10 times. However, when am instance of variable declared inside a loop is captured in a closure, it becomes possible to access that instance even after the loop iteration that created it has finished. In this case, the lifetimes of the instances may overlap, and it is no longer a valid optimization to re-use the same memory for each one. We can see this most clearly by declaring the variable in the loop `immutable`: ```d void main() { int delegate()[10] dgs; foreach (i; 0 .. 10) { immutable index = i; dgs[i] () => index; assert(dgs[i]() == i); } foreach (i; 0 .. 10) { // if this fails, something has mutated immutable data! assert(dgs[i]() == i); } } ``` If you run the above program, you will see that the assert in the second loop does, in fact, fail. By using the same memory to store each instance of `index`, the compiler has generated incorrect code that allows us to observe mutation of `immutable` data--something that the language spec itself says is undefined behavior. In order to compile this code correctly, the compiler *must* allocate a separate location in memory for each instance of `index`. Those locations can be either on the stack (if the closure does not outlive the function) or on the heap; the important part is that they cannot overlap.
May 19