digitalmars.D.learn - Strange closure behaviour

Emmanuelle (31/31) Jun 14 2019 Take a look at this code:

Adam D. Ruppe (13/14) Jun 14 2019 Yup, a very longstanding bug.

Emmanuelle (3/17) Jun 14 2019 Oh, I see. Unfortunate that it's a longstanding compiler bug, but

=?UTF-8?B?UsOpbXkgTW91w6t6YQ==?= (22/45) Jun 15 2019 I don't know if we can tell this is a compiler bug. The same

Adam D. Ruppe (14/15) Jun 15 2019 I can't remember where the key fact was, but I used to agree with
Emmanuelle (6/12) Jun 15 2019 I come from Ruby, where it works as I expected, so I assumed all
Timon Gehr (14/64) Jun 15 2019 It's a bug. It's memory corruption. Different objects with overlapping

=?UTF-8?B?UsOpbXkgTW91w6t6YQ==?= (14/25) Jun 16 2019 I got confused by this Python behavior:

Emmanuelle <VuLXn6DBW PPtUm7TvV6nsw.com> writes:

Take a look at this code:

---
import std.stdio;

void main()
{
     alias Func = void delegate(int);

     int[][] nums = new int[][5];
     Func[] funcs;
     foreach (x; 0 .. 5) {
         funcs ~= (int i) { nums[x] ~= i; };
     }

     foreach (i, func; funcs) {
         func(cast(int) i);
     }

     writeln(nums);
}
---

(https://run.dlang.io/is/oMjNRL)

The output is:

---
[[], [], [], [], [0, 1, 2, 3, 4]]
---

Personally, this makes no sense to me. This is the result I was 
expecting:

---
[[0], [1], [2], [3], [4]]
---

Why is it "locking" the bound `x` to the last element? It seems 
like the compiler is overwriting the closure for `x`, somehow. 
So, I'm wondering why D is doing that. Is it a compiler bug? Or 
is this the expected behaviour?

Jun 14 2019

Adam D. Ruppe <destructionator gmail.com> writes:

On Saturday, 15 June 2019 at 00:24:52 UTC, Emmanuelle wrote:
 Is it a compiler bug?

Yup, a very longstanding bug.

You can work around it by wrapping it all in another layer of 
function which you immediately call (which is fairly common in 
javascript):

         funcs ~= ((x) => (int i) { nums[x] ~= i; })(x);

Or maybe less confusingly written long form:

         funcs ~= (delegate(x) {
             return (int i) { nums[x] ~= i; };
         })(x);

You write a function that returns your actual function, and 
immediately calls it with the loop variable, which will 
explicitly make a copy of it.

Jun 14 2019

Emmanuelle <VuLXn6DBW PPtUm7TvV6nsw.com> writes:

On Saturday, 15 June 2019 at 00:30:43 UTC, Adam D. Ruppe wrote:
 On Saturday, 15 June 2019 at 00:24:52 UTC, Emmanuelle wrote:
 Is it a compiler bug?

 Yup, a very longstanding bug.

 You can work around it by wrapping it all in another layer of 
 function which you immediately call (which is fairly common in 
 javascript):

         funcs ~= ((x) => (int i) { nums[x] ~= i; })(x);

 Or maybe less confusingly written long form:

         funcs ~= (delegate(x) {
             return (int i) { nums[x] ~= i; };
         })(x);

 You write a function that returns your actual function, and 
 immediately calls it with the loop variable, which will 
 explicitly make a copy of it.

Oh, I see. Unfortunate that it's a longstanding compiler bug, but 
at least the rather awkward workaround will do. Thank you!

Jun 14 2019

=?UTF-8?B?UsOpbXkgTW91w6t6YQ==?= <remy.moueza gmail.com> writes:

On Saturday, 15 June 2019 at 01:21:46 UTC, Emmanuelle wrote:
 On Saturday, 15 June 2019 at 00:30:43 UTC, Adam D. Ruppe wrote:
 On Saturday, 15 June 2019 at 00:24:52 UTC, Emmanuelle wrote:
 Is it a compiler bug?

 Yup, a very longstanding bug.

 You can work around it by wrapping it all in another layer of 
 function which you immediately call (which is fairly common in 
 javascript):

         funcs ~= ((x) => (int i) { nums[x] ~= i; })(x);

 Or maybe less confusingly written long form:

         funcs ~= (delegate(x) {
             return (int i) { nums[x] ~= i; };
         })(x);

 You write a function that returns your actual function, and 
 immediately calls it with the loop variable, which will 
 explicitly make a copy of it.

 Oh, I see. Unfortunate that it's a longstanding compiler bug, 
 but at least the rather awkward workaround will do. Thank you!

I don't know if we can tell this is a compiler bug. The same 
behavior happens in Python. The logic being variable `x` is 
captured by the closure. That closure's context will contain a 
pointer/reference to x. Whenever x is updated outside of the 
closure, the context still points to the modified x. Hence the 
seemingly strange behavior.

Adam's workaround ensures that the closure captures a temporary 
`x` variable on the stack: a copy will be made instead of taking 
a reference, since a pointer to `x` would be dangling once the 
`delegate(x){...}` returns.

Most of the time, we want a pointer/reference to the enclosed 
variables in our closures. Note that C++ 17 allows one to select 
the capture mode: the following link lists 8 of them: 
https://en.cppreference.com/w/cpp/language/lambda#Lambda_capture.

D offers a convenient default that works most of the time. The 
trade-off is having to deal with the creation of several closures 
referencing a variable being modified in a single scope, like the 
incremented `x` of the for loop.

That said, I wouldn't mind having the compiler dealing with that 
case: detecting that `x` is within a for loop and making copies 
of it in the closures contexts.

Jun 15 2019

Adam D. Ruppe <destructionator gmail.com> writes:

On Saturday, 15 June 2019 at 16:29:29 UTC, Rémy Mouëza wrote:
 I don't know if we can tell this is a compiler bug.

I can't remember where the key fact was, but I used to agree with 
you (several languages work this same way, and it makes a lot of 
sense for ease of the implementation), but someone convinced me 
otherwise by pointing to the language of the D spec.

I just can't find that reference right now...

It is worth noting too that the current behavior also opens up a 
whole in the immutable promises; the loop variable can be passed 
as immutable to the outside via a delegate, but then modified 
afterward, which is unambiguously a bug.

Regardless of bug vs spec, it isn't implemented and I wouldn't 
expect that to change any time soon, so it is good to just learn 
the wrapper function technique :) (and it is useful in those 
other languages too)

Jun 15 2019

Emmanuelle <VuLXn6DBW PPtUm7TvV6nsw.com> writes:

On Saturday, 15 June 2019 at 16:29:29 UTC, Rémy Mouëza wrote:
 I don't know if we can tell this is a compiler bug. The same 
 behavior happens in Python. The logic being variable `x` is 
 captured by the closure. That closure's context will contain a 
 pointer/reference to x. Whenever x is updated outside of the 
 closure, the context still points to the modified x. Hence the 
 seemingly strange behavior.

I come from Ruby, where it works as I expected, so I assumed all 
languages would work like that; but then, D surprised me, and 
now, Python too, and apparently a whole bunch of other languages 
(which is honestly kinda disheartening since I like throwing 
lambdas everywhere.)

Jun 15 2019

Timon Gehr <timon.gehr gmx.ch> writes:

On 15.06.19 18:29, Rémy Mouëza wrote:
 On Saturday, 15 June 2019 at 01:21:46 UTC, Emmanuelle wrote:
 On Saturday, 15 June 2019 at 00:30:43 UTC, Adam D. Ruppe wrote:
 On Saturday, 15 June 2019 at 00:24:52 UTC, Emmanuelle wrote:
 Is it a compiler bug?

 Yup, a very longstanding bug.

 You can work around it by wrapping it all in another layer of 
 function which you immediately call (which is fairly common in 
 javascript):

         funcs ~= ((x) => (int i) { nums[x] ~= i; })(x);

 Or maybe less confusingly written long form:

         funcs ~= (delegate(x) {
             return (int i) { nums[x] ~= i; };
         })(x);

 You write a function that returns your actual function, and 
 immediately calls it with the loop variable, which will explicitly 
 make a copy of it.

 Oh, I see. Unfortunate that it's a longstanding compiler bug, but at 
 least the rather awkward workaround will do. Thank you!

 
 I don't know if we can tell this is a compiler bug.

It's a bug. It's memory corruption. Different objects with overlapping 
lifetimes use the same memory location.

 The same behavior happens in Python.

No, it's not the same. Python has no sensible notion of variable scope.

 for i in range(3): pass



...
 print(i)



2

Yuck.

 The logic being variable `x` is captured by the 
 closure. That closure's context will contain a pointer/reference to x. 
 Whenever x is updated outside of the closure, the context still points 
 to the modified x. Hence the seemingly strange behavior.
 ...

It's not the same instance of the variable. Foreach loop variables are 
local to the loop body. They may both be called `x`, but they are not 
the same. It's most obvious with `immutable` variables.

 Adam's workaround ensures that the closure captures a temporary `x` 
 variable on the stack: a copy will be made instead of taking a 
 reference, since a pointer to `x` would be dangling once the 
 `delegate(x){...}` returns.
 
 Most of the time, we want a pointer/reference to the enclosed variables 
 in our closures. Note that C++ 17 allows one to select the capture mode: 
 the following link lists 8 of them: 
 https://en.cppreference.com/w/cpp/language/lambda#Lambda_capture.
 ...

No, this is not an issue of by value vs by reference. All captures in D 
are by reference, yet the behavior is wrong.

 D offers a convenient default that works most of the time. The trade-off 
 is having to deal with the creation of several closures referencing a 
 variable being modified in a single scope, like the incremented `x` of 
 the for loop.
 ...

By reference capturing may be a convenient default, but even capturing 
by reference the behavior is wrong.

Jun 15 2019

=?UTF-8?B?UsOpbXkgTW91w6t6YQ==?= <remy.moueza gmail.com> writes:

On Sunday, 16 June 2019 at 01:36:38 UTC, Timon Gehr wrote:
 It's a bug. It's memory corruption. Different objects with 
 overlapping
  lifetimes use the same memory location.

Okay. Seen that way, it is clear to me why it's a bug.

 ...
 No, it's not the same. Python has no sensible notion of 
 variable scope.

 for i in range(3): pass



 ...
 print(i)



 2

 Yuck.

I got confused by this Python behavior:

ls = []
for i in range(0, 5):
    ls.append(lambda x: x + i)
for fun in ls:
    print(fun(0))

This prints:
4
4
4
4
4

Jun 16 2019

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Strange closure behaviour