www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - From stack delegates to true closures

reply "Søren J. Løvborg" <web kwi.dk> writes:
I'm aware this has been discussed before, but I'd like to bring it up again 
(sorry!): D really should have true closures, or "stack delegates that can 
be returned" if you will.

First of all, they're a very useful feature, allowing short and concise 
code.
Secondly, they'd be a good D feature to point at whenever faced with the 
question, "Oh, but what's D got that Java/C++ hasn't?"
But mainly, because D is already so close to supporting them with stack 
delegates (aka dynamic closures).

The problem with stack delegates is that they're only valid as long as one 
does not return from the function in which they were created, which is a 
rather serious limitation, and (it seems to me) the only thing standing in 
the way of true closures.

It's been argued previously that implementing true closures would add a lot 
of overhead to the existing stack delegates -- the cost of copying the 
entire stackframe has been mentioned as a problem. However, it seems to me 
that this can be reduced to the much smaller cost of allocating a block of 
memory, having the garbage collector free it later, and an additional level 
of indirection.

Contrived example: createFunc returns a stack delegate, which takes an "out" 
MyType argument and initializes it to a value calculated at the time 
createFunc was run.

void delegate(out MyType) createFunc()
{
    MyType foo, bar;
    /* do stuff with foo and bar */
    return delegate void(out MyType dst) { dst = foo; };
}

This compiles fine, but doens't work, since foo is long gone from the stack 
at the time the returned delegate is invoked. In order for it to work, foo 
will have to be allocated on the heap, rather than on the stack. (Static 
variables aren't thread-safe.) Simply copying the value of foo to a newly 
allocated block of memory isn't a good solution, since 1) MyType might be a 
very large type (say, int[100000]) and 2) changes to one copy does not 
propagate to the other.

It seems to me that these problems can be avoided, however, by never 
allocating foo on the stack. In other words, the above code should be 
translated into something like

void delegate(out MyType) createFunc()
{
    struct _EXT { MyType foo; };
    _EXT* _ext = new _EXT;

    MyType bar;
    /* do stuff with _EXT.foo and bar */
    return delegate void(out MyType dst) { dst = _EXT.foo; };
}

where, instead of wrapping the soon-to-be invalid stack pointer in the 
delegate, the compiler should wrap the actual _ext pointer, adjusting the 
delegate code as necessary.

The delegate runs as fast as ever, and this would prevent any unnecessary 
copying, reducing the overhead to that of an allocation (peanuts, I hope), 
an extra level of indirection when accessing foo from within createFunc 
(which the compiler should be able to optimize away in most cases, as the 
_ext pointer can be cached in a register), and the cost of garbage 
collecting the _EXT storage.

I guess the GC'ing could cause trouble if/when delegates and function 
pointers are merged, and delegates might end up being stored in 
non-GC-managed memory (for Windows API callbacks, e.g.) -- however, since 
_ext will remain on the stack until the function returns, this new kind of 
delegate would still be valid for at least as long as the current kind.

The allocating/GC overhead might prove troublesome if a lot of delegates 
were allocated, although one probably shouldn't use delegates in bottleneck 
code anyway (in that respect, function calls aren't exactly cheap either).

One option to work around this might be to add new syntax, say "new 
delegate" for this new behaviour, if implemented.

Any comments? Am I the only one who'd find this useful? Is there already 
similar functionality, and I've just not found it? (If so, please 
elaborate!)

Søren J. Løvborg
web kwi.dk 
Jan 31 2006
next sibling parent reply "Kris" <fu bar.com> writes:
Why not just do something like this instead (untested) ?

void delegate (out MyType) createFunc()
{
      class Foo
      {
            MyType foo;

            void func (out MyType dst) {dst = foo;}
      }
      return & (new Foo).func;
}

I presume the anonymous-class support in D would make that a bit more 
concise, but the general strategy should work?




"Søren J. Løvborg" <web kwi.dk> wrote in message 
news:drod3j$1jm0$1 digitaldaemon.com...
 I'm aware this has been discussed before, but I'd like to bring it up 
 again (sorry!): D really should have true closures, or "stack delegates 
 that can be returned" if you will.

 First of all, they're a very useful feature, allowing short and concise 
 code.
 Secondly, they'd be a good D feature to point at whenever faced with the 
 question, "Oh, but what's D got that Java/C++ hasn't?"
 But mainly, because D is already so close to supporting them with stack 
 delegates (aka dynamic closures).

 The problem with stack delegates is that they're only valid as long as one 
 does not return from the function in which they were created, which is a 
 rather serious limitation, and (it seems to me) the only thing standing in 
 the way of true closures.

 It's been argued previously that implementing true closures would add a 
 lot of overhead to the existing stack delegates -- the cost of copying the 
 entire stackframe has been mentioned as a problem. However, it seems to me 
 that this can be reduced to the much smaller cost of allocating a block of 
 memory, having the garbage collector free it later, and an additional 
 level of indirection.

 Contrived example: createFunc returns a stack delegate, which takes an 
 "out" MyType argument and initializes it to a value calculated at the time 
 createFunc was run.

 void delegate(out MyType) createFunc()
 {
    MyType foo, bar;
    /* do stuff with foo and bar */
    return delegate void(out MyType dst) { dst = foo; };
 }

 This compiles fine, but doens't work, since foo is long gone from the 
 stack at the time the returned delegate is invoked. In order for it to 
 work, foo will have to be allocated on the heap, rather than on the stack. 
 (Static variables aren't thread-safe.) Simply copying the value of foo to 
 a newly allocated block of memory isn't a good solution, since 1) MyType 
 might be a very large type (say, int[100000]) and 2) changes to one copy 
 does not propagate to the other.

 It seems to me that these problems can be avoided, however, by never 
 allocating foo on the stack. In other words, the above code should be 
 translated into something like

 void delegate(out MyType) createFunc()
 {
    struct _EXT { MyType foo; };
    _EXT* _ext = new _EXT;

    MyType bar;
    /* do stuff with _EXT.foo and bar */
    return delegate void(out MyType dst) { dst = _EXT.foo; };
 }

 where, instead of wrapping the soon-to-be invalid stack pointer in the 
 delegate, the compiler should wrap the actual _ext pointer, adjusting the 
 delegate code as necessary.

 The delegate runs as fast as ever, and this would prevent any unnecessary 
 copying, reducing the overhead to that of an allocation (peanuts, I hope), 
 an extra level of indirection when accessing foo from within createFunc 
 (which the compiler should be able to optimize away in most cases, as the 
 _ext pointer can be cached in a register), and the cost of garbage 
 collecting the _EXT storage.

 I guess the GC'ing could cause trouble if/when delegates and function 
 pointers are merged, and delegates might end up being stored in 
 non-GC-managed memory (for Windows API callbacks, e.g.) -- however, since 
 _ext will remain on the stack until the function returns, this new kind of 
 delegate would still be valid for at least as long as the current kind.

 The allocating/GC overhead might prove troublesome if a lot of delegates 
 were allocated, although one probably shouldn't use delegates in 
 bottleneck code anyway (in that respect, function calls aren't exactly 
 cheap either).

 One option to work around this might be to add new syntax, say "new 
 delegate" for this new behaviour, if implemented.

 Any comments? Am I the only one who'd find this useful? Is there already 
 similar functionality, and I've just not found it? (If so, please 
 elaborate!)

 Søren J. Løvborg
 web kwi.dk
 
Jan 31 2006
parent "Søren J. Løvborg" <web kwi.dk> writes:
"Kris" <fu bar.com> wrote:
 Why not just do something like this instead (untested) ?

 void delegate (out MyType) createFunc()
 {
      class Foo
      {
            MyType foo;

            void func (out MyType dst) {dst = foo;}
      }
      return & (new Foo).func;
 }
That is essentially how I imagined true closures to be implemented (except for any extra garbage that classes might bring along with them; I don't know if the current D implementation includes a VMT for classes with neither parent nor children, for instance). However, the above code is quite verbose compared to void delegate (out MyType) createFunc() { MyType foo; return delegate void(out MyType dst) {dst = foo;}; } Søren J. Løvborg web kwi.dk
Feb 02 2006
prev sibling next sibling parent reply Sean Kelly <sean f4.ca> writes:
Søren J. Løvborg wrote:
 I'm aware this has been discussed before, but I'd like to bring it up again 
 (sorry!): D really should have true closures, or "stack delegates that can 
 be returned" if you will.
I just had a crazy vision of a stackless version of D where closures simply reference the needed data via a pointer and let the GC worry about cleaning it up. Might be an interesting experiment if nothing else. Sean
Jan 31 2006
parent John Reimer <terminal.node gmail.com> writes:
Sean Kelly wrote:
 Søren J. Løvborg wrote:
 I'm aware this has been discussed before, but I'd like to bring it up 
 again (sorry!): D really should have true closures, or "stack 
 delegates that can be returned" if you will.
I just had a crazy vision of a stackless version of D where closures simply reference the needed data via a pointer and let the GC worry about cleaning it up. Might be an interesting experiment if nothing else. Sean
I like the idea, actually. D gives a taste of what is possible, but it doesn't quite go all the way. Once I read up on what real dynamic closures were all about, my imagination was piqued: I realized D really just pretends at dynamic closures. ( The idea of closures seem to have drifted over from the functional languages ). Kris' suggestion would work, of course. But that way doesn't look near as elegant. It looks like a workaround for what should be left to real dynamic closures (sorry, Kris ;) ). But perhaps looks are deceiving. -JJR
Jan 31 2006
prev sibling next sibling parent reply "Walter Bright" <newshound digitalmars.com> writes:
I think static closures are very useful and are a logical step for D (since 
they add power without changing syntax at all), just not now. 
Feb 02 2006
next sibling parent "Søren J. Løvborg" <web kwi.dk> writes:
"Walter Bright" <newshound digitalmars.com> wrote:
I think static closures are very useful and are a logical step for D (since 
they add power without changing syntax at all), just not now.
Great! I'll look forward to their inclusion, in 2.0 or whenever it's ready. Søren J. Løvborg web kwi.dk
Feb 04 2006
prev sibling parent reply John Reimer <terminal.node gmail.com> writes:
Walter Bright wrote:
 I think static closures are very useful and are a logical step for D (since 
 they add power without changing syntax at all), just not now. 
 
 
Woah! Wait a minute, what's the difference between static closures and dynamic closures? I think I got my terms mixed up in my post to this thread. Apparently D supports dynamic closures already (stack closures?). Are static closures what is being referred to as "true" closures then? -JJR
Feb 04 2006
parent reply "Walter Bright" <newshound digitalmars.com> writes:
"John Reimer" <terminal.node gmail.com> wrote in message 
news:ds2pr8$2cb1$1 digitaldaemon.com...
 Walter Bright wrote:
 I think static closures are very useful and are a logical step for D 
 (since they add power without changing syntax at all), just not now.
Woah! Wait a minute, what's the difference between static closures and dynamic closures? I think I got my terms mixed up in my post to this thread. Apparently D supports dynamic closures already (stack closures?). Are static closures what is being referred to as "true" closures then?
That's my understanding.
Feb 08 2006
parent Bruno Medeiros <daiphoenixNO SPAMlycos.com> writes:
Walter Bright wrote:
 "John Reimer" <terminal.node gmail.com> wrote in message 
 news:ds2pr8$2cb1$1 digitaldaemon.com...
 Walter Bright wrote:
 I think static closures are very useful and are a logical step for D 
 (since they add power without changing syntax at all), just not now.
Woah! Wait a minute, what's the difference between static closures and dynamic closures? I think I got my terms mixed up in my post to this thread. Apparently D supports dynamic closures already (stack closures?). Are static closures what is being referred to as "true" closures then?
That's my understanding.
Sounds like a very bad name term then. "True" closures imply allocating the function frame/context on the heap, which is all but static. -- Bruno Medeiros - CS/E student "Certain aspects of D are a pathway to many abilities some consider to be... unnatural."
Feb 09 2006
prev sibling parent Eugene K. <Eugene_member pathlink.com> writes:
In article <drod3j$1jm0$1 digitaldaemon.com>, S�ren J. L�vborg says...
I'm aware this has been discussed before, but I'd like to bring it up again 
(sorry!): D really should have true closures, or "stack delegates that can 
be returned" if you will.

First of all, they're a very useful feature, allowing short and concise 
code.
One option to work around this might be to add new syntax, say "new 
delegate" for this new behaviour, if implemented.

Any comments? Am I the only one who'd find this useful? Is there already 
similar functionality, and I've just not found it? (If so, please 
elaborate!)

S�ren J. L�vborg
web kwi.dk 
It is very useful as I see it. I recently looked at D at found it very promising (being disappointed in C++, after all these years). Currently, I use Python in my projects, and these "true" closures are quite natural in Python: def make_adder(n): def _adder(m): return n + m return _adder adder = make_adder(3) print adder(2) gives 5. (Sorry for this alien language snippet). Looking for something faster than Python, I tried to do that in D, but no way: import std.stdio; int delegate(int) make_adder(int what) { int _adder(int num) { return what + num; } return _adder; } void main() { int delegate(int) adder; adder = make_adder(3); writefln("%d", adder(2)); // Oops! the stack frame is gone :( } And worse, this doesn't work too: int delegate(int) make_adder(int what) { int *pw = new int; *pw = what; int _adder(int num) { return *pw + num; // Doesn't work either - the *pw is corrupted // after exiting the stack frame } return _adder; } About syntax, it would be convenient it I had another storage modifier, like "managed": int delegate(int) make_adder(managed int what) { int _adder(int num) { return what + num; // "What" is allocated on the heap // and managed with GC } return _adder; } Just my $0.02, and sorry for bad English. -- Eugene
Feb 10 2006