www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Dynamic Closure + Lazy Arguments = Performance Killer?

reply Jason House <jason.james.house gmail.com> writes:
I ported some monte carlo simulation code from Java to D2, and performance is
horrible.

34% of the execution time is used by std.random.uniform. To my great surprise,
25% of the execution  time is memory allocation (and collection) from that
random call. The only candidate source I see is a call to ensure with lazy
arguments. The memory allocation occurs at the start of the UniformDistribution
call. I assume this is dynamic closure kicking in.

Can anyone verify that this is the case?

600000 memory allocations per second really kills performance!
Oct 24 2008
next sibling parent reply Gregor Richards <Richards codu.org> writes:
Jason House wrote:
 I ported some monte carlo simulation code from Java to D2, and performance is
horrible.
 
 34% of the execution time is used by std.random.uniform. To my great surprise,
25% of the execution  time is memory allocation (and collection) from that
random call. The only candidate source I see is a call to ensure with lazy
arguments. The memory allocation occurs at the start of the UniformDistribution
call. I assume this is dynamic closure kicking in.
 
 Can anyone verify that this is the case?
 
 600000 memory allocations per second really kills performance!
Java has a much better garbage collector than D, as it doesn't need to be conservative. - Gregor Richards
Oct 24 2008
parent Jason House <jason.james.house gmail.com> writes:
Gregor Richards Wrote:

 Jason House wrote:
 I ported some monte carlo simulation code from Java to D2, and performance is
horrible.
 
 34% of the execution time is used by std.random.uniform. To my great surprise,
25% of the execution  time is memory allocation (and collection) from that
random call. The only candidate source I see is a call to ensure with lazy
arguments. The memory allocation occurs at the start of the UniformDistribution
call. I assume this is dynamic closure kicking in.
 
 Can anyone verify that this is the case?
 
 600000 memory allocations per second really kills performance!
Java has a much better garbage collector than D, as it doesn't need to be conservative. - Gregor Richards
The code is written to explicitly avoid memory allocation, especially in tight loops. Without this dynamic closure, the garbage collecor would never run. This case is especially pathetic since the call to ensure will never trigger. This is part of a mini language shootout. The Java version I cloned runs 4x faster. This is only one piece of a much bigger problem.
Oct 24 2008
prev sibling next sibling parent reply Frank Benoit <keinfarbton googlemail.com> writes:
Jason House schrieb:
 I ported some monte carlo simulation code from Java to D2, and
 performance is horrible.
 
 34% of the execution time is used by std.random.uniform. To my great
 surprise, 25% of the execution  time is memory allocation (and
 collection) from that random call. The only candidate source I see is
 a call to ensure with lazy arguments. The memory allocation occurs at
 the start of the UniformDistribution call. I assume this is dynamic
 closure kicking in.
 
 Can anyone verify that this is the case?
 
 600000 memory allocations per second really kills performance!
It was written in this NG over and over. The D2 full closure feature is a BIG!!!! problem. The nested functions passed as callback are an important and performance technique in D. The D2 full closure "feature" effectively removes it and makes D2 less attractive IMHO.
Oct 24 2008
next sibling parent "Bill Baxter" <wbaxter gmail.com> writes:
On Sat, Oct 25, 2008 at 7:23 AM, Frank Benoit
<keinfarbton googlemail.com> wrote:
 Jason House schrieb:
 I ported some monte carlo simulation code from Java to D2, and
 performance is horrible.

 34% of the execution time is used by std.random.uniform. To my great
 surprise, 25% of the execution  time is memory allocation (and
 collection) from that random call. The only candidate source I see is
 a call to ensure with lazy arguments. The memory allocation occurs at
 the start of the UniformDistribution call. I assume this is dynamic
 closure kicking in.

 Can anyone verify that this is the case?

 600000 memory allocations per second really kills performance!
It was written in this NG over and over. The D2 full closure feature is a BIG!!!! problem. The nested functions passed as callback are an important and performance technique in D. The D2 full closure "feature" effectively removes it and makes D2 less attractive IMHO.
Not to mention that, among the top problems plaguing D2 currently, it should be one of the easier things to fix. Far easier than figuring out overhauls for operator overloading, or construction syntax, or how ranges should work, or forward reference, or figuring out how 'shared' should work, or merging Tango and Phobos. Compared to those it's pretty easy to solve this one! Personally I think no alloc should be the default, with different syntax to get a full closure. Using the "new" keyword somehow makes sense to me. --bb
Oct 24 2008
prev sibling parent Robert Fraser <fraserofthenight gmail.com> writes:
Frank Benoit wrote:
 Jason House schrieb:
 I ported some monte carlo simulation code from Java to D2, and
 performance is horrible.

 34% of the execution time is used by std.random.uniform. To my great
 surprise, 25% of the execution  time is memory allocation (and
 collection) from that random call. The only candidate source I see is
 a call to ensure with lazy arguments. The memory allocation occurs at
 the start of the UniformDistribution call. I assume this is dynamic
 closure kicking in.

 Can anyone verify that this is the case?

 600000 memory allocations per second really kills performance!
It was written in this NG over and over. The D2 full closure feature is a BIG!!!! problem. The nested functions passed as callback are an important and performance technique in D. The D2 full closure "feature" effectively removes it and makes D2 less attractive IMHO.
Agreed. Is it in bugzilla?
Oct 24 2008
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Jason House:

 34% of the execution time is used by std.random.uniform.
Kiss of Tango is much faster, and there's a much faster still (but good still) rnd generator around...
 Can anyone verify that this is the case?
Can you show us a working minimal code I/we can test? ------------------- Frank Benoit:
The D2 full closure "feature" effectively removes it and makes D2 less
attractive IMHO.<
The first simple solution is to add the possibility of adding "scope" to closures to not use the heap (but I don't know how to do that in every situation, and it makes the already long syntax of lambdas even longer). But the probably best way for D to become more functional (and normal functional programming is often full of functions that move everywhere, often they are closures, but only virtually) is to grow some more optimizing capabilities, so closures aren't a problem anymore. There are many ways to perform such optimizations (but in a language mostly based on side effects it's less easy). Bye, bearophile
Oct 24 2008
parent reply Jason House <jason.james.house gmail.com> writes:
bearophile wrote:

 Jason House:
 
 34% of the execution time is used by std.random.uniform.
Kiss of Tango is much faster, and there's a much faster still (but good still) rnd generator around...
 Can anyone verify that this is the case?
Can you show us a working minimal code I/we can test? ------------------- Frank Benoit:
The D2 full closure "feature" effectively removes it and makes D2 less
attractive IMHO.<
The first simple solution is to add the possibility of adding "scope" to closures to not use the heap (but I don't know how to do that in every situation, and it makes the already long syntax of lambdas even longer). But the probably best way for D to become more functional (and normal functional programming is often full of functions that move everywhere, often they are closures, but only virtually) is to grow some more optimizing capabilities, so closures aren't a problem anymore. There are many ways to perform such optimizations (but in a language mostly based on side effects it's less easy). Bye, bearophile
The following spends 90% of its time in _d_alloc_memory void bar(lazy int i){} void foo(int i){ bar(i); } void main(){ foreach(int i; 1..1000000) foo(i); } Compiling with -O -release reduces it to 88% :)
Oct 24 2008
parent reply bearophile <bearophileHUGS lycos.com> writes:
Jason House:
 The following spends 90% of its time in _d_alloc_memory
 void bar(lazy int i){}
 void foo(int i){ bar(i); }
 void main(){ foreach(int i; 1..1000000) foo(i); }
 Compiling with -O -release reduces it to 88% :)
I see. So I presume it becomes quite difficult for D2 to compute up to the 25th term of this sequence (the D code is in the middle of the page) (it takes just few seconds to run on D1): http://en.wikipedia.org/wiki/Man_or_boy_test What syntax can we use to avoid heap allocation? Few ideas: void bar(lazy int i){} // like D1 void bar(scope lazy int i){} // like D1 void bar(closure int i){} // like current D2 Bye, bearophile
Oct 24 2008
parent reply Jason House <jason.james.house gmail.com> writes:
bearophile Wrote:

 Jason House:
 The following spends 90% of its time in _d_alloc_memory
 void bar(lazy int i){}
 void foo(int i){ bar(i); }
 void main(){ foreach(int i; 1..1000000) foo(i); }
 Compiling with -O -release reduces it to 88% :)
I see. So I presume it becomes quite difficult for D2 to compute up to the 25th term of this sequence (the D code is in the middle of the page) (it takes just few seconds to run on D1): http://en.wikipedia.org/wiki/Man_or_boy_test What syntax can we use to avoid heap allocation? Few ideas: void bar(lazy int i){} // like D1 void bar(scope lazy int i){} // like D1 void bar(closure int i){} // like current D2 Bye, bearophile
I would assume a fix would be to add scope to input delegates and to require some kind of declaration on the caller's side when the compiler can't prove safety. It's best for ambiguous cases to be a warning (error). It also makes the code easier for readers to follow.
Oct 25 2008
parent reply "Bill Baxter" <wbaxter gmail.com> writes:
On Sat, Oct 25, 2008 at 5:24 PM, Jason House
<jason.james.house gmail.com> wrote:
 bearophile Wrote:

 Jason House:
 The following spends 90% of its time in _d_alloc_memory
 void bar(lazy int i){}
 void foo(int i){ bar(i); }
 void main(){ foreach(int i; 1..1000000) foo(i); }
 Compiling with -O -release reduces it to 88% :)
I see. So I presume it becomes quite difficult for D2 to compute up to the 25th term of this sequence (the D code is in the middle of the page) (it takes just few seconds to run on D1): http://en.wikipedia.org/wiki/Man_or_boy_test What syntax can we use to avoid heap allocation? Few ideas: void bar(lazy int i){} // like D1 void bar(scope lazy int i){} // like D1 void bar(closure int i){} // like current D2
This makes no sense because the writer of bar has no idea whether the caller will need a heap allocation or not.
 I would assume a fix would be to add scope to input delegates and to require
some kind of declaration on the caller's side when the compiler can't prove
safety. It's best for ambiguous cases to be a warning (error). It also makes
the code easier for readers to follow.
I think for a language like D, hidden, hard to find memory allocations like the one Andrei didn't know he was doing should be eliminated. By that I mean stack allocation (D1 behavior) should be the default. Then for places where you really want a closure, some other syntax should be chosen. The other reason I say that is that so far in D I've only very seldom really wanted an allocated closure. So I think I will have to use the funky no-closure-please syntax way more than I would have to use a make-me-a-closure-please syntax. But apparently nobody who knows anything about what's actually going to happen is involved in this discussion, so I think I'll just pipe down for now. --bb
Oct 25 2008
next sibling parent Jason House <jason.james.house gmail.com> writes:
Bill Baxter Wrote:

 On Sat, Oct 25, 2008 at 5:24 PM, Jason House
 <jason.james.house gmail.com> wrote:
 bearophile Wrote:

 Jason House:
 The following spends 90% of its time in _d_alloc_memory
 void bar(lazy int i){}
 void foo(int i){ bar(i); }
 void main(){ foreach(int i; 1..1000000) foo(i); }
 Compiling with -O -release reduces it to 88% :)
I see. So I presume it becomes quite difficult for D2 to compute up to the 25th term of this sequence (the D code is in the middle of the page) (it takes just few seconds to run on D1): http://en.wikipedia.org/wiki/Man_or_boy_test What syntax can we use to avoid heap allocation? Few ideas: void bar(lazy int i){} // like D1 void bar(scope lazy int i){} // like D1 void bar(closure int i){} // like current D2
This makes no sense because the writer of bar has no idea whether the caller will need a heap allocation or not.
 I would assume a fix would be to add scope to input delegates and to require
some kind of declaration on the caller's side when the compiler can't prove
safety. It's best for ambiguous cases to be a warning (error). It also makes
the code easier for readers to follow.
I think for a language like D, hidden, hard to find memory allocations like the one Andrei didn't know he was doing should be eliminated. By that I mean stack allocation (D1 behavior) should be the default. Then for places where you really want a closure, some other syntax should be chosen. The other reason I say that is that so far in D I've only very seldom really wanted an allocated closure. So I think I will have to use the funky no-closure-please syntax way more than I would have to use a make-me-a-closure-please syntax. But apparently nobody who knows anything about what's actually going to happen is involved in this discussion, so I think I'll just pipe down for now. --bb
While I agree that should be the default, I've already seen plenty of D1 code that incorrectly used stack-based closures. It really depends on your usage patterns. I do a lot of inter-thread communication in D1
Oct 25 2008
prev sibling parent reply Lars Ivar Igesund <larsivar igesund.net> writes:
Bill Baxter wrote:

 On Sat, Oct 25, 2008 at 5:24 PM, Jason House
 <jason.james.house gmail.com> wrote:
 bearophile Wrote:

 Jason House:
 The following spends 90% of its time in _d_alloc_memory
 void bar(lazy int i){}
 void foo(int i){ bar(i); }
 void main(){ foreach(int i; 1..1000000) foo(i); }
 Compiling with -O -release reduces it to 88% :)
I see. So I presume it becomes quite difficult for D2 to compute up to the 25th term of this sequence (the D code is in the middle of the page) (it takes just few seconds to run on D1): http://en.wikipedia.org/wiki/Man_or_boy_test What syntax can we use to avoid heap allocation? Few ideas: void bar(lazy int i){} // like D1 void bar(scope lazy int i){} // like D1 void bar(closure int i){} // like current D2
This makes no sense because the writer of bar has no idea whether the caller will need a heap allocation or not.
 I would assume a fix would be to add scope to input delegates and to
 require some kind of declaration on the caller's side when the compiler
 can't prove safety. It's best for ambiguous cases to be a warning
 (error). It also makes the code easier for readers to follow.
I think for a language like D, hidden, hard to find memory allocations like the one Andrei didn't know he was doing should be eliminated. By that I mean stack allocation (D1 behavior) should be the default. Then for places where you really want a closure, some other syntax should be chosen. The other reason I say that is that so far in D I've only very seldom really wanted an allocated closure. So I think I will have to use the funky no-closure-please syntax way more than I would have to use a make-me-a-closure-please syntax.
I agree that D1 behaviour should be the default, since otherwise it'll be yet another breaking change. However, I do understand that the D1 behaviour is the unsafe one, and as such the heap allocated version has merit as the default. -- Lars Ivar Igesund blog at http://larsivi.net DSource, #d.tango & #D: larsivi Dancing the Tango
Oct 25 2008
parent reply "Denis Koroskin" <2korden gmail.com> writes:
On Sat, 25 Oct 2008 18:36:27 +0400, Lars Ivar Igesund  
<larsivar igesund.net> wrote:

 Bill Baxter wrote:

 On Sat, Oct 25, 2008 at 5:24 PM, Jason House
 <jason.james.house gmail.com> wrote:
 bearophile Wrote:

 Jason House:
 The following spends 90% of its time in _d_alloc_memory
 void bar(lazy int i){}
 void foo(int i){ bar(i); }
 void main(){ foreach(int i; 1..1000000) foo(i); }
 Compiling with -O -release reduces it to 88% :)
I see. So I presume it becomes quite difficult for D2 to compute up to the 25th term of this sequence (the D code is in the middle of the page) (it takes just few seconds to run on D1): http://en.wikipedia.org/wiki/Man_or_boy_test What syntax can we use to avoid heap allocation? Few ideas: void bar(lazy int i){} // like D1 void bar(scope lazy int i){} // like D1 void bar(closure int i){} // like current D2
This makes no sense because the writer of bar has no idea whether the caller will need a heap allocation or not.
 I would assume a fix would be to add scope to input delegates and to
 require some kind of declaration on the caller's side when the compiler
 can't prove safety. It's best for ambiguous cases to be a warning
 (error). It also makes the code easier for readers to follow.
I think for a language like D, hidden, hard to find memory allocations like the one Andrei didn't know he was doing should be eliminated. By that I mean stack allocation (D1 behavior) should be the default. Then for places where you really want a closure, some other syntax should be chosen. The other reason I say that is that so far in D I've only very seldom really wanted an allocated closure. So I think I will have to use the funky no-closure-please syntax way more than I would have to use a make-me-a-closure-please syntax.
I agree that D1 behaviour should be the default, since otherwise it'll be yet another breaking change. However, I do understand that the D1 behaviour is the unsafe one, and as such the heap allocated version has merit as the default.
I believe the default should be the one that is most frequently used, even if it is less safe. Otherwise you may end up with a lot of code duplication. I also think that scope and heap-allocated delegates should have different types so that no imlicit casting from scope delegate to heap one would be possible. In this case callee function that recieves the delegate might demand the delegate to be heap-allocated (because it stores it, for example).
Oct 25 2008
next sibling parent Lars Ivar Igesund <larsivar igesund.net> writes:
Denis Koroskin wrote:

 On Sat, 25 Oct 2008 18:36:27 +0400, Lars Ivar Igesund
 <larsivar igesund.net> wrote:
 
 Bill Baxter wrote:

 On Sat, Oct 25, 2008 at 5:24 PM, Jason House
 <jason.james.house gmail.com> wrote:
 bearophile Wrote:

 Jason House:
 The following spends 90% of its time in _d_alloc_memory
 void bar(lazy int i){}
 void foo(int i){ bar(i); }
 void main(){ foreach(int i; 1..1000000) foo(i); }
 Compiling with -O -release reduces it to 88% :)
I see. So I presume it becomes quite difficult for D2 to compute up to the 25th term of this sequence (the D code is in the middle of the page) (it takes just few seconds to run on D1): http://en.wikipedia.org/wiki/Man_or_boy_test What syntax can we use to avoid heap allocation? Few ideas: void bar(lazy int i){} // like D1 void bar(scope lazy int i){} // like D1 void bar(closure int i){} // like current D2
This makes no sense because the writer of bar has no idea whether the caller will need a heap allocation or not.
 I would assume a fix would be to add scope to input delegates and to
 require some kind of declaration on the caller's side when the compiler
 can't prove safety. It's best for ambiguous cases to be a warning
 (error). It also makes the code easier for readers to follow.
I think for a language like D, hidden, hard to find memory allocations like the one Andrei didn't know he was doing should be eliminated. By that I mean stack allocation (D1 behavior) should be the default. Then for places where you really want a closure, some other syntax should be chosen. The other reason I say that is that so far in D I've only very seldom really wanted an allocated closure. So I think I will have to use the funky no-closure-please syntax way more than I would have to use a make-me-a-closure-please syntax.
I agree that D1 behaviour should be the default, since otherwise it'll be yet another breaking change. However, I do understand that the D1 behaviour is the unsafe one, and as such the heap allocated version has merit as the default.
I believe the default should be the one that is most frequently used, even if it is less safe. Otherwise you may end up with a lot of code duplication. I also think that scope and heap-allocated delegates should have different types so that no imlicit casting from scope delegate to heap one would be possible. In this case callee function that recieves the delegate might demand the delegate to be heap-allocated (because it stores it, for example).
I definately agree with this. -- Lars Ivar Igesund blog at http://larsivi.net DSource, #d.tango & #D: larsivi Dancing the Tango
Oct 25 2008
prev sibling next sibling parent reply "Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:
On Sat, Oct 25, 2008 at 10:44 AM, Denis Koroskin <2korden gmail.com> wrote:
 I also think that scope and heap-allocated delegates should have different
 types so that no imlicit casting from scope delegate to heap one would be
 possible. In this case callee function that recieves the delegate might
 demand the delegate to be heap-allocated (because it stores it, for
 example).
Fantastic. That also neatly solves the "returning a delegate" problem; it simply becomes illegal to return a scope delegate.
Oct 25 2008
parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
 On Sat, Oct 25, 2008 at 10:44 AM, Denis Koroskin <2korden gmail.com> wrote:
 I also think that scope and heap-allocated delegates should have different
 types so that no imlicit casting from scope delegate to heap one would be
 possible. In this case callee function that recieves the delegate might
 demand the delegate to be heap-allocated (because it stores it, for
 example).
How would this work? For example: ----- struct Struct { // fields... void foo() { // body } } void bar(Struct* p) { auto dg = &p.foo; // stack-based or heap-based delegate? // do stuff with dg } ----- ? (There's no way to know if *p is a heap-based or stack-based struct) Jarrett Billingsley wrote:
 Fantastic.  That also neatly solves the "returning a delegate"
 problem; it simply becomes illegal to return a scope delegate.
Even if "scope delegate" becomes a different type, sometimes such a "scope delegate"s is perfectly safe to return: ----- alias scope void delegate() dg; Dg foo(Dg dg) { return dg; // Why would this be illegal? } -----
Oct 25 2008
next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Sat, 25 Oct 2008 21:17:34 +0400, Frits van Bommel  
<fvbommel remwovexcapss.nl> wrote:

 On Sat, Oct 25, 2008 at 10:44 AM, Denis Koroskin <2korden gmail.com>  
 wrote:
 I also think that scope and heap-allocated delegates should have  
 different
 types so that no imlicit casting from scope delegate to heap one would  
 be
 possible. In this case callee function that recieves the delegate might
 demand the delegate to be heap-allocated (because it stores it, for
 example).
How would this work? For example: ----- struct Struct { // fields... void foo() { // body } } void bar(Struct* p) { auto dg = &p.foo; // stack-based or heap-based delegate? // do stuff with dg } ----- ?
Good question! First, let's expand the code: void bar(Struct* p) { void delegate() dg; dg.ptr = p; dg.funcptr = &Struct.foo; // do stuff with dg } So, here is the question: is this a "stack-based or heap-based delegate?" I.e. may we return it from function and pass it to those functions that need heap-base delegate or not? Yes, we may return it, obviously, and call outside of the function, so from this point of view it is indeed "heap-allocated delegate" even if nothing is actually allocated. But someone might say that it is unsafe to call this dg because at some point object may become inexistant. To respond this, let's rewrite the code to make it trully heap-allocated and compare if it got any safer: void bar(Struct* p) { void foo() { p.foo(); } auto dg = &foo; } Now dg is heap-allocated (in the sense that place for its local variable are allocated on heap). May we return this delegate from function? Yes. Is it any safer? No. They are absolutely the same.
      auto dg = &p.foo;	// stack-based or heap-based delegate?
Heap-based one, even if no actual allocation took place.
Oct 25 2008
prev sibling parent "Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:
On Sat, Oct 25, 2008 at 1:17 PM, Frits van Bommel
<fvbommel remwovexcapss.nl> wrote:
 Fantastic.  That also neatly solves the "returning a delegate"
 problem; it simply becomes illegal to return a scope delegate.
Even if "scope delegate" becomes a different type, sometimes such a "scope delegate"s is perfectly safe to return: ----- alias scope void delegate() dg; Dg foo(Dg dg) { return dg; // Why would this be illegal? } -----
Clarification - it would be an error to return a scope delegate from the scope in which it was declared. Currently the behavior you mention (passing a scope delegate into a function then returning it) doesn't even exist for scope classes - parameters cannot be "scope". I would imagine, though, that if a parameter were scope, a function would be able to return that parameter, and in fact that would be the only way to return a scope reference (delegate or class) from a function.
Oct 25 2008
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Denis Koroskin" wrote
 On Sat, 25 Oct 2008 18:36:27 +0400, Lars Ivar Igesund 
 <larsivar igesund.net> wrote:

 Bill Baxter wrote:

 On Sat, Oct 25, 2008 at 5:24 PM, Jason House
 <jason.james.house gmail.com> wrote:
 bearophile Wrote:

 Jason House:
 The following spends 90% of its time in _d_alloc_memory
 void bar(lazy int i){}
 void foo(int i){ bar(i); }
 void main(){ foreach(int i; 1..1000000) foo(i); }
 Compiling with -O -release reduces it to 88% :)
I see. So I presume it becomes quite difficult for D2 to compute up to the 25th term of this sequence (the D code is in the middle of the page) (it takes just few seconds to run on D1): http://en.wikipedia.org/wiki/Man_or_boy_test What syntax can we use to avoid heap allocation? Few ideas: void bar(lazy int i){} // like D1 void bar(scope lazy int i){} // like D1 void bar(closure int i){} // like current D2
This makes no sense because the writer of bar has no idea whether the caller will need a heap allocation or not.
 I would assume a fix would be to add scope to input delegates and to
 require some kind of declaration on the caller's side when the compiler
 can't prove safety. It's best for ambiguous cases to be a warning
 (error). It also makes the code easier for readers to follow.
I think for a language like D, hidden, hard to find memory allocations like the one Andrei didn't know he was doing should be eliminated. By that I mean stack allocation (D1 behavior) should be the default. Then for places where you really want a closure, some other syntax should be chosen. The other reason I say that is that so far in D I've only very seldom really wanted an allocated closure. So I think I will have to use the funky no-closure-please syntax way more than I would have to use a make-me-a-closure-please syntax.
I agree that D1 behaviour should be the default, since otherwise it'll be yet another breaking change. However, I do understand that the D1 behaviour is the unsafe one, and as such the heap allocated version has merit as the default.
I believe the default should be the one that is most frequently used, even if it is less safe. Otherwise you may end up with a lot of code duplication. I also think that scope and heap-allocated delegates should have different types so that no imlicit casting from scope delegate to heap one would be possible. In this case callee function that recieves the delegate might demand the delegate to be heap-allocated (because it stores it, for example).
I've been thinking about this solution, and I think the decision to allocate scope or heap should be left up to the developer, and no types should be assigned. Think about an example like this: class DelegateCaller { private delegate int _foo(); this(int delegate() foo) { _foo = foo; } int callit() { return _foo();} } int f1() { int x() { return 5; } scope dc = new DelegateCaller(&x); // allocate on stack return dc.callit() * dc.callit(); } DelegateCaller f2() { int x() { return 5;} return new DelegateCaller(&x); // allocate on heap } So what type should DelegateCaller._foo be? I think the only real solution to this, aside from compiler analysis (which introduces all kinds of problems), is to declare all delegates are stack or heap allocated by default, and allow the developer to deviate by declaring the delegate as opposite. As I think most function delegates are expected to be stack allocated, it makes sense to me that stack delegates should be the default. As a suggestion for syntax, I'd say heap-allocated delegates should use the new keyword somehow: return new DelegateCaller(new(&x)); One issue to determine is how heap-allocated delegates are done. Should there be only one heap allocation per function call, or one per instantiation? If so, what happens if you change data in the function after instantiation? The difference is significant if you create multiple delegates: int delegate() foo[]; int i = 0; int getI() { return i; } foo ~= new(&getI); i++; foo ~= new(&getI); i++; for(int j = 0; j < foo.length; j++) { writefln(foo[j]); } What should be the correct output? 0 1 or 0 2 or 2 2 -Steve
Oct 26 2008
parent "Bill Baxter" <wbaxter gmail.com> writes:
On Sun, Oct 26, 2008 at 11:38 PM, Steven Schveighoffer
<schveiguy yahoo.com> wrote:
 I also think that scope and heap-allocated delegates should have different
 types so that no imlicit casting from scope delegate to heap one would be
 possible. In this case callee function that recieves the delegate might
 demand the delegate to be heap-allocated (because it stores it, for
 example).
I've been thinking about this solution, and I think the decision to allocate scope or heap should be left up to the developer, and no types should be assigned. Think about an example like this: class DelegateCaller { private delegate int _foo(); this(int delegate() foo) { _foo = foo; } int callit() { return _foo();} } int f1() { int x() { return 5; } scope dc = new DelegateCaller(&x); // allocate on stack return dc.callit() * dc.callit(); } DelegateCaller f2() { int x() { return 5;} return new DelegateCaller(&x); // allocate on heap } So what type should DelegateCaller._foo be?
Ok, so that's a good example where only the caller knows that heap allocation is necessary, and we already discussed a case where only the callee knows it's necessary.
 I think the only real solution to this, aside from compiler analysis (which
 introduces all kinds of problems), is to declare all delegates are stack or
 heap allocated by default, and allow the developer to deviate by declaring
 the delegate as opposite.
It seems to me that from the two cases above, a good solution might be to make stack the default but to allow *either* the callee *or* the caller to request that that default be overridden.
 As I think most function delegates are expected to be stack allocated, it
 makes sense to me that stack delegates should be the default.
 As a suggestion for syntax, I'd say heap-allocated delegates should use the
 new keyword somehow:

 return new DelegateCaller(new(&x));

 One issue to determine is how heap-allocated delegates are done.  Should
 there be only one heap allocation per function call, or one per
 instantiation?  If so, what happens if you change data in the function after
 instantiation? The difference is significant if you create multiple
 delegates:
 int delegate() foo[];

 int i = 0;
 int getI() { return i; }

 foo ~= new(&getI);

 i++;
 foo ~= new(&getI);
 i++;
 for(int j = 0; j < foo.length; j++)
 {
   writefln(foo[j]);
 }

 What should be the correct output?

 0
 1
Without thinking about implementation or the current behavior at all, this is the output I would expect from a full closure. It should capture the state at the time of its creation. With the either/or proposal you'll need another rule, I think. If you have a case like this: void longTermDelegateKeeper(new int delegate() dg) { ... } // here "new" means heap required ... int i = 0; int getI() { return i; } int delegate() foo[]; foo ~= &getI; i++; longTermDelegateKeeper(foo[0]); // <- what happens? Here there are two options for the "what happens" line I think: 1) stack delegate returned by foo[0] triggers an implicit allocation and copying of current stack variables. (so foo[0]() will return "1") 2) compiler error: "Heap delegate expected". create a heap delegate out of the stack delegate. And by doing that force the caller to examine which state he really mean to capture in that delegate. Did he want it to capture the i==1 state or did he want it to capture i==0? And in a loop context it will force the developer to notice that he's triggering implicit allocations inside a loop when he may not mean to. It also would make it possible to recognize allocations just by looking at code locally. Aside from these D2 delegates I think it's always possible to tell looking at D code where the allocations are. Setting a .length or doing ~= are not obviously (and not necessarily) allocations, but if you see one then you can guess that allocation is involved. I don't really want to end up with a situation where I have to guess if the code I'm looking at is doing allocation just by calling a function that itself doesn't do any allocation either. Finally -- do stack and heap delegates really need to be distinct types? Maybe not. Maybe a run-time check would be good enough. if there's some kind of isHeapDelegate(dg) check available then, library writers could use that. The compiler wouldn't catch the error, but it might be sufficient to catch at runtime in order to avoid the pain of introducing more types. --bb
Oct 26 2008
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Jason House wrote:
 I ported some monte carlo simulation code from Java to D2, and
 performance is horrible.
 
 34% of the execution time is used by std.random.uniform. To my great
 surprise, 25% of the execution  time is memory allocation (and
 collection) from that random call. The only candidate source I see is
 a call to ensure with lazy arguments. The memory allocation occurs at
 the start of the UniformDistribution call. I assume this is dynamic
 closure kicking in.
 
 Can anyone verify that this is the case?
 
 600000 memory allocations per second really kills performance!
std.random does not use dynamic memory allocation. Walter is almost done implementing static closures. Andrei
Oct 24 2008
next sibling parent reply "Bill Baxter" <wbaxter gmail.com> writes:
On Sat, Oct 25, 2008 at 9:59 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 Jason House wrote:
 I ported some monte carlo simulation code from Java to D2, and
 performance is horrible.

 34% of the execution time is used by std.random.uniform. To my great
 surprise, 25% of the execution  time is memory allocation (and
 collection) from that random call. The only candidate source I see is
 a call to ensure with lazy arguments. The memory allocation occurs at
 the start of the UniformDistribution call. I assume this is dynamic
 closure kicking in.

 Can anyone verify that this is the case?

 600000 memory allocations per second really kills performance!
std.random does not use dynamic memory allocation.
Well the suggestion is that it may be using dynamic memory allocation without intending to because of the dynamic closures. Are you saying that is definitely not the case?
 Walter is almost done implementing static closures.
Excellent! So what strategy is being used? I hope it's static by default, dynamic on request, but your wording suggests otherwise. --bb
Oct 24 2008
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 9:59 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Jason House wrote:
 I ported some monte carlo simulation code from Java to D2, and
 performance is horrible.

 34% of the execution time is used by std.random.uniform. To my great
 surprise, 25% of the execution  time is memory allocation (and
 collection) from that random call. The only candidate source I see is
 a call to ensure with lazy arguments. The memory allocation occurs at
 the start of the UniformDistribution call. I assume this is dynamic
 closure kicking in.

 Can anyone verify that this is the case?

 600000 memory allocations per second really kills performance!
std.random does not use dynamic memory allocation.
Well the suggestion is that it may be using dynamic memory allocation without intending to because of the dynamic closures. Are you saying that is definitely not the case?
I don't think there's any delegate in use in std.random.
 Walter is almost done implementing static closures.
Excellent! So what strategy is being used? I hope it's static by default, dynamic on request, but your wording suggests otherwise.
I forgot. Andrei
Oct 24 2008
parent reply Jason House <jason.james.house gmail.com> writes:
Andrei Alexandrescu wrote:

 I don't think there's any delegate in use in std.random.
Lazy arguments are delegates, and enforce uses lazy arguments
Oct 24 2008
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Jason House wrote:
 Andrei Alexandrescu wrote:
 
 I don't think there's any delegate in use in std.random.
Lazy arguments are delegates, and enforce uses lazy arguments
Yikes, I see. Andrei
Oct 24 2008
prev sibling parent Jason House <jason.james.house gmail.com> writes:
Andrei Alexandrescu wrote:

 Jason House wrote:
 I ported some monte carlo simulation code from Java to D2, and
 performance is horrible.
 
 34% of the execution time is used by std.random.uniform. To my great
 surprise, 25% of the execution  time is memory allocation (and
 collection) from that random call. The only candidate source I see is
 a call to ensure with lazy arguments. The memory allocation occurs at
 the start of the UniformDistribution call. I assume this is dynamic
 closure kicking in.
 
 Can anyone verify that this is the case?
 
 600000 memory allocations per second really kills performance!
std.random does not use dynamic memory allocation.
This is exactly why so many have complained about the dynamic closure implementation. You did not intend to use dynamic memory allocation, but it definitely does. A program with nothing but a loop that calls uniform will show it plain as day in the profiler. (I'm using callgrind)
 Walter is almost done 
 implementing static closures.
Ooh... Can you elaborate on that?
Oct 24 2008
prev sibling next sibling parent reply Jason House <jason.james.house gmail.com> writes:
Jason House Wrote:

 Gregor Richards Wrote:
 
 Jason House wrote:
 I ported some monte carlo simulation code from Java to D2, and performance is
horrible.
 
 34% of the execution time is used by std.random.uniform. To my great surprise,
25% of the execution  time is memory allocation (and collection) from that
random call. The only candidate source I see is a call to ensure with lazy
arguments. The memory allocation occurs at the start of the UniformDistribution
call. I assume this is dynamic closure kicking in.
 
 Can anyone verify that this is the case?
 
 600000 memory allocations per second really kills performance!
Java has a much better garbage collector than D, as it doesn't need to be conservative. - Gregor Richards
The code is written to explicitly avoid memory allocation, especially in tight loops. Without this dynamic closure, the garbage collecor would never run. This case is especially pathetic since the call to ensure will never trigger. This is part of a mini language shootout. The Java version I cloned runs 4x faster. This is only one piece of a much bigger problem.
I was wrong about the 4x thing. I have bad hardware. After fixing the accidental allocation and running both the D and Java version on the same box, they're only 1% different.
Oct 28 2008
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Jason House" wrote
 Jason House Wrote:

 Gregor Richards Wrote:

 Jason House wrote:
 I ported some monte carlo simulation code from Java to D2, and 
 performance is horrible.

 34% of the execution time is used by std.random.uniform. To my great 
 surprise, 25% of the execution  time is memory allocation (and 
 collection) from that random call. The only candidate source I see is 
 a call to ensure with lazy arguments. The memory allocation occurs at 
 the start of the UniformDistribution call. I assume this is dynamic 
 closure kicking in.

 Can anyone verify that this is the case?

 600000 memory allocations per second really kills performance!
Java has a much better garbage collector than D, as it doesn't need to be conservative. - Gregor Richards
The code is written to explicitly avoid memory allocation, especially in tight loops. Without this dynamic closure, the garbage collecor would never run. This case is especially pathetic since the call to ensure will never trigger. This is part of a mini language shootout. The Java version I cloned runs 4x faster. This is only one piece of a much bigger problem.
I was wrong about the 4x thing. I have bad hardware. After fixing the accidental allocation and running both the D and Java version on the same box, they're only 1% different.
When you say 'fixing the accidental allocation' you mean removing the case where a dynamic closure was allocated? I just want to make sure that is clear. -Steve
Oct 28 2008
parent Jason House <jason.james.house gmail.com> writes:
Steven Schveighoffer Wrote:

 "Jason House" wrote
 Jason House Wrote:

 Gregor Richards Wrote:

 Jason House wrote:
 I ported some monte carlo simulation code from Java to D2, and 
 performance is horrible.

 34% of the execution time is used by std.random.uniform. To my great 
 surprise, 25% of the execution  time is memory allocation (and 
 collection) from that random call. The only candidate source I see is 
 a call to ensure with lazy arguments. The memory allocation occurs at 
 the start of the UniformDistribution call. I assume this is dynamic 
 closure kicking in.

 Can anyone verify that this is the case?

 600000 memory allocations per second really kills performance!
Java has a much better garbage collector than D, as it doesn't need to be conservative. - Gregor Richards
The code is written to explicitly avoid memory allocation, especially in tight loops. Without this dynamic closure, the garbage collecor would never run. This case is especially pathetic since the call to ensure will never trigger. This is part of a mini language shootout. The Java version I cloned runs 4x faster. This is only one piece of a much bigger problem.
I was wrong about the 4x thing. I have bad hardware. After fixing the accidental allocation and running both the D and Java version on the same box, they're only 1% different.
When you say 'fixing the accidental allocation' you mean removing the case where a dynamic closure was allocated? I just want to make sure that is clear. -Steve
Yes. You are right. The allocation of the dynamic closure was the only performance problem, and consumed 25% of my execution time. I called it accidental because Andrei was unaware that he had done it.
Oct 29 2008
prev sibling parent Russell Lewis <webmaster villagersonline.com> writes:
Objective 1: Make the heap vs. stack variables explicit
Objective 2: Make it impossible to return or store a static (stack) delegate
Objective 3: Don't require decorators on lambda expressions.

Solution:
- Variables are on stack by default
- Use modifier "heap" to put a variable on the heap
- Delegates can be normal (storable) or "scope" (can't live beyond the 
scope of our function, and the type is inferred BASED ON WHAT VARIABLES 
YOU ACCESS.


EXAMPLE CODE

void foo(scope void delegate()) {...}
void bar(void delegate()) {...}

void main()
{
   int a;
   heap int b;

   foo({ a = 1; });	// legal.
   bar({ b = 2; });	// legal.  bar could store dg, but b is on heap
   foo({ b = 3; });	// legal.  ok to pass non-scope dg to
			//         scope argument
   bar({ a = 4; });	// SYNTAX ERROR
			// delegate is scope b/c a is on stack, but
			// argument to bar isn't scope.
}

END CODE


Thoughts?
Nov 03 2008