digitalmars.D - Strange performance behavior
- Marius Muja <mariusm cs.ubc.ca> Sep 19 2007
- Bill Baxter <dnewsgroup billbaxter.com> Sep 19 2007
- Bill Baxter <dnewsgroup billbaxter.com> Sep 19 2007
- Bruce Adams <tortoise_74 yeah.who.co.uk> Sep 19 2007
- Bill Baxter <dnewsgroup billbaxter.com> Sep 19 2007
- Bruce Adams <tortoise_74 yeah.who.co.uk> Sep 20 2007
- "Jarrett Billingsley" <kb3ctd2 yahoo.com> Sep 20 2007
- "Jarrett Billingsley" <kb3ctd2 yahoo.com> Sep 20 2007
- Bill Baxter <dnewsgroup billbaxter.com> Sep 20 2007
- Sean Kelly <sean f4.ca> Sep 20 2007
- Chris Nicholson-Sauls <ibisbasenji gmail.com> Sep 23 2007
- BCS <ao pathlink.com> Sep 23 2007
- Marius Muja <mariusm cs.ubc.ca> Sep 19 2007
I have noticed the following strange (at least for me) performance
behavior with one of my programs. It is a program that does some
scientific computations and while trying to optimize it I noticed that
the code from case_B below executes faster (almost twice as fast) as the
code in case_A. This is a bit counterintuitive for me, since in case _B
there is also the cost of the function call (or should be the same if
the function is inlined).
Can anybody shed some light on why it's behaving this way?
case_A:
-------------------------------
foreach (i,index; indices) {
foreach (k, inout value; centers[belongs_to[i]])
value += vecs[index][k];
}
----------------------------------
case_B:
-------------------------------
void addTo(T,U)(T[] a, U[] b) {
foreach(index, inout value; a) {
value += b[index];
}
}
....
foreach (i,index; indices) {
addTo(centers[belongs_to[i]],vecs[index]);
}
_______________________________
Marius
Sep 19 2007
Marius Muja wrote:I have noticed the following strange (at least for me) performance behavior with one of my programs. It is a program that does some scientific computations and while trying to optimize it I noticed that the code from case_B below executes faster (almost twice as fast) as the code in case_A. This is a bit counterintuitive for me, since in case _B there is also the cost of the function call (or should be the same if the function is inlined). Can anybody shed some light on why it's behaving this way? case_A: ------------------------------- foreach (i,index; indices) { foreach (k, inout value; centers[belongs_to[i]]) value += vecs[index][k]; } ---------------------------------- case_B: ------------------------------- void addTo(T,U)(T[] a, U[] b) { foreach(index, inout value; a) { value += b[index]; } } .... foreach (i,index; indices) { addTo(centers[belongs_to[i]],vecs[index]); } _______________________________
My guess would be the compiler's failure to optimize[*] away the [index] indexing in A. So you do 2 lookups per iteration rather than just one. If so then this should be just as fast as case_B: foreach (i,index; indices) { auto vecs_i = vecs[index]; foreach (k, inout value; centers[belongs_to[i]]) value += vecs_i[k]; } --bb
Sep 19 2007
Bill Baxter wrote:Marius Muja wrote:I have noticed the following strange (at least for me) performance behavior with one of my programs. It is a program that does some scientific computations and while trying to optimize it I noticed that the code from case_B below executes faster (almost twice as fast) as the code in case_A. This is a bit counterintuitive for me, since in case _B there is also the cost of the function call (or should be the same if the function is inlined). Can anybody shed some light on why it's behaving this way? case_A: ------------------------------- foreach (i,index; indices) { foreach (k, inout value; centers[belongs_to[i]]) value += vecs[index][k]; } ---------------------------------- case_B: ------------------------------- void addTo(T,U)(T[] a, U[] b) { foreach(index, inout value; a) { value += b[index]; } } .... foreach (i,index; indices) { addTo(centers[belongs_to[i]],vecs[index]); } _______________________________
My guess would be the compiler's failure to optimize[*] away the [index] indexing in A. So you do 2 lookups per iteration rather than just one. If so then this should be just as fast as case_B: foreach (i,index; indices) { auto vecs_i = vecs[index]; foreach (k, inout value; centers[belongs_to[i]]) value += vecs_i[k]; }
[*] It may not really be a failure of the optimizer -- maybe it's unreasonably difficult to determine absolutely that vecs won't changed by the operation '+= vecs[index][k]'. --bb
Sep 19 2007
Bill Baxter Wrote:Bill Baxter wrote:Marius Muja wrote:I have noticed the following strange (at least for me) performance behavior with one of my programs. It is a program that does some scientific computations and while trying to optimize it I noticed that the code from case_B below executes faster (almost twice as fast) as the code in case_A. This is a bit counterintuitive for me, since in case _B there is also the cost of the function call (or should be the same if the function is inlined). Can anybody shed some light on why it's behaving this way? case_A: ------------------------------- foreach (i,index; indices) { foreach (k, inout value; centers[belongs_to[i]]) value += vecs[index][k]; } ---------------------------------- case_B: ------------------------------- void addTo(T,U)(T[] a, U[] b) { foreach(index, inout value; a) { value += b[index]; } } .... foreach (i,index; indices) { addTo(centers[belongs_to[i]],vecs[index]); } _______________________________
My guess would be the compiler's failure to optimize[*] away the [index] indexing in A. So you do 2 lookups per iteration rather than just one. If so then this should be just as fast as case_B: foreach (i,index; indices) { auto vecs_i = vecs[index]; foreach (k, inout value; centers[belongs_to[i]]) value += vecs_i[k]; }
[*] It may not really be a failure of the optimizer -- maybe it's unreasonably difficult to determine absolutely that vecs won't changed by the operation '+= vecs[index][k]'. --bb
<shields up> This is exactly the sort of thing where const would help :) </shields down>
Sep 19 2007
Bruce Adams wrote:Bill Baxter Wrote:Bill Baxter wrote:Marius Muja wrote:I have noticed the following strange (at least for me) performance behavior with one of my programs. It is a program that does some scientific computations and while trying to optimize it I noticed that the code from case_B below executes faster (almost twice as fast) as the code in case_A. This is a bit counterintuitive for me, since in case _B there is also the cost of the function call (or should be the same if the function is inlined). Can anybody shed some light on why it's behaving this way? case_A: ------------------------------- foreach (i,index; indices) { foreach (k, inout value; centers[belongs_to[i]]) value += vecs[index][k]; } ---------------------------------- case_B: ------------------------------- void addTo(T,U)(T[] a, U[] b) { foreach(index, inout value; a) { value += b[index]; } } .... foreach (i,index; indices) { addTo(centers[belongs_to[i]],vecs[index]); } _______________________________
indexing in A. So you do 2 lookups per iteration rather than just one. If so then this should be just as fast as case_B: foreach (i,index; indices) { auto vecs_i = vecs[index]; foreach (k, inout value; centers[belongs_to[i]]) value += vecs_i[k]; }
[*] It may not really be a failure of the optimizer -- maybe it's unreasonably difficult to determine absolutely that vecs won't changed by the operation '+= vecs[index][k]'. --bb
<shields up> This is exactly the sort of thing where const would help :) </shields down>
Uh oh... you should have left the shields up. Now you're unprotected... --bb
Sep 19 2007
Bill Baxter Wrote:Bruce Adams wrote:Bill Baxter Wrote:Bill Baxter wrote:Marius Muja wrote:I have noticed the following strange (at least for me) performance behavior with one of my programs. It is a program that does some scientific computations and while trying to optimize it I noticed that the code from case_B below executes faster (almost twice as fast) as the code in case_A. This is a bit counterintuitive for me, since in case _B there is also the cost of the function call (or should be the same if the function is inlined). Can anybody shed some light on why it's behaving this way? case_A: ------------------------------- foreach (i,index; indices) { foreach (k, inout value; centers[belongs_to[i]]) value += vecs[index][k]; } ---------------------------------- case_B: ------------------------------- void addTo(T,U)(T[] a, U[] b) { foreach(index, inout value; a) { value += b[index]; } } .... foreach (i,index; indices) { addTo(centers[belongs_to[i]],vecs[index]); } _______________________________
indexing in A. So you do 2 lookups per iteration rather than just one. If so then this should be just as fast as case_B: foreach (i,index; indices) { auto vecs_i = vecs[index]; foreach (k, inout value; centers[belongs_to[i]]) value += vecs_i[k]; }
[*] It may not really be a failure of the optimizer -- maybe it's unreasonably difficult to determine absolutely that vecs won't changed by the operation '+= vecs[index][k]'. --bb
<shields up> This is exactly the sort of thing where const would help :) </shields down>
Uh oh... you should have left the shields up. Now you're unprotected... --bb
I was invariant but now I'm just const but you still can't modify me through this interface :p
Sep 20 2007
"Bruce Adams" <tortoise_74 yeah.who.co.uk> wrote in message news:fct9gb$uhd$1 digitalmars.com...I was invariant but now I'm just const but you still can't modify me through this interface :p
*(cast(void*)&BruceAdams) = 0; Helloooo, void*.
Sep 20 2007
"Jarrett Billingsley" <kb3ctd2 yahoo.com> wrote in message news:fcton6$1rda$1 digitalmars.com...*(cast(void*)&BruceAdams) = 0;
Shit. *(cast(int*)cast(void*)&BruceAdams) = 0;
Sep 20 2007
Bruce Adams wrote:Bill Baxter Wrote:Bruce Adams wrote:Bill Baxter Wrote:Bill Baxter wrote:Marius Muja wrote:I have noticed the following strange (at least for me) performance behavior with one of my programs. It is a program that does some scientific computations and while trying to optimize it I noticed that the code from case_B below executes faster (almost twice as fast) as the code in case_A. This is a bit counterintuitive for me, since in case _B there is also the cost of the function call (or should be the same if the function is inlined). Can anybody shed some light on why it's behaving this way? case_A: ------------------------------- foreach (i,index; indices) { foreach (k, inout value; centers[belongs_to[i]]) value += vecs[index][k]; } ---------------------------------- case_B: ------------------------------- void addTo(T,U)(T[] a, U[] b) { foreach(index, inout value; a) { value += b[index]; } } .... foreach (i,index; indices) { addTo(centers[belongs_to[i]],vecs[index]); } _______________________________
indexing in A. So you do 2 lookups per iteration rather than just one. If so then this should be just as fast as case_B: foreach (i,index; indices) { auto vecs_i = vecs[index]; foreach (k, inout value; centers[belongs_to[i]]) value += vecs_i[k]; }
[*] It may not really be a failure of the optimizer -- maybe it's unreasonably difficult to determine absolutely that vecs won't changed by the operation '+= vecs[index][k]'. --bb
This is exactly the sort of thing where const would help :) </shields down>
--bb
I was invariant but now I'm just const but you still can't modify me through this interface :p
Ooh, you better watch out. Somebody's going to put you in a cast for making such volatile statements. But still, if it's a pure literal expression of how you feel, then I won't take exception. ugh. :-P --bb
Sep 20 2007
Bill Baxter wrote:Ooh, you better watch out. Somebody's going to put you in a cast for making such volatile statements. But still, if it's a pure literal expression of how you feel, then I won't take exception.
Baaaaad :-) Sean
Sep 20 2007
Sean Kelly wrote:Bill Baxter wrote:Ooh, you better watch out. Somebody's going to put you in a cast for making such volatile statements. But still, if it's a pure literal expression of how you feel, then I won't take exception.
Baaaaad :-) Sean
(mhi:104424) Exception caught: AbstractConceptOverflowException: too many puns from mhi/arch/cogito/video/trans.d(782): Lingua!("us-en/sl").translate from mhi/arch/cogito/video/inp.d(342): Aleph!(NatAleph.LATIN).input in mhi/arch/cogito/main.d(501): Video!(170, 60).scan You raised an exception in my Mass Hallucination Interface... how dare you. Probably have a memory leak now... -- Chris Nicholson-Sauls
Sep 23 2007
Reply to Chris Nicholson-Sauls,Sean Kelly wrote:Bill Baxter wrote:Ooh, you better watch out. Somebody's going to put you in a cast for making such volatile statements. But still, if it's a pure literal expression of how you feel, then I won't take exception.
Sean
many puns from mhi/arch/cogito/video/trans.d(782): Lingua!("us-en/sl").translate from mhi/arch/cogito/video/inp.d(342): Aleph!(NatAleph.LATIN).input in mhi/arch/cogito/main.d(501): Video!(170, 60).scan You raised an exception in my Mass Hallucination Interface... how dare you. Probably have a memory leak now...
Just thought I'd correct the subject line :b
Sep 23 2007
Bill Baxter wrote:Bill Baxter wrote:Marius Muja wrote:I have noticed the following strange (at least for me) performance behavior with one of my programs. It is a program that does some scientific computations and while trying to optimize it I noticed that the code from case_B below executes faster (almost twice as fast) as the code in case_A. This is a bit counterintuitive for me, since in case _B there is also the cost of the function call (or should be the same if the function is inlined). Can anybody shed some light on why it's behaving this way? case_A: ------------------------------- foreach (i,index; indices) { foreach (k, inout value; centers[belongs_to[i]]) value += vecs[index][k]; } ---------------------------------- case_B: ------------------------------- void addTo(T,U)(T[] a, U[] b) { foreach(index, inout value; a) { value += b[index]; } } .... foreach (i,index; indices) { addTo(centers[belongs_to[i]],vecs[index]); } _______________________________
My guess would be the compiler's failure to optimize[*] away the [index] indexing in A. So you do 2 lookups per iteration rather than just one. If so then this should be just as fast as case_B: foreach (i,index; indices) { auto vecs_i = vecs[index]; foreach (k, inout value; centers[belongs_to[i]]) value += vecs_i[k]; }
[*] It may not really be a failure of the optimizer -- maybe it's unreasonably difficult to determine absolutely that vecs won't changed by the operation '+= vecs[index][k]'. --bb
Yes, that was it! Thanks for clearing this up for me.
Sep 19 2007









"Jarrett Billingsley" <kb3ctd2 yahoo.com> 