www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Why is for() less efficient than foreach?

reply Bastiaan Veelo <Bastiaan Veelo.net> writes:
Benchmarking for() against foreach():

/////////
enum size_t maxarray = 500_000;

double[maxarray] a, b, c, d;

void main()
{
     import std.stdio;
     import std.datetime;

     import std.random;
     for (int n = 0; n < maxarray; n++)
     {
         a[n] = uniform01;
         b[n] = uniform01;
         c[n] = uniform01 + 0.1;
     }

     void overhead() {}

     void foreach_loop()
     {
         foreach(n, elem; d[])
             elem = a[n] * b[n] / c[n];
     }

     void for_loop()
     {
         for (int n = 0; n < maxarray; n++)
             d[n] = a[n] * b[n] / c[n];
     }

     auto r = benchmark!(overhead,
                         foreach_loop,
                         for_loop)(10_000);

     import std.conv : to;
     foreach (i, d; r)
         writeln("Function ", i, " took: ", d.to!Duration);
}
/////////


Depending on the machine this is run on, for() performs a factor 
3-8 slower than foreach(). Can someone explain this to me? Or, 
taking for() as the norm, how can foreach() be so blazingly fast?
Thanks!
Feb 10 2017
next sibling parent reply biozic <dransic gmail.com> writes:
On Friday, 10 February 2017 at 12:39:50 UTC, Bastiaan Veelo wrote:
     void foreach_loop()
     {
         foreach(n, elem; d[])
             elem = a[n] * b[n] / c[n];
     }
It's fast because the result of the operation (elem) is discarded on each iteration, so it is probably optimized away. Try: ``` void foreach_loop() { foreach(n, ref elem; d[]) elem = a[n] * b[n] / c[n]; } ``` You can also do: ``` d = a[] * b[] / c[]; ``` with no loop statement at all.
Feb 10 2017
parent Bastiaan Veelo <Bastiaan Veelo.net> writes:
On Friday, 10 February 2017 at 12:57:38 UTC, biozic wrote:
 On Friday, 10 February 2017 at 12:39:50 UTC, Bastiaan Veelo 
 wrote:
     void foreach_loop()
     {
         foreach(n, elem; d[])
             elem = a[n] * b[n] / c[n];
     }
It's fast because the result of the operation (elem) is discarded on each iteration, so it is probably optimized away. Try: ``` void foreach_loop() { foreach(n, ref elem; d[]) elem = a[n] * b[n] / c[n]; } ```
Hah, of course.
 You can also do:
 ```
 d = a[] * b[] / c[];
 ```
 with no loop statement at all.
Nice. Thanks.
Feb 10 2017
prev sibling next sibling parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Friday, 10 February 2017 at 12:39:50 UTC, Bastiaan Veelo wrote:
 Benchmarking for() against foreach():

 /////////
 enum size_t maxarray = 500_000;

 double[maxarray] a, b, c, d;

 void main()
 {
     import std.stdio;
     import std.datetime;

     import std.random;
     for (int n = 0; n < maxarray; n++)
     {
         a[n] = uniform01;
         b[n] = uniform01;
         c[n] = uniform01 + 0.1;
     }

     void overhead() {}

     void foreach_loop()
     {
         foreach(n, elem; d[])
             elem = a[n] * b[n] / c[n];
     }

     void for_loop()
     {
         for (int n = 0; n < maxarray; n++)
             d[n] = a[n] * b[n] / c[n];
     }

     auto r = benchmark!(overhead,
                         foreach_loop,
                         for_loop)(10_000);

     import std.conv : to;
     foreach (i, d; r)
         writeln("Function ", i, " took: ", d.to!Duration);
 }
 /////////


 Depending on the machine this is run on, for() performs a 
 factor 3-8 slower than foreach(). Can someone explain this to 
 me? Or, taking for() as the norm, how can foreach() be so 
 blazingly fast?
 Thanks!
The foreach loop behaves differently. It does not modify d. If you want it to modify the array you have to use a ref elem. If you do you will see that foreach is a little slower.
Feb 10 2017
parent reply Bastiaan Veelo <Bastiaan Veelo.net> writes:
On Friday, 10 February 2017 at 12:58:19 UTC, Stefan Koch wrote:
 If you want it to modify the array you have to use a ref elem.
 If you do you will see that foreach is a little slower.
Thanks, I should have spotted that. Bastiaan.
Feb 10 2017
parent Dukc <ajieskola gmail.com> writes:
On Friday, 10 February 2017 at 13:33:55 UTC, Bastiaan Veelo wrote:
 Thanks, I should have spotted that.

 Bastiaan.
No, you don't even have to spot things like that. If you assert() the result that is. (Not a rant, half of us wouldn't probably have bothered).
Feb 10 2017
prev sibling parent reply evilrat <evilrat666 gmail.com> writes:
On Friday, 10 February 2017 at 12:39:50 UTC, Bastiaan Veelo wrote:
 Depending on the machine this is run on, for() performs a 
 factor 3-8 slower than foreach(). Can someone explain this to 
 me? Or, taking for() as the norm, how can foreach() be so 
 blazingly fast?
 Thanks!
On my machine (AMD FX-8350) actually almost no difference DMD 2.073 dmd -run loops.d -release Function 0 took: 16 ╬╝s and 5 hnsecs Function 1 took: 57 secs, 424 ms, and 555 ╬╝s Function 2 took: 53 secs, 494 ms, 709 ╬╝s, and 8 hnsecs LDC 1.1.0-beta6 ldc2 -run loops.d -release -o3 Using Visual C++: C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC Function 0 took: 25 ╬╝s and 5 hnsecs Function 1 took: 53 secs, 253 ╬╝s, and 8 hnsecs Function 2 took: 56 secs, 76 ms, 656 ╬╝s, and 4 hnsecs
Feb 10 2017
parent evilrat <evilrat666 gmail.com> writes:
On Friday, 10 February 2017 at 13:13:24 UTC, evilrat wrote:
 On my machine (AMD FX-8350) actually almost no difference
oops, it skips flags with -run -_- sorry dmd loops.d -release Function 0 took: 16 ╬╝s and 5 hnsecs Function 1 took: 55 secs, 262 ms, 844 ╬╝s, and 6 hnsecs Function 2 took: 56 secs, 564 ms, 231 ╬╝s, and 6 hnsecs ldc2 loops.d -release Function 0 took: 25 ╬╝s and 5 hnsecs Function 1 took: 46 secs, 757 ms, 889 ╬╝s, and 7 hnsecs Function 2 took: 23 secs, 895 ms, 410 ╬╝s, and 3 hnsecs dmd loops.d -m64 -release Function 0 took: 24 ╬╝s and 7 hnsecs Function 1 took: 27 secs, 752 ms, 952 ╬╝s, and 5 hnsecs Function 2 took: 36 secs, 550 ms, 295 ╬╝s, and 4 hnsecs ldc2 loops.d -m64 -release Function 0 took: 25 ╬╝s Function 1 took: 47 secs, 456 ms, 65 ╬╝s, and 4 hnsecs Function 2 took: 26 secs, 583 ms, 880 ╬╝s, and 1 hnsec setting LDC with any optimization flags completely removes empty call and for loop.
Feb 10 2017