digitalmars.D.learn - performance cost of sample conversion

Psychological Cleanup (8/8) Sep 06 2017 if I have a non-double buffer and temporarily convert to double

=?UTF-8?Q?Ali_=c3=87ehreli?= (48/55) Sep 06 2017 You have to measure. Here's a start:

Psychological Cleanup (55/116) Sep 06 2017 Thanks. my results
Johan Engelen (12/14) Sep 07 2017 The program has way too many things pre-defined, and the

Psychological Cleanup <Help Saving.World> writes:

if I have a non-double buffer and temporarily convert to double 
then convert back, do I save many cycles rather than just using a 
double buffer? I know it will bea lot more memory, but I'm 
specifically talking about the cycles in converting to and from 
vs no conversion.

Using a double for everything gives the highest precision and 
makes things much easier but is that the way to go or does it 
costs quite a bit in performance?

Sep 06 2017

=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:

On 09/06/2017 07:06 PM, Psychological Cleanup wrote:
 if I have a non-double buffer and temporarily convert to double then
 convert back, do I save many cycles rather than just using a double
 buffer? I know it will bea lot more memory, but I'm specifically talking
 about the cycles in converting to and from vs no conversion.

 Using a double for everything gives the highest precision and makes
 things much easier but is that the way to go or does it costs quite a
 bit in performance?

You have to measure. Here's a start:

import std.conv;
import std.range;
import std.datetime;
import std.stdio;

double workWithDouble(double d) {
     return d * d / 7;
}

void workWithFloats(float[] floats) {
     foreach (ref f; floats) {
         f = workWithDouble(f).to!float;
     }
}

void workWithDoubles(double[] doubles) {
     foreach (ref d; doubles) {
         d = workWithDouble(d);
     }
}

void main() {
     foreach (n; [ 1_000, 1_000_000, 10_000_000 ]) {
         const beg = -1f;
         const end = 1f;
         const step = (end - beg) / n;
         auto floats = iota(beg, end, step).array;
         auto doubles = iota(double(beg), end, step).array;
         {
             auto sw = StopWatch(AutoStart.yes);
             workWithDoubles(doubles);
             writefln("%10s no   conversion: %10s usecs", n, 
sw.peek().usecs);
         }
         {
             auto sw = StopWatch(AutoStart.yes);
             workWithFloats(floats);
             writefln("%10s with conversion: %10s usecs", n, 
sw.peek().usecs);
         }
     }
}

Conversion seems to be more costly:

       1000 no   conversion:         27 usecs
       1000 with conversion:         40 usecs
    1000000 no   conversion:       1715 usecs
    1000000 with conversion:       5412 usecs
   10000000 no   conversion:      16280 usecs
   10000000 with conversion:      47190 usecs

Ali

Sep 06 2017

Psychological Cleanup <Help Saving.World> writes:

On Thursday, 7 September 2017 at 05:45:58 UTC, Ali Çehreli wrote:
 On 09/06/2017 07:06 PM, Psychological Cleanup wrote:
 if I have a non-double buffer and temporarily convert to 
 double then
 convert back, do I save many cycles rather than just using a 
 double
 buffer? I know it will bea lot more memory, but I'm 
 specifically talking
 about the cycles in converting to and from vs no conversion.

 Using a double for everything gives the highest precision and 
 makes
 things much easier but is that the way to go or does it costs 
 quite a
 bit in performance?

 You have to measure. Here's a start:

 import std.conv;
 import std.range;
 import std.datetime;
 import std.stdio;

 double workWithDouble(double d) {
     return d * d / 7;
 }

 void workWithFloats(float[] floats) {
     foreach (ref f; floats) {
         f = workWithDouble(f).to!float;
     }
 }

 void workWithDoubles(double[] doubles) {
     foreach (ref d; doubles) {
         d = workWithDouble(d);
     }
 }

 void main() {
     foreach (n; [ 1_000, 1_000_000, 10_000_000 ]) {
         const beg = -1f;
         const end = 1f;
         const step = (end - beg) / n;
         auto floats = iota(beg, end, step).array;
         auto doubles = iota(double(beg), end, step).array;
         {
             auto sw = StopWatch(AutoStart.yes);
             workWithDoubles(doubles);
             writefln("%10s no   conversion: %10s usecs", n, 
 sw.peek().usecs);
         }
         {
             auto sw = StopWatch(AutoStart.yes);
             workWithFloats(floats);
             writefln("%10s with conversion: %10s usecs", n, 
 sw.peek().usecs);
         }
     }
 }

 Conversion seems to be more costly:

       1000 no   conversion:         27 usecs
       1000 with conversion:         40 usecs
    1000000 no   conversion:       1715 usecs
    1000000 with conversion:       5412 usecs
   10000000 no   conversion:      16280 usecs
   10000000 with conversion:      47190 usecs

 Ali

Thanks. my results

dmd x86 debug
     asserts on the line `auto floats = iota(beg, end, 
step).array;`


dmd x64 debug
       1000 no   conversion:         15 usecs
       1000 with conversion:          5 usecs
    1000000 no   conversion:       2824 usecs
    1000000 with conversion:       5689 usecs
   10000000 no   conversion:      24148 usecs
   10000000 with conversion:      56335 usecs

dmd release x86
       1000 no   conversion:          1 usecs
       1000 with conversion:          1 usecs
    1000000 no   conversion:       1903 usecs
    1000000 with conversion:       1262 usecs
   10000000 no   conversion:      19156 usecs
   10000000 with conversion:      12831 usecs

dmd release x64
       1000 no   conversion:          4 usecs
       1000 with conversion:         17 usecs
    1000000 no   conversion:       4531 usecs
    1000000 with conversion:       4516 usecs
   10000000 no   conversion:      45928 usecs
   10000000 with conversion:      46080 usecs

ldc x86 debug
       1000 no   conversion:          3 usecs
       1000 with conversion:         32 usecs
    1000000 no   conversion:       3563 usecs
    1000000 with conversion:      19240 usecs
   10000000 no   conversion:      35986 usecs
   10000000 with conversion:     192025 usecs

ldc x64 debug
       1000 no   conversion:          2 usecs
       1000 with conversion:         10 usecs
    1000000 no   conversion:       2855 usecs
    1000000 with conversion:      10309 usecs
   10000000 no   conversion:      28254 usecs
   10000000 with conversion:     101380 usecs

ldc x86 release
       1000 no   conversion:          0 usecs
       1000 with conversion:          0 usecs
    1000000 no   conversion:       1280 usecs
    1000000 with conversion:        532 usecs
   10000000 no   conversion:      10403 usecs
   10000000 with conversion:       5752 usecs

ldc x64 release
       1000 no   conversion:          0 usecs
       1000 with conversion:          1 usecs
    1000000 no   conversion:        887 usecs
    1000000 with conversion:        550 usecs
   10000000 no   conversion:      10730 usecs
   10000000 with conversion:       5482 usecs

The results are strange, sometimes the conversion wins.

Sep 06 2017

Johan Engelen <j j.nl> writes:

On Thursday, 7 September 2017 at 05:45:58 UTC, Ali Çehreli wrote:
 You have to measure.

Indeed.

 Here's a start:

The program has way too many things pre-defined, and the 
semantics are such that workWithDoubles can be completely 
eliminated... So you are not measuring what you want to be 
measuring.
Make stuff depend on argc, and print the result of calculations 
or do something else such that the calculation must be performed. 
When measuring without LTO, probably attaching  weak onto the 
workWith* functions will work too. (pragma(inline, false) does 
not prevent reasoning about the function)

-Johan

Sep 07 2017

D Programming

C/C++ Programming

Other

digitalmars.D.learn - performance cost of sample conversion