## digitalmars.D.learn - performance cost of sample conversion

Psychological Cleanup <Help Saving.World> writes:
```if I have a non-double buffer and temporarily convert to double
then convert back, do I save many cycles rather than just using a
double buffer? I know it will bea lot more memory, but I'm
specifically talking about the cycles in converting to and from
vs no conversion.

Using a double for everything gives the highest precision and
makes things much easier but is that the way to go or does it
costs quite a bit in performance?
```
Sep 06
=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
```On 09/06/2017 07:06 PM, Psychological Cleanup wrote:
if I have a non-double buffer and temporarily convert to double then
convert back, do I save many cycles rather than just using a double
buffer? I know it will bea lot more memory, but I'm specifically talking
about the cycles in converting to and from vs no conversion.

Using a double for everything gives the highest precision and makes
things much easier but is that the way to go or does it costs quite a
bit in performance?

You have to measure. Here's a start:

import std.conv;
import std.range;
import std.datetime;
import std.stdio;

double workWithDouble(double d) {
return d * d / 7;
}

void workWithFloats(float[] floats) {
foreach (ref f; floats) {
f = workWithDouble(f).to!float;
}
}

void workWithDoubles(double[] doubles) {
foreach (ref d; doubles) {
d = workWithDouble(d);
}
}

void main() {
foreach (n; [ 1_000, 1_000_000, 10_000_000 ]) {
const beg = -1f;
const end = 1f;
const step = (end - beg) / n;
auto floats = iota(beg, end, step).array;
auto doubles = iota(double(beg), end, step).array;
{
auto sw = StopWatch(AutoStart.yes);
workWithDoubles(doubles);
writefln("%10s no   conversion: %10s usecs", n,
sw.peek().usecs);
}
{
auto sw = StopWatch(AutoStart.yes);
workWithFloats(floats);
writefln("%10s with conversion: %10s usecs", n,
sw.peek().usecs);
}
}
}

Conversion seems to be more costly:

1000 no   conversion:         27 usecs
1000 with conversion:         40 usecs
1000000 no   conversion:       1715 usecs
1000000 with conversion:       5412 usecs
10000000 no   conversion:      16280 usecs
10000000 with conversion:      47190 usecs

Ali
```
Sep 06
Psychological Cleanup <Help Saving.World> writes:
```On Thursday, 7 September 2017 at 05:45:58 UTC, Ali Çehreli wrote:
On 09/06/2017 07:06 PM, Psychological Cleanup wrote:
if I have a non-double buffer and temporarily convert to
double then
convert back, do I save many cycles rather than just using a
double
buffer? I know it will bea lot more memory, but I'm
specifically talking
about the cycles in converting to and from vs no conversion.

Using a double for everything gives the highest precision and
makes
things much easier but is that the way to go or does it costs
quite a
bit in performance?

You have to measure. Here's a start:

import std.conv;
import std.range;
import std.datetime;
import std.stdio;

double workWithDouble(double d) {
return d * d / 7;
}

void workWithFloats(float[] floats) {
foreach (ref f; floats) {
f = workWithDouble(f).to!float;
}
}

void workWithDoubles(double[] doubles) {
foreach (ref d; doubles) {
d = workWithDouble(d);
}
}

void main() {
foreach (n; [ 1_000, 1_000_000, 10_000_000 ]) {
const beg = -1f;
const end = 1f;
const step = (end - beg) / n;
auto floats = iota(beg, end, step).array;
auto doubles = iota(double(beg), end, step).array;
{
auto sw = StopWatch(AutoStart.yes);
workWithDoubles(doubles);
writefln("%10s no   conversion: %10s usecs", n,
sw.peek().usecs);
}
{
auto sw = StopWatch(AutoStart.yes);
workWithFloats(floats);
writefln("%10s with conversion: %10s usecs", n,
sw.peek().usecs);
}
}
}

Conversion seems to be more costly:

1000 no   conversion:         27 usecs
1000 with conversion:         40 usecs
1000000 no   conversion:       1715 usecs
1000000 with conversion:       5412 usecs
10000000 no   conversion:      16280 usecs
10000000 with conversion:      47190 usecs

Ali

Thanks. my results

dmd x86 debug
asserts on the line `auto floats = iota(beg, end,
step).array;`

dmd x64 debug
1000 no   conversion:         15 usecs
1000 with conversion:          5 usecs
1000000 no   conversion:       2824 usecs
1000000 with conversion:       5689 usecs
10000000 no   conversion:      24148 usecs
10000000 with conversion:      56335 usecs

dmd release x86
1000 no   conversion:          1 usecs
1000 with conversion:          1 usecs
1000000 no   conversion:       1903 usecs
1000000 with conversion:       1262 usecs
10000000 no   conversion:      19156 usecs
10000000 with conversion:      12831 usecs

dmd release x64
1000 no   conversion:          4 usecs
1000 with conversion:         17 usecs
1000000 no   conversion:       4531 usecs
1000000 with conversion:       4516 usecs
10000000 no   conversion:      45928 usecs
10000000 with conversion:      46080 usecs

ldc x86 debug
1000 no   conversion:          3 usecs
1000 with conversion:         32 usecs
1000000 no   conversion:       3563 usecs
1000000 with conversion:      19240 usecs
10000000 no   conversion:      35986 usecs
10000000 with conversion:     192025 usecs

ldc x64 debug
1000 no   conversion:          2 usecs
1000 with conversion:         10 usecs
1000000 no   conversion:       2855 usecs
1000000 with conversion:      10309 usecs
10000000 no   conversion:      28254 usecs
10000000 with conversion:     101380 usecs

ldc x86 release
1000 no   conversion:          0 usecs
1000 with conversion:          0 usecs
1000000 no   conversion:       1280 usecs
1000000 with conversion:        532 usecs
10000000 no   conversion:      10403 usecs
10000000 with conversion:       5752 usecs

ldc x64 release
1000 no   conversion:          0 usecs
1000 with conversion:          1 usecs
1000000 no   conversion:        887 usecs
1000000 with conversion:        550 usecs
10000000 no   conversion:      10730 usecs
10000000 with conversion:       5482 usecs

The results are strange, sometimes the conversion wins.
```
Sep 06
Johan Engelen <j j.nl> writes:
```On Thursday, 7 September 2017 at 05:45:58 UTC, Ali Çehreli wrote:
You have to measure.

Indeed.

Here's a start:

The program has way too many things pre-defined, and the
semantics are such that workWithDoubles can be completely
eliminated... So you are not measuring what you want to be
measuring.
Make stuff depend on argc, and print the result of calculations
or do something else such that the calculation must be performed.
When measuring without LTO, probably attaching  weak onto the
workWith* functions will work too. (pragma(inline, false) does
not prevent reasoning about the function)

-Johan
```
Sep 07