www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - performance cost of sample conversion

reply Psychological Cleanup <Help Saving.World> writes:
if I have a non-double buffer and temporarily convert to double 
then convert back, do I save many cycles rather than just using a 
double buffer? I know it will bea lot more memory, but I'm 
specifically talking about the cycles in converting to and from 
vs no conversion.

Using a double for everything gives the highest precision and 
makes things much easier but is that the way to go or does it 
costs quite a bit in performance?
Sep 06 2017
parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 09/06/2017 07:06 PM, Psychological Cleanup wrote:
 if I have a non-double buffer and temporarily convert to double then
 convert back, do I save many cycles rather than just using a double
 buffer? I know it will bea lot more memory, but I'm specifically talking
 about the cycles in converting to and from vs no conversion.

 Using a double for everything gives the highest precision and makes
 things much easier but is that the way to go or does it costs quite a
 bit in performance?
You have to measure. Here's a start: import std.conv; import std.range; import std.datetime; import std.stdio; double workWithDouble(double d) { return d * d / 7; } void workWithFloats(float[] floats) { foreach (ref f; floats) { f = workWithDouble(f).to!float; } } void workWithDoubles(double[] doubles) { foreach (ref d; doubles) { d = workWithDouble(d); } } void main() { foreach (n; [ 1_000, 1_000_000, 10_000_000 ]) { const beg = -1f; const end = 1f; const step = (end - beg) / n; auto floats = iota(beg, end, step).array; auto doubles = iota(double(beg), end, step).array; { auto sw = StopWatch(AutoStart.yes); workWithDoubles(doubles); writefln("%10s no conversion: %10s usecs", n, sw.peek().usecs); } { auto sw = StopWatch(AutoStart.yes); workWithFloats(floats); writefln("%10s with conversion: %10s usecs", n, sw.peek().usecs); } } } Conversion seems to be more costly: 1000 no conversion: 27 usecs 1000 with conversion: 40 usecs 1000000 no conversion: 1715 usecs 1000000 with conversion: 5412 usecs 10000000 no conversion: 16280 usecs 10000000 with conversion: 47190 usecs Ali
Sep 06 2017
next sibling parent Psychological Cleanup <Help Saving.World> writes:
On Thursday, 7 September 2017 at 05:45:58 UTC, Ali Çehreli wrote:
 On 09/06/2017 07:06 PM, Psychological Cleanup wrote:
 if I have a non-double buffer and temporarily convert to 
 double then
 convert back, do I save many cycles rather than just using a 
 double
 buffer? I know it will bea lot more memory, but I'm 
 specifically talking
 about the cycles in converting to and from vs no conversion.

 Using a double for everything gives the highest precision and 
 makes
 things much easier but is that the way to go or does it costs 
 quite a
 bit in performance?
You have to measure. Here's a start: import std.conv; import std.range; import std.datetime; import std.stdio; double workWithDouble(double d) { return d * d / 7; } void workWithFloats(float[] floats) { foreach (ref f; floats) { f = workWithDouble(f).to!float; } } void workWithDoubles(double[] doubles) { foreach (ref d; doubles) { d = workWithDouble(d); } } void main() { foreach (n; [ 1_000, 1_000_000, 10_000_000 ]) { const beg = -1f; const end = 1f; const step = (end - beg) / n; auto floats = iota(beg, end, step).array; auto doubles = iota(double(beg), end, step).array; { auto sw = StopWatch(AutoStart.yes); workWithDoubles(doubles); writefln("%10s no conversion: %10s usecs", n, sw.peek().usecs); } { auto sw = StopWatch(AutoStart.yes); workWithFloats(floats); writefln("%10s with conversion: %10s usecs", n, sw.peek().usecs); } } } Conversion seems to be more costly: 1000 no conversion: 27 usecs 1000 with conversion: 40 usecs 1000000 no conversion: 1715 usecs 1000000 with conversion: 5412 usecs 10000000 no conversion: 16280 usecs 10000000 with conversion: 47190 usecs Ali
Thanks. my results dmd x86 debug asserts on the line `auto floats = iota(beg, end, step).array;` dmd x64 debug 1000 no conversion: 15 usecs 1000 with conversion: 5 usecs 1000000 no conversion: 2824 usecs 1000000 with conversion: 5689 usecs 10000000 no conversion: 24148 usecs 10000000 with conversion: 56335 usecs dmd release x86 1000 no conversion: 1 usecs 1000 with conversion: 1 usecs 1000000 no conversion: 1903 usecs 1000000 with conversion: 1262 usecs 10000000 no conversion: 19156 usecs 10000000 with conversion: 12831 usecs dmd release x64 1000 no conversion: 4 usecs 1000 with conversion: 17 usecs 1000000 no conversion: 4531 usecs 1000000 with conversion: 4516 usecs 10000000 no conversion: 45928 usecs 10000000 with conversion: 46080 usecs ldc x86 debug 1000 no conversion: 3 usecs 1000 with conversion: 32 usecs 1000000 no conversion: 3563 usecs 1000000 with conversion: 19240 usecs 10000000 no conversion: 35986 usecs 10000000 with conversion: 192025 usecs ldc x64 debug 1000 no conversion: 2 usecs 1000 with conversion: 10 usecs 1000000 no conversion: 2855 usecs 1000000 with conversion: 10309 usecs 10000000 no conversion: 28254 usecs 10000000 with conversion: 101380 usecs ldc x86 release 1000 no conversion: 0 usecs 1000 with conversion: 0 usecs 1000000 no conversion: 1280 usecs 1000000 with conversion: 532 usecs 10000000 no conversion: 10403 usecs 10000000 with conversion: 5752 usecs ldc x64 release 1000 no conversion: 0 usecs 1000 with conversion: 1 usecs 1000000 no conversion: 887 usecs 1000000 with conversion: 550 usecs 10000000 no conversion: 10730 usecs 10000000 with conversion: 5482 usecs The results are strange, sometimes the conversion wins.
Sep 06 2017
prev sibling parent Johan Engelen <j j.nl> writes:
On Thursday, 7 September 2017 at 05:45:58 UTC, Ali Çehreli wrote:
 You have to measure.
Indeed.
 Here's a start:
The program has way too many things pre-defined, and the semantics are such that workWithDoubles can be completely eliminated... So you are not measuring what you want to be measuring. Make stuff depend on argc, and print the result of calculations or do something else such that the calculation must be performed. When measuring without LTO, probably attaching weak onto the workWith* functions will work too. (pragma(inline, false) does not prevent reasoning about the function) -Johan
Sep 07 2017