www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Benchmarking suite

reply "qznc" <qznc web.de> writes:
Since issue 13487 [0] seems to rot away, I started something on 
my own.
Made a benchmark script and inserted C/C++/D programs for 
comparison.

However, various programs are broken, as you see in the example 
report [1].
The D code is at least 7 years old. I only fixed compile errors.
The C/C++ programs were selected quite randomly.

It should be easy to checkout the repo [2] and run the benchmarks 
yourself
as long as you run on Linux:

   git clone git github.com:qznc/d-shootout.git
   cd d-shootout
   ./benchmark.d --quickly
   xdg-open index.html

Maybe somebody has already fixed or improved benchmark programs?



[0] https://issues.dlang.org/show_bug.cgi?id=13487
[1] https://qznc.github.io/d-shootout/
[2] https://github.com/qznc/d-shootout
Aug 29 2015
next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 29-Aug-2015 15:05, qznc wrote:
 Since issue 13487 [0] seems to rot away, I started something on my own.
 Made a benchmark script and inserted C/C++/D programs for comparison.

 However, various programs are broken, as you see in the example report [1].
 The D code is at least 7 years old. I only fixed compile errors.
 The C/C++ programs were selected quite randomly.

 It should be easy to checkout the repo [2] and run the benchmarks yourself
 as long as you run on Linux:

    git clone git github.com:qznc/d-shootout.git
    cd d-shootout
    ./benchmark.d --quickly
    xdg-open index.html

 Maybe somebody has already fixed or improved benchmark programs?
Well, here is the regex-dna one with 3 versions including C-T regex: https://github.com/DmitryOlshansky/FReD/blob/master/bench/regex-dna/d_dna.d Could be trivially parallelized with std.parallelism. -- Dmitry Olshansky
Aug 29 2015
parent reply "qznc" <qznc web.de> writes:
On Saturday, 29 August 2015 at 12:35:14 UTC, Dmitry Olshansky 
wrote:
 Well, here is the regex-dna one with 3 versions including C-T 
 regex:

 https://github.com/DmitryOlshansky/FReD/blob/master/bench/regex-dna/d_dna.d
Thanks Dmitry! Which version should be used?
Aug 29 2015
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 29-Aug-2015 21:14, qznc wrote:
 On Saturday, 29 August 2015 at 12:35:14 UTC, Dmitry Olshansky wrote:
 Well, here is the regex-dna one with 3 versions including C-T regex:

 https://github.com/DmitryOlshansky/FReD/blob/master/bench/regex-dna/d_dna.d
Thanks Dmitry! Which version should be used?
I'd try all of them, I think C-T was the fastest (as it should). -- Dmitry Olshansky
Aug 29 2015
parent reply "qznc" <qznc web.de> writes:
On Saturday, 29 August 2015 at 19:17:47 UTC, Dmitry Olshansky 
wrote:
 On 29-Aug-2015 21:14, qznc wrote:
 On Saturday, 29 August 2015 at 12:35:14 UTC, Dmitry Olshansky 
 wrote:
 Well, here is the regex-dna one with 3 versions including C-T 
 regex:

 https://github.com/DmitryOlshansky/FReD/blob/master/bench/regex-dna/d_dna.d
Thanks Dmitry! Which version should be used?
I'd try all of them, I think C-T was the fastest (as it should).
Yes, C-T is fastest. Even dmd is faster than C/C++ now. :)
Aug 30 2015
next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 30-Aug-2015 16:21, qznc wrote:
 On Saturday, 29 August 2015 at 19:17:47 UTC, Dmitry Olshansky wrote:
 On 29-Aug-2015 21:14, qznc wrote:
 On Saturday, 29 August 2015 at 12:35:14 UTC, Dmitry Olshansky wrote:
 Well, here is the regex-dna one with 3 versions including C-T regex:

 https://github.com/DmitryOlshansky/FReD/blob/master/bench/regex-dna/d_dna.d
Thanks Dmitry! Which version should be used?
I'd try all of them, I think C-T was the fastest (as it should).
Yes, C-T is fastest. Even dmd is faster than C/C++ now. :)
Was one of the first benchmarks where std.regex destroyed the competition. It may still do so ;) -- Dmitry Olshansky
Aug 30 2015
parent reply "qznc" <qznc web.de> writes:
On Sunday, 30 August 2015 at 14:56:34 UTC, Dmitry Olshansky wrote:
 Was one of the first benchmarks where std.regex destroyed the 
 competition. It may still do so ;)
Rust has compile-time regex as well now. http://doc.rust-lang.org/regex/regex/index.html#the-regex!-macro
Aug 30 2015
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 30-Aug-2015 19:57, qznc wrote:
 On Sunday, 30 August 2015 at 14:56:34 UTC, Dmitry Olshansky wrote:
 Was one of the first benchmarks where std.regex destroyed the
 competition. It may still do so ;)
Rust has compile-time regex as well now. http://doc.rust-lang.org/regex/regex/index.html#the-regex!-macro
Yeahm I've seen that. Last year they were just catching up, they may have production quality stuff by now. -- Dmitry Olshansky
Aug 30 2015
prev sibling parent reply "qznc" <qznc web.de> writes:
On Sunday, 30 August 2015 at 13:21:42 UTC, qznc wrote:
 On Saturday, 29 August 2015 at 19:17:47 UTC, Dmitry Olshansky 
 wrote:
 On 29-Aug-2015 21:14, qznc wrote:
 On Saturday, 29 August 2015 at 12:35:14 UTC, Dmitry Olshansky 
 wrote:
 Well, here is the regex-dna one with 3 versions including 
 C-T regex:

 https://github.com/DmitryOlshansky/FReD/blob/master/bench/regex-dna/d_dna.d
Thanks Dmitry! Which version should be used?
I'd try all of them, I think C-T was the fastest (as it should).
Yes, C-T is fastest. Even dmd is faster than C/C++ now. :)
Unfortunately, I have to take that back. C is faster than D even with compile-time regexes. I used the short running benchmarks first, where compile-time regex wins, probably because it saves some startup time. For large data, C is faster. It uses the regex engine from TCL. Maybe std.regex has just space for optimization? I updated the benchmark results: https://qznc.github.io/d-shootout/
Sep 07 2015
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 07-Sep-2015 11:29, qznc wrote:
 Maybe std.regex has just space for optimization?
Sure thing, see WIP here (~25% faster but not yet complete): https://github.com/D-Programming-Language/phobos/pull/3314 -- Dmitry Olshansky
Sep 07 2015
next sibling parent "Suliman" <evermind live.ru> writes:
On Monday, 7 September 2015 at 08:33:33 UTC, Dmitry Olshansky 
wrote:
 On 07-Sep-2015 11:29, qznc wrote:
 Maybe std.regex has just space for optimization?
Sure thing, see WIP here (~25% faster but not yet complete): https://github.com/D-Programming-Language/phobos/pull/3314
Could anybody add C# version of examples?
Sep 07 2015
prev sibling parent "Jack Stouffer" <jack jackstouffer.com> writes:
On Monday, 7 September 2015 at 08:33:33 UTC, Dmitry Olshansky 
wrote:
 On 07-Sep-2015 11:29, qznc wrote:
 Maybe std.regex has just space for optimization?
Sure thing, see WIP here (~25% faster but not yet complete): https://github.com/D-Programming-Language/phobos/pull/3314
It's been over a year since the last commit on that. Any updates?
Sep 08 2015
prev sibling next sibling parent "qznc" <qznc web.de> writes:
On Saturday, 29 August 2015 at 12:05:18 UTC, qznc wrote:
 Maybe somebody has already fixed or improved benchmark programs?
As of now, most things work. Only meteor.d is broken. Crashes at runtime. Ldc and gdc sometimes fail, because they are behind dmd. regexdna.cpp fails, because re2 is not available via Ubuntu apt. Many benchmarks need some performance tuning, though. We should not lose to C/C++. Some benchmarks are suspiciously fast, which means probably wrong.
Aug 30 2015
prev sibling next sibling parent reply "Robert burner Schadek" <rburners gmail.com> writes:
Why not go really big. aka:

http://forum.dlang.org/post/vzcvwrbqpeamtnopmtoa forum.dlang.org
Sep 08 2015
parent reply "qznc" <qznc web.de> writes:
On Tuesday, 8 September 2015 at 08:24:43 UTC, Robert burner 
Schadek wrote:
 Why not go really big. aka:

 http://forum.dlang.org/post/vzcvwrbqpeamtnopmtoa forum.dlang.org
You suggest to create a benchmark suite from all the unittests in Phobos? I don't think this is a good idea. Most programs don't make good performance tests. Even the Benchmarks Game / Shootout benchmarks are partially stupid. For example, threadring measures context switching. The best strategy is "use pthreads and pthread mutex and restrict to one core". It only shows how good your language can access the pthread API. The context switching is done by Linux. The pidigits programs basically measures libGMP performance. I'm all for adding more programs into the benchmark suite, but they should be carefully selected to measure different aspects. I don't understand all the programs enough to know what is lacking. Probably some memory management aspects. Maybe some concurrency stuff.
Sep 08 2015
next sibling parent "qznc" <qznc web.de> writes:
On Tuesday, 8 September 2015 at 09:27:13 UTC, qznc wrote:
 On Tuesday, 8 September 2015 at 08:24:43 UTC, Robert burner 
 Schadek wrote:
 Why not go really big. aka:

 http://forum.dlang.org/post/vzcvwrbqpeamtnopmtoa forum.dlang.org
You suggest to create a benchmark suite from all the unittests in Phobos? I don't think this is a good idea. Most programs don't make good performance tests.
Read your PR more closely now. We already agree on that. :P Still a difference: You propose a method to measure Phobos performance and detect regressions. My shootout wants to compare D and C/C++ performance.
Sep 08 2015
prev sibling parent reply "Isaac Gouy" <igouy2 yahoo.com> writes:
On Tuesday, 8 September 2015 at 09:27:13 UTC, qznc wrote:

 For example, threadring measures context switching.
thread-ring has aged badly. It was added when the measurements were only made on single-core hardware, and Erlang's huge number of lightweight processes seemed interesting ;-) It's been many years since the thread-ring measurements were included in the summary charts.
 The pidigits programs basically measures libGMP performance.
And arbitrary precision arithmetic without libGMP :-)
Sep 08 2015
parent "qznc" <qznc web.de> writes:
On Tuesday, 8 September 2015 at 18:53:02 UTC, Isaac Gouy wrote:
 On Tuesday, 8 September 2015 at 09:27:13 UTC, qznc wrote:

 For example, threadring measures context switching.
thread-ring has aged badly. It was added when the measurements were only made on single-core hardware, and Erlang's huge number of lightweight processes seemed interesting ;-) It's been many years since the thread-ring measurements were included in the summary charts.
It is interesting that Erlang and others are considered "preemptive" threads. Afaik the Erlang runtime does not interrupt processes.
 The pidigits programs basically measures libGMP performance.
And arbitrary precision arithmetic without libGMP :-)
In this comparison it is actually interesting, because D has its own bignum implementation in the standard library. It holds well against libGMP.
Sep 08 2015
prev sibling parent reply "Isaac Gouy" <igouy2 yahoo.com> writes:
On Saturday, 29 August 2015 at 12:05:18 UTC, qznc wrote:

 I started something on my own.
Kudos to qznc!
 The C/C++ programs were selected quite randomly.
Note: There are separate C and C++ programs shown on the benchmarks game -- so for something like regex-dna there's a C program using the C library written for Tcl and there's a C++ program using the re library. fwiw Doing both would make your comparison a little broader.
Sep 08 2015
parent reply "qznc" <qznc web.de> writes:
On Tuesday, 8 September 2015 at 18:41:10 UTC, Isaac Gouy wrote:
 On Saturday, 29 August 2015 at 12:05:18 UTC, qznc wrote:

 I started something on my own.
Kudos to qznc!
 The C/C++ programs were selected quite randomly.
Note: There are separate C and C++ programs shown on the benchmarks game -- so for something like regex-dna there's a C program using the C library written for Tcl and there's a C++ program using the re library. fwiw Doing both would make your comparison a little broader.
Yes. I'm not sure how to structure this whole suite. The general goal is "D claims that it can match C/C++ in performance, let's have some actual numbers". So far D mostly disappoints in terms of performance. There are at least three interesting variations "fastest parallel programs", "fastest sequential programs" and "short idiomatic programs". Probably all of them should be compared.
Sep 08 2015
next sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Tuesday, 8 September 2015 at 21:11:15 UTC, qznc wrote:
 Yes. I'm not sure how to structure this whole suite. The 
 general goal is "D claims that it can match C/C++ in 
 performance, let's have some actual numbers". So far D mostly 
 disappoints in terms of performance.
The most interesting thing to test is how they fare with high level optimization, not low level optimization. So make sure the implementation is similar...
Sep 08 2015
parent reply "qznc" <qznc web.de> writes:
On Tuesday, 8 September 2015 at 23:20:05 UTC, Ola Fosheim Grøstad 
wrote:
 On Tuesday, 8 September 2015 at 21:11:15 UTC, qznc wrote:
 Yes. I'm not sure how to structure this whole suite. The 
 general goal is "D claims that it can match C/C++ in 
 performance, let's have some actual numbers". So far D mostly 
 disappoints in terms of performance.
The most interesting thing to test is how they fare with high level optimization, not low level optimization. So make sure the implementation is similar...
I'm not sure if I understand you correctly. What is "high level" and "low level" optimization? What I want to know is a) how fast is "idiomatic" D code (using ranges etc) compared to "idiomatic" C/C++ and b) how do they compare if you push performance to the limits (code beauty be damned). For a) you want a similar implementation although C/C++ will most certainly always loose in terms of length and convenience. For b) we don't care. C/C++ is free to use builtins, pragmas, and whatnot. If for loops are faster than ranges in D, then we will use for loops here.
Sep 09 2015
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Wednesday, 9 September 2015 at 07:59:48 UTC, qznc wrote:
 I'm not sure if I understand you correctly. What is "high 
 level" and "low level" optimization?
Low level are local optimizations, which basically will be the same if you use the same backend (like LLVM). It would just measure the programmer's approach and not the compiler.
 What I want to know is a) how fast is "idiomatic" D code (using 
 ranges etc) compared to "idiomatic" C/C++ and b) how do they 
 compare if you push performance to the limits (code beauty be 
 damned).

 For a) you want a similar implementation although C/C++ will 
 most certainly always loose in terms of length and convenience.

 For b) we don't care. C/C++ is free to use builtins, pragmas, 
 and whatnot. If for loops are faster than ranges in D, then we 
 will use for loops here.
The problem with a) is that in C++ there are many libraries and you'll have a hard time finding comparable alternatives on both sides... People don't stick to the standard libraries in C++. For b) C++ will be slightly faster because of things like modular arithmetics and OpenMP support. I think the better approach is to write up the same algorithms in a high level fashion (using generic templates on both sides) from the ground up using the same constructs and measure the ability to optimize. Otherwise you end up comparing apples and oranges in a rather subjective manner.
Sep 09 2015
parent reply "qznc" <qznc web.de> writes:
On Wednesday, 9 September 2015 at 09:56:10 UTC, Ola Fosheim 
Grøstad wrote:
 I think the better approach is to write up the same algorithms 
 in a high level fashion (using generic templates on both sides) 
 from the ground up using the same constructs and measure the 
 ability to optimize.
That is a good idea, if you want to measure compiler optimizations. Ideally g++ and gdc should always yield the same performance then? However, it does answer the wrong question imho. Suppose you consider using D with C/C++ as the stable alternative. D lures you with its high level features. However, you know that you will have to really optimize some hot spots sooner or later. Will D impose a penalty on you and C/C++ could have provided better performance? Walter argues that there is no technical reason why D should be slower than C/C++. My experience with the benchmarks says, there seem to be such penalties. For example, there is no __builtin_ia32_cmplepd or __builtin_ia32_movmskpd like gcc has.
Sep 09 2015
next sibling parent reply Iain Buclaw via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 9 September 2015 at 16:00, qznc via Digitalmars-d <
digitalmars-d puremagic.com> wrote:

 On Wednesday, 9 September 2015 at 09:56:10 UTC, Ola Fosheim Gr=C3=B8stad =
wrote:
 I think the better approach is to write up the same algorithms in a high
 level fashion (using generic templates on both sides) from the ground up
 using the same constructs and measure the ability to optimize.
That is a good idea, if you want to measure compiler optimizations. Ideally g++ and gdc should always yield the same performance then? However, it does answer the wrong question imho. Suppose you consider using D with C/C++ as the stable alternative. D lure=
s
 you with its high level features. However, you know that you will have to
 really optimize some hot spots sooner or later. Will D impose a penalty o=
n
 you and C/C++ could have provided better performance?

 Walter argues that there is no technical reason why D should be slower
 than C/C++. My experience with the benchmarks says, there seem to be such
 penalties. For example, there is no __builtin_ia32_cmplepd or
 __builtin_ia32_movmskpd like gcc has.
import gcc.builtins; // OK, cheating. :-)
Sep 09 2015
parent "qznc" <qznc web.de> writes:
On Wednesday, 9 September 2015 at 14:09:36 UTC, Iain Buclaw wrote:
 import gcc.builtins;  // OK, cheating. :-)
Thanks, I did not know this. :) I would not consider it cheating. Using builtins in C is not portable C11 either. It also shows off how D does versions.
Sep 09 2015
prev sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Wednesday, 9 September 2015 at 14:00:07 UTC, qznc wrote:
 That is a good idea, if you want to measure compiler 
 optimizations. Ideally g++ and gdc should always yield the same 
 performance then?
Hopefully, as I understand GCC uses a highlevel IR, but if performance is equal that is a pretty strong argument to get people to adopt, if the rest of the language is polished. And you could measure regressions.
 Suppose you consider using D with C/C++ as the stable 
 alternative. D lures you with its high level features. However, 
 you know that you will have to really optimize some hot spots 
 sooner or later. Will D impose a penalty on you and C/C++ could 
 have provided better performance?

 Walter argues that there is no technical reason why D should be 
 slower than C/C++. My experience with the benchmarks says, 
 there seem to be such penalties. For example, there is no 
 __builtin_ia32_cmplepd or __builtin_ia32_movmskpd like gcc has.
Ok, I see your point. You want to measure maximum throughput for critical applications that might benefit from language specific intrinsics. Multithreaded applications could probably show some differences too, due to TLS/shared... Maybe some kind of actor based benchmark. Essentially running thousands of fibers with lots of intercommunication.
Sep 09 2015
prev sibling next sibling parent "Isaac Gouy" <igouy2 yahoo.com> writes:
On Tuesday, 8 September 2015 at 21:06:26 UTC, qznc wrote:

 Afaik the Erlang runtime does not interrupt processes.
Depends what you mean by "processes" :-)
 In this comparison it is actually interesting, because D has 
 its own bignum implementation in the standard library.
There you go! On Tuesday, 8 September 2015 at 21:11:15 UTC, qznc wrote:
 The general goal is "D claims that it can match C/C++ in 
 performance, let's have some actual numbers".
- You're only dealing with 3 programming languages, although more than 3 language implementations - Those programming languages are intended to be used for similar tasks. - You'll correctly be seen as a D language advocate, so your presentation needs to show that you accept advice on how to improve the C and C++ programs. - "short idiomatic programs" is difficult because the tradeoff between performance and "idiomatic" is so subjective, and you will correctly be seen as a D language advocate :-) When asked, one of the C++ program contributors to the benchmarks game did try to write some "shorter" C++ programs, see: http://benchmarksgame.alioth.debian.org/u64/code-used-time-used-shapes.php#shortest
Sep 09 2015
prev sibling parent reply Russel Winder via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Tue, 2015-09-08 at 21:11 +0000, qznc via Digitalmars-d wrote:
=20
[=E2=80=A6]
 Yes. I'm not sure how to structure this whole suite. The general=20
 goal is "D claims that it can match C/C++ in performance, let's=20
 have some actual numbers". So far D mostly disappoints in terms=20
 of performance.
=20
 [=E2=80=A6]
Which D compilers are you testing with? I have found DMD to be OK but often a bit slow compared to C++ codes using GCC and Clang. However, LDC and GDC generally create executables from D codes that are at least as fast as C++ with GCC and Clang. =20 --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Sep 09 2015
parent "qznc" <qznc web.de> writes:
On Wednesday, 9 September 2015 at 16:58:41 UTC, Russel Winder 
wrote:
 On Tue, 2015-09-08 at 21:11 +0000, qznc via Digitalmars-d wrote:
 
[…]
 Yes. I'm not sure how to structure this whole suite. The 
 general goal is "D claims that it can match C/C++ in 
 performance, let's have some actual numbers". So far D mostly 
 disappoints in terms of performance.
 
 […]
Which D compilers are you testing with?
The benchmark records all versions: https://qznc.github.io/d-shootout/
 I have found DMD to be OK but often a bit slow compared to C++ 
 codes
 using GCC and Clang. However, LDC and GDC generally create 
 executables
 from D codes that are at least as fast as C++ with GCC and 
 Clang.
Yes, that is the general opinion. However, I have a hard time to reach the obscenely tuned C++ programs.
Sep 09 2015