www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Programming language benchmark

reply Piotr Szturmaj <bncrbme jadamspam.pl> writes:
Hi,

I've just found this benchmark which includes GDC and LDC compilers. I 
don't know why there's no DMD flavor, though. Here's the link:

http://attractivechaos.wordpress.com/2011/06/22/my-programming-language-benchmark-analyses/

It clearly shows that D is really fast (comparable to C) which I like a 
lot! However, RegExp implementation is missing from D benchmark. Author 
"could not get it to work".

Piotr
Jun 22 2011
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Piotr Szturmaj:

 I don't know why there's no DMD flavor, though.
The author of that bench has said:
I have not evaluated DMD because I am running the programs on a Linux server I
have no control of. The “libc” is quite old and incompatible with the binary
release of dmd.<
 It clearly shows that D is really fast (comparable to C) which I like a lot!
Only LDC and GDC. DMD is not up to GCC/ICC/LLVM even on integer math benchmarks. Bye, bearophile
Jun 22 2011
next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 22.06.2011 15:41, bearophile wrote:
 Piotr Szturmaj:

 I don't know why there's no DMD flavor, though.
The author of that bench has said:
 I have not evaluated DMD because I am running the programs on a Linux server I
have no control of. The “libc” is quite old and incompatible with the binary
release of dmd.<
 It clearly shows that D is really fast (comparable to C) which I like a lot!
Yeah, if that's supposed to show anything. I personally dislike the way author benchmarks regexes anyway, e.g. perl: while (<>) { chomp; print $_, "\n" if /$re/; } chomp?? and printing each line will get this test biased by performance of text printing facilities. Same things with C, gets and puts and chomping that have nothing to do with pattern matching: while (fgets(buf, BUF_SIZE - 1, stdin)) { ++l; for (q = buf; *q; ++q); if (q > buf) *(q-1) = 0;//was that triming '\n'? if (regexec(&r, buf, 10, match, 0) != REG_NOMATCH) puts(buf); }
 Only LDC and GDC. DMD is not up to GCC/ICC/LLVM even on integer math
benchmarks.
Mm, proof link? :)
 Bye,
 bearophile
-- Dmitry Olshansky
Jun 22 2011
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 6/22/11 6:52 AM, Dmitry Olshansky wrote:
 for (q = buf; *q; ++q); if (q > buf) *(q-1) = 0;//was that triming '\n'?
Now that's an interesting line. Andrei
Jun 22 2011
prev sibling parent Don <nospam nospam.com> writes:
bearophile wrote:
 Piotr Szturmaj:
 
 I don't know why there's no DMD flavor, though.
The author of that bench has said:
 I have not evaluated DMD because I am running the programs on a Linux server I
have no control of. The “libc” is quite old and incompatible with the binary
release of dmd.<
 It clearly shows that D is really fast (comparable to C) which I like a lot!
Only LDC and GDC. DMD is not up to GCC/ICC/LLVM even on integer math benchmarks.
I've never heard that claim before. Do you have evidence for that? If it is true, there's a strong possibility that it's a small, fixable issue (for example, DMD used to have terrible performance for ulong multiplication).
Jun 23 2011
prev sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Don:

Sorry for my slow answer, I was quite busy for days.


 I've never heard that claim before. Do you have evidence for that?
I compare/convert code to D every day, so I am aware that D code compiled with DMD is often slower than C/C++ code compiled with GCC. Since some years I even keep a collection of snippets of slow code. But I am also aware that the low performance has many different causes, like some missing inlining, missing loop unrolling, etc, so spotting a clear and small case of integer arithmetic code that causes a slow down, to give you evidence, is not easy. So I am sorry for my overly broad claim.
If it is true, there's a strong possibility that it's a small, fixable issue
(for example, DMD used to have terrible performance for ulong multiplication).<
You are right, the case I'm going to show is a precise problem that's fixable. ----------------------- // C code #include "limits.h" #include "stdio.h" int divideBySeven(int x) { return x / 7; } int main() { int i = INT_MAX; int r; while (i--) r = divideBySeven(i); printf("%d\n", r); return 0; } ----------------------- // D code int divideBySeven(int x) { return x / 7; } void main() { int i = int.max; int r; while (i--) r = divideBySeven(i); printf("%d\n", r); } ----------------------- Asm from the C version: _divideBySeven: pushl %ebx movl $-1840700269, %ebx movl 8(%esp), %ecx movl %ebx, %eax popl %ebx imull %ecx leal (%edx,%ecx), %eax sarl $31, %ecx sarl $2, %eax subl %ecx, %eax ret _main: leal 4(%esp), %ecx andl $-16, %esp pushl -4(%ecx) pushl %ebx movl $-1840700269, %ebx pushl %ecx subl $20, %esp call ___main movl $2147483646, %ecx .p2align 4,,10 L4: movl %ecx, %eax imull %ebx movl %ecx, %eax addl %ecx, %edx sarl $31, %eax sarl $2, %edx decl %ecx subl %eax, %edx cmpl $-1, %ecx jne L4 movl %edx, 4(%esp) movl $LC0, (%esp) call _printf addl $20, %esp xorl %eax, %eax popl %ecx popl %ebx leal -4(%ecx), %esp ret .def _printf; .scl 2; .type 32; .endef ----------------------- Asm from the D version: _D9int_div_d13divideBySevenFiZi comdat mov ECX,7 cdq idiv ECX ret __Dmain comdat L0: push EAX push EBX mov EBX,07FFFFFFFh push ESI xor ESI,ESI test EBX,EBX lea EBX,-1[EBX] je L24 L11: mov EAX,EBX mov ECX,7 cdq idiv ECX test EBX,EBX mov ESI,EAX lea EBX,-1[EBX] jne L11 L24: push ESI mov EDX,offset FLAT:_DATA push EDX call near ptr _printf add ESP,8 xor EAX,EAX pop ESI pop EBX pop ECX ret ----------------------- For a more real case see: http://d.puremagic.com/issues/show_bug.cgi?id=5607 Bye, bearophile
Jun 24 2011
parent reply Don <nospam nospam.com> writes:
bearophile wrote:
 Don:
 
 Sorry for my slow answer, I was quite busy for days.
 
 
 I've never heard that claim before. Do you have evidence for that?
I compare/convert code to D every day, so I am aware that D code compiled with DMD is often slower than C/C++ code compiled with GCC. Since some years I even keep a collection of snippets of slow code. But I am also aware that the low performance has many different causes, like some missing inlining, missing loop unrolling, etc, so spotting a clear and small case of integer arithmetic code that causes a slow down, to give you evidence, is not easy. So I am sorry for my overly broad claim.
It is true in general that DMD's inliner is not very good. I _suspect_ that it is the primary cause of most instances of poor integer performance. It's actually part of the front-end, not the back-end. So many of those performance problems won't apply to DMC. It's also true that the DMD/DMC instruction scheduler doesn't schedule for modern processors. But last I checked, GCC wasn't really much better in practice (you have to be almost perfect to get a benefit from instruction scheduling these days, the hardware does a very good job on unscheduled code). Otherwise, I don't think there's any major optimisation it misses. But it's quite likely that it misses several very specific minor optimizations.
 If it is true, there's a strong possibility that it's a small, fixable issue
(for example, DMD used to have terrible performance for ulong multiplication).<
You are right, the case I'm going to show is a precise problem that's fixable.
[snip]
 -----------------------
 
 For a more real case see:
 http://d.puremagic.com/issues/show_bug.cgi?id=5607
Thanks, that's helpful. It's a major speed difference (factor of 20, maybe) so it wouldn't have to occur very often to be noticeable.
Jun 26 2011
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
Don wrote:
 bearophile wrote:
 Don:

 Sorry for my slow answer, I was quite busy for days.


 I've never heard that claim before. Do you have evidence for that?
I compare/convert code to D every day, so I am aware that D code compiled with
DMD is often slower than C/C++ code compiled with GCC. Since some years I even keep a
 collection of snippets of slow code.

 But I am also aware that the low performance has many different causes, like
some missing inlining, missing loop unrolling, etc, so spotting a clear and small case of
 integer arithmetic code that causes a slow down, to give you evidence, is not
easy. So I am sorry for my overly broad claim.
 It is true in general that DMD's inliner is not very good. I _suspect_
 that it is the primary cause of most instances of poor integer
 performance. It's actually part of the front-end, not the back-end. So
 many of those performance problems won't apply to DMC.

 It's also true that the DMD/DMC instruction scheduler doesn't schedule
 for modern processors. But last I checked, GCC wasn't really much better
 in practice (you have to be almost perfect to get a benefit from
 instruction scheduling these days, the hardware does a very good job on
 unscheduled code).

 Otherwise, I don't think there's any major optimisation it misses. But
 it's quite likely that it misses several very specific minor optimizations.

 If it is true, there's a strong possibility that it's a small, fixable issue
(for example, DMD used to have terrible performance for ulong multiplication).<
 You are right, the case I'm going to show is a precise problem that's fixable.
[snip]
 -----------------------

 For a more real case see:
 http://d.puremagic.com/issues/show_bug.cgi?id=5607
Thanks, that's helpful. It's a major speed difference (factor of 20, maybe) so it wouldn't have to occur very often to be noticeable.
You may also want to have a look at this paper: http://www.agner.org/optimize/optimizing_cpp.pdf I don't know if it still accurately reflects the current state though. Interestingly, it says that DMC is already able to perform the optimization requested by bearophile. On page 73 starts a tabular that is quite specific about which optimizations the DMC backend is lacking. Cheers, -Timon
Jun 26 2011
parent reply Don <nospam nospam.com> writes:
Timon Gehr wrote:
 Don wrote:
 bearophile wrote:
 Don:

 Sorry for my slow answer, I was quite busy for days.


 I've never heard that claim before. Do you have evidence for that?
I compare/convert code to D every day, so I am aware that D code compiled with
DMD is often slower than C/C++ code compiled with GCC. Since some years I even keep a
 collection of snippets of slow code.

 But I am also aware that the low performance has many different causes, like
some missing inlining, missing loop unrolling, etc, so spotting a clear and small case of
 integer arithmetic code that causes a slow down, to give you evidence, is not
easy. So I am sorry for my overly broad claim.
 It is true in general that DMD's inliner is not very good. I _suspect_
 that it is the primary cause of most instances of poor integer
 performance. It's actually part of the front-end, not the back-end. So
 many of those performance problems won't apply to DMC.

 It's also true that the DMD/DMC instruction scheduler doesn't schedule
 for modern processors. But last I checked, GCC wasn't really much better
 in practice (you have to be almost perfect to get a benefit from
 instruction scheduling these days, the hardware does a very good job on
 unscheduled code).

 Otherwise, I don't think there's any major optimisation it misses. But
 it's quite likely that it misses several very specific minor optimizations.

 If it is true, there's a strong possibility that it's a small, fixable issue
(for example, DMD used to have terrible performance for ulong multiplication).<
 You are right, the case I'm going to show is a precise problem that's fixable.
[snip]
 -----------------------

 For a more real case see:
 http://d.puremagic.com/issues/show_bug.cgi?id=5607
Thanks, that's helpful. It's a major speed difference (factor of 20, maybe) so it wouldn't have to occur very often to be noticeable.
You may also want to have a look at this paper: http://www.agner.org/optimize/optimizing_cpp.pdf I don't know if it still accurately reflects the current state though.
It's a little out of date, DMD now does a couple of things it didn't do when Agner did the testing. Incidentally I contributed a bit to that paper <g>.
 Interestingly, it says that DMC is already able to perform the optimization
 requested by bearophile.
 
 On page 73 starts a tabular that is quite specific about which optimizations
the
 DMC backend is lacking.
 
 Cheers,
 -Timon
Jun 26 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 6/26/2011 2:24 PM, Don wrote:
 You may also want to have a look at this paper:

 http://www.agner.org/optimize/optimizing_cpp.pdf

 I don't know if it still accurately reflects the current state though.
It's a little out of date, DMD now does a couple of things it didn't do when Agner did the testing. Incidentally I contributed a bit to that paper <g>.
The table has several errors wrt DMC++. For example, DMC++ certainly does function inlining, constant propagation and the branch optimizations.
Jun 26 2011
parent Caligo <iteronvexor gmail.com> writes:
Kind of off topic, but a good place to get benchmark results for many
of the programming languages is Sphere Online Judge:
http://www.spoj.pl/problems/classical/

They accept solutions in D, but not many have been submitted.  I found a few:

http://www.spoj.pl/ranks/FCTRL/lang=D
http://www.spoj.pl/ranks/HASHIT/lang=D
http://www.spoj.pl/ranks/ONP/lang=D

Most of the fastest solutions are in C++, but D is pretty close.
Maybe we could start submitting solutions :-)
Jun 28 2011