www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - D for scientific computing

reply "Alan" <geouke gmail.com> writes:
I saw an old thread from 2004 while doing a google search that 
discussed D and scientific computing and was looking for some 
more recent information or opinions from people who have used it 
for such purposes.

I am a graduate student and my thesis work is in numerical 
modeling. While I have some experience using Fortran and C, I am 
not obligated to use any particular language for my work. I like 
the design goals behind D and the syntax. I was would like to 
know if D can compete with C or Fortran for numerical work.

Is anyone out there using D for heavy numeric work?
Jan 23 2013
next sibling parent "Stephan" <stephan_schiffels mac.com> writes:
On Wednesday, 23 January 2013 at 22:39:04 UTC, Alan wrote:
 I saw an old thread from 2004 while doing a google search that 
 discussed D and scientific computing and was looking for some 
 more recent information or opinions from people who have used 
 it for such purposes.

 I am a graduate student and my thesis work is in numerical 
 modeling. While I have some experience using Fortran and C, I 
 am not obligated to use any particular language for my work. I 
 like the design goals behind D and the syntax. I was would like 
 to know if D can compete with C or Fortran for numerical work.

 Is anyone out there using D for heavy numeric work?

Hi Alan, I use D to build a fairly large project to analyze whole genome sequences from multiple individuals. I will actually upload things into Bitbucket soon, I will let people on this forum know. I use it straight as a replacement for C++, that means I use it for all the numeric work I used C and C++ before. To name a few highlights: You can very easily adapt the code samples from Numerical Recipes 3rd edition (which are in C++) to D, with a lot more convenient built-in arrays and associative arrays. Also, you can link the Gnu scientific library (GSL) straight into your D code. This actually had some bugs in previous versions of the compiler, but now it is really flawless. I use the GSL vector class to do very fast Matrix-Matrix multiplications with GSL's blast interface. Also, I use GSL's special functions. So I think D is ideal for scientific developers that start new projects. We don't have to convince huge teams to endeavour in a new language. We can just pick the best there is :-) Stephan
Jan 23 2013
prev sibling next sibling parent "Stephan" <stephan_schiffels mac.com> writes:
On Wednesday, 23 January 2013 at 22:39:04 UTC, Alan wrote:
 I saw an old thread from 2004 while doing a google search that 
 discussed D and scientific computing and was looking for some 
 more recent information or opinions from people who have used 
 it for such purposes.

 I am a graduate student and my thesis work is in numerical 
 modeling. While I have some experience using Fortran and C, I 
 am not obligated to use any particular language for my work. I 
 like the design goals behind D and the syntax. I was would like 
 to know if D can compete with C or Fortran for numerical work.

 Is anyone out there using D for heavy numeric work?

I actually forgot the main argument for me: D allows safe multithreading right out of the box, which is a huge advantage I think. I never wrote anything multithreaded before, until I started in D. I recommend Andrei's book, check out the two chapters that are linked on the D page. Stephan
Jan 23 2013
prev sibling next sibling parent reply "Joshua Niehus" <jm.niehus gmail.com> writes:
On Wednesday, 23 January 2013 at 22:39:04 UTC, Alan wrote:
 to know if D can compete with C or Fortran for numerical work.

https://github.com/kyllingstad/scid You dont need to compete, you can take established "good and fast" FORTRAN/C code and use it within your own D program.
Jan 23 2013
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/23/2013 6:36 PM, Rob T wrote:
 BTW the D version of my sqlite3 lib is at least 1/3 smaller than the C++
 version, and not only is it smaller, but it is far more flexible due to the use
 of templates (I just could not make much use out of C++ templates). A reduction
 like that is very significant. For large projects. it's a drastic reduction in
 development costs and perhaps more so in long term maintenance costs.

Interesting. I found the same percentage reduction in translating C++ code to D.
Jan 24 2013
parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/25/2013 9:45 AM, Rob T wrote:
 I wonder what the main reasons are for the reduction?

Some reasons: 1. elimination of .h files 2. array & string handling was so much more straightforward 3. elimination of need for many constructors and code to initialize things 4. easier cleanup with scope statement 5. templates are much more concise 6. a lot of boilerplate member functions are simply unnecessary in D 7. static if eliminates a lot of template source bloat
 Have you ever translated from D to C++?

Haven't tried that!
Jan 25 2013
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/24/13 5:21 AM, deadalnix wrote:
 On Thursday, 24 January 2013 at 02:19:06 UTC, Era Scarecrow wrote:
 On Thursday, 24 January 2013 at 00:35:13 UTC, Joshua Niehus wrote:
 On Thursday, 24 January 2013 at 00:29:15 UTC, Joshua Niehus wrote:
 You don't need to compete, you can take established "good and fast"
 FORTRAN/C code and use it within your own D program.

I forgot to add: If you doing new stuff then D can be as fast as anything eles, provided the algorithm is sound, optimizers turned on, sprinkle in a lil assembly, etc...

And use nothrow when it's applicable; Found with a sudoku solver how much nothrow was making an impact on the algorithm speed.

Do you know why ? It shouldn't.

More code motion. It's a classic in C++ code as well (where it's mode difficult to detect). Andrei
Jan 24 2013
prev sibling next sibling parent "Joshua Niehus" <jm.niehus gmail.com> writes:
On Thursday, 24 January 2013 at 00:29:15 UTC, Joshua Niehus wrote:
 You dont need to compete, you can take established "good and 
 fast" FORTRAN/C code and use it within your own D program.

I forgot to add: If you doing new stuff then D can be as fast as anything eles, provided the algorithm is sound, optimizers turned on, sprinkle in a lil asembly, etc...
Jan 23 2013
prev sibling next sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Thursday, 24 January 2013 at 00:35:13 UTC, Joshua Niehus wrote:
 On Thursday, 24 January 2013 at 00:29:15 UTC, Joshua Niehus 
 wrote:
 You don't need to compete, you can take established "good and 
 fast" FORTRAN/C code and use it within your own D program.

I forgot to add: If you doing new stuff then D can be as fast as anything eles, provided the algorithm is sound, optimizers turned on, sprinkle in a lil assembly, etc...

And use nothrow when it's applicable; Found with a sudoku solver how much nothrow was making an impact on the algorithm speed.
Jan 23 2013
prev sibling next sibling parent "Rob T" <alanb ucora.com> writes:
On Thursday, 24 January 2013 at 00:35:13 UTC, Joshua Niehus wrote:
 On Thursday, 24 January 2013 at 00:29:15 UTC, Joshua Niehus 
 wrote:
 You dont need to compete, you can take established "good and 
 fast" FORTRAN/C code and use it within your own D program.

I forgot to add: If you doing new stuff then D can be as fast as anything eles, provided the algorithm is sound, optimizers turned on, sprinkle in a lil asembly, etc...

.. also don't forget that there's a garbage collector which can have a huge impact on performance if you are doing a lot of memory allocations. The GC is adjustable to a degree, so performance problems can be solved provided that you are aware of them. For example, I wrote a sqlite3 library in D, and for large SELECT returns it was 3 times slower than an almost identical C++ implementation. The performance difference was resolved by disabling the GC prior to running the query and re-enabling afterwards. It was an easy fix, only two lines of code in one function. BTW the D version of my sqlite3 lib is at least 1/3 smaller than the C++ version, and not only is it smaller, but it is far more flexible due to the use of templates (I just could not make much use out of C++ templates). A reduction like that is very significant. For large projects. it's a drastic reduction in development costs and perhaps more so in long term maintenance costs. --rt
Jan 23 2013
prev sibling next sibling parent "Alan" <geouke gmail.com> writes:
My project will be working with the USGS Modflow model (a 
finite-difference model for groundwater flow written in fortran). 
Thankfully, it works with text input and output. So, the bit of 
the program I will be writing (which will talk back and forth to 
modflow through text) can be written in whatever language I 
choose.

I guess since the source code is available it might make sense to 
modify it to take i/o directly as a function called from D (if D 
is happy to do that).

I ordered Andrei's book a couple days ago and am oddly excited 
about it arriving. I have worked a bit with fortran, C, and C++. 
C++ needs to die. C is lacking a couple useful features (rather, 
it is more that they are not made as convenient to implement). 
Fortran does not have as many intrinsic functions and easy access 
to do "cool stuff" :P

I am glad to hear that D has been crunching numbers are a 
reasonable rate for you guys. My concern was that I will 
potentially be working on projects that might take a week or two 
to run on a small computer cluster and I do not want it to take 
four weeks to run something that I could run in two had I written 
it in Fortran.
Jan 23 2013
prev sibling next sibling parent reply "Nicolas Sicard" <dransic gmail.com> writes:
On Wednesday, 23 January 2013 at 22:39:04 UTC, Alan wrote:
 I saw an old thread from 2004 while doing a google search that 
 discussed D and scientific computing and was looking for some 
 more recent information or opinions from people who have used 
 it for such purposes.

 I am a graduate student and my thesis work is in numerical 
 modeling. While I have some experience using Fortran and C, I 
 am not obligated to use any particular language for my work. I 
 like the design goals behind D and the syntax. I was would like 
 to know if D can compete with C or Fortran for numerical work.

 Is anyone out there using D for heavy numeric work?

The different D compilers available don't generate numeric code of the same quality, depending on the algorithms and data structures used. I have found in one of my projects that LDC produces code that is up to 5x or even 10x faster than DMD (though the average difference is less spectacular).
Jan 24 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/24/2013 2:08 AM, Nicolas Sicard wrote:
 The different D compilers available don't generate numeric code of the same
 quality, depending on the algorithms and data structures used. I have found in
 one of my projects that LDC produces code that is up to 5x or even 10x faster
 than DMD (though the average difference is less spectacular).

If you use the 64 bit model, dmd will use SIMD instructions for float and double, which are much faster.
Jan 24 2013
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/24/2013 2:41 AM, Joseph Rushton Wakeling wrote:
 That's been a fairly consistent speed difference for a long time.  And yes, I'm
 using 64-bit.

Is that with floating point code, or otherwise?
Jan 24 2013
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/24/2013 8:36 AM, H. S. Teoh wrote:
 Nevertheless, I also have made the same observation that code produced
 by gdc consistently outperforms code produced by dmd. Usually by about
 20-30%, sometimes as much as 50-60%, IME. That's a pretty big
 discrepancy for me, esp. when I'm doing compute-intensive geometric
 computations.

Do you mean floating point code? 32 or 64 bit?
Jan 24 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/24/2013 1:13 PM, H. S. Teoh wrote:
 On Thu, Jan 24, 2013 at 12:15:07PM -0800, Walter Bright wrote:
 On 1/24/2013 8:36 AM, H. S. Teoh wrote:
 Nevertheless, I also have made the same observation that code
 produced by gdc consistently outperforms code produced by dmd.
 Usually by about 20-30%, sometimes as much as 50-60%, IME. That's a
 pretty big discrepancy for me, esp. when I'm doing compute-intensive
 geometric computations.

Do you mean floating point code? 32 or 64 bit?

Floating-point, 64-bit, tested on dmd -O vs. gdc -O3.

Next, are you using floats, doubles, or reals?
Jan 24 2013
parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/25/2013 11:46 AM, H. S. Teoh wrote:
 So I think at this
 point it's fair to say that GDC's back end produces superior code in
 terms of performance.

If you're feeling ambitious, taking a closer look to see why would be most interesting.
Jan 25 2013
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Thursday, 24 January 2013 at 02:19:06 UTC, Era Scarecrow wrote:
 On Thursday, 24 January 2013 at 00:35:13 UTC, Joshua Niehus 
 wrote:
 On Thursday, 24 January 2013 at 00:29:15 UTC, Joshua Niehus 
 wrote:
 You don't need to compete, you can take established "good and 
 fast" FORTRAN/C code and use it within your own D program.

I forgot to add: If you doing new stuff then D can be as fast as anything eles, provided the algorithm is sound, optimizers turned on, sprinkle in a lil assembly, etc...

And use nothrow when it's applicable; Found with a sudoku solver how much nothrow was making an impact on the algorithm speed.

Do you know why ? It shouldn't.
Jan 24 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 01/24/2013 11:16 AM, Walter Bright wrote:
 If you use the 64 bit model, dmd will use SIMD instructions for float and
 double, which are much faster.

I generally find that dmd-compiled programs run at about half the speed of those built with gdc or ldc (the latter seem pretty much equivalent these days, some programs run faster compiled with one, some with the other). That's running off latest GitHub source for all compilers. That's been a fairly consistent speed difference for a long time. And yes, I'm using 64-bit.
Jan 24 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 01/24/2013 11:49 AM, Walter Bright wrote:
 Is that with floating point code, or otherwise?

Yes, quite heavily floating-point. I did once have a brief go at writing some entirely integer-based number-crunching code just to see if it made any difference, but I think other priorities intervened ... :-)
Jan 24 2013
prev sibling next sibling parent "Nicolas Sicard" <dransic gmail.com> writes:
On Thursday, 24 January 2013 at 10:42:10 UTC, Joseph Rushton 
Wakeling wrote:
 On 01/24/2013 11:16 AM, Walter Bright wrote:
 If you use the 64 bit model, dmd will use SIMD instructions 
 for float and
 double, which are much faster.

I generally find that dmd-compiled programs run at about half the speed of those built with gdc or ldc (the latter seem pretty much equivalent these days, some programs run faster compiled with one, some with the other). That's running off latest GitHub source for all compilers. That's been a fairly consistent speed difference for a long time. And yes, I'm using 64-bit.

Same for me. The difference between ldc and dmd seems to be mainly due to optimizing and especially inlining (see http://d.puremagic.com/issues/show_bug.cgi?id=9320 for an example in that matter).
Jan 24 2013
prev sibling next sibling parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Thursday, 24 January 2013 at 10:42:10 UTC, Joseph Rushton 
Wakeling wrote:
 On 01/24/2013 11:16 AM, Walter Bright wrote:
 If you use the 64 bit model, dmd will use SIMD instructions 
 for float and
 double, which are much faster.

I generally find that dmd-compiled programs run at about half the speed of those built with gdc or ldc (the latter seem pretty much equivalent these days, some programs run faster compiled with one, some with the other). That's running off latest GitHub source for all compilers. That's been a fairly consistent speed difference for a long time. And yes, I'm using 64-bit.

I had similar experience with all my numerical code. gdc and ldc trade places but dmd is always solidly behind. Walter, I know you like working with the current backend and you understand it etc..., but this gives dmd a bus factor of 1 and is slowing down code in the process.
Jan 24 2013
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Jan 25, 2013 at 04:09:25PM +0100, John Colvin wrote:
 On Friday, 25 January 2013 at 13:38:03 UTC, Iain Buclaw wrote:
On 25 January 2013 10:27, John Colvin
<john.loughran.colvin gmail.com>wrote:


Comparing dmd -O and gdc -O3 is hardly fair. "dmd -release
-inline -O" is more comparable.

But then you'd have to do gdc -O3 -frelease. :-)

Ah yes, of course :)

Hmm. I didn't realize that dmd has a separate switch for function inlining. Well, here's the updated numbers:
On Friday, 25 January 2013 at 01:41:12 UTC, H. S. Teoh wrote:
Both reals and floats. Well, let's get some real measurements.
Here's a quick run-through of various test programs I have lying
around:

Test program #1 (iterating 2-variable function over grid),
uses reals:
- Test case with n=400:
        Using DMD:      ~8 seconds (consistently)
        Using GDC:      ~6 seconds (consistently)
        * So the DMD version is 33% slower than the GDC
version.
          (That is, 8/6*100 = 133%, so 33% slower.)




Updated: DMD version with -inline takes ~7 seconds consistently, so we have 7/6*100 = 116%, so 16% slower.
- Test case with n=600:
        Using DMD:      ~27 seconds (consistently)
        Using GDC:      ~19 seconds (consistently)
        * So the DMD version is 42% slower than the GDC
version.




Updated: DMD version with -inline takes ~24 seconds consistently, so 26% slower.
Test program #2 (terrain generation simulator), uses floats:
(The running time of this one depends on the RNG, so I fixed
the seed
value in order to make a fair comparison.)
- Test case with seed=380170304, n=20 with water & wind
simulation:
        Using DMD:      ~10 seconds (consistently)
        Using GDC:      ~7 seconds (consistently)
        * So the DMD version is 42% slower than the GDC
version.




Updated: DMD version with -inline takes ~8 seconds consistently, so 14% slower.
- Test case with seed=380170304, n=25 with water & wind
simulation:
        Using DMD:      ~14 seconds (consistently)
        Using GDC:      ~9 seconds (consistently)
        * So the DMD version is 55% slower than the GDC
version.




Updated: DMD version with -inline takes ~11 seconds consistently, so 22% slower.
Test program #3 (enumeration of coordinates of n-dimensional
polytopes),
uses reals:
- All permutations and changes of sign of <1,2,3,4,5,6,7>:
        Using DMD:      ~4 seconds (consistently)
        Using GDC:      ~3 seconds (consistently)
        * So the DMD version is 33% slower than the GDC
version.




Updated: DMD version with -inline still takes ~4 seconds, so no significant change here.
- All permutations and changes of sign of <1,2,3,4,5,6,7,7>:
        Using DMD:      ~41 seconds (consistently)
        Using GDC:      ~27 seconds (consistently)
        * So the DMD version is 51% slower than the GDC
version.




Updated: DMD version with -inline takes about 36 seconds on average, so about 33% slower.
- Even permutations and all changes of sign of
<1,2,3,4,5,6,7,8>:
        Using DMD:      ~40 seconds (consistently)
        Using GDC:      ~27 seconds (consistently)
        * So the DMD version is 48% slower than the GDC
version.




Updated: DMD version with -inline takes about 38 seconds, so 41% slower. Conclusions: - The performance gap is smaller than previously thought, but it's still present. - I will be using -inline with dmd aggressively. - What other dmd options am I missing that will bring dmd on par with gdc -O3 (if there are any)? T -- Written on the window of a clothing store: No shirt, no shoes, no service.
Jan 25 2013
prev sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Friday, 25 January 2013 at 16:09:00 UTC, H. S. Teoh wrote:
 On Fri, Jan 25, 2013 at 04:09:25PM +0100, John Colvin wrote:
 On Friday, 25 January 2013 at 13:38:03 UTC, Iain Buclaw wrote:
On 25 January 2013 10:27, John Colvin
<john.loughran.colvin gmail.com>wrote:


Comparing dmd -O and gdc -O3 is hardly fair. "dmd -release
-inline -O" is more comparable.

But then you'd have to do gdc -O3 -frelease. :-)

Ah yes, of course :)

Hmm. I didn't realize that dmd has a separate switch for function inlining. Well, here's the updated numbers:
On Friday, 25 January 2013 at 01:41:12 UTC, H. S. Teoh wrote:
Both reals and floats. Well, let's get some real 
measurements.
Here's a quick run-through of various test programs I have 
lying
around:

Test program #1 (iterating 2-variable function over grid),
uses reals:
- Test case with n=400:
        Using DMD:      ~8 seconds (consistently)
        Using GDC:      ~6 seconds (consistently)
        * So the DMD version is 33% slower than the GDC
version.
          (That is, 8/6*100 = 133%, so 33% slower.)




Updated: DMD version with -inline takes ~7 seconds consistently, so we have 7/6*100 = 116%, so 16% slower.
- Test case with n=600:
        Using DMD:      ~27 seconds (consistently)
        Using GDC:      ~19 seconds (consistently)
        * So the DMD version is 42% slower than the GDC
version.




Updated: DMD version with -inline takes ~24 seconds consistently, so 26% slower.
Test program #2 (terrain generation simulator), uses floats:
(The running time of this one depends on the RNG, so I fixed
the seed
value in order to make a fair comparison.)
- Test case with seed=380170304, n=20 with water & wind
simulation:
        Using DMD:      ~10 seconds (consistently)
        Using GDC:      ~7 seconds (consistently)
        * So the DMD version is 42% slower than the GDC
version.




Updated: DMD version with -inline takes ~8 seconds consistently, so 14% slower.
- Test case with seed=380170304, n=25 with water & wind
simulation:
        Using DMD:      ~14 seconds (consistently)
        Using GDC:      ~9 seconds (consistently)
        * So the DMD version is 55% slower than the GDC
version.




Updated: DMD version with -inline takes ~11 seconds consistently, so 22% slower.
Test program #3 (enumeration of coordinates of n-dimensional
polytopes),
uses reals:
- All permutations and changes of sign of <1,2,3,4,5,6,7>:
        Using DMD:      ~4 seconds (consistently)
        Using GDC:      ~3 seconds (consistently)
        * So the DMD version is 33% slower than the GDC
version.




Updated: DMD version with -inline still takes ~4 seconds, so no significant change here.
- All permutations and changes of sign of <1,2,3,4,5,6,7,7>:
        Using DMD:      ~41 seconds (consistently)
        Using GDC:      ~27 seconds (consistently)
        * So the DMD version is 51% slower than the GDC
version.




Updated: DMD version with -inline takes about 36 seconds on average, so about 33% slower.
- Even permutations and all changes of sign of
<1,2,3,4,5,6,7,8>:
        Using DMD:      ~40 seconds (consistently)
        Using GDC:      ~27 seconds (consistently)
        * So the DMD version is 48% slower than the GDC
version.




Updated: DMD version with -inline takes about 38 seconds, so 41% slower. Conclusions: - The performance gap is smaller than previously thought, but it's still present. - I will be using -inline with dmd aggressively. - What other dmd options am I missing that will bring dmd on par with gdc -O3 (if there are any)? T

I have sometimes found that using -release and -noboundscheck made a bigger difference to dmd than to gdc. The corresponding gdc options are -frelease and -fno-bounds-check Comparing performance without -release isn't that meaningful.
Jan 25 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 01/24/2013 02:11 PM, John Colvin wrote:
 Walter, I know you like working with the current backend and you understand it
 etc..., but this gives dmd a bus factor of 1 and is slowing down code in the
 process.

Honestly, I don't feel this is too strong an issue. The point of dmd is to be a reference compiler -- speed is nice if it's possible, but not the most important consideration. The most important thing is that new frontend updates can get merged quickly into ldc/gdc, so that there is no time lag between new feature development and their incorporation into other compilers.
Jan 24 2013
prev sibling next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Jan 24, 2013 at 05:28:16PM +0100, Joseph Rushton Wakeling wrote:
 On 01/24/2013 02:11 PM, John Colvin wrote:
Walter, I know you like working with the current backend and you
understand it etc..., but this gives dmd a bus factor of 1 and is
slowing down code in the process.

Honestly, I don't feel this is too strong an issue. The point of dmd is to be a reference compiler -- speed is nice if it's possible, but not the most important consideration.

I think it would be ideal if the dmd front end were more isolated from the back end, so that it's easier to port to gdc/ldc (i.e. it can happen in the matter of days after a dmd release, not, say, weeks or months). But I believe Walter has already said that patches to this effect are welcome, so I can only see the situation improve in the future. Nevertheless, I also have made the same observation that code produced by gdc consistently outperforms code produced by dmd. Usually by about 20-30%, sometimes as much as 50-60%, IME. That's a pretty big discrepancy for me, esp. when I'm doing compute-intensive geometric computations.
 The most important thing is that new frontend updates can get merged
 quickly into ldc/gdc, so that there is no time lag between new
 feature development and their incorporation into other compilers.

Agreed. T -- It is of the new things that men tire --- of fashions and proposals and improvements and change. It is the old things that startle and intoxicate. It is the old things that are young. -- G.K. Chesterton
Jan 24 2013
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Jan 25, 2013 at 05:50:21PM +0100, John Colvin wrote:
 On Friday, 25 January 2013 at 16:09:00 UTC, H. S. Teoh wrote:

Conclusions:
- The performance gap is smaller than previously thought, but it's
  still present.
- I will be using -inline with dmd aggressively.
- What other dmd options am I missing that will bring dmd on par
  with gdc -O3 (if there are any)?


T

I have sometimes found that using -release and -noboundscheck made a bigger difference to dmd than to gdc. The corresponding gdc options are -frelease and -fno-bounds-check Comparing performance without -release isn't that meaningful.

Alright. So to make the comparison fair(er), I recompiled test program #1 (iterating 2-variable function on grid) with: dmd -O -inline -m64 -release -nobounds check gdc -O3 -m64 -frelease -fno-bounds-check Here are the new results for test program #1, using n=600: With DMD: 15 seconds (average of 4 runs) With GDC: 11 seconds (average of 4 runs) There's still a 36% performance difference. I did the same thing for test program #2 (terrain generation simulation), using seed=380170304, with wind & water simulation, and n=30 (I increased the iteration count to make measurement noise less prominent). Here's the new results: With DMD: 11 seconds (average of 4 runs) With GDC: 9 seconds (average of 4 runs) So a gap of 22% is still present. I'm running into a DMD bug for test program #3 (linker error when compiling with -release -O -noboundscheck -inline), so I don't have the test results for that yet. I'll try to figure out what's causing the linker error and post the results later. In the meantime, it's clear that GDC is still showing significant performance improvement over DMD. There is a _consistent_ 20-30% difference in performance in all of the tests so far. So I think at this point it's fair to say that GDC's back end produces superior code in terms of performance. (I will note, though, that GDC produces larger executables than DMD, sometimes much larger, so space-wise, there is some price to pay.) T -- Chance favours the prepared mind. -- Louis Pasteur
Jan 25 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 01/26/2013 02:37 AM, Walter Bright wrote:
 If you're feeling ambitious, taking a closer look to see why would be most
 interesting.

It's nice if DMD can produce faster code, but in the short term I'd rather see priority being given to making the frontend/druntime more easily portable to different backends. The speed issues of DMD have never bothered me, precisely because GDC and LDC exist -- and besides speed, there's also the issue of target architectures. The problem is rather having to wait for bugfixes and new features to propagate to the D compilers which already solved the speed and architecture issues.
Jan 26 2013
prev sibling next sibling parent "mist" <none none.none> writes:
On Saturday, 26 January 2013 at 15:17:18 UTC, Joseph Rushton 
Wakeling wrote:
 ...

++ Once situation with front-end bugs and stability is settled, I see zero reasons to use dmd back-end and spending efforts on its optimization feels not pragmatical.
Jan 26 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 01/26/2013 04:26 PM, mist wrote:
 Once situation with front-end bugs and stability is settled, I see zero reasons
 to use dmd back-end and spending efforts on its optimization feels not
pragmatical.

Actually, I feel somewhat the contrary. When the problem of frontend/runtime portability has been solved, then it makes plenty of sense to look at DMD speed and backend issues. Improving DMD is always a good thing -- it's just a question of priorities.
Jan 26 2013
prev sibling next sibling parent "mist" <none none.none> writes:
Yes, of course, we all have our own preferences, that is fine :) 
I mean a bit different thing: front-end efforts affect all major 
compiler lovers, not only one group and thus are more important.
Jan 26 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 01/26/2013 04:43 PM, mist wrote:
 Yes, of course, we all have our own preferences, that is fine :) I mean a bit
 different thing: front-end efforts affect all major compiler lovers, not only
 one group and thus are more important.

Yup, agree. :-)
Jan 26 2013
prev sibling parent Marco Leise <Marco.Leise gmx.de> writes:
I use DMD and GDC. DMD for debug builds and (since it is the
reference compiler) to ensure language conformance. GDC for
performance tests and release.
In other terms, I don't expect the W3 reference browser to be
the fastest, but to set the required standard for HTML
interpretation.
If you asked me, I'd keep all smart compiler optimizations
out of DMD for sake of stability, compilation speed and
maintenance effort.
Some of what GCC does is amazing, but probably requires heaps
of difficult to read code. (I once saw it SSE optimize my code
where I was using a 4-byte struct with 3 used bytes that I did
computations on in a loop.)

-- 
Marco
Jan 26 2013
prev sibling next sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Thursday, 24 January 2013 at 16:28:29 UTC, Joseph Rushton 
Wakeling wrote:
 On 01/24/2013 02:11 PM, John Colvin wrote:
 Walter, I know you like working with the current backend and 
 you understand it
 etc..., but this gives dmd a bus factor of 1 and is slowing 
 down code in the
 process.

Honestly, I don't feel this is too strong an issue. The point of dmd is to be a reference compiler -- speed is nice if it's possible, but not the most important consideration.

Fair point, i guess the reference doesn't have to be fastest.
 The most important thing is that new frontend updates can get 
 merged quickly into ldc/gdc, so that there is no time lag 
 between new feature development and their incorporation into 
 other compilers.

This would be really great.
Jan 24 2013
prev sibling next sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Thursday, 24 January 2013 at 10:21:47 UTC, deadalnix wrote:
 On Thursday, 24 January 2013 at 02:19:06 UTC, Era Scarecrow 
 wrote:
 And use nothrow when it's applicable; Found with a sudoku 
 solver how much nothrow was making an impact on the algorithm 
 speed.

Do you know why? It shouldn't.

As mentioned somewhere, with nothrow the compiler can drop various checks and support for exceptions (assert/ensure don't throw exceptions, they throw errors instead). How big this overhead is I'm not sure, but the speedup in my code went from some 30 seconds down to 7 or so. But nothrow can't be used everywhere.
Jan 24 2013
prev sibling next sibling parent "Rob T" <alanb ucora.com> writes:
On Thursday, 24 January 2013 at 10:42:10 UTC, Joseph Rushton 
Wakeling wrote:
 On 01/24/2013 11:16 AM, Walter Bright wrote:
 If you use the 64 bit model, dmd will use SIMD instructions 
 for float and
 double, which are much faster.

I generally find that dmd-compiled programs run at about half the speed of those built with gdc or ldc (the latter seem pretty much equivalent these days, some programs run faster compiled with one, some with the other). That's running off latest GitHub source for all compilers. That's been a fairly consistent speed difference for a long time. And yes, I'm using 64-bit.

You are taking care to compare with full optimization flag settings? I'm sure you are, but I ask just in case. --rt
Jan 24 2013
prev sibling next sibling parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
There is also Plot2Kill, that David Simcha developed for his own
thesis, to do 2D drawings:

https://github.com/dsimcha/Plot2kill


I used it 1 (2?) years ago, and it worked well. It was quite nice to
be able to generate / manipulate data in D, and then to keep the same
powerful language for graphs.
Jan 24 2013
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Jan 24, 2013 at 12:15:07PM -0800, Walter Bright wrote:
 On 1/24/2013 8:36 AM, H. S. Teoh wrote:
Nevertheless, I also have made the same observation that code
produced by gdc consistently outperforms code produced by dmd.
Usually by about 20-30%, sometimes as much as 50-60%, IME. That's a
pretty big discrepancy for me, esp. when I'm doing compute-intensive
geometric computations.

Do you mean floating point code? 32 or 64 bit?

Floating-point, 64-bit, tested on dmd -O vs. gdc -O3. T -- The irony is that Bill Gates claims to be making a stable operating system and Linus Torvalds claims to be trying to take over the world. -- Anonymous
Jan 24 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 01/24/2013 10:05 PM, Rob T wrote:
 You are taking care to compare with full optimization flag settings? I'm sure
 you are, but I ask just in case.

I use -O -release -inline typically (I use the dmd-ish interfaces for gdc and ldc as well). Absent any optimizations, executables seem to run at about the same speed no matter what compiler is used. Interestingly, at least on the code that I just tested with, the different compilers react differently to different optimizations: dmd gains much less from -O than gdmd, and ldmd2 gains much more than both of the others. Adding -inline doesn't seem to affect executable speed at all (this is probably a quirk of the particular code I'm testing with). Adding -release speeds up executables about as much as -O (for dmd and gdmd) and maybe makes a slight additional speedup for ldmd2. With -O -release -inline, executables compiled with gdmd and ldmd2 seem to run at about the same speed. Interestingly, using -release alone results in about the same executable speed for both gdmd and ldmd2, but using -O alone means ldmd2-compiled executables are as fast as gdmd-compiled executables compiled with both -O and -release. That surely means that these identical DFLAGS translate in practice into different underlying optimizations depending on the compiler. Of course, these are very casual and trivial tests using a single piece of code -- here if you want to repeat the tests: https://github.com/WebDrake/Dregs -- but they reflect my typical experience with the different D compilers.
Jan 24 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 01/25/2013 01:02 AM, Joseph Rushton Wakeling wrote:
 but they reflect my typical experience with the different D compilers.

The caveat here is that these results are typical for _number-crunching_ code. If the dominant factor in your program's speed is e.g. console output, you'll find the differences between the compilers much less noticeable. For example: I have a piece of code that implements a Monte Carlo simulation and prints an update of its status at each time step -- with -O -release -inline flags, this runs in about 23s with gdmd, 25 with ldmd2 and 28 with dmd. If I remove the writef statements, leaving just the number-crunching part, it runs in about 4s with gdmd, 7s with ldmd2 and 14s (!) with dmd.
Jan 24 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 01/24/2013 05:36 PM, H. S. Teoh wrote:
 I think it would be ideal if the dmd front end were more isolated from
 the back end, so that it's easier to port to gdc/ldc (i.e. it can happen
 in the matter of days after a dmd release, not, say, weeks or months).

Case in point -- today I got bitten by this issue: http://forum.dlang.org/thread/sntkmtabuhuctcbnlsgq forum.dlang.org AFAICT it's fixed in 2.061, and it certainly doesn't show up when compiling with latest-git dmd. But as 2.061 isn't yet merged into ldc or gdc, both of these compilers are temporarily out of commission ...
 But I believe Walter has already said that patches to this effect are
 welcome, so I can only see the situation improve in the future.

Yes, my impression too, and I know that some people have been putting work towards it.
Jan 24 2013
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Jan 24, 2013 at 03:18:01PM -0800, Walter Bright wrote:
 On 1/24/2013 1:13 PM, H. S. Teoh wrote:
On Thu, Jan 24, 2013 at 12:15:07PM -0800, Walter Bright wrote:
On 1/24/2013 8:36 AM, H. S. Teoh wrote:
Nevertheless, I also have made the same observation that code
produced by gdc consistently outperforms code produced by dmd.
Usually by about 20-30%, sometimes as much as 50-60%, IME. That's a
pretty big discrepancy for me, esp. when I'm doing compute-intensive
geometric computations.

Do you mean floating point code? 32 or 64 bit?

Floating-point, 64-bit, tested on dmd -O vs. gdc -O3.

Next, are you using floats, doubles, or reals?

Both reals and floats. Well, let's get some real measurements. Here's a quick run-through of various test programs I have lying around: Test program #1 (iterating 2-variable function over grid), uses reals: - Test case with n=400: Using DMD: ~8 seconds (consistently) Using GDC: ~6 seconds (consistently) * So the DMD version is 33% slower than the GDC version. (That is, 8/6*100 = 133%, so 33% slower.) - Test case with n=600: Using DMD: ~27 seconds (consistently) Using GDC: ~19 seconds (consistently) * So the DMD version is 42% slower than the GDC version. Test program #2 (terrain generation simulator), uses floats: (The running time of this one depends on the RNG, so I fixed the seed value in order to make a fair comparison.) - Test case with seed=380170304, n=20 with water & wind simulation: Using DMD: ~10 seconds (consistently) Using GDC: ~7 seconds (consistently) * So the DMD version is 42% slower than the GDC version. - Test case with seed=380170304, n=25 with water & wind simulation: Using DMD: ~14 seconds (consistently) Using GDC: ~9 seconds (consistently) * So the DMD version is 55% slower than the GDC version. Test program #3 (enumeration of coordinates of n-dimensional polytopes), uses reals: - All permutations and changes of sign of <1,2,3,4,5,6,7>: Using DMD: ~4 seconds (consistently) Using GDC: ~3 seconds (consistently) * So the DMD version is 33% slower than the GDC version. - All permutations and changes of sign of <1,2,3,4,5,6,7,7>: Using DMD: ~41 seconds (consistently) Using GDC: ~27 seconds (consistently) * So the DMD version is 51% slower than the GDC version. - Even permutations and all changes of sign of <1,2,3,4,5,6,7,8>: Using DMD: ~40 seconds (consistently) Using GDC: ~27 seconds (consistently) * So the DMD version is 48% slower than the GDC version. All test programs were compiled with dmd -O for the DMD version, and gdc -O3 for the GDC version. The source code is unchanged between the two compilers, and there are no version()'s that depend on a particular compiler. The measurements stated above are averages of about 3-4 runs. As you can see, the performance difference is between the two is pretty clear. I'm pretty sure this isn't only because of floating point operations, because the above test programs all use a lot of inner loops, and GDC does some pretty sophisticated loop unrolling and other such optimizations. T -- Two wrongs don't make a right; but three rights do make a left...
Jan 24 2013
prev sibling next sibling parent "lomereiter" <lomereiter gmail.com> writes:
On Friday, 25 January 2013 at 00:25:46 UTC, Joseph Rushton 
Wakeling wrote:
 If I remove the writef statements, leaving just the 
 number-crunching part, it runs in about 4s with gdmd, 7s with 
 ldmd2 and 14s (!) with dmd.

From my experience, writef and friends are substantially slower than printf. I wouldn't recommend using them for output-intensive applications. And of course, the best option is to avoid any format string parsing altogether, using only fwrite calls.
Jan 24 2013
prev sibling next sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Friday, 25 January 2013 at 01:41:12 UTC, H. S. Teoh wrote:
 On Thu, Jan 24, 2013 at 03:18:01PM -0800, Walter Bright wrote:
 On 1/24/2013 1:13 PM, H. S. Teoh wrote:
On Thu, Jan 24, 2013 at 12:15:07PM -0800, Walter Bright wrote:
On 1/24/2013 8:36 AM, H. S. Teoh wrote:
Nevertheless, I also have made the same observation that 
code
produced by gdc consistently outperforms code produced by 
dmd.
Usually by about 20-30%, sometimes as much as 50-60%, IME. 
That's a
pretty big discrepancy for me, esp. when I'm doing 
compute-intensive
geometric computations.

Do you mean floating point code? 32 or 64 bit?

Floating-point, 64-bit, tested on dmd -O vs. gdc -O3.

Next, are you using floats, doubles, or reals?

Both reals and floats. Well, let's get some real measurements. Here's a quick run-through of various test programs I have lying around: Test program #1 (iterating 2-variable function over grid), uses reals: - Test case with n=400: Using DMD: ~8 seconds (consistently) Using GDC: ~6 seconds (consistently) * So the DMD version is 33% slower than the GDC version. (That is, 8/6*100 = 133%, so 33% slower.) - Test case with n=600: Using DMD: ~27 seconds (consistently) Using GDC: ~19 seconds (consistently) * So the DMD version is 42% slower than the GDC version. Test program #2 (terrain generation simulator), uses floats: (The running time of this one depends on the RNG, so I fixed the seed value in order to make a fair comparison.) - Test case with seed=380170304, n=20 with water & wind simulation: Using DMD: ~10 seconds (consistently) Using GDC: ~7 seconds (consistently) * So the DMD version is 42% slower than the GDC version. - Test case with seed=380170304, n=25 with water & wind simulation: Using DMD: ~14 seconds (consistently) Using GDC: ~9 seconds (consistently) * So the DMD version is 55% slower than the GDC version. Test program #3 (enumeration of coordinates of n-dimensional polytopes), uses reals: - All permutations and changes of sign of <1,2,3,4,5,6,7>: Using DMD: ~4 seconds (consistently) Using GDC: ~3 seconds (consistently) * So the DMD version is 33% slower than the GDC version. - All permutations and changes of sign of <1,2,3,4,5,6,7,7>: Using DMD: ~41 seconds (consistently) Using GDC: ~27 seconds (consistently) * So the DMD version is 51% slower than the GDC version. - Even permutations and all changes of sign of <1,2,3,4,5,6,7,8>: Using DMD: ~40 seconds (consistently) Using GDC: ~27 seconds (consistently) * So the DMD version is 48% slower than the GDC version. All test programs were compiled with dmd -O for the DMD version, and gdc -O3 for the GDC version. The source code is unchanged between the two compilers, and there are no version()'s that depend on a particular compiler. The measurements stated above are averages of about 3-4 runs. As you can see, the performance difference is between the two is pretty clear. I'm pretty sure this isn't only because of floating point operations, because the above test programs all use a lot of inner loops, and GDC does some pretty sophisticated loop unrolling and other such optimizations. T

Comparing dmd -O and gdc -O3 is hardly fair. "dmd -release -inline -O" is more comparable.
Jan 25 2013
prev sibling next sibling parent Iain Buclaw <ibuclaw ubuntu.com> writes:
--20cf307f3aa0b38eae04d41d0879
Content-Type: text/plain; charset=ISO-8859-1

On 25 January 2013 10:27, John Colvin <john.loughran.colvin gmail.com>wrote:

 On Friday, 25 January 2013 at 01:41:12 UTC, H. S. Teoh wrote:

 On Thu, Jan 24, 2013 at 03:18:01PM -0800, Walter Bright wrote:

 On 1/24/2013 1:13 PM, H. S. Teoh wrote:
On Thu, Jan 24, 2013 at 12:15:07PM -0800, Walter Bright wrote:
On 1/24/2013 8:36 AM, H. S. Teoh wrote:
Nevertheless, I also have made the same observation that >>>code
produced by gdc consistently outperforms code produced by >>>dmd.
Usually by about 20-30%, sometimes as much as 50-60%, IME. >>>That's a
pretty big discrepancy for me, esp. when I'm doing
compute-intensive
geometric computations.

Do you mean floating point code? 32 or 64 bit?

Floating-point, 64-bit, tested on dmd -O vs. gdc -O3.

Next, are you using floats, doubles, or reals?

Both reals and floats. Well, let's get some real measurements. Here's a quick run-through of various test programs I have lying around: Test program #1 (iterating 2-variable function over grid), uses reals: - Test case with n=400: Using DMD: ~8 seconds (consistently) Using GDC: ~6 seconds (consistently) * So the DMD version is 33% slower than the GDC version. (That is, 8/6*100 = 133%, so 33% slower.) - Test case with n=600: Using DMD: ~27 seconds (consistently) Using GDC: ~19 seconds (consistently) * So the DMD version is 42% slower than the GDC version. Test program #2 (terrain generation simulator), uses floats: (The running time of this one depends on the RNG, so I fixed the seed value in order to make a fair comparison.) - Test case with seed=380170304, n=20 with water & wind simulation: Using DMD: ~10 seconds (consistently) Using GDC: ~7 seconds (consistently) * So the DMD version is 42% slower than the GDC version. - Test case with seed=380170304, n=25 with water & wind simulation: Using DMD: ~14 seconds (consistently) Using GDC: ~9 seconds (consistently) * So the DMD version is 55% slower than the GDC version. Test program #3 (enumeration of coordinates of n-dimensional polytopes), uses reals: - All permutations and changes of sign of <1,2,3,4,5,6,7>: Using DMD: ~4 seconds (consistently) Using GDC: ~3 seconds (consistently) * So the DMD version is 33% slower than the GDC version. - All permutations and changes of sign of <1,2,3,4,5,6,7,7>: Using DMD: ~41 seconds (consistently) Using GDC: ~27 seconds (consistently) * So the DMD version is 51% slower than the GDC version. - Even permutations and all changes of sign of <1,2,3,4,5,6,7,8>: Using DMD: ~40 seconds (consistently) Using GDC: ~27 seconds (consistently) * So the DMD version is 48% slower than the GDC version. All test programs were compiled with dmd -O for the DMD version, and gdc -O3 for the GDC version. The source code is unchanged between the two compilers, and there are no version()'s that depend on a particular compiler. The measurements stated above are averages of about 3-4 runs. As you can see, the performance difference is between the two is pretty clear. I'm pretty sure this isn't only because of floating point operations, because the above test programs all use a lot of inner loops, and GDC does some pretty sophisticated loop unrolling and other such optimizations. T

Comparing dmd -O and gdc -O3 is hardly fair. "dmd -release -inline -O" is more comparable.

But then you'd have to do gdc -O3 -frelease. :-) -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0'; --20cf307f3aa0b38eae04d41d0879 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">= On 25 January 2013 10:27, John Colvin <span dir=3D"ltr">&lt;<a href=3D"mail= to:john.loughran.colvin gmail.com" target=3D"_blank">john.loughran.colvin g= mail.com</a>&gt;</span> wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex">On Friday, 25 January 2013 at 01:41:12 UTC, = H. S. Teoh wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"><div><div class=3D"h5"> On Thu, Jan 24, 2013 at 03:18:01PM -0800, Walter Bright wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> On 1/24/2013 1:13 PM, H. S. Teoh wrote:<br> &gt;On Thu, Jan 24, 2013 at 12:15:07PM -0800, Walter Bright wrote:<br> &gt;&gt;On 1/24/2013 8:36 AM, H. S. Teoh wrote:<br> &gt;&gt;&gt;Nevertheless, I also have made the same observation that &gt;&g= t;&gt;code<br> &gt;&gt;&gt;produced by gdc consistently outperforms code produced by &gt;&= gt;&gt;dmd.<br> &gt;&gt;&gt;Usually by about 20-30%, sometimes as much as 50-60%, IME. &gt;= &gt;&gt;That&#39;s a<br> &gt;&gt;&gt;pretty big discrepancy for me, esp. when I&#39;m doing &gt;&gt;= &gt;compute-intensive<br> &gt;&gt;&gt;geometric computations.<br> &gt;&gt;<br> &gt;&gt;Do you mean floating point code? 32 or 64 bit?<br> &gt;<br> &gt;Floating-point, 64-bit, tested on dmd -O vs. gdc -O3.<br> <br> Next, are you using floats, doubles, or reals?<br> </blockquote> <br></div></div><div><div class=3D"h5"> Both reals and floats. Well, let&#39;s get some real measurements. Here&#39= ;s a<br> quick run-through of various test programs I have lying around:<br> <br> Test program #1 (iterating 2-variable function over grid), uses reals:<br> - Test case with n=3D400:<br> =A0 =A0 =A0 =A0 Using DMD: =A0 =A0 =A0~8 seconds (consistently)<br> =A0 =A0 =A0 =A0 Using GDC: =A0 =A0 =A0~6 seconds (consistently)<br> =A0 =A0 =A0 =A0 * So the DMD version is 33% slower than the GDC version.<br=

<br> - Test case with n=3D600:<br> =A0 =A0 =A0 =A0 Using DMD: =A0 =A0 =A0~27 seconds (consistently)<br> =A0 =A0 =A0 =A0 Using GDC: =A0 =A0 =A0~19 seconds (consistently)<br> =A0 =A0 =A0 =A0 * So the DMD version is 42% slower than the GDC version.<br=

<br> Test program #2 (terrain generation simulator), uses floats:<br> (The running time of this one depends on the RNG, so I fixed the seed<br> value in order to make a fair comparison.)<br> - Test case with seed=3D380170304, n=3D20 with water &amp; wind simulation:= <br> =A0 =A0 =A0 =A0 Using DMD: =A0 =A0 =A0~10 seconds (consistently)<br> =A0 =A0 =A0 =A0 Using GDC: =A0 =A0 =A0~7 seconds (consistently)<br> =A0 =A0 =A0 =A0 * So the DMD version is 42% slower than the GDC version.<br=

- Test case with seed=3D380170304, n=3D25 with water &amp; wind simulation:= <br> =A0 =A0 =A0 =A0 Using DMD: =A0 =A0 =A0~14 seconds (consistently)<br> =A0 =A0 =A0 =A0 Using GDC: =A0 =A0 =A0~9 seconds (consistently)<br> =A0 =A0 =A0 =A0 * So the DMD version is 55% slower than the GDC version.<br=

<br> Test program #3 (enumeration of coordinates of n-dimensional polytopes),<br=

- All permutations and changes of sign of &lt;1,2,3,4,5,6,7&gt;:<br> =A0 =A0 =A0 =A0 Using DMD: =A0 =A0 =A0~4 seconds (consistently)<br> =A0 =A0 =A0 =A0 Using GDC: =A0 =A0 =A0~3 seconds (consistently)<br> =A0 =A0 =A0 =A0 * So the DMD version is 33% slower than the GDC version.<br=

- All permutations and changes of sign of &lt;1,2,3,4,5,6,7,7&gt;:<br> =A0 =A0 =A0 =A0 Using DMD: =A0 =A0 =A0~41 seconds (consistently)<br> =A0 =A0 =A0 =A0 Using GDC: =A0 =A0 =A0~27 seconds (consistently)<br> =A0 =A0 =A0 =A0 * So the DMD version is 51% slower than the GDC version.<br=

- Even permutations and all changes of sign of &lt;1,2,3,4,5,6,7,8&gt;:<br> =A0 =A0 =A0 =A0 Using DMD: =A0 =A0 =A0~40 seconds (consistently)<br> =A0 =A0 =A0 =A0 Using GDC: =A0 =A0 =A0~27 seconds (consistently)<br> =A0 =A0 =A0 =A0 * So the DMD version is 48% slower than the GDC version.<br=

<br> All test programs were compiled with dmd -O for the DMD version, and gdc<br=

compilers, and there are no version()&#39;s that depend on a particular<br> compiler. The measurements stated above are averages of about 3-4 runs.<br> <br> As you can see, the performance difference is between the two is pretty<br> clear. =A0I&#39;m pretty sure this isn&#39;t only because of floating point= <br> operations, because the above test programs all use a lot of inner<br> loops, and GDC does some pretty sophisticated loop unrolling and other<br> such optimizations.<br> <br> <br> T<br> </div></div></blockquote> <br> Comparing dmd -O and gdc -O3 is hardly fair. &quot;dmd -release -inline -O&= quot; is more comparable.<br> </blockquote></div><br><br></div><div class=3D"gmail_extra">But then you&#3= 9;d have to do gdc -O3 -frelease. :-)<br clear=3D"all"></div><div class=3D"= gmail_extra"><br>-- <br>Iain Buclaw<br><br>*(p &lt; e ? p++ : p) =3D (c &am= p; 0x0f) + &#39;0&#39;; </div></div> --20cf307f3aa0b38eae04d41d0879--
Jan 25 2013
prev sibling next sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Friday, 25 January 2013 at 13:38:03 UTC, Iain Buclaw wrote:
 On 25 January 2013 10:27, John Colvin 
 <john.loughran.colvin gmail.com>wrote:

 On Friday, 25 January 2013 at 01:41:12 UTC, H. S. Teoh wrote:

 On Thu, Jan 24, 2013 at 03:18:01PM -0800, Walter Bright wrote:

 On 1/24/2013 1:13 PM, H. S. Teoh wrote:
On Thu, Jan 24, 2013 at 12:15:07PM -0800, Walter Bright 
wrote:
On 1/24/2013 8:36 AM, H. S. Teoh wrote:
Nevertheless, I also have made the same observation that
code



dmd.



IME. >>>That's a pretty big discrepancy for me, esp. when I'm doing compute-intensive geometric computations.

Do you mean floating point code? 32 or 64 bit?

Floating-point, 64-bit, tested on dmd -O vs. gdc -O3.

Next, are you using floats, doubles, or reals?

Both reals and floats. Well, let's get some real measurements. Here's a quick run-through of various test programs I have lying around: Test program #1 (iterating 2-variable function over grid), uses reals: - Test case with n=400: Using DMD: ~8 seconds (consistently) Using GDC: ~6 seconds (consistently) * So the DMD version is 33% slower than the GDC version. (That is, 8/6*100 = 133%, so 33% slower.) - Test case with n=600: Using DMD: ~27 seconds (consistently) Using GDC: ~19 seconds (consistently) * So the DMD version is 42% slower than the GDC version. Test program #2 (terrain generation simulator), uses floats: (The running time of this one depends on the RNG, so I fixed the seed value in order to make a fair comparison.) - Test case with seed=380170304, n=20 with water & wind simulation: Using DMD: ~10 seconds (consistently) Using GDC: ~7 seconds (consistently) * So the DMD version is 42% slower than the GDC version. - Test case with seed=380170304, n=25 with water & wind simulation: Using DMD: ~14 seconds (consistently) Using GDC: ~9 seconds (consistently) * So the DMD version is 55% slower than the GDC version. Test program #3 (enumeration of coordinates of n-dimensional polytopes), uses reals: - All permutations and changes of sign of <1,2,3,4,5,6,7>: Using DMD: ~4 seconds (consistently) Using GDC: ~3 seconds (consistently) * So the DMD version is 33% slower than the GDC version. - All permutations and changes of sign of <1,2,3,4,5,6,7,7>: Using DMD: ~41 seconds (consistently) Using GDC: ~27 seconds (consistently) * So the DMD version is 51% slower than the GDC version. - Even permutations and all changes of sign of <1,2,3,4,5,6,7,8>: Using DMD: ~40 seconds (consistently) Using GDC: ~27 seconds (consistently) * So the DMD version is 48% slower than the GDC version. All test programs were compiled with dmd -O for the DMD version, and gdc -O3 for the GDC version. The source code is unchanged between the two compilers, and there are no version()'s that depend on a particular compiler. The measurements stated above are averages of about 3-4 runs. As you can see, the performance difference is between the two is pretty clear. I'm pretty sure this isn't only because of floating point operations, because the above test programs all use a lot of inner loops, and GDC does some pretty sophisticated loop unrolling and other such optimizations. T

Comparing dmd -O and gdc -O3 is hardly fair. "dmd -release -inline -O" is more comparable.

But then you'd have to do gdc -O3 -frelease. :-)

Ah yes, of course :)
Jan 25 2013
prev sibling parent "Rob T" <alanb ucora.com> writes:
On Thursday, 24 January 2013 at 10:17:50 UTC, Walter Bright wrote:
 On 1/23/2013 6:36 PM, Rob T wrote:
 BTW the D version of my sqlite3 lib is at least 1/3 smaller 
 than the C++
 version, and not only is it smaller, but it is far more 
 flexible due to the use
 of templates (I just could not make much use out of C++ 
 templates). A reduction
 like that is very significant. For large projects. it's a 
 drastic reduction in
 development costs and perhaps more so in long term maintenance 
 costs.

Interesting. I found the same percentage reduction in translating C++ code to D.

I wonder what the main reasons are for the reduction? I did make my D version of the sqlite3 lib slightly better by removing some redundancies, but that had only a ~100 line effect on the size difference. I know that the basic design is pretty much the same, so there's no radical design change that would account for the difference. It could be that I did a better job in subtle ways when converting over from C++ because of the experience gained from the original work, for example I think the error detection and reporting I have in the D version is much simpler, and likely accounts for some of the size difference. The question though, is could I have implemented the same changes in the C++ version just as easily? I'm not so sure about that because when I program in D, it "feels" better in terms of being much less tedious to work with, so there must be more going on than just a few design choices. I also find that I get into these "ah ha" moments, where I realize that I don't have to do much of anything extra to make something new work - hard to explain without real examples, but I know I run into these when working with D more so than when working with C++. An interesting test would be to translate a D program into a C++ one, to see if the C++ version will shrink due to subtle improvements, but I think that would be very difficult to do if there are templates involved. You just cannot make heavy use out of templates in C++ like you can in D. Have you ever translated from D to C++? --rt
Jan 25 2013