digitalmars.D - D for scientific computing

Alan (10/10) Jan 23 2013 I saw an old thread from 2004 while doing a google search that

Stephan (21/31) Jan 23 2013 Hi Alan,
Stephan (7/17) Jan 23 2013 I actually forgot the main argument for me: D allows safe
Joshua Niehus (4/5) Jan 23 2013 https://github.com/kyllingstad/scid

Joshua Niehus (5/7) Jan 23 2013 I forgot to add:

Era Scarecrow (3/11) Jan 23 2013 And use nothrow when it's applicable; Found with a sudoku solver

deadalnix (2/16) Jan 24 2013 Do you know why ? It shouldn't.

Andrei Alexandrescu (4/18) Jan 24 2013 More code motion. It's a classic in C++ code as well (where it's mode
Era Scarecrow (7/13) Jan 24 2013 As mentioned somewhere, with nothrow the compiler can drop

Rob T (20/28) Jan 23 2013 .. also don't forget that there's a garbage collector which can

Walter Bright (2/7) Jan 24 2013 Interesting. I found the same percentage reduction in translating C++ co...

Rob T (28/41) Jan 25 2013 I wonder what the main reasons are for the reduction? I did make

Walter Bright (10/12) Jan 25 2013 Some reasons:

Alan (21/21) Jan 23 2013 My project will be working with the USGS Modflow model (a

Philippe Sigaud (6/6) Jan 24 2013 There is also Plot2Kill, that David Simcha developed for his own

Nicolas Sicard (6/16) Jan 24 2013 The different D compilers available don't generate numeric code

Walter Bright (3/7) Jan 24 2013 If you use the 64 bit model, dmd will use SIMD instructions for float an...

Joseph Rushton Wakeling (7/9) Jan 24 2013 I generally find that dmd-compiled programs run at about half the speed ...

Walter Bright (2/4) Jan 24 2013 Is that with floating point code, or otherwise?

Joseph Rushton Wakeling (4/5) Jan 24 2013 Yes, quite heavily floating-point. I did once have a brief go at writin...

Nicolas Sicard (6/17) Jan 24 2013 Same for me. The difference between ldc and dmd seems to be
John Colvin (7/18) Jan 24 2013 I had similar experience with all my numerical code. gdc and ldc

Joseph Rushton Wakeling (7/10) Jan 24 2013 Honestly, I don't feel this is too strong an issue. The point of dmd is...

John Colvin (4/17) Jan 24 2013 Fair point, i guess the reference doesn't have to be fastest.

H. S. Teoh (15/26) Jan 24 2013 I think it would be ideal if the dmd front end were more isolated from

Walter Bright (2/7) Jan 24 2013 Do you mean floating point code? 32 or 64 bit?

H. S. Teoh (7/15) Jan 24 2013 Floating-point, 64-bit, tested on dmd -O vs. gdc -O3.

Walter Bright (2/12) Jan 24 2013 Next, are you using floats, doubles, or reals?

H. S. Teoh (50/64) Jan 24 2013 Both reals and floats. Well, let's get some real measurements. Here's a

John Colvin (3/84) Jan 25 2013 Comparing dmd -O and gdc -O3 is hardly fair. "dmd -release

Iain Buclaw (5/89) Jan 25 2013 But then you'd have to do gdc -O3 -frelease. :-)

John Colvin (2/120) Jan 25 2013 Ah yes, of course :)

H. S. Teoh (26/91) Jan 25 2013 Hmm. I didn't realize that dmd has a separate switch for function

John Colvin (5/105) Jan 25 2013 I have sometimes found that using -release and -noboundscheck

H. S. Teoh (31/47) Jan 25 2013 Alright. So to make the comparison fair(er), I recompiled test program

Walter Bright (3/6) Jan 25 2013 If you're feeling ambitious, taking a closer look to see why would be mo...

Joseph Rushton Wakeling (8/10) Jan 26 2013 It's nice if DMD can produce faster code, but in the short term I'd rath...

mist (6/7) Jan 26 2013 ++

Joseph Rushton Wakeling (5/7) Jan 26 2013 Actually, I feel somewhat the contrary. When the problem of frontend/ru...

mist (3/3) Jan 26 2013 Yes, of course, we all have our own preferences, that is fine :)

Joseph Rushton Wakeling (2/5) Jan 26 2013 Yup, agree. :-)

Marco Leise (15/15) Jan 26 2013 I use DMD and GDC. DMD for debug builds and (since it is the

Joseph Rushton Wakeling (8/13) Jan 24 2013 Case in point -- today I got bitten by this issue:

Rob T (5/16) Jan 24 2013 You are taking care to compare with full optimization flag

Joseph Rushton Wakeling (21/23) Jan 24 2013 I use -O -release -inline typically (I use the dmd-ish interfaces for gd...
Joseph Rushton Wakeling (9/10) Jan 24 2013 The caveat here is that these results are typical for _number-crunching_...

lomereiter (6/9) Jan 24 2013 From my experience, writef and friends are substantially slower

"Alan" <geouke gmail.com> writes:

I saw an old thread from 2004 while doing a google search that 
discussed D and scientific computing and was looking for some 
more recent information or opinions from people who have used it 
for such purposes.

I am a graduate student and my thesis work is in numerical 
modeling. While I have some experience using Fortran and C, I am 
not obligated to use any particular language for my work. I like 
the design goals behind D and the syntax. I was would like to 
know if D can compete with C or Fortran for numerical work.

Is anyone out there using D for heavy numeric work?

Jan 23 2013

"Stephan" <stephan_schiffels mac.com> writes:

On Wednesday, 23 January 2013 at 22:39:04 UTC, Alan wrote:
 I saw an old thread from 2004 while doing a google search that 
 discussed D and scientific computing and was looking for some 
 more recent information or opinions from people who have used 
 it for such purposes.

 I am a graduate student and my thesis work is in numerical 
 modeling. While I have some experience using Fortran and C, I 
 am not obligated to use any particular language for my work. I 
 like the design goals behind D and the syntax. I was would like 
 to know if D can compete with C or Fortran for numerical work.

 Is anyone out there using D for heavy numeric work?

Hi Alan,

I use D to build a fairly large project to analyze whole genome 
sequences from multiple individuals. I will actually upload 
things into Bitbucket soon, I will let people on this forum know. 
I use it straight as a replacement for C++, that means I use it 
for all the numeric work I used C and C++ before.

To name a few highlights:
You can very easily adapt the code samples from Numerical Recipes 
3rd edition (which are in C++) to D, with a lot more convenient 
built-in arrays and associative arrays.
Also, you can link the Gnu scientific library (GSL) straight into 
your D code. This actually had some bugs in previous versions of 
the compiler, but now it is really flawless.
I use the GSL vector class to do very fast Matrix-Matrix 
multiplications with GSL's blast interface. Also, I use GSL's 
special functions.

So I think D is ideal for scientific developers that start new 
projects. We don't have to convince huge teams to endeavour in a 
new language. We can just pick the best there is :-)

Stephan

Jan 23 2013

"Stephan" <stephan_schiffels mac.com> writes:

On Wednesday, 23 January 2013 at 22:39:04 UTC, Alan wrote:
 I saw an old thread from 2004 while doing a google search that 
 discussed D and scientific computing and was looking for some 
 more recent information or opinions from people who have used 
 it for such purposes.

 I am a graduate student and my thesis work is in numerical 
 modeling. While I have some experience using Fortran and C, I 
 am not obligated to use any particular language for my work. I 
 like the design goals behind D and the syntax. I was would like 
 to know if D can compete with C or Fortran for numerical work.

 Is anyone out there using D for heavy numeric work?

I actually forgot the main argument for me: D allows safe 
multithreading right out of the box, which is a huge advantage I 
think. I never wrote anything multithreaded before, until I 
started in D. I recommend Andrei's book, check out the two 
chapters that are linked on the D page.

Stephan

Jan 23 2013

"Joshua Niehus" <jm.niehus gmail.com> writes:

On Wednesday, 23 January 2013 at 22:39:04 UTC, Alan wrote:
 to know if D can compete with C or Fortran for numerical work.

https://github.com/kyllingstad/scid

You dont need to compete, you can take established "good and 
fast" FORTRAN/C code and use it within your own D program.

Jan 23 2013

"Joshua Niehus" <jm.niehus gmail.com> writes:

On Thursday, 24 January 2013 at 00:29:15 UTC, Joshua Niehus wrote:
 You dont need to compete, you can take established "good and 
 fast" FORTRAN/C code and use it within your own D program.

I forgot to add:
If you doing new stuff then D can be as fast as anything eles, 
provided the algorithm is sound, optimizers turned on, sprinkle 
in a lil asembly, etc...

Jan 23 2013

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Thursday, 24 January 2013 at 00:35:13 UTC, Joshua Niehus wrote:
 On Thursday, 24 January 2013 at 00:29:15 UTC, Joshua Niehus 
 wrote:
 You don't need to compete, you can take established "good and 
 fast" FORTRAN/C code and use it within your own D program.

 I forgot to add:
 If you doing new stuff then D can be as fast as anything eles, 
 provided the algorithm is sound, optimizers turned on, sprinkle 
 in a lil assembly, etc...

  And use nothrow when it's applicable; Found with a sudoku solver 
how much nothrow was making an impact on the algorithm speed.

Jan 23 2013

"deadalnix" <deadalnix gmail.com> writes:

On Thursday, 24 January 2013 at 02:19:06 UTC, Era Scarecrow wrote:
 On Thursday, 24 January 2013 at 00:35:13 UTC, Joshua Niehus 
 wrote:
 On Thursday, 24 January 2013 at 00:29:15 UTC, Joshua Niehus 
 wrote:
 You don't need to compete, you can take established "good and 
 fast" FORTRAN/C code and use it within your own D program.

 I forgot to add:
 If you doing new stuff then D can be as fast as anything eles, 
 provided the algorithm is sound, optimizers turned on, 
 sprinkle in a lil assembly, etc...

  And use nothrow when it's applicable; Found with a sudoku 
 solver how much nothrow was making an impact on the algorithm 
 speed.

Do you know why ? It shouldn't.

Jan 24 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/24/13 5:21 AM, deadalnix wrote:
 On Thursday, 24 January 2013 at 02:19:06 UTC, Era Scarecrow wrote:
 On Thursday, 24 January 2013 at 00:35:13 UTC, Joshua Niehus wrote:
 On Thursday, 24 January 2013 at 00:29:15 UTC, Joshua Niehus wrote:
 You don't need to compete, you can take established "good and fast"
 FORTRAN/C code and use it within your own D program.

 I forgot to add:
 If you doing new stuff then D can be as fast as anything eles,
 provided the algorithm is sound, optimizers turned on, sprinkle in a
 lil assembly, etc...

 And use nothrow when it's applicable; Found with a sudoku solver how
 much nothrow was making an impact on the algorithm speed.

 Do you know why ? It shouldn't.

More code motion. It's a classic in C++ code as well (where it's mode 
difficult to detect).

Andrei

Jan 24 2013

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Thursday, 24 January 2013 at 10:21:47 UTC, deadalnix wrote:
 On Thursday, 24 January 2013 at 02:19:06 UTC, Era Scarecrow 
 wrote:
 And use nothrow when it's applicable; Found with a sudoku 
 solver how much nothrow was making an impact on the algorithm 
 speed.

 Do you know why? It shouldn't.

  As mentioned somewhere, with nothrow the compiler can drop 
various checks and support for exceptions (assert/ensure don't 
throw exceptions, they throw errors instead).

  How big this overhead is I'm not sure, but the speedup in my 
code went from some 30 seconds down to 7 or so. But nothrow can't 
be used everywhere.

Jan 24 2013

"Rob T" <alanb ucora.com> writes:

On Thursday, 24 January 2013 at 00:35:13 UTC, Joshua Niehus wrote:
 On Thursday, 24 January 2013 at 00:29:15 UTC, Joshua Niehus 
 wrote:
 You dont need to compete, you can take established "good and 
 fast" FORTRAN/C code and use it within your own D program.

 I forgot to add:
 If you doing new stuff then D can be as fast as anything eles, 
 provided the algorithm is sound, optimizers turned on, sprinkle 
 in a lil asembly, etc...

.. also don't forget that there's a garbage collector which can 
have a huge impact on performance if you are doing a lot of 
memory allocations. The GC is adjustable to a degree, so 
performance problems can be solved provided that you are aware of 
them.

For example, I wrote a sqlite3 library in D, and for large SELECT 
returns it was 3 times slower than an almost identical C++ 
implementation.

The performance difference was resolved by disabling the GC prior 
to running the query and re-enabling afterwards. It was an easy 
fix, only two lines of code in one function.

BTW the D version of my sqlite3 lib is at least 1/3 smaller than 
the C++ version, and not only is it smaller, but it is far more 
flexible due to the use of templates (I just could not make much 
use out of C++ templates). A reduction like that is very 
significant. For large projects. it's a drastic reduction in 
development costs and perhaps more so in long term maintenance 
costs.

--rt

Jan 23 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 1/23/2013 6:36 PM, Rob T wrote:
 BTW the D version of my sqlite3 lib is at least 1/3 smaller than the C++
 version, and not only is it smaller, but it is far more flexible due to the use
 of templates (I just could not make much use out of C++ templates). A reduction
 like that is very significant. For large projects. it's a drastic reduction in
 development costs and perhaps more so in long term maintenance costs.

Interesting. I found the same percentage reduction in translating C++ code to D.

Jan 24 2013

"Rob T" <alanb ucora.com> writes:

On Thursday, 24 January 2013 at 10:17:50 UTC, Walter Bright wrote:
 On 1/23/2013 6:36 PM, Rob T wrote:
 BTW the D version of my sqlite3 lib is at least 1/3 smaller 
 than the C++
 version, and not only is it smaller, but it is far more 
 flexible due to the use
 of templates (I just could not make much use out of C++ 
 templates). A reduction
 like that is very significant. For large projects. it's a 
 drastic reduction in
 development costs and perhaps more so in long term maintenance 
 costs.

 Interesting. I found the same percentage reduction in 
 translating C++ code to D.

I wonder what the main reasons are for the reduction? I did make 
my D version of the sqlite3 lib slightly better by removing some 
redundancies, but that had only a ~100 line effect on the size 
difference. I know that the basic design is pretty much the same, 
so there's no radical design change that would account for the 
difference.

It could be that I did a better job in subtle ways when 
converting over from C++ because of the experience gained from 
the original work, for example I think the error detection and 
reporting I have in the D version is much simpler, and likely 
accounts for some of the size difference. The question though, is 
could I have implemented the same changes in the C++ version just 
as easily? I'm not so sure about that because when I program in 
D, it "feels" better in terms of being much less tedious to work 
with, so there must be more going on than just a few design 
choices. I also find that I get into these "ah ha" moments, where 
I realize that I don't have to do much of anything extra to make 
something new work - hard to explain without real examples, but I 
know I run into these when working with D more so than when 
working with C++.

An interesting test would be to translate a D program into a C++ 
one, to see if the C++ version will shrink due to subtle 
improvements, but I think that would be very difficult to do if 
there are templates involved. You just cannot make heavy use out 
of templates in C++ like you can in D.

Have you ever translated from D to C++?

--rt

Jan 25 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 1/25/2013 9:45 AM, Rob T wrote:
 I wonder what the main reasons are for the reduction?

Some reasons:

1. elimination of .h files
2. array & string handling was so much more straightforward
3. elimination of need for many constructors and code to initialize things
4. easier cleanup with scope statement
5. templates are much more concise
6. a lot of boilerplate member functions are simply unnecessary in D
7. static if eliminates a lot of template source bloat

 Have you ever translated from D to C++?

Haven't tried that!

Jan 25 2013

"Alan" <geouke gmail.com> writes:

My project will be working with the USGS Modflow model (a 
finite-difference model for groundwater flow written in fortran). 
Thankfully, it works with text input and output. So, the bit of 
the program I will be writing (which will talk back and forth to 
modflow through text) can be written in whatever language I 
choose.

I guess since the source code is available it might make sense to 
modify it to take i/o directly as a function called from D (if D 
is happy to do that).

I ordered Andrei's book a couple days ago and am oddly excited 
about it arriving. I have worked a bit with fortran, C, and C++. 
C++ needs to die. C is lacking a couple useful features (rather, 
it is more that they are not made as convenient to implement). 
Fortran does not have as many intrinsic functions and easy access 
to do "cool stuff" :P

I am glad to hear that D has been crunching numbers are a 
reasonable rate for you guys. My concern was that I will 
potentially be working on projects that might take a week or two 
to run on a small computer cluster and I do not want it to take 
four weeks to run something that I could run in two had I written 
it in Fortran.

Jan 23 2013

Philippe Sigaud <philippe.sigaud gmail.com> writes:

There is also Plot2Kill, that David Simcha developed for his own
thesis, to do 2D drawings:

https://github.com/dsimcha/Plot2kill


I used it 1 (2?) years ago, and it worked well. It was quite nice to
be able to generate / manipulate data in D, and then to keep the same
powerful language for graphs.

Jan 24 2013

"Nicolas Sicard" <dransic gmail.com> writes:

On Wednesday, 23 January 2013 at 22:39:04 UTC, Alan wrote:
 I saw an old thread from 2004 while doing a google search that 
 discussed D and scientific computing and was looking for some 
 more recent information or opinions from people who have used 
 it for such purposes.

 I am a graduate student and my thesis work is in numerical 
 modeling. While I have some experience using Fortran and C, I 
 am not obligated to use any particular language for my work. I 
 like the design goals behind D and the syntax. I was would like 
 to know if D can compete with C or Fortran for numerical work.

 Is anyone out there using D for heavy numeric work?

The different D compilers available don't generate numeric code 
of the same quality, depending on the algorithms and data 
structures used. I have found in one of my projects that LDC 
produces code that is up to 5x or even 10x faster than DMD 
(though the average difference is less spectacular).

Jan 24 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 1/24/2013 2:08 AM, Nicolas Sicard wrote:
 The different D compilers available don't generate numeric code of the same
 quality, depending on the algorithms and data structures used. I have found in
 one of my projects that LDC produces code that is up to 5x or even 10x faster
 than DMD (though the average difference is less spectacular).

If you use the 64 bit model, dmd will use SIMD instructions for float and 
double, which are much faster.

Jan 24 2013

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 01/24/2013 11:16 AM, Walter Bright wrote:
 If you use the 64 bit model, dmd will use SIMD instructions for float and
 double, which are much faster.

I generally find that dmd-compiled programs run at about half the speed of
those 
built with gdc or ldc (the latter seem pretty much equivalent these days, some 
programs run faster compiled with one, some with the other).  That's running
off 
latest GitHub source for all compilers.

That's been a fairly consistent speed difference for a long time.  And yes, I'm 
using 64-bit.

Jan 24 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 1/24/2013 2:41 AM, Joseph Rushton Wakeling wrote:
 That's been a fairly consistent speed difference for a long time.  And yes, I'm
 using 64-bit.

Is that with floating point code, or otherwise?

Jan 24 2013

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 01/24/2013 11:49 AM, Walter Bright wrote:
 Is that with floating point code, or otherwise?

Yes, quite heavily floating-point.  I did once have a brief go at writing some 
entirely integer-based number-crunching code just to see if it made any 
difference, but I think other priorities intervened ... :-)

Jan 24 2013

"Nicolas Sicard" <dransic gmail.com> writes:

On Thursday, 24 January 2013 at 10:42:10 UTC, Joseph Rushton 
Wakeling wrote:
 On 01/24/2013 11:16 AM, Walter Bright wrote:
 If you use the 64 bit model, dmd will use SIMD instructions 
 for float and
 double, which are much faster.

 I generally find that dmd-compiled programs run at about half 
 the speed of those built with gdc or ldc (the latter seem 
 pretty much equivalent these days, some programs run faster 
 compiled with one, some with the other).  That's running off 
 latest GitHub source for all compilers.

 That's been a fairly consistent speed difference for a long 
 time.  And yes, I'm using 64-bit.

Same for me. The difference between ldc and dmd seems to be 
mainly due to optimizing and especially inlining (see 
http://d.puremagic.com/issues/show_bug.cgi?id=9320 for an example 
in that matter).

Jan 24 2013

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Thursday, 24 January 2013 at 10:42:10 UTC, Joseph Rushton 
Wakeling wrote:
 On 01/24/2013 11:16 AM, Walter Bright wrote:
 If you use the 64 bit model, dmd will use SIMD instructions 
 for float and
 double, which are much faster.

 I generally find that dmd-compiled programs run at about half 
 the speed of those built with gdc or ldc (the latter seem 
 pretty much equivalent these days, some programs run faster 
 compiled with one, some with the other).  That's running off 
 latest GitHub source for all compilers.

 That's been a fairly consistent speed difference for a long 
 time.  And yes, I'm using 64-bit.

I had similar experience with all my numerical code. gdc and ldc 
trade places but dmd is always solidly behind.

Walter, I know you like working with the current backend and you 
understand it etc..., but this gives dmd a bus factor of 1 and is 
slowing down code in the process.

Jan 24 2013

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 01/24/2013 02:11 PM, John Colvin wrote:
 Walter, I know you like working with the current backend and you understand it
 etc..., but this gives dmd a bus factor of 1 and is slowing down code in the
 process.

Honestly, I don't feel this is too strong an issue.  The point of dmd is to be
a 
reference compiler -- speed is nice if it's possible, but not the most
important 
consideration.

The most important thing is that new frontend updates can get merged quickly 
into ldc/gdc, so that there is no time lag between new feature development and 
their incorporation into other compilers.

Jan 24 2013

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Thursday, 24 January 2013 at 16:28:29 UTC, Joseph Rushton 
Wakeling wrote:
 On 01/24/2013 02:11 PM, John Colvin wrote:
 Walter, I know you like working with the current backend and 
 you understand it
 etc..., but this gives dmd a bus factor of 1 and is slowing 
 down code in the
 process.

 Honestly, I don't feel this is too strong an issue.  The point 
 of dmd is to be a reference compiler -- speed is nice if it's 
 possible, but not the most important consideration.

Fair point, i guess the reference doesn't have to be fastest.

 The most important thing is that new frontend updates can get 
 merged quickly into ldc/gdc, so that there is no time lag 
 between new feature development and their incorporation into 
 other compilers.

This would be really great.

Jan 24 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, Jan 24, 2013 at 05:28:16PM +0100, Joseph Rushton Wakeling wrote:
 On 01/24/2013 02:11 PM, John Colvin wrote:
Walter, I know you like working with the current backend and you
understand it etc..., but this gives dmd a bus factor of 1 and is
slowing down code in the process.

 
 Honestly, I don't feel this is too strong an issue.  The point of
 dmd is to be a reference compiler -- speed is nice if it's possible,
 but not the most important consideration.

I think it would be ideal if the dmd front end were more isolated from
the back end, so that it's easier to port to gdc/ldc (i.e. it can happen
in the matter of days after a dmd release, not, say, weeks or months).

But I believe Walter has already said that patches to this effect are
welcome, so I can only see the situation improve in the future.

Nevertheless, I also have made the same observation that code produced
by gdc consistently outperforms code produced by dmd. Usually by about
20-30%, sometimes as much as 50-60%, IME. That's a pretty big
discrepancy for me, esp. when I'm doing compute-intensive geometric
computations.


 The most important thing is that new frontend updates can get merged
 quickly into ldc/gdc, so that there is no time lag between new
 feature development and their incorporation into other compilers.

Agreed.


T

-- 
It is of the new things that men tire --- of fashions and proposals and
improvements and change. It is the old things that startle and intoxicate. It
is the old things that are young. -- G.K. Chesterton

Jan 24 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 1/24/2013 8:36 AM, H. S. Teoh wrote:
 Nevertheless, I also have made the same observation that code produced
 by gdc consistently outperforms code produced by dmd. Usually by about
 20-30%, sometimes as much as 50-60%, IME. That's a pretty big
 discrepancy for me, esp. when I'm doing compute-intensive geometric
 computations.

Do you mean floating point code? 32 or 64 bit?

Jan 24 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, Jan 24, 2013 at 12:15:07PM -0800, Walter Bright wrote:
 On 1/24/2013 8:36 AM, H. S. Teoh wrote:
Nevertheless, I also have made the same observation that code
produced by gdc consistently outperforms code produced by dmd.
Usually by about 20-30%, sometimes as much as 50-60%, IME. That's a
pretty big discrepancy for me, esp. when I'm doing compute-intensive
geometric computations.

 
 Do you mean floating point code? 32 or 64 bit?

Floating-point, 64-bit, tested on dmd -O vs. gdc -O3.


T

-- 
The irony is that Bill Gates claims to be making a stable operating
system and Linus Torvalds claims to be trying to take over the world. --
Anonymous

Jan 24 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 1/24/2013 1:13 PM, H. S. Teoh wrote:
 On Thu, Jan 24, 2013 at 12:15:07PM -0800, Walter Bright wrote:
 On 1/24/2013 8:36 AM, H. S. Teoh wrote:
 Nevertheless, I also have made the same observation that code
 produced by gdc consistently outperforms code produced by dmd.
 Usually by about 20-30%, sometimes as much as 50-60%, IME. That's a
 pretty big discrepancy for me, esp. when I'm doing compute-intensive
 geometric computations.

 Do you mean floating point code? 32 or 64 bit?

 Floating-point, 64-bit, tested on dmd -O vs. gdc -O3.

Next, are you using floats, doubles, or reals?

Jan 24 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, Jan 24, 2013 at 03:18:01PM -0800, Walter Bright wrote:
 On 1/24/2013 1:13 PM, H. S. Teoh wrote:
On Thu, Jan 24, 2013 at 12:15:07PM -0800, Walter Bright wrote:
On 1/24/2013 8:36 AM, H. S. Teoh wrote:
Nevertheless, I also have made the same observation that code
produced by gdc consistently outperforms code produced by dmd.
Usually by about 20-30%, sometimes as much as 50-60%, IME. That's a
pretty big discrepancy for me, esp. when I'm doing compute-intensive
geometric computations.

Do you mean floating point code? 32 or 64 bit?

Floating-point, 64-bit, tested on dmd -O vs. gdc -O3.

 
 Next, are you using floats, doubles, or reals?

Both reals and floats. Well, let's get some real measurements. Here's a
quick run-through of various test programs I have lying around:


- Test case with n=400:
	Using DMD:	~8 seconds (consistently)
	Using GDC:	~6 seconds (consistently)
	* So the DMD version is 33% slower than the GDC version.
	  (That is, 8/6*100 = 133%, so 33% slower.)

- Test case with n=600:
	Using DMD:	~27 seconds (consistently)
	Using GDC:	~19 seconds (consistently)
	* So the DMD version is 42% slower than the GDC version.



(The running time of this one depends on the RNG, so I fixed the seed
value in order to make a fair comparison.)
- Test case with seed=380170304, n=20 with water & wind simulation:
	Using DMD:	~10 seconds (consistently)
	Using GDC:	~7 seconds (consistently)
	* So the DMD version is 42% slower than the GDC version.

- Test case with seed=380170304, n=25 with water & wind simulation:
	Using DMD:	~14 seconds (consistently)
	Using GDC:	~9 seconds (consistently)
	* So the DMD version is 55% slower than the GDC version.



uses reals:
- All permutations and changes of sign of <1,2,3,4,5,6,7>:
	Using DMD:	~4 seconds (consistently)
	Using GDC:	~3 seconds (consistently)
	* So the DMD version is 33% slower than the GDC version.

- All permutations and changes of sign of <1,2,3,4,5,6,7,7>:
	Using DMD:	~41 seconds (consistently)
	Using GDC:	~27 seconds (consistently)
	* So the DMD version is 51% slower than the GDC version.

- Even permutations and all changes of sign of <1,2,3,4,5,6,7,8>:
	Using DMD:	~40 seconds (consistently)
	Using GDC:	~27 seconds (consistently)
	* So the DMD version is 48% slower than the GDC version.


All test programs were compiled with dmd -O for the DMD version, and gdc
-O3 for the GDC version. The source code is unchanged between the two
compilers, and there are no version()'s that depend on a particular
compiler. The measurements stated above are averages of about 3-4 runs.

As you can see, the performance difference is between the two is pretty
clear.  I'm pretty sure this isn't only because of floating point
operations, because the above test programs all use a lot of inner
loops, and GDC does some pretty sophisticated loop unrolling and other
such optimizations.


T

-- 
Two wrongs don't make a right; but three rights do make a left...

Jan 24 2013

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Friday, 25 January 2013 at 01:41:12 UTC, H. S. Teoh wrote:
 On Thu, Jan 24, 2013 at 03:18:01PM -0800, Walter Bright wrote:
 On 1/24/2013 1:13 PM, H. S. Teoh wrote:
On Thu, Jan 24, 2013 at 12:15:07PM -0800, Walter Bright wrote:
On 1/24/2013 8:36 AM, H. S. Teoh wrote:
Nevertheless, I also have made the same observation that 
code
produced by gdc consistently outperforms code produced by 
dmd.
Usually by about 20-30%, sometimes as much as 50-60%, IME. 
That's a
pretty big discrepancy for me, esp. when I'm doing 
compute-intensive
geometric computations.

Do you mean floating point code? 32 or 64 bit?

Floating-point, 64-bit, tested on dmd -O vs. gdc -O3.

 
 Next, are you using floats, doubles, or reals?

 Both reals and floats. Well, let's get some real measurements. 
 Here's a
 quick run-through of various test programs I have lying around:


 reals:
 - Test case with n=400:
 	Using DMD:	~8 seconds (consistently)
 	Using GDC:	~6 seconds (consistently)
 	* So the DMD version is 33% slower than the GDC version.
 	  (That is, 8/6*100 = 133%, so 33% slower.)

 - Test case with n=600:
 	Using DMD:	~27 seconds (consistently)
 	Using GDC:	~19 seconds (consistently)
 	* So the DMD version is 42% slower than the GDC version.



 (The running time of this one depends on the RNG, so I fixed 
 the seed
 value in order to make a fair comparison.)
 - Test case with seed=380170304, n=20 with water & wind 
 simulation:
 	Using DMD:	~10 seconds (consistently)
 	Using GDC:	~7 seconds (consistently)
 	* So the DMD version is 42% slower than the GDC version.

 - Test case with seed=380170304, n=25 with water & wind 
 simulation:
 	Using DMD:	~14 seconds (consistently)
 	Using GDC:	~9 seconds (consistently)
 	* So the DMD version is 55% slower than the GDC version.



 polytopes),
 uses reals:
 - All permutations and changes of sign of <1,2,3,4,5,6,7>:
 	Using DMD:	~4 seconds (consistently)
 	Using GDC:	~3 seconds (consistently)
 	* So the DMD version is 33% slower than the GDC version.

 - All permutations and changes of sign of <1,2,3,4,5,6,7,7>:
 	Using DMD:	~41 seconds (consistently)
 	Using GDC:	~27 seconds (consistently)
 	* So the DMD version is 51% slower than the GDC version.

 - Even permutations and all changes of sign of 
 <1,2,3,4,5,6,7,8>:
 	Using DMD:	~40 seconds (consistently)
 	Using GDC:	~27 seconds (consistently)
 	* So the DMD version is 48% slower than the GDC version.


 All test programs were compiled with dmd -O for the DMD 
 version, and gdc
 -O3 for the GDC version. The source code is unchanged between 
 the two
 compilers, and there are no version()'s that depend on a 
 particular
 compiler. The measurements stated above are averages of about 
 3-4 runs.

 As you can see, the performance difference is between the two 
 is pretty
 clear.  I'm pretty sure this isn't only because of floating 
 point
 operations, because the above test programs all use a lot of 
 inner
 loops, and GDC does some pretty sophisticated loop unrolling 
 and other
 such optimizations.


 T

Comparing dmd -O and gdc -O3 is hardly fair. "dmd -release 
-inline -O" is more comparable.

Jan 25 2013

Iain Buclaw <ibuclaw ubuntu.com> writes:

On 25 January 2013 10:27, John Colvin <john.loughran.colvin gmail.com>wrote:

 On Friday, 25 January 2013 at 01:41:12 UTC, H. S. Teoh wrote:

 On Thu, Jan 24, 2013 at 03:18:01PM -0800, Walter Bright wrote:

 On 1/24/2013 1:13 PM, H. S. Teoh wrote:
On Thu, Jan 24, 2013 at 12:15:07PM -0800, Walter Bright wrote:
On 1/24/2013 8:36 AM, H. S. Teoh wrote:
Nevertheless, I also have made the same observation that >>>code
produced by gdc consistently outperforms code produced by >>>dmd.
Usually by about 20-30%, sometimes as much as 50-60%, IME. >>>That's a
pretty big discrepancy for me, esp. when I'm doing
compute-intensive
geometric computations.

Do you mean floating point code? 32 or 64 bit?

Floating-point, 64-bit, tested on dmd -O vs. gdc -O3.

 Next, are you using floats, doubles, or reals?

 Both reals and floats. Well, let's get some real measurements. Here's a
 quick run-through of various test programs I have lying around:


 - Test case with n=400:
         Using DMD:      ~8 seconds (consistently)
         Using GDC:      ~6 seconds (consistently)
         * So the DMD version is 33% slower than the GDC version.
           (That is, 8/6*100 = 133%, so 33% slower.)

 - Test case with n=600:
         Using DMD:      ~27 seconds (consistently)
         Using GDC:      ~19 seconds (consistently)
         * So the DMD version is 42% slower than the GDC version.



 (The running time of this one depends on the RNG, so I fixed the seed
 value in order to make a fair comparison.)
 - Test case with seed=380170304, n=20 with water & wind simulation:
         Using DMD:      ~10 seconds (consistently)
         Using GDC:      ~7 seconds (consistently)
         * So the DMD version is 42% slower than the GDC version.

 - Test case with seed=380170304, n=25 with water & wind simulation:
         Using DMD:      ~14 seconds (consistently)
         Using GDC:      ~9 seconds (consistently)
         * So the DMD version is 55% slower than the GDC version.



 uses reals:
 - All permutations and changes of sign of <1,2,3,4,5,6,7>:
         Using DMD:      ~4 seconds (consistently)
         Using GDC:      ~3 seconds (consistently)
         * So the DMD version is 33% slower than the GDC version.

 - All permutations and changes of sign of <1,2,3,4,5,6,7,7>:
         Using DMD:      ~41 seconds (consistently)
         Using GDC:      ~27 seconds (consistently)
         * So the DMD version is 51% slower than the GDC version.

 - Even permutations and all changes of sign of <1,2,3,4,5,6,7,8>:
         Using DMD:      ~40 seconds (consistently)
         Using GDC:      ~27 seconds (consistently)
         * So the DMD version is 48% slower than the GDC version.


 All test programs were compiled with dmd -O for the DMD version, and gdc
 -O3 for the GDC version. The source code is unchanged between the two
 compilers, and there are no version()'s that depend on a particular
 compiler. The measurements stated above are averages of about 3-4 runs.

 As you can see, the performance difference is between the two is pretty
 clear.  I'm pretty sure this isn't only because of floating point
 operations, because the above test programs all use a lot of inner
 loops, and GDC does some pretty sophisticated loop unrolling and other
 such optimizations.


 T

 Comparing dmd -O and gdc -O3 is hardly fair. "dmd -release -inline -O" is
 more comparable.


But then you'd have to do gdc -O3 -frelease. :-)

-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';

Jan 25 2013

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Friday, 25 January 2013 at 13:38:03 UTC, Iain Buclaw wrote:
 On 25 January 2013 10:27, John Colvin 
 <john.loughran.colvin gmail.com>wrote:

 On Friday, 25 January 2013 at 01:41:12 UTC, H. S. Teoh wrote:

 On Thu, Jan 24, 2013 at 03:18:01PM -0800, Walter Bright wrote:

 On 1/24/2013 1:13 PM, H. S. Teoh wrote:
On Thu, Jan 24, 2013 at 12:15:07PM -0800, Walter Bright 
wrote:
On 1/24/2013 8:36 AM, H. S. Teoh wrote:
Nevertheless, I also have made the same observation that
code



produced by gdc consistently outperforms code produced by
dmd.



Usually by about 20-30%, sometimes as much as 50-60%, 
IME. >>>That's a
pretty big discrepancy for me, esp. when I'm doing
compute-intensive
geometric computations.

Do you mean floating point code? 32 or 64 bit?

Floating-point, 64-bit, tested on dmd -O vs. gdc -O3.

 Next, are you using floats, doubles, or reals?

 Both reals and floats. Well, let's get some real 
 measurements. Here's a
 quick run-through of various test programs I have lying 
 around:


 uses reals:
 - Test case with n=400:
         Using DMD:      ~8 seconds (consistently)
         Using GDC:      ~6 seconds (consistently)
         * So the DMD version is 33% slower than the GDC 
 version.
           (That is, 8/6*100 = 133%, so 33% slower.)

 - Test case with n=600:
         Using DMD:      ~27 seconds (consistently)
         Using GDC:      ~19 seconds (consistently)
         * So the DMD version is 42% slower than the GDC 
 version.



 (The running time of this one depends on the RNG, so I fixed 
 the seed
 value in order to make a fair comparison.)
 - Test case with seed=380170304, n=20 with water & wind 
 simulation:
         Using DMD:      ~10 seconds (consistently)
         Using GDC:      ~7 seconds (consistently)
         * So the DMD version is 42% slower than the GDC 
 version.

 - Test case with seed=380170304, n=25 with water & wind 
 simulation:
         Using DMD:      ~14 seconds (consistently)
         Using GDC:      ~9 seconds (consistently)
         * So the DMD version is 55% slower than the GDC 
 version.



 polytopes),
 uses reals:
 - All permutations and changes of sign of <1,2,3,4,5,6,7>:
         Using DMD:      ~4 seconds (consistently)
         Using GDC:      ~3 seconds (consistently)
         * So the DMD version is 33% slower than the GDC 
 version.

 - All permutations and changes of sign of <1,2,3,4,5,6,7,7>:
         Using DMD:      ~41 seconds (consistently)
         Using GDC:      ~27 seconds (consistently)
         * So the DMD version is 51% slower than the GDC 
 version.

 - Even permutations and all changes of sign of 
 <1,2,3,4,5,6,7,8>:
         Using DMD:      ~40 seconds (consistently)
         Using GDC:      ~27 seconds (consistently)
         * So the DMD version is 48% slower than the GDC 
 version.


 All test programs were compiled with dmd -O for the DMD 
 version, and gdc
 -O3 for the GDC version. The source code is unchanged between 
 the two
 compilers, and there are no version()'s that depend on a 
 particular
 compiler. The measurements stated above are averages of about 
 3-4 runs.

 As you can see, the performance difference is between the two 
 is pretty
 clear.  I'm pretty sure this isn't only because of floating 
 point
 operations, because the above test programs all use a lot of 
 inner
 loops, and GDC does some pretty sophisticated loop unrolling 
 and other
 such optimizations.


 T

 Comparing dmd -O and gdc -O3 is hardly fair. "dmd -release 
 -inline -O" is
 more comparable.


 But then you'd have to do gdc -O3 -frelease. :-)

Ah yes, of course :)

Jan 25 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Jan 25, 2013 at 04:09:25PM +0100, John Colvin wrote:
 On Friday, 25 January 2013 at 13:38:03 UTC, Iain Buclaw wrote:
On 25 January 2013 10:27, John Colvin
<john.loughran.colvin gmail.com>wrote:


[...]
Comparing dmd -O and gdc -O3 is hardly fair. "dmd -release
-inline -O" is more comparable.


But then you'd have to do gdc -O3 -frelease. :-)

 
 Ah yes, of course :)

Hmm. I didn't realize that dmd has a separate switch for function
inlining. Well, here's the updated numbers:


On Friday, 25 January 2013 at 01:41:12 UTC, H. S. Teoh wrote:
Both reals and floats. Well, let's get some real measurements.
Here's a quick run-through of various test programs I have lying
around:


uses reals:
- Test case with n=400:
        Using DMD:      ~8 seconds (consistently)
        Using GDC:      ~6 seconds (consistently)
        * So the DMD version is 33% slower than the GDC
version.
          (That is, 8/6*100 = 133%, so 33% slower.)




Updated: DMD version with -inline takes ~7 seconds consistently, so we
have 7/6*100 = 116%, so 16% slower.


- Test case with n=600:
        Using DMD:      ~27 seconds (consistently)
        Using GDC:      ~19 seconds (consistently)
        * So the DMD version is 42% slower than the GDC
version.




Updated: DMD version with -inline takes ~24 seconds consistently, so 26%
slower.



(The running time of this one depends on the RNG, so I fixed
the seed
value in order to make a fair comparison.)
- Test case with seed=380170304, n=20 with water & wind
simulation:
        Using DMD:      ~10 seconds (consistently)
        Using GDC:      ~7 seconds (consistently)
        * So the DMD version is 42% slower than the GDC
version.




Updated: DMD version with -inline takes ~8 seconds consistently, so 14%
slower.


- Test case with seed=380170304, n=25 with water & wind
simulation:
        Using DMD:      ~14 seconds (consistently)
        Using GDC:      ~9 seconds (consistently)
        * So the DMD version is 55% slower than the GDC
version.




Updated: DMD version with -inline takes ~11 seconds consistently, so
22% slower.



polytopes),
uses reals:
- All permutations and changes of sign of <1,2,3,4,5,6,7>:
        Using DMD:      ~4 seconds (consistently)
        Using GDC:      ~3 seconds (consistently)
        * So the DMD version is 33% slower than the GDC
version.




Updated: DMD version with -inline still takes ~4 seconds, so no
significant change here.


- All permutations and changes of sign of <1,2,3,4,5,6,7,7>:
        Using DMD:      ~41 seconds (consistently)
        Using GDC:      ~27 seconds (consistently)
        * So the DMD version is 51% slower than the GDC
version.




Updated: DMD version with -inline takes about 36 seconds on average, so
about 33% slower.


- Even permutations and all changes of sign of
<1,2,3,4,5,6,7,8>:
        Using DMD:      ~40 seconds (consistently)
        Using GDC:      ~27 seconds (consistently)
        * So the DMD version is 48% slower than the GDC
version.




Updated: DMD version with -inline takes about 38 seconds, so 41% slower.

Conclusions:
- The performance gap is smaller than previously thought, but it's still
  present.
- I will be using -inline with dmd aggressively.
- What other dmd options am I missing that will bring dmd on par with
  gdc -O3 (if there are any)?


T

-- 
Written on the window of a clothing store: No shirt, no shoes, no service.

Jan 25 2013

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Friday, 25 January 2013 at 16:09:00 UTC, H. S. Teoh wrote:
 On Fri, Jan 25, 2013 at 04:09:25PM +0100, John Colvin wrote:
 On Friday, 25 January 2013 at 13:38:03 UTC, Iain Buclaw wrote:
On 25 January 2013 10:27, John Colvin
<john.loughran.colvin gmail.com>wrote:


 [...]
Comparing dmd -O and gdc -O3 is hardly fair. "dmd -release
-inline -O" is more comparable.


But then you'd have to do gdc -O3 -frelease. :-)

 
 Ah yes, of course :)

 Hmm. I didn't realize that dmd has a separate switch for 
 function
 inlining. Well, here's the updated numbers:


On Friday, 25 January 2013 at 01:41:12 UTC, H. S. Teoh wrote:
Both reals and floats. Well, let's get some real 
measurements.
Here's a quick run-through of various test programs I have 
lying
around:


uses reals:
- Test case with n=400:
        Using DMD:      ~8 seconds (consistently)
        Using GDC:      ~6 seconds (consistently)
        * So the DMD version is 33% slower than the GDC
version.
          (That is, 8/6*100 = 133%, so 33% slower.)




 Updated: DMD version with -inline takes ~7 seconds 
 consistently, so we
 have 7/6*100 = 116%, so 16% slower.


- Test case with n=600:
        Using DMD:      ~27 seconds (consistently)
        Using GDC:      ~19 seconds (consistently)
        * So the DMD version is 42% slower than the GDC
version.




 Updated: DMD version with -inline takes ~24 seconds 
 consistently, so 26%
 slower.



(The running time of this one depends on the RNG, so I fixed
the seed
value in order to make a fair comparison.)
- Test case with seed=380170304, n=20 with water & wind
simulation:
        Using DMD:      ~10 seconds (consistently)
        Using GDC:      ~7 seconds (consistently)
        * So the DMD version is 42% slower than the GDC
version.




 Updated: DMD version with -inline takes ~8 seconds 
 consistently, so 14%
 slower.


- Test case with seed=380170304, n=25 with water & wind
simulation:
        Using DMD:      ~14 seconds (consistently)
        Using GDC:      ~9 seconds (consistently)
        * So the DMD version is 55% slower than the GDC
version.




 Updated: DMD version with -inline takes ~11 seconds 
 consistently, so
 22% slower.



polytopes),
uses reals:
- All permutations and changes of sign of <1,2,3,4,5,6,7>:
        Using DMD:      ~4 seconds (consistently)
        Using GDC:      ~3 seconds (consistently)
        * So the DMD version is 33% slower than the GDC
version.




 Updated: DMD version with -inline still takes ~4 seconds, so no
 significant change here.


- All permutations and changes of sign of <1,2,3,4,5,6,7,7>:
        Using DMD:      ~41 seconds (consistently)
        Using GDC:      ~27 seconds (consistently)
        * So the DMD version is 51% slower than the GDC
version.




 Updated: DMD version with -inline takes about 36 seconds on 
 average, so
 about 33% slower.


- Even permutations and all changes of sign of
<1,2,3,4,5,6,7,8>:
        Using DMD:      ~40 seconds (consistently)
        Using GDC:      ~27 seconds (consistently)
        * So the DMD version is 48% slower than the GDC
version.




 Updated: DMD version with -inline takes about 38 seconds, so 
 41% slower.

 Conclusions:
 - The performance gap is smaller than previously thought, but 
 it's still
   present.
 - I will be using -inline with dmd aggressively.
 - What other dmd options am I missing that will bring dmd on 
 par with
   gdc -O3 (if there are any)?


 T

I have sometimes found that using -release and -noboundscheck 
made a bigger difference to dmd than to gdc. The corresponding 
gdc options are -frelease and -fno-bounds-check

Comparing performance without -release isn't that meaningful.

Jan 25 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Jan 25, 2013 at 05:50:21PM +0100, John Colvin wrote:
 On Friday, 25 January 2013 at 16:09:00 UTC, H. S. Teoh wrote:

[...]
Conclusions:
- The performance gap is smaller than previously thought, but it's
  still present.
- I will be using -inline with dmd aggressively.
- What other dmd options am I missing that will bring dmd on par
  with gdc -O3 (if there are any)?


T

 
 I have sometimes found that using -release and -noboundscheck made a
 bigger difference to dmd than to gdc. The corresponding gdc options
 are -frelease and -fno-bounds-check
 
 Comparing performance without -release isn't that meaningful.

Alright. So to make the comparison fair(er), I recompiled test program


	dmd -O -inline -m64 -release -nobounds check
	gdc -O3 -m64 -frelease -fno-bounds-check



	With DMD: 15 seconds (average of 4 runs)
	With GDC: 11 seconds (average of 4 runs)

There's still a 36% performance difference.


simulation), using seed=380170304, with wind & water simulation, and
n=30 (I increased the iteration count to make measurement noise less
prominent). Here's the new results:

	With DMD: 11 seconds (average of 4 runs)
	With GDC: 9 seconds (average of 4 runs)

So a gap of 22% is still present.


compiling with -release -O -noboundscheck -inline), so I don't have the
test results for that yet. I'll try to figure out what's causing the
linker error and post the results later.

In the meantime, it's clear that GDC is still showing significant
performance improvement over DMD.  There is a _consistent_ 20-30%
difference in performance in all of the tests so far. So I think at this
point it's fair to say that GDC's back end produces superior code in
terms of performance.  (I will note, though, that GDC produces larger
executables than DMD, sometimes much larger, so space-wise, there is
some price to pay.)


T

-- 
Chance favours the prepared mind. -- Louis Pasteur

Jan 25 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 1/25/2013 11:46 AM, H. S. Teoh wrote:
 So I think at this
 point it's fair to say that GDC's back end produces superior code in
 terms of performance.

If you're feeling ambitious, taking a closer look to see why would be most 
interesting.

Jan 25 2013

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 01/26/2013 02:37 AM, Walter Bright wrote:
 If you're feeling ambitious, taking a closer look to see why would be most
 interesting.

It's nice if DMD can produce faster code, but in the short term I'd rather see 
priority being given to making the frontend/druntime more easily portable to 
different backends.

The speed issues of DMD have never bothered me, precisely because GDC and LDC 
exist -- and besides speed, there's also the issue of target architectures. 
The 
problem is rather having to wait for bugfixes and new features to propagate to 
the D compilers which already solved the speed and architecture issues.

Jan 26 2013

"mist" <none none.none> writes:

On Saturday, 26 January 2013 at 15:17:18 UTC, Joseph Rushton 
Wakeling wrote:
 ...

++

Once situation with front-end bugs and stability is settled, I 
see zero reasons to use dmd back-end and spending efforts on its 
optimization feels not pragmatical.

Jan 26 2013

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 01/26/2013 04:26 PM, mist wrote:
 Once situation with front-end bugs and stability is settled, I see zero reasons
 to use dmd back-end and spending efforts on its optimization feels not
pragmatical.

Actually, I feel somewhat the contrary.  When the problem of frontend/runtime 
portability has been solved, then it makes plenty of sense to look at DMD speed 
and backend issues.  Improving DMD is always a good thing -- it's just a 
question of priorities.

Jan 26 2013

"mist" <none none.none> writes:

Yes, of course, we all have our own preferences, that is fine :) 
I mean a bit different thing: front-end efforts affect all major 
compiler lovers, not only one group and thus are more important.

Jan 26 2013

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 01/26/2013 04:43 PM, mist wrote:
 Yes, of course, we all have our own preferences, that is fine :) I mean a bit
 different thing: front-end efforts affect all major compiler lovers, not only
 one group and thus are more important.

Yup, agree. :-)

Jan 26 2013

Marco Leise <Marco.Leise gmx.de> writes:

I use DMD and GDC. DMD for debug builds and (since it is the
reference compiler) to ensure language conformance. GDC for
performance tests and release.
In other terms, I don't expect the W3 reference browser to be
the fastest, but to set the required standard for HTML
interpretation.
If you asked me, I'd keep all smart compiler optimizations
out of DMD for sake of stability, compilation speed and
maintenance effort.
Some of what GCC does is amazing, but probably requires heaps
of difficult to read code. (I once saw it SSE optimize my code
where I was using a 4-byte struct with 3 used bytes that I did
computations on in a loop.)

-- 
Marco

Jan 26 2013

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 01/24/2013 05:36 PM, H. S. Teoh wrote:
 I think it would be ideal if the dmd front end were more isolated from
 the back end, so that it's easier to port to gdc/ldc (i.e. it can happen
 in the matter of days after a dmd release, not, say, weeks or months).

Case in point -- today I got bitten by this issue:
http://forum.dlang.org/thread/sntkmtabuhuctcbnlsgq forum.dlang.org

AFAICT it's fixed in 2.061, and it certainly doesn't show up when compiling
with 
latest-git dmd.  But as 2.061 isn't yet merged into ldc or gdc, both of these 
compilers are temporarily out of commission ...

 But I believe Walter has already said that patches to this effect are
 welcome, so I can only see the situation improve in the future.

Yes, my impression too, and I know that some people have been putting work 
towards it.

Jan 24 2013

"Rob T" <alanb ucora.com> writes:

On Thursday, 24 January 2013 at 10:42:10 UTC, Joseph Rushton 
Wakeling wrote:
 On 01/24/2013 11:16 AM, Walter Bright wrote:
 If you use the 64 bit model, dmd will use SIMD instructions 
 for float and
 double, which are much faster.

 I generally find that dmd-compiled programs run at about half 
 the speed of those built with gdc or ldc (the latter seem 
 pretty much equivalent these days, some programs run faster 
 compiled with one, some with the other).  That's running off 
 latest GitHub source for all compilers.

 That's been a fairly consistent speed difference for a long 
 time.  And yes, I'm using 64-bit.

You are taking care to compare with full optimization flag 
settings? I'm sure you are, but I ask just in case.

--rt

Jan 24 2013

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 01/24/2013 10:05 PM, Rob T wrote:
 You are taking care to compare with full optimization flag settings? I'm sure
 you are, but I ask just in case.

I use -O -release -inline typically (I use the dmd-ish interfaces for gdc and 
ldc as well).

Absent any optimizations, executables seem to run at about the same speed no 
matter what compiler is used.  Interestingly, at least on the code that I just 
tested with, the different compilers react differently to different 
optimizations: dmd gains much less from -O than gdmd, and ldmd2 gains much more 
than both of the others.  Adding -inline doesn't seem to affect executable
speed 
at all (this is probably a quirk of the particular code I'm testing with). 
Adding -release speeds up executables about as much as -O (for dmd and gdmd)
and 
maybe makes a slight additional speedup for ldmd2.

With -O -release -inline, executables compiled with gdmd and ldmd2 seem to run 
at about the same speed.  Interestingly, using -release alone results in about 
the same executable speed for both gdmd and ldmd2, but using -O alone means 
ldmd2-compiled executables are as fast as gdmd-compiled executables compiled 
with both -O and -release.

That surely means that these identical DFLAGS translate in practice into 
different underlying optimizations depending on the compiler.

Of course, these are very casual and trivial tests using a single piece of code 
-- here if you want to repeat the tests: https://github.com/WebDrake/Dregs -- 
but they reflect my typical experience with the different D compilers.

Jan 24 2013

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 01/25/2013 01:02 AM, Joseph Rushton Wakeling wrote:
 but they reflect my typical experience with the different D compilers.

The caveat here is that these results are typical for _number-crunching_ code. 
If the dominant factor in your program's speed is e.g. console output, you'll 
find the differences between the compilers much less noticeable.  For example:
I 
have a piece of code that implements a Monte Carlo simulation and prints an 
update of its status at each time step -- with -O -release -inline flags, this 
runs in about 23s with gdmd, 25 with ldmd2 and 28 with dmd.

If I remove the writef statements, leaving just the number-crunching part, it 
runs in about 4s with gdmd, 7s with ldmd2 and 14s (!) with dmd.

Jan 24 2013

"lomereiter" <lomereiter gmail.com> writes:

On Friday, 25 January 2013 at 00:25:46 UTC, Joseph Rushton 
Wakeling wrote:
 If I remove the writef statements, leaving just the 
 number-crunching part, it runs in about 4s with gdmd, 7s with 
 ldmd2 and 14s (!) with dmd.

 From my experience, writef and friends are substantially slower 
than printf. I wouldn't recommend using them for output-intensive 
applications. And of course, the best option is to avoid any 
format string parsing altogether, using only fwrite calls.

Jan 24 2013

D Programming

C/C++ Programming

Other

digitalmars.D - D for scientific computing