www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - dmd-x64

reply alkor <alkor au.ru> writes:
anybody see the 64-bit version of dmd compiler?
Dec 21 2009
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
alkor:

 anybody see the 64-bit version of dmd compiler?
I can't see it. It must be absent. Bye, bearophile
Dec 21 2009
parent reply alkor <alkor au.ru> writes:
it's bad
d's good enough to make real projects, but complier MUST supports linux x64 as
a target platform

believe, it's time to make 64-bit code generation

is it possible to take back-end (i.e. code generation) from gcc or it's too
complicated?
Dec 22 2009
parent reply Travis Boucher <boucher.travis gmail.com> writes:
alkor wrote:
 it's bad
 d's good enough to make real projects, but complier MUST supports linux x64 as
a target platform
 
 believe, it's time to make 64-bit code generation
 
 is it possible to take back-end (i.e. code generation) from gcc or it's too
complicated?
Look up gdc and ldc, both can target x86_64. gdc tends to be lagging behind (ALOT) in the dmd front end, ldc not as much.
Dec 22 2009
parent reply Matt <revcompgeek gmail.com> writes:
On 12/22/09 2:34 AM, Travis Boucher wrote:
 alkor wrote:
 it's bad
 d's good enough to make real projects, but complier MUST supports
 linux x64 as a target platform

 believe, it's time to make 64-bit code generation

 is it possible to take back-end (i.e. code generation) from gcc or
 it's too complicated?
Look up gdc and ldc, both can target x86_64. gdc tends to be lagging behind (ALOT) in the dmd front end, ldc not as much.
GDC is being maintained again. See http://bitbucket.org/goshawk/gdc/wiki/Home They are up to DMD 1.043 and there has been significant activity recently. It could take a while for them to get fully caught up, but they are making good progress.
Dec 22 2009
parent reply Travis Boucher <boucher.travis gmail.com> writes:
Matt wrote:
 On 12/22/09 2:34 AM, Travis Boucher wrote:
 alkor wrote:
 it's bad
 d's good enough to make real projects, but complier MUST supports
 linux x64 as a target platform

 believe, it's time to make 64-bit code generation

 is it possible to take back-end (i.e. code generation) from gcc or
 it's too complicated?
Look up gdc and ldc, both can target x86_64. gdc tends to be lagging behind (ALOT) in the dmd front end, ldc not as much.
GDC is being maintained again. See http://bitbucket.org/goshawk/gdc/wiki/Home They are up to DMD 1.043 and there has been significant activity recently. It could take a while for them to get fully caught up, but they are making good progress.
gdc is still lagging quite a bit, I've been following the goshawk branch. The problem here is he has to deal with both the major DMD changes (in 2 different D versions) and the big changes in GCC, so maintaining gdc itself would be an annoying process since there isn't a bit of support on either end of the bridge. (DM does what best for DM, gcc won't accept a language like D (even though it has more similarities to C/C++ then java/fortran/ada does). ldc on the other hand has a great structure which promotes using it as a backend for a different front end, however it doesn't (yet) generic code nearly as good as gcc. dmd's focus seems to be more about a reference compiler then a flexible compile that generates great code. Personally, I still use an old ass gdc based on GCC 4.1.3, DMD1.020 because it happens to be the one that best supports my platform (FreeBSD/amd64). The only real issues I run into is a few issues with CTFE and dsss/rebuild's handling of a few compiler errors (eg. writefln("..."; results in rebuild exploding.)
Dec 22 2009
parent reply bearophile <bearophileHUGS lycos.com> writes:
Travis Boucher:
 ldc on the other hand has a great structure which promotes using it as a 
 backend for a different front end, however it doesn't (yet) generic code 
 nearly as good as gcc.
Can you explain better what do you mean? Bye, bearophile
Dec 22 2009
parent reply Travis Boucher <boucher.travis gmail.com> writes:
bearophile wrote:
 Travis Boucher:
 ldc on the other hand has a great structure which promotes using it as a 
 backend for a different front end, however it doesn't (yet) generic code 
 nearly as good as gcc.
Can you explain better what do you mean? Bye, bearophile
llvm has been designed for use for code analyzers, compiler development, IDEs, etc. The APIs are well documented and well thought out, as it its IR (which is an assembler-like language itself). It is easy to use small parts of llvm due to its modular structure. Although it's design promotes all sorta of optimization techniques, its still pretty young (compared to gcc) and just doesn't have all of the optimization stuff gcc has. gcc has evolved over a long time, and contains alot of legacy cruft. It's IR changes on a (somewhat) regular basis, and its internals are a big hairy intertwined mess. Trying to learn one small part of how GCC works often involves learning how alot of other unrelated things work. However, since it is so mature, many different optimization techniques have been developed, and continue to be developed as underlying hardware changes. It also supports generating code for a huge number of targets. When I say 'ldc' above, I really mean 'llvm' in general.
Dec 22 2009
parent reply bearophile <bearophileHUGS lycos.com> writes:
Travis Boucher:
 Although it's design 
 promotes all sorta of optimization techniques, its still pretty young 
 (compared to gcc) and just doesn't have all of the optimization stuff 
 gcc has.
I have already done hundred of tests and benchmarks with LDC and llvm-gcc, and I'm starting to understand its optimizations. I am mostly ignorant of LLVM still, but I'm giving a bit of help tuning it, this improvement was motivated by me: http://blog.llvm.org/2009/12/advanced-topics-in-redundant-load.html Compared to GCC LLVM lacks vectorization (this can be important for certain heavy numerical computing code), profile-guided optimization (this is usually less important, it's uncommon that it gives more than 5-25% performance improvement), but it has a link-time optimizations that gcc lacks (about as important as profile-guided optimization or a little more). LLVM produces bad X86 floating point code still, but its int/FP SSE code is about as good as GCC one or better (but it's not vectorized, so far). GCC is older and it knows few extra small/tiny optimization tricks, but in most situations they don't create a large difference in performance, they are often quite specific. So overall LLVM may sometime produce a little slower code, but in many situations it's about as good or even better (I can show a large amount of cases where LLVM is better). So the asm quality difference is smaller than you seem to imply. If the size of such performance differences are important for you, then you may want to use the Intel compiler instead of GCC, because it's sometimes better than GCC. Bye, bearophile
Dec 22 2009
next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from bearophile (bearophileHUGS lycos.com)'s article
 So overall LLVM may sometime produce a little slower code, but in many
situations it's about as good or even better (I can show a large amount of cases where LLVM is better). So the asm quality difference is smaller than you seem to imply. If the size of such performance differences are important for you, then you may want to use the Intel compiler instead of GCC, because it's sometimes better than GCC.
 Bye,
 bearophile
Does Intel even make compilers for any language outside the horribly crufty legacy language category (C, C++, Fortran)?
Dec 22 2009
parent bearophile <bearophileHUGS lycos.com> writes:
dsimcha:
 Does Intel even make compilers for any language outside the horribly crufty
legacy
 language category (C, C++, Fortran)?
Mostly C++/Fortran. The problem is, probably those crufty legacy languages aren't going away in the next 20 years :-) Bye, bearophile
Dec 23 2009
prev sibling next sibling parent Travis Boucher <boucher.travis gmail.com> writes:
bearophile wrote:
 Travis Boucher:
 Although it's design 
 promotes all sorta of optimization techniques, its still pretty young 
 (compared to gcc) and just doesn't have all of the optimization stuff 
 gcc has.
I have already done hundred of tests and benchmarks with LDC and llvm-gcc, and I'm starting to understand its optimizations. I am mostly ignorant of LLVM still, but I'm giving a bit of help tuning it, this improvement was motivated by me: http://blog.llvm.org/2009/12/advanced-topics-in-redundant-load.html Compared to GCC LLVM lacks vectorization (this can be important for certain heavy numerical computing code), profile-guided optimization (this is usually less important, it's uncommon that it gives more than 5-25% performance improvement), but it has a link-time optimizations that gcc lacks (about as important as profile-guided optimization or a little more). LLVM produces bad X86 floating point code still, but its int/FP SSE code is about as good as GCC one or better (but it's not vectorized, so far). GCC is older and it knows few extra small/tiny optimization tricks, but in most situations they don't create a large difference in performance, they are often quite specific. So overall LLVM may sometime produce a little slower code, but in many situations it's about as good or even better (I can show a large amount of cases where LLVM is better). So the asm quality difference is smaller than you seem to imply. If the size of such performance differences are important for you, then you may want to use the Intel compiler instead of GCC, because it's sometimes better than GCC. Bye, bearophile
I am not trying to get into the benchmark game, for every example of gcc generating better code then llvm, there could be an example of llvm generating better code then gcc. What I was trying to state is the overall differences between the two: - ldc supports newer versions of the dmd front end then gcc. - gdc tend to generate better code then ldc (in many cases) - gdc supports more targets (the code generator, not the runtime) I personally use an old-ass gdc because it works for what I need. I'd like to switch to ldc, but there is limited support for my target platform.
Dec 22 2009
prev sibling parent reply Leandro Lucarella <llucax gmail.com> writes:
bearophile, el 23 de diciembre a las 00:13 me escribiste:
 Compared to GCC LLVM lacks vectorization (this can be important for
 certain heavy numerical computing code), profile-guided optimization
 (this is usually less important, it's uncommon that it gives more than
 5-25% performance improvement)
I don't know if that are accurate numbers, but 5-25% looks like a *lot* to me.
 but it has a link-time optimizations that gcc lacks (about as important
 as profile-guided optimization or a little more).
And GCC have LTO too, see: http://gcc.gnu.org/wiki/LinkTimeOptimization I'm not arguing that GCC is way better than LLVM, just wanted to add some lacking information to this thread. I really think they are very close, sometimes one is better, sometimes the other is better), but LLVM is very young compared to GCC so it's very promising that they are so close to GCC in so little time (and using less memory and CPU time). -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- If you want to be alone, just be alone If you want to watch the sea, just watch the sea But do it now, timing is the answer, do it now Timing is the answer to success
Dec 23 2009
parent reply bearophile <bearophileHUGS lycos.com> writes:
Leandro Lucarella:

 bearophile, el 23 de diciembre a las 00:13 me escribiste:
 Compared to GCC LLVM lacks vectorization (this can be important for
 certain heavy numerical computing code), profile-guided optimization
 (this is usually less important, it's uncommon that it gives more than
 5-25% performance improvement)
I don't know if that are accurate numbers, but 5-25% looks like a *lot* to me.
Vectorization can improve 2X or 3X+ the performance of certain code (typical example: matrix multiplication done right). Performance differences start to matter in practice when they are 2X or more. In most situations users aren't able to appreciate a 20% performance improvement of an application. (But small improvements are important for the compiler devs because they are cumulative, so many small improvements may eventually lead some a significant difference). Regarding the accuracy of those numbers, it's not easy to tell how much accurate they are, because they are quite sensitive to the details of the code.
 but it has a link-time optimizations that gcc lacks (about as important
 as profile-guided optimization or a little more).
And GCC have LTO too, see: http://gcc.gnu.org/wiki/LinkTimeOptimization
Oh, nice, I have not tried this yet. Is this going in Gcc 4.5? LTO of LLVM is pretty good, I don't know if GCC implements it equally well (I fear that the answer is negative.
 I'm not arguing that GCC is way better than LLVM, just wanted to add some
 lacking information to this thread.
Thank you.
 I really think they are very close,
 sometimes one is better, sometimes the other is better), but LLVM is very
 young compared to GCC so it's very promising that they are so close to GCC
 in so little time (and using less memory and CPU time).
LLVM devs are also very nice people, they help me when I have a problem, and they even implement large changes I ask them, often in a short enough time. Helping them is fun. This means that probably the compiler will keep improving for some more time, because in open source projects the quality of the community is important. Bye, bearophile
Dec 23 2009
next sibling parent Leandro Lucarella <llucax gmail.com> writes:
bearophile, el 23 de diciembre a las 12:02 me escribiste:
 Leandro Lucarella:
 
 bearophile, el 23 de diciembre a las 00:13 me escribiste:
 Compared to GCC LLVM lacks vectorization (this can be important for
 certain heavy numerical computing code), profile-guided optimization
 (this is usually less important, it's uncommon that it gives more than
 5-25% performance improvement)
I don't know if that are accurate numbers, but 5-25% looks like a *lot* to me.
Vectorization can improve 2X or 3X+ the performance of certain code (typical example: matrix multiplication done right). Performance differences start to matter in practice when they are 2X or more. In most situations users aren't able to appreciate a 20% performance improvement of an application. (But small improvements are important for the compiler devs because they are cumulative, so many small improvements may eventually lead some a significant difference).
Well, you are talking about a single user, but for servers, if you have to provide a minimum quality of service, a 20% difference means you can serve 20% more people, for example (not that people would have to wait 0.2 secs more, because that is not an option).
 but it has a link-time optimizations that gcc lacks (about as important
 as profile-guided optimization or a little more).
And GCC have LTO too, see: http://gcc.gnu.org/wiki/LinkTimeOptimization
Oh, nice, I have not tried this yet. Is this going in Gcc 4.5? LTO of LLVM is pretty good, I don't know if GCC implements it equally well (I fear that the answer is negative.
I think it will be in GCC 4.5 but I don't know the details. -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Se va a licitar un sistema de vuelos espaciales mendiante el cual, desde una plataforma que quizás se instale en la provincia de Córdoba. Esas naves espaciales va a salir de la atmósfera, va a remontar la estratósfera y desde ahí elegir el lugar donde quieran ir de tal forma que en una hora y media podamos desde Argentina estar en Japón, en Corea o en cualquier parte. -- Carlos Saúl Menem (sic)
Dec 23 2009
prev sibling parent reply retard <re tard.com.invalid> writes:
Wed, 23 Dec 2009 12:02:53 -0500, bearophile wrote:

 Leandro Lucarella:
 
 bearophile, el 23 de diciembre a las 00:13 me escribiste:
 Compared to GCC LLVM lacks vectorization (this can be important for
 certain heavy numerical computing code), profile-guided optimization
 (this is usually less important, it's uncommon that it gives more
 than 5-25% performance improvement)
I don't know if that are accurate numbers, but 5-25% looks like a *lot* to me.
Vectorization can improve 2X or 3X+ the performance of certain code (typical example: matrix multiplication done right). Performance differences start to matter in practice when they are 2X or more. In most situations users aren't able to appreciate a 20% performance improvement of an application. (But small improvements are important for the compiler devs because they are cumulative, so many small improvements may eventually lead some a significant difference).
Aren't able to appreciate? Where are those numbers pulled from? Autovectorization mostly deals with expression optimizations in loops. You can easily calculate how much faster some code runs when it uses e.g. SSE2 instructions instead of plain old x86 instructions.
 LLVM devs are also very nice people, they help me when I have a problem,
 and they even implement large changes I ask them, often in a short
 enough time. Helping them is fun. This means that probably the compiler
 will keep improving for some more time, because in open source projects
 the quality of the community is important.
And GCC devs aren't nice people? They won't help you if you have a problem? Helping them isn't fun? GCC won't keep improving because it's open source? You make no sense. How much do the LLVM devs pay you for advertising them?
Dec 23 2009
parent reply =?UTF-8?B?UGVsbGUgTcOlbnNzb24=?= <pelle.mansson gmail.com> writes:
On 12/23/2009 10:40 PM, retard wrote:
 Wed, 23 Dec 2009 12:02:53 -0500, bearophile wrote:

 Leandro Lucarella:

 bearophile, el 23 de diciembre a las 00:13 me escribiste:
 Compared to GCC LLVM lacks vectorization (this can be important for
 certain heavy numerical computing code), profile-guided optimization
 (this is usually less important, it's uncommon that it gives more
 than 5-25% performance improvement)
I don't know if that are accurate numbers, but 5-25% looks like a *lot* to me.
Vectorization can improve 2X or 3X+ the performance of certain code (typical example: matrix multiplication done right). Performance differences start to matter in practice when they are 2X or more. In most situations users aren't able to appreciate a 20% performance improvement of an application. (But small improvements are important for the compiler devs because they are cumulative, so many small improvements may eventually lead some a significant difference).
Aren't able to appreciate? Where are those numbers pulled from? Autovectorization mostly deals with expression optimizations in loops. You can easily calculate how much faster some code runs when it uses e.g. SSE2 instructions instead of plain old x86 instructions.
I think you miss the point, he said vectorization was a big deal. The numbers on profile guided optimization seem a bit odd though.
 LLVM devs are also very nice people, they help me when I have a problem,
 and they even implement large changes I ask them, often in a short
 enough time. Helping them is fun. This means that probably the compiler
 will keep improving for some more time, because in open source projects
 the quality of the community is important.
And GCC devs aren't nice people? They won't help you if you have a problem? Helping them isn't fun? GCC won't keep improving because it's open source? You make no sense. How much do the LLVM devs pay you for advertising them?
LLVM is way younger than GCC. In my experiments, I get mostly better performance out of clang than out of gcc. Working with LLVM seems like more fun to me.
Dec 23 2009
parent reply bearophile <bearophileHUGS lycos.com> writes:
Pelle MÃ¥nsson:
The numbers on profile guided optimization seem a bit odd though.<
You are right. It's not easy to give average numbers for any kind of C or C++ software. In benchmark-like code I've seen up to 20-25% improvements, but I assume that in much larger programs the situation is different. Probably if you try to compute a true average, the average percentage of improvement is lower, like 5% or less. It's a feature useful for hot spots of the code. Bye, bearophile
Dec 23 2009
parent reply Walter Bright <newshound1 digitalmars.com> writes:
bearophile wrote:
 You are right. It's not easy to give average numbers for any kind of
 C or C++ software. In benchmark-like code I've seen up to 20-25%
 improvements, but I assume that in much larger programs the situation
 is different. Probably if you try to compute a true average, the
 average percentage of improvement is lower, like 5% or less. It's a
 feature useful for hot spots of the code.
Small benchmarks tend to have a high 'beta', or variance from the norm. The results in actual applications tend to be much closer together.
Dec 23 2009
parent reply retard <re tard.com.invalid> writes:
Wed, 23 Dec 2009 17:04:49 -0800, Walter Bright wrote:

 bearophile wrote:
 You are right. It's not easy to give average numbers for any kind of C
 or C++ software. In benchmark-like code I've seen up to 20-25%
 improvements, but I assume that in much larger programs the situation
 is different. Probably if you try to compute a true average, the
 average percentage of improvement is lower, like 5% or less. It's a
 feature useful for hot spots of the code.
Small benchmarks tend to have a high 'beta', or variance from the norm. The results in actual applications tend to be much closer together.
It's difficult to measure performance improvements overall in applications like image manipulation software or sound wave editors. E.g. if a complex effect processing takes now 2 seconds instead of 4 hours, but all GUI event processing is 100% slower, during the workday the application might only work 10% faster overall. The user spends much more time in the interactive part of the code. From what I've read, bearophile mostly only uses synthetic tests.
Dec 23 2009
parent reply Walter Bright <newshound1 digitalmars.com> writes:
retard wrote:
 It's difficult to measure performance improvements overall in 
 applications like image manipulation software or sound wave editors. E.g. 
 if a complex effect processing takes now 2 seconds instead of 4 hours, 
 but all GUI event processing is 100% slower, during the workday the 
 application might only work 10% faster overall. The user spends much more 
 time in the interactive part of the code. From what I've read, bearophile 
 mostly only uses synthetic tests.
I find that benchmarks are useful in figuring out new ways to optimize code, but not very useful in predicting the performance of a compiler on any of my apps.
Dec 23 2009
parent reply alkor <alkor au.ru> writes:
oh ... i stirred up a holy war, sorry

each lang has weak & strength features
e.g. need hight performance? - use asm and pay by development time

but i'm looking for a new lang generation (not c++ - it's too "dirty") w real
objects & templates  and powerful multi-threading features
e.g. thread-local storage (TLS) and some concurrency features from c++0x

so, Walter, is it possible to expand a set of D's multi-threading features?

Walter Bright Wrote:

 retard wrote:
 It's difficult to measure performance improvements overall in 
 applications like image manipulation software or sound wave editors. E.g. 
 if a complex effect processing takes now 2 seconds instead of 4 hours, 
 but all GUI event processing is 100% slower, during the workday the 
 application might only work 10% faster overall. The user spends much more 
 time in the interactive part of the code. From what I've read, bearophile 
 mostly only uses synthetic tests.
I find that benchmarks are useful in figuring out new ways to optimize code, but not very useful in predicting the performance of a compiler on any of my apps.
Dec 24 2009
parent reply Walter Bright <newshound1 digitalmars.com> writes:
alkor wrote:
 but i'm looking for a new lang generation (not c++ - it's too "dirty") w real
objects & templates  and powerful multi-threading features
 e.g. thread-local storage (TLS) and some concurrency features from c++0x
 
 so, Walter, is it possible to expand a set of D's multi-threading features?
D already has TLS. What exactly do you need?
Dec 24 2009
parent reply alkor <alkor au.ru> writes:
 D already has TLS. What exactly do you need?
hmm ... i don't think so. i've worked out the following info: http://www.digitalmars.com/d/2.0/cpp0x.html#local-classes http://www.digitalmars.com/d/2.0/migrate-to-shared.html but "shared data" are not TLS or i misunderstand something whether you could give a TLS example?
Dec 24 2009
next sibling parent =?UTF-8?B?UGVsbGUgTcOlbnNzb24=?= <pelle.mansson gmail.com> writes:
On 12/24/2009 11:44 AM, alkor wrote:
 D already has TLS. What exactly do you need?
hmm ... i don't think so. i've worked out the following info: http://www.digitalmars.com/d/2.0/cpp0x.html#local-classes http://www.digitalmars.com/d/2.0/migrate-to-shared.html but "shared data" are not TLS or i misunderstand something whether you could give a TLS example?
int i; void main() { } compile with -vtls. :)
Dec 24 2009
prev sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Thu, 24 Dec 2009 13:44:41 +0300, alkor <alkor au.ru> wrote:

 D already has TLS. What exactly do you need?
hmm ... i don't think so. i've worked out the following info: http://www.digitalmars.com/d/2.0/cpp0x.html#local-classes http://www.digitalmars.com/d/2.0/migrate-to-shared.html but "shared data" are not TLS or i misunderstand something whether you could give a TLS example?
"Shared data" is something which is *shared* between threads. That's exact opposite of TLS (thread-*local* storage). In D2, everything is thread-local by default: int foo; // thread-local shared int bar; // shared among threads
Dec 24 2009
prev sibling next sibling parent reply alkor <alkor au.ru> writes:
i've tested g++, gdc & dmd on an ordinary task - processing compressed data w
using zlib
all compilers're made from sources, target - gentoo x32 i686

c++ & d codes are simplest & alike

but, dmd makes faster result then g++
and gdc loses g++ 'cause gdc'es not have any optimization options

gdc makes slower code then dmd and does'nt support d 2.0, so it's useless

so ... i'm waiting for dmd x64

== Repost the article of Travis Boucher (boucher.travis gmail.com)
== Posted at 2009/12/23 01:51 to digitalmars.D

bearophile wrote:
 Travis Boucher:
parent Travis Boucher <boucher.travis gmail.com> writes:
alkor wrote:
 i've tested g++, gdc & dmd on an ordinary task - processing compressed data w
using zlib
 all compilers're made from sources, target - gentoo x32 i686
 
 c++ & d codes are simplest & alike
 
 but, dmd makes faster result then g++
 and gdc loses g++ 'cause gdc'es not have any optimization options
 
 gdc makes slower code then dmd and does'nt support d 2.0, so it's useless
 
 so ... i'm waiting for dmd x64
 
If you can't get gdc to generate optimized code, then you are using it wrong.
Dec 23 2009
prev sibling parent reply alkor <alkor au.ru> writes:
maybe, i do something wrong, but for example:

$ cat main.d
int main () {
    return 0;
}

$dmd -O -release -ofmain-dmd main.d
$gdc -O3 main.d -o main-gdc
$ ls -l main-dmd main-gdc
-rwxr-xr-x 1 alkor alkor 123439 Dec 23 14:06 main-dmd
-rwxr-xr-x 1 alkor alkor 609363 Dec 23 14:06 main-gdc

why the main-gdc in 5 time more then the main-dmd?

any test shows dmd superiorities over gdc (and gcc)
dmd rules :)

== Repost the article of Travis Boucher (boucher.travis gmail.com)
== Posted at 2009/12/23 04:48 to digitalmars.D

If you can't get gdc to generate optimized code, then you are using it
wrong.
Dec 23 2009
parent reply =?ISO-8859-1?Q?=22J=E9r=F4me_M=2E_Berger=22?= <jeberger free.fr> writes:
alkor wrote:
 maybe, i do something wrong, but for example:
=20
 $ cat main.d
 int main () {
     return 0;
 }
=20
 $dmd -O -release -ofmain-dmd main.d
 $gdc -O3 main.d -o main-gdc
 $ ls -l main-dmd main-gdc
 -rwxr-xr-x 1 alkor alkor 123439 Dec 23 14:06 main-dmd
 -rwxr-xr-x 1 alkor alkor 609363 Dec 23 14:06 main-gdc
=20
 why the main-gdc in 5 time more then the main-dmd?
=20
Because the dmd-built executable is stripped. Try to add "-s" to=20 the gdc command line or use gdmd with the same options as dmd. Moreover, since you are trying to optimize for space rather than=20 performance, you should use -Os (or at least -O2) rather than -O3.
 any test shows dmd superiorities over gdc (and gcc)
 dmd rules :)
=20
dmd doesn't even work on my computer. End of story :) Jerome --=20 mailto:jeberger free.fr http://jeberger.free.fr Jabber: jeberger jabber.fr
Dec 23 2009
parent reply alkor <alor au.ru> writes:
oh no - both files aren't stripped

after strip a difference is 2,3 times 
$ strip main-gdc main-dmd
$ ls -l main-dmd main-gdc
-rwxr-xr-x 1 alkor alkor  65088 Dec 23 16:44 main-dmd
-rwxr-xr-x 1 alkor alkor 155784 Dec 23 16:44 main-gdc

and main-gdc required libgcc_s.so.1

$ ldd main-gdc
        linux-gate.so.1 =>  (0xffffe000)
        libm.so.6 => /lib/libm.so.6 (0xb7ee7000)
        libgcc_s.so.1 => /usr/local/lib/libgcc_s.so.1 (0xb7edc000)
        libpthread.so.0 => /lib/libpthread.so.0 (0xb7ec4000)
        libc.so.6 => /lib/libc.so.6 (0xb7d89000)
        /lib/ld-linux.so.2 (0xb7f2d000)

--- the next test - math performance ---
module test.performance;

import std.stdio, std.random;
const int MAX = 10000000;
int main () {
    int[] a = new int[MAX];
    int[] b = new int[MAX];
    double[] c = new double[MAX];

    for (auto i=0;  i< MAX; i++) {
	a[i] = i | 0xa1c0;
	b[i] = i | 0xbadbad;
    }

    for (auto i=0;  i< MAX; i++) {
	c[i] = (a[i] & 0x10) ? cast(double)a[i] * b[i] * b[i] : cast(double)a[i] *
a[i] * b[i];
    }
    writefln("init a[9555000] 0x%08X", a[9555000]);
    writefln("init b[9555000] 0x%08X", b[9555000]);
    writefln("init b[9555000] %f", c[9555000]);
    return 0;
}
$ dmd -O -release -oftest-dmd test-performance.d && strip test-dmd
$ time ./test-dmd
init a[9555000] 0x0091EDF8           
init b[9555000] 0x00BBDFBD           
init b[9555000] 1449827528761239666688.000000

real    0m0.722s
user    0m0.552s
sys     0m0.168s

$ gdc  -O3 test-performance.d -o test-gdc && strip test-gdc
$ time ./test-gdc
init a[9555000] 0x0091EDF8
init b[9555000] 0x00BBDFBD
init b[9555000] 1449827528761239666688.000000

real    0m0.786s
user    0m0.628s
sys     0m0.152s

so, dmd's code optimization rules 
Walter made nice lang & good compiler - it's true

Jérôme M. Berger Wrote:

 alkor wrote:
 maybe, i do something wrong, but for example:
 
 $ cat main.d
 int main () {
     return 0;
 }
 
 $dmd -O -release -ofmain-dmd main.d
 $gdc -O3 main.d -o main-gdc
 $ ls -l main-dmd main-gdc
 -rwxr-xr-x 1 alkor alkor 123439 Dec 23 14:06 main-dmd
 -rwxr-xr-x 1 alkor alkor 609363 Dec 23 14:06 main-gdc
 
 why the main-gdc in 5 time more then the main-dmd?
 
Because the dmd-built executable is stripped. Try to add "-s" to the gdc command line or use gdmd with the same options as dmd. Moreover, since you are trying to optimize for space rather than performance, you should use -Os (or at least -O2) rather than -O3.
 any test shows dmd superiorities over gdc (and gcc)
 dmd rules :)
 
dmd doesn't even work on my computer. End of story :) Jerome -- mailto:jeberger free.fr http://jeberger.free.fr Jabber: jeberger jabber.fr
Dec 23 2009
parent reply Travis Boucher <boucher.travis gmail.com> writes:
alkor wrote:
 $ dmd -O -release -oftest-dmd test-performance.d && strip test-dmd
 $ gdc  -O3 test-performance.d -o test-gdc && strip test-gdc
 so, dmd's code optimization rules 
 Walter made nice lang & good compiler - it's true
 
Add -frelease to gdc (if you want a fair comparison), and look at the code generated rather then running a micro benchmark on something that takes a fraction of a second to run.
Dec 23 2009
parent reply alkor <alkor au.ru> writes:
thanks,  w -frelease gdc makes a good result - faster then dmd's one & normal
size

Travis Boucher Wrote:

 alkor wrote:
 $ dmd -O -release -oftest-dmd test-performance.d && strip test-dmd
 $ gdc  -O3 test-performance.d -o test-gdc && strip test-gdc
 so, dmd's code optimization rules 
 Walter made nice lang & good compiler - it's true
 
Add -frelease to gdc (if you want a fair comparison), and look at the code generated rather then running a micro benchmark on something that takes a fraction of a second to run.
Dec 23 2009
parent Travis Boucher <boucher.travis gmail.com> writes:
alkor wrote:
 thanks,  w -frelease gdc makes a good result - faster then dmd's one & normal
size
 
Thats because -frelease removes certain array bounds checking code, assertion testing and I think a few other things.
Dec 23 2009