digitalmars.D - Is there any language that native-compiles faster than D?

Per =?UTF-8?B?Tm9yZGzDtnc=?= (8/8) Aug 20 2020 After having evaluated the compilation speed of D compared to

kinke (29/37) Aug 20 2020 Pardon me, but that code seems everything but remotely

Per =?UTF-8?B?Tm9yZGzDtnc=?= (5/11) Aug 20 2020 Thanks. But for really large files that won't make that big of

kinke (11/16) Aug 20 2020 For linking, a D main requires the _d_cmain template imported by

kinke (3/4) Aug 20 2020 Ah sry, just seen now that the table has been generated with
Per =?UTF-8?B?Tm9yZGzDtnc=?= (4/6) Aug 20 2020 Moreover, I just realized I should probably add a test case with

kinke (3/3) Aug 20 2020 I've just seen that you use ldmd2, not ldc2 directly. This makes

Per =?UTF-8?B?Tm9yZGzDtnc=?= (3/6) Aug 21 2020 I warmly welcome suggestions on how to improve the relevance of

Andrei Alexandrescu (3/14) Aug 20 2020 I tried Python a while ago, the build-run cycle for a simple program was...

Per =?UTF-8?B?Tm9yZGzDtnc=?= (4/6) Aug 20 2020 That same as what? D? That entirely depends on the complexity of

Andrei Alexandrescu (2/8) Aug 20 2020 It was a simple wc program with rdmd versus python.

Stefan Koch (3/18) Aug 20 2020 Neither perl nor python compile their code by default.

Atila Neves (2/21) Aug 26 2020 Yes, they do. Just not to x86.

Stefan Koch (3/25) Aug 26 2020 If they compile to bytecode that's still not native code.

Guillaume Piolat (3/8) Aug 20 2020 Object Pascal / Delphi has been very fast in the past, perhaps

oddp (3/14) Aug 20 2020 Same goes for good old ada. It has blazing fast compilation times in con...

Per =?UTF-8?B?Tm9yZGzDtnc=?= (12/14) Aug 22 2020 Nope. I just added support for it in compiler-benchmark. It is

Per =?UTF-8?B?Tm9yZGzDtnc=?= (9/23) Aug 22 2020 Correction, should be

Per =?UTF-8?B?Tm9yZGzDtnc=?= (3/4) Aug 22 2020 And the slowdown of Ada compared to D gets larger with function

oddp (7/11) Aug 23 2020 It might get even slower if you apply:

Per =?UTF-8?B?Tm9yZGzDtnc=?= (2/10) Aug 23 2020 Fixed. Thanks!
Per =?UTF-8?B?Tm9yZGzDtnc=?= (12/20) Aug 23 2020 Indeed, it does! :)

Andrej Mitrovic (2/7) Aug 20 2020 Yes. D1. :)
Jacob Carlborg (13/20) Aug 20 2020 I'm surprised that you only have gccgo and not the reference/official

Per =?UTF-8?B?Tm9yZGzDtnc=?= (4/10) Aug 21 2020 Well it's only a matter of having the time to add it.

Jacob Carlborg (11/14) Aug 21 2020 Fair enough. I would have guessed that the reference compiler is

Per =?UTF-8?B?Tm9yZGzDtnc=?= (6/11) Aug 21 2020 According to https://github.com/golang/go/wiki/Ubuntu
Per =?UTF-8?B?Tm9yZGzDtnc=?= (12/17) Aug 21 2020 I update compiler-benchmark with support for the reference go

Per =?UTF-8?B?Tm9yZGzDtnc=?= (3/5) Aug 21 2020 Renamed it the more verbose
Per =?UTF-8?B?Tm9yZGzDtnc=?= (24/28) Aug 21 2020 Here's the prel. Markdown-formatted table with ref Go compiler

Per =?UTF-8?B?Tm9yZGzDtnc=?= (4/6) Aug 21 2020 I haven't included backends that generate C because I don't think

Mike James (5/13) Aug 21 2020 Turbo Pascal ;-)

Per =?UTF-8?B?Tm9yZGzDtnc=?= (7/10) Aug 21 2020 Ahh, yes I remember!
Adam D. Ruppe (8/10) Aug 21 2020 That was me using D1 again after a while, it is virtually instant

Per =?UTF-8?B?Tm9yZGzDtnc=?= (4/9) Aug 21 2020 How big is that program?

Adam D. Ruppe (6/9) Aug 21 2020 No, the difference is probably because my custom druntime and

James Lu (61/69) Aug 25 2020 V8 JavaScript compiles faster:

James Lu (14/17) Aug 25 2020 DMD -O doesn't make a significant difference over DMD, clocking

James Lu (2/3) Aug 25 2020 Sorry, approximately 100 times slower, not over 100 times slower.
James Lu (5/22) Aug 25 2020 I wonder if anyone in the D community has the expertise to change

Andrei Alexandrescu (3/7) Aug 26 2020 Interesting. What is the result of that compilation? A traditional

James Lu (3/10) Aug 26 2020 It results in machine code (in my case, x86) in memory.
James Lu (6/13) Aug 26 2020 In an earlier post from me in this thread, I added together V8's

Steven Schveighoffer (10/59) Aug 26 2020 I just want to point out that ts is not an accurate timestamping system....

James Lu (56/71) Aug 26 2020 $ time d8 --always-opt --trace-opt --single-threaded

Steven Schveighoffer (13/88) Aug 26 2020 Thanks, that's really impressive. D has some significant overhead which

Atila Neves (3/14) Aug 27 2020 Weird. On my machine both take the same amount of time: ~16ms.

H. S. Teoh (30/36) Aug 25 2020 [...]

ketmar (11/17) Aug 25 2020 that's why there is no reason to "improve" current DMD backend at all. i...

Dukc (21/35) Aug 26 2020 Perhaps we should not be that quick to downplay DMD just because

ketmar (10/23) Aug 26 2020 it's not the reason, at least for me. the real reason is that DMD backen...

Stefan Koch (4/6) Aug 26 2020 CTFE needs a different code path from the regular backend.

ketmar (8/15) Aug 26 2020 you're right... with the current backend. but with universal SSA backend...

drug (2/9) Aug 26 2020 What are disadvantages of SSA based backend?

ketmar (2/3) Aug 26 2020 somebody have to write it.
Paolo Invernizzi (4/14) Aug 26 2020 If I'm not wrong, from what I remember LLVM IR is SSA, for I
Stefan Koch (5/15) Aug 26 2020 Well formed SSA is a little tricky to generate.

ketmar (10/12) Aug 26 2020 that is not what SSA is used for. ;-)

Jacob Carlborg (10/15) Aug 26 2020 You could add Crystal [1] as well for completeness. It uses LLVM

jmh530 (6/14) Aug 26 2020 After [1] I was under the assumption that the linux versions were

Jacob Carlborg (4/5) Aug 27 2020 Yes, unfortunately.

Atila Neves (3/8) Aug 27 2020 Didn't that change recently?

MrSmith (11/13) Aug 26 2020 Hi, you may want to try benchmarking my compiler of Vox language:

Per =?UTF-8?B?Tm9yZGzDtnc=?= (2/6) Aug 26 2020 Looks really interesting! Thanks. I'll try it out soon.
Per =?UTF-8?B?Tm9yZGzDtnc=?= (5/6) Aug 26 2020 What kinds of memory management are supported/planned and how

MrSmith (4/10) Aug 26 2020 Currently only manual MM. Check source/tests for small examples

Per =?UTF-8?B?Tm9yZGzDtnc=?= (3/6) Aug 26 2020 Can you elaborate on what you mean manual MM? Refcounted? I'm

MrSmith (7/9) Aug 26 2020 Currently you can either do static/stack allocation and use host

MrSmith (11/17) Aug 26 2020 I run benchmark myself and here is what Vox code should look like

Per =?UTF-8?B?Tm9yZGzDtnc=?= (12/14) Aug 26 2020 Ok, thanks.

MrSmith (4/5) Aug 26 2020 It needs cli version passed:

Per =?UTF-8?B?Tm9yZGzDtnc=?= (2/5) Aug 26 2020 Ahh, nice.
Per =?UTF-8?B?Tm9yZGzDtnc=?= (2/5) Aug 26 2020 Can `vox` only output Windows binaries?

MrSmith (2/3) Aug 26 2020 Yes. ELF and SytemV ABI is WIP

Per =?UTF-8?B?Tm9yZGzDtnc=?= (5/6) Aug 26 2020 What flags should I feed to the compiler? My output binary when I

Per =?UTF-8?B?Tm9yZGzDtnc=?= (17/19) Aug 26 2020 I built that binary on my system and called it `vox`.

Per =?UTF-8?B?Tm9yZGzDtnc=?= (3/5) Aug 26 2020 I've updated the docs aswell at

Per =?UTF-8?B?Tm9yZGzDtnc=?= (2/4) Aug 26 2020 I've haven't updated the benchmarks yet, though.

Per =?UTF-8?B?Tm9yZGzDtnc=?= (2/3) Aug 26 2020 I've pushed support for Vox-generics now aswell.

Per =?UTF-8?B?Tm9yZGzDtnc=?= (3/4) Aug 26 2020 Hardly any difference in compile-time for the generic version.

Per =?UTF-8?B?Tm9yZGzDtnc=?= (3/5) Aug 27 2020 Is Vox so fast because it doesn't (yet) support implicit function

MrSmith (28/30) Aug 27 2020 IFTI is supported, but only in simple cases. Relevant tests:

Per =?UTF-8?B?Tm9yZGzDtnc=?= (4/6) Aug 27 2020 Nice. Are/Will Vox's overloading rules be different from D's?

Per =?UTF-8?B?Tm9yZGzDtnc=?= (3/5) Aug 27 2020 I just read todo.txt.
MrSmith (12/15) Aug 27 2020 Haven't given overloading a thought yet, but D way seems totally

Per =?UTF-8?B?Tm9yZGzDtnc=?= (4/5) Oct 05 2020 What is the pro of this syntax with `Args... args` compared to D's

MrSmith (3/7) Oct 05 2020 I think it was due to simpler implementation. This way you know

Per =?UTF-8?B?Tm9yZGzDtnc=?= (25/29) Aug 26 2020 Here's at least numbers for

MrSmith (6/8) Aug 26 2020 Vox uses SSA form + linear scan register allocation, but no other

Per =?UTF-8?B?Tm9yZGzDtnc=?= (2/3) Aug 27 2020 What does the macro syntax look like in Vox?

Jacob Carlborg (9/14) Aug 27 2020 Should add the following flags as well for best performance:

Per =?UTF-8?B?Tm9yZGzDtnc=?= (3/5) Aug 26 2020 What about adding Vox-support for performing semantic analysis

Per =?UTF-8?B?Tm9yZGzDtnc=?= (25/33) Sep 25 2020 I just added support for Apple's Swift. It's massively slow on

Per =?UTF-8?B?Tm9yZGzDtnc=?= (9/12) Sep 28 2020 The update comments at [1] gives some clues to problems with "Big

Jacob Carlborg (13/18) Sep 29 2020 You should have a look at the self-hosted Zig compiler as well.

Per =?UTF-8?B?Tm9yZGzDtnc=?= (3/6) Sep 29 2020 Thanks. Have you tried building and using the self-hosted version?

Jacob Carlborg (5/7) Sep 29 2020 No, I have not. I've looked at a couple of live coding videos

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

After having evaluated the compilation speed of D compared to 
other languages at

     https://github.com/nordlow/compiler-benchmark

I wonder; is there any language that compiles to native code 
anywhere nearly as fast or faster than D, except C?

If so it most likely needs to use a backend other than LLVM.

I believe Jai is supposed to do that but it hasn't been released 
yet.

Aug 20 2020

kinke <noone nowhere.com> writes:

On Thursday, 20 August 2020 at 20:50:25 UTC, Per Nordlöw wrote:
 After having evaluated the compilation speed of D compared to 
 other languages at

     https://github.com/nordlow/compiler-benchmark

 I wonder; is there any language that compiles to native code 
 anywhere nearly as fast or faster than D, except C?

 If so it most likely needs to use a backend other than LLVM.

 I believe Jai is supposed to do that but it hasn't been 
 released yet.

Pardon me, but that code seems everything but remotely 
representative to me - no structs, no classes, no control flow, 
just a few integer additions and calls. It even seems to make D 
look worse than it is, simply because object.d is imported but 
totally unused. E.g., on my Win64 box with DMD 2.093, compiling 
this:

-----
int add_int_n0_h0(int x) { return x + 15440; }
int add_int_n0(int x) { return x + add_int_n0_h0(x) + 95485; }

int add_int_n1_h0(int x) { return x + 37523; }
int add_int_n1(int x) { return x + add_int_n1_h0(x) + 92492; }

int add_int_n2_h0(int x) { return x + 39239; }
int add_int_n2(int x) { return x + add_int_n2_h0(x) + 12248; }

int main()
{
     int int_sum = 0;
     int_sum += add_int_n0(0);
     int_sum += add_int_n1(1);
     int_sum += add_int_n2(2);
     return int_sum;
}
-----

with `dmd -o- bla.d` takes about 37ms, while creating an empty 
object.d and compiling with `dmd -o- bla.d object.d` takes 24ms. 
There's no default includes for C, so this would make the 
comparison more fair.

Additionally, generics and templates are completely different 
concepts and can't be compared.

Aug 20 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Thursday, 20 August 2020 at 21:21:39 UTC, kinke wrote:
 with `dmd -o- bla.d` takes about 37ms, while creating an empty 
 object.d and compiling with `dmd -o- bla.d object.d` takes 
 24ms. There's no default includes for C, so this would make the 
 comparison more fair.

Thanks. But for really large files that won't make that big of 
difference. I'll add the generation of the object.d file aswell.

 Additionally, generics and templates are completely different 
 concepts and can't be compared.

I'm aware of that. But I wanted to start somewhere and expand 
from there.

Aug 20 2020

kinke <noone nowhere.com> writes:

On Thursday, 20 August 2020 at 21:34:54 UTC, Per Nordlöw wrote:
 But for really large files that won't make that big of 
 difference.

Of course not, but your benchmark is as tiny as it gets. ;)

 I'll add the generation of the object.d file aswell.

For linking, a D main requires the _d_cmain template imported by 
object.d, so you'll have to make it extern(C) in that case.

 I'm aware of that. But I wanted to start somewhere and expand 
 from there.

I don't think languages can be compared like that. One would 
probably need a reasonably-sized non-contrived project and port 
it to each language, exploiting the language features, but then 
runtime and standard libraries would play a significant role as 
well (object.d kinda already does).
Another strong suit of D, its module system and compiling 
multiple modules at once, isn't reflected either.

Aug 20 2020

kinke <noone nowhere.com> writes:

On Thursday, 20 August 2020 at 21:49:16 UTC, kinke wrote:
 Of course not, but your benchmark is as tiny as it gets. ;)

Ah sry, just seen now that the table has been generated with 
`--function-count=200 --function-depth=450`.

Aug 20 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Thursday, 20 August 2020 at 21:49:16 UTC, kinke wrote:
 For linking, a D main requires the _d_cmain template imported 
 by object.d, so you'll have to make it extern(C) in that case.

Moreover, I just realized I should probably add a test case with 
`-betterC` aswell. Or maybe make it default until `-betterC` is 
not sufficient.

Aug 20 2020

kinke <noone nowhere.com> writes:

I've just seen that you use ldmd2, not ldc2 directly. This makes 
quite a difference for such tiny code, in my case (with -c, not 
-o-) something like 60ms vs. 42ms.

Aug 20 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Thursday, 20 August 2020 at 21:21:39 UTC, kinke wrote:
 Pardon me, but that code seems everything but remotely 
 representative to me - no structs, no classes, no control flow, 
 just a few integer additions and calls.

I warmly welcome suggestions on how to improve the relevance of 
these tests. ;)

Aug 21 2020

Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:

On 8/20/20 4:50 PM, Per Nordlöw wrote:
 After having evaluated the compilation speed of D compared to other 
 languages at
 
      https://github.com/nordlow/compiler-benchmark
 
 I wonder; is there any language that compiles to native code anywhere 
 nearly as fast or faster than D, except C?
 
 If so it most likely needs to use a backend other than LLVM.
 
 I believe Jai is supposed to do that but it hasn't been released yet.

I tried Python a while ago, the build-run cycle for a simple program was 
about the same. For Perl it was faster.

Aug 20 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Thursday, 20 August 2020 at 22:20:19 UTC, Andrei Alexandrescu 
wrote:
 I tried Python a while ago, the build-run cycle for a simple 
 program was about the same. For Perl it was faster.

That same as what? D? That entirely depends on the complexity of 
the Python program. What kind of app/program where you testing?

Aug 20 2020

Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:

On 8/20/20 6:35 PM, Per Nordlöw wrote:
 On Thursday, 20 August 2020 at 22:20:19 UTC, Andrei Alexandrescu wrote:
 I tried Python a while ago, the build-run cycle for a simple program 
 was about the same. For Perl it was faster.

 
 That same as what? D? That entirely depends on the complexity of the 
 Python program. What kind of app/program where you testing?

It was a simple wc program with rdmd versus python.

Aug 20 2020

Stefan Koch <uplink.coder googlemail.com> writes:

On Thursday, 20 August 2020 at 22:20:19 UTC, Andrei Alexandrescu 
wrote:
 On 8/20/20 4:50 PM, Per Nordlöw wrote:
 After having evaluated the compilation speed of D compared to 
 other languages at
 
      https://github.com/nordlow/compiler-benchmark
 
 I wonder; is there any language that compiles to native code 
 anywhere nearly as fast or faster than D, except C?
 
 If so it most likely needs to use a backend other than LLVM.
 
 I believe Jai is supposed to do that but it hasn't been 
 released yet.

 I tried Python a while ago, the build-run cycle for a simple 
 program was about the same. For Perl it was faster.

Neither perl nor python compile their code by default.

Aug 20 2020

Atila Neves <atila.neves gmail.com> writes:

On Thursday, 20 August 2020 at 23:16:40 UTC, Stefan Koch wrote:
 On Thursday, 20 August 2020 at 22:20:19 UTC, Andrei 
 Alexandrescu wrote:
 On 8/20/20 4:50 PM, Per Nordlöw wrote:
 After having evaluated the compilation speed of D compared to 
 other languages at
 
      https://github.com/nordlow/compiler-benchmark
 
 I wonder; is there any language that compiles to native code 
 anywhere nearly as fast or faster than D, except C?
 
 If so it most likely needs to use a backend other than LLVM.
 
 I believe Jai is supposed to do that but it hasn't been 
 released yet.

 I tried Python a while ago, the build-run cycle for a simple 
 program was about the same. For Perl it was faster.

 Neither perl nor python compile their code by default.

Yes, they do. Just not to x86.

Aug 26 2020

Stefan Koch <uplink.coder googlemail.com> writes:

On Wednesday, 26 August 2020 at 09:19:47 UTC, Atila Neves wrote:
 On Thursday, 20 August 2020 at 23:16:40 UTC, Stefan Koch wrote:
 On Thursday, 20 August 2020 at 22:20:19 UTC, Andrei 
 Alexandrescu wrote:
 On 8/20/20 4:50 PM, Per Nordlöw wrote:
 After having evaluated the compilation speed of D compared 
 to other languages at
 
      https://github.com/nordlow/compiler-benchmark
 
 I wonder; is there any language that compiles to native code 
 anywhere nearly as fast or faster than D, except C?
 
 If so it most likely needs to use a backend other than LLVM.
 
 I believe Jai is supposed to do that but it hasn't been 
 released yet.

 I tried Python a while ago, the build-run cycle for a simple 
 program was about the same. For Perl it was faster.

 Neither perl nor python compile their code by default.

 Yes, they do. Just not to x86.

If they compile to bytecode that's still not native code.
Which is what was asked if I understood the question correctly.

Aug 26 2020

Guillaume Piolat <first.name gmail.com> writes:

On Thursday, 20 August 2020 at 20:50:25 UTC, Per Nordlöw wrote:
 After having evaluated the compilation speed of D compared to 
 other languages at

     https://github.com/nordlow/compiler-benchmark

 I wonder; is there any language that compiles to native code 
 anywhere nearly as fast or faster than D, except C?

Object Pascal / Delphi has been very fast in the past, perhaps 
the existing Pascal compilers still are.

Aug 20 2020

oddp <oddp posteo.de> writes:

On 2020-08-21 00:31, Guillaume Piolat via Digitalmars-d wrote:
 On Thursday, 20 August 2020 at 20:50:25 UTC, Per Nordlöw wrote:
 After having evaluated the compilation speed of D compared to other languages
at

     https://github.com/nordlow/compiler-benchmark

 I wonder; is there any language that compiles to native code anywhere nearly
as fast or faster 
 than D, except C?

 
 Object Pascal / Delphi has been very fast in the past, perhaps the existing
Pascal compilers still are.
 
 

Same goes for good old ada. It has blazing fast compilation times in
conjunction with gcc-gnat, but 
can't tell whether they made any kind of significant progress on the llvm-gnat
front in recent months.

Aug 20 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Thursday, 20 August 2020 at 23:57:09 UTC, oddp wrote:
 Same goes for good old ada. It has blazing fast compilation 
 times in conjunction with gcc-gnat,

Nope. I just added support for it in compiler-benchmark. It is 
crazy slow:

     ./benchmark --languages=D,Ada --function-count=20 
--function-depth=450 --run-count=1

gives

| Lang-uage | Oper-ation | Temp-lated | Time [s/fn] | Slowdown vs 
[Best] | Version | Exec |
| :---: | :---: | --- | :---: | :---: | :---: | :---: |
| D | Build | No | 0.150 | 1.0 [D] | v2.093.1-541-ge54c041a4 | 
`dmd` |
| Ada | Build | No | 9.031 | 60.2 [D] | 10.2.0 | `gnat-10` |

Aug 22 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Sunday, 23 August 2020 at 00:06:02 UTC, Per Nordlöw wrote:
 On Thursday, 20 August 2020 at 23:57:09 UTC, oddp wrote:
 Same goes for good old ada. It has blazing fast compilation 
 times in conjunction with gcc-gnat,

 Nope. I just added support for it in compiler-benchmark. It is 
 crazy slow:

     ./benchmark --languages=D,Ada --function-count=20 
 --function-depth=450 --run-count=1

 gives

 | Lang-uage | Oper-ation | Temp-lated | Time [s/fn] | Slowdown 
 vs [Best] | Version | Exec |
 | :---: | :---: | --- | :---: | :---: | :---: | :---: |
 | D | Build | No | 0.150 | 1.0 [D] | v2.093.1-541-ge54c041a4 | 
 `dmd` |
 | Ada | Build | No | 9.031 | 60.2 [D] | 10.2.0 | `gnat-10` |

Correction, should be

| Lang-uage | Oper-ation | Temp-lated | Time [s/fn] | Slowdown vs 
[Best] | Version | Exec |
| :---: | :---: | --- | :---: | :---: | :---: | :---: |
| D | Build | No | 0.163 | 1.0 [D] | v2.093.1-541-ge54c041a4 | 
`dmd` |
| Ada | Build | No | 1.596 | 9.8 [D] | 10.2.0 | `gnat-10` |

Builds are 10x slower than D.

Aug 22 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Sunday, 23 August 2020 at 00:18:44 UTC, Per Nordlöw wrote:
 Builds are 10x slower than D.

And the slowdown of Ada compared to D gets larger with function 
size.

Aug 22 2020

oddp <oddp posteo.de> writes:

On 2020-08-23 02:20, Per Nordlöw via Digitalmars-d wrote:
 On Sunday, 23 August 2020 at 00:18:44 UTC, Per Nordlöw wrote:
 Builds are 10x slower than D.

 
 And the slowdown of Ada compared to D gets larger with function size.

It might get even slower if you apply:

-        f.write(Tm('''   GNAT.OS_Lib.OS_Exit(Integer(${T}));
+        f.write(Tm('''   GNAT.OS_Lib.OS_Exit(Integer(${T}_sum));

Currently, we're seeing:

main.adb:9049:32: invalid use of subtype mark in expression or call
gnatmake: "generated/ada/main.adb" compilation error

Aug 23 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Sunday, 23 August 2020 at 07:36:19 UTC, oddp wrote:
 It might get even slower if you apply:

 -        f.write(Tm('''   GNAT.OS_Lib.OS_Exit(Integer(${T}));
 +        f.write(Tm('''   
 GNAT.OS_Lib.OS_Exit(Integer(${T}_sum));

 Currently, we're seeing:

 main.adb:9049:32: invalid use of subtype mark in expression or 
 call
 gnatmake: "generated/ada/main.adb" compilation error

Fixed. Thanks!

Aug 23 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Sunday, 23 August 2020 at 07:36:19 UTC, oddp wrote:
 It might get even slower if you apply:

 -        f.write(Tm('''   GNAT.OS_Lib.OS_Exit(Integer(${T}));
 +        f.write(Tm('''   
 GNAT.OS_Lib.OS_Exit(Integer(${T}_sum));

 Currently, we're seeing:

 main.adb:9049:32: invalid use of subtype mark in expression or 
 call
 gnatmake: "generated/ada/main.adb" compilation error

Indeed, it does! :)

./benchmark --languages=D,Ada --function-count=20 
--function-depth=450 --run-count=1

gives

| Lang-uage | Oper-ation | Temp-lated | Time [s/fn] | Slowdown vs 
[Best] | Version | Exec |
| :---: | :---: | --- | :---: | :---: | :---: | :---: |
| D | Build | No | 0.170 | 1.0 [D] | v2.093.1-541-ge54c041a4 | 
`dmd` |
| Ada | Build | No | 9.065 | 53.3 [D] | 10.2.0 | `gnat-10` |

!

Aug 23 2020

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On Thursday, 20 August 2020 at 20:50:25 UTC, Per Nordlöw wrote:
 After having evaluated the compilation speed of D compared to 
 other languages at

     https://github.com/nordlow/compiler-benchmark

 I wonder; is there any language that compiles to native code 
 anywhere nearly as fast or faster than D, except C?

Yes. D1. :)

Aug 20 2020

Jacob Carlborg <doob me.com> writes:

On 2020-08-20 22:50, Per Nordlöw wrote:
 After having evaluated the compilation speed of D compared to other 
 languages at
 
      https://github.com/nordlow/compiler-benchmark
 
 I wonder; is there any language that compiles to native code anywhere 
 nearly as fast or faster than D, except C?

I'm surprised that you only have gccgo and not the reference/official 
(or whatever it's called) implementation. That one uses a fully custom 
tool chain, i.e. custom compiler, custom assembler, custom object format 
and custom linker. I'm sure it's faster than gccgo and it might be 
faster than D as well.

There's some work on a new Rust backend [1] as well. Not sure if that 
usable yet.

What about Nim and Vala, don't they count since they're generating C code?

[1] https://jason-williams.co.uk/a-possible-new-backend-for-rust (first 
hit on Google)

-- 
/Jacob Carlborg

Aug 20 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Friday, 21 August 2020 at 06:20:18 UTC, Jacob Carlborg wrote:
 I'm surprised that you only have gccgo and not the 
 reference/official (or whatever it's called) implementation. 
 That one uses a fully custom tool chain, i.e. custom compiler, 
 custom assembler, custom object format and custom linker. I'm 
 sure it's faster than gccgo and it might be faster than D as 
 well.

Well it's only a matter of having the time to add it.

What's the easiest way to install the reference/official 
toolchain on Ubuntu?

Aug 21 2020

Jacob Carlborg <doob me.com> writes:

On Friday, 21 August 2020 at 11:36:51 UTC, Per Nordlöw wrote:

 Well it's only a matter of having the time to add it.

Fair enough. I would have guessed that the reference compiler is 
the most commonly used and it's famous for being fast, most 
likley faster than DMD. Therefore I would expect anyone that 
compare compile time speed would pick the reference compiler 
first, not gccgo.

 What's the easiest way to install the reference/official 
 toolchain on Ubuntu?

For the latest version, download the binaries from here: 
https://golang.org/dl/. Otherwise install the "golang" package 
using the package manager.

--
/Jacob Carlborg

Aug 21 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Friday, 21 August 2020 at 14:37:35 UTC, Jacob Carlborg wrote:
 For the latest version, download the binaries from here: 
 https://golang.org/dl/. Otherwise install the "golang" package 
 using the package manager.

 --
 /Jacob Carlborg

According to https://github.com/golang/go/wiki/Ubuntu

sudo add-apt-repository ppa:longsleep/golang-backports
sudo apt update
sudo apt install golang-go

will get you the Go 1.15 on Ubuntu 20.04.

Aug 21 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Friday, 21 August 2020 at 14:37:35 UTC, Jacob Carlborg wrote:
 For the latest version, download the binaries from here: 
 https://golang.org/dl/. Otherwise install the "golang" package 
 using the package manager.

 --
 /Jacob Carlborg

I update compiler-benchmark with support for the reference go 
compiler. I'll update the numbers now. It will take a while. In 
the meanwhile you can evaluate it yourself using, for instance,

./benchmark --languages=D,Go --function-count=200 
--function-depth=450 --run-count=1

DMD is still far ahead of Go aswell; about 2.5x faster on check 
and 10x faster on build for my contrived example.

I also added a script

     ./install-compilers.sh

you can use to install most of the compilers I use in my tests on 
Ubuntu 20.04.

Aug 21 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Friday, 21 August 2020 at 23:08:05 UTC, Per Nordlöw wrote:
 I also added a script
     ./install-compilers.sh

Renamed it the more verbose

     ./install-compilers-on-ubuntu-20.04.sh

Aug 21 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Friday, 21 August 2020 at 23:08:05 UTC, Per Nordlöw wrote:
 ./benchmark --languages=D,Go --function-count=200 
 --function-depth=450 --run-count=1

 DMD is still far ahead of Go aswell; about 2.5x faster on check 
 and 10x faster on build for my contrived example.

Here's the prel. Markdown-formatted table with ref Go compiler 
version 1.15 added:

| Lang-uage | Oper-ation | Temp-lated | Time [s/fn] | Slowdown vs 
[Best] | Version | Exec |
| :---: | :---: | --- | :---: | :---: | :---: | :---: |
| D | Check | No | 0.634 | 1.0 [D] | v2.093.1-538-ge9c22d712 | 
`dmd` |
| D | Check | No | 0.691 | 1.1 [D] | 1.23.0 | `ldmd2` |
| D | Check | Yes | 1.600 | 2.5 [D] | v2.093.1-538-ge9c22d712 | 
`dmd` |
| D | Check | Yes | 1.647 | 2.6 [D] | 1.23.0 | `ldmd2` |
| D | Build | No | 1.518 | 1.0 [D] | v2.093.1-538-ge9c22d712 | 
`dmd` |
| D | Build | No | 17.536 | 11.6 [D] | 1.23.0 | `ldmd2` |
| D | Build | Yes | 2.696 | 1.8 [D] | v2.093.1-538-ge9c22d712 | 
`dmd` |
| D | Build | Yes | 18.178 | 12.0 [D] | 1.23.0 | `ldmd2` |
| Go | Check | No | 1.554 | 2.5 [D] | 1.15 | `gotype` |
| Go | Check | No | 2.232 | 3.5 [D] | 9.3.0 | `gccgo-9` |
| Go | Check | No | 2.244 | 3.5 [D] | 10.2.0 | `gccgo-10` |
| Go | Build | No | 13.717 | 9.0 [D] | 1.15 | `go` |
| Go | Build | No | 53.743 | 35.4 [D] | 9.3.0 | `gccgo-9` |
| Go | Build | No | 57.711 | 38.0 [D] | 10.2.0 | `gccgo-10` |

Aug 21 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Friday, 21 August 2020 at 06:20:18 UTC, Jacob Carlborg wrote:
 What about Nim and Vala, don't they count since they're 
 generating C code?

I haven't included backends that generate C because I don't think 
they are relevant to this metric as dmd is faster than both GCC 
and Clang at compiling C-style D code.

Aug 21 2020

Mike James <foo bar.com> writes:

On Thursday, 20 August 2020 at 20:50:25 UTC, Per Nordlöw wrote:
 After having evaluated the compilation speed of D compared to 
 other languages at

     https://github.com/nordlow/compiler-benchmark

 I wonder; is there any language that compiles to native code 
 anywhere nearly as fast or faster than D, except C?

 If so it most likely needs to use a backend other than LLVM.

 I believe Jai is supposed to do that but it hasn't been 
 released yet.

Turbo Pascal ;-)

It compiled so fast at a demonstration to customers they thought 
it was broke...

-=mike=-

Aug 21 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Friday, 21 August 2020 at 11:42:51 UTC, Mike James wrote:
 Turbo Pascal ;-)

 It compiled so fast at a demonstration to customers they 
 thought it was broke...

Ahh, yes I remember!

It was my first language (in high school). :)

Techniques outlined here:
https://prog21.dadgum.com/47.html

See also:
https://www.reddit.com/r/programming/comments/1fpu6u/how_could_turbo_pascal_be_so_fast/

Aug 21 2020

Adam D. Ruppe <destructionator gmail.com> writes:

On Friday, 21 August 2020 at 11:42:51 UTC, Mike James wrote:
 It compiled so fast at a demonstration to customers they 
 thought it was broke...

That was me using D1 again after a while, it is virtually instant 
even for medium size programs.

For my little webassembly demo I made a couple weeks ago, when 
you do the "try it yourself" thing, it actually runs ldc on the 
input and sends the output to the browser right there. No caching 
or anything fancy... but since it compiles in 40 ms you probably 
barely notice.

Aug 21 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Friday, 21 August 2020 at 12:14:35 UTC, Adam D. Ruppe wrote:
 For my little webassembly demo I made a couple weeks ago, when 
 you do the "try it yourself" thing, it actually runs ldc on the 
 input and sends the output to the browser right there. No 
 caching or anything fancy... but since it compiles in 40 ms you 
 probably barely notice.

How big is that program?

Are you saying ldc's webassembly backend is much faster than its 
native (x86) backend?

Aug 21 2020

Adam D. Ruppe <destructionator gmail.com> writes:

On Friday, 21 August 2020 at 12:51:29 UTC, Per Nordlöw wrote:
 How big is that program?

Small, < 1000 lines, it is a little tetris game.

 Are you saying ldc's webassembly backend is much faster than 
 its native (x86) backend?

No, the difference is probably because my custom druntime and 
stdlib are more minimal.

D as a language remains fast, but the stdlib has gotten slower to 
compile as time goes on.

Aug 21 2020

James Lu <jamtlu gmail.com> writes:

On Thursday, 20 August 2020 at 20:50:25 UTC, Per Nordlöw wrote:
 After having evaluated the compilation speed of D compared to 
 other languages at

     https://github.com/nordlow/compiler-benchmark

 I wonder; is there any language that compiles to native code 
 anywhere nearly as fast or faster than D, except C?

 If so it most likely needs to use a backend other than LLVM.

 I believe Jai is supposed to do that but it hasn't been 
 released yet.

V8 JavaScript compiles faster:

$ d8 --always-opt --trace-opt --single-threaded 
--no-compilation-cache mandelbrot.js | ts -s "%.s"
0.000025 [compiling method 0x3fc50821068d <JSFunction (sfi = 
0x3fc508210165)> using TurboFan]
0.001651 [optimizing 0x3fc50821068d <JSFunction (sfi = 
0x3fc508210165)> - took 4.455, 41.945, 0.052 ms]
0.001707 [optimizing 0x3fc508085b65 <JSFunction Complex (sfi = 
0x3fc508210261)> because --always-opt]
0.001736 [compiling method 0x3fc508085b65 <JSFunction Complex 
(sfi = 0x3fc508210261)> using TurboFan]
0.001763 [optimizing 0x3fc508085b65 <JSFunction Complex (sfi = 
0x3fc508210261)> - took 0.125, 0.284, 0.019 ms]
0.001789 [optimizing 0x3fc508211665 <JSFunction 
iterate_mandelbrot (sfi = 0x3fc508210229)> because --always-opt]
0.001817 [compiling method 0x3fc508211665 <JSFunction 
iterate_mandelbrot (sfi = 0x3fc508210229)> using TurboFan]
0.001842 [optimizing 0x3fc508211665 <JSFunction 
iterate_mandelbrot (sfi = 0x3fc508210229)> - took 0.167, 1.197, 
0.022 ms]
0.001868 [optimizing 0x3fc508085b85 <JSFunction abs (sfi = 
0x3fc508210299)> because --always-opt]
0.001892 [compiling method 0x3fc508085b85 <JSFunction abs (sfi = 
0x3fc508210299)> using TurboFan]
0.001916 [optimizing 0x3fc508085b85 <JSFunction abs (sfi = 
0x3fc508210299)> - took 0.125, 0.421, 0.025 ms]
0.002093 [optimizing 0x3fc508085bbd <JSFunction mul (sfi = 
0x3fc508210309)> because --always-opt]
0.002337 [compiling method 0x3fc508085bbd <JSFunction mul (sfi = 
0x3fc508210309)> using TurboFan]
0.002365 [optimizing 0x3fc508085bbd <JSFunction mul (sfi = 
0x3fc508210309)> - took 0.134, 0.550, 0.023 ms]
0.002389 [optimizing 0x3fc508085ba1 <JSFunction add (sfi = 
0x3fc5082102d1)> because --always-opt]
0.002498 [compiling method 0x3fc508085ba1 <JSFunction add (sfi = 
0x3fc5082102d1)> using TurboFan]


--single-threaded disables compilation background tasks
--always-opt makes V8 immediately compile the function without 
profiling

Timestamps thanks to "ts" from moreutils.

$ time dmd -c mandelbrot.d

real	0m0.507s
user	0m0.416s
sys	0m0.094s

V8 compiles 202x faster.

-c to omit linking, which can be slow.

That's using the struct/double version. (I also ran it with 
timestamps on the verbose version, writing to disk accounted for 
a negligible amount of time.)

Obviously, take these results with a grain of salt, since 
--always-opt makes the compiled program very slow. But still, 
even with it off, V8 can compile very fast.

---

LDC2 takes 1.695 seconds to compile with no flags, and 2.126 
seconds with O2.
QuickJS takes around 0.36 seconds to compile, but the program 
output is unbearably slow.

However, I suspect by the time the fastest and most balanced D 
compiler finishes compiling the program and executing it, V8 will 
already have finished executing it.

Aug 25 2020

James Lu <jamtlu gmail.com> writes:

On Wednesday, 26 August 2020 at 01:13:47 UTC, James Lu wrote:
 However, I suspect by the time the fastest and most balanced D 
 compiler finishes compiling the program and executing it, V8 
 will already have finished executing it.

DMD -O doesn't make a significant difference over DMD, clocking 
in at 12 seconds total.

LDC2 -O/-O2 has the best compilation-execution total, clocking in 
at 5.4 seconds. Subtracting the link time gets 5.0 seconds.

Its code generation phase (measured by measuring the time between 
-v's "code" and the link command) takes 2.6 seconds.

In contrast, compilation+execution for V8 JavaScript 2.0 seconds. 
V8's --trace-opt says it compiles each function once, and 
compilation (SSA creation, optimization, and code generation) 
takes a grand total of 0.027 seconds.

LDC's code generation is over 100 times slower. Surely there's 
opportunities for profiling and optimization here. (dmd's code 
generation is too bad to count.)

Aug 25 2020

James Lu <jamtlu gmail.com> writes:

On Wednesday, 26 August 2020 at 01:31:01 UTC, James Lu wrote:
 LDC's code generation is over 100 times slower. Surely there's

Sorry, approximately 100 times slower, not over 100 times slower.

Aug 25 2020

James Lu <jamtlu gmail.com> writes:

On Wednesday, 26 August 2020 at 01:31:01 UTC, James Lu wrote:
 On Wednesday, 26 August 2020 at 01:13:47 UTC, James Lu wrote:
 However, I suspect by the time the fastest and most balanced D 
 compiler finishes compiling the program and executing it, V8 
 will already have finished executing it.

 DMD -O doesn't make a significant difference over DMD, clocking 
 in at 12 seconds total.

 LDC2 -O/-O2 has the best compilation-execution total, clocking 
 in at 5.4 seconds. Subtracting the link time gets 5.0 seconds.

 Its code generation phase (measured by measuring the time 
 between -v's "code" and the link command) takes 2.6 seconds.

 In contrast, compilation+execution for V8 JavaScript 2.0 
 seconds. V8's --trace-opt says it compiles each function once, 
 and compilation (SSA creation, optimization, and code 
 generation) takes a grand total of 0.027 seconds.

 LDC's code generation is over 100 times slower. Surely there's 
 opportunities for profiling and optimization here. (dmd's code 
 generation is too bad to count.)

I wonder if anyone in the D community has the expertise to change 
modify or rewrite DMD's backend to be up to be at most 1.5-2x 
slower at normal, non-SIMD tasks, up to a poor version of LuaJIT 
or V8 while retaining the speed.

Aug 25 2020

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 8/25/20 9:13 PM, James Lu wrote:
 V8 JavaScript compiles faster:
 
 $ d8 --always-opt --trace-opt --single-threaded --no-compilation-cache 
 mandelbrot.js | ts -s "%.s"

Interesting. What is the result of that compilation? A traditional 
binary file, or a webassembly?

Aug 26 2020

James Lu <jamtlu gmail.com> writes:

On Wednesday, 26 August 2020 at 13:16:08 UTC, Andrei Alexandrescu 
wrote:
 On 8/25/20 9:13 PM, James Lu wrote:
 V8 JavaScript compiles faster:
 
 $ d8 --always-opt --trace-opt --single-threaded 
 --no-compilation-cache mandelbrot.js | ts -s "%.s"

 Interesting. What is the result of that compilation? A 
 traditional binary file, or a webassembly?

It results in machine code (in my case, x86) in memory.

Aug 26 2020

James Lu <jamtlu gmail.com> writes:

On Wednesday, 26 August 2020 at 13:16:08 UTC, Andrei Alexandrescu 
wrote:
 On 8/25/20 9:13 PM, James Lu wrote:
 V8 JavaScript compiles faster:
 
 $ d8 --always-opt --trace-opt --single-threaded 
 --no-compilation-cache mandelbrot.js | ts -s "%.s"

 Interesting. What is the result of that compilation? A 
 traditional binary file, or a webassembly?

In an earlier post from me in this thread, I added together V8's 
internal timing to suggest that its codgen (SSA building, SSA 
optimization, conversion to machine code) phase runs 100 times 
faster than LLVM on similar code in D.

Aug 26 2020

Steven Schveighoffer <schveiguy gmail.com> writes:

On 8/25/20 9:13 PM, James Lu wrote:
 On Thursday, 20 August 2020 at 20:50:25 UTC, Per Nordlöw wrote:
 After having evaluated the compilation speed of D compared to other 
 languages at

     https://github.com/nordlow/compiler-benchmark

 I wonder; is there any language that compiles to native code anywhere 
 nearly as fast or faster than D, except C?

 If so it most likely needs to use a backend other than LLVM.

 I believe Jai is supposed to do that but it hasn't been released yet.

 
 V8 JavaScript compiles faster:
 
 $ d8 --always-opt --trace-opt --single-threaded --no-compilation-cache 
 mandelbrot.js | ts -s "%.s"
 0.000025 [compiling method 0x3fc50821068d <JSFunction (sfi = 
 0x3fc508210165)> using TurboFan]
 0.001651 [optimizing 0x3fc50821068d <JSFunction (sfi = 0x3fc508210165)> 
 - took 4.455, 41.945, 0.052 ms]
 0.001707 [optimizing 0x3fc508085b65 <JSFunction Complex (sfi = 
 0x3fc508210261)> because --always-opt]
 0.001736 [compiling method 0x3fc508085b65 <JSFunction Complex (sfi = 
 0x3fc508210261)> using TurboFan]
 0.001763 [optimizing 0x3fc508085b65 <JSFunction Complex (sfi = 
 0x3fc508210261)> - took 0.125, 0.284, 0.019 ms]
 0.001789 [optimizing 0x3fc508211665 <JSFunction iterate_mandelbrot (sfi 
 = 0x3fc508210229)> because --always-opt]
 0.001817 [compiling method 0x3fc508211665 <JSFunction iterate_mandelbrot 
 (sfi = 0x3fc508210229)> using TurboFan]
 0.001842 [optimizing 0x3fc508211665 <JSFunction iterate_mandelbrot (sfi 
 = 0x3fc508210229)> - took 0.167, 1.197, 0.022 ms]
 0.001868 [optimizing 0x3fc508085b85 <JSFunction abs (sfi = 
 0x3fc508210299)> because --always-opt]
 0.001892 [compiling method 0x3fc508085b85 <JSFunction abs (sfi = 
 0x3fc508210299)> using TurboFan]
 0.001916 [optimizing 0x3fc508085b85 <JSFunction abs (sfi = 
 0x3fc508210299)> - took 0.125, 0.421, 0.025 ms]
 0.002093 [optimizing 0x3fc508085bbd <JSFunction mul (sfi = 
 0x3fc508210309)> because --always-opt]
 0.002337 [compiling method 0x3fc508085bbd <JSFunction mul (sfi = 
 0x3fc508210309)> using TurboFan]
 0.002365 [optimizing 0x3fc508085bbd <JSFunction mul (sfi = 
 0x3fc508210309)> - took 0.134, 0.550, 0.023 ms]
 0.002389 [optimizing 0x3fc508085ba1 <JSFunction add (sfi = 
 0x3fc5082102d1)> because --always-opt]
 0.002498 [compiling method 0x3fc508085ba1 <JSFunction add (sfi = 
 0x3fc5082102d1)> using TurboFan]

I just want to point out that ts is not an accurate timestamping system. 
The shell is starting ts and d8 simultaneously, and it's very possible 
that d8 has done a lot of stuff by the time ts gets around to deciding 
what timestamp 0 is. In fact, d8 could be completely finished, and all 
its output buffered in the pipe, before ts does anything. Especially 
when these times are so short.

Not saying the data is wrong, but I am not certain this is proof.

Use shell time builtin.

-Steve

Aug 26 2020

James Lu <jamtlu gmail.com> writes:

On Wednesday, 26 August 2020 at 14:45:48 UTC, Steven 
Schveighoffer wrote:
 On 8/25/20 9:13 PM, James Lu wrote:
 V8 JavaScript compiles faster:
 
 $ d8 --always-opt --trace-opt --single-threaded

 I just want to point out that ts is not an accurate 
 timestamping system. The shell is starting ts and d8 
 simultaneously, and it's very possible that d8 has done a lot 
 of stuff by the time ts gets around to deciding what timestamp 
 0 is. In fact, d8 could be completely finished, and all its 
 output buffered in the pipe, before ts does anything. 
 Especially when these times are so short.

 Not saying the data is wrong, but I am not certain this is 
 proof.

 Use shell time builtin.

 -Steve

$ time d8 --always-opt --trace-opt --single-threaded 
--no-compilation-cache mandelbrot.js
[compiling method 0x182b08210471 <JSFunction (sfi = 
0x182b08210129)> using TurboFan]
[optimizing 0x182b08210471 <JSFunction (sfi = 0x182b08210129)> - 
took 0.888, 1.326, 0.026 ms]
[optimizing 0x182b0821073d <JSFunction main (sfi = 
0x182b082101e5)> because --always-opt]
[compiling method 0x182b0821073d <JSFunction main (sfi = 
0x182b082101e5)> using TurboFan]
[optimizing 0x182b0821073d <JSFunction main (sfi = 
0x182b082101e5)> - took 0.401, 3.229, 0.044 ms]
[optimizing 0x182b08085bdd <JSFunction Complex (sfi = 
0x182b0821021d)> because --always-opt]
[compiling method 0x182b08085bdd <JSFunction Complex (sfi = 
0x182b0821021d)> using TurboFan]
[optimizing 0x182b08085bdd <JSFunction Complex (sfi = 
0x182b0821021d)> - took 0.138, 0.296, 0.028 ms]
[optimizing 0x182b08210709 <JSFunction iterate_mandelbrot (sfi = 
0x182b082101ad)> because --always-opt]
[compiling method 0x182b08210709 <JSFunction iterate_mandelbrot 
(sfi = 0x182b082101ad)> using TurboFan]
[optimizing 0x182b08210709 <JSFunction iterate_mandelbrot (sfi = 
0x182b082101ad)> - took 0.228, 1.619, 0.035 ms]
[optimizing 0x182b08085bfd <JSFunction abs (sfi = 
0x182b08210255)> because --always-opt]
[compiling method 0x182b08085bfd <JSFunction abs (sfi = 
0x182b08210255)> using TurboFan]
[optimizing 0x182b08085bfd <JSFunction abs (sfi = 
0x182b08210255)> - took 0.213, 0.502, 0.033 ms]
[optimizing 0x182b08085c35 <JSFunction mul (sfi = 
0x182b082102c5)> because --always-opt]
[compiling method 0x182b08085c35 <JSFunction mul (sfi = 
0x182b082102c5)> using TurboFan]
[optimizing 0x182b08085c35 <JSFunction mul (sfi = 
0x182b082102c5)> - took 0.183, 0.643, 0.030 ms]
[optimizing 0x182b08085c19 <JSFunction add (sfi = 
0x182b0821028d)> because --always-opt]
[compiling method 0x182b08085c19 <JSFunction add (sfi = 
0x182b0821028d)> using TurboFan]
[optimizing 0x182b08085c19 <JSFunction add (sfi = 
0x182b0821028d)> - took 0.150, 0.449, 0.041 ms]

real	0m0.052s
user	0m0.022s
sys	0m0.021s

--always-opt makes V8 compile the function the first time it is 
called, so we can ignore interpreter overhead. I changed the code 
to quit after one function call, so we would measure how long it 
takes to compile and run the compilation.

For the sake of transparency, I modified some of the code to move 
it into a main function to ensure it would compile the code. 
Surprisingly, doing this reduced compilation time.

Here is the exact code I used: 
https://gist.github.com/CrazyPython/3552e1405dbb4b640810f6443cd0a015

Aug 26 2020

Steven Schveighoffer <schveiguy gmail.com> writes:

On 8/26/20 12:00 PM, James Lu wrote:
 On Wednesday, 26 August 2020 at 14:45:48 UTC, Steven Schveighoffer wrote:
 On 8/25/20 9:13 PM, James Lu wrote:
 V8 JavaScript compiles faster:

 $ d8 --always-opt --trace-opt --single-threaded

 I just want to point out that ts is not an accurate timestamping 
 system. The shell is starting ts and d8 simultaneously, and it's very 
 possible that d8 has done a lot of stuff by the time ts gets around to 
 deciding what timestamp 0 is. In fact, d8 could be completely 
 finished, and all its output buffered in the pipe, before ts does 
 anything. Especially when these times are so short.

 Not saying the data is wrong, but I am not certain this is proof.

 Use shell time builtin.

 
 $ time d8 --always-opt --trace-opt --single-threaded 
 --no-compilation-cache mandelbrot.js
 [compiling method 0x182b08210471 <JSFunction (sfi = 0x182b08210129)> 
 using TurboFan]
 [optimizing 0x182b08210471 <JSFunction (sfi = 0x182b08210129)> - took 
 0.888, 1.326, 0.026 ms]
 [optimizing 0x182b0821073d <JSFunction main (sfi = 0x182b082101e5)> 
 because --always-opt]
 [compiling method 0x182b0821073d <JSFunction main (sfi = 
 0x182b082101e5)> using TurboFan]
 [optimizing 0x182b0821073d <JSFunction main (sfi = 0x182b082101e5)> - 
 took 0.401, 3.229, 0.044 ms]
 [optimizing 0x182b08085bdd <JSFunction Complex (sfi = 0x182b0821021d)> 
 because --always-opt]
 [compiling method 0x182b08085bdd <JSFunction Complex (sfi = 
 0x182b0821021d)> using TurboFan]
 [optimizing 0x182b08085bdd <JSFunction Complex (sfi = 0x182b0821021d)> - 
 took 0.138, 0.296, 0.028 ms]
 [optimizing 0x182b08210709 <JSFunction iterate_mandelbrot (sfi = 
 0x182b082101ad)> because --always-opt]
 [compiling method 0x182b08210709 <JSFunction iterate_mandelbrot (sfi = 
 0x182b082101ad)> using TurboFan]
 [optimizing 0x182b08210709 <JSFunction iterate_mandelbrot (sfi = 
 0x182b082101ad)> - took 0.228, 1.619, 0.035 ms]
 [optimizing 0x182b08085bfd <JSFunction abs (sfi = 0x182b08210255)> 
 because --always-opt]
 [compiling method 0x182b08085bfd <JSFunction abs (sfi = 0x182b08210255)> 
 using TurboFan]
 [optimizing 0x182b08085bfd <JSFunction abs (sfi = 0x182b08210255)> - 
 took 0.213, 0.502, 0.033 ms]
 [optimizing 0x182b08085c35 <JSFunction mul (sfi = 0x182b082102c5)> 
 because --always-opt]
 [compiling method 0x182b08085c35 <JSFunction mul (sfi = 0x182b082102c5)> 
 using TurboFan]
 [optimizing 0x182b08085c35 <JSFunction mul (sfi = 0x182b082102c5)> - 
 took 0.183, 0.643, 0.030 ms]
 [optimizing 0x182b08085c19 <JSFunction add (sfi = 0x182b0821028d)> 
 because --always-opt]
 [compiling method 0x182b08085c19 <JSFunction add (sfi = 0x182b0821028d)> 
 using TurboFan]
 [optimizing 0x182b08085c19 <JSFunction add (sfi = 0x182b0821028d)> - 
 took 0.150, 0.449, 0.041 ms]
 
 real    0m0.052s
 user    0m0.022s
 sys    0m0.021s
 
 --always-opt makes V8 compile the function the first time it is called, 
 so we can ignore interpreter overhead. I changed the code to quit after 
 one function call, so we would measure how long it takes to compile and 
 run the compilation.
 
 For the sake of transparency, I modified some of the code to move it 
 into a main function to ensure it would compile the code. Surprisingly, 
 doing this reduced compilation time.
 
 Here is the exact code I used: 
 https://gist.github.com/CrazyPython/3552e1405dbb4b640810f6443cd0a015

Thanks, that's really impressive. D has some significant overhead which 
might explain some of the discrepancy.

Compiling an empty main function with dmd takes 0.128 seconds on my 
system, but of course comparing my system to yours isn't going to be 
useful.

Compiling with -betterC an empty main takes 0.065 seconds, which means 
about half the overhead is spent compiling D runtime setup things?

It is difficult to assign what is overhead and what is compilation, 
especially for a JIT compiler, so the claim of "100x", may not be 
accurate, especially with these low timings. Still, it definitely seems 
faster than D.

-Steve

Aug 26 2020

Atila Neves <atila.neves gmail.com> writes:

On Wednesday, 26 August 2020 at 16:24:03 UTC, Steven 
Schveighoffer wrote:
 On 8/26/20 12:00 PM, James Lu wrote:
 On Wednesday, 26 August 2020 at 14:45:48 UTC, Steven 
 Schveighoffer wrote:
 On 8/25/20 9:13 PM, James Lu wrote:
 V8 JavaScript compiles faster:




 Compiling an empty main function with dmd takes 0.128 seconds 
 on my system, but of course comparing my system to yours isn't 
 going to be useful.

 Compiling with -betterC an empty main takes 0.065 seconds, 
 which means about half the overhead is spent compiling D 
 runtime setup things?

Weird. On my machine both take the same amount of time: ~16ms.

Aug 27 2020

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Wed, Aug 26, 2020 at 01:31:01AM +0000, James Lu via Digitalmars-d wrote:
[...]
 DMD -O doesn't make a significant difference over DMD, clocking in at
 12 seconds total.

[...]

DMD's optimizer is a joke compared to modern optimizing backends like
LDC/LLVM or GCC.  These days I don't even look at DMD for anything
remotely performance-related.  I consistently get 15-20% faster
executables from LDC than from DMD (even without any optimization
flags!), and for compute-heavy programs with -O2/-O3, the difference can
be up to 40-50%.  Now that LDC releases are closely tracking DMD
releases, I honestly have lost interest in DMD codegen quality, and only
use DMD for rapid prototyping during development. For everything else,
LDC is my go-to compiler.

(And don't even get me started on backend codegen bugs triggered by -O
and/or -inline. After getting bitten a few times by a couple of those, I
stay away from dmd -O / dmd -inline like the plague. If I want
optimization, I use LDC instead.)


On Wed, Aug 26, 2020 at 01:38:27AM +0000, James Lu via Digitalmars-d wrote:
[...]
 I wonder if anyone in the D community has the expertise to change
 modify or rewrite DMD's backend to be up to be at most 1.5-2x slower
 at normal, non-SIMD tasks, up to a poor version of LuaJIT or V8 while
 retaining the speed.

Supposedly Walter is one of the only people who understands the backend
well enough to be able to make significant improvements to it.

However, Walter is busy with other D-related stuff (important
language-level stuff), and we really don't want his time to be spent
optimizing a backend that, to be frank, almost nobody is interested in
these days. (I'm willing to be pleasantly surprised, though. If Walter
can singlehandedly clean up DMD's optimizer and hone it at least to the
same ballpark as LDC/GDC, then I'll be all ears. But I'm not holding my
breath.)


T

-- 
People say I'm indecisive, but I'm not sure about that. -- YHL, CONLANG

Aug 25 2020

ketmar <ketmar ketmar.no-ip.org> writes:

H. S. Teoh wrote:

 I wonder if anyone in the D community has the expertise to change
 modify or rewrite DMD's backend to be up to be at most 1.5-2x slower
 at normal, non-SIMD tasks, up to a poor version of LuaJIT or V8 while
 retaining the speed.

 Supposedly Walter is one of the only people who understands the backend
 well enough to be able to make significant improvements to it.

that's why there is no reason to "improve" current DMD backend at all. it 
is much easier to throw it away, and write a brand new one, SSA-based. i 
bet that bog-standard SSA with linear register allocator will generate code 
at least as good as DMD -O, but it will be faster, and more maintainable. 
it is also easy to retarget it, because most analysis (and even spilling, 
partially) is done on SSA level, and you only have to port instruction 
selector. so no problems maintaining backends for x86, x86_64 and arm (even 
in the same executable).

also, the same backend can be used to jit ctfe code later.

now we only need somebody to do it.

Aug 25 2020

Dukc <ajieskola gmail.com> writes:

On Wednesday, 26 August 2020 at 04:37:06 UTC, ketmar wrote:
 H. S. Teoh wrote:

 I wonder if anyone in the D community has the expertise to 
 change
 modify or rewrite DMD's backend to be up to be at most 1.5-2x 
 slower
 at normal, non-SIMD tasks, up to a poor version of LuaJIT or 
 V8 while
 retaining the speed.

 Supposedly Walter is one of the only people who understands 
 the backend
 well enough to be able to make significant improvements to it.

 that's why there is no reason to "improve" current DMD backend 
 at all.

Perhaps we should not be that quick to downplay DMD just because 
it does not optimize as heavily as GDC and LDC at max settings. I 
may be too theoretical, but I think using only relatively basic 
optimizations for release build might be preferable to always 
using the most aggressive setting. Why?

Because the program usually spends almost all it's time in tiny 
fraction of itself. One has profile where it is and do some 
hand-optimization anyway to get a performant program, no matter 

optimizations, enough to avoid hand-assembly and things like 
`foreach(vector; cast(long[])intArray){...}` in the critical 
parts. But max-optimizing the whole program, for me, just seems 
to bloat binary size and compile times for relatively little 
benefit.

Also, one supposedly wants to benchmark the critical parts. With 
conservative optimization, the benchmarks are faster to compile, 
and supposedly more reliable. There is less surface for 
compiler-caused performance regression, and your code is more 
likely to stay fast if you decide you need to use size 
optimization instead.

Aug 26 2020

ketmar <ketmar ketmar.no-ip.org> writes:

Dukc wrote:

 On Wednesday, 26 August 2020 at 04:37:06 UTC, ketmar wrote:
 H. S. Teoh wrote:

 I wonder if anyone in the D community has the expertise to change
 modify or rewrite DMD's backend to be up to be at most 1.5-2x slower
 at normal, non-SIMD tasks, up to a poor version of LuaJIT or V8 while
 retaining the speed.

 Supposedly Walter is one of the only people who understands the backend
 well enough to be able to make significant improvements to it.

 that's why there is no reason to "improve" current DMD backend at all.

 Perhaps we should not be that quick to downplay DMD just because it does 
 not optimize as heavily as GDC and LDC at max settings.

it's not the reason, at least for me. the real reason is that DMD backend 
is virtually impenetrable. it is a giant black box with the label "DO NOT 
ENTER IF YOUR NAME IS NOT WALTER" on its side.

SSA backend is much easier to maintain, much easier to retarget, and 
optimisations over SSA can be nicely layered, from "nothing" to "set of 
aggressive multipass optimisers". the best thing is that those optimisers 
are mostly independent of each other, they only need to maintain SSA 
invariant. so you can write alot of them doing one simple optimisation at a 
time, and run them as long as you want.

Aug 26 2020

Stefan Koch <uplink.coder googlemail.com> writes:

On Wednesday, 26 August 2020 at 04:37:06 UTC, ketmar wrote:
 also, the same backend can be used to jit ctfe code later.

 now we only need somebody to do it.

CTFE needs a different code path from the regular backend.
You need to be able to hook many things which usually you 
wouldn't need to hook.

Aug 26 2020

ketmar <ketmar ketmar.no-ip.org> writes:

Stefan Koch wrote:

 On Wednesday, 26 August 2020 at 04:37:06 UTC, ketmar wrote:
 also, the same backend can be used to jit ctfe code later.

 now we only need somebody to do it.

 CTFE needs a different code path from the regular backend.
 You need to be able to hook many things which usually you wouldn't need 
 to hook.

you're right... with the current backend. but with universal SSA backend, 
once you lowered the code to SSA, it doesn't matter anymore. for native 
code, lowering engine can emit direct memory manipulation SSA opcodes, and 
for CTFE it can emit function calls. the backend doesn't care, it will 
still produce machine code you can either write to disk, or run directly. 
or don't even bother producing machine code at all, but run some SSA 
optimisers and execute SSA code directly.

Aug 26 2020

drug <drug2004 bk.ru> writes:

On 8/26/20 4:02 PM, ketmar wrote:
 you're right... with the current backend. but with universal SSA 
 backend, once you lowered the code to SSA, it doesn't matter anymore. 
 for native code, lowering engine can emit direct memory manipulation SSA 
 opcodes, and for CTFE it can emit function calls. the backend doesn't 
 care, it will still produce machine code you can either write to disk, 
 or run directly. or don't even bother producing machine code at all, but 
 run some SSA optimisers and execute SSA code directly.

What are disadvantages of SSA based backend?

Aug 26 2020

ketmar <ketmar ketmar.no-ip.org> writes:

drug wrote:

 What are disadvantages of SSA based backend?

somebody have to write it.

Aug 26 2020

Paolo Invernizzi <paolo.invernizzi gmail.com> writes:

On Wednesday, 26 August 2020 at 13:07:03 UTC, drug wrote:
 On 8/26/20 4:02 PM, ketmar wrote:
 you're right... with the current backend. but with universal 
 SSA backend, once you lowered the code to SSA, it doesn't 
 matter anymore. for native code, lowering engine can emit 
 direct memory manipulation SSA opcodes, and for CTFE it can 
 emit function calls. the backend doesn't care, it will still 
 produce machine code you can either write to disk, or run 
 directly. or don't even bother producing machine code at all, 
 but run some SSA optimisers and execute SSA code directly.

 What are disadvantages of SSA based backend?

If I'm not wrong, from what I remember LLVM IR is SSA, for I 
guess there's a lot of literature around pro and versus the SSA 
approach...

Aug 26 2020

Stefan Koch <uplink.coder googlemail.com> writes:

On Wednesday, 26 August 2020 at 13:07:03 UTC, drug wrote:
 On 8/26/20 4:02 PM, ketmar wrote:
 you're right... with the current backend. but with universal 
 SSA backend, once you lowered the code to SSA, it doesn't 
 matter anymore. for native code, lowering engine can emit 
 direct memory manipulation SSA opcodes, and for CTFE it can 
 emit function calls. the backend doesn't care, it will still 
 produce machine code you can either write to disk, or run 
 directly. or don't even bother producing machine code at all, 
 but run some SSA optimisers and execute SSA code directly.

 What are disadvantages of SSA based backend?

Well formed SSA is a little tricky to generate.
And does not map well on hardware.
Without a few dedicated rewrite and optimization passes, it 
produces code which is dog slow.

Aug 26 2020

ketmar <ketmar ketmar.no-ip.org> writes:

Stefan Koch wrote:

 Well formed SSA is a little tricky to generate.
 And does not map well on hardware.

that is not what SSA is used for. ;-)

also, well-formed SSA is dead easy to generate: just don't try to be smart, 
and don't write "locals reuse logic" at all. other passes will take care of 
eliminating redunant loads and locals. and then simple linear scan register 
allocator will give you very surprising results. ;-)

The Great Secret of SSA is "don't be smart". each SSA pass should do only 
one thing, it should do it in the easiest possible way, and only care about 
repairing SSA damage it done. and then you can easily add more passes, and 
choose between generated code quality and time spent on optimising.

Aug 26 2020

Jacob Carlborg <doob me.com> writes:

On Thursday, 20 August 2020 at 20:50:25 UTC, Per Nordlöw wrote:
 After having evaluated the compilation speed of D compared to 
 other languages at

     https://github.com/nordlow/compiler-benchmark

 I wonder; is there any language that compiles to native code 
 anywhere nearly as fast or faster than D, except C?

You could add Crystal [1] as well for completeness. It uses LLVM 
as its backend, so it might not be fast.

BTW, I see that you download DMD from dlang.org. The binaries at 
dlang.org are compiled with DMD itself. You should compile DMD 
yourself using LDC, with the appropriate flags. It will give you 
a boost, perhaps 30%.

[1] https://crystal-lang.org

--
/Jacob Carlborg

Aug 26 2020

jmh530 <john.michael.hall gmail.com> writes:

On Wednesday, 26 August 2020 at 11:49:06 UTC, Jacob Carlborg 
wrote:
 [snip]

 BTW, I see that you download DMD from dlang.org. The binaries 
 at dlang.org are compiled with DMD itself. You should compile 
 DMD yourself using LDC, with the appropriate flags. It will 
 give you a boost, perhaps 30%.

 [1] https://crystal-lang.org

 --
 /Jacob Carlborg

After [1] I was under the assumption that the linux versions were 
already compiled on LDC first. Is it only the Windows release 
that is compiled with LDC?

[1] https://dlang.org/changelog/2.091.0.html#windows

Aug 26 2020

Jacob Carlborg <doob me.com> writes:

On Wednesday, 26 August 2020 at 12:40:28 UTC, jmh530 wrote:

 Is it only the Windows release that is compiled with LDC?

Yes, unfortunately.

--
/Jacob Carlborg

Aug 27 2020

Atila Neves <atila.neves gmail.com> writes:

On Thursday, 27 August 2020 at 09:54:36 UTC, Jacob Carlborg wrote:
 On Wednesday, 26 August 2020 at 12:40:28 UTC, jmh530 wrote:

 Is it only the Windows release that is compiled with LDC?

 Yes, unfortunately.

 --
 /Jacob Carlborg

Didn't that change recently?

In any case, on Arch Linux dmd is compiled with ldc.

Aug 27 2020

MrSmith <mrsmith33 yandex.ru> writes:

On Thursday, 20 August 2020 at 20:50:25 UTC, Per Nordlöw wrote:
 I wonder; is there any language that compiles to native code 
 anywhere nearly as fast or faster than D, except C?

Hi, you may want to try benchmarking my compiler of Vox language:
https://github.com/MrSmith33/tiny_jit
I designed it with high compilation speed in mind.

I added statically compiled linux build to CI tag.
Or compile manually with:
source> ~/dlang/ldc-1.22.0/bin/ldc2 -d-version=cli -m64 -O3 
-release -boundscheck=off -enable-inlining -flto=full -i main.d 
-of=./tjc

So far it may only produce executables for win64, but it should 
be enough for you purpose.

Aug 26 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Wednesday, 26 August 2020 at 16:08:15 UTC, MrSmith wrote:
 Hi, you may want to try benchmarking my compiler of Vox 
 language:
 https://github.com/MrSmith33/tiny_jit
 I designed it with high compilation speed in mind.

Looks really interesting! Thanks. I'll try it out soon.

Aug 26 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Wednesday, 26 August 2020 at 16:08:15 UTC, MrSmith wrote:
 https://github.com/MrSmith33/tiny_jit

What kinds of memory management are supported/planned and how 
does this interact with slices?

I can't find any code examples in Vox. Are they represented as 
strings inside the .d-files?

Aug 26 2020

MrSmith <mrsmith33 yandex.ru> writes:

On Wednesday, 26 August 2020 at 16:34:39 UTC, Per Nordlöw wrote:
 On Wednesday, 26 August 2020 at 16:08:15 UTC, MrSmith wrote:
 https://github.com/MrSmith33/tiny_jit

 What kinds of memory management are supported/planned and how 
 does this interact with slices?

 I can't find any code examples in Vox. Are they represented as 
 strings inside the .d-files?

Currently only manual MM. Check source/tests for small examples 
and https://github.com/MrSmith33/rltut_2019 for bigger project. 
Also see spec/index.md for some docs.

Aug 26 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Wednesday, 26 August 2020 at 17:01:44 UTC, MrSmith wrote:
 Currently only manual MM. Check source/tests for small examples 
 and https://github.com/MrSmith33/rltut_2019 for bigger project. 
 Also see spec/index.md for some docs.

Can you elaborate on what you mean manual MM? Refcounted? I'm 
very curious.

Aug 26 2020

MrSmith <mrsmith33 yandex.ru> writes:

On Wednesday, 26 August 2020 at 20:13:55 UTC, Per Nordlöw wrote:
 Can you elaborate on what you mean manual MM? Refcounted? I'm 
 very curious.

Currently you can either do static/stack allocation and use host 
provided functions or OS API to dynamically allocate memory 
(alloc/free, mmap/VirtualAlloc/HeapAlloc etc). No GC or RC is 
planned.
Probably I will add more first class support for allocators like 
in Jai/Zig/Odin.

Aug 26 2020

MrSmith <mrsmith33 yandex.ru> writes:

On Wednesday, 26 August 2020 at 16:34:39 UTC, Per Nordlöw wrote:
 On Wednesday, 26 August 2020 at 16:08:15 UTC, MrSmith wrote:
 https://github.com/MrSmith33/tiny_jit

 What kinds of memory management are supported/planned and how 
 does this interact with slices?

 I can't find any code examples in Vox. Are they represented as 
 strings inside the .d-files?

I run benchmark myself and here is what Vox code should look like 
based on D code:
i64 add_long_n0_h0[T](i64 x) { return x + 87734; }
i64 add_long_n0[T](i64 x) { return x + add_long_n0_h0[i64](x) + 
40209; }
i64 main() {
     i64 long_sum = 0;
     long_sum += add_long_n0[i64](0);
     return long_sum;
}

Aug 26 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Wednesday, 26 August 2020 at 17:27:48 UTC, MrSmith wrote:
 I run benchmark myself and here is what Vox code should look 
 like based on D code:

Ok, thanks.

How should I best build vox for maximum performance?

These alternatives correctly produce a binary:

     dmd -i main.d
     ldmd2 -i main.d
     ldmd2 -O -release -i main.d

but running the binary produced

     ldmd2 -O -release -i main.d

prints

     Running 218 tests

but never completes...

Aug 26 2020

MrSmith <mrsmith33 yandex.ru> writes:

On Wednesday, 26 August 2020 at 19:26:17 UTC, Per Nordlöw wrote:
 How should I best build vox for maximum performance?

It needs cli version passed:
ldc2 -d-version=cli -m64 -O3 -release -boundscheck=off 
-enable-inlining -flto=full -i main.d -of=./tjc

Aug 26 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Wednesday, 26 August 2020 at 20:10:09 UTC, MrSmith wrote:
 It needs cli version passed:
 ldc2 -d-version=cli -m64 -O3 -release -boundscheck=off 
 -enable-inlining -flto=full -i main.d -of=./tjc

Ahh, nice.

Aug 26 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Wednesday, 26 August 2020 at 20:10:09 UTC, MrSmith wrote:
 It needs cli version passed:
 ldc2 -d-version=cli -m64 -O3 -release -boundscheck=off 
 -enable-inlining -flto=full -i main.d -of=./tjc

Can `vox` only output Windows binaries?

Aug 26 2020

MrSmith <mrsmith33 yandex.ru> writes:

On Wednesday, 26 August 2020 at 22:05:51 UTC, Per Nordlöw wrote:
 Can `vox` only output Windows binaries?

Yes. ELF and SytemV ABI is WIP

Aug 26 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Wednesday, 26 August 2020 at 22:11:34 UTC, MrSmith wrote:
 Yes. ELF and SytemV ABI is WIP

What flags should I feed to the compiler? My output binary when I 
do

vox main.vox

is a main.exe Windows binary.

Aug 26 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Wednesday, 26 August 2020 at 20:10:09 UTC, MrSmith wrote:
 ldc2 -d-version=cli -m64 -O3 -release -boundscheck=off 
 -enable-inlining -flto=full -i main.d -of=./tjc

I built that binary on my system and called it `vox`.

It is indeed fast; about 2.5x faster than dmd

| Lang-uage | Oper-ation | Temp-lated | Time [us/#fn] | Slowdown 
vs [Best] | Version | Exec |
| :---: | :---: | --- | :---: | :---: | :---: | :---: |
| D | Build | No | 16.4 | 2.4 [Vox] | v2.093.1-697-g537aa8eb1 | 
`dmd` |
| D | Build | No | 188.1 | 27.6 [Vox] | 1.23.0 | `ldmd2` |
| D | Build | Yes | 30.8 | 4.5 [Vox] | v2.093.1-697-g537aa8eb1 | 
`dmd` |
| D | Build | Yes | 204.2 | 29.9 [Vox] | 1.23.0 | `ldmd2` |
| Vox | Build | No | 6.8 | 1.0 [Vox] | master | `vox` |

I've added support for untemplated Vox language to 
compiler-benchmark if `vox` is found in the path. I'll add the 
templated version now.

Great work!

Aug 26 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Wednesday, 26 August 2020 at 22:21:46 UTC, Per Nordlöw wrote:
 I've added support for untemplated Vox language to 
 compiler-benchmark if `vox` is found in the path.

I've updated the docs aswell at 
https://github.com/nordlow/compiler-benchmark

Aug 26 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Wednesday, 26 August 2020 at 22:22:52 UTC, Per Nordlöw wrote:
 I've updated the docs aswell at
 https://github.com/nordlow/compiler-benchmark

I've haven't updated the benchmarks yet, though.

Aug 26 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Wednesday, 26 August 2020 at 22:27:39 UTC, Per Nordlöw wrote:
 I've haven't updated the benchmarks yet, though.

I've pushed support for Vox-generics now aswell.

Aug 26 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Wednesday, 26 August 2020 at 22:56:54 UTC, Per Nordlöw wrote:
 I've pushed support for Vox-generics now aswell.

Hardly any difference in compile-time for the generic version. 
Impressive.

Aug 26 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Wednesday, 26 August 2020 at 23:03:08 UTC, Per Nordlöw wrote:
 Hardly any difference in compile-time for the generic version. 
 Impressive.

Is Vox so fast because it doesn't (yet) support implicit function 
template instantiation (IFTI)?

Aug 27 2020

MrSmith <mrsmith33 yandex.ru> writes:

On Thursday, 27 August 2020 at 09:47:38 UTC, Per Nordlöw wrote:
 Is Vox so fast because it doesn't (yet) support implicit 
 function template instantiation (IFTI)?

IFTI is supported, but only in simple cases. Relevant tests: 
https://github.com/MrSmith33/tiny_jit/blob/master/source/tests/passing.d#L3342-L3434

Macros are not yet implemented. But I have variadic templates. 
(See tests below IFTI tests).

Here is a fun one. Combining #foreach, variadic template function 
and type functions to get writeln functionality. selectPrintFunc 
gets run via CTFE and returns alias to relevant function. $ 
functions work like traits. $type is $alias restricted to types 
only.

void printStr(u8[]);
void printInt(i64 i);
$alias selectPrintFunc($type T) {
	if ($isInteger(T))
		return printInt;
	if ($isSlice(T))
		return printStr;
	$compileError("Invalid type");
}
void write[Args...](Args... args) {
	#foreach(i, arg; args) {
		alias func = selectPrintFunc(Args[i]);
		func(arg);
	}
}
void run() {
	write("Hello", 42);
}

Aug 27 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Thursday, 27 August 2020 at 10:29:41 UTC, MrSmith wrote:
 IFTI is supported, but only in simple cases. Relevant tests: 
 https://github.com/MrSmith33/tiny_jit/blob/master/source/tests/passing.d#L3342-L3434

Nice. Are/Will Vox's overloading rules be different from D's?

Is there a list of D features you don't want?
And a list of non-D features you plan?

Aug 27 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Thursday, 27 August 2020 at 14:52:06 UTC, Per Nordlöw wrote:
 Is there a list of D features you don't want?
 And a list of non-D features you plan?

I just read todo.txt.

Do you plan to add qualifiers for pure and safe code?

Aug 27 2020

MrSmith <mrsmith33 yandex.ru> writes:

On Thursday, 27 August 2020 at 14:52:06 UTC, Per Nordlöw wrote:
 Nice. Are/Will Vox's overloading rules be different from D's?

 Is there a list of D features you don't want?
 And a list of non-D features you plan?

Haven't given overloading a thought yet, but D way seems totally 
ok. Till that point, the lack of overloading wasn't a major pain 
point, though.
I tried to list main differences from D in the readme.
I want to have performant ways of introspection and 
code-generation that utilize CTFE as much as possible. I may add 
some support for polymorphism, like signatures/traits.
I'm not a big fan of poisonous attributes, so not adding them atm 
(But I have one now - #ctfe, for ctfe-only functions/structs). 
There are more basic stuff still missing from the language that I 
need to focus on.

Aug 27 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Thursday, 27 August 2020 at 10:29:41 UTC, MrSmith wrote:
 void write[Args...](Args... args)

What is the pro of this syntax with `Args... args` compared to D's

void write(Args...)(Args args)

?

Oct 05 2020

MrSmith <mrsmith33 yandex.ru> writes:

On Monday, 5 October 2020 at 14:19:36 UTC, Per Nordlöw wrote:
 What is the pro of this syntax with `Args... args` compared to 
 D's
 void write(Args...)(Args args)
 ?

I think it was due to simpler implementation. This way you know 
that Args is variadic as early as at parse time.

Oct 05 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Wednesday, 26 August 2020 at 22:27:39 UTC, Per Nordlöw wrote:
 On Wednesday, 26 August 2020 at 22:22:52 UTC, Per Nordlöw wrote:
 I've updated the docs aswell at
 https://github.com/nordlow/compiler-benchmark

 I've haven't updated the benchmarks yet, though.

Here's at least numbers for

./benchmark --languages=D,Vox --function-count=200 
--function-depth=450 --run-count=1

outputted in Markdown-table format:

| Lang-uage | Oper-ation | Temp-lated | Time [us/#fn] | Slowdown 
vs [Best] | Version | Exec |
| :---: | :---: | --- | :---: | :---: | :---: | :---: |
| D | Check | No | 6.9 | 1.0 [D] | v2.093.1-697-g537aa8eb1 | 
`dmd` |
| D | Check | No | 7.5 | 1.1 [D] | 1.23.0 | `ldmd2` |
| D | Check | Yes | 17.4 | 2.5 [D] | v2.093.1-697-g537aa8eb1 | 
`dmd` |
| D | Check | Yes | 18.8 | 2.7 [D] | 1.23.0 | `ldmd2` |
| D | Build | No | 16.8 | 2.4 [Vox] | v2.093.1-697-g537aa8eb1 | 
`dmd` |
| D | Build | No | 192.8 | 27.3 [Vox] | 1.23.0 | `ldmd2` |
| D | Build | Yes | 29.7 | 4.2 [Vox] | v2.093.1-697-g537aa8eb1 | 
`dmd` |
| D | Build | Yes | 211.0 | 29.9 [Vox] | 1.23.0 | `ldmd2` |
| Vox | Build | No | 7.1 | 1.0 [Vox] | master | `vox` |
| Vox | Build | Yes | 7.9 | 1.1 [Vox] | master | `vox` |

vox build equals dmd check in speed!

I guess it's time to start running the binary aswell to see if 
there are any speed differences.

Aug 26 2020

MrSmith <mrsmith33 yandex.ru> writes:

On Wednesday, 26 August 2020 at 23:07:28 UTC, Per Nordlöw wrote:
 I guess it's time to start running the binary aswell to see if 
 there are any speed differences.

Vox uses SSA form + linear scan register allocation, but no other 
major optimizations are done yet. I would guess performance 
between debug and release version of other compilers. You may 
want to check --print-mem and --print-time flags to get detailed 
stats.

Aug 26 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Wednesday, 26 August 2020 at 23:14:57 UTC, MrSmith wrote:
 Vox ...

What does the macro syntax look like in Vox?

Aug 27 2020

Jacob Carlborg <doob me.com> writes:

On Wednesday, 26 August 2020 at 20:10:09 UTC, MrSmith wrote:
 On Wednesday, 26 August 2020 at 19:26:17 UTC, Per Nordlöw wrote:
 How should I best build vox for maximum performance?

 It needs cli version passed:
 ldc2 -d-version=cli -m64 -O3 -release -boundscheck=off 
 -enable-inlining -flto=full -i main.d -of=./tjc

Should add the following flags as well for best performance:

`--mcpu=native 
--defaultlib=libdruntime-ldc-lto.,libphobos2-ldc-lto`

The first will enable extra features the current CPU supports, 
like SSE. The second one will link to druntime and Phobos 
compiled with LTO enabled, instead of the regular libraries.

--
/Jacob Carlborg

Aug 27 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Wednesday, 26 August 2020 at 17:27:48 UTC, MrSmith wrote:
 I run benchmark myself and here is what Vox code should look 
 like based on D code:

What about adding Vox-support for performing semantic analysis 
only, similar to dmd's -o- flag?

Aug 26 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Thursday, 20 August 2020 at 20:50:25 UTC, Per Nordlöw wrote:
 After having evaluated the compilation speed of D compared to 
 other languages at

     https://github.com/nordlow/compiler-benchmark

 I wonder; is there any language that compiles to native code 
 anywhere nearly as fast or faster than D, except C?

 If so it most likely needs to use a backend other than LLVM.

 I believe Jai is supposed to do that but it hasn't been 
 released yet.

I just added support for Apple's Swift. It's massively slow on 
Linux. Check is 61 times slower than dmd and build is 42 times 
slower than dmd.

./benchmark --languages=D,Swift --function-count=200 
--function-depth=450 --run-count=1

outputs (in Markdown)

| Lang-uage | Oper-ation | Temp-lated | Op Time [us/#fn] | 
Slowdown vs [Best] | Run Time [us/#fn] | Version | Exec |
| :---: | :---: | --- | :---: | :---: | :---: | :---: | :---: |
| D | Check | No | 7.5 | 1.0 [D] | N/A | 
v2.094.0-rc.1-75-ga0875a7e0 | `dmd` |
| D | Check | No | 8.5 | 1.1 [D] | N/A | 1.23.0 | `ldmd2` |
| D | Check | Yes | 19.8 | 2.6 [D] | N/A | 
v2.094.0-rc.1-75-ga0875a7e0 | `dmd` |
| D | Check | Yes | 22.9 | 3.0 [D] | N/A | 1.23.0 | `ldmd2` |
| D | Build | No | 27.1 | 1.0 [D] | 50 | 
v2.094.0-rc.1-75-ga0875a7e0 | `dmd` |
| D | Build | No | 205.6 | 7.6 [D] | 108 | 1.23.0 | `ldmd2` |
| D | Build | Yes | 38.2 | 1.4 [D] | 31 | 
v2.094.0-rc.1-75-ga0875a7e0 | `dmd` |
| D | Build | Yes | 214.7 | 7.9 [D] | 113 | 1.23.0 | `ldmd2` |
| Swift | Check | No | 461.4 | 61.3 [D] | N/A | 5.3 | `swiftc` |
| Swift | Build | No | 1133.4 | 41.8 [D] | 61 | 5.3 | `swiftc` |

I'll rerun all the benchmarks now.

Sep 25 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Friday, 25 September 2020 at 16:01:32 UTC, Per Nordlöw wrote:
 I just added support for Apple's Swift. It's massively slow on 
 Linux. Check is 61 times slower than dmd and build is 42 times 
 slower than dmd.

The update comments at [1] gives some clues to problems with "Big 
Agenda Languages" in general och Swift in particular. Jonathan 
Blow explains the meaning of the term "Big Agenda Languages" at 
[2].

This should be mentioned when people ask us why we work for Dlang.

[1] 
https://stackoverflow.com/questions/25537614/why-is-swift-compile-time-so-slow
[2] https://github.com/BSVino/JaiPrimer/blob/master/JaiPrimer.md

Sep 28 2020

Jacob Carlborg <doob me.com> writes:

On Thursday, 20 August 2020 at 20:50:25 UTC, Per Nordlöw wrote:
 After having evaluated the compilation speed of D compared to 
 other languages at

     https://github.com/nordlow/compiler-benchmark

 I wonder; is there any language that compiles to native code 
 anywhere nearly as fast or faster than D, except C?

You should have a look at the self-hosted Zig compiler as well. 
I'm not sure if it's mature enough to benchmark (it currently 
only supports a small set of Zig). You would probably need to 
compile it from source as well.

Have a look at this post [1] I made, that mentions how the Zig 
compiler uses incremental compilation. Even a full build 
(non-incremental) seems really fast. Although, I don't know how 
well it scales.

[1] 
https://forum.dlang.org/post/ctelroirrkqpkrlupajp forum.dlang.org

--
/Jacob Carlborg

Sep 29 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Tuesday, 29 September 2020 at 08:03:07 UTC, Jacob Carlborg 
wrote:
 Have a look at this post [1] I made, that mentions how the Zig 
 compiler uses incremental compilation.

 https://forum.dlang.org/post/ctelroirrkqpkrlupajp forum.dlang.org

Thanks. Have you tried building and using the self-hosted version?

Sep 29 2020

Jacob Carlborg <doob me.com> writes:

On Tuesday, 29 September 2020 at 08:23:28 UTC, Per Nordlöw wrote:

 Thanks. Have you tried building and using the self-hosted 
 version?

No, I have not. I've looked at a couple of live coding videos 
about Zig and the self-hosted compiler.

--
/Jacob Carlborg

Sep 29 2020

D Programming

C/C++ Programming

Other

digitalmars.D - Is there any language that native-compiles faster than D?