www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Linking is the slowest part of D's compilation

reply James Lu <jamtlu gmail.com> writes:
Linking is the slowest part of D's compilation process. It is 
what makes its compilation speed uncompetitive with interrpreted 
scripting languages like Python and JavaScript.

This project to make a faster linker is in alpha: 
https://github.com/rui314/mold

 Concretely speaking, I wanted to use the linker to link a 
 Chromium executable with full debug info (~2 GiB in size) just 
 in 1 second. LLVM's lld, the fastest open-source linker which I 
 originally created a few years ago, takes about 12 seconds to 
 link Chromium on my machine. So the goal is 12x performance 
 bump over lld. Compared to GNU gold, it's more than 50x.
 It looks like mold has achieved the goal. It can link Chromium 
 in 2 seconds with 8-cores/16-threads, and if I enable the 
 preloading feature (I'll explain it later), the latency of the 
 linker for an interactive use is less than 900 milliseconds. It 
 is actualy faster than cat.
Feb 23
next sibling parent James Lu <jamtlu gmail.com> writes:
On Tuesday, 23 February 2021 at 14:41:59 UTC, James Lu wrote:
 This project to make a faster linker is in alpha: 
 https://github.com/rui314/mold
We would need to spend substantial engineering effort to make mold production-ready and ported to macOS and Windows, but it fits the spirit of D to improve the compilation speed.
Feb 23
prev sibling next sibling parent reply Imperatorn <johan_forsberg_86 hotmail.com> writes:
On Tuesday, 23 February 2021 at 14:41:59 UTC, James Lu wrote:
 Linking is the slowest part of D's compilation process. It is 
 what makes its compilation speed uncompetitive with 
 interrpreted scripting languages like Python and JavaScript.

 This project to make a faster linker is in alpha: 
 https://github.com/rui314/mold

 [...]
 [...]
Have you looked at gold or lld?
Feb 24
parent reply Petar Kirov [ZombineDev] <petar.p.kirov gmail.com> writes:
On Wednesday, 24 February 2021 at 20:05:17 UTC, Imperatorn wrote:
 On Tuesday, 23 February 2021 at 14:41:59 UTC, James Lu wrote:
 Linking is the slowest part of D's compilation process. It is 
 what makes its compilation speed uncompetitive with 
 interrpreted scripting languages like Python and JavaScript.

 This project to make a faster linker is in alpha: 
 https://github.com/rui314/mold

 [...]
 [...]
Have you looked at gold or lld?
From the readme file:
 LLVM's lld, the fastest open-source linker which I originally 
 created a few years ago, takes about 12 seconds to link 
 Chromium on my machine. So the goal is 12x performance bump 
 over lld. Compared to GNU gold, it's more than 50x.
Feb 24
parent reply Imperatorn <johan_forsberg_86 hotmail.com> writes:
On Wednesday, 24 February 2021 at 20:48:49 UTC, Petar Kirov 
[ZombineDev] wrote:
 On Wednesday, 24 February 2021 at 20:05:17 UTC, Imperatorn 
 wrote:
 On Tuesday, 23 February 2021 at 14:41:59 UTC, James Lu wrote:
 Linking is the slowest part of D's compilation process. It is 
 what makes its compilation speed uncompetitive with 
 interrpreted scripting languages like Python and JavaScript.

 This project to make a faster linker is in alpha: 
 https://github.com/rui314/mold

 [...]
 [...]
Have you looked at gold or lld?
From the readme file:
 LLVM's lld, the fastest open-source linker which I originally 
 created a few years ago, takes about 12 seconds to link 
 Chromium on my machine. So the goal is 12x performance bump 
 over lld. Compared to GNU gold, it's more than 50x.
👍 We should indoctrinate this rui314 to use D! 😁
Feb 24
parent Petar Kirov [ZombineDev] <petar.p.kirov gmail.com> writes:
On Wednesday, 24 February 2021 at 22:12:03 UTC, Imperatorn wrote:
 On Wednesday, 24 February 2021 at 20:48:49 UTC, Petar Kirov 
 [ZombineDev] wrote:
 On Wednesday, 24 February 2021 at 20:05:17 UTC, Imperatorn 
 wrote:
 On Tuesday, 23 February 2021 at 14:41:59 UTC, James Lu wrote:
 [...]
Have you looked at gold or lld?
From the readme file:
 LLVM's lld, the fastest open-source linker which I originally 
 created a few years ago, takes about 12 seconds to link 
 Chromium on my machine. So the goal is 12x performance bump 
 over lld. Compared to GNU gold, it's more than 50x.
👍 We should indoctrinate this rui314 to use D! 😁
😈
Feb 25
prev sibling next sibling parent Max Haughton <maxhaton gmail.com> writes:
On Tuesday, 23 February 2021 at 14:41:59 UTC, James Lu wrote:
 Linking is the slowest part of D's compilation process. It is 
 what makes its compilation speed uncompetitive with 
 interrpreted scripting languages like Python and JavaScript.

 This project to make a faster linker is in alpha: 
 https://github.com/rui314/mold

 Concretely speaking, I wanted to use the linker to link a 
 Chromium executable with full debug info (~2 GiB in size) just 
 in 1 second. LLVM's lld, the fastest open-source linker which 
 I originally created a few years ago, takes about 12 seconds 
 to link Chromium on my machine. So the goal is 12x performance 
 bump over lld. Compared to GNU gold, it's more than 50x.
 It looks like mold has achieved the goal. It can link Chromium 
 in 2 seconds with 8-cores/16-threads, and if I enable the 
 preloading feature (I'll explain it later), the latency of the 
 linker for an interactive use is less than 900 milliseconds. 
 It is actualy faster than cat.
This project is definitely worth watching, but it's worth saying that people continuously try and make their compiler the fastest, or their linker the fastest etc - progress is always good, but these things tend to slow down quite a bit as they mature.
Feb 24
prev sibling next sibling parent reply Atila Neves <atila.neves gmail.com> writes:
On Tuesday, 23 February 2021 at 14:41:59 UTC, James Lu wrote:
 Linking is the slowest part of D's compilation process.
Not necessarily: /tmp % cat foo.d import std.uni; void main() {} /tmp % time dmd -c -unittest foo.d dmd -c -unittest foo.d 0.27s user 0.06s system 99% cpu 0.331 total /tmp % time dmd foo.o dmd foo.o 0.04s user 0.05s system 154% cpu 0.058 total I'm using lld as the linker. The linker can be a bottleneck, yes, especially since it doesn't do work in parallel. But in my experience, if the linker takes a while, compiling took a lot longer still. Of course, any improvements in this area are welcome, and I hope mold is production-ready as soon as possible. I recently bought an NVMe drive just in the hope that it'd help with the "linker tax".
Feb 24
parent reply deadalnix <deadalnix gmail.com> writes:
On Wednesday, 24 February 2021 at 22:12:46 UTC, Atila Neves wrote:
 On Tuesday, 23 February 2021 at 14:41:59 UTC, James Lu wrote:
 The linker can be a bottleneck, yes, especially since it 
 doesn't do work in parallel. But in my experience, if the 
 linker takes a while, compiling took a lot longer still. Of 
 course, any improvements in this area are welcome, and I hope 
 mold is production-ready as soon as possible.
This is true for a fresh build, but often not the case for incremental builds, which dev often have to go through. This is because the work you have to do for sources grows with the size of the changeset, while the work you have to do link grows with the size of the project as a whole, changed or not. On large projects, it is very common that linking dominates incremental builds. zld is another interesting project that tries to do enable incremental linking: https://github.com/kubkon/zld Just like mold, it is fairly new and probably not battle tested enough for production yet.
Feb 24
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Feb 24, 2021 at 11:53:34PM +0000, deadalnix via Digitalmars-d wrote:
 On Wednesday, 24 February 2021 at 22:12:46 UTC, Atila Neves wrote:
 On Tuesday, 23 February 2021 at 14:41:59 UTC, James Lu wrote:
 The linker can be a bottleneck, yes, especially since it doesn't do
 work in parallel. But in my experience, if the linker takes a while,
 compiling took a lot longer still. Of course, any improvements in
 this area are welcome, and I hope mold is production-ready as soon
 as possible.
This is true for a fresh build, but often not the case for incremental builds, which dev often have to go through. This is because the work you have to do for sources grows with the size of the changeset, while the work you have to do link grows with the size of the project as a whole, changed or not. On large projects, it is very common that linking dominates incremental builds.
[...] This is very interesting. I wonder if there's a way to incrementally update the executable, instead of starting from scratch each time? E.g., hypothetically, if the linker emitted not only the executable but also some kind of map file describing the various parts that compose the executable, together with some extra information about offsets/addresses that depend on each other between parts, then in theory, if we change n object files (where n is significantly less than the total number N of all object files), we ought to be able to regenerate the executable by copying most of its current data, move a few sections around, and patch up some references. If the executable format is flexible enough (I think ELF is, don't know about PE), we could also pad the executable with some extra unused space between sections to allow for growth of individual sections up to some limit. Then we might be able patch in updated object files in-place, along with updating some references as needed, as long as said object files don't grow beyond the size of the extra space. This could significantly speed up the code-compile-run cycle during development. For releases, of course, you'd want to compact the executable, but generally it's expected that release builds are OK to take longer. T -- Государство делает вид, что платит нам зарплату, а мы делаем вид, что работаем.
Feb 25
next sibling parent James Lu <jamtlu gmail.com> writes:
On Thursday, 25 February 2021 at 15:42:22 UTC, H. S. Teoh wrote:
 On Wed, Feb 24, 2021 at 11:53:34PM +0000, deadalnix via 
 Digitalmars-d wrote:
 [...]
[...] This is very interesting. I wonder if there's a way to incrementally update the executable, instead of starting from scratch each time? [...]
I'm more interested in a "JIT" that remembers object-file-to-in-memory-function-pointers and overwrites them with trampolines. You could use fork() to make a reloadable executable that way. Of course, you'd need a new function loading system, which could be difficult...
Feb 25
prev sibling next sibling parent MrSmith33 <mrsmith33 yandex.ru> writes:
On Thursday, 25 February 2021 at 15:42:22 UTC, H. S. Teoh wrote:
 This is very interesting.  I wonder if there's a way to 
 incrementally update the executable, instead of starting from 
 scratch each time?
Zig has that: https://kristoff.it/blog/zig-new-relationship-llvm/#in-place-binary-patching
Feb 25
prev sibling next sibling parent Paul Backus <snarwin gmail.com> writes:
On Thursday, 25 February 2021 at 15:42:22 UTC, H. S. Teoh wrote:
 This is very interesting.  I wonder if there's a way to 
 incrementally update the executable, instead of starting from 
 scratch each time?
The author of mold has a section on incremental linking in the readme under "Rejected Ideas": https://github.com/rui314/mold#rejected-ideas
Feb 25
prev sibling parent deadalnix <deadalnix gmail.com> writes:
On Thursday, 25 February 2021 at 15:42:22 UTC, H. S. Teoh wrote:
 This is very interesting.  I wonder if there's a way to 
 incrementally update the executable, instead of starting from 
 scratch each time?

 E.g., hypothetically, if the linker emitted not only the 
 executable but also some kind of map file describing the 
 various parts that compose the executable, together with some 
 extra information about offsets/addresses that depend on each 
 other between parts, then in theory, if we change n object 
 files (where n is significantly less than the total number N of 
 all object files), we ought to be able to regenerate the 
 executable by copying most of its current data, move a few 
 sections around, and patch up some references.

 If the executable format is flexible enough (I think ELF is, 
 don't know about PE), we could also pad the executable with 
 some extra unused space between sections to allow for growth of 
 individual sections up to some limit. Then we might be able 
 patch in updated object files in-place, along with updating 
 some references as needed, as long as said object files don't 
 grow beyond the size of the extra space.

 This could significantly speed up the code-compile-run cycle 
 during development.  For releases, of course, you'd want to 
 compact the executable, but generally it's expected that 
 release builds are OK to take longer.


 T
https://github.com/kubkon/zld It is still quite experimental. Author have written about the techniques they use. There are very interesting things they do both for speed (like preloading .o as soon as they finish compiling in a daemon) and incremental link (this require to maintain extra metadata about where things are). See https://kristoff.it/blog/zig-new-relationship-llvm/#designing-machine-code-for-inc emental-compilation for more details on how this works.
Feb 25
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/24/2021 3:53 PM, deadalnix wrote:
 This is true for a fresh build, but often not the case for incremental builds, 
 which dev often have to go through. This is because the work you have to do
for 
 sources grows with the size of the changeset, while the work you have to do
link 
 grows with the size of the project as a whole, changed or not. On large 
 projects, it is very common that linking dominates incremental builds.
 
 zld is another interesting project that tries to do enable incremental
linking: 
 https://github.com/kubkon/zld
 
 Just like mold, it is fairly new and probably not battle tested enough for 
 production yet.
Optlink could do a full link faster than MS-Link could do an incremental link.
Feb 25
next sibling parent James Lu <jamtlu gmail.com> writes:
On Friday, 26 February 2021 at 01:52:04 UTC, Walter Bright wrote:
 Optlink could do a full link faster than MS-Link could do an 
 incremental link.
I wonder if you could explain to us how to port Optlink to ELF and Mach-O, if porting would be possible.
Feb 25
prev sibling parent reply deadalnix <deadalnix gmail.com> writes:
On Friday, 26 February 2021 at 01:52:04 UTC, Walter Bright wrote:
 On 2/24/2021 3:53 PM, deadalnix wrote:
 This is true for a fresh build, but often not the case for 
 incremental builds, which dev often have to go through. This 
 is because the work you have to do for sources grows with the 
 size of the changeset, while the work you have to do link 
 grows with the size of the project as a whole, changed or not. 
 On large projects, it is very common that linking dominates 
 incremental builds.
 
 zld is another interesting project that tries to do enable 
 incremental linking: https://github.com/kubkon/zld
 
 Just like mold, it is fairly new and probably not battle 
 tested enough for production yet.
Optlink could do a full link faster than MS-Link could do an incremental link.
That is also the position of the mold guy. He think that the extra work to do incremental linking offset the gains so decided to not even try to do it. Hard to know which is right.
Feb 25
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/25/2021 6:26 PM, deadalnix wrote:
 On Friday, 26 February 2021 at 01:52:04 UTC, Walter Bright wrote:
 Optlink could do a full link faster than MS-Link could do an incremental link.
That is also the position of the mold guy. He think that the extra work to do incremental linking offset the gains so decided to not even try to do it. Hard to know which is right.
Incremental linking also tends to suffer from all kinds of weird bugs. Enough so that one tends to give up and go full linking anyway.
Feb 25
next sibling parent deadalnix <deadalnix gmail.com> writes:
On Friday, 26 February 2021 at 07:35:16 UTC, Walter Bright wrote:
 On 2/25/2021 6:26 PM, deadalnix wrote:
 On Friday, 26 February 2021 at 01:52:04 UTC, Walter Bright 
 wrote:
 Optlink could do a full link faster than MS-Link could do an 
 incremental link.
That is also the position of the mold guy. He think that the extra work to do incremental linking offset the gains so decided to not even try to do it. Hard to know which is right.
Incremental linking also tends to suffer from all kinds of weird bugs. Enough so that one tends to give up and go full linking anyway.
That is why they only do it on any type of code, but one the code that has been compiled by their compiler in the right way. So no incremental link for release build for instance, only debug builds.
Feb 26
prev sibling parent Dukc <ajieskola gmail.com> writes:
On Friday, 26 February 2021 at 07:35:16 UTC, Walter Bright wrote:
 Incremental linking also tends to suffer from all kinds of 
 weird bugs. Enough so that one tends to give up and go full 
 linking anyway.
And for mega-sized codebases that could potentially benefit from incremental linking speed-wise, I'd think its better to use either the application user interfaces and/or BindBC-style dynamic linking as the top-level modularity mechanisms anyway. Am I correct?
Feb 26
prev sibling parent reply Atila Neves <atila.neves gmail.com> writes:
On Wednesday, 24 February 2021 at 23:53:34 UTC, deadalnix wrote:
 On Wednesday, 24 February 2021 at 22:12:46 UTC, Atila Neves 
 wrote:
 On Tuesday, 23 February 2021 at 14:41:59 UTC, James Lu wrote:
 The linker can be a bottleneck, yes, especially since it 
 doesn't do work in parallel. But in my experience, if the 
 linker takes a while, compiling took a lot longer still. Of 
 course, any improvements in this area are welcome, and I hope 
 mold is production-ready as soon as possible.
This is true for a fresh build, but often not the case for incremental builds, which dev often have to go through.
I only really care about incremental builds. Fresh builds should be rare, and if they're not, I don't understand that workflow.
 This is because the work you have to do for sources grows with 
 the size of the changeset, while the work you have to do link 
 grows with the size of the project as a whole, changed or not. 
 On large projects, it is very common that linking dominates 
 incremental builds.
It can, yes, and any improvements there will be very welcome.
 zld is another interesting project that tries to do enable 
 incremental linking: https://github.com/kubkon/zld
Nice. I wrote a D program once that used the linker as a server and kept "sending" it object files that it kept on relinking, but unfortunately that didn't speed anything up.
Feb 26
parent reply deadalnix <deadalnix gmail.com> writes:
On Friday, 26 February 2021 at 13:23:06 UTC, Atila Neves wrote:
 Nice. I wrote a D program once that used the linker as a server 
 and kept "sending" it object files that it kept on relinking, 
 but unfortunately that didn't speed anything up.
I wouldn't expect this to improve performance unless the linker is coded to take advantage of this. Re linking everything many times won't help.
Feb 26
parent reply Max Haughton <maxhaton gmail.com> writes:
On Friday, 26 February 2021 at 17:04:26 UTC, deadalnix wrote:
 On Friday, 26 February 2021 at 13:23:06 UTC, Atila Neves wrote:
 Nice. I wrote a D program once that used the linker as a 
 server and kept "sending" it object files that it kept on 
 relinking, but unfortunately that didn't speed anything up.
I wouldn't expect this to improve performance unless the linker is coded to take advantage of this. Re linking everything many times won't help.
On the subject of re-linking, reducing pressure on the linker is probably the way to go from the perspective of things we can actually do. The issue is that these things end up being deeply buried in code or worse exhibit slightly chaotic behaviour (For example, if you pull in dmd-as-a-library in dub it rebuilds the entire frontend every time for no reason as far as I can tell, and it's hard to know why)
Feb 26
parent reply Jacob Carlborg <doob me.com> writes:
On 2021-02-26 18:20, Max Haughton wrote:

 (For example, if you pull in dmd-as-a-library in dub it rebuilds the entire
frontend every time for 
 no reason as far as I can tell, and it's hard to know why)
It's a bug in Dub [1]. [1] https://github.com/dlang/dub/pull/1687 -- /Jacob Carlborg
Feb 26
parent Max Haughton <maxhaton gmail.com> writes:
On Friday, 26 February 2021 at 21:01:14 UTC, Jacob Carlborg wrote:
 On 2021-02-26 18:20, Max Haughton wrote:

 (For example, if you pull in dmd-as-a-library in dub it 
 rebuilds the entire frontend every time for no reason as far 
 as I can tell, and it's hard to know why)
It's a bug in Dub [1]. [1] https://github.com/dlang/dub/pull/1687
Oh joy...
Feb 26
prev sibling parent FeepingCreature <feepingcreature gmail.com> writes:
On Tuesday, 23 February 2021 at 14:41:59 UTC, James Lu wrote:
 Linking is the slowest part of D's compilation process. It is 
 what makes its compilation speed uncompetitive with 
 interrpreted scripting languages like Python and JavaScript.

 This project to make a faster linker is in alpha: 
 https://github.com/rui314/mold

 Concretely speaking, I wanted to use the linker to link a 
 Chromium executable with full debug info (~2 GiB in size) just 
 in 1 second. LLVM's lld, the fastest open-source linker which 
 I originally created a few years ago, takes about 12 seconds 
 to link Chromium on my machine. So the goal is 12x performance 
 bump over lld. Compared to GNU gold, it's more than 50x.
 It looks like mold has achieved the goal. It can link Chromium 
 in 2 seconds with 8-cores/16-threads, and if I enable the 
 preloading feature (I'll explain it later), the latency of the 
 linker for an interactive use is less than 900 milliseconds. 
 It is actualy faster than cat.
I've run a quick internal test with a 34MB object. gold: 0.82s mold: 0.20s That's pretty amazing. Of course, it took a lot of poking at linker flags to make it happen, and I had to change hash style cause its gnu hash impl is pretty slow right now (0.6s). But still.
Feb 25