www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Compilation is taking a ton of memory

reply Mario Silva <mario tripaneer.com> writes:
Hello,

Our code base has been growing steadily and it's currently at a 
point where my 16GB machine just freezes when we're compiling our 
code. This happens because DMD just consumes all my memory for a 
while.

Also, it's taking a long time to compile it. Less than an year 
ago our project was taking around 17 seconds to compile - no libs 
requiring compilation - and maybe around 50 seconds for full 
compilation, and it now takes around 50 seconds for an 
incremental compilations and around 1.5 minutes for a full one.

For you guys to have an idea of the size of our project, we have 
21151 lines of code and then 50933 in our libs. This is just our 
code without counting dependencies fetched by dub like vibe.d for 
example.

Are there any plans to work on compiler performance in terms of 
memory consumption and compilation time?

Any tips on how to code in a way that minimizes both compilation 
times and memory consumption when compiling?

Thanks,
Mario
Jun 27 2018
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Jun 27, 2018 at 04:00:37PM +0000, Mario Silva via Digitalmars-d wrote:
 Hello,
 
 Our code base has been growing steadily and it's currently at a point
 where my 16GB machine just freezes when we're compiling our code. This
 happens because DMD just consumes all my memory for a while.
How are you compiling the code? Dub? Makefile? Are you compiling everything all at once? DMD is known to prefer compilation speed over memory efficiency, a choice I personally don't quite agree with, but it is what it is. That means past a certain size, you're going to have to split up your project into multiple separate compilations. The usual convention is to compile packages together, e.g., if you have the source tree structure: main/main.d main/common.d pkg1/package.d pkg1/mod1.d pkg1/mod2.d pkg2/package.d pkg2/mod1.d pkg2/mod2.d then you'd compile main/* first, say into main.a (or main.lib), and compile pkg1/* into pkg1.a, and compile pkg2/* into pkg2.a, then finally link them all together with a final invocation of the compiler. This should reduce the total amount of memory used by any single invocation of dmd, which should keep memory usage sane. It may also reduce compilation time if only one package needs to be recompiled (though template-heavy code may not save that much, depending on what you do with it). [..]
 Are there any plans to work on compiler performance in terms of memory
 consumption and compilation time?
Not that I know of, though it'd be really nice if somebody worked on that. Given Walter's past choice of compilation speed over memory efficiency, I'm not sure how far such an effort would get. But I'd vote for it. There *is* the option of just turning on the GC, now that dmd is bootstrapping, but the last time I tried there were bugs that caused crashes, so it may not (yet) be a viable option.
 Any tips on how to code in a way that minimizes both compilation times
 and memory consumption when compiling?
[...] In general: (1) avoid template bloat, and (2) use separate compilation in strategic ways. (1) implies: - Avoid using (or reduce usage of) libraries / modules that are heavy on templates, except where absolutely necessary. If you're using Phobos, std.format comes to mind (though in a sufficiently large project I'd expect it doesn't play that big of a role, but the principle holds). If you're using vibe.d, Diet templates can be a source of heavy template usage. More generally, avoid recursive templates, or templates instantiation chains that nest too deeply. If not done carefully, you may end up with a quadratic or an exponential number of template instantiations, which will quickly eat up all your memory and CPU cycles (in addition to generating bloated executables). - Avoid doing too much in CTFE, because the current CTFE engine is slow (and also very memory-inefficient). We're all waiting for newCTFE to land, but from what I can tell that's still a ways away. (Note: not to be confused with other compile-time features that are not directly related to CTFE. Using static if's and the like won't add much overhead unless you're using recursive templates, in which case see If template-heavy or CTFE-heavy code is essential to your project, consider splitting them into their own self-contained (sub)modules, and separately compile them, so that recompilations that don't touch those parts of the code will be reasonably fast. In one of my vibe.d projects, I basically separated out the "business logic" of the program into a module outside the code that directly interfaces with vibe.d (Diet templates being the primary target). This has sped up recompilations when only the "business logic" parts of the code needs to change. Furthermore, because Diet templates tend to get very heavy, I also split those up into different compilation groups (e.g., new user / account management pages, error/miscellaneous pages, main pages) so that if I'm working on Diet templates in, say, the new user creation pages, I only have to recompile the Diet templates related to account management, not every other Diet template in the project. This also saves a lot of time waiting for compilation. T -- It is impossible to make anything foolproof because fools are so ingenious. -- Sammy
Jun 27 2018
prev sibling next sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 6/27/18 12:00 PM, Mario Silva wrote:
 Hello,
 
 Our code base has been growing steadily and it's currently at a point 
 where my 16GB machine just freezes when we're compiling our code. This 
 happens because DMD just consumes all my memory for a while.
That is the path of D's compiler toolchain. The front end uses a simple "bump the pointer" memory allocation scheme, and NEVER frees memory. This is reasonable for many things in the compiler (e.g. you never want to forget about symbols you built), but the worst is if you have a lot of CTFE, as that consumes a lot of memory that could easily be reclaimed after it's done running.
 Also, it's taking a long time to compile it. Less than an year ago our 
 project was taking around 17 seconds to compile - no libs requiring 
 compilation - and maybe around 50 seconds for full compilation, and it 
 now takes around 50 seconds for an incremental compilations and around 
 1.5 minutes for a full one.
Yikes, I experience stuff like this, and I'm using vibe.d as well. But I haven't gotten big enough to make it horrible. I'm compiling my vibe.d project on a VM Linux image with only 2GB of RAM. It's definitely a problem for template-heavy or CTFE heavy builds.
 
 For you guys to have an idea of the size of our project, we have 21151 
 lines of code and then 50933 in our libs. This is just our code without 
 counting dependencies fetched by dub like vibe.d for example.
The libraries should build OK. It's really the templates or diet templates that take the longest. And that's not a large project in comparison to other projects.
 Are there any plans to work on compiler performance in terms of memory 
 consumption and compilation time?
This has been a thorn in many sides for a long time. I remember Weka.io's Liran talking about how they required an INSANE amount of time/memory to build their system in dconf 2015 maybe? But things have gotten a bit better since then. I think at some point there will be a reckoning where it has to be fixed. Fast compile-time is cool, but inifinte compile time (i.e. it never finishes because the OOM killer destroys your process) is not acceptable.
 Any tips on how to code in a way that minimizes both compilation times 
 and memory consumption when compiling?
The default for dub is to build "separately", you can try different build modes --build-mode=allAtOnce or --build-mode=singleFile. There is very scarce documentation as to what these actually do. I don't know if these would help at all, but worth a try. Note that in some respects, memory usage and compilation times are related. If you want to keep memory usage low, you may have to forget some things, and then re-figure them out later, which slows you down. But if you consume too much memory, your system thrashes and then you experience a slowdown that way. -Steve
Jun 27 2018
parent Shachar Shemesh <shachar weka.io> writes:
On 28/06/18 01:46, Steven Schveighoffer wrote:
 This has been a thorn in many sides for a long time. I remember 
 Weka.io's Liran talking about how they required an INSANE amount of 
 time/memory to build their system in dconf 2015 maybe? But things have 
 gotten a bit better since then. I think at some point there will be a 
 reckoning where it has to be fixed. Fast compile-time is cool, but 
 inifinte compile time (i.e. it never finishes because the OOM killer 
 destroys your process) is not acceptable.
No, we just use bigger build servers: $ cat /proc/cpuinfo ... processor : 127 vendor_id : GenuineIntel cpu family : 6 model : 63 model name : Intel(R) Xeon(R) CPU E7-8880 v3 2.30GHz stepping : 4 microcode : 0x11 cpu MHz : 2300.072 cache size : 46080 KB physical id : 3 siblings : 32 core id : 15 cpu cores : 16 apicid : 223 initial apicid : 223 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq monitor est ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm ida xsaveopt fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm bogomips : 4662.97 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: ubuntu ip-172-31-14-32:~$ cat /proc/meminfo MemTotal: 2014743200 kB My laptop has 32GB of ram. If I do single threaded compilation and allocate enough swap, I can *barely* squeeze through a debug build compilation of the node. Shachar
Jun 30 2018
prev sibling next sibling parent sarn <sarn theartofmachinery.com> writes:
On Wednesday, 27 June 2018 at 16:00:37 UTC, Mario Silva wrote:
 Any tips on how to code in a way that minimizes both 
 compilation times and memory consumption when compiling?
Here are my tips. I'd love to hear more from others. * Try to reduce imports. E.g., say you use a lot of stuff from std.algorithm in your implementations, but not in your external interface. In that case, instead of doing "import std.algorithm;" at the module top, do things like "import std.algorithm.searching : canFind;" inside function bodies. If you think about this when designing your external interfaces, you can let user code import what it needs from your modules without recursively importing half the rest of the D ecosystem with it. * Templates are great, but not everything has to be one. Sometimes a simple delegate makes a perfectly generic interface. Templated code can be more expensive to compile, while non-templated code can be compiled once in a library and then reused. * CTFE is awesome but currently a memory hog. BTW, vibe.d isn't great with this stuff, TBH. Simple public API functions return types with "auto" that turn out to be several layers of templated wrappers all the way down, each wrapper coming from a different module. That's an example of what to avoid doing.
Jun 27 2018
prev sibling next sibling parent reply crimaniak <crimaniak gmail.com> writes:
On Wednesday, 27 June 2018 at 16:00:37 UTC, Mario Silva wrote:
 Hello,

 Our code base has been growing steadily and it's currently at a 
 point where my 16GB machine just freezes when we're compiling 
 our code. This happens because DMD just consumes all my memory 
 for a while.

 Also, it's taking a long time to compile it. Less than an year 
 ago our project was taking around 17 seconds to compile - no 
 libs requiring compilation - and maybe around 50 seconds for 
 full compilation, and it now takes around 50 seconds for an 
 incremental compilations and around 1.5 minutes for a full one.
The same problem. Freeze because of active swap usage and about 2 minutes to compile after every small change. In fact, when it comes to template code, DMD is neither particularly fast nor memory-efficient. Given that there are no things such as precompiled headers, on a large project, the D experience is much worse than C++. The problem is aggravated by the fact that DUB compiles all the sources in one DMD launch. I tried to solve this problem some time ago, and Sönke advised using reggae (https://code.dlang.org/packages/reggae). I'm still experimenting with different approaches and even wrote my own utility helper (https://code.dlang.org/packages/ifupdated). But at the moment I can state the following: * It is necessary to incorporate into the DUB reggae (== classic make) approach to compile sources. * Like the 'debug' and 'release' modes of the compiler, the build system should also have two modes, for example, 'deploy' and 'development', the first minimizes the time it takes for a full compilation, and the second minimizes the recompilation time after minor changes. * It needs to think seriously about 'compiler as a service' approach and caching intermediate compilation results.
 Any tips on how to code in a way that minimizes both 
 compilation times and memory consumption when compiling?
For the current moment just try reggae. You can find some forgotten imports because of separate compilation, but it is easily fixable.
Jun 28 2018
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Jun 28, 2018 at 04:11:57PM +0000, crimaniak via Digitalmars-d wrote:
[...]
 The problem is aggravated by the fact that DUB compiles all the
 sources in one DMD launch.
Doesn't dub have an option to compile packages (i.e. subdirs) separately? Or does that only apply to dub packages, not to subdirs within a single project?
 I tried to solve this problem some time ago, and Sönke advised using
 reggae (https://code.dlang.org/packages/reggae). I'm still
 experimenting with different approaches and even wrote my own utility
 helper (https://code.dlang.org/packages/ifupdated).
[...] This is one of the reasons I was not impressed by dub (sorry, Sonke). I continue to use SCons for my D projects. For dub dependencies, I just create a fake empty dub project with declared dependencies and run that separately for refreshing dependencies, but the actual compiling and linking is handled by SCons. Once a project gets beyond a certain size, you *need* more fine-grained control over exactly how the project should be built. T -- That's not a bug; that's a feature!
Jun 28 2018
next sibling parent reply Eugene Wissner <belka caraus.de> writes:
On Thursday, 28 June 2018 at 16:24:07 UTC, H. S. Teoh wrote:
 On Thu, Jun 28, 2018 at 04:11:57PM +0000, crimaniak via 
 Digitalmars-d wrote: [...]
 The problem is aggravated by the fact that DUB compiles all 
 the sources in one DMD launch.
Doesn't dub have an option to compile packages (i.e. subdirs) separately? Or does that only apply to dub packages, not to subdirs within a single project?
As far as I can see dub builds static libraries from dub dependencies. But it passes all source files from one dub project together to build final static library/executable. You can actually use --verbose to see what commands a called.
 I tried to solve this problem some time ago, and Sönke advised 
 using reggae (https://code.dlang.org/packages/reggae). I'm 
 still experimenting with different approaches and even wrote 
 my own utility helper 
 (https://code.dlang.org/packages/ifupdated).
[...] This is one of the reasons I was not impressed by dub (sorry, Sonke). I continue to use SCons for my D projects. For dub dependencies, I just create a fake empty dub project with declared dependencies and run that separately for refreshing dependencies, but the actual compiling and linking is handled by SCons. Once a project gets beyond a certain size, you *need* more fine-grained control over exactly how the project should be built. T
Some more alternatives: I have good experiance with "rake"-family of build systems. I've seen ruby rake [1] itself being used for D projects. For a smaller project of me I'm using shake [2]. You write your build scripts in Haskell and since Haskell is a compiled language, the build system should be compiled first, but it is worth it, since shake is mature and very fast (I'm pretty sure it is faster than python, ruby or js build systems and probably make too). You can use it to run tests as well. I compile every source file to an object file, put object files from dependencies to its own static libraries and link everything together at the end. But I should also admit, I'm not using dmd at all, but build everything with gdc/gcc. The only problem is handling dependencies and dependencies of dependencies if you have a lot of them. [1] https://ruby.github.io/rake/ [2] https://shakebuild.com/
Jun 28 2018
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Jun 28, 2018 at 06:13:45PM +0000, Eugene Wissner via Digitalmars-d
wrote:
 On Thursday, 28 June 2018 at 16:24:07 UTC, H. S. Teoh wrote:
 On Thu, Jun 28, 2018 at 04:11:57PM +0000, crimaniak via Digitalmars-d
 wrote: [...]
 The problem is aggravated by the fact that DUB compiles all the
 sources in one DMD launch.
Doesn't dub have an option to compile packages (i.e. subdirs) separately? Or does that only apply to dub packages, not to subdirs within a single project?
As far as I can see dub builds static libraries from dub dependencies. But it passes all source files from one dub project together to build final static library/executable. You can actually use --verbose to see what commands a called.
Someone should file an enhancement ticket for this. [...]
 I compile every source file to an object file, put object files from
 dependencies to its own static libraries and link everything together
 at the end.
That's the traditional C/C++ approach, which is stable, if not the most suitable for D. I forget the details now, but I believe there are some savings in compiling template code if you compile multiple source files at once (assuming, of course, that they share some number of template instantiations). So personally I'd group source files by subdir and compile them in batches that way, then pass the resulting static libs to the parent dirs, and so on.
 But I should also admit, I'm not using dmd at all, but build
 everything with gdc/gcc. The only problem is handling dependencies and
 dependencies of dependencies if you have a lot of them.
[...] Unless you're doing imports using mixins, which is a questionable practice to begin with, it should be pretty easy to, say, grep for 'import' lines and build your dependency tree that way. Or use the tup approach of instrumenting the compiler with LD_LIBRARY_* so that any dependencies are automatically caught, and you don't have to worry about it (though that only accounts for dependencies on a specific run, it won't pick up imports across static-if branches, for example, though it's unlikely to be an actual problem in real-life code). T -- You have to expect the unexpected. -- RL
Jun 28 2018
prev sibling next sibling parent crimaniak <crimaniak gmail.com> writes:
On Thursday, 28 June 2018 at 16:24:07 UTC, H. S. Teoh wrote:
 On Thu, Jun 28, 2018 at 04:11:57PM +0000, crimaniak via 
 Digitalmars-d wrote: [...]
 The problem is aggravated by the fact that DUB compiles all 
 the sources in one DMD launch.
Doesn't dub have an option to compile packages (i.e. subdirs) separately? Or does that only apply to dub packages, not to subdirs within a single project?
Yes, it's about dub packages, not about subdirs. And yes, it's possible to split project to subpackages, it gives its benefits and structure code, but I hate the idea of doing this forcibly, just to fix compilation time. In addition, a typical large site is a large number of pages with diet templates, and even if you separate the pages from business logic, but this part itself is still very large and slow in compilation. I had to refuse diet rendering on the server and go to Vue+pug, which allowed to accelerate the development cycle when changes on the frontend from 2 minutes to instantly.
 [...]

 This is one of the reasons I was not impressed by dub (sorry, 
 Sonke).  I continue to use SCons for my D projects...
I believe that this situation is bad for D as a whole. The problem with large projects can be solved by splitting into subpackages, or with Scons, or with reggae, or manually writing the makefile, etc., etc., but the person who just started writing on D does not know all this. He simply does the project and with his growth discovers this problem. I think the modern programmer has the right to expect that the official build system will work well enough with a project of several thousand files.
Jun 28 2018
prev sibling parent sarn <sarn theartofmachinery.com> writes:
On Thursday, 28 June 2018 at 16:24:07 UTC, H. S. Teoh wrote:
 I continue to use SCons for my D projects. For dub 
 dependencies, I just create a fake empty dub project with 
 declared dependencies and run that separately for refreshing 
 dependencies, but the actual compiling and linking is handled 
 by SCons.
Sounds like what I'm doing. I build the dependencies using dub on a dummy project, then use "dub describe" to find all the files I need and install them into my project directory. Then I can use whatever build system I want (currently tup).
Jun 28 2018
prev sibling next sibling parent Nick Treleaven <nick geany.org> writes:
On Wednesday, 27 June 2018 at 16:00:37 UTC, Mario Silva wrote:
 Less than an year ago our project was taking around 17 seconds 
 to compile - no libs requiring compilation - and maybe around 
 50 seconds for full compilation, and it now takes around 50 
 seconds for an incremental compilations and around 1.5 minutes 
 for a full one.
If memory consumption has got worse in recent releases, it's possible it was my fault. I made a pull* to fix it, so it might be worth trying the 2.081.0 beta as it should have the fix merged. * https://github.com/dlang/dmd/pull/8281
Jul 01 2018
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2018-06-27 18:00, Mario Silva wrote:
 Hello,
 
 Our code base has been growing steadily and it's currently at a point 
 where my 16GB machine just freezes when we're compiling our code. This 
 happens because DMD just consumes all my memory for a while.
 
 Also, it's taking a long time to compile it. Less than an year ago our 
 project was taking around 17 seconds to compile - no libs requiring 
 compilation - and maybe around 50 seconds for full compilation, and it 
 now takes around 50 seconds for an incremental compilations and around 
 1.5 minutes for a full one.
 
 For you guys to have an idea of the size of our project, we have 21151 
 lines of code and then 50933 in our libs. This is just our code without 
 counting dependencies fetched by dub like vibe.d for example.
 
 Are there any plans to work on compiler performance in terms of memory 
 consumption and compilation time?
 
 Any tips on how to code in a way that minimizes both compilation times 
 and memory consumption when compiling?
Unless this already has been stated, the issue is usually not the size of the project, it's rather which compile time features which are used. For example the DWT project has over 200k lines of code and compile (to a library) in a couple of seconds. Try to reduce templates and CTFE code, if possible and prefer CTFE over templates. -- /Jacob Carlborg
Jul 04 2018
prev sibling parent reply "Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:
On 06/27/2018 12:00 PM, Mario Silva wrote:
 Hello,
 
 Our code base has been growing steadily and it's currently at a point 
 where my 16GB machine just freezes when we're compiling our code. This 
 happens because DMD just consumes all my memory for a while.
 
 Also, it's taking a long time to compile it. Less than an year ago our 
 project was taking around 17 seconds to compile - no libs requiring 
 compilation - and maybe around 50 seconds for full compilation, and it 
 now takes around 50 seconds for an incremental compilations and around 
 1.5 minutes for a full one.
 
Memory consumption is a known issue with the D frontend implementation. As others have said, CTFE never freeing memory is considered to be a big part of the issue here. The CTFE rewrite someone is doing should help with that when it's done. Long compilation times are almost always the fault of Dub. Dub is pretty well-known for taking DMD's near-instant compile times and bloating them out to the ridiculous (by D standards) lengths you're experiencing. I'd say try just writing a basic buildscript that runs DMD directly. Plus, that way you can customize it to your own project's needs: Which files/packages get compiled together, which get compiled separately.
Jul 04 2018
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Jul 04, 2018 at 04:17:22PM -0400, Nick Sabalausky (Abscissa) via
Digitalmars-d wrote:
[...]
 Long compilation times are almost always the fault of Dub. Dub is
 pretty well-known for taking DMD's near-instant compile times and
 bloating them out to the ridiculous (by D standards) lengths you're
 experiencing. I'd say try just writing a basic buildscript that runs
 DMD directly. Plus, that way you can customize it to your own
 project's needs: Which files/packages get compiled together, which get
 compiled separately.
Yeah, I'm sorry to say that I have been quite disappointed with dub (as a build system), and currently have taken to avoiding it as much as possible. As a packaging system it's not bad, but it simply doesn't have the options I need to use it as a build tool. But to be fair, it's not always dub's fault. Heavy template / CTFE usage is also known to bloat compilation times, sometimes by a lot. Earlier this year I discovered that the very act of importing std.format or std.regex causes compilation time to almost double. There were some issues related to compiling with -unittest that triggered this behaviour, though we found that the underlying issue was actually more tricky than it might first appear. I don't remember if this issue has been completely resolved yet. T -- Живёшь только однажды.
Jul 05 2018