www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - On the performance of building D programs

reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
Recently I studied the performance of building a vibe.d example:
https://github.com/rejectedsoftware/vibe.d/issues/208

I wrote a few tools in the process, perhaps someone might find 
them useful as well.

However, I'd also like to discuss a related matter:

I noticed that compiling D programs in the usual manner (rdmd) is 
as much as 40% slower than it can be.

This is because before rdmd knows a full list of modules to be 
built, it must run dmd with -v -o-, and read its verbose output. 
Then, it feeds that output back to the compiler again, and passes 
all modules on the command line of the second run.

The problem with this approach is that DMD needs to parse, lex, 
run CTFE, instantiate templates, etc. etc. - everything except 
actual code generation / optimization / linking - twice. And code 
generation can actually be a small part of the total compilation 
time.

D code already compiles pretty quickly, but here's an opportunity 
to nearly halve that time (for some cases) - by moving some of 
rdmd's basic functionality into the compiler. DMD already knows 
which modules are used in the program, so it just needs two new 
options: one to enabled this behavior (say, -r for recursive 
compilation), and one to specify an exclusion list, to indicate 
which modules are already compiled and will be found in a library 
(e.g. -rx). The default -rx settings can be placed in 
sc.ini/dmd.conf. I think we should seriously consider it.

Another appealing thing about the idea is that the compiler has 
access to information that would allow it to recompile programs 
more efficiently in the future. For example, it would be possible 
to get a hash of a module's public interface, so that a change in 
one function's code would not trigger a recompile of all modules 
that import it (assuming no CTFE).
Apr 04 2013
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Vladimir Panteleev:

 D code already compiles pretty quickly, but here's an 
 opportunity to nearly halve that time (for some cases) - by 
 moving some of rdmd's basic functionality into the compiler.

Make the D compiler search for its modules was one of the first (unwritten) enhancement requests. That's the right default for a handy compiler (plus a compiler switch to disable that behavour). But for backwards compatibility I think that switch has to do the opposite, to enable the recursive search. Bye, bearophile
Apr 04 2013
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Friday, 5 April 2013 at 00:39:49 UTC, bearophile wrote:
 Vladimir Panteleev:

 D code already compiles pretty quickly, but here's an 
 opportunity to nearly halve that time (for some cases) - by 
 moving some of rdmd's basic functionality into the compiler.

Make the D compiler search for its modules was one of the first (unwritten) enhancement requests. That's the right default for a handy compiler (plus a compiler switch to disable that behavour). But for backwards compatibility I think that switch has to do the opposite, to enable the recursive search.

Yes, I agree completely. D is the only programming language I know that has both a module system, and the archaic C/C++ compilation model. Even Pascal got this right! I think Rust got off the right foot in this regard. The compiler ("rustc") is a low-level detail which most users don't interact with directly; instead, there's "rust", a tool which acts as the user-friendly interface for the compiler, package manager, documentation tool, etc. This is in contrast with D, where the tool in the spotlight and the shorter name will throw cryptic linker errors at you if you don't to specify all modules on its command line.
Apr 04 2013
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Friday, 5 April 2013 at 00:39:49 UTC, bearophile wrote:
 Vladimir Panteleev:

 D code already compiles pretty quickly, but here's an 
 opportunity to nearly halve that time (for some cases) - by 
 moving some of rdmd's basic functionality into the compiler.

Make the D compiler search for its modules was one of the first (unwritten) enhancement requests. That's the right default for a handy compiler (plus a compiler switch to disable that behavour). But for backwards compatibility I think that switch has to do the opposite, to enable the recursive search.

Actually, that switch is -c. The default behavior is "compile and link"... actually, I can't think of any instances when making -r the default when only one module is specified on the command line would break anything.
Apr 04 2013
prev sibling next sibling parent Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:
On Fri, 05 Apr 2013 00:49:08 +0200
"Vladimir Panteleev" <vladimir thecybershadow.net> wrote:

 Recently I studied the performance of building a vibe.d example:
 https://github.com/rejectedsoftware/vibe.d/issues/208
 
 I wrote a few tools in the process, perhaps someone might find 
 them useful as well.
 
 However, I'd also like to discuss a related matter:
 
 I noticed that compiling D programs in the usual manner (rdmd) is 
 as much as 40% slower than it can be.
 
 This is because before rdmd knows a full list of modules to be 
 built, it must run dmd with -v -o-, and read its verbose output. 
 Then, it feeds that output back to the compiler again, and passes 
 all modules on the command line of the second run.
 
 The problem with this approach is that DMD needs to parse, lex, 
 run CTFE, instantiate templates, etc. etc. - everything except 
 actual code generation / optimization / linking - twice. And code 
 generation can actually be a small part of the total compilation 
 time.
 
 D code already compiles pretty quickly, but here's an opportunity 
 to nearly halve that time (for some cases) - by moving some of 
 rdmd's basic functionality into the compiler. DMD already knows 
 which modules are used in the program, so it just needs two new 
 options: one to enabled this behavior (say, -r for recursive 
 compilation), and one to specify an exclusion list, to indicate 
 which modules are already compiled and will be found in a library 
 (e.g. -rx). The default -rx settings can be placed in 
 sc.ini/dmd.conf. I think we should seriously consider it.
 
 Another appealing thing about the idea is that the compiler has 
 access to information that would allow it to recompile programs 
 more efficiently in the future. For example, it would be possible 
 to get a hash of a module's public interface, so that a change in 
 one function's code would not trigger a recompile of all modules 
 that import it (assuming no CTFE).

About a year or two ago, Andrei proposed a system that addressed that issue together with a substitute for a package manager. I was against it for various reasons, and for the psudo-package-manager part of it I still am. But I've since become more open to the "Make RDMD only call DMD once" part. *That* part *might* not be a bad idea, although I can't remember quite how it worked. Something about passing DMD a cmd line tool to invoke whenever it finds a newly imported module? Although if it was like that, that it could very well slow things down on Windows, as Windows is notoriously slow at launching processes. Or just fold RDMD functionality into DMD itself, like you said...which would probably be easier anyway.
Apr 04 2013
prev sibling next sibling parent reply "Dicebot" <m.strashun gmail.com> writes:
I was very unpleasantly surprised to know from that thread about 
dmd doing everything but object file creation during rdmd 
dependency tracking step. That feels very inefficient and 
probably only reason this has not caught any attention is 
generally fast dmd compilation speed. This case begs for better 
integration between dmd and rdmd or even moving some 
functionality of rdmd to front-end.
Apr 05 2013
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/5/13 3:29 AM, Dicebot wrote:
 I was very unpleasantly surprised to know from that thread about dmd
 doing everything but object file creation during rdmd dependency
 tracking step. That feels very inefficient and probably only reason this
 has not caught any attention is generally fast dmd compilation speed.
 This case begs for better integration between dmd and rdmd or even
 moving some functionality of rdmd to front-end.

Well rdmd caches the result of the dependency collection step and it won't rebuild dependencies unless necessary. Anyhow I think it makes a lot of sense for dmd to automatically link .d files imported by the main program compiled. A bunch of language features (e.g. no classes spread across modules) and approach to modularity (e.g. separate notions of .di and .d) make that natural. Andrei
Apr 05 2013
prev sibling next sibling parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 4/5/13, Vladimir Panteleev <vladimir thecybershadow.net> wrote:
 it must run dmd with -v -o-, and read its verbose output.

IIRC xfbuild used to work around this by both reading the verbose output and simultaneously building. In other words it used 'dmd -deps' without the -o- switch. There's a function called compileAndTrackDeps which did this, but the entire codebase was very hacky which has lead me to abandon the port. Note that using pipes instead of File I/O to read the verbose output of DMD can speed up the entire building process (the new std.process comes in handy for this). However I've come to the same conclusions as you, DMD could do all of this on its own in one process invocation. It won't have to re-parse and re-process any dependencies more than once, which could be a significant boost in speed. Plus, it would be a boon for people new to D who expect things to work in a modern way (no nasty linker errors). Preferably we could introduce another switch -rxp which is used to exclude entire packages, e.g.: dmd -r -rx=somemodule.d -rxp=some.package (The = were just added for readability)
Apr 05 2013
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-04-05 15:45, Vladimir Panteleev wrote:

 -r should be disabled for such cases - thus, enabled only when there's
 one .d file and no .obj / .lib files. Although specifying library files
 on the compiler's command line is a valid use case, compatible with
 recursive compilation, I think we should promote the use of pragma(lib)
 instead.

Why not when there's multiple source files? -- /Jacob Carlborg
Apr 05 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-04-06 04:05, Vladimir Panteleev wrote:

 1) It will break tools like rdmd, for cases when the tool knows the
 exact set of modules that needs to be compiled
 2) When would that be useful?

When building libraries of course. -- /Jacob Carlborg
Apr 06 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-04-06 13:02, Vladimir Panteleev wrote:

 Why would you want recursive compilation plus specifying multiple modules?

If you're building a library which as a dependency on another library, which you build from source. Perhaps not that common.
 For a library, you generally know the exact set of modules to be built.
 If not in a makefile / build script, you could write one dummy module
 that imports all of the library's components, and use that as the root
 of incremental compilation.

Using a dummy module is ugly. -- /Jacob Carlborg
Apr 06 2013
parent Jacob Carlborg <doob me.com> writes:
On 2013-04-06 16:22, Andrej Mitrovic wrote:

 It doesn't necessarily have to be dummy. I like to use an 'all.d'
 module which imports all of my library modules. I've used this
 technique to run unittest via "rdmd --main mylib\all.d". But being
 able to recursively compile a static library would be pretty nice.

Actually, just specifying a directory and having the compiler/some tool to compile all the files, recursively would be really nice to have. -- /Jacob Carlborg
Apr 07 2013
prev sibling next sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
05-Apr-2013 18:12, Paulo Pinto пишет:
 On Friday, 5 April 2013 at 12:51:58 UTC, Andrej Mitrovic wrote:
 On 4/5/13, Paulo Pinto <pjmlp progtools.org> wrote:
 By checking if foo.obj is outdated in regards to foo.d

The compiler doesn't know that foo.d was built into foo.obj, it has no way of knowing that.


Something as simple as adding an MD5 digest plus a mime-type of a source file as a "finger-print" in a separate section would have worked. It wouldn't even need to change how object files work but would preserve 1:1 mapping of sources to obj. And we could have gone this way in D but since C objects produced by other compiler won't have this ability it would be quite limited (and conservative).
 Hence my remark of a modern linker.

Yup, UNIX is so UNIX. -- Dmitry Olshansky
Apr 05 2013
prev sibling parent reply Martin Nowak <code dawg.eu> writes:
On 04/05/2013 12:18 PM, Andrej Mitrovic wrote:
 On 4/5/13, Vladimir Panteleev <vladimir thecybershadow.net> wrote:
 it must run dmd with -v -o-, and read its verbose output.

IIRC xfbuild used to work around this by both reading the verbose output and simultaneously building. In other words it used 'dmd -deps' without the -o- switch. There's a function called compileAndTrackDeps which did this, but the entire codebase was very hacky which has lead me to abandon the port. Note that using pipes instead of File I/O to read the verbose output of DMD can speed up the entire building process (the new std.process comes in handy for this).

It would be good to use IPC instead of an a priori -rb -rx argument list. The idea is that whenever dmd imports a module it ask rdmd (or another driver) whether this module should be compiled. The driver could then check it's cache for an existing module-object to decide this. The driver cannot decide this a priori because it's dependency information is outdated. It could only heuristically list available packages/modules. 1. driver has no knowledge about foo.bar's dependencies: dmd -c foo/bar -rb -rxstd.* ---- module foo.bar; import std.stdio, foo.baz; // ... ---- 2. driver assumes that foo.bar depends on foo.baz and has a cached obj for foo.baz: dmd -c foo/bar -rb -rxstd.* -rxfoo.baz ---- module foo.bar; import std.stdio, something.else; // ... ----
Apr 07 2013
parent Martin Nowak <code dawg.eu> writes:
On 04/07/2013 07:14 PM, Andrej Mitrovic wrote:
 On 4/7/13, Martin Nowak <code dawg.eu> wrote:
 2.
 driver assumes that foo.bar depends on foo.baz and has a cached obj for
 foo.baz:
 dmd -c foo/bar -rb -rxstd.* -rxfoo.baz

How is this a problem? If foo.baz is not needed, the linker won't link it in. All the driver has to do is list the modules which shouldn't be recompiled if their modification time hasn't changed.

don't need to recompile. With the current approach you're first recompiling before you know the dependencies. It's also the reason why you hardcoded std.* and core.* into the exclude list. But this will miss other prebuild libraries, e.g. vibe.
Apr 07 2013
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Friday, 5 April 2013 at 10:18:25 UTC, Andrej Mitrovic wrote:

(agreed with everything not quoted)

 dmd -r -rx=somemodule.d -rxp=some.package (The = were just 
 added for
 readability)

I think -r is redundant, and should be the default action if only one module is given on DMD's command line. I can't think of plausible situations where this could be a problem. Considering that you can't have a module with the same name as a package, the same syntax for excluding both can be used, e.g. "-rxcrc32 -rxstd".
Apr 05 2013
prev sibling next sibling parent "Paulo Pinto" <pjmlp progtools.org> writes:
On Friday, 5 April 2013 at 00:51:38 UTC, Vladimir Panteleev wrote:
 On Friday, 5 April 2013 at 00:39:49 UTC, bearophile wrote:
 Vladimir Panteleev:

 D code already compiles pretty quickly, but here's an 
 opportunity to nearly halve that time (for some cases) - by 
 moving some of rdmd's basic functionality into the compiler.

Make the D compiler search for its modules was one of the first (unwritten) enhancement requests. That's the right default for a handy compiler (plus a compiler switch to disable that behavour). But for backwards compatibility I think that switch has to do the opposite, to enable the recursive search.

Yes, I agree completely. D is the only programming language I know that has both a module system, and the archaic C/C++ compilation model. Even Pascal got this right!

Yep, units exist since UCSD Pascal.
Apr 05 2013
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 4/5/13, Vladimir Panteleev <vladimir thecybershadow.net> wrote:
 I think -r is redundant, and should be the default action if only
 one module is given on DMD's command line. I can't think of
 plausible situations where this could be a problem.

$ dmd main.d foo.obj If main has imports to 'foo', how will DMD know whether or not to compile foo.d or just link with foo.obj? You could pass -rxfoo, but you'd still be breaking all existing build scripts which rely on non-recursive building.
 Considering that you can't have a module with the same name as a
 package, the same syntax for excluding both can be used, e.g.
 "-rxcrc32 -rxstd".

I guess it could work. But I'm hoping one day we'll be able to lift that restriction, it's quite a pain in the ass when porting C++ code that uses namespaces to D. So for future-compatibility I thought separating module and package switches might be nice. It's not a big deal though.
Apr 05 2013
prev sibling next sibling parent "Paulo Pinto" <pjmlp progtools.org> writes:
On Friday, 5 April 2013 at 11:29:20 UTC, Andrej Mitrovic wrote:
 On 4/5/13, Vladimir Panteleev <vladimir thecybershadow.net> 
 wrote:
 I think -r is redundant, and should be the default action if 
 only
 one module is given on DMD's command line. I can't think of
 plausible situations where this could be a problem.

$ dmd main.d foo.obj If main has imports to 'foo', how will DMD know whether or not to compile foo.d or just link with foo.obj?

By checking if foo.obj is outdated in regards to foo.d Turbo Pascal was doing this, so it shouldn't be too hard. On the other hand it was using a modern linker, not one with UNIX compatible semantics. -- Paulo
Apr 05 2013
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 4/5/13, Paulo Pinto <pjmlp progtools.org> wrote:
 By checking if foo.obj is outdated in regards to foo.d

The compiler doesn't know that foo.d was built into foo.obj, it has no way of knowing that.
Apr 05 2013
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Friday, 5 April 2013 at 11:29:20 UTC, Andrej Mitrovic wrote:
 On 4/5/13, Vladimir Panteleev <vladimir thecybershadow.net> 
 wrote:
 I think -r is redundant, and should be the default action if 
 only
 one module is given on DMD's command line. I can't think of
 plausible situations where this could be a problem.

$ dmd main.d foo.obj If main has imports to 'foo', how will DMD know whether or not to compile foo.d or just link with foo.obj?

-r should be disabled for such cases - thus, enabled only when there's one .d file and no .obj / .lib files. Although specifying library files on the compiler's command line is a valid use case, compatible with recursive compilation, I think we should promote the use of pragma(lib) instead.
 You could pass -rxfoo, but you'd still be breaking all existing 
 build
 scripts which rely on non-recursive building.

Yes, the goal is to not affect existing scripts.
 Considering that you can't have a module with the same name as 
 a
 package, the same syntax for excluding both can be used, e.g.
 "-rxcrc32 -rxstd".

I guess it could work. But I'm hoping one day we'll be able to lift that restriction, it's quite a pain in the ass when porting C++ code that uses namespaces to D. So for future-compatibility I thought separating module and package switches might be nice. It's not a big deal though.

Another option is wildcards (std.*).
Apr 05 2013
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 4/5/13, Vladimir Panteleev <vladimir thecybershadow.net> wrote:
 Another option is wildcards (std.*).

Yep, I use this for unittesting and it works nice. On 4/5/13, Vladimir Panteleev <vladimir thecybershadow.net> wrote:
 -r should be disabled for such cases - thus, enabled only when there's one .d
file and no .obj / .lib files.

The .lib file could be an import library. On 4/5/13, Vladimir Panteleev <vladimir thecybershadow.net> wrote:
 I think we should promote the use of pragma(lib) instead.

I remember other compiler writers saying pragma should not be used (I think GDC was it). Anyway, It's easier to reason about a feature if you only had to pass a single switch to enable the feature rather than having to remember some odd set of rules.
Apr 05 2013
prev sibling next sibling parent "Paulo Pinto" <pjmlp progtools.org> writes:
On Friday, 5 April 2013 at 12:51:58 UTC, Andrej Mitrovic wrote:
 On 4/5/13, Paulo Pinto <pjmlp progtools.org> wrote:
 By checking if foo.obj is outdated in regards to foo.d

The compiler doesn't know that foo.d was built into foo.obj, it has no way of knowing that.

Hence my remark of a modern linker.
Apr 05 2013
prev sibling next sibling parent Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:
On Fri, 05 Apr 2013 15:45:15 +0200
"Vladimir Panteleev" <vladimir thecybershadow.net> wrote:
 
 -r should be disabled for such cases - thus, enabled only when 
 there's one .d file and no .obj / .lib files.

No, including static libs while using recursive compilation is definitely a very real need.
Apr 05 2013
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 4/5/13, Paulo Pinto <pjmlp progtools.org> wrote:
 Hence my remark of a modern linker.

Yeah but we're looking for something that works now. A modern linker for D is a distant dream right now, unfortunately..
Apr 05 2013
prev sibling next sibling parent Johannes Pfau <nospam example.com> writes:
Am Fri, 5 Apr 2013 16:02:22 +0200
schrieb Andrej Mitrovic <andrej.mitrovich gmail.com>:

 On 4/5/13, Vladimir Panteleev <vladimir thecybershadow.net> wrote:
 Another option is wildcards (std.*).

Yep, I use this for unittesting and it works nice. On 4/5/13, Vladimir Panteleev <vladimir thecybershadow.net> wrote:
 -r should be disabled for such cases - thus, enabled only when
 there's one .d file and no .obj / .lib files.

The .lib file could be an import library. On 4/5/13, Vladimir Panteleev <vladimir thecybershadow.net> wrote:
 I think we should promote the use of pragma(lib) instead.

I remember other compiler writers saying pragma should not be used (I think GDC was it).

It's not like we don't want to support it, we can't(*) implement pragma lib in gdc due to gcc's architecture. (split into compiler (cc1d) and link driver(gdc). cc1d can't communicate with gdc) * With some effort (read hacks) it could be possible but it would likely prevent gdc from being merged into upstream gcc, so...
Apr 05 2013
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Friday, 5 April 2013 at 14:01:31 UTC, Jacob Carlborg wrote:
 On 2013-04-05 15:45, Vladimir Panteleev wrote:

 -r should be disabled for such cases - thus, enabled only when 
 there's
 one .d file and no .obj / .lib files. Although specifying 
 library files
 on the compiler's command line is a valid use case, compatible 
 with
 recursive compilation, I think we should promote the use of 
 pragma(lib)
 instead.

Why not when there's multiple source files?

1) It will break tools like rdmd, for cases when the tool knows the exact set of modules that needs to be compiled 2) When would that be useful?
Apr 05 2013
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Friday, 5 April 2013 at 14:02:36 UTC, Andrej Mitrovic wrote:
 I remember other compiler writers saying pragma should not be 
 used (I
 think GDC was it).

I see, I didn't realize pragma(lib) wasn't implemented in GDC.
 Anyway, It's easier to reason about a feature if you only had 
 to pass
 a single switch to enable the feature rather than having to 
 remember
 some odd set of rules.

Certainly; but at the same time, I think this is an opportunity to make the tool do the right thing with the least keystrokes.
Apr 05 2013
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 4/6/13, Vladimir Panteleev <vladimir thecybershadow.net> wrote:
 Certainly; but at the same time, I think this is an opportunity
 to make the tool do the right thing with the least keystrokes.

Hmm yeah. Well we should definitely give implementing it a try and see what the build times are like.
Apr 06 2013
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Saturday, 6 April 2013 at 10:56:19 UTC, Jacob Carlborg wrote:
 On 2013-04-06 04:05, Vladimir Panteleev wrote:

 1) It will break tools like rdmd, for cases when the tool 
 knows the
 exact set of modules that needs to be compiled
 2) When would that be useful?

When building libraries of course.

Why would you want recursive compilation plus specifying multiple modules? For a library, you generally know the exact set of modules to be built. If not in a makefile / build script, you could write one dummy module that imports all of the library's components, and use that as the root of incremental compilation.
Apr 06 2013
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 4/6/13, Jacob Carlborg <doob me.com> wrote:
 For a library, you generally know the exact set of modules to be built.
 If not in a makefile / build script, you could write one dummy module
 that imports all of the library's components, and use that as the root
 of incremental compilation.

Using a dummy module is ugly.

It doesn't necessarily have to be dummy. I like to use an 'all.d' module which imports all of my library modules. I've used this technique to run unittest via "rdmd --main mylib\all.d". But being able to recursively compile a static library would be pretty nice.
Apr 06 2013
prev sibling next sibling parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 4/5/13, Vladimir Panteleev <vladimir thecybershadow.net> wrote:
 I noticed that compiling D programs in the usual manner (rdmd) is
 as much as 40% slower than it can be.

I've implemented the -rb and -rx switchex (the -r switch itself is taken, it's an undocumented switch). These switches are used to enable recursive builds and to exclude modules/packages, respectively[1]. The -rx switch can take a module name or a wildcard option, e.g. -rxfoo.* to exclude all modules from the package "foo". I'm seeing about 30% faster clean builds when using DMD with this feature compared to using RDMD. But I've only tested this on smaller projects, I wonder what the impact is on larger projects. The std and core packages, and the object module are implicitly excluded from building. [1] : https://github.com/AndrejMitrovic/dmd/tree/BuildRecurse
Apr 06 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/6/13 1:14 PM, Andrej Mitrovic wrote:
 On 4/5/13, Vladimir Panteleev<vladimir thecybershadow.net>  wrote:
 I noticed that compiling D programs in the usual manner (rdmd) is
 as much as 40% slower than it can be.

I've implemented the -rb and -rx switchex (the -r switch itself is taken, it's an undocumented switch). These switches are used to enable recursive builds and to exclude modules/packages, respectively[1]. The -rx switch can take a module name or a wildcard option, e.g. -rxfoo.* to exclude all modules from the package "foo". I'm seeing about 30% faster clean builds when using DMD with this feature compared to using RDMD. But I've only tested this on smaller projects, I wonder what the impact is on larger projects. The std and core packages, and the object module are implicitly excluded from building. [1] : https://github.com/AndrejMitrovic/dmd/tree/BuildRecurse

This is a dramatic improvement, and better build times are always well received. Do you plan to convert your work into a pull request? For larger projects I wonder if it's possible to restrict scanning only to specific directories. That way building becomes a modularity enforcer. Andrei
Apr 06 2013
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/7/13 3:17 AM, Vladimir Panteleev wrote:
 On Sunday, 7 April 2013 at 00:23:06 UTC, Andrei Alexandrescu wrote:
 For larger projects I wonder if it's possible to restrict scanning
 only to specific directories. That way building becomes a modularity
 enforcer.

I'm not sure what this means. Could you expand a little, please?

I mean say we have a project consisting of business, graphics, and glue. We wouldn't want business and graphics to directly use each other, so they'd be each built with -I settings that make accidental dependencies impossible (in spite of the automatic dependency detection). Andrei
Apr 07 2013
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/6/13 9:38 PM, Vladimir Panteleev wrote:
 On Saturday, 6 April 2013 at 17:52:40 UTC, Andrej Mitrovic wrote:
 It definitely helps when you're building something from scratch.

Great stuff!
 But RDMD can track changes to dependencies, which DMD still can't do.
 I want to try implementing this feature in DMD and see the speed
 difference for incremental builds.

Why not adapt rdmd so that it takes advantage of the new script? Is there any benefit of having dependency change checks in dmd instead of rdmd?

Yah, I think it would be great to dedicate rdmd to the dependency/caching part and leave the build to the compiler. One possibility would be to run the build and the dependency saving in parallel (!). Andrei
Apr 06 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/6/13 10:12 PM, Vladimir Panteleev wrote:
 On Sunday, 7 April 2013 at 02:04:54 UTC, Andrei Alexandrescu wrote:
 Yah, I think it would be great to dedicate rdmd to the
 dependency/caching part and leave the build to the compiler. One
 possibility would be to run the build and the dependency saving in
 parallel (!).

Why in parallel and not in one go? (-v with -rb and without -o-)

I'm not sure how rdmd would distingish dmd's output from the output of the running program. Oh wait, the run will be a distinct step - awesome. Andrei
Apr 06 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/7/13 12:07 AM, Vladimir Panteleev wrote:
 On Sunday, 7 April 2013 at 02:16:10 UTC, Andrei Alexandrescu wrote:
 Why in parallel and not in one go? (-v with -rb and without -o-)

I'm not sure how rdmd would distingish dmd's output from the output of the running program. Oh wait, the run will be a distinct step - awesome.

I'll have a go at the rdmd side of this. Any reason not to use the deps file as the build witness?

There was a reason, but darn I forgot it... Andrei
Apr 06 2013
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/7/13 12:26 AM, Vladimir Panteleev wrote:
 On Sunday, 7 April 2013 at 04:25:19 UTC, Andrei Alexandrescu wrote:
 On 4/7/13 12:07 AM, Vladimir Panteleev wrote:
 On Sunday, 7 April 2013 at 02:16:10 UTC, Andrei Alexandrescu wrote:
 Why in parallel and not in one go? (-v with -rb and without -o-)

I'm not sure how rdmd would distingish dmd's output from the output of the running program. Oh wait, the run will be a distinct step - awesome.

I'll have a go at the rdmd side of this. Any reason not to use the deps file as the build witness?

There was a reason, but darn I forgot it...

I think I figured it out - --makedepend will update the deps file, but not the executable.

I think there's another one related to a bug I recently fixed. There needs to be a file touched after the executable has been built when people use -of to place the executable in a specific place. https://github.com/D-Programming-Language/tools/commit/e14e0378375d1bb586871f8e66cc501dac64a7e1 Andrei
Apr 07 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Andrej Mitrovic:

 But I've only tested this on smaller
 projects, I wonder what the impact is on larger projects.

I think the recursive scan is mostly meant for small projects. I think large projects will usually use some kind of build scripts. Bye, bearophile
Apr 06 2013
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 4/6/13, bearophile <bearophileHUGS lycos.com> wrote:
 I think the recursive scan is mostly meant for small projects. I
 think large projects will usually use some kind of build scripts.

A gtkD benchmark: $ C:\dev\projects\GtkD\demos\gtk>timeit rdmd --build-only --force -IC:\dev\projects\GtkD\src HelloWorld.d
 Done in 26_247_732 usecs.

$ C:\dev\projects\GtkD\demos\gtk>timeit dmd -rb -IC:\dev\projects\GtkD\src HelloWorld.d
 Done in 22_826_820 usecs.

4 seconds shaved off. Now let's try with a prebuilt static library: timeit rdmd C:\dev\projects\GtkD\src\GtkD.lib --exclude=atk --exclude=cairo --exclude=gthread --exclude=gobject --exclude=glib --exclude=gio --exclude=gdk --exclude=gdkpixbuf --exclude=gtk --exclude=gtkc --exclude=gtkD --exclude=pango --build-only --force -IC:\dev\projects\GtkD\src HelloWorld.d Done in 3_100_329 usecs. timeit dmd -rb C:\dev\projects\GtkD\src\GtkD.lib -rxatk.* -rxcairo.* -rxgthread.* -rxgobject.* -rxglib.* -rxgio.* -rxgdk.* -rxgdkpixbuf.* -rxgtk.* -rxgtkc.* -rxgtkD.* -rxpango.* -IC:\dev\projects\GtkD\src HelloWorld.d Done in 1_663_442 usecs. It definitely helps when you're building something from scratch. But RDMD can track changes to dependencies, which DMD still can't do. I want to try implementing this feature in DMD and see the speed difference for incremental builds.
Apr 06 2013
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Saturday, 6 April 2013 at 17:52:40 UTC, Andrej Mitrovic wrote:
 It definitely helps when you're building something from scratch.

Great stuff!
 But RDMD can track changes to dependencies, which DMD still 
 can't do.
 I want to try implementing this feature in DMD and see the speed
 difference for incremental builds.

Why not adapt rdmd so that it takes advantage of the new script? Is there any benefit of having dependency change checks in dmd instead of rdmd?
Apr 06 2013
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sunday, 7 April 2013 at 02:04:54 UTC, Andrei Alexandrescu 
wrote:
 Yah, I think it would be great to dedicate rdmd to the 
 dependency/caching part and leave the build to the compiler. 
 One possibility would be to run the build and the dependency 
 saving in parallel (!).

Why in parallel and not in one go? (-v with -rb and without -o-)
Apr 06 2013
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sunday, 7 April 2013 at 02:16:10 UTC, Andrei Alexandrescu 
wrote:
 Why in parallel and not in one go? (-v with -rb and without 
 -o-)

I'm not sure how rdmd would distingish dmd's output from the output of the running program. Oh wait, the run will be a distinct step - awesome.

I'll have a go at the rdmd side of this. Any reason not to use the deps file as the build witness?
Apr 06 2013
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sunday, 7 April 2013 at 04:25:19 UTC, Andrei Alexandrescu 
wrote:
 On 4/7/13 12:07 AM, Vladimir Panteleev wrote:
 On Sunday, 7 April 2013 at 02:16:10 UTC, Andrei Alexandrescu 
 wrote:
 Why in parallel and not in one go? (-v with -rb and without 
 -o-)

I'm not sure how rdmd would distingish dmd's output from the output of the running program. Oh wait, the run will be a distinct step - awesome.

I'll have a go at the rdmd side of this. Any reason not to use the deps file as the build witness?

There was a reason, but darn I forgot it...

I think I figured it out - --makedepend will update the deps file, but not the executable.
Apr 06 2013
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sunday, 7 April 2013 at 04:07:55 UTC, Vladimir Panteleev wrote:
 I'll have a go at the rdmd side of this.

https://github.com/CyberShadow/tools/compare/BuildRecurse
Apr 06 2013
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sunday, 7 April 2013 at 00:23:06 UTC, Andrei Alexandrescu 
wrote:
 For larger projects I wonder if it's possible to restrict 
 scanning only to specific directories. That way building 
 becomes a modularity enforcer.

I'm not sure what this means. Could you expand a little, please?
Apr 07 2013
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Saturday, 6 April 2013 at 17:52:40 UTC, Andrej Mitrovic wrote:
 But RDMD can track changes to dependencies, which DMD still 
 can't do.
 I want to try implementing this feature in DMD and see the speed
 difference for incremental builds.

By "incremental builds", do you mean skipping compilation of certain modules, and re-using object files from previous builds? Currently, rdmd deletes all generated modules after building, to save disk space, as those modules are not reused in subsequent builds. The only thing that rdmd caches, is the full list of source files in the program (so it knows which files to check to see if they've been edited), and the executable. Actual incremental compilation with just one compiler invocation would certainly be a big speedup for large projects. As I understand, there were some problems with how dmd places data in object files that prevented implementing this, however that's been put into question recently: http://forum.dlang.org/post/mailman.508.1365074861.4724.digitalmars-d puremagic.com I wonder if it's somehow related to Dicebot's findings in another recent thread: http://forum.dlang.org/post/gxscuhrvcpnuzvgyykyp forum.dlang.org The compiler's verbose (-v) output, which is what rdmd currently uses, does not specify which modules depends on which - it simply lists all the files that took part in the compilation. There exists a separate option, -deps, which lists actual inter-module dependencies - judging by the identifiers, this is what rdmd used at some point. However, -deps output does not specify things such as import()-ed files. eskimor on IRC attempted to rectify that, here: https://github.com/D-Programming-Language/dmd/pull/1839 However, as we've discussed, I think improving -v output would be better than adding a new switch to the compiler.
Apr 07 2013
prev sibling next sibling parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 4/7/13, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 Do you plan to convert your work into a pull request?

Sure. On 4/7/13, Vladimir Panteleev <vladimir thecybershadow.net> wrote:
 By "incremental builds", do you mean skipping compilation of
 certain modules, and re-using object files from previous builds?

Yeah. On 4/7/13, Vladimir Panteleev <vladimir thecybershadow.net> wrote:
 Why not adapt rdmd so that it takes advantage of the new script?

Yeah, RDMD could take advantage of -rx to avoid building specific modules. It slipped my mind.
 Currently, rdmd deletes all generated modules after building, to
 save disk space, as those modules are not reused in subsequent
 builds.

Hmm.. ok so we have to add an incremental option to RDMD. If I'm not mistaken the build process for the following project would look like: main.d -> foo.d -> bar.d -> doo.d 1. Invoke DMD to build recursively, and simultaneously fetch the dependencies to RDMD which will store the dependencies to a file: $ dmd -rb main.d -v 2. On the second build (let's assume foo.d changed), RDMD would read the dependencies, check the modification time of each file, and knowing only foo.d has to be rebuilt it would run DMD again with: $ dmd -rb main.d -rxbar bar.obj -rxdoo doo.obj This assumes it keeps the object files on disk and doesn't delete them. Would this work? On 4/7/13, Vladimir Panteleev <vladimir thecybershadow.net> wrote:
 As I understand, there were some problems with how dmd places data in object
files that prevented implementing this, however that's been put into question
recently:
 http://forum.dlang.org/post/mailman.508.1365074861.4724.digitalmars-d puremagic.com

Yeah I can't reproduce this bug. h3r3tic also had some test-case somewhere which I couldn't reproduce[1]. [1] : https://bitbucket.org/h3r3tic/xfbuild/issue/7/make-incremental-building-reliable
Apr 07 2013
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/7/13 7:45 AM, Andrej Mitrovic wrote:
 Hmm.. ok so we have to add an incremental option to RDMD. If I'm not
 mistaken the build process for the following project would look like:

 main.d ->  foo.d ->  bar.d ->  doo.d

 1. Invoke DMD to build recursively, and simultaneously fetch the
 dependencies to RDMD which will store the dependencies to a file:

 $ dmd -rb main.d -v

 2. On the second build (let's assume foo.d changed), RDMD would read
 the dependencies, check the modification time of each file, and
 knowing only foo.d has to be rebuilt it would run DMD again with:

 $ dmd -rb main.d -rxbar bar.obj -rxdoo doo.obj

 This assumes it keeps the object files on disk and doesn't delete them.

 Would this work?

Yah. One question would be how to store dependencies. One possibility is to flatly store transitive dependencies for each file (rdmd currently does that, but only for the main file). This would take care of circular dependencies automatically. One other option is to store only direct dependencies for each file, and then use a graph walker to get work done. (I seem to recall Go is that it has a simple highly linearized organization of dependencies. If anyone knows more about that, please share.) Andrei
Apr 07 2013
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-04-07 13:45, Andrej Mitrovic wrote:

 On 4/7/13, Vladimir Panteleev <vladimir thecybershadow.net> wrote:
 As I understand, there were some problems with how dmd places data in object
files that prevented implementing this, however that's been put into question
recently:
 http://forum.dlang.org/post/mailman.508.1365074861.4724.digitalmars-d puremagic.com

Yeah I can't reproduce this bug. h3r3tic also had some test-case somewhere which I couldn't reproduce[1]. [1] : https://bitbucket.org/h3r3tic/xfbuild/issue/7/make-incremental-building-reliable

There's also the problem of DMD not outputting object files with their full module name. -- /Jacob Carlborg
Apr 07 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-04-08 12:44, Andrej Mitrovic wrote:

 I think LDC or GDC have the -oq flag for this. We could consider
 porting it to DMD.

Yes. -- /Jacob Carlborg
Apr 08 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-04-08 13:29, Andrej Mitrovic wrote:
 Ah I see now there were some pulls for this bug but it was hidden
 under the D1 section:
 http://d.puremagic.com/issues/show_bug.cgi?id=3541

 I might have a go at it unless someone beats me to it.

Right, forgot about that pull request. I never figured out how to run the unit tests for Windows. -- /Jacob Carlborg
Apr 08 2013
parent Jacob Carlborg <doob me.com> writes:
On 2013-04-08 15:41, Andrej Mitrovic wrote:

 No problem, I made a new pull.

 https://github.com/D-Programming-Language/dmd/pull/1871

Cool :) -- /Jacob Carlborg
Apr 08 2013
prev sibling next sibling parent "Dicebot" <m.strashun gmail.com> writes:
On Sunday, 7 April 2013 at 07:29:26 UTC, Vladimir Panteleev wrote:
 I wonder if it's somehow related to Dicebot's findings in 
 another recent thread:
 http://forum.dlang.org/post/gxscuhrvcpnuzvgyykyp forum.dlang.org

I am pretty sure it is related. Current DMD logic for emitting template symbols is very naive and feels more like a hack that has several consequences in variety of cases. I am trying to create a more capable solution but have not been doing any dmd development before and this may take an eternity. Would be really glad to see some experienced dmd dev comment on topic.
Apr 07 2013
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 4/7/13, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 1. Invoke DMD to build recursively, and simultaneously fetch the
 dependencies to RDMD which will store the dependencies to a file:

 $ dmd -rb main.d -v


One thing I forgot to add were the -c and -op flags to make DMD actually generate separate object files, with paths to avoid conflicts.
Apr 07 2013
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sunday, 7 April 2013 at 11:52:54 UTC, Andrei Alexandrescu 
wrote:
 I think there's another one related to a bug I recently fixed. 
 There needs to be a file touched after the executable has been 
 built when people use -of to place the executable in a specific 
 place.

Well, yes, that was when the witness file was added. My question was whether there was a reason for a separate file to be used, as opposed to using the .deps file. --makedepend would be the answer, as I see.
Apr 07 2013
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 4/7/13, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 This is a dramatic improvement, and better build times are always well
 received. Do you plan to convert your work into a pull request?

I've filed the enhancement: http://d.puremagic.com/issues/show_bug.cgi?id=9896 Perhaps you could mark it as pre-approved? I'm making a pull soon.
Apr 07 2013
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sunday, 7 April 2013 at 11:45:38 UTC, Andrej Mitrovic wrote:
 Would this work?

I think so, however the main challenge is that rdmd doesn't track dependencies between each module. It just has a list of files that take part in the build. We'd need to improve either the -v output (to include inter-module dependencies) or -deps output (to include import()-ed files) to provide that information to rdmd.
Apr 07 2013
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 4/7/13, Vladimir Panteleev <vladimir thecybershadow.net> wrote:
 We'd need to improve either the -v
 output (to include inter-module dependencies) or -deps output (to
 include import()-ed files) to provide that information to rdmd.

-v is a better choice, it will allow us to use pipes later.
Apr 07 2013
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 4/7/13, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 Do you plan to convert your work into a pull request?

Pull made: https://github.com/D-Programming-Language/dmd/pull/1861
Apr 07 2013
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 4/7/13, Andrej Mitrovic <andrej.mitrovich gmail.com> wrote:
 On 4/7/13, Vladimir Panteleev <vladimir thecybershadow.net> wrote:
 We'd need to improve either the -v
 output (to include inter-module dependencies) or -deps output (to
 include import()-ed files) to provide that information to rdmd.

-v is a better choice, it will allow us to use pipes later.

Even better is to make -deps without arguments just print to stdout. I'm making a pull for this soon.
Apr 07 2013
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 4/7/13, Martin Nowak <code dawg.eu> wrote:
 2.
 driver assumes that foo.bar depends on foo.baz and has a cached obj for
 foo.baz:
 dmd -c foo/bar -rb -rxstd.* -rxfoo.baz

How is this a problem? If foo.baz is not needed, the linker won't link it in. All the driver has to do is list the modules which shouldn't be recompiled if their modification time hasn't changed.
Apr 07 2013
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 4/7/13, Martin Nowak <code dawg.eu> wrote:
 This is a problem because 'something.else' might be a huge library you
 don't need to recompile.
 With the current approach you're first recompiling before you know the
 dependencies. It's also the reason why you hardcoded std.* and core.*
 into the exclude list. But this will miss other prebuild libraries, e.g.
 vibe.

So pass -rxvibe vibe.lib on the first invocation. I think we're in some kind of miscommunication..
Apr 07 2013
prev sibling next sibling parent "Martin Nowak" <code dawg.eu> writes:
 So pass -rxvibe vibe.lib on the first invocation. I think we're 
 in
 some kind of miscommunication..

You don't know that something uses vibe untilyou actually compile it. Passing every cached library to the compiler is not a feasible approach. How do you pass the relevant linker flags to the compiler? How do you support parallelization? Using IPC to negotiate this could handle these problems. You could think of it as using the compiler as library if you want to.
Apr 07 2013
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 4/7/13, Martin Nowak <code dawg.eu> wrote:
 So pass -rxvibe vibe.lib on the first invocation. I think we're
 in
 some kind of miscommunication..

You don't know that something uses vibe until you actually compile it.

The user knows, he can pass the flag to avoid compiling vibe if he has the static library prebuilt.
Apr 07 2013
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 4/8/13, Jacob Carlborg <doob me.com> wrote:
 There's also the problem of DMD not outputting object files with their
 full module name.

I think LDC or GDC have the -oq flag for this. We could consider porting it to DMD.
Apr 08 2013
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 4/8/13, Jacob Carlborg <doob me.com> wrote:
 On 2013-04-08 12:44, Andrej Mitrovic wrote:

 I think LDC or GDC have the -oq flag for this. We could consider
 porting it to DMD.

Yes.

Ah I see now there were some pulls for this bug but it was hidden under the D1 section: http://d.puremagic.com/issues/show_bug.cgi?id=3541 I might have a go at it unless someone beats me to it.
Apr 08 2013
prev sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 4/8/13, Jacob Carlborg <doob me.com> wrote:
 Right, forgot about that pull request. I never figured out how to run
 the unit tests for Windows.

No problem, I made a new pull. https://github.com/D-Programming-Language/dmd/pull/1871
Apr 08 2013