digitalmars.D - What is the compilation model of D?

David Piepgrass (42/42) Jul 24 2012 (Maybe this should be in D.learn but it's a somewhat advanced

Nick Sabalausky (107/155) Jul 24 2012 The compilation model is very similar to C or C++, so that's a good

Jonathan M Davis (10/13) Jul 24 2012 I find it shocking that anyone would consider 15 seconds slow to compile...

Nick Sabalausky (6/21) Jul 24 2012 I just meant that I haven't heard of much D stuff that took much longer
David Piepgrass (14/29) Jul 25 2012 I agree with Andrej, 15 seconds *is* slow for a edit-compile-run

Jonathan M Davis (14/38) Jul 25 2012 Sure, smaller programs should should build quickly, and having build tim...
Jacob Carlborg (11/17) Jul 26 2012 That's not necessarily true. The C# and Java compilers in these IDEs are...

Jacob Carlborg (8/21) Jul 25 2012 RDMD is mostly useful for executables, not so much for libraries. For
Russel Winder (23/31) Jul 25 2012 for a=20

Nick Sabalausky (5/22) Jul 25 2012 Yea, my understanding is that full-build times measured in days are (or

Peter Alexander (6/11) Jul 25 2012 You must be thinking of full data rebuilds, not code recompiles.

Nick Sabalausky (4/18) Jul 25 2012 Yea, you're probably right. I meant "full project", which almost

Andrej Mitrovic (15/17) Jul 25 2012 It's not shocking if you're used to a fast edit-compile-run cycle
David Piepgrass (76/124) Jul 25 2012 Thanks for the very good description, Nick! So if I understand

Roman D. Boiko (2/8) Jul 25 2012 TDPL chapter 11 "Scaling Up".

David Piepgrass (3/5) Jul 25 2012 That's where I was looking. As I said already, TDPL does not

Roman D. Boiko (3/9) Jul 25 2012 Strange, because it seems to me this chapter answers all your

Nick Sabalausky (38/103) Jul 25 2012 See, now you're getting into some details that I'm not entirely
Jacob Carlborg (30/62) Jul 26 2012 Yes, I think that's correct. But if you give the compiler all the source...

David Piepgrass (4/20) Jul 25 2012 I meant to ask, why would it recompile *all* of the source files

Nick Sabalausky (10/15) Jul 25 2012 I'm not 100% certain, but, yes, I think it's a combination of that, and

Jonathan M Davis (25/41) Jul 25 2012 and
Jonathan M Davis (16/22) Jul 25 2012 And dmc and dmd are lightning fast in comparison to most compilers. I th...
Russel Winder (32/41) Jul 26 2012 ing=20

Nick Sabalausky (5/17) Jul 26 2012 That's not something that actually necessitates a VM though. It's just

"David Piepgrass" <qwertie256 gmail.com> writes:

(Maybe this should be in D.learn but it's a somewhat advanced 
topic)

I would really like to understand how D compiles a program or 
library. I looked through TDPL and it doesn't seem to say 
anything about how compilation works.

- Does it compile all source files in a project at once?
- Does the compiler it have to re-parse all Phobos templates (in 
modules used by the program) whenever it starts?
- Is there any concept of an incremental build?
- Obviously, one can set up circular dependencies in which the 
compile-time meaning of some code in module A depends on the 
meaning of some code in module B, which in turn depends on the 
meaning of some other code in module A. Sometimes the D compiler 
can resolve the ultimate meaning, other times it cannot. I was 
pleased that the compiler successfully understood this:

// Y.d
import X;
struct StructY {
	int a = StructX().c;
	auto b() { return StructX().d(); }
}

// X.d
import Y;
struct StructX {
	int c = 3;
	auto d()
	{
		static if (StructY().a == 3 && StructY().a.sizeof == 3)
			return 3;
		else
			return "C";
	}
}

But what procedure does the compiler use to resolve the semantics 
of the code? Is there a specification anywhere? Does it have some 
limitations, such that there is code with an unambiguous meaning 
that a human could resolve but the compiler cannot?

- In light of the above (that the meaning of D code can be 
interdependent with other D code, plus the presence of mixins and 
all that), what are the limitations of __traits(allMembers...) 
and other compile-time reflection operations, and what kind of 
problems might a user expect to encounter?

Jul 24 2012

Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:

On Wed, 25 Jul 2012 02:16:04 +0200
"David Piepgrass" <qwertie256 gmail.com> wrote:

 (Maybe this should be in D.learn but it's a somewhat advanced 
 topic)
 
 I would really like to understand how D compiles a program or 
 library. I looked through TDPL and it doesn't seem to say 
 anything about how compilation works.
 

The compilation model is very similar to C or C++, so that's a good
starting point for understanding how D's works.

Here's how it works:

Whatever file *or files* you pass to DMD on the command line, *those*
are the files it will compile and generate object files for. No more,
no less.

However, in the process, it will *also* parse and perform semantic
analysis on any files that are directly or indirectly imported, but it
won't actually generate any machine code or object files for them (it
will find these files  via the -Ipath command line switch you pass
to DMD - this -I switch is like D's equivalent of Java's classpaths).

This does mean that, unlike what's typically done in C/C++, it's
generally much faster to pass all your files into DMD at once, instead
of the typical C/C++ route of making separate calls to the compiler for
each source file.

After DMD generates the object files for all source files you give it,
it will automatically send them to the linker (OPTLINK on windows, or
gcc/ld on Posix) to be linked into an executable. That is, *unless* you
give it either -c ("compile-only, do not link") or -lib ("generate
library instead of object files"). That way, you can link manually if
you wish.

So typically, you pass DMD all the .d files in your program, and it'll
compile them all, and pass them to the linker to be linked into an
executable. But if you don't want to automatically link, you don't have
to. If you want to compile them all separately, you can do so (though
it'd be very slow - probably almost as slow as C++, but not quite).

But that's just the DMD compiler itself. Instead of using DMD
directly, there's a better modern trick that's generally preferred:
RDMD.

If you use rdmd to compile (instead of dmd), you *just* give it
your *one* main source file (typically the one with your "main()"
function). This file must be the *last* parameter passed to rdmd:

$rdmd --build-only (any other flags) main.d

Then, RDMD will figure out *all* of the source files needed (using
the full compiler's frontend, so it never gets fooled into missing
anything), and if any of them have been changed, it will automatically
pass them *all* into DMD for you. This way, you don't have to
manually keep track of all your files and pass them all into
DMD youself. Just give RDMD your main file and that's it, you're golden.

Side note: Another little trick with RDMD: Omit the --build-only and
it will compile AND then run your program:

$cat simpleecho.d
import std.stdio;
void main(string[] args)
{
	writeln(args[1]);
}

$rdmd simpleecho.d "Anything after the .d file is passed to your app"
{automatically compiles all sources if needed}
Anything after the .d file is passed to your app

$wheee!!
command not found


 - Does it compile all source files in a project at once?

Answered this above. In short: It compiles whatever you give it (and
processes, but doesn't compile, any needed imports). Unless you use RDMD
in which case it automatically detects and compiles all your
needed sources (unless none of them have changed).

 - Does the compiler it have to re-parse all Phobos templates (in 
 modules used by the program) whenever it starts?

Yes. (Unless you never import anything from in phobos...I think.) But
it's very, very fast to parse. Lightning-speed if you compare it to C++.

But it shouldn't run full semantic analysis on templates that are never
actually used. (Unless they're used in a piece of dead code.)

 - Is there any concept of an incremental build?

Yes, but there's a few "gotcha"s:

1. D compiles so damn fast that it's not nearly as much of an issue as
it is with C++ (which is notoriously ultra-slow compared
to...everything, hence the monumental importance of C++'s incremental
builds).

2. Historically, there can be problems with templates when
incrementally compiling. DMD has been known to get confused about which
object file it put an instantiated template into, which can lead to
occasional linker errors. These errors can be fixed by doing a full
rebuild (which is WAAAY faster than it would be with C++). I don't know
whether or not this has been fixed.

3. Incremental building typically involves compiling files
one-at-a-time. But with D, you get a HUGE boost in compilation speed by
not compiling one-at-a-time. So if you have a huge, slow-to-compile
codebase (for example, 15 seconds or so), and you change a handful of
files, it may actually be much *faster* to do a full rebuild (since
you're not re-analysing all the imports). Of course, you could probably
get around that issue by passing all the changed files (and only the
changed files) into DMD at once (instead of one-at-a-time), but I don't
know whether typical build tools (like make) can realistically handle
that.


 - Obviously, one can set up circular dependencies in which the 
 compile-time meaning of some code in module A depends on the 
 meaning of some code in module B, which in turn depends on the 
 meaning of some other code in module A. Sometimes the D compiler 
 can resolve the ultimate meaning, other times it cannot. I was 
 pleased that the compiler successfully understood this:
 
 // Y.d
 import X;
 struct StructY {
 	int a = StructX().c;
 	auto b() { return StructX().d(); }
 }
 
 // X.d
 import Y;
 struct StructX {
 	int c = 3;
 	auto d()
 	{
 		static if (StructY().a == 3 && StructY().a.sizeof ==
 3) return 3;
 		else
 			return "C";
 	}
 }
 
 But what procedure does the compiler use to resolve the semantics 
 of the code? Is there a specification anywhere? Does it have some 
 limitations, such that there is code with an unambiguous meaning 
 that a human could resolve but the compiler cannot?
 

It keeps diving deeper and deeper to find anything it can "start" with.
One it finds that, it'll just build everything back up in whatever
order is necessary.

If it *truly is* a circular definition, and there isn't any place
it can actually start with, then it issues an error.

(If there's any cases where it doesn't work this way, they should be
filed as bugs in the compiler.)

 - In light of the above (that the meaning of D code can be 
 interdependent with other D code, plus the presence of mixins and 
 all that), what are the limitations of __traits(allMembers...) 
 and other compile-time reflection operations, and what kind of 
 problems might a user expect to encounter?

Shouldn't really be an issue. Such things won't get evaluated until the
types/identifiers involved are *fully* analyzed (or at least to the
extent that they need to be analyzed). So the results of things like
__traits(allMembers...) should *never* change during compilation, or
when changing the order of files or imports (unless there's some
compiler bug). Any situation that *would* result in any such ambiguity
will get flagged as an error in your code.

I would however, recommend avoiding static constructors and module
constructors whenever you reasonably can. If you have a circular
import (ie: module a imports b, which imports c, which imports
a), then that's normally OK, *UNLESS* they all have static
and/or module constructors. If they do, then the startup code D builds
into your application won't know which needs to run first (and it
doesn't analyze the actual code, it just assumes there *could* be
an order-of-execution dependency), so you'll get a circular dependency
error when you run your program. And the safest, easiest way to get rid
of those errors is to eliminate one or more static/module constructors.

Jul 24 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday, July 24, 2012 22:00:56 Nick Sabalausky wrote:
 But with D, you get a HUGE boost in compilation speed by
 not compiling one-at-a-time. So if you have a huge, slow-to-compile
 codebase (for example, 15 seconds or so),

I find it shocking that anyone would consider 15 seconds slow to compile for a 
large program. Yes, D's builds are lightning fast in general, and 15 seconds 
is probably a longer build, but calling 15 seconds "slow-to-compile" just 
about blows my mind. 15 seconds for a large program is _fast_. If anyone 
complains about a large program taking 15 seconds to build, then they're just 
plain spoiled or naive. I've dealt with _Java_ apps which took in the realm of 
10 minutes to compile, let alone C++ apps which take _hours_ to compile. 15 
seconds is a godsend.

- Jonathan M Davis

Jul 24 2012

Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:

On Tue, 24 Jul 2012 20:35:27 -0700
Jonathan M Davis <jmdavisProg gmx.com> wrote:

 On Tuesday, July 24, 2012 22:00:56 Nick Sabalausky wrote:
 But with D, you get a HUGE boost in compilation speed by
 not compiling one-at-a-time. So if you have a huge, slow-to-compile
 codebase (for example, 15 seconds or so),

 
 I find it shocking that anyone would consider 15 seconds slow to
 compile for a large program. Yes, D's builds are lightning fast in
 general, and 15 seconds is probably a longer build, but calling 15
 seconds "slow-to-compile" just about blows my mind. 15 seconds for a
 large program is _fast_. If anyone complains about a large program
 taking 15 seconds to build, then they're just plain spoiled or naive.
 I've dealt with _Java_ apps which took in the realm of 10 minutes to
 compile, let alone C++ apps which take _hours_ to compile. 15 seconds
 is a godsend.
 

I just meant that I haven't heard of much D stuff that took much longer
than that, so it's somewhat on the long end as far as D stuff goes.
But I may be off-base. 'Course it depends a lot of the computer, too.
I probably worded it weird.

Jul 24 2012

"David Piepgrass" <qwertie256 gmail.com> writes:

 I find it shocking that anyone would consider 15 seconds slow 
 to compile for a
 large program. Yes, D's builds are lightning fast in general, 
 and 15 seconds
 is probably a longer build, but calling 15 seconds 
 "slow-to-compile" just
 about blows my mind. 15 seconds for a large program is _fast_. 
 If anyone
 complains about a large program taking 15 seconds to build, 
 then they're just
 plain spoiled or naive. I've dealt with _Java_ apps which took 
 in the realm of
 10 minutes to compile, let alone C++ apps which take _hours_ to 
 compile. 15
 seconds is a godsend.

I agree with Andrej, 15 seconds *is* slow for a edit-compile-run 
cycle, although it might be understandable when editing code that 
uses a lot of CTFE and static foreach and reinstantiates 
templates with a crapton of different arguments.

I am neither spoiled nor naive to think it can be done in under 

seconds (okay, not a big program, but several smaller programs).


to having an IDE that immediately understands what I have typed, 
giving me error messages and keeping metadata about the program 
up-to-date within 2 seconds. I can edit a class definition in 
file A and get code completion for it in file B, 2 seconds later. 
I don't expect the IDE can ever do that if the compiler can't do 
a debug build in a similar timeframe.

Jul 25 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Wednesday, July 25, 2012 17:35:09 David Piepgrass wrote:
 I find it shocking that anyone would consider 15 seconds slow
 to compile for a
 large program. Yes, D's builds are lightning fast in general,
 and 15 seconds
 is probably a longer build, but calling 15 seconds
 "slow-to-compile" just
 about blows my mind. 15 seconds for a large program is _fast_.
 If anyone
 complains about a large program taking 15 seconds to build,
 then they're just
 plain spoiled or naive. I've dealt with _Java_ apps which took
 in the realm of
 10 minutes to compile, let alone C++ apps which take _hours_ to
 compile. 15
 seconds is a godsend.

 
 I agree with Andrej, 15 seconds *is* slow for a edit-compile-run
 cycle, although it might be understandable when editing code that
 uses a lot of CTFE and static foreach and reinstantiates
 templates with a crapton of different arguments.
 
 I am neither spoiled nor naive to think it can be done in under

 seconds (okay, not a big program, but several smaller programs).

Sure, smaller programs should should build quickly, and having build times get 
slower as the program grows can definitely be a problem. I'm not about to argue 
with that. But having a _large_ application build in 15 seconds is arguably a 
luxory. Large applications just aren't the sort of thing that builds quickly. 
But that's the sort of project that's usually commercial (either that or a 
major open source one), and I don't think that D's been used in that domain a 
lot yet.

While D compiles far faster than C++, the kind of application which takes 
hours to compile in C++ and the one that takes 10+ seconds in D are on a 
completely different level in terms of amount of source code and the level of 
complexity, even if D _would_ probably only take minutes on a similar project 
instead of hours.

- Jonathan M Davis

Jul 25 2012

Jacob Carlborg <doob me.com> writes:

On 2012-07-25 17:35, David Piepgrass wrote:


 having an IDE that immediately understands what I have typed, giving me
 error messages and keeping metadata about the program up-to-date within
 2 seconds. I can edit a class definition in file A and get code
 completion for it in file B, 2 seconds later. I don't expect the IDE can
 ever do that if the compiler can't do a debug build in a similar timeframe.


built to be able to handle incremental compilation at a very fine 
grained level. We're not talking recompiling just a single file, we're 
talking recompiling just a part of a single file.

DMD and other D compiler are just not built to handle this. They don't 
handle incremental builds at all. There are various reason why it's more 
difficult to make an incremental build system with D. Most of the reason 
are due to meta programming (templates, CTFE, mixins and other things).

-- 
/Jacob Carlborg

Jul 26 2012

Jacob Carlborg <doob me.com> writes:

On 2012-07-25 04:00, Nick Sabalausky wrote:

 But that's just the DMD compiler itself. Instead of using DMD
 directly, there's a better modern trick that's generally preferred:
 RDMD.

 If you use rdmd to compile (instead of dmd), you *just* give it
 your *one* main source file (typically the one with your "main()"
 function). This file must be the *last* parameter passed to rdmd:

 $rdmd --build-only (any other flags) main.d

 Then, RDMD will figure out *all* of the source files needed (using
 the full compiler's frontend, so it never gets fooled into missing
 anything), and if any of them have been changed, it will automatically
 pass them *all* into DMD for you. This way, you don't have to
 manually keep track of all your files and pass them all into
 DMD youself. Just give RDMD your main file and that's it, you're golden.

RDMD is mostly useful for executables, not so much for libraries. For 
libraries you would need to pass _all_ of your project files directly to 
DMD (or find some other tool). It's perfectly fine to have a library 
which consists of two files with no interaction between them. Neither 
RDMD or the compiler can track that.

-- 
/Jacob Carlborg

Jul 25 2012

Russel Winder <russel winder.org.uk> writes:

On Tue, 2012-07-24 at 20:35 -0700, Jonathan M Davis wrote:
[=E2=80=A6]
 I find it shocking that anyone would consider 15 seconds slow to compile =

for a=20
 large program. Yes, D's builds are lightning fast in general, and 15 seco=

nds=20
 is probably a longer build, but calling 15 seconds "slow-to-compile" just=

=20
 about blows my mind. 15 seconds for a large program is _fast_. If anyone=

=20
 complains about a large program taking 15 seconds to build, then they're =

just=20
 plain spoiled or naive. I've dealt with _Java_ apps which took in the rea=

lm of=20
 10 minutes to compile, let alone C++ apps which take _hours_ to compile. =

15=20
 seconds is a godsend.

A company I did some Python training for (they used Python for their
integration and system testing, and a bit of unit testing) back in 2006
had a C++ product whose "from scratch" build time genuinely was 56
hours.

--=20
Russel.
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D
Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder ekiga.n=
et
41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel winder.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder

Jul 25 2012

Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:

On Wed, 25 Jul 2012 08:54:24 +0100
Russel Winder <russel winder.org.uk> wrote:

 On Tue, 2012-07-24 at 20:35 -0700, Jonathan M Davis wrote:
 [=E2=80=A6]
 I find it shocking that anyone would consider 15 seconds slow to
 compile for a large program. Yes, D's builds are lightning fast in
 general, and 15 seconds is probably a longer build, but calling 15
 seconds "slow-to-compile" just about blows my mind. 15 seconds for
 a large program is _fast_. If anyone complains about a large
 program taking 15 seconds to build, then they're just plain spoiled
 or naive. I've dealt with _Java_ apps which took in the realm of 10
 minutes to compile, let alone C++ apps which take _hours_ to
 compile. 15 seconds is a godsend.

=20
 A company I did some Python training for (they used Python for their
 integration and system testing, and a bit of unit testing) back in
 2006 had a C++ product whose "from scratch" build time genuinely was
 56 hours.
=20

Yea, my understanding is that full-build times measured in days are (or
used to be, don't know if they still are) also typical of high-budget
C++-based videogames.

Jul 25 2012

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Wednesday, 25 July 2012 at 08:06:23 UTC, Nick Sabalausky wrote:
 Yea, my understanding is that full-build times measured in days 
 are (or
 used to be, don't know if they still are) also typical of 
 high-budget
 C++-based videogames.

You must be thinking of full data rebuilds, not code recompiles.

There's no way a game could take over a day to compile and still 
produce an executable that would fit on a console.

Several minutes is more typical. Maybe up to 30 minutes in bad 
cases.

Jul 25 2012

Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:

On Wed, 25 Jul 2012 23:20:04 +0200
"Peter Alexander" <peter.alexander.au gmail.com> wrote:

 On Wednesday, 25 July 2012 at 08:06:23 UTC, Nick Sabalausky wrote:
 Yea, my understanding is that full-build times measured in days 
 are (or
 used to be, don't know if they still are) also typical of 
 high-budget
 C++-based videogames.

 
 You must be thinking of full data rebuilds, not code recompiles.
 
 There's no way a game could take over a day to compile and still 
 produce an executable that would fit on a console.
 
 Several minutes is more typical. Maybe up to 30 minutes in bad 
 cases.

Yea, you're probably right. I meant "full project", which almost
certainly involves going through gigabytes of assets.

Jul 25 2012

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On 7/25/12, Jonathan M Davis <jmdavisProg gmx.com> wrote:
 I find it shocking that anyone would consider 15 seconds slow to compile for
 a large program.

It's not shocking if you're used to a fast edit-compile-run cycle
which takes a few seconds and then starts to slow down considerably
when you involve more and more templates. When I start working on a
new D app it almost feels like programming in Python, the
edit-compile-run cycle is really fast. But eventually the codebase
grows, things slow down and I lose that "Python" feeling when it
starts taking a dozen seconds to compile. It just breaks my
concentration having to wait for something to finish.

Hell I can't believe how outdated the compiler technology is. I can
play incredibly realistic and interactive 3D games in real-time with
practically no input lag, but I have to wait a dozen seconds for a
tool to convert lines of text into object code? From a syntax
perspective D has moved forward but from a compilation perspective it
hasn't innovated at all.

Jul 25 2012

"David Piepgrass" <qwertie256 gmail.com> writes:

Thanks for the very good description, Nick! So if I understand 
correctly, if

1. I use an "auto" return value or suchlike in a module Y.d
2. module X.d calls this function
3. I call "dmd -c X.d" and "dmd -c Y.d" as separate steps

Then the compiler will have to fully parse Y twice and fully 
analyze the Y function twice, although it generates object code 
for the function only once. Right? I wonder how smart it is about 
not analyzing things it does not need to analyze (e.g. when Y is 
a big module but X only calls one function from it - the compiler 
has to parse Y fully but it should avoid most of the semantic 
analysis.)

What about templates? In C++ it is a problem that the compiler 
will instantiate templates repeatedly, say if I use 
vector<string> in 20 source files, the compiler will generate and 
store 20 copies of vector<string> (plus 20 copies of 
basic_string<char>, too) in object files.

1. So in D, if I compile the 20 sources separately, does the same 
thing happen (same collection template instantiated 20 times with 
all 20 copies stored)?
2. If I compile the 20 sources all together, I guess the template 
would be instantiated just once, but then which .obj file does 
the instantiated template go in?

 $rdmd --build-only (any other flags) main.d

 Then, RDMD will figure out *all* of the source files needed 
 (using
 the full compiler's frontend, so it never gets fooled into 
 missing
 anything), and if any of them have been changed, it will 
 automatically
 pass them *all* into DMD for you. This way, you don't have to
 manually keep track of all your files and pass them all into
 DMD youself. Just give RDMD your main file and that's it, 
 you're golden.

 Side note: Another little trick with RDMD: Omit the 
 --build-only and it will compile AND then run your program:

 Yes. (Unless you never import anything from in phobos...I 
 think.) But
 it's very, very fast to parse. Lightning-speed if you compare 
 it to C++.

I don't even want to legitimize C++ compiler speed by comparing 
it to any other language ;)

 - Is there any concept of an incremental build?

 Yes, but there's a few "gotcha"s:

 1. D compiles so damn fast that it's not nearly as much of an 
 issue as
 it is with C++ (which is notoriously ultra-slow compared
 to...everything, hence the monumental importance of C++'s 
 incremental
 builds).

I figure as CTFE is used more, especially when it is used to 
decide which template overloads are valid or how a mixin will 
behave, this will slow down the compiler more and more, thus 
making incremental builds more important. A typical example would 
be a compile-time parser-generator, or compiled regexes.

Plus, I've heard some people complaining that the compiler uses 
over 1 GB RAM, and splitting up compilation into parts might help 
with that.

BTW, I think I heard the compiler uses multithreading to speed up 
the build, is that right?

 It keeps diving deeper and deeper to find anything it can 
 "start" with.
 One it finds that, it'll just build everything back up in 
 whatever
 order is necessary.

I hope someone can give more details about this.

 - In light of the above (that the meaning of D code can be 
 interdependent with other D code, plus the presence of mixins 
 and all that), what are the limitations of 
 __traits(allMembers...) and other compile-time reflection 
 operations, and what kind of problems might a user expect to 
 encounter?

 Shouldn't really be an issue. Such things won't get evaluated 
 until the
 types/identifiers involved are *fully* analyzed (or at least to 
 the
 extent that they need to be analyzed). So the results of things 
 like
 __traits(allMembers...) should *never* change during 
 compilation, or
 when changing the order of files or imports (unless there's some
 compiler bug). Any situation that *would* result in any such 
 ambiguity
 will get flagged as an error in your code.

Hmm. Well, I couldn't find an obvious example... for example, you 
are right, this doesn't work, although the compiler annoyingly 
doesn't give a reason:

struct OhCrap {
	void a() {}
	// main.d(72): Error: error evaluating static if expression
	//             (what error? syntax error? type error? c'mon...)
	static if ([ __traits(allMembers, OhCrap) ].length > 1) {
		auto b() { return 2; }
	}
	void c() {}
}

But won't this be a problem when it comes time to produce 
run-time reflection information? I mean, when module A asks to 
create run-time reflection information for all the functions and 
types in module A.... er, I naively thought the information would 
be created as a set of types and functions *in module A*, which 
would then change the set of allMembers of A. But, maybe it makes 
more sense to create that stuff in a different module (which A 
could then import??)

Anyway, I can't even figure out how to enumerate the members of a 
module A; __traits(allMembers, A) causes "Error: import Y has no 
members".

Aside: I first wrote the above code as follows:

// Shouldn't this be in Phobos somewhere?
bool contains(alias pred = "a == b", R, E)(R haystack, E needle)
     if (isInputRange!R &&
         is(typeof(binaryFun!pred(haystack.front, needle)) : bool))
{
	return !(find!(pred, R, E)(haystack, needle).empty);
}

struct OhCrap {
	void a() {}
	static if ([ __traits(allMembers, OhCrap) ].contains("a")) {
		auto b() { return 2; }
	}
	void c() {}
}

But it causes a series of 204 error messages that I don't 
understand.

Jul 25 2012

"Roman D. Boiko" <rb d-coding.com> writes:

On Wednesday, 25 July 2012 at 19:54:31 UTC, David Piepgrass wrote:
 It keeps diving deeper and deeper to find anything it can 
 "start" with.
 One it finds that, it'll just build everything back up in 
 whatever
 order is necessary.

 I hope someone can give more details about this.

TDPL chapter 11 "Scaling Up".

Jul 25 2012

"David Piepgrass" <qwertie256 gmail.com> writes:

 I hope someone can give more details about this.

 TDPL chapter 11 "Scaling Up".

That's where I was looking. As I said already, TDPL does not 
explain how compilation works, especially not anything about the 
low-level semantic analysis which has me most curious.

Jul 25 2012

"Roman D. Boiko" <rb d-coding.com> writes:

On Wednesday, 25 July 2012 at 20:25:19 UTC, David Piepgrass wrote:
 I hope someone can give more details about this.

 TDPL chapter 11 "Scaling Up".

 That's where I was looking. As I said already, TDPL does not 
 explain how compilation works, especially not anything about 
 the low-level semantic analysis which has me most curious.

Strange, because it seems to me this chapter answers all your 
previous questions. What exact details are you interested in?

Jul 25 2012

Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:

On Wed, 25 Jul 2012 21:54:29 +0200
"David Piepgrass" <qwertie256 gmail.com> wrote:

 Thanks for the very good description, Nick! So if I understand 
 correctly, if
 
 1. I use an "auto" return value or suchlike in a module Y.d
 2. module X.d calls this function
 3. I call "dmd -c X.d" and "dmd -c Y.d" as separate steps
 

See, now you're getting into some details that I'm not entirely
familiar with ;)... 

 Then the compiler will have to fully parse Y twice and fully 
 analyze the Y function twice, although it generates object code 
 for the function only once. Right?

That's my understanding of it, yes.

 I wonder how smart it is about 
 not analyzing things it does not need to analyze (e.g. when Y is 
 a big module but X only calls one function from it - the compiler 
 has to parse Y fully but it should avoid most of the semantic 
 analysis.)

I don't know how smart it is about that.

If you have a template that never gets instantiated by *anything*, then
I do know that semantic analysis won't get run on it since

be evaluated once they're instantiated.  

If, OTOH, you have a plain old function that never gets called, I'm
guessing semantics probably still get run on it.

Anything else: I dunno. :/

 
 What about templates? In C++ it is a problem that the compiler 
 will instantiate templates repeatedly, say if I use 
 vector<string> in 20 source files, the compiler will generate and 
 store 20 copies of vector<string> (plus 20 copies of 
 basic_string<char>, too) in object files.
 
 1. So in D, if I compile the 20 sources separately, does the same 
 thing happen (same collection template instantiated 20 times with 
 all 20 copies stored)?

Again, I'm not certain about this, other people would be able to
answer better, but I *think* it works like this:

If you pass all the files into DMD at once, then it'll only evaluate
and generate code for vector<string> once. If you pass the files in
as separate calls to DMD, then it's do semantic analysis on
vector<string> twenty times, and I have no idea whether code will get
generated one time or twenty times.

 2. If I compile the 20 sources all together, I guess the template 
 would be instantiated just once, but then which .obj file does 
 the instantiated template go in?
 

Unless things have been fixed since last I heared, this is actually the
root of the problem with incremental compilation and templates. The
compiler apparently makes some odd, or maybe inconsistent choices about
what obj to stick the template into. I don't know the details of
it though, just that in the past, people attempting to do incremental
compilation have run into occasional linking issues that were traced
back to problems in how DMD handles where to put instantiated
templates. 

 
 I don't even want to legitimize C++ compiler speed by comparing 
 it to any other language ;)
 

Fair enough :)

 - Is there any concept of an incremental build?

 Yes, but there's a few "gotcha"s:

 1. D compiles so damn fast that it's not nearly as much of an 
 issue as
 it is with C++ (which is notoriously ultra-slow compared
 to...everything, hence the monumental importance of C++'s 
 incremental
 builds).

 
 I figure as CTFE is used more, especially when it is used to 
 decide which template overloads are valid or how a mixin will 
 behave, this will slow down the compiler more and more, thus 
 making incremental builds more important. A typical example would 
 be a compile-time parser-generator, or compiled regexes.
 

That's probably a fair assumption.

 Plus, I've heard some people complaining that the compiler uses 
 over 1 GB RAM, and splitting up compilation into parts might help 
 with that.
 

Yea, the problem is, DMD doesn't currently free any of the memory it
takes, so mem usage just grows and grows. That's a known issue that
needs to be taken care of at some point. 

 BTW, I think I heard the compiler uses multithreading to speed up 
 the build, is that right?
 

Yes, it does. But someone else will have to explain how it actually uses
multithreading, ie, what it multithreads, because I've got no clue ;)
I think it's fairly coarse-grained, like on the module-level, but
that's all I know.

 It keeps diving deeper and deeper to find anything it can 
 "start" with.
 One it finds that, it'll just build everything back up in 
 whatever
 order is necessary.

 
 I hope someone can give more details about this.
 

I hope so too :)

Jul 25 2012

Jacob Carlborg <doob me.com> writes:

On 2012-07-25 21:54, David Piepgrass wrote:
 Thanks for the very good description, Nick! So if I understand
 correctly, if

 1. I use an "auto" return value or suchlike in a module Y.d
 2. module X.d calls this function
 3. I call "dmd -c X.d" and "dmd -c Y.d" as separate steps

 Then the compiler will have to fully parse Y twice and fully analyze the
 Y function twice, although it generates object code for the function
 only once. Right? I wonder how smart it is about not analyzing things it
 does not need to analyze (e.g. when Y is a big module but X only calls
 one function from it - the compiler has to parse Y fully but it should
 avoid most of the semantic analysis.)

Yes, I think that's correct. But if you give the compiler all the source 
code at once it should only need to parse a given module only once. D 
doesn't use textual includes like C/C++ does, it just symbolically 
refers to other symbols (or something like that).

 What about templates? In C++ it is a problem that the compiler will
 instantiate templates repeatedly, say if I use vector<string> in 20
 source files, the compiler will generate and store 20 copies of
 vector<string> (plus 20 copies of basic_string<char>, too) in object files.

 1. So in D, if I compile the 20 sources separately, does the same thing
 happen (same collection template instantiated 20 times with all 20
 copies stored)?

If you compile them separately I think so, yes. How would it otherwise 
work, store some info between compile runs?

 2. If I compile the 20 sources all together, I guess the template would
 be instantiated just once, but then which .obj file does the
 instantiated template go in?

I think it only need to instantiate it once. If it does that or not, I 
don't know. About the object file, that is probably unspecified. 
Although if you compile with the -lib flag it will output the templates 
to all object files. This is one of the problems making it hard to 
create an incremental build system for D.


 I figure as CTFE is used more, especially when it is used to decide
 which template overloads are valid or how a mixin will behave, this will
 slow down the compiler more and more, thus making incremental builds
 more important. A typical example would be a compile-time
 parser-generator, or compiled regexes.

I think that's correct. I did some simple benchmarking comparing 
different uses of string mixins in Derelict. It turns out that it's a 
lot better to have few string mixins containing a lot of code then many 
string mixins containing very little code. I suspect other meta 
programming features (CTFE, templates, static if, mixins) could behave 
in a similar way.

 Plus, I've heard some people complaining that the compiler uses over 1
 GB RAM, and splitting up compilation into parts might help with that.

Yeah, I just run in to a compiler bug (not been able to create a simple 
test case) where it consumed around 3.5 GB of memory then just crashed 
after a while.

 BTW, I think I heard the compiler uses multithreading to speed up the
 build, is that right?

Yes, I'm pretty sure it reads all (many) the files in concurrently or in 
parallel. It probably can lex and parse in parallel as well, don't know 
if it does that though.


 Anyway, I can't even figure out how to enumerate the members of a module
 A; __traits(allMembers, A) causes "Error: import Y has no members".

Currently there's a bug which forces you to put the module in a package, 
try:

module foo.A;

__traits(allMembers, foo.A);

-- 
/Jacob Carlborg

Jul 26 2012

"David Piepgrass" <qwertie256 gmail.com> writes:

 If you use rdmd to compile (instead of dmd), you *just* give it
 your *one* main source file (typically the one with your 
 "main()"
 function). This file must be the *last* parameter passed to 
 rdmd:

 $rdmd --build-only (any other flags) main.d

 Then, RDMD will figure out *all* of the source files needed 
 (using
 the full compiler's frontend, so it never gets fooled into 
 missing
 anything), and if any of them have been changed, it will 
 automatically
 pass them *all* into DMD for you. This way, you don't have to
 manually keep track of all your files and pass them all into
 DMD youself. Just give RDMD your main file and that's it, 
 you're golden.

I meant to ask, why would it recompile *all* of the source files 
if only one changed? Seems like it only should recompile the 
changed ones (but still compile them together as a unit.) Is it 
because of bugs (e.g. the template problem you mentioned)?

Jul 25 2012

Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:

On Wed, 25 Jul 2012 22:18:37 +0200
"David Piepgrass" <qwertie256 gmail.com> wrote:
 
 I meant to ask, why would it recompile *all* of the source files 
 if only one changed? Seems like it only should recompile the 
 changed ones (but still compile them together as a unit.) Is it 
 because of bugs (e.g. the template problem you mentioned)?

I'm not 100% certain, but, yes, I think it's a combination of that, and
the fact that nobody's actually gone and tried to make that change to
RDMD yet.

AIUI, The original motivating purpose for RDMD was to be able to execute
a D source file as if it were a script. So finding all relevant source
files, passing them to DMD, etc, was all just necessary steps towards
that end. Which turned out to also be useful in many cases for general
project building.

Jul 25 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Wednesday, July 25, 2012 08:54:24 Russel Winder wrote:
 On Tue, 2012-07-24 at 20:35 -0700, Jonathan M Davis wrote:
 [=E2=80=A6]
=20
 I find it shocking that anyone would consider 15 seconds slow to co=


mpile
 for a large program. Yes, D's builds are lightning fast in general,=


 and
 15 seconds is probably a longer build, but calling 15 seconds
 "slow-to-compile" just about blows my mind. 15 seconds for a large
 program is _fast_. If anyone complains about a large program taking=


 15
 seconds to build, then they're just plain spoiled or naive. I've de=


alt
 with _Java_ apps which took in the realm of 10 minutes to compile, =


let
 alone C++ apps which take _hours_ to compile. 15 seconds is a godse=


nd.
=20
 A company I did some Python training for (they used Python for their
 integration and system testing, and a bit of unit testing) back in 20=

06
 had a C++ product whose "from scratch" build time genuinely was 56
 hours.

I've heard of overnight builds, and I've heard of _regression tests_ ru=
nning=20
for over a week, but I've never heard of builds being over 2 days. Ouch=
.

It has got to have been possible to have a shorter build than that. Of =
course,=20
if their code was bad enough that the build was that long, it may have =
been=20
rather disgusting code to clean up. But then again, maybe they genuinel=
y had a=20
legitimate reason for having the build take that long. I'd be very surp=
rised=20
though.

In any case, much as I like C++ (not as much as D, but I still like it =
quite a=20
bit), its build times are undeniably horrible.

- Jonathan M Davis

Jul 25 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Wednesday, July 25, 2012 14:57:23 Andrej Mitrovic wrote:
 Hell I can't believe how outdated the compiler technology is. I can
 play incredibly realistic and interactive 3D games in real-time with
 practically no input lag, but I have to wait a dozen seconds for a
 tool to convert lines of text into object code? From a syntax
 perspective D has moved forward but from a compilation perspective it
 hasn't innovated at all.

And dmc and dmd are lightning fast in comparison to most compilers. I think 
that a lot of it comes down to the fact that optimizing code is _expensive_, 
and doing a lot of operations on an AST isn't necessarily all that cheap 
either. dmd is actually _lightning_ fast at processing text. That's not what's 
slow. It's everything which is after that which is.

And for most compilers, the speed of the resultant code matters a lot more 
than the speed of compilation. Compare this to games which need to maintain a 
certain number of FPS. They optimize _everything_ towards that goal, which is 
why they achieve it. There's also no compiler equivalent of parallelizing 
optimizations to the AST or asm like games have parallelizing geometric 
computations and the like with GPUs. The priorities are completely different, 
what they're doing is very different, and what they have to work with is very 
different. As great as it would be if compilers were faster, it's an apples to 
oranges comparison.

- Jonathan M Davis

Jul 25 2012

Russel Winder <russel winder.org.uk> writes:

On Wed, 2012-07-25 at 01:03 -0700, Jonathan M Davis wrote:
[=E2=80=A6]
 I've heard of overnight builds, and I've heard of _regression tests_ runn=

ing=20
 for over a week, but I've never heard of builds being over 2 days. Ouch.

Indeed the full test suite did take about a week to run. I think the
core problem then was it was 2006, computers were slower, parallel
compilation was not as well managed as multicore hadn't really taken
hold, and they were doing the equivalent of trying -O2 and -O3 to see
which space/time balance was best.

 It has got to have been possible to have a shorter build than that. Of co=

urse,=20
 if their code was bad enough that the build was that long, it may have be=

en=20
 rather disgusting code to clean up. But then again, maybe they genuinely =

had a=20
 legitimate reason for having the build take that long. I'd be very surpri=

sed=20
 though.

These were smart people, so my suspicion is very much that there was a
necessary complexity. I think there was also an element of they were in
the middle of a global refactoring. I suspect they have now had time to
get stuff into a better state, but I do not know.

 In any case, much as I like C++ (not as much as D, but I still like it qu=

ite a=20
 bit), its build times are undeniably horrible.

Indeed, especially with -O2 or -O3.

This is an area where VM + JIT can actually make things a lot better.
Optimization happens on actually running code and is therefore focused
on the "hot spot" rather than trying to optimize the entire code base.
Java is doing this quite successfully, as is PyPy.

--=20
Russel.
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D
Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder ekiga.n=
et
41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel winder.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder

Jul 26 2012

Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:

On Thu, 26 Jul 2012 09:27:03 +0100
Russel Winder <russel winder.org.uk> wrote:

 On Wed, 2012-07-25 at 01:03 -0700, Jonathan M Davis wrote:
 
 In any case, much as I like C++ (not as much as D, but I still like
 it quite a bit), its build times are undeniably horrible.

 
 Indeed, especially with -O2 or -O3.
 
 This is an area where VM + JIT can actually make things a lot better.
 Optimization happens on actually running code and is therefore focused
 on the "hot spot" rather than trying to optimize the entire code base.
 Java is doing this quite successfully, as is PyPy.
 

That's not something that actually necessitates a VM though. It's just
that no native-compiled language (to my knowledge) has actually put
something like that into its runtime yet.

Jul 26 2012

D Programming

C/C++ Programming

Other

digitalmars.D - What is the compilation model of D?