www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Analyze a D file for imports

reply "Maaaks" <o ololo.im> writes:
I want to make a simple build utility that will rebuild only 
those files which changed since last build and those files that 
depend on them.

Which is the easiest and yet reliable way to parse a D source and 
find all imports in it (and file import()s as well)?
Jun 29 2015
next sibling parent reply "rsw0x" <anonymous anonymous.com> writes:
On Tuesday, 30 June 2015 at 04:02:00 UTC, Maaaks wrote:
 I want to make a simple build utility that will rebuild only 
 those files which changed since last build and those files that 
 depend on them.

 Which is the easiest and yet reliable way to parse a D source 
 and find all imports in it (and file import()s as well)?
dscanner has import analysis, it's likely tied to libdparse. That might be something worth investigating.
Jun 29 2015
parent "Maaaks" <o ololo.im> writes:
On Tuesday, 30 June 2015 at 04:08:48 UTC, rsw0x wrote:
 On Tuesday, 30 June 2015 at 04:02:00 UTC, Maaaks wrote:
 I want to make a simple build utility that will rebuild only 
 those files which changed since last build and those files 
 that depend on them.

 Which is the easiest and yet reliable way to parse a D source 
 and find all imports in it (and file import()s as well)?
dscanner has import analysis, it's likely tied to libdparse. That might be something worth investigating.
Yes, it seems to be enough for me. Module imports can be retrieved directly by `dscanner --imports`, and string imports can be found in `dscanner --ast`. Thank you!
Jun 29 2015
prev sibling parent reply "anonymous" <anonymous example.com> writes:
On Tuesday, 30 June 2015 at 04:02:00 UTC, Maaaks wrote:
 I want to make a simple build utility that will rebuild only 
 those files which changed since last build and those files that 
 depend on them.

 Which is the easiest and yet reliable way to parse a D source 
 and find all imports in it (and file import()s as well)?
Here's a pull request to make rdmd do that: https://github.com/D-Programming-Language/tools/pull/170 It uses regex over `dmd -deps`. Be aware of the challenges: Compiling source files separately is slower than passing them all at once to the compiler. This is why for rdmd the idea is run dmd once on all source files that need updating. Alas, dmd behaves a little quirky then with regards to where template instantiations go [1]. This is currently blocking the PR. [1] https://github.com/D-Programming-Language/tools/pull/170#issuecomment-112526734
Jun 30 2015
next sibling parent reply "rsw0x" <anonymous anonymous.com> writes:
On Tuesday, 30 June 2015 at 13:21:18 UTC, anonymous wrote:
 On Tuesday, 30 June 2015 at 04:02:00 UTC, Maaaks wrote:
 I want to make a simple build utility that will rebuild only 
 those files which changed since last build and those files 
 that depend on them.

 Which is the easiest and yet reliable way to parse a D source 
 and find all imports in it (and file import()s as well)?
Here's a pull request to make rdmd do that: https://github.com/D-Programming-Language/tools/pull/170 It uses regex over `dmd -deps`. Be aware of the challenges: Compiling source files separately is slower than passing them all at once to the compiler.
this is only true for dmd
Jun 30 2015
parent reply "anonymous" <anonymous example.com> writes:
On Tuesday, 30 June 2015 at 13:22:10 UTC, rsw0x wrote:
 Be aware of the challenges:
 Compiling source files separately is slower than passing them 
 all at once to the compiler.
this is only true for dmd
As far as I understand, the slowdown comes from parsing common dependencies again and again. Any compiler that doesn't cache the ASTs should be affected, no? A quick check with two files that both just import std.stdio suggests that ldc2 behaves similar to dmd. gdc seems to do things differently, but it takes relatively long anyway which may obscure things. $ time (dmd -c test.d test2.d) real 0m0.028s user 0m0.020s sys 0m0.009s $ time (dmd -c test.d; dmd -c test2.d) real 0m0.059s user 0m0.046s sys 0m0.013s $ time (ldc2 -c test.d test2.d) real 0m0.046s user 0m0.020s sys 0m0.025s $ time (ldc2 -c test.d; ldc2 -c test2.d) real 0m0.090s user 0m0.064s sys 0m0.025s $ time (gdc -c test.d test2.d) real 0m0.499s user 0m0.358s sys 0m0.138s $ time (gdc -c test.d; gdc -c test2.d) real 0m0.499s user 0m0.342s sys 0m0.155s
Jun 30 2015
parent reply "rsw0x" <anonymous anonymous.com> writes:
On Tuesday, 30 June 2015 at 14:07:20 UTC, anonymous wrote:
 On Tuesday, 30 June 2015 at 13:22:10 UTC, rsw0x wrote:
 [...]
As far as I understand, the slowdown comes from parsing common dependencies again and again. Any compiler that doesn't cache the ASTs should be affected, no? [...]
you're skipping the part where they can be ran in parallel, dmd sees no benefit from this.
Jun 30 2015
parent reply "anonymous" <anonymous example.com> writes:
On Tuesday, 30 June 2015 at 14:18:20 UTC, rsw0x wrote:
 you're skipping the part where they can be ran in parallel, dmd 
 sees no benefit from this.
Could you elaborate? Surely, one can run multiple instances of dmd in parallel, no? In my (possibly flawed) understanding, to get the quickest compile one would then: * Determine the optimal number of parallel processes. * Split the source files into that many chunks. * Run parallel instances of the compiler, one on each of those chunks.
Jun 30 2015
next sibling parent reply "rsw0x" <anonymous anonymous.com> writes:
On Tuesday, 30 June 2015 at 14:28:12 UTC, anonymous wrote:
 On Tuesday, 30 June 2015 at 14:18:20 UTC, rsw0x wrote:
 you're skipping the part where they can be ran in parallel, 
 dmd sees no benefit from this.
Could you elaborate? Surely, one can run multiple instances of dmd in parallel, no? In my (possibly flawed) understanding, to get the quickest compile one would then: * Determine the optimal number of parallel processes. * Split the source files into that many chunks. * Run parallel instances of the compiler, one on each of those chunks.
dmd scales extremely poorly across threads, to the poor where I got negative performance in parallel. LDC runs faster than dmd in parallel on my xeon machine.
Jun 30 2015
parent "rsw0x" <anonymous anonymous.com> writes:
On Tuesday, 30 June 2015 at 14:29:25 UTC, rsw0x wrote:
 On Tuesday, 30 June 2015 at 14:28:12 UTC, anonymous wrote:
 On Tuesday, 30 June 2015 at 14:18:20 UTC, rsw0x wrote:
 you're skipping the part where they can be ran in parallel, 
 dmd sees no benefit from this.
Could you elaborate? Surely, one can run multiple instances of dmd in parallel, no? In my (possibly flawed) understanding, to get the quickest compile one would then: * Determine the optimal number of parallel processes. * Split the source files into that many chunks. * Run parallel instances of the compiler, one on each of those chunks.
dmd scales extremely poorly across threads, to the poor where
argh to the point* :)
Jun 30 2015
prev sibling parent reply "Maaaks" <o ololo.im> writes:
On Tuesday, 30 June 2015 at 14:28:12 UTC, anonymous wrote:
 On Tuesday, 30 June 2015 at 14:18:20 UTC, rsw0x wrote:
 you're skipping the part where they can be ran in parallel, 
 dmd sees no benefit from this.
Could you elaborate? Surely, one can run multiple instances of dmd in parallel, no? In my (possibly flawed) understanding, to get the quickest compile one would then: * Determine the optimal number of parallel processes. * Split the source files into that many chunks. * Run parallel instances of the compiler, one on each of those chunks.
I would try to do that but the reason I am going to write a build system is that my project (which contains too many imported HTML templates) requires too much memory to compile. So, I need also to think how not to make chunks require too much memory.
Jun 30 2015
parent "Atila Neves" <atila.neves gmail.com> writes:
On Tuesday, 30 June 2015 at 14:31:26 UTC, Maaaks wrote:
 On Tuesday, 30 June 2015 at 14:28:12 UTC, anonymous wrote:
 On Tuesday, 30 June 2015 at 14:18:20 UTC, rsw0x wrote:
 [...]
Could you elaborate? Surely, one can run multiple instances of dmd in parallel, no? In my (possibly flawed) understanding, to get the quickest compile one would then: * Determine the optimal number of parallel processes. * Split the source files into that many chunks. * Run parallel instances of the compiler, one on each of those chunks.
I would try to do that but the reason I am going to write a build system is that my project (which contains too many imported HTML templates) requires too much memory to compile. So, I need also to think how not to make chunks require too much memory.
Build system you say? https://github.com/atilaneves/reggae Atila
Jun 30 2015
prev sibling parent reply "Atila Neves" <atila.neves gmail.com> writes:
On Tuesday, 30 June 2015 at 13:21:18 UTC, anonymous wrote:
 On Tuesday, 30 June 2015 at 04:02:00 UTC, Maaaks wrote:
 I want to make a simple build utility that will rebuild only 
 those files which changed since last build and those files 
 that depend on them.

 Which is the easiest and yet reliable way to parse a D source 
 and find all imports in it (and file import()s as well)?
Here's a pull request to make rdmd do that: https://github.com/D-Programming-Language/tools/pull/170 It uses regex over `dmd -deps`. Be aware of the challenges: Compiling source files separately is slower than passing them all at once to the compiler. This is why for rdmd the idea is run dmd once on all source files that need updating. Alas, dmd behaves a little quirky then with regards to where template instantiations go [1]. This is currently blocking the PR. [1] https://github.com/D-Programming-Language/tools/pull/170#issuecomment-112526734
rdmd doesn't run on files that need updating; it always compiles everything, it has no concept of needs updating. What it _does_ do is figure out what "everything" means. Atila
Jun 30 2015
parent "anonymous" <anonymous example.com> writes:
On Tuesday, 30 June 2015 at 14:49:11 UTC, Atila Neves wrote:
 On Tuesday, 30 June 2015 at 13:21:18 UTC, anonymous wrote:
[...]
 Here's a pull request to make rdmd do that:
 https://github.com/D-Programming-Language/tools/pull/170
[...]
 rdmd doesn't run on files that need updating; it always 
 compiles everything, it has no concept of needs updating. What 
 it _does_ do is figure out what "everything" means.
The PR would change that.
Jun 30 2015