www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.announce - Please try rdmd on large projects

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Hello,


I just submitted 
(https://github.com/D-Programming-Language/tools/commit/c77b870fdc5674d7434b0
d1767ba831eaac25b1) 
a change to rdmd that runs one thread per stat when comparing file 
dates, using David's excellent std.parallelism.

In my experiment the change introduces no additional lag on small 
projects and works 10-15% faster on moderate projects (couple dozen deps).

Could someone try rdmd against some larger projects and assess its 
behavior and speed?


Thanks,

Andrei
Feb 20 2012
next sibling parent reply Juan Manuel Cabo <juanmanuel.cabo gmail.com> writes:
GOOD!

Is the missing chmod problem fixable? So that the
binary has the same permissions as the D file?
If my D file is not readable or runnable by 'other',
the binary shouldn't be either. (the cached .deps should
have the same readability as the D file too perhaps).


I think that this is the big timesaver:

     rdmd: cache dependency file to improve startup time

So: Big Thanks!! I was using a wrapper for rdmd that only
called rdmd if the file was modified (which worked great for
small one file scripts, those 300ms to 1000ms startup
delays where unbearable).
With the .deps caching, rerun time went down to 20ms.
It was 300ms ~ 1000ms before (depending on how many imports).

I think that 20ms is still too slow (for certain applications,
it is just too much).

When rdmd asks dmd to generate the dependencies of my_file.d,
dmd goes beyond and parses phobos files, opening
all the module files in the path of dependency. I think
that was the major slow part.

Is it possible to have an option to skip rechecking inside phobos
dependencies each time? That would be the thing that brings
it down to < 5ms.

--jm


On 02/20/2012 07:17 PM, Andrei Alexandrescu wrote:
 Hello,
 
 
 I just submitted (https://github.com/D-Programming-Language/tools/commit/c77b870fdc5674d7434b0
d1767ba831eaac25b1) a
 change to rdmd that runs one thread per stat when comparing file dates, using
David's excellent std.parallelism.
 
 In my experiment the change introduces no additional lag on small projects and
works 10-15% faster on moderate projects
 (couple dozen deps).
 
 Could someone try rdmd against some larger projects and assess its behavior
and speed?
 
 
 Thanks,
 
 Andrei

Feb 20 2012
parent reply Juan Manuel Cabo <juanmanuel.cabo gmail.com> writes:
Doing:

  ltrace -e open dmd -deps=outdeps.txt example.d

and:

  ltrace -e read dmd -deps=outdeps.txt example.d

shows that dmd opens and reads a lot of phobos and druntime
to generate the dependencies of:

     import std.stdio;
     void main() { writeln("something");}

--jm


On 02/21/2012 02:02 AM, Juan Manuel Cabo wrote:
 GOOD!
 
 Is the missing chmod problem fixable? So that the
 binary has the same permissions as the D file?
 If my D file is not readable or runnable by 'other',
 the binary shouldn't be either. (the cached .deps should
 have the same readability as the D file too perhaps).
 
 
 I think that this is the big timesaver:
 
      rdmd: cache dependency file to improve startup time
 
 So: Big Thanks!! I was using a wrapper for rdmd that only
 called rdmd if the file was modified (which worked great for
 small one file scripts, those 300ms to 1000ms startup
 delays where unbearable).
 With the .deps caching, rerun time went down to 20ms.
 It was 300ms ~ 1000ms before (depending on how many imports).
 
 I think that 20ms is still too slow (for certain applications,
 it is just too much).
 
 When rdmd asks dmd to generate the dependencies of my_file.d,
 dmd goes beyond and parses phobos files, opening
 all the module files in the path of dependency. I think
 that was the major slow part.
 
 Is it possible to have an option to skip rechecking inside phobos
 dependencies each time? That would be the thing that brings
 it down to < 5ms.
 
 --jm
 
 
 On 02/20/2012 07:17 PM, Andrei Alexandrescu wrote:
 Hello,


 I just submitted (https://github.com/D-Programming-Language/tools/commit/c77b870fdc5674d7434b0
d1767ba831eaac25b1) a
 change to rdmd that runs one thread per stat when comparing file dates, using
David's excellent std.parallelism.

 In my experiment the change introduces no additional lag on small projects and
works 10-15% faster on moderate projects
 (couple dozen deps).

 Could someone try rdmd against some larger projects and assess its behavior
and speed?


 Thanks,

 Andrei


Feb 20 2012
parent Juan Manuel Cabo <juanmanuel.cabo gmail.com> writes:
I said:

 Is it possible to have an option to skip rechecking inside phobos
 dependencies each time? That would be the thing that brings
 it down to < 5ms.


but I did a: ltrace -e __xstat64 ./rdmd bla.d and saw that rdmd doesn't recheck phobos, so SORRY nevermind what I said!!!!!!!!!! And saw the inALibrary() function later. But, even though rdmd now doesn't recheck phobos, "dmd --deps" does. By the way, the .deps caching has two problems as it is right now in github: 1) BUG: the .deps should be rebuilt if the file changes. This misses the: isNewer(root, exe) part: // See if the deps file is still in good shape auto deps = readDepsFile(); bool mustRebuildDeps = anyNewerThan(deps.keys, depsFilename); if (!mustRebuildDeps) So if a add an import to a dependency, the .deps get rebuilt, but if I add an import to the D file, the .deps don't get rebuilt, and the only solution is to delete de .deps file. And if the .deps file goes to the /tmp dir, someusers will miss that they have to delete the .dep file and get stuck. 2) The .deps file is thrown in the same directory as the D root file, instead of at /tmp. One might not have write access to the D script directory, just read access. --jm On 02/21/2012 02:17 AM, Juan Manuel Cabo wrote:
 Doing:
 
   ltrace -e open dmd -deps=outdeps.txt example.d
 
 and:
 
   ltrace -e read dmd -deps=outdeps.txt example.d
 
 shows that dmd opens and reads a lot of phobos and druntime
 to generate the dependencies of:
 
      import std.stdio;
      void main() { writeln("something");}
 
 --jm
 
 


Feb 20 2012
prev sibling parent "Nick Sabalausky" <a a.a> writes:
"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message 
news:jhugqd$26v0$1 digitalmars.com...
 Hello,


 I just submitted 
 (https://github.com/D-Programming-Language/tools/commit/c77b870fdc5674d7434b0
d1767ba831eaac25b1) 
 a change to rdmd that runs one thread per stat when comparing file dates, 
 using David's excellent std.parallelism.

 In my experiment the change introduces no additional lag on small projects 
 and works 10-15% faster on moderate projects (couple dozen deps).

 Could someone try rdmd against some larger projects and assess its 
 behavior and speed?

Finally got a change to try this. The projects probably aren't as big as what you had in mind, but I tried: rdmd 0124c6b61a VS rdmd 75f292fffd Compiling: - Compiling all of Goldie's targets with DMD 2.058 - Attempting to compile DDMD with DMD 2.053 (the build failed pretty quickly, but not until after RDMD would DMD for the second time) On: - 32-bit single-core Linux, using a HDD - 32-bit single-core Windows, using a HDD - 64-bit dual-core Linux, using a USB Flash drive On all combinations I got no difference between the two versions of RDMD (which is good in the case of the single-core machines, of course). The once slight exception was that compiling Goldie's targets on the 64-bit dual-core machine was about 1-2% faster with the newer RDMD (75f292fffd) compared with the older (0124c6b61a). Could just be noise, though. There was one weird anomaly: When compiling the Goldie targets on the 64-bit dual-core, I compiled all of Goldie 8 times with each of the two RDMDs. On *ONE* of the compilations with the newer RDMD, one of the targets failed to build with a DMD ICE: dmd: ../ztc/aa.c:423: void AArray::rehash_x(aaA*, aaA**, size_t): Assertion `0' failed. I hadn't touched anything during or in-between compilations.
Mar 01 2012