www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 9673] New: Add --incremental option to rdmd

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9673

           Summary: Add --incremental option to rdmd
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: DMD
        AssignedTo: nobody puremagic.com
        ReportedBy: andrei erdani.com



PST ---
Currently rdmd follows the following process for building:

1. Fetch the main file (e.g. main.d) from the command line

2. Compute transitive dependencies for main.d and cache them in a main.deps
file in a private directory. This computation is done only when dependencies
change the main.deps file gets out of date.

3. Build an executable passing main.d and all of its dependencies on the same
command line to dmd

This setup has a number of advantages and disadvantages. For large projects
built of relatively independent parts, an --incremental option should allow a
different approach to building:

1. Fetch the main file (e.g. main.d) from the command line

2. Compute transitive dependencies for main.d and cache them in a main.deps
file in a private directory

3. For each such discovered file compute its own transitive dependencies in a
worklist approach, until all dependencies of all files in the project are
computed and cached in one .deps file for each .d file in the project. This
computation shall be done only when dependencies change and some .deps files
get out of date.

4. Invoke dmd once per .d file, producing object files (only for object files
that are out of date). Invocations should be runnable in parallel, but this may
be left as a future enhancement.

5. Invoke dmd once with all object files to link the code.

The added feature should not interfere with the existing setup. Users should
compare and contrast the two approaches just by adding or removing
--incremental in the rdmd command line.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Mar 09 2013
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9673


Vladimir Panteleev <thecybershadow gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |cbkbbejeap mailinator.com



07:06:31 EET ---
*** Issue 4686 has been marked as a duplicate of this issue. ***

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Mar 09 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9673


Martin Nowak <code dawg.eu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |code dawg.eu




 4. Invoke dmd once per .d file, producing object files (only for object files
 that are out of date). Invocations should be runnable in parallel, but this may
 be left as a future enhancement.
 
It should cluster the source files by common dependencies so to avoid the parsing and semantic analysis overhead of the blunt parallel approach. I think a simple k-means clustering would suffice for this, k would be the number of parallel jobs. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 11 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9673


Vladimir Panteleev <thecybershadow gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |thecybershadow gmail.com



18:22:20 EET ---
How would it matter? You still need to launch the compiler one time per each
source file with the current limitations.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Mar 11 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9673




You save the time by invoking "dmd -c" k times with each cluster.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Mar 11 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9673




18:35:29 EET ---
Martin, I think you're missing some information. Incremental compilation is
currently not reliably possible when more than one file is passed to the
compiler at a time. Please check the thread on the newsgroup for more
discussion on the topic.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Mar 11 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9673





We should fix Bug 9571 et.al. rather than using them as design constraints.
Of course we'll have to do single invocation as a workaround.
All I want to contribute is an idea how to optimize rebuilds.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Mar 11 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9673




19:14:49 EET ---


 We should fix Bug 9571 et.al. 
Issue 9571 describes a problem with compiling files one at a time.
 rather than using them as design constraints.
 Of course we'll have to do single invocation as a workaround.
Yes.
 All I want to contribute is an idea how to optimize rebuilds.
I think sorting the file list (incl. path) is a crude but simple approximation of your idea, assuming the project follows sensible conventions for package structure. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 11 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9673




PDT ---


 4. Invoke dmd once per .d file, producing object files (only for object files
 that are out of date). Invocations should be runnable in parallel, but this may
 be left as a future enhancement.
 
It should cluster the source files by common dependencies so to avoid the parsing and semantic analysis overhead of the blunt parallel approach. I think a simple k-means clustering would suffice for this, k would be the number of parallel jobs.
Great idea, although we'd need to amend things. First, the graph is directed (not sure whether k-means clustering is directly applicable to directed graphs, a cursory search suggests it doesn't). Second, for each node we don't have the edges, but instead all paths (that's what dmd -v generates). So we can take advantage of that information. A simple thought is to cluster based on the maximum symmetric difference between module dependency sets, i.e. separately compile modules that have the most mutually disjoint dependency sets. Anyhow I wouldn't want to get too bogged down into details at this point - first we need to get the appropriate infrastructure off the ground. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 11 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9673





 Great idea, although we'd need to amend things. First, the graph is directed
 (not sure whether k-means clustering is directly applicable to directed graphs,
 a cursory search suggests it doesn't).
 
I didn't thought about graph clustering.
 Second, for each node we don't have the edges, but instead all paths (that's
 what dmd -v generates). So we can take advantage of that information. A simple
 thought is to cluster based on the maximum symmetric difference between module
 dependency sets, i.e. separately compile modules that have the most mutually
 disjoint dependency sets.
 
That's more of what I had in mind. I'd use k-means to minimize the differences between the dependency sets of each module and the module set of their centroids.
 Anyhow I wouldn't want to get too bogged down into details at this point -
 first we need to get the appropriate infrastructure off the ground.
Right, but I'm happy to experiment with clustering once this is done. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 11 2013
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9673




Kind of works, but there are not many independent clusters in phobos.
https://gist.github.com/dawgfoto/5747405

A better approach might be to optimize for even cluster sizes, e.g. trying to
split 100KLOC into 4 independent clusters of 25KLOC. The number of lines here
are sources+imports. Assignment of source files to clusters could then be
optimized with simulated annealing or so.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 19 2013