www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Memoization in DMD

reply =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
Have anybody thought about implementing hashing and in turn 
caching (memoization) directly into DMD to give further 
compilation speedups? If so which at what phase in the 
compilation would be this best injected?
Jul 06 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/6/2014 9:16 AM, "Nordlöw" wrote:
 Have anybody thought about implementing hashing and in turn caching
 (memoization) directly into DMD to give further compilation speedups? If so
 which at what phase in the compilation would be this best injected?
Caching and memoization of what?
Jul 06 2014
parent reply =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
On Sunday, 6 July 2014 at 20:11:12 UTC, Walter Bright wrote:
 Caching and memoization of what?
Generated machine code.
Jul 06 2014
next sibling parent reply =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
On Sunday, 6 July 2014 at 20:34:31 UTC, Nordlöw wrote:
 Caching and memoization of what?
Kind of what scons (shared) caching does but built into the compiler.
Jul 06 2014
parent =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
On Sunday, 6 July 2014 at 20:36:02 UTC, Nordlöw wrote:
 Kind of what scons (shared) caching does but built into the 
 compiler.
BTW: Does anybody have an example of a large D project which scons (with caching) builds faster than rdmd?
Jul 06 2014
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/6/2014 1:34 PM, "Nordlöw" wrote:
 On Sunday, 6 July 2014 at 20:11:12 UTC, Walter Bright wrote:
 Caching and memoization of what?
Generated machine code.
The generated code is always different. I don't see the savings.
Jul 06 2014
parent reply =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
On Sunday, 6 July 2014 at 21:00:34 UTC, Walter Bright wrote:
 The generated code is always different.
What do you mean by always different? Because of memory offsets changing with every small change to input source files?
Jul 06 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/6/2014 2:35 PM, "Nordlöw" wrote:
 On Sunday, 6 July 2014 at 21:00:34 UTC, Walter Bright wrote:
 The generated code is always different.
What do you mean by always different? Because of memory offsets changing with every small change to input source files?
I'd turn that around and ask where you are seeing potential savings? Note that you can run obj2asm on the generated object files, please do so and point out where things can be cached.
Jul 06 2014
next sibling parent reply =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
On Sunday, 6 July 2014 at 21:38:28 UTC, Walter Bright wrote:

I'm thinking in terms of a caching build system like scons that 
memoize on the object/lib level. As dmd/rdmd kind of realizes a 
simple build system I thought we somewhere could inject 
memoization on the object level and see how large projects would 
have to be to benefit from using memoized compilation of sources 
files to separate objects and then link them.

 I'd turn that around and ask where you are seeing potential 
 savings? Note that you can run obj2asm on the generated object 
 files, please do so and point out where things can be cached.
Alternatively, wouldn't it be possible to instead memoize the individual writes to the resulting binary? This memoization could preferrably be indexed by some message digest (as is done by scons) based on either the dmd parameters and the content of the input source files that influences a specific code-gen or perhaps even the parts of the asts that were inputs. I realize that these dependencies, of course, may be difficult to distinguish from each other. How are these separate writes to the resulting binary split up? By generated functions and static data initializations? If so a good first test would be to just calculate these digests and compare them between two compilations with a minor change between them and see how large percentage of them that remain unchanged. Robert Shadek spoke at DConf 2013 about a "Distributed Caching Compiler" he was working on so I thought the idea is not that completely irrelevant right? I was mainly curious if the D leaders have discussed this further as there are examples of D projects, such as DCD, that take more than a 1 minute to build (on my laptop).
Jul 06 2014
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
I'm afraid you're proposing a very complex system. It's way beyond where we're
at.
Jul 06 2014
prev sibling parent Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:
On 7/6/2014 6:12 PM, "Nordlöw" wrote:
 On Sunday, 6 July 2014 at 21:38:28 UTC, Walter Bright wrote:

 I'm thinking in terms of a caching build system like scons that memoize
 on the object/lib level. As dmd/rdmd kind of realizes a simple build
 system I thought we somewthere could inject memoization on the object
 level and see how large projects would have to be to benefit from using
 memoized compilation of sources files to separate objects and then link
 them.
Are you just talking about normal incremental compilation? Ie, only recompiling the sources that have changed since the last build?
Jul 06 2014
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 06/07/14 23:38, Walter Bright wrote:

 I'd turn that around and ask where you are seeing potential savings?
 Note that you can run obj2asm on the generated object files, please do
 so and point out where things can be cached.
Templates perhaps? -- /Jacob Carlborg
Jul 07 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/7/2014 12:06 AM, Jacob Carlborg wrote:
 On 06/07/14 23:38, Walter Bright wrote:

 I'd turn that around and ask where you are seeing potential savings?
 Note that you can run obj2asm on the generated object files, please do
 so and point out where things can be cached.
Templates perhaps?
They're already cached.
Jul 07 2014
parent reply Jacob Carlborg <doob me.com> writes:
On 07/07/14 10:23, Walter Bright wrote:

 They're already cached.
Does that include templates instantiated with different types but which results in the same code? -- /Jacob Carlborg
Jul 07 2014
parent Walter Bright <newshound2 digitalmars.com> writes:
On 7/7/2014 4:20 AM, Jacob Carlborg wrote:
 On 07/07/14 10:23, Walter Bright wrote:

 They're already cached.
Does that include templates instantiated with different types but which results in the same code?
No. That's doable, but hasn't been done yet.
Jul 07 2014
prev sibling next sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Sunday, 6 July 2014 at 20:34:31 UTC, Nordlöw wrote:
 Generated machine code.
You can detect similarities on the AST level, but that demands some kind of normalization process which may or may not work out for the language. More likely to work of for functional style generic code than optimized imperative code. For floating point it becomes especially problematic since reordering changes the result accuracy.
Jul 06 2014
prev sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Sun, Jul 06, 2014 at 08:34:29PM +0000, "Nordlöw" via Digitalmars-d wrote:
 On Sunday, 6 July 2014 at 20:11:12 UTC, Walter Bright wrote:
Caching and memoization of what?
Generated machine code.
Doesn't rdmd do that to some extent? (But in granularity of entire source files, rather than individual functions or assembly blocks.) T -- You are only young once, but you can stay immature indefinitely. -- azephrahel
Jul 06 2014
parent Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:
On 7/7/2014 12:10 AM, H. S. Teoh via Digitalmars-d wrote:
 On Sun, Jul 06, 2014 at 08:34:29PM +0000, "Nordlöw" via Digitalmars-d wrote:
 On Sunday, 6 July 2014 at 20:11:12 UTC, Walter Bright wrote:
 Caching and memoization of what?
Generated machine code.
Doesn't rdmd do that to some extent? (But in granularity of entire source files, rather than individual functions or assembly blocks.)
Unless something's changed recently, RDMD doesn't do incremental compilation. It just passes *all* sources to DMD to be rebuilt on every invocation. It does skip recompiling if *none* of the dependencies have changed since the last build. But the granularity is "whole program" not "source file". IIRC, The old xfbuild supported incremental compilation (or tried to), but it ran into big problems with how DMD chose which object file to put which symbol into (may have been resolved since then?), and I'm not sure it ever really supported D2 anyway.
Jul 06 2014