digitalmars.D - Please integrate build framework into the compiler

davidl (13/13) Mar 21 2009 1. compiler know in what situation a file need to be recompiled

grauzone (32/32) Mar 21 2009 I don't really understand what you mean. But if you want the compiler to...

Andrei Alexandrescu (3/10) Mar 21 2009 That's precisely what rdmd does.

grauzone (26/37) Mar 21 2009 This looks really good, but I couldn't get it to work. Am I doing

Andrei Alexandrescu (4/51) Mar 21 2009 Should work, but I tested only with D2. You may want to pass --chatty to...

grauzone (5/5) Mar 21 2009 My rdmd doesn't know --chatty. Probably the zip file for dmd 1.041

Andrei Alexandrescu (7/12) Mar 21 2009 rdmd invokes dmd -v to get deps. It's a interesting idea to add a

davidl (19/31) Mar 21 2009 The bad news is that public imports ruin the simplicity of dependencies....

Ary Borenszweig (2/26) Mar 22 2009 Yes. They could give a compile-time error... always. ;-)

grauzone (5/20) Mar 23 2009 Is this just an "interesting idea", or are you actually considering

Andrei Alexandrescu (3/21) Mar 23 2009 I would if there was a compelling case made in favor of it.

dsimcha (19/23) Mar 21 2009 I'm surprised that this could possibly be more efficient than incrementa...

grauzone (9/34) Mar 21 2009 Maybe incremental compilation could be faster, but dmd has a bug that
Christopher Wright (2/10) Mar 21 2009 This is only if there is no dynamic linking.

BCS (7/13) Mar 21 2009 Adding that without a way to turn it off would kill D in some cases. I h...
Christopher Wright (2/7) Mar 21 2009 You can use interfaces for this, though that is not always possible.
davidl (28/34) Mar 21 2009 This may not be true. Consider the dwt lib case, once you tweaked a modu...

grauzone (18/28) Mar 21 2009 If it's about bugs, it would (probably) be easier for Walter to fix that...

Kristian Kilpi (16/22) Mar 22 2009 Well, why not get rid of the imports altogether... Ok, that would not be...

Christopher Wright (6/9) Mar 22 2009 That's not sufficient. I'm using SDL right now; if I type 'Surface s;',

Kristian Kilpi (28/37) Mar 22 2009 Such things should of course be told to the compiler somehow. By using t...

dennis luehring (36/39) Mar 22 2009 maybe like delphi did it
Christopher Wright (17/61) Mar 22 2009 Then I want to deal with a library type with the same name as my builtin...

Nick Sabalausky (25/35) Mar 24 2009 "If your program can operate efficiently with a textual representation.....

bearophile (4/8) Mar 24 2009 Maybe not much, because today textual files can be compressed and decomp...

Nick Sabalausky (12/20) Mar 24 2009 I've become more and more wary of this "CPUs are now fast enough..." phr...

bearophile (8/9) Mar 24 2009 See here too :-)

Nick Sabalausky (13/21) Mar 24 2009 Excellent article :)

bearophile (6/13) Mar 24 2009 Because experiments have shown it solves or reduces a lot the problem yo...

Christopher Wright (7/11) Mar 24 2009 Most programs only need to load up text on startup. So the cost of

Unknown W. Brackets (12/33) Mar 22 2009 Actually, dmd is so fast I never bother with these "build" utilities. I...

davidl <davidl 126.com> writes:

1. compiler know in what situation a file need to be recompiled

Consider the file given the same header file, then the obj file of this  
will be required for linking, all other files import this file shouldn't  
require any recompilation in this case. If a file's header file changes,  
thus the interface changes, all files import this file should be  
recompiled.
Compiler can emit building command like rebuild does.

I would enjoy:

dmd -buildingcommand abc.d  > responsefile

dmd  responsefile

I think we need to eliminate useless recompilation as much as we should  
with consideration of the growing d project size.

2. maintaining the build without compiler support costs

Mar 21 2009

grauzone <none example.net> writes:

I don't really understand what you mean. But if you want the compiler to 
scan for dependencies, I fully agree.

I claim that we don't even need incremental compilation. It would be 
better if the compiler would scan for dependencies, and if a source file 
has changed, recompile the whole project in one go. This would be simple 
and efficient.

Here are some arguments that speak for this approach:

- A full compiler is the only piece of software that can build a 
correct/complete module dependency graph. This is because you need full 
semantic analysis to catch all import statements. For example, you can 
use a string mixin to generate import statements: mixin("import bla;"). 
No naive dependency scanner would be able to detect this import. You 
need CTFE capabilities, which require almost a full compiler. (Actually, 
dsss uses the dmd frontend for dependency scanning.)

- Speed. Incremental compilation is godawfully slow (10 times slower 
than to compile all files in one dmd invocation). You could pass all 
changed files to dmd at once, but this is broken and often causes linker 
errors (ask the dsss author for details lol). Recompiling the whole 
thing every time is faster.

- Long dependency chains. Unlike in C/C++, you can't separate a module 
into interface and implementation. Compared to C++, it's as if a change 
to one .c file triggers recompilation of a _lot_ of other .c files. This 
makes incremental compilation really look useless. Unless you move 
modules into libraries and use them through .di files.

I would even go so far to say, that dmd should automatically follow all 
imports and compile them in one go. This would be faster than having a 
separate responsefile step, because the source code needs to be analyzed 
only once. To prevent compilation of imported library headers, the 
compiler could provide a new include switch for library code. Modules 
inside "library" include paths wouldn't be compiled.

Hell, maybe I'll even manage to come up with a compiler patch, to turn 
this into reality.

Mar 21 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

grauzone wrote:
 I don't really understand what you mean. But if you want the compiler to 
 scan for dependencies, I fully agree.
 
 I claim that we don't even need incremental compilation. It would be 
 better if the compiler would scan for dependencies, and if a source file 
 has changed, recompile the whole project in one go. This would be simple 
 and efficient.

That's precisely what rdmd does.

Andrei

Mar 21 2009

grauzone <none example.net> writes:

Andrei Alexandrescu wrote:
 grauzone wrote:
 I don't really understand what you mean. But if you want the compiler 
 to scan for dependencies, I fully agree.

 I claim that we don't even need incremental compilation. It would be 
 better if the compiler would scan for dependencies, and if a source 
 file has changed, recompile the whole project in one go. This would be 
 simple and efficient.

 
 That's precisely what rdmd does.

This looks really good, but I couldn't get it to work. Am I doing 
something wrong?

--- o.d:
module o;

import tango.io.Stdout;

void k() {
     Stdout("foo").newline;
}

--- u.d:
module u;

import o;

void main() {
     k();
}



$ rdmd u.d
/tmp/u-1000-20-49158160-A46C236CDE107E3B9F053881E4257C2D.o:(.data+0x38): 
undefined reference to `_D1o12__ModuleInfoZ'
/tmp/u-1000-20-49158160-A46C236CDE107E3B9F053881E4257C2D.o: In function 
`_Dmain':
u.d:(.text._Dmain+0x4): undefined reference to `_D1o1kFZv'
collect2: ld returned 1 exit status
--- errorlevel 1
rdmd: Couldn't compile or execute u.d.

$ dmd|grep Compiler
Digital Mars D Compiler v1.041

 Andrei

Mar 21 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

grauzone wrote:
 Andrei Alexandrescu wrote:
 grauzone wrote:
 I don't really understand what you mean. But if you want the compiler 
 to scan for dependencies, I fully agree.

 I claim that we don't even need incremental compilation. It would be 
 better if the compiler would scan for dependencies, and if a source 
 file has changed, recompile the whole project in one go. This would 
 be simple and efficient.

 That's precisely what rdmd does.

 
 This looks really good, but I couldn't get it to work. Am I doing 
 something wrong?
 
 --- o.d:
 module o;
 
 import tango.io.Stdout;
 
 void k() {
     Stdout("foo").newline;
 }
 
 --- u.d:
 module u;
 
 import o;
 
 void main() {
     k();
 }
 
 
 
 $ rdmd u.d
 /tmp/u-1000-20-49158160-A46C236CDE107E3B9F053881E4257C2D.o:(.data+0x38): 
 undefined reference to `_D1o12__ModuleInfoZ'
 /tmp/u-1000-20-49158160-A46C236CDE107E3B9F053881E4257C2D.o: In function 
 `_Dmain':
 u.d:(.text._Dmain+0x4): undefined reference to `_D1o1kFZv'
 collect2: ld returned 1 exit status
 --- errorlevel 1
 rdmd: Couldn't compile or execute u.d.
 
 $ dmd|grep Compiler
 Digital Mars D Compiler v1.041

Should work, but I tested only with D2. You may want to pass --chatty to 
rdmd and see what commands it invokes.


Andrei

Mar 21 2009

grauzone <none example.net> writes:

My rdmd doesn't know --chatty. Probably the zip file for dmd 1.041 
contains an outdated, buggy version. Where can I find the up-to-date 
source code?

Another question, rdmd just calls dmd, right? How does it scan for 
dependencies, or is this step actually done by dmd itself?

Mar 21 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

grauzone wrote:
 My rdmd doesn't know --chatty. Probably the zip file for dmd 1.041 
 contains an outdated, buggy version. Where can I find the up-to-date 
 source code?

Hold off on that for now.

 Another question, rdmd just calls dmd, right? How does it scan for 
 dependencies, or is this step actually done by dmd itself?

rdmd invokes dmd -v to get deps. It's a interesting idea to add a 
compilation mode to rdmd that asks dmd to generate headers and diff them 
against the old headers. That way we can implement incremental rebuilds 
without changing the compiler.

Andrei

Mar 21 2009

davidl <davidl 126.com> writes:

在 Sun, 22 Mar 2009 12:18:03 +0800，Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> 写道:

 grauzone wrote:
 My rdmd doesn't know --chatty. Probably the zip file for dmd 1.041  
 contains an outdated, buggy version. Where can I find the up-to-date  
 source code?

 Hold off on that for now.

 Another question, rdmd just calls dmd, right? How does it scan for  
 dependencies, or is this step actually done by dmd itself?

 rdmd invokes dmd -v to get deps. It's a interesting idea to add a  
 compilation mode to rdmd that asks dmd to generate headers and diff them  
 against the old headers. That way we can implement incremental rebuilds  
 without changing the compiler.

 Andrei

The bad news is that public imports ruin the simplicity of dependencies.  
Though most cases d projs uses private imports.

Maybe we can further restrict the public imports.

I suggest we add a new module style of interfacing. Public imports are  
only allowed in those modules. Interface module can only have public  
imports.

example: all.d

module(interface) all;
public import blah;
public import blah.foo;

interface module can not import another interface module. Thus no public  
import chain will be created. The shortcoming of it is:

module(interface) subpack.all;
public import subpack.mod;

module(interface) all;
public import subpack.mod;     	// duplication here.
public import subpack1.mod1;

Mar 21 2009

Ary Borenszweig <ary esperanto.org.ar> writes:

davidl escribió:
 在 Sun, 22 Mar 2009 12:18:03 +0800，Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> 写道:
 
 grauzone wrote:
 My rdmd doesn't know --chatty. Probably the zip file for dmd 1.041 
 contains an outdated, buggy version. Where can I find the up-to-date 
 source code?

 Hold off on that for now.

 Another question, rdmd just calls dmd, right? How does it scan for 
 dependencies, or is this step actually done by dmd itself?

 rdmd invokes dmd -v to get deps. It's a interesting idea to add a 
 compilation mode to rdmd that asks dmd to generate headers and diff 
 them against the old headers. That way we can implement incremental 
 rebuilds without changing the compiler.

 Andrei

 
 The bad news is that public imports ruin the simplicity of dependencies. 
 Though most cases d projs uses private imports.
 
 Maybe we can further restrict the public imports.

Yes. They could give a compile-time error... always. ;-)

Mar 22 2009

grauzone <none example.net> writes:

Andrei Alexandrescu wrote:
 grauzone wrote:
 My rdmd doesn't know --chatty. Probably the zip file for dmd 1.041 
 contains an outdated, buggy version. Where can I find the up-to-date 
 source code?

 
 Hold off on that for now.
 
 Another question, rdmd just calls dmd, right? How does it scan for 
 dependencies, or is this step actually done by dmd itself?

 
 rdmd invokes dmd -v to get deps. It's a interesting idea to add a 
 compilation mode to rdmd that asks dmd to generate headers and diff them 
 against the old headers. That way we can implement incremental rebuilds 
 without changing the compiler.

Is this just an "interesting idea", or are you actually considering 
implementing it?

Anyway, maybe you could pressure Walter to fix that dmd bug, that stops 
dsss from being efficient. I can't advertise this enough.

 Andrei

Mar 23 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

grauzone wrote:
 Andrei Alexandrescu wrote:
 grauzone wrote:
 My rdmd doesn't know --chatty. Probably the zip file for dmd 1.041 
 contains an outdated, buggy version. Where can I find the up-to-date 
 source code?

 Hold off on that for now.

 Another question, rdmd just calls dmd, right? How does it scan for 
 dependencies, or is this step actually done by dmd itself?

 rdmd invokes dmd -v to get deps. It's a interesting idea to add a 
 compilation mode to rdmd that asks dmd to generate headers and diff 
 them against the old headers. That way we can implement incremental 
 rebuilds without changing the compiler.

 
 Is this just an "interesting idea", or are you actually considering 
 implementing it?

I would if there was a compelling case made in favor of it.

Andrei

Mar 23 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from grauzone (none example.net)'s article
 I claim that we don't even need incremental compilation. It would be
 better if the compiler would scan for dependencies, and if a source file
 has changed, recompile the whole project in one go. This would be simple
 and efficient.

I'm surprised that this could possibly be more efficient than incremental
compilation, but I've never worked on a project large enough for compile times
to
be a major issue, so I've never really looked into this.

If incremental compilation were removed from the spec, meaning the compiler
would
always know about the whole program when compiling, I assume (correct me if I'm
wrong) that would mean the following restrictions could be removed:

1.  std.traits could offer a way to get a tuple of all derived classes,
essentially the opposite of BaseTypeType.
2.  Since DMD would know about all derived classes when compiling the base
class,
it would be feasible to allow templates to add virtual functions to classes.
IMHO, this would be an absolute godsend, as it is currently a _huge_ limitation
of
templates.
3.  For the same reason, methods calls to classes with no derived classes could
be
made directly instead of through the vtable.

Of course, these restrictions would still apply to libraries that use .di files.
If incremental compilation is actually causing more problems than it solves
anyhow, it would be great to get rid of it along with the annoying restrictions
it
creates.

Mar 21 2009

grauzone <none example.net> writes:

dsimcha wrote:
 == Quote from grauzone (none example.net)'s article
 I claim that we don't even need incremental compilation. It would be
 better if the compiler would scan for dependencies, and if a source file
 has changed, recompile the whole project in one go. This would be simple
 and efficient.

 
 I'm surprised that this could possibly be more efficient than incremental
 compilation, but I've never worked on a project large enough for compile times
to
 be a major issue, so I've never really looked into this.

Maybe incremental compilation could be faster, but dmd has a bug that 
forces tools like dsss/rebuild to use a slower method. Instead of 
invoking the compiler once to recompile all modules that depend from 
changed files, it has to start a new compiler process for each file.

 If incremental compilation were removed from the spec, meaning the compiler
would
 always know about the whole program when compiling, I assume (correct me if I'm
 wrong) that would mean the following restrictions could be removed:
 
 1.  std.traits could offer a way to get a tuple of all derived classes,
 essentially the opposite of BaseTypeType.
 2.  Since DMD would know about all derived classes when compiling the base
class,
 it would be feasible to allow templates to add virtual functions to classes.
 IMHO, this would be an absolute godsend, as it is currently a _huge_
limitation of
 templates.
 3.  For the same reason, methods calls to classes with no derived classes
could be
 made directly instead of through the vtable.

And you could do all kinds of interprocedural optimizations.

 Of course, these restrictions would still apply to libraries that use .di
files.
 If incremental compilation is actually causing more problems than it solves
 anyhow, it would be great to get rid of it along with the annoying
restrictions it
 creates.


compilation. But for now, D's build model is too similar to C/C++ as 
that you'd completely remove that ability.

Mar 21 2009

Christopher Wright <dhasenan gmail.com> writes:

dsimcha wrote:
 1.  std.traits could offer a way to get a tuple of all derived classes,
 essentially the opposite of BaseTypeType.
 2.  Since DMD would know about all derived classes when compiling the base
class,
 it would be feasible to allow templates to add virtual functions to classes.
 IMHO, this would be an absolute godsend, as it is currently a _huge_
limitation of
 templates.
 3.  For the same reason, methods calls to classes with no derived classes
could be
 made directly instead of through the vtable.

This is only if there is no dynamic linking.

Mar 21 2009

BCS <none anon.com> writes:

Hello grauzone,

 I would even go so far to say, that dmd should automatically follow
 all imports and compile them in one go. This would be faster than
 having a separate responsefile step, because the source code needs to
 be analyzed only once. To prevent compilation of imported library
 headers, the compiler could provide a new include switch for library
 code. Modules inside "library" include paths wouldn't be compiled.

Adding that without a way to turn it off would kill D in some cases. I have 
a project where DMD uses up >30% of the available address space compiling 
one module. If I was forced to compile all modules at once, it might not 
work, end of story.

That said, for many cases, I don't see a problem with having that feature 
available.

Mar 21 2009

Christopher Wright <dhasenan gmail.com> writes:

grauzone wrote:
 - Long dependency chains. Unlike in C/C++, you can't separate a module 
 into interface and implementation. Compared to C++, it's as if a change 
 to one .c file triggers recompilation of a _lot_ of other .c files. This 
 makes incremental compilation really look useless. Unless you move 
 modules into libraries and use them through .di files.

You can use interfaces for this, though that is not always possible.

Mar 21 2009

davidl <davidl 126.com> writes:

在 Sun, 22 Mar 2009 04:19:31 +0800，grauzone <none example.net> 写道:

 I don't really understand what you mean. But if you want the compiler to  
 scan for dependencies, I fully agree.

 I claim that we don't even need incremental compilation. It would be  
 better if the compiler would scan for dependencies, and if a source file  
 has changed, recompile the whole project in one go. This would be simple  
 and efficient.

This may not be true. Consider the dwt lib case, once you tweaked a module  
very little(that means you do not modify any interface connects with  
outside modules and code that could possible affect modules in the same  
packages), the optimal way is

dmd -c your_tweaked_module
link all_obj

That's much faster than regenerating all other object files. Yes, feed  
them all to DMD compiles really fast. Writing all object files to disk  
costs much time. And your impression of incremental compilation seems to  
be misguided by the rebuild and dsss system. Rebuild takes no advantage of  
di files, thus it have to recompile everytime even in the situation that  
the module based on all other di files unchanged. I posted several  
blocking header generation bugs in DMD and with fixes. Just so little  
change that dmd can generate almost all header files correctly. I tested  
tango, dwt, dwt-addons. Those projects are very big and some take advanced  
use of templates. So the header generation building strategy is really not  
far away.

Little self-promotion here, and in case Walter misses some of them:
http://d.puremagic.com/issues/show_bug.cgi?id=2744
http://d.puremagic.com/issues/show_bug.cgi?id=2745
http://d.puremagic.com/issues/show_bug.cgi?id=2747
http://d.puremagic.com/issues/show_bug.cgi?id=2748
http://d.puremagic.com/issues/show_bug.cgi?id=2751

In c++, a sophisticated makefile carefully build .h dependencies of .c  
files. Thus, once .h files are updated, then .c files which are based on  
them need to be recompile. This detection can be made by comparison of old  
.di files and new .di files by testing their equality.

Mar 21 2009

grauzone <none example.net> writes:

 Little self-promotion here, and in case Walter misses some of them:
 http://d.puremagic.com/issues/show_bug.cgi?id=2744
 http://d.puremagic.com/issues/show_bug.cgi?id=2745
 http://d.puremagic.com/issues/show_bug.cgi?id=2747
 http://d.puremagic.com/issues/show_bug.cgi?id=2748
 http://d.puremagic.com/issues/show_bug.cgi?id=2751

If it's about bugs, it would (probably) be easier for Walter to fix that 
code generation bug, that forces dsss/rebuild to invoke a new dmd 
process to recompile each outdated file separately.

This would bring a critical speedup for incremental compilation (from 
absolutely useless to relatively useful), and all impatient D users with 
middle sized source bases could be happy.

 In c++, a sophisticated makefile carefully build .h dependencies of .c 
 files. Thus, once .h files are updated, then .c files which are based on 
 them need to be recompile. This detection can be made by comparison of 
 old .di files and new .di files by testing their equality.

This sounds like a really nice idea, but it's also quite complex.

For example, to guarantee correctness, the D compiler _always_ had to 
read the .di file when importing a module (and not the .d file 
directly). If it doesn't do that, it could "accidentally" use 
information that isn't included in the .di file (like code when doing 
inlining). This means you had to generate the .di files first. When 
doing this, you also had to deal with circular dependencies, which will 
bring extra headaches. And of course, you need to fix all those .di 
generation bugs. It's actually a bit scary that the compiler not only 
has to be able to parse D code, but also to output D source code again. 
And .di files are not even standardized.

It's perhaps messy enough to deem it unrealistic. Still, nice idea.

Mar 21 2009

"Kristian Kilpi" <kjkilpi gmail.com> writes:

On Sat, 21 Mar 2009 22:19:31 +0200, grauzone <none example.net> wrote:
 I don't really understand what you mean. But if you want the compiler to  
 scan for dependencies, I fully agree.

 I claim that we don't even need incremental compilation. It would be  
 better if the compiler would scan for dependencies, and if a source file  
 has changed, recompile the whole project in one go. This would be simple  
 and efficient.

Well, why not get rid of the imports altogether... Ok, that would not be  
feasible because of the way compilers (D, C++, etc) are build nowadays.

I find adding of #includes/imports laborious. (Is this component already  
#included/imported? Where's that class defined? Did I forgot something?)  
And when you modify or refractor the file, you have to update the  
#includes/imports accordingly...

(In case of modification/refractoring) the easiest way is just to compile  
the file, and see if there's errors... Of course, that approach will not  
help to remove the unnecessary #includes/imports.

So, sometimes (usually?) I give up, create one huge #include/import file  
that #includes/imports all the stuff, and use that instead. Efficient?  
Pretty? No. Easy? Simple? Yes.

#includes/imports are redundant information: the source code of course  
describes what's used in it. So, the compiler could be aware of the whole  
project (and the libraries used) instead of one file at the time.

Mar 22 2009

Christopher Wright <dhasenan gmail.com> writes:

Kristian Kilpi wrote:
 #includes/imports are redundant information: the source code of course 
 describes what's used in it. So, the compiler could be aware of the 
 whole project (and the libraries used) instead of one file at the time.

That's not sufficient. I'm using SDL right now; if I type 'Surface s;', 
should that import sdl.surface or cairo.Surface? How is the compiler to 
tell? How should the compiler find out where to look for classes named 
Surface? Should it scan everything under /usr/local/include/d/? That's 
going to be pointlessly expensive.

Mar 22 2009

"Kristian Kilpi" <kjkilpi gmail.com> writes:

On Sun, 22 Mar 2009 14:14:39 +0200, Christopher Wright  
<dhasenan gmail.com> wrote:

 Kristian Kilpi wrote:
 #includes/imports are redundant information: the source code of course  
 describes what's used in it. So, the compiler could be aware of the  
 whole project (and the libraries used) instead of one file at the time.

 That's not sufficient. I'm using SDL right now; if I type 'Surface s;',  
 should that import sdl.surface or cairo.Surface? How is the compiler to  
 tell? How should the compiler find out where to look for classes named  
 Surface? Should it scan everything under /usr/local/include/d/? That's  
 going to be pointlessly expensive.

Such things should of course be told to the compiler somehow. By using the  
project configuration, or by other means. (It's only a matter of  
definition.)

For example, if my project contains the Surface class, then 'Surface s;'  
should of course refer to it. If some library (used by the project) also  
has the Surface class, then one should use some other way to refer it  
(e.g. sdl.Surface).

But my point was that the compilers today do not have knowledge about the  
projects as a whole. That makes this kind of 'scanning' too expensive (in  
the current compiler implementations). But if the compilers were build  
differently that wouldn't have to be true.


If I were to create/design a compiler (which I am not ;) ), it would be  
something like this:

Every file is cached (why to read and parse files over and over again, if  
not necessary). These cache files would contain all the information (parse  
trees, interfaces, etc) needed during the compilation (of the whole  
project). Also, they would contain the compilation results too (i.e.  
assembly). So, these cache/database files would logically replace the old  
object files.

That is, there would be database for the whole project. When something  
gets changed, the compiler knows what effect it has and what's required to  
do.

And finally, I would also change the format of libraries. A library would  
be one file only. No more header/.di -files; one compact file containing  
all the needed information (in a binary formated database that can be read  
very quickly).

Mar 22 2009

dennis luehring <dl.soluz gmx.net> writes:

 Such things should of course be told to the compiler somehow. By using
 the project configuration, or by other means. (It's only a matter of
 definition.)

maybe like delphi did it

there is a file called .dpr (delphi project)
which holds the absolute/relative pathes for in project used imports
it could be seen as an delphi source based makefile

test.dpr
---
project test;

uses // like D's import
   unit1 in '\temp\unit1.pas',
   unit2 in '\bla\unit2.pas',
   unit3 in '\blub\unit3.pas',
   ...
---

unit1.pas
---
   uses
     unit2,
     unit3;

interface

...

implementation

...

---


and the sources files .pas compiled into an delphi compiler specific 
"object file format" called .dcu (delphi compiled unit)
which holds all intelligent data for the compiler when used serveral 
times (if the compiler finds an .dcu he will use it, or compile the .pas 
if needed to an .dcu)

i think that, the blasting fast parser (and the absence of generic 
programming features) makes delphi the fastest compiler out there
the compiling speed is compareable to sending a message through icq or 
save a small file

did the dmd compiler have rich compile/linktime intermediate files?

and btw: if we do compiletime bechmarks - delphi is the only hart to 
beat reference

but i still don't like delphi :-)

Mar 22 2009

Christopher Wright <dhasenan gmail.com> writes:

Kristian Kilpi wrote:
 On Sun, 22 Mar 2009 14:14:39 +0200, Christopher Wright 
 <dhasenan gmail.com> wrote:
 
 Kristian Kilpi wrote:
 #includes/imports are redundant information: the source code of 
 course describes what's used in it. So, the compiler could be aware 
 of the whole project (and the libraries used) instead of one file at 
 the time.

 That's not sufficient. I'm using SDL right now; if I type 'Surface 
 s;', should that import sdl.surface or cairo.Surface? How is the 
 compiler to tell? How should the compiler find out where to look for 
 classes named Surface? Should it scan everything under 
 /usr/local/include/d/? That's going to be pointlessly expensive.

 
 Such things should of course be told to the compiler somehow. By using 
 the project configuration, or by other means. (It's only a matter of 
 definition.)
 
 For example, if my project contains the Surface class, then 'Surface s;' 
 should of course refer to it. If some library (used by the project) also 
 has the Surface class, then one should use some other way to refer it 
 (e.g. sdl.Surface).

Then I want to deal with a library type with the same name as my builtin 
type.

You can come up with a convention that does the right thing 90% of the 
time, but produces strange errors on occasion.

 But my point was that the compilers today do not have knowledge about 
 the projects as a whole. That makes this kind of 'scanning' too 
 expensive (in the current compiler implementations). But if the 
 compilers were build differently that wouldn't have to be true.

If you want a system that accepts plugins, you will never have access to 
the entire project. If you are writing a library, you will never have 
access to the entire project. So a compiler has to address those needs, too.

 If I were to create/design a compiler (which I am not ;) ), it would be 
 something like this:
 
 Every file is cached (why to read and parse files over and over again, 
 if not necessary). These cache files would contain all the information 
 (parse trees, interfaces, etc) needed during the compilation (of the 
 whole project). Also, they would contain the compilation results too 
 (i.e. assembly). So, these cache/database files would logically replace 
 the old object files.
 
 That is, there would be database for the whole project. When something 
 gets changed, the compiler knows what effect it has and what's required 
 to do.

All this is helpful for developers. It's not helpful if you are merely 
compiling everything once, but then, the overhead would only be 
experienced on occasion.

 And finally, I would also change the format of libraries. A library 
 would be one file only. No more header/.di -files; one compact file 
 containing all the needed information (in a binary formated database 
 that can be read very quickly).

Why binary? If your program can operate efficiently with a textual 
representation, it's easier to test, easier to debug, and less 
susceptible to changes in internal structures.

Additionally, a database in a binary format will require special tools 
to examine. You can't just pop it open in a text editor to see what 
functions are defined.

Mar 22 2009

"Nick Sabalausky" <a a.a> writes:

"Christopher Wright" <dhasenan gmail.com> wrote in message 
news:gq6lms$1815$1 digitalmars.com...
 And finally, I would also change the format of libraries. A library would 
 be one file only. No more header/.di -files; one compact file containing 
 all the needed information (in a binary formated database that can be 
 read very quickly).

 Why binary? If your program can operate efficiently with a textual 
 representation, it's easier to test, easier to debug, and less susceptible 
 to changes in internal structures.

 Additionally, a database in a binary format will require special tools to 
 examine. You can't just pop it open in a text editor to see what functions 
 are defined.

"If your program can operate efficiently with a textual representation..."

I think that's the key right there. Most of the time, parsing a 
sensibly-designed text format is going to be a bit slower than reading in an 
equivalent sensibly-designed (as opposed to over-engineered [pet-peeve]ex: 
GOLD Parser Builder's .cgt format[/pet-peeve]) binary format. First off, 
there's just simply more raw data to be read off the disk and processed, 
then you've got the actual tokenizing/syntax-parsing itself, and then 
anything that isn't supposed to be interpreted as a string (like ints and 
bools) need to get converted to their proper internal representations. And 
then for saving, you go through all the same, but in reverse. (Also, mixed 
human/computer editing of a text file can sometimes be problematic.)

With a sensibly-designed binary format (and a sensible systems language like 

into memory and apply some structs over top of them. Toss in some trivial 
version checks and maybe some endian fixups and you're done. Very little 
processing and memory is needed.

I can certainly appreciate the other benefits of text formats, though, and 
certainly agree that there are cases where the performance of using a text 
format would be perfectly acceptable.

But it can add up. And I often wonder how much faster and more 
memory-efficient things like linux and the web could have been if they 
weren't so big on sticking damn near everything into "convenient" text 
formats.

Mar 24 2009

bearophile <bearophileHUGS lycos.com> writes:

Nick Sabalausky:
 I often wonder how much faster and more 
 memory-efficient things like linux and the web could have been if they 
 weren't so big on sticking damn near everything into "convenient" text 
 formats.

Maybe not much, because today textual files can be compressed and decomperssed
on the fly. CPUs are now fast enough that even with compression the I/O is
usually the bottleneck anyway.

Bye,
bearophile

Mar 24 2009

"Nick Sabalausky" <a a.a> writes:

"bearophile" <bearophileHUGS lycos.com> wrote in message 
news:gqbe2k$13al$1 digitalmars.com...
 Nick Sabalausky:
 I often wonder how much faster and more
 memory-efficient things like linux and the web could have been if they
 weren't so big on sticking damn near everything into "convenient" text
 formats.

 Maybe not much, because today textual files can be compressed and 
 decomperssed on the fly. CPUs are now fast enough that even with 
 compression the I/O is usually the bottleneck anyway.

I've become more and more wary of this "CPUs are now fast enough..." phrase 
that keeps getting tossed around these days. The problem is, that argument 
gets used SO much, that on this fastest computer I've ever owned, I've 
actually experienced *basic text-entry boxes* (with no real bells or 
whistles or anything) that had *seconds* of delay. That never once happened 
to me on my "slow" Apple 2.

The unfortunate truth is that the speed and memory of modern systems are 
constantly getting used to rationalize shoddy bloatware practices and we 
wind up with systems that are even *slower* than they were back on 
less-powerful hardware. It's pathetic, and drives me absolutely nuts.

Mar 24 2009

bearophile <bearophileHUGS lycos.com> writes:

Nick Sabalausky:
 That never once happened to me on my "slow" Apple 2.<

See here too :-)
http://hubpages.com/hub/_86_Mac_Plus_Vs_07_AMD_DualCore_You_Wont_Believe_Who_Wins

Yet, what I have written is often true :-)
Binary data can't be compressed as well as textual data, and lzop is I/O bound
in most situations:
http://www.lzop.org/

Bye,
bearophile

Mar 24 2009

"Nick Sabalausky" <a a.a> writes:

"bearophile" <bearophileHUGS lycos.com> wrote in message 
news:gqbgma$189l$1 digitalmars.com...
 Nick Sabalausky:
 That never once happened to me on my "slow" Apple 2.<

 See here too :-)
 http://hubpages.com/hub/_86_Mac_Plus_Vs_07_AMD_DualCore_You_Wont_Believe_Who_Wins

Excellent article :)

 Yet, what I have written is often true :-)
 Binary data can't be compressed as well as textual data,

Doesn't really matter, since binary data (assuming a format that isn't 
over-engineered) is already smaller than the same data in text form. Text 
compresses well *because* it contains so much more excess redundant data 
than binary data does. I could stick 10GB of zeros to the end of a 1MB 
binary file and suddenly it would compress far better than any typical text 
file.

 and lzop is I/O bound in most situations:
 http://www.lzop.org/

I'm not really sure why you're bringing up compression...? Do you mean that 
the actual disk access time of a text format can be brought down to the time 
of an equivalent binary format by storing the text file in a compressed 
form?

Mar 24 2009

bearophile <bearophileHUGS lycos.com> writes:

Nick Sabalausky:
 Doesn't really matter, since binary data (assuming a format that isn't 
 over-engineered) is already smaller than the same data in text form.

If you take into account compression too, sometimes text compressed is smaller
than the same binary file and the same binary file compressed (because good
compressors are often able to spot redundancy better in text files than in
arbitrary structured binary files).


I'm not really sure why you're bringing up compression...?<

Because experiments have shown it solves or reduces a lot the problem you were
talking about.


 Do you mean that 
 the actual disk access time of a text format can be brought down to the time 
 of an equivalent binary format by storing the text file in a compressed 
 form?

It's not always true, but it happens often enough, or the difference becomes
tolerable and balances the clarity advantages of the textual format (and
sometimes the actual time becomes less, but this is less common).

Bye,
bearophile

Mar 24 2009

Christopher Wright <dhasenan gmail.com> writes:

Nick Sabalausky wrote:
 But it can add up. And I often wonder how much faster and more 
 memory-efficient things like linux and the web could have been if they 
 weren't so big on sticking damn near everything into "convenient" text 
 formats.

Most programs only need to load up text on startup. So the cost of 
parsing the config file is linear in the number of times you start the 
application, and linear in the size of the config file.

If there were a binary database format in place of libraries, I would be 
fine with it, as long as there were a convenient way to get the textual 
version.

Mar 24 2009

"Unknown W. Brackets" <unknown simplemachines.org> writes:

Actually, dmd is so fast I never bother with these "build" utilities.  I 
just send it all the files and have it rebuild everytime, deleting all 
the o files afterward.

This is very fast, even for larger projects.  It appears (to me) the 
static cost of calling dmd is much greater than the dynamic cost of 
compiling a file.  These toolkits always compile a, then b, then c, 
which takes like 2.5 times as long as compiling a, b, and c at once.

That said, if dmd were made to link into other programs, these toolkits 
could hook into it, and have the fixed cost only once (theoretically) - 
but still dynamically decide which files to compile.  This seems ideal.

-[Unknown]


davidl wrote:
 
 1. compiler know in what situation a file need to be recompiled
 
 Consider the file given the same header file, then the obj file of this 
 will be required for linking, all other files import this file shouldn't 
 require any recompilation in this case. If a file's header file changes, 
 thus the interface changes, all files import this file should be 
 recompiled.
 Compiler can emit building command like rebuild does.
 
 I would enjoy:
 
 dmd -buildingcommand abc.d  > responsefile
 
 dmd  responsefile
 
 I think we need to eliminate useless recompilation as much as we should 
 with consideration of the growing d project size.
 
 2. maintaining the build without compiler support costs

Mar 22 2009

D Programming

C/C++ Programming

Other

digitalmars.D - Please integrate build framework into the compiler