D - Compilation model

Martin M. Pedersen (34/34) May 14 2003 Currently, D is using the same compilation model as C/C++, but I'm not

Walter (12/46) May 14 2003 D is designed so that an advanced compiler can compile the entire projec...
Sean L. Palmer (10/44) May 14 2003 I find no flaw in your arguments, and agree it would be an improvement.
Helmut Leitner (36/78) May 14 2003 I agree, esp. with the need to resolve the one-to-one relationship

Mark T (10/29) May 16 2003 I think that Walter's current mapping is fine. It doesn't prevent a D
Walter (8/32) May 16 2003 The D compiler automatically generates COMDATs for each function in a

Helmut Leitner (11/48) May 19 2003 Hey, that's great! I tried to check it and it seems to work.

Walter (9/29) May 20 2003 /MAP

midiclub tiscali.de (30/62) May 15 2003 It is not really the same model as C. Look at any other language - it se...

Martin M. Pedersen (35/40) May 15 2003 Sorry, I has been away from D awhile. I was thinking more in line of Jav...

Ilya Minkov (18/25) May 15 2003 --- 8< --- >8 ---
Walter (5/7) May 20 2003 You need to compile with -Nc (function level linking). It's off by defau...

Helmut Leitner (7/11) May 15 2003 I often heard this but never came upon a system that really did this.
Walter (8/15) May 16 2003 in

Sean L. Palmer (6/11) May 16 2003 D compiler is really really fast. Gotta hand it to ya.

Walter (4/5) May 16 2003 Thanks. I haven't even expended any effort tuning it for speed. I just

Garen Parham (6/8) May 16 2003 The whole thing is fast. It's so fast I couldn't tell the difference fr...

Walter (3/4) May 18 2003 Profile, profile, profile!

Garen Parham (3/5) May 18 2003 Thats it? I was thinking maybe you'd say you had some ingenuous bottom-...

Walter (6/10) May 20 2003 I don't think there's anything ingenious in the code. I just have a lot ...

Ilya Minkov (5/7) May 18 2003 assert (Walter == GeniousWizard);

Walter (5/11) May 20 2003 LOL. If you want to see the source to the lexer/parser, it's included wi...

"Martin M. Pedersen" <mmp www.moeller-pedersen.dk> writes:

Currently, D is using the same compilation model as C/C++, but I'm not
convinced that it cannot be done better.

We have (at least):
- directories
- modules
- source files
- object files
- libraries
- binaries

The main reason for using the same compilation model as C/C++ is link
compability, as I see it. But there is room for changes, while keeping link
compability. There is a one-to-one correspondance between directories and
modules. I suggest that we keep it this way. There is also a one-to-one
correspondance between source files and object file. I suggest that is
changed.

How about letting the compiler compile a whole module/directory at once, and
emit a library. Object files within the library should be of much smaller
smaller granularity than the source files, and there would be many more of
them. Idially one object file per method or public data symbol. But it would
not really be a concern using the compiler, because we - users of the
compiler - would not mess with the object files, only libraries. The
benefits would be:

- The compiler would be able to do module-wide optimizations.
- The linker (an existing, traditional linker) would be able to filter away
more unused code and data.
- When compiling a module, a source file needs only to be parsed once.

In the traditional model, when a change is made outside a module, it might
happen that only a portion of the module needs to be recompiled. But in many
cases, perhaps the most often cases, such changes means that the whole
module needs to be recompiled. This leads me to think that we might as well
treat the module as the translation unit, and gain the benefits above.

Comments?

Regards,
Martin M. Pedersen

May 14 2003

"Walter" <walter digitalmars.com> writes:

D is designed so that an advanced compiler can compile the entire project in
one go. For the time being it follows the traditional C/C++ development
model because I have limited resources, and so take maximum advantage of
existing tools.

"Martin M. Pedersen" <mmp www.moeller-pedersen.dk> wrote in message
news:b9uh9j$2fls$1 digitaldaemon.com...
 Currently, D is using the same compilation model as C/C++, but I'm not
 convinced that it cannot be done better.

 We have (at least):
 - directories
 - modules
 - source files
 - object files
 - libraries
 - binaries

 The main reason for using the same compilation model as C/C++ is link
 compability, as I see it. But there is room for changes, while keeping

link
 compability. There is a one-to-one correspondance between directories and
 modules. I suggest that we keep it this way. There is also a one-to-one
 correspondance between source files and object file. I suggest that is
 changed.

 How about letting the compiler compile a whole module/directory at once,

and
 emit a library. Object files within the library should be of much smaller
 smaller granularity than the source files, and there would be many more of
 them. Idially one object file per method or public data symbol. But it

would
 not really be a concern using the compiler, because we - users of the
 compiler - would not mess with the object files, only libraries. The
 benefits would be:

 - The compiler would be able to do module-wide optimizations.
 - The linker (an existing, traditional linker) would be able to filter

away
 more unused code and data.
 - When compiling a module, a source file needs only to be parsed once.

 In the traditional model, when a change is made outside a module, it might
 happen that only a portion of the module needs to be recompiled. But in

many
 cases, perhaps the most often cases, such changes means that the whole
 module needs to be recompiled. This leads me to think that we might as

well
 treat the module as the translation unit, and gain the benefits above.

 Comments?

 Regards,
 Martin M. Pedersen

May 14 2003

"Sean L. Palmer" <palmer.sean verizon.net> writes:

I find no flaw in your arguments, and agree it would be an improvement.

Sean

"Martin M. Pedersen" <mmp www.moeller-pedersen.dk> wrote in message
news:b9uh9j$2fls$1 digitaldaemon.com...
 Currently, D is using the same compilation model as C/C++, but I'm not
 convinced that it cannot be done better.

 We have (at least):
 - directories
 - modules
 - source files
 - object files
 - libraries
 - binaries

 The main reason for using the same compilation model as C/C++ is link
 compability, as I see it. But there is room for changes, while keeping

link
 compability. There is a one-to-one correspondance between directories and
 modules. I suggest that we keep it this way. There is also a one-to-one
 correspondance between source files and object file. I suggest that is
 changed.

 How about letting the compiler compile a whole module/directory at once,

and
 emit a library. Object files within the library should be of much smaller
 smaller granularity than the source files, and there would be many more of
 them. Idially one object file per method or public data symbol. But it

would
 not really be a concern using the compiler, because we - users of the
 compiler - would not mess with the object files, only libraries. The
 benefits would be:

 - The compiler would be able to do module-wide optimizations.
 - The linker (an existing, traditional linker) would be able to filter

away
 more unused code and data.
 - When compiling a module, a source file needs only to be parsed once.

 In the traditional model, when a change is made outside a module, it might
 happen that only a portion of the module needs to be recompiled. But in

many
 cases, perhaps the most often cases, such changes means that the whole
 module needs to be recompiled. This leads me to think that we might as

well
 treat the module as the translation unit, and gain the benefits above.

 Comments?

 Regards,
 Martin M. Pedersen

May 14 2003

Helmut Leitner <helmut.leitner chello.at> writes:

I agree, esp. with the need to resolve the one-to-one relationship
between source file (module) and object file.

One way to do this could be manually, by using a 
  #pragma split
which could produce from a source file
  module.d:
  ...independent code, part 1
  #pragma split
  ...independent code, part 2
  #pragma split
  ...independent code, part 3
three separate object files (via a compiler "split" option) 
  module_001.obj
  module_002.obj
  module_003.obj
for inclusion into libaries. 

This would allow to reduce the footprint of D applications:

- embedded people won't consider D with its current 60K
  minimium exe size (they think in the range 1-8K).

- needing Win32 structs you will need to compile and link 
  some windows.d file into your application. Its 500 init
  default structures add about 30K. This could be splitted
  into reasonable parts without creating 500 independent
  source files to have a space optimized interface. 

- it would allow a more efficient reuse situation, because adding
  some code to a module would not autimatically mean to add
  to the footprint of any application using the module.

Of course it would be more beautiful when this effect could be
transparent to the programmer, so that the splitted parts were
produced as one granular object file that could be linked just
the typical monolythic object files. But I think linkers won't
support this.

"Martin M. Pedersen" wrote:
 
 Currently, D is using the same compilation model as C/C++, but I'm not
 convinced that it cannot be done better.
 
 We have (at least):
 - directories
 - modules
 - source files
 - object files
 - libraries
 - binaries
 
 The main reason for using the same compilation model as C/C++ is link
 compability, as I see it. But there is room for changes, while keeping link
 compability. There is a one-to-one correspondance between directories and
 modules. I suggest that we keep it this way. There is also a one-to-one
 correspondance between source files and object file. I suggest that is
 changed.
 
 How about letting the compiler compile a whole module/directory at once, and
 emit a library. Object files within the library should be of much smaller
 smaller granularity than the source files, and there would be many more of
 them. Idially one object file per method or public data symbol. But it would
 not really be a concern using the compiler, because we - users of the
 compiler - would not mess with the object files, only libraries. The
 benefits would be:
 
 - The compiler would be able to do module-wide optimizations.
 - The linker (an existing, traditional linker) would be able to filter away
 more unused code and data.
 - When compiling a module, a source file needs only to be parsed once.
 
 In the traditional model, when a change is made outside a module, it might
 happen that only a portion of the module needs to be recompiled. But in many
 cases, perhaps the most often cases, such changes means that the whole
 module needs to be recompiled. This leads me to think that we might as well
 treat the module as the translation unit, and gain the benefits above.
 
 Comments?
 
 Regards,
 Martin M. Pedersen

--
Helmut Leitner    leitner hls.via.at   
Graz, Austria   www.hls-software.com

May 14 2003

Mark T <Mark_member pathlink.com> writes:

In article <3EC32D89.2F4CD480 chello.at>, Helmut Leitner says...
I agree, esp. with the need to resolve the one-to-one relationship
between source file (module) and object file.

I think that Walter's current mapping is fine. It doesn't prevent a D
build-system from doing a global optimization of the whole "program" for
release. Which would be nice for real-time or number crunching apps.

One way to do this could be manually, by using a 
  #pragma split
which could produce from a source file
  module.d:
  ...independent code, part 1
  #pragma split
  ...independent code, part 2
  #pragma split
  ...independent code, part 3
three separate object files (via a compiler "split" option) 
  module_001.obj
  module_002.obj
  module_003.obj
for inclusion into libaries. 

yuk pragma stuff
This would allow to reduce the footprint of D applications:

- embedded people won't consider D with its current 60K
  minimium exe size (they think in the range 1-8K).

I do embedded, it comes in all sizes, the folks working in the above size range
do NOT use a 32 bit (or bigger) processor (requirement for D early on). Most
everything I work on now is 32 bit with some 16 bit processors.

Reuse, project complexity (makefile structure), etc are usually more important.
The D module appears to support this.

May 16 2003

"Walter" <walter digitalmars.com> writes:

"Helmut Leitner" <helmut.leitner chello.at> wrote in message
news:3EC32D89.2F4CD480 chello.at...
 I agree, esp. with the need to resolve the one-to-one relationship
 between source file (module) and object file.

 One way to do this could be manually, by using a
   #pragma split
 which could produce from a source file
   module.d:
   ...independent code, part 1
   #pragma split
   ...independent code, part 2
   #pragma split
   ...independent code, part 3
 three separate object files (via a compiler "split" option)
   module_001.obj
   module_002.obj
   module_003.obj
 for inclusion into libaries.

The D compiler automatically generates COMDATs for each function in a
module, so they are individually linked in. It's equivalent to doing the
above.

 This would allow to reduce the footprint of D applications:
 - embedded people won't consider D with its current 60K
   minimium exe size (they think in the range 1-8K).

The D footprint is about 24k larger than the equivalent C footprint.

 - needing Win32 structs you will need to compile and link
   some windows.d file into your application. Its 500 init
   default structures add about 30K. This could be splitted
   into reasonable parts without creating 500 independent
   source files to have a space optimized interface.

Actually, I need to remove the need to link in the struct inits for 0
initted structs.

May 16 2003

Helmut Leitner <helmut.leitner chello.at> writes:

Walter wrote:
 
 "Helmut Leitner" <helmut.leitner chello.at> wrote in message
 news:3EC32D89.2F4CD480 chello.at...
 I agree, esp. with the need to resolve the one-to-one relationship
 between source file (module) and object file.

 One way to do this could be manually, by using a
   #pragma split
 which could produce from a source file
   module.d:
   ...independent code, part 1
   #pragma split
   ...independent code, part 2
   #pragma split
   ...independent code, part 3
 three separate object files (via a compiler "split" option)
   module_001.obj
   module_002.obj
   module_003.obj
 for inclusion into libaries.

 
 The D compiler automatically generates COMDATs for each function in a
 module, so they are individually linked in. It's equivalent to doing the
 above.

Hey, that's great! I tried to check it and it seems to work.
Is the a linker option that gives a more detailed list of 
the code (text) area?

 This would allow to reduce the footprint of D applications:
 - embedded people won't consider D with its current 60K
   minimium exe size (they think in the range 1-8K).

 
 The D footprint is about 24k larger than the equivalent C footprint.

Both seems a bit large given that the linker removes dead code.
 
 - needing Win32 structs you will need to compile and link
   some windows.d file into your application. Its 500 init
   default structures add about 30K. This could be splitted
   into reasonable parts without creating 500 independent
   source files to have a space optimized interface.

 
 Actually, I need to remove the need to link in the struct inits for 0
 initted structs.

That would be great for the Win32 API, because otherwise one has
to work around this.

--
Helmut Leitner    leitner hls.via.at   
Graz, Austria   www.hls-software.com

May 19 2003

"Walter" <walter digitalmars.com> writes:

"Helmut Leitner" <helmut.leitner chello.at> wrote in message
news:3EC938EB.5E35CB39 chello.at...
 The D compiler automatically generates COMDATs for each function in a
 module, so they are individually linked in. It's equivalent to doing the
 above.

 Hey, that's great! I tried to check it and it seems to work.
 Is the a linker option that gives a more detailed list of
 the code (text) area?

/MAP


 This would allow to reduce the footprint of D applications:
 - embedded people won't consider D with its current 60K
   minimium exe size (they think in the range 1-8K).

 The D footprint is about 24k larger than the equivalent C footprint.

 Both seems a bit large given that the linker removes dead code.

They're a bit large on Win32 systems because (unfortunately) there is a lot
of code always linked in that deals with exceptions thrown from VC++
compiled DLLs. This is necessary so that DMC++ code can call DLLs built with
VC++ and catch exceptions thrown by it. This code has no relevance for
embedded systems, and so won't be there.


 - needing Win32 structs you will need to compile and link
   some windows.d file into your application. Its 500 init
   default structures add about 30K. This could be splitted
   into reasonable parts without creating 500 independent
   source files to have a space optimized interface.

 Actually, I need to remove the need to link in the struct inits for 0
 initted structs.

 That would be great for the Win32 API, because otherwise one has
 to work around this.

I've got a lot of things like this that need to be done.

May 20 2003

midiclub tiscali.de writes:

In article <b9uh9j$2fls$1 digitaldaemon.com>, Martin M. Pedersen says...
Currently, D is using the same compilation model as C/C++, but I'm not
convinced that it cannot be done better.

It is not really the same model as C. Look at any other language - it seems more
similar to anything else than to C.

We have (at least):
- directories

are not really relevant

- modules

one module ^= one source file ^= one object file.

- source files
- object files
- libraries

libraries are not a part of compilation model - they are used as an agregate of
compiled modules and possibly their source (or in the future possibly parsed
source).

- binaries


The main reason for using the same compilation model as C/C++ is link
compability, as I see it. But there is room for changes, while keeping link
compability. There is a one-to-one correspondance between directories and
modules. I suggest that we keep it this way. There is also a one-to-one
correspondance between source files and object file. I suggest that is
changed.

Where do you see correspondence of directories to modules? Keep many modules in
one directory.

How about letting the compiler compile a whole module/directory at once, and
emit a library. Object files within the library should be of much smaller
smaller granularity than the source files, and there would be many more of
them. Idially one object file per method or public data symbol. But it would
not really be a concern using the compiler, because we - users of the
compiler - would not mess with the object files, only libraries. The
benefits would be:

- The compiler would be able to do module-wide optimizations.
- The linker (an existing, traditional linker) would be able to filter away
more unused code and data.
- When compiling a module, a source file needs only to be parsed once.

The compiler does inter-module optimisations, as far as possible. For that, the
complete parsed source of the application and used libraries is contained in
memory during the project compilation. This causes the whole library code to be
re-parsed once per project compilation, but not more often than that. Maybe
Walter could implement dumping of the parse trees to disk - this would be a
great and major optimisation.

In the traditional model, when a change is made outside a module, it might
happen that only a portion of the module needs to be recompiled. But in many
cases, perhaps the most often cases, such changes means that the whole
module needs to be recompiled. This leads me to think that we might as well
treat the module as the translation unit, and gain the benefits above.

Modules are as long as a programmer could keep track of - a few dozen pages at
most. The curent D compiler doesn't take any significant time to compile such an
amount. GCC does, and hence it could be of relevance with it...

Object files need not be of smaller granularity, since:
- unittest/start-up code is common for a module, and must be included anyway
doesn't mater how much of the module you use;
- splitting a module apart into tiniest parts is a common misconception - it
shouldn't help. Any decent linker is able to eliminate all functions which are
not referenced anywhere. This means, methods of a class and functions taking
part in a unittest are never removed. All other unused functions are.

Comments?

Seems you're not exactly sure what you're talking about.

-i.

--i
MIDICLUB

May 15 2003

"Martin M. Pedersen" <mmp www.moeller-pedersen.dk> writes:

Hi,

 Where do you see correspondence of directories to modules? Keep many

modules in
 one directory.

Sorry, I has been away from D awhile. I was thinking more in line of Java's
packages. I a larger D project, I would probably organize the modules such
that there would be a main module per directory, the one you imported
elsewhere, and a number of submodules. In such a case I was thinking of
threating all the sources in the directory as one translation unit.

 Any decent linker is able to eliminate all functions which are
 not referenced anywhere.

Then I'm sorry that I don't have a decent linker, or don't know how to
operate them efficiently. Given these sources...

=== main.c
#include <stdio.h>
extern const char* foo();
int main() {
    puts(foo());
    return 0;
}
=== foobar.c
extern const char* baz();
const char* bar() {
    return baz();
}
const char* foo() {
    return "foo";
}
=== baz.c
const char* baz()
{
    return "This should not be in the executable";
}
===

.. I'm able to find "This should not be in the executable" in the
executable. I have tried this with the linker of MSVC6 and DMC.

 Seems you're not exactly sure what you're talking about.

If I was, I would not ask for comments :-)

Regards,
Martin M. Pedersen

May 15 2003

Ilya Minkov <midiclub 8ung.at> writes:

Hello.

Martin M. Pedersen wrote:
 Then I'm sorry that I don't have a decent linker, or don't know how
 to operate them efficiently. Given these sources...

--- 8< --- >8 ---
 .. I'm able to find "This should not be in the executable" in the 
 executable. I have tried this with the linker of MSVC6 and DMC.

This doesn't show much, since "This should not be ..." is data, not
code, and stored separately. However the code seems to be there too, i
checked it. :( I'm yet to try some other linkers. Maybe Borland or
Watcom? But from the times i worked with Delphi, yet version 2 or 3, i
can remember the linker stripping out functions even from debug
executables, midst in the actively used modules. This manifestated
itself in the debugger complaints when using real-time expression
evaluator: "This function has been eliminated by linker." This sometimes
gave me a clue what's wrong with my code - that it contains some silly
plug (a "bone") instead of real code - a function call - which was intended.

Sorry for my misconception - this probably comes from me reading too
much documentation containing hidden advertisements. :> But if such 
claims exist, there probably is some product which fulfills them...

 Seems you're not exactly sure what you're talking about

 
 If I was, I would not ask for comments :-)

It appears that i'm also not.

-i.

May 15 2003

"Walter" <walter digitalmars.com> writes:

"Martin M. Pedersen" <mmp www.moeller-pedersen.dk> wrote in message
news:ba0irh$1g5e$1 digitaldaemon.com...
 .. I'm able to find "This should not be in the executable" in the
 executable. I have tried this with the linker of MSVC6 and DMC.

You need to compile with -Nc (function level linking). It's off by default
for C because some legacy C will fail with -Nc. The D compiler does this
automatically.

May 20 2003

Helmut Leitner <helmut.leitner chello.at> writes:

midiclub tiscali.de wrote:
 - splitting a module apart into tiniest parts is a common misconception - it
 shouldn't help. Any decent linker is able to eliminate all functions which are
 not referenced anywhere. This means, methods of a class and functions taking
 part in a unittest are never removed. All other unused functions are.

I often heard this but never came upon a system that really did this.

So please name a linker for Windows or Linux that shows this 
behaviour together with current D object modules.

--
Helmut Leitner    leitner hls.via.at   
Graz, Austria   www.hls-software.com

May 15 2003

"Walter" <walter digitalmars.com> writes:

<midiclub tiscali.de> wrote in message
news:b9vvrj$r9r$1 digitaldaemon.com...
 In article <b9uh9j$2fls$1 digitaldaemon.com>, Martin M. Pedersen says...
 For that, the
 complete parsed source of the application and used libraries is contained

in
 memory during the project compilation. This causes the whole library code

to be
 re-parsed once per project compilation, but not more often than that.

Maybe
 Walter could implement dumping of the parse trees to disk - this would be

a
 great and major optimisation.

That was the original plan, but the compiler parses so fast it wasn't
justifiable to make the effort.

May 16 2003

"Sean L. Palmer" <palmer.sean verizon.net> writes:

D compiler is really really fast.  Gotta hand it to ya.

Sean

"Walter" <walter digitalmars.com> wrote in message
news:ba312c$uav$1 digitaldaemon.com...
 re-parsed once per project compilation, but not more often than that.


Maybe
 Walter could implement dumping of the parse trees to disk - this would


be a
 great and major optimisation.

 That was the original plan, but the compiler parses so fast it wasn't
 justifiable to make the effort.

May 16 2003

"Walter" <walter digitalmars.com> writes:

"Sean L. Palmer" <palmer.sean verizon.net> wrote in message
news:ba36nl$1430$1 digitaldaemon.com...
 D compiler is really really fast.  Gotta hand it to ya.

Thanks. I haven't even expended any effort tuning it for speed. I just
structured the language so it would be fast to parse.

May 16 2003

Garen Parham <nospam garen.net> writes:

Walter wrote:

 Thanks. I haven't even expended any effort tuning it for speed. I just
 structured the language so it would be fast to parse.

The whole thing is fast.  It's so fast I couldn't tell the difference from
doing something like:

$ export DMD=/path/to/blah

and running the compiler itself.  

DMC++ is also super fast.  What would you attribute that to?

May 16 2003

"Walter" <walter digitalmars.com> writes:

"Garen Parham" <nospam garen.net> wrote in message
news:ba4f4p$29kc$1 digitaldaemon.com...
 DMC++ is also super fast.  What would you attribute that to?

Profile, profile, profile!

May 18 2003

Garen Parham <nospam garen.net> writes:

Walter wrote:


 
 Profile, profile, profile!


Thats it?  I was thinking maybe you'd say you had some ingenuous bottom-up
strategies tightly knit with the target/languages or something. :)

May 18 2003

"Walter" <walter digitalmars.com> writes:

"Garen Parham" <nospam garen.net> wrote in message
news:ba8kn1$ugt$1 digitaldaemon.com...
 Walter wrote:
 Profile, profile, profile!

 Thats it?  I was thinking maybe you'd say you had some ingenuous bottom-up
 strategies tightly knit with the target/languages or something. :)

I don't think there's anything ingenious in the code. I just have a lot of
experience knowing what eats cycles and what doesn't <g>. I must have tried
dozens of different ways to do symbol tables.

I can tell you, though, if you want a slow compiler, use Lex and Yacc.

May 20 2003

Ilya Minkov <midiclub 8ung.at> writes:

Walter wrote:
 Thanks. I haven't even expended any effort tuning it for speed. I just
 structured the language so it would be fast to parse.

assert (Walter == GeniousWizard);

AND LET TEH WORLD CRASH IF THIS ASSERT FAILS!
:>

-i.

May 18 2003

"Walter" <walter digitalmars.com> writes:

"Ilya Minkov" <midiclub 8ung.at> wrote in message
news:ba8h98$qsu$1 digitaldaemon.com...
 Walter wrote:
 Thanks. I haven't even expended any effort tuning it for speed. I just
 structured the language so it would be fast to parse.

 assert (Walter == GeniousWizard);

 AND LET TEH WORLD CRASH IF THIS ASSERT FAILS!
 :>

LOL. If you want to see the source to the lexer/parser, it's included with
the download. It looks pretty straightforward, until you compare it with the
source to other compilers.

May 20 2003

D Programming

C/C++ Programming

Other

D - Compilation model