digitalmars.D - Solutions to the TypeInfo dependency injection issue?

Nathan Petrelli (5/5) Mar 08 2007 I'm referring to the issue raised by Tango developers about a TypeInfo f...

Walter Bright (8/24) Mar 08 2007 hack on par

kris (10/14) Mar 08 2007 Could you please be explicit about what the distinctions would be? And

Walter Bright (18/40) Mar 08 2007 When you compile:

kris (2/47) Mar 08 2007 What about (c) how templates are affected at each step ?

Walter Bright (2/3) Mar 08 2007 It's the same algorithm - nothing special about templates.

kris (7/13) Mar 08 2007 Is it possible, do you think, to be just a little more forthcoming on th...

Walter Bright (5/20) Mar 08 2007 Yes.

kris (11/45) Mar 08 2007 Thank you;

Sean Kelly (10/57) Mar 11 2007 My guess is separate compilation generates all TypeInfo and templates

Pragma (9/22) Mar 12 2007 That agrees with my experience, although I'm not sure about the "smaller...

Derek Parnell (15/40) Mar 08 2007 One of the things that greatly impressed me was DMD's ability to quickly

kris (6/16) Mar 08 2007 I don't see that as a hardship when buidling libs, since it perhaps

Nathan Petrelli <npetrelli klassmaster.com> writes:

I'm referring to the issue raised by Tango developers about a TypeInfo for
char[][] inflating the .EXE size by importing unneeded modules.

The solution given by Walter to this issue was carefully/painfully examination
of object file symbols to determine the correct order of linking.

Are other solutions been planned or considered?

Because I don't think this is a long term solution for big projects, specially
if an IDE is been used (Most of them don't even let you specify the compilation
order of files).

I think it would be possible to build a tool that analyzes object files and
determines the optimal order in most cases, but this seems like a hack on par
with the moc compiler of the Qt project. A hack that's only needed to supply a
deficiency of the compiler.

Mar 08 2007

Walter Bright <newshound digitalmars.com> writes:

Nathan Petrelli wrote:
 I'm referring to the issue raised by Tango developers about a TypeInfo for
char[][]
 inflating the .EXE size by importing unneeded modules.
 
 The solution given by Walter to this issue was carefully/painfully examination
 of object file symbols to determine the correct order of linking.
 
 Are other solutions been planned or considered?
 
 Because I don't think this is a long term solution for big projects,
 specially if an IDE is been used (Most of them don't even let you 

specify the
 compilation order of files).
 
 I think it would be possible to build a tool that analyzes object files and
 determines the optimal order in most cases, but this seems like a 

hack on par
 with the moc compiler of the Qt project. A hack that's only needed to 

supply
 a deficiency of the compiler.

This situation also only crops up when you're passing all the modules at 
once to dmd, and then putting the resulting object files into a library. 
Try compiling the modules independently when they are intended to be put 
in a library.

Mar 08 2007

kris <foo bar.com> writes:

Walter Bright wrote:

 This situation also only crops up when you're passing all the modules at 
 once to dmd, and then putting the resulting object files into a library. 
 Try compiling the modules independently when they are intended to be put 
 in a library.

Could you please be explicit about what the distinctions would be? And 
how it would affect template generation also?

Please assume I am a complete idiot, and lead me through step-by-step:

(a) what the implications are for discrete versus "batch" compilation

(b) how the different compilation approaches lead to differing results

(c) how templates are affected at each step

I'm hoping this will lead to a "comprehensive" set of instructions to 
help others create useful libs for WIn32 with DM tools. The longer and 
more detailed these intructions are, the better it will be for D

Mar 08 2007

Walter Bright <newshound digitalmars.com> writes:

kris wrote:
 Walter Bright wrote:
 
 This situation also only crops up when you're passing all the modules 
 at once to dmd, and then putting the resulting object files into a 
 library. Try compiling the modules independently when they are 
 intended to be put in a library.

 
 Could you please be explicit about what the distinctions would be? And 
 how it would affect template generation also?
 
 Please assume I am a complete idiot, and lead me through step-by-step:
 
 (a) what the implications are for discrete versus "batch" compilation
 
 (b) how the different compilation approaches lead to differing results
 
 (c) how templates are affected at each step
 
 I'm hoping this will lead to a "comprehensive" set of instructions to 
 help others create useful libs for WIn32 with DM tools. The longer and 
 more detailed these intructions are, the better it will be for D
 

When you compile:
	dmd -c a b
then dmd is assuming that a.obj and b.obj will be linked together, so it 
does not matter which object file something is placed in. In other 
words, it does not generate things twice.

On the other hand:
	dmd -c a
	dmd -c b
then dmd doesn't know, when compiling a.obj what will be in b.obj, so it 
assumes the worst and generates it.

In other words:
	dmd -c a b
	lib foo.lib a.obj b.obj
is not a good way to create a library, instead:
	dmd -c a
	dmd -c b
	lib foo.lib a.obj b.obj

Mar 08 2007

kris <foo bar.com> writes:

Walter Bright wrote:
 kris wrote:
 
 Walter Bright wrote:

 This situation also only crops up when you're passing all the modules 
 at once to dmd, and then putting the resulting object files into a 
 library. Try compiling the modules independently when they are 
 intended to be put in a library.


 Could you please be explicit about what the distinctions would be? And 
 how it would affect template generation also?

 Please assume I am a complete idiot, and lead me through step-by-step:

 (a) what the implications are for discrete versus "batch" compilation

 (b) how the different compilation approaches lead to differing results

 (c) how templates are affected at each step

 I'm hoping this will lead to a "comprehensive" set of instructions to 
 help others create useful libs for WIn32 with DM tools. The longer and 
 more detailed these intructions are, the better it will be for D

 
 When you compile:
     dmd -c a b
 then dmd is assuming that a.obj and b.obj will be linked together, so it 
 does not matter which object file something is placed in. In other 
 words, it does not generate things twice.
 
 On the other hand:
     dmd -c a
     dmd -c b
 then dmd doesn't know, when compiling a.obj what will be in b.obj, so it 
 assumes the worst and generates it.
 
 In other words:
     dmd -c a b
     lib foo.lib a.obj b.obj
 is not a good way to create a library, instead:
     dmd -c a
     dmd -c b
     lib foo.lib a.obj b.obj



What about (c) how templates are affected at each step ?

Mar 08 2007

Walter Bright <newshound digitalmars.com> writes:

kris wrote:
 What about (c) how templates are affected at each step ?

It's the same algorithm - nothing special about templates.

Mar 08 2007

kris <foo bar.com> writes:

Walter Bright wrote:
 kris wrote:
 
 What about (c) how templates are affected at each step ?

 
 
 It's the same algorithm - nothing special about templates.

Is it possible, do you think, to be just a little more forthcoming on this?

1) when you batch-compile code with multiple references to a template, 
there is just one instance generated.

2) when you compile the same code modules individually, there are 
presumably multiple template instances generated?

3) how does the linker resolve the multiple template instances to just one?

Mar 08 2007

Walter Bright <newshound digitalmars.com> writes:

kris wrote:
 Walter Bright wrote:
 kris wrote:

 What about (c) how templates are affected at each step ?


 It's the same algorithm - nothing special about templates.

 
 Is it possible, do you think, to be just a little more forthcoming on this?
 
 1) when you batch-compile code with multiple references to a template, 
 there is just one instance generated.

Yes.

 2) when you compile the same code modules individually, there are 
 presumably multiple template instances generated?

Yes.

 3) how does the linker resolve the multiple template instances to just one?

The template instantiations are put into COMDAT sections, and the linker 
discards redundant ones.

Mar 08 2007

kris <foo bar.com> writes:

Walter Bright wrote:
 kris wrote:
 
 Walter Bright wrote:

 kris wrote:

 What about (c) how templates are affected at each step ?



 It's the same algorithm - nothing special about templates.


 Is it possible, do you think, to be just a little more forthcoming on 
 this?

 1) when you batch-compile code with multiple references to a template, 
 there is just one instance generated.

 
 
 Yes.
 
 2) when you compile the same code modules individually, there are 
 presumably multiple template instances generated?

 
 
 Yes.
 
 3) how does the linker resolve the multiple template instances to just 
 one?

 
 
 The template instantiations are put into COMDAT sections, and the linker 
 discards redundant ones.

Thank you;

4) all symbols required to represent typeinfo and templates are now 
duplicated in each object file ?

5) The linker does not have to search beyond the current object file for 


6) the result is a library with many more duplicate symbols than before, 
but arranged in such a manner that persuades the linker to do the "right 
thing" ?

7) there is no possibility of the linker following a 'bad chain', and 
thus linking in unused or otherwise redundant code ?

Mar 08 2007

Sean Kelly <sean f4.ca> writes:

kris wrote:
 Walter Bright wrote:
 kris wrote:

 Walter Bright wrote:

 kris wrote:

 What about (c) how templates are affected at each step ?



 It's the same algorithm - nothing special about templates.


 Is it possible, do you think, to be just a little more forthcoming on 
 this?

 1) when you batch-compile code with multiple references to a 
 template, there is just one instance generated.


 Yes.

 2) when you compile the same code modules individually, there are 
 presumably multiple template instances generated?


 Yes.

 3) how does the linker resolve the multiple template instances to 
 just one?


 The template instantiations are put into COMDAT sections, and the 
 linker discards redundant ones.

 
 Thank you;
 
 4) all symbols required to represent typeinfo and templates are now 
 duplicated in each object file ?

My guess is separate compilation generates all TypeInfo and templates 
used by that module into the module's object file.  Which I believe is a 
"yes."

 5) The linker does not have to search beyond the current object file for 


Correct.

 6) the result is a library with many more duplicate symbols than before, 
 but arranged in such a manner that persuades the linker to do the "right 
 thing" ?

Yes.

 7) there is no possibility of the linker following a 'bad chain', and 
 thus linking in unused or otherwise redundant code ?

It certainly seems that way.  We get larger object files and libraries 
in exchange for smaller executables.  If any of the above is wrong, 
someone please correct me.


Sean

Mar 11 2007

Pragma <ericanderton yahoo.removeme.com> writes:

Sean Kelly wrote:
 kris wrote:
 6) the result is a library with many more duplicate symbols than 
 before, but arranged in such a manner that persuades the linker to do 
 the "right thing" ?

 
 Yes.
 
 7) there is no possibility of the linker following a 'bad chain', and 
 thus linking in unused or otherwise redundant code ?

 
 It certainly seems that way.  We get larger object files and libraries 
 in exchange for smaller executables.  If any of the above is wrong, 
 someone please correct me.

That agrees with my experience, although I'm not sure about the "smaller
executables" part.  I think the reason why we 
sometimes get larger executables is more incidental than deliberate - so it
doesn't always work out that way.  But if we 
opt for larger object files, then yes, we *always* get the smallest executable
size as a result.

A nice thing to add to DMD for all this would be to emit "fat .obj files" when
-c is supplied, no matter how many .d 
files are passed on the command line.  That way, the optimizations Walter has
added (non-duplication of templates and 
typeinfo) will still become useful for direct-to-link situations (w/o -c).

-- 
- EricAnderton at yahoo

Mar 12 2007

Derek Parnell <derek psych.ward> writes:

On Thu, 08 Mar 2007 12:36:24 -0800, Walter Bright wrote:

 Walter Bright wrote:
 
 This situation also only crops up when you're passing all the modules 
 at once to dmd, and then putting the resulting object files into a 
 library. Try compiling the modules independently when they are 
 intended to be put in a library.



 
 When you compile:
 	dmd -c a b
 then dmd is assuming that a.obj and b.obj will be linked together, so it 
 does not matter which object file something is placed in. In other 
 words, it does not generate things twice.
 
 On the other hand:
 	dmd -c a
 	dmd -c b
 then dmd doesn't know, when compiling a.obj what will be in b.obj, so it 
 assumes the worst and generates it.
 
 In other words:
 	dmd -c a b
 	lib foo.lib a.obj b.obj
 is not a good way to create a library, instead:
 	dmd -c a
 	dmd -c b
 	lib foo.lib a.obj b.obj

One of the things that greatly impressed me was DMD's ability to quickly
compile multiple files in one pass, rather than the make-like process on
doing one file per DMD run. So when I came to write Bud, I made a lot of
effort to ensure that I could compile as many as possible files in one call
to the compiler. 

It now seems that you are warning us against this feature of DMD, in the
case of creating libraries. This is extremely disappointing.

I will add a new switch to Bud to force file-by-file compilation.

-- 
Derek Parnell
Melbourne, Australia
"Justice for David Hicks!"
skype: derek.j.parnell

Mar 08 2007

kris <foo bar.com> writes:

Derek Parnell wrote:
 One of the things that greatly impressed me was DMD's ability to quickly
 compile multiple files in one pass, rather than the make-like process on
 doing one file per DMD run. So when I came to write Bud, I made a lot of
 effort to ensure that I could compile as many as possible files in one call
 to the compiler. 
 
 It now seems that you are warning us against this feature of DMD, in the
 case of creating libraries. This is extremely disappointing.
 
 I will add a new switch to Bud to force file-by-file compilation.

I don't see that as a hardship when buidling libs, since it perhaps 
doesn't happen as often as "regular" builds (assuming, of course, that 
this strategy actually resolves the underlying issue) ?

Having said that, the new switch will be *greatly* appreciated. Means we 
can avoid having to create and maintain the damn make-files. Thanks, Derek!

Mar 08 2007

D Programming

C/C++ Programming

Other

digitalmars.D - Solutions to the TypeInfo dependency injection issue?