digitalmars.D - Lib change leads to larger executables

Walter Bright (22/25) Feb 21 2007 Let's say you have a template instance, TI. It is declared in two

kris (6/39) Feb 21 2007 This is definately useful; thanks for this and for being so expediant

Walter Bright (3/6) Feb 21 2007 For a quick & dirty test, try reversing the order the object files are

kris (2/11) Feb 21 2007 there's a couple of hundred :-D

Walter Bright (3/8) Feb 21 2007 Do the ones that were giving the undefined reference before the changes

kris (8/21) Feb 21 2007 The lib itself is actually built via a single D module, which imports

kris (5/33) Feb 21 2007 I've been messing with the response file handed to the librarian (via

Walter Bright (5/9) Feb 21 2007 Then look at the .map file to see what was linked in to the larger file

kris (3/15) Feb 21 2007 That's exactly what I'm doing, and I agree there seems to be something

kris (61/80) Feb 21 2007 OK: narrowed it down to one obj file. The vast sea of data is coming

Pragma (6/97) Feb 21 2007 Sorry if I'm stating the obvious, but it seems to me that the linker is ...

kris (5/116) Feb 21 2007 Heya Eric

Frits van Bommel (5/20) Feb 21 2007 Did you try putting it at the front of the lib? You never know, maybe it...

Pragma (7/29) Feb 21 2007 I thought about that too, but that doesn't seem to be the case. As Kris ...

kris (5/39) Feb 21 2007 Perhaps a question is this: why the heck is that symbol exposed from

Walter Bright (3/7) Feb 21 2007 The standard TypeInfo's in Phobos cover only basic types and arrays of

kris (11/21) Feb 21 2007 Yep, I've seen that to be the case. However, the current strategy

Frits van Bommel (8/11) Feb 21 2007 The obvious solution would be to always generate typeinfo even if it can...

Walter Bright (11/22) Feb 21 2007 I wish to be precise - there is no "seems" or "confuse" with linking. It...

Derek Parnell (33/59) Feb 21 2007 Walter,

Walter Bright (22/52) Feb 21 2007 That's right. COMDATs make things slightly more complicated, as

Derek Parnell (9/12) Feb 21 2007 Has anyone tried successfully to use a Windows linker, other than OptLin...

kris (6/17) Feb 21 2007 I tried the 'modified' Watcom linker the other night, but it found some
John Reimer (6/14) Feb 21 2007 What does gcc have to do with windows dmd tools? And what good is it to

Carlos Smith (17/17) Feb 21 2007 "Walter Bright" wrote in message

Walter Bright (2/3) Feb 23 2007 It is for the Linux DMD.

John Reimer (8/20) Feb 22 2007 That's not a good argument. ld is pig slow? I'm sorry but I don't get

=?UTF-8?B?QW5kZXJzIEYgQmrDtnJrbHVuZA==?= (8/15) Feb 22 2007 I've find OPTLINK to hang and crash a lot when linking the wxD programs

Bill Baxter (6/24) Feb 22 2007 I see hangs occasionally even for small programs. Even on single files

Walter Bright (4/8) Feb 23 2007 I've never seen optlink crash except on known cases where there's a

Walter Bright (4/13) Feb 23 2007 I'd forgotten, there is a problem with optlink running on multicore

Frits van Bommel (4/18) Feb 23 2007 Is it a threaded application? (That would explain why the error is spora...

Walter Bright (5/20) Feb 23 2007 Yes.

Sean Kelly (6/13) Feb 23 2007 Assuming you're talking about cores that don't share a cache (ie. "real"...

Bill Baxter (3/17) Feb 23 2007 Hmm, one of my machines is core duo, so that could maybe be it.

Bill Baxter (16/25) Feb 23 2007 Hmm, well it's not crashing just hanging. And it may not be optlink but...

David Gileadi (6/36) Feb 26 2007 I've seen it before compiling the wxD samples. I'm running a Pentium 4

Lionello Lunesu (3/7) Feb 23 2007 I got that too! Same numbers, 1:50. AMD dual core.

Sean Kelly (7/26) Feb 22 2007 Ideally, perhaps a linker could provide both options: link fast and

Frits van Bommel (6/11) Feb 22 2007 That might not be the case here: if a module's object file is pulled in,...

Sean Kelly (3/15) Feb 22 2007 Yuck. Good point.
Kristian Kilpi (5/16) Feb 22 2007 Hmm, yes, but how that's different from the today's situation? Currently...

Sean Kelly (4/23) Feb 22 2007 Because as long as the list of dependencies remains unchanged, the same

Kristian Kilpi (5/27) Feb 22 2007 Well yes, except there is no guarantees of that, in the specs I mean.

Walter Bright (2/4) Feb 23 2007 What is "link carefully"?

Sean Kelly (2/7) Feb 23 2007 Link at a segment level instead of a module level.

jcc7 (17/29) Feb 22 2007 I think your idea could work. It makes sense to me, but I'd like to go o...

Frits van Bommel (25/56) Feb 22 2007 Not all libraries may have a DllMain, IIRC it's completely optional.

jcc7 (26/77) Feb 22 2007 (By the way, this topic is mostly over-my-head, so I'll probably have to...

Frits van Bommel (36/77) Feb 22 2007 How static constructors could interfere:

Justin C Calvarese (25/88) Feb 22 2007 Oh, I thought the .obj file included mentions of things that are needed,...

Frits van Bommel (32/85) Feb 22 2007 Oh, you want the compiler to parse the .obj files to generate some extra...

jcc7 (10/30) Feb 23 2007 Doh! I forgot that the compiler doesn't read the .obj/.lib files, but ju...

Daniel Keep (23/23) Feb 22 2007 (I'm just going to interject here because WOW this thread is getting

kris (3/28) Feb 23 2007 On the face of it, that sounds like a reaonable solution. One would

Dave (2/32) Feb 23 2007 Great idea if it's feasible... Would it then make sense that the switch ...

jcc7 (7/26) Feb 23 2007 I don't know enough about how linkers work to know if OPTLINK can just i...

Daniel Keep (19/39) Feb 23 2007 I had a peek at the TypeInfos that are hard-coded into Phobos. They do

kris (2/27) Feb 21 2007 No change, Frits

Walter Bright (6/7) Feb 21 2007 TypeInfo's don't get the module prefix because it would cause

kris (2/12) Feb 21 2007 well, ok ... but it is responsible for what happened here? If not, what ...

Walter Bright (3/16) Feb 21 2007 From your description, the linker is looking to resolve a reference to

kris (6/29) Feb 21 2007 That's exactly what it looks like. Would you agree the results could be

Walter Bright (10/13) Feb 21 2007 I bet that's because that module was imported (directly or indirectly)

kris (16/38) Feb 21 2007 1) Tango takes this very seriously ... more so than Phobos, for example.

Walter Bright (39/66) Feb 21 2007 Sure, but in this particular case, it seems that "core" is being

kris (31/111) Feb 21 2007 This core module, and the entire locale package it resides in, is /not/

Justin C Calvarese (27/158) Feb 21 2007 I'm not trying to pick a fight with any of the people who have been

kris (115/143) Feb 21 2007 Well said, Justin. I'm personally feeling like there's either some vast

John Reimer (28/28) Feb 22 2007 < SNIP good post from Kris >

Walter Bright (50/55) Feb 23 2007 Linux's ld exhibits the same behavior. Try compiling the 3 files here

Sean Kelly (4/17) Feb 23 2007 In your example, no symbols at all from a is referenced in b or in test,...

Sean Kelly (2/20) Feb 23 2007 Forget I said that. It's the TypeInfo for char[][].

Walter Bright (4/25) Feb 23 2007 That's right, it picked the FIRST ONE in library, regardless of how many...

Sean Kelly (3/29) Feb 23 2007 That makes complete sense. It's irritating that such matches pull in

Frits van Bommel (7/9) Feb 23 2007 Of course, if matches didn't pull in the entire object then static

Sean Kelly (2/13) Feb 23 2007 Hm... so how does segment-level linking work at all?

Frits van Bommel (20/34) Feb 23 2007 Well, ld has a switch called --gc-sections, which basically... wait for

John Reimer (39/103) Feb 23 2007 True, you are correct about the same error being represented here. This

John Reimer (15/15) Feb 24 2007 I want to point out also that there /is/ a way to partially side-step th...

Jascha Wetzel (18/23) Feb 22 2007 just a thought:
Kristian Kilpi (21/39) Feb 22 2007 [snip]
janderson (6/9) Feb 22 2007 [snip]

Frits van Bommel (19/28) Feb 22 2007 Presumably this would leave static constructors/destructors intact? If

Sean Kelly (30/61) Feb 22 2007 This is the crux of the problem. In C/C++, problem areas can typically

Frits van Bommel (12/16) Feb 22 2007 Doesn't Build only link together the object files for modules that are

Walter Bright (34/40) Feb 23 2007 The librarian takes a list of .obj files, and concatenates them

Frits van Bommel (3/13) Feb 23 2007 When doing a lookup while linking, does it at least check other .obj

Walter Bright (15/17) Feb 23 2007 The linker first puts together all the explicitly listed object files.

Sean Kelly (4/8) Feb 23 2007 So how are TypeInfo definitions resolved? I'd think it would be pretty

Walter Bright (5/13) Feb 23 2007 if (name in library.dictionary)

Sean Kelly (2/18) Feb 23 2007 Oh, so TypeInfo are stored in COMDATs. Makes perfect sense. Thanks!

Walter Bright (10/21) Feb 23 2007 Then the typeinfo for char[][] is being generated by another module. I

kris (29/47) Mar 07 2007 After taking a much needed break from this, I'm having another bash at

kris (8/68) Mar 07 2007 In fact, it is so brittle and fragile that I now cannot reproduce what's...

Pragma (7/78) Mar 08 2007 I made a pass at trying to reproduce this the last time out, with no suc...

Sean Kelly (9/66) Mar 07 2007 It's a long-term proposition, but what about delaying the generation of

Carlos Santander (6/17) Mar 08 2007 Unless I'm missing something, I don't think a new linker would be requir...

Daniel Keep (9/25) Mar 08 2007 What about build utilities that compile each module separately (using

Pragma (6/73) Mar 08 2007 I was going to say: new linker == our problem. I'm in the same boat on ...

Don Clugston (5/79) Mar 08 2007 I've been wondering how far your work with DDL goes towards writing a

Pragma (12/94) Mar 08 2007 Well, the OMF loader needs some polish and some subtle refactoring (read...

kris (6/8) Feb 21 2007 Just to satify your stance I tried this; guess what? It has no effect

Walter Bright (3/11) Feb 23 2007 Then there's something else going on, i.e. another symbol is being
Walter Bright (3/11) Feb 23 2007 Did you verify (using obj2asm) that the separate module actually did

John Reimer (7/41) Feb 21 2007 Is build really a reliable means of testing this? I mean, it's produced

John Reimer (4/50) Feb 22 2007 I obviously misunderstood the whole issue here. After reading the

Sean Kelly (5/30) Feb 21 2007 For some reason I thought an optimizing linker worked at a segment

Walter Bright (2/6) Feb 21 2007 The linker works at the .obj file level.

Frits van Bommel (3/10) Feb 21 2007 GNU ld seems to be perfectly happy working at the section level (with

Walter Bright (3/5) Feb 21 2007 Yeah, well, try linking D programs with --gc-sections, and you'll get a

Lionello Lunesu (4/9) Feb 21 2007 Thomas has suggested some fixes for that in bugzilla #879.

Walter Bright (5/14) Feb 21 2007 Yes, I know, and I'll probably implement them. But they are a hack. A

Frits van Bommel (6/12) Feb 21 2007 Haven't had trouble with it so far, though I seem to recall reading

Kristian Kilpi (8/34) Feb 21 2007 Here's a quick thought. (It's probably too impractical/absurd. ;) ) Coul...

Pragma (11/53) Feb 21 2007 Nice idea, but I'd rather see the librarian to (optionally?) do this job...

kris (6/69) Feb 21 2007 Just to clarify the current situation: the ballooned exe file has

Walter Bright <newshound digitalmars.com> writes:

 It does, but increases the exe size of the first example from 180kb to 617kb!
 180kb is when compiled using build/rebuild/jake etc (no library) and 

the 617kb
 is when using dmd+lib only. Same flags in both cases: none at all

Let's say you have a template instance, TI. It is declared in two 
modules, M1 and M2:

-----------M1------------
TI
A
-----------M2------------
TI
B
-------------------------

M1 also declares A, and M2 also declares B. Now, the linker is looking 
to resolve TI, and the first one it finds is one in M1, and so links in 
M1. Later on, it needs to resolve B, and so links in M2. The redundant 
TI is discarded (because it's a COMDAT).

However, suppose the program never references A, and A is a chunk of 
code that pulls in lots of other bloat. This could make the executable 
much larger than if, in resolving TI, it had picked M2 instead.

You can control which module containing TI will be pulled in by the 
linker to resolve TI, by specifying that module first to lib.exe.

You can also put TI in a third module that has neither A nor B in it. 
When compiling M1 and M2, import that third module, so TI won't be 
generated for M1 or M2.

Feb 21 2007

kris <foo bar.com> writes:

Walter Bright wrote:
 It does, but increases the exe size of the first example from 180kb to 
 617kb!

  > 180kb is when compiled using build/rebuild/jake etc (no library) and 
 the 617kb
  > is when using dmd+lib only. Same flags in both cases: none at all

 Let's say you have a template instance, TI. It is declared in two 
 modules, M1 and M2:

 -----------M1------------
 TI
 A
 -----------M2------------
 TI
 B
 -------------------------

 M1 also declares A, and M2 also declares B. Now, the linker is looking 
 to resolve TI, and the first one it finds is one in M1, and so links in 
 M1. Later on, it needs to resolve B, and so links in M2. The redundant 
 TI is discarded (because it's a COMDAT).

 However, suppose the program never references A, and A is a chunk of 
 code that pulls in lots of other bloat. This could make the executable 
 much larger than if, in resolving TI, it had picked M2 instead.

 You can control which module containing TI will be pulled in by the 
 linker to resolve TI, by specifying that module first to lib.exe.

 You can also put TI in a third module that has neither A nor B in it. 
 When compiling M1 and M2, import that third module, so TI won't be 
 generated for M1 or M2.

This is definately useful; thanks for this and for being so expediant 
with the lib change.

In this particular case I suspect something else is the cause, since (a) 
Tango is deliberately very granular (b) the map file for the huge exe is 
showing gobs of data that shouldn't exist <g>

Feb 21 2007

Walter Bright <newshound digitalmars.com> writes:

kris wrote:
 In this particular case I suspect something else is the cause, since (a) 
 Tango is deliberately very granular (b) the map file for the huge exe is 
 showing gobs of data that shouldn't exist <g>

For a quick & dirty test, try reversing the order the object files are 
presented to lib.

Feb 21 2007

kris <foo bar.com> writes:

Walter Bright wrote:
 kris wrote:
 
 In this particular case I suspect something else is the cause, since 
 (a) Tango is deliberately very granular (b) the map file for the huge 
 exe is showing gobs of data that shouldn't exist <g>

 
 
 For a quick & dirty test, try reversing the order the object files are 
 presented to lib.

there's a couple of hundred :-D

Feb 21 2007

Walter Bright <newshound digitalmars.com> writes:

kris wrote:
 Walter Bright wrote:
 For a quick & dirty test, try reversing the order the object files are 
 presented to lib.

 
 there's a couple of hundred :-D

Do the ones that were giving the undefined reference before the changes 
to lib.

Feb 21 2007

kris <foo bar.com> writes:

Walter Bright wrote:
 kris wrote:
 
 Walter Bright wrote:

 For a quick & dirty test, try reversing the order the object files 
 are presented to lib.


 there's a couple of hundred :-D

 
 
 Do the ones that were giving the undefined reference before the changes 
 to lib.

The lib itself is actually built via a single D module, which imports 
all others. This is then given to Build to construct the library. Thus I 
don't have direct control over the ordering. Having said that, it 
appears Build does something different when the modules are reordered; 
Likely changing the order in which modules are presented to the lib.

By moving things around, I see a change in size on the target executable 
between -4kb to +5kb

Feb 21 2007

kris <foo bar.com> writes:

kris wrote:
 Walter Bright wrote:
 
 kris wrote:

 Walter Bright wrote:

 For a quick & dirty test, try reversing the order the object files 
 are presented to lib.



 there's a couple of hundred :-D



 Do the ones that were giving the undefined reference before the 
 changes to lib.

 
 
 The lib itself is actually built via a single D module, which imports 
 all others. This is then given to Build to construct the library. Thus I 
 don't have direct control over the ordering. Having said that, it 
 appears Build does something different when the modules are reordered; 
 Likely changing the order in which modules are presented to the lib.
 
 By moving things around, I see a change in size on the target executable 
 between -4kb to +5kb
 

I've been messing with the response file handed to the librarian (via 
lib  foo); moving modules around here and there, reordering big chunks 
etc. Have yet to see a notable change in the resulting exe after 
relinking against each lib version.

Feb 21 2007

Walter Bright <newshound digitalmars.com> writes:

kris wrote:
 I've been messing with the response file handed to the librarian (via 
 lib  foo); moving modules around here and there, reordering big chunks 
 etc. Have yet to see a notable change in the resulting exe after 
 relinking against each lib version.

Then look at the .map file to see what was linked in to the larger file 
that wasn't in the smaller. Remove that module from the library. Link 
again, and see what was unresolved. Rinse, repeat, and you'll see what 
was pulling it all in.

Feb 21 2007

kris <foo bar.com> writes:

Walter Bright wrote:
 kris wrote:
 
 I've been messing with the response file handed to the librarian (via 
 lib  foo); moving modules around here and there, reordering big chunks 
 etc. Have yet to see a notable change in the resulting exe after 
 relinking against each lib version.

 
 
 Then look at the .map file to see what was linked in to the larger file 
 that wasn't in the smaller. Remove that module from the library. Link 
 again, and see what was unresolved. Rinse, repeat, and you'll see what 
 was pulling it all in.

That's exactly what I'm doing, and I agree there seems to be something 
odd going on here. With ~200 modules, it's no slam-dunk to isolate it :)

Feb 21 2007

kris <foo bar.com> writes:

kris wrote:
 Walter Bright wrote:
 
 kris wrote:

 I've been messing with the response file handed to the librarian (via 
 lib  foo); moving modules around here and there, reordering big 
 chunks etc. Have yet to see a notable change in the resulting exe 
 after relinking against each lib version.



 Then look at the .map file to see what was linked in to the larger 
 file that wasn't in the smaller. Remove that module from the library. 
 Link again, and see what was unresolved. Rinse, repeat, and you'll see 
 what was pulling it all in.

 
 
 That's exactly what I'm doing, and I agree there seems to be something 
 odd going on here. With ~200 modules, it's no slam-dunk to isolate it :)


OK: narrowed it down to one obj file. The vast sea of data is coming 
from the 'locale' package, which is stuffed to the gills with I18N content.

There's a half dozen modules in 'locale', none of which are used or 
referenced by any other code in Tango, or by the example at hand. It is 
an entirely isolated package (for obvious reason).

Yes, there's the /potential/ for symbolic collision between 'locale' and 
some other module/package. Let's consider that in a moment.

In the meantime, I whittled the dependency down to one single module 
listed right at the very end of the lib-response file (the very last one 
to be added to the lib). This module is called Core.

When Core is added to the lib, the linker emits "missing symbol" errors 
since the rest of the locale package is missing. When Core is removed, 
there are no link errors. This indicates some kind of symbolic 
collision; one that is triggered by the very last module added to the lib?

So, sifting through the obj2asm output for Core, I see all the publics 
are correctly prefixed by the package name. Thus, each symbol exposed 
appears to be unique across the entire library. Except for one that 
stands out. It is noted as an 'extern', yet is also listed amongst the 
full set of publics in the obj2asm output; and it looks a bit 
suspicious. Here's a small snippet from the ocean of public symbols in Core:

====================
_D5tango4text6locale4Core18DaylightSavingTime5_ctorMFS5tango4text6locale4Core8DateTimeS5tango4text6locale4Core8DateTimeS5tango4text6locale4Core8TimeSpanZC5tango4text6locale4Core
8DaylightSavingTime 
COMDAT flags=x0 attr=x0 align=x0
_D5tango4text6locale4Core18DaylightSavingTime5startMFZS5tango4text6
ocale4Core8DateTime 
COMDAT flags=x0 attr=x0 align=x0
_D5tango4text6locale4Core18DaylightSavingTime3endMFZS5tango4text6
ocale4Core8DateTime 
COMDAT flags=x0 attr=x0 align=x0
_D5tango4text6locale4Core18DaylightSavingTime6changeMFZS5tango4text6
ocale4Core8TimeSpan 
COMDAT flags=x0 attr=x0 align=x0
_D5tango4text6locale4Core8TimeZone18getDaylightChangesMFiZC5tango4text6locale4Core
8DaylightSavingTime 
COMDAT flags=x0 attr=x0 align=x0
_D5tango4text6locale4Core8TimeZone18getDaylightChangesMFiZC5tango4text6locale4Core18DaylightSavingTime9getSundayMFiiiiiiiiZS5tango4text6
ocale4Core8DateTime 
COMDAT flags=x0 attr=x0 align=x0
_D5tango4text6locale4Core8TimeZone12getLocalTimeMFS5tango4text6locale4Core8DateTimeZS5tango4text6
ocale4Core8DateTime 
COMDAT flags=x0 attr=x0 align=x0
_D5tango4text6locale4Core8TimeZone16getUniversalTimeMFS5tango4text6locale4Core8DateTimeZS5tango4text6
ocale4Core8DateTime 
COMDAT flags=x0 attr=x0 align=x0
_D5tango4text6locale4Core8TimeZone12getUtcOffsetMFS5tango4text6locale4Core8DateTimeZS5tango4text6
ocale4Core8TimeSpan 
COMDAT flags=x0 attr=x0 align=x0
_D5tango4text6locale4Core8TimeZone20isDaylightSavingTimeMFS5tango4text6lo
ale4Core8DateTimeZb 
COMDAT flags=x0 attr=x0 align=x0
_D5tango4text6locale4Core8TimeZone7currentFZC5tango4text6locale4Core8TimeZone 
COMDAT flags=x0 attr=x0 align=x0
_D5tango4text6locale4Core8TimeZone5_ctorMFZC5tango4text6locale4Core8TimeZone 
COMDAT flags=x0 attr=x0 align=x0
_D47TypeInfo_C5tango4text6locale4Core12NumberFormat6__initZ	COMDAT 
flags=x0 attr=x10 align=x0
_D49TypeInfo_C5tango4text6locale4Core14DateTimeFormat6__initZ	COMDAT 
flags=x0 attr=x10 align=x0
_D5tango4text6locale4Core14__T7arrayOfTiZ7arrayOfFAiXAi	COMDAT flags=x0 
attr=x10 align=x0
_D5tango4text6locale4Core15__T7arrayOfTAaZ7arrayOfFAAaXAAa	COMDAT 
flags=x0 attr=x10 align=x0
_D12TypeInfo_AAa6__initZ	COMDAT flags=x0 attr=x10 align=x0
__D5tango4text6locale4Core9__modctorFZv	COMDAT flags=x0 attr=x0 align=x0
__D5tango4text6locale4Core9__moddtorFZv	COMDAT flags=x0 attr=x0 align=x0
_D5tango4text6locale4Core8__assertFiZv	COMDAT flags=x0 attr=x0 align=x0
_D5tango4text6locale4Core7__arrayZ	COMDAT flags=x0 attr=x0 align=x0

====================

You see the odd one out? That cursed _D12TypeInfo_AAa6__initZ again?

Feb 21 2007

Pragma <ericanderton yahoo.removeme.com> writes:

kris wrote:
 kris wrote:
 Walter Bright wrote:

 kris wrote:

 I've been messing with the response file handed to the librarian 
 (via lib  foo); moving modules around here and there, reordering big 
 chunks etc. Have yet to see a notable change in the resulting exe 
 after relinking against each lib version.



 Then look at the .map file to see what was linked in to the larger 
 file that wasn't in the smaller. Remove that module from the library. 
 Link again, and see what was unresolved. Rinse, repeat, and you'll 
 see what was pulling it all in.


 That's exactly what I'm doing, and I agree there seems to be something 
 odd going on here. With ~200 modules, it's no slam-dunk to isolate it :)

 
 
 OK: narrowed it down to one obj file. The vast sea of data is coming 
 from the 'locale' package, which is stuffed to the gills with I18N content.
 
 There's a half dozen modules in 'locale', none of which are used or 
 referenced by any other code in Tango, or by the example at hand. It is 
 an entirely isolated package (for obvious reason).
 
 Yes, there's the /potential/ for symbolic collision between 'locale' and 
 some other module/package. Let's consider that in a moment.
 
 In the meantime, I whittled the dependency down to one single module 
 listed right at the very end of the lib-response file (the very last one 
 to be added to the lib). This module is called Core.
 
 When Core is added to the lib, the linker emits "missing symbol" errors 
 since the rest of the locale package is missing. When Core is removed, 
 there are no link errors. This indicates some kind of symbolic 
 collision; one that is triggered by the very last module added to the lib?
 
 So, sifting through the obj2asm output for Core, I see all the publics 
 are correctly prefixed by the package name. Thus, each symbol exposed 
 appears to be unique across the entire library. Except for one that 
 stands out. It is noted as an 'extern', yet is also listed amongst the 
 full set of publics in the obj2asm output; and it looks a bit 
 suspicious. Here's a small snippet from the ocean of public symbols in 
 Core:
 
 ====================
 _D5tango4text6locale4Core18DaylightSavingTime5_ctorMFS5tango4text6locale4Core8DateTimeS5tango4text6locale4Core8DateTimeS5tango4text6locale4Core8TimeSpanZC5tango4text6locale4Core
8DaylightSavingTime 
 COMDAT flags=x0 attr=x0 align=x0
 _D5tango4text6locale4Core18DaylightSavingTime5startMFZS5tango4text6
ocale4Core8DateTime 
 COMDAT flags=x0 attr=x0 align=x0
 _D5tango4text6locale4Core18DaylightSavingTime3endMFZS5tango4text6
ocale4Core8DateTime 
 COMDAT flags=x0 attr=x0 align=x0
 _D5tango4text6locale4Core18DaylightSavingTime6changeMFZS5tango4text6
ocale4Core8TimeSpan 
 COMDAT flags=x0 attr=x0 align=x0
 _D5tango4text6locale4Core8TimeZone18getDaylightChangesMFiZC5tango4text6locale4Core
8DaylightSavingTime 
 COMDAT flags=x0 attr=x0 align=x0
 _D5tango4text6locale4Core8TimeZone18getDaylightChangesMFiZC5tango4text6locale4Core18DaylightSavingTime9getSundayMFiiiiiiiiZS5tango4text6
ocale4Core8DateTime 
 COMDAT flags=x0 attr=x0 align=x0
 _D5tango4text6locale4Core8TimeZone12getLocalTimeMFS5tango4text6locale4Core8DateTimeZS5tango4text6
ocale4Core8DateTime 
 COMDAT flags=x0 attr=x0 align=x0
 _D5tango4text6locale4Core8TimeZone16getUniversalTimeMFS5tango4text6locale4Core8DateTimeZS5tango4text6
ocale4Core8DateTime 
 COMDAT flags=x0 attr=x0 align=x0
 _D5tango4text6locale4Core8TimeZone12getUtcOffsetMFS5tango4text6locale4Core8DateTimeZS5tango4text6
ocale4Core8TimeSpan 
 COMDAT flags=x0 attr=x0 align=x0
 _D5tango4text6locale4Core8TimeZone20isDaylightSavingTimeMFS5tango4text6lo
ale4Core8DateTimeZb 
 COMDAT flags=x0 attr=x0 align=x0
 _D5tango4text6locale4Core8TimeZone7currentFZC5tango4text6locale4Core8TimeZone 
 COMDAT flags=x0 attr=x0 align=x0
 _D5tango4text6locale4Core8TimeZone5_ctorMFZC5tango4text6locale4Core8TimeZone 
 COMDAT flags=x0 attr=x0 align=x0
 _D47TypeInfo_C5tango4text6locale4Core12NumberFormat6__initZ    COMDAT 
 flags=x0 attr=x10 align=x0
 _D49TypeInfo_C5tango4text6locale4Core14DateTimeFormat6__initZ    COMDAT 
 flags=x0 attr=x10 align=x0
 _D5tango4text6locale4Core14__T7arrayOfTiZ7arrayOfFAiXAi    COMDAT 
 flags=x0 attr=x10 align=x0
 _D5tango4text6locale4Core15__T7arrayOfTAaZ7arrayOfFAAaXAAa    COMDAT 
 flags=x0 attr=x10 align=x0
 _D12TypeInfo_AAa6__initZ    COMDAT flags=x0 attr=x10 align=x0
 __D5tango4text6locale4Core9__modctorFZv    COMDAT flags=x0 attr=x0 align=x0
 __D5tango4text6locale4Core9__moddtorFZv    COMDAT flags=x0 attr=x0 align=x0
 _D5tango4text6locale4Core8__assertFiZv    COMDAT flags=x0 attr=x0 align=x0
 _D5tango4text6locale4Core7__arrayZ    COMDAT flags=x0 attr=x0 align=x0
 
 ====================
 
 You see the odd one out? That cursed _D12TypeInfo_AAa6__initZ again?

Sorry if I'm stating the obvious, but it seems to me that the linker is finding
this typeinfo COMDAT in Core first, 
rather than somewhere else, and is thereby forcing the inclusion of the rest of
it's containing module.

Does moving core.obj to the end of the .lib solve the problem?

-- 
- EricAnderton at yahoo

Feb 21 2007

kris <foo bar.com> writes:

Pragma wrote:
 kris wrote:
 
 kris wrote:

 Walter Bright wrote:

 kris wrote:

 I've been messing with the response file handed to the librarian 
 (via lib  foo); moving modules around here and there, reordering 
 big chunks etc. Have yet to see a notable change in the resulting 
 exe after relinking against each lib version.




 Then look at the .map file to see what was linked in to the larger 
 file that wasn't in the smaller. Remove that module from the 
 library. Link again, and see what was unresolved. Rinse, repeat, and 
 you'll see what was pulling it all in.



 That's exactly what I'm doing, and I agree there seems to be 
 something odd going on here. With ~200 modules, it's no slam-dunk to 
 isolate it :)



 OK: narrowed it down to one obj file. The vast sea of data is coming 
 from the 'locale' package, which is stuffed to the gills with I18N 
 content.

 There's a half dozen modules in 'locale', none of which are used or 
 referenced by any other code in Tango, or by the example at hand. It 
 is an entirely isolated package (for obvious reason).

 Yes, there's the /potential/ for symbolic collision between 'locale' 
 and some other module/package. Let's consider that in a moment.

 In the meantime, I whittled the dependency down to one single module 
 listed right at the very end of the lib-response file (the very last 
 one to be added to the lib). This module is called Core.

 When Core is added to the lib, the linker emits "missing symbol" 
 errors since the rest of the locale package is missing. When Core is 
 removed, there are no link errors. This indicates some kind of 
 symbolic collision; one that is triggered by the very last module 
 added to the lib?

 So, sifting through the obj2asm output for Core, I see all the publics 
 are correctly prefixed by the package name. Thus, each symbol exposed 
 appears to be unique across the entire library. Except for one that 
 stands out. It is noted as an 'extern', yet is also listed amongst the 
 full set of publics in the obj2asm output; and it looks a bit 
 suspicious. Here's a small snippet from the ocean of public symbols in 
 Core:

 ====================
 _D5tango4text6locale4Core18DaylightSavingTime5_ctorMFS5tango4text6locale4Core8DateTimeS5tango4text6locale4Core8DateTimeS5tango4text6locale4Core8TimeSpanZC5tango4text6locale4Core
8DaylightSavingTime 
 COMDAT flags=x0 attr=x0 align=x0
 _D5tango4text6locale4Core18DaylightSavingTime5startMFZS5tango4text6
ocale4Core8DateTime 
 COMDAT flags=x0 attr=x0 align=x0
 _D5tango4text6locale4Core18DaylightSavingTime3endMFZS5tango4text6
ocale4Core8DateTime 
 COMDAT flags=x0 attr=x0 align=x0
 _D5tango4text6locale4Core18DaylightSavingTime6changeMFZS5tango4text6
ocale4Core8TimeSpan 
 COMDAT flags=x0 attr=x0 align=x0
 _D5tango4text6locale4Core8TimeZone18getDaylightChangesMFiZC5tango4text6locale4Core
8DaylightSavingTime 
 COMDAT flags=x0 attr=x0 align=x0
 _D5tango4text6locale4Core8TimeZone18getDaylightChangesMFiZC5tango4text6locale4Core18DaylightSavingTime9getSundayMFiiiiiiiiZS5tango4text6
ocale4Core8DateTime 
 COMDAT flags=x0 attr=x0 align=x0
 _D5tango4text6locale4Core8TimeZone12getLocalTimeMFS5tango4text6locale4Core8DateTimeZS5tango4text6
ocale4Core8DateTime 
 COMDAT flags=x0 attr=x0 align=x0
 _D5tango4text6locale4Core8TimeZone16getUniversalTimeMFS5tango4text6locale4Core8DateTimeZS5tango4text6
ocale4Core8DateTime 
 COMDAT flags=x0 attr=x0 align=x0
 _D5tango4text6locale4Core8TimeZone12getUtcOffsetMFS5tango4text6locale4Core8DateTimeZS5tango4text6
ocale4Core8TimeSpan 
 COMDAT flags=x0 attr=x0 align=x0
 _D5tango4text6locale4Core8TimeZone20isDaylightSavingTimeMFS5tango4text6lo
ale4Core8DateTimeZb 
 COMDAT flags=x0 attr=x0 align=x0
 _D5tango4text6locale4Core8TimeZone7currentFZC5tango4text6locale4Core8TimeZone 
 COMDAT flags=x0 attr=x0 align=x0
 _D5tango4text6locale4Core8TimeZone5_ctorMFZC5tango4text6locale4Core8TimeZone 
 COMDAT flags=x0 attr=x0 align=x0
 _D47TypeInfo_C5tango4text6locale4Core12NumberFormat6__initZ    COMDAT 
 flags=x0 attr=x10 align=x0
 _D49TypeInfo_C5tango4text6locale4Core14DateTimeFormat6__initZ    
 COMDAT flags=x0 attr=x10 align=x0
 _D5tango4text6locale4Core14__T7arrayOfTiZ7arrayOfFAiXAi    COMDAT 
 flags=x0 attr=x10 align=x0
 _D5tango4text6locale4Core15__T7arrayOfTAaZ7arrayOfFAAaXAAa    COMDAT 
 flags=x0 attr=x10 align=x0
 _D12TypeInfo_AAa6__initZ    COMDAT flags=x0 attr=x10 align=x0
 __D5tango4text6locale4Core9__modctorFZv    COMDAT flags=x0 attr=x0 
 align=x0
 __D5tango4text6locale4Core9__moddtorFZv    COMDAT flags=x0 attr=x0 
 align=x0
 _D5tango4text6locale4Core8__assertFiZv    COMDAT flags=x0 attr=x0 
 align=x0
 _D5tango4text6locale4Core7__arrayZ    COMDAT flags=x0 attr=x0 align=x0

 ====================

 You see the odd one out? That cursed _D12TypeInfo_AAa6__initZ again?

 
 
 Sorry if I'm stating the obvious, but it seems to me that the linker is 
 finding this typeinfo COMDAT in Core first, rather than somewhere else, 
 and is thereby forcing the inclusion of the rest of it's containing module.
 
 Does moving core.obj to the end of the .lib solve the problem?
 

Heya Eric

That's what it seems like and (as noted above) core.obj is already the 
very last one added to the lib ;)

The only way to resolve at this point is to remove core.obj entirely.

Feb 21 2007

Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:

kris wrote:
 Pragma wrote:
 Sorry if I'm stating the obvious, but it seems to me that the linker 
 is finding this typeinfo COMDAT in Core first, rather than somewhere 
 else, and is thereby forcing the inclusion of the rest of it's 
 containing module.

 Does moving core.obj to the end of the .lib solve the problem?

 
 Heya Eric
 
 That's what it seems like and (as noted above) core.obj is already the 
 very last one added to the lib ;)
 
 The only way to resolve at this point is to remove core.obj entirely.

Did you try putting it at the front of the lib? You never know, maybe it 
picks the last one instead of the first one.

Unless it just happens to be the only module to define 
_D12TypeInfo_AAa6__initZ ...

Feb 21 2007

Pragma <ericanderton yahoo.removeme.com> writes:

Frits van Bommel wrote:
 kris wrote:
 Pragma wrote:
 Sorry if I'm stating the obvious, but it seems to me that the linker 
 is finding this typeinfo COMDAT in Core first, rather than somewhere 
 else, and is thereby forcing the inclusion of the rest of it's 
 containing module.

 Does moving core.obj to the end of the .lib solve the problem?

 Heya Eric

 That's what it seems like and (as noted above) core.obj is already the 
 very last one added to the lib ;)

 The only way to resolve at this point is to remove core.obj entirely.

 
 Did you try putting it at the front of the lib? You never know, maybe it 
 picks the last one instead of the first one.
 
 Unless it just happens to be the only module to define 
 _D12TypeInfo_AAa6__initZ ...

I thought about that too, but that doesn't seem to be the case. As Kris also
stated, the lib compiles when Core is 
removed.  That implies that the TypeInfo mentioned lives elsewhere - and it
very likely does, as it's a "char[char[]]".

Just a hunch: does the .lib's dictionary play a role in OPTLINK's use of
finding COMDAT symbol matches in a .lib file? 
Maybe there's some non-.obj-order-dependent behavior going on between the two.

-- 
- EricAnderton at yahoo

Feb 21 2007

kris <foo bar.com> writes:

Pragma wrote:
 Frits van Bommel wrote:
 
 kris wrote:

 Pragma wrote:

 Sorry if I'm stating the obvious, but it seems to me that the linker 
 is finding this typeinfo COMDAT in Core first, rather than somewhere 
 else, and is thereby forcing the inclusion of the rest of it's 
 containing module.

 Does moving core.obj to the end of the .lib solve the problem?

 Heya Eric

 That's what it seems like and (as noted above) core.obj is already 
 the very last one added to the lib ;)

 The only way to resolve at this point is to remove core.obj entirely.


 Did you try putting it at the front of the lib? You never know, maybe 
 it picks the last one instead of the first one.

 Unless it just happens to be the only module to define 
 _D12TypeInfo_AAa6__initZ ...

 
 
 I thought about that too, but that doesn't seem to be the case. As Kris 
 also stated, the lib compiles when Core is removed.  That implies that 
 the TypeInfo mentioned lives elsewhere - and it very likely does, as 
 it's a "char[char[]]".
 

Perhaps a question is this: why the heck is that symbol exposed from 
Core, when it should instead be exposed via the TypeInfo class for 
char[][] instead ... linked via the TypeInfo classes in Object.d? A 
large number of those are present in every D executable.

Feb 21 2007

Walter Bright <newshound digitalmars.com> writes:

kris wrote:
 Perhaps a question is this: why the heck is that symbol exposed from 
 Core, when it should instead be exposed via the TypeInfo class for 
 char[][] instead ... linked via the TypeInfo classes in Object.d? A 
 large number of those are present in every D executable.

The standard TypeInfo's in Phobos cover only basic types and arrays of 
basic types. The rest are generated by the compiler.

Feb 21 2007

kris <foo bar.com> writes:

Walter Bright wrote:
 kris wrote:
 
 Perhaps a question is this: why the heck is that symbol exposed from 
 Core, when it should instead be exposed via the TypeInfo class for 
 char[][] instead ... linked via the TypeInfo classes in Object.d? A 
 large number of those are present in every D executable.

 
 
 The standard TypeInfo's in Phobos cover only basic types and arrays of 
 basic types. The rest are generated by the compiler.

Yep, I've seen that to be the case. However, the current strategy 
clearly leads to a somewhat haphazard mechanism for resolving such 
symbols: in this case, the exe is dogged by a suite of code that it 
doesn't want or need ... all for the sake of a typedef init?

Sure, we don't want such things being replicated all over the place, but 
I think it has been shown that the current approach is unrealistic in 
practice.

Isn't there some way to isolate the typeinfo such that only a segment is 
linked, rather than the entire "hosting" module (the one that just 
happened to be found first in the lib) ?

Feb 21 2007

Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:

kris wrote:
 Isn't there some way to isolate the typeinfo such that only a segment is 
 linked, rather than the entire "hosting" module (the one that just 
 happened to be found first in the lib) ?

The obvious solution would be to always generate typeinfo even if it can 
be determined imported modules will already supply it. The current 
approach seems to confuse the linker, causing it to link in unrelated 
objects that happen to supply the symbol even though the compiler 
"meant" for another object file to supply it.

Yes, that will "bloat" object files, but the current approach apparently 
bloats applications. Care to guess which are distributed most often? ;)

Feb 21 2007

Walter Bright <newshound digitalmars.com> writes:

Frits van Bommel wrote:
 kris wrote:
 Isn't there some way to isolate the typeinfo such that only a segment 
 is linked, rather than the entire "hosting" module (the one that just 
 happened to be found first in the lib) ?


No, the linker deals with .obj files as a unit.

 The obvious solution would be to always generate typeinfo even if it can 
 be determined imported modules will already supply it. The current 
 approach seems to confuse the linker, causing it to link in unrelated 
 objects that happen to supply the symbol even though the compiler 
 "meant" for another object file to supply it.

I wish to be precise - there is no "seems" or "confuse" with linking. It 
simply follows the algorithm I outlined previously - have an unresolved 
symbol, find the first .obj module in the library which resolves it. It 
does this in a loop until there are no further unreferenced symbols.

Most of the complexity in a linker stems from:

1) trying to make it fast
2) the over-complicated .obj file format

Conceptually, it is a very simple program.

 Yes, that will "bloat" object files, but the current approach apparently 
 bloats applications. Care to guess which are distributed most often? ;)

TypeInfo's are only going to grow, and this could create gigantic obj files.

Feb 21 2007

Derek Parnell <derek nomail.afraid.org> writes:

Walter,
do we (the developer community) have a problem here? 

If yes, will you be actively trying to find a satisfactory resolution in
the near future?

On Wed, 21 Feb 2007 13:22:09 -0800, Walter Bright wrote:

 Frits van Bommel wrote:
 kris wrote:
 Isn't there some way to isolate the typeinfo such that only a segment 
 is linked, rather than the entire "hosting" module (the one that just 
 happened to be found first in the lib) ?


 
 No, the linker deals with .obj files as a unit.

This has been pointed out a few times now; if any single item in an .OBJ
file is referenced in the program, the whole .OBJ file is linked into the
executable.

This implies that in order to make small executable files, we need to
ensure that .OBJ files are as atomic as possible and to minimize references
to other modules. Yes, these are at conflict with each other so a
compromise made be made somehow.

A better link editor would be able to only link in the portions of the .OBJ
file that are needed, but until someone writes a replacement for OptLink,
we are pretty well stuck with Walter's approach.
 
 The obvious solution would be to always generate typeinfo even if it can 
 be determined imported modules will already supply it. The current 
 approach seems to confuse the linker, causing it to link in unrelated 
 objects that happen to supply the symbol even though the compiler 
 "meant" for another object file to supply it.

 
 I wish to be precise - there is no "seems" or "confuse" with linking. It 
 simply follows the algorithm I outlined previously - have an unresolved 
 symbol, find the first .obj module in the library which resolves it. It 
 does this in a loop until there are no further unreferenced symbols.

Walter, I know that you are not going to change OptLink, so this next
question is purely theoretical ... instead of finding the 'first' object
file that resolves it, is there a better algorithm ... maybe the smallest
object file that resolves it, or ... I don't know ... but it might be worth
thinking about.

 
 Most of the complexity in a linker stems from:
 
 1) trying to make it fast

How fast is fast enough?

 2) the over-complicated .obj file format

Can we improve the OBJ file format?

 Conceptually, it is a very simple program.

And that might be a part of the problem.
 
 Yes, that will "bloat" object files, but the current approach apparently 
 bloats applications. Care to guess which are distributed most often? ;)

 
 TypeInfo's are only going to grow, and this could create gigantic obj files.

So, have we got a problem or not?

-- 
Derek
(skype: derek.j.parnell)
Melbourne, Australia
"Justice for David Hicks!"
22/02/2007 10:01:57 AM

Feb 21 2007

Walter Bright <newshound digitalmars.com> writes:

Derek Parnell wrote:
 If yes, will you be actively trying to find a satisfactory resolution in
 the near future?

I posted some suggestions to Kris.

 This has been pointed out a few times now; if any single item in an .OBJ
 file is referenced in the program, the whole .OBJ file is linked into the
 executable.

That's right. COMDATs make things slightly more complicated, as 
unreferenced COMDATs get discarded by the linker.

 This implies that in order to make small executable files, we need to
 ensure that .OBJ files are as atomic as possible and to minimize references
 to other modules. Yes, these are at conflict with each other so a
 compromise made be made somehow.
 
 A better link editor would be able to only link in the portions of the .OBJ
 file that are needed, but until someone writes a replacement for OptLink,
 we are pretty well stuck with Walter's approach.

It's important to work with existing tools (like linkers and 
librarians), which (among other things) helps ensure that D programs can 
link with the output of other compilers (like gcc).

 Walter, I know that you are not going to change OptLink, so this next
 question is purely theoretical ... instead of finding the 'first' object
 file that resolves it, is there a better algorithm ... maybe the smallest
 object file that resolves it, or ... I don't know ... but it might be worth
 thinking about.

The 'smallest' doesn't do what you ask, either, because even the 
smallest obj file could contain a reference to something big.

 Most of the complexity in a linker stems from:
 1) trying to make it fast

 How fast is fast enough?

It's never fast enough. I know a fellow who made his fortune just 
writing a faster linker than MS-LINK. (You can guess the name of that 
linker!) Borland based their whole company's existence on fast 
compile-link times. Currently, ld is pig slow, it's a big bottleneck on 
the edit-compile-link-debug cycle on Linux.


 2) the over-complicated .obj file format

 Can we improve the OBJ file format?

Only if we want to write a replacement for every tool out there that 
manipulates object files, and if we want to give up linking with the 
output of C compilers (or any other compilers).


 Conceptually, it is a very simple program.

 And that might be a part of the problem.

Might be, but there also shouldn't be any confusion or mystery about 
what it's doing. Understanding how it works makes it possible to build a 
professional quality library. You can't really escape understanding it - 
and there's no reason to, it *is* a simple program.


 Yes, that will "bloat" object files, but the current approach apparently 
 bloats applications. Care to guess which are distributed most often? ;)

 TypeInfo's are only going to grow, and this could create gigantic obj files.

 
 So, have we got a problem or not?

Given limited resources, we have to deal with what we have.

Feb 21 2007

Derek Parnell <derek nomail.afraid.org> writes:

On Wed, 21 Feb 2007 16:12:10 -0800, Walter Bright wrote:

 It's important to work with existing tools (like linkers and 
 librarians), which (among other things) helps ensure that D programs can 
 link with the output of other compilers (like gcc).

Has anyone tried successfully to use a Windows linker, other than OptLink,
to handle D .obj files? 

-- 
Derek
(skype: derek.j.parnell)
Melbourne, Australia
"Justice for David Hicks!"
22/02/2007 12:02:31 PM

Feb 21 2007

kris <foo bar.com> writes:

Derek Parnell wrote:
 On Wed, 21 Feb 2007 16:12:10 -0800, Walter Bright wrote:
 
 
It's important to work with existing tools (like linkers and 
librarians), which (among other things) helps ensure that D programs can 
link with the output of other compilers (like gcc).

 
 
 Has anyone tried successfully to use a Windows linker, other than OptLink,
 to handle D .obj files? 
 


I tried the 'modified' Watcom linker the other night, but it found some 
problems with snn.lib. That was a 2003 release, and the tools have been 
updated significantly since then. The recent release may well have 
better luck (the trunk supports OMF). Would be very interested if 
someone were to have a go at it.

Feb 21 2007

John Reimer <terminal.node gmail.com> writes:

On Thu, 22 Feb 2007 12:05:22 +1100, Derek Parnell wrote:

 On Wed, 21 Feb 2007 16:12:10 -0800, Walter Bright wrote:
 
 It's important to work with existing tools (like linkers and 
 librarians), which (among other things) helps ensure that D programs can 
 link with the output of other compilers (like gcc).

 
 Has anyone tried successfully to use a Windows linker, other than OptLink,
 to handle D .obj files? 


What does gcc have to do with windows dmd tools?  And what good is it to
use these tools if they aren't working properly?  That's why this whole
discussion started, right?

I don't get what Walter is saying.

-JJR

Feb 21 2007

"Carlos Smith" <carlos-smith sympatico.ca> writes:

"Walter Bright" <newshound digitalmars.com> wrote in message

: It's important to work with existing tools (like linkers 
and
: librarians), which (among other things) helps ensure that 
D programs can
: link with the output of other compilers (like gcc).
:

What does that mean ?

D produce omf obj files, right ?
and omf is not supported by any of the gnu tools working
on the win32 platform.

gcc is not the right example.

Is it a good idea to suggest that D produce coff obj ?
This will probably give you some (or a lot of) work
to add the necessary support to your tool set,

but, D is a modern compiler, using an obj format that
is not a standard anymore.

Feb 21 2007

Walter Bright <newshound digitalmars.com> writes:

Carlos Smith wrote:
 gcc is not the right example.

It is for the Linux DMD.

Feb 23 2007

John Reimer <terminal.node gmail.com> writes:

On Wed, 21 Feb 2007 16:12:10 -0800, Walter Bright wrote:

 Derek Parnell wrote:
 Most of the complexity in a linker stems from:
 1) trying to make it fast

 How fast is fast enough?

 
 It's never fast enough. I know a fellow who made his fortune just 
 writing a faster linker than MS-LINK. (You can guess the name of that 
 linker!) Borland based their whole company's existence on fast 
 compile-link times. Currently, ld is pig slow, it's a big bottleneck on 
 the edit-compile-link-debug cycle on Linux.
 
 


That's not a good argument. ld is pig slow?  I'm sorry but I don't get
that.  It works; it works as intended; and, strangely, I don't hear people
complain about its apparent lack of speed. 

So what if a linker is blitzingly fast. If it's outdated and broken,
there's not much to get excited about.  I'll choose the slow working one
any day.

-JJR

Feb 22 2007

=?UTF-8?B?QW5kZXJzIEYgQmrDtnJrbHVuZA==?= <afb algonet.se> writes:

John Reimer wrote:

 That's not a good argument. ld is pig slow?  I'm sorry but I don't get
 that.  It works; it works as intended; and, strangely, I don't hear people
 complain about its apparent lack of speed. 
 
 So what if a linker is blitzingly fast. If it's outdated and broken,
 there's not much to get excited about.  I'll choose the slow working one
 any day.

I've find OPTLINK to hang and crash a lot when linking the wxD programs
on Windows XP. But every time I try to reproduce it, it goes away... :-(

So now I just run the "make" like three times in a row, and it usually
succeeds in building everything. And yeah, it's rather fast in doing so.

But I prefer the MinGW gdc/ld, since it works the first time but slower?
(well that and that I have problems getting DMC to work with SDL / GL)

--anders

Feb 22 2007

Bill Baxter <dnewsgroup billbaxter.com> writes:

Anders F Björklund wrote:
 John Reimer wrote:
 
 That's not a good argument. ld is pig slow?  I'm sorry but I don't get
 that.  It works; it works as intended; and, strangely, I don't hear 
 people
 complain about its apparent lack of speed.
 So what if a linker is blitzingly fast. If it's outdated and broken,
 there's not much to get excited about.  I'll choose the slow working one
 any day.

 
 I've find OPTLINK to hang and crash a lot when linking the wxD programs
 on Windows XP. But every time I try to reproduce it, it goes away... :-(
 
 So now I just run the "make" like three times in a row, and it usually
 succeeds in building everything. And yeah, it's rather fast in doing so.
 
 But I prefer the MinGW gdc/ld, since it works the first time but slower?
 (well that and that I have problems getting DMC to work with SDL / GL)

I see hangs occasionally even for small programs.  Even on single files 
compiled with dmd -run.  Every time it happens if I Ctrl-C kill it and 
run the same command again, everything is fine.  Frequency is maybe like 
1 out of every 50 compiles.

--bb

Feb 22 2007

Walter Bright <newshound digitalmars.com> writes:

Bill Baxter wrote:
 I see hangs occasionally even for small programs.  Even on single files 
 compiled with dmd -run.  Every time it happens if I Ctrl-C kill it and 
 run the same command again, everything is fine.  Frequency is maybe like 
 1 out of every 50 compiles.

I've never seen optlink crash except on known cases where there's a 
gigantic amount of static data. If you've got more conventional cases, 
please post a bug report with a reproducible case.

Feb 23 2007

Walter Bright <newshound digitalmars.com> writes:

Walter Bright wrote:
 Bill Baxter wrote:
 I see hangs occasionally even for small programs.  Even on single 
 files compiled with dmd -run.  Every time it happens if I Ctrl-C kill 
 it and run the same command again, everything is fine.  Frequency is 
 maybe like 1 out of every 50 compiles.

 
 I've never seen optlink crash except on known cases where there's a 
 gigantic amount of static data. If you've got more conventional cases, 
 please post a bug report with a reproducible case.

I'd forgotten, there is a problem with optlink running on multicore 
machines. There's supposed to be a way to tell Windows to run an exe 
using only one core, but I can't think of it at the moment.

Feb 23 2007

Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:

Walter Bright wrote:
 Walter Bright wrote:
 Bill Baxter wrote:
 I see hangs occasionally even for small programs.  Even on single 
 files compiled with dmd -run.  Every time it happens if I Ctrl-C kill 
 it and run the same command again, everything is fine.  Frequency is 
 maybe like 1 out of every 50 compiles.

 I've never seen optlink crash except on known cases where there's a 
 gigantic amount of static data. If you've got more conventional cases, 
 please post a bug report with a reproducible case.

 
 I'd forgotten, there is a problem with optlink running on multicore 
 machines. There's supposed to be a way to tell Windows to run an exe 
 using only one core, but I can't think of it at the moment.

Is it a threaded application? (That would explain why the error is sporadic)
If so, the obvious workaround would be to tell it to only use one thread.
If not, why does multicore matter?

Feb 23 2007

Walter Bright <newshound digitalmars.com> writes:

Frits van Bommel wrote:
 Walter Bright wrote:
 Walter Bright wrote:
 Bill Baxter wrote:
 I see hangs occasionally even for small programs.  Even on single 
 files compiled with dmd -run.  Every time it happens if I Ctrl-C 
 kill it and run the same command again, everything is fine.  
 Frequency is maybe like 1 out of every 50 compiles.


 I'd forgotten, there is a problem with optlink running on multicore 
 machines. There's supposed to be a way to tell Windows to run an exe 
 using only one core, but I can't think of it at the moment.

 
 Is it a threaded application?

Yes.

(That would explain why the error is sporadic)

Yes.

 If so, the obvious workaround would be to tell it to only use one thread.
 If not, why does multicore matter?

The way mutexes are done for multi core is different than for single 
core. I think optlink multithreads correctly only for single core machines.

Feb 23 2007

Sean Kelly <sean f4.ca> writes:

Walter Bright wrote:
 Frits van Bommel wrote:
 
 If so, the obvious workaround would be to tell it to only use one thread.
 If not, why does multicore matter?

 
 The way mutexes are done for multi core is different than for single 
 core. I think optlink multithreads correctly only for single core machines.

Assuming you're talking about cores that don't share a cache (ie. "real" 
SMP) then it's possible optlink id just doing its own synchronization by 
manipulating variables.  Fixing it might be as easy as sticking a lock 
prefix in the right locations (long shot, I know).


Sean

Feb 23 2007

Bill Baxter <dnewsgroup billbaxter.com> writes:

Walter Bright wrote:
 Walter Bright wrote:
 Bill Baxter wrote:
 I see hangs occasionally even for small programs.  Even on single 
 files compiled with dmd -run.  Every time it happens if I Ctrl-C kill 
 it and run the same command again, everything is fine.  Frequency is 
 maybe like 1 out of every 50 compiles.

 I've never seen optlink crash except on known cases where there's a 
 gigantic amount of static data. If you've got more conventional cases, 
 please post a bug report with a reproducible case.

 
 I'd forgotten, there is a problem with optlink running on multicore 
 machines. There's supposed to be a way to tell Windows to run an exe 
 using only one core, but I can't think of it at the moment.

Hmm, one of my machines is core duo, so that could maybe be it.

--bb

Feb 23 2007

Bill Baxter <dnewsgroup billbaxter.com> writes:

Walter Bright wrote:
 Bill Baxter wrote:
 I see hangs occasionally even for small programs.  Even on single 
 files compiled with dmd -run.  Every time it happens if I Ctrl-C kill 
 it and run the same command again, everything is fine.  Frequency is 
 maybe like 1 out of every 50 compiles.

 
 I've never seen optlink crash except on known cases where there's a 
 gigantic amount of static data. If you've got more conventional cases, 
 please post a bug report with a reproducible case.

Hmm, well it's not crashing just hanging.  And it may not be optlink but 
dmd.  I do all my work on an external usb/firewire drive so that could 
be an issue (like optlink waiting for a file lock to be released?). 
Also it doesn't happen very often.  Maybe 1 out of 50 is too high an 
estimate.  More like once every few days (and I recompile things a lot).

It always works if I Ctrl-C and try again, so a repro will be difficult.

Other possibilities -- I also started using a cmd wrapper called 
"Console" about the same time as getting into D, so maybe that could 
also be related I suppose. [http://sourceforge.net/projects/console/]. 
It is a bit flaky at times.

In short it's really not a big issue.  Just something I see 
occasionally.  I posted it mostly to see if anyone else was silently 
seeing the same sorts of things.  If no-one else has seen this then it's 
most likely due to something in my particular setup.

--bb

Feb 23 2007

David Gileadi <foo bar.com> writes:

Bill Baxter wrote:
 Walter Bright wrote:
 Bill Baxter wrote:
 I see hangs occasionally even for small programs.  Even on single 
 files compiled with dmd -run.  Every time it happens if I Ctrl-C kill 
 it and run the same command again, everything is fine.  Frequency is 
 maybe like 1 out of every 50 compiles.

 I've never seen optlink crash except on known cases where there's a 
 gigantic amount of static data. If you've got more conventional cases, 
 please post a bug report with a reproducible case.

 
 Hmm, well it's not crashing just hanging.  And it may not be optlink but 
 dmd.  I do all my work on an external usb/firewire drive so that could 
 be an issue (like optlink waiting for a file lock to be released?). Also 
 it doesn't happen very often.  Maybe 1 out of 50 is too high an 
 estimate.  More like once every few days (and I recompile things a lot).
 
 It always works if I Ctrl-C and try again, so a repro will be difficult.
 
 Other possibilities -- I also started using a cmd wrapper called 
 "Console" about the same time as getting into D, so maybe that could 
 also be related I suppose. [http://sourceforge.net/projects/console/]. 
 It is a bit flaky at times.
 
 In short it's really not a big issue.  Just something I see 
 occasionally.  I posted it mostly to see if anyone else was silently 
 seeing the same sorts of things.  If no-one else has seen this then it's 
 most likely due to something in my particular setup.
 
 --bb

I've seen it before compiling the wxD samples.  I'm running a Pentium 4 
hyperthreaded.  As you say, it's a hang, it's occasional, but I wouldn't 
say 1 in 50--it bites me just about every time I compile all the wxD 
samples, so more like 1 in 15 for me.

-Dave

Feb 26 2007

Lionello Lunesu <lio lunesu.remove.com> writes:

Bill Baxter wrote:
 I see hangs occasionally even for small programs.  Even on single files 
 compiled with dmd -run.  Every time it happens if I Ctrl-C kill it and 
 run the same command again, everything is fine.  Frequency is maybe like 
 1 out of every 50 compiles.

I got that too! Same numbers, 1:50. AMD dual core.

L.

Feb 23 2007

Sean Kelly <sean f4.ca> writes:

John Reimer wrote:
 On Wed, 21 Feb 2007 16:12:10 -0800, Walter Bright wrote:
 
 Derek Parnell wrote:
 Most of the complexity in a linker stems from:
 1) trying to make it fast

 How fast is fast enough?

 It's never fast enough. I know a fellow who made his fortune just 
 writing a faster linker than MS-LINK. (You can guess the name of that 
 linker!) Borland based their whole company's existence on fast 
 compile-link times. Currently, ld is pig slow, it's a big bottleneck on 
 the edit-compile-link-debug cycle on Linux.

 
 That's not a good argument. ld is pig slow?  I'm sorry but I don't get
 that.  It works; it works as intended; and, strangely, I don't hear people
 complain about its apparent lack of speed. 
 
 So what if a linker is blitzingly fast. If it's outdated and broken,
 there's not much to get excited about.  I'll choose the slow working one
 any day.

Ideally, perhaps a linker could provide both options: link fast and 
potentially bloat the exe or link carefully (and slowly) for a lean exe. 
  I'd use the fast link for debugging and the slow link for releases. 
Assuming, of course, that the linker were reliable enough that there was 
no risk of changing app behavior between the two.


Sean

Feb 22 2007

Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:

Sean Kelly wrote:
 Ideally, perhaps a linker could provide both options: link fast and 
 potentially bloat the exe or link carefully (and slowly) for a lean exe. 
  I'd use the fast link for debugging and the slow link for releases. 
 Assuming, of course, that the linker were reliable enough that there was 
 no risk of changing app behavior between the two.

That might not be the case here: if a module's object file is pulled in, 
that module's static constructors and destructors are called at runtime, 
right? So if different modules are pulled in with those options, 
different static constructors/destructors get called.
(Same goes for unit tests, if enabled, by the way)

Feb 22 2007

Sean Kelly <sean f4.ca> writes:

Frits van Bommel wrote:
 Sean Kelly wrote:
 Ideally, perhaps a linker could provide both options: link fast and 
 potentially bloat the exe or link carefully (and slowly) for a lean 
 exe.  I'd use the fast link for debugging and the slow link for 
 releases. Assuming, of course, that the linker were reliable enough 
 that there was no risk of changing app behavior between the two.

 
 That might not be the case here: if a module's object file is pulled in, 
 that module's static constructors and destructors are called at runtime, 
 right? So if different modules are pulled in with those options, 
 different static constructors/destructors get called.
 (Same goes for unit tests, if enabled, by the way)

Yuck.  Good point.


Sean

Feb 22 2007

"Kristian Kilpi" <kjkilpi gmail.com> writes:

On Thu, 22 Feb 2007 18:32:18 +0200, Frits van Bommel  
<fvbommel REMwOVExCAPSs.nl> wrote:

 Sean Kelly wrote:
 Ideally, perhaps a linker could provide both options: link fast and  
 potentially bloat the exe or link carefully (and slowly) for a lean  
 exe.  I'd use the fast link for debugging and the slow link for  
 releases. Assuming, of course, that the linker were reliable enough  
 that there was no risk of changing app behavior between the two.

 That might not be the case here: if a module's object file is pulled in,  
 that module's static constructors and destructors are called at runtime,  
 right? So if different modules are pulled in with those options,  
 different static constructors/destructors get called.
 (Same goes for unit tests, if enabled, by the way)

Hmm, yes, but how that's different from the today's situation? Currently  
the linker chooses *arbitrary* object modules that happen to contain the  
needed typeinfo.

Feb 22 2007

Sean Kelly <sean f4.ca> writes:

Kristian Kilpi wrote:
 On Thu, 22 Feb 2007 18:32:18 +0200, Frits van Bommel 
 <fvbommel REMwOVExCAPSs.nl> wrote:
 
 Sean Kelly wrote:
 Ideally, perhaps a linker could provide both options: link fast and 
 potentially bloat the exe or link carefully (and slowly) for a lean 
 exe.  I'd use the fast link for debugging and the slow link for 
 releases. Assuming, of course, that the linker were reliable enough 
 that there was no risk of changing app behavior between the two.

 That might not be the case here: if a module's object file is pulled 
 in, that module's static constructors and destructors are called at 
 runtime, right? So if different modules are pulled in with those 
 options, different static constructors/destructors get called.
 (Same goes for unit tests, if enabled, by the way)

 
 Hmm, yes, but how that's different from the today's situation? Currently 
 the linker chooses *arbitrary* object modules that happen to contain the 
 needed typeinfo.

Because as long as the list of dependencies remains unchanged, the same 
arbitrary choices should be made.


Sean

Feb 22 2007

"Kristian Kilpi" <kjkilpi gmail.com> writes:

On Thu, 22 Feb 2007 22:08:46 +0200, Sean Kelly <sean f4.ca> wrote:

 Kristian Kilpi wrote:
 On Thu, 22 Feb 2007 18:32:18 +0200, Frits van Bommel  
 <fvbommel REMwOVExCAPSs.nl> wrote:

 Sean Kelly wrote:
 Ideally, perhaps a linker could provide both options: link fast and  
 potentially bloat the exe or link carefully (and slowly) for a lean  
 exe.  I'd use the fast link for debugging and the slow link for  
 releases. Assuming, of course, that the linker were reliable enough  
 that there was no risk of changing app behavior between the two.

 That might not be the case here: if a module's object file is pulled  
 in, that module's static constructors and destructors are called at  
 runtime, right? So if different modules are pulled in with those  
 options, different static constructors/destructors get called.
 (Same goes for unit tests, if enabled, by the way)

  Hmm, yes, but how that's different from the today's situation?  
 Currently the linker chooses *arbitrary* object modules that happen to  
 contain the needed typeinfo.

 Because as long as the list of dependencies remains unchanged, the same  
 arbitrary choices should be made.


 Sean

Well yes, except there is no guarantees of that, in the specs I mean.  
Another linker may (and likely will) produce a different result. And the  
same can happen when a library is rebuild. The order of object modules  
affect how the linker will choose modules to be linked in.

Feb 22 2007

Walter Bright <newshound digitalmars.com> writes:

Sean Kelly wrote:
 Ideally, perhaps a linker could provide both options: link fast and 
 potentially bloat the exe or link carefully (and slowly) for a lean exe. 

What is "link carefully"?

Feb 23 2007

Sean Kelly <sean f4.ca> writes:

Walter Bright wrote:
 Sean Kelly wrote:
 Ideally, perhaps a linker could provide both options: link fast and 
 potentially bloat the exe or link carefully (and slowly) for a lean exe. 

 
 What is "link carefully"?

Link at a segment level instead of a module level.

Feb 23 2007

jcc7 <technocrat7 gmail.com> writes:

== Quote from Frits van Bommel (fvbommel REMwOVExCAPSs.nl)'s article
kris wrote:
Isn't there some way to isolate the typeinfo such that only a
segment is linked, rather than the entire "hosting" module (the
one that just happened to be found first in the lib) ?

The obvious solution would be to always generate typeinfo even if it
can be determined imported modules will already supply it. The
current approach seems to confuse the linker, causing it to link in
unrelated objects that happen to supply the symbol even though the
compiler "meant" for another object file to supply it.

Yes, that will "bloat" object files, but the current approach
apparently bloats applications. Care to guess which are distributed
most often? ;)

I think your idea could work. It makes sense to me, but I'd like to go one
better:
Let's have DMD postpone creating TypeInfo until an .exe or .dll is being created
and only include them with the .obj for the "main" module (i.e. the module with
the main or DllMain function).

Surely, the compiler can figure out which TypeInfo's it needs at the point of
compiling an .exe or .dll. If not, even if we have to wait for linker to spit
out
a list of missing TypeInfo's and then generate the TypeInfo (trial-and-error), I
think that would be a small price to pay for eliminating all of this bloat of
unneeded module that Kris has discovered.

There seems to be a lot more concern around here about .exe-size than there is
about the speed of compiling and linking. Let's fix what's broken -- even if we
have to give up a little compile/link speed

My idea seems too obvious to be the solution. Does anyone else think this would
work?

By the way, Kris has a thorough description of the problem here:
http://www.digitalmars.com/pnews/read.php?server=news.digitalmars.com&group=digitalmars.D&artnum=49257

jcc7

Feb 22 2007

Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:

jcc7 wrote:
== Quote from Frits van Bommel (fvbommel REMwOVExCAPSs.nl)'s article
kris wrote:
Isn't there some way to isolate the typeinfo such that only a
segment is linked, rather than the entire "hosting" module (the
one that just happened to be found first in the lib) ?

Yes, that will "bloat" object files, but the current approach
apparently bloats applications. Care to guess which are distributed
most often? ;)

I think your idea could work. It makes sense to me, but I'd like to go one
better:
Let's have DMD postpone creating TypeInfo until an .exe or .dll is being
created
and only include them with the .obj for the "main" module (i.e. the module with
the main or DllMain function).

Not all libraries may have a DllMain, IIRC it's completely optional.
On Windows it's required for D DLLs if you want to use the GC from
within the DLL, or have static constructors/destructors in the DLL --
but otherwise you may get by without. I think if you write C-style D you
may well get away without it.

Surely, the compiler can figure out which TypeInfo's it needs at the point of
compiling an .exe or .dll.

Not necessarily. Any modules that are linked in but not called by other
modules (e.g. code only reachable from static constructors and/or static
destructors) may not be seen when main/DllMain is compiled, if there
even is one of these (see above point about DllMain being optional).

If not, even if we have to wait for linker to spit out
a list of missing TypeInfo's and then generate the TypeInfo (trial-and-error),
I
think that would be a small price to pay for eliminating all of this bloat of
unneeded module that Kris has discovered.

This would mean you can't "manually" link stuff together, using
optlink/ld/whatever directly. I don't know how many people want to do
this, but Walter has made it pretty clear he wants to be able to use a
generic linker[1] (i.e. one that doesn't require specialized knowledge
of D) and I agree with that.
Consider this: if every (or even more than one) language required a
special way of linking, that would mean you couldn't link together code
written in those languages without writing a linker (or perhaps wrapper)
that supports both...
Though arguably the situation with DMD/Windows is already worse when it
comes to that, since almost nobody else uses OMF anymore...

I agree.

My idea seems too obvious to be the solution. Does anyone else think this
would work?

For above-mentioned reasons, I don't think it will work for all
(corner)cases.

By the way, Kris has a thorough description of the problem here:
http://www.digitalmars.com/pnews/read.php?server=news.digitalmars.com&group=digitalmars.D&artnum=49257

I've seen it.

Feb 22 2007

jcc7 <technocrat7 gmail.com> writes:

Frits van Bommel (fvbommel REMwOVExCAPSs.nl) wrote:
 jcc7 wrote:
 Frits van Bommel (fvbommel REMwOVExCAPSs.nl) wrote:
 kris wrote:
 Isn't there some way to isolate the typeinfo such that only a
 segment is linked, rather than the entire "hosting" module (the
 one that just happened to be found first in the lib) ?

 The obvious solution would be to always generate typeinfo even if
 it can be determined imported modules will already supply it. The
 current approach seems to confuse the linker, causing it to link
 in unrelated objects that happen to supply the symbol even though
 the compiler "meant" for another object file to supply it.

 Yes, that will "bloat" object files, but the current approach
 apparently bloats applications. Care to guess which are
 distributed most often? ;)

 I think your idea could work. It makes sense to me, but I'd like
 to go one better:


(By the way, this topic is mostly over-my-head, so I'll probably have to quit
offering ideas pretty soon lest I embarrass myselft more than I already may
have.)

 Let's have DMD postpone creating TypeInfo until an .exe or .dll is
 being created and only include them with the .obj for the "main"
 module (i.e. the module with the main or DllMain function).

 Not all libraries may have a DllMain, IIRC it's completely optional.
 On Windows it's required for D DLLs if you want to use the GC from
 within the DLL, or have static constructors/destructors in the DLL
 but otherwise you may get by without. I think if you write C-style D
 you may well get away without it.

Well, I don't want to prevent anyone from playing by their own rules, so my
proposed TypeInfo-postponing compiler could have a switch to add the TypeInfo as
it's compiling any arbitrary code into an .obj file. But in usual circumstances,
I'd think that the TypeInfo would only be needed when producing an .exe or .dll.

 Surely, the compiler can figure out which TypeInfo's it needs at
 the point of compiling an .exe or .dll.

 Not necessarily. Any modules that are linked in but not called by
 other modules (e.g. code only reachable from static constructors
 and/or static destructors) may not be seen when main/DllMain is
 compiled, if there even is one of these (see above point about
 DllMain being optional).

I don't see how static constructors and/or destructors interferes with the
compiler detecting which TypeInfo's would be necessary, but I don't think such a
problem would be insurmountable. Perhaps, it'd be a question of "Is it worth the
effort?".

But then again, I don't know much about what the compiler and linker do "under
the
hood". It's mostly a black box for me. But from reading Walter and Kris discuss
the issues involved, I'm convinced there has to be a less haphazard way for DMD
and optlink to interact.


 If not, even if we have to wait for linker to spit out
 a list of missing TypeInfo's and then generate the TypeInfo
 (trial-and-error), I think that would be a small price to pay for
 eliminating all of this bloat of unneeded module that Kris has
 discovered.

 This would mean you can't "manually" link stuff together, using
 optlink/ld/whatever directly. I don't know how many people want to
 do this, but Walter has made it pretty clear he wants to be able to
 use a generic linker[1] (i.e. one that doesn't require specialized
 knowledge of D) and I agree with that.

Isn't there still a question of whether anyone has found a "generic linker" for
OMF (other than OptLink) that can work with DMD anyway?


 Consider this: if every (or even more than one) language required a
 special way of linking, that would mean you couldn't link together
 code written in those languages without writing a linker (or perhaps
 wrapper) that supports both...

Yeah, that doesn't sound like fun.


 Though arguably the situation with DMD/Windows is already worse when
 it comes to that, since almost nobody else uses OMF anymore...

Right. We seem to be on our own when it comes to using OMF.

I think we're mostly trying to find a fix for the problem with the OMF files
generated by DMD right now. Apparently, GDC doesn't have these same problems (or
if GDC does have linker problems, Walter isn't the one responsible for fixing
them). So I think the problem is limited to using DMD's OMF files on Windows.
(Doesn't DMD on Linux use ELF? I think that's the case.)


 For above-mentioned reasons, I don't think it will work for all
 (corner)cases.

You might be right, but I haven't given up hope yet.


jcc7

Feb 22 2007

Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:

jcc7 wrote:
 Frits van Bommel (fvbommel REMwOVExCAPSs.nl) wrote:
 jcc7 wrote:
 Surely, the compiler can figure out which TypeInfo's it needs at
 the point of compiling an .exe or .dll.

 Not necessarily. Any modules that are linked in but not called by
 other modules (e.g. code only reachable from static constructors
 and/or static destructors) may not be seen when main/DllMain is
 compiled, if there even is one of these (see above point about
 DllMain being optional).

 
 I don't see how static constructors and/or destructors interferes with the
 compiler detecting which TypeInfo's would be necessary, but I don't think such
a
 problem would be insurmountable.

How static constructors could interfere:
---
module selfcontained;

static this() {
     // some code that requires TypeInfo not used in other modules
     // (including Phobos), perhaps for a type defined in this module.
}
---
(Change 'static this' to 'static ~this' or 'unittest' for similar problems)

If this module isn't imported (directly or indirectly) from the file 
defining main() the compiler can't possibly know what TypeInfo needs to 
be generated for it when compiling main(), simply because it doesn't 
parse that file when pointed at the file containing main().

Yes, this could be "fixed" by having the module containing main() import 
all such modules, but it shouldn't have to. We shouldn't need to work 
around toolchain limitations, especially if there's a way to make it 
Just Work(TM).

 Perhaps, it'd be a question of "Is it worth the
 effort?".

It'll be worth the effort when one of _your_ projects fail to compile 
because of it :P.

 But then again, I don't know much about what the compiler and linker do "under
the
 hood". It's mostly a black box for me. But from reading Walter and Kris discuss
 the issues involved, I'm convinced there has to be a less haphazard way for DMD
 and optlink to interact.

Like I've mentioned earlier: I'm pretty sure this problem would go away 
entirely if the compiler simply generated all TypeInfo used in the 
module. If that generates larger intermediate object files I'm okay with 
that. In fact, that was how I thought it worked until I started reading 
about this problem...

 If not, even if we have to wait for linker to spit out
 a list of missing TypeInfo's and then generate the TypeInfo
 (trial-and-error), I think that would be a small price to pay for
 eliminating all of this bloat of unneeded module that Kris has
 discovered.

 This would mean you can't "manually" link stuff together, using
 optlink/ld/whatever directly. I don't know how many people want to
 do this, but Walter has made it pretty clear he wants to be able to
 use a generic linker[1] (i.e. one that doesn't require specialized
 knowledge of D) and I agree with that.

 
 Isn't there still a question of whether anyone has found a "generic linker" for
 OMF (other than OptLink) that can work with DMD anyway?

I believe I mentioned that a bit later ;).

[snip special linker discussion]
 Though arguably the situation with DMD/Windows is already worse when
 it comes to that, since almost nobody else uses OMF anymore...

 
 Right. We seem to be on our own when it comes to using OMF.

Well, it seems OpenWatcom supports it. From what I've read here the 
linker doesn't like DMD object files though. Walter claims it's buggy. I 
don't know enough about OMF to say one way or the other.

 I think we're mostly trying to find a fix for the problem with the OMF files
 generated by DMD right now. Apparently, GDC doesn't have these same problems
(or
 if GDC does have linker problems, Walter isn't the one responsible for fixing
 them). So I think the problem is limited to using DMD's OMF files on Windows.
 (Doesn't DMD on Linux use ELF? I think that's the case.)

Yes, DMD/Linux uses ELF. It just calls ld (through gcc) to link instead 
of using optlink.

I'm not sure if ld (or the mingw port of it) can use ELF to create 
Windows executables, but if it can that may be an option: just switch to 
ELF entirely and trash optlink. (this paragraph wasn't entirely serious, 
in case you hadn't noticed :P)

Feb 22 2007

Justin C Calvarese <technocrat7 gmail.com> writes:

Frits van Bommel wrote:
 jcc7 wrote:
 Frits van Bommel (fvbommel REMwOVExCAPSs.nl) wrote:
 jcc7 wrote:
 Surely, the compiler can figure out which TypeInfo's it needs at
 the point of compiling an .exe or .dll.

 Not necessarily. Any modules that are linked in but not called by
 other modules (e.g. code only reachable from static constructors
 and/or static destructors) may not be seen when main/DllMain is
 compiled, if there even is one of these (see above point about
 DllMain being optional).

 I don't see how static constructors and/or destructors interferes with 
 the
 compiler detecting which TypeInfo's would be necessary, but I don't 
 think such a
 problem would be insurmountable.

 
 How static constructors could interfere:
 ---
 module selfcontained;
 
 static this() {
     // some code that requires TypeInfo not used in other modules
     // (including Phobos), perhaps for a type defined in this module.
 }
 ---
 (Change 'static this' to 'static ~this' or 'unittest' for similar problems)
 
 If this module isn't imported (directly or indirectly) from the file 
 defining main() the compiler can't possibly know what TypeInfo needs to 
 be generated for it when compiling main(), simply because it doesn't 
 parse that file when pointed at the file containing main().

Oh, I thought the .obj file included mentions of things that are needed, 
but not contained in a particular .obj. I thought that's why "Error 42: 
Symbol Undefined" will appear if I don't give the compiler enough source 
files.

If that's not right, that would be a serious flaw in my proposal.


 Yes, this could be "fixed" by having the module containing main() import 
 all such modules, but it shouldn't have to. We shouldn't need to work 
 around toolchain limitations, especially if there's a way to make it 
 Just Work(TM).
 
 Perhaps, it'd be a question of "Is it worth the
 effort?".

 
 It'll be worth the effort when one of _your_ projects fail to compile 
 because of it :P.

Well, of course, my plan is contingent upon my projects successfully 
compiling. ;)

[snip my older thoughts]

 Like I've mentioned earlier: I'm pretty sure this problem would go away 
 entirely if the compiler simply generated all TypeInfo used in the 
 module. If that generates larger intermediate object files I'm okay with 
 that. In fact, that was how I thought it worked until I started reading 
 about this problem...

If that'd solve the problem, that'd be an improvement from the status quo.

But I had the understanding that there is a problem with the linker 
picking the TypeInfo from an arbitrary .obj (such as a large module that 
isn't needed for a particular program)? I'm afraid the linker might 
continue to choose an inappropriate TypeInfo. Or do you plan for all of 
the TypeInfo's to be unique, thus probably still bloating the .exe (but 
in a different way)?


 [snip special linker discussion]
 Though arguably the situation with DMD/Windows is already worse when
 it comes to that, since almost nobody else uses OMF anymore...

 Right. We seem to be on our own when it comes to using OMF.

 
 Well, it seems OpenWatcom supports it. From what I've read here the 
 linker doesn't like DMD object files though. Walter claims it's buggy. I 
 don't know enough about OMF to say one way or the other.

Well, it doesn't really matter to me if DMD continues to use OMF if the 
format doesn't cause a bunch of bloat or other broken features. But I 
still wonder Walter needs to stay so close to the "official" format if 
DMC/DMD's OMF doesn't seems to be compatible with any other compiler.

[snip my older thoughts]

 Yes, DMD/Linux uses ELF. It just calls ld (through gcc) to link instead 
 of using optlink.
 
 I'm not sure if ld (or the mingw port of it) can use ELF to create 
 Windows executables, but if it can that may be an option: just switch to 
 ELF entirely and trash optlink. (this paragraph wasn't entirely serious, 
 in case you hadn't noticed :P)

I suspect the option of ELF output would be welcomed by OMF's harshest 
critics. Not that I know anything about ELF.

-- 
jcc7

Feb 22 2007

Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:

Justin C Calvarese wrote:
[snip]
 Oh, I thought the .obj file included mentions of things that are needed, 
 but not contained in a particular .obj. I thought that's why "Error 42: 
 Symbol Undefined" will appear if I don't give the compiler enough source 
 files.
 
 If that's not right, that would be a serious flaw in my proposal.

Oh, you want the compiler to parse the .obj files to generate some extra 
stuff before it tries to link?
So basically that would be Walters "Put all the TypeInfo instantiations 
into one special module and import it everywhere you need TypeInfo", but 
automated by the compiler when it gets to the point of linking an 
executable?
I'm not sure if that's workable either since I'm pretty sure it'll need 
to parse the source files to generate the TypeInfo, at the very least 
for user-defined types. At that point it may not even have access to 
those source files, especially when someone is using a closed-source 
library (via .di header files). This is also a problem with what I 
thought you meant, by the way.

[snip]
 Perhaps, it'd be a question of "Is it worth the
 effort?".

 It'll be worth the effort when one of _your_ projects fail to compile 
 because of it :P.

 
 Well, of course, my plan is contingent upon my projects successfully 
 compiling. ;)

Thought so :).

 Like I've mentioned earlier: I'm pretty sure this problem would go 
 away entirely if the compiler simply generated all TypeInfo used in 
 the module. If that generates larger intermediate object files I'm 
 okay with that. In fact, that was how I thought it worked until I 
 started reading about this problem...

 
 If that'd solve the problem, that'd be an improvement from the status quo.
 
 But I had the understanding that there is a problem with the linker 
 picking the TypeInfo from an arbitrary .obj (such as a large module that 
 isn't needed for a particular program)? I'm afraid the linker might 
 continue to choose an inappropriate TypeInfo. Or do you plan for all of 
 the TypeInfo's to be unique, thus probably still bloating the .exe (but 
 in a different way)?

I was kind of hoping the compiler wouldn't go looking for a new object 
file that includes a symbol it hasn't seen yet if it's present in the 
object file that needs it.
After seeing some more discussion on hash tables used in OMF linkers, 
I'm not sure if that's what would happen. It'd depend on how the linker 
is implemented, I guess.

 [snip special linker discussion]
 Though arguably the situation with DMD/Windows is already worse when
 it comes to that, since almost nobody else uses OMF anymore...

 Right. We seem to be on our own when it comes to using OMF.

 Well, it seems OpenWatcom supports it. From what I've read here the 
 linker doesn't like DMD object files though. Walter claims it's buggy. 
 I don't know enough about OMF to say one way or the other.

 
 Well, it doesn't really matter to me if DMD continues to use OMF if the 
 format doesn't cause a bunch of bloat or other broken features.

Well, obviously it doesn't really matter if it works (and works well) :P.
Unfortunately, that doesn't currently seem to be the case...

 But I
 still wonder Walter needs to stay so close to the "official" format if
 DMC/DMD's OMF doesn't seems to be compatible with any other compiler.

Those bugs in OW might be fixed someday, or someone might re-implement 
OMF for the GNU toolchain (IIRC it was removed). Or someone else might 
want to implement a better linker for the Digital Mars compilers.
It's usually better to stick to published standards where they exist.

 Yes, DMD/Linux uses ELF. It just calls ld (through gcc) to link 
 instead of using optlink.

 I'm not sure if ld (or the mingw port of it) can use ELF to create 
 Windows executables, but if it can that may be an option: just switch 
 to ELF entirely and trash optlink. (this paragraph wasn't entirely 
 serious, in case you hadn't noticed :P)

 
 I suspect the option of ELF output would be welcomed by OMF's harshest 
 critics. Not that I know anything about ELF.

Thinking about it, I seem to recall "PE operations on non-PE file" to be 
a common error when I was trying to link ELF files on Windows 
(cross-compiled). MinGW-ld didn't seem to like ELF object files.
That was when linking to ELF binaries though, not to Windows executables.

Feb 22 2007

jcc7 <technocrat7 gmail.com> writes:

== Quote from Frits van Bommel (fvbommel REMwOVExCAPSs.nl)'s article
 Justin C Calvarese wrote:
 [snip]
 Oh, I thought the .obj file included mentions of things that are
 needed, but not contained in a particular .obj. I thought that's
 why "Error 42: Symbol Undefined" will appear if I don't give the
 compiler enough source files.

 If that's not right, that would be a serious flaw in my proposal.

 Oh, you want the compiler to parse the .obj files to generate some
 extra stuff before it tries to link?
 So basically that would be Walters "Put all the TypeInfo
 instantiations into one special module and import it everywhere you
 need TypeInfo", but automated by the compiler when it gets to the
 point of linking an executable?
 I'm not sure if that's workable either since I'm pretty sure it'll
 need to parse the source files to generate the TypeInfo, at the very
 least for user-defined types. At that point it may not even have
 access to those source files, especially when someone is using a
 closed-source library (via .di header files). This is also a problem
 with what I thought you meant, by the way.

Doh! I forgot that the compiler doesn't read the .obj/.lib files, but just hands
them over to the linker.

Yeah, I'm seeing the big problems with my idea now. It's sounding like a lot
more
work than we could expect Walter or anyone else to undertake. There might still
be
an easy solution (e.g. having the compiler produce a text file that lists the
needed TypeInfo's for each .obj that it outputs), but every solution seems to
bring with it another risk (e.g. too many files floating around that have to be
kept track of).

jcc7

Feb 23 2007

Daniel Keep <daniel.keep.lists gmail.com> writes:

(I'm just going to interject here because WOW this thread is getting
long; I can't even read the subject line anymore it's gone so far
sideways...)

There's one suggestion I haven't seen yet, so I'll make it:

I assume from the discussion about segment-based linking that it's
possible to pull out one particular section from an object file, and
just link that into the executable.

So, why not make a small modification to OPTLINK such that if a switch
is thrown, and it encounters any missing symbol of the form
/_D11TypeInfo_.*/, then it will link only the segment that symbol is in.
 In other words, it does what it currently does in all cases except
where there's a TypeInfo involved; in which case it links *just* the
TypeInfo, not the whole object file.

This doesn't break compatibility with OMF or any other tool; it's simply
an optimisation for reducing executable bloat in D programs.  This way,
we don't need a new object format, or a whole new linker.

Or have I just got it all wrong? :P

	-- Daniel

-- 
Unlike Knuth, I have neither proven or tried the above; it may not even
make sense.

v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D
i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP  http://hackerkey.com/

Feb 22 2007

kris <foo bar.com> writes:

Daniel Keep wrote:
 (I'm just going to interject here because WOW this thread is getting
 long; I can't even read the subject line anymore it's gone so far
 sideways...)
 
 There's one suggestion I haven't seen yet, so I'll make it:
 
 I assume from the discussion about segment-based linking that it's
 possible to pull out one particular section from an object file, and
 just link that into the executable.
 
 So, why not make a small modification to OPTLINK such that if a switch
 is thrown, and it encounters any missing symbol of the form
 /_D11TypeInfo_.*/, then it will link only the segment that symbol is in.
  In other words, it does what it currently does in all cases except
 where there's a TypeInfo involved; in which case it links *just* the
 TypeInfo, not the whole object file.
 
 This doesn't break compatibility with OMF or any other tool; it's simply
 an optimisation for reducing executable bloat in D programs.  This way,
 we don't need a new object format, or a whole new linker.
 
 Or have I just got it all wrong? :P
 
 	-- Daniel
 


On the face of it, that sounds like a reaonable solution. One would 
assume it would be legit to pull typeinfo from a segment instead ...

Feb 23 2007

Dave <Dave_member pathlink.com> writes:

kris wrote:
 Daniel Keep wrote:
 (I'm just going to interject here because WOW this thread is getting
 long; I can't even read the subject line anymore it's gone so far
 sideways...)

 There's one suggestion I haven't seen yet, so I'll make it:

 I assume from the discussion about segment-based linking that it's
 possible to pull out one particular section from an object file, and
 just link that into the executable.

 So, why not make a small modification to OPTLINK such that if a switch
 is thrown, and it encounters any missing symbol of the form
 /_D11TypeInfo_.*/, then it will link only the segment that symbol is in.
  In other words, it does what it currently does in all cases except
 where there's a TypeInfo involved; in which case it links *just* the
 TypeInfo, not the whole object file.

 This doesn't break compatibility with OMF or any other tool; it's simply
 an optimisation for reducing executable bloat in D programs.  This way,
 we don't need a new object format, or a whole new linker.

 Or have I just got it all wrong? :P

     -- Daniel

 
 
 On the face of it, that sounds like a reaonable solution. One would 
 assume it would be legit to pull typeinfo from a segment instead ...

Great idea if it's feasible... Would it then make sense that the switch be
thrown by default for DMD?

Feb 23 2007

jcc7 <technocrat7 gmail.com> writes:

== Quote from Daniel Keep (daniel.keep.lists gmail.com)'s article
 (I'm just going to interject here because WOW this thread is getting
 long; I can't even read the subject line anymore it's gone so far
 sideways...)

 There's one suggestion I haven't seen yet, so I'll make it:

 I assume from the discussion about segment-based linking that it's
 possible to pull out one particular section from an object file, and
 just link that into the executable.

 So, why not make a small modification to OPTLINK such that if a
 switch is thrown, and it encounters any missing symbol of the form
 /_D11TypeInfo_.*/, then it will link only the segment that symbol is
 in.  In other words, it does what it currently does in all cases
 except where there's a TypeInfo involved; in which case it links
 *just* the TypeInfo, not the whole object file.

 This doesn't break compatibility with OMF or any other tool; it's
 simply
 an optimisation for reducing executable bloat in D programs.  This
 way, we don't need a new object format, or a whole new linker.

 Or have I just got it all wrong? :P
 	-- Daniel

I don't know enough about how linkers work to know if OPTLINK can just include a
TypeInfo segment like that, but it sounds like a good idea to me. I think it's
more feasible than my idea was (since Frits helped me see some serious
challenges
with my idea).

And if it works, it should directly address Kris's problem.

jcc7

Feb 23 2007

Daniel Keep <daniel.keep.lists gmail.com> writes:

jcc7 wrote:
 == Quote from Daniel Keep (daniel.keep.lists gmail.com)'s article
 ...

 So, why not make a small modification to OPTLINK such that if a
 switch is thrown, and it encounters any missing symbol of the form
 /_D11TypeInfo_.*/, then it will link only the segment that symbol is
 in.  In other words, it does what it currently does in all cases
 except where there's a TypeInfo involved; in which case it links
 *just* the TypeInfo, not the whole object file.

 ...

 
 I don't know enough about how linkers work to know if OPTLINK can just include
a
 TypeInfo segment like that, but it sounds like a good idea to me. I think it's
 more feasible than my idea was (since Frits helped me see some serious
challenges
 with my idea).
 
 And if it works, it should directly address Kris's problem.
 
 jcc7

I had a peek at the TypeInfos that are hard-coded into Phobos.  They do
import other stuff, but that "other stuff" is all in Phobos.  Since
(ignoring Tango for the moment) all D programs include at least one
non-TypeInfo symbol (the *real* main(), if I'm not mistaken), then
Phobos will always be linked in.

I think it's also safe to assume that since all other TypeInfos are
generated by the compiler, it's not going to start inserting imports to
modules all over the shop; at worst, it will import some other modules
from Phobos.

So, assuming all my assumptions hold (starts plastering his argument
with duct tape), then it should be feasible.

Of course, when it comes down to it, Walter's the expert.  What say ye?

	-- Daniel

-- 
Unlike Knuth, I have neither proven or tried the above; it may not even
make sense.

v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D
i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP  http://hackerkey.com/

Feb 23 2007

kris <foo bar.com> writes:

Frits van Bommel wrote:
 kris wrote:
 
 Pragma wrote:

 Sorry if I'm stating the obvious, but it seems to me that the linker 
 is finding this typeinfo COMDAT in Core first, rather than somewhere 
 else, and is thereby forcing the inclusion of the rest of it's 
 containing module.

 Does moving core.obj to the end of the .lib solve the problem?

 Heya Eric

 That's what it seems like and (as noted above) core.obj is already the 
 very last one added to the lib ;)

 The only way to resolve at this point is to remove core.obj entirely.

 
 
 Did you try putting it at the front of the lib? You never know, maybe it 
 picks the last one instead of the first one.
 
 Unless it just happens to be the only module to define 
 _D12TypeInfo_AAa6__initZ ...

No change, Frits

Feb 21 2007

Walter Bright <newshound digitalmars.com> writes:

kris wrote:
 _D12TypeInfo_AAa6__initZ    COMDAT flags=x0 attr=x10 align=x0

TypeInfo's don't get the module prefix because it would cause 
duplication of code (i.e. bloat) to have it that way. There is no 
difference between the TypeInfo for char[][] in one module, and the 
TypeInfo for char[][] in another, so the TypeInfo names should match 
exactly.

Feb 21 2007

kris <foo bar.com> writes:

Walter Bright wrote:
 kris wrote:
 
 _D12TypeInfo_AAa6__initZ    COMDAT flags=x0 attr=x10 align=x0

 
 
 TypeInfo's don't get the module prefix because it would cause 
 duplication of code (i.e. bloat) to have it that way. There is no 
 difference between the TypeInfo for char[][] in one module, and the 
 TypeInfo for char[][] in another, so the TypeInfo names should match 
 exactly.

well, ok ... but it is responsible for what happened here? If not, what is?

Feb 21 2007

Walter Bright <newshound digitalmars.com> writes:

kris wrote:
 Walter Bright wrote:
 kris wrote:

 _D12TypeInfo_AAa6__initZ    COMDAT flags=x0 attr=x10 align=x0


 TypeInfo's don't get the module prefix because it would cause 
 duplication of code (i.e. bloat) to have it that way. There is no 
 difference between the TypeInfo for char[][] in one module, and the 
 TypeInfo for char[][] in another, so the TypeInfo names should match 
 exactly.

 
 well, ok ... but it is responsible for what happened here? If not, what is?

 From your description, the linker is looking to resolve a reference to 
the TypeInfo for char[][], and found that module first.

Feb 21 2007

kris <foo bar.com> writes:

Walter Bright wrote:
 kris wrote:
 
 Walter Bright wrote:

 kris wrote:

 _D12TypeInfo_AAa6__initZ    COMDAT flags=x0 attr=x10 align=x0



 TypeInfo's don't get the module prefix because it would cause 
 duplication of code (i.e. bloat) to have it that way. There is no 
 difference between the TypeInfo for char[][] in one module, and the 
 TypeInfo for char[][] in another, so the TypeInfo names should match 
 exactly.


 well, ok ... but it is responsible for what happened here? If not, 
 what is?

 
 
  From your description, the linker is looking to resolve a reference to 
 the TypeInfo for char[][], and found that module first.


That's exactly what it looks like. Would you agree the results could be 
described as haphazard? The outcome here certainly has that feeling to it :)

It also finds that particular one whether the module is listed first or 
last in the lib response-file.

What is one supposed to do for production-quality libraries?

Feb 21 2007

Walter Bright <newshound digitalmars.com> writes:

kris wrote:
 It also finds that particular one whether the module is listed first or 
 last in the lib response-file.

I bet that's because that module was imported (directly or indirectly) 
by every other module that used char[][], and so it was the only module 
that defines it.

 What is one supposed to do for production-quality libraries?

Some strategies:

1) minimize importing of modules that are never used

2) for modules with a lot of code in them, import them as a .di file 
rather than a .d

3) create a separate module that defines the relevant typeinfo's, and 
put that first in the library

Feb 21 2007

kris <foo bar.com> writes:

Walter Bright wrote:
 kris wrote:
 
 It also finds that particular one whether the module is listed first 
 or last in the lib response-file.

 
 
 I bet that's because that module was imported (directly or indirectly) 
 by every other module that used char[][], and so it was the only module 
 that defines it.
 
 What is one supposed to do for production-quality libraries?

 
 
 Some strategies:
 
 1) minimize importing of modules that are never used
 
 2) for modules with a lot of code in them, import them as a .di file 
 rather than a .d
 
 3) create a separate module that defines the relevant typeinfo's, and 
 put that first in the library


1) Tango takes this very seriously ... more so than Phobos, for example.

2) That is something that could be used in certain scenario's, but is 
not a general or practical solution for widespread use of D.

3) Hack around an undocumented and poorly understood problem in 
developer-land. Great.


you might as well add:

4) have the user instantiate a pointless and magic char[][] in their own 
program, so that they can link with the Tango library?


None of this is not gonna fly in practice, and you surely know that?

I get a subtle impression that you're being defensive about the problem 
rather than actively thinking about a practical solution? We're trying 
to help D get some traction here, yet it seems you're not particularly 
interested in removing some roadblocks? Or are you scheming a resolution 
in private?

"frustrated with D tools again"

Feb 21 2007

Walter Bright <newshound digitalmars.com> writes:

kris wrote:
 Walter Bright wrote:
 Some strategies:

 1) minimize importing of modules that are never used

 2) for modules with a lot of code in them, import them as a .di file 
 rather than a .d

 3) create a separate module that defines the relevant typeinfo's, and 
 put that first in the library

 
 
 1) Tango takes this very seriously ... more so than Phobos, for example.

Sure, but in this particular case, it seems that "core" is being 
imported without referencing code in it. The only reason the compiler 
doesn't generate the char[][] TypeInfo is because an import defines it. 
The compiler does work on the assumption that if a module is imported, 
then it will also be linked in.

 2) That is something that could be used in certain scenario's, but is 
 not a general or practical solution for widespread use of D.

The compiler can automatically generate .di files. You're probably going 
to want to do that anyway as part of polishing the library - it speeds 
compilation times, aids proper encapsulation, etc. That's why the gc 
does it, and I've been meaning to do it for other bulky libraries like 
std.regexp.

I wish to point out that the current scheme does *work*, it generates 
working executables. In the age of demand paged executable loading 
(which both Linux and Windows do), unused code in the executable never 
even gets loaded into memory. The downside to size is really in shipping 
code over a network (also in embedded systems).

So I disagree with your characterization of it as impractical.

For professional libraries, it is not unreasonable to expect some extra 
effort in tuning the libraries to minimize dependency. This is a normal 
process, it's been going on at least for the 25 years I've been doing 
it. Standard C runtime libraries, for example, have been *extensively* 
tweaked and tuned in this manner, and that's just boring old C. They are 
not just big lumps of code.

 3) Hack around an undocumented and poorly understood problem in 
 developer-land. Great.

I think you understand the problem now, and the solution. Every 
developer of professional libraries should understand this problem, it 
crops up with most every language. If a developer doesn't understand it, 
one winds up with something like Java where even the simplest hello 
world winds up pulling in the entire Java runtime library, because 
dependencies were not engineered properly.

 you might as well add:
 
 4) have the user instantiate a pointless and magic char[][] in their own 
 program, so that they can link with the Tango library?

I wouldn't add it, as I would expect the library developer to take care 
of such things by adding them to the Tango library as part of the 
routine process of optimizing executable size by minimizing dependencies.

 None of this is not gonna fly in practice, and you surely know that?

For features like runtime time identification, etc., that are generated 
by the compiler (instead of explicitly by the programmer), then the 
dependencies they generate are a fact of life.

Optimizing the size of a generated program is a routine programming 
task. It isn't something new with D. I've been doing this for 25 years.

 I get a subtle impression that you're being defensive about the problem 
 rather than actively thinking about a practical solution? We're trying 
 to help D get some traction here, yet it seems you're not particularly 
 interested in removing some roadblocks? Or are you scheming a resolution 
 in private?

If you have any ideas that don't involve reinventing obj file formats or 
that don't preclude using standard linkers, please let me know.

Feb 21 2007

kris <foo bar.com> writes:

Walter Bright wrote:
 kris wrote:
 
 Walter Bright wrote:

 Some strategies:

 1) minimize importing of modules that are never used

 2) for modules with a lot of code in them, import them as a .di file 
 rather than a .d

 3) create a separate module that defines the relevant typeinfo's, and 
 put that first in the library



 1) Tango takes this very seriously ... more so than Phobos, for example.

 
 
 Sure, but in this particular case, it seems that "core" is being 
 imported without referencing code in it. The only reason the compiler 
 doesn't generate the char[][] TypeInfo is because an import defines it. 
 The compiler does work on the assumption that if a module is imported, 
 then it will also be linked in.

This core module, and the entire locale package it resides in, is /not/ 
imported by anything. I spelled that out clearly before. You're making 
an assumption it is, somehow ... well, it is not. You can deduce that 
from the fact that the link succeeds perfectly well without that package 
existing in the library.

 
 2) That is something that could be used in certain scenario's, but is 
 not a general or practical solution for widespread use of D.

 
 
 The compiler can automatically generate .di files. You're probably going 
 to want to do that anyway as part of polishing the library - it speeds 
 compilation times, aids proper encapsulation, etc. That's why the gc 
 does it, and I've been meaning to do it for other bulky libraries like 
 std.regexp.

You may remember that many of us find .di files to be something "less" 
than an effective approach to library interfacing? As to it making 
smaller, faster compiliations -- try it on the Win32 header files ... it 
makes them bigger and noticably slower to parse.

This is neither a valid or practical solution.


 
 I wish to point out that the current scheme does *work*, it generates 
 working executables. In the age of demand paged executable loading 
 (which both Linux and Windows do), unused code in the executable never 
 even gets loaded into memory. The downside to size is really in shipping 
 code over a network (also in embedded systems).
 
 So I disagree with your characterization of it as impractical.

Oh, ok. It all depends on what one expects from a toolset. Good point

 
 For professional libraries, it is not unreasonable to expect some extra 
 effort in tuning the libraries to minimize dependency. This is a normal 
 process, it's been going on at least for the 25 years I've been doing 
 it. Standard C runtime libraries, for example, have been *extensively* 
 tweaked and tuned in this manner, and that's just boring old C. They are 
 not just big lumps of code.
 
 3) Hack around an undocumented and poorly understood problem in 
 developer-land. Great.

 
 
 I think you understand the problem now, and the solution. Every 
 developer of professional libraries should understand this problem, it 
 crops up with most every language. If a developer doesn't understand it, 
 one winds up with something like Java where even the simplest hello 
 world winds up pulling in the entire Java runtime library, because 
 dependencies were not engineered properly.

This is a problem with the toolchain, Walter. Plain and simple. The 
linker picks up an arbitrary, yes arbitrary, module from the library 
because the D language-design is such that it highlights a deficiency in 
the existing toolchain. See below:

You can claim all you like that devs should learn to deal with it, but 
the fact remains that it took us more than a day to track down this 
obscure problem to being a char[][] decl. It will take just as long for 
the next one, and perhaps longer. Where does the cycle end?

The toolchain currently operates in a haphazard fashion, linking in 
/whatever/ module-chain happens to declare a typeinfo for char[][]. And 
it does this because of the way D generates the typeinfo. The process is 
broken, pure and simple. We should accept this and try to figure out how 
to resolve it instead.


 
 you might as well add:

 4) have the user instantiate a pointless and magic char[][] in their 
 own program, so that they can link with the Tango library?

 
 
 I wouldn't add it, as I would expect the library developer to take care 
 of such things by adding them to the Tango library as part of the 
 routine process of optimizing executable size by minimizing dependencies.
 


Minimizing dependencies? What are you talking about? Those deps are 
produces purely by the D compiler, and not the code design.



 None of this is not gonna fly in practice, and you surely know that?

 
 
 For features like runtime time identification, etc., that are generated 
 by the compiler (instead of explicitly by the programmer), then the 
 dependencies they generate are a fact of life.
 
 Optimizing the size of a generated program is a routine programming 
 task. It isn't something new with D. I've been doing this for 25 years.

Entirely disingenuous. This is not about "optimization" at all ... it 
about a broken toolchain. Nothing more.

I hope you'll find a way to progress this forward toward a resolution 
instead of labeling it something else.

Feb 21 2007

Justin C Calvarese <technocrat7 gmail.com> writes:

kris wrote:
 Walter Bright wrote:
 kris wrote:

 Walter Bright wrote:

 Some strategies:

 1) minimize importing of modules that are never used

 2) for modules with a lot of code in them, import them as a .di file 
 rather than a .d

 3) create a separate module that defines the relevant typeinfo's, 
 and put that first in the library



 1) Tango takes this very seriously ... more so than Phobos, for example.


 Sure, but in this particular case, it seems that "core" is being 
 imported without referencing code in it. The only reason the compiler 
 doesn't generate the char[][] TypeInfo is because an import defines 
 it. The compiler does work on the assumption that if a module is 
 imported, then it will also be linked in.

 
 This core module, and the entire locale package it resides in, is /not/ 
 imported by anything. I spelled that out clearly before. You're making 
 an assumption it is, somehow ... well, it is not. You can deduce that 
 from the fact that the link succeeds perfectly well without that package 
 existing in the library.
 
 2) That is something that could be used in certain scenario's, but is 
 not a general or practical solution for widespread use of D.


 The compiler can automatically generate .di files. You're probably 
 going to want to do that anyway as part of polishing the library - it 
 speeds compilation times, aids proper encapsulation, etc. That's why 
 the gc does it, and I've been meaning to do it for other bulky 
 libraries like std.regexp.

 
 You may remember that many of us find .di files to be something "less" 
 than an effective approach to library interfacing? As to it making 
 smaller, faster compiliations -- try it on the Win32 header files ... it 
 makes them bigger and noticably slower to parse.
 
 This is neither a valid or practical solution.
 
 
 I wish to point out that the current scheme does *work*, it generates 
 working executables. In the age of demand paged executable loading 
 (which both Linux and Windows do), unused code in the executable never 
 even gets loaded into memory. The downside to size is really in 
 shipping code over a network (also in embedded systems).

 So I disagree with your characterization of it as impractical.

 
 Oh, ok. It all depends on what one expects from a toolset. Good point
 
 For professional libraries, it is not unreasonable to expect some 
 extra effort in tuning the libraries to minimize dependency. This is a 
 normal process, it's been going on at least for the 25 years I've been 
 doing it. Standard C runtime libraries, for example, have been 
 *extensively* tweaked and tuned in this manner, and that's just boring 
 old C. They are not just big lumps of code.

 3) Hack around an undocumented and poorly understood problem in 
 developer-land. Great.


 I think you understand the problem now, and the solution. Every 
 developer of professional libraries should understand this problem, it 
 crops up with most every language. If a developer doesn't understand 
 it, one winds up with something like Java where even the simplest 
 hello world winds up pulling in the entire Java runtime library, 
 because dependencies were not engineered properly.

 
 This is a problem with the toolchain, Walter. Plain and simple. The 
 linker picks up an arbitrary, yes arbitrary, module from the library 
 because the D language-design is such that it highlights a deficiency in 
 the existing toolchain. See below:
 
 You can claim all you like that devs should learn to deal with it, but 
 the fact remains that it took us more than a day to track down this 
 obscure problem to being a char[][] decl. It will take just as long for 
 the next one, and perhaps longer. Where does the cycle end?
 
 The toolchain currently operates in a haphazard fashion, linking in 
 /whatever/ module-chain happens to declare a typeinfo for char[][]. And 
 it does this because of the way D generates the typeinfo. The process is 
 broken, pure and simple. We should accept this and try to figure out how 
 to resolve it instead.
 
 
 you might as well add:

 4) have the user instantiate a pointless and magic char[][] in their 
 own program, so that they can link with the Tango library?


 I wouldn't add it, as I would expect the library developer to take 
 care of such things by adding them to the Tango library as part of the 
 routine process of optimizing executable size by minimizing dependencies.

 
 
 Minimizing dependencies? What are you talking about? Those deps are 
 produces purely by the D compiler, and not the code design.
 
 
 
 None of this is not gonna fly in practice, and you surely know that?


 For features like runtime time identification, etc., that are 
 generated by the compiler (instead of explicitly by the programmer), 
 then the dependencies they generate are a fact of life.

 Optimizing the size of a generated program is a routine programming 
 task. It isn't something new with D. I've been doing this for 25 years.

 
 Entirely disingenuous. This is not about "optimization" at all ... it 
 about a broken toolchain. Nothing more.
 
 I hope you'll find a way to progress this forward toward a resolution 
 instead of labeling it something else.

I'm not trying to pick a fight with any of the people who have been 
discussing this serious issue, but I have some thoughts I'd like to add. 
Feel free to take my words as the ramblings of an idiot...

My theory is that Walter and you (Kris and everyone else who is trying 
to talk some sense into Walter) are operating on different wavelengths. 
(I may be on yet another wavelength.) When I read Kris's complaint, I 
think "Wow, that sounds like a problem that needs fixing". When I read 
Walter's response, I think "Hmmm, that makes sense, too. What was Kris's 
problem again?". And that cycle repeats for me. Walter seems to still 
think he understands the problem, but perhaps we could benefit from a 
simple illustration of the problem. Or just a restatement of the problem 
situation. I'm sure that I don't understand what's going on.

It's something about the compiler is generating the TypeInfo for 
char[][], and it's bringing in all of "Core" (but we don't need all of 
"Core"). And we especially don't need the "locale" package since it's 
bloated (and unneeded), but the whole package (including all of "Core" 
and "locale") is brought in because the compiler is generating TypeInfo 
for the char[][]. (But if the "locale" package is so bloated and 
unneeded, then why is it being compiled at all? Is "locale" part of 
"Core"?) Is any of that right? I'm so confused.

(Perhaps part of the problem is that Walter isn't that familiar with the 
Tango library and what it's all about. I suspect that I know more about 
Tango than Walter does -- and I'm afraid that I know barely anything 
about it -- so that could be part of the problem, too.)

-- 
jcc7

Feb 21 2007

kris <foo bar.com> writes:

Justin C Calvarese wrote:
 I'm not trying to pick a fight with any of the people who have been 
 discussing this serious issue, but I have some thoughts I'd like to add. 
 Feel free to take my words as the ramblings of an idiot...
 
 My theory is that Walter and you (Kris and everyone else who is trying 
 to talk some sense into Walter) are operating on different wavelengths. 
 (I may be on yet another wavelength.) When I read Kris's complaint, I 
 think "Wow, that sounds like a problem that needs fixing". When I read 
 Walter's response, I think "Hmmm, that makes sense, too. What was Kris's 
 problem again?". And that cycle repeats for me. Walter seems to still 
 think he understands the problem, but perhaps we could benefit from a 
 simple illustration of the problem. Or just a restatement of the problem 
 situation. I'm sure that I don't understand what's going on.
 
 It's something about the compiler is generating the TypeInfo for 
 char[][], and it's bringing in all of "Core" (but we don't need all of 
 "Core"). And we especially don't need the "locale" package since it's 
 bloated (and unneeded), but the whole package (including all of "Core" 
 and "locale") is brought in because the compiler is generating TypeInfo 
 for the char[][]. (But if the "locale" package is so bloated and 
 unneeded, then why is it being compiled at all? Is "locale" part of 
 "Core"?) Is any of that right? I'm so confused.
 
 (Perhaps part of the problem is that Walter isn't that familiar with the 
 Tango library and what it's all about. I suspect that I know more about 
 Tango than Walter does -- and I'm afraid that I know barely anything 
 about it -- so that could be part of the problem, too.)
 

Well said, Justin. I'm personally feeling like there's either some vast 
misunderstanding or there's a lot of smoke billowing about. I'll try to 
recapture the issue and see where it goes. Let me know if I fail to 
explain something?

The problem space
-----------------

1) This is not about templates anymore. We're currently past that bridge 
and into different territory. A common territory that every developer 
using D will have to face in one way or another.

2) This is not specific to Tango at all. It is a generic problem and 
Tango just happens to trip it in an obvious manner.

3) In a nutshell, the linker is binding code from the library that has 
no business being attached to the executable. Let's call this the 
"redundant code"?

4) Given the last set of comments from Walter, he appears to think the 
the redundant code is somehow imported; either by the example program or 
indirectly via some chain of imports within the library itself. This is 
where the disconnect lies, I suspect.

5) There is /no/ import chain explicitly manifested anywhere in the code 
in question. This should be obvious form the fact that the example links 
perfectly cleanly when said redundant code is deliberately removed from 
the library.

6) The dependency that /does/ cause the problem is one generated by the 
D compiler itself. It generates and injects false 'dependencies' across 
object modules. Specifically, that's effectively how the linker treats them.

7) These fake dependencies are responsible for, in this case, the entire 
"locale" package to be bound to the example app, resulting in a 350% 
increase in size.

8) Fake dependencies are injected in the form of typeinfo. In this case, 
the typeinfo is for a char[][]. This is not part of the "prepackaged" 
set of typeinfo, so the compiler makes it up on the fly. Trouble is, 
this is "global" information -- it should be in one location only.

9) The Fake dependencies cause the linker to pick up and bind whatever 
module happens to satisfy it's need for the typeinfo resolution. In this 
case, the linker sees Core.obj with a char[][] decl exposed, so it say 
"hey, this is the place!" and binds it. Along with everything else that 
Core.obj actually requires.

10) The linker is entirely wrong, but you can't really blame it since 
the char[][] decl is scattered throughout the library modules. It thinks 
it get's the /right/ one, but in fact it could have chosen *any* of 
them. This is now getting to the heart of the problem.

11) If there's was only one exposed decl for char[][], e.g. like int[], 
there would be no problem. In fact you can see all the prepackaged 
typeinfo bound to any D executable. There's lots of them. However, 
because the compiler injects this typeinfo into a variety of objects 
(apparently wherever char[][] is used), then the linker is boondoggled.

12) If the linker were smart, and could link segments instead of entire 
object modules, this would still be OK (a segment is an isolated part of 
the object module). But the linker is not smart. It was written to be 
fast, in pure assembler, decades ago.


Why is this a problem now?
-------------------------

Well, it's always been a problem to an extent, over the years. The key 
here is that in the past, the problem was generated principally by the 
developers/coder by introducing duplicate symbols and so on. Because it 
was in the hand of the developer, it could be resolved reasonably well.

With D, that is still potentially the case. However, the /real/ problem 
is this: the compiler generates the duplicate symbols all by itself. So, 
the developer has no means to rectify the situation ... it is entirely 
out of their hands. What's worse is this: there are no useful messages 
involved ... all you get is some bizzare and arcane message from the 
linker that generally misguides you instead.

Case in point: you have to strip the library down by hand, and very very 
carefully sift through the symbols and literally hundreds of library 
builds until you finally get lucky enough to stumble over the problem.

Walter asserts that the linker can be tricked into doing the right 
thing. This seems to show a lack of understanding on his part about the 
problem and the manner in which the lib and linker operate.

The linker cannot be fooled or tricked in a dependendable manner, since 
the combinations of redundant symbols for it to choose from are nigh 
impossible for a human to track on a regular basis, and the particular 
module resolved depends very much on where the linker currently is in 
it's process. As you can imagine, in a large library with a large number 
of compiler-generated duplicates, that's potentially a very large 
explosion of combinations? The notion that a developer be responsible 
for tricking the linker, to cover up for these injected duplicates is 
simply absurd :)

As was pointed out to me, OMF librarians actually uses a two-level 
hashmap to index the library entries. This is used by the linker to 
locate missing symbols. I think it's clear that this is not a linear 
lookup mechanism as had been claimed, and is borne out by experiments 
that show the process cannot be controlled, and the linker cannot be 
faked in usable or dependable manner.

To hammer the overall issue home, consider this: in the example 
application I added a dummy /magical/ declaration of

char[][] huh = [];

When linked against the lib, my executable shrank from ~620kb to ~180kb. 
  Where did I get this magic from? Well, it took an exceddingly long and 
tedious process to discover it. Rinse and repeat for the next related error.



A word about Tango
------------------

Contrary to various implications made recently, Tango is rather well 
organized with very limited inter-module dependencies. For example, 
interfaces and pure-abstract classes are deliberately used to decouple 
the implementation of one module from those of others. You won't see 
that kind of thing in many libs, and certainly not in Phobos. Tango is 
designed and build by people who actually care about such things, crazy 
as that may sound :)

The "bloat" injected into the example executable comes entirely from an 
isolated package. It is the "locale" package, which supports a truly 
extensive array of I18N tools and calanders. I think it has 7 or 8 
different calander systems alone? It captures all the monetary, time and 
date preferences and idioms for every recongnized locale in the world. 
In short, it is an exceptional piece of work (from David Chapman). The 
equivalent out there is perhaps a good chunk of the IBM ICU project. The 
minimum size for that is a 7MB DLL. Typically 10MB instead.

I feel it important to point out that this powerful I18N package is, in 
no way, at fault here. The D compiler simply injected the wrong symbol 
into the wrong module at the wrong time, in the wrong order, and the 
result is that this package gets linked when it's not used or imported 
in any fashion by the code design. Instead, the dependency is created 
entirely by the compiler. That's a problem. It is a big problem. And it 
is a problem every D developer will face, at some point, when using DM 
tools. But it is not a problem rooted in the Tango code.

Feb 21 2007

John Reimer <terminal.node gmail.com> writes:

< SNIP good post from Kris >

After reading this, it began to dawn on me what has become one of the huge
obstacles for D.

Tango represents one the largest contributions to the D world.  It
may be considered the first large commercial grade library to make it's
presence here.  It's an opportunity to prove "that D has the right stuff."

But there's a traitor in our midst, and that traitor resides in the d
tool chain, no less: the withered grip of optlink -- decades old,
astonishly speedy, tried and true, vain and boastful -- stays the progress
of D in an absurd way. We're led to believe that we need nothing better for
a modern language that steadily pushes into new territory and tests new
ideas. And yet we are ultimately held hostage by this very tool. 

Does no one see the imbalance, the inconsistancy of all this? A powerful
language like D has worked so hard to ease the pains of C++ users, to
separate us from the stench of C++ clutter, by implementing a more
palatable grammar. And yet despite all this, we are forsaken. Has the
imagination and concern for progress vanished? Isn't the toolchain also
one of the most critical aspects of acceptance of a language?  Why should
anyone adopt a powerful and clean language when the tools just drag it
deeper into the sewer.  What benefit is a new language with old tools?
Cutting corners for the sake of efficiency does D no good.  Make it all
good, language and tools, and the users will clamour for it.

optlink may just be the bane for D acceptance. And Tango gets the pitiful
opportunity of demonstrating why D is NOT ready for prime-time in the
commercial realm: the DM support tools it relies on are bogged down in
the past, reflecting D's lopsided existance on yet another level: a strong
language relying on a fragile, outdated, and poorly fit tool set.

-JJR

Feb 22 2007

Walter Bright <newshound digitalmars.com> writes:

John Reimer wrote:
 optlink may just be the bane for D acceptance. And Tango gets the pitiful
 opportunity of demonstrating why D is NOT ready for prime-time in the
 commercial realm: the DM support tools it relies on are bogged down in
 the past, reflecting D's lopsided existance on yet another level: a strong
 language relying on a fragile, outdated, and poorly fit tool set.

Linux's ld exhibits the same behavior. Try compiling the 3 files here 
according to the instructions below, and try different orderings adding 
the object files to the librarian (ar). The same behavior as lib/optlink 
results (using nm to see what symbols are placed into the resulting 
executable).

------------------- test.d --------------------
import b;       // a is not imported

void foo(...) { }

void test(char[][] s)
{
     bbb2();
     foo(s);
}

void main()
{
}
--------------------- a.d -------------------
void xxx(...) { }

void aaa(char[][] s)
{
     xxx(s);
}

void aaa2()             // never referenced nor imported 'bloat'
{
}
-------------------- b.d ---------------------
void yyy(...) { }

void bbb(char[][] s)
{
     yyy(s);
}

void bbb2()
{
}
-------------------- build 1 -------------------
dmd -c a.d test.d
dmd -c b.d
rm foo.a
ar -r foo.a a.o b.o     <= a comes before b
dmd test.o foo.a
nm test >log		<= aaa2() appears in executable
-------------------- build 2 -------------------
dmd -c a.d test.d
dmd -c b.d
rm foo.a
ar -r foo.a b.o a.o     <= b comes before a
dmd test.o foo.a
nm test >log		<= aaa2() does not appear in executable
---------------------------------------------

Feb 23 2007

Sean Kelly <sean f4.ca> writes:

Walter Bright wrote:
 John Reimer wrote:
 optlink may just be the bane for D acceptance. And Tango gets the pitiful
 opportunity of demonstrating why D is NOT ready for prime-time in the
 commercial realm: the DM support tools it relies on are bogged down in
 the past, reflecting D's lopsided existance on yet another level: a 
 strong
 language relying on a fragile, outdated, and poorly fit tool set.

 
 Linux's ld exhibits the same behavior. Try compiling the 3 files here 
 according to the instructions below, and try different orderings adding 
 the object files to the librarian (ar). The same behavior as lib/optlink 
 results (using nm to see what symbols are placed into the resulting 
 executable).

In your example, no symbols at all from a is referenced in b or in test, 
and yet it's linked anyway in the first case?


Sean

Feb 23 2007

Sean Kelly <sean f4.ca> writes:

Sean Kelly wrote:
 Walter Bright wrote:
 John Reimer wrote:
 optlink may just be the bane for D acceptance. And Tango gets the 
 pitiful
 opportunity of demonstrating why D is NOT ready for prime-time in the
 commercial realm: the DM support tools it relies on are bogged down in
 the past, reflecting D's lopsided existance on yet another level: a 
 strong
 language relying on a fragile, outdated, and poorly fit tool set.

 Linux's ld exhibits the same behavior. Try compiling the 3 files here 
 according to the instructions below, and try different orderings 
 adding the object files to the librarian (ar). The same behavior as 
 lib/optlink results (using nm to see what symbols are placed into the 
 resulting executable).

 
 In your example, no symbols at all from a is referenced in b or in test, 
 and yet it's linked anyway in the first case?

Forget I said that.  It's the TypeInfo for char[][].

Feb 23 2007

Walter Bright <newshound digitalmars.com> writes:

Sean Kelly wrote:
 Sean Kelly wrote:
 Walter Bright wrote:
 John Reimer wrote:
 optlink may just be the bane for D acceptance. And Tango gets the 
 pitiful
 opportunity of demonstrating why D is NOT ready for prime-time in the
 commercial realm: the DM support tools it relies on are bogged down in
 the past, reflecting D's lopsided existance on yet another level: a 
 strong
 language relying on a fragile, outdated, and poorly fit tool set.

 Linux's ld exhibits the same behavior. Try compiling the 3 files here 
 according to the instructions below, and try different orderings 
 adding the object files to the librarian (ar). The same behavior as 
 lib/optlink results (using nm to see what symbols are placed into the 
 resulting executable).

 In your example, no symbols at all from a is referenced in b or in 
 test, and yet it's linked anyway in the first case?

 
 Forget I said that.  It's the TypeInfo for char[][].

That's right, it picked the FIRST ONE in library, regardless of how many 
other .o's defined it, and regardless of what else was in that .o, 
referenced or not.

Feb 23 2007

Sean Kelly <sean f4.ca> writes:

Walter Bright wrote:
 Sean Kelly wrote:
 Sean Kelly wrote:
 Walter Bright wrote:
 John Reimer wrote:
 optlink may just be the bane for D acceptance. And Tango gets the 
 pitiful
 opportunity of demonstrating why D is NOT ready for prime-time in the
 commercial realm: the DM support tools it relies on are bogged down in
 the past, reflecting D's lopsided existance on yet another level: a 
 strong
 language relying on a fragile, outdated, and poorly fit tool set.

 Linux's ld exhibits the same behavior. Try compiling the 3 files 
 here according to the instructions below, and try different 
 orderings adding the object files to the librarian (ar). The same 
 behavior as lib/optlink results (using nm to see what symbols are 
 placed into the resulting executable).

 In your example, no symbols at all from a is referenced in b or in 
 test, and yet it's linked anyway in the first case?

 Forget I said that.  It's the TypeInfo for char[][].

 
 That's right, it picked the FIRST ONE in library, regardless of how many 
 other .o's defined it, and regardless of what else was in that .o, 
 referenced or not.

That makes complete sense.  It's irritating that such matches pull in 
the entire object, but at least the process is straightforward and logical.

Feb 23 2007

Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:

Sean Kelly wrote:
 That makes complete sense.  It's irritating that such matches pull in 
 the entire object, but at least the process is straightforward and logical.

Of course, if matches didn't pull in the entire object then static 
constructors/destructors, unittests and exceptions wouldn't work without 
some way to flag specific sections as "always pull this in if anything 
(or even better: if _certain things_) are pulled in" or "when this is 
pulled in, also pull in X and Y even though they're not referenced" or 
something similar...

Feb 23 2007

Sean Kelly <sean f4.ca> writes:

Frits van Bommel wrote:
 Sean Kelly wrote:
 That makes complete sense.  It's irritating that such matches pull in 
 the entire object, but at least the process is straightforward and 
 logical.

 
 Of course, if matches didn't pull in the entire object then static 
 constructors/destructors, unittests and exceptions wouldn't work without 
 some way to flag specific sections as "always pull this in if anything 
 (or even better: if _certain things_) are pulled in" or "when this is 
 pulled in, also pull in X and Y even though they're not referenced" or 
 something similar...

Hm... so how does segment-level linking work at all?

Feb 23 2007

Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:

Sean Kelly wrote:
 Frits van Bommel wrote:
 Sean Kelly wrote:
 That makes complete sense.  It's irritating that such matches pull in 
 the entire object, but at least the process is straightforward and 
 logical.

 Of course, if matches didn't pull in the entire object then static 
 constructors/destructors, unittests and exceptions wouldn't work 
 without some way to flag specific sections as "always pull this in if 
 anything (or even better: if _certain things_) are pulled in" or "when 
 this is pulled in, also pull in X and Y even though they're not 
 referenced" or something similar...

 
 Hm... so how does segment-level linking work at all?

Well, ld has a switch called --gc-sections, which basically... wait for 
it... garbage collects the sections. :)
 From what I understand, this currently breaks DMD exception handling on 
Linux, since nothing explicitly refers to the sections containing the data.
C++ exceptions work fine AFAIK, presumably because the default linker 
script[1] explicitly tells ld to keep .eh_frame sections even if 
unreferenced. It does the same for quite some other sections, including 
.ctors and .dtors (which DMD uses to set up the linked module list).

So presumably it first selects the objects to be linked in and then uses 
the program entry point and those sections as the "root pointers" for 
the --gc-sections switch if used.

This would mean that if a module gets pulled in, its .ctors section (if 
any) gets kept, and that references (indirectly) the static 
constructors, destructors and unit tests. So that makes me think ld may 
have the same issue as optlink, since Walter has shown that without 
--gc-sections ld also pulls in whole object files (even if the 
corresponding modules are not necessarily imported by the program). 
Unless ld somehow handles libraries in a smarter manner...


[1] use 'ld --verbose' to see it.

Feb 23 2007

John Reimer <terminal.node gmail.com> writes:

On Fri, 23 Feb 2007 13:54:33 -0800, Walter Bright wrote:

 John Reimer wrote:
 optlink may just be the bane for D acceptance. And Tango gets the pitiful
 opportunity of demonstrating why D is NOT ready for prime-time in the
 commercial realm: the DM support tools it relies on are bogged down in
 the past, reflecting D's lopsided existance on yet another level: a strong
 language relying on a fragile, outdated, and poorly fit tool set.

 
 Linux's ld exhibits the same behavior. Try compiling the 3 files here 
 according to the instructions below, and try different orderings adding 
 the object files to the librarian (ar). The same behavior as lib/optlink 
 results (using nm to see what symbols are placed into the resulting 
 executable).
 
 ------------------- test.d --------------------
 import b;       // a is not imported
 
 void foo(...) { }
 
 void test(char[][] s)
 {
      bbb2();
      foo(s);
 }
 
 void main()
 {
 }
 --------------------- a.d -------------------
 void xxx(...) { }
 
 void aaa(char[][] s)
 {
      xxx(s);
 }
 
 void aaa2()             // never referenced nor imported 'bloat'
 {
 }
 -------------------- b.d ---------------------
 void yyy(...) { }
 
 void bbb(char[][] s)
 {
      yyy(s);
 }
 
 void bbb2()
 {
 }
 -------------------- build 1 -------------------
 dmd -c a.d test.d
 dmd -c b.d
 rm foo.a
 ar -r foo.a a.o b.o     <= a comes before b
 dmd test.o foo.a
 nm test >log		<= aaa2() appears in executable
 -------------------- build 2 -------------------
 dmd -c a.d test.d
 dmd -c b.d
 rm foo.a
 ar -r foo.a b.o a.o     <= b comes before a
 dmd test.o foo.a
 nm test >log		<= aaa2() does not appear in executable
 ---------------------------------------------


True, you are correct about the same error being represented here. This
proves only that ld/ar and optlink/lib are on equal footing in this
specific case. optlink/lib/OMF however have so many problems in other
aspects that my opinion stands.  (one simple example: think 64-bit... dmd
tools have no footing)

Concerning the problem at hand:

D is not C.  Resolving symbols in C era was part of the business and
pretty much cause/effect.  Not so with D and its hidden implementation
details and duplicate symbols resulting from implementation of language
features.

That's why D is different; that's why jerry-wrigging old tools to a new
language and expecting developers to rely on these tools and fix
problems like they are C programmers is totally absurd. Even expert D users
are going to be confounded.  New users will be more lost than if they were
using C. That's bad publicity for D, no two ways about it.

What your example above actually points out is the specific weakness
in D's TypeInfo implementation such that linker/librarians are rendered
incapable of making sane choices.  Maybe it's not the linkers and
librarians fault since they are too stupid to really figure a resolution?
If not, then it must be the implementation's fault.

Two options?  Create new a new object file format, a new linker, and a new
librarian to suite D's evolving feature set.  Everyone will yell
"overkill", "that's a nasty amount of work", "I'm not gonna do it!", and
"How are we going to support C?"

If we /must/ go a jerry-rigging to old tools in the name of keeping
things simple (and keeping the faith with C), then perhaps the better
option is to rethink how TypeInfo is implemented and mold it to fit sanely
into the containers and tools available. Like I said, to expect
programmers to troubleshoot hidden operations -- how linker and librarian
operate with D symbols that are practically invisible from the programmers
perspective -- is absolutely nasty and mean.  :)

This /really/ needs to be addressed.  Let's stop
saying that the programmer needs to learn to deal with this.  This is
really just a way of ignoring a show-stopping issue for D.  D cannot afford
to turn it's back on it anymore (as it has for the last few years).  D has
to be progressive in more than just language features to make an impact in
this world.

-JJR

Feb 23 2007

John Reimer <terminal.node gmail.com> writes:

I want to point out also that there /is/ a way to partially side-step this
issue, believe it or not:

Use GDC and build Tango or Phobos as a shared library (on non-win32
systems naturally). Win32 dlls are painfully limited in this regard so
it's a no-go there, and dmd (I believe) still doesn't support shared libs
on linux.

Suddenly, the problem of fat binaries and phantom dependencies is
neatly abstracted away... well, in a matter of speaking only since the
issue still exists, I'm sure, in some sense. But seeing a Tango or Phobos
as a shared library tends to make us all have warm, fuzzy feelings about
the whole thing. :)

When alls said and done, though... probably the equivalent amount of
object code is loaded into memory (phantom objects included), except that
now the library is shared by D programs instead.

-JJR

Feb 24 2007

Jascha Wetzel <"[firstname]" mainia.de> writes:

kris wrote:
 9) The Fake dependencies cause the linker to pick up and bind whatever
 module happens to satisfy it's need for the typeinfo resolution. In this
 case, the linker sees Core.obj with a char[][] decl exposed, so it say
 "hey, this is the place!" and binds it. Along with everything else that
 Core.obj actually requires.

just a thought:
assuming the linker is working at obj file level, isn't that a
set-cover-problem and therefore NP-complete then?
given several obj files (non-disjoint sets), find the minimum number of
obj files that cover all symbols needed by the program.

why i think that it's a set-cover:
i understand that the dependency for some typeinfo can only arise if one
of the modules that are imported needs it. hence, there is always a
module with that TI that provides at least one other needed symbol.
therefore it cannot happen, that the set-cover includes a module of
which only TIs are needed.

if that is correct, one actually wouldn't want the linker to solve this
correctly on object file level, rather than use a faster heuristic to
approach a solution (which is what it seems to be doing now).

maybe it helps the discussion a bit if one knows that solving the
problem in the toolchain necessarily involves making the linker work at
segment level.

Feb 22 2007

"Kristian Kilpi" <kjkilpi gmail.com> writes:

On Thu, 22 Feb 2007 09:49:23 +0200, kris <foo bar.com> wrote:
[snip]
 9) The Fake dependencies cause the linker to pick up and bind whatever  
 module happens to satisfy it's need for the typeinfo resolution. In this  
 case, the linker sees Core.obj with a char[][] decl exposed, so it say  
 "hey, this is the place!" and binds it. Along with everything else that  
 Core.obj actually requires.

 10) The linker is entirely wrong, but you can't really blame it since  
 the char[][] decl is scattered throughout the library modules. It thinks  
 it get's the /right/ one, but in fact it could have chosen *any* of  
 them. This is now getting to the heart of the problem.

 11) If there's was only one exposed decl for char[][], e.g. like int[],  
 there would be no problem. In fact you can see all the prepackaged  
 typeinfo bound to any D executable. There's lots of them. However,  
 because the compiler injects this typeinfo into a variety of objects  
 (apparently wherever char[][] is used), then the linker is boondoggled.

 12) If the linker were smart, and could link segments instead of entire  
 object modules, this would still be OK (a segment is an isolated part of  
 the object module). But the linker is not smart. It was written to be  
 fast, in pure assembler, decades ago.

[snip]

As long as the linker will operate at the .obj file level, the linker will  
pull in some bloat to the executable, in practice. The question is how  
much the executable will be bloated.

And if the compiler generates false dependencies, the size of the bloat  
will enlarge.

So, a solution would be a new linker operating at the section level. (Not  
necessary *the* solution, but *a* solution.)

Oh, the linker was written in assembly, how hardcore. :) I don't think  
there's much of a point in writing a new linker (that is, if someone will  
do that) in assembly... not that anyone was considering using assembly...  
<g> If the linking times were a bit (or two) slower (because of more  
complex algorithm), I think it would be okay for a lot of people (all of  
them?). (If linking times will be a issue, for example when building debug  
executables, the old linker could be used for that.) Hmm, I'm wondering  
how much slower the current linker would be if it had been written in  
C/C++/D instead of assembly. I mean, today processors are so much faster  
than a decade ago, and hard disks had not got any faster (well, not  
significally). Usually the hard disk is the bottleneck, not the processor.

Feb 22 2007

janderson <askme me.com> writes:

kris wrote:
[snip]
 7) These fake dependencies are responsible for, in this case, the entire 
 "locale" package to be bound to the example app, resulting in a 350% 
 increase in size.

[snip]

Is it not possible to have a tool which strips the executable of dead 
code after its finished, using an external tool?

-Joel

Feb 22 2007

Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:

janderson wrote:
 kris wrote:
 [snip]
 7) These fake dependencies are responsible for, in this case, the 
 entire "locale" package to be bound to the example app, resulting in a 
 350% increase in size.

 [snip]
 
 Is it not possible to have a tool which strips the executable of dead 
 code after its finished, using an external tool?

Presumably this would leave static constructors/destructors intact? If 
not, you'll have some problems with code that depends on them.

Now, assuming you do: tango.text.locale.Core contains 3 static 
constructors and a static destructor. The static destructor sets some 
variables to null, but the static constructors do a bit more. One of 
them directly allocates some class instances. That pulls in the vtable 
for that class, which in turn pulls in all virtual member functions. 
Those instantiate at least two other classes (not counting Exception)...

This kind of stuff can easily cascade into a lot of code, all because a 
single module got pulled in and its static constructors get called.

So all is not as simple as it seems, unfortunately. To get this 
right[1], you'd basically have to do the dependency analysis all over 
again. You might as well have the linker get it right the first time...
(IMHO, to get it to do that without requiring a special linker and/or 
object format, the compiler needs to somehow make sure the linker won't 
pick the wrong object file to link to)


[1]: i.e. don't include unnecessary code, but also include all code that 
_is_ necessary.

Feb 22 2007

Sean Kelly <sean f4.ca> writes:

kris wrote:
 
 6) The dependency that /does/ cause the problem is one generated by the 
 D compiler itself. It generates and injects false 'dependencies' across 
 object modules. Specifically, that's effectively how the linker treats 
 them.

...
 9) The Fake dependencies cause the linker to pick up and bind whatever 
 module happens to satisfy it's need for the typeinfo resolution. In this 
 case, the linker sees Core.obj with a char[][] decl exposed, so it say 
 "hey, this is the place!" and binds it. Along with everything else that 
 Core.obj actually requires.

This is the crux of the problem.  In C/C++, problem areas can typically 
be identified and addressed.  In D however, the problem areas are 
related to "hidden" data and may manifest differently for different 
applications.  They can still be identified and addressed, but the 
process is far more brittle, as any code change can have cascading 
effects on application size.

Still, I don't entirely understand why this appears to not be an issue 
using Build, which has historically had bloat issues in some cases.  Was 
it just luck, or do things actually change when objects are stored in a 
library as opposed to not?

 Case in point: you have to strip the library down by hand, and very very 
 carefully sift through the symbols and literally hundreds of library 
 builds until you finally get lucky enough to stumble over the problem.
 
 Walter asserts that the linker can be tricked into doing the right 
 thing. This seems to show a lack of understanding on his part about the 
 problem and the manner in which the lib and linker operate.

So far, I see two options for using Win32 libraries: the old way, which 
created relatively lean EXEs but had link errors for template code, or 
the new way, which requires the laborious process outlined above.

Interestingly, eschewing libraries in favor of object-level linking has 
always worked and doesn't seem to exhibit either of the above problems 
(as I mentioned above).  As much as people have been pushing for working 
D libraries, given the above alternatives I'm somewhat inclined to stick 
with Build unless I'm integrating with a C application.

By the same token none of these problems appear to have ever existed on 
Linux, be it because of the ELF format, the 'ld' linker, or some other 
confluence of planetary alignment and sheer luck.  Can anyone confirm 
that this is indeed true?

 As was pointed out to me, OMF librarians actually uses a two-level 
 hashmap to index the library entries. This is used by the linker to 
 locate missing symbols. I think it's clear that this is not a linear 
 lookup mechanism as had been claimed, and is borne out by experiments 
 that show the process cannot be controlled, and the linker cannot be 
 faked in usable or dependable manner.

This may not be true of optlink however.  I suspect the hashmap is 
probably more likely of linkers that do segment-level linking?  There 
seems little point in the complexity otherwise.

 I feel it important to point out that this powerful I18N package is, in 
 no way, at fault here. The D compiler simply injected the wrong symbol 
 into the wrong module at the wrong time, in the wrong order, and the 
 result is that this package gets linked when it's not used or imported 
 in any fashion by the code design. Instead, the dependency is created 
 entirely by the compiler. That's a problem. It is a big problem. And it 
 is a problem every D developer will face, at some point, when using DM 
 tools. But it is not a problem rooted in the Tango code.

Agreed.  Tango was merely one of the first to encounter it because it's 
one of the first "large" D libraries.


Sean

Feb 22 2007

Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:

Sean Kelly wrote:
 Still, I don't entirely understand why this appears to not be an issue 
 using Build, which has historically had bloat issues in some cases.  Was 
 it just luck, or do things actually change when objects are stored in a 
 library as opposed to not?

Doesn't Build only link together the object files for modules that are 
imported at any point in the application?
This problem is that the locale.Core module isn't used by the program 
and is still linked in.
But AFAIK Build wouldn't add Core.obj to the link command line if the 
module isn't imported from any module it compiled.
The linker can't pick the wrong object file to link in if it only 
considers "right" ones...

So the problem here is pretty related to library usage, in particular to 
the fact that libraries can contain object files that aren't needed for 
a particular program.

Feb 22 2007

Walter Bright <newshound digitalmars.com> writes:

kris wrote:
 As was pointed out to me, OMF librarians actually uses a two-level 
 hashmap to index the library entries. This is used by the linker to 
 locate missing symbols. I think it's clear that this is not a linear 
 lookup mechanism as had been claimed, and is borne out by experiments 
 that show the process cannot be controlled, and the linker cannot be 
 faked in usable or dependable manner.

The librarian takes a list of .obj files, and concatenates them 
together. It appends a dictionary at the end. The dictionary is a 
(ridiculously complex, but that's irrelevant here) associative array 
that can be thought of as being equivalent to:

	ObjectModule[SymbolName] dictionary;

The librarian reads the .obj files in the order in which they are 
presented to the librarian. Each .obj file is parsed, and the public 
names in it are inserted into the dictionary like:

	dictionary[publicname] = objectmodule;

Note that there can be only a 1:1 correspondence between publicnames and 
objectmodules. If a publicname is already in the dictionary, lib issues 
an error and quits.

COMDAT names are also inserted into the dictionary *unless they are 
already in there*, in which case they are ignored.

Hence, only the first COMDAT name is inserted. The rest are ignored. The 
hashmap lookup algorithm has ZERO effect on which object module is 
pulled in, because there is (and can only be) a 1:1 mapping. There is no 
way it can arbitrarily pick a different object module.

The process can be controlled by setting the order in which object 
modules are presented to the library.

What cannot be controlled is the order in which the linker visits 
unresolved names trying to find them, i.e. if A and B are unresolved, it 
cannot be controlled whether A is looked up first, or B is looked up 
first. That means, if you have two modules M1 and M2, and COMDATs A and B:

---------M1------------
A
C
---------M2------------
A
B
-----------------------

then if B is looked up first, the resulting exe will have only M2 linked 
in. If A is looked up first, then both M1 and M2 will be in the executable.

Feb 23 2007

Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:

Walter Bright wrote:
 COMDAT names are also inserted into the dictionary *unless they are 
 already in there*, in which case they are ignored.
 
 Hence, only the first COMDAT name is inserted. The rest are ignored. The 
 hashmap lookup algorithm has ZERO effect on which object module is 
 pulled in, because there is (and can only be) a 1:1 mapping. There is no 
 way it can arbitrarily pick a different object module.
 
 The process can be controlled by setting the order in which object 
 modules are presented to the library.

When doing a lookup while linking, does it at least check other .obj 
files first before escalating to pulling in library .objs?

Feb 23 2007

Walter Bright <newshound digitalmars.com> writes:

Frits van Bommel wrote:
 When doing a lookup while linking, does it at least check other .obj 
 files first before escalating to pulling in library .objs?

The linker first puts together all the explicitly listed object files.
Then,

	foreach (unresolved name)
	{
		if (name is it in the library)
			read in that object module from the library
			add it to the the build
			use any pulled in publics/comdats to resolve
			any remaining unresolved symbols
			add any new unresolved names to unresolved name list
		else
			issue error message
	}

It does just what you'd expect it to.

Feb 23 2007

Sean Kelly <sean f4.ca> writes:

Walter Bright wrote:
 
 Note that there can be only a 1:1 correspondence between publicnames and 
 objectmodules. If a publicname is already in the dictionary, lib issues 
 an error and quits.

So how are TypeInfo definitions resolved?  I'd think it would be pretty 
common to have more than one of the same TypeInfo publicname per library.


Sean

Feb 23 2007

Walter Bright <newshound digitalmars.com> writes:

Sean Kelly wrote:
 Walter Bright wrote:
 Note that there can be only a 1:1 correspondence between publicnames 
 and objectmodules. If a publicname is already in the dictionary, lib 
 issues an error and quits.

 
 So how are TypeInfo definitions resolved?  I'd think it would be pretty 
 common to have more than one of the same TypeInfo publicname per library.

if (name in library.dictionary)
	readin(library.dictionary[name])

As I said, only the FIRST ONE of the comdat names makes it into 
dictionary[], subsequent ones are not.

Feb 23 2007

Sean Kelly <sean f4.ca> writes:

Walter Bright wrote:
 Sean Kelly wrote:
 Walter Bright wrote:
 Note that there can be only a 1:1 correspondence between publicnames 
 and objectmodules. If a publicname is already in the dictionary, lib 
 issues an error and quits.

 So how are TypeInfo definitions resolved?  I'd think it would be 
 pretty common to have more than one of the same TypeInfo publicname 
 per library.

 
 if (name in library.dictionary)
     readin(library.dictionary[name])
 
 As I said, only the FIRST ONE of the comdat names makes it into 
 dictionary[], subsequent ones are not.

Oh, so TypeInfo are stored in COMDATs.  Makes perfect sense.  Thanks!

Feb 23 2007

Walter Bright <newshound digitalmars.com> writes:

kris wrote:
 Walter Bright wrote:
 Sure, but in this particular case, it seems that "core" is being 
 imported without referencing code in it. The only reason the compiler 
 doesn't generate the char[][] TypeInfo is because an import defines 
 it. The compiler does work on the assumption that if a module is 
 imported, then it will also be linked in.

 This core module, and the entire locale package it resides in, is /not/ 
 imported by anything. I spelled that out clearly before. You're making 
 an assumption it is, somehow ... well, it is not. You can deduce that 
 from the fact that the link succeeds perfectly well without that package 
 existing in the library.

Then the typeinfo for char[][] is being generated by another module. I 
suggest it would be helpful to find that module. Grep is a handy tool 
for finding it.

What let me to assume that the typeinfo for char[][] was *only* in 
core.obj was your statement that core got linked in regardless of where 
in the lib it was. This doesn't make sense in the light of it being also 
in some other .obj file.

I suggest identifying which .obj files have the typeinfo marked as 
extern, and which have it defined as a COMDAT.

Feb 23 2007

kris <foo bar.com> writes:

Walter Bright wrote:
 kris wrote:
 
 Walter Bright wrote:

 Sure, but in this particular case, it seems that "core" is being 
 imported without referencing code in it. The only reason the compiler 
 doesn't generate the char[][] TypeInfo is because an import defines 
 it. The compiler does work on the assumption that if a module is 
 imported, then it will also be linked in.

 This core module, and the entire locale package it resides in, is 
 /not/ imported by anything. I spelled that out clearly before. You're 
 making an assumption it is, somehow ... well, it is not. You can 
 deduce that from the fact that the link succeeds perfectly well 
 without that package existing in the library.


After taking a much needed break from this, I'm having another bash at 
it. The locale package highlighted in the last bout has been through a 
number of changes, and thus the behaviour will now be somewhat different 
than before.

For a refresher on this issue, here's an overview: 
http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=49257

The upshot is that reordering the content as presented to the librarian 
now has an effect on the resultant binary size. This tells me two things:

1) there were more than just the one compiler-generated unresolved 
symbol in the first case, though I did not spot it after many 
painstaking hours of frustrating effort. In fact, there may well have 
been a number of vaguely intertwined dependencies spread throughout the 
library, due entirely to these compiler-generated symbols.

2) there is no feasible manner in which a developer can control how lib 
contents are linked while the compiler continues to generate duplicate 
symbols under the covers. The fact that it does this on a regular basis 
simply serves to highlight a critical problem.


 Then the typeinfo for char[][] is being generated by another module. I
 suggest it would be helpful to find that module. Grep is a handy tool
 for finding it.

What would be the point? These symbols are compiler generated, and exist 
in a variety of places because of that fact alone. Lest we forget, we're 
not even talking about just one symbolic name ~ the compiler generates 
duplicate symbols for any typeinfo that does not match the pre-packaged 
list (such as char[][]). The end result is a potential rats-nest of 
duplicate symbols, creating a maze of intricate and fragile dependencies 
across the lib itself.

To put things into perspective, I have absolutely no way of knowing what 
the real dependencies within this library actually are. As such, I'd 
have to describe the situation as being "out of control". This is wholly 
due to the duplicate compiler-generated symbols.

Mar 07 2007

kris <foo bar.com> writes:

kris wrote:
 Walter Bright wrote:

 kris wrote:

 Walter Bright wrote:

 Sure, but in this particular case, it seems that "core" is being 
 imported without referencing code in it. The only reason the 
 compiler doesn't generate the char[][] TypeInfo is because an import 
 defines it. The compiler does work on the assumption that if a 
 module is imported, then it will also be linked in.

 This core module, and the entire locale package it resides in, is 
 /not/ imported by anything. I spelled that out clearly before. You're 
 making an assumption it is, somehow ... well, it is not. You can 
 deduce that from the fact that the link succeeds perfectly well 
 without that package existing in the library.

 After taking a much needed break from this, I'm having another bash at 
 it. The locale package highlighted in the last bout has been through a 
 number of changes, and thus the behaviour will now be somewhat different 
 than before.

 For a refresher on this issue, here's an overview: 
 http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmar
.D&article_id=49257 

 The upshot is that reordering the content as presented to the librarian 
 now has an effect on the resultant binary size. This tells me two things:

 1) there were more than just the one compiler-generated unresolved 
 symbol in the first case, though I did not spot it after many 
 painstaking hours of frustrating effort. In fact, there may well have 
 been a number of vaguely intertwined dependencies spread throughout the 
 library, due entirely to these compiler-generated symbols.

 2) there is no feasible manner in which a developer can control how lib 
 contents are linked while the compiler continues to generate duplicate 
 symbols under the covers. The fact that it does this on a regular basis 
 simply serves to highlight a critical problem.

  > Then the typeinfo for char[][] is being generated by another module. I
  > suggest it would be helpful to find that module. Grep is a handy tool
  > for finding it.

 What would be the point? These symbols are compiler generated, and exist 
 in a variety of places because of that fact alone. Lest we forget, we're 
 not even talking about just one symbolic name ~ the compiler generates 
 duplicate symbols for any typeinfo that does not match the pre-packaged 
 list (such as char[][]). The end result is a potential rats-nest of 
 duplicate symbols, creating a maze of intricate and fragile dependencies 
 across the lib itself.

 To put things into perspective, I have absolutely no way of knowing what 
 the real dependencies within this library actually are. As such, I'd 
 have to describe the situation as being "out of control". This is wholly 
 due to the duplicate compiler-generated symbols.

In fact, it is so brittle and fragile that I now cannot reproduce what's 
noted above. Back to square one where it does not matter if Core.obj is 
listed first or last in the the lib "response file" -- it always gets 
linked, resulting in a wildly bloated binary.

With Core.obj removed from the lib, the resultant binary is still 60KB 
bigger than it should be, so the problem simply moves to a different bad 
link-chain instead.

Mar 07 2007

Pragma <ericanderton yahoo.removeme.com> writes:

kris wrote:
 kris wrote:
 Walter Bright wrote:

 kris wrote:

 Walter Bright wrote:

 Sure, but in this particular case, it seems that "core" is being 
 imported without referencing code in it. The only reason the 
 compiler doesn't generate the char[][] TypeInfo is because an 
 import defines it. The compiler does work on the assumption that if 
 a module is imported, then it will also be linked in.

 This core module, and the entire locale package it resides in, is 
 /not/ imported by anything. I spelled that out clearly before. 
 You're making an assumption it is, somehow ... well, it is not. You 
 can deduce that from the fact that the link succeeds perfectly well 
 without that package existing in the library.

 After taking a much needed break from this, I'm having another bash at 
 it. The locale package highlighted in the last bout has been through a 
 number of changes, and thus the behaviour will now be somewhat 
 different than before.

 For a refresher on this issue, here's an overview: 
 http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmar
.D&article_id=49257 

 The upshot is that reordering the content as presented to the 
 librarian now has an effect on the resultant binary size. This tells 
 me two things:

 1) there were more than just the one compiler-generated unresolved 
 symbol in the first case, though I did not spot it after many 
 painstaking hours of frustrating effort. In fact, there may well have 
 been a number of vaguely intertwined dependencies spread throughout 
 the library, due entirely to these compiler-generated symbols.

 2) there is no feasible manner in which a developer can control how 
 lib contents are linked while the compiler continues to generate 
 duplicate symbols under the covers. The fact that it does this on a 
 regular basis simply serves to highlight a critical problem.

  > Then the typeinfo for char[][] is being generated by another module. I
  > suggest it would be helpful to find that module. Grep is a handy tool
  > for finding it.

 What would be the point? These symbols are compiler generated, and 
 exist in a variety of places because of that fact alone. Lest we 
 forget, we're not even talking about just one symbolic name ~ the 
 compiler generates duplicate symbols for any typeinfo that does not 
 match the pre-packaged list (such as char[][]). The end result is a 
 potential rats-nest of duplicate symbols, creating a maze of intricate 
 and fragile dependencies across the lib itself.

 To put things into perspective, I have absolutely no way of knowing 
 what the real dependencies within this library actually are. As such, 
 I'd have to describe the situation as being "out of control". This is 
 wholly due to the duplicate compiler-generated symbols.

 In fact, it is so brittle and fragile that I now cannot reproduce what's 
 noted above. Back to square one where it does not matter if Core.obj is 
 listed first or last in the the lib "response file" -- it always gets 
 linked, resulting in a wildly bloated binary.

 With Core.obj removed from the lib, the resultant binary is still 60KB 
 bigger than it should be, so the problem simply moves to a different bad 
 link-chain instead.

I made a pass at trying to reproduce this the last time out, with no success -
that got swept under the rug as I quickly 
got caught up in a handful of other things.

If only there was some way to diagnose what the linker was doing, as it
happened, we could easily map out what the 
suspect dependencies are.

-- 
- EricAnderton at yahoo

Mar 08 2007

Sean Kelly <sean f4.ca> writes:

kris wrote:
 Walter Bright wrote:
 kris wrote:

 Walter Bright wrote:

 Sure, but in this particular case, it seems that "core" is being 
 imported without referencing code in it. The only reason the 
 compiler doesn't generate the char[][] TypeInfo is because an import 
 defines it. The compiler does work on the assumption that if a 
 module is imported, then it will also be linked in.

 This core module, and the entire locale package it resides in, is 
 /not/ imported by anything. I spelled that out clearly before. You're 
 making an assumption it is, somehow ... well, it is not. You can 
 deduce that from the fact that the link succeeds perfectly well 
 without that package existing in the library.

 After taking a much needed break from this, I'm having another bash at 
 it. The locale package highlighted in the last bout has been through a 
 number of changes, and thus the behaviour will now be somewhat different 
 than before.

 For a refresher on this issue, here's an overview: 
 http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmar
.D&article_id=49257 

 The upshot is that reordering the content as presented to the librarian 
 now has an effect on the resultant binary size. This tells me two things:

 1) there were more than just the one compiler-generated unresolved 
 symbol in the first case, though I did not spot it after many 
 painstaking hours of frustrating effort. In fact, there may well have 
 been a number of vaguely intertwined dependencies spread throughout the 
 library, due entirely to these compiler-generated symbols.

 2) there is no feasible manner in which a developer can control how lib 
 contents are linked while the compiler continues to generate duplicate 
 symbols under the covers. The fact that it does this on a regular basis 
 simply serves to highlight a critical problem.

  > Then the typeinfo for char[][] is being generated by another module. I
  > suggest it would be helpful to find that module. Grep is a handy tool
  > for finding it.

 What would be the point? These symbols are compiler generated, and exist 
 in a variety of places because of that fact alone. Lest we forget, we're 
 not even talking about just one symbolic name ~ the compiler generates 
 duplicate symbols for any typeinfo that does not match the pre-packaged 
 list (such as char[][]). The end result is a potential rats-nest of 
 duplicate symbols, creating a maze of intricate and fragile dependencies 
 across the lib itself.

 To put things into perspective, I have absolutely no way of knowing what 
 the real dependencies within this library actually are. As such, I'd 
 have to describe the situation as being "out of control". This is wholly 
 due to the duplicate compiler-generated symbols.

It's a long-term proposition, but what about delaying the generation of 
TypeInfo until link-time?  The MS linker already optionally performs 
code generation to allow for more optimized executables, so I assume 
such a thing is definitely possible.  It would obviously require a new 
linker, but I don't see any other way to address these "silent 
dependency" issues, etc.  If I had the time I'd try it out, but as 
things stand there's no way that could happen before September.

Sean

Mar 07 2007

Carlos Santander <csantander619 gmail.com> writes:

Sean Kelly escribi�:
 
 It's a long-term proposition, but what about delaying the generation of 
 TypeInfo until link-time?  The MS linker already optionally performs 
 code generation to allow for more optimized executables, so I assume 
 such a thing is definitely possible.  It would obviously require a new 
 linker, but I don't see any other way to address these "silent 
 dependency" issues, etc.  If I had the time I'd try it out, but as 
 things stand there's no way that could happen before September.
 
 
 Sean

Unless I'm missing something, I don't think a new linker would be required. The 
compiler could just check if -c is passed. If it is, don't generate those 
TypeInfos, otherwise, do.

-- 
Carlos Santander Bernal

Mar 08 2007

Daniel Keep <daniel.keep.lists gmail.com> writes:

Carlos Santander wrote:
 Sean Kelly escribi�:
 It's a long-term proposition, but what about delaying the generation
 of TypeInfo until link-time?  The MS linker already optionally
 performs code generation to allow for more optimized executables, so I
 assume such a thing is definitely possible.  It would obviously
 require a new linker, but I don't see any other way to address these
 "silent dependency" issues, etc.  If I had the time I'd try it out,
 but as things stand there's no way that could happen before September.


 Sean

 
 Unless I'm missing something, I don't think a new linker would be
 required. The compiler could just check if -c is passed. If it is, don't
 generate those TypeInfos, otherwise, do.

What about build utilities that compile each module separately (using
-c), and then invoke the linker directly?

	-- Daniel

-- 
Unlike Knuth, I have neither proven or tried the above; it may not even
make sense.

v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D
i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP  http://hackerkey.com/

Mar 08 2007

Pragma <ericanderton yahoo.removeme.com> writes:

Sean Kelly wrote:
 kris wrote:
 Walter Bright wrote:
 kris wrote:

 Walter Bright wrote:

 Sure, but in this particular case, it seems that "core" is being 
 imported without referencing code in it. The only reason the 
 compiler doesn't generate the char[][] TypeInfo is because an 
 import defines it. The compiler does work on the assumption that if 
 a module is imported, then it will also be linked in.

 This core module, and the entire locale package it resides in, is 
 /not/ imported by anything. I spelled that out clearly before. 
 You're making an assumption it is, somehow ... well, it is not. You 
 can deduce that from the fact that the link succeeds perfectly well 
 without that package existing in the library.

 After taking a much needed break from this, I'm having another bash at 
 it. The locale package highlighted in the last bout has been through a 
 number of changes, and thus the behaviour will now be somewhat 
 different than before.

 For a refresher on this issue, here's an overview: 
 http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmar
.D&article_id=49257 

 The upshot is that reordering the content as presented to the 
 librarian now has an effect on the resultant binary size. This tells 
 me two things:

 1) there were more than just the one compiler-generated unresolved 
 symbol in the first case, though I did not spot it after many 
 painstaking hours of frustrating effort. In fact, there may well have 
 been a number of vaguely intertwined dependencies spread throughout 
 the library, due entirely to these compiler-generated symbols.

 2) there is no feasible manner in which a developer can control how 
 lib contents are linked while the compiler continues to generate 
 duplicate symbols under the covers. The fact that it does this on a 
 regular basis simply serves to highlight a critical problem.

  > Then the typeinfo for char[][] is being generated by another module. I
  > suggest it would be helpful to find that module. Grep is a handy tool
  > for finding it.

 What would be the point? These symbols are compiler generated, and 
 exist in a variety of places because of that fact alone. Lest we 
 forget, we're not even talking about just one symbolic name ~ the 
 compiler generates duplicate symbols for any typeinfo that does not 
 match the pre-packaged list (such as char[][]). The end result is a 
 potential rats-nest of duplicate symbols, creating a maze of intricate 
 and fragile dependencies across the lib itself.

 To put things into perspective, I have absolutely no way of knowing 
 what the real dependencies within this library actually are. As such, 
 I'd have to describe the situation as being "out of control". This is 
 wholly due to the duplicate compiler-generated symbols.

 It's a long-term proposition, but what about delaying the generation of 
 TypeInfo until link-time?  The MS linker already optionally performs 
 code generation to allow for more optimized executables, so I assume 
 such a thing is definitely possible.  It would obviously require a new 
 linker, but I don't see any other way to address these "silent 
 dependency" issues, etc.  If I had the time I'd try it out, but as 
 things stand there's no way that could happen before September.

I was going to say: new linker == our problem.  I'm in the same boat on the no
time thing, but maybe we can get a group 
effort going later on.  If for no other reason, we could investigate what a
native 64-bit toolchain for D on Windows 
might look like.

-- 
- EricAnderton at yahoo

Mar 08 2007

Don Clugston <dac nospam.com.au> writes:

Pragma wrote:
 Sean Kelly wrote:
 kris wrote:
 Walter Bright wrote:
 kris wrote:

 Walter Bright wrote:

 Sure, but in this particular case, it seems that "core" is being 
 imported without referencing code in it. The only reason the 
 compiler doesn't generate the char[][] TypeInfo is because an 
 import defines it. The compiler does work on the assumption that 
 if a module is imported, then it will also be linked in.

 This core module, and the entire locale package it resides in, is 
 /not/ imported by anything. I spelled that out clearly before. 
 You're making an assumption it is, somehow ... well, it is not. You 
 can deduce that from the fact that the link succeeds perfectly well 
 without that package existing in the library.

 After taking a much needed break from this, I'm having another bash 
 at it. The locale package highlighted in the last bout has been 
 through a number of changes, and thus the behaviour will now be 
 somewhat different than before.

 For a refresher on this issue, here's an overview: 
 http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmar
.D&article_id=49257 

 The upshot is that reordering the content as presented to the 
 librarian now has an effect on the resultant binary size. This tells 
 me two things:

 1) there were more than just the one compiler-generated unresolved 
 symbol in the first case, though I did not spot it after many 
 painstaking hours of frustrating effort. In fact, there may well have 
 been a number of vaguely intertwined dependencies spread throughout 
 the library, due entirely to these compiler-generated symbols.

 2) there is no feasible manner in which a developer can control how 
 lib contents are linked while the compiler continues to generate 
 duplicate symbols under the covers. The fact that it does this on a 
 regular basis simply serves to highlight a critical problem.

  > Then the typeinfo for char[][] is being generated by another 
 module. I
  > suggest it would be helpful to find that module. Grep is a handy tool
  > for finding it.

 What would be the point? These symbols are compiler generated, and 
 exist in a variety of places because of that fact alone. Lest we 
 forget, we're not even talking about just one symbolic name ~ the 
 compiler generates duplicate symbols for any typeinfo that does not 
 match the pre-packaged list (such as char[][]). The end result is a 
 potential rats-nest of duplicate symbols, creating a maze of 
 intricate and fragile dependencies across the lib itself.

 To put things into perspective, I have absolutely no way of knowing 
 what the real dependencies within this library actually are. As such, 
 I'd have to describe the situation as being "out of control". This is 
 wholly due to the duplicate compiler-generated symbols.

 It's a long-term proposition, but what about delaying the generation 
 of TypeInfo until link-time?  The MS linker already optionally 
 performs code generation to allow for more optimized executables, so I 
 assume such a thing is definitely possible.  It would obviously 
 require a new linker, but I don't see any other way to address these 
 "silent dependency" issues, etc.  If I had the time I'd try it out, 
 but as things stand there's no way that could happen before September.

 I was going to say: new linker == our problem.  I'm in the same boat on 
 the no time thing, but maybe we can get a group effort going later on.  
 If for no other reason, we could investigate what a native 64-bit 
 toolchain for D on Windows might look like.

I've been wondering how far your work with DDL goes towards writing a 
linker? Certainly the work you've done with making sense of the OMF and 
ELF specs, and parsing the obj files, seems to be a huge chunk of the 
task. A lot of code would be common to both tasks, surely?

Mar 08 2007

Pragma <ericanderton yahoo.removeme.com> writes:

Don Clugston wrote:
 Pragma wrote:
 Sean Kelly wrote:
 kris wrote:
 Walter Bright wrote:
 kris wrote:

 Walter Bright wrote:

 Sure, but in this particular case, it seems that "core" is being 
 imported without referencing code in it. The only reason the 
 compiler doesn't generate the char[][] TypeInfo is because an 
 import defines it. The compiler does work on the assumption that 
 if a module is imported, then it will also be linked in.

 This core module, and the entire locale package it resides in, is 
 /not/ imported by anything. I spelled that out clearly before. 
 You're making an assumption it is, somehow ... well, it is not. 
 You can deduce that from the fact that the link succeeds perfectly 
 well without that package existing in the library.

 After taking a much needed break from this, I'm having another bash 
 at it. The locale package highlighted in the last bout has been 
 through a number of changes, and thus the behaviour will now be 
 somewhat different than before.

 For a refresher on this issue, here's an overview: 
 http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmar
.D&article_id=49257 

 The upshot is that reordering the content as presented to the 
 librarian now has an effect on the resultant binary size. This tells 
 me two things:

 1) there were more than just the one compiler-generated unresolved 
 symbol in the first case, though I did not spot it after many 
 painstaking hours of frustrating effort. In fact, there may well 
 have been a number of vaguely intertwined dependencies spread 
 throughout the library, due entirely to these compiler-generated 
 symbols.

 2) there is no feasible manner in which a developer can control how 
 lib contents are linked while the compiler continues to generate 
 duplicate symbols under the covers. The fact that it does this on a 
 regular basis simply serves to highlight a critical problem.

  > Then the typeinfo for char[][] is being generated by another 
 module. I
  > suggest it would be helpful to find that module. Grep is a handy 
 tool
  > for finding it.

 What would be the point? These symbols are compiler generated, and 
 exist in a variety of places because of that fact alone. Lest we 
 forget, we're not even talking about just one symbolic name ~ the 
 compiler generates duplicate symbols for any typeinfo that does not 
 match the pre-packaged list (such as char[][]). The end result is a 
 potential rats-nest of duplicate symbols, creating a maze of 
 intricate and fragile dependencies across the lib itself.

 To put things into perspective, I have absolutely no way of knowing 
 what the real dependencies within this library actually are. As 
 such, I'd have to describe the situation as being "out of control". 
 This is wholly due to the duplicate compiler-generated symbols.

 It's a long-term proposition, but what about delaying the generation 
 of TypeInfo until link-time?  The MS linker already optionally 
 performs code generation to allow for more optimized executables, so 
 I assume such a thing is definitely possible.  It would obviously 
 require a new linker, but I don't see any other way to address these 
 "silent dependency" issues, etc.  If I had the time I'd try it out, 
 but as things stand there's no way that could happen before September.

 I was going to say: new linker == our problem.  I'm in the same boat 
 on the no time thing, but maybe we can get a group effort going later 
 on.  If for no other reason, we could investigate what a native 64-bit 
 toolchain for D on Windows might look like.

 I've been wondering how far your work with DDL goes towards writing a 
 linker? Certainly the work you've done with making sense of the OMF and 
 ELF specs, and parsing the obj files, seems to be a huge chunk of the 
 task. A lot of code would be common to both tasks, surely?

Well, the OMF loader needs some polish and some subtle refactoring (read:Tango)
but I have thought the same myself.  The 
big stumbling block (for me at least) is understanding the "E" in "PE/COFF" for
emitting .exe and .dll files.  The 
intermediate part, matching dependencies, is really kind of simple only up
until you get into various forms of 
optimization: culling unused segments & symbols, deep dependency analysis,
*fast* linking, minimize size, optimize for 
speed, etc.  Typing "link /?" in your console will quickly cast a very large
shadow over this area.

ELF is another issue.  Flectioned is light-years ahead of DDL on that front. 
But combined with a (future) upgraded DDL, 
we'll pretty much have "libtools for D".

Now if we had a library that allowed for reading *and* writing of COFF files
per 100% of the specification, then I'd 
imagine that this wouldn't be too far out of reach.

-- 
- EricAnderton at yahoo

Mar 08 2007

kris <foo bar.com> writes:

Walter Bright wrote:

 3) create a separate module that defines the relevant typeinfo's, and 
 put that first in the library

Just to satify your stance I tried this; guess what? It has no effect 
whatsoever, since you /cannot/ dictate the order in which the decls will 
be inspected in advance.

I hope this captures the sheer absurdity of trying to "outwit" the 
librarian/linker in the first place?

Feb 21 2007

Walter Bright <newshound digitalmars.com> writes:

kris wrote:
 Walter Bright wrote:
 
 3) create a separate module that defines the relevant typeinfo's, and 
 put that first in the library

 
 Just to satify your stance I tried this; guess what? It has no effect 
 whatsoever, since you /cannot/ dictate the order in which the decls will 
 be inspected in advance.

Then there's something else going on, i.e. another symbol is being 
referenced that is resolved by core.obj.

Feb 23 2007

Walter Bright <newshound digitalmars.com> writes:

kris wrote:
 Walter Bright wrote:
 
 3) create a separate module that defines the relevant typeinfo's, and 
 put that first in the library

 
 Just to satify your stance I tried this; guess what? It has no effect 
 whatsoever, since you /cannot/ dictate the order in which the decls will 
 be inspected in advance.

Did you verify (using obj2asm) that the separate module actually did 
define the typeinfo's?

Feb 23 2007

John Reimer <terminal.node gmail.com> writes:

On Wed, 21 Feb 2007 01:47:53 -0800, kris wrote:

 kris wrote:
 Walter Bright wrote:
 
 kris wrote:

 Walter Bright wrote:

 For a quick & dirty test, try reversing the order the object files 
 are presented to lib.



 there's a couple of hundred :-D



 Do the ones that were giving the undefined reference before the 
 changes to lib.

 
 
 The lib itself is actually built via a single D module, which imports 
 all others. This is then given to Build to construct the library. Thus I 
 don't have direct control over the ordering. Having said that, it 
 appears Build does something different when the modules are reordered; 
 Likely changing the order in which modules are presented to the lib.
 
 By moving things around, I see a change in size on the target executable 
 between -4kb to +5kb
 

 
 I've been messing with the response file handed to the librarian (via 
 lib  foo); moving modules around here and there, reordering big chunks 
 etc. Have yet to see a notable change in the resulting exe after 
 relinking against each lib version.


Is build really a reliable means of testing this?  I mean, it's produced
unusual differences in binary size in the past (granted not of that
magnitude).  Of course, this is a different case too, in which a library is
being created.

Just out of curiousity, does rebuild do the same thing?

-JJR

Feb 21 2007

John Reimer <terminal.node gmail.com> writes:

On Wed, 21 Feb 2007 15:56:31 +0000, John Reimer wrote:

 On Wed, 21 Feb 2007 01:47:53 -0800, kris wrote:
 
 kris wrote:
 Walter Bright wrote:
 
 kris wrote:

 Walter Bright wrote:

 For a quick & dirty test, try reversing the order the object files 
 are presented to lib.



 there's a couple of hundred :-D



 Do the ones that were giving the undefined reference before the 
 changes to lib.

 
 
 The lib itself is actually built via a single D module, which imports 
 all others. This is then given to Build to construct the library. Thus I 
 don't have direct control over the ordering. Having said that, it 
 appears Build does something different when the modules are reordered; 
 Likely changing the order in which modules are presented to the lib.
 
 By moving things around, I see a change in size on the target executable 
 between -4kb to +5kb
 

 
 I've been messing with the response file handed to the librarian (via 
 lib  foo); moving modules around here and there, reordering big chunks 
 etc. Have yet to see a notable change in the resulting exe after 
 relinking against each lib version.

 
 
 Is build really a reliable means of testing this?  I mean, it's produced
 unusual differences in binary size in the past (granted not of that
 magnitude).  Of course, this is a different case too, in which a library is
 being created.
 
 Just out of curiousity, does rebuild do the same thing?
 
 -JJR


I obviously misunderstood the whole issue here.  After reading the
responses, I begin to see the magnitude of the problem described.

-JJR

Feb 22 2007

Sean Kelly <sean f4.ca> writes:

Walter Bright wrote:
 It does, but increases the exe size of the first example from 180kb to 
 617kb!

  > 180kb is when compiled using build/rebuild/jake etc (no library) and 
 the 617kb
  > is when using dmd+lib only. Same flags in both cases: none at all

 Let's say you have a template instance, TI. It is declared in two 
 modules, M1 and M2:

 -----------M1------------
 TI
 A
 -----------M2------------
 TI
 B
 -------------------------

 M1 also declares A, and M2 also declares B. Now, the linker is looking 
 to resolve TI, and the first one it finds is one in M1, and so links in 
 M1. Later on, it needs to resolve B, and so links in M2. The redundant 
 TI is discarded (because it's a COMDAT).

 However, suppose the program never references A, and A is a chunk of 
 code that pulls in lots of other bloat. This could make the executable 
 much larger than if, in resolving TI, it had picked M2 instead.

For some reason I thought an optimizing linker worked at a segment 
level, but I suppose that is not true for data in a library?  In other 
words, since libraries are indexed by module name, I suppose this means 
they are necessarily dealt with at module granularity instead?

Feb 21 2007

Walter Bright <newshound digitalmars.com> writes:

Sean Kelly wrote:
 For some reason I thought an optimizing linker worked at a segment 
 level, but I suppose that is not true for data in a library?  In other 
 words, since libraries are indexed by module name, I suppose this means 
 they are necessarily dealt with at module granularity instead?

The linker works at the .obj file level.

Feb 21 2007

Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:

Walter Bright wrote:
 Sean Kelly wrote:
 For some reason I thought an optimizing linker worked at a segment 
 level, but I suppose that is not true for data in a library?  In other 
 words, since libraries are indexed by module name, I suppose this 
 means they are necessarily dealt with at module granularity instead?

 
 The linker works at the .obj file level.

GNU ld seems to be perfectly happy working at the section level (with 
--gc-sections).

Feb 21 2007

Walter Bright <newshound digitalmars.com> writes:

Frits van Bommel wrote:
 GNU ld seems to be perfectly happy working at the section level (with 
 --gc-sections).

Yeah, well, try linking D programs with --gc-sections, and you'll get a 
crashing executable.

Feb 21 2007

"Lionello Lunesu" <lionello lunesu.remove.com> writes:

"Walter Bright" <newshound digitalmars.com> wrote in message 
news:erie2v$kad$2 digitalmars.com...
 Frits van Bommel wrote:
 GNU ld seems to be perfectly happy working at the section level 
 (with --gc-sections).

 Yeah, well, try linking D programs with --gc-sections, and you'll get a 
 crashing executable.



L.

Feb 21 2007

Walter Bright <newshound digitalmars.com> writes:

Lionello Lunesu wrote:
 "Walter Bright" <newshound digitalmars.com> wrote in message 
 news:erie2v$kad$2 digitalmars.com...
 Frits van Bommel wrote:
 GNU ld seems to be perfectly happy working at the section level 
 (with --gc-sections).

 Yeah, well, try linking D programs with --gc-sections, and you'll get a 
 crashing executable.

 


Yes, I know, and I'll probably implement them. But they are a hack. A 
much better solution would be a change to the ELF format to allow 
sections to be marked as "don't gc this section", but I doubt that'll 
happen.

Feb 21 2007

Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:

Walter Bright wrote:
 Frits van Bommel wrote:
 GNU ld seems to be perfectly happy working at the section level (with 
 --gc-sections).

 
 Yeah, well, try linking D programs with --gc-sections, and you'll get a 
 crashing executable.

Haven't had trouble with it so far, though I seem to recall reading 
there being some issues with exceptions. AFAIK that can be fixed by 
using an appropriate linker script that KEEP()s the exception info, 
though I haven't tried it since I haven't had that problem so far...

Is there anything else that breaks with --gc-sections?

Feb 21 2007

"Kristian Kilpi" <kjkilpi gmail.com> writes:

On Wed, 21 Feb 2007 11:00:44 +0200, Walter Bright  
<newshound digitalmars.com> wrote:

 It does, but increases the exe size of the first example from 180kb to  
 617kb!

  > 180kb is when compiled using build/rebuild/jake etc (no library) and  
 the 617kb
  > is when using dmd+lib only. Same flags in both cases: none at all

 Let's say you have a template instance, TI. It is declared in two  
 modules, M1 and M2:

 -----------M1------------
 TI
 A
 -----------M2------------
 TI
 B
 -------------------------

 M1 also declares A, and M2 also declares B. Now, the linker is looking  
 to resolve TI, and the first one it finds is one in M1, and so links in  
 M1. Later on, it needs to resolve B, and so links in M2. The redundant  
 TI is discarded (because it's a COMDAT).

 However, suppose the program never references A, and A is a chunk of  
 code that pulls in lots of other bloat. This could make the executable  
 much larger than if, in resolving TI, it had picked M2 instead.

 You can control which module containing TI will be pulled in by the  
 linker to resolve TI, by specifying that module first to lib.exe.

 You can also put TI in a third module that has neither A nor B in it.  
 When compiling M1 and M2, import that third module, so TI won't be  
 generated for M1 or M2.

Here's a quick thought. (It's probably too impractical/absurd. ;) ) Could  
template instances to be put to their own, separate modules? Then the  
linker will find a module containing the template instance only, and no  
bloat will be pulled in with it. I don't know if this would require the  
compiler to generate separate, extra .obj files for template instances or  
something.

Feb 21 2007

Pragma <ericanderton yahoo.removeme.com> writes:

Kristian Kilpi wrote:
 On Wed, 21 Feb 2007 11:00:44 +0200, Walter Bright 
 <newshound digitalmars.com> wrote:

 It does, but increases the exe size of the first example from 180kb 
 to 617kb!

  > 180kb is when compiled using build/rebuild/jake etc (no library) 
 and the 617kb
  > is when using dmd+lib only. Same flags in both cases: none at all

 Let's say you have a template instance, TI. It is declared in two 
 modules, M1 and M2:

 -----------M1------------
 TI
 A
 -----------M2------------
 TI
 B
 -------------------------

 M1 also declares A, and M2 also declares B. Now, the linker is looking 
 to resolve TI, and the first one it finds is one in M1, and so links 
 in M1. Later on, it needs to resolve B, and so links in M2. The 
 redundant TI is discarded (because it's a COMDAT).

 However, suppose the program never references A, and A is a chunk of 
 code that pulls in lots of other bloat. This could make the executable 
 much larger than if, in resolving TI, it had picked M2 instead.

 You can control which module containing TI will be pulled in by the 
 linker to resolve TI, by specifying that module first to lib.exe.

 You can also put TI in a third module that has neither A nor B in it. 
 When compiling M1 and M2, import that third module, so TI won't be 
 generated for M1 or M2.

 Here's a quick thought. (It's probably too impractical/absurd. ;) ) 
 Could template instances to be put to their own, separate modules? Then 
 the linker will find a module containing the template instance only, and 
 no bloat will be pulled in with it. I don't know if this would require 
 the compiler to generate separate, extra .obj files for template 
 instances or something.

Nice idea, but I'd rather see the librarian to (optionally?) do this job
instead.  It would avoid any complications for 
the existing toolchain by not introducing any behavior that is radically
different from other platforms (i.e. "foo.d" 
==> "foo.obj" and "foo-t.obj").

Now if you're talking about breaking each-and-every COMDAT out into it's own
.obj, then having the librarian do it is a 
must.  I can't imagine what my workspace would look like otherwise.

Either way, all this involves the rather messy business of turning each COMDAT
fixup reference within an .obj file into 
an EXTERN.  I doubt that the DMD/DMC backend would make this job easy (I could
be wrong!), so again, putting the job 
elsewhere (librarian) might be easier to maintain.

-- 
- EricAnderton at yahoo

Feb 21 2007

kris <foo bar.com> writes:

Pragma wrote:
 Kristian Kilpi wrote:

 On Wed, 21 Feb 2007 11:00:44 +0200, Walter Bright 
 <newshound digitalmars.com> wrote:

 It does, but increases the exe size of the first example from 180kb 
 to 617kb!

  > 180kb is when compiled using build/rebuild/jake etc (no library) 
 and the 617kb
  > is when using dmd+lib only. Same flags in both cases: none at all

 Let's say you have a template instance, TI. It is declared in two 
 modules, M1 and M2:

 -----------M1------------
 TI
 A
 -----------M2------------
 TI
 B
 -------------------------

 M1 also declares A, and M2 also declares B. Now, the linker is 
 looking to resolve TI, and the first one it finds is one in M1, and 
 so links in M1. Later on, it needs to resolve B, and so links in M2. 
 The redundant TI is discarded (because it's a COMDAT).

 However, suppose the program never references A, and A is a chunk of 
 code that pulls in lots of other bloat. This could make the 
 executable much larger than if, in resolving TI, it had picked M2 
 instead.

 You can control which module containing TI will be pulled in by the 
 linker to resolve TI, by specifying that module first to lib.exe.

 You can also put TI in a third module that has neither A nor B in it. 
 When compiling M1 and M2, import that third module, so TI won't be 
 generated for M1 or M2.

 Here's a quick thought. (It's probably too impractical/absurd. ;) ) 
 Could template instances to be put to their own, separate modules? 
 Then the linker will find a module containing the template instance 
 only, and no bloat will be pulled in with it. I don't know if this 
 would require the compiler to generate separate, extra .obj files for 
 template instances or something.

 Nice idea, but I'd rather see the librarian to (optionally?) do this job 
 instead.  It would avoid any complications for the existing toolchain by 
 not introducing any behavior that is radically different from other 
 platforms (i.e. "foo.d" ==> "foo.obj" and "foo-t.obj").

 Now if you're talking about breaking each-and-every COMDAT out into it's 
 own .obj, then having the librarian do it is a must.  I can't imagine 
 what my workspace would look like otherwise.

 Either way, all this involves the rather messy business of turning each 
 COMDAT fixup reference within an .obj file into an EXTERN.  I doubt that 
 the DMD/DMC backend would make this job easy (I could be wrong!), so 
 again, putting the job elsewhere (librarian) might be easier to maintain.

Just to clarify the current situation: the ballooned exe file has 
nothing to do with templates. There are no templates involved in that 
particular issue, and it appears the prior template demons have been 
driven under the bridge for the interim. There is some progress here, 
but it led to the uncovering of another problem ;)

Feb 21 2007

D Programming

C/C++ Programming

Other

digitalmars.D - Lib change leads to larger executables