www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Stripping Data Symbols (Win64)

reply Benjamin Thaut <code benjamin-thaut.de> writes:
My current work on the D compiler lead me to the following test 
case which I put through a unmodified version of dmd 2.069.2

import core.stdc.stdio;

struct UnusedStruct
{
	int i = 3;
	float f = 4.0f;
};

class UnusedClass
{
	int i = 2;
	float f = 5.0f;
};

void main(string[] args)
{
   printf("Hello World!");
}

When compiling this on windows with dmd -m64 main.d -L/MAP
and then inspecting the map file I noticed that the following 4 
data symbols end up in the final executable although they 
shouldn't be used.

  0003:00000a90       _D4main12UnusedStruct6__initZ 
0000000140046a90     main.obj
  0003:00000ad0       _D4main11UnusedClass6__initZ 
0000000140046ad0     main.obj
  0003:00000af0       _D4main11UnusedClass7__ClassZ 
0000000140046af0     main.obj
  0003:00000ba0       _D4main11UnusedClass6__vtblZ 
0000000140046ba0     main.obj

For the struct this is the initializer, for the class its the 
initializer, class info and vtbl.

Is this behavior correct? Shouldn't UnusedStruct and UnusedClass 
be stripped completely from the binary? Is this somehow connected 
to the module info / object.factory?

I noticed by looking at some object file dumps that dmd puts each 
function into its own section, but data symbols, like 
initializers, are all merged into the same section. Could this be 
the root issue?
Dec 28 2015
parent reply Rainer Schuetze <r.sagitario gmx.de> writes:
On 28.12.2015 13:05, Benjamin Thaut wrote:
 My current work on the D compiler lead me to the following test case
 which I put through a unmodified version of dmd 2.069.2

 import core.stdc.stdio;

 struct UnusedStruct
 {
      int i = 3;
      float f = 4.0f;
 };

 class UnusedClass
 {
      int i = 2;
      float f = 5.0f;
 };

 void main(string[] args)
 {
    printf("Hello World!");
 }

 When compiling this on windows with dmd -m64 main.d -L/MAP
 and then inspecting the map file I noticed that the following 4 data
 symbols end up in the final executable although they shouldn't be used.

   0003:00000a90       _D4main12UnusedStruct6__initZ 0000000140046a90
 main.obj
   0003:00000ad0       _D4main11UnusedClass6__initZ 0000000140046ad0
 main.obj
   0003:00000af0       _D4main11UnusedClass7__ClassZ 0000000140046af0
 main.obj
   0003:00000ba0       _D4main11UnusedClass6__vtblZ 0000000140046ba0
 main.obj

 For the struct this is the initializer, for the class its the
 initializer, class info and vtbl.

 Is this behavior correct? Shouldn't UnusedStruct and UnusedClass be
 stripped completely from the binary? Is this somehow connected to the
 module info / object.factory?
I noticed something similar recently when compiling a C file with /Gy, see https://github.com/D-Programming-Language/druntime/pull/1446#issuecomment-160880021 The compiler puts all functions into COMDATs, but they are all still linked in if only a single symbol is referenced, even if linked with /OPT:REF. So I suspect this is not an issue with dmd, but the Microsoft linker. I still wonder whether the approach to use "function level linking" works at all for Win64.
 I noticed by looking at some object file dumps that dmd puts each
 function into its own section, but data symbols, like initializers, are
 all merged into the same section. Could this be the root issue?
Having all data in a single section misses some possible optimizations, and it might be the reason for the behavior in your case (you can check this with "dumpbin /all objectfile"), but the issue above does not contain any data.
Dec 30 2015
parent reply Benjamin Thaut <code benjamin-thaut.de> writes:
On Wednesday, 30 December 2015 at 09:43:32 UTC, Rainer Schuetze 
wrote:
 I noticed something similar recently when compiling a C file 
 with /Gy, see 
 https://github.com/D-Programming-Language/druntime/pull/1446#issuecomment-160880021

 The compiler puts all functions into COMDATs, but they are all 
 still linked in if only a single symbol is referenced, even if 
 linked with /OPT:REF.

 So I suspect this is not an issue with dmd, but the Microsoft 
 linker. I still wonder whether the approach to use "function 
 level linking" works at all for Win64.

 I noticed by looking at some object file dumps that dmd puts
each
 function into its own section, but data symbols, like
initializers, are
 all merged into the same section. Could this be the root
issue? Having all data in a single section misses some possible optimizations, and it might be the reason for the behavior in your case (you can check this with "dumpbin /all objectfile"), but the issue above does not contain any data.
So if I understand this correctly the microsoft linker only strips unused comdats, otherwise always the entire object file gets pulled in? For me stripping of individual data symbols not working is actually a good thing, if it doesn't work, I can't break it. ;-)
Dec 30 2015
parent reply Rainer Schuetze <r.sagitario gmx.de> writes:
On 30.12.2015 13:25, Benjamin Thaut wrote:
 On Wednesday, 30 December 2015 at 09:43:32 UTC, Rainer Schuetze
 wrote:
 I noticed something similar recently when compiling a C file with
 /Gy, see
 https://github.com/D-Programming-Language/druntime/pull/1446#issuecomment-160880021



 The compiler puts all functions into COMDATs, but they are all
 still linked in if only a single symbol is referenced, even if
 linked with /OPT:REF.

 So I suspect this is not an issue with dmd, but the Microsoft
 linker. I still wonder whether the approach to use "function level
 linking" works at all for Win64.

 I noticed by looking at some object file dumps that dmd puts
each
 function into its own section, but data symbols, like
initializers, are
 all merged into the same section. Could this be the root
issue? Having all data in a single section misses some possible optimizations, and it might be the reason for the behavior in your case (you can check this with "dumpbin /all objectfile"), but the issue above does not contain any data.
So if I understand this correctly the microsoft linker only strips unused comdats, otherwise always the entire object file gets pulled in?
I tried to reproduce the issue right now, but failed to do so (both with a C file compiled with /Gy and a D compiled). Only referenced COMDATs where included in a link, not other COMDATs in the same object file. Maybe it was an issue with my build script back then.
 For me stripping of individual data symbols not working is actually a
  good thing, if it doesn't work, I can't break it. ;-)
Please note that building with -lib puts every function/declaration into it's own object file inside the library, and unused class declarations are no longer in the linked executable.
Jan 01 2016
parent Benjamin Thaut <code benjamin-thaut.de> writes:
On Friday, 1 January 2016 at 13:57:01 UTC, Rainer Schuetze wrote:
 Please note that building with -lib puts every 
 function/declaration into it's own object file inside the 
 library, and unused class declarations are no longer in the 
 linked executable.
Ok, that is very good information. I should be able to build a test case out of that. Kind Regards Benjamin Thaut
Jan 04 2016