www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - D Map Treemap viewer

reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
This tool attempts to answer the question "Why the $#%!$ % hell is my  
binary so huge?" in an intuitive way. Each rectangle is proportional in  
its area (relative to its siblings, and disregarding padding/captions -  
the entire layout) to the size of the object it represents. The tool takes  
linker map files as input, and attempts to arrange symbols in a tree,  
using demangling and some ugly hacks.

URL: http://thecybershadow.net/d/mapview/
Example: http://thecybershadow.net/d/mapview/view.php?id=4ea9c5d230f18
Source: https://github.com/CyberShadow/DMapTreeMap

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net
Oct 27 2011
next sibling parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
This is awesome and I was just about to request for something like
this too. Great job!

What is not awesome is that DMD spits out map files with invalid code points. :(

DMapTreeMap>rdmd treemapgen.d test13_2056.map out.json
std.utf.UTFException std\utf.d(637): Invalid UTF-8 sequence (at index 15342)

I've checked it with an editor and there are tons of invalid code
points in the map file.
Oct 27 2011
parent reply Rainer Schuetze <r.sagitario gmx.de> writes:
On 28.10.2011 02:57, Vladimir Panteleev wrote:
 On Fri, 28 Oct 2011 03:45:11 +0300, Andrej Mitrovic
 <andrej.mitrovich gmail.com> wrote:

 This is awesome and I was just about to request for something like
 this too. Great job!

 What is not awesome is that DMD spits out map files with invalid code
 points. :(

 DMapTreeMap>rdmd treemapgen.d test13_2056.map out.json
 std.utf.UTFException std\utf.d(637): Invalid UTF-8 sequence (at index
 15342)

 I've checked it with an editor and there are tons of invalid code
 points in the map file.

Ah, it shouldn't fail on that... the original D1 code I recycled worked just fine on arbitrary encodings, but D2's splitLines can't handle bad UTF. Anyway, fixed.

Pretty cool tool :-) The map file generated by optlink is not UTF8, it uses compressed symbols, that can be expanded with demangle.decodeDmdString before demangling. Please also note that the map file is often corrupt: http://d.puremagic.com/issues/show_bug.cgi?id=6673 which could lead to bad computed sizes.
Oct 28 2011
parent Rainer Schuetze <r.sagitario gmx.de> writes:
On 28.10.2011 20:44, Vladimir Panteleev wrote:
 On Fri, 28 Oct 2011 20:15:21 +0300, Rainer Schuetze <r.sagitario gmx.de>
 wrote:

 The map file generated by optlink is not UTF8, it uses compressed
 symbols, that can be expanded with demangle.decodeDmdString before
 demangling.

Yeah, the code does that.
 Please also note that the map file is often corrupt:
 http://d.puremagic.com/issues/show_bug.cgi?id=6673

 which could lead to bad computed sizes.

Yeah, I noticed. I think it'll show up as bad symbol names in the worst case.

Do you use the address after the symbol or the segment address at the beginning of the line? The latter can easily be screwed up, see the single line from one of my map files: 0005:0 0005:00194A43 _D7visuald10completion13CompletionSet16OnCommitCompleteMWZi10__FUNCTION6__initZ 107 0005:00194A44 __D7visuald7comutil10DComObject13sCountCreatedi 107FCA44 The address after the symbol seems more unlikely to be wrong, but chances are you skip a symbol and get bad names. Wondering why symbols seem quite large, I checked my definition of IID_IUnknown that was reported as using more than 30kB, and that is what could be seen in the map file. What's actually happening is that templates used for CTFE are compiled into the object file. The code is in separate COMDAT sections which means they could be eliminated by the linker, but the data is written into the same section as the IID, but without any labels. In my case these were either mixin strings or template string arguments. So, dmd should separate the data into different sections aswell. Even better, it should not generate code and data for imported functions and templates that were only used in CTFE. IIRC there are already bugzilla entries for this, but I could only find http://d.puremagic.com/issues/show_bug.cgi?id=4767 ATM.
Oct 29 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Quick test case:

This one is creates an ok map file, I can create the tree:
module test;
void main()
{}

With this one, treemapgen fails:
module test;
import std.stdio;
void main()
{}

rdmd treemapgen.d test.map test.json
std.utf.UTFException std\utf.d(637): Invalid UTF-8 sequence (at index 57950)
Oct 27 2011
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Fri, 28 Oct 2011 03:45:11 +0300, Andrej Mitrovic  
<andrej.mitrovich gmail.com> wrote:

 This is awesome and I was just about to request for something like
 this too. Great job!

 What is not awesome is that DMD spits out map files with invalid code  
 points. :(

 DMapTreeMap>rdmd treemapgen.d test13_2056.map out.json
 std.utf.UTFException std\utf.d(637): Invalid UTF-8 sequence (at index  
 15342)

 I've checked it with an editor and there are tons of invalid code
 points in the map file.

Ah, it shouldn't fail on that... the original D1 code I recycled worked just fine on arbitrary encodings, but D2's splitLines can't handle bad UTF. Anyway, fixed. -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Oct 27 2011
prev sibling next sibling parent Trass3r <un known.com> writes:
 This tool attempts to answer the question "Why the $#%!$ % hell is my  
 binary so huge?" in an intuitive way.

Awesome! Another great tool after DustMite. Look at that: http://thecybershadow.net/d/mapview/view.php?id=4eaa05054b06f Why the heck is that _D2rt3aaA2AA6__initZ shown as 2MB in size? The resulting exe is only 1MB. Guess it's thrown out by the linker. Also why can't the demangler demangle all those init symbols? And _D6opencl7wrapper8CLObject6__initZ - 6928 bytes?? It's an empty struct. :D and what kind of a symbol is '0x24' ;) Seems like a bug.
Oct 27 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
OT: Interesting, JS has a .shift method for arrays which is basically:
{ auto v = arr.front; arr.popFront; return v; }

Do we have something equivalent in Phobos? It seems useful.
Oct 27 2011
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Fri, 28 Oct 2011 04:50:59 +0300, Trass3r <un known.com> wrote:

 This tool attempts to answer the question "Why the $#%!$ % hell is my  
 binary so huge?" in an intuitive way.

Awesome! Another great tool after DustMite. Look at that: http://thecybershadow.net/d/mapview/view.php?id=4eaa05054b06f Why the heck is that _D2rt3aaA2AA6__initZ shown as 2MB in size? The resulting exe is only 1MB.

Not sure. I guess it's the last symbol in a segment, before a long gap.
 Also why can't the demangler demangle all those init symbols?

I think that's a question for Sean, who maintains the demangler.
 And _D6opencl7wrapper8CLObject6__initZ - 6928 bytes??
 It's an empty struct.

 :D and what kind of a symbol is '0x24' ;)
 Seems like a bug.

I guess ld map parsing could be better. I'll see what I can do. -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Oct 28 2011
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Fri, 28 Oct 2011 04:50:59 +0300, Trass3r <un known.com> wrote:

 http://thecybershadow.net/d/mapview/view.php?id=4eaa05054b06f

It's still a hack, but should be better now. -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Oct 28 2011
prev sibling next sibling parent reply Ary Manzana <ary esperanto.org.ar> writes:
On 10/27/11 9:09 PM, Vladimir Panteleev wrote:
 This tool attempts to answer the question "Why the $#%!$ % hell is my
 binary so huge?" in an intuitive way. Each rectangle is proportional in
 its area (relative to its siblings, and disregarding padding/captions -
 the entire layout) to the size of the object it represents. The tool
 takes linker map files as input, and attempts to arrange symbols in a
 tree, using demangling and some ugly hacks.

 URL: http://thecybershadow.net/d/mapview/
 Example: http://thecybershadow.net/d/mapview/view.php?id=4ea9c5d230f18
 Source: https://github.com/CyberShadow/DMapTreeMap

<div class="nojs">Needs JavaScript (sorry, Nick!)</div> LOL!!!! :-)
Oct 28 2011
parent "Nick Sabalausky" <a a.a> writes:
"Ary Manzana" <ary esperanto.org.ar> wrote in message 
news:j8edr5$mla$1 digitalmars.com...
 On 10/27/11 9:09 PM, Vladimir Panteleev wrote:
 This tool attempts to answer the question "Why the $#%!$ % hell is my
 binary so huge?" in an intuitive way. Each rectangle is proportional in
 its area (relative to its siblings, and disregarding padding/captions -
 the entire layout) to the size of the object it represents. The tool
 takes linker map files as input, and attempts to arrange symbols in a
 tree, using demangling and some ugly hacks.

 URL: http://thecybershadow.net/d/mapview/
 Example: http://thecybershadow.net/d/mapview/view.php?id=4ea9c5d230f18
 Source: https://github.com/CyberShadow/DMapTreeMap

<div class="nojs">Needs JavaScript (sorry, Nick!)</div> LOL!!!! :-)

Ditto that LOL :)
Oct 28 2011
prev sibling next sibling parent Trass3r <un known.com> writes:
 http://thecybershadow.net/d/mapview/view.php?id=4eaa05054b06f

It's still a hack, but should be better now.

Nice. Still wondering about the sizes though. opencl.wrapper.CLObject is an empty struct and it shows 6928 bytes for the .init opencl.commandqueue.CLCommandQueue is a struct wrapping a void* and its init is 12848 bytes in size? opencl.c.cl.cl_command_type is a named enum : uint and it's 12348 bytes? Also now it sort of demangles the init symbols but interprets that Z at the end like an identifier. I'd guess it rather represents the fact that this is an init symbol.
Oct 28 2011
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Fri, 28 Oct 2011 17:39:26 +0300, Trass3r <un known.com> wrote:

 http://thecybershadow.net/d/mapview/view.php?id=4eaa05054b06f

It's still a hack, but should be better now.

Nice. Still wondering about the sizes though. opencl.wrapper.CLObject is an empty struct and it shows 6928 bytes for the .init opencl.commandqueue.CLCommandQueue is a struct wrapping a void* and its init is 12848 bytes in size? opencl.c.cl.cl_command_type is a named enum : uint and it's 12348 bytes?

That's what the map file says. You may want to take a look at the binary with a disassembler or hex editor. It's not excluded that there's a bug somewhere, similar to issue 2254 with DMD or OptLink.
 Also now it sort of demangles the init symbols but interprets that Z at  
 the end like an identifier. I'd guess it rather represents the fact that  
 this is an init symbol.

I added a simplistic "demangler" when core.demangle failed. Z is the leftover part of the string, I think it indicates the storage class (executable, if it means the same as in C++). I could hide it, I guess, it's just yet another special case. -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Oct 28 2011
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Fri, 28 Oct 2011 20:15:21 +0300, Rainer Schuetze <r.sagitario gmx.de>  
wrote:

 The map file generated by optlink is not UTF8, it uses compressed  
 symbols, that can be expanded with demangle.decodeDmdString before  
 demangling.

Yeah, the code does that.
 Please also note that the map file is often corrupt:
 http://d.puremagic.com/issues/show_bug.cgi?id=6673

 which could lead to bad computed sizes.

Yeah, I noticed. I think it'll show up as bad symbol names in the worst case. -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Oct 28 2011
prev sibling next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 10/27/2011 5:09 PM, Vladimir Panteleev wrote:
 This tool attempts to answer the question "Why the $#%!$ % hell is my binary so
 huge?" in an intuitive way.

That is a neat idea and a neat project. I've thought of building another tool that will read the .obj files and be able to answer the question "why is this symbol being linked in?"
Oct 28 2011
prev sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sat, 29 Oct 2011 14:43:28 +0300, Rainer Schuetze <r.sagitario gmx.de>  
wrote:

 Do you use the address after the symbol or the segment address at the  
 beginning of the line?

The address at the end.
 The address after the symbol seems more unlikely to be wrong, but  
 chances are you skip a symbol and get bad names.

Skipping symbols is inevitable when the symptom is cutting off lines, and a vital bit of information (the address) is at the line's end. While the tool could work around the corruption, I think it'd be a waste of effort better spent on fixing the source of the corruption (although it seems to lie in a closed-source proprietary program). -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Oct 29 2011