www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - compiled code file size

reply Duke Normandin <dukeofperl ml1.net> writes:
I'm re-visiting the D language. I've compared the file sizes of 2 
executables - 1 is compiled C code using gcc; the other is D code 
using dmd.

helloWorld.d => helloWorld.exe = 146,972 bytes
ex1hello.c => ex1-hello.exe = 5,661 bytes

Why such a huge difference???

Duke
Sep 20 2013
next sibling parent "Temtaime" <temtaime gmail.com> writes:
DMD likes the size.
When compiling, compiler may use GBs of RAM.
In resulting executable there is no dead/unused code elimination.
Sep 20 2013
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Friday, 20 September 2013 at 16:35:40 UTC, Temtaime wrote:
 In resulting executable there is no dead/unused code 
 elimination.

Not true.
Sep 20 2013
prev sibling next sibling parent Justin Whear <justin economicmodeling.com> writes:
On Fri, 20 Sep 2013 18:35:39 +0200, Temtaime wrote:

 DMD likes the size.
 When compiling, compiler may use GBs of RAM.
 In resulting executable there is no dead/unused code elimination.

http://imgur.com/W5AMy0P
Sep 20 2013
prev sibling next sibling parent "Temtaime" <temtaime gmail.com> writes:
Why? I have a large project.
If i replace main with "void main() {}" the size is still 26 MB 
in debug.
Sep 20 2013
prev sibling next sibling parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Friday, 20 September 2013 at 16:20:34 UTC, Duke Normandin 
wrote:
 Why such a huge difference???

The D program carries its additional D runtime library code with it, whereas the C program only depends on libraries provided by the operating system, and thus it doesn't have to include it in the exe.
Sep 20 2013
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/20/13 3:49 PM, Duke Normandin wrote:
 On Friday, 20-Sep-13 3:04 PM, Nick Sabalausky wrote:
 On Fri, 20 Sep 2013 21:45:48 +0200
 "Temtaime" <temtaime gmail.com> wrote:
 Software MUST
 running almost ANYWHERE and consumes minimal resources.

 For example i hate 3dsmax developers when on my game's map it
 uses several GB of ram amd freezes sometimes, when Blender uses
 only 500 MB and runs fast. The only reason for me for use 3dsmax
 is more friendly contoling. But this is another story...

 Some users which doesn't have ""modern"" PC will hate your app
 too i think.
 One should optimize ALL things which he can to optimize.

I agree with what you're saying here, but the problem is we're looking at a difference of only a few hundred k. Heck, my primary PC was a 32-bit single-core right up until last year (and I still use it as a secondary system), and I didn't care one bit if a hello world was 1k or 1MB. How many real world programs are as trivial as a hello world? A few maybe, but not many. Certainly not enough to actually add up to anything significant, unless maybe you happen to be running on a 286 or such. If we were talking about real-world D programs taking tens/hundreds of MB more than they should, then that would be a problem. But they don't. We're just talking about a few hundred k for an *entire* program.

I should have been a bit more clear!! It's the _relative_ size difference that bothers me!! One is almost 26 times larger than the other. If I'm to expect that same variance in a large to huge project, that I think that I'd me in a world of bullshine!!

The point here is that the factor does not preserve as sizes go. A 4 year-old is twice as old as a 2-year-old, but a 34-year-old is not twice as old as a 32-year-old. Andrei
Sep 20 2013
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
On Friday, 20 September 2013 at 16:20:34 UTC, Duke Normandin 
wrote:
 I'm re-visiting the D language. I've compared the file sizes of 
 2 executables - 1 is compiled C code using gcc; the other is D 
 code using dmd.

 helloWorld.d => helloWorld.exe = 146,972 bytes
 ex1hello.c => ex1-hello.exe = 5,661 bytes

 Why such a huge difference???

 Duke

You are doing it wrong. ``` $ gcc hello.c; ls -lah a.out -rwxr-xr-x 1 dicebot users 4.9K Sep 20 18:47 a.out ``` vs ``` $ gcc -static hello.c; ls -lah a.out -rwxr-xr-x 1 dicebot users 717K Sep 20 18:48 a.out ``` (C standard library is dynamically linked by default) So actual relative difference is about 2x - quite big but not as huge. It mostly comes from additional D runtime stuff.
Sep 20 2013
prev sibling next sibling parent "Temtaime" <temtaime gmail.com> writes:
C/C++ applications also carries on its runtime(mingwm10, msvc's 
redist, for example).
If compiled with static runtime, msvc's hello world application 
uses about 40 KB.
Sep 20 2013
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Friday, 20 September 2013 at 16:44:30 UTC, Temtaime wrote:
 Why? I have a large project.
 If i replace main with "void main() {}" the size is still 26 MB 
 in debug.

Could be due to things like static module constructors or typeinfos. There *are* problems with things getting intertwined and not being considered dead by the linker, but it does try. The proof of it is hello world being 150 KB instead of the 6 MB or so it would be if it carried all of the dead code from phobos too.
Sep 20 2013
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
On Friday, 20 September 2013 at 16:37:19 UTC, Adam D. Ruppe wrote:
 On Friday, 20 September 2013 at 16:35:40 UTC, Temtaime wrote:
 In resulting executable there is no dead/unused code 
 elimination.

Not true.

Well, it is _mostly_ true. There is an elimination of unused code inside the function bodies during the code gen, but no unused data/code symbol elimination - and it actually can't be done safely right now by language design.
Sep 20 2013
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
On Friday, 20 September 2013 at 16:50:50 UTC, Adam D. Ruppe wrote:
 On Friday, 20 September 2013 at 16:44:30 UTC, Temtaime wrote:
 Why? I have a large project.
 If i replace main with "void main() {}" the size is still 26 
 MB in debug.

Could be due to things like static module constructors or typeinfos. There *are* problems with things getting intertwined and not being considered dead by the linker, but it does try. The proof of it is hello world being 150 KB instead of the 6 MB or so it would be if it carried all of the dead code from phobos too.

You are confusing linking static library and eliminating unused code from executable binary itself.
Sep 20 2013
prev sibling next sibling parent "Temtaime" <temtaime gmail.com> writes:
I haven't any static constructors.
DMD eliminates nothing. It even doesn't eliminates unreferenced 
data as i wrote on some bugreport.
Sep 20 2013
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
On Friday, 20 September 2013 at 17:00:22 UTC, Temtaime wrote:
 I haven't any static constructors.
 DMD eliminates nothing. It even doesn't eliminates unreferenced 
 data as i wrote on some bugreport.

It is not possible to eliminate it within current language spec because of shared libraries.
Sep 20 2013
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Friday, 20 September 2013 at 16:54:45 UTC, Dicebot wrote:oo.
 You are confusing linking static library and eliminating unused 
 code from executable binary itself.

Well, damn, I just tested with a quick static array and you're right. Damn son.
Sep 20 2013
prev sibling next sibling parent "Temtaime" <temtaime gmail.com> writes:
Linker knows what it links i think. :)
So it can eliminate everything that unreferenced in main, isn't i 
right?
Sep 20 2013
prev sibling next sibling parent Duke Normandin <dukeofperl ml1.net> writes:
On Friday, 20-Sep-13 10:45 AM, Adam D. Ruppe wrote:
 On Friday, 20 September 2013 at 16:20:34 UTC, Duke Normandin wrote:
 Why such a huge difference???

The D program carries its additional D runtime library code with it, whereas the C program only depends on libraries provided by the operating system, and thus it doesn't have to include it in the exe.

Now that I know _why_ , is there a way to shave tons off those executables? Any optimization possible?
Sep 20 2013
prev sibling next sibling parent Duke Normandin <dukeofperl ml1.net> writes:
On Friday, 20-Sep-13 10:50 AM, Temtaime wrote:
 C/C++ applications also carries on its runtime(mingwm10, msvc's
 redist, for example).
 If compiled with static runtime, msvc's hello world application uses
 about 40 KB.

+1
Sep 20 2013
prev sibling next sibling parent captaindet <2krnk gmx.net> writes:
On 2013-09-20 10:03, Duke Normandin wrote:
 I'm re-visiting the D language. I've compared the file sizes of 2 executables
- 1 is compiled C code using gcc; the other is D code using dmd.

 helloWorld.d => helloWorld.exe = 146,972 bytes
 ex1hello.c => ex1-hello.exe = 5,661 bytes

 Why such a huge difference???

 Duke

maybe somehow related: i have a short program using GtkD. the exe is ~3MB if compiled using dmd and linked to pre-built GtkD.lib (16MB) ~2MB if compiled via bud/build following up on all imports directly, no linking to pre-built lib all compiler flags the same (-debug for exe, prebuilt lib is not debug but -O -inline -release). on windows. /det
Sep 20 2013
prev sibling next sibling parent Duke Normandin <dukeofperl ml1.net> writes:
On Friday, 20-Sep-13 10:49 AM, Dicebot wrote:
 On Friday, 20 September 2013 at 16:20:34 UTC, Duke Normandin wrote:
 I'm re-visiting the D language. I've compared the file sizes of 2
 executables - 1 is compiled C code using gcc; the other is D code
 using dmd.

 helloWorld.d => helloWorld.exe = 146,972 bytes
 ex1hello.c => ex1-hello.exe = 5,661 bytes

 Why such a huge difference???

 Duke

You are doing it wrong. ``` $ gcc hello.c; ls -lah a.out -rwxr-xr-x 1 dicebot users 4.9K Sep 20 18:47 a.out ``` vs ``` $ gcc -static hello.c; ls -lah a.out -rwxr-xr-x 1 dicebot users 717K Sep 20 18:48 a.out ``` (C standard library is dynamically linked by default) So actual relative difference is about 2x - quite big but not as huge. It mostly comes from additional D runtime stuff.

I get the same executable size whether or not I use `-static' with cygwin/win7 ... Still tons smaller than the D executable though. Not good!! me young mucker!!!! :D
Sep 20 2013
prev sibling next sibling parent Duke Normandin <dukeofperl ml1.net> writes:
On Friday, 20-Sep-13 11:28 AM, captaindet wrote:
 On 2013-09-20 10:03, Duke Normandin wrote:
 I'm re-visiting the D language. I've compared the file sizes of 2
 executables - 1 is compiled C code using gcc; the other is D code
 using dmd.

 helloWorld.d => helloWorld.exe = 146,972 bytes
 ex1hello.c => ex1-hello.exe = 5,661 bytes

 Why such a huge difference???

 Duke

maybe somehow related: i have a short program using GtkD. the exe is ~3MB if compiled using dmd and linked to pre-built GtkD.lib (16MB) ~2MB if compiled via bud/build following up on all imports directly, no linking to pre-built lib all compiler flags the same (-debug for exe, prebuilt lib is not debug but -O -inline -release). on windows.

interesting ....
Sep 20 2013
prev sibling next sibling parent "JohnnyK" <johnnykinsey comcast.net> writes:
On Friday, 20 September 2013 at 16:20:34 UTC, Duke Normandin 
wrote:
 I'm re-visiting the D language. I've compared the file sizes of 
 2 executables - 1 is compiled C code using gcc; the other is D 
 code using dmd.

 helloWorld.d => helloWorld.exe = 146,972 bytes
 ex1hello.c => ex1-hello.exe = 5,661 bytes

 Why such a huge difference???

 Duke

That 140KB is called the CYA document. It is there so that when you the programmer screws up you don't look so bad in front of your boss.
Sep 20 2013
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Friday, 20 September 2013 at 17:26:29 UTC, Duke Normandin 
wrote:
 Now that I know _why_ , is there a way to shave tons off those 
 executables? Any optimization possible?

Yes, you can get D programs down very small - I've gone as low as 3 KB before on Linux (100% statically linked, doesn't even depend on the C runtime), where the executables are generally a little larger than on Windows. BUT, the runtime code is there for a reason. Stripping it out means you lose D features, can't use most D libraries, and have to know druntime's implementation fairly well. So it isn't something you really want to do. Why is size important to you though? 140 KB really isn't bad, and will probably shrink to a small percentage of the total once you write a real program that's more than just hello world.
Sep 20 2013
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Friday, 20 September 2013 at 17:27:56 UTC, captaindet wrote:
 i have a short program using GtkD. the exe is

gtkd's size is one reason why I started writing minigui.d. It isn't finished yet, but the resulting exes are about 220 KB instead of 2MB!
Sep 20 2013
prev sibling next sibling parent "JohnnyK" <johnnykinsey comcast.net> writes:
On Friday, 20 September 2013 at 17:27:30 UTC, Duke Normandin 
wrote:
 On Friday, 20-Sep-13 10:50 AM, Temtaime wrote:
 C/C++ applications also carries on its runtime(mingwm10, msvc's
 redist, for example).
 If compiled with static runtime, msvc's hello world 
 application uses
 about 40 KB.

+1

I don't think that static runtime comes with a garbage collector either nor does it come with a lot of other really nice features that come with D auto-magically. I remember these same questions being brought up with C being compared to Assembly, Visual Basic vs GWBASIC, COBOL vs FORTRAN vs RPG vs C. This discussion always comes up over and over again in the programming world. Those of us that have come to D is not coming here because of the size of the executable compared to <your fav. language here>. We came because D is fast, robust, easy to maintain, easy to understand, and just an all around practical language. Duke unless your trying to build programs for embedded appliances I don't think this question really matters much in this day and age with Terabyte hard drives and gigabytes of ram on the modern computer. Back when C was built we only had 64KB to work with so we could not have garbage collection, thread libraries, or even a string library built into the compiler. IMHO who cares as long as it is reasonable and necessary to make time to market quicker and still produce a sound product. In the end ask yourself what is 100KB to make your life as a programmer easier.
Sep 20 2013
prev sibling next sibling parent "Gary Willoughby" <dev nomad.so> writes:
On Friday, 20 September 2013 at 16:40:48 UTC, Justin Whear wrote:
 On Fri, 20 Sep 2013 18:35:39 +0200, Temtaime wrote:

 DMD likes the size.
 When compiling, compiler may use GBs of RAM.
 In resulting executable there is no dead/unused code 
 elimination.

http://imgur.com/W5AMy0P

Ha, awesome! Swiped for future use. ;)
Sep 20 2013
prev sibling next sibling parent "JohnnyK" <johnnykinsey comcast.net> writes:
On Friday, 20 September 2013 at 18:09:03 UTC, Adam D. Ruppe wrote:
 On Friday, 20 September 2013 at 17:27:56 UTC, captaindet wrote:
 i have a short program using GtkD. the exe is

gtkd's size is one reason why I started writing minigui.d. It isn't finished yet, but the resulting exes are about 220 KB instead of 2MB!

Please share minigui.d with us?
Sep 20 2013
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Friday, 20 September 2013 at 18:37:36 UTC, JohnnyK wrote:
 Please share minigui.d with us?

it is on my misc. github: https://github.com/adamdruppe/misc-stuff-including-D-programming-language-web-stuff you'll need color.d, simpledisplay.d, and minigui.d There's still a lot of work that has to be done to finish minigui.d's widget classes, so it isn't really usable yet except for some very simple things.
Sep 20 2013
prev sibling next sibling parent "Temtaime" <temtaime gmail.com> writes:
The main question is how topicstarter achieved 150 KB, is it on 
linux?

I have 137 KB with DMD when compiling

import core.stdc.stdio;
void main() { printf(`hello world`); }

And 673 KB when using writeln from std.stdio.

I'm using DMD 2.063.2 on windows.


You're saying about terabytes hdd and gigabytes ram..
This is not right way when developing the software. Software MUST 
running almost ANYWHERE and consumes minimal resources.

For example i hate 3dsmax developers when on my game's map it 
uses several GB of ram amd freezes sometimes, when Blender uses 
only 500 MB and runs fast. The only reason for me for use 3dsmax 
is more friendly contoling. But this is another story...

Some users which doesn't have ""modern"" PC will hate your app 
too i think.
One should optimize ALL things which he can to optimize.

Staying on the topic i need say that it's possible to reduce 
executable size.
For example, as David says, GCC produces 700 KB for hello world. 
There is TCC(Tiny C Compiler). It produces hello executable of 2 
KB. It's really nice C compiler, i respect its developers.

C++ also has threads and oter things you said.
For example, DMD's compiler size is about 1.6 MB. It is large and 
sophisticated. Nuff said.
Sep 20 2013
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Sep 20, 2013 at 11:26:18AM -0600, Duke Normandin wrote:
 On Friday, 20-Sep-13 10:45 AM, Adam D. Ruppe wrote:
On Friday, 20 September 2013 at 16:20:34 UTC, Duke Normandin wrote:
Why such a huge difference???

The D program carries its additional D runtime library code with it, whereas the C program only depends on libraries provided by the operating system, and thus it doesn't have to include it in the exe.

Now that I know _why_ , is there a way to shave tons off those executables? Any optimization possible?

If you're on Linux: dmd -release -O myprogram.d strip myprogram upx myprogram I've seen this reduce a 50MB executable down to about 400k. YMMV. Keep in mind, though, that stripping basically deletes all debugging information from the executable (plus a bunch of other stuff -- you don't want to do this to an object file or a library, for example), so it's not something you want to do during development. And upx turns your executable into something that probably violates the ELF spec in many different ways, but resembles it closely enough that the kernel will still run it. File type recognizers like 'file' may fail to recognize the result as an executable afterwards. But it will still work. (That's how cool upx is, in case you don't already know that.) T -- The right half of the brain controls the left half of the body. This means that only left-handed people are in their right mind. -- Manoj Srivastava
Sep 20 2013
prev sibling next sibling parent Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:
On Fri, 20 Sep 2013 21:45:48 +0200
"Temtaime" <temtaime gmail.com> wrote:
 
 Software MUST 
 running almost ANYWHERE and consumes minimal resources.
 
 For example i hate 3dsmax developers when on my game's map it 
 uses several GB of ram amd freezes sometimes, when Blender uses 
 only 500 MB and runs fast. The only reason for me for use 3dsmax 
 is more friendly contoling. But this is another story...
 
 Some users which doesn't have ""modern"" PC will hate your app 
 too i think.
 One should optimize ALL things which he can to optimize.
 

I agree with what you're saying here, but the problem is we're looking at a difference of only a few hundred k. Heck, my primary PC was a 32-bit single-core right up until last year (and I still use it as a secondary system), and I didn't care one bit if a hello world was 1k or 1MB. How many real world programs are as trivial as a hello world? A few maybe, but not many. Certainly not enough to actually add up to anything significant, unless maybe you happen to be running on a 286 or such. If we were talking about real-world D programs taking tens/hundreds of MB more than they should, then that would be a problem. But they don't. We're just talking about a few hundred k for an *entire* program.
Sep 20 2013
prev sibling next sibling parent Duke Normandin <dukeofperl ml1.net> writes:
On Friday, 20-Sep-13 3:04 PM, Nick Sabalausky wrote:
 On Fri, 20 Sep 2013 21:45:48 +0200
 "Temtaime" <temtaime gmail.com> wrote:
 Software MUST
 running almost ANYWHERE and consumes minimal resources.

 For example i hate 3dsmax developers when on my game's map it
 uses several GB of ram amd freezes sometimes, when Blender uses
 only 500 MB and runs fast. The only reason for me for use 3dsmax
 is more friendly contoling. But this is another story...

 Some users which doesn't have ""modern"" PC will hate your app
 too i think.
 One should optimize ALL things which he can to optimize.

I agree with what you're saying here, but the problem is we're looking at a difference of only a few hundred k. Heck, my primary PC was a 32-bit single-core right up until last year (and I still use it as a secondary system), and I didn't care one bit if a hello world was 1k or 1MB. How many real world programs are as trivial as a hello world? A few maybe, but not many. Certainly not enough to actually add up to anything significant, unless maybe you happen to be running on a 286 or such. If we were talking about real-world D programs taking tens/hundreds of MB more than they should, then that would be a problem. But they don't. We're just talking about a few hundred k for an *entire* program.

I should have been a bit more clear!! It's the _relative_ size difference that bothers me!! One is almost 26 times larger than the other. If I'm to expect that same variance in a large to huge project, that I think that I'd me in a world of bullshine!!
Sep 20 2013
prev sibling next sibling parent Duke Normandin <dukeofperl ml1.net> writes:
On Friday, 20-Sep-13 2:20 PM, H. S. Teoh wrote:
 On Fri, Sep 20, 2013 at 11:26:18AM -0600, Duke Normandin wrote:
 On Friday, 20-Sep-13 10:45 AM, Adam D. Ruppe wrote:
 On Friday, 20 September 2013 at 16:20:34 UTC, Duke Normandin wrote:
 Why such a huge difference???

The D program carries its additional D runtime library code with it, whereas the C program only depends on libraries provided by the operating system, and thus it doesn't have to include it in the exe.

Now that I know _why_ , is there a way to shave tons off those executables? Any optimization possible?

If you're on Linux: dmd -release -O myprogram.d strip myprogram upx myprogram I've seen this reduce a 50MB executable down to about 400k. YMMV. Keep in mind, though, that stripping basically deletes all debugging information from the executable (plus a bunch of other stuff -- you don't want to do this to an object file or a library, for example), so it's not something you want to do during development. And upx turns your executable into something that probably violates the ELF spec in many different ways, but resembles it closely enough that the kernel will still run it. File type recognizers like 'file' may fail to recognize the result as an executable afterwards. But it will still work. (That's how cool upx is, in case you don't already know that.)

Thx! I'll have to do some experimenting ...
Sep 20 2013
prev sibling next sibling parent Duke Normandin <dukeofperl ml1.net> writes:
On Friday, 20-Sep-13 11:59 AM, JohnnyK wrote:
 On Friday, 20 September 2013 at 16:20:34 UTC, Duke Normandin wrote:
 I'm re-visiting the D language. I've compared the file sizes of 2
 executables - 1 is compiled C code using gcc; the other is D code
 using dmd.

 helloWorld.d => helloWorld.exe = 146,972 bytes
 ex1hello.c => ex1-hello.exe = 5,661 bytes

 Why such a huge difference???

 Duke

That 140KB is called the CYA document. It is there so that when you the programmer screws up you don't look so bad in front of your boss.

[quote]CYA document ...[/quote] sounds about right!!! :)
Sep 20 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Duke Normandin:

 I should have been a bit more clear!! It's the _relative_ size 
 difference that bothers me!! One is almost 26 times larger than 
 the other.

http://xkcd.com/605/ Bye, bearophile
Sep 20 2013
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Sep 20, 2013 at 05:04:23PM -0400, Nick Sabalausky wrote:
 On Fri, 20 Sep 2013 21:45:48 +0200
 "Temtaime" <temtaime gmail.com> wrote:
 
 Software MUST running almost ANYWHERE and consumes minimal
 resources.
 
 For example i hate 3dsmax developers when on my game's map it uses
 several GB of ram amd freezes sometimes, when Blender uses only 500
 MB and runs fast. The only reason for me for use 3dsmax is more
 friendly contoling. But this is another story...
 
 Some users which doesn't have ""modern"" PC will hate your app too i
 think.  One should optimize ALL things which he can to optimize.
 

I agree with what you're saying here, but the problem is we're looking at a difference of only a few hundred k. Heck, my primary PC was a 32-bit single-core right up until last year (and I still use it as a secondary system), and I didn't care one bit if a hello world was 1k or 1MB.

I agree with the OP that dmd should improve dead-code culling, though. Recently Walter has started doing lazy template instantiation for imports, which begins to trim off some of the fat. But there's plenty of room for more improvements. For example, after seeing Walter's recent pulls, I got inspired to write a simple utility that takes the output of objdump -d (the disassembly of an executable) and parses it to extract code symbols from the program along with references to other symbols. It then builds of graph of how symbols reference each other, and performs some trivial reachability analysis on it. It revealed some startling results... like the fact that symbols from std.complex are included in a hello world program, even though complex numbers are never used! The ratio of total number of symbols to symbols transitively reachable from _Dmain is rather large, ranging from 5 (medium-sized, complex program) to about 30 (a hello world program). Now I'm not 100% confident about the accuracy of these numbers, since some symbols may be indirectly referenced, and thus missed in the graph built from parsing the disassembly. But still, even when taken as ballpark figures, it shows that there's a *lot* of room for improvement. Certainly, some of the unreferenced symbols are druntime overhead (used by startup/exit functions, etc.), but a ratio of *5*? That's a 5x executable size bloat. Even if we discount half of that for druntime overhead and indirect references... I mean, how many indirect references can you have? I really can't convince myself that's "merely" druntime/phobos overhead. Especially when I see symbols from std.complex in a program that doesn't even use complex numbers. std.complex shouldn't be in there in the first place, before we even talk about template bloat.
 How many real world programs are as trivial as a hello world? A few
 maybe, but not many. Certainly not enough to actually add up to
 anything significant, unless maybe you happen to be running on a 286 or
 such.
 
 If we were talking about real-world D programs taking tens/hundreds of
 MB more than they should, then that would be a problem. But they
 don't. We're just talking about a few hundred k for an *entire* program.

My numbers show otherwise. :) Well, OK, I'm counting symbols rather than size, and the count may not be 100% accurate. But it does show that we could improve. By a lot. A hello world program, according to my test, has a ratio of 30 between total symbols and symbols reachable from _Dmain, whereas a medium-sized complex program shows a ratio of around 5 (the symbol analyser program itself, which is significantly simpler than the complex program I tested, also shows a ratio of 5). So we can probably discount the hello world case, since most of the apparent bloat is probably just one-off overhead from druntime, etc.. But the ratio of 5 for non-trivial programs? No matter how I try to rationalize it, I'm forced to conclude that there is a lot of room for improvement here. Surely *some* significant subset of these unreferenced symbols must be actually unreachable and can be pruned from the executable. I'll continue refining the analysis while Walter works on more lazy instantiations for imports. I'm expecting to see a lot of improvements in this area. :) T -- Береги платье снову, а здоровье смолоду.
Sep 20 2013
prev sibling next sibling parent Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:
On Fri, 20 Sep 2013 16:49:58 -0600
Duke Normandin <dukeofperl ml1.net> wrote:
 
 I should have been a bit more clear!! It's the _relative_ size 
 difference that bothers me!! One is almost 26 times larger than the 
 other. If I'm to expect that same variance in a large to huge 
 project, that I think that I'd me in a world of bullshine!!

If you're to expect that same variance in a large, huge or even normal sized program then you're very, very mistaken.
Sep 20 2013
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
--089e0149c5063f617f04e6de8e27
Content-Type: text/plain; charset=UTF-8

On 21 September 2013 09:02, H. S. Teoh <hsteoh quickfur.ath.cx> wrote:

 On Fri, Sep 20, 2013 at 05:04:23PM -0400, Nick Sabalausky wrote:
 On Fri, 20 Sep 2013 21:45:48 +0200
 "Temtaime" <temtaime gmail.com> wrote:
 Software MUST running almost ANYWHERE and consumes minimal
 resources.

 For example i hate 3dsmax developers when on my game's map it uses
 several GB of ram amd freezes sometimes, when Blender uses only 500
 MB and runs fast. The only reason for me for use 3dsmax is more
 friendly contoling. But this is another story...

 Some users which doesn't have ""modern"" PC will hate your app too i
 think.  One should optimize ALL things which he can to optimize.

I agree with what you're saying here, but the problem is we're looking at a difference of only a few hundred k. Heck, my primary PC was a 32-bit single-core right up until last year (and I still use it as a secondary system), and I didn't care one bit if a hello world was 1k or 1MB.

I agree with the OP that dmd should improve dead-code culling, though. Recently Walter has started doing lazy template instantiation for imports, which begins to trim off some of the fat. But there's plenty of room for more improvements. For example, after seeing Walter's recent pulls, I got inspired to write a simple utility that takes the output of objdump -d (the disassembly of an executable) and parses it to extract code symbols from the program along with references to other symbols. It then builds of graph of how symbols reference each other, and performs some trivial reachability analysis on it. It revealed some startling results... like the fact that symbols from std.complex are included in a hello world program, even though complex numbers are never used! The ratio of total number of symbols to symbols transitively reachable from _Dmain is rather large, ranging from 5 (medium-sized, complex program) to about 30 (a hello world program). Now I'm not 100% confident about the accuracy of these numbers, since some symbols may be indirectly referenced, and thus missed in the graph built from parsing the disassembly. But still, even when taken as ballpark figures, it shows that there's a *lot* of room for improvement. Certainly, some of the unreferenced symbols are druntime overhead (used by startup/exit functions, etc.), but a ratio of *5*? That's a 5x executable size bloat. Even if we discount half of that for druntime overhead and indirect references... I mean, how many indirect references can you have? I really can't convince myself that's "merely" druntime/phobos overhead. Especially when I see symbols from std.complex in a program that doesn't even use complex numbers. std.complex shouldn't be in there in the first place, before we even talk about template bloat.
 How many real world programs are as trivial as a hello world? A few
 maybe, but not many. Certainly not enough to actually add up to
 anything significant, unless maybe you happen to be running on a 286 or
 such.

 If we were talking about real-world D programs taking tens/hundreds of
 MB more than they should, then that would be a problem. But they
 don't. We're just talking about a few hundred k for an *entire* program.

My numbers show otherwise. :) Well, OK, I'm counting symbols rather than size, and the count may not be 100% accurate. But it does show that we could improve. By a lot. A hello world program, according to my test, has a ratio of 30 between total symbols and symbols reachable from _Dmain, whereas a medium-sized complex program shows a ratio of around 5 (the symbol analyser program itself, which is significantly simpler than the complex program I tested, also shows a ratio of 5). So we can probably discount the hello world case, since most of the apparent bloat is probably just one-off overhead from druntime, etc.. But the ratio of 5 for non-trivial programs? No matter how I try to rationalize it, I'm forced to conclude that there is a lot of room for improvement here. Surely *some* significant subset of these unreferenced symbols must be actually unreachable and can be pruned from the executable. I'll continue refining the analysis while Walter works on more lazy instantiations for imports. I'm expecting to see a lot of improvements in this area. :)

This is awesome. What would be really awesome is if you integrated this into the D auto-builder, and hack it publish the results somewhere for the latest build. It would be good to know when people write code that results in a significant increase in coverage (particularly when it doesn't need to). It would also provide very useful information for hackers who just want to get in and do some work to try and trim it a bit. --089e0149c5063f617f04e6de8e27 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr">On 21 September 2013 09:02, H. S. Teoh <span dir=3D"ltr">&= lt;<a href=3D"mailto:hsteoh quickfur.ath.cx" target=3D"_blank">hsteoh quick= fur.ath.cx</a>&gt;</span> wrote:<br><div class=3D"gmail_extra"><div class= =3D"gmail_quote"> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"><div class=3D"im">On Fri, Sep 20, 2013 at 05= :04:23PM -0400, Nick Sabalausky wrote:<br> &gt; On Fri, 20 Sep 2013 21:45:48 +0200<br> &gt; &quot;Temtaime&quot; &lt;<a href=3D"mailto:temtaime gmail.com">temtaim= e gmail.com</a>&gt; wrote:<br> &gt; &gt;<br> &gt; &gt; Software MUST running almost ANYWHERE and consumes minimal<br> &gt; &gt; resources.<br> &gt; &gt;<br> &gt; &gt; For example i hate 3dsmax developers when on my game&#39;s map it= uses<br> &gt; &gt; several GB of ram amd freezes sometimes, when Blender uses only 5= 00<br> &gt; &gt; MB and runs fast. The only reason for me for use 3dsmax is more<b= r> &gt; &gt; friendly contoling. But this is another story...<br> &gt; &gt;<br> &gt; &gt; Some users which doesn&#39;t have &quot;&quot;modern&quot;&quot; = PC will hate your app too i<br> &gt; &gt; think. =C2=A0One should optimize ALL things which he can to optim= ize.<br> &gt; &gt;<br> &gt;<br> &gt; I agree with what you&#39;re saying here, but the problem is we&#39;re= looking<br> &gt; at a difference of only a few hundred k.<br> &gt;<br> &gt; Heck, my primary PC was a 32-bit single-core right up until last year<= br> &gt; (and I still use it as a secondary system), and I didn&#39;t care one = bit<br> &gt; if a hello world was 1k or 1MB.<br> <br> </div>I agree with the OP that dmd should improve dead-code culling, though= .<br> Recently Walter has started doing lazy template instantiation for<br> imports, which begins to trim off some of the fat. But there&#39;s plenty o= f<br> room for more improvements.<br> <br> For example, after seeing Walter&#39;s recent pulls, I got inspired to writ= e<br> a simple utility that takes the output of objdump -d (the disassembly of<br=

along with references to other symbols. It then builds of graph of how<br> symbols reference each other, and performs some trivial reachability<br> analysis on it. It revealed some startling results... like the fact that<br=

though complex numbers are never used!<br> <br> The ratio of total number of symbols to symbols transitively reachable<br> from _Dmain is rather large, ranging from 5 (medium-sized, complex<br> program) to about 30 (a hello world program). Now I&#39;m not 100% confiden= t<br> about the accuracy of these numbers, since some symbols may be<br> indirectly referenced, and thus missed in the graph built from parsing<br> the disassembly. But still, even when taken as ballpark figures, it<br> shows that there&#39;s a *lot* of room for improvement. Certainly, some of<= br> the unreferenced symbols are druntime overhead (used by startup/exit<br> functions, etc.), but a ratio of *5*? That&#39;s a 5x executable size bloat= .<br> Even if we discount half of that for druntime overhead and indirect<br> references... I mean, how many indirect references can you have? =C2=A0I<br=

bos overhead.<br> Especially when I see symbols from std.complex in a program that doesn&#39;= t<br> even use complex numbers. std.complex shouldn&#39;t be in there in the firs= t<br> place, before we even talk about template bloat.<br> <div class=3D"im"><br> <br> &gt; How many real world programs are as trivial as a hello world? A few<br=

&gt; anything significant, unless maybe you happen to be running on a 286 o= r<br> &gt; such.<br> &gt;<br> &gt; If we were talking about real-world D programs taking tens/hundreds of= <br> &gt; MB more than they should, then that would be a problem. But they<br> &gt; don&#39;t. We&#39;re just talking about a few hundred k for an *entire= * program.<br> <br> </div>My numbers show otherwise. :) Well, OK, I&#39;m counting symbols rath= er than<br> size, and the count may not be 100% accurate. But it does show that we<br> could improve. By a lot.<br> <br> A hello world program, according to my test, has a ratio of 30 between<br> total symbols and symbols reachable from _Dmain, whereas a medium-sized<br> complex program shows a ratio of around 5 (the symbol analyser program<br> itself, which is significantly simpler than the complex program I<br> tested, also shows a ratio of 5). So we can probably discount the hello<br> world case, since most of the apparent bloat is probably just one-off<br> overhead from druntime, etc.. But the ratio of 5 for non-trivial<br> programs? No matter how I try to rationalize it, I&#39;m forced to conclude= <br> that there is a lot of room for improvement here. Surely *some*<br> significant subset of these unreferenced symbols must be actually<br> unreachable and can be pruned from the executable.<br> <br> I&#39;ll continue refining the analysis while Walter works on more lazy<br> instantiations for imports. I&#39;m expecting to see a lot of improvements<= br> in this area. :)<br></blockquote><div><br></div><div>This is awesome.</div>= <div>What would be really awesome is if you integrated this into the D auto= -builder, and hack it publish the results somewhere for the latest build.</= div> <div>It would be good to know when people write code that results in a sign= ificant increase in coverage (particularly when it doesn&#39;t need to).</d= iv><div>It would also provide very useful information for hackers who just = want to get in and do some work to try and trim it a bit.</div> </div></div></div> --089e0149c5063f617f04e6de8e27--
Sep 20 2013
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
On Friday, 20 September 2013 at 23:03:48 UTC, H. S. Teoh wrote:
 I'll continue refining the analysis while Walter works on more 
 lazy
 instantiations for imports. I'm expecting to see a lot of 
 improvements
 in this area. :)

I have been doing similar analysis for some time too, only mostly manually (was curious what symbols actually get included for trivial programs), with pretty much the same conclusion. Right now I am pretty much convinced that we need some sort of whole program optimization and tweak language spec to allow it safely (i.e. force dynamically loaded symbols to be marked with export). Lot of code bloat comes from stuff which is unnecessary in the big picture but compiler has to means to decide it during compilation. There is no real reason why `[1, 2, 3].map!(a => a*2)().reduce!((a, b) => a + b)(0)` can't be reduce to single loop and inlined, leaving no traces of actual std.algorithm usage. Other than compiler can't possibly be sure that you won't try to link to those generate instances somewhere (or pass it to shared library). That feels like a language design issue to address.
Sep 21 2013
prev sibling next sibling parent "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Saturday, 21 September 2013 at 10:29:35 UTC, Dicebot wrote:
 Lot of code bloat comes from stuff which is unnecessary in the 
 big picture but compiler has to means to decide it during 
 compilation. There is no real reason why

 `[1, 2, 3].map!(a => a*2)().reduce!((a, b) => a + b)(0)`

 can't be reduce to single loop and inlined, leaving no traces 
 of actual std.algorithm usage.

There's no theoretical reason, but plenty of practical reasons. bearophile linked to a talk by Chandler Carruth that explains the difficulties encountered by inlining optimisers.
Sep 21 2013
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
On Saturday, 21 September 2013 at 10:53:17 UTC, Peter Alexander 
wrote:
 On Saturday, 21 September 2013 at 10:29:35 UTC, Dicebot wrote:
 Lot of code bloat comes from stuff which is unnecessary in the 
 big picture but compiler has to means to decide it during 
 compilation. There is no real reason why

 `[1, 2, 3].map!(a => a*2)().reduce!((a, b) => a + b)(0)`

 can't be reduce to single loop and inlined, leaving no traces 
 of actual std.algorithm usage.

There's no theoretical reason, but plenty of practical reasons. bearophile linked to a talk by Chandler Carruth that explains the difficulties encountered by inlining optimisers.

I wasn't referring to actual inlining but to "remove all unused that is left after inlining". You point is solid, of course, there is nothing trivial about robust inline optimizations - but is possible within existing language design.
Sep 21 2013
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Saturday, 21 September 2013 at 10:53:17 UTC, Peter Alexander 
wrote:
 On Saturday, 21 September 2013 at 10:29:35 UTC, Dicebot wrote:
 Lot of code bloat comes from stuff which is unnecessary in the 
 big picture but compiler has to means to decide it during 
 compilation. There is no real reason why

 `[1, 2, 3].map!(a => a*2)().reduce!((a, b) => a + b)(0)`

 can't be reduce to single loop and inlined, leaving no traces 
 of actual std.algorithm usage.

There's no theoretical reason, but plenty of practical reasons. bearophile linked to a talk by Chandler Carruth that explains the difficulties encountered by inlining optimisers.

Either you are confusing with me ( http://forum.dlang.org/thread/mvbqiwajntrivndylelw forum.dlang.org?page=8#post-mqcwjbgxildixehsxt e:40forum.dlang.org ) or I missed that post by bearophile. Also, Dicebot have some very good points. Function can't be stripped from the executable if they are exported by default.
Sep 21 2013
prev sibling next sibling parent "Temtaime" <temtaime gmail.com> writes:
It's executable, not a DLL. So any functions can be stripped.
Isn't it ?
Sep 21 2013
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
On Saturday, 21 September 2013 at 11:17:46 UTC, Temtaime wrote:
 It's executable, not a DLL. So any functions can be stripped.
 Isn't it ?

Not any. You must preserve those symbols that are exposed to DLL via callbacks or parameter types (functions are not only symbols that bloat). Now, it may be possible to compiler to detect those automatically as passing parameter implies manual reference from code but I am not sure about that (D never stops to surprise me about weird hacks it can do :P)
Sep 21 2013
prev sibling next sibling parent "Temtaime" <temtaime gmail.com> writes:
Are you saying about passing a function via pointer to winapi for 
example?
The logic is simple: if someone gets function address, then 
function cannot be stripped. It's logic of all c++ compilers.
Sep 21 2013
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
On Saturday, 21 September 2013 at 11:34:10 UTC, Temtaime wrote:
 Are you saying about passing a function via pointer to winapi 
 for example?
 The logic is simple: if someone gets function address, then 
 function cannot be stripped. It's logic of all c++ compilers.

More like passing an object instance to plugin which knows it only via .di import. Compiler can't possibly know what methods of that object (or function indirectly accessible from it) will be available in the .di and/or called and must act conservatively, preserving everything. It will also need to be aware of fact that function pointer retrieved via `dlsym` is actually some external function and use that knowledge during optimization. Also it is worth noting that naive preservation of all functions that got their address may not work very well with frequent lambda usage for algorithms in D. Same stuff with inheritance. It is just another side of a problem why compiler can't de-virtualize certain methods based on whole program class graph. I won't be as harsh as to say it is impossible but this clearly requires defining some parts of the language that are currently vague. P.S. C++ compilers are not much better here in that regard, unless you are going to try some non-standard tweaks.
Sep 21 2013
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
On Saturday, 21 September 2013 at 11:46:13 UTC, Dicebot wrote:
 ...

P.S. A lot of those problems can be avoided even without Whole Program Optimization if internal linkage attribute is introduced ;)
Sep 21 2013
prev sibling next sibling parent "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Saturday, 21 September 2013 at 11:11:09 UTC, deadalnix wrote:
 On Saturday, 21 September 2013 at 10:53:17 UTC, Peter Alexander
 There's no theoretical reason, but plenty of practical 
 reasons. bearophile linked to a talk by Chandler Carruth that 
 explains the difficulties encountered by inlining optimisers.

Either you are confusing with me ( http://forum.dlang.org/thread/mvbqiwajntrivndylelw forum.dlang.org?page=8#post-mqcwjbgxildixehsxt e:40forum.dlang.org ) or I missed that post by bearophile.

Sorry, I am just confused :-)
Sep 21 2013
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
--047d7b3a9b1205df9f04e6e6bd02
Content-Type: text/plain; charset=UTF-8

On 21 September 2013 21:34, Temtaime <temtaime gmail.com> wrote:

 Are you saying about passing a function via pointer to winapi for example?
 The logic is simple: if someone gets function address, then function
 cannot be stripped. It's logic of all c++ compilers.

Totally OT, but every single time I read your name when you post, I can't help but start hearing lines from Terry Prachett's Hogfather in my head... http://www.youtube.com/watch?v=M0mU3393PGk Although I suspect not many people would have seen it. My brain is a strange place... --047d7b3a9b1205df9f04e6e6bd02 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr">On 21 September 2013 21:34, Temtaime <span dir=3D"ltr">&lt= ;<a href=3D"mailto:temtaime gmail.com" target=3D"_blank">temtaime gmail.com= </a>&gt;</span> wrote:<br><div class=3D"gmail_extra"><div class=3D"gmail_qu= ote"><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;bo= rder-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:so= lid;padding-left:1ex"> Are you saying about passing a function via pointer to winapi for example?<= br> The logic is simple: if someone gets function address, then function cannot= be stripped. It&#39;s logic of all c++ compilers.<br> </blockquote></div><br></div><div class=3D"gmail_extra">Totally OT, but eve= ry single time I read your name when you post, I can&#39;t help but start h= earing lines from Terry Prachett&#39;s Hogfather in my head...</div><div cl= ass=3D"gmail_extra"> <a href=3D"http://www.youtube.com/watch?v=3DM0mU3393PGk">http://www.youtube= .com/watch?v=3DM0mU3393PGk</a><br></div><div class=3D"gmail_extra"><br></di= v><div class=3D"gmail_extra">Although I suspect not many people would have = seen it.<br> </div><div class=3D"gmail_extra">My brain is a strange place...</div></div> --047d7b3a9b1205df9f04e6e6bd02--
Sep 21 2013
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Friday, 20 September 2013 at 16:20:34 UTC, Duke Normandin 
wrote:
 I'm re-visiting the D language. I've compared the file sizes of 
 2 executables - 1 is compiled C code using gcc; the other is D 
 code using dmd.

 helloWorld.d => helloWorld.exe = 146,972 bytes
 ex1hello.c => ex1-hello.exe = 5,661 bytes

 Why such a huge difference???

You can upload a .map file here, and see what's taking up all the space: http://thecybershadow.net/d/mapview/
Sep 21 2013
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Sep 21, 2013 at 09:41:30PM +0200, Vladimir Panteleev wrote:
 On Friday, 20 September 2013 at 16:20:34 UTC, Duke Normandin wrote:
I'm re-visiting the D language. I've compared the file sizes of 2
executables - 1 is compiled C code using gcc; the other is D code
using dmd.

helloWorld.d => helloWorld.exe = 146,972 bytes
ex1hello.c => ex1-hello.exe = 5,661 bytes

Why such a huge difference???

You can upload a .map file here, and see what's taking up all the space: http://thecybershadow.net/d/mapview/

Ah, you beat me to it. :-) T -- All problems are easy in retrospect.
Sep 23 2013
prev sibling next sibling parent Sean Kelly <sean invisibleduck.org> writes:
On Sep 21, 2013, at 8:49 AM, Manu <turkeyman gmail.com> wrote:

 On 21 September 2013 21:34, Temtaime <temtaime gmail.com> wrote:
 Are you saying about passing a function via pointer to winapi for =

 The logic is simple: if someone gets function address, then function =

=20
 Totally OT, but every single time I read your name when you post, I =

head=85 Same here.
Sep 23 2013
prev sibling parent Iain Buclaw <ibuclaw ubuntu.com> writes:
--047d7bea3a348133c204e71c2b28
Content-Type: text/plain; charset=ISO-8859-1

On Sep 20, 2013 5:40 PM, "Temtaime" <temtaime gmail.com> wrote:
 DMD likes the size.
 When compiling, compiler may use GBs of RAM.
 In resulting executable there is no dead/unused code elimination.

Three random sentences that are not at all factual. :) Regards -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0'; --047d7bea3a348133c204e71c2b28 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable <p><br> On Sep 20, 2013 5:40 PM, &quot;Temtaime&quot; &lt;<a href=3D"mailto:temtaim= e gmail.com">temtaime gmail.com</a>&gt; wrote:<br> &gt;<br> &gt; DMD likes the size.<br> &gt; When compiling, compiler may use GBs of RAM.<br> &gt; In resulting executable there is no dead/unused code elimination.</p> <p>Three random sentences that are not at all factual.=A0 :)</p> <p>Regards<br> -- <br> Iain Buclaw</p> <p>*(p &lt; e ? p++ : p) =3D (c &amp; 0x0f) + &#39;0&#39;;</p> --047d7bea3a348133c204e71c2b28--
Sep 24 2013