www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Potential of a compiler that creates the executable at once

reply rempas <rempas tutanota.com> writes:
A couple of months ago, I found out about a language called 
[Vox](https://github.com/MrSmith33/vox) which uses a design that 
I haven't seen before by any other compiler which is to not 
create object files and then link them together but instead, 
always create an executable at once. This means that every time 
we change something in our code, we have to recompile the whole 
thing. Naturally, you will say that this is a huge problem 
because we will have to wait a lot of times every time we make a 
small change to our project but here is the thing... With this 
design, the compilation times can become really really fast (of 
course the design of the compiler matters too)!

At some point about 3 months ago, the creator of the language 
said that at that point, Vox can compile 1.2M LoC/S which is 
really really fast and this is a point that 99% of the projects 
will not reach so your project will always compiler in less than 
a second no matter what! What is even more impressive is that Vox 
is single thread so when parsing the files for symbols and 
errors, why could get a much bigger performance boost if we had 
multithread support!

Of course, not creating object files and then link them means 
that we don't have to create a lot of object files and then link 
them all into a big executable but rather start creating this 
executable and add everything up. You can understand how this can 
save a lot of time! And CPUs are so fast in our days that we can 
compile Million lines of code in less than a second using 
multi-thead support so even then very rare huge projects will 
compile very fast.
What's even more impressive is that Vox is not even the fastest 
compiler out there. TCC is even faster (about 4-5 times)! I have 
personally tried to see how fast TCC is able to compile using my 
CPU which is Ryzen 5 2400G. I was able to compile 4M LoC in 
700ms! Yeah, the speeds are crazy! And my CPU is an average one, 
if you were to build a PC now, you would get something that is at 
least 20% faster with at least 2 more threads!

However, this is not the best test. This was only an one-line 
functions that had the same assembly code in it without any 
preprocess and libraries linked so I don't know if this played 
any role but that was 8 files using 8 threads and the speed is 
just unreal! And TCC DOES create object files and then links 
them. How faster it could be if it used the same design Vox uses 
(And how slower would Vox be if it used the same design regular 
compilers use?)?

Of course, TCC doesn't produce optimized code but still, even 
when compared with GCC's "-O0", it generates code 4-7 times 
faster than GCC so if TCC could optimize code as much as GCC and 
was using the design Vox used, I can see it been able to compile 
around 1-1.5M LoC/s!

I am personally really interested and inspired to make my own 
compiler by this design. This design also solves a lot of 
problems that we would have to take into account in the other 
classic method. One thing that I thought was the ability to be 
able to also export your project as a library (mostly 
shared/dynamic) so in case you have something really huge like 
10+M LoC (Linux kernel I'm talking to you!), you could split it 
to "sub projects" that will be libraries and then link them all 
together.

Another idea would be to check the type of the files that are 
passed to the compiler and if they are source files, do not 
create object files as they would not be kept anyways. So the 
following would apply:

```
my_lang -c test3.lang // compile mode! Outputs the object files 
"test3.o"

my_lang -c test1.lang test2.lang test3.o -o=TEST // Create 
executable. "test1.lang" and "test2.lang" are source files so we 
won't create object files for them but rather will go straight to 
create a binary out of them. "test3.o" is an object files so we 
will "copy-past" its symbols to the final binary file.
```

This is probably the best of both worlds!

So I thought about sharing this and see what your thoughts are! 
How fast DMD could be using this design? Or even better if we 
created a new, faster backend for DMD that would be faster than 
the current one? D could be very competitive!
Feb 10 2022
next sibling parent reply Araq <rumpf_a web.de> writes:
On Thursday, 10 February 2022 at 09:41:12 UTC, rempas wrote:
 This is probably the best of both worlds!
It's a very bad idea, it's in fact so bad that I wouldn't call it a "design": - Since everything is recompiled all the time regardless, there is no incentive for "modularity" in the language design. Nor is there any incentive to keep the compiler's internals clean. Soon everything in the compiler operates on an enormous mutable graph internally, encouraging many, many bugs. - You'll likely run into memory management problems too as you cannot free memory as everything is connected to everything else. Even if you are willing to use a GC the GC cannot help you much as your liveset simply keeps growing. - Every compiler bugfix tends to add code to a compiler, so it'll get slower over time. - The same is true for the memory consumption, it'll get worse over time. - Every optimization you add to the compiler must not destroy your lovely compile-times. So everything in the compiler is speed-critical and has to be optimized. Almost anything you do ends up being on the critical path. - This does not only affect optimizations (which can depend on algorithms that are O(n^3) btw) but also all sorts of linting phases. And static analysis gets more important over time too. In summary: People expect optimizers and static analysis to get better too and demand more of their tools. Your "design" doesn't allow for this. And in an IDE setting you might be able to skip all the expensive optimization steps, but not the static analyser steps.
Feb 10 2022
parent rempas <rempas tutanota.com> writes:
On Thursday, 10 February 2022 at 10:38:05 UTC, Araq wrote:
 It's a very bad idea, it's in fact so bad that I wouldn't call 
 it a "design":

 - Since everything is recompiled all the time regardless, there 
 is no incentive for "modularity" in the language design. Nor is 
 there any incentive to keep the compiler's internals clean. 
 Soon everything in the compiler operates on an enormous mutable 
 graph internally, encouraging many, many bugs.
 - You'll likely run into memory management problems too as you 
 cannot free memory  as everything is connected to everything 
 else. Even if you are willing to use a GC the GC cannot help 
 you much as your liveset simply keeps growing.
 - Every compiler bugfix tends to add code to a compiler, so 
 it'll get slower over time.
 - The same is true for the memory consumption, it'll get worse 
 over time.
 - Every optimization you add to the compiler must not destroy 
 your lovely compile-times. So everything in the compiler is 
 speed-critical and has to be optimized. Almost anything you do 
 ends up being on the critical path.
 - This does not only affect optimizations (which can depend on 
 algorithms that are O(n^3) btw) but also all sorts of linting 
 phases. And static analysis gets more important over time too.

 In summary: People expect optimizers and static analysis to get 
 better too and demand more of their tools. Your "design" 
 doesn't allow for this. And in an IDE setting you might be able 
 to skip all the expensive optimization steps, but not the 
 static analyser steps.
Thank you for your reply! I suppose you are right and I'm glad I asked people with more experience than me. It would be fun to hear more negative thoughts to see all the things that I'm missing.
Feb 10 2022
prev sibling next sibling parent reply bauss <jj_1337 live.dk> writes:
On Thursday, 10 February 2022 at 09:41:12 UTC, rempas wrote:
 At some point about 3 months ago, the creator of the language 
 said that at that point, Vox can compile 1.2M LoC/S which is 
 really really fast and this is a point that 99% of the projects 
 will not reach so your project will always compiler in less 
 than a second no matter what! What is even more impressive is 
 that Vox is single thread so when parsing the files for symbols 
 and errors, why could get a much bigger performance boost if we 
 had multithread support!
You see, there's a large misconception here. Typically slow compile times aren't due to the LoC a project has, but rather what happens during the compilation. Ex. template instantiation, functions executed at ctfe, preprocessing, optimization etc. I've seen projects with only a couple thousand lines of code compile slower than projects with hundreds of thousands of lines of code. Generally most compiles can read large source files and parse their tokens etc. really fast, it's usually what happens afterwards that are the bottleneck. Say if you have a project that is compiling very slow, usually you won't start out by cutting the amount of lines you have, because that's often not as easy or even possible, but rather you profile where the compiler is spending most of its time and then you attempt to resolve it, ex. perhaps you're running nested loops that are unnecessary etc. at compile-time and so on.
Feb 10 2022
next sibling parent Mark <smarksc gmail.com> writes:
On Thursday, 10 February 2022 at 11:54:59 UTC, bauss wrote:
 On Thursday, 10 February 2022 at 09:41:12 UTC, rempas wrote:
 At some point about 3 months ago, the creator of the language 
 said that at that point, Vox can compile 1.2M LoC/S which is 
 really really fast and this is a point that 99% of the 
 projects will not reach so your project will always compiler 
 in less than a second no matter what! What is even more 
 impressive is that Vox is single thread so when parsing the 
 files for symbols and errors, why could get a much bigger 
 performance boost if we had multithread support!
You see, there's a large misconception here. Typically slow compile times aren't due to the LoC a project has, but rather what happens during the compilation. Ex. template instantiation, functions executed at ctfe, preprocessing, optimization etc.
If you generate an executable directly (without going through compilation to object files and then linking) then you can save some compile time on these tasks, no? For instance, you can maintain some sort of global cache so that repeated instantiations of the same template (in different compilation units) are detected during compilation; this then saves you time on compiling something that you have already compiled before. I assume that such repeated instantiations are very common when there is heavy usage of the standard library. The same goes for identical CTFEs and any other compilation step that can potentially repeat in different compilation units. Assuming link-time optimization, the end result (the executable) should be the same, but the compile times will be different.
Feb 10 2022
prev sibling parent rempas <rempas tutanota.com> writes:
On Thursday, 10 February 2022 at 11:54:59 UTC, bauss wrote:
 You see, there's a large misconception here.

 Typically slow compile times aren't due to the LoC a project 
 has, but rather what happens during the compilation.

 Ex. template instantiation, functions executed at ctfe, 
 preprocessing, optimization etc.

 I've seen projects with only a couple thousand lines of code 
 compile slower than projects with hundreds of thousands of 
 lines of code.
Yeah, Of course! There is no misconception here. Templates play a role. When talking about LoC/s I'm talking about clear lines and this is why I made it clear that in my example with TCC, I didn't used any preprocessors hence the 4M LoC were exactly 4.
 Generally most compiles can read large source files and parse 
 their tokens etc. really fast, it's usually what happens 
 afterwards that are the bottleneck.

 Say if you have a project that is compiling very slow, usually 
 you won't start out by cutting the amount of lines you have, 
 because that's often not as easy or even possible, but rather 
 you profile where the compiler is spending most of its time and 
 then you attempt to resolve it, ex. perhaps you're running 
 nested loops that are unnecessary etc. at compile-time and so 
 on.
Of course, the backed is what matters. TCC goes from source file to object file directly. GCC/D/Rust etc. Go from source file, to IR (maybe DMD doesn't but LDC, GDC do), then Assembly and then object file so this takes many times more than if you did it directly. But even then, TCC/Vox are many more times faster so still there is something more. Idk...
Feb 11 2022
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
This is actually the reason behind why dmd will create a single object file
when 
given multiple source files on the command line. It's also why dmd can create a 
library directly.

I've toyed with the idea of generating an executable directly many times.
Feb 10 2022
parent reply rempas <rempas tutanota.com> writes:
On Thursday, 10 February 2022 at 20:39:33 UTC, Walter Bright 
wrote:
 This is actually the reason behind why dmd will create a single 
 object file when given multiple source files on the command 
 line. It's also why dmd can create a library directly.

 I've toyed with the idea of generating an executable directly 
 many times.
That's nice to hear! However, does DMD generates object files directly or "asm" files that are passed to a C compile? If I remember correctly, LDC2 needs to pass the output to a C compiler as people told me so what's the case from DMD? I tried to compile a C library (code converted in D to use with DMD rather than using "ImportC") using GCC and DMD. And it turns out that DMD is about 70-80% faster than GCC which is good but I would suppose it could have been better given the design of the D as a language and if DMD outputs object files directly. Do you think that there are any very bad places in DMD's backend? Has anyone in the team thought about re-writing the backend (or parts of it) from the beginning?
Feb 11 2022
next sibling parent reply max haughton <maxhaton gmail.com> writes:
On Friday, 11 February 2022 at 12:34:21 UTC, rempas wrote:
 On Thursday, 10 February 2022 at 20:39:33 UTC, Walter Bright 
 wrote:
 [...]
That's nice to hear! However, does DMD generates object files directly or "asm" files that are passed to a C compile? If I remember correctly, LDC2 needs to pass the output to a C compiler as people told me so what's the case from DMD? [...]
The object emission code in the backend is quite inefficient, it needs to be rewritten (it's horrible old code anyway)
Feb 11 2022
next sibling parent reply rempas <rempas tutanota.com> writes:
On Friday, 11 February 2022 at 14:52:09 UTC, max haughton wrote:
 The object emission code in the backend is quite inefficient, 
 it needs to be rewritten (it's horrible old code anyway)
I would love if they would do it but I can't complain that they don't. Openhub reports that [DMD] consists of 961K LoC!! I know that D is a huge language so the frontend will be a good part of it and that code for some other stuff (including a lot of stuff for the backend) will probably not change. But this is A LOT to do still! Maybe they can do that for D 3.0 along with removing the need for GC to use Phobos (and giving the ability to only close that in the compiler) then I can see D becoming as big as it was intended! But dreams are free...
Feb 11 2022
parent reply user1234 <user1234 12.de> writes:
On Friday, 11 February 2022 at 15:17:16 UTC, rempas wrote:
 On Friday, 11 February 2022 at 14:52:09 UTC, max haughton wrote:
 The object emission code in the backend is quite inefficient, 
 it needs to be rewritten (it's horrible old code anyway)
I would love if they would do it but I can't complain that they don't. Openhub reports that [DMD] consists of 961K LoC!!
Openhub and their metrics are old trash. It's more 170K according to D-Scanner.
Feb 11 2022
parent reply user1234 <user1234 12.de> writes:
On Friday, 11 February 2022 at 16:41:33 UTC, user1234 wrote:
 On Friday, 11 February 2022 at 15:17:16 UTC, rempas wrote:
 On Friday, 11 February 2022 at 14:52:09 UTC, max haughton 
 wrote:
 The object emission code in the backend is quite inefficient, 
 it needs to be rewritten (it's horrible old code anyway)
I would love if they would do it but I can't complain that they don't. Openhub reports that [DMD] consists of 961K LoC!!
Openhub and their metrics are old trash. It's more 170K according to D-Scanner.
wait... it's 175K. I had not pulled since 8 monthes or so. There's much new code that was commited since, with importC notably.
Feb 11 2022
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Feb 11, 2022 at 04:47:46PM +0000, user1234 via Digitalmars-d wrote:
 On Friday, 11 February 2022 at 16:41:33 UTC, user1234 wrote:
 On Friday, 11 February 2022 at 15:17:16 UTC, rempas wrote:
 On Friday, 11 February 2022 at 14:52:09 UTC, max haughton wrote:
 
 The object emission code in the backend is quite inefficient, it
 needs to be rewritten (it's horrible old code anyway)
I would love if they would do it but I can't complain that they don't. Openhub reports that [DMD] consists of 961K LoC!!
Openhub and their metrics are old trash. It's more 170K according to D-Scanner.
wait... it's 175K. I had not pulled since 8 monthes or so. There's much new code that was commited since, with importC notably.
I pulled just this week, and running `wc` on *.d *.c *.h says there are 365K lines. I'm not sure what the *.h files are for, since DMD is now bootstrapping. Excluding *.h yields 347K lines. But a lot of those are actually blank lines and comments; excluding // comments, /**/ and /++/ block comments, and blank lines yields 175K. The 961K probably comes from the myriad test cases in the testsuite, where more lines is actually a *good* thing. But really, LoC is an unreliable measure of code complexity. Token count would be more reflective of the actual complexity of the code, though even that is questionable. Writing `enum x = 1 + 1;` would be 7 tokens vs. `enum x = 2;` which is 5 tokens, for example, but the former may actually make code easier to read in certain cases (e.g., if the longer expression makes intent clearer that the shorter one). Compressed size may be an even better approximation, because a high degree of complexity approaches Kolgomorov complexity in the limit, which is a measure of the information content of the data. Stripping comments and compressing (with the best compression algorithm you can find), for example, would give a good approximation to the actual complexity in the code. Though of course, even that fails to measure the inherent level of complexity in language constructs. So you couldn't meaningfully compare compressed sizes across different languages, for example. T -- Unix was not designed to stop people from doing stupid things, because that would also stop them from doing clever things. -- Doug Gwyn
Feb 11 2022
next sibling parent reply Stanislav Blinov <stanislav.blinov gmail.com> writes:
On Friday, 11 February 2022 at 17:36:37 UTC, H. S. Teoh wrote:

 I pulled just this week, and running `wc` on *.d *.c *.h says...
https://github.com/AlDanial/cloc would yield a more practical metric, at least as far as "practical metric" in terms of LoC goes.
Feb 11 2022
next sibling parent reply max haughton <maxhaton gmail.com> writes:
On Friday, 11 February 2022 at 17:44:45 UTC, Stanislav Blinov 
wrote:
 On Friday, 11 February 2022 at 17:36:37 UTC, H. S. Teoh wrote:

 I pulled just this week, and running `wc` on *.d *.c *.h 
 says...
https://github.com/AlDanial/cloc would yield a more practical metric, at least as far as "practical metric" in terms of LoC goes.
``` --------------------------------------------------------------------------------------- Language files blank comment code --------------------------------------------------------------------------------------- D 3867 75824 88426 431299 HTML 114 11405 967 61083 C/C++ Header 57 2729 992 23332 C 93 830 797 3346 C++ 19 532 139 2249 ``` this includes the test suite and other stuff that isn't technically the compiler-proper.
Feb 11 2022
next sibling parent reply rempas <rempas tutanota.com> writes:
On Friday, 11 February 2022 at 18:02:21 UTC, max haughton wrote:
 On Friday, 11 February 2022 at 17:44:45 UTC, Stanislav Blinov 
 wrote:
 On Friday, 11 February 2022 at 17:36:37 UTC, H. S. Teoh wrote:

 I pulled just this week, and running `wc` on *.d *.c *.h 
 says...
https://github.com/AlDanial/cloc would yield a more practical metric, at least as far as "practical metric" in terms of LoC goes.
``` --------------------------------------------------------------------------------------- Language files blank comment code --------------------------------------------------------------------------------------- D 3867 75824 88426 431299 HTML 114 11405 967 61083 C/C++ Header 57 2729 992 23332 C 93 830 797 3346 C++ 19 532 139 2249 ``` this includes the test suite and other stuff that isn't technically the compiler-proper.
Interesting! We could remove the "test-suit" directory and we could tell it to only parse "D" language files which will give us more "clean" results. "cloc" is actually what I use and for DragonFlyBSD, it gave me the same number "OpenHub" gave so I really wonder how other source code or languages have different results...
Feb 11 2022
parent user1234 <user1234 12.de> writes:
On Friday, 11 February 2022 at 20:19:16 UTC, rempas wrote:
 On Friday, 11 February 2022 at 18:02:21 UTC, max haughton wrote:
 On Friday, 11 February 2022 at 17:44:45 UTC, Stanislav Blinov 
 wrote:
 On Friday, 11 February 2022 at 17:36:37 UTC, H. S. Teoh wrote:

 I pulled just this week, and running `wc` on *.d *.c *.h 
 says...
https://github.com/AlDanial/cloc would yield a more practical metric, at least as far as "practical metric" in terms of LoC goes.
``` --------------------------------------------------------------------------------------- Language files blank comment code --------------------------------------------------------------------------------------- D 3867 75824 88426 431299 HTML 114 11405 967 61083 C/C++ Header 57 2729 992 23332 C 93 830 797 3346 C++ 19 532 139 2249 ``` this includes the test suite and other stuff that isn't technically the compiler-proper.
Interesting! We could remove the "test-suit" directory and we could tell it to only parse "D" language files which will give us more "clean" results.
This is the number I gave yesterday. D-Scanner counts sloc more cleverly than the other tools mentionned. The report in detail: dmd/src/build.d: 740 dmd/src/dmd/access.d: 181 dmd/src/dmd/aggregate.d: 362 dmd/src/dmd/aliasthis.d: 93 dmd/src/dmd/apply.d: 58 dmd/src/dmd/argtypes_aarch64.d: 96 dmd/src/dmd/argtypes_sysv_x64.d: 199 dmd/src/dmd/argtypes_x86.d: 190 dmd/src/dmd/arrayop.d: 176 dmd/src/dmd/arraytypes.d: 43 dmd/src/dmd/astbase.d: 2640 dmd/src/dmd/astcodegen.d: 83 dmd/src/dmd/astenums.d: 55 dmd/src/dmd/ast_node.d: 4 dmd/src/dmd/asttypename.d: 73 dmd/src/dmd/attrib.d: 484 dmd/src/dmd/backend/aarray.d: 244 dmd/src/dmd/backend/backconfig.d: 335 dmd/src/dmd/backend/backend.d: 7 dmd/src/dmd/backend/barray.d: 72 dmd/src/dmd/backend/bcomplex.d: 127 dmd/src/dmd/backend/blockopt.d: 1367 dmd/src/dmd/backend/cc.d: 534 dmd/src/dmd/backend/cdef.d: 223 dmd/src/dmd/backend/cg87.d: 2521 dmd/src/dmd/backend/cgcod.d: 1743 dmd/src/dmd/backend/cgcs.d: 445 dmd/src/dmd/backend/cgcse.d: 78 dmd/src/dmd/backend/cgcv.d: 58 dmd/src/dmd/backend/cg.d: 162 dmd/src/dmd/backend/cgelem.d: 3342 dmd/src/dmd/backend/cgen.d: 232 dmd/src/dmd/backend/cgobj.d: 1827 dmd/src/dmd/backend/cgreg.d: 539 dmd/src/dmd/backend/cgsched.d: 1595 dmd/src/dmd/backend/cgxmm.d: 1352 dmd/src/dmd/backend/cod1.d: 3447 dmd/src/dmd/backend/cod2.d: 3650 dmd/src/dmd/backend/cod3.d: 4719 dmd/src/dmd/backend/cod4.d: 3039 dmd/src/dmd/backend/cod5.d: 102 dmd/src/dmd/backend/codebuilder.d: 167 dmd/src/dmd/backend/code.d: 434 dmd/src/dmd/backend/code_x86.d: 114 dmd/src/dmd/backend/compress.d: 63 dmd/src/dmd/backend/cv4.d: 2 dmd/src/dmd/backend/cv8.d: 638 dmd/src/dmd/backend/dcgcv.d: 2196 dmd/src/dmd/backend/dcode.d: 52 dmd/src/dmd/backend/debugprint.d: 279 dmd/src/dmd/backend/disasm86.d: 3316 dmd/src/dmd/backend/divcoeff.d: 129 dmd/src/dmd/backend/dlist.d: 197 dmd/src/dmd/backend/drtlsym.d: 468 dmd/src/dmd/backend/dt.d: 316 dmd/src/dmd/backend/dtype.d: 892 dmd/src/dmd/backend/dvarstats.d: 230 dmd/src/dmd/backend/dvec.d: 287 dmd/src/dmd/backend/dwarf2.d: 1 dmd/src/dmd/backend/dwarf.d: 20 dmd/src/dmd/backend/dwarfdbginf.d: 1721 dmd/src/dmd/backend/dwarfeh.d: 308 dmd/src/dmd/backend/ee.d: 59 dmd/src/dmd/backend/el.d: 112 dmd/src/dmd/backend/elem.d: 1649 dmd/src/dmd/backend/elfobj.d: 1699 dmd/src/dmd/backend/elpicpie.d: 483 dmd/src/dmd/backend/errors.di: 2 dmd/src/dmd/backend/evalu8.d: 1628 dmd/src/dmd/backend/exh.d: 28 dmd/src/dmd/backend/filespec.d: 158 dmd/src/dmd/backend/fp.d: 11 dmd/src/dmd/backend/gdag.d: 527 dmd/src/dmd/backend/gflow.d: 1034 dmd/src/dmd/backend/global.d: 343 dmd/src/dmd/backend/glocal.d: 419 dmd/src/dmd/backend/gloop.d: 2129 dmd/src/dmd/backend/go.d: 247 dmd/src/dmd/backend/goh.d: 58 dmd/src/dmd/backend/gother.d: 1136 dmd/src/dmd/backend/gsroa.d: 330 dmd/src/dmd/backend/iasm.d: 80 dmd/src/dmd/backend/mach.d: 137 dmd/src/dmd/backend/machobj.d: 1361 dmd/src/dmd/backend/md5.d: 152 dmd/src/dmd/backend/md5.di: 9 dmd/src/dmd/backend/melf.d: 317 dmd/src/dmd/backend/mem.d: 19 dmd/src/dmd/backend/mscoff.d: 82 dmd/src/dmd/backend/mscoffobj.d: 1002 dmd/src/dmd/backend/newman.d: 1065 dmd/src/dmd/backend/nteh.d: 445 dmd/src/dmd/backend/obj.d: 299 dmd/src/dmd/backend/oper.d: 444 dmd/src/dmd/backend/os.d: 409 dmd/src/dmd/backend/out.d: 989 dmd/src/dmd/backend/pdata.d: 112 dmd/src/dmd/backend/ph2.d: 63 dmd/src/dmd/backend/ptrntab.d: 986 dmd/src/dmd/backend/rtlsym.d: 4 dmd/src/dmd/backend/symbol.d: 1259 dmd/src/dmd/backend/symtab.d: 50 dmd/src/dmd/backend/ty.d: 50 dmd/src/dmd/backend/type.d: 94 dmd/src/dmd/backend/util2.d: 162 dmd/src/dmd/backend/var.d: 395 dmd/src/dmd/backend/xmm.d: 1 dmd/src/dmd/blockexit.d: 229 dmd/src/dmd/builtin.d: 263 dmd/src/dmd/canthrow.d: 141 dmd/src/dmd/chkformat.d: 801 dmd/src/dmd/cli.d: 94 dmd/src/dmd/clone.d: 858 dmd/src/dmd/common/file.d: 239 dmd/src/dmd/common/int128.d: 325 dmd/src/dmd/common/outbuffer.d: 358 dmd/src/dmd/common/string.d: 72 dmd/src/dmd/compiler.d: 195 dmd/src/dmd/cond.d: 427 dmd/src/dmd/console.d: 68 dmd/src/dmd/constfold.d: 1254 dmd/src/dmd/cparse.d: 2374 dmd/src/dmd/cppmangle.d: 1264 dmd/src/dmd/cppmanglewin.d: 850 dmd/src/dmd/ctfeexpr.d: 1203 dmd/src/dmd/ctorflow.d: 82 dmd/src/dmd/dcast.d: 2122 dmd/src/dmd/dclass.d: 515 dmd/src/dmd/declaration.d: 956 dmd/src/dmd/delegatize.d: 111 dmd/src/dmd/denum.d: 116 dmd/src/dmd/dimport.d: 174 dmd/src/dmd/dinifile.d: 194 dmd/src/dmd/dinterpret.d: 4214 dmd/src/dmd/dmacro.d: 238 dmd/src/dmd/dmangle.d: 634 dmd/src/dmd/dmdparams.d: 12 dmd/src/dmd/dmodule.d: 721 dmd/src/dmd/dmsc.d: 86 dmd/src/dmd/doc.d: 2928 dmd/src/dmd/dscope.d: 388 dmd/src/dmd/dstruct.d: 281 dmd/src/dmd/dsymbol.d: 1015 dmd/src/dmd/dsymbolsem.d: 3557 dmd/src/dmd/dtemplate.d: 4167 dmd/src/dmd/dtoh.d: 1708 dmd/src/dmd/dversion.d: 83 dmd/src/dmd/e2ir.d: 3795 dmd/src/dmd/eh.d: 189 dmd/src/dmd/entity.d: 38 dmd/src/dmd/errors.d: 358 dmd/src/dmd/escape.d: 986 dmd/src/dmd/expression.d: 2608 dmd/src/dmd/expressionsem.d: 7109 dmd/src/dmd/file_manager.d: 140 dmd/src/dmd/foreachvar.d: 193 dmd/src/dmd/frontend.d: 215 dmd/src/dmd/func.d: 1650 dmd/src/dmd/globals.d: 262 dmd/src/dmd/glue.d: 941 dmd/src/dmd/gluelayer.d: 32 dmd/src/dmd/hdrgen.d: 2231 dmd/src/dmd/iasm.d: 20 dmd/src/dmd/iasmdmd.d: 2625 dmd/src/dmd/iasmgcc.d: 231 dmd/src/dmd/id.d: 20 dmd/src/dmd/identifier.d: 125 dmd/src/dmd/impcnvtab.d: 230 dmd/src/dmd/imphint.d: 9 dmd/src/dmd/importc.d: 115 dmd/src/dmd/init.d: 125 dmd/src/dmd/initsem.d: 772 dmd/src/dmd/inlinecost.d: 202 dmd/src/dmd/inline.d: 1067 dmd/src/dmd/intrange.d: 444 dmd/src/dmd/json.d: 621 dmd/src/dmd/lambdacomp.d: 239 dmd/src/dmd/lexer.d: 2303 dmd/src/dmd/lib.d: 54 dmd/src/dmd/libelf.d: 319 dmd/src/dmd/libmach.d: 318 dmd/src/dmd/libmscoff.d: 418 dmd/src/dmd/libomf.d: 311 dmd/src/dmd/link.d: 543 dmd/src/dmd/mars.d: 1759 dmd/src/dmd/mtype.d: 3396 dmd/src/dmd/nogc.d: 127 dmd/src/dmd/nspace.d: 60 dmd/src/dmd/ob.d: 1345 dmd/src/dmd/objc.d: 293 dmd/src/dmd/objc_glue.d: 629 dmd/src/dmd/opover.d: 1066 dmd/src/dmd/optimize.d: 757 dmd/src/dmd/parse.d: 5760 dmd/src/dmd/parsetimevisitor.d: 226 dmd/src/dmd/permissivevisitor.d: 3 dmd/src/dmd/printast.d: 87 dmd/src/dmd/root/aav.d: 145 dmd/src/dmd/root/array.d: 499 dmd/src/dmd/root/bitarray.d: 89 dmd/src/dmd/root/complex.d: 35 dmd/src/dmd/root/ctfloat.d: 113 dmd/src/dmd/root/env.d: 29 dmd/src/dmd/root/file.d: 108 dmd/src/dmd/root/filename.d: 462 dmd/src/dmd/root/hash.d: 38 dmd/src/dmd/root/longdouble.d: 460 dmd/src/dmd/root/man.d: 48 dmd/src/dmd/root/optional.d: 27 dmd/src/dmd/root/port.d: 84 dmd/src/dmd/root/region.d: 56 dmd/src/dmd/root/response.d: 188 dmd/src/dmd/root/rmem.d: 134 dmd/src/dmd/root/rootobject.d: 9 dmd/src/dmd/root/speller.d: 132 dmd/src/dmd/root/string.d: 113 dmd/src/dmd/root/stringtable.d: 182 dmd/src/dmd/root/strtold.d: 284 dmd/src/dmd/root/utf.d: 136 dmd/src/dmd/s2ir.d: 878 dmd/src/dmd/safe.d: 91 dmd/src/dmd/sapply.d: 54 dmd/src/dmd/scanelf.d: 173 dmd/src/dmd/scanmach.d: 195 dmd/src/dmd/scanmscoff.d: 164 dmd/src/dmd/scanomf.d: 265 dmd/src/dmd/semantic2.d: 396 dmd/src/dmd/semantic3.d: 888 dmd/src/dmd/sideeffect.d: 178 dmd/src/dmd/statement.d: 660 dmd/src/dmd/statement_rewrite_walker.d: 71 dmd/src/dmd/statementsem.d: 2645 dmd/src/dmd/staticassert.d: 20 dmd/src/dmd/staticcond.d: 254 dmd/src/dmd/stmtstate.d: 73 dmd/src/dmd/strictvisitor.d: 222 dmd/src/dmd/target.d: 815 dmd/src/dmd/templateparamsem.d: 88 dmd/src/dmd/tocsym.d: 407 dmd/src/dmd/toctype.d: 141 dmd/src/dmd/tocvdebug.d: 689 dmd/src/dmd/todt.d: 842 dmd/src/dmd/toir.d: 529 dmd/src/dmd/tokens.d: 198 dmd/src/dmd/toobj.d: 747 dmd/src/dmd/traits.d: 1219 dmd/src/dmd/transitivevisitor.d: 481 dmd/src/dmd/typesem.d: 2798 dmd/src/dmd/typinf.d: 123 dmd/src/dmd/utils.d: 132 dmd/src/dmd/visitor.d: 117 dmd/src/dmd/vsoptions.d: 384 dmd/src/vcbuild/msvc-lib.d: 26 total: 174122
Feb 12 2022
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
None of the C or C++ code is part of dmd, it is there to interface with the C 
backends of gdc and ldc. dmd is 100% D.
Feb 11 2022
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Feb 11, 2022 at 05:44:45PM +0000, Stanislav Blinov via Digitalmars-d
wrote:
 On Friday, 11 February 2022 at 17:36:37 UTC, H. S. Teoh wrote:
 
 I pulled just this week, and running `wc` on *.d *.c *.h says...
https://github.com/AlDanial/cloc would yield a more practical metric, at least as far as "practical metric" in terms of LoC goes.
I'm skeptical of any LoC metric. T -- What do you mean the Internet isn't filled with subliminal messages? What about all those buttons marked "submit"??
Feb 11 2022
parent reply rempas <rempas tutanota.com> writes:
On Friday, 11 February 2022 at 18:13:34 UTC, H. S. Teoh wrote:
 I'm skeptical of any LoC metric.


 T
This reminds me of what Walter said before! It is actually so simple that I don't understand what's so hard about it! ``` int val = 200; // This is a line of code // This is a comment /* This is a comment This counts as a comment too! /* int function_test() { int v = 10; } ``` The following has: Lines of code: 4 Empty lines: 3 Comments: 2 Don't we all agree that this is how we should count it?
Feb 11 2022
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Feb 11, 2022 at 08:23:10PM +0000, rempas via Digitalmars-d wrote:
 On Friday, 11 February 2022 at 18:13:34 UTC, H. S. Teoh wrote:
 I'm skeptical of any LoC metric.
[...]
 This reminds me of what Walter said before! It is actually so simple
 that I don't understand what's so hard about it!
[...] It's not that it's *hard*. It's pretty straightforward, and everybody knows what it means. The problem is the mostly-unfounded *interpretations* that people put on it. In the bad ole days, LoC used to be a metric used by employers to measure their programmers' productivity. (I *hope* they don't do that anymore, but you never know...) Which is completely ridiculous because the amount of code you write has very little correlation with the amount of effort you put into it. It's trivial to write 1000 lines of sloppy boilerplate code that accomplishes little; it's a lot harder to write condense that into 50 lines of code that does the same thing 10x faster and with 10% of the memory requirements. One of the hardest bug fixes I've done at my job involve a 1-line fix for a subtle race condition that took 3+ months to track down and identify. I guess they should fire me for non-productivity, because by the LoC metric I've done almost zero work in that time. Good luck with the race condition, though; adding another 1000 LoC to the code ain't getting rid of the race, it'd only obscure it even further and make it just about impossible to find and fix. And some of my best bug fixes involve *deleting* poorly-written redundant code and writing a much shorter replacement. I guess they should *really* fire me for that, because by the LoC metric I've not only been unproductive, but *counter*productive. :-P By the above, it should be clear that the assumption that LoC is a good measure of complexity is an unfounded one. If project A has 10000 LoC and project B has 10000 LoC, does it mean they are of equal complexity? Hardly. Project A could be mostly boilerplate, copy-pasta, redundant code, poorly-implemented poorly-chosen O(n^2) algorithms, which has 10000 LoC simply because there's so much useless redundancy. Project B could be a collection of fine-tuned, hand-optimized professional algorithms that could do a LOT under the hood, and it has 10000 LoC because it actually has a large number of algorithms implemented, and was able to fit them all into 10000 LoC because each individual piece was written to be as concise as needed to express the algorithm and no more. In terms of actual complexity, project A might as well be kindergarten-level compared to project B's PhD sophistication. What does their respective LoC tell us about their complexity? Basically nothing. And don't even get me started on code quality vs. LoC. An IOCCC entry can easily fit an entire flight simulator into a single page of code, for example. Don't expect anybody to be able to read it, though (not even the author :-D). A more properly-written flight simulator would occupy a lot more than a single page of code, but in terms of complexity, they'd be about the same, give or take. But by the LoC metric, the two ought to be so far apart they should be completely unrelated to each other. Again, the value of LoC as a metric here is practically nil. --T
Feb 11 2022
next sibling parent reply rempas <rempas tutanota.com> writes:
On Friday, 11 February 2022 at 22:08:57 UTC, H. S. Teoh wrote:
 [It's not that it's *hard*... practically nil.]


 --T
I hear you loud and clear! It's very funny how "professionals" and their companies work worse that most hobbies programmers. This is why I don't want to become a "professional" and work for a company and why I FUCKING HATE when everyone talks about programming based on what's popular and what you should learn to get a "job". Fuck this shit! I remember someone saying me the same thing when we were discussing about QT and I said how bloated it is and they guy said that this is probably due to this reason (as even that QT offers free licenses, a company is behind it). I haven't wrote almost anything but even the few things that I tried to, I would always see how many things I could do with so few lines of code that I would always wonder how some projects take hundreds of thousands of lines of code or even millions! Like, wtf they are doing? Even for software that are minimal (see suckless), they still do about 80% of what the other "big and complete" software do with about 10% of the codebase so bloatware is a thing no matter how you see it! You can't make these numbers out!
Feb 11 2022
parent reply forkit <forkit gmail.com> writes:
On Saturday, 12 February 2022 at 06:29:37 UTC, rempas wrote:
 I haven't wrote almost anything ...
 ...
umm..reasoning that involves negation is extremely difficult. Walter will not be happy.
Feb 11 2022
parent rempas <rempas tutanota.com> writes:
On Saturday, 12 February 2022 at 07:06:21 UTC, forkit wrote:
 umm..reasoning that involves negation is extremely difficult.
Of course I was saying that to justify my idea about software been bloated which there will be something that does 80% of it with just 10% of the code-base. Also I'm just gonna try my first (and hopefully last) book about Compiler Design and if it succeeds and I'm able to make a fully compiler (including a linker) then I may even offer to make a backend for D in case someone wants to work in the backend. Or maybe Walter and the other folks will want to adopt it and make it the official backend of DMD. In any case, I would be glad to offer my help if that means improving D!
 Walter will not be happy.
Given the fact that Walter has to actually do real work and at the same time he's here answering every single crap that we are asking (I post the most crap, not gonna lie), makes me really impressed on his ability to stay calm. Makes me appreciate him more!
Feb 11 2022
prev sibling parent user1234 <user1234 12.de> writes:
On Friday, 11 February 2022 at 22:08:57 UTC, H. S. Teoh wrote:
 On Fri, Feb 11, 2022 at 08:23:10PM +0000, rempas via 
 Digitalmars-d wrote:
 [...]
[...]
 [...]
[...] It's not that it's *hard*. It's pretty straightforward, and everybody knows what it means. The problem is the mostly-unfounded *interpretations* that people put on it. In the bad ole days, LoC used to be a metric used by employers to measure their programmers' productivity. (I *hope* they don't do that anymore, but you never know...) Which is completely ridiculous because the amount of code you write has very little correlation with the amount of effort you put into it. It's trivial to write 1000 lines of sloppy boilerplate code that accomplishes little; it's a lot harder to write condense that into 50 lines of code that does the same thing 10x faster and with 10% of the memory requirements.
That's why I told earlier that OpenHUB is old trash. Their estimation [for DMD](https://www.openhub.net/p/dmd/estimated_cost), based on model from **the late 70's**
Feb 12 2022
prev sibling parent user1234 <user1234 12.de> writes:
On Friday, 11 February 2022 at 17:36:37 UTC, H. S. Teoh wrote:
 On Fri, Feb 11, 2022 at 04:47:46PM +0000, user1234 via 
 Digitalmars-d wrote:
 On Friday, 11 February 2022 at 16:41:33 UTC, user1234 wrote:
 On Friday, 11 February 2022 at 15:17:16 UTC, rempas wrote:
 [...]
Openhub and their metrics are old trash. It's more 170K according to D-Scanner.
wait... it's 175K. I had not pulled since 8 monthes or so. There's much new code that was commited since, with importC notably.
I pulled just this week, and running `wc` on *.d *.c *.h says there are 365K lines. I'm not sure what the *.h files are for,
Ah yes, the h files... D-Scanner does not take them in account. They are still used by GDC I believe.
Feb 11 2022
prev sibling next sibling parent reply rempas <rempas tutanota.com> writes:
On Friday, 11 February 2022 at 16:47:46 UTC, user1234 wrote:
 Openhub and their metrics are old trash. It's more 170K 
 according to D-Scanner.
wait... it's 175K. I had not pulled since 8 monthes or so. There's much new code that was commited since, with importC notably.
Thank you for the information! It seems pretty impressive to me that DMD only has 175K LoC in it's code base given the fact of how huge D is! Even without the recent commits (which how much could they be?), this seems to little to me. In that case, we can talk about re-writing it but again, that's up to the developers to decide.
Feb 11 2022
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Feb 11, 2022 at 08:00:14PM +0000, rempas via Digitalmars-d wrote:
[...]
 Thank you for the information! It seems pretty impressive to me that
 DMD only has 175K LoC in it's code base given the fact of how huge D
 is! Even without the recent commits (which how much could they be?),
 this seems to little to me. In that case, we can talk about re-writing
 it but again, that's up to the developers to decide.
https://www.joelonsoftware.com/2000/04/06/things-you-should-never-do-part-i/ T -- "I'm running Windows '98." "Yes." "My computer isn't working now." "Yes, you already said that." -- User-Friendly
Feb 11 2022
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
The backend is currently 127,748 lines of code, including the optimizer.
Feb 11 2022
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/11/2022 6:52 AM, max haughton wrote:
 The object emission code in the backend is quite inefficient,
It's faster than any other compiler.
 it needs to be rewritten (it's horrible old code anyway)
I suppose that depends on what you're used to. The basic design is pretty simple - there's a code gen function for each expression node type. The optimizer uses standard data flow analysis math. There's a separate pass for register allocation, and one for scheduling. The design was originally written for the 8086. It survived extension to 32 bits, then 64 bits, then SIMD. The complexity comes from the complexity of the x86 instruction set and the choice of instructions is very dependent on the shape of the expression trees. The only thing it has really failed at is the x87, which everyone wants to leave behind anyway.
Feb 11 2022
next sibling parent reply forkit <forkit gmail.com> writes:
On Saturday, 12 February 2022 at 07:13:15 UTC, Walter Bright 
wrote:
 On 2/11/2022 6:52 AM, max haughton wrote:
 The object emission code in the backend is quite inefficient,
It's faster than any other compiler.
It sure is! That's the primary reason I became interested in D - the speed of compilation, using dmd. compilation speed, compared to all later versions of VS (including VS2022) - at least in my experience. I don't care how great a programming language is, slow compilation is a real turn off! Hooray for dmd!!
Feb 11 2022
next sibling parent reply rempas <rempas tutanota.com> writes:
On Saturday, 12 February 2022 at 07:51:38 UTC, forkit wrote:
 I don't care how great a programming language is, slow 
 compilation is a real turn off!

 Hooray for dmd!!
Thank you! That's the reason I don't use Rust at all and for anything (even if it is so popular and has so much support). That's also the reason I am OBSESSED with TCC and was inspired to learn about compilers and make my own. Funny enough there are people that don't care about compilation speed and are willing to have their project compile even twice as fast for 5% runtime performance. The same people of course don't have any problem using Python in other cases...
Feb 12 2022
parent reply forkit <forkit gmail.com> writes:
On Saturday, 12 February 2022 at 08:12:03 UTC, rempas wrote:
 Funny enough there are people that don't care about compilation 
 speed and are willing to have their project compile even twice 
 as fast for 5% runtime performance. The same people of course 
 don't have any problem using Python in other cases...
Yeah .. users... erh... .. but compiler writers are a different breed all together. (well, they used to be anyway) It used to be, that the golden rule of compiler writers was "performance is (almost) everything". i.e. - Compile time performance -> how long it takes to generate code. - Runtime performance -> how fast that code runs. (Almost) nothing else used to matter (to compiler writers) Why almost? Cause in the end, you need accurate results more than you need speed. (ref: Expert C Programming - P van der Linden 1994) I see the performance of (other) compilers these days, and I wonder.. what ever happened to that bread of compiler writers... from long ago... Luckily, we still have one of them.
Feb 12 2022
next sibling parent rempas <rempas tutanota.com> writes:
On Saturday, 12 February 2022 at 11:04:48 UTC, forkit wrote:
 Yeah .. users... erh...

 .. but compiler writers are a different breed all together.

 (well, they used to be anyway)

 It used to be, that the golden rule of compiler writers was 
 "performance is (almost) everything".

 i.e.

 - Compile time performance -> how long it takes to generate 
 code.

 - Runtime performance -> how fast that code runs.

 (Almost) nothing else used to matter (to compiler writers)

 Why almost? Cause in the end, you need accurate results more 
 than you need speed.

 (ref: Expert C Programming -  P van der Linden 1994)

 I see the performance of (other) compilers these days, and I 
 wonder.. what ever happened to that bread of compiler 
 writers... from long ago...
Makes total sense to me. It's the same way people choose Python (or C++ or Rust or JS or whatever) over C because runtime performance is not the only thing that matters. Development speed matters too. Super fast compilation times will allow the dream of Gentoo, *BSD to become true and everyone will be able to compile everything from source with all the advantages this offers. Another thing to mention is that I was also obsessed with the compiler that generates that code that "runs faster" in the past but then I realized something. Runtime performance is a really really really complicated topic! First of all, runtime performance may not (and probably will not) be very critical every time to begin with. But development time will always show! A compile that generates my code fast and allows me to save-and-run as much as I can, a compiler that manages the memory for me because I will make mistakes as I'm human, a compile that does immutability be default (because again humans make mistakes) a compiler that will allow me to express myself the way I want and focus all my time to actually solve the program rather than find a way to bypass the languages limitations etc. THIS IS what matters the most! Even when the runtime performance will be important, the optimizations that they compiler will do will mostly not offer you more than 20% runtime performance so what you should do is either use faster algorithms and/or change the design of your program (and maybe remove some unnecessary features). I finally understand that now! I don't chase pure raw compiler optimization runtime performance but good/smart program designs! Of course, I want my compiler to not generate unnecessary instructions but again, MY design is what will make the program faster. Unfortunately, we live in a generation where people are OBSESSED with numbers! Ignoring their meaning and what's behind them! I don't want to see big words too! I have learned and I'm still learning day by day and I'm (hopefully) getting better!
 Luckily, we still have one of them.
We have a couple of people that think this way. Which one do you refer to?
Feb 12 2022
prev sibling parent reply Paulo Pinto <pjmlp progtools.org> writes:
On Saturday, 12 February 2022 at 11:04:48 UTC, forkit wrote:
 ...

 I see the performance of (other) compilers these days, and I 
 wonder.. what ever happened to that bread of compiler 
 writers... from long ago...
They went on to create Eiffel, Delphi, .NET, Java, V8, GraalVM (nee Maxime), OCaml, Go and Dart.
Feb 12 2022
parent reply rempas <rempas tutanota.com> writes:
On Saturday, 12 February 2022 at 15:54:29 UTC, Paulo Pinto wrote:
 They went on to create Eiffel, Delphi, .NET, Java, V8, GraalVM 
 (nee Maxime), OCaml, Go and Dart.
Half you mentioned are interpreters. Bytecode, JITs, saying however you want, not true compilers/transpilers that result to binary. Dart doesn't compile fast (or why I have this idea?) and GO is very fast but isn't extremely fast (and was very annoying and limited language the last time I checked). Not to offend anything and anyone here but these examples don't do for me and probably for most people here.
Feb 12 2022
parent reply Paulo Pinto <pjmlp progtools.org> writes:
On Saturday, 12 February 2022 at 19:06:55 UTC, rempas wrote:
 On Saturday, 12 February 2022 at 15:54:29 UTC, Paulo Pinto 
 wrote:
 They went on to create Eiffel, Delphi, .NET, Java, V8, GraalVM 
 (nee Maxime), OCaml, Go and Dart.
Half you mentioned are interpreters. Bytecode, JITs, saying however you want, not true compilers/transpilers that result to binary. Dart doesn't compile fast (or why I have this idea?) and GO is very fast but isn't extremely fast (and was very annoying and limited language the last time I checked). Not to offend anything and anyone here but these examples don't do for me and probably for most people here.
That only shows how little you know of them, and the available toolchains. If you want to actually educate yourself about them, there is plenty of material available.
Feb 12 2022
parent rempas <rempas tutanota.com> writes:
On Saturday, 12 February 2022 at 19:18:12 UTC, Paulo Pinto wrote:
 That only shows how little you know of them, and the available 
 toolchains.

 If you want to actually educate yourself about them, there is 
 plenty of material available.
You are true on what you are saying, but what made you say that from my comment? I suppose that I was wrong about Dart and Go have probably got better. But other than that what was my mistake? I'm not saying that to make irony, I really want to see your point of view. You understand that I can learn about 2-3 languages but I cannot make a research about every language you listed. Thank you!
Feb 12 2022
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Feb 12, 2022 at 07:51:38AM +0000, forkit via Digitalmars-d wrote:
[...]
 That's the primary reason I became interested in D - the speed of
 compilation, using dmd.
[...]
 I don't care how great a programming language is, slow compilation is
 a real turn off!
 
 Hooray for dmd!!
I use dmd for the code-compile-test cycle because of the fast turnaround. For small programs dmd is so fast it's almost like programming in a scripting language(!). For larger programs it's less so, but still impressively fast for compile times. Runtime performance of executables compiled by dmd, however, is a disappointment. I consistently get 20%-40% runtime performance improvement by compiling with ldc/gdc, esp. for CPU-intensive programs. So my usual workflow is dmd for code-compile-test, ldc -O2 for release builds. T -- Amateurs built the Ark; professionals built the Titanic.
Feb 12 2022
parent reply rempas <rempas tutanota.com> writes:
On Saturday, 12 February 2022 at 16:13:55 UTC, H. S. Teoh wrote:
 I use dmd for the code-compile-test cycle because of the fast 
 turnaround. For small programs dmd is so fast it's almost like 
 programming in a scripting language(!). For larger programs 
 it's less so, but still impressively fast for compile times.

 Runtime performance of executables compiled by dmd, however, is 
 a disappointment.  I consistently get 20%-40% runtime 
 performance improvement by compiling with ldc/gdc, esp. for 
 CPU-intensive programs.

 So my usual workflow is dmd for code-compile-test, ldc -O2 for 
 release builds.


 T
If you get to a point that runtime becomes too slow for a specific task then I don't think that 20%-40% will make such of a big difference really. There may be cases that even the smallest performance boost will make the difference but were that a lot in your experience? The funny stuff is that I may be stupid and talking about things I don't have experience with but I'm just talking with logic in mind so If I'm wrong then please make sure to properly fix me and tell me your experience on this topic. Thank you!
Feb 12 2022
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Feb 12, 2022 at 07:31:28PM +0000, rempas via Digitalmars-d wrote:
[...]
 If you get to a point that runtime becomes too slow for a specific
 task then I don't think that 20%-40% will make such of a big
 difference really. There may be cases that even the smallest
 performance boost will make the difference but were that a lot in your
 experience?
[...] 20%-40% is a HUGE difference. Think about a 60fps 3D game where you have only 16ms to update the screen for the next frame. If your code takes ~13ms to update a frame when compiled with LDC -O2, then compiling D will not even be an option because it would not be able to meet the framerate and the game will be jerky and unplayable. If the difference is 2% or 3% then there may still be room for negotiation. 20%-40% is half an order of magnitude. There is no way you can compromise with that. Also, for long-running CPU-intensive computations, which one would you rather have: your complex computation to finish in 2 days, which may just make the deadline, or ~4 days, which will definitely *not* meet the deadline? Again, if the difference is 2% or 3% then you may still be able to work with it. 20%-40% is unacceptable. T -- Written on the window of a clothing store: No shirt, no shoes, no service.
Feb 12 2022
next sibling parent rempas <rempas tutanota.com> writes:
On Saturday, 12 February 2022 at 20:22:44 UTC, H. S. Teoh wrote:
 20%-40% is a HUGE difference. Think about a 60fps 3D game where 
 you have only 16ms to update the screen for the next frame. If 
 your code takes ~13ms to update a frame when compiled with LDC 
 -O2, then compiling D will not even be an option because it 
 would not be able to meet the framerate and the game will be 
 jerky and unplayable.  If the difference is 2% or 3% then there 
 may still be room for negotiation. 20%-40% is half an order of 
 magnitude. There is no way you can compromise with that.

 Also, for long-running CPU-intensive computations, which one 
 would you rather have: your complex computation to finish in 2 
 days, which may just make the deadline, or ~4 days, which will 
 definitely *not* meet the deadline?  Again, if the difference 
 is 2% or 3% then you may still be able to work with it. 20%-40% 
 is unacceptable.


 T
Game dev was what I was sure about and the first thing that comes in mind when we talk about runtime performance. The second example was a good one too! Thank you!
Feb 12 2022
prev sibling parent max haughton <maxhaton gmail.com> writes:
On Saturday, 12 February 2022 at 20:22:44 UTC, H. S. Teoh wrote:
 On Sat, Feb 12, 2022 at 07:31:28PM +0000, rempas via 
 Digitalmars-d wrote: [...]
 If you get to a point that runtime becomes too slow for a 
 specific task then I don't think that 20%-40% will make such 
 of a big difference really. There may be cases that even the 
 smallest performance boost will make the difference but were 
 that a lot in your experience?
[...] 20%-40% is a HUGE difference. Think about a 60fps 3D game where you have only 16ms to update the screen for the next frame. If your code takes ~13ms to update a frame when compiled with LDC -O2, then compiling D will not even be an option because it would not be able to meet the framerate and the game will be jerky and unplayable. If the difference is 2% or 3% then there may still be room for negotiation. 20%-40% is half an order of magnitude. There is no way you can compromise with that. Also, for long-running CPU-intensive computations, which one would you rather have: your complex computation to finish in 2 days, which may just make the deadline, or ~4 days, which will definitely *not* meet the deadline? Again, if the difference is 2% or 3% then you may still be able to work with it. 20%-40% is unacceptable. T
The thing with dmd isn't just the performance that also it's quite buggy when it starts optimizing. Quite a few libraries have a gotcha due to dmd (*especially* `-inline`) that has to be worked around (the inliner can basically ignore language semantics which can break NRVO for example)
Feb 12 2022
prev sibling next sibling parent reply rempas <rempas tutanota.com> writes:
On Saturday, 12 February 2022 at 07:13:15 UTC, Walter Bright 
wrote:
 The complexity comes from the complexity of the x86 instruction 
 set and the choice of instructions is very dependent on the 
 shape of the expression trees.
You could email the creator of Vox and ask him about the general structure of Vox and about tricks with X86_X64 specif stuff as this is what Vox targets so he may be specialized in this ISA and know stuff that you don't (which may also improve the runtime performance of programs compiled with DMD). Of course in case you didn't checked, the source of Vox is written in D and it is 36K LoC (at least that's what the README.md says) so you could also have a look (I did and it even looks readable to a n00b like me). If I knew assembly and machine language (and in general about compiler design), I would do it myself to save you some time and then directly email you but unfortunately I'm not able to do that now. But tbh, DMD is very fast as it is now given the fact that it does optimizations (Vox and TCC doesn't do any if I'm not mistaken). And my post was to make discussion and see what others think about this topic and not to say that DMD is slow cause that would be a lie ;)
 The only thing it has really failed at is the x87, which 
 everyone wants to leave behind anyway.
Why, what was bad about it? Can I get a little of background on this one?
Feb 12 2022
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/12/2022 12:08 AM, rempas wrote:
 The only thing it has really failed at is the x87, which everyone wants to 
 leave behind anyway.
Why, what was bad about it? Can I get a little of background on this one?
It doesn't assign variables to x87 registers. The reason it doesn't is because the x87 is a stack machine, meaning the registers all shift position. There is a way to fix that by using FXCH instructions, but I never got around to doing that.
Feb 12 2022
parent rempas <rempas tutanota.com> writes:
On Saturday, 12 February 2022 at 09:00:18 UTC, Walter Bright 
wrote:
 It doesn't assign variables to x87 registers. The reason it 
 doesn't is because the x87 is a stack machine, meaning the 
 registers all shift position.

 There is a way to fix that by using FXCH instructions, but I 
 never got around to doing that.
Thanks for the info! Yeah, agree with you! It seems why should all forget about x87 then.
Feb 12 2022
prev sibling parent reply max haughton <maxhaton gmail.com> writes:
On Saturday, 12 February 2022 at 07:13:15 UTC, Walter Bright 
wrote:
 On 2/11/2022 6:52 AM, max haughton wrote:
 The object emission code in the backend is quite inefficient,
It's faster than any other compiler.
 it needs to be rewritten (it's horrible old code anyway)
I suppose that depends on what you're used to. The basic design is pretty simple - there's a code gen function for each expression node type. The optimizer uses standard data flow analysis math. There's a separate pass for register allocation, and one for scheduling. The design was originally written for the 8086. It survived extension to 32 bits, then 64 bits, then SIMD. The complexity comes from the complexity of the x86 instruction set and the choice of instructions is very dependent on the shape of the expression trees. The only thing it has really failed at is the x87, which everyone wants to leave behind anyway.
I'm specifically talking about the file that handles elf files, it's very messy and uses some absolutely enormous structs which are naturally very slow by virtue of their size.
Feb 12 2022
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/12/2022 9:20 AM, max haughton wrote:
 I'm specifically talking about the file that handles elf files, it's very
messy 
 and uses some absolutely enormous structs which are naturally very slow by 
 virtue of their size.
The elf generator was written nearly 30 years ago, and has never been refactored properly to modernize it. It could sure use it, but I'm not so sure it would speed things up noticeably. If you want to take a crack at it, feel free!
Feb 12 2022
parent max haughton <maxhaton gmail.com> writes:
On Sunday, 13 February 2022 at 00:41:38 UTC, Walter Bright wrote:
 On 2/12/2022 9:20 AM, max haughton wrote:
 I'm specifically talking about the file that handles elf 
 files, it's very messy and uses some absolutely enormous 
 structs which are naturally very slow by virtue of their size.
The elf generator was written nearly 30 years ago, and has never been refactored properly to modernize it. It could sure use it, but I'm not so sure it would speed things up noticeably. If you want to take a crack at it, feel free!
It's on my list. The reason why it's slow is because the structs are very large compared to a cacheline so the CPU has to pull in (optimistically, the CPU might pull in several lines at once) 64 bytes but only uses about 10 of them in a given iteration. There is an O(n^2) algorithm in there but I'm not sure it's a particularly big N in normal programs.
Feb 12 2022
prev sibling next sibling parent reply Dennis <dkorpel gmail.com> writes:
On Friday, 11 February 2022 at 12:34:21 UTC, rempas wrote:
 That's nice to hear! However, does DMD generates object files 
 directly or "asm" files that are passed to a C compile? If I 
 remember correctly, LDC2 needs to pass the output to a C 
 compiler as people told me so what's the case from DMD?
DMD goes from its own backend block tree to an object file, without writing assembly. In fact, only recently was the ability to output asm added for debugging purposes: https://dlang.org/blog/2022/01/24/the-binary-language-of-moisture-vaporators/ On Linux dmd invokes gcc by default to create an executable, but only to link the resulting object files, not to compile C/assembly code. LDC goes from LLVM IR to machine code, but it can output assembly with the `-output-s` flag. GDC does generate assembly text to the tmp folder and then invokes `gas` the GNU assembler, it can't directly write machine code.
Feb 11 2022
next sibling parent rempas <rempas tutanota.com> writes:
On Friday, 11 February 2022 at 16:40:42 UTC, Dennis wrote:
 DMD goes from its own backend block tree to an object file, 
 without writing assembly. In fact, only recently was the 
 ability to output asm added for debugging purposes:
 https://dlang.org/blog/2022/01/24/the-binary-language-of-moisture-vaporators/

 On Linux dmd invokes gcc by default to create an executable, 
 but only to link the resulting object files, not to compile 
 C/assembly code.

 LDC goes from LLVM IR to machine code, but it can output 
 assembly with the `-output-s` flag.

 GDC does generate assembly text to the tmp folder and then 
 invokes `gas` the GNU assembler, it can't directly write 
 machine code.
Thank you! This sums it up perfectly! Can you choose to pass it directly to the linker with DMD on Linux? Something like setting "ld" (or another linker of course) as the "C" compiler, idk...
Feb 11 2022
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/11/2022 8:40 AM, Dennis wrote:
 In fact, only recently was the ability to output asm added for 
 debugging purposes:
 https://dlang.org/blog/2022/01/24/the-binary-language-of-moisture-vaporators/
I use that dumb feature most every day. It's the most productivity enhancing feature I've added in a long time. For example, I formerly wrote: import core.stdio; int main() { printf("%d\n", expression); return 0; } dmd test ./test every time I wanted to see what `expression` evaluated to. Now I just do: int test() { return expression; } dmd -c test -vasm Shazzam! Mucho less typitty-tippity-tip-typing!
Feb 11 2022
parent reply rempas <rempas tutanota.com> writes:
On Saturday, 12 February 2022 at 07:08:17 UTC, Walter Bright 
wrote:
 I use that dumb feature most every day. It's the most 
 productivity enhancing feature I've added in a long time.

 For example, I formerly wrote:

     import core.stdio;
     int main() {
        printf("%d\n", expression);
        return 0;
     }

     dmd test
     ./test

 every time I wanted to see what `expression` evaluated to. Now 
 I just do:

    int test() { return expression; }

    dmd -c test -vasm

 Shazzam! Mucho less typitty-tippity-tip-typing!
THANK YOU!!! Every compiler needs that even if they can output binary formats directly (TCC I'm talking to you!) because it is easier to see the assembly rather than imagine the instructions in your head as humans are know to make mistakes. Thank you for adding this Walter!
Feb 11 2022
parent Walter Bright <newshound2 digitalmars.com> writes:
On 2/11/2022 11:44 PM, rempas wrote:
 THANK YOU!!! Every compiler needs that even if they can output binary formats 
 directly (TCC I'm talking to you!) because it is easier to see the assembly 
 rather than imagine the instructions in your head as humans are know to make 
 mistakes. Thank you for adding this Walter!
You're quite welcome!
Feb 12 2022
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 2/11/2022 4:34 AM, rempas wrote:
 That's nice to hear! However, does DMD generates object files directly
It generates object files directly. No "asm" step. The intermediate code is converted directly to machine code.
 Do you think that there are any very bad places in DMD's backend? Has anyone
in 
 the team thought about re-writing the backend (or parts of it) from the
beginning?
It has evolved over time, but the basic design has held up very well. The main difficulty is the very complex nature of the x86 CPU, which leads to endless special cases.
Feb 11 2022
prev sibling next sibling parent reply Dave P. <dave287091 gmail.com> writes:
On Thursday, 10 February 2022 at 09:41:12 UTC, rempas wrote:
 [...]
I think it would be interesting to combine a compiler and a linker into a single executable. Not necessarily for speed reasons, but for better diagnostics and the possibility of type checking external symbols. Linker errors can sometimes be hard to understand in the presence of inlining and optimizations. The linker will report references to symbols not present in your code or present in completely different places. For example: ```D extern(D) int some_func(int x); pragma(inline, true) private int foo(int x){ return some_func(x); } pragma(inline, true) private int bar(int x){ return foo(x); } pragma(inline, true) private int baz(int x){ return bar(x); } pragma(inline, true) private int qux(int x){ return baz(x); } int main(){ return qux(2); } ``` When you go to compile it: ```sh Undefined symbols for architecture arm64: "__D7example9some_funcFiZi", referenced from: __D7example3fooFiZi in example.o __D7example3barFiZi in example.o __D7example3bazFiZi in example.o __D7example3quxFiZi in example.o __Dmain in example.o ld: symbol(s) not found for architecture arm64 clang: error: linker command failed with exit code 1 (use -v to see invocation) Error: /usr/bin/cc failed with status: 1 ``` The linker sees references to the extern function in places where I never wrote that in my source code. In a nontrivial project this can be quite confusing if you’re not used to this quirk of the linking process. If the compiler is invoking the linker for you anyway, why can’t it read the object files and libraries and tell you exactly what is missing and where in your code you reference it?
Feb 10 2022
next sibling parent reply max haughton <maxhaton gmail.com> writes:
On Thursday, 10 February 2022 at 22:06:30 UTC, Dave P. wrote:
 On Thursday, 10 February 2022 at 09:41:12 UTC, rempas wrote:
 [...]
I think it would be interesting to combine a compiler and a linker into a single executable. Not necessarily for speed reasons, but for better diagnostics and the possibility of type checking external symbols. Linker errors can sometimes be hard to understand in the presence of inlining and optimizations. The linker will report references to symbols not present in your code or present in completely different places. [...]
This goes away if you do a debug build, which most (all professionals I'm aware of) people do. And why should the compiler do something the linker is going to do anyway? It would have to wait until after linking anyway because you might want a symbol to be defined somewhere else.
Feb 10 2022
parent Dave P. <dave287091 gmail.com> writes:
On Thursday, 10 February 2022 at 22:11:13 UTC, max haughton wrote:
 On Thursday, 10 February 2022 at 22:06:30 UTC, Dave P. wrote:
 On Thursday, 10 February 2022 at 09:41:12 UTC, rempas wrote:
 [...]
I think it would be interesting to combine a compiler and a linker into a single executable. Not necessarily for speed reasons, but for better diagnostics and the possibility of type checking external symbols. Linker errors can sometimes be hard to understand in the presence of inlining and optimizations. The linker will report references to symbols not present in your code or present in completely different places. [...]
This goes away if you do a debug build, which most (all professionals I'm aware of) people do.
That *is* a debug build: ```sh ldc2 example.d -O0 Undefined symbols for architecture arm64: "__D7example9some_funcFiZi", referenced from: __D7example3fooFiZi in example.o __D7example3barFiZi in example.o __D7example3bazFiZi in example.o __D7example3quxFiZi in example.o __Dmain in example.o ld: symbol(s) not found for architecture arm64 clang: error: linker command failed with exit code 1 (use -v to see invocation) Error: /usr/bin/cc failed with status: 1 ``` I’m used to it at this point, but for people new to the C-style model of separate compilation it is extremely confusing. It’s made worse by the name mangling required to get C-linkers to link code from more modern languages.
 And why should the compiler do something the linker is going to 
 do anyway? It would have to wait until after linking anyway 
 because you might want a symbol to be defined somewhere else.
You would still give the compiler libraries if you wanted them defined elsewhere and in my idea the compiler would also be the linker so there is no “the linker is going to do anyway”.
Feb 10 2022
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/10/2022 2:06 PM, Dave P. wrote:
 Undefined symbols for architecture arm64:
    "__D7example9some_funcFiZi", referenced from:
        __D7example3fooFiZi in example.o
        __D7example3barFiZi in example.o
        __D7example3bazFiZi in example.o
        __D7example3quxFiZi in example.o
        __Dmain in example.o
 ld: symbol(s) not found for architecture arm64
Things I have never been able to explain, even to long time professional programmers: 1. what "undefined symbol" means 2. what "multiply defined symbol" means 3. how linkers resolve symbols Our own runtime library illustrates this bafflement. In druntime, there are these "hooks" where one can replace the default function that deals with assertion errors. Such hooks are entirely unnecessary. To override a symbol in a library, just write your own function with the same name and link it in before the library. I have never been able to explain these to people. I wonder if it is because it is so simple, people think "that can't be right". With the hook thing, they'll ask me to re-explain it several times, then they'll say "are you sure?" and they still don't believe it.
Feb 10 2022
next sibling parent reply rikki cattermole <rikki cattermole.co.nz> writes:
On 11/02/2022 11:52 AM, Walter Bright wrote:
 I have never been able to explain these to people. I wonder if it is 
 because it is so simple, people think "that can't be right". With the 
 hook thing, they'll ask me to re-explain it several times, then they'll 
 say "are you sure?" and they still don't believe it.
It does depend on a few factors. Compiler, linker, build/package manager all playing along. Not to mention shared library support actually good enough with clear common use cases all described. For me personally there are a few unknowns for the general case that I would avoid using it in production.
Feb 10 2022
parent Walter Bright <newshound2 digitalmars.com> writes:
On 2/10/2022 4:03 PM, rikki cattermole wrote:
 On 11/02/2022 11:52 AM, Walter Bright wrote:
 I have never been able to explain these to people. I wonder if it is because 
 it is so simple, people think "that can't be right". With the hook thing, 
 they'll ask me to re-explain it several times, then they'll say "are you 
 sure?" and they still don't believe it.
It does depend on a few factors. Compiler, linker, build/package manager all playing along. Not to mention shared library support actually good enough with clear common use cases all described. For me personally there are a few unknowns for the general case that I would avoid using it in production.
All linkers work this way.
Feb 10 2022
prev sibling next sibling parent reply max haughton <maxhaton gmail.com> writes:
On Thursday, 10 February 2022 at 22:52:45 UTC, Walter Bright 
wrote:
 On 2/10/2022 2:06 PM, Dave P. wrote:
 Undefined symbols for architecture arm64:
    "__D7example9some_funcFiZi", referenced from:
        __D7example3fooFiZi in example.o
        __D7example3barFiZi in example.o
        __D7example3bazFiZi in example.o
        __D7example3quxFiZi in example.o
        __Dmain in example.o
 ld: symbol(s) not found for architecture arm64
Things I have never been able to explain, even to long time professional programmers: 1. what "undefined symbol" means 2. what "multiply defined symbol" means 3. how linkers resolve symbols Our own runtime library illustrates this bafflement. In druntime, there are these "hooks" where one can replace the default function that deals with assertion errors. Such hooks are entirely unnecessary. To override a symbol in a library, just write your own function with the same name and link it in before the library. I have never been able to explain these to people. I wonder if it is because it is so simple, people think "that can't be right". With the hook thing, they'll ask me to re-explain it several times, then they'll say "are you sure?" and they still don't believe it.
If by hook you mean a callback of sorts that can be overrided, then the problem solved is not strictly the same as a weakly defined function. If you have multiple library's in the same playpen then it simply doesn't work to have them all trying to override the same symbols. If they can neatly hook and unhook things that goes away.
Feb 10 2022
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/10/2022 7:45 PM, max haughton wrote:
 If by hook you mean a callback of sorts that can be overrided, then the
problem 
 solved is not strictly the same as a weakly defined function. If you have 
 multiple library's in the same playpen then it simply doesn't work to have
them 
 all trying to override the same symbols. If they can neatly hook and unhook 
 things that goes away.
That's not how multiple libraries work. Suppose you have 3 libraries, A, B, and C. You have an object file X. The linker command is: link X.obj A.lib B.lib C.lib X refers to "foo". All 4 define "foo". Which one gets picked? X.foo That's it. There are no unresolved symbols to look for. Now, suppose only B and C define "foo". Which one gets picked? B.foo because it is not in X. Then, A is looked at, and it is not in A. Then, B is looked at, and it is in B. C is not looked at because it is now resolved. It has nothing to do with weak definitions. It's a simple "foo" is referenced. Got to find a definition. Look in the libraries in the order they are supplied to the linker. That's it. Want to not use the library definition? Define it yourself in X. No need for hooking. No need for anything clever at all. Just define it in your .obj file. ---- Now suppose X.obj and Y.obj both define foo. Link with: link X.obj Y.obj A.lib B.lib C.lib You get a message: Multiple definition of "foo", found in X.obj and Y.obj because order does not matter for .obj files as far as symbols go. All the symbols in .obj files get added.
Feb 10 2022
next sibling parent reply Dennis <dkorpel gmail.com> writes:
On Friday, 11 February 2022 at 06:33:20 UTC, Walter Bright wrote:
 Now suppose X.obj and Y.obj both define foo. Link with:

     link X.obj Y.obj A.lib B.lib C.lib

 You get a message:

     Multiple definition of "foo", found in X.obj and Y.obj
Unless your compiler places all functions in COMDATs of course. https://github.com/dlang/dmd/blob/a176f0359a07fa5a252518b512f3b085a43a77d8/src/dmd/backend/backconfig.d#L303 https://issues.dlang.org/show_bug.cgi?id=15342
Feb 11 2022
parent Walter Bright <newshound2 digitalmars.com> writes:
On 2/11/2022 1:42 AM, Dennis wrote:
 On Friday, 11 February 2022 at 06:33:20 UTC, Walter Bright wrote:
 Now suppose X.obj and Y.obj both define foo. Link with:

     link X.obj Y.obj A.lib B.lib C.lib

 You get a message:

     Multiple definition of "foo", found in X.obj and Y.obj
Unless your compiler places all functions in COMDATs of course. https://github.com/dlang/dmd/blob/a176f0359a07fa5a252518b512f3b085a43a77d8/src/dmd/backe d/backconfig.d#L303 https://issues.dlang.org/show_bug.cgi?id=15342
Yes, common blocks (of which COMDATs are) are all treated as identical and one is selected, but if and only if they are already added by the linker. If the linker finds a COMDAT that resolves an undefined symbol, it is not going looking further for another one. COMDATs came about because C++ has a proclivity to spew identical functions into multiple object files. D does, too.
Feb 11 2022
prev sibling next sibling parent Dennis <dkorpel gmail.com> writes:
On Friday, 11 February 2022 at 06:33:20 UTC, Walter Bright wrote:
 Now suppose X.obj and Y.obj both define foo. Link with:

     link X.obj Y.obj A.lib B.lib C.lib

 You get a message:

     Multiple definition of "foo", found in X.obj and Y.obj
Don't rely on this when using DMD though, since it likes to place all functions in COMDATs, meaning the linker will just pick one `foo` instead of raising an error. https://github.com/dlang/dmd/blob/a176f0359a07fa5a252518b512f3b085a43a77d8/src/dmd/backend/backconfig.d#L303 https://issues.dlang.org/show_bug.cgi?id=15342
Feb 11 2022
prev sibling next sibling parent sfp <sfp hush.ai> writes:
On Friday, 11 February 2022 at 06:33:20 UTC, Walter Bright wrote:
 On 2/10/2022 7:45 PM, max haughton wrote:
 If by hook you mean a callback of sorts that can be overrided, 
 then the problem solved is not strictly the same as a weakly 
 defined function. If you have multiple library's in the same 
 playpen then it simply doesn't work to have them all trying to 
 override the same symbols. If they can neatly hook and unhook 
 things that goes away.
That's not how multiple libraries work. Suppose you have 3 libraries, A, B, and C. You have an object file X. The linker command is: link X.obj A.lib B.lib C.lib X refers to "foo". All 4 define "foo". Which one gets picked? X.foo That's it. There are no unresolved symbols to look for. Now, suppose only B and C define "foo". Which one gets picked? B.foo because it is not in X. Then, A is looked at, and it is not in A. Then, B is looked at, and it is in B. C is not looked at because it is now resolved. It has nothing to do with weak definitions. It's a simple "foo" is referenced. Got to find a definition. Look in the libraries in the order they are supplied to the linker. That's it. Want to not use the library definition? Define it yourself in X. No need for hooking. No need for anything clever at all. Just define it in your .obj file. ---- Now suppose X.obj and Y.obj both define foo. Link with: link X.obj Y.obj A.lib B.lib C.lib You get a message: Multiple definition of "foo", found in X.obj and Y.obj because order does not matter for .obj files as far as symbols go. All the symbols in .obj files get added.
You have now successfully explained this to at least one programmer! :-) Very good explanation, and very simple mechanism indeed. Had no idea it worked this way. Inspired by this, I did a little searching and found this blog post: http://www.samanbarghi.com/blog/2014/09/05/how-to-wrap-a-system-call-libc-function-in-linux/ One of these days I should get around to learning all the things the toolchain can actually do for me!
Feb 11 2022
prev sibling parent reply max haughton <maxhaton gmail.com> writes:
On Friday, 11 February 2022 at 06:33:20 UTC, Walter Bright wrote:
 On 2/10/2022 7:45 PM, max haughton wrote:
 If by hook you mean a callback of sorts that can be overrided, 
 then the problem solved is not strictly the same as a weakly 
 defined function. If you have multiple library's in the same 
 playpen then it simply doesn't work to have them all trying to 
 override the same symbols. If they can neatly hook and unhook 
 things that goes away.
That's not how multiple libraries work. Suppose you have 3 libraries, A, B, and C. You have an object file X. The linker command is: link X.obj A.lib B.lib C.lib X refers to "foo". All 4 define "foo". Which one gets picked? X.foo That's it. There are no unresolved symbols to look for. Now, suppose only B and C define "foo". Which one gets picked? B.foo because it is not in X. Then, A is looked at, and it is not in A. Then, B is looked at, and it is in B. C is not looked at because it is now resolved. It has nothing to do with weak definitions. It's a simple "foo" is referenced. Got to find a definition. Look in the libraries in the order they are supplied to the linker. That's it. Want to not use the library definition? Define it yourself in X. No need for hooking. No need for anything clever at all. Just define it in your .obj file. ---- Now suppose X.obj and Y.obj both define foo. Link with: link X.obj Y.obj A.lib B.lib C.lib You get a message: Multiple definition of "foo", found in X.obj and Y.obj because order does not matter for .obj files as far as symbols go. All the symbols in .obj files get added.
If all the libraries rely on hooking something you will silently break all but one, whereas the process of overriding a runtime hook can be made into an atomic operation that can fail in a reasonable manner if wielded incorrectly. Doing things based on the order at link-time is simply not good practice in the general case. It's OK if you control all the things in the stack and want to (say) override malloc, but controlling what happens on an assertion is exactly the kind of thing that resolution at link-time can make into a real nightmare to do cleanly (and mutably, you might want to catch assertions differently when acting as a web server than when loading data). Also linking (especially around shared libraries) doesn't work in exactly the same way on all platforms, so basically maximizing the entropy of a given link (minimize possible outcomes, so minimal magic) can be a real win when it comes to making a program that builds and runs reliably on different platforms. At Symmetry we have had real issues with shared libraries, for reasons more complicated than mentioned here granted, so we actually cannot ship anything with dmd even if we wanted to.
Feb 11 2022
parent Walter Bright <newshound2 digitalmars.com> writes:
On 2/11/2022 9:20 AM, max haughton wrote:
 If all the libraries rely on hooking something you will silently break all but 
 one, whereas the process of overriding a runtime hook can be made into an
atomic 
 operation that can fail in a reasonable manner if wielded incorrectly.
Sorry, I don't follow that. I don't know what atomic ops have to do with it.
 Doing things based on the order at link-time is simply not good practice in
the 
 general case. It's OK if you control all the things in the stack and want to 
 (say) override malloc, but controlling what happens on an assertion is exactly 
 the kind of thing that resolution at link-time can make into a real nightmare
to 
 do cleanly (and mutably, you might want to catch assertions differently when 
 acting as a web server than when loading data).
All link operations conform to the ordering I described. I can't think of a way that is simpler, cleaner, or easier to understand. Hooking certainly ain't.
 Also linking (especially around shared libraries) doesn't work in exactly the 
 same way on all platforms, so basically maximizing the entropy of a given link 
 (minimize possible outcomes, so minimal magic) can be a real win when it comes 
 to making a program that builds and runs reliably on different platforms. At 
 Symmetry we have had real issues with shared libraries, for reasons more 
 complicated than mentioned here granted, so we actually cannot ship anything 
 with dmd even if we wanted to.
DLLs (shared libraries) are a different story because they are all-or-nothing. In fact, they aren't actually libraries at all in the programming sense. They aren't linked in, either, there's no linking involved when accessing a DLL.
Feb 11 2022
prev sibling parent reply John Colvin <john.loughran.colvin gmail.com> writes:
On Thursday, 10 February 2022 at 22:52:45 UTC, Walter Bright 
wrote:
 On 2/10/2022 2:06 PM, Dave P. wrote:
 Undefined symbols for architecture arm64:
    "__D7example9some_funcFiZi", referenced from:
        __D7example3fooFiZi in example.o
        __D7example3barFiZi in example.o
        __D7example3bazFiZi in example.o
        __D7example3quxFiZi in example.o
        __Dmain in example.o
 ld: symbol(s) not found for architecture arm64
Things I have never been able to explain, even to long time professional programmers: 1. what "undefined symbol" means 2. what "multiply defined symbol" means 3. how linkers resolve symbols Our own runtime library illustrates this bafflement. In druntime, there are these "hooks" where one can replace the default function that deals with assertion errors. Such hooks are entirely unnecessary. To override a symbol in a library, just write your own function with the same name and link it in before the library. I have never been able to explain these to people. I wonder if it is because it is so simple, people think "that can't be right". With the hook thing, they'll ask me to re-explain it several times, then they'll say "are you sure?" and they still don't believe it.
I absolutely don’t want my executable defined by the order things happen to appear on the linker command line. I don’t want that incidentally and I don’t want to do it deliberately. The boat sailed on this long ago, I just want everything to be in the executable please with errors on duplicates, unless it’s dead code. Same goes for import paths btw. I don’t want imports selected based on the order of import paths, I want hard errors on any duplication of fully-qualified modules. D has amazing compile-time features for deciding what to compile or not, what to call and not. I want to use those, not rely on the details of how I cobble together my build (or how some automated tool does it for me).
Feb 12 2022
parent Walter Bright <newshound2 digitalmars.com> writes:
On 2/12/2022 2:00 AM, John Colvin wrote:
 I absolutely don’t want my executable defined by the order things happen to 
 appear on the linker command line. I don’t want that incidentally and I
don’t 
 want to do it deliberately. The boat sailed on this long ago, I just want 
 everything to be in the executable please with errors on duplicates, unless
it’s 
 dead code.
For better or worse, that's how linkers work. Though you could write a tool to scan libraries for multiple definitions. Most of the work is already done for you in dmd's source code.
Feb 12 2022
prev sibling parent rempas <rempas tutanota.com> writes:
On Thursday, 10 February 2022 at 22:06:30 UTC, Dave P. wrote:
 I think it would be interesting to combine a compiler and a 
 linker into a single executable. Not necessarily for speed 
 reasons, but for better diagnostics and the possibility of type 
 checking external symbols. Linker errors can sometimes be hard 
 to understand in the presence of inlining and optimizations. 
 The linker will report references to symbols not present in 
 your code or present in completely different places.

 For example:

 ```D
 extern(D) int some_func(int x);

 pragma(inline, true)
 private int foo(int x){
     return some_func(x);
 }

 pragma(inline, true)
 private int bar(int x){
     return foo(x);
 }

 pragma(inline, true)
 private int baz(int x){
     return bar(x);
 }

 pragma(inline, true)
 private int qux(int x){
     return baz(x);
 }

 int main(){
     return qux(2);
 }

 ```

 When you go to compile it:

 ```sh
 Undefined symbols for architecture arm64:
   "__D7example9some_funcFiZi", referenced from:
       __D7example3fooFiZi in example.o
       __D7example3barFiZi in example.o
       __D7example3bazFiZi in example.o
       __D7example3quxFiZi in example.o
       __Dmain in example.o
 ld: symbol(s) not found for architecture arm64
 clang: error: linker command failed with exit code 1 (use -v to 
 see invocation)
 Error: /usr/bin/cc failed with status: 1
 ```

 The linker sees references to the extern function in places 
 where I never wrote that in my source code. In a nontrivial 
 project this can be quite confusing if you’re not used to this 
 quirk of the linking process.

 If the compiler is invoking the linker for you anyway, why 
 can’t it read the object files and libraries and tell you 
 exactly what is missing and where in your code you reference it?
Yeah, error messages could ALWAYS be better in any compiler (even rustc) at any time. This design would make it even easier to do like you explained. Thank you!
Feb 11 2022
prev sibling parent reply Era Scarecrow <rtcvb32 yahoo.com> writes:
On Thursday, 10 February 2022 at 09:41:12 UTC, rempas wrote:
 A couple of months ago, I found out about a language called 
 [Vox](https://github.com/MrSmith33/vox) which uses a design 
 that I haven't seen before by any other compiler which is to 
 not create object files and then link them together but 
 instead, always create an executable at once.
TCC (*Tiny C Compiler*) does this like 10 years ago. TCC was originally made as part of the obfuscation programming challenge, and then got updated to be more complete. https://www.bellard.org/tcc/ I believe most of the compilers base is involving optimization for various architectures and versions of CPU's, along with cross-compiling. GNU/GCC has tons of legacy code in the back that it still uses i believe. To note, back in 1996 or about there i wrote an assembler that took x86 and could compiler itself. But wasn't compatible with any other code and couldn't use object files or anything (*as it was all made from scratch when i was 12-14*). However it did compiler directly to a COM file. I'll just say from experience, there are advantages but they don't outweigh the disadvantages. That's my flat opinion going from here.
Feb 10 2022
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 2/10/2022 8:18 PM, Era Scarecrow wrote:
   To note, back in 1996 or about there i wrote an assembler that took x86 and 
 could compiler itself. But wasn't compatible with any other code and couldn't 
 use object files or anything (*as it was all made from scratch when i was 
 12-14*). However it did compiler directly to a COM file. I'll just say from 
 experience, there are advantages but they don't outweigh the disadvantages. 
 That's my flat opinion going from here.
Back in the olden days, creating a DOS executable was trivial. Things have gotten much more complicated.
Feb 10 2022
prev sibling next sibling parent reply max haughton <maxhaton gmail.com> writes:
On Friday, 11 February 2022 at 04:18:42 UTC, Era Scarecrow wrote:
 On Thursday, 10 February 2022 at 09:41:12 UTC, rempas wrote:
 A couple of months ago, I found out about a language called 
 [Vox](https://github.com/MrSmith33/vox) which uses a design 
 that I haven't seen before by any other compiler which is to 
 not create object files and then link them together but 
 instead, always create an executable at once.
TCC (*Tiny C Compiler*) does this like 10 years ago. TCC was originally made as part of the obfuscation programming challenge, and then got updated to be more complete. https://www.bellard.org/tcc/ I believe most of the compilers base is involving optimization for various architectures and versions of CPU's, along with cross-compiling. GNU/GCC has tons of legacy code in the back that it still uses i believe. To note, back in 1996 or about there i wrote an assembler that took x86 and could compiler itself. But wasn't compatible with any other code and couldn't use object files or anything (*as it was all made from scratch when i was 12-14*). However it did compiler directly to a COM file. I'll just say from experience, there are advantages but they don't outweigh the disadvantages. That's my flat opinion going from here.
Optimizations are slow, and optimizations that aren't a total mess when implemented require abstraction. Making those abstractions cheap is difficult, so you end up with LLVM and GCC being slower even on debug builds because they have more layers of abstraction (or rather take less shortcuts). It's probably very possible to equalise this performance with a more niche compiler, but it would also probably require a really immense effort and probably starting from scratch around a new concept (a la LLVM). As for legacy code, there probably are branches being tested for old processors in places, but for the most part GCC's algorithms may look a bit crude (i.e. some of GCC's development practices are very 1980s compared to LLVM and will probably scare off new money and minds and kill the project in the long run) because of their C heritage, but they are still the benchmark to beat. The Itanium scheduler won't be running on an X86 target, to be clear. I'm also not convinced the compiler assembling code itself is all that useful, it probably is marginally faster but on a modern system I couldn't measure it as significant on basically any workload. It's basically performance theatre, the performance of the semantic analysis or moving bytes around prior to object code however it's emitted is much more important. The dmd backend gets a 6/10 for me when it comes to performance. The algorithms are very simple, it should really be faster than it is. The parts that actually emit the object code are particularly slow.
Feb 10 2022
parent Walter Bright <newshound2 digitalmars.com> writes:
On 2/10/2022 10:36 PM, max haughton wrote:
 The dmd backend gets a 6/10 for me when it comes to performance. The
algorithms 
 are very simple, it should really be faster than it is. The parts that
actually 
 emit the object code are particularly slow.
Much of that comes from supporting 4 very different object file formats.
Feb 11 2022
prev sibling next sibling parent rempas <rempas tutanota.com> writes:
On Friday, 11 February 2022 at 04:18:42 UTC, Era Scarecrow wrote:
  I believe most of the compilers base is involving optimization 
 for various architectures and versions of CPU's, along with 
 cross-compiling.
Yeah but when I don't cross-compile, I only compile for one OS and one instruction set. Code for other cases will not get executed so I cannot see how this can play a role. TCC also support a lot of architectures and Operating Systems (even Windows natively If I'm not wrong). Unless I don't understand what you mean...
 GNU/GCC has tons of legacy code in the back that it still uses 
 i believe.
Yeah, that's the problem we will never be able to solve. New and better practices will always be invented so to get the best possible performance, we must always re-write stuff (or parts of it) and in the case of big compilers, this will be a pain in the ass and I understand it...
  To note, back in 1996 or about there i wrote an assembler that 
 took x86 and could compiler itself. But wasn't compatible with 
 any other code and couldn't use object files or anything (*as 
 it was all made from scratch when i was 12-14*). However it did 
 compiler directly to a COM file. I'll just say from experience, 
 there are advantages but they don't outweigh the disadvantages. 
 That's my flat opinion going from here.
I wonder what we can do to keep the advantages and take away the disadvantages. The second idea I had is probably the answer but I would like someone to say something about it directly. Thank you for your time!
Feb 11 2022
prev sibling parent reply Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Friday, 11 February 2022 at 04:18:42 UTC, Era Scarecrow wrote:
 On Thursday, 10 February 2022 at 09:41:12 UTC, rempas wrote:
 A couple of months ago, I found out about a language called 
 [Vox](https://github.com/MrSmith33/vox) which uses a design 
 that I haven't seen before by any other compiler which is to 
 not create object files and then link them together but 
 instead, always create an executable at once.
TCC (*Tiny C Compiler*) does this like 10 years ago. TCC was originally made as part of the obfuscation programming challenge, and then got updated to be more complete. https://www.bellard.org/tcc/
If one wants to get really historic it is also what made Turbo Pascal did up to version 3.0. With Turbo Pascal 4.0 they went back to more classic object file/linker and there is a good reason for that. Separate compilation and linking modules and libraries are a thing. If you build the compiler for direct executable production you have to still support normal object file/library handling i.e. you put the functionality of the linker into your compiler.
Feb 11 2022
next sibling parent rempas <rempas tutanota.com> writes:
On Friday, 11 February 2022 at 17:36:03 UTC, Patrick Schluter 
wrote:
 If one wants to get really historic it is also what made Turbo 
 Pascal did up to version 3.0. With Turbo Pascal 4.0 they went 
 back to more classic object file/linker and there is a good 
 reason for that. Separate compilation and linking modules and 
 libraries are a thing. If you build the compiler for direct 
 executable production you have to still support normal object 
 file/library handling i.e. you put the functionality of the 
 linker into your compiler.
Yep and that's what I love about it! You can have 2 ways to do the same thing and choose based on what's best for the case. For example, if your projects has 10M LoC, even if you can compiler 1M LoC/S (which is a very big number), your project will need 10 seconds to build which will make it very annoying. In that case, we use the classic method of creating object files to the files that were changed and then link them together. However, if your project is 1M LoC or less, that is less than 1 second to build it which is not noticeable at all. The same happens when the end-user compiles the software from source and doesn't care (and won't even keep) about the object files because he/she is not a developer. In that case it makes sense to not waste time creating the object file and go straight creating the executable/library. If we are to make a new compiler (which I plan to), we should create a whole toolchain that will consist of all the tools. Sounds complex, I know but what's the point if we don't advance? Make another compiler that outputs assembly so it will always have dependencies and it will be slow to compile (slow compared to if we outputted machine language directly)?
Feb 11 2022
prev sibling parent Era Scarecrow <rtcvb32 yahoo.com> writes:
On Friday, 11 February 2022 at 17:36:03 UTC, Patrick Schluter 
wrote:
 If one wants to get really historic it is also what made Turbo 
 Pascal did up to version 3.0. With Turbo Pascal 4.0 they went 
 back to more classic object file/linker
Mmm hard to say on various compilers, i never had the money when i was younger to pay for said compilers/toolsets, and now most of them (*current popular ones*) are free (*Might have a couple Turbo Compiler with a C++ programming book, but never touched it*). No doubt many earlier commercial compilers didn't have separate architectures and probably just did x86. But it's been a long time since the 16-bit MS-DOS age when that was more common. Though if optimizations are dropped you can probably have a very lean toolset, maybe even to build an entire distro from sources on a CD. Though last time i tried to build Libc it took a very long time, not recommended.
Feb 11 2022