digitalmars.D - How hard would it be to create a new backend in D?
- rempas (13/13) Aug 05 2022 I was wondering how easy it would be to create a new backend in
- IGotD- (10/24) Aug 05 2022 Why would you want to create a new backend when you have LLVM and
- rempas (22/32) Aug 05 2022 Because LDC (llvm) is terribly slow (and GDC is worse in my
- user1234 (5/19) Aug 05 2022 I think you can try things without writing a backend. If I
- rempas (25/30) Aug 05 2022 LLVM is the slow part of LDC (D's frontend is actually faster than
- Walter Bright (5/13) Aug 06 2022 The dmd backend is already in D :-)
- welkam (6/10) Aug 06 2022 But where is inliner located? I read that several other languages
- Paul Backus (3/9) Aug 06 2022 DMD's inliner is being moved to the backend:
- welkam (2/11) Aug 06 2022 Awesome.
- Walter Bright (10/24) Aug 07 2022 DMD has two inliners, a front end one and a back end one. Here's the bac...
- rempas (3/7) Aug 06 2022 Thank you! It just happened that I missed this reply and wouldn't
I was wondering how easy it would be to create a new backend in D? From what I know all of the tree DMD, LDC and GDC are using the same frontend which is the one from DMD. So I would suppose that there is a way to do that and that probably DMD has evolved over the years to make its frontend more portable. However, how strain-forward is it? Is it legit to do it or will it require "hacking" (if you understand what I mean). So it's not about working hard but if the work makes sense or if I'll constantly find obstacles that no one would be able to help me with. I know some of you will ask so, I want to use [mir](https://github.com/vnmakarov/mir) as a backend because of it's fast compile times and it's great runtime performance! So yeah, not a question purely out of curiosity!
Aug 05 2022
On Friday, 5 August 2022 at 20:37:19 UTC, rempas wrote:I was wondering how easy it would be to create a new backend in D? From what I know all of the tree DMD, LDC and GDC are using the same frontend which is the one from DMD. So I would suppose that there is a way to do that and that probably DMD has evolved over the years to make its frontend more portable. However, how strain-forward is it? Is it legit to do it or will it require "hacking" (if you understand what I mean). So it's not about working hard but if the work makes sense or if I'll constantly find obstacles that no one would be able to help me with. I know some of you will ask so, I want to use [mir](https://github.com/vnmakarov/mir) as a backend because of it's fast compile times and it's great runtime performance! So yeah, not a question purely out of curiosity!Why would you want to create a new backend when you have LLVM and GCC? Scrap the DMD backend. The only claimed benefit is that it is fast which isn't much compared to the amount of time spent on maintenance. It's not the 90s anymore where you only had a Intel Pentium target. Now you have many Intel optimizations options, ARM (many models there as well), RISC V, PowerPC etc. and more are coming. Let's focus on the language because CPU support is out of scope for the D project.
Aug 05 2022
On Friday, 5 August 2022 at 21:54:52 UTC, IGotD- wrote:Why would you want to create a new backend when you have LLVM and GCC? Scrap the DMD backend. The only claimed benefit is that it is fast which isn't much compared to the amount of time spent on maintenance.Because LDC (llvm) is terribly slow (and GDC is worse in my system)! I have a dream to create a cross platform tool-chain and package/project manager (no, DUB will not do) to create a new ecosystem where everyone will compile everything from source with the many benefits this adds! But with a compile that is so slow, there is no way that anyone (including me) will want to compile everything from source. So my goal is simply not achievable with the current compilers. Mir is about 4-5 times faster than `GCC -O2` while having about 83%-85% of its runtime performance (on average). Still it's not TCC level but still, much much better!It's not the 90s anymore where you only had a Intel Pentium target. Now you have many Intel optimizations options, ARM (many models there as well), RISC V, PowerPC etc. and more are coming.I don't understand what you're trying to say with this one. Do you mean that LLVM has support for a lot of CPU ISAs so it's a good backend? If yes, then mir has support for a couple of them as well.Let's focus on the language because CPU support is out of scope for the D project.Again, not sure what you mean with that...
Aug 05 2022
On Saturday, 6 August 2022 at 04:33:41 UTC, rempas wrote:a new ecosystem where everyone will compile everything from source with the many benefits this adds!So a Gentoo? Now since Gentoo has been mentioned some one is required to mention Arch in the responses. The only benefit I would want from compiling everything myself instead of downloading precompiled binaries is that I could enable specific optimizations for my system. The only backends that have all those optimizations are GCC and LLVM. What other benefits do you see that are worth the hassle? I think when talking about creating executables from source code its better to use the word build instead of compiling to describe that process. In order to build the program you need to compile and link. The last time I built debug version of DMD on my system around half of the time was spent linking so even if you can get a significantly faster backend the whole build time wont change significantly. You need to think about the whole pipeline if you want big changes.
Aug 06 2022
On Saturday, 6 August 2022 at 19:35:16 UTC, welkam wrote:So a Gentoo? Now since Gentoo has been mentioned some one is required to mention Arch in the responses. The only benefit I would want from compiling everything myself instead of downloading precompiled binaries is that I could enable specific optimizations for my system. The only backends that have all those optimizations are GCC and LLVM. What other benefits do you see that are worth the hassle?Having the ability to create custom builds! Most of the times, this is not a problem but there may be a chance where you have to build your own version of a package (because you need the 'X' feature) and the build-in package manager may not have greatest support to work with its packages and your local ones. Another one, cross-compatibility! No more wasting time for platform-specific bugs because this distro has this built-in and that distro has this option enabled yada yada yada. One package manager, all the systems! And as everything and everyone builds from source, we can have one cross-compiler and we can build a local version of the program. No more, "upload your binary" which, yes it happened to me (if tho we did find the solution without having to do that but still....).I think when talking about creating executables from source code its better to use the word build instead of compiling to describe that process. In order to build the program you need to compile and link. The last time I built debug version of DMD on my system around half of the time was spent linking so even if you can get a significantly faster backend the whole build time wont change significantly. You need to think about the whole pipeline if you want big changes.That's fair but linkers have become faster and faster! See `lld` for example! In my experience, most of the projects (at least the C ones as this is where we have tons of huge projects to test) spend around 90% of the whole compiling process in compiling when using "lld" as the linker (and not link-time optimization). I've seen C++ projects been different as C++ probably outputs more complex symbols so the linker has to do more work and I don't remember for D (Even tho I'm mostly sure that it falls in the C category where link times are faster). So in the end, I think that improving the compile time (output object files) do matter. Also, there is always the ability to create the final executable at one shot! This can be extremely useful in release builds when you don't care about the object files anyways. Is there any D backend that has been build to do that? Actually, the only compilers that I personally know to do that are [Vox](https://github.com/MrSmith33/vox) and [Vlang](https://github.com/vlang/v) when using its "native" backend.
Aug 06 2022
On Friday, 5 August 2022 at 20:37:19 UTC, rempas wrote:I was wondering how easy it would be to create a new backend in D? From what I know all of the tree DMD, LDC and GDC are using the same frontend which is the one from DMD. So I would suppose that there is a way to do that and that probably DMD has evolved over the years to make its frontend more portable. However, how strain-forward is it? Is it legit to do it or will it require "hacking" (if you understand what I mean). So it's not about working hard but if the work makes sense or if I'll constantly find obstacles that no one would be able to help me with. I know some of you will ask so, I want to use [mir](https://github.com/vnmakarov/mir) as a backend because of it's fast compile times and it's great runtime performance! So yeah, not a question purely out of curiosity!I think you can try things without writing a backend. If I believe the diagram of what currently works that seems possible to use LDC to produce LLVM IR and then MIR in theory could use that, but MIR does not produce native executables so why ?
Aug 05 2022
On Friday, 5 August 2022 at 22:09:44 UTC, user1234 wrote:I think you can try things without writing a backend. If I believe the diagram of what currently works that seems possible to use LDC to produce LLVM IR and then MIR in theory could use that, ...LLVM is the slow part of LDC (D's frontend is actually faster than C's frontend for LLVM as LDC compiles cod faster than Clang) so this will not help...but MIR does not produce native executables so why?What do you mean with native? ELF (at least for Unix)? Mir has its own format which is a binary format (including machine instructions). It then uses its own linker so no problem! The format is also cross-platform so in general there are no problems with its format. It also uses JIT so it can probably do some more optimizations (or at least there is room to make them, it don't know what happens in the moment). When it comes to its runtime performance, there is a directory called "c-benchmarks" where I have run the tests (compiling mir from the `bbv` branch) and compared to `GCC -O2`, mir has about 83%-85% of its runtime performance while compiling code about 4-5 times faster on average! If you wonder why I care about compilation times so much, please check my other reply in this thread.
Aug 05 2022
On Saturday, 6 August 2022 at 04:41:24 UTC, rempas wrote:[...]In what way do you wish to use MIR? A D frontend that generates MIR or some kind of LLVM-MIR pass? Could be an interesting project if not quite ambitious. I wouldn't let the previous user discourage you here. :)
Aug 05 2022
On Saturday, 6 August 2022 at 05:28:06 UTC, cmyka wrote:In what way do you wish to use MIR? A D frontend that generates MIR or some kind of LLVM-MIR pass? Could be an interesting project if not quite ambitious. I wouldn't let the previous user discourage you here. :)Here's the thing... I don't know! That's the case. I wonder what D's frontend (which is DMD's frontend practically) generates. It has to generate some kind of global IR and then probably LDC takes that and turns it into LLVM IR and GDC takes it and turns it into GCC IR. So I will suppose that it has to be some kind of independent middle representation that DMD does. I don't think that there is another way that things can happen... So I wonder how I can get started and if the process is straightforward because if it is not, then I may also think about building a language that is based on D or something like that, idk... As for getting discouraged, I wouldn't see it as if someone tried to discourage me. I think that the guys said their opinion nicely so in any way, I'm still thinking about it. But making my own frontend is still in the corner. The result of this thread will show!
Aug 05 2022
On Saturday, 6 August 2022 at 04:41:24 UTC, rempas wrote:On Friday, 5 August 2022 at 22:09:44 UTC, user1234 wrote:I suggested to experiment MIR like that, it was not a proposal on the final design. Experimenting using LLVM IR could be useful to determine if working seriously on the project is worth. Anyway if you want to put your hand in the hard stuff from the start I think you have two options. 1. Create an AST visitor that generate MIR format after DMDFE semantics 2. Create the MIR representation after the part of the backend that generate DMD IR (s2ir, e2ir, etc.) has run. The second option might be easier because the production will most of the time map 1:1 to a MIR equivalent. The first option is IMO would be harder because of forward references and imports. and even without that, that would require to split visiting in several passes (decls, aggregate members, function headers, function bodies) About the "how hard" I think that compiler programming is not hard but that takes time. I estimate that this could take you from 1 month to 3 months to finish however you 'd get results much earlier, e.g if you handle just a few constructs.I think you can try things without writing a backend. If I believe the diagram of what currently works that seems possible to use LDC to produce LLVM IR and then MIR in theory could use that, ...LLVM is the slow part of LDC (D's frontend is actually faster than C's frontend for LLVM as LDC compiles cod faster than Clang) so this will not help...
Aug 06 2022
On Saturday, 6 August 2022 at 07:10:43 UTC, user1234 wrote:I suggested to experiment MIR like that, it was not a proposal on the final design. Experimenting using LLVM IR could be useful to determine if working seriously on the project is worth. Anyway if you want to put your hand in the hard stuff from the start I think you have two options. 1. Create an AST visitor that generate MIR format after DMDFE semantics 2. Create the MIR representation after the part of the backend that generate DMD IR (s2ir, e2ir, etc.) has run. The second option might be easier because the production will most of the time map 1:1 to a MIR equivalent. The first option is IMO would be harder because of forward references and imports. and even without that, that would require to split visiting in several passes (decls, aggregate members, function headers, function bodies)Thank you for the info! The thing is (and why I make the question originally) how do I find info about how to get started? I don't even know how DMD works and how it's IR works. Does DMD's frontend parses the text and then outputs something like LLVM-IR (but for DMD) which we can take and then translate it to the final backend that we need (in our case mir) or something else? That's what I want to know. So yeah, is there a legit documentation or something or do backend developers have to guess how things work and do "hacking"?About the "how hard" I think that compiler programming is not hard but that takes time. I estimate that this could take you from 1 month to 3 months to finish however you 'd get results much earlier, e.g if you handle just a few constructs.That's actually pretty nice! I don't mind about putting the work but I mind the work to be strain-forward and make sense. I would expect to see actual documentation and info about how things work in detail. If not, then I would probably spend the time to design and implement my own language.
Aug 06 2022
On Saturday, 6 August 2022 at 08:04:37 UTC, rempas wrote:On Saturday, 6 August 2022 at 07:10:43 UTC, user1234 wrote:You'll have to read DMD code to get familiar with its code base (another way in the past was fixing bugs, unfortunately there are not much easy ones anymore). Fortunately you'll dont have to understand the whole thing. In a first time I'd suggest you to follow the lifetime of one particular construct and that for each big family of node. Choose - a Type (maybe the one for `int`) - a Statement (maybe the ReturnStatement) - a Declaration (the FunctionDeclaration) - an Expression (maybe the IntegerExp). Try to follow what is happening during the different passes. That way you'll have a good idea of what the compiler does for ```d int i(){return 0;} ``` and where you could generate MIR stuff.I suggested to experiment MIR like that, it was not a proposal on the final design. Experimenting using LLVM IR could be useful to determine if working seriously on the project is worth. Anyway if you want to put your hand in the hard stuff from the start I think you have two options. 1. Create an AST visitor that generate MIR format after DMDFE semantics 2. Create the MIR representation after the part of the backend that generate DMD IR (s2ir, e2ir, etc.) has run. The second option might be easier because the production will most of the time map 1:1 to a MIR equivalent. The first option is IMO would be harder because of forward references and imports. and even without that, that would require to split visiting in several passes (decls, aggregate members, function headers, function bodies)Thank you for the info! The thing is (and why I make the question originally) how do I find info about how to get started? I don't even know how DMD works and how it's IR works. Does DMD's frontend parses the text and then outputs something like LLVM-IR (but for DMD) which we can take and then translate it to the final backend that we need (in our case mir) or something else? That's what I want to know. So yeah, is there a legit documentation or something or do backend developers have to guess how things work and do "hacking"? [...]
Aug 06 2022
On Saturday, 6 August 2022 at 08:31:16 UTC, user1234 wrote:You'll have to read DMD code to get familiar with its code base (another way in the past was fixing bugs, unfortunately there are not much easy ones anymore). Fortunately you'll dont have to understand the whole thing. In a first time I'd suggest you to follow the lifetime of one particular construct and that for each big family of node. Choose - a Type (maybe the one for `int`) - a Statement (maybe the ReturnStatement) - a Declaration (the FunctionDeclaration) - an Expression (maybe the IntegerExp). Try to follow what is happening during the different passes. That way you'll have a good idea of what the compiler does for ```d int i(){return 0;} ``` and where you could generate MIR stuff.Thanks my friend! I'll try to read and understand the code and If I end up been able to create anything, I'll shared it here! Have a great day!
Aug 06 2022
On 8/5/2022 1:37 PM, rempas wrote:I was wondering how easy it would be to create a new backend in D? From what I know all of the tree DMD, LDC and GDC are using the same frontend which is the one from DMD. So I would suppose that there is a way to do that and that probably DMD has evolved over the years to make its frontend more portable. However, how strain-forward is it? Is it legit to do it or will it require "hacking" (if you understand what I mean). So it's not about working hard but if the work makes sense or if I'll constantly find obstacles that no one would be able to help me with.The dmd backend is already in D :-) But since it's all Boost Licensed, anyone can use 0..100% of it for their own backend project. No asking is required. Have fun!
Aug 06 2022
On Saturday, 6 August 2022 at 07:39:36 UTC, Walter Bright wrote:The dmd backend is already in D :-) But since it's all Boost Licensed, anyone can use 0..100% of it for their own backend project. No asking is required. Have fun!But where is inliner located? I read that several other languages (Jai, Zig) are trying to make their own x86_64 backends because GCC and LLVM are too slow. If DMD backend had its IR well documented and inliner implemented not in the frontend I could see a future where it could be used by other languages.
Aug 06 2022
On Saturday, 6 August 2022 at 18:02:34 UTC, welkam wrote:But where is inliner located? I read that several other languages (Jai, Zig) are trying to make their own x86_64 backends because GCC and LLVM are too slow. If DMD backend had its IR well documented and inliner implemented not in the frontend I could see a future where it could be used by other languages.DMD's inliner is being moved to the backend: https://github.com/dlang/dmd/pull/14194
Aug 06 2022
On Saturday, 6 August 2022 at 18:35:18 UTC, Paul Backus wrote:On Saturday, 6 August 2022 at 18:02:34 UTC, welkam wrote:Awesome.But where is inliner located? I read that several other languages (Jai, Zig) are trying to make their own x86_64 backends because GCC and LLVM are too slow. If DMD backend had its IR well documented and inliner implemented not in the frontend I could see a future where it could be used by other languages.DMD's inliner is being moved to the backend: https://github.com/dlang/dmd/pull/14194
Aug 06 2022
On 8/6/2022 11:02 AM, welkam wrote:On Saturday, 6 August 2022 at 07:39:36 UTC, Walter Bright wrote:DMD has two inliners, a front end one and a back end one. Here's the back end one: https://github.com/dlang/dmd/blob/master/compiler/src/dmd/backend/inliner.dThe dmd backend is already in D :-) But since it's all Boost Licensed, anyone can use 0..100% of it for their own backend project. No asking is required. Have fun!But where is inliner located?I read that several other languages (Jai, Zig) are trying to make their own x86_64 backends because GCC and LLVM are too slow. If DMD backend had its IR well documented and inliner implemented not in the frontend I could see a future where it could be used by other languages.The IR is very simple: https://github.com/dlang/dmd/blob/master/compiler/src/dmd/backend/el.d#L70 It's a binary tree. Not clever at all. It's about 50 lines of declaration. Too see it in action, use the --b --f switches: ./dmd test.d -c --b --f which will pretty print the IR before and after optimization. There are some other switches, like --r which will show the register allocator at work.
Aug 07 2022
On Saturday, 6 August 2022 at 07:39:36 UTC, Walter Bright wrote:The dmd backend is already in D :-) But since it's all Boost Licensed, anyone can use 0..100% of it for their own backend project. No asking is required. Have fun!Thank you! It just happened that I missed this reply and wouldn't even see it if it wasn't for another reply that quotes it...
Aug 06 2022