digitalmars.dip.ideas - Create a full C++ parser
- rempas (68/68) Jun 14 **Idea/Problem**
- rempas (17/18) Jun 14 Oh and something I forgot to say. A small parser would of course
- Sergey (22/40) Jun 14 Most of the IT world agreed that even though C++ is quite popular
- Richard (Rikki) Andrew Cattermole (33/33) Jun 14 If you look back in the archives you can see posts from Walter during
- monkyyy (5/14) Jun 14 What are you doing? This isnt happening, why are you giving out a
- Richard (Rikki) Andrew Cattermole (19/36) Jun 14 If someone was willing to shepherd this, they should understand:
- Lance Bachmeier (4/15) Jun 14 I think the best way forward would be taking advantage of the
- rempas (5/8) Jun 15 That seems interesting! In general, it would be nice if we lived
- Dejan Lekic (3/7) Jun 16 In the case of C, C++ and D (and many others) that is exactly
- Andrey Zherikov (5/12) Jun 17 I'm not an expert in compilers, but why can't languages be
- evilrat (16/29) Jun 17 The problem is type-safety, iirc LLVM IR is just about type
- user1234 (10/23) Jun 18 IR is already too late to put things in common. Type information
- rempas (4/6) Jun 25 Yeah, I mean also a way to compile files from multiple languages
- xoxo (6/6) Jun 15 DLang doesn't have a good parser for tooling, nor does it have a
- Johan (4/15) Jun 17 It's been done already:
- rempas (2/5) Jun 25 Ehhhmmmm... Lats commit was 5 years ago?!
**Idea/Problem** Ok, I know this will probably not happen, but I like D, and I'm going to give it a chance in hopes that I'll be able to convince you why you must do that. Now, let me start by saying that I do understand how hard it is to write a parser for a programming language, let alone C++. However, I will try to explain the reason that the trouble is more than worth it as, it will solve D's 2 biggest problems and complains that people have and not use it. So let me first start by saying that I think that the D language is the best language that we currently have. While not without its flaws and things that could be massively improved, it's still better than the second best, which is C++. However, D has a very big disadvantage, which is library support. A lot of big and powerful libraries are written in C++ and while there is the ability to manually create the bindings, it can be a real pain to manually make it for big projects and having to keep up with every update (especially major version ones). Now, you might tell me that all that aren't something new, and you already knew them, but what I'm going to present is that, the only thing that stops D from not only been more popular than it is, but also, been the most used language that there is out there, is a full compatibility with C++! Yeah, I know that it sounds just a bold theory, and you may deem that the risk isn't worth the trouble but, let me ask you one thing. What's the biggest reason D hasn't caught up with big languages? What is the number one complaint that people make about D (the garbage collector is number 2)? Yes, libraries! Now, D has "importC" and, lots of C++ libraries have C bindings, but the problem is, C doesn't have classes and, you will need to do manual work to create a "D way" of using the library. Also, not every C++ library has C bindings. For example, [Louvre](https://github.com/CuarzoSoftware/Louvre) does not! If I want to use it from D, I have to: * Wait for it to create C bindings (and hope they are maintained in the future) * Manually port the library myself. Which also includes porting other C++ libraries because I will get start with the [weston-example](https://github.com/CuarzoSoftware/Louvre/tree/main/src/examples/ ouvre-weston-clone) that they showcase in their repos. **Improvements** Implementing a C++ parser will have the following advantages: * C++ libraries will be able to natively been used in D. This includes "macros" and templates that have not been initialized in the actual library and would need an *additional* initialization from the project that would use them (making the process even more tedious, slow and overall annoying). * C++ and D code will be able to be combined, giving the ability for any C++ project to more easily and smoothly get fully transit to D. That will bring even more popularity and trust to the language. * C++ has smart pointers, which means that we will be able to use C++'s standard library for performance sensitive projects, solving D's number 2 complain (which is the garbage collector) and giving even more trust to people to see D as a real competitor that can get the place of C++. **Implementation** First, such project will require a great knowledge of the D compiler and great skills of writing an efficient parser. That's why, if anyone is to do it, it better be the DMD contributors. Second, this is something that will take time, so I believe that the best approach is to try parsing some real libraries, keep improving the parser little by little. Implement, test, fix bugs, test, and repeat! Let's start by the STL library (so we can use the smart pointers as fast as possible) and then, move to more common and big libraries. With steady work, in 2–3 years, we will hopefully have a fully working C++ parser without any bugs. Or at least one that can parse the biggest and most important C++ libraries.
Jun 14
On Saturday, 14 June 2025 at 11:39:01 UTC, rempas wrote:[**Idea/Problem** .. most important C++ libraries]Oh and something I forgot to say. A small parser would of course mean a preprocessor. Now, this gives one more advantages of not requiring an external C/C++ compiler (both for C and C++). This will also give us the ability to be able to not require extra files and been able to add the headers in the D files. Something like the following: ```d importC <gtk/gtk.h>; // Header, as a C file (including C11 features) importCXX <iostream>; // Header, as a C++ file importC cpp_module; // C++ module (no reason for the two "XX" in the end as, C has no modules!) void main() { std.cout << "Hello from C++'s print function!\n"; } ```
Jun 14
On Saturday, 14 June 2025 at 11:47:38 UTC, rempas wrote:On Saturday, 14 June 2025 at 11:39:01 UTC, rempas wrote:Most of the IT world agreed that even though C++ is quite popular language, at the same time it has very bad designed. And they want to move to something else. Some fields are moving to Go, others to Rust. Also companies are spending a lot of effort to provide solutions to simplify this transition. Auto transpilers from C++ to Rust by DARPA and others, Carbon by Google and Apple presented C++ interop to simplify the transition to Swift. Having C++ interop will be cool now if it will be ready - currently it will be a huge benefit for the language. But starting developing it now I think will be waste of resources - and moreover D doesn't have these resources even for crucial parts. And for sure no resources for such experimental things. And also there are other approaches - several languages have very nice C++ interop. There are projects in Python, R, Julia, Swift, Rust - they have different approaches from automatic bindings generators (cbindgen) to the seamless integrations (Rcpp). So if you really want compiled language with C++ interop I would suggest to check Swift.[**Idea/Problem** .. most important C++ libraries]Oh and something I forgot to say. A small parser would of course mean a preprocessor. Now, this gives one more advantages of not requiring an external C/C++ compiler (both for C and C++). This will also give us the ability to be able to not require extra files and been able to add the headers in the D files. Something like the following: ```d importC <gtk/gtk.h>; // Header, as a C file (including C11 features) importCXX <iostream>; // Header, as a C++ file importC cpp_module; // C++ module (no reason for the two "XX" in the end as, C has no modules!) void main() { std.cout << "Hello from C++'s print function!\n"; } ```
Jun 14
If you look back in the archives you can see posts from Walter during the mid 2000's saying that he did not want C++ binding in D. Too complex, too much effort for very little gain. Right now our AST supports a subset of C++, and does not have the capability to handle C++ templates. It works for C + COM and not much more than that. An example of a C++ feature which we do not support is multiple inheritance. Our AST, semantic analysis and codegen do not support it. We cannot bind to it. And this is before we get into compiler specific stuff like type information. Dmd is even missing the Windows 64bit exception handling of MSVC. Now consider C, we support most of what C does. What we are missing is some semantics of macros, some types and of course typedef, but overall the compiler is fully capable of handling it. ImportC is just a parser with automatic calling out to the macro preprocessor sure, but that is because all the AST and semantic analysis is in place and mature. Adding C++ support isn't a 2-3 year project, even if it was just a parser (its wayyyy more complex than you are thinking it is). Its all this other stuff. If somebody wants to take this on, here are a list of projects that you can prove yourself on: 1. Supports Win64 exceptions to dmd (approved by Walter) 2. Add typedef to dmd (requires a DIP) 3. Implement a macro processor and figure out how to define the predefined macros for each target and then make system headers work out of the box. 4. Implement 16bit float type (requires a DIP) These four things are not controversial, at least not compared to stuff like multiple inheritance. If you can implement them, then you might have a chance to succeed with a ImportC++ feature. I suspect everyone would love to have ImportC++, the question is how much work it would take, and right now its well beyond the benefits.
Jun 14
On Saturday, 14 June 2025 at 12:19:47 UTC, Richard (Rikki) Andrew Cattermole wrote:If somebody wants to take this on, here are a list of projects that you can prove yourself on: 1. Supports Win64 exceptions to dmd (approved by Walter) 2. Add typedef to dmd (requires a DIP) 3. Implement a macro processor and figure out how to define the predefined macros for each target and then make system headers work out of the box. 4. Implement 16bit float type (requires a DIP)What are you doing? This isnt happening, why are you giving out a todo list. Toxic optimism
Jun 14
On 15/06/2025 12:44 AM, monkyyy wrote:On Saturday, 14 June 2025 at 12:19:47 UTC, Richard (Rikki) Andrew Cattermole wrote:If someone was willing to shepherd this, they should understand: 1. This isn't a small amount of work. 2. Have a path forward if they were willing. I would much rather explain why the knee-jerk reaction to the concept of ImportC++ and have it understood why it isn't planned for D, than for us to tell someone 'no' without them understanding nor able to make progress on. Who knows? They might be able to. I will give an example where this policy of mine is a positive. Yesterday I got approval to have my DFA engine merged as long as its self contained and remains behind a preview switch in terms of scope so that it can be experimented with. The reason I started work on it? Because I understood that Walter can't put more time into this aspect of the compiler and its a requirement for us to ever have RC in the language. It would've shortened many months leading up to the start of implementation if I had known that I had to take charge of the implementation, not just the design. As a community we are very bad at helping to onboard people into shepherding of new features, and I want to see that fixed.If somebody wants to take this on, here are a list of projects that you can prove yourself on: 1. Supports Win64 exceptions to dmd (approved by Walter) 2. Add typedef to dmd (requires a DIP) 3. Implement a macro processor and figure out how to define the predefined macros for each target and then make system headers work out of the box. 4. Implement 16bit float type (requires a DIP)What are you doing? This isnt happening, why are you giving out a todo list. Toxic optimism
Jun 14
On Saturday, 14 June 2025 at 11:39:01 UTC, rempas wrote:Now, D has "importC" and, lots of C++ libraries have C bindings, but the problem is, C doesn't have classes and, you will need to do manual work to create a "D way" of using the library. Also, not every C++ library has C bindings. For example, [Louvre](https://github.com/CuarzoSoftware/Louvre) does not! If I want to use it from D, I have to: * Wait for it to create C bindings (and hope they are maintained in the future) * Manually port the library myself. Which also includes porting other C++ libraries because I will get start with the [weston-example](https://github.com/CuarzoSoftware/Louvre/tree/main/src/examples/ ouvre-weston-clone) that they showcase in their repos.I think the best way forward would be taking advantage of the recent work on SWIG. Their last major release added experimental support for C: https://swig.org/Doc4.3/C.html#C
Jun 14
On Saturday, 14 June 2025 at 14:37:19 UTC, Lance Bachmeier wrote:I think the best way forward would be taking advantage of the recent work on SWIG. Their last major release added experimental support for C: https://swig.org/Doc4.3/C.html#CThat seems interesting! In general, it would be nice if we lived in a world where there would be a common IR and backend and languages would just target that, allowing you to use any symbol from any language.
Jun 15
On Sunday, 15 June 2025 at 12:46:09 UTC, rempas wrote:That seems interesting! In general, it would be nice if we lived in a world where there would be a common IR and backend and languages would just target that, allowing you to use any symbol from any language.In the case of C, C++ and D (and many others) that is exactly what is happening. They share the same backend.
Jun 16
On Monday, 16 June 2025 at 16:49:51 UTC, Dejan Lekic wrote:On Sunday, 15 June 2025 at 12:46:09 UTC, rempas wrote:I'm not an expert in compilers, but why can't languages be "married" on IR level? I mean compiler translates source code D and C++ to IR independently where these representations are "linked" together.That seems interesting! In general, it would be nice if we lived in a world where there would be a common IR and backend and languages would just target that, allowing you to use any symbol from any language.In the case of C, C++ and D (and many others) that is exactly what is happening. They share the same backend.
Jun 17
On Wednesday, 18 June 2025 at 01:39:00 UTC, Andrey Zherikov wrote:On Monday, 16 June 2025 at 16:49:51 UTC, Dejan Lekic wrote:The problem is type-safety, iirc LLVM IR is just about type width, but then C++ and D also have structs, OOP, templates. But you are right, it can do this right now, LDC has options to output IR/bytecode. However without this rich type information you MUST always write correct code because in that case compiler is unable to tell if you have a wrong types, and your program will end up malformed, doing nonsensical calculations on nonsensical inputs. Ok in reality you still need type information in form of manual declarations to please the type system. btw check out my gentool, it partially translates C++ to D on AST level and nicely matches linker level interop feature of D, I built it originally to help with my gamedev needs, but since then there is not much interest even in D community so now it is basically in maintenance mode and I switched to godot.On Sunday, 15 June 2025 at 12:46:09 UTC, rempas wrote:I'm not an expert in compilers, but why can't languages be "married" on IR level? I mean compiler translates source code D and C++ to IR independently where these representations are "linked" together.That seems interesting! In general, it would be nice if we lived in a world where there would be a common IR and backend and languages would just target that, allowing you to use any symbol from any language.In the case of C, C++ and D (and many others) that is exactly what is happening. They share the same backend.
Jun 17
On Wednesday, 18 June 2025 at 01:39:00 UTC, Andrey Zherikov wrote:On Monday, 16 June 2025 at 16:49:51 UTC, Dejan Lekic wrote:IR is already too late to put things in common. Type information is already lost, for example nowadays the LLVM IR does not make any difference between `int*` and `int**`, it's up to the front-end to check that kind of things. For example : https://godbolt.org/z/v6cG9Y65c. Only the front end knows the valid input types. Also you have the problem of the ABI. It's just delusional to think you can call a foreign function if you have it's LLVM IR. To some extent that will work but it's not sane.On Sunday, 15 June 2025 at 12:46:09 UTC, rempas wrote:I'm not an expert in compilers, but why can't languages be "married" on IR level? I mean compiler translates source code D and C++ to IR independently where these representations are "linked" together.That seems interesting! In general, it would be nice if we lived in a world where there would be a common IR and backend and languages would just target that, allowing you to use any symbol from any language.In the case of C, C++ and D (and many others) that is exactly what is happening. They share the same backend.
Jun 18
On Monday, 16 June 2025 at 16:49:51 UTC, Dejan Lekic wrote:In the case of C, C++ and D (and many others) that is exactly what is happening. They share the same backend.Yeah, I mean also a way to compile files from multiple languages and have them read files from other languages and been able to call their symbols.
Jun 25
DLang doesn't have a good parser for tooling, nor does it have a good LSP for D's features, and you want it to have a builtin C++ parser? I suggest we first ensure that D has good tooling for D.., then eventually somebody can begin the work on a C++ parser (even tho it's imo a waste of time.., but what ever..).
Jun 15
On Saturday, 14 June 2025 at 11:39:01 UTC, rempas wrote:**Improvements** Implementing a C++ parser will have the following advantages: * C++ libraries will be able to natively been used in D. This includes "macros" and templates that have not been initialized in the actual library and would need an *additional* initialization from the project that would use them (making the process even more tedious, slow and overall annoying). * C++ and D code will be able to be combined, giving the ability for any C++ project to more easily and smoothly get fully transit to D. That will bring even more popularity and trust to the language.It's been done already: https://github.com/Syniurge/Calypso -Johan
Jun 17
On Wednesday, 18 June 2025 at 06:33:00 UTC, Johan wrote:It's been done already: https://github.com/Syniurge/Calypso -JohanEhhhmmmm... Lats commit was 5 years ago?!
Jun 25