www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Automated source translation of C++ to D

reply "Joakim" <dlang joakim.airpost.net> writes:
C++ support keeps coming up these days, with Andrei continually 
stressing it as something to work on.  How hard would it to be to 
write a C++->D translator, to allow people to translate C++ 
libraries to D?

I've been using tools like DStep and looking at libdparse, which 
seem to work very well.  I just translated a C sample app from 
the Android NDK to D, fairly simple stuff like turning -> into ., 
adding a default in a switch statement, rewriting casts from 
C-style to D-style casts, removing the struct label, nothing that 
couldn't be automated.

I'm sure there's stuff that'd need to be done by hand, but if you 
can automate 97%, that's good enough.  Could this be a viable 
option for many cases?
Aug 20 2014
next sibling parent ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Thu, 21 Aug 2014 06:35:53 +0000
Joakim via Digitalmars-d <digitalmars-d puremagic.com> wrote:

i believe that all code that using STL/Boost will not be translated
(and this is the majority of C++ code, i think). and only very-very
primitive templates can be translated automatically.

so maybe i'm not right, but i think that automatic translation can do
something about 3% of existing code. ;-)
Aug 20 2014
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 21/08/14 08:35, Joakim wrote:
 C++ support keeps coming up these days, with Andrei continually
 stressing it as something to work on.  How hard would it to be to write
 a C++->D translator, to allow people to translate C++ libraries to D?

 I've been using tools like DStep and looking at libdparse, which seem to
 work very well.  I just translated a C sample app from the Android NDK
 to D, fairly simple stuff like turning -> into ., adding a default in a
 switch statement, rewriting casts from C-style to D-style casts,
 removing the struct label, nothing that couldn't be automated.

 I'm sure there's stuff that'd need to be done by hand, but if you can
 automate 97%, that's good enough.  Could this be a viable option for
 many cases?
I think it will be quite difficult. There are probably many cases of C++ code where there is no direct translation in D. libclang, which DStep uses, is probably not enough to do a complete source translation of C++. That means using the C++ Clang API's which are unstable in written in C++. C is completely different. D was basically designed to allow to easy translate C to D. If a C syntax compiles in D it's supposed to have the same semantics as in C. It might be possible. As a start you could help out with DStep. Starting to add support for C++ for example. -- /Jacob Carlborg
Aug 20 2014
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 8/20/14, 11:35 PM, Joakim wrote:
 C++ support keeps coming up these days, with Andrei continually
 stressing it as something to work on.  How hard would it to be to write
 a C++->D translator, to allow people to translate C++ libraries to D?

 I've been using tools like DStep and looking at libdparse, which seem to
 work very well.  I just translated a C sample app from the Android NDK
 to D, fairly simple stuff like turning -> into ., adding a default in a
 switch statement, rewriting casts from C-style to D-style casts,
 removing the struct label, nothing that couldn't be automated.

 I'm sure there's stuff that'd need to be done by hand, but if you can
 automate 97%, that's good enough.  Could this be a viable option for
 many cases?
I think the key here involves clang with hooks + a config file per C++ file to translate that instructs the translator how to proceed about corner cases (e.g. expand this macro but not this other etc). -- Andrei
Aug 21 2014
prev sibling parent reply "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"Joakim"  wrote in message news:ysntkmioyndreuiiyqxx forum.dlang.org... 

 C++ support keeps coming up these days, with Andrei continually 
 stressing it as something to work on.  How hard would it to be to 
 write a C++->D translator, to allow people to translate C++ 
 libraries to D?
You might want to look at DDMD, which is automatically converted.
Aug 21 2014
parent reply "Joakim" <dlang joakim.airpost.net> writes:
On Thursday, 21 August 2014 at 10:00:43 UTC, Daniel Murphy wrote:
 "Joakim"  wrote in message 
 news:ysntkmioyndreuiiyqxx forum.dlang.org...

 C++ support keeps coming up these days, with Andrei 
 continually stressing it as something to work on.  How hard 
 would it to be to write a C++->D translator, to allow people 
 to translate C++ libraries to D?
You might want to look at DDMD, which is automatically converted.
Yes, I'm aware of ddmd. You've mentioned many times that it only works because dmd is written using a very unC++-like style, to the point where github's source analyzer claims that dmd is written in 66.7% C, 28.4% D (presumably the tests right now), 4.4% C++, and 0.5% other. :) Given tools like libclang, how hard do you think it'd be to translate most of actual C++ to D? If writing such a tool would mean that C++->D translation is the path of least effort for D users who want to integrate with C++, maybe that's the approach that should be taken instead. I should note that I have no interest in any C++ libraries: I'm just throwing out this idea as an alternative to all the C++ interfacing that's being considered for D right now.
Aug 21 2014
next sibling parent "po" <yes no.com> writes:
  Might be pretty hard, C++ has some features D doesn't, not sure 
how you would emulate them.

C++ has these, I don't think D does:
   move only types
   r-value references
   SFINAE
   ADL
   Multiple inheritance
Aug 21 2014
prev sibling next sibling parent "Brad Anderson" <eco gnuk.net> writes:
On Thursday, 21 August 2014 at 17:57:13 UTC, Joakim wrote:
 Yes, I'm aware of ddmd.  You've mentioned many times that it 
 only works because dmd is written using a very unC++-like 
 style, to the point where github's source analyzer claims that 
 dmd is written in 66.7% C, 28.4% D (presumably the tests right 
 now), 4.4% C++, and 0.5% other. :)
That's just because the C++ DMD files have a .c extension. The joke about it being written in C+ is fitting but the source code is pretty easily identified as C++ (if only because it makes frequent use of classes).
Aug 21 2014
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 8/21/2014 10:57 AM, Joakim wrote:
 Given tools like libclang, how hard do you think it'd be to translate most of
 actual C++ to D?
I'd say the possibility of that is about zero. Heck, we can't even do it 100% for C. The trouble is, D is not a perfect superset of C++, not even close: 1. multiple inheritance 2. SFINAE 3. Koenig lookup 4. tail mutability 5. overloading rules 6. operator overloading rules 7. fwd reference issues 8. macros (it's depressing how much modern C++ practice still heavily depends on the preprocessor) Does that really matter? In my not-so-humble experience, C++ programmers often, far too often, find some odd corner case in the language and build an entire store on it. I personally find this baffling, but it happens with depressing regularity. (In contrast, the C++ style used in DMD is very conservative and tends to run right down the middle of the road of C++, avoiding anything clever and corners and weird emergent behavior. This is the only reason why DDMD has even a prayer of working.)
Aug 21 2014
next sibling parent "Joakim" <dlang joakim.airpost.net> writes:
On Thursday, 21 August 2014 at 21:06:09 UTC, Walter Bright wrote:
 On 8/21/2014 10:57 AM, Joakim wrote:
 Given tools like libclang, how hard do you think it'd be to 
 translate most of
 actual C++ to D?
I'd say the possibility of that is about zero. Heck, we can't even do it 100% for C. The trouble is, D is not a perfect superset of C++, not even close: 1. multiple inheritance 2. SFINAE 3. Koenig lookup 4. tail mutability 5. overloading rules 6. operator overloading rules 7. fwd reference issues 8. macros (it's depressing how much modern C++ practice still heavily depends on the preprocessor) Does that really matter? In my not-so-humble experience, C++ programmers often, far too often, find some odd corner case in the language and build an entire store on it. I personally find this baffling, but it happens with depressing regularity. (In contrast, the C++ style used in DMD is very conservative and tends to run right down the middle of the road of C++, avoiding anything clever and corners and weird emergent behavior. This is the only reason why DDMD has even a prayer of working.)
OK, you would know better than anyone, thanks for the considered answer.
Aug 21 2014
prev sibling next sibling parent reply Andrej Mitrovic via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 8/21/14, Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:
 The trouble is, D is not a perfect superset of C++, not even close
I don't think that's important for porting. To quote: ----- Engineering teams at Mozilla and Epic ported the award-winning Unreal Engine 3 (UE3) to the Web in just four days using the powerful combination of asm.js and Emscripten, which enables developers to compile C++ code into JavaScript.[1] ----- [1] : https://www.unrealengine.com/news/epic-games-releases-epic-citadel-on-the-web I'm still amazed that this is even possible. But it makes me think what does JS have over D that makes this possible? I have a feeling it's only down to the number and capability of people working on such a project which guarantees it's success (JS is important, and the Unreal Engine is important).
Sep 06 2014
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 9/6/2014 11:42 AM, Andrej Mitrovic via Digitalmars-d wrote:
 On 8/21/14, Walter Bright via Digitalmars-d <digitalmars-d puremagic.com>
wrote:
 The trouble is, D is not a perfect superset of C++, not even close
I don't think that's important for porting. To quote: ----- Engineering teams at Mozilla and Epic ported the award-winning Unreal Engine 3 (UE3) to the Web in just four days using the powerful combination of asm.js and Emscripten, which enables developers to compile C++ code into JavaScript.[1] ----- [1] : https://www.unrealengine.com/news/epic-games-releases-epic-citadel-on-the-web I'm still amazed that this is even possible. But it makes me think what does JS have over D that makes this possible? I have a feeling it's only down to the number and capability of people working on such a project which guarantees it's success (JS is important, and the Unreal Engine is important).
I'd have to see the source code and the translated code to form any judgement about that.
Sep 06 2014
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Saturday, 6 September 2014 at 20:06:51 UTC, Walter Bright 
wrote:
 I'd have to see the source code and the translated code to form 
 any judgement about that.
AFAIK Emscripten is not a source-to-source translator. It is a LLVM backend. asm2js code is very cryptic. You basically have everything in one big array and weird javascript expressions that are meant to ensure that expressions are taken as typed by the JIT. You cannot reasonably expect to be able to modify the output by hand.
Sep 06 2014
prev sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Saturday, 6 September 2014 at 18:42:43 UTC, Andrej Mitrovic 
via Digitalmars-d wrote:
 On 8/21/14, Walter Bright via Digitalmars-d 
 <digitalmars-d puremagic.com> wrote:
 The trouble is, D is not a perfect superset of C++, not even 
 close
I don't think that's important for porting. To quote: ----- Engineering teams at Mozilla and Epic ported the award-winning Unreal Engine 3 (UE3) to the Web in just four days using the powerful combination of asm.js and Emscripten, which enables developers to compile C++ code into JavaScript.[1] ----- [1] : https://www.unrealengine.com/news/epic-games-releases-epic-citadel-on-the-web I'm still amazed that this is even possible. But it makes me think what does JS have over D that makes this possible? I have a feeling it's only down to the number and capability of people working on such a project which guarantees it's success (JS is important, and the Unreal Engine is important).
asm.js is NOT javascript. asm.js is a bytecode format for native code, and a very inefficient one for that matter, as it uses the same syntax as javascript. One of the said advantage of this format, is that it is retro-compatible with javascript. That mean that the bytecode can be interpreted as javascript and still works. For some values of "works". Indeed, run as javascript, asm.js is an order of magnitude slower. You can consider that compatible for toy program, but the reason to use asm.js to begin with is performance, and so an order of magnitude slower is not acceptable. For the example of unity, that means unplayable games. Once you understand that this retro-compatibility with javascript argument is bullshit, it is easily understood that asm.js is simply a variant of pNaCl, that uses bloated bytecode to provide extra useless features and create fragmentation. Brendan Eich being CTO of mozilla when they dropped pNaCl in favor of asm.js is probably relevant.
Sep 06 2014
prev sibling parent ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Sat, 6 Sep 2014 20:42:32 +0200
Andrej Mitrovic via Digitalmars-d <digitalmars-d puremagic.com> wrote:

there is two kinds of "porting".

first is just generation of code that will be compiled by the target
compiler. this code can be messy, unreadable and unmaintainable, nobody
cares until it compiles.

second is "real porting", when code is still human-readable and easy to
support, just written in another language.

it's not that hard, for example, to wrtite C++ -> D translator that
produces working mess. let C++ compiler to instantiate all
necessary templates and so on, then use D as "high-level assembler".
resulting blob will be unmaintainable, but working. but this has no
sense, 'cause without original C++ code resulting D code cannot be
supported by any sane human person.

UE "porting" is of second kind.
Sep 06 2014
prev sibling next sibling parent reply "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"Joakim"  wrote in message news:ynfwlptfuzfutksbnslc forum.dlang.org...

 Yes, I'm aware of ddmd.  You've mentioned many times that it only works 
 because dmd is written using a very unC++-like style, to the point where 
 github's source analyzer claims that dmd is written in 66.7% C, 28.4% D 
 (presumably the tests right now), 4.4% C++, and 0.5% other. :)
The style dmd is written in makes it a lot easier, but it would still be possible with other styles. As others have said the github numbers have nothing to do with the style of the code, only the naming of the files.
 Given tools like libclang, how hard do you think it'd be to translate most 
 of actual C++ to D?  If writing such a tool would mean that C++->D 
 translation is the path of least effort for D users who want to integrate 
 with C++, maybe that's the approach that should be taken instead.
A tool that can translate an arbitrary C++ program to D is not going to happen. A tool that can translate a specific C++ program to D is not particularly difficult to produce, as I've done with dmd. eg multiple inheritance cannot be generally mapped to D code. But in many applications, the use of multiple inheritance _can_ be mapped because it corresponds to classes + interfaces. This type of application-specific knowledge can significantly reduce the complexity. DDMD actually has a major additional complication that most translations would not have - it is only a partial translation (glue layer stays in C++) and therefore needs abi stability across the boundary. I initially did a non-abi-stable translation of only the frontend, and it was rather easy in comparison. So no, you can't magically upgrade a project from C++ to D. But it can be done, and is not prohibitively difficult someone experienced with C++ and D.
Aug 22 2014
next sibling parent reply "David Nadlinger" <code klickverbot.at> writes:
On Friday, 22 August 2014 at 07:48:30 UTC, Daniel Murphy wrote:
 So no, you can't magically upgrade a project from C++ to D.
Hence the name "magicport"? Sorry, could not resist. David
Aug 22 2014
next sibling parent "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"David Nadlinger"  wrote in message 
news:jumhdppapovcvfnwnxxq forum.dlang.org...

 On Friday, 22 August 2014 at 07:48:30 UTC, Daniel Murphy wrote:
 So no, you can't magically upgrade a project from C++ to D.
Hence the name "magicport"? Sorry, could not resist. David
It's only the illusion of magic.
Aug 22 2014
prev sibling next sibling parent ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Fri, 22 Aug 2014 08:29:52 +0000
David Nadlinger via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 Hence the name "magicport"?
magic is not easy, contrary to widespread beliefs. ;-)
Aug 22 2014
prev sibling parent David Gileadi <gileadis NSPMgmail.com> writes:
On 8/22/14 1:29 AM, David Nadlinger wrote:
 On Friday, 22 August 2014 at 07:48:30 UTC, Daniel Murphy wrote:
 So no, you can't magically upgrade a project from C++ to D.
Hence the name "magicport"? Sorry, could not resist. David
It's pronounced "sufficiently advanced technology port" :)
Aug 22 2014
prev sibling parent reply Xavier Bigand <flamaros.xavier gmail.com> writes:
Le 22/08/2014 09:48, Daniel Murphy a écrit :
 "Joakim"  wrote in message news:ynfwlptfuzfutksbnslc forum.dlang.org...

 Yes, I'm aware of ddmd.  You've mentioned many times that it only
 works because dmd is written using a very unC++-like style, to the
 point where github's source analyzer claims that dmd is written in
 66.7% C, 28.4% D (presumably the tests right now), 4.4% C++, and 0.5%
 other. :)
The style dmd is written in makes it a lot easier, but it would still be possible with other styles. As others have said the github numbers have nothing to do with the style of the code, only the naming of the files.
 Given tools like libclang, how hard do you think it'd be to translate
 most of actual C++ to D?  If writing such a tool would mean that
 C++->D translation is the path of least effort for D users who want to
 integrate with C++, maybe that's the approach that should be taken
 instead.
A tool that can translate an arbitrary C++ program to D is not going to happen. A tool that can translate a specific C++ program to D is not particularly difficult to produce, as I've done with dmd. eg multiple inheritance cannot be generally mapped to D code. But in many applications, the use of multiple inheritance _can_ be mapped because it corresponds to classes + interfaces. This type of application-specific knowledge can significantly reduce the complexity. DDMD actually has a major additional complication that most translations would not have - it is only a partial translation (glue layer stays in C++) and therefore needs abi stability across the boundary. I initially did a non-abi-stable translation of only the frontend, and it was rather easy in comparison. So no, you can't magically upgrade a project from C++ to D. But it can be done, and is not prohibitively difficult someone experienced with C++ and D.
As many told it including Walter it seems impossible to be able to convert arbitrary C++ code to D, but having a tool that is well documented on what it can convert without issue will certainly be enough interesting. My point is, if some part of your C++ can't be converted you certainly can refactor it to fit in something supported by the conversion tool. Here the point is the tool must not be able to do imperfect conversion, it better to exit in error when something isn't well supported. As Andrei told, there is certainly ways to do a configurable tool, to help it to convert some parts. Even if a such tool can't convert any C++ code, it might help to interface C++ with D, by converting the main part (every day modified code) of a C++ software. In this sense I am more interested by converting the C++ I write than those of third party libraries. Maybe just as Facebook? The other question is what will be the result of the generated binary in term of performances?
Sep 03 2014
parent "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"Xavier Bigand"  wrote in message news:lu81p2$2a54$1 digitalmars.com...

 As many told it including Walter it seems impossible to be able to convert 
 arbitrary C++ code to D, but having a tool that is well documented on what 
 it can convert without issue will certainly be
 enough interesting.

 My point is, if some part of your C++ can't be converted you certainly can 
 refactor it to fit in something supported by the conversion tool.
 Here the point is the tool must not be able to do imperfect conversion, it 
 better to exit in error when something isn't well supported.

 As Andrei told, there is certainly ways to do a configurable tool, to help 
 it to convert some parts.
The main problem is that what exactly is supported depends on the conventions of the project being targeted. While you can certainly refactor your code to fit in the style the converter expects, it's quite likely easier to patch the converter to understand the existing style better. (assuming the project is internally consistent) eg In the dmd frontend, there were several places that #define was used to alias static members. eg Type::tint expanded to something like Type::tarray[Tint]. I removed all of these from the source to allow conversion, since D's alias can't alias an expression. In a codebase where this type of thing was significantly more common, it would make more sense to have the converter detect this pattern and automatically generate wrapper functions or something. Unfortunately every C++ seems to be written in a different, incompatible subset. Project-custom converters are able to handle these subsets optimally. Luckily these tools are very simple to write and adapt, in my experience.
 Even if a such tool can't convert any C++ code, it might help to interface 
 C++ with D, by converting the main part (every day modified code) of a C++ 
 software. In this sense I am more interested by converting the C++ I write 
 than those of third party libraries. Maybe just as Facebook?
If it can't convert the whole project, then you are going to have to do some parts manually or leave some parts in C++. Neither are particularly attractive. Most of the debugging work I've done with DDMD was caused by one of those, either bugs introduced during manual conversion, or nasty wrong-code or alignment bugs when calling across the language boundary.
 The other question is what will be the result of the generated binary in
 term of performances?
This is completely up to you. It is quite straightforward to have the generated D code exactly match the C++ code, call for call. Or, when you don't care so much, C++'s new can be replaced with D's GC new. This is true for DDMD at least. Other projects that rely more heavily on C++'s struct semantics may not behave the same, it's hard to say without trying it.
Sep 04 2014
prev sibling parent Marco Leise <Marco.Leise gmx.de> writes:
Am Thu, 21 Aug 2014 17:57:11 +0000
schrieb "Joakim" <dlang joakim.airpost.net>:

 On Thursday, 21 August 2014 at 10:00:43 UTC, Daniel Murphy wrote:
 You might want to look at DDMD, which is automatically 
 converted.
Yes, I'm aware of ddmd. You've mentioned many times that it only works because dmd is written using a very unC++-like style, to the point where github's source analyzer claims that dmd is written in 66.7% C, 28.4% D (presumably the tests right now), 4.4% C++, and 0.5% other. :)
OT: That's because the "analyzer" analyzes the file name part after the last "." and dmd uses an atypical extension for C++. -- Marco
Sep 06 2014