www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - dmd Lexer and Parser in D

reply Zach Tollen <reachzachatgooglesmailservice dot.com> writes:
Greetings! I am a rather new programmer and while this is my first post 
I wanted to say that I did some work on the ddmd project at dsource.org, 
which was kind of a big hairy mess. My fork of this project is at:

https://github.com/zachthemystic/ddmd-clean

The point is, I cleaned out the crappiness but I chucked the entire 
semantic and backend, so that you have left a port of the dmd lexer and 
parser in the D language now. The README there has more to say.

I might well "announce" this on D.announce but I'm too new to have a 
feel for the significance of it all.

Thanks for reading,

Zach
Feb 03 2012
next sibling parent Trass3r <un known.com> writes:
Maybe there's some IDE that can make use of this.
"Unfortunately" VisualD already has its own ;)
Feb 04 2012
prev sibling next sibling parent reply "F i L" <witte2008 gmail.com> writes:
On Saturday, 4 February 2012 at 05:24:45 UTC, Zach Tollen wrote:
 Greetings! I am a rather new programmer and while this is my 
 first post I wanted to say that I did some work on the ddmd 
 project at dsource.org, which was kind of a big hairy mess. My 
 fork of this project is at:

 https://github.com/zachthemystic/ddmd-clean

 The point is, I cleaned out the crappiness but I chucked the 
 entire semantic and backend, so that you have left a port of 
 the dmd lexer and parser in the D language now. The README 
 there has more to say.

 I might well "announce" this on D.announce but I'm too new to 
 have a feel for the significance of it all.

 Thanks for reading,

 Zach

Very cool. I was talking with someone on the IRC about the possibility/difficulties of making DMD's parser/lexer/AST stay open in memory with protocols designed for IDE code-completion communication. It would be ideal to have an IDE's intellisense automatically update with DMD semantically. Unfortunately the conclusion was that it would be to difficult an undertaking to be realistic, since DMD is designed to be run-and-done (also something about "Walter code" :-)). But maybe a rewrite/port of DMD, especially one written in D, might be able to be reworked with this goal in mind? How complete is DDMD?
Feb 04 2012
parent reply Zach Tollen <reachzachatgooglesmailservice dot.com> writes:
On 2/4/12 6:59 AM, F i L wrote:
 Very cool. I was talking with someone on the IRC about the
 possibility/difficulties of making DMD's parser/lexer/AST stay open in
 memory with protocols designed for IDE code-completion communication. It
 would be ideal to have an IDE's intellisense automatically update with
 DMD semantically.

This is my thinking too. One good thing about having cut the program is that it's a much lighter weight now, and I did it because I thought, well, maybe once it's paired down, I can actually steer it toward IDE functionality. For example, you could really cut out a lot of the members of the data structures which only point to backend functionality anyway. Even if the whole project fails I won't regret doing it because I learned a lot about D in the process. What I'm really wondering is if you wanted a program which helped you edit the syntax tree directly and only produced a text file for saving and running, what kind of data structure would you like to have representing the syntax tree? Without knowing anything else, I guessed that it would be nice to have something resembling the official D parse-tree.
 Unfortunately the conclusion was that it would be to difficult an
 undertaking to be realistic, since DMD is designed to be run-and-done
 (also something about "Walter code" :-)).

I was wondering if you couldn't take a parse-tree data structure and deparse (disparse?) it back to formatted program code so that you could see what you were editing? As unrealistic as that sounds, I'm sufficiently attracted to the idea that I'm investigating it with an open mind.
 But maybe a rewrite/port of DMD, especially one written in D, might
 be able to be reworked with this
 goal in mind? How complete is DDMD?

This is exactly what I'm aiming at. My basic hopes for its being possible are the comforting notion that the huge part of dmd is actually the stuff I threw out! The goal would be to construct the front end (of a front end) which was at least theoretically capable both of allowing code editing, and of translation to a more backend-friendly data structure. If that's not possible, then I'm stuck with this thought that you edit the tree, then the IDE reverses the parse back into ordinary code for saving and compiling. If anybody can refer me to any examples and demonstrations of this type of code-editing, please do. As someone new to programming I'm really wondering why, if the program itself is understood by the computer as a tree, why do I have to edit a text file instead of a tree? Zach
Feb 04 2012
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 02/04/2012 06:39 PM, Zach Tollen wrote:
 On 2/4/12 6:59 AM, F i L wrote:
 Very cool. I was talking with someone on the IRC about the
 possibility/difficulties of making DMD's parser/lexer/AST stay open in
 memory with protocols designed for IDE code-completion communication. It
 would be ideal to have an IDE's intellisense automatically update with
 DMD semantically.

This is my thinking too. One good thing about having cut the program is that it's a much lighter weight now, and I did it because I thought, well, maybe once it's paired down, I can actually steer it toward IDE functionality. For example, you could really cut out a lot of the members of the data structures which only point to backend functionality anyway. Even if the whole project fails I won't regret doing it because I learned a lot about D in the process. What I'm really wondering is if you wanted a program which helped you edit the syntax tree directly and only produced a text file for saving and running, what kind of data structure would you like to have representing the syntax tree? Without knowing anything else, I guessed that it would be nice to have something resembling the official D parse-tree.
 Unfortunately the conclusion was that it would be to difficult an
 undertaking to be realistic, since DMD is designed to be run-and-done
 (also something about "Walter code" :-)).

I was wondering if you couldn't take a parse-tree data structure and deparse (disparse?) it back to formatted program code so that you could see what you were editing? As unrealistic as that sounds, I'm sufficiently attracted to the idea that I'm investigating it with an open mind. > But maybe a rewrite/port of DMD, especially one written in D, might > be able to be reworked with this > goal in mind? How complete is DDMD? This is exactly what I'm aiming at. My basic hopes for its being possible are the comforting notion that the huge part of dmd is actually the stuff I threw out! The goal would be to construct the front end (of a front end) which was at least theoretically capable both of allowing code editing, and of translation to a more backend-friendly data structure. If that's not possible, then I'm stuck with this thought that you edit the tree, then the IDE reverses the parse back into ordinary code for saving and compiling. If anybody can refer me to any examples and demonstrations of this type of code-editing, please do. As someone new to programming I'm really wondering why, if the program itself is understood by the computer as a tree, why do I have to edit a text file instead of a tree? Zach

You are: The source file can be seen as the representation of a tree structure, and if you read the source you group the characters in a tree-like way in order to understand what it is saying. Anyway, this is true for any language. Your post could be parsed into a tree structure too. You might want to have a look at lisp. Its syntax is a straightforward description of the parse tree. http://en.wikipedia.org/wiki/Lisp_%28programming_language%29
Feb 04 2012
parent Zach Tollen <reachzachatgooglesmailservice dot.com> writes:
On 2/4/12 1:24 PM, Timon Gehr wrote:
 On 02/04/2012 06:39 PM, Zach Tollen wrote:
 If anybody can refer me to any examples and demonstrations of this type
 of code-editing, please do. As someone new to programming I'm really
 wondering why, if the program itself is understood by the computer as a
 tree, why do I have to edit a text file instead of a tree?

 Zach

You are: The source file can be seen as the representation of a tree structure, and if you read the source you group the characters in a tree-like way in order to understand what it is saying. Anyway, this is true for any language. Your post could be parsed into a tree structure too.

I know what you mean, but what I mean is that it would be cool if my text editor knew that when I started a line with 'writeln(' that I had no intention of finishing the line without inserting a corresponding ');'. Instead, if I forget to add the ending parenthesis, the compiler thinks I meant never to end the function call and it gives a parse error when it gets to something it can't read according to its expectations. It gets a little worse when the structure in question gets larger or more complicated. What I wish would happen is that I simply told the editor directly "I want to insert a complete statement here", and then it inserts the statement into it's tree, and I can't get rid of it without giving a specific command to do so. So while I understand that the text file *represents* a syntax tree, I wish it were more controlled than that. That thought made it tempting to consider, well how hard would it be to have the editor just hold the tree itself in memory and all the editor's commands were oriented toward adding, deleting, changing the program itself instead of changing textual characters which merely represent the tree? I see two reasons this might be a bad idea. First, even in an ideal world, where you had a fully implemented syntax tree editor, it might turn out that it's just worse than manually editing the files. But the other reason is one of tradition and infrastructure. If all the experiences folks have is with text editing then they don't want to change, and all the infrastructure is already built to support text editing anyway. It's this second reason I'm scared of. It would seem like a shame if that were the only reason nobody wants to build a syntax-tree editor. So I'm still interested in this idea. I'm going to try to research people's experiences with this kind of thing. http://en.wikipedia.org/wiki/Structure_editor Zach
Feb 04 2012
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2012-02-04 18:39, Zach Tollen wrote:
 On 2/4/12 6:59 AM, F i L wrote:
 Very cool. I was talking with someone on the IRC about the
 possibility/difficulties of making DMD's parser/lexer/AST stay open in
 memory with protocols designed for IDE code-completion communication. It
 would be ideal to have an IDE's intellisense automatically update with
 DMD semantically.

This is my thinking too. One good thing about having cut the program is that it's a much lighter weight now, and I did it because I thought, well, maybe once it's paired down, I can actually steer it toward IDE functionality. For example, you could really cut out a lot of the members of the data structures which only point to backend functionality anyway. Even if the whole project fails I won't regret doing it because I learned a lot about D in the process. What I'm really wondering is if you wanted a program which helped you edit the syntax tree directly and only produced a text file for saving and running, what kind of data structure would you like to have representing the syntax tree? Without knowing anything else, I guessed that it would be nice to have something resembling the official D parse-tree.
 Unfortunately the conclusion was that it would be to difficult an
 undertaking to be realistic, since DMD is designed to be run-and-done
 (also something about "Walter code" :-)).

I was wondering if you couldn't take a parse-tree data structure and deparse (disparse?) it back to formatted program code so that you could see what you were editing? As unrealistic as that sounds, I'm sufficiently attracted to the idea that I'm investigating it with an open mind. > But maybe a rewrite/port of DMD, especially one written in D, might > be able to be reworked with this > goal in mind? How complete is DDMD? This is exactly what I'm aiming at. My basic hopes for its being possible are the comforting notion that the huge part of dmd is actually the stuff I threw out! The goal would be to construct the front end (of a front end) which was at least theoretically capable both of allowing code editing, and of translation to a more backend-friendly data structure. If that's not possible, then I'm stuck with this thought that you edit the tree, then the IDE reverses the parse back into ordinary code for saving and compiling. If anybody can refer me to any examples and demonstrations of this type of code-editing, please do. As someone new to programming I'm really wondering why, if the program itself is understood by the computer as a tree, why do I have to edit a text file instead of a tree? Zach

You could have a look at Clang. It's a frontend for LLVM that's developed to be used both as a compiler and as a library to build other tools on, like IDE's and other tools. -- /Jacob Carlborg
Feb 04 2012
prev sibling parent "Daniel Murphy" <yebblies nospamgmail.com> writes:
"Zach Tollen" <reachzachatgooglesmailservice dot.com> wrote in message 
news:jgjqfo$2puc$1 digitalmars.com...
 What I'm really wondering is if you wanted a program which helped you edit 
 the syntax tree directly and only produced a text file for saving and 
 running, what kind of data structure would you like to have representing 
 the syntax tree? Without knowing anything else, I guessed that it would be 
 nice to have something resembling the official D parse-tree.

The code inside dmd that does lowerings does something like this. Want to add printf to the end of a function? Sure! fbody = new CompoundStatement(loc, fbody, new ExpStatement(loc, new CallExp(loc, new VarExp(loc, new IdentifierExp(loc, Lexer::idPool("printf"))), new StringExp(loc, "Hello world!\\n")))); It is a lot better to have the parser generate the syntax tree for you! fbody = new CompoundStatement(loc, fbody, new CompileStatement(loc, "printf(\"Hello world!\\n\");")); (I know this isn't what you meant, but that's what the parse tree looks like)
Feb 04 2012
prev sibling next sibling parent Trass3r <un known.com> writes:
 Unfortunately the conclusion was that it would be to difficult an  
 undertaking to be realistic, since DMD is designed to be run-and-done  
 (also something about "Walter code" :-)). But maybe a rewrite/port of  
 DMD, especially one written in D, might be able to be reworked with this  
 goal in mind? How complete is DDMD?

It's stuck at 2.040. I doubt getting it up-to-date would be worth the effort. Also there are still plenty of unimplemented functions.
Feb 04 2012
prev sibling next sibling parent "F i L" <witte2008 gmail.com> writes:
 It's stuck at 2.040. I doubt getting it up-to-date would be 
 worth the effort.
 Also there are still plenty of unimplemented functions.

Ah, oh well...
Feb 04 2012
prev sibling parent reply "Daniel Murphy" <yebblies nospamgmail.com> writes:
On a related note, how much interest is there around here in having an 
official version of dmd written in D?

There are two ways I can imagine this actually happening:
1.
- Improve D's ability to link with C++
- Make D bindings out of the header files
- Port code to D incrementally

2.
- Dify the C++ source (no classes on the stack/embedded, no bitfields, etc)
- Fix all #ifdefs that break up expressions so they can be turned into 
versions
- Create a conversion program to turn it into D ('->' -> '.', (type) -> 
cast(type) etc)

Just something to think about for the distant future. 
Feb 04 2012
next sibling parent reply "Nick Sabalausky" <a a.a> writes:
"Daniel Murphy" <yebblies nospamgmail.com> wrote in message 
news:jgj9mu$1q60$1 digitalmars.com...
 On a related note, how much interest is there around here in having an 
 official version of dmd written in D?

I'm interested in a D *API* for taking in D sources and spitting out the user's choice of either the parser results, or an AST with all the semantics/CTFE/etc already run. I get the impressiona lot of people are intrested in this. As far as the actual *implementation* behind the D interface, I don't particularly care if it's C, C++, or D. I suspect having it D might be a pain until a lot more issues get resolved. A bootstrapping compiler, I would imagine, would need a much more stable base than other types of software would need (though I don't have any experience with bootstrapping compilers, so I could be wrong).
Feb 04 2012
next sibling parent Armin Kazmi <armin.kazmi tu-dortmund.de> writes:
Well, I think, it might be easier to change the dmd implemention to use  
C only and then write language bindings to that. We all know the 
binding situation to C++ won't change sooner or later.

 "Daniel Murphy" <yebblies nospamgmail.com> wrote in message
 news:jgj9mu$1q60$1 digitalmars.com...
 On a related note, how much interest is there around here in having 


 official version of dmd written in D?

I'm interested in a D *API* for taking in D sources and spitting out

 user's choice of either the parser results, or an AST with all the
 semantics/CTFE/etc already run. I get the impressiona lot of people 

 intrested in this.
 
 As far as the actual *implementation* behind the D interface, I don't
 particularly care if it's C, C++, or D.
 
 I suspect having it D might be a pain until a lot more issues get
 resolved. A bootstrapping compiler, I would imagine, would need a 

 more stable base than other types of software would need (though I 

 have any experience with bootstrapping compilers, so I could be 

Feb 04 2012
prev sibling parent "Daniel Murphy" <yebblies nospamgmail.com> writes:
"Nick Sabalausky" <a a.a> wrote in message 
news:jgjngv$2je9$1 digitalmars.com...
 I'm interested in a D *API* for taking in D sources and spitting out the 
 user's choice of either the parser results, or an AST with all the 
 semantics/CTFE/etc already run. I get the impressiona lot of people are 
 intrested in this.

This is not that far off. I've got a branch of dmd, with a di file for every h file, that is able to link to itself. There are still some issues with vtables and static variables but hopefully I will sort them out in the near future. What would the D api look like? If D can link to c++ well enough to call into the dmd source, building an api on top of that wouldn't be that bad.
 I suspect having it D might be a pain until a lot more issues get 
 resolved. A bootstrapping compiler, I would imagine, would need a much 
 more stable base than other types of software would need (though I don't 
 have any experience with bootstrapping compilers, so I could be wrong).

Yeah.
Feb 04 2012
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, February 04, 2012 23:52:46 Daniel Murphy wrote:
 On a related note, how much interest is there around here in having an
 official version of dmd written in D?
 
 There are two ways I can imagine this actually happening:
 1.
 - Improve D's ability to link with C++
 - Make D bindings out of the header files
 - Port code to D incrementally
 
 2.
 - Dify the C++ source (no classes on the stack/embedded, no bitfields, etc)
 - Fix all #ifdefs that break up expressions so they can be turned into
 versions
 - Create a conversion program to turn it into D ('->' -> '.', (type) ->
 cast(type) etc)
 
 Just something to think about for the distant future.

The intention is to have a lexer and parser for D in Phobos at some point, but I don't know how much we gain by having the whole compiler in D. It's not a bad idea in the least, and it would be a great project for someone to tackle, but of all of the things that a contributor could be doing, I'm not sure that that's really all that high on the list as far as value goes. - Jonathan M Davis
Feb 04 2012
prev sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Feb 04, 2012 at 01:56:50PM -0800, Jonathan M Davis wrote:
[...]
 The intention is to have a lexer and parser for D in Phobos at some
 point, but I don't know how much we gain by having the whole compiler
 in D. It's not a bad idea in the least, and it would be a great
 project for someone to tackle, but of all of the things that a
 contributor could be doing, I'm not sure that that's really all that
 high on the list as far as value goes.

I'm actually thinking about writing a D pretty printer as a little exercise in D programming. I haven't decided whether or not to simply adapt the existing dmd frontend. One of Walter's stated advantages of D is that it's easily lexed and parsed, even if semantics are disregarded. I'm considering to prove that statement by building a lexer/parser from ground up, though of course lacking most of the complexity of the real compiler since I only need to do just enough to be able to pretty-print D code. If it's done correctly, it might even be useful in automated conversions between different preferred indentation styles, etc.. T -- This is a tpyo.
Feb 04 2012