www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Re: Writing a Parser

reply Christoph Singewald <chrisrtoph singewald.at> writes:
Here you can find some parsergenerators:

http://www.prowiki.org/wiki4d/wiki.cgi?GrammarParsers

I use lemonde as parsergenerator and re2c as lexer. Using re2c in most cases
you have only to replace 'unsigned int' with 'uint' in the resulting code.


regards
cs

Dan Wrote:

 
 I've been messing with how to write a parser, and so far I've played with
numerous patterns before eventually wanting to cry.
 
 At the moment, I'm trying recursive descent parsing.
 
 The problem is that I've realized I'm duplicating huge volumes of code to cope
with the tristate decision of { unexpected, allow, require } for any given
token.
 
 For example, to consume a for loop, you consume something similar to
 /for\s*\((.*?)\)\s*\{(.*?)\}/
 
 I have it doing that, but my soul feels heavy with the masses of looped
switches it's doing.  Is there any way to ease the pain?
 
 Regards,
 Dan

Jan 10 2008
parent bearophile <bearophileHUGS lycos.com> writes:
Christoph Singewald:
 I use lemonde as parsergenerator and re2c as lexer. Using re2c in most
 cases you have only to replace 'unsigned int' with 'uint' in the resulting
code.

I have seen re2c and it looks nice. I think its source code can be modified with not that much efforts to make it produce D code instead of C. I think C isn't the right language to write such tool: its sources are about 200 KB of C code (plus some code generated by itself), they can probably be replaced by a 50 (or less) KB Python module (that generates the C/D code). Something like a tiny but really fast C compiler like TinyCC (http://fabrice.bellard.free.fr/tcc/) can be used to compile the C code on the fly in memory and execute it. This may become the starting point to give run-time compiled Regular Expressions to D :-) Probably there are other smarter ways to do similar things. Bye, bearophile
Jan 10 2008