www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Anyone interested on working on a D parser ?

reply "Leandro T. C. Melo via Digitalmars-d" <digitalmars-d puremagic.com> writes:
Hi D enthusiasts,

I'm developing a multi-language code modelling engine. The heart of
the project is a language-unifying AST, a generic pipeline of binding,
type checking, code completion, etc, and hooks that allow each
language to plug-in their specific behavior where needed. Also, the
library is not tight to any particular IDE or text editor.

One "issue" I have so far is the D parser.

Mostly because of convenience I prototyped it with Bison. Despite
being tricky to get such LR parsers working in an interactive
environment, it's still possible to error-recover at the right spots
and provide a decent user experience - you can see some action in the
videos below, one for D and another for Go [1]. However, in the case
of D there's an additional challenge due to its grammar. Even though
I'm using a GLR parser (so ambiguities are handled), it's still
difficult to get everything in place.

Would anyone be interested on working out this parser or perhaps
building a recursive descent one? The parser is supposed to be
lightweight, not to perform symbol lookup (it can afford some
impreciseness), and its result must be the special AST. Therefore,
simply taking the official dmd2's parser is not a solution, although
it could certainly server as a reference.

An alternative would be a LL parser generator. I think ANTLR added a
C++ target, but I don't know how mature it is. There's also llgen, but
I never tried it. I might experiment one of them with Rust.

This is a project I work on my free time, but I'm trying to make it
move. So if anyone is interested, please get in touch, I'd be glad to
take contributions: https://github.com/ltcmelo/uaiso

Leandro

[1] https://www.youtube.com/watch?v=ZwMQ_GB-Zv0 and
https://www.youtube.com/watch?v=nUpcVBAw0DM
Sep 16 2015
next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
Did you take a look at 
https://github.com/Hackerpilot/libdparse/tree/master already?
Sep 16 2015
parent deadalnix <deadalnix gmail.com> writes:
On Thursday, 17 September 2015 at 01:38:02 UTC, Adam D. Ruppe 
wrote:
 Did you take a look at 
 https://github.com/Hackerpilot/libdparse/tree/master already?
Yes. libdparse and/or SDC's parser seems like some good places to start.
Sep 16 2015
prev sibling parent reply thedeemon <dlang thedeemon.com> writes:
On Thursday, 17 September 2015 at 01:35:42 UTC, Leandro T. C. 
Melo wrote:

 An alternative would be a LL parser generator. I think ANTLR 
 added a C++ target, but I don't know how mature it is.
I used C++ target of ANTLR like 13 years ago and it was fine. So I suppose it should be mature now. ;)
Sep 16 2015
parent "Leandro T. C. Melo via Digitalmars-d" <digitalmars-d puremagic.com> writes:
Thanks for the suggestions, I'm aware of SDC and Hackerpilot's. In
particular an ANTLR grammar that used to live in the DGrammar repo,
which I curiously noticed has been removed by a commit in January - I
suppose the intention is to progress with libdparse. Was this decision
oriented by taste or due to any technical difficulty?

What I need to balance now is whether 1) I continue work on the Bison
grammar I already have or 2) I evaluate such alternatives to re-write
the parser. Given the amount of time I can invest in this project,
I'll probably stick with 1. But I'd be open to consider 2 if it's a
joint effort.

Leandro

ps: thedeemon, I guess you mean older versions of ANTLR. ANTLR 3 and 4
didn't feature a C++ target (at least originally, if I recall
correctly).



On Thu, Sep 17, 2015 at 2:31 AM, thedeemon via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On Thursday, 17 September 2015 at 01:35:42 UTC, Leandro T. C. Melo wrote:

 An alternative would be a LL parser generator. I think ANTLR added a C++
 target, but I don't know how mature it is.
I used C++ target of ANTLR like 13 years ago and it was fine. So I suppose it should be mature now. ;)
Sep 20 2015