www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - D parsing

reply "jgetner" <jgetner gmail.com> writes:
Is there a function in D for evaluating expressions before 
compile time?.
Oct 30 2013
next sibling parent "evilrat" <evilrat666 gmail.com> writes:
On Wednesday, 30 October 2013 at 22:39:13 UTC, jgetner wrote:
 Is there a function in D for evaluating expressions before 
 compile time?.
what? how it is supposed to work? O_o
Oct 30 2013
prev sibling next sibling parent reply Philippe Sigaud <philippe.sigaud gmail.com> writes:
Do you want to evaluate an expression at CT or to parse an expression?

D has a powerful constant-folding step during compilation, where whole
functions can be evaluated, provided they contain only a subset of D
(almost all of D, except taking adresses/pointers and generating classes,
IIRC).

http://en.wikipedia.org/wiki/Compile_time_function_execution
http://dlang.org/function.html#interpretation


Now, if you want parsing specifically, what do you want to get? A parse
tree? In that case, you can for example use one of my projects, Pegged,
which generates CT-compatible parsers (that is, functions that can use CTFE
to parse a string at CT).
Parse trees can also be manipulated at compile-time, to modify the
expression, and then collapsed down again to a new expression, if that's
what is needed.

https://github.com/PhilippeSigaud/Pegged
Oct 30 2013
parent reply Martin Nowak <code dawg.eu> writes:
On 10/31/2013 07:10 AM, Philippe Sigaud wrote:
 Now, if you want parsing specifically, what do you want to get? A parse
 tree? In that case, you can for example use one of my projects, Pegged,
 which generates CT-compatible parsers (that is, functions that can use
 CTFE to parse a string at CT).
 Parse trees can also be manipulated at compile-time, to modify the
 expression, and then collapsed down again to a new expression, if that's
 what is needed.

 https://github.com/PhilippeSigaud/Pegged
Have any serious grammars been tested with this? Are there any shortcomings?
Nov 01 2013
parent reply Philippe Sigaud <philippe.sigaud gmail.com> writes:
The examples directory shows different grammars, from JSON to XML to C to D.

I used it with the D grammar, but the one on the website is woefully
inadequate (some mistakes, out of date compared to the compiler and written
in a somewhat convoluted style full of left-recursion). The shortcomings
are that the generated parser is quite slow compared to other D parsers.

That comes from my coding, of course: Pegged generates a simple recursive
descent parser. I guess I could push for a better engine, but I'm waiting
for CTFE to get a bit better.

The advantages are nice, though: full parse tree, semantic actions, and so
on.
Nov 01 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/1/13 1:59 PM, Philippe Sigaud wrote:
 The examples directory shows different grammars, from JSON to XML to C to D.

 I used it with the D grammar, but the one on the website is woefully
 inadequate (some mistakes, out of date compared to the compiler and
 written in a somewhat convoluted style full of left-recursion). The
 shortcomings are that the generated parser is quite slow compared to
 other D parsers.

 That comes from my coding, of course: Pegged generates a simple
 recursive descent parser. I guess I could push for a better engine, but
 I'm waiting for CTFE to get a bit better.

 The advantages are nice, though: full parse tree, semantic actions, and
 so on.
I have long believed, and continue to believe, that this kind of work is strategic for D. There has been evidence, too - the code I pushed at work recently included a generated lexer replacing a hand-written lexer with a generated component, and people did notice. Compile-time and run-time improvements sealed the deal. To succeed against established languages, D must not only do well where others also do well. It's not even enough to do great where others do well. It must do well (or great!) where others don't even stand a chance. Regex and Pegged are in that category. Bugs stopping Pegged from going forward should receive high priority. I also encourage others to participate to that and similar work. Andrei
Nov 01 2013
next sibling parent reply Manfred Nowak <svv1999 hotmail.com> writes:
Andrei Alexandrescu wrote:

 Bugs stopping Pegged from going forward should receive high priority.
One prerequisite for every PEG-Parser is, that the language has to be designed to be without any ambiguity. This is not the case for D, because of its evolution based on recursive descent parsing including tricks. It is therefore not sufficient to eliminate bugs in Pegged. -manfred
Nov 01 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/1/13 9:06 PM, Manfred Nowak wrote:
 Andrei Alexandrescu wrote:

 Bugs stopping Pegged from going forward should receive high priority.
One prerequisite for every PEG-Parser is, that the language has to be designed to be without any ambiguity. This is not the case for D, because of its evolution based on recursive descent parsing including tricks. It is therefore not sufficient to eliminate bugs in Pegged. -manfred
That would mean Pegged should accommodate tricks. Very few languages don't have them. Andrei
Nov 01 2013
next sibling parent reply Philippe Sigaud <philippe.sigaud gmail.com> writes:
Manfred:
One prerequisite for every PEG-Parser is, that
the language has to be designed to be without any ambiguity.

This is not the case for D, because of its evolution
 based on recursive descent parsing including tricks.

It is therefore not sufficient to eliminate bugs in Pegged.
That's right. I should test it with 2.064, though, as soon as it's out. I also have a fully recursive parser that manages ambiguity quite well (gives all possible parse trees), but that's not what I want for a programming language. What we need is a parser able to deal with left-recursive BNF grammar. That way, we can feed it the D grammar. Note that Pegged has no problem with the official C grammar or the officiel XML specification... Andrei:
 That would mean Pegged should accommodate tricks. Very few languages
don't have them. That was ma conclusion also, even if that means stepping outside of what a PEG is. I just fear losing some of their proprieties. I really like the composability of PEGs: you can inject / call another grammar inside another one: creating a grammar becomes a bit like structured programming, with functions and recursions. That means you can 'grow' your grammar by steps. That's really really nice to have. Also, discarding the 'scannerless' part of PEGs (to connect the parser to the token range resulting from DScanner, for example), or dealing with left-recursion means changing the internal engine. Of course, that can be done : the engine could be changed according to compile-time parameters. I 'just' need to code the parsers themsevels :) Bah, you're getting me all interested in it again! I wanted to write a D tutorial around a ray-tracer!
Nov 02 2013
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/2/13 1:19 AM, Philippe Sigaud wrote:
 Bah, you're getting me all interested in it again! I wanted to write a D
 tutorial around a ray-tracer!
If you want maximum impact, Pegged is it. (Not to say tutorials aren't important, but probably the skill to work on Pegged is narrower.) Andrei
Nov 02 2013
prev sibling parent reply "growler" <growlercab gmail.com> writes:
On Saturday, 2 November 2013 at 08:19:39 UTC, Philippe Sigaud
wrote:

[snip]
 Bah, you're getting me all interested in it again! I wanted to 
 write a D
 tutorial around a ray-tracer!
I'd like to chime in and say I think Pegged is fantastic. I'm using it for parsing C and Markup in two different projects. I've also just started working on an Octave (well ok, Matlab) M-file parser. It will probably never see the light of day but with Pegged is great fun to play with. Please don't lose interest in it !!
Nov 02 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/2/13 1:56 PM, growler wrote:
 On Saturday, 2 November 2013 at 08:19:39 UTC, Philippe Sigaud
 wrote:

 [snip]
 Bah, you're getting me all interested in it again! I wanted to write a D
 tutorial around a ray-tracer!
I'd like to chime in and say I think Pegged is fantastic. I'm using it for parsing C and Markup in two different projects. I've also just started working on an Octave (well ok, Matlab) M-file parser. It will probably never see the light of day but with Pegged is great fun to play with. Please don't lose interest in it !!
Yes. Probably the most significant D project at this time. Andrei
Nov 02 2013
next sibling parent reply Timothee Cour <thelastmammoth gmail.com> writes:
1)
The main issue I see with pegged is PEG grammars don't support left
recursion, so for example will fail on foo[1].bar(2).fun().
Unless there's a plan to accomodate those, I sense a dead end.
One can eliminate left recursion but this has issues.

2)
There is some material on extending PEG to support those, eg "Left
Recursion in Parsing Expression Grammars", or code
https://github.com/orlandohill/peg-left-recursion but I don't know how well
they work in practice.

3)
Finally, a parser should be as fast as possible; I'm not sure how well
pegged performs compared to dmd frontend. Other promising options are using
lemon, a LALR(1) parser generator.

On Sat, Nov 2, 2013 at 2:17 PM, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 On 11/2/13 1:56 PM, growler wrote:

 On Saturday, 2 November 2013 at 08:19:39 UTC, Philippe Sigaud
 wrote:

 [snip]

 Bah, you're getting me all interested in it again! I wanted to write a D
 tutorial around a ray-tracer!
I'd like to chime in and say I think Pegged is fantastic. I'm using it for parsing C and Markup in two different projects. I've also just started working on an Octave (well ok, Matlab) M-file parser. It will probably never see the light of day but with Pegged is great fun to play with. Please don't lose interest in it !!
Yes. Probably the most significant D project at this time. Andrei
Nov 02 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/2/13 6:45 PM, Timothee Cour wrote:
 1)
 The main issue I see with pegged is PEG grammars don't support left
 recursion, so for example will fail on foo[1].bar(2).fun().
 Unless there's a plan to accomodate those, I sense a dead end.
 One can eliminate left recursion but this has issues.
I see Pegged significant not only as a PEG-based generator, but also for exercising D's generative abilities and consequently inspire other parser generators. Andrei
Nov 02 2013
parent Timothee Cour <thelastmammoth gmail.com> writes:
On Sat, Nov 2, 2013 at 7:11 PM, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 On 11/2/13 6:45 PM, Timothee Cour wrote:

 1)
 The main issue I see with pegged is PEG grammars don't support left
 recursion, so for example will fail on foo[1].bar(2).fun().
 Unless there's a plan to accomodate those, I sense a dead end.
 One can eliminate left recursion but this has issues.
I see Pegged significant not only as a PEG-based generator, but also for exercising D's generative abilities and consequently inspire other parser generators.
That's fine and it works great for certain grammars, but for grammars like D? Either theres' a plan to handle those or it's time to look for alternatives. I've suggested lemon (improves upon yacc/bison), which I've used before (via a d wrapper) http://www.hwaci.com/sw/lemon/.
Nov 02 2013
prev sibling next sibling parent reply "Chad Joan" <chadjoan gmail.com> writes:
On Sunday, 3 November 2013 at 01:45:23 UTC, Timothee Cour wrote:
 1)
 The main issue I see with pegged is PEG grammars don't support 
 left
 recursion, so for example will fail on foo[1].bar(2).fun().
 Unless there's a plan to accomodate those, I sense a dead end.
 One can eliminate left recursion but this has issues.
Use the repetition operator(s) and turn the resulting array into whatever tree you like. In practice, I have never had a problem with this. I have used both Pegged and have written an SQL parser at work (not exhaustive, just what's needed) that uses C macros as PEG elements. Tangent: Using C macros for this worked surprisingly well and allowed me to avoid an extra step in the build process (I do not have access to D for that project). Pegged is still much more scalable, optimizable, and the grammars are a lot less ugly/finicky.
Nov 04 2013
parent reply Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Tue, Nov 5, 2013 at 5:45 AM, Chad Joan <chadjoan gmail.com> wrote:

  Use the repetition operator(s) and turn the resulting array into
 whatever tree you like.  In practice, I have never had a problem with this.

 I have used both Pegged and have written an SQL parser at work (not
 exhaustive, just what's needed) that uses C macros as PEG elements.
  Tangent: Using C macros for this worked surprisingly well and allowed me
 to avoid an extra step in the build process (I do not have access to D for
 that project).  Pegged is still much more scalable, optimizable, and the
 grammars are a lot less ugly/finicky.
That made my day, thanks!
Nov 05 2013
parent "Chad Joan" <chadjoan gmail.com> writes:
On Tuesday, 5 November 2013 at 16:31:52 UTC, Philippe Sigaud 
wrote:
 On Tue, Nov 5, 2013 at 5:45 AM, Chad Joan <chadjoan gmail.com> 
 wrote:

  Use the repetition operator(s) and turn the resulting array 
 into
 whatever tree you like.  In practice, I have never had a 
 problem with this.

 I have used both Pegged and have written an SQL parser at 
 work (not
 exhaustive, just what's needed) that uses C macros as PEG 
 elements.
  Tangent: Using C macros for this worked surprisingly well 
 and allowed me
 to avoid an extra step in the build process (I do not have 
 access to D for
 that project).  Pegged is still much more scalable, 
 optimizable, and the
 grammars are a lot less ugly/finicky.
That made my day, thanks!
You're very welcome! Thank you for spending all that time to make a neat thing. Maybe I'll rub it in: Pegged taught me about PEGs and how to build reasonably powerful parsers with extremely limited tools and support. It's like being able to make shelter in the woods when no one gives you a tent. And my company's codebase is a jungle. So even when D isn't around, it is still /useful/. Thanks.
Nov 06 2013
prev sibling next sibling parent reply Martin Nowak <code dawg.eu> writes:
On 11/03/2013 02:45 AM, Timothee Cour wrote:
 1)
 The main issue I see with pegged is PEG grammars don't support left
 recursion, so for example will fail on foo[1].bar(2).fun().
 Unless there's a plan to accomodate those, I sense a dead end.
 One can eliminate left recursion but this has issues.

 2)
 There is some material on extending PEG to support those, eg "Left
 Recursion in Parsing Expression Grammars", or code
 https://github.com/orlandohill/peg-left-recursion but I don't know how
 well they work in practice.
Scala extended their PEG generator to handle left recursion. But then there is also a Scala GLL implementation. https://github.com/djspiewak/gll-combinators
Nov 05 2013
parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Tue, Nov 5, 2013 at 6:45 PM, Martin Nowak <code dawg.eu> wrote:

 On 11/03/2013 02:45 AM, Timothee Cour wrote:

 1)
 The main issue I see with pegged is PEG grammars don't support left
 recursion, so for example will fail on foo[1].bar(2).fun().
 Unless there's a plan to accomodate those, I sense a dead end.
 One can eliminate left recursion but this has issues.

 2)
 There is some material on extending PEG to support those, eg "Left
 Recursion in Parsing Expression Grammars", or code
 https://github.com/orlandohill/peg-left-recursion but I don't know how
 well they work in practice.

  Scala extended their PEG generator to handle left recursion.
But then there is also a Scala GLL implementation. https://github.com/djspiewak/gll-combinators
I didn't know that. My 'programming in Scala' book is a bit dated. I'll have a look, thanks.
Nov 06 2013
prev sibling parent reply Bastiaan Veelo <Bastiaan Veelo.net> writes:
On Sunday, 3 November 2013 at 01:45:23 UTC, Timothee Cour wrote:
 1)
 The main issue I see with pegged is PEG grammars don't support 
 left
 recursion, so for example will fail on foo[1].bar(2).fun().
 Unless there's a plan to accomodate those, I sense a dead end.
 One can eliminate left recursion but this has issues.

 2)
 There is some material on extending PEG to support those, eg 
 "Left
 Recursion in Parsing Expression Grammars", or code
 https://github.com/orlandohill/peg-left-recursion but I don't 
 know how well
 they work in practice.
Left-recursion for Pegged is in the works: https://github.com/PhilippeSigaud/Pegged/pull/164 :-)
Nov 14 2015
parent =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Sunday, 15 November 2015 at 01:17:47 UTC, Bastiaan Veelo wrote:
 Left-recursion for Pegged is in the works: 
 https://github.com/PhilippeSigaud/Pegged/pull/164

 :-)
And merged!: https://github.com/PhilippeSigaud/Pegged/pull/164 Would be nice to have an overview of which languages the, with this change, can be parsed by a Pegged grammar.
Nov 15 2015
prev sibling next sibling parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Sun, Nov 3, 2013 at 2:45 AM, Timothee Cour <thelastmammoth gmail.com>wrote:

 1)
 The main issue I see with pegged is PEG grammars don't support left
 recursion, so for example will fail on foo[1].bar(2).fun().
 Unless there's a plan to accomodate those, I sense a dead end.
 One can eliminate left recursion but this has issues.
Indeed. Eliminating left-recursion can be done, but it might disrupt the parse tree. There is no discarding the fact that PEG are intrinsically made to go with top-down recursive parsers.
 2)
 There is some material on extending PEG to support those, eg "Left
 Recursion in Parsing Expression Grammars", or code
 https://github.com/orlandohill/peg-left-recursion but I don't know how
 well they work in practice.
I'll have a look, thanks. The idea seems similar to what I wanted to do, with a growing seed.
 3)
 Finally, a parser should be as fast as possible; I'm not sure how well
 pegged performs compared to dmd frontend.
Oh, for D it works (it's even the biggest grammar I know), but it's too slow to be of practical interest. But that just means the generated parser is not top-notch, which is reasonable: I'm not a parsing pro, just a guy doing this during his free time :)
 Other promising options are using lemon, a LALR(1) parser generator.
My current plan is to write different engines, and letting either the user select them at compile-time, or to have the parser decide which one to use, depending on the grammar. I'm pretty sure the 'Type 3' parts of a grammar (regular expressions) could be bone by using std.regex, for example. I guess I'll have to write a CT-compatible LALR(1) engine...
 Growler:
 I'm using it for parsing C and Markup in two different projects.
 I've also just started working on an Octave (well ok, Matlab) M-file
 parser. It will probably never see the light of day but with
 Pegged is great fun to play with.

 Please don't lose interest in it !!
OK guys, I'm hearing you. Thanks for the nice words growler! I tried to have Pegged simple to use, compared to other generator I know and I'm pleased to see it seems to work. If you have new grammars, you can send them to me, I'll put them in the examples.
 Yes. Probably the most significant D project at this time.

 Andrei
That's nothing compared to vide.d! But I guess it indeed demonstrates what can be done with the generative capabilities of D. Thanks for the kind words.
Nov 03 2013
prev sibling next sibling parent Timothee Cour <thelastmammoth gmail.com> writes:
On Sun, Nov 3, 2013 at 1:13 AM, Philippe Sigaud
<philippe.sigaud gmail.com>wrote:

 On Sun, Nov 3, 2013 at 2:45 AM, Timothee Cour <thelastmammoth gmail.com>wrote:

 1)
 The main issue I see with pegged is PEG grammars don't support left
 recursion, so for example will fail on foo[1].bar(2).fun().
 Unless there's a plan to accomodate those, I sense a dead end.
 One can eliminate left recursion but this has issues.
Indeed. Eliminating left-recursion can be done, but it might disrupt the parse tree. There is no discarding the fact that PEG are intrinsically made to go with top-down recursive parsers.
 2)
 There is some material on extending PEG to support those, eg "Left
 Recursion in Parsing Expression Grammars", or code
 https://github.com/orlandohill/peg-left-recursion but I don't know how
 well they work in practice.
I'll have a look, thanks. The idea seems similar to what I wanted to do, with a growing seed.
Great, please let us know how that works.
 3)
 Finally, a parser should be as fast as possible; I'm not sure how well
 pegged performs compared to dmd frontend.
Oh, for D it works (it's even the biggest grammar I know), but it's too slow to be of practical interest. But that just means the generated parser is not top-notch, which is reasonable: I'm not a parsing pro, just a guy doing this during his free time :)
 Other promising options are using lemon, a LALR(1) parser generator.
My current plan is to write different engines, and letting either the user select them at compile-time, or to have the parser decide which one to use, depending on the grammar. I'm pretty sure the 'Type 3' parts of a grammar (regular expressions) could be bone by using std.regex, for example.
even lexing can't be done with regex, eg nesting comments : /+ ... +/ Also, although it may seem cleaner at first to combine lexing and parsing in 1 big grammar (as done in pegged), it usually is faster do feed a (separate) lexer output into parser.
 I guess I'll have to write a CT-compatible LALR(1) engine...




  Growler:
 I'm using it for parsing C and Markup in two different projects.
 I've also just started working on an Octave (well ok, Matlab) M-file
 parser. It will probably never see the light of day but with
 Pegged is great fun to play with.

 Please don't lose interest in it !!
OK guys, I'm hearing you. Thanks for the nice words growler! I tried to have Pegged simple to use, compared to other generator I know and I'm pleased to see it seems to work. If you have new grammars, you can send them to me, I'll put them in the examples.
 Yes. Probably the most significant D project at this time.

 Andrei
That's nothing compared to vide.d! But I guess it indeed demonstrates what can be done with the generative capabilities of D. Thanks for the kind words.
Nov 03 2013
prev sibling next sibling parent Robert Schadek <realburner gmx.de> writes:
On 11/03/2013 09:13 AM, Philippe Sigaud wrote:
 Oh, for D it works (it's even the biggest grammar I know), but it's
 too slow to be of practical interest. But that just means the
 generated parser is not top-notch, which is reasonable: I'm not a
 parsing pro, just a guy doing this during his free time :)

  

     Other promising options are using lemon, a LALR(1) parser generator.


 My current plan is to write different engines, and letting either the
 user select them at compile-time, or to have the parser decide which
 one to use, depending on the grammar. I'm pretty sure the 'Type 3'
 parts of a grammar (regular expressions) could be bone by using
 std.regex, for example.

 I guess I'll have to write a CT-compatible LALR(1) engine...
  
D does not fit into LALR(1), you need glr. Another idea would be to make the engine a template argument, and than combine multiple parser!(engines). And even allow hand written stuff. This way you could use ll(1) for the ll(1) parts and the crazy hand written black magic for the hard parts.
Nov 03 2013
prev sibling next sibling parent Timothee Cour <thelastmammoth gmail.com> writes:
On Sun, Nov 3, 2013 at 12:46 PM, Robert Schadek <realburner gmx.de> wrote:

  On 11/03/2013 09:13 AM, Philippe Sigaud wrote:


  Oh, for D it works (it's even the biggest grammar I know), but it's too
 slow to be of practical interest. But that just means the generated parser
 is not top-notch, which is reasonable: I'm not a parsing pro, just a guy
 doing this during his free time :)



  Other promising options are using lemon, a LALR(1) parser generator.
My current plan is to write different engines, and letting either the user select them at compile-time, or to have the parser decide which one to use, depending on the grammar. I'm pretty sure the 'Type 3' parts of a grammar (regular expressions) could be bone by using std.regex, for example. I guess I'll have to write a CT-compatible LALR(1) engine... D does not fit into LALR(1), you need glr.
how about dparser (nothing to do with D btw): http://dparser.sourceforge.net/ the grammar for C looks quite compact and clean: http://dparser.sourceforge.net/d/tests/ansic.test.g
 Another idea would be to make the engine a template argument, and than
 combine multiple parser!(engines). And even allow hand written stuff. This
 way you could use ll(1) for the ll(1) parts and the crazy hand written
 black magic for the hard parts.
Sure, but that'd be a 2nd priority after having at least one (partially) automatically generated parser for D.
Nov 03 2013
prev sibling next sibling parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Sun, Nov 3, 2013 at 7:08 PM, Timothee Cour <thelastmammoth gmail.com>wrote:

 On Sun, Nov 3, 2013 at 1:13 AM, Philippe Sigaud <philippe.sigaud gmail.com
 wrote:
 My current plan is to write different engines, and letting either the
 user select them at compile-time, or to have the parser decide which one to
 use, depending on the grammar. I'm pretty sure the 'Type 3' parts of a
 grammar (regular expressions) could be bone by using std.regex, for example.
even lexing can't be done with regex, eg nesting comments : /+ ... +/ Also, although it may seem cleaner at first to combine lexing and parsing in 1 big grammar (as done in pegged), it usually is faster do feed a (separate) lexer output into parser.
Lexing, yes. I was imprecise: even in a context-free grammar, some rules are regular and could use std.regex (the ct part) as the underlying engine, just for that rule.
Nov 03 2013
prev sibling next sibling parent reply Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Mon, Nov 4, 2013 at 1:55 AM, Timothee Cour <thelastmammoth gmail.com>wrote:

 I guess I'll have to write a CT-compatible LALR(1) engine...
 D does not fit into LALR(1), you need glr.
Well, too bad. GLR is also on my plate, but I fear its cubic behavior (IIRC, it degrades gracefully, though). Do you know what part of the D grammar makes it non-LALR(1)?
Nov 03 2013
parent Martin Nowak <code dawg.eu> writes:
On 11/04/2013 06:52 AM, Philippe Sigaud wrote:
 Do you know what part of the D grammar makes it non-LALR(1)?
At least function and template function declarations are not even LR(1). You need to look for a second left parenthesis to distinguish both. It's fairly easy to solve this though. 1. Add a new token r")[ ]*(" 2. Add an alternate rule that never matches but forces the parser to look behind the closing paren of the function arguments. ...
Nov 05 2013
prev sibling next sibling parent reply Robert Schadek <realburner gmx.de> writes:
On 11/04/2013 06:48 AM, Philippe Sigaud wrote:
 On Sun, Nov 3, 2013 at 7:08 PM, Timothee Cour
 <thelastmammoth gmail.com <mailto:thelastmammoth gmail.com>> wrote:


     On Sun, Nov 3, 2013 at 1:13 AM, Philippe Sigaud
     <philippe.sigaud gmail.com <mailto:philippe.sigaud gmail.com>>wrote:


         My current plan is to write different engines, and letting
         either the user select them at compile-time, or to have the
         parser decide which one to use, depending on the grammar. I'm
         pretty sure the 'Type 3' parts of a grammar (regular
         expressions) could be bone by using std.regex, for example.


     even lexing can't be done with regex, eg nesting comments : /+ ... +/
     Also, although it may seem cleaner at first to combine lexing and
     parsing in 1 big grammar (as done in pegged), it usually is faster
     do feed a (separate) lexer output into parser. 


 Lexing, yes. I was imprecise: even in a context-free grammar, some
 rules are regular and could use std.regex (the ct part) as the
 underlying engine, just for that rule.
Lexing can not be done with regex. Think myArray[1. ] ! What is next a dot or a number.
Nov 04 2013
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
04-Nov-2013 12:28, Robert Schadek пишет:
 On 11/04/2013 06:48 AM, Philippe Sigaud wrote:
 On Sun, Nov 3, 2013 at 7:08 PM, Timothee Cour
 <thelastmammoth gmail.com <mailto:thelastmammoth gmail.com>> wrote:


     On Sun, Nov 3, 2013 at 1:13 AM, Philippe Sigaud
     <philippe.sigaud gmail.com <mailto:philippe.sigaud gmail.com>>wrote:


         My current plan is to write different engines, and letting
         either the user select them at compile-time, or to have the
         parser decide which one to use, depending on the grammar. I'm
         pretty sure the 'Type 3' parts of a grammar (regular
         expressions) could be bone by using std.regex, for example.


     even lexing can't be done with regex, eg nesting comments : /+ ... +/
     Also, although it may seem cleaner at first to combine lexing and
     parsing in 1 big grammar (as done in pegged), it usually is faster
     do feed a (separate) lexer output into parser.


 Lexing, yes. I was imprecise: even in a context-free grammar, some
 rules are regular and could use std.regex (the ct part) as the
 underlying engine, just for that rule.
Lexing can not be done with regex. Think myArray[1. ] ! What is next a dot or a number.
You could use lookahead extension ;) In general I would not recommend ATM to use std.regex for performance-critical lexer if only because it wasn't tuned for such a use case yet. I have plans for extending std.regex capabilities in many directions, lexing being one important use case. -- Dmitry Olshansky
Nov 04 2013
parent reply Timothee Cour <thelastmammoth gmail.com> writes:
for lexing there's already dscanner we could use (while we wait for perhaps
a autogenerated lexer);
so I think priority is on the autogenerated parser (dscanner has one but
hand designed), where it's still unknown what will work well.


On Mon, Nov 4, 2013 at 2:43 AM, Dmitry Olshansky <dmitry.olsh gmail.com>wro=
te:

 04-Nov-2013 12:28, Robert Schadek =D0=BF=D0=B8=D1=88=D0=B5=D1=82:

 On 11/04/2013 06:48 AM, Philippe Sigaud wrote:

 On Sun, Nov 3, 2013 at 7:08 PM, Timothee Cour
 <thelastmammoth gmail.com <mailto:thelastmammoth gmail.com>> wrote:


     On Sun, Nov 3, 2013 at 1:13 AM, Philippe Sigaud
     <philippe.sigaud gmail.com <mailto:philippe.sigaud gmail.com>>wrote=
:
         My current plan is to write different engines, and letting
         either the user select them at compile-time, or to have the
         parser decide which one to use, depending on the grammar. I'm
         pretty sure the 'Type 3' parts of a grammar (regular
         expressions) could be bone by using std.regex, for example.


     even lexing can't be done with regex, eg nesting comments : /+ ... =
+/
     Also, although it may seem cleaner at first to combine lexing and
     parsing in 1 big grammar (as done in pegged), it usually is faster
     do feed a (separate) lexer output into parser.


 Lexing, yes. I was imprecise: even in a context-free grammar, some
 rules are regular and could use std.regex (the ct part) as the
 underlying engine, just for that rule.
Lexing can not be done with regex. Think myArray[1. ] ! What is next a dot or a number.
You could use lookahead extension ;) In general I would not recommend ATM to use std.regex for performance-critical lexer if only because it wasn't tuned for such a use case yet. I have plans for extending std.regex capabilities in many directions, lexing being one important use case. -- Dmitry Olshansky
Nov 04 2013
parent reply "Brian Schott" <briancschott gmail.com> writes:
On Monday, 4 November 2013 at 13:20:01 UTC, Timothee Cour wrote:
 for lexing there's already dscanner we could use (while we wait 
 for perhaps
 a autogenerated lexer);
 so I think priority is on the autogenerated parser (dscanner 
 has one but
 hand designed), where it's still unknown what will work well.
Yes, that tool has two properties: 1) It works now. Not Soon(tm). You can download it, compile it, and use it to dump the AST of your D code in just a minute or two. 2) It wasn't built THE ONE TRUE WAY. But we should take a step back first. Before we try to implement a parser for D's grammar, we need to figure out what exactly D's grammar is. Seriously. We don't have a real grammar for D. We have the language spec on dlang.org, but it isn't complete, consistent, or updated when the language changes. Want examples? I have a tracker for them here: http://d.puremagic.com/issues/show_bug.cgi?id=10233 There's also my project here: https://github.com/Hackerpilot/DGrammar, but it's not official and I keep finding differences between it and the behavior of DMD. Why am I the only one who thinks this is a problem?
Nov 04 2013
next sibling parent "Brian Schott" <briancschott gmail.com> writes:
On Tuesday, 5 November 2013 at 07:27:10 UTC, Brian Schott wrote:
 Why am I the only one who thinks this is a problem?
Ignore this line if your name is "Kenji".
Nov 04 2013
prev sibling next sibling parent Manfred Nowak <svv1999 hotmail.com> writes:
Brian Schott wrote:

 Why am I the only one who thinks this is a problem?
meetoo. -manfred
Nov 05 2013
prev sibling next sibling parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Tue, Nov 5, 2013 at 8:27 AM, Brian Schott <briancschott gmail.com> wrote:

 But we should take a step back first. Before we try to implement a parser
 for D's grammar, we need to figure out what exactly D's grammar is.

 Seriously. We don't have a real grammar for D. We have the language spec
 on dlang.org, but it isn't complete, consistent, or updated when the
 language changes. Want examples? I have a tracker for them here:
 http://d.puremagic.com/issues/show_bug.cgi?id=10233

 There's also my project here: https://github.com/Hackerpilot/DGrammar,
 but it's not official and I keep finding differences between it and the
 behavior of DMD.

 Why am I the only one who thinks this is a problem?
You're not alone in this, but Walter's answer is that we must make pull requests on the website if we find errors in the online grammar. The problem is not knowing what the intended behavior is. When the reference compiler and the grammar disagree, who I am to say which is wrong? We could also see if some slight modification to the grammar could push it into LALR(1) territory.
Nov 05 2013
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/4/13 11:27 PM, Brian Schott wrote:
 On Monday, 4 November 2013 at 13:20:01 UTC, Timothee Cour wrote:
 for lexing there's already dscanner we could use (while we wait for
 perhaps
 a autogenerated lexer);
 so I think priority is on the autogenerated parser (dscanner has one but
 hand designed), where it's still unknown what will work well.
Yes, that tool has two properties: 1) It works now. Not Soon(tm). You can download it, compile it, and use it to dump the AST of your D code in just a minute or two. 2) It wasn't built THE ONE TRUE WAY. But we should take a step back first. Before we try to implement a parser for D's grammar, we need to figure out what exactly D's grammar is. Seriously. We don't have a real grammar for D. We have the language spec on dlang.org, but it isn't complete, consistent, or updated when the language changes. Want examples? I have a tracker for them here: http://d.puremagic.com/issues/show_bug.cgi?id=10233 There's also my project here: https://github.com/Hackerpilot/DGrammar, but it's not official and I keep finding differences between it and the behavior of DMD. Why am I the only one who thinks this is a problem?
I agree it's a problem, in fact three problems in one. In decreasing difficulty order: 1. Semantic changes for working code (e.g. order of evaluation etc) are subtle enough to be very difficult to track and require sheer attention and careful manual verification and maintenance of documentation. 2. Semantic analysis changes (i.e. compiles/doesn't compile) are also difficult and require attention, but at least can be to a good extent verified automatically (by means of test suites and runnable examples). In TDPL I have two categories of examples - visible and invisible. The invisible ones do not occur in the printed text but are present in the book source and are used to check whether the claims made by the book are true. It would be really cool if we had something like that for the online documentation. We should be able to intersperse freely documentation text with invisible unittests that ensure the documentation is correct. 3. Grammar changes are the simplest ones and in a way the most embarrassing if they happen. The best solution I see to that is deriving the documentation and the actual parser from the same source. This is part of why I'm so keen on parser generators. Andrei P.S. I haven't forgotten about the lexer - it's still on the back burner but I will publish it as soon as I get a chance.
Nov 05 2013
parent "Chad Joan" <chadjoan gmail.com> writes:
On Tuesday, 5 November 2013 at 20:17:07 UTC, Andrei Alexandrescu 
wrote:
 3. Grammar changes are the simplest ones and in a way the most 
 embarrassing if they happen. The best solution I see to that is 
 deriving the documentation and the actual parser from the same 
 source. This is part of why I'm so keen on parser generators.
This. This is why I have a hard time accepting any "formalize a grammar first" arguments. I think any grammars for D will be pure speculation until they become executable and start to see heavy use in the tools that we use.
Nov 06 2013
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 11/4/2013 11:27 PM, Brian Schott wrote:
 Seriously. We don't have a real grammar for D. We have the language spec on
 dlang.org, but it isn't complete, consistent, or updated when the language
 changes. Want examples? I have a tracker for them here:
 http://d.puremagic.com/issues/show_bug.cgi?id=10233
Thank you, Brian, for finding and reporting these issues.
Nov 12 2013
prev sibling parent Robert Schadek <realburner gmx.de> writes:
On 11/04/2013 06:52 AM, Philippe Sigaud wrote:
 On Mon, Nov 4, 2013 at 1:55 AM, Timothee Cour
 <thelastmammoth gmail.com <mailto:thelastmammoth gmail.com>> wrote:


         I guess I'll have to write a CT-compatible LALR(1) engine...
          
D does not fit into LALR(1), you need glr. Well, too bad. GLR is also on my plate, but I fear its cubic behavior (IIRC, it degrades gracefully, though).
Thats one part, and even worse is that you need a decider function if more than one rule accepts.
 Do you know what part of the D grammar makes it non-LALR(1)?
I had big trouble with the IdentifierList rules.
Nov 04 2013
prev sibling next sibling parent Timothee Cour <thelastmammoth gmail.com> writes:
On Sat, Nov 2, 2013 at 1:19 AM, Philippe Sigaud
<philippe.sigaud gmail.com>wrote:

 Note that Pegged has no problem with the official C grammar or the
 officiel XML specification...
how do you handle left recursion in C ? eg: a[0][1] etc?
Nov 03 2013
prev sibling next sibling parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Mon, Nov 4, 2013 at 1:58 AM, Timothee Cour <thelastmammoth gmail.com>wrote:

 On Sat, Nov 2, 2013 at 1:19 AM, Philippe Sigaud <philippe.sigaud gmail.com
 wrote:
 Note that Pegged has no problem with the official C grammar or the
 officiel XML specification...
how do you handle left recursion in C ? eg: a[0][1] etc?
I don't remember. Maybe the grammar I use has no left-recursion? I don't remember whether I sued the official C grammar or one created for PEGs by someone else. I know I tested it on many different programs, with no problem.
Nov 03 2013
prev sibling parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
I *used* the grammar, obviously. Suing could be an option to disallow
left-recursion, but let's explore other ways for now.


On Mon, Nov 4, 2013 at 6:46 AM, Philippe Sigaud
<philippe.sigaud gmail.com>wrote:

 On Mon, Nov 4, 2013 at 1:58 AM, Timothee Cour <thelastmammoth gmail.com>wrote:

 On Sat, Nov 2, 2013 at 1:19 AM, Philippe Sigaud <
 philippe.sigaud gmail.com> wrote:

 Note that Pegged has no problem with the official C grammar or the
 officiel XML specification...
how do you handle left recursion in C ? eg: a[0][1] etc?
I don't remember. Maybe the grammar I use has no left-recursion? I don't remember whether I sued the official C grammar or one created for PEGs by someone else. I know I tested it on many different programs, with no problem.
Nov 03 2013
prev sibling parent reply Martin Nowak <code dawg.eu> writes:
On 11/02/2013 05:06 AM, Manfred Nowak wrote:
 One prerequisite for every PEG-Parser is, that the language has to be
 designed to be without any ambiguity.
AFAIK Parser expression grammars handle ambiguity through prioritization. You'd still need to rewrite grammar rules that involve left-recursion or hidden left recursion.
Nov 05 2013
parent Manfred Nowak <svv1999 hotmail.com> writes:
Martin Nowak wrote:
 AFAIK Parser expression grammars handle ambiguity through prioritization.
My statement about ambiguity is taken from the first sentence of chapter 7 of "Parsing Expression Grammars: A Recognition-Based Syntactic Foundation" by Brian Ford: http://pdos.csail.mit.edu/papers/parsing%3Apopl04.pdf: | Parsing expression grammars provide a [...] foundation for [...] | languages that are designed to be unambiguous. -manfred
Nov 05 2013
prev sibling parent reply "Chad Joan" <chadjoan gmail.com> writes:
On Friday, 1 November 2013 at 21:22:46 UTC, Andrei Alexandrescu 
wrote:
 ...
 Bugs stopping Pegged from going forward should receive high 
 priority. I also encourage others to participate to that and 
 similar work.


 Andrei
Ack! I share Philippe's sense of timing. This would have been even nicer to hear a year ago when both of us were actively committing ;) I was going to close a project or two that I care deeply about and had started a long time ago, but I see that this might be a hard decision. Nonetheless, I am really glad that you are showing this interest! I like to hear stuff like this, since I too really like Pegged. Andrei and Philippe, I feel compelled to share some of my thoughts that I never had time to finish back then. I was working on a parser-engine-as-a-library that could be used to as optimized internals for Pegged, as well as any other tools that need to recognize these common patterns. The idea was to expose an API like this: string makeParser() { auto builder = new ParserBuilder!char; builder.pushSequence(); builder.literal('x'); builder.pushMaybe(); builder.literal('y'); builder.pop(); builder.pop(); return builder.toDCode("callMe"); } const foo = makeParser(); pragma(msg, foo); mixin(foo); That snippet would create a parser that recognizes the grammar 'x' ( 'y'? ). The current fledgling implementation creates this parser: http://pastebin.com/MgSqWXE2 Of course, no one would be expected to write grammars like that. It would be the job of tools like Pegged or std.regex to package it up in nice syntax that is easy to use. The code already takes a somewhat different strategy than Pegged's original strategy. Rather than generating a bunch of templates that dmd then has to instantiate to actualize the parser, it just emits a bunch of very primitive procedural D code. I suspect that this approach would mixin far faster with current dmd, because the deeply nested templates generated by Pegged seemed to be a bottleneck. I have to hand it to Philippe though for coming up with a very clever way to bootstrap the thing: once I saw how his templates assembled together, I realized just how convenient that was! (And I think my parser generator still has to be tought how to avoid making redundant branches in its output: there's some hash table action that belongs in there somewhere.) The small amount of code that I have for it is here: https://github.com/chadjoan/xdc/blob/master/src/xdc/parser_builder/parser_builder.d I wanted to eventually make it generic enough to recognize patterns in things besides strings. Being able to write grammars that recognize patterns in ASTs is /useful/. That leads into the whole xdc project: automate all of the tedious crud in semantic analysis, and thus make compiler writing, and possibly other AST manipulations in user code, become all much easier. The other thing I wanted to do was to optimize it. - I intended it to do no allocations unless the caller asks for it. - I intended to do a bunch of PEG/regex hybridization. I noticed some mathematical properties of PEGs and regular expressions that should allow you to mix the two as much as possible. All you have to do is tell it how to behave at the boundaries where they meet. And given that PEGs already define their own behavior pretty well, it would become possible to lower a lot of a PEG into regular expressions connected with a minimal set of PEG rules. This would be some awesome lowering: if you first do a pass that inlines as many rules as possible, and then do a second pass that converts PEG elements into regular elements whenever possible, then I feel like the thing will be damned near optimal. If you are wondering what these mathematical properties are, then I encourage you to look at this snippet where I define "unitary" and "non-unitary" expressions, for lack of prior terms: http://pastebin.com/iYBypRc5 Another fun thought: PEGs can have look-behind that includes any regular elements without any additional algorithmic complexity. Just take all of the look-behinds in the grammar, mash them together into one big regular-expression using regular alternation (|), and then have the resulting automaton consume in lock-step with the PEG parser. Whenever the PEG parser needs to do a lookbehind, it just checks to see if the companion automaton is in a matching state for the capture it needs. *sigh*, I feel like I could write a paper on this stuff if I were in grad school right now. Alas, I am stuck doing 50-60 hours a week of soul-sucking business programming. Well, then again, my understanding is that even though I can think of things that seem like they would make interesting topics for publishable papers, reality would have the profs conscript me to do completely different things that are possibly just as inane as the business programming. I worry that the greater threat to good AST manipulation tools in D is a lack of free time, and not the DMD bugs as much. I hope this is useful to someone!
Nov 04 2013
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
05-Nov-2013 10:36, Chad Joan пишет:
 On Friday, 1 November 2013 at 21:22:46 UTC, Andrei Alexandrescu wrote:
 ...
 Bugs stopping Pegged from going forward should receive high priority.
 I also encourage others to participate to that and similar work.


 Andrei
[snip]
 I was working on a parser-engine-as-a-library that could be used to as
 optimized internals for Pegged, as well as any other tools that need to
 recognize these common patterns.

 The idea was to expose an API like this:

      string makeParser()
      {
          auto builder = new ParserBuilder!char;
          builder.pushSequence();
              builder.literal('x');
              builder.pushMaybe();
                  builder.literal('y');
              builder.pop();
          builder.pop();
          return builder.toDCode("callMe");
      }
I was also toying with the idea of exposing Builder interface for std.regex. But push/pop IMHO are better be implicitly designed-out: auto re = atom('x').star(charClass(unicode.Letter),atom('y')).build(); ... and letting the nesting be explicit. Is the same as: auto re = regex(`x(?:\p{L}y)*`); Aimed for apps/libs that build regular expressions anyway and have no need in textual parser.
 That snippet would create a parser that recognizes the grammar 'x' (
 'y'? ).
 The current fledgling implementation creates this parser:
 http://pastebin.com/MgSqWXE2

 Of course, no one would be expected to write grammars like that. It
 would be the job of tools like Pegged or std.regex to package it up in
 nice syntax that is easy to use.
I thought to provide some building blocks for that with new std.uni. Not quite everything I wanted, but now at least there is one set of wheels less to reinvent. [snip]
 Another fun thought: PEGs can have look-behind that includes any regular
 elements without any additional algorithmic complexity. Just take all of
 the look-behinds in the grammar, mash them together into one big
 regular-expression using regular alternation (|), and then have the
 resulting automaton consume in lock-step with the PEG parser.  Whenever
 the PEG parser needs to do a lookbehind, it just checks to see if the
 companion automaton is in a matching state for the capture it needs.
Sounds quite slow to do it "just in case". Also complete DFAs tend to be mm quite big. What ANTLR does is similar technique - a regular lookahead to resolve ambiguity in the grammar (implicitly). A lot like LL(k) but with unlimited length (so called LL(*)). Of course, it generates LL(k) disambiguation where possible, then LL(*), failing that the usual backtracking.
 *sigh*, I feel like I could write a paper on this stuff if I were in
 grad school right now.  Alas, I am stuck doing 50-60 hours a week of
 soul-sucking business programming.
I heard Sociomantic is hiring D programmers for coding some awesome stuff, you may as well apply :)
 Well, then again, my understanding
 is that even though I can think of things that seem like they would make
 interesting topics for publishable papers, reality would have the profs
 conscript me to do completely different things that are possibly just as
 inane as the business programming.
Speaking for my limited experience - at times it's like that.
 I worry that the greater threat to good AST manipulation tools in D is a
 lack of free time, and not the DMD bugs as much.
Good for you I guess, my developments in related area are blocked still :(
 I hope this is useful to someone!
-- Dmitry Olshansky
Nov 05 2013
next sibling parent reply Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Tue, Nov 5, 2013 at 3:54 PM, Dmitry Olshansky <dmitry.olsh gmail.com>wrote:

 I was also toying with the idea of exposing Builder interface for
 std.regex. But push/pop IMHO are better be implicitly designed-out:

 auto re = atom('x').star(charClass(unicode.Letter),atom('y')).build();

 ... and letting the nesting be explicit.

 Is the same as:
 auto re = regex(`x(?:\p{L}y)*`);

 Aimed for apps/libs that build regular expressions anyway and have no need
 in textual parser.
Another possible advantage is to reference external names inside your construction, thus naming other regexen or refencing external variables to deposit backreferences inside them. All in all, to get a regex construct that can interact with the external word.
 What ANTLR does is similar technique - a regular lookahead to resolve
 ambiguity in the grammar (implicitly). A lot like LL(k) but with unlimited
 length (so called LL(*)). Of course, it generates LL(k) disambiguation
 where possible, then LL(*), failing that the usual backtracking.
I liked that idea since the author added it to ANTLR, but I never used it since. I wonder whether that can be implemented inside another parser generator or if it uses some specific-to-ANTLR internal machinery.
 *sigh*, I feel like I could write a paper on this stuff if I were in
 grad school right now.  Alas, I am stuck doing 50-60 hours a week of
 soul-sucking business programming.
I heard Sociomantic is hiring D programmers for coding some awesome stuff, you may as well apply :)
Well, I do 50-60 hours a week of fulfilling and challenging work, but the free time at the end is the same ;) I feel Chad's pain, though.
  Well, then again, my understanding
 is that even though I can think of things that seem like they would make
 interesting topics for publishable papers, reality would have the profs
 conscript me to do completely different things that are possibly just as
 inane as the business programming.

 Speaking for my limited experience - at times it's like that.
I worry that the greater threat to good AST manipulation tools in D is a
 lack of free time, and not the DMD bugs as much.
Good for you I guess, my developments in related area are blocked still :(
Walter is far from convinced that AST manipulation is a good thing. You would have to convince him first. His fear is that it will lead to unreadable code, and everyone using her own personnal version of D. AFAICT, nothing of the sort happened in Lisp (I mean, Lispers have balkanization, but *not* due to AST manipulation).
Nov 05 2013
next sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
05-Nov-2013 20:55, Philippe Sigaud пишет:
 On Tue, Nov 5, 2013 at 3:54 PM, Dmitry Olshansky <dmitry.olsh gmail.com
 <mailto:dmitry.olsh gmail.com>> wrote:


     I was also toying with the idea of exposing Builder interface for
     std.regex. But push/pop IMHO are better be implicitly designed-out:

     auto re =
     atom('x').star(charClass(__unicode.Letter),atom('y')).__build();

     ... and letting the nesting be explicit.

     Is the same as:
     auto re = regex(`x(?:\p{L}y)*`);

     Aimed for apps/libs that build regular expressions anyway and have
     no need in textual parser.

 Another possible advantage is to reference external names inside your
 construction, thus naming other regexen or refencing external variables
 to deposit backreferences inside them.
Actually it's a bad, bad idea. It has nice potential to destroy all optimization opportunities and performance guarantees of it (like being linear in time, and that only works today w/o funky extensions used). After all I'm in a curious position of having to do some work at R-T as well where you can't always just generate some D code ;) What would be real nice though is to let users register their own dictionary of 'tokens' from that. Then things like Ipv4 pattern or domain name pattern as simple as `\d` pieces they use today (say with \i{user-defined-name}).
 All in all, to get a regex
 construct that can interact with the external word.
Well, I think of some rather interesting ways to do it even w/o tying in some external stuff as building blocks. It's rather making std.regex itself less rigid and more lean (as in cheap to invoke). Then external modules may slice and dice its primitives as seen fit.
     What ANTLR does is similar technique - a regular lookahead to
     resolve ambiguity in the grammar (implicitly). A lot like LL(k) but
     with unlimited length (so called LL(*)). Of course, it generates
     LL(k) disambiguation where possible, then LL(*), failing that the
     usual backtracking.

 I liked that idea since the author added it to ANTLR, but I never used
 it since.
 I wonder whether that can be implemented inside another parser generator
 or if it uses some specific-to-ANTLR internal machinery.
I don't think there is much of specific in it. You would though have to accept it's no longer a PEG but rather some hybrid top-down EBNF parser that resolves ambiguities.
         I worry that the greater threat to good AST manipulation tools
         in D is a
         lack of free time, and not the DMD bugs as much.


     Good for you I guess, my developments in related area are blocked
     still :(

 Walter is far from convinced that AST manipulation is a good thing. You
 would have to convince him first.
I thought it was about tools that work with D code like say lints, refactoring, etc. -- Dmitry Olshansky
Nov 05 2013
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-11-05 17:55, Philippe Sigaud wrote:

 Walter is far from convinced that AST manipulation is a good thing. You
 would have to convince him first. His fear is that it will lead to
 unreadable code, and everyone using her own personnal version of D.
 AFAICT, nothing of the sort happened in Lisp (I mean, Lispers have
 balkanization, but *not* due to AST manipulation).
You know, Walter did a talk about AST macros at the first D conference. The idea back then was to add AST macros, hence the "macro" keyword. -- /Jacob Carlborg
Nov 06 2013
next sibling parent reply "Chad Joan" <chadjoan gmail.com> writes:
On Wednesday, 6 November 2013 at 08:19:13 UTC, Jacob Carlborg 
wrote:
 On 2013-11-05 17:55, Philippe Sigaud wrote:

 Walter is far from convinced that AST manipulation is a good 
 thing. You
 would have to convince him first. His fear is that it will 
 lead to
 unreadable code, and everyone using her own personnal version 
 of D.
 AFAICT, nothing of the sort happened in Lisp (I mean, Lispers 
 have
 balkanization, but *not* due to AST manipulation).
You know, Walter did a talk about AST macros at the first D conference. The idea back then was to add AST macros, hence the "macro" keyword.
Also, IIRC, it is believed that string mixins with CTFE are potentially more powerful. I am under the assumption that Walter is taking the long view and waiting for the community to furnish their own powerful AST manipulation tools using the existing spec. I suspect that he is opposed to baking AST manipulation into the /language spec/, but is perfectly accepting of the notion of using AST manipulation to generate code, reduce boilerplate, implement exotic features and DSLs, and so on. Just don't complicate the core language any more than it already is. Sorry if I misrepresented you Walter; I can only make educated guesses ;)
Nov 06 2013
next sibling parent reply "Chris Cain" <clcain uncg.edu> writes:
On Wednesday, 6 November 2013 at 08:34:32 UTC, Chad Joan wrote:
 Also, IIRC, it is believed that string mixins with CTFE are 
 potentially more powerful.
Maybe. But first, the point that Brian Schott brought up has to be addressed (language spec and DMD must be in sync). Second, CTFE has to get more efficient. It's _very_ easy to make it run out of memory and it's fairly slow at executing. But yeah, I agree that there could be a library AST manipulator and it'd be pretty nice. But it'll require that two sets of parsers in two different languages be kept in sync, which is likely to be a challenge. There's pros and cons to both of the approaches, really.
Nov 06 2013
parent "Chad Joan" <chadjoan gmail.com> writes:
On Wednesday, 6 November 2013 at 09:09:32 UTC, Chris Cain wrote:
 On Wednesday, 6 November 2013 at 08:34:32 UTC, Chad Joan wrote:
 Also, IIRC, it is believed that string mixins with CTFE are 
 potentially more powerful.
Maybe. But first, the point that Brian Schott brought up has to be addressed (language spec and DMD must be in sync). Second, CTFE has to get more efficient. It's _very_ easy to make it run out of memory and it's fairly slow at executing. But yeah, I agree that there could be a library AST manipulator and it'd be pretty nice. But it'll require that two sets of parsers in two different languages be kept in sync, which is likely to be a challenge. There's pros and cons to both of the approaches, really.
Right. My thoughts are, "be patient". The slow CTFE can be fixed without changes to the language spec, and it has progressed really well over the last couple years. The DMD vs spec thing is also fixable by building tools that operate on a formal grammar and forcing the grammar to be thoroughly explored. At the end of the day, I believe these will both be pleasantly fixed, and the spec won't end up with a bunch of complexity that was "useful 10 years ago". It all makes sense to me, at least.
Nov 06 2013
prev sibling next sibling parent reply Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Wed, Nov 6, 2013 at 9:34 AM, Chad Joan <chadjoan gmail.com> wrote:

 On Wednesday, 6 November 2013 at 08:19:13 UTC, Jacob Carlborg wrote:

 On 2013-11-05 17:55, Philippe Sigaud wrote:

  Walter is far from convinced that AST manipulation is a good thing. You
 would have to convince him first. His fear is that it will lead to
 unreadable code, and everyone using her own personnal version of D.
 AFAICT, nothing of the sort happened in Lisp (I mean, Lispers have
 balkanization, but *not* due to AST manipulation).
You know, Walter did a talk about AST macros at the first D conference. The idea back then was to add AST macros, hence the "macro" keyword.
Also, IIRC, it is believed that string mixins with CTFE are potentially more powerful. I am under the assumption that Walter is taking the long view and waiting for the community to furnish their own powerful AST manipulation tools using the existing spec. I suspect that he is opposed to baking AST manipulation into the /language spec/, but is perfectly accepting of the notion of using AST manipulation to generate code, reduce boilerplate, implement exotic features and DSLs, and so on. Just don't complicate the core language any more than it already is. Sorry if I misrepresented you Walter; I can only make educated guesses ;)
Well, I remember him saying that he doesn't want everybody and their dog to change the D syntax. If we get macros, that's exactly what we can do. For example, telling the compiler (or another tool), to rewrite int a := 1; into immutable a = 1; But that means the original not-D code cannot be compiled by anyone with a D compiler anymore (obviously). And soon everybody is transforming the grammar according to what they think is cool. That's his PoV IIRC. I'm more interested in getting to ability to define my own lowerings: in the same way that scope statements or foreach over ranges are rewritten into lower-level D code by the comiler, I would like to be able to define my own syntactic constructs. Plus, everybody here know that AST macros can cure world hunger :) Wouldn't that be nice?
Nov 06 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-11-07 00:23, Philippe Sigaud wrote:

 Well, I remember him saying that he doesn't want everybody and their dog
 to change the D syntax. If we get macros, that's exactly what we can do.
 For example, telling the compiler (or another tool), to rewrite

 int a := 1;

 into

 immutable a = 1;
It depends on what the AST macros allows you to do. I guess everyone here has her/his own definition.
 But that means the original not-D code cannot be compiled by anyone with
 a D compiler anymore (obviously). And soon everybody is transforming the
 grammar according to what they think is cool.
 That's his PoV IIRC.

 I'm more interested in getting to ability to define my own lowerings: in
 the same way that scope statements or foreach over ranges are rewritten
 into lower-level D code by the comiler, I would like to be able to
 define my own syntactic constructs.
My idea of AST macros[1] is quite similar to those in Scala and a bit similar to what you want. The most important thing to remember is that AST macros cannot create new syntax. It only works at the semantic level of the language. That means the above example you wrote doesn't not work, since that isn't legal D syntax. What would work is doing something like this: auto a = Person.where(e => e.name == "John"); In this case "where" is a macro. Which translates the given AST into the following SQL query: select * from person where name = 'John' Another thing I would really like is to be able to pass a delegate to a function after argument list, if the function takes a delegate as its last parameter: void each (T) (T[] arr, void delegate (T) dg); [1, 2, 3].each(e) { writeln(e); } Of courses, the compiler needs to be able to inline the delegate.
 Plus, everybody here know that AST macros can cure world hunger :)
 Wouldn't that be nice?
Of course :) [1] https://dl.dropboxusercontent.com/u/18386187/ast_macros.html -- /Jacob Carlborg
Nov 07 2013
parent reply "Martin Nowak" <code dawg.eu> writes:
On Thursday, 7 November 2013 at 13:50:57 UTC, Jacob Carlborg 
wrote:
 https://dl.dropboxusercontent.com/u/18386187/ast_macros.html
Can you pour that into a DIP?
Nov 10 2013
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-11-10 17:37, Martin Nowak wrote:

 Can you pour that into a DIP?
Absolutely, I've been meaning to do that. -- /Jacob Carlborg
Nov 10 2013
next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Sunday, 10 November 2013 at 17:58:35 UTC, Jacob Carlborg wrote:
 On 2013-11-10 17:37, Martin Nowak wrote:

 Can you pour that into a DIP?
Absolutely, I've been meaning to do that.
It require a bit more precision. But as you know I really behind the idea.
Nov 10 2013
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/10/13 9:58 AM, Jacob Carlborg wrote:
 On 2013-11-10 17:37, Martin Nowak wrote:

 Can you pour that into a DIP?
Absolutely, I've been meaning to do that.
Just a thought - I think the best strategy for the time being is to explore gainful use of the features already present, instead of adding new features. Andrei
Nov 10 2013
parent reply "deadalnix" <deadalnix gmail.com> writes:
On Sunday, 10 November 2013 at 18:31:32 UTC, Andrei Alexandrescu 
wrote:
 On 11/10/13 9:58 AM, Jacob Carlborg wrote:
 On 2013-11-10 17:37, Martin Nowak wrote:

 Can you pour that into a DIP?
Absolutely, I've been meaning to do that.
Just a thought - I think the best strategy for the time being is to explore gainful use of the features already present, instead of adding new features. Andrei
It is also a given that many feature could have been avoided in the first place given that one. Still I'd rather see thing settle down a little before introducing this. Fix the existing things, then introduce new stuff seems like the proper way to go.
Nov 10 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-11-10 19:44, deadalnix wrote:

 It is also a given that many feature could have been avoided in the
 first place given that one.
I agree. It seems at least a couple of features could have been implemented using AST macros instead. -- /Jacob Carlborg
Nov 10 2013
parent reply Timothee Cour <thelastmammoth gmail.com> writes:
On Sun, Nov 10, 2013 at 12:27 PM, Jacob Carlborg <doob me.com> wrote:

 On 2013-11-10 19:44, deadalnix wrote:

  It is also a given that many feature could have been avoided in the
 first place given that one.
I agree. It seems at least a couple of features could have been implemented using AST macros instead. -- /Jacob Carlborg
I agree; eg, just to name 2 features that I proposed that could be simply implemented via AST macros as Jacob said: * [proposal: a new string litteral to embed variables in a string<http://forum.dlang.org/post/mailman.94.1383254681.9546.digitalmars-d puremagic.com> http://forum.dlang.org/post/mailman.94.1383254681.9546.digitalmars-d puremagic.com ] * [Re: feature request: __ARGS__ for logging (cf __FILE__, __LINE__, __FUNC___ http://forum.dlang.org/post/kegmp0$30dn$1 digitalmars.com] It seems we could even get rid of __FILE__,__LINE__,__MODULE__ with AST macros.
Nov 10 2013
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-11-10 21:38, Timothee Cour wrote:

 I agree; eg, just to name 2 features that I proposed that could be
 simply implemented via AST macros as Jacob said:
 * [proposal: a new string litteral to embed variables in a string
 <http://forum.dlang.org/post/mailman.94.1383254681.9546.digitalmars-d puremagic.com>
 http://forum.dlang.org/post/mailman.94.1383254681.9546.digitalmars-d puremagic.com]
 * [Re: feature request: __ARGS__ for logging (cf __FILE__, __LINE__,
 __FUNC___ http://forum.dlang.org/post/kegmp0$30dn$1 digitalmars.com]

 It seems we could even get rid of __FILE__,__LINE__,__MODULE__ with AST
 macros.
Let see, what else. These could, most likely, have been implemented using AST macros: * scope * foreach * for * inlining delegates. I think the current solution to pass a delegate/lambda as an alias parameter is a bit weird One of my favorites that is not already in the language is implementing database queries: Person.where(e => e.name == "John"); Will be translate to the following SQL: select * from person where name = 'John' Another one is the property shortcut, see the example for attribute macros: http://wiki.dlang.org/DIP50#Attribute_macros -- /Jacob Carlborg
Nov 10 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/10/13 1:17 PM, Jacob Carlborg wrote:
 On 2013-11-10 21:38, Timothee Cour wrote:

 I agree; eg, just to name 2 features that I proposed that could be
 simply implemented via AST macros as Jacob said:
 * [proposal: a new string litteral to embed variables in a string
 <http://forum.dlang.org/post/mailman.94.1383254681.9546.digitalmars-d puremagic.com>

 http://forum.dlang.org/post/mailman.94.1383254681.9546.digitalmars-d puremagic.com]

 * [Re: feature request: __ARGS__ for logging (cf __FILE__, __LINE__,
 __FUNC___ http://forum.dlang.org/post/kegmp0$30dn$1 digitalmars.com]

 It seems we could even get rid of __FILE__,__LINE__,__MODULE__ with AST
 macros.
Let see, what else. These could, most likely, have been implemented using AST macros: * scope * foreach * for * inlining delegates. I think the current solution to pass a delegate/lambda as an alias parameter is a bit weird One of my favorites that is not already in the language is implementing database queries: Person.where(e => e.name == "John"); Will be translate to the following SQL: select * from person where name = 'John' Another one is the property shortcut, see the example for attribute macros: http://wiki.dlang.org/DIP50#Attribute_macros
The way I see this is, these preexisting features (which shan't be thrown away) diminish the motivation for adding another feature. Andrei
Nov 10 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-11-10 23:49, Andrei Alexandrescu wrote:

 The way I see this is, these preexisting features (which shan't be
 thrown away) diminish the motivation for adding another feature.
I think the opposite. It shows how powerful macros are and that they will most likely be able to implement other new features that haven't been proposed yet. Adding a bunch of new features to the language instead of taking a step back and thinking if something else can be added that implement these features and many more shows bad judgement. -- /Jacob Carlborg
Nov 10 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/10/13 11:55 PM, Jacob Carlborg wrote:
 On 2013-11-10 23:49, Andrei Alexandrescu wrote:

 The way I see this is, these preexisting features (which shan't be
 thrown away) diminish the motivation for adding another feature.
I think the opposite. It shows how powerful macros are and that they will most likely be able to implement other new features that haven't been proposed yet. Adding a bunch of new features to the language instead of taking a step back and thinking if something else can be added that implement these features and many more shows bad judgement.
No need to be snarky about it. The point here is that there is significant difficulty to remove features that already exist and are useful, which changes the game considerably. Andrei
Nov 11 2013
next sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 12 November 2013 at 02:34:52 UTC, Andrei Alexandrescu 
wrote:
 No need to be snarky about it. The point here is that there is 
 significant difficulty to remove features that already exist 
 and are useful, which changes the game considerably.

 Andrei
That is true. However, please understand that this is extremely frustrating to have a lot of gadget features that solve a specific issue when much more generic solutions exists, and when these gadgets actually makes it more difficult to introduce the proper features. If we put aside the AST macro thing (granted we have a swiss army knife of gadgets for many of its typical uses cases), several other skeleton don't want to stay in the closet. If I has to pick 2: tail qualified type parameter are one. It is impossible to have a use defined type that behave as slices/pointers. That is showstopper for a proper collection library. I have only ugly solution to propose to that now. Lately, I was plying with thing along the lines of struct S[T] { auto foo() const[T] { ... } } I'm afraid that what already exist in that area makes things quite difficult. The lack of qualifier for ownership, which makes concurrency and interface between manual memory management and GC tedious. For that one, isolated seems the way to go. These same issue continue to pop up in many different forms (for instance, see Kenji's DIP, where he is trying to hack around some gadget to unique postblit work).
Nov 11 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/11/13 9:11 PM, deadalnix wrote:
 On Tuesday, 12 November 2013 at 02:34:52 UTC, Andrei Alexandrescu wrote:
 No need to be snarky about it. The point here is that there is
 significant difficulty to remove features that already exist and are
 useful, which changes the game considerably.

 Andrei
That is true. However, please understand that this is extremely frustrating to have a lot of gadget features that solve a specific issue when much more generic solutions exists, and when these gadgets actually makes it more difficult to introduce the proper features.
Frustration is inappropriate. Probably "bummed" would be more fit for the situation. (Which I am not, but that's a different story; I personally don't think macros are all they're cracked to be.) We did not have a macro-based solution when D was being invented, and I don't think there's a lot of glory in seeing patterns for doing things differently after the language has been in use for a while. There's a whole water under the bridge between now and then.
 If we put aside the AST macro thing (granted we have a swiss army knife
 of gadgets for many of its typical uses cases), several other skeleton
 don't want to stay in the closet.

 If I has to pick 2:
 tail qualified type parameter are one. It is impossible to have a use
 defined type that behave as slices/pointers. That is showstopper for a
 proper collection library. I have only ugly solution to propose to that
 now. Lately, I was plying with thing along the lines of struct S[T] {
 auto foo() const[T] { ... } } I'm afraid that what already exist in that
 area makes things quite difficult.

 The lack of qualifier for ownership, which makes concurrency and
 interface between manual memory management and GC tedious. For that one,
 isolated seems the way to go.

 These same issue continue to pop up in many different forms (for
 instance, see Kenji's DIP, where he is trying to hack around some gadget
 to unique postblit work).
Our goals for the time being are to improve stability, close gaps in the type system, and make good use of features in code. To the extent we can't do what we want to do, the topics you mention are fit. Andrei
Nov 11 2013
parent "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 12 November 2013 at 05:58:02 UTC, Andrei Alexandrescu 
wrote:
 If I has to pick 2:
 tail qualified type parameter are one. It is impossible to 
 have a use
 defined type that behave as slices/pointers. That is 
 showstopper for a
 proper collection library. I have only ugly solution to 
 propose to that
 now. Lately, I was plying with thing along the lines of struct 
 S[T] {
 auto foo() const[T] { ... } } I'm afraid that what already 
 exist in that
 area makes things quite difficult.

 The lack of qualifier for ownership, which makes concurrency 
 and
 interface between manual memory management and GC tedious. For 
 that one,
 isolated seems the way to go.

 These same issue continue to pop up in many different forms 
 (for
 instance, see Kenji's DIP, where he is trying to hack around 
 some gadget
 to unique postblit work).
Our goals for the time being are to improve stability, close gaps in the type system, and make good use of features in code. To the extent we can't do what we want to do, the topics you mention are fit. Andrei
Yes, as much as I'd like to have macro, I don't think this is the right time now. Closing the gaps is more important.
Nov 11 2013
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-11-12 03:34, Andrei Alexandrescu wrote:

 No need to be snarky about it. The point here is that there is
 significant difficulty to remove features that already exist and are
 useful, which changes the game considerably.
I'm really trying to be as humble as I can but you're not making it easy. I have not said we should remove any of the existing features. I'm trying to show that several already existing features could have been implemented using macros if they existed early in the development. It should show that it is a powerful feature and can hopefully be used to solve new ides with library code instead changing the languages. -- /Jacob Carlborg
Nov 11 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/11/13 11:46 PM, Jacob Carlborg wrote:
 On 2013-11-12 03:34, Andrei Alexandrescu wrote:

 No need to be snarky about it. The point here is that there is
 significant difficulty to remove features that already exist and are
 useful, which changes the game considerably.
I'm really trying to be as humble as I can but you're not making it easy. I have not said we should remove any of the existing features. I'm trying to show that several already existing features could have been implemented using macros if they existed early in the development. It should show that it is a powerful feature and can hopefully be used to solve new ides with library code instead changing the languages.
Fine, although a sense of futility is hard to shake seeing as we won't replace those existing features. I think a much stronger point would be made if the power of the feature were demonstrated on problems not accessible with the existing ones. About DIP 50: I will say "no" but please do not take it personally. It's great there is discussion about this, and I encourage it, but at this time I think we should make no plans to add macros to D. Andrei
Nov 11 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-11-12 08:52, Andrei Alexandrescu wrote:

 Fine, although a sense of futility is hard to shake seeing as we won't
 replace those existing features. I think a much stronger point would be
 made if the power of the feature were demonstrated on problems not
 accessible with the existing ones.
You just said we shouldn't replace existing features. "The point here is that there is significant difficulty to remove features that already exist" http://forum.dlang.org/thread/bwsofbnigfbrxwouiobj forum.dlang.org?page=9#post-l5s44b:242c36:241:40digitalmars.com
 About DIP 50: I will say "no" but please do not take it personally. It's
 great there is discussion about this, and I encourage it, but at this
 time I think we should make no plans to add macros to D.
I don't think we should add macros now either. This proposal is far from ready. If Martin hadn't suggested I should create a DIP, I wouldn't have, at least now at this time. BTW, just saying "no" doesn't help a bit. You could just have said "foo". That's extremely annoying. You're shooting down very many suggestions/proposal/ideas with just a "no", or the more elaborated answer "no, never going to happen". On the other hand when you have a proposal it should be consider pre-approved and is a more of a FYI. -- /Jacob Carlborg
Nov 12 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/12/13 12:02 AM, Jacob Carlborg wrote:
 On 2013-11-12 08:52, Andrei Alexandrescu wrote:

 Fine, although a sense of futility is hard to shake seeing as we won't
 replace those existing features. I think a much stronger point would be
 made if the power of the feature were demonstrated on problems not
 accessible with the existing ones.
You just said we shouldn't replace existing features. "The point here is that there is significant difficulty to remove features that already exist" http://forum.dlang.org/thread/bwsofbnigfbrxwouiobj forum.dlang.org?page=9#post-l5s44b:242c36:241:40digitalmars.com
Yes. So I said. I don't get why you'd provide a link - it's in my text that you quote. Indeed, we shouldn't replace existing features.
 About DIP 50: I will say "no" but please do not take it personally. It's
 great there is discussion about this, and I encourage it, but at this
 time I think we should make no plans to add macros to D.
I don't think we should add macros now either. This proposal is far from ready. If Martin hadn't suggested I should create a DIP, I wouldn't have, at least now at this time.
Fine.
 BTW, just saying "no" doesn't help a bit. You could just have said
 "foo". That's extremely annoying. You're shooting down very many
 suggestions/proposal/ideas with just a "no", or the more elaborated
 answer "no, never going to happen".

 On the other hand when you have a proposal it should be consider
 pre-approved and is a more of a FYI.
So how could we express a "no" that doesn't annoy you in the extreme? In case the answer would be "you haven't explained why", allow me to retort. I've mentioned the argument before: at this point we should focus on quality of implementation and making good use of the features we have. In fact I am repeating myself: http://goo.gl/1thq1j. As has been publicly known for a while now, our strategy has been to improve quality and to double down on the assets we have. People ask for a roadmap, and what's missing from a roadmap is as important as what's there. This is a strategy that Walter and I agree with, have been transparent about, and that may work or not, with various degrees of success. Reasonable people may disagree what the best step moving forward should be, but at some point some step must be made and we can't adopt your strategy, with which we disagree, as our strategy, just to be nice and not offend your sensibility. (I'm using "we" here because Walter and I discussed this at large.) There must be a way to say "no" that doesn't offend you. Please advise what that is. Andrei
Nov 12 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-11-12 09:13, Andrei Alexandrescu wrote:

 So how could we express a "no" that doesn't annoy you in the extreme? In
 case the answer would be "you haven't explained why", allow me to retort.

 I've mentioned the argument before: at this point we should focus on
 quality of implementation and making good use of the features we have.
 In fact I am repeating myself: http://goo.gl/1thq1j. As has been
 publicly known for a while now, our strategy has been to improve quality
 and to double down on the assets we have. People ask for a roadmap, and
 what's missing from a roadmap is as important as what's there.

 This is a strategy that Walter and I agree with, have been transparent
 about, and that may work or not, with various degrees of success.
 Reasonable people may disagree what the best step moving forward should
 be, but at some point some step must be made and we can't adopt your
 strategy, with which we disagree, as our strategy, just to be nice and
 not offend your sensibility. (I'm using "we" here because Walter and I
 discussed this at large.) There must be a way to say "no" that doesn't
 offend you. Please advise what that is.
Thank you for the explanation. I do understand your and Walter's position in this. Just giving a short reason, to accompany the "no" with helps a lot. It doesn't need to be as long as the explanation above, just a sentence, like: "no, at this time we don't want to make such a big change, we're trying to stabilize". This is not just for me. I'm hoping proposal from others also can get a fair reason to why the "no". -- /Jacob Carlborg
Nov 12 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/12/13 12:31 AM, Jacob Carlborg wrote:
 Just giving a short reason, to accompany the "no" with helps a lot. It
 doesn't need to be as long as the explanation above, just a sentence, like:

 "no, at this time we don't want to make such a big change, we're trying
 to stabilize".

 This is not just for me. I'm hoping proposal from others also can get a
 fair reason to why the "no".
OK, thanks. I'll do my best to improve on that in the future. Andrei
Nov 12 2013
parent Jacob Carlborg <doob me.com> writes:
On 2013-11-12 09:37, Andrei Alexandrescu wrote:

 OK, thanks. I'll do my best to improve on that in the future.
Thanks again. Sorry for being frustrated. -- /Jacob Carlborg
Nov 12 2013
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/10/13 12:38 PM, Timothee Cour wrote:
 On Sun, Nov 10, 2013 at 12:27 PM, Jacob Carlborg <doob me.com
 <mailto:doob me.com>> wrote:

     On 2013-11-10 19:44, deadalnix wrote:

         It is also a given that many feature could have been avoided in the
         first place given that one.


     I agree. It seems at least a couple of features could have been
     implemented using AST macros instead.

     --
     /Jacob Carlborg


 I agree; eg, just to name 2 features that I proposed that could be
 simply implemented via AST macros as Jacob said:
 * [proposal: a new string litteral to embed variables in a string
 <http://forum.dlang.org/post/mailman.94.1383254681.9546.digitalmars-d puremagic.com>
 http://forum.dlang.org/post/mailman.94.1383254681.9546.digitalmars-d puremagic.com]
 * [Re: feature request: __ARGS__ for logging (cf __FILE__, __LINE__,
 __FUNC___ http://forum.dlang.org/post/kegmp0$30dn$1 digitalmars.com]

 It seems we could even get rid of __FILE__,__LINE__,__MODULE__ with AST
 macros.
This would be a very small advantage. The special variables hardly cause any trouble. Andrei
Nov 10 2013
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 11/10/2013 11:43 PM, Andrei Alexandrescu wrote:
 It seems we could even get rid of __FILE__,__LINE__,__MODULE__ with AST
 macros.
This would be a very small advantage. The special variables hardly cause any trouble. Andrei
The trouble is that there are too few of them to (efficiently) reflect all the information about the calling context one may be interested in. I don't see the point in adding them one after another in an unprincipled ad-hoc fashion instead of actually thinking up a good API coupled with some more expressive power.
Nov 10 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/10/13 3:21 PM, Timon Gehr wrote:
 On 11/10/2013 11:43 PM, Andrei Alexandrescu wrote:
 It seems we could even get rid of __FILE__,__LINE__,__MODULE__ with AST
 macros.
This would be a very small advantage. The special variables hardly cause any trouble. Andrei
The trouble is that there are too few of them to (efficiently) reflect all the information about the calling context one may be interested in. I don't see the point in adding them one after another in an unprincipled ad-hoc fashion instead of actually thinking up a good API coupled with some more expressive power.
But the API would add them in exactly the same rote manner. I really don't see an advantage here. An argument for macros would have to do a lot more than "we don't need __FILE__ etc. anymore". Andrei
Nov 10 2013
next sibling parent reply Timothee Cour <thelastmammoth gmail.com> writes:
On Sun, Nov 10, 2013 at 3:37 PM, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 On 11/10/13 3:21 PM, Timon Gehr wrote:

 On 11/10/2013 11:43 PM, Andrei Alexandrescu wrote:

 It seems we could even get rid of __FILE__,__LINE__,__MODULE__ with AST
 macros.
This would be a very small advantage. The special variables hardly cause any trouble. Andrei
The trouble is that there are too few of them to (efficiently) reflect all the information about the calling context one may be interested in. I don't see the point in adding them one after another in an unprincipled ad-hoc fashion instead of actually thinking up a good API coupled with some more expressive power.
But the API would add them in exactly the same rote manner. I really don't see an advantage here.
All else being equal, a library solution is always preferable; and it's easily extensible unlike what we have.
 An argument for macros would have to do a lot more than "we don't need
 __FILE__ etc. anymore".
More use cases have already been provided in this thread and previously. The fact that *one* language feature enables all of those is a very good indication there'll be more use cases. Furthermore, writing AST macros instead of mixins is cleaner in terms of introspection, refactoring tools, etc.
 Andrei
Nov 10 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/10/13 5:03 PM, Timothee Cour wrote:
 On Sun, Nov 10, 2013 at 3:37 PM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail erdani.org>>
 wrote:

     On 11/10/13 3:21 PM, Timon Gehr wrote:

         On 11/10/2013 11:43 PM, Andrei Alexandrescu wrote:


                 It seems we could even get rid of
                 __FILE__,__LINE__,__MODULE__ with AST
                 macros.


             This would be a very small advantage. The special variables
             hardly cause
             any trouble.

             Andrei


         The trouble is that there are too few of them to (efficiently)
         reflect
         all the information about the calling context one may be
         interested in.
         I don't see the point in adding them one after another in an
         unprincipled ad-hoc fashion instead of actually thinking up a
         good API
         coupled with some more expressive power.


     But the API would add them in exactly the same rote manner. I really
     don't see an advantage here.


 All else being equal, a library solution is always preferable; and it's
 easily extensible unlike what we have.
What do you have in mind? I find it difficult to grab e.g. the file and the line unless there's some plumbing in the language that allows you to.
     An argument for macros would have to do a lot more than "we don't
     need __FILE__ etc. anymore".


 More use cases have already been provided in this thread and previously.
 The fact that *one* language feature enables all of those is a very good
 indication there'll be more use cases. Furthermore, writing AST macros
 instead of mixins is cleaner in terms of introspection, refactoring
 tools, etc.
I am aware of the advantages. You, too, should be aware of the disadvantages. Andrei
Nov 10 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-11-11 02:54, Andrei Alexandrescu wrote:

 What do you have in mind? I find it difficult to grab e.g. the file and
 the line unless there's some plumbing in the language that allows you to.
The language just need to provide a way to introspect the AST. If we're talking about a specific API I can think of something like this: macro foo (Context context, Ast!(string) str) { auto line = context.caller.line; auto file = context.caller.file; } foo("asd"); -- /Jacob Carlborg
Nov 11 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/11/13 12:00 AM, Jacob Carlborg wrote:
 On 2013-11-11 02:54, Andrei Alexandrescu wrote:

 What do you have in mind? I find it difficult to grab e.g. the file and
 the line unless there's some plumbing in the language that allows you to.
The language just need to provide a way to introspect the AST. If we're talking about a specific API I can think of something like this: macro foo (Context context, Ast!(string) str) { auto line = context.caller.line; auto file = context.caller.file; } foo("asd");
So... there is rote addition to the context.caller structure. It's just spelled differently. No? Andrei
Nov 11 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-11-12 03:35, Andrei Alexandrescu wrote:

 So... there is rote addition to the context.caller structure. It's just
 spelled differently. No?
Have you seen the other posts? Have even read the proposal? If you think that I just want a different syntax for __LINE__ and __FILE__ you're way, way off. That's just an extremely small part of the proposal. __LINE__ and __FILE__ would not be removed. -- /Jacob Carlborg
Nov 11 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/11/13 11:50 PM, Jacob Carlborg wrote:
 On 2013-11-12 03:35, Andrei Alexandrescu wrote:

 So... there is rote addition to the context.caller structure. It's just
 spelled differently. No?
Have you seen the other posts?
Yes. Don't assume whoever disagrees with you is misinformed.
 Have even read the proposal?
Yes.
 If you think
 that I just want a different syntax for __LINE__ and __FILE__ you're
 way, way off. That's just an extremely small part of the proposal.
 __LINE__ and __FILE__ would not be removed.
This is putting words in my mouth. Andrei
Nov 12 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-11-12 09:00, Andrei Alexandrescu wrote:

 Yes. Don't assume whoever disagrees with you is misinformed.
I absolutely do not. But when I read this: "An argument for macros would have to do a lot more than 'we don't need __FILE__ etc. anymore'" I find it hard to believe anything else. http://forum.dlang.org/thread/bwsofbnigfbrxwouiobj forum.dlang.org?page=8#post-l5p5b8:241kju:241:40digitalmars.com -- /Jacob Carlborg
Nov 12 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/12/13 12:09 AM, Jacob Carlborg wrote:
 On 2013-11-12 09:00, Andrei Alexandrescu wrote:

 Yes. Don't assume whoever disagrees with you is misinformed.
I absolutely do not. But when I read this: "An argument for macros would have to do a lot more than 'we don't need __FILE__ etc. anymore'" I find it hard to believe anything else. http://forum.dlang.org/thread/bwsofbnigfbrxwouiobj forum.dlang.org?page=8#post-l5p5b8:241kju:241:40digitalmars.com
It's very simple. Timon got into that part as if it were important. I pointed out it's not important. Andrei
Nov 12 2013
next sibling parent Jacob Carlborg <doob me.com> writes:
On 2013-11-12 09:14, Andrei Alexandrescu wrote:

 It's very simple. Timon got into that part as if it were important. I
 pointed out it's not important.
Ok, fair enough. I'm sorry for wrongly interpreting it. -- /Jacob Carlborg
Nov 12 2013
prev sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 11/12/2013 09:14 AM, Andrei Alexandrescu wrote:
 On 11/12/13 12:09 AM, Jacob Carlborg wrote:
 ...

 "An argument for macros would have to do a
 lot more than 'we don't need __FILE__ etc. anymore'"

 ...
It's very simple. Timon got into that part as if it were important. I pointed out it's
not important.
 ..
This is a misrepresentation. I'll just answer that post.
Nov 14 2013
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2013-11-11 00:37, Andrei Alexandrescu wrote:

 But the API would add them in exactly the same rote manner. I really
 don't see an advantage here. An argument for macros would have to do a
 lot more than "we don't need __FILE__ etc. anymore".
They do, and several examples have already been shown. I'll also add more as I come up with them. -- /Jacob Carlborg
Nov 10 2013
prev sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 11/11/2013 12:37 AM, Andrei Alexandrescu wrote:
 On 11/10/13 3:21 PM, Timon Gehr wrote:
 On 11/10/2013 11:43 PM, Andrei Alexandrescu wrote:
 It seems we could even get rid of __FILE__,__LINE__,__MODULE__ with AST
 macros.
This would be a very small advantage. The special variables hardly cause any trouble. Andrei
The trouble is that there are too few of them to (efficiently) reflect all the information about the calling context one may be interested in. I don't see the point in adding them one after another in an unprincipled ad-hoc fashion instead of actually thinking up a good API coupled with some more expressive power.
But the API would add them in exactly the same rote manner.
No, it would be (close to) exhaustive from the beginning. That is the point. I assume the problem is that there have not yet been very specific proposals.
 I really don't see an advantage here.
It's the difference between having an universal language, and a language that has a finite and fixed set of words (without any methods of combination).
 An argument for macros would have to do a
 lot more than "we don't need __FILE__ etc. anymore".
 ...
Obviously, but I was not making this argument, and as far as I can see nobody else was either.
Nov 14 2013
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2013-11-10 17:37, Martin Nowak wrote:

 Can you pour that into a DIP?
Done: http://forum.dlang.org/thread/l5otb1$1dhi$1 digitalmars.com -- /Jacob Carlborg
Nov 10 2013
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2013-11-06 09:34, Chad Joan wrote:

 Also, IIRC, it is believed that string mixins with CTFE are potentially
 more powerful.  I am under the assumption that Walter is taking the long
 view and waiting for the community to furnish their own powerful AST
 manipulation tools using the existing spec.  I suspect that he is
 opposed to baking AST manipulation into the /language spec/, but is
 perfectly accepting of the notion of using AST manipulation to generate
 code, reduce boilerplate, implement exotic features and DSLs, and so
 on.  Just don't complicate the core language any more than it already
 is. Sorry if I misrepresented you Walter; I can only make educated
 guesses ;)
The problem with this is the ugly syntax, it requires a string mixin and the source code needs to be embedded in strings. You also need a complete front end that works at compile time. This is my vision of AST macros: https://dl.dropboxusercontent.com/u/18386187/ast_macros.html -- /Jacob Carlborg
Nov 07 2013
prev sibling parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Wed, Nov 6, 2013 at 9:19 AM, Jacob Carlborg <doob me.com> wrote:

 You know, Walter did a talk about AST macros at the first D conference.
 The idea back then was to add AST macros, hence the "macro" keyword.
I know. But since then, I guess he changed his mind, or thought about it a bit more :-)
Nov 06 2013
prev sibling parent "Chad Joan" <chadjoan gmail.com> writes:
On Tuesday, 5 November 2013 at 14:54:34 UTC, Dmitry Olshansky 
wrote:
 I was also toying with the idea of exposing Builder interface 
 for std.regex. But push/pop IMHO are better be implicitly 
 designed-out:

 auto re = 
 atom('x').star(charClass(unicode.Letter),atom('y')).build();

 ... and letting the nesting be explicit.

 Is the same as:
 auto re = regex(`x(?:\p{L}y)*`);

 Aimed for apps/libs that build regular expressions anyway and 
 have no need in textual parser.
Interesting. I like how it induces some amount of static verification, though I worry that it could harm procedural generation of grammars. It would be difficult, for instance, to use that API to do the equivalent of pushing an atom in one function and popping it in another. I wonder if we are at different levels of abstraction. The example you give seems like it requires the API to remember, in a structured way, all of the information presented by the call-chaining and call-nesting. I might implement something like that with a stateful "builder" object under the hood. However, the example you give seems like it is closer to what a regex engine would morph an expression into, thus making it a higher abstraction.
 That snippet would create a parser that recognizes the grammar 
 'x' (
 'y'? ).
 The current fledgling implementation creates this parser:
 http://pastebin.com/MgSqWXE2

 Of course, no one would be expected to write grammars like 
 that. It
 would be the job of tools like Pegged or std.regex to package 
 it up in
 nice syntax that is easy to use.
I thought to provide some building blocks for that with new std.uni. Not quite everything I wanted, but now at least there is one set of wheels less to reinvent.
I haven't looked at std.uni earnestly yet, but if it succeeds at making that unicode/utf jungle manageable, then I will be incredibly thankful.
 [snip]

 Another fun thought: PEGs can have look-behind that includes 
 any regular
 elements without any additional algorithmic complexity. Just 
 take all of
 the look-behinds in the grammar, mash them together into one 
 big
 regular-expression using regular alternation (|), and then 
 have the
 resulting automaton consume in lock-step with the PEG parser.  
 Whenever
 the PEG parser needs to do a lookbehind, it just checks to see 
 if the
 companion automaton is in a matching state for the capture it 
 needs.
Sounds quite slow to do it "just in case". Also complete DFAs tend to be mm quite big.
I was envisioning it being done lazily and strictly as-needed, if I even got around to doing it at all.
 What ANTLR does is similar technique - a regular lookahead to 
 resolve ambiguity in the grammar (implicitly). A lot like LL(k) 
 but with unlimited length (so called LL(*)). Of course, it 
 generates LL(k) disambiguation where possible, then LL(*), 
 failing that the usual backtracking.
Neat.
 *sigh*, I feel like I could write a paper on this stuff if I 
 were in
 grad school right now.  Alas, I am stuck doing 50-60 hours a 
 week of
 soul-sucking business programming.
I heard Sociomantic is hiring D programmers for coding some awesome stuff, you may as well apply :)
Tempting. And it seems even Facebook is joining the market now, which is news to me.
 Well, then again, my understanding
 is that even though I can think of things that seem like they 
 would make
 interesting topics for publishable papers, reality would have 
 the profs
 conscript me to do completely different things that are 
 possibly just as
 inane as the business programming.
Speaking for my limited experience - at times it's like that.
Yay, someone's observations corroborate mine! ... Crap, someone observations corroborate mine :( ;)
 I worry that the greater threat to good AST manipulation tools 
 in D is a
 lack of free time, and not the DMD bugs as much.
Good for you I guess, my developments in related area are blocked still :(
Well, hopefully you've got the wind at your back now.
Nov 06 2013
prev sibling next sibling parent reply Martin Nowak <code dawg.eu> writes:
On 11/01/2013 09:59 PM, Philippe Sigaud wrote:
 The examples directory shows different grammars, from JSON to XML to C to D.
Sounds promising. I think I found a way to implement a D REPL but are lacking a reliable way to classify code snippets (expressions, declarations, statements) and get introduced symbol names. Do you think it will be able to handle that (performance is not an issue)? If there is no parser I either have to use a crude heuristic or I could use machine learning to classify the code based on token histograms (after all we have at least 2 D lexers).
Nov 05 2013
next sibling parent "cal" <callumenator gmail.com> writes:
On Tuesday, 5 November 2013 at 17:23:23 UTC, Martin Nowak wrote:
 On 11/01/2013 09:59 PM, Philippe Sigaud wrote:
 The examples directory shows different grammars, from JSON to 
 XML to C to D.
Sounds promising. I think I found a way to implement a D REPL but are lacking a reliable way to classify code snippets (expressions, declarations, statements) and get introduced symbol names. Do you think it will be able to handle that (performance is not an issue)?
I did similar using Pegged little while back (https://github.com/callumenator/dabble), I tuned the PEG grammar quite a lot, but it worked ok.
Nov 05 2013
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-11-05 18:23, Martin Nowak wrote:

 Sounds promising.
 I think I found a way to implement a D REPL but are lacking a reliable
 way to classify code snippets (expressions, declarations, statements)
 and get introduced symbol names.
 Do you think it will be able to handle that (performance is not an issue)?

 If there is no parser I either have to use a crude heuristic
 or I could use machine learning to classify the code based on token
 histograms (after all we have at least 2 D lexers).
There's a parser available, DScanner: https://github.com/Hackerpilot/Dscanner. It's part of DCD, autocomplete daemon. -- /Jacob Carlborg
Nov 06 2013
parent "Brian Schott" <briancschott gmail.com> writes:
On Wednesday, 6 November 2013 at 08:21:57 UTC, Jacob Carlborg 
wrote:
 It's part of DCD, autocomplete daemon.
Side note: DCD 0.2.0 is in beta now. There's a thread on the .ide list. http://forum.dlang.org/thread/flxjzwuaadnkrixryzds forum.dlang.org
Nov 06 2013
prev sibling parent reply Martin Nowak <code dawg.eu> writes:
On 11/01/2013 09:59 PM, Philippe Sigaud wrote:
 I used it with the D grammar, but the one on the website is woefully
 inadequate (some mistakes, out of date compared to the compiler and
 written in a somewhat convoluted style full of left-recursion). The
 shortcomings are that the generated parser is quite slow compared to
 other D parsers.

 That comes from my coding, of course: Pegged generates a simple
 recursive descent parser. I guess I could push for a better engine, but
 I'm waiting for CTFE to get a bit better.

 The advantages are nice, though: full parse tree, semantic actions, and
 so on.
Like many others I'm hoping for a nice general parser generator for quite some time. So I'm also asking specifically about your insights on PEGs. From what I know they do have some memory issues due to the backtrace memoization (only for deep parse trees though, which aren't common in programming languages). Also it seems that the research community has some issues to formalize PEGs and what grammars they are capable to handle. Also why is the expr grammar example so complicated?
Nov 05 2013
parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Tue, Nov 5, 2013 at 6:39 PM, Martin Nowak <code dawg.eu> wrote:

 Like many others I'm hoping for a nice general parser generator for quite
some time. So I'm also asking specifically about your insights on PEGs. From what I know they do have some memory issues due to the backtrace memoization (only for deep parse trees though, which aren't common in programming languages). Also it seems that the research community has some issues to formalize PEGs and what grammars they are capable to handle.
IIRC, they don't fit exactly inside the Chomsky hierarchy. But then, so do other grammars and parsing algorithms.
 Also why is the expr grammar example so complicated?
What do you mean?
Nov 06 2013
prev sibling parent Martin Nowak <code dawg.eu> writes:
On 10/30/2013 11:39 PM, jgetner wrote:
 Is there a function in D for evaluating expressions before compile time?.
I want to throw two more ideas in the parser generator topic which I carry around for quite some time. I think combined they'd enable AST generation from clean and simple grammars. Maybe someone find this interesting enough to give it a try, I hardly will ever get to this. There is an interesting paper [MTP] that describes how a slight modification of the ENBF grammar can be used to generate a usable AST from the grammar definition. Each AST node is named like it's production. This is possible because the modified grammar disallows to mix alternates and sequences which forces one to name all productions. The other paper is [GLL] parsing which is a new general parser generation technique. The main advantage over GLR, it's a top-down parser. This allows to modularize grammars, e.g. you can define a grammar that reuses/imports an existing grammar for expressions. The reason to choose generalized parsers instead of say LALR(1) is maintainability of the grammars. The needed modifications to make a context free grammar LR(1) or LALR(1)-compatible often make them extremely hard to read and change. Parse forests (many parse trees) are not a problem I think because most CFG only have local ambiguities so they resolve to a single parse tree. ### A motivating example (desiderata) grammar.d ``` import number; Identifier = "[_a-zA-Z][a-zA-Z]*"; // This still needs some syntax to annotate precedence and associativity MulExpr = Expr "*" Expr; DivExpr = Expr "/" Expr; AddExpr = Expr "+" Expr; SubExpr = Expr "-" Expr; PowExpr = Expr "^^" Expr; Expr = MulExpr | DivExpr | AddExpr | SubExpr | PowExpr | Number | Identifer; Arguments = Expr % ","; ExprStmt = Expr ";"; CallStmt = Identifier "(" Arguments ")"; Stmt = ExprStmt | CallStmt; Stmts = Stmt*; ``` ```d auto p = genParser(import("grammar.d")); auto stmts = p.parseStmts(input); autp expr = p.parseExpr(input); foreach (arg; p.parseArguments(input)) {} final switch (expr) { case MulExpr.tag: case DivExpr.tag: case AddExpr.tag: case SubExpr.tag: case PowExpr.tag: } static assert(typeof(MulExpr == Tuple!(Expr, "expr0", Expr, "expr1"))); static assert(typeof(Stmt == Variant!(ExprStmt, CallStmt))); static assert(typeof(Identifier) == string)); ``` [MTP]: http://babel.ls.fi.upm.es/research/mtp/ http://babel.ls.fi.upm.es/~angel/papers/finalcopy-2005prole.pdf [GLL]: http://www.rhul.ac.uk/computerscience/research/csle/gllparsers.aspx http://dotat.at/tmp/gll.pdf for implementers - Modelling GLL Parser Implementations (http://dx.doi.org/10.1007/978-3-642-19440-5_4)
Nov 05 2013