digitalmars.D - D parsing

jgetner (2/2) Oct 30 2013 Is there a function in D for evaluating expressions before

evilrat (2/4) Oct 30 2013 what? how it is supposed to work? O_o
Philippe Sigaud (15/15) Oct 30 2013 Do you want to evaluate an expression at CT or to parse an expression?

Martin Nowak (3/11) Nov 01 2013 Have any serious grammars been tested with this?

Philippe Sigaud (10/10) Nov 01 2013 The examples directory shows different grammars, from JSON to XML to C t...

Andrei Alexandrescu (13/24) Nov 01 2013 I have long believed, and continue to believe, that this kind of work is...

Manfred Nowak (7/8) Nov 01 2013 One prerequisite for every PEG-Parser is, that the language has to be

Andrei Alexandrescu (4/12) Nov 01 2013 That would mean Pegged should accommodate tricks. Very few languages

Philippe Sigaud (24/30) Nov 02 2013 That's right. I should test it with 2.064, though, as soon as it's out.

Andrei Alexandrescu (4/6) Nov 02 2013 If you want maximum impact, Pegged is it. (Not to say tutorials aren't
growler (10/13) Nov 02 2013 On Saturday, 2 November 2013 at 08:19:39 UTC, Philippe Sigaud

Andrei Alexandrescu (3/15) Nov 02 2013 Yes. Probably the most significant D project at this time.

Timothee Cour (16/38) Nov 02 2013 1)

Andrei Alexandrescu (5/10) Nov 02 2013 I see Pegged significant not only as a PEG-based generator, but also for...

Timothee Cour (7/17) Nov 02 2013 That's fine and it works great for certain grammars, but for grammars li...

Chad Joan (11/17) Nov 04 2013 Use the repetition operator(s) and turn the resulting array into

Philippe Sigaud (2/12) Nov 05 2013 That made my day, thanks!

Chad Joan (10/31) Nov 06 2013 You're very welcome!

Martin Nowak (4/14) Nov 05 2013 Scala extended their PEG generator to handle left recursion.

Philippe Sigaud (3/19) Nov 06 2013 I didn't know that. My 'programming in Scala' book is a bit dated.

Bastiaan Veelo (4/17) Nov 14 2015 Left-recursion for Pegged is in the works:

=?UTF-8?B?Tm9yZGzDtnc=?= (4/7) Nov 15 2015 And merged!: https://github.com/PhilippeSigaud/Pegged/pull/164

Philippe Sigaud (23/50) Nov 03 2013 Indeed. Eliminating left-recursion can be done, but it might disrupt the
Timothee Cour (7/67) Nov 03 2013 Great, please let us know how that works.
Robert Schadek (6/19) Nov 03 2013 D does not fit into LALR(1), you need glr.
Timothee Cour (7/24) Nov 03 2013 how about dparser (nothing to do with D btw):
Philippe Sigaud (4/16) Nov 03 2013 Lexing, yes. I was imprecise: even in a context-free grammar, some rules
Philippe Sigaud (4/9) Nov 03 2013 Well, too bad. GLR is also on my plate, but I fear its cubic behavior

Martin Nowak (8/9) Nov 05 2013 At least function and template function declarations are not even LR(1).

Robert Schadek (3/19) Nov 04 2013 Lexing can not be done with regex. Think myArray[1. ] ! What is next a

Dmitry Olshansky (9/36) Nov 04 2013 You could use lookahead extension ;)

Timothee Cour (8/48) Nov 04 2013 for lexing there's already dscanner we could use (while we wait for perh...

Brian Schott (17/23) Nov 04 2013 Yes, that tool has two properties:

Brian Schott (2/3) Nov 04 2013 Ignore this line if your name is "Kenji".
Manfred Nowak (3/4) Nov 05 2013 meetoo.
Philippe Sigaud (7/17) Nov 05 2013 You're not alone in this, but Walter's answer is that we must make pull
Andrei Alexandrescu (23/43) Nov 05 2013 I agree it's a problem, in fact three problems in one. In decreasing

Chad Joan (7/11) Nov 06 2013 This.

Walter Bright (2/6) Nov 12 2013 Thank you, Brian, for finding and reporting these issues.

Robert Schadek (4/12) Nov 04 2013 Thats one part, and even worse is that you need a decider function if

Timothee Cour (3/5) Nov 03 2013 how do you handle left recursion in C ? eg: a[0][1] etc?
Philippe Sigaud (5/12) Nov 03 2013 I don't remember. Maybe the grammar I use has no left-recursion? I don't
Philippe Sigaud (4/21) Nov 03 2013 I *used* the grammar, obviously. Suing could be an option to disallow

Martin Nowak (4/6) Nov 05 2013 AFAIK Parser expression grammars handle ambiguity through prioritization...

Manfred Nowak (7/8) Nov 05 2013 My statement about ambiguity is taken from the first sentence of chapter...

Chad Joan (96/101) Nov 04 2013 Ack! I share Philippe's sense of timing. This would have been

Dmitry Olshansky (27/74) Nov 05 2013 I was also toying with the idea of exposing Builder interface for

Philippe Sigaud (16/45) Nov 05 2013 Another possible advantage is to reference external names inside your

Dmitry Olshansky (21/53) Nov 05 2013 Actually it's a bad, bad idea. It has nice potential to destroy all
Jacob Carlborg (5/10) Nov 06 2013 You know, Walter did a talk about AST macros at the first D conference.

Chad Joan (13/26) Nov 06 2013 Also, IIRC, it is believed that string mixins with CTFE are

Chris Cain (10/12) Nov 06 2013 Maybe. But first, the point that Brian Schott brought up has to

Chad Joan (10/22) Nov 06 2013 Right.

Philippe Sigaud (17/39) Nov 06 2013 Well, I remember him saying that he doesn't want everybody and their dog...

Jacob Carlborg (25/41) Nov 07 2013 It depends on what the AST macros allows you to do. I guess everyone

Martin Nowak (3/4) Nov 10 2013 Can you pour that into a DIP?

Jacob Carlborg (4/5) Nov 10 2013 Absolutely, I've been meaning to do that.

deadalnix (3/6) Nov 10 2013 It require a bit more precision. But as you know I really behind
Andrei Alexandrescu (5/8) Nov 10 2013 Just a thought - I think the best strategy for the time being is to

deadalnix (7/17) Nov 10 2013 It is also a given that many feature could have been avoided in

Jacob Carlborg (5/7) Nov 10 2013 I agree. It seems at least a couple of features could have been

Timothee Cour (11/19) Nov 10 2013 I agree; eg, just to name 2 features that I proposed that could be simpl...

Jacob Carlborg (17/26) Nov 10 2013 Let see, what else. These could, most likely, have been implemented

Andrei Alexandrescu (4/31) Nov 10 2013 The way I see this is, these preexisting features (which shan't be

Jacob Carlborg (9/11) Nov 10 2013 I think the opposite. It shows how powerful macros are and that they

Andrei Alexandrescu (5/14) Nov 11 2013 No need to be snarky about it. The point here is that there is

deadalnix (24/28) Nov 11 2013 That is true. However, please understand that this is extremely

Andrei Alexandrescu (12/38) Nov 11 2013 Frustration is inappropriate. Probably "bummed" would be more fit for

deadalnix (4/33) Nov 11 2013 Yes, as much as I'd like to have macro, I don't think this is the

Jacob Carlborg (9/12) Nov 11 2013 I'm really trying to be as humble as I can but you're not making it easy...

Andrei Alexandrescu (9/19) Nov 11 2013 Fine, although a sense of futility is hard to shake seeing as we won't

Jacob Carlborg (16/23) Nov 12 2013 You just said we shouldn't replace existing features.

Andrei Alexandrescu (21/42) Nov 12 2013 Yes. So I said. I don't get why you'd provide a link - it's in my text

Jacob Carlborg (11/27) Nov 12 2013 Thank you for the explanation. I do understand your and Walter's

Andrei Alexandrescu (3/9) Nov 12 2013 OK, thanks. I'll do my best to improve on that in the future.

Jacob Carlborg (4/5) Nov 12 2013 Thanks again. Sorry for being frustrated.

Andrei Alexandrescu (4/22) Nov 10 2013 This would be a very small advantage. The special variables hardly cause...

Timon Gehr (6/12) Nov 10 2013 The trouble is that there are too few of them to (efficiently) reflect

Andrei Alexandrescu (5/19) Nov 10 2013 But the API would add them in exactly the same rote manner. I really

Timothee Cour (9/34) Nov 10 2013 All else being equal, a library solution is always preferable; and it's

Andrei Alexandrescu (6/37) Nov 10 2013 What do you have in mind? I find it difficult to grab e.g. the file and

Jacob Carlborg (11/13) Nov 11 2013 The language just need to provide a way to introspect the AST. If we're

Andrei Alexandrescu (4/15) Nov 11 2013 So... there is rote addition to the context.caller structure. It's just

Jacob Carlborg (7/9) Nov 11 2013 Have you seen the other posts? Have even read the proposal? If you think...

Andrei Alexandrescu (5/14) Nov 12 2013 Yes.

Jacob Carlborg (8/9) Nov 12 2013 I absolutely do not. But when I read this:

Andrei Alexandrescu (4/11) Nov 12 2013 It's very simple. Timon got into that part as if it were important. I

Jacob Carlborg (4/6) Nov 12 2013 Ok, fair enough. I'm sorry for wrongly interpreting it.
Timon Gehr (3/12) Nov 14 2013 This is a misrepresentation. I'll just answer that post.

Jacob Carlborg (5/8) Nov 10 2013 They do, and several examples have already been shown. I'll also add
Timon Gehr (9/30) Nov 14 2013 No, it would be (close to) exhaustive from the beginning. That is the

Jacob Carlborg (4/5) Nov 10 2013 Done: http://forum.dlang.org/thread/l5otb1$1dhi$1@digitalmars.com

Jacob Carlborg (8/18) Nov 07 2013 The problem with this is the ugly syntax, it requires a string mixin and...

Philippe Sigaud (3/5) Nov 06 2013 I know. But since then, I guess he changed his mind, or thought about it...

Chad Joan (29/98) Nov 06 2013 Interesting. I like how it induces some amount of static

Martin Nowak (9/10) Nov 05 2013 Sounds promising.

cal (4/13) Nov 05 2013 I did similar using Pegged little while back
Jacob Carlborg (6/14) Nov 06 2013 There's a parser available, DScanner:

Brian Schott (5/6) Nov 06 2013 Side note: DCD 0.2.0 is in beta now. There's a thread on the .ide

Martin Nowak (9/19) Nov 05 2013 Like many others I'm hoping for a nice general parser generator for

Philippe Sigaud (4/12) Nov 06 2013 IIRC, they don't fit exactly inside the Chomsky hierarchy. But then, so ...

Martin Nowak (62/63) Nov 05 2013 I want to throw two more ideas in the parser generator topic which

"jgetner" <jgetner gmail.com> writes:

Is there a function in D for evaluating expressions before 
compile time?.

Oct 30 2013

"evilrat" <evilrat666 gmail.com> writes:

On Wednesday, 30 October 2013 at 22:39:13 UTC, jgetner wrote:
 Is there a function in D for evaluating expressions before 
 compile time?.

what? how it is supposed to work? O_o

Oct 30 2013

Philippe Sigaud <philippe.sigaud gmail.com> writes:

Do you want to evaluate an expression at CT or to parse an expression?

D has a powerful constant-folding step during compilation, where whole
functions can be evaluated, provided they contain only a subset of D
(almost all of D, except taking adresses/pointers and generating classes,
IIRC).

http://en.wikipedia.org/wiki/Compile_time_function_execution
http://dlang.org/function.html#interpretation


Now, if you want parsing specifically, what do you want to get? A parse
tree? In that case, you can for example use one of my projects, Pegged,
which generates CT-compatible parsers (that is, functions that can use CTFE
to parse a string at CT).
Parse trees can also be manipulated at compile-time, to modify the
expression, and then collapsed down again to a new expression, if that's
what is needed.

https://github.com/PhilippeSigaud/Pegged

Oct 30 2013

Martin Nowak <code dawg.eu> writes:

On 10/31/2013 07:10 AM, Philippe Sigaud wrote:
 Now, if you want parsing specifically, what do you want to get? A parse
 tree? In that case, you can for example use one of my projects, Pegged,
 which generates CT-compatible parsers (that is, functions that can use
 CTFE to parse a string at CT).
 Parse trees can also be manipulated at compile-time, to modify the
 expression, and then collapsed down again to a new expression, if that's
 what is needed.

 https://github.com/PhilippeSigaud/Pegged

Have any serious grammars been tested with this?
Are there any shortcomings?

Nov 01 2013

Philippe Sigaud <philippe.sigaud gmail.com> writes:

The examples directory shows different grammars, from JSON to XML to C to D.

I used it with the D grammar, but the one on the website is woefully
inadequate (some mistakes, out of date compared to the compiler and written
in a somewhat convoluted style full of left-recursion). The shortcomings
are that the generated parser is quite slow compared to other D parsers.

That comes from my coding, of course: Pegged generates a simple recursive
descent parser. I guess I could push for a better engine, but I'm waiting
for CTFE to get a bit better.

The advantages are nice, though: full parse tree, semantic actions, and so
on.

Nov 01 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/1/13 1:59 PM, Philippe Sigaud wrote:
 The examples directory shows different grammars, from JSON to XML to C to D.

 I used it with the D grammar, but the one on the website is woefully
 inadequate (some mistakes, out of date compared to the compiler and
 written in a somewhat convoluted style full of left-recursion). The
 shortcomings are that the generated parser is quite slow compared to
 other D parsers.

 That comes from my coding, of course: Pegged generates a simple
 recursive descent parser. I guess I could push for a better engine, but
 I'm waiting for CTFE to get a bit better.

 The advantages are nice, though: full parse tree, semantic actions, and
 so on.

I have long believed, and continue to believe, that this kind of work is 
strategic for D. There has been evidence, too - the code I pushed at 
work recently included a generated lexer replacing a hand-written lexer 
with a generated component, and people did notice. Compile-time and 
run-time improvements sealed the deal.

To succeed against established languages, D must not only do well where 
others also do well. It's not even enough to do great where others do 
well. It must do well (or great!) where others don't even stand a 
chance. Regex and Pegged are in that category.

Bugs stopping Pegged from going forward should receive high priority. I 
also encourage others to participate to that and similar work.


Andrei

Nov 01 2013

Manfred Nowak <svv1999 hotmail.com> writes:

Andrei Alexandrescu wrote:

 Bugs stopping Pegged from going forward should receive high priority.

One prerequisite for every PEG-Parser is, that the language has to be 
designed to be without any ambiguity.

This is not the case for D, because of its evolution based on recursive 
descent parsing including tricks.

It is therefore not sufficient to eliminate bugs in Pegged.

-manfred

Nov 01 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/1/13 9:06 PM, Manfred Nowak wrote:
 Andrei Alexandrescu wrote:

 Bugs stopping Pegged from going forward should receive high priority.

 One prerequisite for every PEG-Parser is, that the language has to be
 designed to be without any ambiguity.

 This is not the case for D, because of its evolution based on recursive
 descent parsing including tricks.

 It is therefore not sufficient to eliminate bugs in Pegged.

 -manfred

That would mean Pegged should accommodate tricks. Very few languages 
don't have them.

Andrei

Nov 01 2013

Philippe Sigaud <philippe.sigaud gmail.com> writes:

Manfred:
One prerequisite for every PEG-Parser is, that
the language has to be designed to be without any ambiguity.

This is not the case for D, because of its evolution
 based on recursive descent parsing including tricks.

It is therefore not sufficient to eliminate bugs in Pegged.

That's right. I should test it with 2.064, though, as soon as it's out.
I also have a fully recursive parser that manages ambiguity quite well
(gives all possible parse trees), but that's not what I want for a
programming language.

What we need is a parser able to deal with left-recursive BNF grammar. That
way, we can feed it the D grammar.

Note that Pegged has no problem with the official C grammar or the officiel
XML specification...

Andrei:
 That would mean Pegged should accommodate tricks. Very few languages

don't have them.

That was ma conclusion also, even if that means stepping outside of what a
PEG is. I just fear losing some of their proprieties.
I really like the composability of PEGs: you can inject / call another
grammar inside another one: creating a grammar becomes a bit like
structured programming, with functions and recursions. That means you can
'grow' your grammar by steps. That's really really nice to have.

Also, discarding the 'scannerless' part of PEGs (to connect the parser to
the token range resulting from DScanner, for example), or dealing with
left-recursion means changing the internal engine.

Of course, that can be done : the engine could be changed according to
compile-time parameters. I 'just' need to code the parsers themsevels :)

Bah, you're getting me all interested in it again! I wanted to write a D
tutorial around a ray-tracer!

Nov 02 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/2/13 1:19 AM, Philippe Sigaud wrote:
 Bah, you're getting me all interested in it again! I wanted to write a D
 tutorial around a ray-tracer!

If you want maximum impact, Pegged is it. (Not to say tutorials aren't 
important, but probably the skill to work on Pegged is narrower.)

Andrei

Nov 02 2013

"growler" <growlercab gmail.com> writes:

On Saturday, 2 November 2013 at 08:19:39 UTC, Philippe Sigaud
wrote:

[snip]
 Bah, you're getting me all interested in it again! I wanted to 
 write a D
 tutorial around a ray-tracer!

I'd like to chime in and say I think Pegged is fantastic.

I'm using it for parsing C and Markup in two different projects.
I've also just started working on an Octave (well ok, Matlab) 
M-file
parser. It will probably never see the light of day but with
Pegged is great fun to play with.

Please don't lose interest in it !!

Nov 02 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/2/13 1:56 PM, growler wrote:
 On Saturday, 2 November 2013 at 08:19:39 UTC, Philippe Sigaud
 wrote:

 [snip]
 Bah, you're getting me all interested in it again! I wanted to write a D
 tutorial around a ray-tracer!

 I'd like to chime in and say I think Pegged is fantastic.

 I'm using it for parsing C and Markup in two different projects.
 I've also just started working on an Octave (well ok, Matlab) M-file
 parser. It will probably never see the light of day but with
 Pegged is great fun to play with.

 Please don't lose interest in it !!

Yes. Probably the most significant D project at this time.

Andrei

Nov 02 2013

Timothee Cour <thelastmammoth gmail.com> writes:

1)
The main issue I see with pegged is PEG grammars don't support left
recursion, so for example will fail on foo[1].bar(2).fun().
Unless there's a plan to accomodate those, I sense a dead end.
One can eliminate left recursion but this has issues.

2)
There is some material on extending PEG to support those, eg "Left
Recursion in Parsing Expression Grammars", or code
https://github.com/orlandohill/peg-left-recursion but I don't know how well
they work in practice.

3)
Finally, a parser should be as fast as possible; I'm not sure how well
pegged performs compared to dmd frontend. Other promising options are using
lemon, a LALR(1) parser generator.

On Sat, Nov 2, 2013 at 2:17 PM, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 On 11/2/13 1:56 PM, growler wrote:

 On Saturday, 2 November 2013 at 08:19:39 UTC, Philippe Sigaud
 wrote:

 [snip]

 Bah, you're getting me all interested in it again! I wanted to write a D
 tutorial around a ray-tracer!

 I'd like to chime in and say I think Pegged is fantastic.

 I'm using it for parsing C and Markup in two different projects.
 I've also just started working on an Octave (well ok, Matlab) M-file
 parser. It will probably never see the light of day but with
 Pegged is great fun to play with.

 Please don't lose interest in it !!

 Yes. Probably the most significant D project at this time.

 Andrei

Nov 02 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/2/13 6:45 PM, Timothee Cour wrote:
 1)
 The main issue I see with pegged is PEG grammars don't support left
 recursion, so for example will fail on foo[1].bar(2).fun().
 Unless there's a plan to accomodate those, I sense a dead end.
 One can eliminate left recursion but this has issues.

I see Pegged significant not only as a PEG-based generator, but also for 
exercising D's generative abilities and consequently inspire other 
parser generators.

Andrei

Nov 02 2013

Timothee Cour <thelastmammoth gmail.com> writes:

On Sat, Nov 2, 2013 at 7:11 PM, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 On 11/2/13 6:45 PM, Timothee Cour wrote:

 1)
 The main issue I see with pegged is PEG grammars don't support left
 recursion, so for example will fail on foo[1].bar(2).fun().
 Unless there's a plan to accomodate those, I sense a dead end.
 One can eliminate left recursion but this has issues.

 I see Pegged significant not only as a PEG-based generator, but also for
 exercising D's generative abilities and consequently inspire other parser
 generators.


That's fine and it works great for certain grammars, but for grammars like
D?
Either theres' a plan to handle those or it's time to look for
alternatives. I've suggested lemon (improves upon yacc/bison), which I've
used before (via a d wrapper) http://www.hwaci.com/sw/lemon/.

Nov 02 2013

"Chad Joan" <chadjoan gmail.com> writes:

On Sunday, 3 November 2013 at 01:45:23 UTC, Timothee Cour wrote:
 1)
 The main issue I see with pegged is PEG grammars don't support 
 left
 recursion, so for example will fail on foo[1].bar(2).fun().
 Unless there's a plan to accomodate those, I sense a dead end.
 One can eliminate left recursion but this has issues.

Use the repetition operator(s) and turn the resulting array into 
whatever tree you like.  In practice, I have never had a problem 
with this.

I have used both Pegged and have written an SQL parser at work 
(not exhaustive, just what's needed) that uses C macros as PEG 
elements.  Tangent: Using C macros for this worked surprisingly 
well and allowed me to avoid an extra step in the build process 
(I do not have access to D for that project).  Pegged is still 
much more scalable, optimizable, and the grammars are a lot less 
ugly/finicky.

Nov 04 2013

Philippe Sigaud <philippe.sigaud gmail.com> writes:

On Tue, Nov 5, 2013 at 5:45 AM, Chad Joan <chadjoan gmail.com> wrote:

  Use the repetition operator(s) and turn the resulting array into
 whatever tree you like.  In practice, I have never had a problem with this.

 I have used both Pegged and have written an SQL parser at work (not
 exhaustive, just what's needed) that uses C macros as PEG elements.
  Tangent: Using C macros for this worked surprisingly well and allowed me
 to avoid an extra step in the build process (I do not have access to D for
 that project).  Pegged is still much more scalable, optimizable, and the
 grammars are a lot less ugly/finicky.


That made my day, thanks!

Nov 05 2013

"Chad Joan" <chadjoan gmail.com> writes:

On Tuesday, 5 November 2013 at 16:31:52 UTC, Philippe Sigaud 
wrote:
 On Tue, Nov 5, 2013 at 5:45 AM, Chad Joan <chadjoan gmail.com> 
 wrote:

  Use the repetition operator(s) and turn the resulting array 
 into
 whatever tree you like.  In practice, I have never had a 
 problem with this.

 I have used both Pegged and have written an SQL parser at 
 work (not
 exhaustive, just what's needed) that uses C macros as PEG 
 elements.
  Tangent: Using C macros for this worked surprisingly well 
 and allowed me
 to avoid an extra step in the build process (I do not have 
 access to D for
 that project).  Pegged is still much more scalable, 
 optimizable, and the
 grammars are a lot less ugly/finicky.


 That made my day, thanks!

You're very welcome!

Thank you for spending all that time to make a neat thing.

Maybe I'll rub it in:
Pegged taught me about PEGs and how to build reasonably powerful 
parsers with extremely limited tools and support.  It's like 
being able to make shelter in the woods when no one gives you a 
tent.  And my company's codebase is a jungle.  So even when D 
isn't around, it is still /useful/.  Thanks.

Nov 06 2013

Martin Nowak <code dawg.eu> writes:

On 11/03/2013 02:45 AM, Timothee Cour wrote:
 1)
 The main issue I see with pegged is PEG grammars don't support left
 recursion, so for example will fail on foo[1].bar(2).fun().
 Unless there's a plan to accomodate those, I sense a dead end.
 One can eliminate left recursion but this has issues.

 2)
 There is some material on extending PEG to support those, eg "Left
 Recursion in Parsing Expression Grammars", or code
 https://github.com/orlandohill/peg-left-recursion but I don't know how
 well they work in practice.

Scala extended their PEG generator to handle left recursion.
But then there is also a Scala GLL implementation.
https://github.com/djspiewak/gll-combinators

Nov 05 2013

Philippe Sigaud <philippe.sigaud gmail.com> writes:

On Tue, Nov 5, 2013 at 6:45 PM, Martin Nowak <code dawg.eu> wrote:

 On 11/03/2013 02:45 AM, Timothee Cour wrote:

 1)
 The main issue I see with pegged is PEG grammars don't support left
 recursion, so for example will fail on foo[1].bar(2).fun().
 Unless there's a plan to accomodate those, I sense a dead end.
 One can eliminate left recursion but this has issues.

 2)
 There is some material on extending PEG to support those, eg "Left
 Recursion in Parsing Expression Grammars", or code
 https://github.com/orlandohill/peg-left-recursion but I don't know how
 well they work in practice.

  Scala extended their PEG generator to handle left recursion.

 But then there is also a Scala GLL implementation.
 https://github.com/djspiewak/gll-combinators


I didn't know that. My 'programming in Scala' book is a bit dated.
I'll have a look, thanks.

Nov 06 2013

Bastiaan Veelo <Bastiaan Veelo.net> writes:

On Sunday, 3 November 2013 at 01:45:23 UTC, Timothee Cour wrote:
 1)
 The main issue I see with pegged is PEG grammars don't support 
 left
 recursion, so for example will fail on foo[1].bar(2).fun().
 Unless there's a plan to accomodate those, I sense a dead end.
 One can eliminate left recursion but this has issues.

 2)
 There is some material on extending PEG to support those, eg 
 "Left
 Recursion in Parsing Expression Grammars", or code
 https://github.com/orlandohill/peg-left-recursion but I don't 
 know how well
 they work in practice.

Left-recursion for Pegged is in the works: 
https://github.com/PhilippeSigaud/Pegged/pull/164

:-)

Nov 14 2015

=?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Sunday, 15 November 2015 at 01:17:47 UTC, Bastiaan Veelo wrote:
 Left-recursion for Pegged is in the works: 
 https://github.com/PhilippeSigaud/Pegged/pull/164

 :-)

And merged!: https://github.com/PhilippeSigaud/Pegged/pull/164

Would be nice to have an overview of which languages the, with 
this change, can be parsed by a Pegged grammar.

Nov 15 2015

Philippe Sigaud <philippe.sigaud gmail.com> writes:

On Sun, Nov 3, 2013 at 2:45 AM, Timothee Cour <thelastmammoth gmail.com>wrote:

 1)
 The main issue I see with pegged is PEG grammars don't support left
 recursion, so for example will fail on foo[1].bar(2).fun().
 Unless there's a plan to accomodate those, I sense a dead end.
 One can eliminate left recursion but this has issues.

Indeed. Eliminating left-recursion can be done, but it might disrupt the
parse tree.
There is no discarding the fact that PEG are intrinsically made to go with
top-down recursive parsers.


 2)
 There is some material on extending PEG to support those, eg "Left
 Recursion in Parsing Expression Grammars", or code
 https://github.com/orlandohill/peg-left-recursion but I don't know how
 well they work in practice.

I'll have a look, thanks. The idea seems similar to what I wanted to do,
with a growing seed.



 3)
 Finally, a parser should be as fast as possible; I'm not sure how well
 pegged performs compared to dmd frontend.


Oh, for D it works (it's even the biggest grammar I know), but it's too
slow to be of practical interest. But that just means the generated parser
is not top-notch, which is reasonable: I'm not a parsing pro, just a guy
doing this during his free time :)



 Other promising options are using lemon, a LALR(1) parser generator.

My current plan is to write different engines, and letting either the user
select them at compile-time, or to have the parser decide which one to use,
depending on the grammar. I'm pretty sure the 'Type 3' parts of a grammar
(regular expressions) could be bone by using std.regex, for example.

I guess I'll have to write a CT-compatible LALR(1) engine...




 Growler:
 I'm using it for parsing C and Markup in two different projects.
 I've also just started working on an Octave (well ok, Matlab) M-file
 parser. It will probably never see the light of day but with
 Pegged is great fun to play with.

 Please don't lose interest in it !!



OK guys, I'm hearing you. Thanks for the nice words growler! I tried to
have Pegged simple to use, compared to other generator I know and I'm
pleased to see it seems to work. If you have new grammars, you can send
them to me, I'll put them in the examples.



 Yes. Probably the most significant D project at this time.

 Andrei


That's nothing compared to vide.d! But I guess it indeed demonstrates what
can be done with the generative capabilities of D.
Thanks for the kind words.

Nov 03 2013

Timothee Cour <thelastmammoth gmail.com> writes:

On Sun, Nov 3, 2013 at 1:13 AM, Philippe Sigaud
<philippe.sigaud gmail.com>wrote:

 On Sun, Nov 3, 2013 at 2:45 AM, Timothee Cour <thelastmammoth gmail.com>wrote:

 1)
 The main issue I see with pegged is PEG grammars don't support left
 recursion, so for example will fail on foo[1].bar(2).fun().
 Unless there's a plan to accomodate those, I sense a dead end.
 One can eliminate left recursion but this has issues.

 Indeed. Eliminating left-recursion can be done, but it might disrupt the
 parse tree.
 There is no discarding the fact that PEG are intrinsically made to go with
 top-down recursive parsers.


 2)
 There is some material on extending PEG to support those, eg "Left
 Recursion in Parsing Expression Grammars", or code
 https://github.com/orlandohill/peg-left-recursion but I don't know how
 well they work in practice.

 I'll have a look, thanks. The idea seems similar to what I wanted to do,
 with a growing seed.

Great, please let us know how that works.


 3)
 Finally, a parser should be as fast as possible; I'm not sure how well
 pegged performs compared to dmd frontend.


 Oh, for D it works (it's even the biggest grammar I know), but it's too
 slow to be of practical interest. But that just means the generated parser
 is not top-notch, which is reasonable: I'm not a parsing pro, just a guy
 doing this during his free time :)



 Other promising options are using lemon, a LALR(1) parser generator.

 My current plan is to write different engines, and letting either the user
 select them at compile-time, or to have the parser decide which one to use,
 depending on the grammar. I'm pretty sure the 'Type 3' parts of a grammar
 (regular expressions) could be bone by using std.regex, for example.

even lexing can't be done with regex, eg nesting comments : /+ ... +/
Also, although it may seem cleaner at first to combine lexing and parsing
in 1 big grammar (as done in pegged), it usually is faster do feed a
(separate) lexer output into parser.


 I guess I'll have to write a CT-compatible LALR(1) engine...




  Growler:
 I'm using it for parsing C and Markup in two different projects.
 I've also just started working on an Octave (well ok, Matlab) M-file
 parser. It will probably never see the light of day but with
 Pegged is great fun to play with.

 Please don't lose interest in it !!



 OK guys, I'm hearing you. Thanks for the nice words growler! I tried to
 have Pegged simple to use, compared to other generator I know and I'm
 pleased to see it seems to work. If you have new grammars, you can send
 them to me, I'll put them in the examples.



 Yes. Probably the most significant D project at this time.

 Andrei


 That's nothing compared to vide.d! But I guess it indeed demonstrates what
 can be done with the generative capabilities of D.
 Thanks for the kind words.

Nov 03 2013

Robert Schadek <realburner gmx.de> writes:

On 11/03/2013 09:13 AM, Philippe Sigaud wrote:
 Oh, for D it works (it's even the biggest grammar I know), but it's
 too slow to be of practical interest. But that just means the
 generated parser is not top-notch, which is reasonable: I'm not a
 parsing pro, just a guy doing this during his free time :)

  

     Other promising options are using lemon, a LALR(1) parser generator.


 My current plan is to write different engines, and letting either the
 user select them at compile-time, or to have the parser decide which
 one to use, depending on the grammar. I'm pretty sure the 'Type 3'
 parts of a grammar (regular expressions) could be bone by using
 std.regex, for example.

 I guess I'll have to write a CT-compatible LALR(1) engine...
  

D does not fit into LALR(1), you need glr.

Another idea would be to make the engine a template argument, and than
combine multiple parser!(engines). And even allow hand written stuff.
This way you could use ll(1) for the ll(1) parts and the crazy hand
written black magic for the hard parts.

Nov 03 2013

Timothee Cour <thelastmammoth gmail.com> writes:

On Sun, Nov 3, 2013 at 12:46 PM, Robert Schadek <realburner gmx.de> wrote:

  On 11/03/2013 09:13 AM, Philippe Sigaud wrote:


  Oh, for D it works (it's even the biggest grammar I know), but it's too
 slow to be of practical interest. But that just means the generated parser
 is not top-notch, which is reasonable: I'm not a parsing pro, just a guy
 doing this during his free time :)



  Other promising options are using lemon, a LALR(1) parser generator.

  My current plan is to write different engines, and letting either the
 user select them at compile-time, or to have the parser decide which one to
 use, depending on the grammar. I'm pretty sure the 'Type 3' parts of a
 grammar (regular expressions) could be bone by using std.regex, for example.

  I guess I'll have to write a CT-compatible LALR(1) engine...


 D does not fit into LALR(1), you need glr.

how about dparser (nothing to do with D btw):
http://dparser.sourceforge.net/
the grammar for C looks quite compact and clean:
http://dparser.sourceforge.net/d/tests/ansic.test.g


 Another idea would be to make the engine a template argument, and than
 combine multiple parser!(engines). And even allow hand written stuff. This
 way you could use ll(1) for the ll(1) parts and the crazy hand written
 black magic for the hard parts.

Sure, but that'd be a 2nd priority after having at least one (partially)
automatically generated parser for D.

Nov 03 2013

Philippe Sigaud <philippe.sigaud gmail.com> writes:

On Sun, Nov 3, 2013 at 7:08 PM, Timothee Cour <thelastmammoth gmail.com>wrote:

 On Sun, Nov 3, 2013 at 1:13 AM, Philippe Sigaud <philippe.sigaud gmail.com
 wrote:

 My current plan is to write different engines, and letting either the
 user select them at compile-time, or to have the parser decide which one to
 use, depending on the grammar. I'm pretty sure the 'Type 3' parts of a
 grammar (regular expressions) could be bone by using std.regex, for example.

 even lexing can't be done with regex, eg nesting comments : /+ ... +/
 Also, although it may seem cleaner at first to combine lexing and parsing
 in 1 big grammar (as done in pegged), it usually is faster do feed a
 (separate) lexer output into parser.

Lexing, yes. I was imprecise: even in a context-free grammar, some rules
are regular and could use std.regex (the ct part) as the underlying engine,
just for that rule.

Nov 03 2013

Philippe Sigaud <philippe.sigaud gmail.com> writes:

On Mon, Nov 4, 2013 at 1:55 AM, Timothee Cour <thelastmammoth gmail.com>wrote:

 I guess I'll have to write a CT-compatible LALR(1) engine...
 D does not fit into LALR(1), you need glr.


Well, too bad. GLR is also on my plate, but I fear its cubic behavior
(IIRC, it degrades gracefully, though).
Do you know what part of the D grammar makes it non-LALR(1)?

Nov 03 2013

Martin Nowak <code dawg.eu> writes:

On 11/04/2013 06:52 AM, Philippe Sigaud wrote:
 Do you know what part of the D grammar makes it non-LALR(1)?

At least function and template function declarations are not even LR(1).
You need to look for a second left parenthesis to distinguish both.

It's fairly easy to solve this though.
1. Add a new token r")[ ]*("
2. Add an alternate rule that never matches but forces the parser to 
look behind the closing paren of the function arguments.
...

Nov 05 2013

Robert Schadek <realburner gmx.de> writes:

On 11/04/2013 06:48 AM, Philippe Sigaud wrote:
 On Sun, Nov 3, 2013 at 7:08 PM, Timothee Cour
 <thelastmammoth gmail.com <mailto:thelastmammoth gmail.com>> wrote:


     On Sun, Nov 3, 2013 at 1:13 AM, Philippe Sigaud
     <philippe.sigaud gmail.com <mailto:philippe.sigaud gmail.com>>wrote:


         My current plan is to write different engines, and letting
         either the user select them at compile-time, or to have the
         parser decide which one to use, depending on the grammar. I'm
         pretty sure the 'Type 3' parts of a grammar (regular
         expressions) could be bone by using std.regex, for example.


     even lexing can't be done with regex, eg nesting comments : /+ ... +/
     Also, although it may seem cleaner at first to combine lexing and
     parsing in 1 big grammar (as done in pegged), it usually is faster
     do feed a (separate) lexer output into parser. 


 Lexing, yes. I was imprecise: even in a context-free grammar, some
 rules are regular and could use std.regex (the ct part) as the
 underlying engine, just for that rule.

Lexing can not be done with regex. Think myArray[1. ] ! What is next a
dot or a number.

Nov 04 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

04-Nov-2013 12:28, Robert Schadek пишет:
 On 11/04/2013 06:48 AM, Philippe Sigaud wrote:
 On Sun, Nov 3, 2013 at 7:08 PM, Timothee Cour
 <thelastmammoth gmail.com <mailto:thelastmammoth gmail.com>> wrote:


     On Sun, Nov 3, 2013 at 1:13 AM, Philippe Sigaud
     <philippe.sigaud gmail.com <mailto:philippe.sigaud gmail.com>>wrote:


         My current plan is to write different engines, and letting
         either the user select them at compile-time, or to have the
         parser decide which one to use, depending on the grammar. I'm
         pretty sure the 'Type 3' parts of a grammar (regular
         expressions) could be bone by using std.regex, for example.


     even lexing can't be done with regex, eg nesting comments : /+ ... +/
     Also, although it may seem cleaner at first to combine lexing and
     parsing in 1 big grammar (as done in pegged), it usually is faster
     do feed a (separate) lexer output into parser.


 Lexing, yes. I was imprecise: even in a context-free grammar, some
 rules are regular and could use std.regex (the ct part) as the
 underlying engine, just for that rule.

 Lexing can not be done with regex. Think myArray[1. ] ! What is next a
 dot or a number.

You could use lookahead extension ;)

In general I would not recommend ATM to use std.regex for 
performance-critical lexer if only because it wasn't tuned for such a 
use case yet.

I have plans for extending std.regex capabilities in many directions, 
lexing being one important use case.

-- 
Dmitry Olshansky

Nov 04 2013

Timothee Cour <thelastmammoth gmail.com> writes:

for lexing there's already dscanner we could use (while we wait for perhaps
a autogenerated lexer);
so I think priority is on the autogenerated parser (dscanner has one but
hand designed), where it's still unknown what will work well.


On Mon, Nov 4, 2013 at 2:43 AM, Dmitry Olshansky <dmitry.olsh gmail.com>wro=
te:

 04-Nov-2013 12:28, Robert Schadek =D0=BF=D0=B8=D1=88=D0=B5=D1=82:

 On 11/04/2013 06:48 AM, Philippe Sigaud wrote:

 On Sun, Nov 3, 2013 at 7:08 PM, Timothee Cour
 <thelastmammoth gmail.com <mailto:thelastmammoth gmail.com>> wrote:


     On Sun, Nov 3, 2013 at 1:13 AM, Philippe Sigaud
     <philippe.sigaud gmail.com <mailto:philippe.sigaud gmail.com>>wrote=



:
         My current plan is to write different engines, and letting
         either the user select them at compile-time, or to have the
         parser decide which one to use, depending on the grammar. I'm
         pretty sure the 'Type 3' parts of a grammar (regular
         expressions) could be bone by using std.regex, for example.


     even lexing can't be done with regex, eg nesting comments : /+ ... =



+/
     Also, although it may seem cleaner at first to combine lexing and
     parsing in 1 big grammar (as done in pegged), it usually is faster
     do feed a (separate) lexer output into parser.


 Lexing, yes. I was imprecise: even in a context-free grammar, some
 rules are regular and could use std.regex (the ct part) as the
 underlying engine, just for that rule.

 Lexing can not be done with regex. Think myArray[1. ] ! What is next a
 dot or a number.

 You could use lookahead extension ;)

 In general I would not recommend ATM to use std.regex for
 performance-critical lexer if only because it wasn't tuned for such a use
 case yet.

 I have plans for extending std.regex capabilities in many directions,
 lexing being one important use case.

 --
 Dmitry Olshansky

Nov 04 2013

"Brian Schott" <briancschott gmail.com> writes:

On Monday, 4 November 2013 at 13:20:01 UTC, Timothee Cour wrote:
 for lexing there's already dscanner we could use (while we wait 
 for perhaps
 a autogenerated lexer);
 so I think priority is on the autogenerated parser (dscanner 
 has one but
 hand designed), where it's still unknown what will work well.

Yes, that tool has two properties:
1) It works now. Not Soon(tm). You can download it, compile it, 
and use it to dump the AST of your D code in just a minute or two.
2) It wasn't built THE ONE TRUE WAY.

But we should take a step back first. Before we try to implement 
a parser for D's grammar, we need to figure out what exactly D's 
grammar is.

Seriously. We don't have a real grammar for D. We have the 
language spec on dlang.org, but it isn't complete, consistent, or 
updated when the language changes. Want examples? I have a 
tracker for them here: 
http://d.puremagic.com/issues/show_bug.cgi?id=10233

There's also my project here: 
https://github.com/Hackerpilot/DGrammar, but it's not official 
and I keep finding differences between it and the behavior of DMD.

Why am I the only one who thinks this is a problem?

Nov 04 2013

"Brian Schott" <briancschott gmail.com> writes:

On Tuesday, 5 November 2013 at 07:27:10 UTC, Brian Schott wrote:
 Why am I the only one who thinks this is a problem?

Ignore this line if your name is "Kenji".

Nov 04 2013

Manfred Nowak <svv1999 hotmail.com> writes:

Brian Schott wrote:

 Why am I the only one who thinks this is a problem?

meetoo.

-manfred

Nov 05 2013

Philippe Sigaud <philippe.sigaud gmail.com> writes:

On Tue, Nov 5, 2013 at 8:27 AM, Brian Schott <briancschott gmail.com> wrote:

 But we should take a step back first. Before we try to implement a parser
 for D's grammar, we need to figure out what exactly D's grammar is.

 Seriously. We don't have a real grammar for D. We have the language spec
 on dlang.org, but it isn't complete, consistent, or updated when the
 language changes. Want examples? I have a tracker for them here:
 http://d.puremagic.com/issues/show_bug.cgi?id=10233

 There's also my project here: https://github.com/Hackerpilot/DGrammar,
 but it's not official and I keep finding differences between it and the
 behavior of DMD.

 Why am I the only one who thinks this is a problem?

You're not alone in this, but Walter's answer is that we must make pull
requests on the website if we find errors in the online grammar. The
problem is not knowing what the intended behavior is. When the reference
compiler and the grammar disagree, who I am to say which is wrong?

We could also see if some slight modification to the grammar could push it
into LALR(1) territory.

Nov 05 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/4/13 11:27 PM, Brian Schott wrote:
 On Monday, 4 November 2013 at 13:20:01 UTC, Timothee Cour wrote:
 for lexing there's already dscanner we could use (while we wait for
 perhaps
 a autogenerated lexer);
 so I think priority is on the autogenerated parser (dscanner has one but
 hand designed), where it's still unknown what will work well.

 Yes, that tool has two properties:
 1) It works now. Not Soon(tm). You can download it, compile it, and use
 it to dump the AST of your D code in just a minute or two.
 2) It wasn't built THE ONE TRUE WAY.

 But we should take a step back first. Before we try to implement a
 parser for D's grammar, we need to figure out what exactly D's grammar is.

 Seriously. We don't have a real grammar for D. We have the language spec
 on dlang.org, but it isn't complete, consistent, or updated when the
 language changes. Want examples? I have a tracker for them here:
 http://d.puremagic.com/issues/show_bug.cgi?id=10233

 There's also my project here: https://github.com/Hackerpilot/DGrammar,
 but it's not official and I keep finding differences between it and the
 behavior of DMD.

 Why am I the only one who thinks this is a problem?

I agree it's a problem, in fact three problems in one. In decreasing 
difficulty order:

1. Semantic changes for working code (e.g. order of evaluation etc) are 
subtle enough to be very difficult to track and require sheer attention 
and careful manual verification and maintenance of documentation.

2. Semantic analysis changes (i.e. compiles/doesn't compile) are also 
difficult and require attention, but at least can be to a good extent 
verified automatically (by means of test suites and runnable examples). 
In TDPL I have two categories of examples - visible and invisible. The 
invisible ones do not occur in the printed text but are present in the 
book source and are used to check whether the claims made by the book 
are true. It would be really cool if we had something like that for the 
online documentation. We should be able to intersperse freely 
documentation text with invisible unittests that ensure the 
documentation is correct.

3. Grammar changes are the simplest ones and in a way the most 
embarrassing if they happen. The best solution I see to that is deriving 
the documentation and the actual parser from the same source. This is 
part of why I'm so keen on parser generators.



Andrei

P.S. I haven't forgotten about the lexer - it's still on the back burner 
but I will publish it as soon as I get a chance.

Nov 05 2013

"Chad Joan" <chadjoan gmail.com> writes:

On Tuesday, 5 November 2013 at 20:17:07 UTC, Andrei Alexandrescu 
wrote:
 3. Grammar changes are the simplest ones and in a way the most 
 embarrassing if they happen. The best solution I see to that is 
 deriving the documentation and the actual parser from the same 
 source. This is part of why I'm so keen on parser generators.

This.

This is why I have a hard time accepting any "formalize a grammar 
first" arguments.  I think any grammars for D will be pure 
speculation until they become executable and start to see heavy 
use in the tools that we use.

Nov 06 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 11/4/2013 11:27 PM, Brian Schott wrote:
 Seriously. We don't have a real grammar for D. We have the language spec on
 dlang.org, but it isn't complete, consistent, or updated when the language
 changes. Want examples? I have a tracker for them here:
 http://d.puremagic.com/issues/show_bug.cgi?id=10233

Thank you, Brian, for finding and reporting these issues.

Nov 12 2013

Robert Schadek <realburner gmx.de> writes:

On 11/04/2013 06:52 AM, Philippe Sigaud wrote:
 On Mon, Nov 4, 2013 at 1:55 AM, Timothee Cour
 <thelastmammoth gmail.com <mailto:thelastmammoth gmail.com>> wrote:


         I guess I'll have to write a CT-compatible LALR(1) engine...
          

         D does not fit into LALR(1), you need glr.


 Well, too bad. GLR is also on my plate, but I fear its cubic behavior
 (IIRC, it degrades gracefully, though).

Thats one part, and even worse is that you need a decider function if
more than one rule accepts.
 Do you know what part of the D grammar makes it non-LALR(1)?

I had big trouble with the IdentifierList rules.

Nov 04 2013

Timothee Cour <thelastmammoth gmail.com> writes:

On Sat, Nov 2, 2013 at 1:19 AM, Philippe Sigaud
<philippe.sigaud gmail.com>wrote:

 Note that Pegged has no problem with the official C grammar or the
 officiel XML specification...

how do you handle left recursion in C ? eg: a[0][1] etc?

Nov 03 2013

Philippe Sigaud <philippe.sigaud gmail.com> writes:

On Mon, Nov 4, 2013 at 1:58 AM, Timothee Cour <thelastmammoth gmail.com>wrote:

 On Sat, Nov 2, 2013 at 1:19 AM, Philippe Sigaud <philippe.sigaud gmail.com
 wrote:

 Note that Pegged has no problem with the official C grammar or the
 officiel XML specification...

 how do you handle left recursion in C ? eg: a[0][1] etc?

I don't remember. Maybe the grammar I use has no left-recursion? I don't
remember whether I sued the official C grammar or one created for PEGs by
someone else. I know I tested it on many different programs, with no
problem.

Nov 03 2013

Philippe Sigaud <philippe.sigaud gmail.com> writes:

I *used* the grammar, obviously. Suing could be an option to disallow
left-recursion, but let's explore other ways for now.


On Mon, Nov 4, 2013 at 6:46 AM, Philippe Sigaud
<philippe.sigaud gmail.com>wrote:

 On Mon, Nov 4, 2013 at 1:58 AM, Timothee Cour <thelastmammoth gmail.com>wrote:

 On Sat, Nov 2, 2013 at 1:19 AM, Philippe Sigaud <
 philippe.sigaud gmail.com> wrote:

 Note that Pegged has no problem with the official C grammar or the
 officiel XML specification...

 how do you handle left recursion in C ? eg: a[0][1] etc?

 I don't remember. Maybe the grammar I use has no left-recursion? I don't
 remember whether I sued the official C grammar or one created for PEGs by
 someone else. I know I tested it on many different programs, with no
 problem.

Nov 03 2013

Martin Nowak <code dawg.eu> writes:

On 11/02/2013 05:06 AM, Manfred Nowak wrote:
 One prerequisite for every PEG-Parser is, that the language has to be
 designed to be without any ambiguity.

AFAIK Parser expression grammars handle ambiguity through prioritization.
You'd still need to rewrite grammar rules that involve left-recursion or 
hidden left recursion.

Nov 05 2013

Manfred Nowak <svv1999 hotmail.com> writes:

Martin Nowak wrote:
 AFAIK Parser expression grammars handle ambiguity through prioritization.

My statement about ambiguity is taken from the first sentence of chapter 7 
of "Parsing Expression Grammars: A Recognition-Based Syntactic Foundation"
by Brian Ford: http://pdos.csail.mit.edu/papers/parsing%3Apopl04.pdf:

| Parsing expression grammars provide a [...] foundation for [...]
| languages that are designed to be unambiguous.

-manfred

Nov 05 2013

"Chad Joan" <chadjoan gmail.com> writes:

On Friday, 1 November 2013 at 21:22:46 UTC, Andrei Alexandrescu 
wrote:
 ...
 Bugs stopping Pegged from going forward should receive high 
 priority. I also encourage others to participate to that and 
 similar work.


 Andrei

Ack!  I share Philippe's sense of timing.  This would have been 
even nicer to hear a year ago when both of us were actively 
committing ;)

I was going to close a project or two that I care deeply about 
and had started a long time ago, but I see that this might be a 
hard decision.

Nonetheless, I am really glad that you are showing this interest! 
  I like to hear stuff like this, since I too really like Pegged.


Andrei and Philippe,

I feel compelled to share some of my thoughts that I never had 
time to finish back then.

I was working on a parser-engine-as-a-library that could be used 
to as optimized internals for Pegged, as well as any other tools 
that need to recognize these common patterns.

The idea was to expose an API like this:

     string makeParser()
     {
         auto builder = new ParserBuilder!char;
         builder.pushSequence();
             builder.literal('x');
             builder.pushMaybe();
                 builder.literal('y');
             builder.pop();
         builder.pop();
         return builder.toDCode("callMe");
     }

     const foo = makeParser();

     pragma(msg, foo);

     mixin(foo);

That snippet would create a parser that recognizes the grammar 
'x' ( 'y'? ).
The current fledgling implementation creates this parser:
http://pastebin.com/MgSqWXE2

Of course, no one would be expected to write grammars like that.  
It would be the job of tools like Pegged or std.regex to package 
it up in nice syntax that is easy to use.

The code already takes a somewhat different strategy than 
Pegged's original strategy.  Rather than generating a bunch of 
templates that dmd then has to instantiate to actualize the 
parser, it just emits a bunch of very primitive procedural D 
code.  I suspect that this approach would mixin far faster with 
current dmd, because the deeply nested templates generated by 
Pegged seemed to be a bottleneck.  I have to hand it to Philippe 
though for coming up with a very clever way to bootstrap the 
thing: once I saw how his templates assembled together, I 
realized just how convenient that was!

(And I think my parser generator still has to be tought how to 
avoid making redundant branches in its output: there's some hash 
table action that belongs in there somewhere.)

The small amount of code that I have for it is here:
https://github.com/chadjoan/xdc/blob/master/src/xdc/parser_builder/parser_builder.d

I wanted to eventually make it generic enough to recognize 
patterns in things besides strings.  Being able to write grammars 
that recognize patterns in ASTs is /useful/.  That leads into the 
whole xdc project: automate all of the tedious crud in semantic 
analysis, and thus make compiler writing, and possibly other AST 
manipulations in user code, become all much easier.

The other thing I wanted to do was to optimize it.
- I intended it to do no allocations unless the caller asks for 
it.
- I intended to do a bunch of PEG/regex hybridization.

I noticed some mathematical properties of PEGs and regular 
expressions that should allow you to mix the two as much as 
possible.  All you have to do is tell it how to behave at the 
boundaries where they meet.  And given that PEGs already define 
their own behavior pretty well, it would become possible to lower 
a lot of a PEG into regular expressions connected with a minimal 
set of PEG rules.  This would be some awesome lowering: if you 
first do a pass that inlines as many rules as possible, and then 
do a second pass that converts PEG elements into regular elements 
whenever possible, then I feel like the thing will be damned near 
optimal.  If you are wondering what these mathematical properties 
are, then I encourage you to look at this snippet where I define 
"unitary" and "non-unitary" expressions, for lack of prior terms:
http://pastebin.com/iYBypRc5

Another fun thought: PEGs can have look-behind that includes any 
regular elements without any additional algorithmic complexity.  
Just take all of the look-behinds in the grammar, mash them 
together into one big regular-expression using regular 
alternation (|), and then have the resulting automaton consume in 
lock-step with the PEG parser.  Whenever the PEG parser needs to 
do a lookbehind, it just checks to see if the companion automaton 
is in a matching state for the capture it needs.

*sigh*, I feel like I could write a paper on this stuff if I were 
in grad school right now.  Alas, I am stuck doing 50-60 hours a 
week of soul-sucking business programming.  Well, then again, my 
understanding is that even though I can think of things that seem 
like they would make interesting topics for publishable papers, 
reality would have the profs conscript me to do completely 
different things that are possibly just as inane as the business 
programming.

I worry that the greater threat to good AST manipulation tools in 
D is a lack of free time, and not the DMD bugs as much.

I hope this is useful to someone!

Nov 04 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

05-Nov-2013 10:36, Chad Joan пишет:
 On Friday, 1 November 2013 at 21:22:46 UTC, Andrei Alexandrescu wrote:
 ...
 Bugs stopping Pegged from going forward should receive high priority.
 I also encourage others to participate to that and similar work.


 Andrei


[snip]

 I was working on a parser-engine-as-a-library that could be used to as
 optimized internals for Pegged, as well as any other tools that need to
 recognize these common patterns.

 The idea was to expose an API like this:

      string makeParser()
      {
          auto builder = new ParserBuilder!char;
          builder.pushSequence();
              builder.literal('x');
              builder.pushMaybe();
                  builder.literal('y');
              builder.pop();
          builder.pop();
          return builder.toDCode("callMe");
      }

I was also toying with the idea of exposing Builder interface for 
std.regex. But push/pop IMHO are better be implicitly designed-out:

auto re = atom('x').star(charClass(unicode.Letter),atom('y')).build();

... and letting the nesting be explicit.

Is the same as:
auto re = regex(`x(?:\p{L}y)*`);

Aimed for apps/libs that build regular expressions anyway and have no 
need in textual parser.

 That snippet would create a parser that recognizes the grammar 'x' (
 'y'? ).
 The current fledgling implementation creates this parser:
 http://pastebin.com/MgSqWXE2

 Of course, no one would be expected to write grammars like that. It
 would be the job of tools like Pegged or std.regex to package it up in
 nice syntax that is easy to use.

I thought to provide some building blocks for that with new std.uni. Not 
quite everything I wanted, but now at least there is one set of wheels 
less to reinvent.

[snip]

 Another fun thought: PEGs can have look-behind that includes any regular
 elements without any additional algorithmic complexity. Just take all of
 the look-behinds in the grammar, mash them together into one big
 regular-expression using regular alternation (|), and then have the
 resulting automaton consume in lock-step with the PEG parser.  Whenever
 the PEG parser needs to do a lookbehind, it just checks to see if the
 companion automaton is in a matching state for the capture it needs.

Sounds quite slow to do it "just in case". Also complete DFAs tend to be 
mm quite big.

What ANTLR does is similar technique - a regular lookahead to resolve 
ambiguity in the grammar (implicitly). A lot like LL(k) but with 
unlimited length (so called LL(*)). Of course, it generates LL(k) 
disambiguation where possible, then LL(*), failing that the usual 
backtracking.

 *sigh*, I feel like I could write a paper on this stuff if I were in
 grad school right now.  Alas, I am stuck doing 50-60 hours a week of
 soul-sucking business programming.

I heard Sociomantic is hiring D programmers for coding some awesome 
stuff, you may as well apply :)

 Well, then again, my understanding
 is that even though I can think of things that seem like they would make
 interesting topics for publishable papers, reality would have the profs
 conscript me to do completely different things that are possibly just as
 inane as the business programming.

Speaking for my limited experience - at times it's like that.

 I worry that the greater threat to good AST manipulation tools in D is a
 lack of free time, and not the DMD bugs as much.

Good for you I guess, my developments in related area are blocked still :(

 I hope this is useful to someone!


-- 
Dmitry Olshansky

Nov 05 2013

Philippe Sigaud <philippe.sigaud gmail.com> writes:

On Tue, Nov 5, 2013 at 3:54 PM, Dmitry Olshansky <dmitry.olsh gmail.com>wrote:

 I was also toying with the idea of exposing Builder interface for
 std.regex. But push/pop IMHO are better be implicitly designed-out:

 auto re = atom('x').star(charClass(unicode.Letter),atom('y')).build();

 ... and letting the nesting be explicit.

 Is the same as:
 auto re = regex(`x(?:\p{L}y)*`);

 Aimed for apps/libs that build regular expressions anyway and have no need
 in textual parser.

Another possible advantage is to reference external names inside your
construction, thus naming other regexen or refencing external variables to
deposit backreferences inside them. All in all, to get a regex construct
that can interact with the external word.



 What ANTLR does is similar technique - a regular lookahead to resolve
 ambiguity in the grammar (implicitly). A lot like LL(k) but with unlimited
 length (so called LL(*)). Of course, it generates LL(k) disambiguation
 where possible, then LL(*), failing that the usual backtracking.

I liked that idea since the author added it to ANTLR, but I never used it
since.
I wonder whether that can be implemented inside another parser generator or
if it uses some specific-to-ANTLR internal machinery.



 *sigh*, I feel like I could write a paper on this stuff if I were in
 grad school right now.  Alas, I am stuck doing 50-60 hours a week of
 soul-sucking business programming.

 I heard Sociomantic is hiring D programmers for coding some awesome stuff,
 you may as well apply :)




Well, I do 50-60 hours a week of fulfilling and challenging work, but the
free time at the end is the same ;) I feel Chad's pain, though.




  Well, then again, my understanding
 is that even though I can think of things that seem like they would make
 interesting topics for publishable papers, reality would have the profs
 conscript me to do completely different things that are possibly just as
 inane as the business programming.

 Speaking for my limited experience - at times it's like that.


 I worry that the greater threat to good AST manipulation tools in D is a
 lack of free time, and not the DMD bugs as much.

 Good for you I guess, my developments in related area are blocked still :(

Walter is far from convinced that AST manipulation is a good thing. You
would have to convince him first. His fear is that it will lead to
unreadable code, and everyone using her own personnal version of D.

AFAICT, nothing of the sort happened in Lisp (I mean, Lispers have
balkanization, but *not* due to AST manipulation).

Nov 05 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

05-Nov-2013 20:55, Philippe Sigaud пишет:
 On Tue, Nov 5, 2013 at 3:54 PM, Dmitry Olshansky <dmitry.olsh gmail.com
 <mailto:dmitry.olsh gmail.com>> wrote:


     I was also toying with the idea of exposing Builder interface for
     std.regex. But push/pop IMHO are better be implicitly designed-out:

     auto re =
     atom('x').star(charClass(__unicode.Letter),atom('y')).__build();

     ... and letting the nesting be explicit.

     Is the same as:
     auto re = regex(`x(?:\p{L}y)*`);

     Aimed for apps/libs that build regular expressions anyway and have
     no need in textual parser.

 Another possible advantage is to reference external names inside your
 construction, thus naming other regexen or refencing external variables
 to deposit backreferences inside them.

Actually it's a bad, bad idea. It has nice potential to destroy all 
optimization opportunities and performance guarantees of it (like being 
linear in time, and that only works today w/o funky extensions used).

After all I'm in a curious position of having to do some work at R-T as 
well where you can't always just generate some D code ;)

What would be real nice though is to let users register their own 
dictionary of 'tokens' from that. Then things like Ipv4 pattern or 
domain name pattern as simple as `\d` pieces they use today (say with 
\i{user-defined-name}).

 All in all, to get a regex
 construct that can interact with the external word.

Well, I think of some rather interesting ways to do it even w/o tying in 
some external stuff as building blocks. It's rather making std.regex 
itself less rigid and more lean (as in cheap to invoke). Then external 
modules may slice and dice its primitives as seen fit.

     What ANTLR does is similar technique - a regular lookahead to
     resolve ambiguity in the grammar (implicitly). A lot like LL(k) but
     with unlimited length (so called LL(*)). Of course, it generates
     LL(k) disambiguation where possible, then LL(*), failing that the
     usual backtracking.

 I liked that idea since the author added it to ANTLR, but I never used
 it since.
 I wonder whether that can be implemented inside another parser generator
 or if it uses some specific-to-ANTLR internal machinery.

I don't think there is much of specific in it. You would though have to 
accept it's no longer a PEG but rather some hybrid top-down EBNF parser 
that resolves ambiguities.

         I worry that the greater threat to good AST manipulation tools
         in D is a
         lack of free time, and not the DMD bugs as much.


     Good for you I guess, my developments in related area are blocked
     still :(

 Walter is far from convinced that AST manipulation is a good thing. You
 would have to convince him first.

I thought it was about tools that work with D code like say lints, 
refactoring, etc.


-- 
Dmitry Olshansky

Nov 05 2013

Jacob Carlborg <doob me.com> writes:

On 2013-11-05 17:55, Philippe Sigaud wrote:

 Walter is far from convinced that AST manipulation is a good thing. You
 would have to convince him first. His fear is that it will lead to
 unreadable code, and everyone using her own personnal version of D.
 AFAICT, nothing of the sort happened in Lisp (I mean, Lispers have
 balkanization, but *not* due to AST manipulation).

You know, Walter did a talk about AST macros at the first D conference. 
The idea back then was to add AST macros, hence the "macro" keyword.

-- 
/Jacob Carlborg

Nov 06 2013

"Chad Joan" <chadjoan gmail.com> writes:

On Wednesday, 6 November 2013 at 08:19:13 UTC, Jacob Carlborg 
wrote:
 On 2013-11-05 17:55, Philippe Sigaud wrote:

 Walter is far from convinced that AST manipulation is a good 
 thing. You
 would have to convince him first. His fear is that it will 
 lead to
 unreadable code, and everyone using her own personnal version 
 of D.
 AFAICT, nothing of the sort happened in Lisp (I mean, Lispers 
 have
 balkanization, but *not* due to AST manipulation).

 You know, Walter did a talk about AST macros at the first D 
 conference. The idea back then was to add AST macros, hence the 
 "macro" keyword.

Also, IIRC, it is believed that string mixins with CTFE are 
potentially more powerful.  I am under the assumption that Walter 
is taking the long view and waiting for the community to furnish 
their own powerful AST manipulation tools using the existing 
spec.  I suspect that he is opposed to baking AST manipulation 
into the /language spec/, but is perfectly accepting of the 
notion of using AST manipulation to generate code, reduce 
boilerplate, implement exotic features and DSLs, and so on.  Just 
don't complicate the core language any more than it already is.  
Sorry if I misrepresented you Walter; I can only make educated 
guesses ;)

Nov 06 2013

"Chris Cain" <clcain uncg.edu> writes:

On Wednesday, 6 November 2013 at 08:34:32 UTC, Chad Joan wrote:
 Also, IIRC, it is believed that string mixins with CTFE are 
 potentially more powerful.

Maybe. But first, the point that Brian Schott brought up has to 
be addressed (language spec and DMD must be in sync). Second, 
CTFE has to get more efficient. It's _very_ easy to make it run 
out of memory and it's fairly slow at executing. But yeah, I 
agree that there could be a library AST manipulator and it'd be 
pretty nice. But it'll require that two sets of parsers in two 
different languages be kept in sync, which is likely to be a 
challenge.

There's pros and cons to both of the approaches, really.

Nov 06 2013

"Chad Joan" <chadjoan gmail.com> writes:

On Wednesday, 6 November 2013 at 09:09:32 UTC, Chris Cain wrote:
 On Wednesday, 6 November 2013 at 08:34:32 UTC, Chad Joan wrote:
 Also, IIRC, it is believed that string mixins with CTFE are 
 potentially more powerful.

 Maybe. But first, the point that Brian Schott brought up has to 
 be addressed (language spec and DMD must be in sync). Second, 
 CTFE has to get more efficient. It's _very_ easy to make it run 
 out of memory and it's fairly slow at executing. But yeah, I 
 agree that there could be a library AST manipulator and it'd be 
 pretty nice. But it'll require that two sets of parsers in two 
 different languages be kept in sync, which is likely to be a 
 challenge.

 There's pros and cons to both of the approaches, really.

Right.

My thoughts are, "be patient".  The slow CTFE can be fixed 
without changes to the language spec, and it has progressed 
really well over the last couple years.  The DMD vs spec thing is 
also fixable by building tools that operate on a formal grammar 
and forcing the grammar to be thoroughly explored.  At the end of 
the day, I believe these will both be pleasantly fixed, and the 
spec won't end up with a bunch of complexity that was "useful 10 
years ago".  It all makes sense to me, at least.

Nov 06 2013

Philippe Sigaud <philippe.sigaud gmail.com> writes:

On Wed, Nov 6, 2013 at 9:34 AM, Chad Joan <chadjoan gmail.com> wrote:

 On Wednesday, 6 November 2013 at 08:19:13 UTC, Jacob Carlborg wrote:

 On 2013-11-05 17:55, Philippe Sigaud wrote:

  Walter is far from convinced that AST manipulation is a good thing. You
 would have to convince him first. His fear is that it will lead to
 unreadable code, and everyone using her own personnal version of D.
 AFAICT, nothing of the sort happened in Lisp (I mean, Lispers have
 balkanization, but *not* due to AST manipulation).

 You know, Walter did a talk about AST macros at the first D conference.
 The idea back then was to add AST macros, hence the "macro" keyword.

 Also, IIRC, it is believed that string mixins with CTFE are potentially
 more powerful.  I am under the assumption that Walter is taking the long
 view and waiting for the community to furnish their own powerful AST
 manipulation tools using the existing spec.  I suspect that he is opposed
 to baking AST manipulation into the /language spec/, but is perfectly
 accepting of the notion of using AST manipulation to generate code, reduce
 boilerplate, implement exotic features and DSLs, and so on.  Just don't
 complicate the core language any more than it already is.  Sorry if I
 misrepresented you Walter; I can only make educated guesses ;)

Well, I remember him saying that he doesn't want everybody and their dog to
change the D syntax. If we get macros, that's exactly what we can do. For
example, telling the compiler (or another tool), to rewrite

int a := 1;

into

immutable a = 1;

But that means the original not-D code cannot be compiled by anyone with a
D compiler anymore (obviously). And soon everybody is transforming the
grammar according to what they think is cool.
That's his PoV IIRC.

I'm more interested in getting to ability to define my own lowerings: in
the same way that scope statements or foreach over ranges are rewritten
into lower-level D code by the comiler, I would like to be able to define
my own syntactic constructs.

Plus, everybody here know that AST macros can cure world hunger :) Wouldn't
that be nice?

Nov 06 2013

Jacob Carlborg <doob me.com> writes:

On 2013-11-07 00:23, Philippe Sigaud wrote:

 Well, I remember him saying that he doesn't want everybody and their dog
 to change the D syntax. If we get macros, that's exactly what we can do.
 For example, telling the compiler (or another tool), to rewrite

 int a := 1;

 into

 immutable a = 1;

It depends on what the AST macros allows you to do. I guess everyone 
here has her/his own definition.

 But that means the original not-D code cannot be compiled by anyone with
 a D compiler anymore (obviously). And soon everybody is transforming the
 grammar according to what they think is cool.
 That's his PoV IIRC.

 I'm more interested in getting to ability to define my own lowerings: in
 the same way that scope statements or foreach over ranges are rewritten
 into lower-level D code by the comiler, I would like to be able to
 define my own syntactic constructs.

My idea of AST macros[1] is quite similar to those in Scala and a bit 
similar to what you want. The most important thing to remember is that 
AST macros cannot create new syntax. It only works at the semantic level 
of the language. That means the above example you wrote doesn't not 
work, since that isn't legal D syntax. What would work is doing 
something like this:

auto a = Person.where(e => e.name == "John");

In this case "where" is a macro. Which translates the given AST into the 
following SQL query:

select * from person where name = 'John'

Another thing I would really like is to be able to pass a delegate to a 
function after argument list, if the function takes a delegate as its 
last parameter:

void each (T) (T[] arr, void delegate (T) dg);

[1, 2, 3].each(e) {
     writeln(e);
}

Of courses, the compiler needs to be able to inline the delegate.

 Plus, everybody here know that AST macros can cure world hunger :)
 Wouldn't that be nice?

Of course :)

[1] https://dl.dropboxusercontent.com/u/18386187/ast_macros.html

-- 
/Jacob Carlborg

Nov 07 2013

"Martin Nowak" <code dawg.eu> writes:

On Thursday, 7 November 2013 at 13:50:57 UTC, Jacob Carlborg 
wrote:
 https://dl.dropboxusercontent.com/u/18386187/ast_macros.html

Can you pour that into a DIP?

Nov 10 2013

Jacob Carlborg <doob me.com> writes:

On 2013-11-10 17:37, Martin Nowak wrote:

 Can you pour that into a DIP?

Absolutely, I've been meaning to do that.

-- 
/Jacob Carlborg

Nov 10 2013

"deadalnix" <deadalnix gmail.com> writes:

On Sunday, 10 November 2013 at 17:58:35 UTC, Jacob Carlborg wrote:
 On 2013-11-10 17:37, Martin Nowak wrote:

 Can you pour that into a DIP?

 Absolutely, I've been meaning to do that.

It require a bit more precision. But as you know I really behind 
the idea.

Nov 10 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/10/13 9:58 AM, Jacob Carlborg wrote:
 On 2013-11-10 17:37, Martin Nowak wrote:

 Can you pour that into a DIP?

 Absolutely, I've been meaning to do that.

Just a thought - I think the best strategy for the time being is to 
explore gainful use of the features already present, instead of adding 
new features.

Andrei

Nov 10 2013

"deadalnix" <deadalnix gmail.com> writes:

On Sunday, 10 November 2013 at 18:31:32 UTC, Andrei Alexandrescu 
wrote:
 On 11/10/13 9:58 AM, Jacob Carlborg wrote:
 On 2013-11-10 17:37, Martin Nowak wrote:

 Can you pour that into a DIP?

 Absolutely, I've been meaning to do that.

 Just a thought - I think the best strategy for the time being 
 is to explore gainful use of the features already present, 
 instead of adding new features.

 Andrei

It is also a given that many feature could have been avoided in 
the first place given that one.

Still I'd rather see thing settle down a little before 
introducing this. Fix the existing things, then introduce new 
stuff seems like the proper way to go.

Nov 10 2013

Jacob Carlborg <doob me.com> writes:

On 2013-11-10 19:44, deadalnix wrote:

 It is also a given that many feature could have been avoided in the
 first place given that one.

I agree. It seems at least a couple of features could have been 
implemented using AST macros instead.

-- 
/Jacob Carlborg

Nov 10 2013

Timothee Cour <thelastmammoth gmail.com> writes:

On Sun, Nov 10, 2013 at 12:27 PM, Jacob Carlborg <doob me.com> wrote:

 On 2013-11-10 19:44, deadalnix wrote:

  It is also a given that many feature could have been avoided in the
 first place given that one.

 I agree. It seems at least a couple of features could have been
 implemented using AST macros instead.

 --
 /Jacob Carlborg

I agree; eg, just to name 2 features that I proposed that could be simply
implemented via AST macros as Jacob said:
* [proposal: a new string litteral to embed variables in a
string<http://forum.dlang.org/post/mailman.94.1383254681.9546.digitalmars-d puremagic.com>

http://forum.dlang.org/post/mailman.94.1383254681.9546.digitalmars-d puremagic.com
]
* [Re: feature request: __ARGS__ for logging (cf __FILE__, __LINE__,
__FUNC___ http://forum.dlang.org/post/kegmp0$30dn$1 digitalmars.com]

It seems we could even get rid of __FILE__,__LINE__,__MODULE__ with AST
macros.

Nov 10 2013

Jacob Carlborg <doob me.com> writes:

On 2013-11-10 21:38, Timothee Cour wrote:

 I agree; eg, just to name 2 features that I proposed that could be
 simply implemented via AST macros as Jacob said:
 * [proposal: a new string litteral to embed variables in a string
 <http://forum.dlang.org/post/mailman.94.1383254681.9546.digitalmars-d puremagic.com>
 http://forum.dlang.org/post/mailman.94.1383254681.9546.digitalmars-d puremagic.com]
 * [Re: feature request: __ARGS__ for logging (cf __FILE__, __LINE__,
 __FUNC___ http://forum.dlang.org/post/kegmp0$30dn$1 digitalmars.com]

 It seems we could even get rid of __FILE__,__LINE__,__MODULE__ with AST
 macros.

Let see, what else. These could, most likely, have been implemented 
using AST macros:

* scope
* foreach
* for
* inlining delegates. I think the current solution to pass a 
delegate/lambda as an alias parameter is a bit weird

One of my favorites that is not already in the language is implementing 
database queries:

Person.where(e => e.name == "John");

Will be translate to the following SQL:

select * from person where name = 'John'

Another one is the property shortcut, see the example for attribute 
macros: http://wiki.dlang.org/DIP50#Attribute_macros

-- 
/Jacob Carlborg

Nov 10 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/10/13 1:17 PM, Jacob Carlborg wrote:
 On 2013-11-10 21:38, Timothee Cour wrote:

 I agree; eg, just to name 2 features that I proposed that could be
 simply implemented via AST macros as Jacob said:
 * [proposal: a new string litteral to embed variables in a string
 <http://forum.dlang.org/post/mailman.94.1383254681.9546.digitalmars-d puremagic.com>

 http://forum.dlang.org/post/mailman.94.1383254681.9546.digitalmars-d puremagic.com]

 * [Re: feature request: __ARGS__ for logging (cf __FILE__, __LINE__,
 __FUNC___ http://forum.dlang.org/post/kegmp0$30dn$1 digitalmars.com]

 It seems we could even get rid of __FILE__,__LINE__,__MODULE__ with AST
 macros.

 Let see, what else. These could, most likely, have been implemented
 using AST macros:

 * scope
 * foreach
 * for
 * inlining delegates. I think the current solution to pass a
 delegate/lambda as an alias parameter is a bit weird

 One of my favorites that is not already in the language is implementing
 database queries:

 Person.where(e => e.name == "John");

 Will be translate to the following SQL:

 select * from person where name = 'John'

 Another one is the property shortcut, see the example for attribute
 macros: http://wiki.dlang.org/DIP50#Attribute_macros

The way I see this is, these preexisting features (which shan't be 
thrown away) diminish the motivation for adding another feature.

Andrei

Nov 10 2013

Jacob Carlborg <doob me.com> writes:

On 2013-11-10 23:49, Andrei Alexandrescu wrote:

 The way I see this is, these preexisting features (which shan't be
 thrown away) diminish the motivation for adding another feature.

I think the opposite. It shows how powerful macros are and that they 
will most likely be able to implement other new features that haven't 
been proposed yet.

Adding a bunch of new features to the language instead of taking a step 
back and thinking if something else can be added that implement these 
features and many more shows bad judgement.

-- 
/Jacob Carlborg

Nov 10 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/10/13 11:55 PM, Jacob Carlborg wrote:
 On 2013-11-10 23:49, Andrei Alexandrescu wrote:

 The way I see this is, these preexisting features (which shan't be
 thrown away) diminish the motivation for adding another feature.

 I think the opposite. It shows how powerful macros are and that they
 will most likely be able to implement other new features that haven't
 been proposed yet.

 Adding a bunch of new features to the language instead of taking a step
 back and thinking if something else can be added that implement these
 features and many more shows bad judgement.

No need to be snarky about it. The point here is that there is 
significant difficulty to remove features that already exist and are 
useful, which changes the game considerably.

Andrei

Nov 11 2013

"deadalnix" <deadalnix gmail.com> writes:

On Tuesday, 12 November 2013 at 02:34:52 UTC, Andrei Alexandrescu 
wrote:
 No need to be snarky about it. The point here is that there is 
 significant difficulty to remove features that already exist 
 and are useful, which changes the game considerably.

 Andrei

That is true. However, please understand that this is extremely 
frustrating to have a lot of gadget features that solve a 
specific issue when much more generic solutions exists, and when 
these gadgets actually makes it more difficult to introduce the 
proper features.

If we put aside the AST macro thing (granted we have a swiss army 
knife of gadgets for many of its typical uses cases), several 
other skeleton don't want to stay in the closet.

If I has to pick 2:
tail qualified type parameter are one. It is impossible to have a 
use defined type that behave as slices/pointers. That is 
showstopper for a proper collection library. I have only ugly 
solution to propose to that now. Lately, I was plying with thing 
along the lines of struct S[T] { auto foo() const[T] { ... } } 
I'm afraid that what already exist in that area makes things 
quite difficult.

The lack of qualifier for ownership, which makes concurrency and 
interface between manual memory management and GC tedious. For 
that one, isolated seems the way to go.

These same issue continue to pop up in many different forms (for 
instance, see Kenji's DIP, where he is trying to hack around some 
gadget to unique postblit work).

Nov 11 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/11/13 9:11 PM, deadalnix wrote:
 On Tuesday, 12 November 2013 at 02:34:52 UTC, Andrei Alexandrescu wrote:
 No need to be snarky about it. The point here is that there is
 significant difficulty to remove features that already exist and are
 useful, which changes the game considerably.

 Andrei

 That is true. However, please understand that this is extremely
 frustrating to have a lot of gadget features that solve a specific issue
 when much more generic solutions exists, and when these gadgets actually
 makes it more difficult to introduce the proper features.

Frustration is inappropriate. Probably "bummed" would be more fit for 
the situation. (Which I am not, but that's a different story; I 
personally don't think macros are all they're cracked to be.)

We did not have a macro-based solution when D was being invented, and I 
don't think there's a lot of glory in seeing patterns for doing things 
differently after the language has been in use for a while. There's a 
whole water under the bridge between now and then.

 If we put aside the AST macro thing (granted we have a swiss army knife
 of gadgets for many of its typical uses cases), several other skeleton
 don't want to stay in the closet.

 If I has to pick 2:
 tail qualified type parameter are one. It is impossible to have a use
 defined type that behave as slices/pointers. That is showstopper for a
 proper collection library. I have only ugly solution to propose to that
 now. Lately, I was plying with thing along the lines of struct S[T] {
 auto foo() const[T] { ... } } I'm afraid that what already exist in that
 area makes things quite difficult.

 The lack of qualifier for ownership, which makes concurrency and
 interface between manual memory management and GC tedious. For that one,
 isolated seems the way to go.

 These same issue continue to pop up in many different forms (for
 instance, see Kenji's DIP, where he is trying to hack around some gadget
 to unique postblit work).

Our goals for the time being are to improve stability, close gaps in the 
type system, and make good use of features in code. To the extent we 
can't do what we want to do, the topics you mention are fit.


Andrei

Nov 11 2013

"deadalnix" <deadalnix gmail.com> writes:

On Tuesday, 12 November 2013 at 05:58:02 UTC, Andrei Alexandrescu 
wrote:
 If I has to pick 2:
 tail qualified type parameter are one. It is impossible to 
 have a use
 defined type that behave as slices/pointers. That is 
 showstopper for a
 proper collection library. I have only ugly solution to 
 propose to that
 now. Lately, I was plying with thing along the lines of struct 
 S[T] {
 auto foo() const[T] { ... } } I'm afraid that what already 
 exist in that
 area makes things quite difficult.

 The lack of qualifier for ownership, which makes concurrency 
 and
 interface between manual memory management and GC tedious. For 
 that one,
 isolated seems the way to go.

 These same issue continue to pop up in many different forms 
 (for
 instance, see Kenji's DIP, where he is trying to hack around 
 some gadget
 to unique postblit work).

 Our goals for the time being are to improve stability, close 
 gaps in the type system, and make good use of features in code. 
 To the extent we can't do what we want to do, the topics you 
 mention are fit.


 Andrei

Yes, as much as I'd like to have macro, I don't think this is the 
right time now. Closing the gaps is more important.

Nov 11 2013

Jacob Carlborg <doob me.com> writes:

On 2013-11-12 03:34, Andrei Alexandrescu wrote:

 No need to be snarky about it. The point here is that there is
 significant difficulty to remove features that already exist and are
 useful, which changes the game considerably.

I'm really trying to be as humble as I can but you're not making it easy.

I have not said we should remove any of the existing features. I'm 
trying to show that several already existing features could have been 
implemented using macros if they existed early in the development. It 
should show that it is a powerful feature and can hopefully be used to 
solve new ides with library code instead changing the languages.

-- 
/Jacob Carlborg

Nov 11 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/11/13 11:46 PM, Jacob Carlborg wrote:
 On 2013-11-12 03:34, Andrei Alexandrescu wrote:

 No need to be snarky about it. The point here is that there is
 significant difficulty to remove features that already exist and are
 useful, which changes the game considerably.

 I'm really trying to be as humble as I can but you're not making it easy.

 I have not said we should remove any of the existing features. I'm
 trying to show that several already existing features could have been
 implemented using macros if they existed early in the development. It
 should show that it is a powerful feature and can hopefully be used to
 solve new ides with library code instead changing the languages.

Fine, although a sense of futility is hard to shake seeing as we won't 
replace those existing features. I think a much stronger point would be 
made if the power of the feature were demonstrated on problems not 
accessible with the existing ones.

About DIP 50: I will say "no" but please do not take it personally. It's 
great there is discussion about this, and I encourage it, but at this 
time I think we should make no plans to add macros to D.


Andrei

Nov 11 2013

Jacob Carlborg <doob me.com> writes:

On 2013-11-12 08:52, Andrei Alexandrescu wrote:

 Fine, although a sense of futility is hard to shake seeing as we won't
 replace those existing features. I think a much stronger point would be
 made if the power of the feature were demonstrated on problems not
 accessible with the existing ones.

You just said we shouldn't replace existing features.

"The point here is that there is
significant difficulty to remove features that already exist"

http://forum.dlang.org/thread/bwsofbnigfbrxwouiobj forum.dlang.org?page=9#post-l5s44b:242c36:241:40digitalmars.com

 About DIP 50: I will say "no" but please do not take it personally. It's
 great there is discussion about this, and I encourage it, but at this
 time I think we should make no plans to add macros to D.

I don't think we should add macros now either. This proposal is far from 
ready. If Martin hadn't suggested I should create a DIP, I wouldn't 
have, at least now at this time.

BTW, just saying "no" doesn't help a bit. You could just have said 
"foo". That's extremely annoying. You're shooting down very many 
suggestions/proposal/ideas with just a "no", or the more elaborated 
answer "no, never going to happen".

On the other hand when you have a proposal it should be consider 
pre-approved and is a more of a FYI.

-- 
/Jacob Carlborg

Nov 12 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/12/13 12:02 AM, Jacob Carlborg wrote:
On 2013-11-12 08:52, Andrei Alexandrescu wrote:

Fine, although a sense of futility is hard to shake seeing as we won't
replace those existing features. I think a much stronger point would be
made if the power of the feature were demonstrated on problems not
accessible with the existing ones.

You just said we shouldn't replace existing features.

"The point here is that there is
significant difficulty to remove features that already exist"

http://forum.dlang.org/thread/bwsofbnigfbrxwouiobj forum.dlang.org?page=9#post-l5s44b:242c36:241:40digitalmars.com

Yes. So I said. I don't get why you'd provide a link - it's in my text
that you quote. Indeed, we shouldn't replace existing features.

About DIP 50: I will say "no" but please do not take it personally. It's
great there is discussion about this, and I encourage it, but at this
time I think we should make no plans to add macros to D.

I don't think we should add macros now either. This proposal is far from
ready. If Martin hadn't suggested I should create a DIP, I wouldn't
have, at least now at this time.

Fine.

BTW, just saying "no" doesn't help a bit. You could just have said
"foo". That's extremely annoying. You're shooting down very many
suggestions/proposal/ideas with just a "no", or the more elaborated
answer "no, never going to happen".

On the other hand when you have a proposal it should be consider
pre-approved and is a more of a FYI.

So how could we express a "no" that doesn't annoy you in the extreme? In
case the answer would be "you haven't explained why", allow me to retort.

I've mentioned the argument before: at this point we should focus on
quality of implementation and making good use of the features we have.
In fact I am repeating myself: http://goo.gl/1thq1j. As has been
publicly known for a while now, our strategy has been to improve quality
and to double down on the assets we have. People ask for a roadmap, and
what's missing from a roadmap is as important as what's there.

This is a strategy that Walter and I agree with, have been transparent
about, and that may work or not, with various degrees of success.
Reasonable people may disagree what the best step moving forward should
be, but at some point some step must be made and we can't adopt your
strategy, with which we disagree, as our strategy, just to be nice and
not offend your sensibility. (I'm using "we" here because Walter and I
discussed this at large.) There must be a way to say "no" that doesn't
offend you. Please advise what that is.

Andrei

Nov 12 2013

Jacob Carlborg <doob me.com> writes:

On 2013-11-12 09:13, Andrei Alexandrescu wrote:

 So how could we express a "no" that doesn't annoy you in the extreme? In
 case the answer would be "you haven't explained why", allow me to retort.

 I've mentioned the argument before: at this point we should focus on
 quality of implementation and making good use of the features we have.
 In fact I am repeating myself: http://goo.gl/1thq1j. As has been
 publicly known for a while now, our strategy has been to improve quality
 and to double down on the assets we have. People ask for a roadmap, and
 what's missing from a roadmap is as important as what's there.

 This is a strategy that Walter and I agree with, have been transparent
 about, and that may work or not, with various degrees of success.
 Reasonable people may disagree what the best step moving forward should
 be, but at some point some step must be made and we can't adopt your
 strategy, with which we disagree, as our strategy, just to be nice and
 not offend your sensibility. (I'm using "we" here because Walter and I
 discussed this at large.) There must be a way to say "no" that doesn't
 offend you. Please advise what that is.

Thank you for the explanation. I do understand your and Walter's 
position in this.

Just giving a short reason, to accompany the "no" with helps a lot. It 
doesn't need to be as long as the explanation above, just a sentence, like:

"no, at this time we don't want to make such a big change, we're trying 
to stabilize".

This is not just for me. I'm hoping proposal from others also can get a 
fair reason to why the "no".

-- 
/Jacob Carlborg

Nov 12 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/12/13 12:31 AM, Jacob Carlborg wrote:
 Just giving a short reason, to accompany the "no" with helps a lot. It
 doesn't need to be as long as the explanation above, just a sentence, like:

 "no, at this time we don't want to make such a big change, we're trying
 to stabilize".

 This is not just for me. I'm hoping proposal from others also can get a
 fair reason to why the "no".

OK, thanks. I'll do my best to improve on that in the future.

Andrei

Nov 12 2013

Jacob Carlborg <doob me.com> writes:

On 2013-11-12 09:37, Andrei Alexandrescu wrote:

 OK, thanks. I'll do my best to improve on that in the future.

Thanks again. Sorry for being frustrated.

-- 
/Jacob Carlborg

Nov 12 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/10/13 12:38 PM, Timothee Cour wrote:
 On Sun, Nov 10, 2013 at 12:27 PM, Jacob Carlborg <doob me.com
 <mailto:doob me.com>> wrote:

     On 2013-11-10 19:44, deadalnix wrote:

         It is also a given that many feature could have been avoided in the
         first place given that one.


     I agree. It seems at least a couple of features could have been
     implemented using AST macros instead.

     --
     /Jacob Carlborg


 I agree; eg, just to name 2 features that I proposed that could be
 simply implemented via AST macros as Jacob said:
 * [proposal: a new string litteral to embed variables in a string
 <http://forum.dlang.org/post/mailman.94.1383254681.9546.digitalmars-d puremagic.com>
 http://forum.dlang.org/post/mailman.94.1383254681.9546.digitalmars-d puremagic.com]
 * [Re: feature request: __ARGS__ for logging (cf __FILE__, __LINE__,
 __FUNC___ http://forum.dlang.org/post/kegmp0$30dn$1 digitalmars.com]

 It seems we could even get rid of __FILE__,__LINE__,__MODULE__ with AST
 macros.

This would be a very small advantage. The special variables hardly cause 
any trouble.

Andrei

Nov 10 2013

Timon Gehr <timon.gehr gmx.ch> writes:

On 11/10/2013 11:43 PM, Andrei Alexandrescu wrote:
 It seems we could even get rid of __FILE__,__LINE__,__MODULE__ with AST
 macros.

 This would be a very small advantage. The special variables hardly cause
 any trouble.

 Andrei

The trouble is that there are too few of them to (efficiently) reflect 
all the information about the calling context one may be interested in. 
I don't see the point in adding them one after another in an 
unprincipled ad-hoc fashion instead of actually thinking up a good API 
coupled with some more expressive power.

Nov 10 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/10/13 3:21 PM, Timon Gehr wrote:
 On 11/10/2013 11:43 PM, Andrei Alexandrescu wrote:
 It seems we could even get rid of __FILE__,__LINE__,__MODULE__ with AST
 macros.

 This would be a very small advantage. The special variables hardly cause
 any trouble.

 Andrei

 The trouble is that there are too few of them to (efficiently) reflect
 all the information about the calling context one may be interested in.
 I don't see the point in adding them one after another in an
 unprincipled ad-hoc fashion instead of actually thinking up a good API
 coupled with some more expressive power.

But the API would add them in exactly the same rote manner. I really 
don't see an advantage here. An argument for macros would have to do a 
lot more than "we don't need __FILE__ etc. anymore".

Andrei

Nov 10 2013

Timothee Cour <thelastmammoth gmail.com> writes:

On Sun, Nov 10, 2013 at 3:37 PM, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 On 11/10/13 3:21 PM, Timon Gehr wrote:

 On 11/10/2013 11:43 PM, Andrei Alexandrescu wrote:

 It seems we could even get rid of __FILE__,__LINE__,__MODULE__ with AST
 macros.

 This would be a very small advantage. The special variables hardly cause
 any trouble.

 Andrei

 The trouble is that there are too few of them to (efficiently) reflect
 all the information about the calling context one may be interested in.
 I don't see the point in adding them one after another in an
 unprincipled ad-hoc fashion instead of actually thinking up a good API
 coupled with some more expressive power.

 But the API would add them in exactly the same rote manner. I really don't
 see an advantage here.


All else being equal, a library solution is always preferable; and it's
easily extensible unlike what we have.


 An argument for macros would have to do a lot more than "we don't need
 __FILE__ etc. anymore".


More use cases have already been provided in this thread and previously.
The fact that *one* language feature enables all of those is a very good
indication there'll be more use cases. Furthermore, writing AST macros
instead of mixins is cleaner in terms of introspection, refactoring tools,
etc.



 Andrei

Nov 10 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/10/13 5:03 PM, Timothee Cour wrote:
 On Sun, Nov 10, 2013 at 3:37 PM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail erdani.org>>
 wrote:

     On 11/10/13 3:21 PM, Timon Gehr wrote:

         On 11/10/2013 11:43 PM, Andrei Alexandrescu wrote:


                 It seems we could even get rid of
                 __FILE__,__LINE__,__MODULE__ with AST
                 macros.


             This would be a very small advantage. The special variables
             hardly cause
             any trouble.

             Andrei


         The trouble is that there are too few of them to (efficiently)
         reflect
         all the information about the calling context one may be
         interested in.
         I don't see the point in adding them one after another in an
         unprincipled ad-hoc fashion instead of actually thinking up a
         good API
         coupled with some more expressive power.


     But the API would add them in exactly the same rote manner. I really
     don't see an advantage here.


 All else being equal, a library solution is always preferable; and it's
 easily extensible unlike what we have.

What do you have in mind? I find it difficult to grab e.g. the file and 
the line unless there's some plumbing in the language that allows you to.

     An argument for macros would have to do a lot more than "we don't
     need __FILE__ etc. anymore".


 More use cases have already been provided in this thread and previously.
 The fact that *one* language feature enables all of those is a very good
 indication there'll be more use cases. Furthermore, writing AST macros
 instead of mixins is cleaner in terms of introspection, refactoring
 tools, etc.

I am aware of the advantages. You, too, should be aware of the 
disadvantages.


Andrei

Nov 10 2013

Jacob Carlborg <doob me.com> writes:

On 2013-11-11 02:54, Andrei Alexandrescu wrote:

 What do you have in mind? I find it difficult to grab e.g. the file and
 the line unless there's some plumbing in the language that allows you to.

The language just need to provide a way to introspect the AST. If we're 
talking about a specific API I can think of something like this:

macro foo (Context context, Ast!(string) str)
{
     auto line = context.caller.line;
     auto file = context.caller.file;
}

foo("asd");

-- 
/Jacob Carlborg

Nov 11 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/11/13 12:00 AM, Jacob Carlborg wrote:
 On 2013-11-11 02:54, Andrei Alexandrescu wrote:

 What do you have in mind? I find it difficult to grab e.g. the file and
 the line unless there's some plumbing in the language that allows you to.

 The language just need to provide a way to introspect the AST. If we're
 talking about a specific API I can think of something like this:

 macro foo (Context context, Ast!(string) str)
 {
      auto line = context.caller.line;
      auto file = context.caller.file;
 }

 foo("asd");

So... there is rote addition to the context.caller structure. It's just 
spelled differently. No?

Andrei

Nov 11 2013

Jacob Carlborg <doob me.com> writes:

On 2013-11-12 03:35, Andrei Alexandrescu wrote:

 So... there is rote addition to the context.caller structure. It's just
 spelled differently. No?

Have you seen the other posts? Have even read the proposal? If you think 
that I just want a different syntax for __LINE__ and __FILE__ you're 
way, way off. That's just an extremely small part of the proposal. 
__LINE__ and __FILE__ would not be removed.

-- 
/Jacob Carlborg

Nov 11 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/11/13 11:50 PM, Jacob Carlborg wrote:
 On 2013-11-12 03:35, Andrei Alexandrescu wrote:

 So... there is rote addition to the context.caller structure. It's just
 spelled differently. No?

 Have you seen the other posts?

Yes. Don't assume whoever disagrees with you is misinformed.

 Have even read the proposal?

Yes.

 If you think
 that I just want a different syntax for __LINE__ and __FILE__ you're
 way, way off. That's just an extremely small part of the proposal.
 __LINE__ and __FILE__ would not be removed.

This is putting words in my mouth.


Andrei

Nov 12 2013

Jacob Carlborg <doob me.com> writes:

On 2013-11-12 09:00, Andrei Alexandrescu wrote:

 Yes. Don't assume whoever disagrees with you is misinformed.

I absolutely do not. But when I read this:

"An argument for macros would have to do a
lot more than 'we don't need __FILE__ etc. anymore'"

I find it hard to believe anything else.

http://forum.dlang.org/thread/bwsofbnigfbrxwouiobj forum.dlang.org?page=8#post-l5p5b8:241kju:241:40digitalmars.com

-- 
/Jacob Carlborg

Nov 12 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/12/13 12:09 AM, Jacob Carlborg wrote:
 On 2013-11-12 09:00, Andrei Alexandrescu wrote:

 Yes. Don't assume whoever disagrees with you is misinformed.

 I absolutely do not. But when I read this:

 "An argument for macros would have to do a
 lot more than 'we don't need __FILE__ etc. anymore'"

 I find it hard to believe anything else.

 http://forum.dlang.org/thread/bwsofbnigfbrxwouiobj forum.dlang.org?page=8#post-l5p5b8:241kju:241:40digitalmars.com

It's very simple. Timon got into that part as if it were important. I 
pointed out it's not important.

Andrei

Nov 12 2013

Jacob Carlborg <doob me.com> writes:

On 2013-11-12 09:14, Andrei Alexandrescu wrote:

 It's very simple. Timon got into that part as if it were important. I
 pointed out it's not important.

Ok, fair enough. I'm sorry for wrongly interpreting it.

-- 
/Jacob Carlborg

Nov 12 2013

Timon Gehr <timon.gehr gmx.ch> writes:

On 11/12/2013 09:14 AM, Andrei Alexandrescu wrote:
 On 11/12/13 12:09 AM, Jacob Carlborg wrote:
 ...

 "An argument for macros would have to do a
 lot more than 'we don't need __FILE__ etc. anymore'"

 ...

 It's very simple.  Timon got into that part as if it were important. I pointed
out it's 

not important.
 ..

This is a misrepresentation. I'll just answer that post.

Nov 14 2013

Jacob Carlborg <doob me.com> writes:

On 2013-11-11 00:37, Andrei Alexandrescu wrote:

 But the API would add them in exactly the same rote manner. I really
 don't see an advantage here. An argument for macros would have to do a
 lot more than "we don't need __FILE__ etc. anymore".

They do, and several examples have already been shown. I'll also add 
more as I come up with them.

-- 
/Jacob Carlborg

Nov 10 2013

Timon Gehr <timon.gehr gmx.ch> writes:

On 11/11/2013 12:37 AM, Andrei Alexandrescu wrote:
 On 11/10/13 3:21 PM, Timon Gehr wrote:
 On 11/10/2013 11:43 PM, Andrei Alexandrescu wrote:
 It seems we could even get rid of __FILE__,__LINE__,__MODULE__ with AST
 macros.

 This would be a very small advantage. The special variables hardly cause
 any trouble.

 Andrei

 The trouble is that there are too few of them to (efficiently) reflect
 all the information about the calling context one may be interested in.
 I don't see the point in adding them one after another in an
 unprincipled ad-hoc fashion instead of actually thinking up a good API
 coupled with some more expressive power.

 But the API would add them in exactly the same rote manner.

No, it would be (close to) exhaustive from the beginning. That is the 
point. I assume the problem is that there have not yet been very 
specific proposals.

 I really don't see an advantage here.

It's the difference between having an universal language, and a language 
that has a finite and fixed set of words (without any methods of 
combination).


 An argument for macros would have to do a
 lot more than "we don't need __FILE__ etc. anymore".
 ...

Obviously, but I was not making this argument, and as far as I can see 
nobody else was either.

Nov 14 2013

Jacob Carlborg <doob me.com> writes:

On 2013-11-10 17:37, Martin Nowak wrote:

 Can you pour that into a DIP?

Done: http://forum.dlang.org/thread/l5otb1$1dhi$1 digitalmars.com

-- 
/Jacob Carlborg

Nov 10 2013

Jacob Carlborg <doob me.com> writes:

On 2013-11-06 09:34, Chad Joan wrote:

 Also, IIRC, it is believed that string mixins with CTFE are potentially
 more powerful.  I am under the assumption that Walter is taking the long
 view and waiting for the community to furnish their own powerful AST
 manipulation tools using the existing spec.  I suspect that he is
 opposed to baking AST manipulation into the /language spec/, but is
 perfectly accepting of the notion of using AST manipulation to generate
 code, reduce boilerplate, implement exotic features and DSLs, and so
 on.  Just don't complicate the core language any more than it already
 is. Sorry if I misrepresented you Walter; I can only make educated
 guesses ;)

The problem with this is the ugly syntax, it requires a string mixin and 
the source code needs to be embedded in strings. You also need a 
complete front end that works at compile time.

This is my vision of AST macros: 
https://dl.dropboxusercontent.com/u/18386187/ast_macros.html

-- 
/Jacob Carlborg

Nov 07 2013

Philippe Sigaud <philippe.sigaud gmail.com> writes:

On Wed, Nov 6, 2013 at 9:19 AM, Jacob Carlborg <doob me.com> wrote:

 You know, Walter did a talk about AST macros at the first D conference.
 The idea back then was to add AST macros, hence the "macro" keyword.


I know. But since then, I guess he changed his mind, or thought about it a
bit more :-)

Nov 06 2013

"Chad Joan" <chadjoan gmail.com> writes:

On Tuesday, 5 November 2013 at 14:54:34 UTC, Dmitry Olshansky 
wrote:
 I was also toying with the idea of exposing Builder interface 
 for std.regex. But push/pop IMHO are better be implicitly 
 designed-out:

 auto re = 
 atom('x').star(charClass(unicode.Letter),atom('y')).build();

 ... and letting the nesting be explicit.

 Is the same as:
 auto re = regex(`x(?:\p{L}y)*`);

 Aimed for apps/libs that build regular expressions anyway and 
 have no need in textual parser.

Interesting.  I like how it induces some amount of static 
verification, though I worry that it could harm procedural 
generation of grammars.  It would be difficult, for instance, to 
use that API to do the equivalent of pushing an atom in one 
function and popping it in another.

I wonder if we are at different levels of abstraction.  The 
example you give seems like it requires the API to remember, in a 
structured way, all of the information presented by the 
call-chaining and call-nesting.  I might implement something like 
that with a stateful "builder" object under the hood.  However, 
the example you give seems like it is closer to what a regex 
engine would morph an expression into, thus making it a higher 
abstraction.

 That snippet would create a parser that recognizes the grammar 
 'x' (
 'y'? ).
 The current fledgling implementation creates this parser:
 http://pastebin.com/MgSqWXE2

 Of course, no one would be expected to write grammars like 
 that. It
 would be the job of tools like Pegged or std.regex to package 
 it up in
 nice syntax that is easy to use.

 I thought to provide some building blocks for that with new 
 std.uni. Not quite everything I wanted, but now at least there 
 is one set of wheels less to reinvent.

I haven't looked at std.uni earnestly yet, but if it succeeds at 
making that unicode/utf jungle manageable, then I will be 
incredibly thankful.

 [snip]

 Another fun thought: PEGs can have look-behind that includes 
 any regular
 elements without any additional algorithmic complexity. Just 
 take all of
 the look-behinds in the grammar, mash them together into one 
 big
 regular-expression using regular alternation (|), and then 
 have the
 resulting automaton consume in lock-step with the PEG parser.  
 Whenever
 the PEG parser needs to do a lookbehind, it just checks to see 
 if the
 companion automaton is in a matching state for the capture it 
 needs.

 Sounds quite slow to do it "just in case". Also complete DFAs 
 tend to be mm quite big.

I was envisioning it being done lazily and strictly as-needed, if 
I even got around to doing it at all.

 What ANTLR does is similar technique - a regular lookahead to 
 resolve ambiguity in the grammar (implicitly). A lot like LL(k) 
 but with unlimited length (so called LL(*)). Of course, it 
 generates LL(k) disambiguation where possible, then LL(*), 
 failing that the usual backtracking.

Neat.

 *sigh*, I feel like I could write a paper on this stuff if I 
 were in
 grad school right now.  Alas, I am stuck doing 50-60 hours a 
 week of
 soul-sucking business programming.

 I heard Sociomantic is hiring D programmers for coding some 
 awesome stuff, you may as well apply :)

Tempting.

And it seems even Facebook is joining the market now, which is 
news to me.

 Well, then again, my understanding
 is that even though I can think of things that seem like they 
 would make
 interesting topics for publishable papers, reality would have 
 the profs
 conscript me to do completely different things that are 
 possibly just as
 inane as the business programming.

 Speaking for my limited experience - at times it's like that.

Yay, someone's observations corroborate mine!
...
Crap, someone observations corroborate mine :(

;)

 I worry that the greater threat to good AST manipulation tools 
 in D is a
 lack of free time, and not the DMD bugs as much.

 Good for you I guess, my developments in related area are 
 blocked still :(

Well, hopefully you've got the wind at your back now.

Nov 06 2013

Martin Nowak <code dawg.eu> writes:

On 11/01/2013 09:59 PM, Philippe Sigaud wrote:
 The examples directory shows different grammars, from JSON to XML to C to D.

Sounds promising.
I think I found a way to implement a D REPL but are lacking a reliable 
way to classify code snippets (expressions, declarations, statements) 
and get introduced symbol names.
Do you think it will be able to handle that (performance is not an issue)?

If there is no parser I either have to use a crude heuristic
or I could use machine learning to classify the code based on token 
histograms (after all we have at least 2 D lexers).

Nov 05 2013

"cal" <callumenator gmail.com> writes:

On Tuesday, 5 November 2013 at 17:23:23 UTC, Martin Nowak wrote:
 On 11/01/2013 09:59 PM, Philippe Sigaud wrote:
 The examples directory shows different grammars, from JSON to 
 XML to C to D.

 Sounds promising.
 I think I found a way to implement a D REPL but are lacking a 
 reliable way to classify code snippets (expressions, 
 declarations, statements) and get introduced symbol names.
 Do you think it will be able to handle that (performance is not 
 an issue)?

I did similar using Pegged little while back
(https://github.com/callumenator/dabble), I tuned the PEG grammar
quite a lot, but it worked ok.

Nov 05 2013

Jacob Carlborg <doob me.com> writes:

On 2013-11-05 18:23, Martin Nowak wrote:

 Sounds promising.
 I think I found a way to implement a D REPL but are lacking a reliable
 way to classify code snippets (expressions, declarations, statements)
 and get introduced symbol names.
 Do you think it will be able to handle that (performance is not an issue)?

 If there is no parser I either have to use a crude heuristic
 or I could use machine learning to classify the code based on token
 histograms (after all we have at least 2 D lexers).

There's a parser available, DScanner: 
https://github.com/Hackerpilot/Dscanner. It's part of DCD, autocomplete 
daemon.

-- 
/Jacob Carlborg

Nov 06 2013

"Brian Schott" <briancschott gmail.com> writes:

On Wednesday, 6 November 2013 at 08:21:57 UTC, Jacob Carlborg 
wrote:
 It's part of DCD, autocomplete daemon.

Side note: DCD 0.2.0 is in beta now. There's a thread on the .ide 
list. 
http://forum.dlang.org/thread/flxjzwuaadnkrixryzds forum.dlang.org

Nov 06 2013

Martin Nowak <code dawg.eu> writes:

On 11/01/2013 09:59 PM, Philippe Sigaud wrote:
 I used it with the D grammar, but the one on the website is woefully
 inadequate (some mistakes, out of date compared to the compiler and
 written in a somewhat convoluted style full of left-recursion). The
 shortcomings are that the generated parser is quite slow compared to
 other D parsers.

 That comes from my coding, of course: Pegged generates a simple
 recursive descent parser. I guess I could push for a better engine, but
 I'm waiting for CTFE to get a bit better.

 The advantages are nice, though: full parse tree, semantic actions, and
 so on.

Like many others I'm hoping for a nice general parser generator for 
quite some time.
So I'm also asking specifically about your insights on PEGs.
 From what I know they do have some memory issues due to the backtrace 
memoization (only for deep parse trees though, which aren't common in 
programming languages). Also it seems that the research community has 
some issues to formalize PEGs and what grammars they are capable to handle.
Also why is the expr grammar example so complicated?

Nov 05 2013

Philippe Sigaud <philippe.sigaud gmail.com> writes:

On Tue, Nov 5, 2013 at 6:39 PM, Martin Nowak <code dawg.eu> wrote:

 Like many others I'm hoping for a nice general parser generator for quite

 some time.
 So I'm also asking specifically about your insights on PEGs.
 From what I know they do have some memory issues due to the backtrace
 memoization (only for deep parse trees though, which aren't common in
 programming languages). Also it seems that the research community has some
 issues to formalize PEGs and what grammars they are capable to handle.

IIRC, they don't fit exactly inside the Chomsky hierarchy. But then, so do
other grammars and parsing algorithms.



 Also why is the expr grammar example so complicated?

What do you mean?

Nov 06 2013

Martin Nowak <code dawg.eu> writes:

On 10/30/2013 11:39 PM, jgetner wrote:
 Is there a function in D for evaluating expressions before compile time?.

I want to throw two more ideas in the parser generator topic which
I carry around for quite some time. I think combined they'd enable AST 
generation from clean and simple grammars. Maybe someone find this 
interesting enough to give it a try, I hardly will ever get to this.

There is an interesting paper [MTP] that describes how a slight 
modification of the ENBF grammar can be used to generate a usable AST
from the grammar definition. Each AST node is named like it's 
production. This is possible because the modified grammar disallows to 
mix alternates and sequences which forces one to name all productions.

The other paper is [GLL] parsing which is a new general parser 
generation technique. The main advantage over GLR, it's a top-down 
parser. This allows to modularize grammars, e.g. you can define a 
grammar that reuses/imports an existing grammar for expressions.

The reason to choose generalized parsers instead of say LALR(1)
is maintainability of the grammars. The needed modifications to make
a context free grammar LR(1) or LALR(1)-compatible often make them 
extremely hard to read and change.
Parse forests (many parse trees) are not a problem I think because most 
CFG only have local ambiguities so they resolve to a single parse tree.



grammar.d
```
import number;

Identifier = "[_a-zA-Z][a-zA-Z]*";

// This still needs some syntax to annotate precedence and associativity
MulExpr = Expr "*" Expr;
DivExpr = Expr "/" Expr;
AddExpr = Expr "+" Expr;
SubExpr = Expr "-" Expr;
PowExpr = Expr "^^" Expr;

Expr = MulExpr | DivExpr | AddExpr | SubExpr | PowExpr | Number | Identifer;

Arguments = Expr % ",";

ExprStmt = Expr ";";
CallStmt = Identifier "(" Arguments ")";

Stmt = ExprStmt | CallStmt;
Stmts = Stmt*;
```
```d
auto p = genParser(import("grammar.d"));
auto stmts = p.parseStmts(input);
autp expr = p.parseExpr(input);
foreach (arg; p.parseArguments(input))
{}

final switch (expr)
{
case MulExpr.tag:
case DivExpr.tag:
case AddExpr.tag:
case SubExpr.tag:
case PowExpr.tag:
}

static assert(typeof(MulExpr == Tuple!(Expr, "expr0", Expr, "expr1")));
static assert(typeof(Stmt == Variant!(ExprStmt, CallStmt)));
static assert(typeof(Identifier) == string));
```

[MTP]: http://babel.ls.fi.upm.es/research/mtp/
        http://babel.ls.fi.upm.es/~angel/papers/finalcopy-2005prole.pdf

[GLL]: http://www.rhul.ac.uk/computerscience/research/csle/gllparsers.aspx
        http://dotat.at/tmp/gll.pdf
        for implementers - Modelling GLL Parser Implementations 
(http://dx.doi.org/10.1007/978-3-642-19440-5_4)

Nov 05 2013

D Programming

C/C++ Programming

Other

digitalmars.D - D parsing