digitalmars.D - Official D Grammar

Brian Schott (41/41) Apr 01 2013 I've pretty much finished up my work on the std.d.lexer module. I

Walter Bright (5/37) Apr 01 2013 Yes, this needs to be fixed.

Bruno Medeiros (18/26) Apr 09 2013 I don't think that kind of grammar issue is too annoying, since it's

Walter Bright (2/9) Apr 20 2013 I think the special case is worth it.

Timon Gehr (2/14) Apr 20 2013 Without doubt.

Jacob Carlborg (4/11) Apr 09 2013 Just define that @safe should be a token?
Stewart Gordon (10/17) Apr 23 2013

Diggory (4/4) Apr 23 2013 If at some point custom attributes will be allowed, @ would be a

Jonathan M Davis (4/9) Apr 23 2013 We already have custom attributes using @. They're called User Defined

Jonathan M Davis (7/10) Apr 23 2013 That would mean creating more keywords, which would break code. By using...

Jacob Carlborg (5/10) Apr 23 2013 You can't use the same reasoning anymore since we have UDA. Those will

Jonathan M Davis (15/24) Apr 25 2013 True. Adding @blah now risks conflicts where it didn't before, but it st...

Paulo Pinto (12/27) Apr 24 2013 Please don't.

Stewart Gordon (16/23) Apr 02 2013

Jacob Carlborg (5/18) Apr 02 2013 I'm wondering if it's possibly to mechanically check that what's in the

Tobias Pankrath (5/7) Apr 02 2013 Take the grammar and (randomly) generate strings with it and

Christopher Bergqvist (7/14) Apr 08 2013 An alternative idea for ensuring that documentation and

Tobias Pankrath (11/28) Apr 08 2013 I know but the parser is currently hand written and I think

Philippe Sigaud (3/6) Apr 08 2013 Yes, it comes with the project. But, it's still buggy (sometimes due to ...

Bruno Medeiros (23/57) Apr 06 2013 Interesting thread. I've been working on a hand-written D parser (in
Bruno Medeiros (9/12) Apr 06 2013 BTW, even in the lexer spec I've found an issue. How does this parse:

Jonathan M Davis (4/15) Apr 06 2013 It would definitely have to be INTEGER DOT IDENTIFIER due to UFCS, so it...
Artur Skawina (18/28) Apr 06 2013 "1..2", "1.ident" and a float literal with '_' after the '.' are the

Bruno Medeiros (12/27) Apr 06 2013 The "1..2" is actually mentioned in the spec:

Artur Skawina (11/33) Apr 07 2013 What does the "spec" currently say about ".001"?..

Bruno Medeiros (9/13) Apr 09 2013 Who says its a bug? From my understanding, this exception is there on

Artur Skawina (17/29) Apr 09 2013 It's a bug, because the grammar does not correctly describe the rules. A

Brian Schott (9/9) Apr 20 2013 I've moved my work on the grammar to the following location on

Dmitry Olshansky (5/14) Apr 20 2013 Bookmarked for now ;)
Brian Schott (5/7) Apr 26 2013 I'm beginning to think that ANTRL is not up to the task either.

Brian Schott (8/12) Apr 28 2013 I think I'm done with this grammar for now. If there's any

Andrei Alexandrescu (4/13) Apr 28 2013 Great.
Brian Schott (6/6) May 12 2013 I wrote a small program to take the grammar and generate input to

Brian Schott (13/13) Jun 03 2013 As threatened at DConf, I've started filing bugs against the

Brian Schott (10/10) Jul 19 2013 Status update:

Mr. Anonymous (3/13) Jul 19 2013 https://rawgithub.com/Hackerpilot/DGrammar/master/grammar.html

Brian Schott (2/4) Jul 19 2013 I learn something new every day.

H. S. Teoh (13/25) Jul 19 2013 Really? Those still exist in Phobos? Shouldn't they be rewritten to the

Brian Schott (3/6) Jul 19 2013 Yes.

Johannes Pfau (15/28) Jul 19 2013 Nice work!
Jacob Carlborg (4/14) Jul 20 2013 Now for the million dollar question: does it work during compile time?
Michael (3/6) Jul 20 2013 +1

Tobias Pankrath (3/10) May 01 2013 Do you know, which parts of the grammar make the parser so slow?

Russel Winder (17/27) May 01 2013 Which other parser generators? cf. for example,

Bruno Medeiros (15/20) May 02 2013 Half hour??? That seems like a massive amount of time, even if the

Russel Winder (16/24) May 02 2013 =20

Tobias Pankrath (3/20) May 03 2013 LALR(1) is O(n) last time I looked. Both Ll(k) and LALR(k) should

Brian Schott (2/4) May 02 2013 That or buy the documentation. Or both...

"Brian Schott" <briancschott gmail.com> writes:

I've pretty much finished up my work on the std.d.lexer module. I 
am waiting for the review queue to make some progress on the 
other (three?) modules being reviewed before starting a thread on 
it.

In the meantime I've started some work on an AST module for 
Phobos that contains the data types necessary to build up a 
parser module so that we can have a standard set of code build D 
dev tools off of. I decided to work directly from the standard on 
dlang.org for this to make sure that my module is correct and 
that the standard is actually correct.

I've seen several threads on this newsgroup complaining about the 
state of the standard and unfortunately this will be another one.

1) Grammar defined in terms of things that aren't tokens. Take, 
for example, PropertyDeclaration. It's defined as an " " token 
followed by... what? "safe"? It's not a real token. It's an 
identifier. You can't parse this based on checking the token 
type. You have to check the type and the value.

2) Grammar references rules that don't exist. 
UserDefinedAttribute is defined in terms of CallExpression, but 
CallExpression doesn't exist elsewhere in the grammar. 
BaseInterfaceList is defined in terms of InterfaceClasses, but 
that rule is never defined.

3) Unnecessary rules. KeyExpression, ValueExpression, 
ScopeBlockStatement, DeclarationStatement, ThenStatement, 
ElseStatement, Test, Increment, Aggregate, LwrExpression, 
UprExpression, FirstExp, LastExp, StructAllocator, 
StructDeallocator, EnumTag, EnumBaseType, EmptyEnumBody, 
ConstraintExpression, MixinIdentifier, etc... are all defined in 
terms of only one other rule.

I think that we need to be able to create a grammar description 
that:
* Fits in to a single file, so that a tool implementer does not 
need to collect bits of the grammar from the various pages on 
dlang.org.
* Can be verified to be correct by an existing tool such as 
Bison, Goldie, JavaCC, <your favorite here> with a small number 
of changes.
* Is part of the dmd/dlang repositories on github and gets 
updated every time the language changes.

I'm willing to work on this if there's a good chance it will 
actually be implemented. Thoughts?

Apr 01 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 4/1/2013 4:18 PM, Brian Schott wrote:
 I've pretty much finished up my work on the std.d.lexer module. I am waiting
for
 the review queue to make some progress on the other (three?) modules being
 reviewed before starting a thread on it.

 In the meantime I've started some work on an AST module for Phobos that
contains
 the data types necessary to build up a parser module so that we can have a
 standard set of code build D dev tools off of. I decided to work directly from
 the standard on dlang.org for this to make sure that my module is correct and
 that the standard is actually correct.

 I've seen several threads on this newsgroup complaining about the state of the
 standard and unfortunately this will be another one.

 1) Grammar defined in terms of things that aren't tokens. Take, for example,
 PropertyDeclaration. It's defined as an " " token followed by... what? "safe"?
 It's not a real token. It's an identifier. You can't parse this based on
 checking the token type. You have to check the type and the value.

True, do you have a suggestion?

 2) Grammar references rules that don't exist. UserDefinedAttribute is defined
in
 terms of CallExpression, but CallExpression doesn't exist elsewhere in the
 grammar. BaseInterfaceList is defined in terms of InterfaceClasses, but that
 rule is never defined.

Yes, this needs to be fixed.

 3) Unnecessary rules. KeyExpression, ValueExpression, ScopeBlockStatement,
 DeclarationStatement, ThenStatement, ElseStatement, Test, Increment, Aggregate,
 LwrExpression, UprExpression, FirstExp, LastExp, StructAllocator,
 StructDeallocator, EnumTag, EnumBaseType, EmptyEnumBody, ConstraintExpression,
 MixinIdentifier, etc... are all defined in terms of only one other rule.

Using these makes documentation easier, and I don't think it harms anything.


 I think that we need to be able to create a grammar description that:
 * Fits in to a single file, so that a tool implementer does not need to collect
 bits of the grammar from the various pages on dlang.org.
 * Can be verified to be correct by an existing tool such as Bison, Goldie,
 JavaCC, <your favorite here> with a small number of changes.
 * Is part of the dmd/dlang repositories on github and gets updated every time
 the language changes.

 I'm willing to work on this if there's a good chance it will actually be
 implemented. Thoughts?

I suggest doing this as a sequence of pull requests, not doing just one big one.

Apr 01 2013

Bruno Medeiros <brunodomedeiros+dng gmail.com> writes:

On 02/04/2013 03:13, Walter Bright wrote:
 1) Grammar defined in terms of things that aren't tokens. Take, for
 example,
 PropertyDeclaration. It's defined as an " " token followed by... what?
 "safe"?
 It's not a real token. It's an identifier. You can't parse this based on
 checking the token type. You have to check the type and the value.

 True, do you have a suggestion?

I don't think that kind of grammar issue is too annoying, since it's 
easy to understand what the intended behavior is (in this case at least).
But to fix it, well, we can have just that: have the grammar say it 
should parse an identifier after the  , and then issue a semantic error 
of sorts if the value is not one of the expected special values (safe, 
etc.).
Parsing an identifier here is in any case the best error recovery 
strategy anyways.

A bit more annoying is the case with the extern declaration, with the 
C++ parameter:
   extern(C++)
here you have to look at a special identifier (the C, D, PASCAL part) 
and see if there is a ++ token ahead, it's a bit more of special-casing 
in the parser. Here I think it would have been better to change the the 
language itself and use "CPP" instead of "C++". A minor simplification.

-- 
Bruno Medeiros - Software Engineer

Apr 09 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 4/9/2013 3:20 AM, Bruno Medeiros wrote:
 A bit more annoying is the case with the extern declaration, with the C++
 parameter:
    extern(C++)
 here you have to look at a special identifier (the C, D, PASCAL part) and see
if
 there is a ++ token ahead, it's a bit more of special-casing in the parser.
Here
 I think it would have been better to change the the language itself and use
 "CPP" instead of "C++". A minor simplification.

I think the special case is worth it.

Apr 20 2013

Timon Gehr <timon.gehr gmx.ch> writes:

On 04/20/2013 08:36 PM, Walter Bright wrote:
 On 4/9/2013 3:20 AM, Bruno Medeiros wrote:
 A bit more annoying is the case with the extern declaration, with the C++
 parameter:
    extern(C++)
 here you have to look at a special identifier (the C, D, PASCAL part)
 and see if
 there is a ++ token ahead, it's a bit more of special-casing in the
 parser. Here
 I think it would have been better to change the the language itself
 and use
 "CPP" instead of "C++". A minor simplification.

 I think the special case is worth it.

Without doubt.

Apr 20 2013

Jacob Carlborg <doob me.com> writes:

On 2013-04-02 04:13, Walter Bright wrote:

 1) Grammar defined in terms of things that aren't tokens. Take, for
 example,
 PropertyDeclaration. It's defined as an " " token followed by... what?
 "safe"?
 It's not a real token. It's an identifier. You can't parse this based on
 checking the token type. You have to check the type and the value.

 True, do you have a suggestion?

Just define that  safe should be a token?

-- 
/Jacob Carlborg

Apr 09 2013

Stewart Gordon <smjg_1998 yahoo.com> writes:

On 02/04/2013 03:13, Walter Bright wrote:
 On 4/1/2013 4:18 PM, Brian Schott wrote:

<snip>
 1) Grammar defined in terms of things that aren't tokens. Take,
 for example, PropertyDeclaration. It's defined as an " " token
 followed by... what? "safe"? It's not a real token. It's an
 identifier. You can't parse this based on checking the token type.
 You have to check the type and the value.

 True, do you have a suggestion?

<snip>

Er, do away with the meaningless  ?  Since nobody seems to have 
succeeded in explaining how the  -attributes differ from the rest, it 
seems the right way to go.

Stewart.

-- 
My email address is valid but not my primary mailbox and not checked 
regularly.  Please keep replies on the 'group where everybody may benefit.

Apr 23 2013

"Diggory" <diggsey googlemail.com> writes:

If at some point custom attributes will be allowed,   would be a 
nice syntax for that.

Things like  safe could be attributes implemented by the standard 
library - would cut down on the number of keywords.

Apr 23 2013

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Wednesday, April 24, 2013 00:34:54 Diggory wrote:
 If at some point custom attributes will be allowed,   would be a
 nice syntax for that.
 
 Things like  safe could be attributes implemented by the standard
 library - would cut down on the number of keywords.

We already have custom attributes using  . They're called User Defined 
Attributes and were added a couple releases ago IIRC.

- Jonathan M Davis

Apr 23 2013

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Tuesday, April 23, 2013 21:51:54 Stewart Gordon wrote:
 Er, do away with the meaningless  ? Since nobody seems to have
 succeeded in explaining how the  -attributes differ from the rest, it
 seems the right way to go.

That would mean creating more keywords, which would break code. By using  , we 
avoid having to create new keywords, which I believe was the whole point in 
the first place. Which attributes got   on them was fairly arbitrary, but they 
do definitely serve a purpose. And any new attributes in the future will 
probably have   on them for the same reason.

- Jonathan M Davis

Apr 23 2013

Jacob Carlborg <doob me.com> writes:

On 2013-04-24 03:40, Jonathan M Davis wrote:

 That would mean creating more keywords, which would break code. By using  , we
 avoid having to create new keywords, which I believe was the whole point in
 the first place. Which attributes got   on them was fairly arbitrary, but they
 do definitely serve a purpose. And any new attributes in the future will
 probably have   on them for the same reason.

You can't use the same reasoning anymore since we have UDA. Those will 
conflict with new keywords starting with an  .

-- 
/Jacob Carlborg

Apr 23 2013

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Wednesday, April 24, 2013 08:39:21 Jacob Carlborg wrote:
 On 2013-04-24 03:40, Jonathan M Davis wrote:
 That would mean creating more keywords, which would break code. By using
  , we avoid having to create new keywords, which I believe was the whole
 point in the first place. Which attributes got   on them was fairly
 arbitrary, but they do definitely serve a purpose. And any new attributes
 in the future will probably have   on them for the same reason.

 
 You can't use the same reasoning anymore since we have UDA. Those will
 conflict with new keywords starting with an  .

True. Adding  blah now risks conflicts where it didn't before, but it still 
risks fewer conflicts that adding a keyword, since a keyword could be used 
anywhere, so adding blah as a keyword would conflict with all uses of blah - 
including  blah - whereas adding  blah would just conflict with  blah. There 
was also some discussion on how to make it so that  blah might not cause 
conflicts (via namespacing or somesuch), but I don't recall if anything came of 
that or how much of applies to the actual UDA implementation.

So, it makes more sense to go with adding an   attribute than a keyword if we 
have to add a new attribute, because it'll conflict with less, but it's 
definitely true that we no longer have a way to avoid creating conflicts when 
adding attributes without altering the grammar in a manner similar to when we 
made it possible to have attributes starting with  . So, it's that much more 
important that we only add new attributes when we really need them.

- Jonathan M Davis

Apr 25 2013

"Paulo Pinto" <pjmlp progtools.org> writes:

On Wednesday, 24 April 2013 at 01:40:52 UTC, Jonathan M Davis 
wrote:
 On Tuesday, April 23, 2013 21:51:54 Stewart Gordon wrote:
 Er, do away with the meaningless  ? Since nobody seems to have
 succeeded in explaining how the  -attributes differ from the 
 rest, it
 seems the right way to go.

 That would mean creating more keywords, which would break code. 
 By using  , we
 avoid having to create new keywords, which I believe was the 
 whole point in
 the first place. Which attributes got   on them was fairly 
 arbitrary, but they
 do definitely serve a purpose. And any new attributes in the 
 future will
 probably have   on them for the same reason.

 - Jonathan M Davis

Please don't.

This was the approach taken by Java with the  overload and 
friends, which leads to ugly code in my opinion.

Once I worked in a project which took a similar approach, we 
would have annotations as a way to extend the core grammar. Each 
team eventually grew its own set of annotations, to the point the 
amount of annotations became bigger than the core grammar itself, 
with overlapping meaning.

--
Paulo

Apr 24 2013

Stewart Gordon <smjg_1998 yahoo.com> writes:

On 02/04/2013 00:18, Brian Schott wrote:
<snip>
 I think that we need to be able to create a grammar description that:
 * Fits in to a single file, so that a tool implementer does not need to
 collect bits of the grammar from the various pages on dlang.org.
 * Can be verified to be correct by an existing tool such as Bison,
 Goldie, JavaCC, <your favorite here> with a small number of changes.
 * Is part of the dmd/dlang repositories on github and gets updated every
 time the language changes.

<snip>

Indeed, the published grammar needs to be thoroughly checked against 
what DMD is actually doing, and any discrepancies fixed (or filed in 
Bugzilla to be fixed in due course).  And then they need to be kept in sync.

Has the idea of using a parser generator to build D's parsing code been 
rejected in the past, or is hand-coding just the way Walter decided to 
do it?  Is the code any more efficient than what a typical parser 
generator would generate?

And all disambiguation rules (such as "if it's parseable as a 
DeclarationStatement, it's a DeclarationStatement") need to be made 
explicit as part of the grammar.  I suppose this is where using Bison or 
similar would help, as it would point out any ambiguities in the grammar 
that need rules to resolve them.

Stewart.

Apr 02 2013

Jacob Carlborg <doob me.com> writes:

On 2013-04-02 15:21, Stewart Gordon wrote:

 Indeed, the published grammar needs to be thoroughly checked against
 what DMD is actually doing, and any discrepancies fixed (or filed in
 Bugzilla to be fixed in due course).  And then they need to be kept in
 sync.

 Has the idea of using a parser generator to build D's parsing code been
 rejected in the past, or is hand-coding just the way Walter decided to
 do it?  Is the code any more efficient than what a typical parser
 generator would generate?

 And all disambiguation rules (such as "if it's parseable as a
 DeclarationStatement, it's a DeclarationStatement") need to be made
 explicit as part of the grammar.  I suppose this is where using Bison or
 similar would help, as it would point out any ambiguities in the grammar
 that need rules to resolve them.

I'm wondering if it's possibly to mechanically check that what's in the 
grammar is how DMD behaves.

-- 
/Jacob Carlborg

Apr 02 2013

"Tobias Pankrath" <tobias pankrath.net> writes:

 I'm wondering if it's possibly to mechanically check that 
 what's in the grammar is how DMD behaves.

Take the grammar and (randomly) generate strings with it and 
check if DMD does complain. You'd need a parse only don't check 
semantics flag, though.

This will not check if the strings are parsed correctly by DMD 
nor if invalid strings are rejected. But it would be a start.

Apr 02 2013

"Christopher Bergqvist" <spambox0 digitalpoetry.se> writes:

On Tuesday, 2 April 2013 at 19:00:21 UTC, Tobias Pankrath wrote:
 I'm wondering if it's possibly to mechanically check that 
 what's in the grammar is how DMD behaves.

 Take the grammar and (randomly) generate strings with it and 
 check if DMD does complain. You'd need a parse only don't check 
 semantics flag, though.

 This will not check if the strings are parsed correctly by DMD 
 nor if invalid strings are rejected. But it would be a start.

An alternative idea for ensuring that documentation and 
implementation are in sync might be to list the full grammar 
definition as a data structure that can both be used as input for 
the parser and as input for a tool that generates the 
documentation.  Theoretically possible, :) just look at Philippe 
Sigaud's Pegged.

Apr 08 2013

"Tobias Pankrath" <tobias pankrath.net> writes:

On Monday, 8 April 2013 at 21:50:12 UTC, Christopher Bergqvist 
wrote:
 On Tuesday, 2 April 2013 at 19:00:21 UTC, Tobias Pankrath wrote:
 I'm wondering if it's possibly to mechanically check that 
 what's in the grammar is how DMD behaves.

 Take the grammar and (randomly) generate strings with it and 
 check if DMD does complain. You'd need a parse only don't 
 check semantics flag, though.

 This will not check if the strings are parsed correctly by DMD 
 nor if invalid strings are rejected. But it would be a start.

 An alternative idea for ensuring that documentation and 
 implementation are in sync might be to list the full grammar 
 definition as a data structure that can both be used as input 
 for the parser and as input for a tool that generates the 
 documentation.  Theoretically possible, :) just look at 
 Philippe Sigaud's Pegged.

I know but the parser is currently hand written and I think 
Walter will only accept an auto generated parser if it is as fast 
as the current solution.

However in an old discussion someone said that the D grammar 
isn't LALR(1) or LR(1), so I don't think that is possible with 
current D parser generators. Do we have a pegged grammar for D?

Another think is that for documentation purposes you might want 
to have an more readable but ambiguous grammar than your 
generator of choice will accept.

Apr 08 2013

Philippe Sigaud <philippe.sigaud gmail.com> writes:

 However in an old discussion someone said that the D grammar isn't LALR(1)
 or LR(1), so I don't think that is possible with current D parser
 generators. Do we have a pegged grammar for D?

Yes, it comes with the project. But, it's still buggy (sometimes due to my
own mistakes, sometimes due to plain errors in the online D grammar). And
the generated parser is quite slow, halas.

Apr 08 2013

Bruno Medeiros <brunodomedeiros+dng gmail.com> writes:

On 02/04/2013 00:18, Brian Schott wrote:
 I've pretty much finished up my work on the std.d.lexer module. I am
 waiting for the review queue to make some progress on the other (three?)
 modules being reviewed before starting a thread on it.

 In the meantime I've started some work on an AST module for Phobos that
 contains the data types necessary to build up a parser module so that we
 can have a standard set of code build D dev tools off of. I decided to
 work directly from the standard on dlang.org for this to make sure that
 my module is correct and that the standard is actually correct.

 I've seen several threads on this newsgroup complaining about the state
 of the standard and unfortunately this will be another one.

 1) Grammar defined in terms of things that aren't tokens. Take, for
 example, PropertyDeclaration. It's defined as an " " token followed
 by... what? "safe"? It's not a real token. It's an identifier. You can't
 parse this based on checking the token type. You have to check the type
 and the value.

 2) Grammar references rules that don't exist. UserDefinedAttribute is
 defined in terms of CallExpression, but CallExpression doesn't exist
 elsewhere in the grammar. BaseInterfaceList is defined in terms of
 InterfaceClasses, but that rule is never defined.

 3) Unnecessary rules. KeyExpression, ValueExpression,
 ScopeBlockStatement, DeclarationStatement, ThenStatement, ElseStatement,
 Test, Increment, Aggregate, LwrExpression, UprExpression, FirstExp,
 LastExp, StructAllocator, StructDeallocator, EnumTag, EnumBaseType,
 EmptyEnumBody, ConstraintExpression, MixinIdentifier, etc... are all
 defined in terms of only one other rule.

 I think that we need to be able to create a grammar description that:
 * Fits in to a single file, so that a tool implementer does not need to
 collect bits of the grammar from the various pages on dlang.org.
 * Can be verified to be correct by an existing tool such as Bison,
 Goldie, JavaCC, <your favorite here> with a small number of changes.
 * Is part of the dmd/dlang repositories on github and gets updated every
 time the language changes.

 I'm willing to work on this if there's a good chance it will actually be
 implemented. Thoughts?

Interesting thread. I've been working on a hand-written D parser (in 
Java, for the DDT IDE) and I too have found a slew of grammar spec 
issues. Some of them more serious than the ones you mentioned above. In 
same cases it's actually not clear, or downright wrong what the grammar 
spec says. For example, here's one off of my notes:

   void func(int foo() { } );

The spec says that is parsable (basically a function declaration in the 
parameter list), which makes no sense, and DMD doesn't accept. Some 
cases are a bit trickier, since it's not clear if the syntax should be 
accepted or not (sometimes they might make sense but not be allowed).

These issues make things a bit harder for tools development that require 
D language parsers. But the whole grammar spec is so messy, I've been 
unsure whether it's worth filling bug reports or not (would they be 
addressed?). There is also the problem that even if those issues are 
fixed now, the spec could very easily fall out of date in the future, 
unless we have some system to test the spec. Like you mentioned, ideally 
we would have a grammar spec for a grammar/PG tool so that correctness 
could more easily be verified.
(it doesn't guarantee no spec bugs, but it makes it much harder for them 
to be there)


-- 
Bruno Medeiros - Software Engineer

Apr 06 2013

Bruno Medeiros <brunodomedeiros+dng gmail.com> writes:

On 02/04/2013 00:18, Brian Schott wrote:
 I've pretty much finished up my work on the std.d.lexer module. I am
 waiting for the review queue to make some progress on the other (three?)
 modules being reviewed before starting a thread on it.

BTW, even in the lexer spec I've found an issue. How does this parse:
   5.blah
According to the spec (maximal munch technique), it should be FLOAT then 
IDENTIFIER. But DMD parses it as INTEGER DOT IDENTIFIER. I'm assuming 
the lastest is the correct behavior, so you can write stuff like 
123.init, but that should be clarified.


-- 
Bruno Medeiros - Software Engineer

Apr 06 2013

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Saturday, April 06, 2013 16:21:12 Bruno Medeiros wrote:
 On 02/04/2013 00:18, Brian Schott wrote:
 I've pretty much finished up my work on the std.d.lexer module. I am
 waiting for the review queue to make some progress on the other (three?)
 modules being reviewed before starting a thread on it.

 
 BTW, even in the lexer spec I've found an issue. How does this parse:
 5.blah
 According to the spec (maximal munch technique), it should be FLOAT then
 IDENTIFIER. But DMD parses it as INTEGER DOT IDENTIFIER. I'm assuming
 the lastest is the correct behavior, so you can write stuff like
 123.init, but that should be clarified.

It would definitely have to be INTEGER DOT IDENTIFIER due to UFCS, so it sounds 
like the spec wasn't updated like it should have been.

- Jonathan M Davis

Apr 06 2013

Artur Skawina <art.08.09 gmail.com> writes:

On 04/06/13 17:21, Bruno Medeiros wrote:
 On 02/04/2013 00:18, Brian Schott wrote:
 I've pretty much finished up my work on the std.d.lexer module. I am
 waiting for the review queue to make some progress on the other (three?)
 modules being reviewed before starting a thread on it.

 
 BTW, even in the lexer spec I've found an issue. How does this parse:
   5.blah
 According to the spec (maximal munch technique), it should be FLOAT then
IDENTIFIER. But DMD parses it as INTEGER DOT IDENTIFIER. I'm assuming the
lastest is the correct behavior, so you can write stuff like 123.init, but that
should be clarified.

"1..2", "1.ident" and a float literal with '_' after the '.' are the
DecimalFloat cases that I immediately ran into when doing a lexer based on
the dlang grammar. It's obvious to a human how these should be handled, but
code generators aren't that smart... But they are good at catching mistakes
like these. 
Actually, that last case is even more "interesting"; http://dlang.org/lex.html
has "1_2_3_4_5_6_._5_6_7_8" as a valid example, which of course it's not
("_5_6_7_8" is a valid identifier), but there is no reason do disallow
"1_2_3_4_5_6_.5_6_7_8".

 that should be clarified.

These are just grammar bugs, that could easily be fixed.  Then there are
some things that can be less obvious, but shouldn't really be controversial
like allowing empty HexString literals.

Then there's the enhancement category.
Looking through my comments, I think the only deliberate change from dlang.org
that I have is in DelimitedString -- there is no reason to forbid q"/abc/def/";
there are no back-compat issues, as it couldn't have existed in legacy D code.

artur

Apr 06 2013

Bruno Medeiros <brunodomedeiros+dng gmail.com> writes:

On 06/04/2013 20:52, Artur Skawina wrote:
 On 04/06/13 17:21, Bruno Medeiros wrote:
 On 02/04/2013 00:18, Brian Schott wrote:
 I've pretty much finished up my work on the std.d.lexer module. I am
 waiting for the review queue to make some progress on the other (three?)
 modules being reviewed before starting a thread on it.

 BTW, even in the lexer spec I've found an issue. How does this parse:
    5.blah
 According to the spec (maximal munch technique), it should be FLOAT then
IDENTIFIER. But DMD parses it as INTEGER DOT IDENTIFIER. I'm assuming the
lastest is the correct behavior, so you can write stuff like 123.init, but that
should be clarified.

 "1..2", "1.ident" and a float literal with '_' after the '.' are the
 DecimalFloat cases that I immediately ran into when doing a lexer based on
 the dlang grammar. It's obvious to a human how these should be handled, but
 code generators aren't that smart... But they are good at catching mistakes
 like these.

The "1..2" is actually mentioned in the spec:
"An exception to this rule is that a .. embedded inside what looks like 
two floating point literals, as in 1..2, is interpreted as if the .. was 
separated by a space from the first integer."
so it's there, even if it can be missed.

But unless I missed it, the spec is incorrect for the "1.ident" or 
"1_2_3_4_5_6_._5_6_7_8" cases as there is no exception mentioned 
there... and it's not always 100% obvious to a human how these should be 
handled. Or maybe that's just me :)

-- 
Bruno Medeiros - Software Engineer

Apr 06 2013

Artur Skawina <art.08.09 gmail.com> writes:

On 04/07/13 00:35, Bruno Medeiros wrote:
 On 06/04/2013 20:52, Artur Skawina wrote:
 On 04/06/13 17:21, Bruno Medeiros wrote:
 On 02/04/2013 00:18, Brian Schott wrote:
 I've pretty much finished up my work on the std.d.lexer module. I am
 waiting for the review queue to make some progress on the other (three?)
 modules being reviewed before starting a thread on it.

 BTW, even in the lexer spec I've found an issue. How does this parse:
    5.blah
 According to the spec (maximal munch technique), it should be FLOAT then
IDENTIFIER. But DMD parses it as INTEGER DOT IDENTIFIER. I'm assuming the
lastest is the correct behavior, so you can write stuff like 123.init, but that
should be clarified.

 "1..2", "1.ident" and a float literal with '_' after the '.' are the
 DecimalFloat cases that I immediately ran into when doing a lexer based on
 the dlang grammar. It's obvious to a human how these should be handled, but
 code generators aren't that smart... But they are good at catching mistakes
 like these.

 
 The "1..2" is actually mentioned in the spec:
 "An exception to this rule is that a .. embedded inside what looks like two
floating point literals, as in 1..2, is interpreted as if the .. was separated
by a space from the first integer."
 so it's there, even if it can be missed.

I know, but documenting a (grammar) bug does not make it go away.

 But unless I missed it, the spec is incorrect for the "1.ident" or
"1_2_3_4_5_6_._5_6_7_8" cases as there is no exception mentioned there... and
it's not always 100% obvious to a human how these should be handled. Or maybe
that's just me :)

What does the "spec" currently say about ".001"?..

It's been a while since I did a d lexer, based on the dlang grammar - it 
(the lexer) was supposed to be dmd compatible. Took a closer look at the
actual dlang.org rules today while writing this message...


Will try to find some time to clean up and convert a working D lexical
grammar to PEG; what i have should be 1:1 translatable, except one rule
(DelimitedString) and put it on the wiki. Maybe it will help someone
avoid these issues.

artur

Apr 07 2013

Bruno Medeiros <brunodomedeiros+dng gmail.com> writes:

On 07/04/2013 16:14, Artur Skawina wrote:
 The "1..2" is actually mentioned in the spec:
"An exception to this rule is that a .. embedded inside what looks like two
floating point literals, as in 1..2, is interpreted as if the .. was separated
by a space from the first integer."
so it's there, even if it can be missed.


 I know, but documenting a (grammar) bug does not make it go away.

Who says its a bug? From my understanding, this exception is there on 
purpose, to make it easier to use the DOT_DOT operator in the slice 
expresions:

   foo[1..2] // It would be silly to have to put a space after the 1

At most you could make a case that "1." shouldn't ever parse as float, 
that the decimal part should be required if the dot is present.

-- 
Bruno Medeiros - Software Engineer

Apr 09 2013

Artur Skawina <art.08.09 gmail.com> writes:

On 04/09/13 12:24, Bruno Medeiros wrote:
 On 07/04/2013 16:14, Artur Skawina wrote:
 The "1..2" is actually mentioned in the spec:
"An exception to this rule is that a .. embedded inside what looks like two
floating point literals, as in 1..2, is interpreted as if the .. was separated
by a space from the first integer."
so it's there, even if it can be missed.


 I know, but documenting a (grammar) bug does not make it go away.

 
 Who says its a bug? From my understanding, this exception is there on purpose,
to make it easier to use the DOT_DOT operator in the slice expresions:
 
   foo[1..2] // It would be silly to have to put a space after the 1
 
 At most you could make a case that "1." shouldn't ever parse as float, that
the decimal part should be required if the dot is present.

It's a bug, because the grammar does not correctly describe the rules. A
parser/lexer based on the grammar alone will not work. Documenting the
exception(s) helps the human, but doesn't make the grammar correct.

I've started the PEG conversion of my lexer rules and the relevant one
looks like this:

   DecimalFloat:
             (LeadingDecimal "." !"." !IdentifierStart DecimalDigitsNoSingleUS
DecimalExponent)
           / (LeadingDecimal "." !"." !IdentifierStart DigitUS*)
           / ("." LeadingDecimal DecimalExponent?)

This works as-is, w/o any extra info - the working lexer is mechanically
generated from aot this rule.
(It differs from the dlang definition in at least four ways - "1..2",
"1.ident", "1.000_1" and ".001")

I'll try to finish the conversion and post the whole lexical grammar in
a couple days (have almost no D-time right now). 

artur

Apr 09 2013

"Brian Schott" <briancschott gmail.com> writes:

I've moved my work on the grammar to the following location on 
Github:

https://github.com/Hackerpilot/DGrammar

This uses ANTLR, as the other parser generators can't handle D's 
grammar. Several rules from the official grammar were removed, 
and several others were added (Such as an actual rule for a 
function declaration...) I also tried to fix any inacuracies or 
omissions I came across in the online documentation.

Comments, issues, and pull requests welcome.

Apr 20 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

20-Apr-2013 12:31, Brian Schott пишет:
 I've moved my work on the grammar to the following location on Github:

 https://github.com/Hackerpilot/DGrammar

 This uses ANTLR, as the other parser generators can't handle D's
 grammar.

Great. IMHO ANTLR is one of the sanest.

 Several rules from the official grammar were removed, and
 several others were added (Such as an actual rule for a function
 declaration...) I also tried to fix any inacuracies or omissions I came
 across in the online documentation.

 Comments, issues, and pull requests welcome.

Bookmarked for now ;)

-- 
Dmitry Olshansky

Apr 20 2013

"Brian Schott" <briancschott gmail.com> writes:

On Saturday, 20 April 2013 at 08:31:34 UTC, Brian Schott wrote:
 This uses ANTLR, as the other parser generators can't handle 
 D's grammar.

I'm beginning to think that ANTRL is not up to the task either. 
I've somehow managed to get the grammar to the point where it 
correctly parses several phobos modules but takes a half hour to 
do so.

Apr 26 2013

"Brian Schott" <briancschott gmail.com> writes:

On Friday, 26 April 2013 at 20:44:12 UTC, Brian Schott wrote:
 I'm beginning to think that ANTRL is not up to the task either. 
 I've somehow managed to get the grammar to the point where it 
 correctly parses several phobos modules but takes a half hour 
 to do so.

I think I'm done with this grammar for now. If there's any 
interest, I can use it to create syntax diagrams for inclusion on 
dlang.org.

I'll also be filing a series of bug reports over the next few 
days. (Two recent examples: 0.0L is not valid according to the 
lexical standard, and deprecated("string") is not documented 
anywhere)

Apr 28 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 4/28/13 8:16 PM, Brian Schott wrote:
 On Friday, 26 April 2013 at 20:44:12 UTC, Brian Schott wrote:
 I'm beginning to think that ANTRL is not up to the task either. I've
 somehow managed to get the grammar to the point where it correctly
 parses several phobos modules but takes a half hour to do so.

 I think I'm done with this grammar for now. If there's any interest, I
 can use it to create syntax diagrams for inclusion on dlang.org.

Yes! This is awesome. Thanks for this work.

 I'll also be filing a series of bug reports over the next few days. (Two
 recent examples: 0.0L is not valid according to the lexical standard,
 and deprecated("string") is not documented anywhere)

Great.


Andrei

Apr 28 2013

"Brian Schott" <briancschott gmail.com> writes:

I wrote a small program to take the grammar and generate input to 
the dot program from Graphviz. You can generate the diagrams with 
some scripts available on Github or view a prebuilt version here:

http://hackerpilot.org/d/dgrammar.zip

Please take a look at this and let me know if you find any 
mistakes.

May 12 2013

"Brian Schott" <briancschott gmail.com> writes:

As threatened at DConf, I've started filing bugs against the 
grammar specification. Anyone interested can track bug 10233[1], 
which I've marked as blocked by the various issues I've been 
finding.

As usual, my best guess at D's actual grammar[2] is located 
here[3].

Top secret project is hinted at here[4].

[1] http://d.puremagic.com/issues/show_bug.cgi?id=10233
[2] I disclaim all responsibility for injuries sustained while 
looking at the functionDefinition rule.
[3] https://github.com/Hackerpilot/DGrammar/blob/master/D.g4
[4] 
http://hackerpilot.github.io/experimental/std_lexer/phobos/parser.html

Jun 03 2013

"Brian Schott" <briancschott gmail.com> writes:

Status update:

My parser is at a point where it can parse all of phobos (well, 
excluding c-style array declarations, which I'm convinced are 
broken and should be removed from the language).

I've been updating the grammar definition as I worked on the 
parser. An HTML version is available for download here: 
https://raw.github.com/Hackerpilot/DGrammar/master/grammar.html

Lexer, parser, and AST code is available here: 
https://github.com/Hackerpilot/Dscanner/tree/range-based-lexer/std/d. 
Bug reports and pull requests are welcome.

Jul 19 2013

"Mr. Anonymous" <mailnew4ster gmail.com> writes:

On Friday, 19 July 2013 at 20:49:27 UTC, Brian Schott wrote:
 Status update:

 My parser is at a point where it can parse all of phobos (well, 
 excluding c-style array declarations, which I'm convinced are 
 broken and should be removed from the language).

 I've been updating the grammar definition as I worked on the 
 parser. An HTML version is available for download here: 
 https://raw.github.com/Hackerpilot/DGrammar/master/grammar.html

 Lexer, parser, and AST code is available here: 
 https://github.com/Hackerpilot/Dscanner/tree/range-based-lexer/std/d. 
 Bug reports and pull requests are welcome.

https://rawgithub.com/Hackerpilot/DGrammar/master/grammar.html
;)

Jul 19 2013

"Brian Schott" <briancschott gmail.com> writes:

On Friday, 19 July 2013 at 20:52:33 UTC, Mr. Anonymous wrote:
 https://rawgithub.com/Hackerpilot/DGrammar/master/grammar.html
 ;)

I learn something new every day.

Jul 19 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Jul 19, 2013 at 10:49:25PM +0200, Brian Schott wrote:
 Status update:
 
 My parser is at a point where it can parse all of phobos (well,
 excluding c-style array declarations, which I'm convinced are broken
 and should be removed from the language).

Really? Those still exist in Phobos? Shouldn't they be rewritten to the
"native" style?


 I've been updating the grammar definition as I worked on the parser.
 An HTML version is available for download here:
 https://raw.github.com/Hackerpilot/DGrammar/master/grammar.html
 
 Lexer, parser, and AST code is available here:
 https://github.com/Hackerpilot/Dscanner/tree/range-based-lexer/std/d.
 Bug reports and pull requests are welcome.

Wonderful! I'll have to look into this sometime. I've been dreaming
about writing a D pretty-printer, and this may be just the thing I need
to get it going.  :)

(Well actually, I *wrote* a D pretty-printer, but I only got as far as
lexing, with some token-matching rules for the pretty-printing. Needless
to say, the results were rather underwhelming, since I didn't have an
AST to work with.)


T

-- 
Written on the window of a clothing store: No shirt, no shoes, no service.

Jul 19 2013

"Brian Schott" <briancschott gmail.com> writes:

On Friday, 19 July 2013 at 20:57:14 UTC, H. S. Teoh wrote:
 Really? Those still exist in Phobos? Shouldn't they be 
 rewritten to the
 "native" style?

Yes.

https://github.com/D-Programming-Language/phobos/pull/1412/files

Jul 19 2013

Johannes Pfau <nospam example.com> writes:

Am Fri, 19 Jul 2013 22:49:25 +0200
schrieb "Brian Schott" <briancschott gmail.com>:

 Status update:
 
 My parser is at a point where it can parse all of phobos (well, 
 excluding c-style array declarations, which I'm convinced are 
 broken and should be removed from the language).
 
 I've been updating the grammar definition as I worked on the 
 parser. An HTML version is available for download here: 
 https://raw.github.com/Hackerpilot/DGrammar/master/grammar.html
 
 Lexer, parser, and AST code is available here: 
 https://github.com/Hackerpilot/Dscanner/tree/range-based-lexer/std/d. 
 Bug reports and pull requests are welcome.

Nice work!

I guess you already tried druntime but did you also run your lexer and
or parser against the dmd test suite? The test suite probably has quite
some old legacy stuff like c-style array declarations so it might be a
pita to run those tests but it might show some real problems as well.

If you wanted to do more testing you could also run it on some
well-maintained big D projects (GTKd, derelict3, ...)
http://wiki.dlang.org/Libraries_and_Frameworks
http://wiki.dlang.org/Open_Source_Projects

It's also nice that we may finally get a correct and complete grammar.

BTW: I think it might be a good idea to start a new thread for this
update. This stuff is quite important for D/phobos and usually old
threads tend to be ignored or get less attention.

Jul 19 2013

Jacob Carlborg <doob me.com> writes:

On 2013-07-19 22:49, Brian Schott wrote:
 Status update:

 My parser is at a point where it can parse all of phobos (well,
 excluding c-style array declarations, which I'm convinced are broken and
 should be removed from the language).

 I've been updating the grammar definition as I worked on the parser. An
 HTML version is available for download here:
 https://raw.github.com/Hackerpilot/DGrammar/master/grammar.html

 Lexer, parser, and AST code is available here:
 https://github.com/Hackerpilot/Dscanner/tree/range-based-lexer/std/d.
 Bug reports and pull requests are welcome.

Now for the million dollar question: does it work during compile time?

-- 
/Jacob Carlborg

Jul 20 2013

"Michael" <pr m1xa.com> writes:

On Friday, 19 July 2013 at 20:49:27 UTC, Brian Schott wrote:
 Status update:
 excluding c-style array declarations, which I'm convinced are 
 broken and should be removed from the language

+1

I found for myself that "native" style arrays are more natural.

Jul 20 2013

"Tobias Pankrath" <tobias pankrath.net> writes:

On Friday, 26 April 2013 at 20:44:12 UTC, Brian Schott wrote:
 On Saturday, 20 April 2013 at 08:31:34 UTC, Brian Schott wrote:
 This uses ANTLR, as the other parser generators can't handle 
 D's grammar.

 I'm beginning to think that ANTRL is not up to the task either. 
 I've somehow managed to get the grammar to the point where it 
 correctly parses several phobos modules but takes a half hour 
 to do so.

Do you know, which parts of the grammar make the parser so slow?
Do you use any suspicious features of ANTLR?

May 01 2013

Russel Winder <russel winder.org.uk> writes:

On Wed, 2013-05-01 at 17:34 +0200, Tobias Pankrath wrote:
 On Friday, 26 April 2013 at 20:44:12 UTC, Brian Schott wrote:
 On Saturday, 20 April 2013 at 08:31:34 UTC, Brian Schott wrote:
 This uses ANTLR, as the other parser generators can't handle=20
 D's grammar.



Which other parser generators? cf. for example,
http://en.wikipedia.org/wiki/Comparison_of_parser_generators

 I'm beginning to think that ANTRL is not up to the task either.=20
 I've somehow managed to get the grammar to the point where it=20
 correctly parses several phobos modules but takes a half hour=20
 to do so.


LL(k) can sometimes do this. As can LALR(1) of course.

 Do you know, which parts of the grammar make the parser so slow?
 Do you use any suspicious features of ANTLR?

There isn't just one ANTLR: ANTLR 2, ANTLR 3, and ANTLR 4 are very
different beasties.

Not many of the parser generators generate D code :-(

--=20
Russel.
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D
Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder ekiga.n=
et
41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel winder.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder

May 01 2013

Bruno Medeiros <brunodomedeiros+dng gmail.com> writes:

On 26/04/2013 21:44, Brian Schott wrote:
 On Saturday, 20 April 2013 at 08:31:34 UTC, Brian Schott wrote:
 This uses ANTLR, as the other parser generators can't handle D's grammar.

 I'm beginning to think that ANTRL is not up to the task either. I've
 somehow managed to get the grammar to the point where it correctly
 parses several phobos modules but takes a half hour to do so.

Half hour??? That seems like a massive amount of time, even if the 
generated parser is very suboptimal and is doing a lot of unnecessary 
backtracking. I mean, even if the parser has quadratic performance over 
typical usage source files, it seems like a lot.

To be honest, that's one of the reasons that put me off working with 
ANLTR. It seems easy to create a parser with ANTLR, but to create an 
efficient, well-behaved parser it looks quite complicated, in the sense 
that you can't abstract yourself from what is happening under the 
hood... you have to read a lot of theory and documention to learn the 
innards of ANTLR, and understand what kind of code it's actually 
generating, and how it processes input. (at moments it feels like you 
have to take a degree to learn how to use it effectively...)

-- 
Bruno Medeiros - Software Engineer

May 02 2013

Russel Winder <russel winder.org.uk> writes:

On Thu, 2013-05-02 at 17:44 +0100, Bruno Medeiros wrote:
[=E2=80=A6]
 To be honest, that's one of the reasons that put me off working with=20
 ANLTR. It seems easy to create a parser with ANTLR, but to create an=20
 efficient, well-behaved parser it looks quite complicated, in the sense=

=20
 that you can't abstract yourself from what is happening under the=20
 hood... you have to read a lot of theory and documention to learn the=20
 innards of ANTLR, and understand what kind of code it's actually=20
 generating, and how it processes input. (at moments it feels like you=20
 have to take a degree to learn how to use it effectively...)

The Groovy parser is an ANTLR 2.7.7 grammar, it works well and quickly.
I think the trick is to work with the LL(k) idioms and avoid letting
LALR(1) thoughts creep in.

--=20
Russel.
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D
Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder ekiga.n=
et
41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel winder.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder

May 02 2013

"Tobias Pankrath" <tobias pankrath.net> writes:

On Thursday, 2 May 2013 at 17:13:46 UTC, Russel Winder wrote:
 On Thu, 2013-05-02 at 17:44 +0100, Bruno Medeiros wrote:
 […]
 To be honest, that's one of the reasons that put me off 
 working with ANLTR. It seems easy to create a parser with 
 ANTLR, but to create an efficient, well-behaved parser it 
 looks quite complicated, in the sense that you can't abstract 
 yourself from what is happening under the hood... you have to 
 read a lot of theory and documention to learn the innards of 
 ANTLR, and understand what kind of code it's actually 
 generating, and how it processes input. (at moments it feels 
 like you have to take a degree to learn how to use it 
 effectively...)

 The Groovy parser is an ANTLR 2.7.7 grammar, it works well and 
 quickly.
 I think the trick is to work with the LL(k) idioms and avoid 
 letting
 LALR(1) thoughts creep in.

LALR(1) is O(n) last time I looked. Both Ll(k) and LALR(k) should 
never backtrack. Dunno what Antlr is doing.

May 03 2013

"Brian Schott" <briancschott gmail.com> writes:

On Thursday, 2 May 2013 at 16:44:44 UTC, Bruno Medeiros wrote:
 (at moments it feels like you have to take a degree to learn 
 how to use it effectively...)

That or buy the documentation. Or both...

May 02 2013

D Programming

C/C++ Programming

Other

digitalmars.D - Official D Grammar