www.digitalmars.com         C & C++   DMDScript  

D - [Bug?] Current D Grammar not context free?

reply Manfred Nowak <svv1999 hotmail.com> writes:
In the current spec there are the following grammar rules for expressions:

        UnaryExpression:
		PostfixExpression
		( Type ) UnaryExpression
		( Type ) . Identifier
	PostfixExpression:
		PrimaryExpression
	PrimaryExpression:
		.Identifier

As one can notice, for the code `( int).hello' there are two different
derivations from the grammar to the corresponding token sequence. One
directly from "UnaryExpression" and one including "PostfixExpression" and
"PrimayExpression".

To me this suggests, that the current grammar is not context free.

In any case I do not understand, what the semantical difference of this
two derivations is.

So long!
Mar 25 2004
parent reply Andy Friesen <andy ikagames.com> writes:
Manfred Nowak wrote:
 In the current spec there are the following grammar rules for expressions:
 
         UnaryExpression:
 		PostfixExpression
 		( Type ) UnaryExpression
 		( Type ) . Identifier
 	PostfixExpression:
 		PrimaryExpression
 	PrimaryExpression:
 		.Identifier
 
 As one can notice, for the code `( int).hello' there are two different
 derivations from the grammar to the corresponding token sequence. One
 directly from "UnaryExpression" and one including "PostfixExpression" and
 "PrimayExpression".
 
 To me this suggests, that the current grammar is not context free.
 
 In any case I do not understand, what the semantical difference of this
 two derivations is.

Looks like the difference is that of casting a global scope symbol and accessing a type property. ie (wchar[]).toString(i) vs (int).sizeof Dumping C-style cast syntax would solve the problem. -- andy
Mar 26 2004
parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
Andy Friesen wrote:

 Manfred Nowak wrote:
 
 In the current spec there are the following grammar rules for 
 expressions:

         UnaryExpression:
         PostfixExpression
         ( Type ) UnaryExpression
         ( Type ) . Identifier
     PostfixExpression:
         PrimaryExpression
     PrimaryExpression:
         .Identifier


The way that's written, there are quite a few ambiguities. Like, is (Qwert) - 3 a cast or a binary subtraction? But look down at the Cast Expressions subsection of that page, you'll see that it just can't make up its mind whether the syntax is UnaryExpression ::= ( Type ) UnaryExpression or UnaryExpression ::= cast ( Type ) UnaryExpression
 Dumping C-style cast syntax would solve the problem.

Only that little bit of it. Even after it, here's another ambiguity: PostfixExpression ::= PostfixExpression . Identifier PrimaryExpression ::= Type . Identifier Given Identifier . Identifier two possible parse trees: PostfixExpression PrimaryExpression Type Identifier . Identifier PostfixExpression PostfixExpression PrimaryExpression Identifier . Identifier Stewart. -- My e-mail is valid but not my primary mailbox, aside from its being the unfortunate victim of intensive mail-bombing at the moment. Please keep replies on the 'group where everyone may benefit.
Mar 26 2004
parent "Walter" <walter digitalmars.com> writes:
"Stewart Gordon" <smjg_1998 yahoo.com> wrote in message
news:c41tnd$14d$1 digitaldaemon.com...
 Andy Friesen wrote:

 Manfred Nowak wrote:

 In the current spec there are the following grammar rules for
 expressions:

         UnaryExpression:
         PostfixExpression
         ( Type ) UnaryExpression
         ( Type ) . Identifier
     PostfixExpression:
         PrimaryExpression
     PrimaryExpression:
         .Identifier


The way that's written, there are quite a few ambiguities. Like, is (Qwert) - 3 a cast or a binary subtraction? But look down at the Cast Expressions subsection of that page, you'll see that it just can't make up its mind whether the syntax is UnaryExpression ::= ( Type ) UnaryExpression or UnaryExpression ::= cast ( Type ) UnaryExpression

The way the ambiguity is dealt with is if the parentheses are pointless, then it is treated as a (Type) rather than (Expression). I.e. in order for it to be a (Type), it has to not be parseable as an expression. For example: (int) => (Type) (T) => (Expression) (T*) => (Type) Look on line 3746 of parse.c to see how it works.
 Dumping C-style cast syntax would solve the problem.


Yes, that is correct. That's probably the right way to go.
 Only that little bit of it.  Even after it, here's another ambiguity:

 PostfixExpression ::= PostfixExpression . Identifier
 PrimaryExpression ::= Type . Identifier

 Given

 Identifier . Identifier

 two possible parse trees:

 PostfixExpression
 PrimaryExpression
 Type
 Identifier
 .
 Identifier

 PostfixExpression
 PostfixExpression
 PrimaryExpression
 Identifier
 .
 Identifier

That should be done a bit better, the 'Type' really should be BasicType.
Apr 18 2004