www.digitalmars.com         C & C++   DMDScript  

D - D grammar?

reply "Ivan Senji" <ivan.senji public.srce.hr> writes:
Is it LL(1)? Can it be converted to LL(1)?
Thanks
Aug 05 2003
parent reply "Walter" <walter digitalmars.com> writes:
"Ivan Senji" <ivan.senji public.srce.hr> wrote in message
news:bgntfi$qhg$1 digitaldaemon.com...
 Is it LL(1)? Can it be converted to LL(1)?

No, it requires arbitrary lookahead.
Aug 05 2003
next sibling parent reply "Ivan Senji" <ivan.senji public.srce.hr> writes:
I know I'm asking stupid questions:
I like D very much and i would like to write a parser for it (for fun).
Does arbitrary lookahead mean LR(1) grammar/parser?
Im writing a program that creates a LR(1) parser table from the grammar
but i would like to know is LR(1) the right parser?

"Walter" <walter digitalmars.com> wrote in message
news:bgon6j$1kfd$2 digitaldaemon.com...
 "Ivan Senji" <ivan.senji public.srce.hr> wrote in message
 news:bgntfi$qhg$1 digitaldaemon.com...
 Is it LL(1)? Can it be converted to LL(1)?

No, it requires arbitrary lookahead.

Aug 08 2003
next sibling parent reply "Walter" <walter digitalmars.com> writes:
"Ivan Senji" <ivan.senji public.srce.hr> wrote in message
news:bgvq53$28sd$1 digitaldaemon.com...
 I know I'm asking stupid questions:
 I like D very much and i would like to write a parser for it (for fun).
 Does arbitrary lookahead mean LR(1) grammar/parser?
 Im writing a program that creates a LR(1) parser table from the grammar
 but i would like to know is LR(1) the right parser?

I can never remember the difference between LL, LR, LALR, etc., probably because I never use parser generators. But it is not (1) because it requires arbitrary lookahead, mostly in trying to figure out if a sequence of tokens is a type or an expression, declaration or statement. The parser (parse.c) is set up to make lookahead easy.
Aug 08 2003
next sibling parent reply Bill Cox <bill viasic.com> writes:
Walter wrote:
 "Ivan Senji" <ivan.senji public.srce.hr> wrote in message
 news:bgvq53$28sd$1 digitaldaemon.com...
 
I know I'm asking stupid questions:
I like D very much and i would like to write a parser for it (for fun).
Does arbitrary lookahead mean LR(1) grammar/parser?
Im writing a program that creates a LR(1) parser table from the grammar
but i would like to know is LR(1) the right parser?

I can never remember the difference between LL, LR, LALR, etc., probably because I never use parser generators. But it is not (1) because it requires arbitrary lookahead, mostly in trying to figure out if a sequence of tokens is a type or an expression, declaration or statement. The parser (parse.c) is set up to make lookahead easy.

Hi, Walter. I believe LR(1) is what YACC does (maybe it's LL(1)). YACC has some abilities to look ahead. It pushes "unreduced" tokens on the token stack, and waits for a rule to be able to reduce them. I doubt D's type declarations are a problem. Generally, grammers become non-YACC friendly when parsing depends on context information. For example, foo(bar) in an expression can be a function call, or a cast expression in C++. You can't build a reasonable data structure for it when you read it, and instead have to come back later and patch it. From what I've seen of D, the grammer looks YACC friendly. I could be talked into verifying this, if you could create a text file of BNF rules for D. Bill
Aug 08 2003
parent reply Russ Lewis <spamhole-2001-07-16 deming-os.org> writes:
YACC (and Bison) is LALR(1).  LALR(1) cannot parse C, D, or other 
similar languages without some nasty hacks.

LALR(1) is LookAhead Left Right (1 symbol of lookahead).  It means that 
it reads the tokens starting at the left, but interprets them starting 
from the right.  Basically, it means that it can only evaluate the 
topmost token(s) on the stack.  The 1 symbol of lookahead means that you 
can read one additional symbol before deciding how to evaluate the 
topmost tokens.

As Burton said, the classic problem that YACC has is parsing the 
difference between expressions and type declarations.  A grammar to 
parse a statment might include the following rules:

statement:
	  declaration ';'
	| expression ';'
;

declaration:
	  type IDENT
;

type:
	  IDENT
	| type '*'
;

expression:
	  IDENT
	| expression '*' expression
;

Now, as an LALR(1) parser, how do you parse this series of tokens:

IDENT '*' IDENT ';'

This could be either a declaration of a pointer, or an expression where 
two variables are multiplied together.

An LALR(1) parser MUST be able to parse the top tokens on the stack with 
only 1 symbol of lookahead.  The grammar requires that the parser make a 
reduction of the first IDENT token, either IDENT->type or 
IDENT->expression.  However, the parser cannot tell from the single 
token of lookahead which is valid - either could work!

Thus, the grammar given above CANNOT be parsed by an LALR(1) parser.

You can try to hack around things, if you want:

statement:
	  IDENT stars IDENT ';'
;

stars:
	  '*'
	| stars '*'
;

This grammar is parsable by an LALR(1) parser.  But it isn't really 
readable!

Right when D first came out, I decided to write a parser for it.  I 
started with bison (GNU's yacc alternative).  I never had any success 
parsing D because of problems like this.  The grammar eventually gets 
too ugly to maintain, and even with all the ugliness, my grammar still 
had ambiguities.  In my mind, it's an open question whether it is even 
possible to devicse an LALR(1) grammar that parses C or D.

So I wrote my own parser generator :)



Bill Cox wrote:
 Walter wrote:
 
 "Ivan Senji" <ivan.senji public.srce.hr> wrote in message
 news:bgvq53$28sd$1 digitaldaemon.com...

 I know I'm asking stupid questions:
 I like D very much and i would like to write a parser for it (for fun).
 Does arbitrary lookahead mean LR(1) grammar/parser?
 Im writing a program that creates a LR(1) parser table from the grammar
 but i would like to know is LR(1) the right parser?

I can never remember the difference between LL, LR, LALR, etc., probably because I never use parser generators. But it is not (1) because it requires arbitrary lookahead, mostly in trying to figure out if a sequence of tokens is a type or an expression, declaration or statement. The parser (parse.c) is set up to make lookahead easy.

Hi, Walter. I believe LR(1) is what YACC does (maybe it's LL(1)). YACC has some abilities to look ahead. It pushes "unreduced" tokens on the token stack, and waits for a rule to be able to reduce them. I doubt D's type declarations are a problem. Generally, grammers become non-YACC friendly when parsing depends on context information. For example, foo(bar) in an expression can be a function call, or a cast expression in C++. You can't build a reasonable data structure for it when you read it, and instead have to come back later and patch it. From what I've seen of D, the grammer looks YACC friendly. I could be talked into verifying this, if you could create a text file of BNF rules for D. Bill

Aug 08 2003
next sibling parent "Walter" <walter digitalmars.com> writes:
"Russ Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message
news:bh15g0$gq9$1 digitaldaemon.com...
 So I wrote my own parser generator :)

You're obviously a kindred spirit!
Aug 08 2003
prev sibling parent reply Bill Cox <bill viasic.com> writes:
Russ Lewis wrote:
 YACC (and Bison) is LALR(1).  LALR(1) cannot parse C, D, or other 
 similar languages without some nasty hacks.
 
 LALR(1) is LookAhead Left Right (1 symbol of lookahead).  It means that 
 it reads the tokens starting at the left, but interprets them starting 
 from the right.  Basically, it means that it can only evaluate the 
 topmost token(s) on the stack.  The 1 symbol of lookahead means that you 
 can read one additional symbol before deciding how to evaluate the 
 topmost tokens.
 
 As Burton said, the classic problem that YACC has is parsing the 
 difference between expressions and type declarations.  A grammar to 
 parse a statment might include the following rules:
 
 statement:
       declaration ';'
     | expression ';'
 ;
 
 declaration:
       type IDENT
 ;
 
 type:
       IDENT
     | type '*'
 ;
 
 expression:
       IDENT
     | expression '*' expression
 ;
 
 Now, as an LALR(1) parser, how do you parse this series of tokens:
 
 IDENT '*' IDENT ';'
 
 This could be either a declaration of a pointer, or an expression where 
 two variables are multiplied together.
 
 An LALR(1) parser MUST be able to parse the top tokens on the stack with 
 only 1 symbol of lookahead.  The grammar requires that the parser make a 
 reduction of the first IDENT token, either IDENT->type or 
 IDENT->expression.  However, the parser cannot tell from the single 
 token of lookahead which is valid - either could work!
 
 Thus, the grammar given above CANNOT be parsed by an LALR(1) parser.
 
 You can try to hack around things, if you want:
 
 statement:
       IDENT stars IDENT ';'
 ;
 
 stars:
       '*'
     | stars '*'
 ;
 
 This grammar is parsable by an LALR(1) parser.  But it isn't really 
 readable!
 
 Right when D first came out, I decided to write a parser for it.  I 
 started with bison (GNU's yacc alternative).  I never had any success 
 parsing D because of problems like this.  The grammar eventually gets 
 too ugly to maintain, and even with all the ugliness, my grammar still 
 had ambiguities.  In my mind, it's an open question whether it is even 
 possible to devicse an LALR(1) grammar that parses C or D.
 
 So I wrote my own parser generator :)
 
 
 
 Bill Cox wrote:
 
 Walter wrote:

 "Ivan Senji" <ivan.senji public.srce.hr> wrote in message
 news:bgvq53$28sd$1 digitaldaemon.com...

 I know I'm asking stupid questions:
 I like D very much and i would like to write a parser for it (for fun).
 Does arbitrary lookahead mean LR(1) grammar/parser?
 Im writing a program that creates a LR(1) parser table from the grammar
 but i would like to know is LR(1) the right parser?

I can never remember the difference between LL, LR, LALR, etc., probably because I never use parser generators. But it is not (1) because it requires arbitrary lookahead, mostly in trying to figure out if a sequence of tokens is a type or an expression, declaration or statement. The parser (parse.c) is set up to make lookahead easy.

Hi, Walter. I believe LR(1) is what YACC does (maybe it's LL(1)). YACC has some abilities to look ahead. It pushes "unreduced" tokens on the token stack, and waits for a rule to be able to reduce them. I doubt D's type declarations are a problem. Generally, grammers become non-YACC friendly when parsing depends on context information. For example, foo(bar) in an expression can be a function call, or a cast expression in C++. You can't build a reasonable data structure for it when you read it, and instead have to come back later and patch it. From what I've seen of D, the grammer looks YACC friendly. I could be talked into verifying this, if you could create a text file of BNF rules for D. Bill


Hi, Russ. Can the latest bison parse your D grammer? I think an official machine readable grammer is a good thing, even if no one uses it to create an actual parser. If you have a D grammer, could you post it? Also, if I remember, you couldn't share your new parser generator due to issues at work. Is that right? Bill
Aug 11 2003
parent Russ Lewis <spamhole-2001-07-16 deming-os.org> writes:
Bill Cox wrote:
 Can the latest bison parse your D grammer?  I think an official machine 
 readable grammer is a good thing, even if no one uses it to create an 
 actual parser.  If you have a D grammer, could you post it?
 
 Also, if I remember, you couldn't share your new parser generator due to 
 issues at work.  Is that right?
 
 Bill

I haven't tried it since they put the GLR code in. It's a good idea, but I'd probably have to rebuild it from scratch. I don't (anymore) have a "clean" bison grammar. I hacked up the first one to try to make it work (and failed), and the new, up-to-date grammar uses the syntax from my new parser. Sometime in the future I can try to port it back to Bison. Ick. Anybody ever tried to write a parser for the for statement in Bison? There are 8 different possible alternatives you have to express. I like my parser because I could parse a for with a single expression. But yeah, the company's still not allowing me release it.
Aug 11 2003
prev sibling next sibling parent reply "Mike Wynn" <mike.wynn l8night.co.uk> writes:
"Walter" <walter digitalmars.com> wrote in message
news:bh0tjn$9cu$1 digitaldaemon.com...
 "Ivan Senji" <ivan.senji public.srce.hr> wrote in message
 news:bgvq53$28sd$1 digitaldaemon.com...
 I know I'm asking stupid questions:
 I like D very much and i would like to write a parser for it (for fun).
 Does arbitrary lookahead mean LR(1) grammar/parser?
 Im writing a program that creates a LR(1) parser table from the grammar
 but i would like to know is LR(1) the right parser?

I can never remember the difference between LL, LR, LALR, etc., probably because I never use parser generators. But it is not (1) because it

 arbitrary lookahead, mostly in trying to figure out if a sequence of

 is a type or an expression, declaration or statement.

LR is a bottom up parser (shift reduce) LALR is a lookahead LR LL parsers have problems with e ::= e + t; so you have to rewrite your grammers to avoid left recursion as far as I know (my dragon book claims anyway) yacc is an LALR parser JavaCC was LL(n) and Antlr is either LL(n) or LALR unlike C or C++ you do not know if an id is an id or a type so you have to go quite a way until you know if you have a variable declare or an assign simple examples to try out id[id] a; // declare id[id] = s; // assign this could of course be id.id[id.id[id.id[id]][id]] etc and func pointers I think cause similar issues id(*id)(id, id ) // type or function call that returns func ptr that is called ? given that worse is c cast .. (id)(id) // func call or cast ? but I think they should be outlawed I think you'll just have to try it out, a D parser generator would be a good tool anyway and I'm sure there are tweeks that can be put in to resolve some issues and if you build a tree then work out what it is I think LALR should handle it initially. it would be a good excersise, especially with all the random good ideas ppl have, would allow someone to try out a given syntax and see if it would actually work or not.
Aug 08 2003
parent reply "Walter" <walter digitalmars.com> writes:
"Mike Wynn" <mike.wynn l8night.co.uk> wrote in message
news:bh13pd$f5f$1 digitaldaemon.com...
 I think you'll just have to try it out, a D parser generator would be a

 tool anyway and I'm sure there are tweeks that can be put in to resolve

 issues and if you build a tree then work out what it is I think LALR

 handle it initially.
 it would be a good excersise, especially with all the random good ideas

 have, would allow someone to try out a given syntax and see if it would
 actually work or not.

In my experience, I've been able to crank out a hand-built parser than someone using Bison could do. So I never saw the point to using those tools <g>. The biggest improvement the D grammar has over C/C++ is that it does not require semantic information in order to lex or to parse. This makes it a *lot* easier to build syntax sensitive editors and other types of code analysis tools.
Aug 08 2003
next sibling parent "Mike Wynn" <mike.wynn l8night.co.uk> writes:
"Walter" <walter digitalmars.com> wrote in message
news:bh1kt5$uua$2 digitaldaemon.com...
 "Mike Wynn" <mike.wynn l8night.co.uk> wrote in message
 news:bh13pd$f5f$1 digitaldaemon.com...
 I think you'll just have to try it out, a D parser generator would be a

 tool anyway and I'm sure there are tweeks that can be put in to resolve

 issues and if you build a tree then work out what it is I think LALR

 handle it initially.
 it would be a good excersise, especially with all the random good ideas

 have, would allow someone to try out a given syntax and see if it would
 actually work or not.

In my experience, I've been able to crank out a hand-built parser than someone using Bison could do. So I never saw the point to using those

 <g>.

your grammer does not have obscure conflicts. I agree that a hand crafted parser can be easier, just little things like C comments are easy to deal with in a hand written parser (once you've read '/*' ignore everything until you get '*/') but a pain to write in antlr for example ML_COMMENT : "/*" ( { LA(2)!='/' }? '*' | '\n' { newline(); } | ~('*'|'\n') )* "*/" { $setType(Token.SKIP); } ; but I don't see that that alone makes parser/lexer/treewalker generating tools redundant they do offer a level of maintainability that a hand written parser can lack. it not always a case of who can write the parser first, but also who's is the most robust and can detect that the original grammer had flaws. and who could add new features without breaking to much ? hand written parsers do allow blending of LL and LR techniques but I'm not 100% convinced that that is altogether a good idea, quite a lot of LR grammers can be converted to LL(n) grammer and if you have a grammer that can not be written as LR(n) or LL(n) is it a "good" grammer ?
 The biggest improvement the D grammar has over C/C++ is that it does not
 require semantic information in order to lex or to parse. This makes it a
 *lot* easier to build syntax sensitive editors and other types of code
 analysis tools.

that would imply that the grammer should be easy to write. but you do have to perform 2 passes over the parsed tree due to this lack of semantic info in C (or C++) at the time when you parse a type identifier you know what it is, in D you do not.
Aug 09 2003
prev sibling parent reply Helium <Helium_member pathlink.com> writes:
The biggest improvement the D grammar has over C/C++ is that it does not
require semantic information in order to lex or to parse. This makes it a
*lot* easier to build syntax sensitive editors and other types of code
analysis tools.

But are there any tools like this? e.g. a refactoring tool? I haven't seen one.
Aug 10 2003
parent "Walter" <walter digitalmars.com> writes:
"Helium" <Helium_member pathlink.com> wrote in message
news:bh5489$15nn$1 digitaldaemon.com...
The biggest improvement the D grammar has over C/C++ is that it does not
require semantic information in order to lex or to parse. This makes it a
*lot* easier to build syntax sensitive editors and other types of code
analysis tools.


No, not yet. But there will be.
Aug 11 2003
prev sibling parent "Fabian Giesen" <rygNO SPAMgmx.net> writes:
 I can never remember the difference between LL, LR, LALR, etc.,
 probably because I never use parser generators. But it is not (1)
 because it requires arbitrary lookahead, mostly in trying to figure
 out if a sequence of tokens is a type or an expression, declaration
 or statement.

Any LR(k) grammar can be converted to a LR(1) grammar that parses the exact same language, at the expense of the need for more states (and a more complicated source grammar). That's why most LR parser generators don't bother providing more than one token of lookahead. -fg
Aug 09 2003
prev sibling parent reply "Peter Hercek" <vvp no.post.spam.sk> writes:
Hi Ivan,

It would be great if you (or anybody) would create D grammar in BNF
 or EBNF. I tried it myself a few month ago from the web documentation,
 but the grammar (on the web) had tons of trivial errors and was not
 complete. I tried to get it from sources, but found out that the parser
 is hand writen. Uff, I have given up. I did not feel like courageous
 enough to reverse engineer the souce code to create BNF grammar :o(
 Shame on me, but to reverse engineer the souce code was too much.

I attached the BNF grammar, I was able to extract from the web. Anyway,
 it is few months old. If you (or anybody) can update the file, I would
 like to play with is and generate a *nice* html grammar for D. (I should
 be able to generate antlr grammar description from it too, but I do not
 have this implemented.) I created some XML schema for grammar description
 and XSLT template to generate the html file; it was my fun project to
 learn what it is XSLT. To have idea how the hyperlinked grammar would
 look like check out this C# grammar (there was no source code reverse
 engineering there :o) )

http://peter.hercek.sk/c-sharp-grammar.html

It contains both forward and backward links (ie "show me all the rules,
 which do use this symbol" can be achieved by clicking on the rule head).
 Tested only for IE.

So, if somebody wants to move D grammar on, please respond to this
 message (to group of course). But it requires a lot of source code
 studying unfortunately! I would tell that D grammar is protected through
 obcurity :o). Seriously now, I may be helpfull with some tools to look
 up trivial problems etc.

Peter.


"Ivan Senji" <ivan.senji public.srce.hr> wrote in message
news:bgvq53$28sd$1 digitaldaemon.com...
 I know I'm asking stupid questions:
 I like D very much and i would like to write a parser for it (for fun).
 Does arbitrary lookahead mean LR(1) grammar/parser?
 Im writing a program that creates a LR(1) parser table from the grammar
 but i would like to know is LR(1) the right parser?

 "Walter" <walter digitalmars.com> wrote in message
 news:bgon6j$1kfd$2 digitaldaemon.com...
 "Ivan Senji" <ivan.senji public.srce.hr> wrote in message
 news:bgntfi$qhg$1 digitaldaemon.com...
 Is it LL(1)? Can it be converted to LL(1)?

No, it requires arbitrary lookahead.


Aug 10 2003
next sibling parent reply "Walter" <walter digitalmars.com> writes:
If you're interested in critiquing the grammar in the D documentation, I can
work on improving it.
Aug 11 2003
parent reply "Peter Hercek" <vvp no.post.spam.sk> writes:
I would like to do so, but not from the current state of the web site.
 I think the grammar must be hard to maintain ... it is distribueted to
 a lot of pages. Do you generate them somehow? How you keep it
 up to date?
I prefer to critique one file with the grammar, preferably in the format
 attached in my previous message. The file can have sections,
 like the C# [http://peter.hercek.sk/c-sharp-grammar.html], actually it
 already has eg ``0Lexical` (the number zero is the nest level). We can
 include some remarks etc. For me this was a way to check posibilities
 of XSLT. I can generate more files (may be a file for each section) to
 include it into different pages. But then cross-links between different
 pages would not work, or we need some good naming convention
 derived from section names or something similar.
It is a pain to extract grammar from the webpages and transform it
 into something usable for automatic processing. I'm lazy to do it
 manually and it is error prone too.
Check also my response to Bil Cox. If he finaly decides to analyze your
 code, you can get a good grammar for free :) Then we can cut off
 sections for you to replace in your pages?
I believe we can figure it out somehow together.

"Walter" <walter digitalmars.com> wrote in message
news:bha72n$b1f$1 digitaldaemon.com...
 If you're interested in critiquing the grammar in the D documentation, I can
 work on improving it.

Aug 12 2003
parent "Walter" <walter digitalmars.com> writes:
All the web pages are html pages written by hand.

"Peter Hercek" <vvp no.post.spam.sk> wrote in message
news:bhce0l$2g97$1 digitaldaemon.com...
 I would like to do so, but not from the current state of the web site.
  I think the grammar must be hard to maintain ... it is distribueted to
  a lot of pages. Do you generate them somehow? How you keep it
  up to date?
 I prefer to critique one file with the grammar, preferably in the format
  attached in my previous message. The file can have sections,
  like the C# [http://peter.hercek.sk/c-sharp-grammar.html], actually it
  already has eg ``0Lexical` (the number zero is the nest level). We can
  include some remarks etc. For me this was a way to check posibilities
  of XSLT. I can generate more files (may be a file for each section) to
  include it into different pages. But then cross-links between different
  pages would not work, or we need some good naming convention
  derived from section names or something similar.
 It is a pain to extract grammar from the webpages and transform it
  into something usable for automatic processing. I'm lazy to do it
  manually and it is error prone too.
 Check also my response to Bil Cox. If he finaly decides to analyze your
  code, you can get a good grammar for free :) Then we can cut off
  sections for you to replace in your pages?
 I believe we can figure it out somehow together.

 "Walter" <walter digitalmars.com> wrote in message

 If you're interested in critiquing the grammar in the D documentation, I


 work on improving it.


Aug 15 2003
prev sibling parent reply Bill Cox <bill viasic.com> writes:
Peter Hercek wrote:
 So, if somebody wants to move D grammar on, please respond to this
  message (to group of course). But it requires a lot of source code
  studying unfortunately! I would tell that D grammar is protected through
  obcurity :o). Seriously now, I may be helpfull with some tools to look
  up trivial problems etc.

I'm willing to take it on. I've had a fair amount of experience with paser generators. With bison's latest upgrades, and some work, I bet I get it to work. Bill
Aug 12 2003
parent reply "Peter Hercek" <vvp no.post.spam.sk> writes:
Cool, I did it this way last time.
Pulled out the grammar from web pages and created the text file
 (this was pain, because I was not able to automate this completely!)
Generated an XML description of the grammar from the text file
 using antlr.
The XML file was transormed to HTML using XSLT.

If you would create directly a bison grammar description (this may
 be easier finaly), I should be able to process the bison file (I hope its
 grammar is not too complicated without C code) and get the final
 HTML. Till now I only thought to genereate antlr grammar description
 to check the grammar for more complicated errors. Simple errors (like
 multiple definitions for a symbol, no definition for a symbol) can be
 detected during translation to HTML. I can get you these errors for
 the file I attached previously, if you think this can help you.

I guess bison/LARL(1) and antlr/LL(k) are not the same, but who cares
 provideded that at least one of them is available.


"Bill Cox" <bill viasic.com> wrote in message
news:bhas3p$vtm$1 digitaldaemon.com...
 Peter Hercek wrote:
 So, if somebody wants to move D grammar on, please respond to this
  message (to group of course). But it requires a lot of source code
  studying unfortunately! I would tell that D grammar is protected through
  obcurity :o). Seriously now, I may be helpfull with some tools to look
  up trivial problems etc.

I'm willing to take it on. I've had a fair amount of experience with paser generators. With bison's latest upgrades, and some work, I bet I get it to work. Bill

Aug 12 2003
parent reply Bill Cox <bill viasic.com> writes:
Hi, Peter.

I'll give it a try.  I'll start with your grammar file.

Bill

Peter Hercek wrote:
 Cool, I did it this way last time.
 Pulled out the grammar from web pages and created the text file
  (this was pain, because I was not able to automate this completely!)
 Generated an XML description of the grammar from the text file
  using antlr.
 The XML file was transormed to HTML using XSLT.
 
 If you would create directly a bison grammar description (this may
  be easier finaly), I should be able to process the bison file (I hope its
  grammar is not too complicated without C code) and get the final
  HTML. Till now I only thought to genereate antlr grammar description
  to check the grammar for more complicated errors. Simple errors (like
  multiple definitions for a symbol, no definition for a symbol) can be
  detected during translation to HTML. I can get you these errors for
  the file I attached previously, if you think this can help you.
 
 I guess bison/LARL(1) and antlr/LL(k) are not the same, but who cares
  provideded that at least one of them is available.
 
 
 "Bill Cox" <bill viasic.com> wrote in message
news:bhas3p$vtm$1 digitaldaemon.com...
 
Peter Hercek wrote:

So, if somebody wants to move D grammar on, please respond to this
 message (to group of course). But it requires a lot of source code
 studying unfortunately! I would tell that D grammar is protected through
 obcurity :o). Seriously now, I may be helpfull with some tools to look
 up trivial problems etc.

I'm willing to take it on. I've had a fair amount of experience with paser generators. With bison's latest upgrades, and some work, I bet I get it to work. Bill


Aug 13 2003
parent Bill Cox <Bill_member pathlink.com> writes:
In article <bhdv9c$vum$1 digitaldaemon.com>, Bill Cox says...
Hi, Peter.

I'll give it a try.  I'll start with your grammar file.

Bill

Hi. I've gotten started. The grammar needs a lot of work. Here's my changed version. There are several undefined rules. I can make them up, but I could use some help from Walter on a couple... I need some help with these --------------------------------- Invariant - from module.html Type - used all over BasicType2 - from declaration.html FunctionDeclarator - from declaration.html I think I can figure these out... --------------------------------- Constructor Destructor StaticConstructor StaticDestructor Unittest I haven't looked at these yet, but they aren't defined in the grammar file... --------------------------------- Declarator VersionAttribute NumericLiteral LabeledStatement DoWhileStatement AssignmentExpression Parameter AsmInstruction
Aug 14 2003
prev sibling parent reply Bill Cox <bill viasic.com> writes:
Walter wrote:
 "Ivan Senji" <ivan.senji public.srce.hr> wrote in message
 news:bgntfi$qhg$1 digitaldaemon.com...
 
Is it LL(1)? Can it be converted to LL(1)?

No, it requires arbitrary lookahead.

Hi, Walter. What parts require arbitrary lookahead? Being able to YACC D seems a worthy goal. Bill
Aug 08 2003
next sibling parent Burton Radons <loth users.sourceforge.net> writes:
Bill Cox wrote:
 Walter wrote:
 
 "Ivan Senji" <ivan.senji public.srce.hr> wrote in message
 news:bgntfi$qhg$1 digitaldaemon.com...

 Is it LL(1)? Can it be converted to LL(1)?

No, it requires arbitrary lookahead.

Hi, Walter. What parts require arbitrary lookahead? Being able to YACC D seems a worthy goal.

Distinguishing locals declarations and C-style casts from expressions.
Aug 08 2003
prev sibling parent reply Ilya Minkov <midiclub 8ung.at> writes:
Bill Cox wrote:
 Being able to YACC D seems a worthy goal.

What for? YACC is past and even "considered harmful"! New Bison supports GLR. There are tons of other GLR and LL(inf) parser generators out there. -i.
Aug 08 2003
next sibling parent Bill Cox <bill viasic.com> writes:
Ilya Minkov wrote:
 Bill Cox wrote:
 
 Being able to YACC D seems a worthy goal.

What for? YACC is past and even "considered harmful"! New Bison supports GLR. There are tons of other GLR and LL(inf) parser generators out there. -i.

Hi, Ilya. Support for GLR in Bison is quite new. It wasn't in the general release last year. Also, there's a performance penalty to use GLR. I strongly encourage anyone making a new language format to create a machine verified version, whether it uses GLR or LR(1). Until the rules have be verified by a parser generator, you can bet the rules are wrong. Bill
Aug 08 2003
prev sibling parent "Mike Wynn" <mike.wynn l8night.co.uk> writes:
"Ilya Minkov" <midiclub 8ung.at> wrote in message
news:bh0jct$30ml$1 digitaldaemon.com...
 Bill Cox wrote:
 Being able to YACC D seems a worthy goal.

What for? YACC is past and even "considered harmful"! New Bison supports GLR. There are tons of other GLR and LL(inf) parser generators out there.

believe it can rebuild itself (am looking for somewhere to get it www.antlr.org seems to be missing! so might be able to squeeze a D version out of it. I don't fancy trying to port Bison to D. Antlr also allows tree walkers to be defined which is very useful (LALR and LL(n) only I believe)
Aug 08 2003