www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - D grammar

reply "Ivan Senji" <ivan.senji public.srce.hr> writes:
Well this is as far as i can get it at this moment (no more time available
to
spen on this)

The only things (as far as i can tell) that make it not a real D grammar
is
<STATEMENT> -> synchronized <STATEMENT>
<STATEMENT> -> synchronized ( <EXPRESSION> ) <STATEMENT>

I can't think of a way to write this in an unambigous way now.
And i didn't solve the problem of type<->expression
so it accepts a language in wich you have to write type infront of type
name.

like:
int something;   =>  type int something;

int func(char[] bla){}  =>  type int func(type char[] bla){}

Charlie? Sammy? Wanna try fixing this thing with types?
I think it could be done at a cost of a much uglier grammar.

Not saying that it isn't ugly now. Sorry for all those upercase names
(those are the ones that i wrote or inspired by Java lalr(1) grammar),
the ones in mixeded case are borowed form Walter's spec.

As it is this gramamr is lr(1) and it's parser is a little over 5000 states.
It would probbably even be lalr(1) but with much less states.

Attached is the grammar file (in my stupid format),
and the file i was able to parse (no lexer i have so it looks strange :)
Aug 21 2004
next sibling parent reply kinghajj <kinghajj_member pathlink.com> writes:
In article <cg7j4b$20cb$1 digitaldaemon.com>, Ivan Senji says...
Well this is as far as i can get it at this moment (no more time available
to
spen on this)

The only things (as far as i can tell) that make it not a real D grammar
is
<STATEMENT> -> synchronized <STATEMENT>
<STATEMENT> -> synchronized ( <EXPRESSION> ) <STATEMENT>

I can't think of a way to write this in an unambigous way now.
And i didn't solve the problem of type<->expression
so it accepts a language in wich you have to write type infront of type
name.

like:
int something;   =>  type int something;

int func(char[] bla){}  =>  type int func(type char[] bla){}

Charlie? Sammy? Wanna try fixing this thing with types?
I think it could be done at a cost of a much uglier grammar.

Not saying that it isn't ugly now. Sorry for all those upercase names
(those are the ones that i wrote or inspired by Java lalr(1) grammar),
the ones in mixeded case are borowed form Walter's spec.

As it is this gramamr is lr(1) and it's parser is a little over 5000 states.
It would probbably even be lalr(1) but with much less states.

Attached is the grammar file (in my stupid format),
and the file i was able to parse (no lexer i have so it looks strange :)

OK, I've looked at enough at the posts on the "D Grammar", so, I have one question: WHAT THE HELL ARE YOU TALKING ABOUT? What do you mean by D Grammar, and what's a grammar file???
Aug 21 2004
next sibling parent reply "Ivan Senji" <ivan.senji public.srce.hr> writes:
"kinghajj" <kinghajj_member pathlink.com> wrote in message
news:cg7r2u$2420$1 digitaldaemon.com...
 In article <cg7j4b$20cb$1 digitaldaemon.com>, Ivan Senji says...
Attached is the grammar file (in my stupid format),
and the file i was able to parse (no lexer i have so it looks strange :)

OK, I've looked at enough at the posts on the "D Grammar", so, I have one question: WHAT THE HELL ARE YOU TALKING ABOUT? What do you mean by D

 and what's a grammar file???

A gramamr is a formal description of a syntax of a language. D Grammar is a grammar that describes syntax of D language. A grammar file is a file containing grammar rules :)
Aug 21 2004
parent reply kinghajj <kinghajj_member pathlink.com> writes:
In article <cg8b6o$2cm7$1 digitaldaemon.com>, Ivan Senji says...
"kinghajj" <kinghajj_member pathlink.com> wrote in message
news:cg7r2u$2420$1 digitaldaemon.com...
 In article <cg7j4b$20cb$1 digitaldaemon.com>, Ivan Senji says...
Attached is the grammar file (in my stupid format),
and the file i was able to parse (no lexer i have so it looks strange :)

OK, I've looked at enough at the posts on the "D Grammar", so, I have one question: WHAT THE HELL ARE YOU TALKING ABOUT? What do you mean by D

 and what's a grammar file???

A gramamr is a formal description of a syntax of a language. D Grammar is a grammar that describes syntax of D language. A grammar file is a file containing grammar rules :)

OK, then how do you "generate" that? Is there some program that looks at sample D code and somehow comes up with a grammar, or do you *write* it yourself?
Aug 21 2004
parent reply Stephan Wienczny <Stephan Wienczny.de> writes:
kinghajj wrote:
 
 
 OK, then how do you "generate" that? Is there some program that looks at sample
 D code and somehow comes up with a grammar, or do you *write* it yourself?
 
 

generate something useful. Wienczny
Aug 21 2004
parent reply teqDruid <me teqdruid.com> writes:
What tool will interpret the syntax posted?  antlr won't take it (at least
not without some options I'm unaware of)

John

On Sun, 22 Aug 2004 01:55:54 +0200, Stephan Wienczny wrote:

 kinghajj wrote:
 
 
 OK, then how do you "generate" that? Is there some program that looks at sample
 D code and somehow comes up with a grammar, or do you *write* it yourself?
 
 

generate something useful. Wienczny

Aug 21 2004
parent "Ivan Senji" <ivan.senji public.srce.hr> writes:
"teqDruid" <me teqdruid.com> wrote in message
news:pan.2004.08.22.00.43.31.983992 teqdruid.com...
 What tool will interpret the syntax posted?  antlr won't take it (at least
 not without some options I'm unaware of)

Only my tool (probbably) the way that it is written, but i don't see a problem in rewriting it to antlr's or any other style. I can't do it because i have never used antlr but it shouldn't be to difficult.
 John

 On Sun, 22 Aug 2004 01:55:54 +0200, Stephan Wienczny wrote:

 kinghajj wrote:
 OK, then how do you "generate" that? Is there some program that looks



 D code and somehow comes up with a grammar, or do you *write* it




generate something useful. Wienczny


Aug 21 2004
prev sibling parent reply Johannes <getridofcrap.johannes.oberg gmail.com> writes:
 Where did the c:\d\ directory come from?

From one of the wiki4d "Evaulation Guide". I tried just making \DMD and \DM first, and setting the path in different ways.
 Add dm\bin and dmd\bin to the path, and you're done.

Could this be some wierd thing with Windows XP not using the same path system as real DOS?
Sep 11 2007
parent Johannes <getridofcrap.johannes.oberg gmail.com> writes:
Sorry, Web-News bugged me into posting in the wrong group.
Sep 11 2007
prev sibling parent reply "Ivan Senji" <ivan.senji public.srce.hr> writes:
I forgot to mention that if you remove "type" from basic_type grammar
rule you get an ambigous LR(1) grammar but it is still possible to
parse that language with a parser i used to call nondeterministic LR(1)
before i found out it is actually Generalized LR or GLR.

The only sad thing is that this parser takes much more time
to parse because there are more parsers working in parallel.
My test file with an unamibouous grammar parses in 30ms
while this one with the GLR parser takes 2s, although it would
probably be possible to optimize both GLR parser and grammar.


"Ivan Senji" <ivan.senji public.srce.hr> wrote in message
news:cg7j4b$20cb$1 digitaldaemon.com...
 Well this is as far as i can get it at this moment (no more time available
 to
 spen on this)

 The only things (as far as i can tell) that make it not a real D grammar
 is
 <STATEMENT> -> synchronized <STATEMENT>
 <STATEMENT> -> synchronized ( <EXPRESSION> ) <STATEMENT>

 I can't think of a way to write this in an unambigous way now.
 And i didn't solve the problem of type<->expression
 so it accepts a language in wich you have to write type infront of type
 name.

 like:
 int something;   =>  type int something;

 int func(char[] bla){}  =>  type int func(type char[] bla){}

 Charlie? Sammy? Wanna try fixing this thing with types?
 I think it could be done at a cost of a much uglier grammar.

 Not saying that it isn't ugly now. Sorry for all those upercase names
 (those are the ones that i wrote or inspired by Java lalr(1) grammar),
 the ones in mixeded case are borowed form Walter's spec.

 As it is this gramamr is lr(1) and it's parser is a little over 5000

 It would probbably even be lalr(1) but with much less states.

 Attached is the grammar file (in my stupid format),
 and the file i was able to parse (no lexer i have so it looks strange :)

Aug 22 2004
parent reply Ilya Minkov <minkov cs.tum.edu> writes:
Ivan Senji schrieb:

 I forgot to mention that if you remove "type" from basic_type grammar
 rule you get an ambigous LR(1) grammar but it is still possible to
 parse that language with a parser i used to call nondeterministic LR(1)
 before i found out it is actually Generalized LR or GLR.

I wonder what exactly causes the abiguity. What rules exactly conflict? Cannot the decision be shifted out? And finally, C also has to deal with it someway. Did you base your D grammar on an existing YACC C grammar?
 The only sad thing is that this parser takes much more time
 to parse because there are more parsers working in parallel.
 My test file with an unamibouous grammar parses in 30ms
 while this one with the GLR parser takes 2s, although it would
 probably be possible to optimize both GLR parser and grammar.

Why don't you drop the table-based parsing altogether? It's always slow in practice anyway! Even the GCC crew seasoned at table-based pasing thinks so. Even, it is "considered harmful for reengineering purposes". -eye
Aug 23 2004
next sibling parent reply Andy Friesen <andy ikagames.com> writes:
Ilya Minkov wrote:
 Ivan Senji schrieb:
 
 I forgot to mention that if you remove "type" from basic_type grammar
 rule you get an ambigous LR(1) grammar but it is still possible to
 parse that language with a parser i used to call nondeterministic LR(1)
 before i found out it is actually Generalized LR or GLR.

I wonder what exactly causes the abiguity. What rules exactly conflict? Cannot the decision be shifted out? And finally, C also has to deal with it someway. Did you base your D grammar on an existing YACC C grammar?

I hit an issue like this just yesterday. a * b; // is it a local variable or a multiplication? -- andy
Aug 23 2004
next sibling parent Stephan Wienczny <Stephan Wienczny.de> writes:
Andy Friesen wrote:
 Ilya Minkov wrote:
 
 Ivan Senji schrieb:

 I forgot to mention that if you remove "type" from basic_type grammar
 rule you get an ambigous LR(1) grammar but it is still possible to
 parse that language with a parser i used to call nondeterministic LR(1)
 before i found out it is actually Generalized LR or GLR.

I wonder what exactly causes the abiguity. What rules exactly conflict? Cannot the decision be shifted out? And finally, C also has to deal with it someway. Did you base your D grammar on an existing YACC C grammar?

I hit an issue like this just yesterday. a * b; // is it a local variable or a multiplication? -- andy

Isn't that decided during semantic analysis? Stephan
Aug 23 2004
prev sibling next sibling parent reply Russ Lewis <spamhole-2001-07-16 deming-os.org> writes:
Andy Friesen wrote:
 Ilya Minkov wrote:
 
 Ivan Senji schrieb:

 I forgot to mention that if you remove "type" from basic_type grammar
 rule you get an ambigous LR(1) grammar but it is still possible to
 parse that language with a parser i used to call nondeterministic LR(1)
 before i found out it is actually Generalized LR or GLR.

I wonder what exactly causes the abiguity. What rules exactly conflict? Cannot the decision be shifted out? And finally, C also has to deal with it someway. Did you base your D grammar on an existing YACC C grammar?


You can't do a YACC C grammar without a lot of hacking. The example below is just one reason why.
 I hit an issue like this just yesterday.
 
     a * b; // is it a local variable or a multiplication?

In D, the rule is (or at least, used to be) that if it looks like either a no-op expression or a type, then it must be a type. In the example above, it must be a type. WALTER: Has this changed since you implemented opMul()? What if opMul() has side effects?
Aug 23 2004
parent "Walter" <newshound digitalmars.com> writes:
"Russ Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message
news:cgdrmm$176$1 digitaldaemon.com...
 I hit an issue like this just yesterday.

     a * b; // is it a local variable or a multiplication?

In D, the rule is (or at least, used to be) that if it looks like either a no-op expression or a type, then it must be a type. In the example above, it must be a type. WALTER: Has this changed since you implemented opMul()?

No.
 What if opMul() has side effects?

Writing it as: (a * b); will cause it to be parsed as a multiply.
Aug 28 2004
prev sibling parent reply "Walter" <newshound digitalmars.com> writes:
"Andy Friesen" <andy ikagames.com> wrote in message
news:cgdqeq$ig$1 digitaldaemon.com...
 I hit an issue like this just yesterday.

      a * b; // is it a local variable or a multiplication?

That's resolved with the rule "if it parses as a declaration, it is a declaration."
Aug 28 2004
next sibling parent "Ivan Senji" <ivan.senji public.srce.hr> writes:
"Walter" <newshound digitalmars.com> wrote in message
news:cgqo26$3m5$1 digitaldaemon.com...
 "Andy Friesen" <andy ikagames.com> wrote in message
 news:cgdqeq$ig$1 digitaldaemon.com...
 I hit an issue like this just yesterday.

      a * b; // is it a local variable or a multiplication?

That's resolved with the rule "if it parses as a declaration, it is a declaration."

I figured that out from parse.c but how do you write a grammar rule for that rule ;)
Aug 29 2004
prev sibling parent reply Roberto Mariottini <Roberto_member pathlink.com> writes:
In article <cgqo26$3m5$1 digitaldaemon.com>, Walter says...
"Andy Friesen" <andy ikagames.com> wrote in message
news:cgdqeq$ig$1 digitaldaemon.com...
 I hit an issue like this just yesterday.

      a * b; // is it a local variable or a multiplication?

That's resolved with the rule "if it parses as a declaration, it is a declaration."

Walter, what about supporting an optional ':' to use in declarations, like this: a* : b; // b is of type a* this can be useful also for type lists: int*[] : a, b, c, d; // all of type int*[] All this is for clarity only. I don't know if this can help to disambiguate or will only cause trouble to the parser. Ciao
Aug 30 2004
next sibling parent Andy Friesen <andy ikagames.com> writes:
Roberto Mariottini wrote:

 In article <cgqo26$3m5$1 digitaldaemon.com>, Walter says...
 
"Andy Friesen" <andy ikagames.com> wrote in message
news:cgdqeq$ig$1 digitaldaemon.com...

I hit an issue like this just yesterday.

     a * b; // is it a local variable or a multiplication?

That's resolved with the rule "if it parses as a declaration, it is a declaration."

Walter, what about supporting an optional ':' to use in declarations, like this: a* : b; // b is of type a* this can be useful also for type lists: int*[] : a, b, c, d; // all of type int*[] All this is for clarity only. I don't know if this can help to disambiguate or will only cause trouble to the parser.

This is really easy to resolve with ANTLR: statement // if it could be a type, assume it is a type : (type) => type Identifier // else maybe it's an expression | expression | ... ; -- andy
Aug 30 2004
prev sibling parent reply Derek Parnell <derek psych.ward> writes:
On Mon, 30 Aug 2004 07:32:30 +0000 (UTC), Roberto Mariottini wrote:

 In article <cgqo26$3m5$1 digitaldaemon.com>, Walter says...
"Andy Friesen" <andy ikagames.com> wrote in message
news:cgdqeq$ig$1 digitaldaemon.com...
 I hit an issue like this just yesterday.

      a * b; // is it a local variable or a multiplication?

That's resolved with the rule "if it parses as a declaration, it is a declaration."

Walter, what about supporting an optional ':' to use in declarations, like this: a* : b; // b is of type a* this can be useful also for type lists: int*[] : a, b, c, d; // all of type int*[] All this is for clarity only. I don't know if this can help to disambiguate or will only cause trouble to the parser. Ciao

I imagine that the current recommended style is ... alias a* ap; ap b; // b is of type a* (aka ap) alias int*[] ipa; ipa a, b, c, d; // all of type int*[] (aka ipa) Its a pity that '*' is used for two totally different syntax purposes. -- Derek Melbourne, Australia 30/Aug/04 5:52:34 PM
Aug 30 2004
next sibling parent reply Id <Id_member pathlink.com> writes:
In article <cgumig$28n8$1 digitaldaemon.com>, Derek Parnell says...

Its a pity that '*' is used for two totally different syntax purposes.

For D 2.0, (unless total backwards compatibility source code is needed...) in order to avoid confusion between the multiplication operator and pointer, what about if: "*" continues being used for multiplications (eg: a*=b; //multiplies a*b) "º" indicates a pointer (eg: longº foo; //pointer to a long)
Aug 30 2004
parent Ilya Minkov <minkov cs.tum.edu> writes:
Id schrieb:

 "*" continues being used for multiplications (eg: a*=3Db; //multiplies =

 "=BA" indicates a pointer (eg: long=BA foo; //pointer to a long)

=B0 is not ASCII. -eye
Aug 30 2004
prev sibling parent reply Regan Heath <regan netwin.co.nz> writes:
On Mon, 30 Aug 2004 17:56:00 +1000, Derek Parnell <derek psych.ward> wrote:

 On Mon, 30 Aug 2004 07:32:30 +0000 (UTC), Roberto Mariottini wrote:

 In article <cgqo26$3m5$1 digitaldaemon.com>, Walter says...
 "Andy Friesen" <andy ikagames.com> wrote in message
 news:cgdqeq$ig$1 digitaldaemon.com...
 I hit an issue like this just yesterday.

      a * b; // is it a local variable or a multiplication?

That's resolved with the rule "if it parses as a declaration, it is a declaration."

Walter, what about supporting an optional ':' to use in declarations, like this: a* : b; // b is of type a* this can be useful also for type lists: int*[] : a, b, c, d; // all of type int*[] All this is for clarity only. I don't know if this can help to disambiguate or will only cause trouble to the parser. Ciao

I imagine that the current recommended style is ... alias a* ap; ap b; // b is of type a* (aka ap)

Actually the style guide says to "avoid pointless type aliases" and lists the following as pointless: alias void VOID; alias int INT; alias int* pint; the last is suspiciously like the example you give above.
  alias int*[] ipa;
  ipa a, b, c, d; // all of type int*[] (aka ipa)

 Its a pity that '*' is used for two totally different syntax purposes.

Yeah.. if only pointers and multiplication weren't so useful we could remove one of them. Luckily the need for pointers has lessened in D, as compared to C/C++. Regan -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Aug 30 2004
parent reply "Ivan Senji" <ivan.senji public.srce.hr> writes:
"Regan Heath" <regan netwin.co.nz> wrote in message
news:opsdkgbpiy5a2sq9 digitalmars.com...
 On Mon, 30 Aug 2004 17:56:00 +1000, Derek Parnell <derek psych.ward>

 On Mon, 30 Aug 2004 07:32:30 +0000 (UTC), Roberto Mariottini wrote:

 In article <cgqo26$3m5$1 digitaldaemon.com>, Walter says...
 "Andy Friesen" <andy ikagames.com> wrote in message
 news:cgdqeq$ig$1 digitaldaemon.com...
 I hit an issue like this just yesterday.

      a * b; // is it a local variable or a multiplication?

That's resolved with the rule "if it parses as a declaration, it is a declaration."

Walter, what about supporting an optional ':' to use in declarations, like this: a* : b; // b is of type a* this can be useful also for type lists: int*[] : a, b, c, d; // all of type int*[] All this is for clarity only. I don't know if this can help to disambiguate or will only cause trouble to the parser. Ciao

I imagine that the current recommended style is ... alias a* ap; ap b; // b is of type a* (aka ap)

Actually the style guide says to "avoid pointless type aliases" and lists the following as pointless: alias void VOID; alias int INT; alias int* pint; the last is suspiciously like the example you give above.
  alias int*[] ipa;
  ipa a, b, c, d; // all of type int*[] (aka ipa)

 Its a pity that '*' is used for two totally different syntax purposes.

Yeah.. if only pointers and multiplication weren't so useful we could remove one of them. Luckily the need for pointers has lessened in D, as compared to C/C++.

But pointers are not a problem here, types are. Because in "a * b" a can be both type and an object. A simple solution would be to mark types some way, like: type a* b; //declaration a*b; //expression But who would like to write "type" before every type in a Cstyle language? :)
 Regan

 --
 Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/

Aug 31 2004
next sibling parent reply Stephan Wienczny <Stephan Wienczny.de> writes:
Ivan Senji wrote:
 
 But pointers are not a problem here, types are. Because in "a * b" a can
 be both type and an object. A simple solution would be to mark types
 some way, like:
 type a* b; //declaration
 a*b; //expression
 
 But who would like to write "type" before every type in a Cstyle
 language? :)
 
 

I would suggest to use "var" as this is one letter less ;-P For me it would not be that problem to use such a style. It could make reading sources easier. Stephan
Aug 31 2004
parent reply Id <Id_member pathlink.com> writes:
In article <ch21n2$qdh$1 digitaldaemon.com>, Stephan Wienczny says...
Ivan Senji wrote:
 
 But pointers are not a problem here, types are. Because in "a * b" a can
 be both type and an object. A simple solution would be to mark types
 some way, like:
 type a* b; //declaration
 a*b; //expression
 
 But who would like to write "type" before every type in a Cstyle
 language? :)
 
 

I would suggest to use "var" as this is one letter less ;-P For me it would not be that problem to use such a style. It could make reading sources easier. Stephan

Or, in all case, as the º character which I suggested for pointers is not in the 7bit ASCII table (oops... ¬¬'), what about using " " for pointers? that is: cent foo; // foo is a overused name which manages to be a pointer to a cent ;P
Aug 31 2004
parent reply Sean Kelly <sean f4.ca> writes:
In article <ch21n2$qdh$1 digitaldaemon.com>, Stephan Wienczny says...
Ivan Senji wrote:
 
 But pointers are not a problem here, types are. Because in "a * b" a can
 be both type and an object. A simple solution would be to mark types
 some way, like:
 type a* b; //declaration
 a*b; //expression
 
 But who would like to write "type" before every type in a Cstyle
 language? :)


Why not: a* b; // declaration (a*b); // expression Sean
Aug 31 2004
parent reply "Ivan Senji" <ivan.senji public.srce.hr> writes:
"Sean Kelly" <sean f4.ca> wrote in message
news:ch2lcv$15l7$1 digitaldaemon.com...
 In article <ch21n2$qdh$1 digitaldaemon.com>, Stephan Wienczny says...
Ivan Senji wrote:
 But pointers are not a problem here, types are. Because in "a * b" a



 be both type and an object. A simple solution would be to mark types
 some way, like:
 type a* b; //declaration
 a*b; //expression

 But who would like to write "type" before every type in a Cstyle
 language? :)


Why not: a* b; // declaration (a*b); // expression

It is the way it is now, but: a/b; //expression a+b; //expression a*b; //declaration I don't like it.
 Sean

Aug 31 2004
parent reply J C Calvarese <jcc7 cox.net> writes:
Ivan Senji wrote:
 "Sean Kelly" <sean f4.ca> wrote in message
 news:ch2lcv$15l7$1 digitaldaemon.com...
 
In article <ch21n2$qdh$1 digitaldaemon.com>, Stephan Wienczny says...

Ivan Senji wrote:

But pointers are not a problem here, types are. Because in "a * b" a



can
be both type and an object. A simple solution would be to mark types
some way, like:
type a* b; //declaration
a*b; //expression

But who would like to write "type" before every type in a Cstyle
language? :)


Why not: a* b; // declaration (a*b); // expression

It is the way it is now, but: a/b; //expression a+b; //expression a*b; //declaration I don't like it.

Would you like this better? cast(thisIsAnExpression) (a*b); //expression type(thisIsADeclaration) a*b; //declaration Perfectly clear. Painful to type. Hurts my eyes. Ow. -- Justin (a/k/a jcc7) http://jcc_7.tripod.com/d/
Aug 31 2004
next sibling parent Derek Parnell <derek psych.ward> writes:
On Tue, 31 Aug 2004 17:39:27 -0500, J C Calvarese wrote:

 Ivan Senji wrote:
 "Sean Kelly" <sean f4.ca> wrote in message
 news:ch2lcv$15l7$1 digitaldaemon.com...
 
In article <ch21n2$qdh$1 digitaldaemon.com>, Stephan Wienczny says...

Ivan Senji wrote:

But pointers are not a problem here, types are. Because in "a * b" a



can
be both type and an object. A simple solution would be to mark types
some way, like:
type a* b; //declaration
a*b; //expression

But who would like to write "type" before every type in a Cstyle
language? :)


Why not: a* b; // declaration (a*b); // expression



I was thinking that, '*' used as a pointer type is 'attached' or associated with the left hand expression, that is the 'a'. So to make that obvious one could code ... (a*)b; // declaration. However that is interpreted as a depreciated cast!
 It is the way it is now, but:
 a/b;  //expression
 a+b; //expression
 a*b; //declaration
 
 I don't like it.


Hmmm... when put like that it does seem a tad inconsistent.
 Would you like this better?
 
 cast(thisIsAnExpression) (a*b); //expression
 type(thisIsADeclaration)  a*b;  //declaration
 
 Perfectly clear.
 Painful to type.
 Hurts my eyes. Ow.

So back on the 'cast' sort of syntax... type(a*)b; // declaration as it reads "I am declaring a variable called 'b' of type a*" This seems to work in the general case ... declaration::"type" "(" type_expression ")" newident [, newident ]... ";" newident:: identifier [ "=" initval ] initval:: ( literal | classinit ) classinit:: "new" classname [ "(" parmlist ")" ] type(int) d; type(Foo) f; type(Bar[]*) b; type(XYZ*[]) x; One would only need to use this 'type' syntax to remove ambiguity and still have consistent expression syntax. -- Derek Melbourne, Australia 1/Sep/04 9:28:59 AM
Aug 31 2004
prev sibling next sibling parent Id <Id_member pathlink.com> writes:
In article <ch2un1$1a8k$1 digitaldaemon.com>, J C Calvarese says...

Would you like this better?

cast(thisIsAnExpression) (a*b); //expression
type(thisIsADeclaration)  a*b;  //declaration

Perfectly clear.
Painful to type.
Hurts my eyes. Ow.

Painful? That's becuase you haven't seen this yet! ;) : public static headache main() // returns a headache, for sure :P { #define <'*',!pointer, 0>; // code 0 assigned to pointer #define <'*',!mult_operand, 1>; // code 1 assigned to mult_operand foo*<0> a=0x0040_0000; // a is a pointer to a foo long n1=5, n2=7, n3=9, n4=-12; n1*<1>=n2; // multiplies n1*n2 n3=n1*<2>n4; // error, code 2 for '*' not specified return null; } Ok, after this joke... you could have heard what I proposed before (which seemed to be ignored...) ¬_¬' . What about if the ' ' character was used for pointers? it's a simple, straightforward way! : a*b; // expression - ol' school multiplication foo b; // declaration - b is a pointer to a foo
Aug 31 2004
prev sibling parent Roberto Mariottini <Roberto_member pathlink.com> writes:
In article <ch2un1$1a8k$1 digitaldaemon.com>, J C Calvarese says...

Would you like this better?

cast(thisIsAnExpression) (a*b); //expression
type(thisIsADeclaration)  a*b;  //declaration

Perfectly clear.
Painful to type.
Hurts my eyes. Ow.

I like better my proposal: a * b; // expression a* : b; // declaration I know that typing a ':' for every declaration is not for lazy people, so I'd let it optional, used to make things clearer, or only when an ambiguity is found: a b; // no ambiguity: declaration a[] b; // ditto a : b; // OK, ':' is redundant a* : b; // ambiguous, must use ':' to be a declaration And so on. Ciao
Aug 31 2004
prev sibling parent Regan Heath <regan netwin.co.nz> writes:
On Tue, 31 Aug 2004 15:27:34 +0200, Ivan Senji <ivan.senji public.srce.hr> 
wrote:
 "Regan Heath" <regan netwin.co.nz> wrote in message
 news:opsdkgbpiy5a2sq9 digitalmars.com...
 On Mon, 30 Aug 2004 17:56:00 +1000, Derek Parnell <derek psych.ward>


<snip>
  alias int*[] ipa;
  ipa a, b, c, d; // all of type int*[] (aka ipa)

 Its a pity that '*' is used for two totally different syntax purposes.

Yeah.. if only pointers and multiplication weren't so useful we could remove one of them. Luckily the need for pointers has lessened in D, as compared to C/C++.

But pointers are not a problem here, types are.

Isn't a pointer a type?
 Because in "a * b" a can
 be both type and an object.

I realise that. Question: If there was no such thing as a pointer, what does "a * b" mean? Answer: multiplication Question: If there was no such thing as multiplication, what does "a * b" mean? Answer: a pointer to an 'a' called 'b'.
 A simple solution would be to mark types
 some way, like:
 type a* b; //declaration
 a*b; //expression

 But who would like to write "type" before every type in a Cstyle
 language? :)

Walter said "a * b" will be interpreted as a pointer, to cause it to be an expression you use braces eg. (a * b) That seems to me to be an ok solution. Regan
 Regan

 --
 Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/


-- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Aug 31 2004
prev sibling parent reply "Ivan Senji" <ivan.senji public.srce.hr> writes:
"Ilya Minkov" <minkov cs.tum.edu> wrote in message
news:cgdp33$30rl$1 digitaldaemon.com...
 Ivan Senji schrieb:

 I forgot to mention that if you remove "type" from basic_type grammar
 rule you get an ambigous LR(1) grammar but it is still possible to
 parse that language with a parser i used to call nondeterministic LR(1)
 before i found out it is actually Generalized LR or GLR.

I wonder what exactly causes the abiguity. What rules exactly conflict? Cannot the decision be shifted out?

Andy Friesens example desribes the problem corectly: a * b; Is "a" a type and i'm declaring a pointer to that type, or am i just mulitplying a and b? DMD also has problems with this: "proba2.d(13): a is used as a type" The grammar rules could probbably be rewriten to combine these two cases into one case, but i haven't yet found the courage (or time) to try it.
 And finally, C also has to deal with it someway. Did you base your D
 grammar on an existing YACC C grammar?

Some parts are but i will have to look into it to see how it solves the problems of this type, although i suspect it uses some aditional disambiguating rules because it has: ifStatement -> if ( expression ) statement else statement ifStatement -> if ( expression ) statement and this isn't reall lr(1).
 The only sad thing is that this parser takes much more time
 to parse because there are more parsers working in parallel.
 My test file with an unamibouous grammar parses in 30ms
 while this one with the GLR parser takes 2s, although it would
 probably be possible to optimize both GLR parser and grammar.

Why don't you drop the table-based parsing altogether? It's always slow in practice anyway! Even the GCC crew seasoned at table-based pasing thinks so. Even, it is "considered harmful for reengineering purposes".

Well i wrote a program to generate that table and if i dropped it it would mean that writing it was a waste of time :) Maybe it is slow but there are advantages: i have to write the table generator and simulator once and can use it with any grammar: For example i have used it so far to parse C,D,Java, our simple-project programming language at the university and a bunch of small grammars. I use it as a toy and a tool to help me write a grammar for a hypothetical programming language based on D. I make a syntax change and in a couple of minutes i can test it: i have a syntax tree to see if the results are good, and i have plans to create a recursive descent parser generator that would traverse this tree, and i know it will be slow but it is a lot of fun! :) Can you please explain why it is: "considered harmful for reengineering purposes". ?
 -eye

Aug 23 2004
next sibling parent reply Andy Friesen <andy ikagames.com> writes:
Ivan Senji wrote:
 "Ilya Minkov" <minkov cs.tum.edu> wrote in message
 news:cgdp33$30rl$1 digitaldaemon.com...
 
Ivan Senji schrieb:


I forgot to mention that if you remove "type" from basic_type grammar
rule you get an ambigous LR(1) grammar but it is still possible to
parse that language with a parser i used to call nondeterministic LR(1)
before i found out it is actually Generalized LR or GLR.

I wonder what exactly causes the abiguity. What rules exactly conflict? Cannot the decision be shifted out?

Andy Friesens example desribes the problem corectly: a * b; Is "a" a type and i'm declaring a pointer to that type, or am i just mulitplying a and b? DMD also has problems with this: "proba2.d(13): a is used as a type" The grammar rules could probbably be rewriten to combine these two cases into one case, but i haven't yet found the courage (or time) to try it.

This is what I did:
 TypeOrExpression
     : Type IDENTIFIER // it's a local
     | Type IDENTIFIER '=' ConditionalExpression // initialized local
     | Type AssignOp ConditionalExpression // expression (this is the ugly bit)
     | AssignExpression // expression
     ;

I'll admit, though, that I haven't tested it as much as I ought to, and it certainly has its share of problems. (like allowing statements like "int = 8") (I really should look into replacing it with an ANTLR grammar. It behaves in a much more intuitive manner, I think) -- andy
Aug 24 2004
parent "Ivan Senji" <ivan.senji public.srce.hr> writes:
"Andy Friesen" <andy ikagames.com> wrote in message
news:cgep6c$gaj$1 digitaldaemon.com...
 Ivan Senji wrote:
 "Ilya Minkov" <minkov cs.tum.edu> wrote in message
 news:cgdp33$30rl$1 digitaldaemon.com...

Ivan Senji schrieb:


I forgot to mention that if you remove "type" from basic_type grammar
rule you get an ambigous LR(1) grammar but it is still possible to
parse that language with a parser i used to call nondeterministic LR(1)
before i found out it is actually Generalized LR or GLR.

I wonder what exactly causes the abiguity. What rules exactly conflict? Cannot the decision be shifted out?

Andy Friesens example desribes the problem corectly: a * b; Is "a" a type and i'm declaring a pointer to that type, or am i just mulitplying a and b? DMD also has problems with this: "proba2.d(13): a is used as a type" The grammar rules could probbably be rewriten to combine these two cases into one case, but i haven't yet found the courage (or time) to try it.

This is what I did:
 TypeOrExpression
     : Type IDENTIFIER // it's a local
     | Type IDENTIFIER '=' ConditionalExpression // initialized local
     | Type AssignOp ConditionalExpression // expression (this is the


     | AssignExpression // expression
     ;


Where does this go? ( :) ) i have this part causing problems: <ClosedStatement> -> <Declaration> <ClosedStatement> -> <Expression> ; Can i find your grammar somewhere, maybe combine them and create a better one?
 I'll admit, though, that I haven't tested it as much as I ought to, and
 it certainly has its share of problems. (like allowing statements like
 "int = 8")

 (I really should look into replacing it with an ANTLR grammar.  It
 behaves in a much more intuitive manner, I think)

   -- andy

Aug 24 2004
prev sibling next sibling parent reply "antiAlias" <fu bar.com> writes:
"Ivan Senji" <ivan.senji public.srce.hr> .
 "Ilya Minkov" <minkov cs.tum.edu> wrote in message
 Why don't you drop the table-based parsing altogether? It's always slow
 in practice anyway! Even the GCC crew seasoned at table-based pasing
 thinks so. Even, it is "considered harmful for reengineering purposes".

Well i wrote a program to generate that table and if i dropped it it would mean that writing it was a waste of time :) Maybe it is slow but there are advantages: i have to write the table generator and simulator once and can use it with any grammar: For example i have used it so far to parse C,D,Java, our simple-project programming language at the university and a bunch of small grammars. I use it as a toy and a tool to help me write a grammar for a hypothetical programming language based on D. I make a syntax change and in a couple of minutes i can test it: i have a syntax tree to see if the

 are good, and i have plans to create a recursive descent parser generator
 that would traverse this tree, and  i know it will be slow but it is a lot
 of fun!
 :)

FWIW, I think what you're doing is highly commendable! Nice work, dude. Wish I could have hired someone like yourself last year for a somewhat related project.
Aug 24 2004
parent "Ivan Senji" <ivan.senji public.srce.hr> writes:
"antiAlias" <fu bar.com> wrote in message
news:cgepb9$geg$1 digitaldaemon.com...
 "Ivan Senji" <ivan.senji public.srce.hr> .
 "Ilya Minkov" <minkov cs.tum.edu> wrote in message
 Why don't you drop the table-based parsing altogether? It's always



 in practice anyway! Even the GCC crew seasoned at table-based pasing
 thinks so. Even, it is "considered harmful for reengineering



 Well i wrote a program to generate that table and if i dropped it
 it would mean that writing it was a waste of time :)
 Maybe it is slow but there are advantages: i have to write the table
 generator and simulator once and can use it with any grammar:
 For example i have used it so far to parse C,D,Java, our simple-project
 programming language at the university and a bunch of small grammars.

 I use it as a toy and a tool to help me write a grammar for a


 programming language based on D. I make a syntax change and in a
 couple of minutes i can test it: i have a syntax tree to see if the

 are good, and i have plans to create a recursive descent parser


 that would traverse this tree, and  i know it will be slow but it is a


 of fun!
 :)

FWIW, I think what you're doing is highly commendable! Nice work, dude.

Not really, it is just a matter of implementing an algorithm not very different then any other algorithm. :)
 Wish
 I could have hired someone like yourself last year for a somewhat related
 project.

Aug 24 2004
prev sibling parent reply Ilya Minkov <minkov cs.tum.edu> writes:
Ivan Senji schrieb:

 Andy Friesens example desribes the problem corectly:
  a * b;
 Is "a" a type and i'm declaring a pointer to that type,
 or am i just mulitplying a and b?

Everything that can be taken for a declaration is. On the other hand, YACC would want that also to work like that in the middle of the expression, right?
 DMD also has problems with this:
 "proba2.d(13): a is used as a type"

DMD lives from expectations. It knows not to expect a declaration within the expression, but what you are trying to parse is a statement.
 The grammar rules could probbably be rewriten to combine these
 two cases into one case, but i haven't yet found the courage (or time)
 to try it.

I think *this* is the reason why it's "considered harmful". :)
 Well i wrote a program to generate that table and if i dropped it
 it would mean that writing it was a waste of time :)

If your target was to learn to write one, then why was it a waste of time? :)
 Maybe it is slow but there are advantages: i have to write the table
 generator and simulator once and can use it with any grammar:
 For example i have used it so far to parse C,D,Java, our simple-project
 programming language at the university and a bunch of small grammars.

Simulator?
 I use it as a toy and a tool to help me write a grammar for a hypothetical
 programming language based on D. I make a syntax change and in a
 couple of minutes i can test it: i have a syntax tree to see if the results
 are good, and i have plans to create a recursive descent parser generator
 that would traverse this tree, and  i know it will be slow but it is a lot
 of fun!
 :)

Why should it be slow?
 Can you please explain why it is: "considered harmful for reengineering
 purposes". ?

There was an article i ran over once, i didn't really like its tone but the points seem to be valid... I think it was on parsing COBOL, where many sorts of syntax extensions appeared over years. And the requierement for parsing was then, that syntax parts could be enabled and disabled, which is possible with recursive descent parsers (though i don't know whether any automatic generator has such an ability), but problematic with table-based systems. Other problem was if the grammar would need to be slightly extended, it might provoke a disproportinate amount of editing to disambiguate it - apart from GLR of course, but what would be the advantage compared to recursive? -eye
Aug 27 2004
parent "Ivan Senji" <ivan.senji public.srce.hr> writes:
"Ilya Minkov" <minkov cs.tum.edu> wrote in message
news:cgo4fg$207v$2 digitaldaemon.com...
 Ivan Senji schrieb:

 Andy Friesens example desribes the problem corectly:
  a * b;
 Is "a" a type and i'm declaring a pointer to that type,
 or am i just mulitplying a and b?

Everything that can be taken for a declaration is. On the other hand, YACC would want that also to work like that in the middle of the expression, right?
 DMD also has problems with this:
 "proba2.d(13): a is used as a type"

DMD lives from expectations. It knows not to expect a declaration within the expression, but what you are trying to parse is a statement.
 The grammar rules could probbably be rewriten to combine these
 two cases into one case, but i haven't yet found the courage (or time)
 to try it.

I think *this* is the reason why it's "considered harmful". :)

Then i wouln't try :)
 Well i wrote a program to generate that table and if i dropped it
 it would mean that writing it was a waste of time :)

If your target was to learn to write one, then why was it a waste of time? :)

You are absolutely right, nothing you learn to do is a waste of time. (almost nothing)
 Maybe it is slow but there are advantages: i have to write the table
 generator and simulator once and can use it with any grammar:
 For example i have used it so far to parse C,D,Java, our simple-project
 programming language at the university and a bunch of small grammars.

Simulator?

Ah! It is my English! When i said "simulator" it is actually a part of program that takes an input string and a table and as output it returns 0 or more parse trees.
 I use it as a toy and a tool to help me write a grammar for a


 programming language based on D. I make a syntax change and in a
 couple of minutes i can test it: i have a syntax tree to see if the


 are good, and i have plans to create a recursive descent parser


 that would traverse this tree, and  i know it will be slow but it is a


 of fun!
 :)

Why should it be slow?

Atlest slower than a normal recursive descent parser because this one will first do the parsing part with a table.
 Can you please explain why it is: "considered harmful for reengineering
 purposes". ?

There was an article i ran over once, i didn't really like its tone but the points seem to be valid... I think it was on parsing COBOL, where many sorts of syntax extensions appeared over years. And the requierement for parsing was then, that syntax parts could be enabled and disabled, which is possible with recursive descent parsers (though i don't know whether any automatic generator has such an ability), but problematic with table-based systems. Other problem was if the grammar would need to be slightly extended, it might provoke a disproportinate amount of editing to disambiguate it - apart from GLR of course, but what would be the advantage compared to recursive?

I see! Thanks
 -eye

Aug 29 2004