D - Project looks very promising - Lets get a compiler out fast :)

Michael Gaskins (4/4) Aug 16 2001 The language specifications look very good to me (I do mostly Java codin...

Robert W. Cunningham (29/33) Aug 16 2001 Of possibly greater importance is a language test suite, usable both for

Walter (20/53) Aug 18 2001 D is designed to be easy to parse. The semantic routines are a little

Axel Kittenberger (5/12) Aug 18 2001 How about:

Christophe de Dinechin (10/22) Aug 18 2001 Oh, yes, let's do the same mistake with angle brackets that w...

Axel Kittenberger (24/28) Aug 18 2001 x shifter right by v is greater then 3 casted to o.

Rajiv Bhagwat (8/36) Aug 18 2001 Hey guys,

Walter (3/6) Aug 18 2001 As one wag once said, you can write FORTRAN in any language.

John Fletcher (3/10) Aug 21 2001 One of my research students wrote a FORTRAN program which wrote FORTRAN.

Russell Bornschlegel (6/26) Aug 18 2001 This I could live with.

Walter (3/6) Aug 18 2001 Yes, because the grammar is too ambiguous. A goal of D is to make parsin...

Sean L. Palmer (7/15) Nov 03 2001 The only other problem with C++ style casts is that it requires making

Walter (11/29) Dec 15 2001 I'm still in a bit of a quandary about casts. Should casting be:

Pavel Minayev (5/12) Dec 16 2001 Personally, I've found myself using the cast() form.
Axel Kittenberger (23/30) Dec 16 2001 My I brainstorm a bit?

Pavel Minayev (12/41) Dec 16 2001 It's also hard to distinguish from normal expression containing

la7y6nvo shamko.com (47/58) Dec 16 2001 Please let D abandon the abominable syntax that C uses for casts.

Axel Kittenberger (11/18) Dec 16 2001 True, in fact there are 4 casts I know of, conversion cast, upcast,

Walter (11/21) Dec 16 2001 use

Walter (14/21) Dec 16 2001 There's a related issue for declarations, is:

Pavel Minayev (12/17) Dec 16 2001 One

Walter (8/27) Dec 17 2001 with

Sean L. Palmer (21/51) Dec 17 2001 I guess you can't have everything. Either that or you have the coder

Walter (19/34) Dec 17 2001 ever

Pavel Minayev (6/10) Dec 17 2001 functions

Roberto Mariottini (3/12) Dec 17 2001 Please, save me from this! ;-)

Walter (4/18) Dec 17 2001 Currently, D issues the following error message for the Apple*y:

Axel Kittenberger (6/21) Dec 17 2001 In my opinion 'int Apple' should better raise an error if Apple is defin...

Russ Lewis (7/28) Dec 17 2001 --

Russ Lewis (13/13) Dec 17 2001 Think, Russ! Think, then speak!

Walter (11/14) Dec 17 2001 (except for
Axel Kittenberger (12/23) Dec 18 2001 Not necessarly true, the lexer can do type lookups, and return differnt

Walter (4/15) Dec 18 2001 size

Pavel Minayev (6/11) Dec 17 2001 This might be true if Apple was a local class. But what if it

Roberto Mariottini (30/55) Dec 17 2001 former

Sean L. Palmer (40/71) Dec 17 2001 I just thought of something. Yes, typing var is kinda annoying. But ho...

Pavel Minayev (8/23) Dec 17 2001 Sounds good.
Walter (8/84) Dec 17 2001 It's an intriguing idea. The : version won't work, though, as you'd need

a (11/23) Dec 16 2001 If you ever allow any form of generic programming (templates, etc.)
Charles Hixson (14/23) Jan 02 2002 Casts are one of my least liked features of C and C++. Better

Charles Hixson (25/36) Jan 02 2002 On reading other posts I encountered the proposal to use the syntax:

Pavel Minayev (11/20) Jan 02 2002 I don't see any problem e.as(int) returns a value of type int.

Sean L. Palmer (10/16) Nov 03 2001 to

Axel Kittenberger (4/12) Nov 04 2001 Do you know any OpenSource LL(n) parser generators for C?

Sean L. Palmer (7/19) Nov 05 2001 Antlr is evidently Public Domain:

brucedickey micron.com (12/34) Jul 18 2002 FYI:

Walter (12/28) Dec 15 2001 I can never remember the definitions of those grammars. The current D

Axel Kittenberger (20/25) Dec 16 2001 Well the today active GNU projects are called "flex" and "bison". Bison

Walter (22/45) Dec 16 2001 descent.

Axel Kittenberger (4/12) Dec 16 2001 I had no problems whatever compiling and running bison with MSVC++ (back...

Russ Lewis (22/28) Jul 22 2002 I don't think that LL(k) is possible (without hacks). The problem is th...

Walter (3/7) Aug 18 2001 The compiler does exist, but it is too embarassingly buggy to post right...

"Michael Gaskins" <mbgaski clemson.edu> writes:

The language specifications look very good to me (I do mostly Java coding
right now but I've also done a fair amount of C and just a tad of C++).
Lets get a compiler for this thing (alpha level, beta level, whatever) to
starve off all the 'vaporware' shouters that will surely come.

Aug 16 2001

"Robert W. Cunningham" <rwc_2001 yahoo.com> writes:

Michael Gaskins wrote:

 The language specifications look very good to me (I do mostly Java coding
 right now but I've also done a fair amount of C and just a tad of C++).
 Lets get a compiler for this thing (alpha level, beta level, whatever) to
 starve off all the 'vaporware' shouters that will surely come.

Of possibly greater importance is a language test suite, usable both for
regression testing of the real compiler, and for uncovering and explaining
thorny language issues by using *real* examples (rather than guesstimates of
what the D language spec *really* means).

And to make such a suite useful useful, we need a tool that will recognize
valid D programs, even if we can't compile them to machine code and execute
them.

Is D LALR?  If so, then let's use LEX/YACC (or Flex/Bison) to whip up a
quick parser that simply outputs state information (and possibly some crude
C equivalents instead of machine code).  At a minimum, the parser could be a
binary function that accepts text, where the return value would simply
indicate parse success (a valid program).  Semantic actions could be used to
handle many non-LALR language features, as well as documenting violations.

Such a tool would allow everyone to learn and write D (and develop
regression suites) long before anything close to a full D compiler is
ready.  It may even help if code generation is postponed until after the
grammar and feature set has stabilized.

I remember writing a compiler for the VAX (yes, decades ago) for an obscure
proprietary language.  Our 4 person team focused on getting the grammar
parsing right first.  When we finally knew we were properly handling all the
key test cases, the team then split to implement code generation,
modularity, compiler options, proper error handling, and many related
items.  After spending two months as a team on the parser, I was then able
to write an optimized code generator on my own in a little over a month.
(Well, the VAX instruction set and machine architecture was a dream to work
with - the best of the CISC CPUs.)

With a known-good parser, everything else seems vastly easier.


-BobC

Aug 16 2001

"Walter" <walter digitalmars.com> writes:

D is designed to be easy to parse. The semantic routines are a little
trickier.

The only real trick in the parser is distinguishing a cast from a
parenthesized expression,
and a declaration from a statement.

I considered an alternate syntax for these to make them easier to parse, but
it just doesn't look right. I'm too used to C, I guess. For example:

    cast(int) expr

instead of:

    (int) expr

and:

    var int foo;

instead of:

    int foo;

-Walter

Robert W. Cunningham wrote in message <3B7CB19C.D1C766B1 yahoo.com>...
Michael Gaskins wrote:

 The language specifications look very good to me (I do mostly Java coding
 right now but I've also done a fair amount of C and just a tad of C++).
 Lets get a compiler for this thing (alpha level, beta level, whatever) to
 starve off all the 'vaporware' shouters that will surely come.

Of possibly greater importance is a language test suite, usable both for
regression testing of the real compiler, and for uncovering and explaining
thorny language issues by using *real* examples (rather than guesstimates

of
what the D language spec *really* means).

And to make such a suite useful useful, we need a tool that will recognize
valid D programs, even if we can't compile them to machine code and execute
them.

Is D LALR?  If so, then let's use LEX/YACC (or Flex/Bison) to whip up a
quick parser that simply outputs state information (and possibly some crude
C equivalents instead of machine code).  At a minimum, the parser could be

a
binary function that accepts text, where the return value would simply
indicate parse success (a valid program).  Semantic actions could be used

to
handle many non-LALR language features, as well as documenting violations.

Such a tool would allow everyone to learn and write D (and develop
regression suites) long before anything close to a full D compiler is
ready.  It may even help if code generation is postponed until after the
grammar and feature set has stabilized.

I remember writing a compiler for the VAX (yes, decades ago) for an obscure
proprietary language.  Our 4 person team focused on getting the grammar
parsing right first.  When we finally knew we were properly handling all

the
key test cases, the team then split to implement code generation,
modularity, compiler options, proper error handling, and many related
items.  After spending two months as a team on the parser, I was then able
to write an optimized code generator on my own in a little over a month.
(Well, the VAX instruction set and machine architecture was a dream to work
with - the best of the CISC CPUs.)

With a known-good parser, everything else seems vastly easier.


-BobC

Aug 18 2001

Axel Kittenberger <axel dtone.org> writes:

 
     cast(int) expr
 
 instead of:
 
     (int) expr
 

How about:
 <int> expr

Greater-lesser brackets are widly accepted to surrong types. This not only 
eases parsing for the compiler, for my eyes it also eases reading the 
source for humans.

Aug 18 2001

Christophe de Dinechin <descubes earthlink.net> writes:

Axel Kittenberger wrote:

     cast(int) expr

 instead of:

     (int) expr

 How about:
  <int> expr

 Greater-lesser brackets are widly accepted to surrong types. This not only
 eases parsing for the compiler, for my eyes it also eases reading the
 source for humans.

<sarcastic>Oh, yes, let's do the same mistake with angle brackets that was
done in C++, angle bracket just look so great.</sarcastic>

Now, please parse for me:

    if (x < < v > <o> 3)

and compare it to

    if (x << v > <o> 3)

Oh, and what if o is a type? Is not a type?


Right, so readable :-)


Christophe

Aug 18 2001

Axel Kittenberger <axel dtone.org> writes:

scarcasm is not good, no need to dig that out early.

     if (x < < v > <o> 3)

x is smaller than 3 casted to o casted to v.

 and compare it to
     if (x << v > <o> 3)

x shifter right by v is greater then 3 casted to o.

Leaving away brackets is no way where they are supposed is no way to argue 
languages. I could talk very wiered english too thats still has valid 
grammar, that doesn't make english a bad language. 

We could also start to discuss what a = b+++c; yields, or if a = b++++c is 
valid syntax, same with the old tangling else problem.
-

     if (x < (<v><o> 3))  
is better.

and so is:
     if ((x << v) > (<o> 3))

I see this at least as good as it would be tradionally:
     if (x < ((v)(o) 3))  
and
     if ((x << v) > ((o) 3))

And honestly typecasts and comperasion is not something you'll encounter 
every day in the same line. Normally one casts objects and compares integer 
types. Yes, I know there are cases where you have to compare signed with 
unsigned integers, there you need a typecast in the same line, but normally 
if you decide types right this can be avoided.

 Oh, and what if o is a type? Is not a type?

I don't understand this.

- Axel

Aug 18 2001

"Rajiv Bhagwat" <dataflow vsnl.com> writes:

Hey guys,

As an aside, did you know that Walter once won an 'obfuscated c' contest?

We don't want 'D' to be a language for which such contests are held<g>!
-- Rajiv

Axel Kittenberger <axel dtone.org> wrote in message
news:9ll905$7vl$1 digitaldaemon.com...
 scarcasm is not good, no need to dig that out early.

     if (x < < v > <o> 3)

 x is smaller than 3 casted to o casted to v.

 and compare it to
     if (x << v > <o> 3)

 x shifter right by v is greater then 3 casted to o.

 Leaving away brackets is no way where they are supposed is no way to argue
 languages. I could talk very wiered english too thats still has valid
 grammar, that doesn't make english a bad language.

 We could also start to discuss what a = b+++c; yields, or if a = b++++c is
 valid syntax, same with the old tangling else problem.
 -

      if (x < (<v><o> 3))
 is better.

 and so is:
      if ((x << v) > (<o> 3))

 I see this at least as good as it would be tradionally:
      if (x < ((v)(o) 3))
 and
      if ((x << v) > ((o) 3))

 And honestly typecasts and comperasion is not something you'll encounter
 every day in the same line. Normally one casts objects and compares

integer
 types. Yes, I know there are cases where you have to compare signed with
 unsigned integers, there you need a typecast in the same line, but

normally
 if you decide types right this can be avoided.

 Oh, and what if o is a type? Is not a type?

 I don't understand this.

 - Axel

Aug 18 2001

"Walter" <walter digitalmars.com> writes:

Rajiv Bhagwat wrote in message <9llau3$99d$1 digitaldaemon.com>...
Hey guys,

As an aside, did you know that Walter once won an 'obfuscated c' contest?


My dirty secret is out!


We don't want 'D' to be a language for which such contests are held<g>!


As one wag once said, you can write FORTRAN in any language.

Aug 18 2001

John Fletcher <J.P.Fletcher aston.ac.uk> writes:

Walter wrote:

 Rajiv Bhagwat wrote in message <9llau3$99d$1 digitaldaemon.com>...
Hey guys,

As an aside, did you know that Walter once won an 'obfuscated c' contest?

 My dirty secret is out!

We don't want 'D' to be a language for which such contests are held<g>!

 As one wag once said, you can write FORTRAN in any language.

One of my research students wrote a FORTRAN program which wrote FORTRAN.

John

Aug 21 2001

Russell Bornschlegel <kaleja estarcion.com> writes:

Walter wrote:
 The only real trick in the parser is distinguishing a cast from a
 parenthesized expression,
 and a declaration from a statement.
 
 I considered an alternate syntax for these to make them easier to parse, but
 it just doesn't look right. I'm too used to C, I guess. For example:
 
     cast(int) expr
 
 instead of:
 
     (int) expr
 

This I could live with. 

Is the C++ style cast of the form:

   int(expr)

difficult to parse?

 and:
 
     var int foo;
 
 instead of:
 
     int foo;

This one hurts my C-brain a bit more. :)

Aug 18 2001

"Walter" <walter digitalmars.com> writes:

Russell Bornschlegel wrote in message <3B7EC050.A74713DA estarcion.com>...
Is the C++ style cast of the form:

   int(expr)

difficult to parse?


Yes, because the grammar is too ambiguous. A goal of D is to make parsing
independent of the symbol table. Can't do that and support C++ style casts.

Aug 18 2001

"Sean L. Palmer" <spalmer iname.com> writes:

The only other problem with C++ style casts is that it requires making
typedefs, without them you can't cast to int* for example.  Pascal had the
same problem.  Not a huge problem, admittedly.

Sean

"Walter" <walter digitalmars.com> wrote in message
news:9lml51$10vj$3 digitaldaemon.com...
 Russell Bornschlegel wrote in message <3B7EC050.A74713DA estarcion.com>...
Is the C++ style cast of the form:

   int(expr)

difficult to parse?


 Yes, because the grammar is too ambiguous. A goal of D is to make parsing
 independent of the symbol table. Can't do that and support C++ style

casts.

Nov 03 2001

"Walter" <walter digitalmars.com> writes:

I'm still in a bit of a quandary about casts. Should casting be:

    (type)expression

or:

    cast(type)expression

? The latter is far easier to parse. I'm so used to the former I am
reluctant to give it up. A kludgy compromise might be allowing the former
only for types that start with a keyword, not a typedef.

"Sean L. Palmer" <spalmer iname.com> wrote in message
news:9s2cep$1ved$1 digitaldaemon.com...
 The only other problem with C++ style casts is that it requires making
 typedefs, without them you can't cast to int* for example.  Pascal had the
 same problem.  Not a huge problem, admittedly.

 Sean

 "Walter" <walter digitalmars.com> wrote in message
 news:9lml51$10vj$3 digitaldaemon.com...
 Russell Bornschlegel wrote in message


<3B7EC050.A74713DA estarcion.com>...
Is the C++ style cast of the form:

   int(expr)

difficult to parse?


 Yes, because the grammar is too ambiguous. A goal of D is to make


parsing
 independent of the symbol table. Can't do that and support C++ style

 casts.

Dec 15 2001

"Pavel Minayev" <evilone omen.ru> writes:

"Walter" <walter digitalmars.com> wrote in message
news:9vhegn$1l01$1 digitaldaemon.com...

 I'm still in a bit of a quandary about casts. Should casting be:

     (type)expression

 or:

     cast(type)expression

 ? The latter is far easier to parse. I'm so used to the former I am
 reluctant to give it up. A kludgy compromise might be allowing the former
 only for types that start with a keyword, not a typedef.

Personally, I've found myself using the cast() form.
But the compromise you've mentioned of seems fine as
well.

Dec 16 2001

Axel Kittenberger <axel dtone.org> writes:

Walter wrote:

 I'm still in a bit of a quandary about casts. Should casting be:
 
     (type)expression
 
 or:
 
     cast(type)expression

My I brainstorm a bit?

personally I like the:
<type> expression

Syntax, and it is easy parseable in a LALR syntax, as it requires only one 
look ahead. However I've seen people having feelings against it.

How about using the right apostrophe? Is it used it already by something 
different?
`type` expression

Well however I thing trying to see pure, type casting can best be viewed as 
a kind of function or? It gets one input value, and returns another output 
value, sometimes different as the input. Then the above expression should 
be look like this better:

cast(type, expression)

or the pascal form 
type(expression)

But types cannot normally be function paramteres, they are something 
different. I think type casting should use the same syntax as generic 
programming in the same language should do. It's the same paradigm, calling 
a function whose implementation is dependant on the type you specify.
How about then:

cast<type>(expression)

?

Dec 16 2001

"Pavel Minayev" <evilone omen.ru> writes:

"Axel Kittenberger" <axel dtone.org> wrote in message
news:9vhomg$1re4$1 digitaldaemon.com...
 Walter wrote:

 I'm still in a bit of a quandary about casts. Should casting be:

     (type)expression

 or:

     cast(type)expression

 My I brainstorm a bit?

 personally I like the:
 <type> expression

 Syntax, and it is easy parseable in a LALR syntax, as it requires only one
 look ahead. However I've seen people having feelings against it.

It's also hard to distinguish from normal expression containing
< and > operands.

 How about using the right apostrophe? Is it used it already by something
 different?
 `type` expression

 Well however I thing trying to see pure, type casting can best be viewed

as
 a kind of function or? It gets one input value, and returns another output
 value, sometimes different as the input. Then the above expression should
 be look like this better:

 cast(type, expression)

It doesn't seem "right" to me =)

 or the pascal form
 type(expression)

This is, IMHO, acceptable, since there's no temporaries in D,
so the syntax is unused.

 But types cannot normally be function paramteres, they are something
 different. I think type casting should use the same syntax as generic
 programming in the same language should do. It's the same paradigm,

calling
 a function whose implementation is dependant on the type you specify.
 How about then:

 cast<type>(expression)

Back to C++ days...
I just hate angle brackets!

And anyhow, what's the problem with the way it's done now?

Dec 16 2001

la7y6nvo shamko.com writes:

"Walter" <walter digitalmars.com> writes:

 I'm still in a bit of a quandary about casts. Should casting be:
 
     (type)expression
 
 or:
 
     cast(type)expression
 
 ? The latter is far easier to parse. I'm so used to the former I am
 reluctant to give it up. A kludgy compromise might be allowing the former
 only for types that start with a keyword, not a typedef.

Please let D abandon the abominable syntax that C uses for casts.
I know, I know, people are used to it, but this is one case where
compatibility should be given second place to forward progress.

On the compromise idea - If a new syntax is introduced then the old
syntax should be eliminated.  C++ made the mistake of adding a new
syntax for casting while also retaining the old, and that's been
a source of confusion.  Also, having two forms which are sometimes
but not always interchangeable is asking for trouble.  Again, I
know about the compatibility arguments, but let's try to make some
forward progress with D in this area.  

As for what syntax to use..  One thing about C that's always seemed
rather poorly thought out is the funny way that prefix operators and
postfix operators interact.  (Incidentally, C++ retains these problems
and makes them worse with a prefix 'new' operator.  But I digress.)
If one thinks of casting as a kind of operator - admittedly one that
doesn't usually execute any instructions - I think it makes more sense
to put the cast operator after the operand, as for example

    operand.as(type)

or if necessary

    (expression).as(type)

If the syntax for expressions were expanded to include a syntax for
types, with semantics allowing some run-time representation for values
of type Type (which seems like a good idea independent of casting),
then no special syntax is needed for casting.  Indeed if this were so
then the only difference between the casting 'as' operators and
regular (method) functions is that 'as' is overloaded on return type.

Note how nicely this functional form cascades:

    x[i].as(Window).redraw()

Compare that to:

    (cast(Window) x[i]).redraw()

Perhaps it's my many years of using object-oriented programming
languages, but the first form seems much easier to understand
than the second.

Let me add a couple of disclaimers as anti-flame insurance :)

    (1) There is a fundamental and important difference between the notion
    of type and the notion of class.  The comments above gloss over that
    distinction.  It's important that some implementation along these
    lines not do that.

    (2) C uses the same syntax for "casting" (like changing one pointer
    type into another with no bit changes) and "conversion" (changing an
    integer value into a floating point value, bits definitely change).
    It seems obvious that these notions, although related, are different
    operations and should have distinct syntaxes.  Or at least different
    operator names.

Hopefully the comments and suggestions made above find some resonance
amongst the readers of the newsgroup and potential users of D.

Dec 16 2001

Axel Kittenberger <axel dtone.org> writes:

 
     (2) C uses the same syntax for "casting" (like changing one pointer
     type into another with no bit changes) and "conversion" (changing an
     integer value into a floating point value, bits definitely change).
     It seems obvious that these notions, although related, are different
     operations and should have distinct syntaxes.  Or at least different
     operator names.

True, in fact there are 4 casts I know of, conversion cast, upcast, 
downcast and reinterpret cast. C++ decides in after the context what to use 
that can be very error prone. Where a const cast is only a finer issue I do 
not count as a seperate cast form.

class Borg b* = (Corg) a*;

Can have different code results dependant that you included #include 
"Corg.h" and defined the class with it. Took me once several days to find a 
bug, where the code did an upcast (pointer should decrement by casting) but 
the class defintion was not included in that file, so he did a reinterpred 
cast, pointing at nonsense at the end.

- Axel

Dec 16 2001

"Walter" <walter digitalmars.com> writes:

"Axel Kittenberger" <axel dtone.org> wrote in message
news:9vj699$2l18$1 digitaldaemon.com...
 True, in fact there are 4 casts I know of, conversion cast, upcast,
 downcast and reinterpret cast. C++ decides in after the context what to

use
 that can be very error prone. Where a const cast is only a finer issue I

do
 not count as a seperate cast form.

 class Borg b* = (Corg) a*;

 Can have different code results dependant that you included #include
 "Corg.h" and defined the class with it. Took me once several days to find

a
 bug, where the code did an upcast (pointer should decrement by casting)

but
 the class defintion was not included in that file, so he did a reinterpred
 cast, pointing at nonsense at the end.

D shouldn't suffer from that problem, as there shouldn't be any forward
referenced class names.

D will just use one cast form, not the 4 different ones. (There is no const
cast in D, because there is no const type modifier.) To do a type paint,
cast it to void* first and then cast the result.

Dec 16 2001

"Walter" <walter digitalmars.com> writes:

"Walter" <walter digitalmars.com> wrote in message
news:9vhegn$1l01$1 digitaldaemon.com...
 I'm still in a bit of a quandary about casts. Should casting be:

     (type)expression

 or:

     cast(type)expression

 ? The latter is far easier to parse. I'm so used to the former I am
 reluctant to give it up. A kludgy compromise might be allowing the former
 only for types that start with a keyword, not a typedef.

There's a related issue for declarations, is:

    a * b;

a declaration or an expression? Currently, the ambiguity is resolved with
the rule "if it will parse as a declaration, it is a declaration". I'm not
particularly thrilled with that, as it requires lookahead in the parser. One
solution is to require a "var" keyword in front of declarations. But to my
eyes, typing all those var's in is annoying:

    void func()
    {    var int i,j;
         var X* y;
        ....
    }

Dec 16 2001

"Pavel Minayev" <evilone omen.ru> writes:

"Walter" <walter digitalmars.com> wrote in message
news:9vj9l0$2n1j$2 digitaldaemon.com...
 There's a related issue for declarations, is:

     a * b;

 a declaration or an expression? Currently, the ambiguity is resolved with
 the rule "if it will parse as a declaration, it is a declaration". I'm not
 particularly thrilled with that, as it requires lookahead in the parser.

One

If "a" is type in current scope, it's a declaration;
If "a" is something else, it's an expression, so:

    class Apple { }

    int main()
    {
        Apple * x;    // x is pointer to apple
        int Apple;
        Apple * y;    // multiply Apple by y
    }

Dec 16 2001

"Walter" <walter digitalmars.com> writes:

"Pavel Minayev" <evilone omen.ru> wrote in message
news:9vk5gl$4hk$1 digitaldaemon.com...
 "Walter" <walter digitalmars.com> wrote in message
 news:9vj9l0$2n1j$2 digitaldaemon.com...
 There's a related issue for declarations, is:

     a * b;

 a declaration or an expression? Currently, the ambiguity is resolved


with
 the rule "if it will parse as a declaration, it is a declaration". I'm


not
 particularly thrilled with that, as it requires lookahead in the parser.

 One

 If "a" is type in current scope, it's a declaration;
 If "a" is something else, it's an expression, so:

     class Apple { }

     int main()
     {
         Apple * x;    // x is pointer to apple
         int Apple;
         Apple * y;    // multiply Apple by y
     }

The trouble there is I am trying to separate the syntactic from the
semantic. Recognizing that an identifier is a type requires semantic
analysis (i.e. building a symbol table). The separation of the two functions
will make it easy to write things like source code formatters and analyzers.

Dec 17 2001

"Sean L. Palmer" <spalmer iname.com> writes:

I guess you can't have everything.  Either that or you have the coder
specify 'var' in front of all declarations.  I for one wouldn't mind that,
since it clears up so many other things.  I did just that in fact when
programming Pascal for all those years.  I was far more upset about having
to type 'begin' and 'end' and 'then' and 'do' all over the place than I ever
was about telling the compiler I'm about to do a variable declaration.

If you ask me, anyone writing a source code analyzer or formatter can easily
check for variable declarations and typedefs so it knows what symbols are
what at current scope.  It doesn't even really need to keep track of the
exact type, just that a variable or type has been declared there with that
name.  Source beautifiers builtin to the IDE is something I'd really love to
see, so if doing that means the IDE has to keep track of declarations, well,
that's nothing VB or VC hasn't been doing for many years now already.  Hell,
Instant Pascal way back on the Apple II did that.  So did Think Pascal on
the Mac.  Then again Pascal used the var keyword.

Sean

"Walter" <walter digitalmars.com> wrote in message
news:9vk9nj$8eq$3 digitaldaemon.com...
 "Pavel Minayev" <evilone omen.ru> wrote in message
 news:9vk5gl$4hk$1 digitaldaemon.com...
 "Walter" <walter digitalmars.com> wrote in message
 news:9vj9l0$2n1j$2 digitaldaemon.com...
 There's a related issue for declarations, is:

     a * b;

 a declaration or an expression? Currently, the ambiguity is resolved


 with
 the rule "if it will parse as a declaration, it is a declaration". I'm


 not
 particularly thrilled with that, as it requires lookahead in the



parser.
 One

 If "a" is type in current scope, it's a declaration;
 If "a" is something else, it's an expression, so:

     class Apple { }

     int main()
     {
         Apple * x;    // x is pointer to apple
         int Apple;
         Apple * y;    // multiply Apple by y
     }

 The trouble there is I am trying to separate the syntactic from the
 semantic. Recognizing that an identifier is a type requires semantic
 analysis (i.e. building a symbol table). The separation of the two

functions
 will make it easy to write things like source code formatters and

analyzers.

Dec 17 2001

"Walter" <walter digitalmars.com> writes:

"Sean L. Palmer" <spalmer iname.com> wrote in message
news:9vke8m$amq$1 digitaldaemon.com...
 I guess you can't have everything.  Either that or you have the coder
 specify 'var' in front of all declarations.  I for one wouldn't mind that,
 since it clears up so many other things.  I did just that in fact when
 programming Pascal for all those years.  I was far more upset about having
 to type 'begin' and 'end' and 'then' and 'do' all over the place than I

ever
 was about telling the compiler I'm about to do a variable declaration.

Having the "if it parses as a declaration, it is a declaration" rule does
seem to work.

 If you ask me, anyone writing a source code analyzer or formatter can

easily
 check for variable declarations and typedefs so it knows what symbols are
 what at current scope.  It doesn't even really need to keep track of the
 exact type, just that a variable or type has been declared there with that
 name.  Source beautifiers builtin to the IDE is something I'd really love

to
 see, so if doing that means the IDE has to keep track of declarations,

well,
 that's nothing VB or VC hasn't been doing for many years now already.

Hell,
 Instant Pascal way back on the Apple II did that.  So did Think Pascal on
 the Mac.  Then again Pascal used the var keyword.

VC has had millions of man-hours of development in it - and with access to a
full blown compiler, you can tell if an identifier is a type or not. To do
so in D, you'd have to find all the imports, parse them all, etc., just like
in a full blown compiler to be able to tell with certainty if it's a type or
not. You can fake it and work 93% of the time, but getting that last 7%
right requires the whole thing. Most C beautifers that aren't hooked into a
full blown compiler get it right about 93% of the time. Heck, many of them
don't even handle the \ line splice completely correctly.

With D, the idea is it can be gotten 100% correct with only a modest effort
by one person.

Dec 17 2001

"Pavel Minayev" <evilone omen.ru> writes:

"Walter" <walter digitalmars.com> wrote in message
news:9vk9nj$8eq$3 digitaldaemon.com...

 The trouble there is I am trying to separate the syntactic from the
 semantic. Recognizing that an identifier is a type requires semantic
 analysis (i.e. building a symbol table). The separation of the two

functions
 will make it easy to write things like source code formatters and

analyzers.

Yes, I see. Personally, I woudln't mind typing "var" here and
there, got quite used to it in Pascal and then UnrealScript...

Dec 17 2001

"Roberto Mariottini" <rmariottini lycosmail.com> writes:

"Pavel Minayev" <evilone omen.ru> ha scritto nel messaggio
news:9vk5gl$4hk$1 digitaldaemon.com...
 If "a" is type in current scope, it's a declaration;
 If "a" is something else, it's an expression, so:

     class Apple { }

     int main()
     {
         Apple * x;    // x is pointer to apple
         int Apple;
         Apple * y;    // multiply Apple by y
     }

Please, save me from this! ;-)

Dec 17 2001

"Walter" <walter digitalmars.com> writes:

"Roberto Mariottini" <rmariottini lycosmail.com> wrote in message
news:9vkbg9$9d5$1 digitaldaemon.com...
 "Pavel Minayev" <evilone omen.ru> ha scritto nel messaggio
 news:9vk5gl$4hk$1 digitaldaemon.com...
 If "a" is type in current scope, it's a declaration;
 If "a" is something else, it's an expression, so:

     class Apple { }

     int main()
     {
         Apple * x;    // x is pointer to apple
         int Apple;
         Apple * y;    // multiply Apple by y
     }

 Please, save me from this! ;-)

Currently, D issues the following error message for the Apple*y:

symbol 'Apple' is not a type

Dec 17 2001

Axel Kittenberger <axel dtone.org> writes:

Walter wrote:

 
 "Roberto Mariottini" <rmariottini lycosmail.com> wrote in message
 news:9vkbg9$9d5$1 digitaldaemon.com...
 "Pavel Minayev" <evilone omen.ru> ha scritto nel messaggio
 news:9vk5gl$4hk$1 digitaldaemon.com...
 If "a" is type in current scope, it's a declaration;
 If "a" is something else, it's an expression, so:

     class Apple { }

     int main()
     {
         Apple * x;    // x is pointer to apple
         int Apple;



In my opinion 'int Apple' should better raise an error if Apple is defined 
as type. Shadowing is nice in theory, and looks cool in compiler 
implementation, but is bad in practice, as it is error prone for the 
programmer who easy mismatches a variable/type when it's understanding 
differes from context.

Dec 17 2001

Russ Lewis <spamhole-2001-07-16 deming-os.org> writes:

Hear, hear.

Axel Kittenberger wrote:

 Walter wrote:

 "Roberto Mariottini" <rmariottini lycosmail.com> wrote in message
 news:9vkbg9$9d5$1 digitaldaemon.com...
 "Pavel Minayev" <evilone omen.ru> ha scritto nel messaggio
 news:9vk5gl$4hk$1 digitaldaemon.com...
 If "a" is type in current scope, it's a declaration;
 If "a" is something else, it's an expression, so:

     class Apple { }

     int main()
     {
         Apple * x;    // x is pointer to apple
         int Apple;



 In my opinion 'int Apple' should better raise an error if Apple is defined
 as type. Shadowing is nice in theory, and looks cool in compiler
 implementation, but is bad in practice, as it is error prone for the
 programmer who easy mismatches a variable/type when it's understanding
 differes from context.

--
The Villagers are Online! villagersonline.com

.[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ]
.[ (a version.of(English).(precise.more)) is(possible) ]
?[ you want.to(help(develop(it))) ]

Dec 17 2001

Russ Lewis <spamhole-2001-07-16 deming-os.org> writes:

Think, Russ!  Think, then speak!

I do want to say that I agree with Axel that "shadowing" is not a good
thing...maybe even a Bad Thing.  But now that I've thought about it, I've
realized that it still doesn't solve the parser problem...the parser has no way
of knowing if it's a valid statement or not.

For that, what about just making no-effect lines to be syntax errors (except for
null statements)?  Then any thing that *could* be a declaration (regardless of
context) *must* be one.

--
The Villagers are Online! villagersonline.com

.[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ]
.[ (a version.of(English).(precise.more)) is(possible) ]
?[ you want.to(help(develop(it))) ]

Dec 17 2001

"Walter" <walter digitalmars.com> writes:

"Russ Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message
news:3C1E6A51.E60BF91A deming-os.org...
 For that, what about just making no-effect lines to be syntax errors

(except for
 null statements)?  Then any thing that *could* be a declaration

(regardless of
 context) *must* be one.

Hmm. That is a good thought. The no effect error is an irritant in C++, as I
use macros that depend on the compiler optimizing away no effect
expressions, as in:

    #define printf 1 || printf

and:

    #define assert(e)    0

but in D these become irrelevant, so maybe no effect errors are practical.

Dec 17 2001

Axel Kittenberger <axel dtone.org> writes:

Russ Lewis wrote:

 Think, Russ!  Think, then speak!
 
 I do want to say that I agree with Axel that "shadowing" is not a good
 thing...maybe even a Bad Thing.  But now that I've thought about it, I've
 realized that it still doesn't solve the parser problem...the parser has
 no way of knowing if it's a valid statement or not.
 
 For that, what about just making no-effect lines to be syntax errors
 (except for
 null statements)?  Then any thing that *could* be a declaration
 (regardless of context) *must* be one.

Not necessarly true, the lexer can do type lookups, and return differnt 
tokens as i.e. mine did. That how I solved the grammatical problem how to 
differentiate ie.

asdf[2][2] a;
asdf[2][2] = a;

Is "asdf" meant to be the type (declaring an 2x2 array of it) or a 
identifier accessing the 2/2th element of it? Yes, it is technically 
distinguishable but not with 1 token lookahead, or with any definitive size 
of lookahead. To distingish "asdf" between type or variable you need 3 * 
field dimenions look ahead, where the dimension is normally unlimited.

- Axel

Dec 18 2001

"Walter" <walter digitalmars.com> writes:

"Axel Kittenberger" <axel dtone.org> wrote in message
news:9voc5a$2vrg$1 digitaldaemon.com...
 Not necessarly true, the lexer can do type lookups, and return differnt
 tokens as i.e. mine did. That how I solved the grammatical problem how to
 differentiate ie.

 asdf[2][2] a;
 asdf[2][2] = a;

 Is "asdf" meant to be the type (declaring an 2x2 array of it) or a
 identifier accessing the 2/2th element of it? Yes, it is technically
 distinguishable but not with 1 token lookahead, or with any definitive

size
 of lookahead. To distingish "asdf" between type or variable you need 3 *
 field dimenions look ahead, where the dimension is normally unlimited.

 - Axel

That's why I wound up implementing arbitrary lookahead. -Walter

Dec 18 2001

"Pavel Minayev" <evilone omen.ru> writes:

"Axel Kittenberger" <axel dtone.org> wrote in message
news:9vlm2o$1389$1 digitaldaemon.com...

 In my opinion 'int Apple' should better raise an error if Apple is defined
 as type. Shadowing is nice in theory, and looks cool in compiler
 implementation, but is bad in practice, as it is error prone for the
 programmer who easy mismatches a variable/type when it's understanding
 differes from context.

This might be true if Apple was a local class. But what if it
resides in other module? In this case, disallowing shadowing
would result in breaking code in one module when other gets
new type in it - definitely unacceptable...

Dec 17 2001

"Roberto Mariottini" <rmariottini lycosmail.com> writes:

"Walter" <walter digitalmars.com> ha scritto nel messaggio
news:9vj9l0$2n1j$2 digitaldaemon.com...
 "Walter" <walter digitalmars.com> wrote in message
 news:9vhegn$1l01$1 digitaldaemon.com...
 I'm still in a bit of a quandary about casts. Should casting be:

     (type)expression

 or:

     cast(type)expression

 ? The latter is far easier to parse. I'm so used to the former I am
 reluctant to give it up. A kludgy compromise might be allowing the


former
 only for types that start with a keyword, not a typedef.


I prefer the keyword way. It's always easier to find a cast in a
million-operand expression
(like one of my co-workers does ever and ever) if the editor highlight the
keyword in the
middle of it.
As for the syntax, I once thougth about it for days, without finding a good
solution.
I think the type should be separated from the expression for clearity.
So (type)cast(expression) is better for me (though ugly...), or
cast[type](expression) or


 There's a related issue for declarations, is:

     a * b;

 a declaration or an expression? Currently, the ambiguity is resolved with
 the rule "if it will parse as a declaration, it is a declaration". I'm not
 particularly thrilled with that, as it requires lookahead in the parser.

One
 solution is to require a "var" keyword in front of declarations. But to my
 eyes, typing all those var's in is annoying:

     void func()
     {    var int i,j;
          var X* y;
         ....
     }

This is another thing I always thought of. Again, I found no better way.
The fact is that typing 'var' or something at every declaration is abig
waste of
typing. But:

var {
    int i;
    X* j;
}                // <-- this should not limit the scope :-(

i : int;
j : X*;                // good ole Pascal! But
a : int = 6 * K;   // where's the initializer? after the type :-(



 An operator is not always better than a keyword :-(

Anything anyone?

Dec 17 2001

"Sean L. Palmer" <spalmer iname.com> writes:

I just thought of something.  Yes, typing var is kinda annoying.  But how
about if one could type the keyword 'var' fairly infrequently, such as:

var:
   int a;
   int b;
   Foo myfoo;

Or perhaps:

var
{
   int a;
   int b;
   Foo myfoo;
}

Or even perhaps this (I think I like this one best):

var int a, int b, Foo myfoo; // can declare multiple vars of different types
with one var keyword

or

var int a, b, Foo myfoo, c; // b is also an int, but myfoo and c are both
Foo's

Whaddya think?  Maybe we could subsitute 'public' or 'private' for the 'var'
keyword instead?

public  // statements can't go inside public blocks.
{
  int a, b;
  Foo myfoo;
}

public:  // begin a public declara
  int i, j;
for (i=0; i<5; ++i)   // this statement terminates the previous public
section.  However here we run into the original problem again.  Perhaps
disallow the 'label:' style attribute format?
{
   j += i;
}
public:
  int c;

Sean

 There's a related issue for declarations, is:

     a * b;

 a declaration or an expression? Currently, the ambiguity is resolved


with
 the rule "if it will parse as a declaration, it is a declaration". I'm


not
 particularly thrilled with that, as it requires lookahead in the parser.

 One
 solution is to require a "var" keyword in front of declarations. But to


my
 eyes, typing all those var's in is annoying:

     void func()
     {    var int i,j;
          var X* y;
         ....
     }

 This is another thing I always thought of. Again, I found no better way.
 The fact is that typing 'var' or something at every declaration is abig
 waste of
 typing. But:

 var {
     int i;
     X* j;
 }                // <-- this should not limit the scope :-(

 i : int;
 j : X*;                // good ole Pascal! But
 a : int = 6 * K;   // where's the initializer? after the type :-(



  An operator is not always better than a keyword :-(

 Anything anyone?

Dec 17 2001

"Pavel Minayev" <evilone omen.ru> writes:

"Sean L. Palmer" <spalmer iname.com> wrote in message
news:9vkeno$b0s$1 digitaldaemon.com...

 Or perhaps:

 var
 {
    int a;
    int b;
    Foo myfoo;
 }

Sounds good.

 Or even perhaps this (I think I like this one best):

 var int a, int b, Foo myfoo; // can declare multiple vars of different

types
 with one var keyword

 or

 var int a, b, Foo myfoo, c; // b is also an int, but myfoo and c are both
 Foo's

Yeah... looks somewhat like VB.NET declarations, but I like
the idea. And should have no problem being parsed.

 Whaddya think?  Maybe we could subsitute 'public' or 'private' for the

'var'
 keyword instead?

"local"?

Dec 17 2001

"Walter" <walter digitalmars.com> writes:

It's an intriguing idea. The : version won't work, though, as you'd need
some other keyword to turn it off when the statements begin.

"Sean L. Palmer" <spalmer iname.com> wrote in message
news:9vkeno$b0s$1 digitaldaemon.com...
 I just thought of something.  Yes, typing var is kinda annoying.  But how
 about if one could type the keyword 'var' fairly infrequently, such as:

 var:
    int a;
    int b;
    Foo myfoo;

 Or perhaps:

 var
 {
    int a;
    int b;
    Foo myfoo;
 }

 Or even perhaps this (I think I like this one best):

 var int a, int b, Foo myfoo; // can declare multiple vars of different

types
 with one var keyword

 or

 var int a, b, Foo myfoo, c; // b is also an int, but myfoo and c are both
 Foo's

 Whaddya think?  Maybe we could subsitute 'public' or 'private' for the

'var'
 keyword instead?

 public  // statements can't go inside public blocks.
 {
   int a, b;
   Foo myfoo;
 }

 public:  // begin a public declara
   int i, j;
 for (i=0; i<5; ++i)   // this statement terminates the previous public
 section.  However here we run into the original problem again.  Perhaps
 disallow the 'label:' style attribute format?
 {
    j += i;
 }
 public:
   int c;

 Sean

 There's a related issue for declarations, is:

     a * b;

 a declaration or an expression? Currently, the ambiguity is resolved


 with
 the rule "if it will parse as a declaration, it is a declaration". I'm


 not
 particularly thrilled with that, as it requires lookahead in the



parser.
 One
 solution is to require a "var" keyword in front of declarations. But



to
 my
 eyes, typing all those var's in is annoying:

     void func()
     {    var int i,j;
          var X* y;
         ....
     }

 This is another thing I always thought of. Again, I found no better way.
 The fact is that typing 'var' or something at every declaration is abig
 waste of
 typing. But:

 var {
     int i;
     X* j;
 }                // <-- this should not limit the scope :-(

 i : int;
 j : X*;                // good ole Pascal! But
 a : int = 6 * K;   // where's the initializer? after the type :-(



  An operator is not always better than a keyword :-(

 Anything anyone?

Dec 17 2001

a <a b.c> writes:

Walter wrote:
 
 I'm still in a bit of a quandary about casts. Should casting be:
 
     (type)expression
 
 or:
 
     cast(type)expression
 
 ? The latter is far easier to parse. I'm so used to the former I am
 reluctant to give it up. A kludgy compromise might be allowing the former
 only for types that start with a keyword, not a typedef.

	If you ever allow any form of generic programming (templates, etc.)
this could come back and haunt you.  Also, it may cause unnecessary
breakage if some one switches code from using int to some FOO_t sort of
datatype.  You've gone to such extremes to do things one way (at least
in the first release) that is seems odd to have two ways for this.  
	I don't like the way the second form looks like a function call
followed by an expression, but I can't give a good justification for the
identifier in parens followed by an expression either.  Go with the cast
keyword and keep it clean.

Dan

Dec 16 2001

Charles Hixson <charleshixsn earthlink.net> writes:

Walter wrote:

 I'm still in a bit of a quandary about casts. Should casting be:
 
     (type)expression
 
 or:
 
     cast(type)expression
 
...

Casts are one of my least liked features of C and C++.  Better 
is a language design that, to the greatest extent possible, 
eliminates the need for casts.  Still, given the occasional 
need, I prefer the form:
      cast(type, expression)
Or even:
      cast("type", expression)
though that seems a bit strange.  But so does the idea of 
passing a type as an argument (unless the language should be 
extended to make that a "normal" activity).  The thing that 
bothers me about it is that in this expression the type appears 
to be a parameter, but normally types aren't allowed to be 
parameters, and this feels ... unnatural.

Jan 02 2002

Charles Hixson <charleshixsn earthlink.net> writes:

Charles Hixson wrote:

 Walter wrote:
 
 I'm still in a bit of a quandary about casts. Should casting be:

     (type)expression

 or:

     cast(type)expression

 ...



On reading other posts I encountered the proposal to use the syntax:
expression.as(type)

This syntax would be quite clear, but seems to be problematic as 
it would require that a method be allowed to return a variable 
type, e.g.:
     (e.as(int) < e.as(Point))
clearly this is somewhat problematical.  OTOH, this same problem 
is exhibited when one examines an alternate syntax, e.g.:
     (cast (int, e) < cast (Point, e) )
but I suppose that cast may more clearly be considered a 
"special form".  I dislike special forms, but perhaps they are 
necessary unless type is a "first class" entity in the language.

Now clearly either of the examples that I used should cause an 
error (the types int and Point being incommensurate), and this 
would be true no matter what the syntax.  The inconsistency 
arises because one can usually depend on methods and functions 
to return one particular type (or at least a descendant of some 
particular type).  So it "feels wrong" to have the syntax used 
in a contrary manner.

The "cleanest way" around this that occurs to me is to eliminate 
the cast operation, and require the creation of a union for each 
dually accessed entity.  But for this to be acceptable the 
frequency with which a cast had to be used would need to be 
quite low.

Jan 02 2002

"Pavel Minayev" <evilone omen.ru> writes:

"Charles Hixson" <charleshixsn earthlink.net> wrote in message
news:3C335780.6070609 earthlink.net...

 This syntax would be quite clear, but seems to be problematic as
 it would require that a method be allowed to return a variable
 type, e.g.:
      (e.as(int) < e.as(Point))

I don't see any problem e.as(int) returns a value of type int.
e.as(Point) returns a value of type Point. Of course, as() is
a built-in method then.

 The "cleanest way" around this that occurs to me is to eliminate
 the cast operation, and require the creation of a union for each
 dually accessed entity.  But for this to be acceptable the
 frequency with which a cast had to be used would need to be
 quite low.

Man have you ever tried to code staight WinAPI programs like that?
Besides, D (as I understand it) is not a "cleanest" language -
it's a _practical_ language. Which means that it tries to make
our live easier rather than "cleaner" (the latter is important as
well, of course!). Forbidding casts is definitely not the right
step in this direction...

Jan 02 2002

"Sean L. Palmer" <spalmer iname.com> writes:

 Is D LALR?  If so, then let's use LEX/YACC (or Flex/Bison) to whip up a
 quick parser that simply outputs state information (and possibly some

crude
 C equivalents instead of machine code).  At a minimum, the parser could be

a
 binary function that accepts text, where the return value would simply
 indicate parse success (a valid program).  Semantic actions could be used

to
 handle many non-LALR language features, as well as documenting violations.

If it's at all possible, I'd recommend making D have an LL(k) grammar for
some small k, as this will vastly simplify the parser needed and allow alot
more tools to manipulate D code.  Please don't force people to use LEX/YACC
to parse D.  My script languages have always been LL(2) or LL(1) and I can
get away with coding the parser by hand.  I think Antlr generates LL(k)
parsers too.

Sean

Nov 03 2001

Axel Kittenberger <axel dtone.org> writes:

 If it's at all possible, I'd recommend making D have an LL(k) grammar for
 some small k, as this will vastly simplify the parser needed and allow
 alot
 more tools to manipulate D code.  Please don't force people to use
 LEX/YACC
 to parse D.  My script languages have always been LL(2) or LL(1) and I can
 get away with coding the parser by hand.  I think Antlr generates LL(k)
 parsers too.

Do you know any OpenSource LL(n) parser generators for C?

- Axel
-- 
|D) http://www.dtone.org

Nov 04 2001

"Sean L. Palmer" <spalmer iname.com> writes:

Antlr is evidently Public Domain:

http://www.antlr.org/rights.html

Sean

"Axel Kittenberger" <axel dtone.org> wrote in message
news:9s5d02$11j9$1 digitaldaemon.com...
 If it's at all possible, I'd recommend making D have an LL(k) grammar


for
 some small k, as this will vastly simplify the parser needed and allow
 alot
 more tools to manipulate D code.  Please don't force people to use
 LEX/YACC
 to parse D.  My script languages have always been LL(2) or LL(1) and I


can
 get away with coding the parser by hand.  I think Antlr generates LL(k)
 parsers too.

 Do you know any OpenSource LL(n) parser generators for C?

 - Axel
 --
 |D) http://www.dtone.org

Nov 05 2001

brucedickey micron.com writes:

FYI:

My favorite parser-generator is ProGrammar from http://www.programmar.com/. It's
not open source or free, but it is inexpensive, LL(n), and can examine the 
parse tree in productions, making it a linear bounded automata! Alas, it does
not generate source, but a proprietary grammar file. You have to link to a
supplied .lib or use the ActiveX. Currently available for Windows, but I have an
alpha for Linux that works fine. This one builds a parse tree and provides an
API to walk it and extract data. Its for C/C++, VB, Delphi.

I've used an LALR parser-generater in the past. I would never consider using
Bison or anything that's not LL(n) again.

Bruce

In article <9s6s97$2bem$1 digitaldaemon.com>, Sean L. Palmer says...
Antlr is evidently Public Domain:

http://www.antlr.org/rights.html

Sean

"Axel Kittenberger" <axel dtone.org> wrote in message
news:9s5d02$11j9$1 digitaldaemon.com...
 If it's at all possible, I'd recommend making D have an LL(k) grammar


for
 some small k, as this will vastly simplify the parser needed and allow
 alot
 more tools to manipulate D code.  Please don't force people to use
 LEX/YACC
 to parse D.  My script languages have always been LL(2) or LL(1) and I


can
 get away with coding the parser by hand.  I think Antlr generates LL(k)
 parsers too.

 Do you know any OpenSource LL(n) parser generators for C?

 - Axel
 --
 |D) http://www.dtone.org

Jul 18 2002

"Walter" <walter digitalmars.com> writes:

I can never remember the definitions of those grammars. The current D
parser, however, is pretty simple and is implemented as recursive descent. I
have little use for LEX/YACC. The code output always seems to require a
little hand editting, and then automating the build process doesn't work.

Lexers are so simple anyway I can't see a reason to use LEX.


"Sean L. Palmer" <spalmer iname.com> wrote in message
news:9s2c77$1v6e$1 digitaldaemon.com...
 Is D LALR?  If so, then let's use LEX/YACC (or Flex/Bison) to whip up a
 quick parser that simply outputs state information (and possibly some

 crude
 C equivalents instead of machine code).  At a minimum, the parser could


be
 a
 binary function that accepts text, where the return value would simply
 indicate parse success (a valid program).  Semantic actions could be


used
 to
 handle many non-LALR language features, as well as documenting


violations.
 If it's at all possible, I'd recommend making D have an LL(k) grammar for
 some small k, as this will vastly simplify the parser needed and allow

alot
 more tools to manipulate D code.  Please don't force people to use

LEX/YACC
 to parse D.  My script languages have always been LL(2) or LL(1) and I can
 get away with coding the parser by hand.  I think Antlr generates LL(k)
 parsers too.

 Sean

Dec 15 2001

Axel Kittenberger <axel dtone.org> writes:

Walter wrote:

 I can never remember the definitions of those grammars. The current D
 parser, however, is pretty simple and is implemented as recursive descent.
 I have little use for LEX/YACC. The code output always seems to require a
 little hand editting, and then automating the build process doesn't work.

Well the today active GNU projects are called "flex" and "bison". Bison 
requires hand editting? Thats not true in my eyes. First primary bison 
creates only the parser tables, not much more not much less. If you want to 
make changes in the parser, you can change the bison.simple template. Once 
for all. There are some limits in bison, as it doesn't allow you to specify 
how the actions are coded, which is interesting if you are doing non-C 
parsers.

However I agree that hand writing a parser can have it's advantages. It's 
only too complicated for a small brain like mine :o) I personally never got 
unary operators right this way, the miny languages I once wrote the parser 
by hand quickly came out of my control, while bison parsers seems to be 
pretty easy you got it, however sometimes configuring out how to resolve 
some shift/reduce conflicuts can be as tedious.
 
 Lexers are so simple anyway I can't see a reason to use LEX.

I totally agree on that. However it personally confuses me somewhat why 
there are all two different tools for spelling and for grammar, after all 
it is in principle the same or? Only once your tokens are the alphabet and 
once your tokens are the words outputed by the lexer grammar.

- Axel

Dec 16 2001

"Walter" <walter digitalmars.com> writes:

"Axel Kittenberger" <axel dtone.org> wrote in message
news:9vho6a$1rdg$1 digitaldaemon.com...
 Walter wrote:
 I can never remember the definitions of those grammars. The current D
 parser, however, is pretty simple and is implemented as recursive


descent.
 I have little use for LEX/YACC. The code output always seems to require


a
 little hand editting, and then automating the build process doesn't


work.
 Well the today active GNU projects are called "flex" and "bison". Bison
 requires hand editting? Thats not true in my eyes. First primary bison
 creates only the parser tables, not much more not much less. If you want

to
 make changes in the parser, you can change the bison.simple template. Once
 for all. There are some limits in bison, as it doesn't allow you to

specify
 how the actions are coded, which is interesting if you are doing non-C
 parsers.

My experience with GNU bison under Win32 is the output generated many
warnings from the compiler. The port of it to Win32 was incomplete. I didn't
want to rely on a decent port of bison existing on each platform I wanted to
port the product to. Even worse, in the end it didn't save any time.

Ok, so I'm not remotely a bison expert, and perhaps these are non-issues to
someone who has taken the time to thoroughly learn it.

 However I agree that hand writing a parser can have it's advantages. It's
 only too complicated for a small brain like mine :o) I personally never

got
 unary operators right this way, the miny languages I once wrote the parser
 by hand quickly came out of my control, while bison parsers seems to be
 pretty easy you got it, however sometimes configuring out how to resolve
 some shift/reduce conflicuts can be as tedious.

A hand-tuned parser can be very fast as well <g>.


 Lexers are so simple anyway I can't see a reason to use LEX.

 I totally agree on that. However it personally confuses me somewhat why
 there are all two different tools for spelling and for grammar, after all
 it is in principle the same or? Only once your tokens are the alphabet and
 once your tokens are the words outputed by the lexer grammar.

In principle they are the same, in practice not. You're not really after a
data graph being built by the lexer, just a token stream. Whereas with a
parser, a syntax graph is the desired result.

Also, despite what compiler textbooks imply, lexing and parsing are only
minor parts of a compiler. The real work is in the semantic analysis,
optimization, and code generation. For example, the lexer in D is 1400 lines
and the parser is 2600. Of course, the goal was to make them simple <g>.

Dec 16 2001

Axel Kittenberger <axel dtone.org> writes:

Walter wrote:

 My experience with GNU bison under Win32 is the output generated many
 warnings from the compiler. The port of it to Win32 was incomplete. I
 didn't want to rely on a decent port of bison existing on each platform I
 wanted to port the product to. Even worse, in the end it didn't save any
 time.
 
 Ok, so I'm not remotely a bison expert, and perhaps these are non-issues
 to someone who has taken the time to thoroughly learn it.

I had no problems whatever compiling and running bison with MSVC++ (back at 
the times I still have used it)

- Axel

Dec 16 2001

Russ Lewis <spamhole-2001-07-16 deming-os.org> writes:

"Sean L. Palmer" wrote:

 If it's at all possible, I'd recommend making D have an LL(k) grammar for
 some small k, as this will vastly simplify the parser needed and allow alot
 more tools to manipulate D code.  Please don't force people to use LEX/YACC
 to parse D.  My script languages have always been LL(2) or LL(1) and I can
 get away with coding the parser by hand.  I think Antlr generates LL(k)
 parsers too.

I don't think that LL(k) is possible (without hacks).  The problem is the
following:

    <identifier> ****...*** <identifier>

Note that there could be any number of *'s.  At this point, a parser CANNOT
determine how to parse it.  Is it the beginning of a declaration, or it is an
expression?  If it is a type, then it should be parsed as:
    (((((...((<identifier> *) *) ... *) *) *) *) *) <identifier>
but if it is an expression, then the first * is a multiplication, and the rest
are dereferencing operators:
    ( <identifier> * (* (* (* ... (* (* <identifier> ))...))))

Obviously, it is legal for that string of *'s to be arbitrarily long.  Thus, in
order to write an LL(k) parser, you have to parse a rule that matches an
arbitrarily long string of *'s, and then use that either in an expression or a
declaration.  It's conceivable, but ugly.

I proposed an alternate declaration syntax (move the *'s to the left) that would
eliminate this problem.  I think that it might make D LALR(2)...

--
The Villagers are Online! villagersonline.com

.[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ]
.[ (a version.of(English).(precise.more)) is(possible) ]
?[ you want.to(help(develop(it))) ]

Jul 22 2002

"Walter" <walter digitalmars.com> writes:

Michael Gaskins wrote in message <9li9i4$o6j$1 digitaldaemon.com>...
The language specifications look very good to me (I do mostly Java coding
right now but I've also done a fair amount of C and just a tad of C++).
Lets get a compiler for this thing (alpha level, beta level, whatever) to
starve off all the 'vaporware' shouters that will surely come.


The compiler does exist, but it is too embarassingly buggy to post right now
<g>.

Aug 18 2001

D Programming

C/C++ Programming

Other

D - Project looks very promising - Lets get a compiler out fast :)