digitalmars.D - PROPOSAL: opSeq()

Russell Lewis (65/65) Apr 07 2008 PROPOSAL: A way to handle sequences of expressions which otherwise would...

Bill Baxter (21/27) Apr 07 2008 Are you familiar with the "trailing delegates" proposal?

downs (11/19) Apr 07 2008 FWIW and just FYI, the least closing brackets can be done with my_for(i=...
Russell Lewis (70/88) Apr 08 2008 Yes, I am familiar with the concept. My proposal is a generalization of...

Bill Baxter (14/123) Apr 08 2008 Ok. Good examples. Here's another that I suppose would be possible:

Russell Lewis (18/20) Apr 09 2008 That is something that I have worried about, as well, and I haven't done...

Frits van Bommel (21/44) Apr 09 2008 Well, one ambiguity is stuff like: "foo 1 -2". Is this foo.opSeq(1 - 2)

Russell Lewis (2/24) Apr 09 2008 Good points. I'll ponder 'em.

Russell Lewis <webmaster villagersonline.com> writes:

PROPOSAL: A way to handle sequences of expressions which otherwise would 
have been syntax errors


EXAMPLE CODE:
	my_for(i=0, i<10, i++) { <code> }


PARSER DETAILS:

Add a grammar rule that works as follows:
	expression:
		expression expression+

(I'm not sure exactly where in the associativity hierarchy it should go. 
  Maybe assign expression?)


GRAMMAR DETAILS:

Any time that we parse the above rule, the left-hand expression must be 
a "sequence handler."  A sequence handler is either a delegate, or a 
struct which implements the function "opSeq()".

The number and type of arguments of the handler determine how many, and 
what type, of expressions can follow the handler.

The return value from the handler can be void, or a value.

If the handler has fewer arguments than we have expressions in the 
sequence, then the return value from the first handler may be a second 
handler, and thus we can chain handlers.

If the types of the expressions don't match, or the sequence of 
expressions has too few elements, then we have a syntax error.

Handlers are always right-associative.  This means that if we have a 
series of expressions:
	handler1 expressionA handler2 expressionB

then this first becomes:
	handler1 expressionA handler2(expressionB)

Then, if handler1 has 2 arguments, it becomes
	handler1(expressionA, handler2(expressionB))

However, if handler1 has only 1 argument, then it must return a handler 
for the second expression:
	handler1(expressionA)(handler2(expressionB))


IMPLEMENTATION EXAMPLE:

// my_for().
//
// Note that the "lazy void" overload of opSeq handles single-line
// bodies with no {} while the "void delegate()" overload handles
// bodies with {}.


MyFor my_for(lazy void init, lazy bool test, lazy void inc)
{
   MyFor ret;
     ret.init = init;
     ret.test = test;
     ret.inc  = inc;
   return ret;
}


struct MyFor
{
   void delegate() init;
   bool delegate() test;
   void delegate() inc;

   void opSeq(lazy void body)
   {
     opSeq({ body() });
   }

   void opSeq(void delegate() body)
   {
     init();

     while(test())
     {
       body();
       inc;
     }
   }
}

Apr 07 2008

Bill Baxter <dnewsgroup billbaxter.com> writes:

Russell Lewis wrote:
 PROPOSAL: A way to handle sequences of expressions which otherwise would 
 have been syntax errors
 
 
 EXAMPLE CODE:
     my_for(i=0, i<10, i++) { <code> }

Are you familiar with the "trailing delegates" proposal?

Basically the idea there is that any {<code>} block following a function 
call would be treated as an extra argument to the function.

So if you write the function:
void my_for(lazy void init, lazy bool test, lazy void inc, void delegate())
{ ... }

then your EXAMPLE_CODE above would call that function.

Your proposal would have one benefit over that in that you could have 
"my_for" a varargs function if you wanted to.  Though, the trailing 
delegates idea could probably be fixed to handle that too.  Like by 
making the trailing delegate the first argument instead of the last 
(kinda like what opIndexAssign does).

Overall I think trailing delegates sounds like a simpler, more elegant 
approach.  Can you point out any other benefits of your proposal that 
trailing delegate args would not have?

I believe Walter's response previously has been that we should just get 
used to looking at things like:

     my_for(i=0,i<10,i++,{<code>});

instead of adding complications to the grammar to support such things.

--bb

Apr 07 2008

downs <default_357-line yahoo.de> writes:

Bill Baxter wrote:
 I believe Walter's response previously has been that we should just get
 used to looking at things like:
 
     my_for(i=0,i<10,i++,{<code>});
 
 instead of adding complications to the grammar to support such things.
 
 --bb

FWIW and just FYI, the least closing brackets can be done with my_for(i=0,
i<10, i++) = {<code>}; using an overloaded opAssign.

To make it flexible, template opAssign and make it lazy to allow chaining; i.e.
my_for(...) = your_for(...) = {<code>};

For example, I use this in dglut:
const string LazyCall="
  static if (is(T==void)) t();
  else static if (is(T==void delegate())) t()();
  else static assert(false, T.stringof);
";

Of course, I'd still rather have trailing DGs or full infix support. ^^

 --downs

Apr 07 2008

Russell Lewis <webmaster villagersonline.com> writes:

Bill Baxter wrote:
 Russell Lewis wrote:
 PROPOSAL: A way to handle sequences of expressions which otherwise 
 would have been syntax errors


 EXAMPLE CODE:
     my_for(i=0, i<10, i++) { <code> }

 
 Are you familiar with the "trailing delegates" proposal?
 
 Basically the idea there is that any {<code>} block following a function 
 call would be treated as an extra argument to the function.
 
 So if you write the function:
 void my_for(lazy void init, lazy bool test, lazy void inc, void delegate())
 { ... }
 
 then your EXAMPLE_CODE above would call that function.

Yes, I am familiar with the concept.  My proposal is a generalization of 
that which is able to handle any type of expression, and also to handle 
multiple expressions.

OPEN QUESTION: What happens if an opSeq-type struct is *not* followed by 
anything?  Do we need syntax to indicate whether that is legal or not?

You asked how opSeq is better than trailing delegates, so here are some 
more examples of things that opSeq can do:



1) Bare statements.  Take a look at my implementation of the MyFor 
struct from the original post.  One of the overloads of opSeq takes 
"lazy void block", which means that this syntax is also legal:
	my_for(i=0, i<10, i++)
		a = a+1;

2) Suffixes.  People have suggested that the expression
	3 + 2i
be something that can be implemented entirely as a library.  If i was a 
variable and we supported "opSeqRev", then it would be easy!

3) Multiple arguments.  Trailing delegates can't implement complex 
syntaxes, such as do...while.  opSeq can.  At the bottom of this post, 
I'll post code that will handle all of the following:
	MyWhile(a != b) <bare statement>;
	MyWhile(a != b) { <block>}
	MyDo <bare statement> MyWhile(a != b);
	MyDo { <block> } MyWhile(a != b);

4) Generalized syntax.  The examples above indicate to me that a lot of 
D's syntax could be implemented in a library using opSeq.  Would that 
allow many of D's constructs to be first class entities?  Might that 
allow us to implement more functional-language type features?



Here's the example code I promised:

BEGIN CODE

struct While
{
   bool delegate() cond;

   void opSeq(lazy void bareStatement)
   {
     opSeq({ bareStatement(); });
   }
   void opSeq(void delegate() block)
   {
     if(cond())
     {
BEGIN_LOOP:  // so I don't have to use D's while!
       block();
       if(cond())
         goto BEGIN_LOOP;
     }
   }
}

While MyWhile(lazy bool cond)
{
   While ret;
     ret.cond = cond;
   return ret;
}


struct Do
{
   void opSeq(lazy void bareStatement, While the_while)
   {
     opSeq({ bareStatement(); }, the_while);
   }
   void opSeq(void delegate() block, While the_while)
   {
     block();
     the_while block;
   }
}

// this isn't a function, it's a variable.  that's because
// the use of MyDo doesn't use parens.
Do MyDo;

END CODE

Apr 08 2008

Bill Baxter <dnewsgroup billbaxter.com> writes:

Russell Lewis wrote:
 Bill Baxter wrote:
 Russell Lewis wrote:
 PROPOSAL: A way to handle sequences of expressions which otherwise 
 would have been syntax errors


 EXAMPLE CODE:
     my_for(i=0, i<10, i++) { <code> }

 Are you familiar with the "trailing delegates" proposal?

 Basically the idea there is that any {<code>} block following a 
 function call would be treated as an extra argument to the function.

 So if you write the function:
 void my_for(lazy void init, lazy bool test, lazy void inc, void 
 delegate())
 { ... }

 then your EXAMPLE_CODE above would call that function.

 
 Yes, I am familiar with the concept.  My proposal is a generalization of 
 that which is able to handle any type of expression, and also to handle 
 multiple expressions.
 
 OPEN QUESTION: What happens if an opSeq-type struct is *not* followed by 
 anything?  Do we need syntax to indicate whether that is legal or not?
 
 You asked how opSeq is better than trailing delegates, so here are some 
 more examples of things that opSeq can do:
 
 
 
 1) Bare statements.  Take a look at my implementation of the MyFor 
 struct from the original post.  One of the overloads of opSeq takes 
 "lazy void block", which means that this syntax is also legal:
     my_for(i=0, i<10, i++)
         a = a+1;
 
 2) Suffixes.  People have suggested that the expression
     3 + 2i
 be something that can be implemented entirely as a library.  If i was a 
 variable and we supported "opSeqRev", then it would be easy!
 
 3) Multiple arguments.  Trailing delegates can't implement complex 
 syntaxes, such as do...while.  opSeq can.  At the bottom of this post, 
 I'll post code that will handle all of the following:
     MyWhile(a != b) <bare statement>;
     MyWhile(a != b) { <block>}
     MyDo <bare statement> MyWhile(a != b);
     MyDo { <block> } MyWhile(a != b);
 
 4) Generalized syntax.  The examples above indicate to me that a lot of 
 D's syntax could be implemented in a library using opSeq.  Would that 
 allow many of D's constructs to be first class entities?  Might that 
 allow us to implement more functional-language type features?
 
 
 
 Here's the example code I promised:
 
 BEGIN CODE
 
 struct While
 {
   bool delegate() cond;
 
   void opSeq(lazy void bareStatement)
   {
     opSeq({ bareStatement(); });
   }
   void opSeq(void delegate() block)
   {
     if(cond())
     {
 BEGIN_LOOP:  // so I don't have to use D's while!
       block();
       if(cond())
         goto BEGIN_LOOP;
     }
   }
 }
 
 While MyWhile(lazy bool cond)
 {
   While ret;
     ret.cond = cond;
   return ret;
 }
 
 
 struct Do
 {
   void opSeq(lazy void bareStatement, While the_while)
   {
     opSeq({ bareStatement(); }, the_while);
   }
   void opSeq(void delegate() block, While the_while)
   {
     block();
     the_while block;
   }
 }
 
 // this isn't a function, it's a variable.  that's because
 // the use of MyDo doesn't use parens.
 Do MyDo;
 
 END CODE


Ok.  Good examples.  Here's another that I suppose would be possible:

5) Cast-like syntaxes.  For instance the to! template in Phobos 2.x and 
Tango acts like a cast more or less, but you have to parenthesize the 
argument.    Currently:
     int x = 5;
     string y = to!(string)(x); // ok!
     string z = to!(string) x; // error!

But with your opSeq, I think the latter could be made legal, too.  IIUC.
I mention this because I keep forgetting to put those parenthesis around 
to!'s argument because it just feels so darn much like a cast.

It's an interesting idea.  Are you sure it doesn't kill 
the-ease-of-parsing requirement for the grammar?

--bb

Apr 08 2008

Russell Lewis <webmaster villagersonline.com> writes:

Bill Baxter wrote:
 It's an interesting idea.  Are you sure it doesn't kill 
 the-ease-of-parsing requirement for the grammar?

That is something that I have worried about, as well, and I haven't done 
a rock-solid analysis of it.  However, my hand-waving argument is that 
we parse the code without any knowledge of the types (we don't know 
which are opSeq handlers and which are not).  If our parsing shows us 
that we have a sequence of expressions without any sort of operator 
between them, then we interpret that using the opSeq parse rule:

	expression:
		expression expression ...

Then, in semantic analysis, we would decide whether that syntax is valid 
or not.  Since opSeq is right-associative, we start at the far-right of 
any chain of expressions, and see if the next-to-last expression is an 
opSeq handler; if so, it must take 1 argument, and the type must match 
the rightmost expression.  If not, then we work left, and so on.

Mechanically, I think I can argue that this doesn't make the parser any 
more complex.  What I don't know for sure, yet, is whether it introduces 
ambiguities into the grammar.  Those often require a tool to find. :(

Russ

Apr 09 2008

Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:

Russell Lewis wrote:
 Bill Baxter wrote:
 It's an interesting idea.  Are you sure it doesn't kill 
 the-ease-of-parsing requirement for the grammar?

 
 That is something that I have worried about, as well, and I haven't done 
 a rock-solid analysis of it.  However, my hand-waving argument is that 
 we parse the code without any knowledge of the types (we don't know 
 which are opSeq handlers and which are not).  If our parsing shows us 
 that we have a sequence of expressions without any sort of operator 
 between them, then we interpret that using the opSeq parse rule:
 
     expression:
         expression expression ...
 
 Then, in semantic analysis, we would decide whether that syntax is valid 
 or not.  Since opSeq is right-associative, we start at the far-right of 
 any chain of expressions, and see if the next-to-last expression is an 
 opSeq handler; if so, it must take 1 argument, and the type must match 
 the rightmost expression.  If not, then we work left, and so on.
 
 Mechanically, I think I can argue that this doesn't make the parser any 
 more complex.  What I don't know for sure, yet, is whether it introduces 
 ambiguities into the grammar.  Those often require a tool to find. :(

Well, one ambiguity is stuff like: "foo 1 -2". Is this foo.opSeq(1 - 2) 
(i.e. foo.opSeq(-1)) or foo.opSeq(1, -2)?
Ditto for '~' (concatenation versus bitwise negation), '&' (bitwise-and 
versus address-of), '!' (template instantiation versus logical 
negation), '.' ("member of" versus "look up in the global scope"), '+' 
(addition versus numeric identity function), '*' (multiplication versus 
dereferencing).

If you only allow _unexpected_ expressions, as you suggest, that would 
mean always choosing the first alternative above. That would mean you'd 
have to disambiguate the unary versions of those operators by placing 
them in parentheses: "foo 1 (-2)" instead of the initial example.
But that leaves another ambiguity: what about "foo x (-2)"? That would 
translate to foo.opSeq(x(-2)). I don't think this one can be resolved, 
even placing parentheses around x doesn't work. For example, if x is a 
delegate, the expression would mean the same thing with or without 
parentheses around it, so there would be no way to call Foo.opSeq(void 
delegate(int), int) except explicitly.

Besides, if you're going to place parentheses around all the operands 
you might as well overload opCall and be done with it, without any 
syntax extensions or added ambiguity at all.

Apr 09 2008

Russell Lewis <webmaster villagersonline.com> writes:

Frits van Bommel wrote:
 Well, one ambiguity is stuff like: "foo 1 -2". Is this foo.opSeq(1 - 2) 
 (i.e. foo.opSeq(-1)) or foo.opSeq(1, -2)?
 Ditto for '~' (concatenation versus bitwise negation), '&' (bitwise-and 
 versus address-of), '!' (template instantiation versus logical 
 negation), '.' ("member of" versus "look up in the global scope"), '+' 
 (addition versus numeric identity function), '*' (multiplication versus 
 dereferencing).
 
 If you only allow _unexpected_ expressions, as you suggest, that would 
 mean always choosing the first alternative above. That would mean you'd 
 have to disambiguate the unary versions of those operators by placing 
 them in parentheses: "foo 1 (-2)" instead of the initial example.
 But that leaves another ambiguity: what about "foo x (-2)"? That would 
 translate to foo.opSeq(x(-2)). I don't think this one can be resolved, 
 even placing parentheses around x doesn't work. For example, if x is a 
 delegate, the expression would mean the same thing with or without 
 parentheses around it, so there would be no way to call Foo.opSeq(void 
 delegate(int), int) except explicitly.
 
 Besides, if you're going to place parentheses around all the operands 
 you might as well overload opCall and be done with it, without any 
 syntax extensions or added ambiguity at all.

Good points.  I'll ponder 'em.

Apr 09 2008

D Programming

C/C++ Programming

Other

digitalmars.D - PROPOSAL: opSeq()