www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Request: (expression).stringof should return a textual AST

reply Don Clugston <dac nospam.com.au> writes:
Many uses of textual macros involve D expression syntax.
For example,

int x;
mixin(println("The value of x is $x and the next value is ${x+1}"));

If these embedded expressions are evaluated by manually parsing the 
string, we need to be able to be able to identify D literals, such as
floating-point literals (-0x1.2_5p-38, 1e80, etc). Also, we need to 
enforce the precedence rules. This is quite a lot of code that is 
difficult to get right.

Almost all the difficulty could be avoided by standardizing the 
behaviour of (expression).stringof.

The behaviour of .stringof when presented with an expression has changed 
a couple of times already. According to the spec, (expression).stringof 
is not supposed to perform semantic analysis, but it currently does (bug 
#1142). It seems that it parses the expression, performs type checking, 
and returns a slightly modified string.

For use in metaprogramming, it would extremely useful if instead, it 
parsed the string, without reference to types, removed unnecessary 
spaces and parentheses, and inserted parentheses to indicate precedence.

Under this proposal, with an expression, .stringof would return a value 
which was a standardised equivalent to the original string:

(1.2e+58+2*3).stringof --> (1.2e+58)+((2)*(3))
(func(var, var1*3.6)--> ((func)((var),((var1)*(3.6))))

This would allow code generators to accept D expressions embedded in 
strings, without needing to implement a lexer or precedence of 
operators; it only needs to count the number of ( and ).

All terminal expressions would be wrapped in () (or alternatively, they 
could be terminated with a space -- doesn't matter as long as it is 
consistent).

If there is a mixin in the expression, it should be evaluated before 
.stringof is invoked. (It's not possible to know the precedence until 
you have the complete string).
Apr 30 2007
next sibling parent reply Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
Don Clugston wrote:
 Many uses of textual macros involve D expression syntax.
 For example,
 
 int x;
 mixin(println("The value of x is $x and the next value is ${x+1}"));
 
 If these embedded expressions are evaluated by manually parsing the 
 string, we need to be able to be able to identify D literals, such as
 floating-point literals (-0x1.2_5p-38, 1e80, etc). Also, we need to 
 enforce the precedence rules. This is quite a lot of code that is 
 difficult to get right.
 
 Almost all the difficulty could be avoided by standardizing the 
 behaviour of (expression).stringof.
 
 The behaviour of .stringof when presented with an expression has changed 
 a couple of times already. According to the spec, (expression).stringof 
 is not supposed to perform semantic analysis, but it currently does (bug 
 #1142). It seems that it parses the expression, performs type checking, 
 and returns a slightly modified string.
 
 For use in metaprogramming, it would extremely useful if instead, it 
 parsed the string, without reference to types, removed unnecessary 
 spaces and parentheses, and inserted parentheses to indicate precedence.
 
 Under this proposal, with an expression, .stringof would return a value 
 which was a standardised equivalent to the original string:
 
 (1.2e+58+2*3).stringof --> (1.2e+58)+((2)*(3))
 (func(var, var1*3.6)--> ((func)((var),((var1)*(3.6))))
 
 This would allow code generators to accept D expressions embedded in 
 strings, without needing to implement a lexer or precedence of 
 operators; it only needs to count the number of ( and ).
 
 All terminal expressions would be wrapped in () (or alternatively, they 
 could be terminated with a space -- doesn't matter as long as it is 
 consistent).
 
 If there is a mixin in the expression, it should be evaluated before 
 ..stringof is invoked. (It's not possible to know the precedence until 
 you have the complete string).

And so with each passing day, D is becoming infix LISP. :D -- Bruno Medeiros - MSc in CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
May 03 2007
parent =?ISO-8859-1?Q?Julio_C=E9sar_Carrascal?= <jcesar phreaker.net> writes:
Bruno Medeiros wrote:
 
 And so with each passing day, D is becoming infix LISP. :D
 

Yes, that's the black hole: http://blogs.sun.com/jag/entry/the_black_hole_theory_of
May 05 2007
prev sibling next sibling parent reply Don Clugston <dac nospam.com.au> writes:
Don Clugston wrote:
 Many uses of textual macros involve D expression syntax.
 For example,
 
 int x;
 mixin(println("The value of x is $x and the next value is ${x+1}"));
 
 If these embedded expressions are evaluated by manually parsing the 
 string, we need to be able to be able to identify D literals, such as
 floating-point literals (-0x1.2_5p-38, 1e80, etc). Also, we need to 
 enforce the precedence rules. This is quite a lot of code that is 
 difficult to get right.
 
 Almost all the difficulty could be avoided by standardizing the 
 behaviour of (expression).stringof.
 
 The behaviour of .stringof when presented with an expression has changed 
 a couple of times already. According to the spec, (expression).stringof 
 is not supposed to perform semantic analysis, but it currently does (bug 
 #1142). It seems that it parses the expression, performs type checking, 
 and returns a slightly modified string.
 
 For use in metaprogramming, it would extremely useful if instead, it 
 parsed the string, without reference to types, removed unnecessary 
 spaces and parentheses, and inserted parentheses to indicate precedence.
 
 Under this proposal, with an expression, .stringof would return a value 
 which was a standardised equivalent to the original string:
 
 (1.2e+58+2*3).stringof --> (1.2e+58)+((2)*(3))
 (func(var, var1*3.6)--> ((func)((var),((var1)*(3.6))))
 
 This would allow code generators to accept D expressions embedded in 
 strings, without needing to implement a lexer or precedence of 
 operators; it only needs to count the number of ( and ).
 
 All terminal expressions would be wrapped in () (or alternatively, they 
 could be terminated with a space -- doesn't matter as long as it is 
 consistent).

 If there is a mixin in the expression, it should be evaluated before
 .stringof is invoked. (It's not possible to know the precedence until
 you have the complete string).

After experimenting with this a bit more, wrapping everything in () is not necessary or even desirable -- it does look pretty ugly. It would be enough to ensure all terminal expressions are space-terminated, and () is only required when the order of evaluation should not be left-to-right, due to precedence or associativity. So my examples would be: (1.2e+58+2*3).stringof --> "1.2e+58 + ( 2 * 3 )" (func(var, var1*3.6).stringof --> "func ( var , var1 * 3.6 )" and (2*3+1.2e+58).stringof --> "2 * 3 + 1.2e+58" Bug #1142 gets us most of the way there.
May 03 2007
parent reply Jari-Matti =?ISO-8859-1?Q?M=E4kel=E4?= <jmjmak utu.fi.invalid> writes:
Don Clugston wrote:
 Don Clugston wrote:
 Many uses of textual macros involve D expression syntax.


 Almost all the difficulty could be avoided by standardizing the
 behaviour of (expression).stringof.


 For use in metaprogramming, it would extremely useful if instead, it
 parsed the string, without reference to types, removed unnecessary
 spaces and parentheses, and inserted parentheses to indicate precedence.


 After experimenting with this a bit more, wrapping everything in () is
 not necessary or even desirable -- it does look pretty ugly.
 It would be enough to ensure all terminal expressions are
 space-terminated, and () is only required when the order of evaluation
 should not be left-to-right, due to precedence or associativity. So my
 examples would be:
 
 (1.2e+58+2*3).stringof --> "1.2e+58 + ( 2 * 3 )"
 (func(var, var1*3.6).stringof --> "func ( var , var1 * 3.6 )"
 and
 (2*3+1.2e+58).stringof --> "2 * 3 + 1.2e+58"

What is the single greatest value in the string representation? I mean you're proposing that the compiler should basically posses the power to construct these expressions (it already has, but it does not add the parenthesis atm) and also parse them afterwards (already possible). Still, the library level code should reimplement a parser or at least a lexer of its own. Why not let the compiler do the heavy lifting, and concentrate on the interesting parts? Nested tuples with S-exp properties could very well handle this kind of expression passing and manipulation. They could be flattened and .stringof'ed afterwards.
May 04 2007
parent Don Clugston <dac nospam.com.au> writes:
Jari-Matti Mäkelä wrote:
 Don Clugston wrote:
 Don Clugston wrote:
 Many uses of textual macros involve D expression syntax.


 Almost all the difficulty could be avoided by standardizing the
 behaviour of (expression).stringof.


 For use in metaprogramming, it would extremely useful if instead, it
 parsed the string, without reference to types, removed unnecessary
 spaces and parentheses, and inserted parentheses to indicate precedence.


 After experimenting with this a bit more, wrapping everything in () is
 not necessary or even desirable -- it does look pretty ugly.
 It would be enough to ensure all terminal expressions are
 space-terminated, and () is only required when the order of evaluation
 should not be left-to-right, due to precedence or associativity. So my
 examples would be:

 (1.2e+58+2*3).stringof --> "1.2e+58 + ( 2 * 3 )"
 (func(var, var1*3.6).stringof --> "func ( var , var1 * 3.6 )"
 and
 (2*3+1.2e+58).stringof --> "2 * 3 + 1.2e+58"

What is the single greatest value in the string representation? I mean you're proposing that the compiler should basically posses the power to construct these expressions (it already has, but it does not add the parenthesis atm) and also parse them afterwards (already possible). Still, the library level code should reimplement a parser or at least a lexer of its own.

True, but for the common case where a D expression is embedded, it's a chunk of uninteresting code that needs to be maintained with compiler changes. Especially this precedence stuff. In answer to your question, the great value of the string representation is that it is not required to be valid D code. Consider println("the value of x is ${x} and next value is ${next(x)}") -- you need to extract the parts which are expressions, before passing it back to the compiler for parsing.
 Why not let the compiler do the heavy lifting, and concentrate on the
 interesting parts? Nested tuples with S-exp properties could very well
 handle this kind of expression passing and manipulation. They could be
 flattened and .stringof'ed afterwards.

Perhaps. I think that the Lisp approach is not going to be optimal for D, though. I think that most of the compiler's functionality could be exposed to metaprogramming simply by adding properties to string literals and tuples. Including something to make a tuple out of the elements of the string. I envisage moving back and forth between string and tuple representations, since certain operations are much simpler in one representation than the other.
May 05 2007
prev sibling parent janderson <askme me.com> writes:
Don Clugston wrote:
 Many uses of textual macros involve D expression syntax.
 For example,
 
 int x;
 mixin(println("The value of x is $x and the next value is ${x+1}"));
 
 If these embedded expressions are evaluated by manually parsing the 
 string, we need to be able to be able to identify D literals, such as
 floating-point literals (-0x1.2_5p-38, 1e80, etc). Also, we need to 
 enforce the precedence rules. This is quite a lot of code that is 
 difficult to get right.
 
 Almost all the difficulty could be avoided by standardizing the 
 behaviour of (expression).stringof.
 
 The behaviour of .stringof when presented with an expression has changed 
 a couple of times already. According to the spec, (expression).stringof 
 is not supposed to perform semantic analysis, but it currently does (bug 
 #1142). It seems that it parses the expression, performs type checking, 
 and returns a slightly modified string.
 
 For use in metaprogramming, it would extremely useful if instead, it 
 parsed the string, without reference to types, removed unnecessary 
 spaces and parentheses, and inserted parentheses to indicate precedence.
 
 Under this proposal, with an expression, .stringof would return a value 
 which was a standardised equivalent to the original string:
 
 (1.2e+58+2*3).stringof --> (1.2e+58)+((2)*(3))
 (func(var, var1*3.6)--> ((func)((var),((var1)*(3.6))))
 
 This would allow code generators to accept D expressions embedded in 
 strings, without needing to implement a lexer or precedence of 
 operators; it only needs to count the number of ( and ).
 
 All terminal expressions would be wrapped in () (or alternatively, they 
 could be terminated with a space -- doesn't matter as long as it is 
 consistent).
 
 If there is a mixin in the expression, it should be evaluated before 
 .stringof is invoked. (It's not possible to know the precedence until 
 you have the complete string).

Crazy thought 2003 Why not have the ability to stringarise any syntax that is within scope: module me; void func() { int X; ... } void bar() { string funcStr = stringof(func()); //funcStr = "void func()/n{/n int X/n.../n}" string funcStr2 = stringof((func())); //funcStr2 = "func()" int X = 10; int Y = 200; string funcStr3 = stringof((X)+Y); //funcStr3 = "10 + Y" int X = 10; int Y = 200; string funcStr4 = stringof((X + 1) + Y); //funcStr4 = "(X + 1) + Y" int X = 10; int Y = 200; string funcStr5 = stringof(((X) + 1) + Y); //funcStr5 = "11 + Y" int X = 10; int Y = 200; string funcStr6 = stringof(me); //funcStr6 = "module me; void foo(); void bar();"; //Note I'm using brackets here to specify what is the //variable and what is a value. It probably should be //another syntax. } //I know you could turn everything into a strings an use mixins however it just doesn't look so nice. -Joel
May 04 2007