www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.announce - Compile-Time Regular Expressions

reply pragma <pragma_member pathlink.com> writes:
(Special thanks goes to Don Clugston, as this engine is based 100% on his
metaprogramming lib)

With so much talk about the possibility of Compile-time regex in D, I couldn't
resist trying to make a stab at it.  I'm pretty happy with the results.

http://trac.dsource.org/projects/ddl/browser/trunk/meta

regex.d has more information on what's working and what's not.  The
implementation isn't 100% complete, but it stands as a proof-of-concept as to
what's possible with D's templates.

Anyway, the end result is that its really easy to use:

# auto exp = &regexMatch!(`[a-z]*\s*\w*`);
# writefln("matches: %s",exp("hello    world"); 

Enjoy!

-----

Don Clugston approached me a while back and asked if he could get a
metaprogramming library going under DDL (for now).  Without hesitation, I set
him up and off he went creating a wonderful library of workarounds and utils,
which make template coding a breeze.  His code turned out to be so useful, that
I wrote this (mostly complete) Compile-time Regular Expression engine.

This is my first parser in D templates, so please bear with me.   It really
could be optimized any number of different ways, but at least the code generated
from the regex templates is very efficent and even minimalistic to a point.
Much of the oddities in the code come from heavy use of "type traits" (a
technique borrowed from c++), which really cuts down on the verbosity of the
code and gets the most out of each template call.  

The rest of the strangeness in there comes from limitations in how template code
works (and doesn't work).  Things like static if() ignoring declarations in the
curent scope, slicing not working the way it should, and templates not always
specializing on alias params caused the whole thing to bloat a bit.  As these
get worked out of DMD, I would expect a more refined engine to emerge.

In the end, I'm confident that to do the same thing in C++ would probably take
at least twice as much code, and would be about half as readable. From what I've
seen of Boost's implementation, that's pretty much the case.  Don's pioneering
work in template coding for DDL's symbol mangling code really set the mode here,
and laid the foundation for more ambitious projects like this.

------

The theory of operation here is fairly straightforward.  The parser set of
templates, in regex.d, tears apart the query string and breaks it down into the
various tests that make up the query.  At each terminal/production, the parser
pulls in an instance of a test function template from testPredicate.d.  These
predicates are passed back along the parser call chain via a type trait (usually
named 'fn' for consistency), that is usually supplied to other test predicates
that and/or multiple tests together.  The top level production is the function
that is returned via the original template instance -- this becomes the regex
handle that you use to make your queries at runtime.

This means that, for now anyway, the implementation uses zero clases and zero
structs.  Its just instances of templated functions, and the minimal set needed
to support the query at that.  As a side-effect of this design, you can
'manually' compose the test predicates yourself by using the templates in
regexPredicate.d directly -- this made testing and prototyping the system very
easy.

Enjoy!


- EricAnderton at yahoo
Dec 17 2005
next sibling parent reply "Ben Hinkle" <ben.hinkle gmail.com> writes:
"pragma" <pragma_member pathlink.com> wrote in message 
news:do18bj$18ge$1 digitaldaemon.com...
 (Special thanks goes to Don Clugston, as this engine is based 100% on his
 metaprogramming lib)

 With so much talk about the possibility of Compile-time regex in D, I 
 couldn't
 resist trying to make a stab at it.  I'm pretty happy with the results.

 http://trac.dsource.org/projects/ddl/browser/trunk/meta

 regex.d has more information on what's working and what's not.  The
 implementation isn't 100% complete, but it stands as a proof-of-concept as 
 to
 what's possible with D's templates.

 Anyway, the end result is that its really easy to use:

 # auto exp = &regexMatch!(`[a-z]*\s*\w*`);
 # writefln("matches: %s",exp("hello    world");

Neat stuff, but as I glanced over the code I couldn't help think that such a use of templates is really just "forced inlining". For example the code template getAt(char[] str,uint index){ const char getAt = str[index]; } seems to me to be the same as allowing the compiler to inline the regular function char getAt(char[] str, uint index){ return str[index]; } Instead of making the template language in D more complex, is it possible to instead specify which parts of the "normal" D language are inlinable and how to explicitly ask to inline them? One difference between inlining regular code and template hacking is something like static if(consumed == 0 || consumed < tokenLength){ pragma(msg,"Error: expected expression in the format of {n,m}"); static assert(false); } The equivalent regular code would look something like if(consumed == 0 || consumed < tokenLength){ throw new Exception("Error: expected expression in the format of {n,m}"); } so that the compile-time error of the templated version would become a run-time error in the regular version. So aside from the pragma(msg,...) and static asserts I don't see a big win by writing D-like code in template-speak. The downside is having two languages in D - the "run-time" D and the "compile-time" D coding. Am I missing something?
Dec 17 2005
next sibling parent reply Bruno Medeiros <daiphoenixNO SPAMlycos.com> writes:
Ben Hinkle wrote:
 "pragma" <pragma_member pathlink.com> wrote in message 
 news:do18bj$18ge$1 digitaldaemon.com...
 
(Special thanks goes to Don Clugston, as this engine is based 100% on his
metaprogramming lib)

With so much talk about the possibility of Compile-time regex in D, I 
couldn't
resist trying to make a stab at it.  I'm pretty happy with the results.

http://trac.dsource.org/projects/ddl/browser/trunk/meta

regex.d has more information on what's working and what's not.  The
implementation isn't 100% complete, but it stands as a proof-of-concept as 
to
what's possible with D's templates.

Anyway, the end result is that its really easy to use:

# auto exp = &regexMatch!(`[a-z]*\s*\w*`);
# writefln("matches: %s",exp("hello    world");

Neat stuff, but as I glanced over the code I couldn't help think that such a use of templates is really just "forced inlining". For example the code template getAt(char[] str,uint index){ const char getAt = str[index]; } seems to me to be the same as allowing the compiler to inline the regular function char getAt(char[] str, uint index){ return str[index]; } Instead of making the template language in D more complex, is it possible to instead specify which parts of the "normal" D language are inlinable and how to explicitly ask to inline them? One difference between inlining regular code and template hacking is something like static if(consumed == 0 || consumed < tokenLength){ pragma(msg,"Error: expected expression in the format of {n,m}"); static assert(false); } The equivalent regular code would look something like if(consumed == 0 || consumed < tokenLength){ throw new Exception("Error: expected expression in the format of {n,m}"); } so that the compile-time error of the templated version would become a run-time error in the regular version. So aside from the pragma(msg,...) and static asserts I don't see a big win by writing D-like code in template-speak. The downside is having two languages in D - the "run-time" D and the "compile-time" D coding. Am I missing something?

reading that article about Dimensional Analysis that someone posted, I tought about something like that. First I was trying to understand why that (DA) wouldn't be implementable in D (I didn't know what was implicit function template instantiation then). After understanding the issue, I started thinking how could such feature (IFTI) be designed into D, preferably in a clean way, since that C++ code was looking ghastly. But then I just realized, one could just implement Dimensional Analysis with normal structs&methods, and with a sufficiently smart compiler, it would be able to inline and simplify-remove most of the calculations, since they are all based on compile-time values and operations. And that was probably the best, cleanest way. So yes, we should probably keep in mind this ideia, that templates, const-values and inlining are close related, right? Could there be pratical ramifications in terms of language design? Wait, quite some ideias are already starting to come to my head, hum... -- Bruno Medeiros - CS/E student "Certain aspects of D are a pathway to many abilities some consider to be... unnatural."
Dec 17 2005
parent reply BCS <BCS_member pathlink.com> writes:
Bruno Medeiros wrote:
 Ben Hinkle wrote:
 
 Neat stuff, but as I glanced over the code I couldn't help think that 
 such a use of templates is really just "forced inlining".


 Instead of making the template language in D more complex, is it 
 possible to instead specify which parts of the "normal" D language are 
 inlinable and how to explicitly ask to inline them?


 The downside is having two languages in 
 D - the "run-time" D and the "compile-time" D coding. Am I missing 
 something?

reading that article about Dimensional Analysis that someone posted, I tought about something like that. First I was trying to understand why that (DA) wouldn't be implementable in D (I didn't know what was implicit function template instantiation then). After understanding the issue, I started thinking how could such feature (IFTI) be designed into D, preferably in a clean way, since that C++ code was looking ghastly. But then I just realized, one could just implement Dimensional Analysis with normal structs&methods, and with a sufficiently smart compiler, it would be able to inline and simplify-remove most of the calculations, since they are all based on compile-time values and operations. And that was probably the best, cleanest way. So yes, we should probably keep in mind this ideia, that templates, const-values and inlining are close related, right? Could there be pratical ramifications in terms of language design? Wait, quite some ideias are already starting to come to my head, hum...

I think that the primary advantage of using templates is that they can allow for automated setup/verification of complex environments at compile time. How about building a static binary search tree? It would be nice to be able to initialize such a tree from an unsorted array and be sure that it gets put together right. something like this auto BTreeRoot = BuildBTree!(int, CompFn, [1,3,7,9,2,5,8]); would build a balanced tree of ints, sorted with CompFn. One advantage of the template solution is that You can trust that it is evaluated at compile time. This would be important for the verification usage. For the Dimensional Analysis case, using a templated version would ensure that, if the program compiles, then no unit errors exist, if you do this with structs/classes then the compiler might not inline everything and as a result no check all of your math and a unit error might still exist. The important difference is that templates always inline, function might not. In a nut shell, I see a distinct advantage of code that is insured to be completely evaluated at compile time.
Dec 19 2005
parent reply "Kris" <fu bar.com> writes:
"BCS" <BCS_member pathlink.com> wrote
[snip]
 I think that the primary advantage of using templates is that they can 
 allow for automated setup/verification of complex environments at compile 
 time. How about building a static binary search tree? It would be nice to 
 be able to initialize such a tree from an unsorted array and be sure that 
 it gets put together right. something like this


 auto BTreeRoot = BuildBTree!(int, CompFn, [1,3,7,9,2,5,8]);

 would build a balanced tree of ints, sorted with CompFn.

Nice, but surely this is something that can be done at runtime via a static ctor? [snip]
 In a nut shell, I see a distinct advantage of code that is insured to be 
 completely evaluated at compile time.

As do I. Another perspective on this is to consider Templates as being extensions to the compiler proper. While such things can be very useful, I imagine it's not hard to get carried away building a language within a language. Still, being able to instruct the compiler to build a regex state-machine for you is pretty powerful, and likely useful for some ~ I'd use it. A full blown compiler-compiler is then not so far-fetched ... where does one draw the line?
Dec 19 2005
parent reply BCS <BCS_member pathlink.com> writes:
Kris wrote:
 "BCS" <BCS_member pathlink.com> wrote
 [snip]
 
I think that the primary advantage of using templates is that they can 
allow for automated setup/verification of complex environments at compile 
time. How about building a static binary search tree? It would be nice to 
be able to initialize such a tree from an unsorted array and be sure that 
it gets put together right. something like this


auto BTreeRoot = BuildBTree!(int, CompFn, [1,3,7,9,2,5,8]);

would build a balanced tree of ints, sorted with CompFn.

Nice, but surely this is something that can be done at runtime via a static ctor?

good enough) however this would add to the start up cost each time the program is run. Why spend code and time setting up that state of a program when that state can be set ahead of time by just having preinitialized data space?
Dec 19 2005
parent reply "Kris" <fu bar.com> writes:
"BCS" <BCS_member pathlink.com> wrote ...
 Kris wrote:
 "BCS" <BCS_member pathlink.com> wrote
 [snip]

I think that the primary advantage of using templates is that they can 
allow for automated setup/verification of complex environments at compile 
time. How about building a static binary search tree? It would be nice to 
be able to initialize such a tree from an unsorted array and be sure that 
it gets put together right. something like this


auto BTreeRoot = BuildBTree!(int, CompFn, [1,3,7,9,2,5,8]);

would build a balanced tree of ints, sorted with CompFn.

Nice, but surely this is something that can be done at runtime via a static ctor?

good enough) however this would add to the start up cost each time the program is run. Why spend code and time setting up that state of a program when that state can be set ahead of time by just having preinitialized data space?

OK ~ I follow you. In the past, some folk would precompute "expensive" data into a file and then load it up at start time. You're effectively noting that might be done with templates, embedding said data right into the executable.
Dec 19 2005
parent reply Sean Kelly <sean f4.ca> writes:
Kris wrote:
 "BCS" <BCS_member pathlink.com> wrote ...
 Kris wrote:
 "BCS" <BCS_member pathlink.com> wrote
 [snip]

 I think that the primary advantage of using templates is that they can 
 allow for automated setup/verification of complex environments at compile 
 time. How about building a static binary search tree? It would be nice to 
 be able to initialize such a tree from an unsorted array and be sure that 
 it gets put together right. something like this


 auto BTreeRoot = BuildBTree!(int, CompFn, [1,3,7,9,2,5,8]);

 would build a balanced tree of ints, sorted with CompFn.

Nice, but surely this is something that can be done at runtime via a static ctor?

good enough) however this would add to the start up cost each time the program is run. Why spend code and time setting up that state of a program when that state can be set ahead of time by just having preinitialized data space?

OK ~ I follow you. In the past, some folk would precompute "expensive" data into a file and then load it up at start time. You're effectively noting that might be done with templates, embedding said data right into the executable.

Yup. I think the usefulness of templates is that the data is integrated with the runtime code, which eliminates the need for the awkward syntax and error detection associated with storing precalculated data in files. So far, this seems to have been most useful for science and mathematics, as calculations can simply be done at compile time "whenever possible" without the need for programmer involvement. And this can still be supplemented by the file method if there is a larger class of data that can be precalculated but for some reason can not be done in code. I think determining a tipping point is probably situation dependent. In C++, for example, I find a lot of what's in Boost to be a fantastic example of what's possible using C++ templates, but the implementation often (necessarily) so horrendous that I'm disinclined to actually use it. Compilation time is another issue, as template code can become so complex that an application might literally take days to compile. At some point the run-time efficiency simply becomes not worth the coding or compile-time cost. Sean
Dec 19 2005
next sibling parent BCS <BCS_member pathlink.com> writes:
Sean Kelly wrote:
 Kris wrote:
 
 OK ~ I follow you. In the past, some folk would precompute "expensive" 
 data into a file and then load it up at start time. You're effectively 
 noting that might be done with templates, embedding said data right 
 into the executable. 

Yup. I think the usefulness of templates is that the data is integrated with the runtime code, which eliminates the need for the awkward syntax and error detection associated with storing precalculated data in files. So far, this seems to have been most useful for science and mathematics, as calculations can simply be done at compile time "whenever possible" without the need for programmer involvement. And this can still be supplemented by the file method if there is a larger class of data that can be precalculated but for some reason can not be done in code. I think determining a tipping point is probably situation dependent. In C++, for example, I find a lot of what's in Boost to be a fantastic example of what's possible using C++ templates, but the implementation often (necessarily) so horrendous that I'm disinclined to actually use it. Compilation time is another issue, as template code can become so complex that an application might literally take days to compile. At some point the run-time efficiency simply becomes not worth the coding or compile-time cost. Sean

The tipping point also depends on what the program is for, If you are writing a GCI app that will need to start, run and generate output while a user is waiting (and the program will run millions of times), compile time won't be nearly so much of an issue as if you are writing a finite element analysis program that will only be run a half dozen times for each time it is compiled. One other thought, to keep the compile/run cycle manageable, maybe many of these libraries should have a setting to change them from compile time code to run time code, to use my last example: template BuildBTree(T, bool function(T,T) cmp, T[], ar) { static if(atComplie) // if at compile generate a tree now const BTree!(T) BuildBTree = ...// some template magic else // else insert code to call function BTree!(T) BuildBTree!(T)(comp, ar); } I know that doesn't work but you get the idea.
Dec 19 2005
prev sibling parent "Walter Bright" <newshound digitalmars.com> writes:
"Sean Kelly" <sean f4.ca> wrote in message 
news:do769q$1k87$1 digitaldaemon.com...
 I think determining a tipping point is probably situation dependent.  In 
 C++, for example, I find a lot of what's in Boost to be a fantastic 
 example of what's possible using C++ templates, but the implementation 
 often (necessarily) so horrendous that I'm disinclined to actually use it.

The problem with C++ template metaprogramming is it was discovered as a side effect of templates. The templates were never designed to do that. Hence, it is, as you say, horrendous. In D we have the benefit of seeing where we want to go and putting the road directly there, rather than just following the arbitrary lay of the land and seeing where we wound up.
Jan 03 2006
prev sibling parent reply Don Clugston <dac nospam.com.au> writes:
Ben Hinkle wrote:
 "pragma" <pragma_member pathlink.com> wrote in message 
 news:do18bj$18ge$1 digitaldaemon.com...
Anyway, the end result is that its really easy to use:

# auto exp = &regexMatch!(`[a-z]*\s*\w*`);
# writefln("matches: %s",exp("hello    world");

Neat stuff, but as I glanced over the code I couldn't help think that such a use of templates is really just "forced inlining". For example the code template getAt(char[] str,uint index){ const char getAt = str[index]; } seems to me to be the same as allowing the compiler to inline the regular function char getAt(char[] str, uint index){ return str[index]; } Instead of making the template language in D more complex, is it possible to instead specify which parts of the "normal" D language are inlinable and how to explicitly ask to inline them? One difference between inlining regular code and template hacking is something like static if(consumed == 0 || consumed < tokenLength){ pragma(msg,"Error: expected expression in the format of {n,m}"); static assert(false); } The equivalent regular code would look something like if(consumed == 0 || consumed < tokenLength){ throw new Exception("Error: expected expression in the format of {n,m}"); } so that the compile-time error of the templated version would become a run-time error in the regular version. So aside from the pragma(msg,...) and static asserts I don't see a big win by writing D-like code in template-speak. The downside is having two languages in D - the "run-time" D and the "compile-time" D coding. Am I missing something?

It would in theory be possible to apply the D meaning of const to functions and function parameters. For example, here's the metaprogramming "pow!()": /** real pow!(real a, int b) * Fast integer powers */ template pow(real a, int b) { static if (b==0) const real pow=1.0L; else static if (b<0) const real pow = 1.0L/.pow!(a, -b); else static if (b==1) const real pow = a; else static if (b & 1) const real pow = a * .pow!(a*a, b>>1); else const real pow = .pow!(a*a, b>>1); } Ideally, this could be written as: const real pow(const real a, const int b) { if (b==0) return 1.0L; else if (b<0) return 1.0L/pow(a, -b); else if (b==1) return a; else if (b & 1) return a * pow(a*a, b>>1); else return pow(a*a, b>>1); } That is, provide an overload for the case in which all parameters are known at compile time. (maybe the "const" before real a, real b are unnecessary, they are implied by the const return?). Note that this is completely different to the C++ meaning of const. This example shows how similar the compile-time and run-time D are, compared to C++. It's not hard to take a compile-time program using template value parameters, and convert it to a run-time program by converting "static if" to "if", removing the "const"s and the "template". Although the two functions above are remarkably similar, the non-template one is nicer because: (a) the optimisation would automatically apply to existing code. (b) you'd be able to use templates! The function should apply for any type of a, provided it supports opMul(). Ie, you should be able to write: template (A) const A pow(const A a, const int b) { ... } (c) things like DDoc would work properly. (d) it could conceivably work for types other than strings, ints, and reals. It'd be pretty tough on the compiler, though.
Dec 19 2005
next sibling parent reply "Ben Hinkle" <bhinkle mathworks.com> writes:
 It would in theory be possible to apply the D meaning of const to 
 functions and function parameters. For example, here's the metaprogramming 
 "pow!()":

 /** real pow!(real a, int b)
  * Fast integer powers
  */
 template pow(real a, int b)
 {
   static if (b==0) const real pow=1.0L;
   else static if (b<0) const real pow = 1.0L/.pow!(a, -b);
   else static if (b==1) const real pow = a;
   else static if (b & 1) const real pow = a * .pow!(a*a, b>>1);
   else const real pow = .pow!(a*a, b>>1);
 }


 Ideally, this could be written as:

 const real pow(const real a, const int b)
 {
   if (b==0) return 1.0L;
   else if (b<0) return 1.0L/pow(a, -b);
   else if (b==1) return a;
   else if (b & 1) return a * pow(a*a, b>>1);
   else return pow(a*a, b>>1);
 }

 That is, provide an overload for the case in which all parameters are 
 known at compile time. (maybe the "const" before real a, real b are 
 unnecessary, they are implied by the const return?).

That could be one approach, though personally I'm not sure its worth making the language and overloading rules more complex. For example given real pow(real a, int b); const real pow(const real a, const int b); what would the compiler do with pow(1.0,1)? The compiler could convert the const double to const real and call the compile-time pow or it could call the run-time pow. I think it's better to have either 1) a separate function name for compile-time pow if the algorithm for the compile-time version is different than the run-time version or 2) import a module called something like std.inlinemath that has the compile-time pow and is tailored to let the compiler inline as much as possible.
 Note that this is completely different to the C++ meaning of const.
 This example shows how similar the compile-time and run-time D are, 
 compared to C++. It's not hard to take a compile-time program using 
 template value parameters, and convert it to a run-time program by 
 converting "static if" to "if", removing the "const"s and the "template".

 Although the two functions above are remarkably similar, the non-template 
 one is nicer because:
 (a) the optimisation would automatically apply to existing code.

automatically except that the user had to explicitly ask for the template version. I'm not sure what you mean by automatically I guess.
 (b) you'd be able to use templates! The function should apply for any type 
 of a, provided it supports opMul(). Ie, you should be able to write:

 template (A)
 const A pow(const A a, const int b)
 {
  ...
 }

One can still templatize the type A and leave the rest as regular code template pow(A) A pow(A a, int b) { if (b==0) return 1.0L; else if (b<0) return 1.0L/pow(a, -b); else if (b==1) return a; else if (b & 1) return a * pow(a*a, b>>1); else return pow(a*a, b>>1); } The compiler can inline that just as easily as it can inline the non-templatized pow.
 (c) things like DDoc would work properly.

I'm not sure what you mean here. Why can't DDoc work for a regular function?
 (d) it could conceivably work for types other than strings, ints, and 
 reals.

I'm not sure what you mean here, either.
 It'd be pretty tough on the compiler, though. 

Dec 19 2005
parent reply Don Clugston <dac nospam.com.au> writes:
Ben Hinkle wrote:
It would in theory be possible to apply the D meaning of const to 
functions and function parameters. For example, here's the metaprogramming 
"pow!()":

/** real pow!(real a, int b)
 * Fast integer powers
 */
template pow(real a, int b)
{
  static if (b==0) const real pow=1.0L;
  else static if (b<0) const real pow = 1.0L/.pow!(a, -b);
  else static if (b==1) const real pow = a;
  else static if (b & 1) const real pow = a * .pow!(a*a, b>>1);
  else const real pow = .pow!(a*a, b>>1);
}


Ideally, this could be written as:

const real pow(const real a, const int b)
{
  if (b==0) return 1.0L;
  else if (b<0) return 1.0L/pow(a, -b);
  else if (b==1) return a;
  else if (b & 1) return a * pow(a*a, b>>1);
  else return pow(a*a, b>>1);
}

That is, provide an overload for the case in which all parameters are 
known at compile time. (maybe the "const" before real a, real b are 
unnecessary, they are implied by the const return?).

That could be one approach, though personally I'm not sure its worth making the language and overloading rules more complex. For example given real pow(real a, int b); const real pow(const real a, const int b); what would the compiler do with pow(1.0,1)? The compiler could convert the const double to const real and call the compile-time pow or it could call the run-time pow.

It should definitely treat the constant as a const real. I think it's better to
 have either 1) a separate function name for compile-time pow if the 
 algorithm for the compile-time version is different than the run-time 
 version or 2) import a module called something like std.inlinemath that has 
 the compile-time pow and is tailored to let the compiler inline as much as 
 possible.

This is what I've done, effectively -- meta.math includes pow!() which behaves exactly in that way.
Note that this is completely different to the C++ meaning of const.
This example shows how similar the compile-time and run-time D are, 
compared to C++. It's not hard to take a compile-time program using 
template value parameters, and convert it to a run-time program by 
converting "static if" to "if", removing the "const"s and the "template".

Although the two functions above are remarkably similar, the non-template 
one is nicer because:
(a) the optimisation would automatically apply to existing code.

automatically except that the user had to explicitly ask for the template version. I'm not sure what you mean by automatically I guess.

In this, and the rest of your comments, you have my meaning inverted. It is the template version which is not automatic, which doesn't work with DDoc, etc. I also notice that you seem to be assuming that the only reason for performing these functions at compile time is for performance. Another important benefit is the possibility of detecting errors at compile time. For example, a compile-time regex library can detect syntax errors in the regular expression, during compilation. It's part of the general trend in modern C++ to push as much as possible from run time into compile time.
Dec 19 2005
parent reply "Ben Hinkle" <ben.hinkle gmail.com> writes:
"Don Clugston" <dac nospam.com.au> wrote in message 
news:do6m1f$oh8$1 digitaldaemon.com...
 Ben Hinkle wrote:
It would in theory be possible to apply the D meaning of const to 
functions and function parameters. For example, here's the 
metaprogramming "pow!()":

/** real pow!(real a, int b)
 * Fast integer powers
 */
template pow(real a, int b)
{
  static if (b==0) const real pow=1.0L;
  else static if (b<0) const real pow = 1.0L/.pow!(a, -b);
  else static if (b==1) const real pow = a;
  else static if (b & 1) const real pow = a * .pow!(a*a, b>>1);
  else const real pow = .pow!(a*a, b>>1);
}


Ideally, this could be written as:

const real pow(const real a, const int b)
{
  if (b==0) return 1.0L;
  else if (b<0) return 1.0L/pow(a, -b);
  else if (b==1) return a;
  else if (b & 1) return a * pow(a*a, b>>1);
  else return pow(a*a, b>>1);
}

That is, provide an overload for the case in which all parameters are 
known at compile time. (maybe the "const" before real a, real b are 
unnecessary, they are implied by the const return?).

That could be one approach, though personally I'm not sure its worth making the language and overloading rules more complex. For example given real pow(real a, int b); const real pow(const real a, const int b); what would the compiler do with pow(1.0,1)? The compiler could convert the const double to const real and call the compile-time pow or it could call the run-time pow.

It should definitely treat the constant as a const real.

It isn't obvious to me which one to choose. I would expect that it would follow D's "error on any multiple matches" rule.
 I think it's better to
 have either 1) a separate function name for compile-time pow if the 
 algorithm for the compile-time version is different than the run-time 
 version or 2) import a module called something like std.inlinemath that 
 has the compile-time pow and is tailored to let the compiler inline as 
 much as possible.

This is what I've done, effectively -- meta.math includes pow!() which behaves exactly in that way.

I'm not sure I'm communicating my point. You have written meta.math.pow!() and have suggested (or maybe I should say floated rather than suggested) a function pow(const real, const int). My point is that in a perfect world a "compile-time" pow shouldn't be a template and shouldn't use overloading on pow. Why invent lots of machinery when it might be avoidable?
Note that this is completely different to the C++ meaning of const.
This example shows how similar the compile-time and run-time D are, 
compared to C++. It's not hard to take a compile-time program using 
template value parameters, and convert it to a run-time program by 
converting "static if" to "if", removing the "const"s and the "template".

Although the two functions above are remarkably similar, the non-template 
one is nicer because:
(a) the optimisation would automatically apply to existing code.

automatically except that the user had to explicitly ask for the template version. I'm not sure what you mean by automatically I guess.

In this, and the rest of your comments, you have my meaning inverted. It is the template version which is not automatic, which doesn't work with DDoc, etc.

You're right. I somehow missed the "non-template version" phrase.
 I also notice that you seem to be assuming that the only reason for 
 performing these functions at compile time is for performance. Another 
 important benefit is the possibility of detecting errors at compile time. 
 For example, a compile-time regex library can detect syntax errors in the 
 regular expression, during compilation.

I didn't mean to imply that performance was the only reason. It's just the most obvious one. I agree catching syntax errors is useful. Non-template code can catch syntax errors, too.
 It's part of the general trend in modern C++ to push as much as possible 
 from run time into compile time.

I'm not objecting to that either.
Dec 19 2005
parent Don Clugston <dac nospam.com.au> writes:
Ben Hinkle wrote:
 "Don Clugston" <dac nospam.com.au> wrote in message 
 news:do6m1f$oh8$1 digitaldaemon.com...
 
Ben Hinkle wrote:

It would in theory be possible to apply the D meaning of const to 
functions and function parameters. For example, here's the 
metaprogramming "pow!()":

/** real pow!(real a, int b)
* Fast integer powers
*/
template pow(real a, int b)
{
 static if (b==0) const real pow=1.0L;
 else static if (b<0) const real pow = 1.0L/.pow!(a, -b);
 else static if (b==1) const real pow = a;
 else static if (b & 1) const real pow = a * .pow!(a*a, b>>1);
 else const real pow = .pow!(a*a, b>>1);
}


Ideally, this could be written as:

const real pow(const real a, const int b)
{
 if (b==0) return 1.0L;
 else if (b<0) return 1.0L/pow(a, -b);
 else if (b==1) return a;
 else if (b & 1) return a * pow(a*a, b>>1);
 else return pow(a*a, b>>1);
}

That is, provide an overload for the case in which all parameters are 
known at compile time. (maybe the "const" before real a, real b are 
unnecessary, they are implied by the const return?).

That could be one approach, though personally I'm not sure its worth making the language and overloading rules more complex. For example given real pow(real a, int b); const real pow(const real a, const int b); what would the compiler do with pow(1.0,1)? The compiler could convert the const double to const real and call the compile-time pow or it could call the run-time pow.

It should definitely treat the constant as a const real.

It isn't obvious to me which one to choose. I would expect that it would follow D's "error on any multiple matches" rule.
I think it's better to
have either 1) a separate function name for compile-time pow if the 
algorithm for the compile-time version is different than the run-time 
version or 2) import a module called something like std.inlinemath that 
has the compile-time pow and is tailored to let the compiler inline as 
much as possible.

This is what I've done, effectively -- meta.math includes pow!() which behaves exactly in that way.

I'm not sure I'm communicating my point. You have written meta.math.pow!() and have suggested (or maybe I should say floated rather than suggested) a function pow(const real, const int). My point is that in a perfect world a "compile-time" pow shouldn't be a template and shouldn't use overloading on pow. Why invent lots of machinery when it might be avoidable?

I agree with you. But, I think that in practice, a compiler from that perfect world is extremely difficult to create. It seems that even after decades of work, highly optimising compilers are very difficult to create. Existing compilers have problems even with the simple cases, like matrix operations. This seems to be the rationale behind much of the code at boost. It's possible that the reason compilers aren't good at it is simply that it has never been a priority; but I suspect it's an intrinsically difficult problem. Code which is optimised to be fast at run time is typically going to be very difficult for a compiler to understand; so it's much more effective to provide seperate algorithms for the two cases. In practice, I think that compiler writers have too much work to do already. Pragmatically, I think the best approach is to attempt to bring the run-time and compile-time languages together. In this respect, the use of "static if" instead of partial template specialisation is an enormous improvement over C++. Removing some of the current D template quirks will bring them even closer. I'm convinced that we'll be able to make significant progress after that, too, with only very minor language enhancements.
Dec 20 2005
prev sibling parent reply Bruno Medeiros <daiphoenixNO SPAMlycos.com> writes:
Don Clugston wrote:
  >
 It would in theory be possible to apply the D meaning of const to 
 functions and function parameters. For example, here's the 
 metaprogramming "pow!()":
 
 /** real pow!(real a, int b)
  * Fast integer powers
  */
 template pow(real a, int b)
 {
   static if (b==0) const real pow=1.0L;
   else static if (b<0) const real pow = 1.0L/.pow!(a, -b);
   else static if (b==1) const real pow = a;
   else static if (b & 1) const real pow = a * .pow!(a*a, b>>1);
   else const real pow = .pow!(a*a, b>>1);
 }
 
 
 Ideally, this could be written as:
 
 const real pow(const real a, const int b)
 {
   if (b==0) return 1.0L;
   else if (b<0) return 1.0L/pow(a, -b);
   else if (b==1) return a;
   else if (b & 1) return a * pow(a*a, b>>1);
   else return pow(a*a, b>>1);
 }
 
 That is, provide an overload for the case in which all parameters are 
 known at compile time. (maybe the "const" before real a, real b are 
 unnecessary, they are implied by the const return?).
 Note that this is completely different to the C++ meaning of const.
 This example shows how similar the compile-time and run-time D are, 
 compared to C++. It's not hard to take a compile-time program using 
 template value parameters, and convert it to a run-time program by 
 converting "static if" to "if", removing the "const"s and the "template".
 
 Although the two functions above are remarkably similar, the 
 non-template one is nicer because:
 (a) the optimisation would automatically apply to existing code.
 (b) you'd be able to use templates! The function should apply for any 
 type of a, provided it supports opMul(). Ie, you should be able to write:
 
 template (A)
 const A pow(const A a, const int b)
 {
  ...
 }
 (c) things like DDoc would work properly.
 (d) it could conceivably work for types other than strings, ints, and 
 reals.
 
 It'd be pretty tough on the compiler, though.

How about going all the way (down the road of orthogonality ) and make this a full blown syntax shortcut for templated functions: TYPE opAdd(const TYPE, TYPE a, TYPE b ) { return a + b; } ... // creates a templated function instance for TYPE=int : opAdd(int, 40, 2); This ability to have templated functions (like we have for classes&structs, instead of just templated declaration blocks) might even help in the language design of implicit function template instantiation. This is an experimental ideia, I'm not sugesting it (not yet at least). -- Bruno Medeiros - CS/E student "Certain aspects of D are a pathway to many abilities some consider to be... unnatural."
Dec 19 2005
parent reply "Ben Hinkle" <bhinkle mathworks.com> writes:
 How about going all the way (down the road of orthogonality ) and make 
 this a full blown syntax shortcut for templated functions:

   TYPE opAdd(const TYPE, TYPE a, TYPE b ) {
 return a + b;
   }
   ...
   // creates a templated function instance for TYPE=int :
   opAdd(int, 40, 2);

I don't understand the advantage over the current style opAdd!(int)(40,2).
 This ability to have templated functions (like we have for 
 classes&structs, instead of just templated declaration blocks) might even 
 help in the language design of implicit function template instantiation.

 This is an experimental ideia, I'm not sugesting it (not yet at least).

 -- 
 Bruno Medeiros - CS/E student
 "Certain aspects of D are a pathway to many abilities some consider to 
 be... unnatural." 

Dec 19 2005
parent Bruno Medeiros <daiphoenixNO SPAMlycos.com> writes:
Ben Hinkle wrote:
How about going all the way (down the road of orthogonality ) and make 
this a full blown syntax shortcut for templated functions:

  TYPE opAdd(const TYPE, TYPE a, TYPE b ) {
return a + b;
  }
  ...
  // creates a templated function instance for TYPE=int :
  opAdd(int, 40, 2);

I don't understand the advantage over the current style opAdd!(int)(40,2).

could bring some more tangible advantadges. -- Bruno Medeiros - CS/E student "Certain aspects of D are a pathway to many abilities some consider to be... unnatural."
Dec 21 2005
prev sibling parent reply "Craig Black" <cblack ara.com> writes:
Awesome!!!!  Have you done any benchmarks?

-Craig 
Dec 21 2005
parent pragma <pragma_member pathlink.com> writes:
In article <doc040$2jt4$1 digitaldaemon.com>, Craig Black says...
Awesome!!!!  Have you done any benchmarks?

-Craig 

None yet, but I'd be lying if I said that I wasn't curious myself. When I wrote this, I really just wanted a proof-of-concept - performance wasn't a concern. :) But if I had to guess, I'd say that it comes out ahead for largeish queries or for programs that heavily rely on regular expressions. - EricAnderton at yahoo
Dec 21 2005