digitalmars.D.announce - Compile-Time Regular Expressions

pragma (52/52) Dec 17 2005 (Special thanks goes to Don Clugston, as this engine is based 100% on hi...

Ben Hinkle (31/44) Dec 17 2005 Neat stuff, but as I glanced over the code I couldn't help think that su...

Bruno Medeiros (22/78) Dec 17 2005 I think it's a very valid point. In fact, some time ago, when I was

BCS (20/52) Dec 19 2005 [...]

Kris (13/22) Dec 19 2005 Nice, but surely this is something that can be done at runtime via a sta...

BCS (6/26) Dec 19 2005 Yes you can do this in a static ctor, (and in many cases this would be

Kris (5/31) Dec 19 2005 OK ~ I follow you. In the past, some folk would precompute "expensive" d...

Sean Kelly (19/50) Dec 19 2005 Yup. I think the usefulness of templates is that the data is integrated...

BCS (20/51) Dec 19 2005 The tipping point also depends on what the program is for, If you are
Walter Bright (7/11) Jan 03 2006 The problem with C++ template metaprogramming is it was discovered as a ...

Don Clugston (45/84) Dec 19 2005 It would in theory be possible to apply the D meaning of const to

Ben Hinkle (29/74) Dec 19 2005 That could be one approach, though personally I'm not sure its worth mak...

Don Clugston (15/73) Dec 19 2005 It should definitely treat the constant as a const real.

Ben Hinkle (14/86) Dec 19 2005 It isn't obvious to me which one to choose. I would expect that it would...

Don Clugston (20/88) Dec 20 2005 I agree with you. But, I think that in practice, a compiler from that

Bruno Medeiros (19/72) Dec 19 2005 How about going all the way (down the road of orthogonality ) and make

Ben Hinkle (1/17) Dec 19 2005

Bruno Medeiros (7/20) Dec 21 2005 I too don't see any (beyond orthogonality), but maybe in the future that...

Craig Black (2/2) Dec 21 2005 Awesome!!!! Have you done any benchmarks?

pragma (6/8) Dec 21 2005 None yet, but I'd be lying if I said that I wasn't curious myself. When...

pragma <pragma_member pathlink.com> writes:

(Special thanks goes to Don Clugston, as this engine is based 100% on his
metaprogramming lib)

With so much talk about the possibility of Compile-time regex in D, I couldn't
resist trying to make a stab at it.  I'm pretty happy with the results.

http://trac.dsource.org/projects/ddl/browser/trunk/meta

regex.d has more information on what's working and what's not.  The
implementation isn't 100% complete, but it stands as a proof-of-concept as to
what's possible with D's templates.

Anyway, the end result is that its really easy to use:




Enjoy!

-----

Don Clugston approached me a while back and asked if he could get a
metaprogramming library going under DDL (for now).  Without hesitation, I set
him up and off he went creating a wonderful library of workarounds and utils,
which make template coding a breeze.  His code turned out to be so useful, that
I wrote this (mostly complete) Compile-time Regular Expression engine.

This is my first parser in D templates, so please bear with me.   It really
could be optimized any number of different ways, but at least the code generated
from the regex templates is very efficent and even minimalistic to a point.
Much of the oddities in the code come from heavy use of "type traits" (a
technique borrowed from c++), which really cuts down on the verbosity of the
code and gets the most out of each template call.  

The rest of the strangeness in there comes from limitations in how template code
works (and doesn't work).  Things like static if() ignoring declarations in the
curent scope, slicing not working the way it should, and templates not always
specializing on alias params caused the whole thing to bloat a bit.  As these
get worked out of DMD, I would expect a more refined engine to emerge.

In the end, I'm confident that to do the same thing in C++ would probably take
at least twice as much code, and would be about half as readable. From what I've
seen of Boost's implementation, that's pretty much the case.  Don's pioneering
work in template coding for DDL's symbol mangling code really set the mode here,
and laid the foundation for more ambitious projects like this.

------

The theory of operation here is fairly straightforward.  The parser set of
templates, in regex.d, tears apart the query string and breaks it down into the
various tests that make up the query.  At each terminal/production, the parser
pulls in an instance of a test function template from testPredicate.d.  These
predicates are passed back along the parser call chain via a type trait (usually
named 'fn' for consistency), that is usually supplied to other test predicates
that and/or multiple tests together.  The top level production is the function
that is returned via the original template instance -- this becomes the regex
handle that you use to make your queries at runtime.

This means that, for now anyway, the implementation uses zero clases and zero
structs.  Its just instances of templated functions, and the minimal set needed
to support the query at that.  As a side-effect of this design, you can
'manually' compose the test predicates yourself by using the templates in
regexPredicate.d directly -- this made testing and prototyping the system very
easy.

Enjoy!


- EricAnderton at yahoo

Dec 17 2005

"Ben Hinkle" <ben.hinkle gmail.com> writes:

"pragma" <pragma_member pathlink.com> wrote in message 
news:do18bj$18ge$1 digitaldaemon.com...
 (Special thanks goes to Don Clugston, as this engine is based 100% on his
 metaprogramming lib)

 With so much talk about the possibility of Compile-time regex in D, I 
 couldn't
 resist trying to make a stab at it.  I'm pretty happy with the results.

 http://trac.dsource.org/projects/ddl/browser/trunk/meta

 regex.d has more information on what's working and what's not.  The
 implementation isn't 100% complete, but it stands as a proof-of-concept as 
 to
 what's possible with D's templates.

 Anyway, the end result is that its really easy to use:




Neat stuff, but as I glanced over the code I couldn't help think that such a 
use of templates is really just "forced inlining". For example the code
 template getAt(char[] str,uint index){
   const char getAt = str[index];
 }
seems to me to be the same as allowing the compiler to inline the regular 
function
 char getAt(char[] str, uint index){
   return str[index];
 }

Instead of making the template language in D more complex, is it possible to 
instead specify which parts of the "normal" D language are inlinable and how 
to explicitly ask to inline them?

One difference between inlining regular code and template hacking is 
something like
  static if(consumed == 0 || consumed < tokenLength){
    pragma(msg,"Error: expected expression in the format of {n,m}");
    static assert(false);
  }
The equivalent regular code would look something like
  if(consumed == 0 || consumed < tokenLength){
    throw new Exception("Error: expected expression in the format of 
{n,m}");
  }
so that the compile-time error of the templated version would become a 
run-time error in the regular version. So aside from the pragma(msg,...) and 
static asserts I don't see a big win by writing D-like code in 
template-speak. The downside is having two languages in D - the "run-time" D 
and the "compile-time" D coding. Am I missing something?

Dec 17 2005

Bruno Medeiros <daiphoenixNO SPAMlycos.com> writes:

Ben Hinkle wrote:
 "pragma" <pragma_member pathlink.com> wrote in message 
 news:do18bj$18ge$1 digitaldaemon.com...
 
(Special thanks goes to Don Clugston, as this engine is based 100% on his
metaprogramming lib)

With so much talk about the possibility of Compile-time regex in D, I 
couldn't
resist trying to make a stab at it.  I'm pretty happy with the results.

http://trac.dsource.org/projects/ddl/browser/trunk/meta

regex.d has more information on what's working and what's not.  The
implementation isn't 100% complete, but it stands as a proof-of-concept as 
to
what's possible with D's templates.

Anyway, the end result is that its really easy to use:




 
 
 Neat stuff, but as I glanced over the code I couldn't help think that such a 
 use of templates is really just "forced inlining". For example the code
  template getAt(char[] str,uint index){
    const char getAt = str[index];
  }
 seems to me to be the same as allowing the compiler to inline the regular 
 function
  char getAt(char[] str, uint index){
    return str[index];
  }
 
 Instead of making the template language in D more complex, is it possible to 
 instead specify which parts of the "normal" D language are inlinable and how 
 to explicitly ask to inline them?
 
 One difference between inlining regular code and template hacking is 
 something like
   static if(consumed == 0 || consumed < tokenLength){
     pragma(msg,"Error: expected expression in the format of {n,m}");
     static assert(false);
   }
 The equivalent regular code would look something like
   if(consumed == 0 || consumed < tokenLength){
     throw new Exception("Error: expected expression in the format of 
 {n,m}");
   }
 so that the compile-time error of the templated version would become a 
 run-time error in the regular version. So aside from the pragma(msg,...) and 
 static asserts I don't see a big win by writing D-like code in 
 template-speak. The downside is having two languages in D - the "run-time" D 
 and the "compile-time" D coding. Am I missing something? 
 
 

I think it's a very valid point. In fact, some time ago, when I was 
reading that article about Dimensional Analysis that someone posted, I 
tought about something like that.
First I was trying to understand why that (DA) wouldn't be implementable 
in D (I didn't know what was implicit function template instantiation 
then). After understanding the issue, I started thinking how could such 
feature (IFTI) be designed into D, preferably in a clean way, since that 
C++ code was looking ghastly. But then I just realized, one could just 
implement Dimensional Analysis with normal structs&methods, and with a 
sufficiently smart compiler, it would be able to inline and 
simplify-remove most of the calculations, since they are all based on 
compile-time values and operations. And that was probably the best, 
cleanest way.
So yes, we should probably keep in mind this ideia, that templates, 
const-values and inlining are close related, right? Could there be 
pratical ramifications in terms of language design? Wait, quite some 
ideias are already starting to come to my head, hum...


-- 
Bruno Medeiros - CS/E student
"Certain aspects of D are a pathway to many abilities some consider to 
be... unnatural."

Dec 17 2005

BCS <BCS_member pathlink.com> writes:

Bruno Medeiros wrote:
 Ben Hinkle wrote:
 
 Neat stuff, but as I glanced over the code I couldn't help think that 
 such a use of templates is really just "forced inlining".


[...]
 Instead of making the template language in D more complex, is it 
 possible to instead specify which parts of the "normal" D language are 
 inlinable and how to explicitly ask to inline them?


[...]
 The downside is having two languages in 
 D - the "run-time" D and the "compile-time" D coding. Am I missing 
 something?

 I think it's a very valid point. In fact, some time ago, when I was 
 reading that article about Dimensional Analysis that someone posted, I 
 tought about something like that.
 First I was trying to understand why that (DA) wouldn't be implementable 
 in D (I didn't know what was implicit function template instantiation 
 then). After understanding the issue, I started thinking how could such 
 feature (IFTI) be designed into D, preferably in a clean way, since that 
 C++ code was looking ghastly. But then I just realized, one could just 
 implement Dimensional Analysis with normal structs&methods, and with a 
 sufficiently smart compiler, it would be able to inline and 
 simplify-remove most of the calculations, since they are all based on 
 compile-time values and operations. And that was probably the best, 
 cleanest way.
 So yes, we should probably keep in mind this ideia, that templates, 
 const-values and inlining are close related, right? Could there be 
 pratical ramifications in terms of language design? Wait, quite some 
 ideias are already starting to come to my head, hum...
 
 


	I think that the primary advantage of using templates is that they can 
allow for automated setup/verification of complex environments at 
compile time. How about building a static binary search tree? It would 
be nice to be able to initialize such a tree from an unsorted array and 
be sure that it gets put together right. something like this


auto BTreeRoot = BuildBTree!(int, CompFn, [1,3,7,9,2,5,8]);

would build a balanced tree of ints, sorted with CompFn.

	One advantage of the template solution is that You can trust that it is 
evaluated at compile time. This would be important for the verification 
usage. For the Dimensional Analysis case, using a templated version 
would ensure that, if the program compiles, then no unit errors exist, 
if you do this with structs/classes then the compiler might not inline 
everything and as a result no check all of your math and a unit error 
might still exist. The important difference is that templates always 
inline, function might not.

	In a nut shell, I see a distinct advantage of code that is insured to 
be completely evaluated at compile time.

Dec 19 2005

"Kris" <fu bar.com> writes:

"BCS" <BCS_member pathlink.com> wrote
[snip]
 I think that the primary advantage of using templates is that they can 
 allow for automated setup/verification of complex environments at compile 
 time. How about building a static binary search tree? It would be nice to 
 be able to initialize such a tree from an unsorted array and be sure that 
 it gets put together right. something like this


 auto BTreeRoot = BuildBTree!(int, CompFn, [1,3,7,9,2,5,8]);

 would build a balanced tree of ints, sorted with CompFn.


Nice, but surely this is something that can be done at runtime via a static 
ctor?


[snip]
 In a nut shell, I see a distinct advantage of code that is insured to be 
 completely evaluated at compile time.


As do I.

Another perspective on this is to consider Templates as being extensions to 
the compiler proper. While such things can be very useful, I imagine it's 
not hard to get carried away building a language within a language. Still, 
being able to instruct the compiler to build a regex state-machine for you 
is pretty powerful, and likely useful for some ~ I'd use it. A full blown 
compiler-compiler is then not so far-fetched ... where does one draw the 
line?

Dec 19 2005

BCS <BCS_member pathlink.com> writes:

Kris wrote:
 "BCS" <BCS_member pathlink.com> wrote
 [snip]
 
I think that the primary advantage of using templates is that they can 
allow for automated setup/verification of complex environments at compile 
time. How about building a static binary search tree? It would be nice to 
be able to initialize such a tree from an unsorted array and be sure that 
it gets put together right. something like this


auto BTreeRoot = BuildBTree!(int, CompFn, [1,3,7,9,2,5,8]);

would build a balanced tree of ints, sorted with CompFn.

 
 
 
 Nice, but surely this is something that can be done at runtime via a static 
 ctor?
 
 

Yes you can do this in a static ctor, (and in many cases this would be 
good enough) however this would add to the start up cost each time the 
program is run. Why spend code and time setting up that state of a 
program when that state can be set ahead of time by just having 
preinitialized data space?

Dec 19 2005

"Kris" <fu bar.com> writes:

"BCS" <BCS_member pathlink.com> wrote ...
 Kris wrote:
 "BCS" <BCS_member pathlink.com> wrote
 [snip]

I think that the primary advantage of using templates is that they can 
allow for automated setup/verification of complex environments at compile 
time. How about building a static binary search tree? It would be nice to 
be able to initialize such a tree from an unsorted array and be sure that 
it gets put together right. something like this


auto BTreeRoot = BuildBTree!(int, CompFn, [1,3,7,9,2,5,8]);

would build a balanced tree of ints, sorted with CompFn.



 Nice, but surely this is something that can be done at runtime via a 
 static ctor?

 Yes you can do this in a static ctor, (and in many cases this would be 
 good enough) however this would add to the start up cost each time the 
 program is run. Why spend code and time setting up that state of a program 
 when that state can be set ahead of time by just having preinitialized 
 data space?

OK ~ I follow you. In the past, some folk would precompute "expensive" data 
into a file and then load it up at start time. You're effectively noting 
that might be done with templates, embedding said data right into the 
executable.

Dec 19 2005

Sean Kelly <sean f4.ca> writes:

Kris wrote:
 "BCS" <BCS_member pathlink.com> wrote ...
 Kris wrote:
 "BCS" <BCS_member pathlink.com> wrote
 [snip]

 I think that the primary advantage of using templates is that they can 
 allow for automated setup/verification of complex environments at compile 
 time. How about building a static binary search tree? It would be nice to 
 be able to initialize such a tree from an unsorted array and be sure that 
 it gets put together right. something like this


 auto BTreeRoot = BuildBTree!(int, CompFn, [1,3,7,9,2,5,8]);

 would build a balanced tree of ints, sorted with CompFn.


 Nice, but surely this is something that can be done at runtime via a 
 static ctor?

 Yes you can do this in a static ctor, (and in many cases this would be 
 good enough) however this would add to the start up cost each time the 
 program is run. Why spend code and time setting up that state of a program 
 when that state can be set ahead of time by just having preinitialized 
 data space?

 
 OK ~ I follow you. In the past, some folk would precompute "expensive" data 
 into a file and then load it up at start time. You're effectively noting 
 that might be done with templates, embedding said data right into the 
 executable. 

Yup.  I think the usefulness of templates is that the data is integrated 
with the runtime code, which eliminates the need for the awkward syntax 
and error detection associated with storing precalculated data in files. 
  So far, this seems to have been most useful for science and 
mathematics, as calculations can simply be done at compile time 
"whenever possible" without the need for programmer involvement.  And 
this can still be supplemented by the file method if there is a larger 
class of data that can be precalculated but for some reason can not be 
done in code.

I think determining a tipping point is probably situation dependent.  In 
C++, for example, I find a lot of what's in Boost to be a fantastic 
example of what's possible using C++ templates, but the implementation 
often (necessarily) so horrendous that I'm disinclined to actually use 
it.  Compilation time is another issue, as template code can become so 
complex that an application might literally take days to compile.  At 
some point the run-time efficiency simply becomes not worth the coding 
or compile-time cost.


Sean

Dec 19 2005

BCS <BCS_member pathlink.com> writes:

Sean Kelly wrote:
 Kris wrote:
 
 OK ~ I follow you. In the past, some folk would precompute "expensive" 
 data into a file and then load it up at start time. You're effectively 
 noting that might be done with templates, embedding said data right 
 into the executable. 

 
 
 Yup.  I think the usefulness of templates is that the data is integrated 
 with the runtime code, which eliminates the need for the awkward syntax 
 and error detection associated with storing precalculated data in files. 
  So far, this seems to have been most useful for science and 
 mathematics, as calculations can simply be done at compile time 
 "whenever possible" without the need for programmer involvement.  And 
 this can still be supplemented by the file method if there is a larger 
 class of data that can be precalculated but for some reason can not be 
 done in code.
 
 I think determining a tipping point is probably situation dependent.  In 
 C++, for example, I find a lot of what's in Boost to be a fantastic 
 example of what's possible using C++ templates, but the implementation 
 often (necessarily) so horrendous that I'm disinclined to actually use 
 it.  Compilation time is another issue, as template code can become so 
 complex that an application might literally take days to compile.  At 
 some point the run-time efficiency simply becomes not worth the coding 
 or compile-time cost.
 
 
 Sean

The tipping point also depends on what the program is for, If you are 
writing a GCI app that will need to start, run and generate output while 
a user is waiting (and the program will run millions of times), compile 
time won't be nearly so much of an issue as if you are writing a finite 
element analysis program that will only be run a half dozen times for 
each time it is compiled.

One other thought, to keep the compile/run cycle manageable, maybe many 
of these libraries should have a setting to change them from compile 
time code to run time code, to use my last example:

template BuildBTree(T, bool function(T,T) cmp, T[], ar)
{
	static if(atComplie)
			// if at compile generate a tree now
		const BTree!(T) BuildBTree = ...// some template magic
	else
			// else insert code to call function
		BTree!(T) BuildBTree!(T)(comp, ar);
}

I know that doesn't work but you get the idea.

Dec 19 2005

"Walter Bright" <newshound digitalmars.com> writes:

"Sean Kelly" <sean f4.ca> wrote in message 
news:do769q$1k87$1 digitaldaemon.com...
 I think determining a tipping point is probably situation dependent.  In 
 C++, for example, I find a lot of what's in Boost to be a fantastic 
 example of what's possible using C++ templates, but the implementation 
 often (necessarily) so horrendous that I'm disinclined to actually use it.

The problem with C++ template metaprogramming is it was discovered as a side 
effect of templates. The templates were never designed to do that. Hence, it 
is, as you say, horrendous. In D we have the benefit of seeing where we want 
to go and putting the road directly there, rather than just following the 
arbitrary lay of the land and seeing where we wound up.

Jan 03 2006

Don Clugston <dac nospam.com.au> writes:

Ben Hinkle wrote:
 "pragma" <pragma_member pathlink.com> wrote in message 
 news:do18bj$18ge$1 digitaldaemon.com...
Anyway, the end result is that its really easy to use:




 
 
 Neat stuff, but as I glanced over the code I couldn't help think that such a 
 use of templates is really just "forced inlining". For example the code
  template getAt(char[] str,uint index){
    const char getAt = str[index];
  }
 seems to me to be the same as allowing the compiler to inline the regular 
 function
  char getAt(char[] str, uint index){
    return str[index];
  }
 
 Instead of making the template language in D more complex, is it possible to 
 instead specify which parts of the "normal" D language are inlinable and how 
 to explicitly ask to inline them?
 
 One difference between inlining regular code and template hacking is 
 something like
   static if(consumed == 0 || consumed < tokenLength){
     pragma(msg,"Error: expected expression in the format of {n,m}");
     static assert(false);
   }
 The equivalent regular code would look something like
   if(consumed == 0 || consumed < tokenLength){
     throw new Exception("Error: expected expression in the format of 
 {n,m}");
   }
 so that the compile-time error of the templated version would become a 
 run-time error in the regular version. So aside from the pragma(msg,...) and 
 static asserts I don't see a big win by writing D-like code in 
 template-speak. The downside is having two languages in D - the "run-time" D 
 and the "compile-time" D coding. Am I missing something? 

It would in theory be possible to apply the D meaning of const to 
functions and function parameters. For example, here's the 
metaprogramming "pow!()":

/** real pow!(real a, int b)
  * Fast integer powers
  */
template pow(real a, int b)
{
   static if (b==0) const real pow=1.0L;
   else static if (b<0) const real pow = 1.0L/.pow!(a, -b);
   else static if (b==1) const real pow = a;
   else static if (b & 1) const real pow = a * .pow!(a*a, b>>1);
   else const real pow = .pow!(a*a, b>>1);
}


Ideally, this could be written as:

const real pow(const real a, const int b)
{
   if (b==0) return 1.0L;
   else if (b<0) return 1.0L/pow(a, -b);
   else if (b==1) return a;
   else if (b & 1) return a * pow(a*a, b>>1);
   else return pow(a*a, b>>1);
}

That is, provide an overload for the case in which all parameters are 
known at compile time. (maybe the "const" before real a, real b are 
unnecessary, they are implied by the const return?).
Note that this is completely different to the C++ meaning of const.
This example shows how similar the compile-time and run-time D are, 
compared to C++. It's not hard to take a compile-time program using 
template value parameters, and convert it to a run-time program by 
converting "static if" to "if", removing the "const"s and the "template".

Although the two functions above are remarkably similar, the 
non-template one is nicer because:
(a) the optimisation would automatically apply to existing code.
(b) you'd be able to use templates! The function should apply for any 
type of a, provided it supports opMul(). Ie, you should be able to write:

template (A)
const A pow(const A a, const int b)
{
  ...
}
(c) things like DDoc would work properly.
(d) it could conceivably work for types other than strings, ints, and reals.

It'd be pretty tough on the compiler, though.

Dec 19 2005

"Ben Hinkle" <bhinkle mathworks.com> writes:

 It would in theory be possible to apply the D meaning of const to 
 functions and function parameters. For example, here's the metaprogramming 
 "pow!()":

 /** real pow!(real a, int b)
  * Fast integer powers
  */
 template pow(real a, int b)
 {
   static if (b==0) const real pow=1.0L;
   else static if (b<0) const real pow = 1.0L/.pow!(a, -b);
   else static if (b==1) const real pow = a;
   else static if (b & 1) const real pow = a * .pow!(a*a, b>>1);
   else const real pow = .pow!(a*a, b>>1);
 }


 Ideally, this could be written as:

 const real pow(const real a, const int b)
 {
   if (b==0) return 1.0L;
   else if (b<0) return 1.0L/pow(a, -b);
   else if (b==1) return a;
   else if (b & 1) return a * pow(a*a, b>>1);
   else return pow(a*a, b>>1);
 }

 That is, provide an overload for the case in which all parameters are 
 known at compile time. (maybe the "const" before real a, real b are 
 unnecessary, they are implied by the const return?).

That could be one approach, though personally I'm not sure its worth making 
the language and overloading rules more complex. For example given
  real pow(real a, int b);
  const real pow(const real a, const int b);
what would the compiler do with
  pow(1.0,1)?
The compiler could convert the const double to const real and call the 
compile-time pow or it could call the run-time pow. I think it's better to 
have either 1) a separate function name for compile-time pow if the 
algorithm for the compile-time version is different than the run-time 
version or 2) import a module called something like std.inlinemath that has 
the compile-time pow and is tailored to let the compiler inline as much as 
possible.

 Note that this is completely different to the C++ meaning of const.
 This example shows how similar the compile-time and run-time D are, 
 compared to C++. It's not hard to take a compile-time program using 
 template value parameters, and convert it to a run-time program by 
 converting "static if" to "if", removing the "const"s and the "template".

 Although the two functions above are remarkably similar, the non-template 
 one is nicer because:
 (a) the optimisation would automatically apply to existing code.

automatically except that the user had to explicitly ask for the template 
version. I'm not sure what you mean by automatically I guess.

 (b) you'd be able to use templates! The function should apply for any type 
 of a, provided it supports opMul(). Ie, you should be able to write:

 template (A)
 const A pow(const A a, const int b)
 {
  ...
 }

One can still templatize the type A and leave the rest as regular code

template pow(A)
A pow(A a, int b)
{
   if (b==0) return 1.0L;
   else if (b<0) return 1.0L/pow(a, -b);
   else if (b==1) return a;
   else if (b & 1) return a * pow(a*a, b>>1);
   else return pow(a*a, b>>1);
}

The compiler can inline that just as easily as it can inline the 
non-templatized pow.

 (c) things like DDoc would work properly.

I'm not sure what you mean here. Why can't DDoc work for a regular function?

 (d) it could conceivably work for types other than strings, ints, and 
 reals.

I'm not sure what you mean here, either.

 It'd be pretty tough on the compiler, though.

Dec 19 2005

Don Clugston <dac nospam.com.au> writes:

Ben Hinkle wrote:
It would in theory be possible to apply the D meaning of const to 
functions and function parameters. For example, here's the metaprogramming 
"pow!()":

/** real pow!(real a, int b)
 * Fast integer powers
 */
template pow(real a, int b)
{
  static if (b==0) const real pow=1.0L;
  else static if (b<0) const real pow = 1.0L/.pow!(a, -b);
  else static if (b==1) const real pow = a;
  else static if (b & 1) const real pow = a * .pow!(a*a, b>>1);
  else const real pow = .pow!(a*a, b>>1);
}


Ideally, this could be written as:

const real pow(const real a, const int b)
{
  if (b==0) return 1.0L;
  else if (b<0) return 1.0L/pow(a, -b);
  else if (b==1) return a;
  else if (b & 1) return a * pow(a*a, b>>1);
  else return pow(a*a, b>>1);
}

That is, provide an overload for the case in which all parameters are 
known at compile time. (maybe the "const" before real a, real b are 
unnecessary, they are implied by the const return?).

 
 That could be one approach, though personally I'm not sure its worth making 
 the language and overloading rules more complex. For example given
   real pow(real a, int b);
   const real pow(const real a, const int b);
 what would the compiler do with
   pow(1.0,1)?
 The compiler could convert the const double to const real and call the 
 compile-time pow or it could call the run-time pow. 

It should definitely treat the constant as a const real.


I think it's better to
 have either 1) a separate function name for compile-time pow if the 
 algorithm for the compile-time version is different than the run-time 
 version or 2) import a module called something like std.inlinemath that has 
 the compile-time pow and is tailored to let the compiler inline as much as 
 possible.

This is what I've done, effectively -- meta.math includes pow!() which 
behaves exactly in that way.


Note that this is completely different to the C++ meaning of const.
This example shows how similar the compile-time and run-time D are, 
compared to C++. It's not hard to take a compile-time program using 
template value parameters, and convert it to a run-time program by 
converting "static if" to "if", removing the "const"s and the "template".

Although the two functions above are remarkably similar, the non-template 
one is nicer because:
(a) the optimisation would automatically apply to existing code.

 
 
 automatically except that the user had to explicitly ask for the template 
 version. I'm not sure what you mean by automatically I guess.

In this, and the rest of your comments, you have my meaning inverted. It 
is the template version which is not automatic, which doesn't work with 
DDoc, etc.

I also notice that you seem to be assuming that the only reason for 
performing these functions at compile time is for performance. Another 
important benefit is the possibility of detecting errors at compile 
time. For example, a compile-time regex library can detect syntax errors 
in the regular expression, during compilation.
It's part of the general trend in modern C++ to push as much as possible 
from run time into compile time.

Dec 19 2005

"Ben Hinkle" <ben.hinkle gmail.com> writes:

"Don Clugston" <dac nospam.com.au> wrote in message 
news:do6m1f$oh8$1 digitaldaemon.com...
 Ben Hinkle wrote:
It would in theory be possible to apply the D meaning of const to 
functions and function parameters. For example, here's the 
metaprogramming "pow!()":

/** real pow!(real a, int b)
 * Fast integer powers
 */
template pow(real a, int b)
{
  static if (b==0) const real pow=1.0L;
  else static if (b<0) const real pow = 1.0L/.pow!(a, -b);
  else static if (b==1) const real pow = a;
  else static if (b & 1) const real pow = a * .pow!(a*a, b>>1);
  else const real pow = .pow!(a*a, b>>1);
}


Ideally, this could be written as:

const real pow(const real a, const int b)
{
  if (b==0) return 1.0L;
  else if (b<0) return 1.0L/pow(a, -b);
  else if (b==1) return a;
  else if (b & 1) return a * pow(a*a, b>>1);
  else return pow(a*a, b>>1);
}

That is, provide an overload for the case in which all parameters are 
known at compile time. (maybe the "const" before real a, real b are 
unnecessary, they are implied by the const return?).

 That could be one approach, though personally I'm not sure its worth 
 making the language and overloading rules more complex. For example given
   real pow(real a, int b);
   const real pow(const real a, const int b);
 what would the compiler do with
   pow(1.0,1)?
 The compiler could convert the const double to const real and call the 
 compile-time pow or it could call the run-time pow.

 It should definitely treat the constant as a const real.

It isn't obvious to me which one to choose. I would expect that it would 
follow D's "error on any multiple matches" rule.

 I think it's better to
 have either 1) a separate function name for compile-time pow if the 
 algorithm for the compile-time version is different than the run-time 
 version or 2) import a module called something like std.inlinemath that 
 has the compile-time pow and is tailored to let the compiler inline as 
 much as possible.

 This is what I've done, effectively -- meta.math includes pow!() which 
 behaves exactly in that way.

I'm not sure I'm communicating my point. You have written meta.math.pow!() 
and have suggested (or maybe I should say floated rather than suggested) a 
function pow(const real, const int). My point is that in a perfect world a 
"compile-time" pow shouldn't be a template and shouldn't use overloading on 
pow. Why invent lots of machinery when it might be avoidable?

Note that this is completely different to the C++ meaning of const.
This example shows how similar the compile-time and run-time D are, 
compared to C++. It's not hard to take a compile-time program using 
template value parameters, and convert it to a run-time program by 
converting "static if" to "if", removing the "const"s and the "template".

Although the two functions above are remarkably similar, the non-template 
one is nicer because:
(a) the optimisation would automatically apply to existing code.

 automatically except that the user had to explicitly ask for the template 
 version. I'm not sure what you mean by automatically I guess.

 In this, and the rest of your comments, you have my meaning inverted. It 
 is the template version which is not automatic, which doesn't work with 
 DDoc, etc.

You're right. I somehow missed the "non-template version" phrase.

 I also notice that you seem to be assuming that the only reason for 
 performing these functions at compile time is for performance. Another 
 important benefit is the possibility of detecting errors at compile time. 
 For example, a compile-time regex library can detect syntax errors in the 
 regular expression, during compilation.

I didn't mean to imply that performance was the only reason. It's just the 
most obvious one. I agree catching syntax errors is useful. Non-template 
code can catch syntax errors, too.

 It's part of the general trend in modern C++ to push as much as possible 
 from run time into compile time.

I'm not objecting to that either.

Dec 19 2005

Don Clugston <dac nospam.com.au> writes:

Ben Hinkle wrote:
 "Don Clugston" <dac nospam.com.au> wrote in message 
 news:do6m1f$oh8$1 digitaldaemon.com...
 
Ben Hinkle wrote:

It would in theory be possible to apply the D meaning of const to 
functions and function parameters. For example, here's the 
metaprogramming "pow!()":

/** real pow!(real a, int b)
* Fast integer powers
*/
template pow(real a, int b)
{
 static if (b==0) const real pow=1.0L;
 else static if (b<0) const real pow = 1.0L/.pow!(a, -b);
 else static if (b==1) const real pow = a;
 else static if (b & 1) const real pow = a * .pow!(a*a, b>>1);
 else const real pow = .pow!(a*a, b>>1);
}


Ideally, this could be written as:

const real pow(const real a, const int b)
{
 if (b==0) return 1.0L;
 else if (b<0) return 1.0L/pow(a, -b);
 else if (b==1) return a;
 else if (b & 1) return a * pow(a*a, b>>1);
 else return pow(a*a, b>>1);
}

That is, provide an overload for the case in which all parameters are 
known at compile time. (maybe the "const" before real a, real b are 
unnecessary, they are implied by the const return?).

That could be one approach, though personally I'm not sure its worth 
making the language and overloading rules more complex. For example given
  real pow(real a, int b);
  const real pow(const real a, const int b);
what would the compiler do with
  pow(1.0,1)?
The compiler could convert the const double to const real and call the 
compile-time pow or it could call the run-time pow.

It should definitely treat the constant as a const real.

 
 
 It isn't obvious to me which one to choose. I would expect that it would 
 follow D's "error on any multiple matches" rule.
 
 
I think it's better to
have either 1) a separate function name for compile-time pow if the 
algorithm for the compile-time version is different than the run-time 
version or 2) import a module called something like std.inlinemath that 
has the compile-time pow and is tailored to let the compiler inline as 
much as possible.

This is what I've done, effectively -- meta.math includes pow!() which 
behaves exactly in that way.

 
 I'm not sure I'm communicating my point. You have written meta.math.pow!() 
 and have suggested (or maybe I should say floated rather than suggested) a 
 function pow(const real, const int). My point is that in a perfect world a 
 "compile-time" pow shouldn't be a template and shouldn't use overloading on 
 pow. Why invent lots of machinery when it might be avoidable?

I agree with you. But, I think that in practice, a compiler from that 
perfect world is extremely difficult to create. It seems that even after 
decades of work, highly optimising compilers are very difficult to 
create. Existing compilers have problems even with the simple cases, 
like matrix operations. This seems to be the rationale behind much of 
the code at boost.

It's possible that the reason compilers aren't good at it is simply that 
it has never been a priority; but I suspect it's an intrinsically 
difficult problem. Code which is optimised to be fast at run time is 
typically going to be very difficult for a compiler to understand; so 
it's much more effective to provide seperate algorithms for the two cases.
In practice, I think that compiler writers have too much work to do already.

Pragmatically, I think the best approach is to attempt to bring the 
run-time and compile-time languages together. In this respect, the use 
of "static if" instead of partial template specialisation is an enormous 
improvement over C++. Removing some of the current D template quirks 
will bring them even closer. I'm convinced that we'll be able to make 
significant progress after that, too, with only very minor language 
enhancements.

Dec 20 2005

Bruno Medeiros <daiphoenixNO SPAMlycos.com> writes:

Don Clugston wrote:
  >
 It would in theory be possible to apply the D meaning of const to 
 functions and function parameters. For example, here's the 
 metaprogramming "pow!()":
 
 /** real pow!(real a, int b)
  * Fast integer powers
  */
 template pow(real a, int b)
 {
   static if (b==0) const real pow=1.0L;
   else static if (b<0) const real pow = 1.0L/.pow!(a, -b);
   else static if (b==1) const real pow = a;
   else static if (b & 1) const real pow = a * .pow!(a*a, b>>1);
   else const real pow = .pow!(a*a, b>>1);
 }
 
 
 Ideally, this could be written as:
 
 const real pow(const real a, const int b)
 {
   if (b==0) return 1.0L;
   else if (b<0) return 1.0L/pow(a, -b);
   else if (b==1) return a;
   else if (b & 1) return a * pow(a*a, b>>1);
   else return pow(a*a, b>>1);
 }
 
 That is, provide an overload for the case in which all parameters are 
 known at compile time. (maybe the "const" before real a, real b are 
 unnecessary, they are implied by the const return?).
 Note that this is completely different to the C++ meaning of const.
 This example shows how similar the compile-time and run-time D are, 
 compared to C++. It's not hard to take a compile-time program using 
 template value parameters, and convert it to a run-time program by 
 converting "static if" to "if", removing the "const"s and the "template".
 
 Although the two functions above are remarkably similar, the 
 non-template one is nicer because:
 (a) the optimisation would automatically apply to existing code.
 (b) you'd be able to use templates! The function should apply for any 
 type of a, provided it supports opMul(). Ie, you should be able to write:
 
 template (A)
 const A pow(const A a, const int b)
 {
  ...
 }
 (c) things like DDoc would work properly.
 (d) it could conceivably work for types other than strings, ints, and 
 reals.
 
 It'd be pretty tough on the compiler, though.

How about going all the way (down the road of orthogonality ) and make 
this a full blown syntax shortcut for templated functions:

   TYPE opAdd(const TYPE, TYPE a, TYPE b ) {
	return a + b;
   }
   ...
   // creates a templated function instance for TYPE=int :
   opAdd(int, 40, 2);


This ability to have templated functions (like we have for 
classes&structs, instead of just templated declaration blocks) might 
even help in the language design of implicit function template 
instantiation.

This is an experimental ideia, I'm not sugesting it (not yet at least).

-- 
Bruno Medeiros - CS/E student
"Certain aspects of D are a pathway to many abilities some consider to 
be... unnatural."

Dec 19 2005

"Ben Hinkle" <bhinkle mathworks.com> writes:

 How about going all the way (down the road of orthogonality ) and make 
 this a full blown syntax shortcut for templated functions:

   TYPE opAdd(const TYPE, TYPE a, TYPE b ) {
 return a + b;
   }
   ...
   // creates a templated function instance for TYPE=int :
   opAdd(int, 40, 2);

I don't understand the advantage over the current style opAdd!(int)(40,2).

 This ability to have templated functions (like we have for 
 classes&structs, instead of just templated declaration blocks) might even 
 help in the language design of implicit function template instantiation.

 This is an experimental ideia, I'm not sugesting it (not yet at least).

 -- 
 Bruno Medeiros - CS/E student
 "Certain aspects of D are a pathway to many abilities some consider to 
 be... unnatural."

Dec 19 2005

Bruno Medeiros <daiphoenixNO SPAMlycos.com> writes:

Ben Hinkle wrote:
How about going all the way (down the road of orthogonality ) and make 
this a full blown syntax shortcut for templated functions:

  TYPE opAdd(const TYPE, TYPE a, TYPE b ) {
return a + b;
  }
  ...
  // creates a templated function instance for TYPE=int :
  opAdd(int, 40, 2);

 
 
 I don't understand the advantage over the current style opAdd!(int)(40,2).
 

I too don't see any (beyond orthogonality), but maybe in the future that 
could bring some more tangible advantadges.

-- 
Bruno Medeiros - CS/E student
"Certain aspects of D are a pathway to many abilities some consider to 
be... unnatural."

Dec 21 2005

"Craig Black" <cblack ara.com> writes:

Awesome!!!!  Have you done any benchmarks?

-Craig

Dec 21 2005

pragma <pragma_member pathlink.com> writes:

In article <doc040$2jt4$1 digitaldaemon.com>, Craig Black says...
Awesome!!!!  Have you done any benchmarks?

-Craig 

None yet, but I'd be lying if I said that I wasn't curious myself.  When I wrote
this, I really just wanted a proof-of-concept - performance wasn't a concern. :)

But if I had to guess, I'd say that it comes out ahead for largeish queries or
for programs that heavily rely on regular expressions.

- EricAnderton at yahoo

Dec 21 2005

D Programming

C/C++ Programming

Other

digitalmars.D.announce - Compile-Time Regular Expressions