www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Compile-time optimization

reply "JS" <js.mdnq gmail.com> writes:
There seems to be a lot of improvement for programming languages 
to optimize compile time aspects that are not taken into account. 
With ctfe I think such optimizations are more and more relevant 
in D.

I'll give a simple example:

Take a standard string joining function:

string join(string[] s)
{
     string ss;
     foreach(sss; s) ss ~= sss;
     return ss
}

when this string function is called at run-time with literal 
arguments it is extremely inefficient:

join(["a", "b", "c"]);

because the result can obviously be computed at compile-time by 
the compiler.

using ctfe can solve this problem but only when used in const 
results.

But there should be no reason why join(["a", "b", "c"]) can't be 
optimized at compile time so that the function call is replaced 
with "abc".

Now, think of join as defined like

string join(T...)(T t);

and we call it like

join("a", s, "b", "c", ss);

where s is string that is compile time evaluable and ss is a 
run-time string array.

the optimal join code would be something like

r = "a"~s~"bc"
foreach(sss; ss) r ~= sss;

(and which would change depending on the argument types and if 
they are compile time evaluable)


To generate such "optimal" code(one that gets the compiler to do 
as much work as it can at compile time) is somewhat convoluted in 
D. We can use string mixins to solve the problem but not easily 
because we can't pass variadic arguments to templates.

e.g.,

join(T...)(T s)
{
     mixin(tJoinH!(s))
}

fails because s is not known at compile time... even if s is 
partially known.

It would be nice if we could pass the arguments to a template 
even if they are not completely known at compile time.

e.g., join("a", s),

tJoinH can receive "a" and s but s's value, if not compile-time 
known, is undefined.

Hence, tJoinH can optimize the arguments by attempting to join 
the compile-time arguments and insert optimal code for the 
run-time arguments.

That is, if we call a function with an argument list, we should 
be able to pass that to a template without any errors and be able 
to determine the value of each argument... if the argument is 
undefined, then that becomes it's value.

Another way to see it work is with sum:

sum(1,2,x);

int sum(T...)(T i)
{
     // return 1 + 2 + i[3]; for the call above, it is optimal and 
our goal to generate
     // return mixin(tSumH!(i)); // hypothetically generates 1 + 2 
+ i[3] for the call above
     // return mixin(tSumH2!(i)); // generates i[1] + i[2] + i[3]. 
Current possible way

     // naive approach(assumes all arguments are known only at 
run-time):
     int y;
     foreach(ii; i) y += ii;
     return y;
}


I'm able to achieve this partially by passing T, rather than i, 
to the template function and the variable name, i. This works but 
may not produce optimal results(referencing the stack instead of 
the value of a literal... which prevents it from being further 
optimized).

e.g., from above, instead of return 1 + 2 + i[3]; I would 
generate return i[1] + i[2] + i[3]; Not optimal because i[1] and 
i[2] are treated as run-time entities even though in the above 
case they are not.


If D had some model to create functions that can optimize the way 
they deal with their compile-time arguments and run-time 
arguments then it would be pretty easy to create such functions.

e.g.,

string join(T...)(T t)    // Note join has a variadic parameter
{
     ctfe {
          // called when join is used in ctfe, t can be used to 
get the values directly if they exist at compile time, else their 
value is undefined.
          // can call join here with run-time entities

     }

     // runtime version goes here
}

we might have something like this

string join(T...)(T t) // easy notation to constrain T on char, 
string, string[]
{
     ctfe
     {
         string code = "string s;";
         foreach(k, val; t)
         {
             if (isDefined!val)
                 code ~= `s ~= "`~join(val)~`";`; // calls join as 
a ctfe on val.
             else
             {
                 if (type == string[])
                     code ~= `foreach(ss; t[`~k~`]) s~= sss;`;
                 } else code ~= `s ~= t[`~k~`];`;
             }
         }
         code ~= "return s;";
         mixin code;
     }

     // concatenates all arguments of t as if they were run-time 
arguments.
     string s;
     foreach(k, ss; t)
         static if (T[k].type == string[])
             foreach(sss; ss)
                 s ~= sss;
         else
             s ~= ss;
     return s;
}


I'm not saying this works but the idea is that the ctfe block is 
executed at compile time WHEN there are non-compile time 
arguments(ok, maybe ctfe isn't the best keyword). The block then 
figures out how to best deal with the non-ct arguments.

In the above code note how join is called in the ctfe block to 
handle known compile time types. This is just standard ctfe 
execution. The real "magic" happens when it is dealing with 
non-ct arguments, in which case it builds up meta code which is 
used as a mixin(special statement at end of ctfe block) that is 
inserted.

So for join("a", s); the code would execute something like

at compile time(in ctfe block)
     code = `string s;`;
     "a" is defined so
     code ~= `s ~= "~join("a")~`";`;
         join("a") is called, since "a" is known, the non-ctfe 
block is called, so we actually have
     code ~= `s ~= "a";`;
  next for s(which is undefined at compile time)
     code ~= `s ~= t[1];`;
  end
     code ~= "return s;";
so the string mixin code looks like:

     string s;
     s ~= "a";
     s ~= t[1];
     return s;

or simplified,

    return "a"~t[1];

which is an optimal join code for the join call join("a", s);

(it would make more sense to call a template in the ctfe block so 
we don't have form code strings of code strings)


Note also that having such a ctfe block does not break existing 
code structures and pre existing functions can easily be upgraded 
for optimal behavior.


Having the compiler do as much optimization as possible should 
produce a significant speedup in most applications.

When a function is called with any known compile time entities, 
it is possible for an optimization to take place. I think if D 
incorporates some mechanism in it's language to do so it will go 
a long.

In fact, all this simply demonstrates that ctfe's are not as 
powerful as they can be. Calling a ctfe'able function with just 
one run-time variable will prevent it from being ctfe'ed and 
optimized completely... even though there is still some 
possibility for optimization.


Hopefully some will read this acutely and attempt to ponder its 
potential.  I expect the immediate naysayers who can't see past 
their own noses... and those that get hung up on non-essential 
details(I said that the block doesn't have to be called ctfe... 
oh hell, you do realize we don't even need to use a block?).
Jul 23 2013
next sibling parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 07/23/2013 09:21 AM, JS wrote:

 string join(T...)(T t)    // Note join has a variadic parameter
 {
      ctfe {

Are you aware of version(ctfe)?
 Hopefully some will read this acutely

Not me. :(
 and attempt to ponder its potential.

I understood that much though...
 I expect the immediate naysayers who can't see past their own 

 those that get hung up on non-essential details(I said that the block
 doesn't have to be called ctfe... oh hell, you do realize we don't 

 to use a block?).

That's me! :) Ali
Jul 23 2013
parent =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 07/24/2013 02:56 AM, monarch_dodra wrote:

 On Tuesday, 23 July 2013 at 17:41:01 UTC, Ali Çehreli wrote:

 Are you aware of version(ctfe)?

Hum... really? I don't think that exists.

You are right. I must be remembering old proposals about it. Ali
Jul 24 2013
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Jul 23, 2013 at 06:21:08PM +0200, JS wrote:
 There seems to be a lot of improvement for programming languages to
 optimize compile time aspects that are not taken into account. With
 ctfe I think such optimizations are more and more relevant in D.

 But there should be no reason why join(["a", "b", "c"]) can't be
 optimized at compile time so that the function call is replaced with
 "abc".

 To generate such "optimal" code(one that gets the compiler to do as
 much work as it can at compile time) is somewhat convoluted in D. We
 can use string mixins to solve the problem but not easily because we
 can't pass variadic arguments to templates.
 
 e.g.,
 
 join(T...)(T s)
 {
     mixin(tJoinH!(s))
 }
 
 fails because s is not known at compile time... even if s is
 partially known.

Your CTFE-fu isn't good enough. ;-) Check this out: import std.stdio; template tuple(args...) { alias tuple = args; } /** * Given a tuple of strings, returns a tuple in which all * adjacent compile-time readable strings are concatenated. */ template tupleReduce(args...) { static if (args.length > 1) { static if (is(typeof(args[0])==string) && __traits(compiles, { enum x = args[0]; })) { static if (is(typeof(args[1])==string) && __traits(compiles, { enum x = args[1]; })) { alias tupleReduce = tupleReduce!(args[0] ~ args[1], args[2..$]); } else { alias tupleReduce = tuple!(args[0], args[1], tupleReduce!(args[2..$])); } } else { alias tupleReduce = tuple!(args[0], tupleReduce!(args[1..$])); } } else { alias tupleReduce = args; } } void main() { string x = "runtime1"; string y = "runtime2"; auto arr = [ tupleReduce!("a", "b", x, "c", "d", y, "e", "f") ]; writeln(arr); } The output is: ["ab", "runtime1", "cd", "runtime2", "ef"] which proves that all adjacent compile-time readable strings have been concatenated at compile-time. Using tupleReduce, you can optimize calls to join by writing: join(tupleReduce!("a", "b", x, "c", "d")); and it will be compiled as though you've written: join("ab", x, "cd"); Yeah, so tupleReduce looks a bit on the ugly side. But the point is that it works. So don't be so quick to say something is not possible in D, 'cos even with its current warts, D is one powerful beast. :-) T -- One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie. -- The Silicon Valley Tarot
Jul 23 2013
prev sibling next sibling parent "Jesse Phillips" <Jesse.K.Phillips+D gmail.com> writes:
On Tuesday, 23 July 2013 at 16:21:10 UTC, JS wrote:
 Now, think of join as defined like

 string join(T...)(T t);

 and we call it like

 join("a", s, "b", "c", ss);

To summarize to the best of my knowledge, the request is to provide string join(alias T...)() {} With the call changed to: join!("a", s, "b", "c", ss); I could be wrong here. I think think this is a reasonable request, but haven't looked at any implications of adding it.
Jul 23 2013
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Jul 23, 2013 at 11:09:24AM -0700, H. S. Teoh wrote:
[...]
 Using tupleReduce, you can optimize calls to join by writing:
 
 	join(tupleReduce!("a", "b", x, "c", "d"));
 
 and it will be compiled as though you've written:
 
 	join("ab", x, "cd");

I've generalized tupleReduce to handle reduction of tuples by any CTFE-able binary function: import std.stdio; template tuple(args...) { alias tuple = args; } /** * Given a binary function and a tuple, returns a tuple in which all adjacent * compile-time readable items are recursively substituted with the return * value of the binary function applied to them. */ template tupleReduce(alias func, args...) { static if (args.length > 1) { static if (__traits(compiles, { enum x = args[0]; })) { static if (__traits(compiles, { enum x = args[1]; }) && is(typeof(func(args[0], args[1])))) { enum result = func(args[0], args[1]); alias tupleReduce = tupleReduce!(func, result, args[2..$]); } else { alias tupleReduce = tuple!(args[0], args[1], tupleReduce!(func, args[2..$])); } } else { alias tupleReduce = tuple!(args[0], tupleReduce!(func, args[1..$])); } } else { alias tupleReduce = args; } } /** * Given a tuple of strings, returns a tuple in which all adjacent compile-time * readable strings are concatenated. */ template tupleStringReduce(args...) { alias tupleStringReduce = tupleReduce!( (string x, string y) => x~y, args ); } void main() { string x = "runtime1"; string y = "runtime2"; auto arr = [ tupleStringReduce!("a", "b", x, "c", "d", y, "e", "f", "g", x) ]; writeln(arr); int i1=100, i2=200; auto brr = [ tupleReduce!((int x, int y) => x+y, 1, 2, i1, 3, 4, i2, 5, 6, 7) ]; writeln(brr); } The output is: ["ab", "runtime1", "cd", "runtime2", "efg", "runtime1"] [3, 100, 7, 200, 18] For the string case, to make user code more convenient, I've written an alias template tupleStringReduce() that basically supplies the binary function for concatenating adjacent strings. The second case shows how this can be explicitly specified via a function literal. The example here reduces adjacent compile-time readable ints via addition; you could substitute that with any binary function you want, say (x,y)=>x*y, or (x,y)=>x*2-y, or whatever. The only limit is that the function must be CTFE-able. With the help of the new tupleReduce, you can do all sorts of compile-time optimizations. D gives you power -- awesome power -- but you do have to learn how to tame the beast first. ;-) T -- Beware of bugs in the above code; I have only proved it correct, not tried it. -- Donald Knuth
Jul 23 2013
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Jul 23, 2013 at 12:42:18PM -0700, H. S. Teoh wrote:
[...]
 With the help of the new tupleReduce, you can do all sorts of
 compile-time optimizations.

Here's an example of how to implement compile-time optimization of join() with the help of tupleReduce: // Just for the sake of example, the following function uses "," // to join the strings in the arguments. string join(string[] args...) { string result; string delim; foreach (arg; args) { result ~= delim ~ arg; delim = ","; // just for example } return result; } // This is the wrapper that uses tupleReduce (see my previous // post) to reduce the arguments at compile-time. For // illustration purposes, I omitted the "," from the reduction // function, so that you can tell from the output what has been // reduced at compile-time, and what is reduced at runtime. template optimizedJoin(args...) { string optimizedJoin() { return join(tupleReduce!((string x, string y) => x~y, args)); } } void main() { writeln(optimizedJoin!(x, "a", "bc", y, "de", "f")); } The output is: runtime1,abc,runtime2,def As you can see, "a" and "bc" have been concatenated at compile-time, and so have "de" and "f". Since x and y can't be read at compile-time, they remain as variable arguments. The optimizedJoin template then collects the elements of the reduced tuple and hands it off to join() to do the rest of the work at runtime. Notice that join() uses "delim ~ arg" to do the concatenation at runtime; we could ostensibly replace this with the same function literal we pass to tupleReduce in optimizedJoin(). Then you'll have a full implementation of join() that performs compile-time optimization of its parameters. I rest my case. :-) T -- In order to understand recursion you must first understand recursion.
Jul 23 2013
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Jul 23, 2013 at 12:58:11PM -0700, H. S. Teoh wrote:
[...]
 	void main() {

Oops, forgot to include the definition of x and y in main(): string x = "runtime1"; string y = "runtime2";
 		writeln(optimizedJoin!(x, "a", "bc", y, "de", "f"));
 	}

T -- People say I'm indecisive, but I'm not sure about that. -- YHL, CONLANG
Jul 23 2013
prev sibling next sibling parent "Jesse Phillips" <Jesse.K.Phillips+D gmail.com> writes:
On Tuesday, 23 July 2013 at 18:10:54 UTC, H. S. Teoh wrote:
 	void main() {
 		string x = "runtime1";
 		string y = "runtime2";
 		auto arr = [ tupleReduce!("a", "b", x, "c", "d", y, "e", "f") 
 ];
 		writeln(arr);
 	}

Heh, missed this. Forgot about recursive aliasing too. My suggestion also is bogus since literals can't be aliased.
Jul 23 2013
prev sibling next sibling parent "JS" <js.mdnq gmail.com> writes:
On Tuesday, 23 July 2013 at 19:59:46 UTC, H. S. Teoh wrote:
 On Tue, Jul 23, 2013 at 12:42:18PM -0700, H. S. Teoh wrote:
 [...]
 With the help of the new tupleReduce, you can do all sorts of
 compile-time optimizations.

Here's an example of how to implement compile-time optimization of join() with the help of tupleReduce: // Just for the sake of example, the following function uses "," // to join the strings in the arguments. string join(string[] args...) { string result; string delim; foreach (arg; args) { result ~= delim ~ arg; delim = ","; // just for example } return result; } // This is the wrapper that uses tupleReduce (see my previous // post) to reduce the arguments at compile-time. For // illustration purposes, I omitted the "," from the reduction // function, so that you can tell from the output what has been // reduced at compile-time, and what is reduced at runtime. template optimizedJoin(args...) { string optimizedJoin() { return join(tupleReduce!((string x, string y) => x~y, args)); } } void main() { writeln(optimizedJoin!(x, "a", "bc", y, "de", "f")); } The output is: runtime1,abc,runtime2,def As you can see, "a" and "bc" have been concatenated at compile-time, and so have "de" and "f". Since x and y can't be read at compile-time, they remain as variable arguments. The optimizedJoin template then collects the elements of the reduced tuple and hands it off to join() to do the rest of the work at runtime. Notice that join() uses "delim ~ arg" to do the concatenation at runtime; we could ostensibly replace this with the same function literal we pass to tupleReduce in optimizedJoin(). Then you'll have a full implementation of join() that performs compile-time optimization of its parameters. I rest my case. :-) T

I must study your powerful ctfe-fu master! Great fu you have! This is great that D can do this and from just a quick glance it looks like you solve the problem. Your code doesn't exactly do what I wanted but probably because I wasn't clear... I think it is modifiable and solves the problem I was having(again, assuming your code is correct). Note though that something like string x = "asd" join(x,...); // x is compile time known and should be joined at compile time. Not sure if D is smart enough to get this one? When I get some time later I'll check your code out, what I want to do is get something like string x = join(a, b, "a", "b"); to produce the same code as string x = a~b~"ab"; (no looping using for each except for string arrays) Thanks!
Jul 23 2013
prev sibling next sibling parent "Yota" <yotaxp thatGoogleMailThing.com> writes:
On Tuesday, 23 July 2013 at 20:11:21 UTC, JS wrote:
 Your code doesn't exactly do what I wanted but probably because 
 I wasn't clear... I think it is modifiable and solves the 
 problem I was having(again, assuming your code is correct).

 Note though that something like

 string x = "asd"
 join(x,...); // x is compile time known and should be joined at 
 compile time. Not sure if D is smart enough to get this one?

 When I get some time later I'll check your code out, what I 
 want to do is get something like

 string x = join(a, b, "a", "b"); to produce the same code as
 string x = a~b~"ab";

 (no looping using for each except for string arrays)

 Thanks!

Don't forget that tuples can store more than just types! See this for instance. import std.stdio; typeof(Xs[0]) sum(Xs...)() { pragma(msg, Xs); // tuple(1, 2, 3) typeof(Xs[0]) y; foreach(x; Xs) y ~= x; return y; } int main() { auto n = sum!(1, 2, 3); enum m = sum!(1, 2, 3); pragma(msg, typeof(n)); // int writeln(n); // 6 writeln(m); // 6 return 0; } Works great. When I comment out the 'sum!("1", "2", "3")' and replace it with a literal '6', the compiled EXE is byte for byte exactly the same. You can exchange the '+=' for '~=' and it will concatenate strings instead. Of course, now all arguments MUST be known at compile time, but that's no surprise. PS: OK, just noticed that you were hoping to mix run time and compile time arguments. Oops.... Oh well, posting anyway because I think it's neat. lol As for a~"B"~"C" compiling into a~"BC", I think we just need to work on constant folding in the compiler. (If it doesn't already do this anyway.)
Jul 23 2013
prev sibling next sibling parent "Yota" <yotaxp thatGoogleMailThing.com> writes:
On Tuesday, 23 July 2013 at 22:38:35 UTC, Yota wrote:
   typeof(Xs[0]) sum(Xs...)()
   {
       pragma(msg, Xs); // tuple(1, 2, 3)
       typeof(Xs[0]) y;
       foreach(x; Xs) y ~= x;
       return y;
   }

*facepalm* That ~= in there is what I get for copying code after I experiment on it. Ment for it to be a +=. Phooey ~Yota
Jul 23 2013
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Jul 23, 2013 at 10:11:17PM +0200, JS wrote:
[...]
 Note though that something like
 
 string x = "asd"
 join(x,...); // x is compile time known and should be joined at
 compile time. Not sure if D is smart enough to get this one?

Well, if you want *that*, then you'll just have to submit a DMD pull request for the optimizer. ;-) The problem is that even though in this particular case it should be easy for the compiler to figure out, in the general case it's far from trivial. For example, you might write: string x = "asd"; x ~= "def"; join(x, ...); Arguably, the compiler should detect that the value of x is computable at compile-time, so it should apply the optimization. But the problem is, the code between the first line and the call to join() can be arbitrarily complex... for example: enum VeryLargeNumber = 2_391_382_391_483; string x = "asd"; if (canFactorize(VeryLargeNumber)) x ~= "x"; join(x, ...); Assuming canFactorize is CTFE-able, in theory the compiler should be able to figure out the value of x at compile-time. But integer factorization currently has no simple solution, which means your code will take 5 months to compile while DMD tries to solve integer factorization. Worse yet, you might write something like: string x = "asd"; void function(int x) fn = createArbitraryFunction(); if (canHalt(fn)) x ~= "x"; join(x, ...); Then your program will *never* finish compiling, because the halting problem is unsolvable, but nothing stops you from writing a CTFE-able solution that requires infinite time to find an answer.
 When I get some time later I'll check your code out, what I want to
 do is get something like
 
 string x = join(a, b, "a", "b"); to produce the same code as
 string x = a~b~"ab";
 
 (no looping using for each except for string arrays)

That's pretty much what my code already does. Just use GDC and it will unroll the loop for you. ;-) Alternatively, since tupleReduce returns a tuple, you *could* just foreach over it (loops involving tuples are automatically unrolled at compile-time) and generate the code. That is, instead of using optimizedJoin() as a wrapper for a function, you could do something like this instead: string optimizedJoin(args...)() { string result; foreach (arg; tupleReduce!( (string x, string y) => x~y, args)) { result ~= arg; } return result; } I haven't tested this code, though, so I'm not 100% sure if it will work. But if not, a little massaging should make it work. On another note, though, it makes me wonder why you want this level of control over the generated code. Are you sure you're not doing premature optimization? You should be aware that the more you use such arcane tricks to generate this "optimized" code, the less the compiler's built-in optimizer will be able to understand your intent, and the more likely that it may be generating more inferior code than if you had just written the code the "normal" way. Generally, this level of manual optimization should only be done after you've profiled your code and identified the hotspots. I know from experience that more often than not, what you *think* is a hotspot actually isn't. It's only after running a profiler that you find out where the *real* hotspots are. :) I used to be a hardcore optimization freak -- used to read assembly output from the compiler to identify inefficiencies -- until one day I actually ran a profiler on my code. Turns out that the real hotspot is completely NOT where I expected it. All those late nights spent "optimizing" my code, only to get negligible performance gain, when a 1-line change in the *real* hotspot improved performance by 30% instantly (and only took me 5 minutes instead of hours and hours). T -- My program has no bugs! Only undocumented features...
Jul 23 2013
prev sibling next sibling parent "JS" <js.mdnq gmail.com> writes:
On Tuesday, 23 July 2013 at 22:38:35 UTC, Yota wrote:
 On Tuesday, 23 July 2013 at 20:11:21 UTC, JS wrote:
 Your code doesn't exactly do what I wanted but probably 
 because I wasn't clear... I think it is modifiable and solves 
 the problem I was having(again, assuming your code is correct).

 Note though that something like

 string x = "asd"
 join(x,...); // x is compile time known and should be joined 
 at compile time. Not sure if D is smart enough to get this one?

 When I get some time later I'll check your code out, what I 
 want to do is get something like

 string x = join(a, b, "a", "b"); to produce the same code as
 string x = a~b~"ab";

 (no looping using for each except for string arrays)

 Thanks!

Don't forget that tuples can store more than just types! See this for instance. import std.stdio; typeof(Xs[0]) sum(Xs...)() { pragma(msg, Xs); // tuple(1, 2, 3) typeof(Xs[0]) y; foreach(x; Xs) y ~= x; return y; } int main() { auto n = sum!(1, 2, 3); enum m = sum!(1, 2, 3); pragma(msg, typeof(n)); // int writeln(n); // 6 writeln(m); // 6 return 0; } Works great. When I comment out the 'sum!("1", "2", "3")' and replace it with a literal '6', the compiled EXE is byte for byte exactly the same. You can exchange the '+=' for '~=' and it will concatenate strings instead. Of course, now all arguments MUST be known at compile time, but that's no surprise. PS: OK, just noticed that you were hoping to mix run time and compile time arguments. Oops.... Oh well, posting anyway because I think it's neat. lol As for a~"B"~"C" compiling into a~"BC", I think we just need to work on constant folding in the compiler. (If it doesn't already do this anyway.)

Well, it's not that simple. Constant folding is easy when it is obvious to the compiler... but something buried in foreach loop that uses runtime and compile time objects is not obvious. My three points out of the post is: 1. D needs some way to make doing this stuff easier. 2. It needs to be done! ;) 3. Goto 1. I didn't realize D could even potentially do it, and still not sure if it can(have to check out T's post again and work with the code). I think there is a lot of potential power behind such techniques with regard to optimization. Suppose you are writing a matrix library. Wouldn't it be nice to have certain matrix computations that can be done at compile time actually done? (I suppose what I'm trying to do is extend the compiler's "constant folding" to be able to deal with much more complex cases). Another way to potentially do it, I think, is for the compiler to automatically expand ctfe functions and replace runtime entities with "placeholder" code which is then expanded at runtime. e.g., for a simple join join(T...)(T t) { string s; foreach(tt; t) s ~= tt; return s; } the compiler can unroll the loop to get s ~= t[0]~t[1]~t[2]~...; replace the compile time entities, e.g., suppose t[1] and t[3] are known at compile time at the invocation, then s ~= t[0]~"abcd"~t[2]~"asdf"; then this code could be used inline for the join call instead of the actual code. The key here is there is no dependence on the for loop. If all arguments are compile-time evaluable, then s could be just a reference lookup to a static string. If it can't be done, then no big deal, it is optimized as far as it can go. Only if all arguments passed are runtime entities will the exact code be used. This seems like it would take a pretty smart compiler and possibly it would have issues in some many cases. I personally don't mind writing the code itself but it seems pretty complex, and having no way to debug the code generation easily(AFAIK) makes it difficult to do. Possibly using version(ctfe) can help... I'll work on it when I get a chance...
Jul 23 2013
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Jul 24, 2013 at 01:35:23AM +0200, JS wrote:
[...]
 My three points out of the post is:
 
 1. D needs some way to make doing this stuff easier.
 2. It needs to be done! ;)
 3. Goto 1.
 
 I didn't realize D could even potentially do it, and still not sure
 if it can(have to check out T's post again and work with the code).
 
 I think there is a lot of potential power behind such techniques
 with regard to optimization.
 
 Suppose you are writing a matrix library. Wouldn't it be nice to
 have certain matrix computations that can be done at compile time
 actually done? (I suppose what I'm trying to do is extend the
 compiler's "constant folding" to be able to deal with much more
 complex cases).

You're looking for expression templates. So far, nobody seems to have written expression templates in D yet (that I know of, anyway, I could be wrong). D should be more than well-equipped to handle expression templates, you just have to write them. (But having said that, they aren't exactly easy to write. :-P) I think the last time this topic came up, somebody mentioned the dreaded mythical AST macros, and the discussion got derailed. I've been meaning to write a D matrix library using expression templates, but I haven't gotten around to it yet. The basic idea behind expression templates is that overloaded operators (or function calls, if operator overloading isn't good enough for you) don't return concrete types, but a proxy wrapper type that gets unboxed into a real type when assigned to a concrete variable. The operators are overloaded to take these wrapper types as arguments, and basically construct a template tree paralleling the structure of the expression so that when you're finally assigning the result to a concrete variable, the assignment operator can traverse the expression tree, fold constants and do whatever other optimizations you wish, then generate optimized code for performing the operation represented by the tree. That's the theory anyway. In practice, expression templates are rather hard to write (and read!) if you're the unlucky library writer (it *is* really nice from the user's POV though!). They are also one of the sources of C++ error messages that span 15 pages per line, if you're ever unlucky enough to run into a problem. Last I was told, however, that the preferred approach in D is to use string mixins instead. You basically write a string containing an arbitrary expression (of arbtrary syntax, even!) that gets parsed by a CTFE lexer/parser that can perform arbitrary transformations on it, which generates D code as a string that you can use in a mixin. This is significantly more powerful than expression templates, because you're no longer constrained by the syntax of the host language, but your input string can be literally *anything*, like a DSL, or a math expression containing real, bona fide Unicode mathematical symbols for operators, sets, and what-have-you. You could even write your own language that way. All of it eventually gets compiled down into D code as a string mixin. Syntax-wise, there's the slight ugliness of having to write "mixin(MyDSL("A ± B / C → D"))", but that's more than outweighed by the total freedom of syntax within the input string itself. I saw an expression-template-based regex implementation in C++ once, and almost went blind. It overloaded C++ operators in rather ugly ways in order to work around the fact that C++ doesn't have native regex operators. By contrast, our own std.regex has a ctRegex function that essentially does the same thing while allowing you to use native regex syntax within a string literal. I think that's a far better approach for many cases. :) T -- It always amuses me that Windows has a Safe Mode during bootup. Does that mean that Windows is normally unsafe?
Jul 23 2013
prev sibling next sibling parent "JS" <js.mdnq gmail.com> writes:
On Wednesday, 24 July 2013 at 01:07:15 UTC, H. S. Teoh wrote:
 On Wed, Jul 24, 2013 at 01:35:23AM +0200, JS wrote:
 [...]
 My three points out of the post is:
 
 1. D needs some way to make doing this stuff easier.
 2. It needs to be done! ;)
 3. Goto 1.
 
 I didn't realize D could even potentially do it, and still not 
 sure
 if it can(have to check out T's post again and work with the 
 code).
 
 I think there is a lot of potential power behind such 
 techniques
 with regard to optimization.
 
 Suppose you are writing a matrix library. Wouldn't it be nice 
 to
 have certain matrix computations that can be done at compile 
 time
 actually done? (I suppose what I'm trying to do is extend the
 compiler's "constant folding" to be able to deal with much more
 complex cases).


Your tuple reduce seems to work! Debugging the code shows no loops and memory debugging looks like the adjacent constants are folded! magnificent! As far as over optimizing, I am trying to write some library code and want to "get it right" the first time... having it not only optimize the best it can but also make the functions easier to use... e.g., variargs instead of nested calls. As far as ctfe running time issues, I believe they exist already? In any case, the compiler at some point has to mark a variable that is compile time evaluable as not evaluable at some point(probably at a function call) What we do know is arguments to ctfe's return compile time evaluable results... So it is easy to know that a variable is compile time evaluable but not necessarily actually be able to compute it at compile time.... a simple time out count with error/warning could be used... if failed it could be marked as a run-time entity at that point and beyond(so no optimization beyond the point where it can't be computed at compile time. Not perfect but covers 99.99% of cases.
 You're looking for expression templates. So far, nobody seems 
 to have
 written expression templates in D yet (that I know of, anyway, 
 I could
 be wrong). D should be more than well-equipped to handle 
 expression
 templates, you just have to write them. (But having said that, 
 they
 aren't exactly easy to write. :-P)

I think I'll pass, I'm still struggling just to learn D. You, OTH, seem like someone that can pull it off rather easily!
 I think the last time this topic came up, somebody mentioned 
 the dreaded
 mythical AST macros, and the discussion got derailed. I've been 
 meaning
 to write a D matrix library using expression templates, but I 
 haven't
 gotten around to it yet.

 The basic idea behind expression templates is that overloaded 
 operators
 (or function calls, if operator overloading isn't good enough 
 for you)
 don't return concrete types, but a proxy wrapper type that gets 
 unboxed
 into a real type when assigned to a concrete variable. The 
 operators are
 overloaded to take these wrapper types as arguments, and 
 basically
 construct a template tree paralleling the structure of the 
 expression so
 that when you're finally assigning the result to a concrete 
 variable,
 the assignment operator can traverse the expression tree, fold 
 constants
 and do whatever other optimizations you wish, then generate 
 optimized
 code for performing the operation represented by the tree.

 That's the theory anyway. In practice, expression templates are 
 rather
 hard to write (and read!) if you're the unlucky library writer 
 (it *is*
 really nice from the user's POV though!). They are also one of 
 the
 sources of C++ error messages that span 15 pages per line, if 
 you're
 ever unlucky enough to run into a problem.

 Last I was told, however, that the preferred approach in D is 
 to use
 string mixins instead. You basically write a string containing 
 an
 arbitrary expression (of arbtrary syntax, even!) that gets 
 parsed by a
 CTFE lexer/parser that can perform arbitrary transformations on 
 it,
 which generates D code as a string that you can use in a mixin. 
 This is
 significantly more powerful than expression templates, because 
 you're no
 longer constrained by the syntax of the host language, but your 
 input
 string can be literally *anything*, like a DSL, or a math 
 expression
 containing real, bona fide Unicode mathematical symbols for 
 operators,
 sets, and what-have-you. You could even write your own language 
 that
 way. All of it eventually gets compiled down into D code as a 
 string
 mixin.

 Syntax-wise, there's the slight ugliness of having to write
 "mixin(MyDSL("A ± B / C → D"))", but that's more than 
 outweighed by the
 total freedom of syntax within the input string itself. I saw an
 expression-template-based regex implementation in C++ once, and 
 almost
 went blind. It overloaded C++ operators in rather ugly ways in 
 order to
 work around the fact that C++ doesn't have native regex 
 operators. By
 contrast, our own std.regex has a ctRegex function that 
 essentially does
 the same thing while allowing you to use native regex syntax 
 within a
 string literal. I think that's a far better approach for many 
 cases. :)

I was wondering if that was possible. I was using string mixins to essentially accomplish the constant folding was thinking that one could implement java script or some other language in the strings and have them transformed. It could be pretty useful to have. I've still got a lot to learn about D but it seems nearly unbounded for what it can do compared to other languages. Thanks for making this work! I'll spend some time studying your code to learn how it works then try to apply it to my library code. Trying to do stripLeft/Right, split, join, etc... for strings. Each function taking an arbitrary number of run-time or compile-time char or string arguments with optimal code generation.
Jul 24 2013
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Tuesday, 23 July 2013 at 17:41:01 UTC, Ali Çehreli wrote:
 On 07/23/2013 09:21 AM, JS wrote:

 string join(T...)(T t)    // Note join has a variadic

 {
      ctfe {

Are you aware of version(ctfe)?

Hum... really? I don't think that exists. Either that, or I'm using it wrong. Could you provide an example? What you do have is "if (__ctfe)" though... //---- import std.stdio; int foo() { version (poop) return 1; else return 0; } void main(string[] args) { enum a = foo(); auto b = foo(); writeln(a, " ", b); } //---- Prints: 0 0 //---- import std.stdio; int foo() { if (__ctfe) return 1; else return 0; } void main(string[] args) { enum a = foo(); auto b = foo(); writeln(a, " ", b); } //---- Prints: 1 0
Jul 24 2013
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Wednesday, 24 July 2013 at 09:56:41 UTC, monarch_dodra wrote:
     version (poop)

Goddamnit. There goes my credibility -_-' I was testing to see if I got a compile error with an inexistant version tag. Please replace that with "ctfe"
Jul 24 2013
prev sibling next sibling parent "JS" <js.mdnq gmail.com> writes:
On Wednesday, 24 July 2013 at 09:58:02 UTC, monarch_dodra wrote:
 On Wednesday, 24 July 2013 at 09:56:41 UTC, monarch_dodra wrote:
    version (poop)

Goddamnit. There goes my credibility -_-' I was testing to see if I got a compile error with an inexistant version tag. Please replace that with "ctfe"

lol... I was wondering were the hell that came from. I thought maybe it meant you wrote that part of the code on the toilet ;)
Jul 24 2013
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Jul 24, 2013 at 09:22:14AM -0700, Ali ehreli wrote:
 On 07/24/2013 02:56 AM, monarch_dodra wrote:
 
 On Tuesday, 23 July 2013 at 17:41:01 UTC, Ali ehreli wrote:

 Are you aware of version(ctfe)?

Hum... really? I don't think that exists.

You are right. I must be remembering old proposals about it.

There's if(__ctfe), but unfortunately (IIRC) it doesn't work with static if, because CTFE runs after static ifs have been processed. So it might not be enough if you have code that has parts that can't run under CTFE. Those parts will taint all parent functions up the call tree as non-CTFEable, so once you have them you're out of luck. T -- We are in class, we are supposed to be learning, we have a teacher... Is it too much that I expect him to teach me??? -- RL
Jul 24 2013
prev sibling next sibling parent "JS" <js.mdnq gmail.com> writes:
T,

I've looked over your code and understand what you are doing but 
I don't understand the expression

      is(typeof(func(args[0], args[1]))))

you are checking if func(a,b)'s return type is a type?
Jul 24 2013
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Wednesday, 24 July 2013 at 17:10:01 UTC, H. S. Teoh wrote:
 On Wed, Jul 24, 2013 at 09:22:14AM -0700, Ali Çehreli wrote:
 On 07/24/2013 02:56 AM, monarch_dodra wrote:
 
 On Tuesday, 23 July 2013 at 17:41:01 UTC, Ali Çehreli wrote:

 Are you aware of version(ctfe)?

Hum... really? I don't think that exists.

You are right. I must be remembering old proposals about it.

There's if(__ctfe), but unfortunately (IIRC) it doesn't work with static if, because CTFE runs after static ifs have been processed. So it might not be enough if you have code that has parts that can't run under CTFE. Those parts will taint all parent functions up the call tree as non-CTFEable, so once you have them you're out of luck. T

That's not true either. The code will compile, and CTFE will only complain if it *runs* into code that can't be executed CTFE-style. For example: //---- import std.stdio; import core.stdc.string; int foo() { if(__ctfe) { return 1; } else { int i = void; memset(&i, 0, 4); return i; } } void main(string[] args) { enum a = foo(); auto b = foo(); writeln(a, " ", b); } //---- This compiles and runs fine, even though memset is not ctfe-able.
Jul 24 2013
prev sibling parent "JS" <js.mdnq gmail.com> writes:
On Wednesday, 24 July 2013 at 18:12:37 UTC, JS wrote:
T,

Also, I've tried to wrap the template in a function and I can't 
get it to work:

	
	string join(T...)(string delim, T args)
	{
		return tupleStringReduce!(args);
	}

it seems tupleStringReduce just returns the tuple of of the 
optimized strings. I'll still need to write code that will deal 
mixing in the actual compile time and runtime code.

This doesn't seem that hard as tupleReduce essentially outlines 
the approach. The difference is instead of calling func directly 
I sort of need to mix it in as a code string.

e.g., for tupleStringReduce, func is x~y. I need to build a 
string using that "((arg[0]~arg[1])~arg[2])..." which is then 
mixed in at compile time but at runtime evaluates to a string(not 
an array).

Since there is no .codeof, AFAIK, I guess the only way to do this 
is pass func, and the func code string("x~y") in this case.

The idea though is to use the lamba as a ctfe to reduce the tuple 
when possible but *also* use it at runtime to evaluate the tuple.


Any ideas?
Jul 24 2013