www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Another way to do CTFE

reply Ary Borenszweig <ary esperanto.org.ar> writes:
CTFE is really nice but has its limitations: you can't do anything you 
want, and since it's interpreted it requires an interpreter and it's 
generally slow. Nimrod does the same thing, and now they are 
implementing a VM to run the interpreted code faster. Is this really the 
way to go?

In our language we are thinking about allowing code generation at 
compile time but in a different way. The idea is, at compile time, to 
compile and execute another program that would generate the code that 
would be mixed into the current program. This program could receive the 
execution context as arguments, along with any AST nodes that are passed 
to the program.

So right now in D you can do ctRegex:

auto ctr = ctRegex!(`^.*/([^/]+)/?$`);

I don't know what the syntax could be, but the idea is to have a file 
ct_regex.d. This file would receive the string as an argument and must 
generate the code that would be mixed in the program. Since this program 
is a program (compiled and executed), it has no limits on what it can 
do. Then you would do something like this:

mixin(compile_time_execute("ct_regex.d", `^.*/([^/]+)/?$`));

The compiler could be smart and cache the executable so that anytime it 
has to expand it it just needs to invoke it (skip the compile phase).

What do you think?

I know, I know. The first answer I'll get is: "Oh, no! But that way I 
could download a program, compile it and suddenly all my files are 
gone". My reply is: If you downloaded and compiled that program, weren't 
you going to execute it afterwards? At that point the program could do 
something harmful, so what's the difference?. You must either way check 
the source code to see that something fishy isn't happening there.

Just as a reference that something like this is possible, in our 
language you can already do this:

build_date = {{ system("date").stringify }}
puts build_date

That generates a program that has the build date embedded in it. We can 
also get the git hash of a repo and stick it into the executable without 
an additional Makefile or some build process. "system" is our first step 
towards doing this compile-time things. The next thing would be do do:

ct_regex = {{ run("ct_regex", "^.*/([^/]+)/?$") }}
Jun 17 2014
next sibling parent reply "Dicebot" <public dicebot.lv> writes:
Heh: http://forum.dlang.org/post/lnhtiq$qqn$1 digitalmars.com
Jun 17 2014
parent Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:
On 6/17/2014 3:55 PM, Dicebot wrote:
 Heh: http://forum.dlang.org/post/lnhtiq$qqn$1 digitalmars.com

Yea, Nemerle's approach addresses that, although it comes with other tradeoffs. In Nemerle, you compile your macros to a dll, then you pass that dll to the compiler when compiling any code that uses the macros. It has various pros/cons versus D's approach, but I think it's at least something worth being aware of.
Jun 17 2014
prev sibling next sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
17-Jun-2014 23:41, Ary Borenszweig пишет:
 CTFE is really nice but has its limitations: you can't do anything you
 want, and since it's interpreted it requires an interpreter and it's
 generally slow. Nimrod does the same thing, and now they are
 implementing a VM to run the interpreted code faster. Is this really the
 way to go?

 In our language we are thinking about allowing code generation at
 compile time but in a different way. The idea is, at compile time, to
 compile and execute another program that would generate the code that
 would be mixed into the current program. This program could receive the
 execution context as arguments, along with any AST nodes that are passed
 to the program.

 So right now in D you can do ctRegex:

 auto ctr = ctRegex!(`^.*/([^/]+)/?$`);

 I don't know what the syntax could be, but the idea is to have a file
 ct_regex.d. This file would receive the string as an argument and must
 generate the code that would be mixed in the program. Since this program
 is a program (compiled and executed), it has no limits on what it can
 do. Then you would do something like this:

 mixin(compile_time_execute("ct_regex.d", `^.*/([^/]+)/?$`));

 The compiler could be smart and cache the executable so that anytime it
 has to expand it it just needs to invoke it (skip the compile phase).

 What do you think?

Not limiting it to just calling some external tools, but plugins and services, I agree. Well, see the link by Dicebot. I belive this is more practical and useful stuff for heavy meta-programming. My reasons pro: a) Not everything could be done in CTFE envirnoment, e.g. please go ahead and compile a HLSL shader for me. b) Performance of standalone optimized code and definitive boundaries for caching of results. Some points against: a) Can't be as deeply integrated into compiler. Passing arbitrary D types won't work, for instance, or it needs to share type info with compiler. Same limitations with meta-data. b) We haven't seen a proper interpreter for CTFE yet at all, so are unable to truly assess its performance. Overall I think it's much more practical (yet hackish) way that can be easily done in near future.
 I know, I know. The first answer I'll get is: "Oh, no! But that way I
 could download a program, compile it and suddenly all my files are
 gone". My reply is: If you downloaded and compiled that program, weren't
 you going to execute it afterwards? At that point the program could do
 something harmful, so what's the difference?. You must either way check
 the source code to see that something fishy isn't happening there.

Well, there are ways to constrain plugins, even in system languages like D.
 Just as a reference that something like this is possible, in our
 language you can already do this:

 build_date = {{ system("date").stringify }}
 puts build_date

 That generates a program that has the build date embedded in it. We can
 also get the git hash of a repo and stick it into the executable without
 an additional Makefile or some build process. "system" is our first step
 towards doing this compile-time things. The next thing would be do do:

 ct_regex = {{ run("ct_regex", "^.*/([^/]+)/?$") }}

-- Dmitry Olshansky
Jun 17 2014
prev sibling next sibling parent "Tofu Ninja" <joeyemmons yahoo.com> writes:
On Tuesday, 17 June 2014 at 19:41:59 UTC, Ary Borenszweig wrote:
 CTFE is really nice but has its limitations: you can't do 
 anything you want, and since it's interpreted it requires an 
 interpreter and it's generally slow. Nimrod does the same 
 thing, and now they are implementing a VM to run the 
 interpreted code faster. Is this really the way to go?

 In our language we are thinking about allowing code generation 
 at compile time but in a different way. The idea is, at compile 
 time, to compile and execute another program that would 
 generate the code that would be mixed into the current program. 
 This program could receive the execution context as arguments, 
 along with any AST nodes that are passed to the program.

 So right now in D you can do ctRegex:

 auto ctr = ctRegex!(`^.*/([^/]+)/?$`);

 I don't know what the syntax could be, but the idea is to have 
 a file ct_regex.d. This file would receive the string as an 
 argument and must generate the code that would be mixed in the 
 program. Since this program is a program (compiled and 
 executed), it has no limits on what it can do. Then you would 
 do something like this:

 mixin(compile_time_execute("ct_regex.d", `^.*/([^/]+)/?$`));

 The compiler could be smart and cache the executable so that 
 anytime it has to expand it it just needs to invoke it (skip 
 the compile phase).

 What do you think?

 I know, I know. The first answer I'll get is: "Oh, no! But that 
 way I could download a program, compile it and suddenly all my 
 files are gone". My reply is: If you downloaded and compiled 
 that program, weren't you going to execute it afterwards? At 
 that point the program could do something harmful, so what's 
 the difference?. You must either way check the source code to 
 see that something fishy isn't happening there.

 Just as a reference that something like this is possible, in 
 our language you can already do this:

 build_date = {{ system("date").stringify }}
 puts build_date

 That generates a program that has the build date embedded in 
 it. We can also get the git hash of a repo and stick it into 
 the executable without an additional Makefile or some build 
 process. "system" is our first step towards doing this 
 compile-time things. The next thing would be do do:

 ct_regex = {{ run("ct_regex", "^.*/([^/]+)/?$") }}

I had a similar idea a while ago. The only difference was that instead of compiling and running some d file at compile time, mine was simply run some pre-compiled executable at compile time and return the output as a string(similar to string imports). I expressed similar use cases as you mentioned, replace extremely slow ctfe, but it didn't seem to catch on. Every one screamed that it was a security risk and it died there. -tofu
Jun 17 2014
prev sibling next sibling parent "Araq" <rumpf_a web.de> writes:
On Tuesday, 17 June 2014 at 19:41:59 UTC, Ary Borenszweig wrote:
 CTFE is really nice but has its limitations: you can't do 
 anything you want, and since it's interpreted it requires an 
 interpreter and it's generally slow. Nimrod does the same 
 thing, and now they are implementing a VM to run the 
 interpreted code faster. Is this really the way to go?

For your information the new VM shipped with 0.9.4 and runs Nimrod code faster at compile-time than Python runs code at run-time in the tests that I did with it. :-) That said, it turned out to be much harder to implement than I thought and I wouldn't do it again.
 ...
 The compiler could be smart and cache the executable so that 
 anytime it has to expand it it just needs to invoke it (skip 
 the compile phase).

 What do you think?

It is a *very* good idea and this is exactly the way I would do it now. However, you usually only trade one set of problems for another. (For instance, giving Nimrod an 'eval' module is now quite easy to do...)
Jun 17 2014
prev sibling parent "Dicebot" <public dicebot.lv> writes:
On actual topic.

Do I think it is practical approach and has benefits over 
existing situation? Definitely yes.

Do I think it is the right design with a more idealized 
infrastructure? No. As Dmitry has mentioned it has huge flaw of 
not being able to use template alias and type arguments, 
effectively removing reflection out of the question.

Do I think including it in the language as opposed to build 
system is the deal breaker here? Not sure but unlikely. It 
improves mental context locality which is not important until 
this become a much more casual tool. And by the time this happens 
I'd like another design to be encouraged anyway.
Jun 18 2014