www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Self-modifying code! The real kind!

reply Jethro <qyzz gr.ff> writes:
I think it would be pretty novel to have the concept of self 
modifying code.

I have several use cases in D where I have to repeat a process 
over and over such as compile, change some line, then recompile 
to get the effect.

The main one has to do with mixins.

//version = compiledMixins;

version(compiledMixins)
{
     import fooMixedInFile;
} else
{
     WriteFile(foo, fooMixedInFile);
     ModFile(this, "//version = compiledMixins" => "version = 
compiledMixins");
}

This hypothetical code, when compiled behaves like this:


1. evaluates the code string represented by foo and writes it to 
the file fooMixinInFile(.d).

2. Modifies the current file and uncomments the version = 
compiled Mixins.

3. (hit compile again, or automate somehow)

4. imports fooMixedInFile instead.


What this does, instead of exposing the mixin(foo), is instead 
write the mixin to a file and imports that on next build so D can 
parse it and return errors properly.

This works well in practice except that WriteFile and ModFile 
actually have to be ran at runtime requiring step 3 to also 
include a "dummy" run, e.g.,

version(compiledMixins)
{
     void main()
     {
         WriteFile(foo, fooMixedInFile);
         ModFile(this, "//version = compiledMixins" => "version = 
compiledMixins");
     }

} else {

}

Essentially this method allows debugging mixins as code with the 
only requirement that one build/dummy run.

The main problem is I create hacks to do it. It would be nice for 
a general purpose solution in a nice package. Many would benefit 
from being able to debug mixins as if they were code, which the 
process above allows. Not only that, which a little big of work, 
one could match the output of a mixin to the line of code that 
generated it to get an accurate way to find bugs in the mixin 
code vs it's output.

I am only talking about string mixins here, of course, and the 
import would have to be a valid way to run them(which possibly 
may not work for certain types of string mixins... but works in 
the majority of cases).

The self modifying code(The ModFile line) is interesting but 
probably requires a good D parser to be robust.

I have a feeling that the compiler could do the job internally 
much better and completely encapsulate all the work.

Essentially,

1. Evaluate the string mixin(doesn't actually insert it yet).
2. compute hash
3. match hash to mixin's file backing. If not matched or doesn't 
exist, write string to file.
4. import the file instead of the mixin code. (or, if you want, 
mixin(import(file) but one would need to fixup the the debugging 
a little)

This has 3 advantages: 1. Can be done completely by the compiler 
when it encounters a mixin statement and doesn't change anything 
for the user. 2. Allows both the string generating code to be 
debugged(compiler will catch those errors first) and the output 
of the mixin(caught on the second compile). 3. No recompilation 
required.

Thoughts?
Apr 05 2017
parent reply Swoorup Joshi <swoorupjoshi gmail.com> writes:
Self-modifying might be the answer to all sorts of performance 
problems due to branching. Only problem is security I guess. 
Don't they disable writes to code segment anyway?

On Wednesday, 5 April 2017 at 22:21:23 UTC, Jethro wrote:
 I think it would be pretty novel to have the concept of self 
 modifying code.

 [...]
Apr 05 2017
next sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Thu, Apr 06, 2017 at 05:36:52AM +0000, Swoorup Joshi via Digitalmars-d wrote:
 Self-modifying might be the answer to all sorts of performance
 problems due to branching. Only problem is security I guess. Don't
 they disable writes to code segment anyway?
[...] I don't think the OP was talking about self-modifying code in that sense. I think he was talking about a program that modifies its own *source code*, which is a different thing than a program that modifies its own machine code while that machine code is running. T -- Give a man a fish, and he eats once. Teach a man to fish, and he will sit forever.
Apr 07 2017
parent Jethro <qyzz gr.ff> writes:
On Friday, 7 April 2017 at 18:54:10 UTC, H. S. Teoh wrote:
 On Thu, Apr 06, 2017 at 05:36:52AM +0000, Swoorup Joshi via 
 Digitalmars-d wrote:
 Self-modifying might be the answer to all sorts of performance 
 problems due to branching. Only problem is security I guess. 
 Don't they disable writes to code segment anyway?
[...] I don't think the OP was talking about self-modifying code in that sense. I think he was talking about a program that modifies its own *source code*, which is a different thing than a program that modifies its own machine code while that machine code is running. T
Yeah, that's what I mean. Basically D's meta programming accomplishes the same effect for the most part but it is somewhat limited. Mainly since one can't write to files for "security" reasons(I'd like to know of any real world security issues that this has caused!).
Apr 07 2017
prev sibling parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Thursday, 6 April 2017 at 05:36:52 UTC, Swoorup Joshi wrote:
 Self-modifying might be the answer to all sorts of performance 
 problems due to branching.
No it's not! You are throwing away your i-cache AND mess up the branch prediction.
Apr 07 2017
parent Era Scarecrow <rtcvb32 yahoo.com> writes:
On Friday, 7 April 2017 at 20:43:52 UTC, Stefan Koch wrote:
 On Thursday, 6 April 2017 at 05:36:52 UTC, Swoorup Joshi wrote:
 Self-modifying might be the answer to all sorts of performance 
 problems due to branching.
No it's not! You are throwing away your i-cache AND mess up the branch prediction.
From the opening statement it looks and sounds more like loading and unloading DLL files... rather than self-modifying code. Self modifying code isn't really that practical anymore, the best example working is compressed executables (UPX and similar), but those only expand optimized code from a compressed cache and then changes the block to executable, it doesn't really modify the code at all. Perhaps an actual use case for self-modifying code would be to give you a quick & dirty compile for a function, and then work on optimizing it, then switch the calls appropriately to the new function once it's optimized, which is more useful to for say JIT circumstances and emulation, and less in statically known source code.
Apr 07 2017