www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Stable functions

reply "Martin M. Pedersen" <martin moeller-pedersen.dk> writes:
In the process of considering another language of my own (not an ambitios 
project like D), I has been thinking of a radical language feature: I like 
to think of it as "stable functions".

In C++ you have templates, that is a functional language on top of the 
original C++. Templates are used at compile-time, and are sometimes 
(mis)used to do some heavy computation while compiling. I would like to be 
able to do this, but the template approach is obscure, and not really 
necessary as I see it.

Instead I suggest the concept of "stable functions". A class of functions 
(and methods), that have no side effects and are guaranteed to generate the 
same result given the same input. Lots of functions of the standard library 
could be classified as stable where math and string functions are obvious 
examples. If the compiler was evaluating stable functions at compile-time, 
incorporating them fully as constant expressions, it would allow us to do 
some interesting things. For example, a function like POSIX's regcomp() 
could be classified as a stable function, allowing compile-time compilation 
of regular expressions - something that was brought up recently. An even 
more radical use would be to feed a grammar as a string into a 
parser-generator, and having it generate parser tables at compile-time; ie. 
a YACC-like facility - but all within the existing language and without use 
of templates. How would that be compared to Boost's Spirit library? Like 
Walter, I have big reservations about the extreme use of templates we 
sometimes see, and this may be an alternative in some cases.

The only syntactical extension to the language would be the addition of the 
"stable" keyword. Something like:

    stable real cos(real num);

What could be done in stable functions would be restricted. For example the 
following could not be allowed:

    - use of non-local variables
    - use of other non-stable functions

On the other hand, the following should be allowed:

    - instantiation of objects
    - use of non-local constants
    - use of other stable functions

In the compiler itself it would requiere a relatively large addition: A 
compile/runtime-time system that allows execution of stable methods. A 
stable method could be provided as extern(C) and not even be available in 
source form. The compiler would then need to dynamically link in and use the 
function. A less feature-rich implementation could do, however. On the other 
hand, I don't think the concept would not interfere with link-compability 
with C, and standard C linkers would still be used.

I'm not aware of any other language having a feature like this, but I know 
only a small fraction of languages out there. I may exist already.

What do you think?

Regards,
Martin
Mar 03 2005
next sibling parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Thu, 3 Mar 2005 09:02:36 +0100, Martin M. Pedersen  
<martin moeller-pedersen.dk> wrote:
 In the process of considering another language of my own (not an ambitios
 project like D), I has been thinking of a radical language feature: I  
 like
 to think of it as "stable functions".

 In C++ you have templates, that is a functional language on top of the
 original C++. Templates are used at compile-time, and are sometimes
 (mis)used to do some heavy computation while compiling. I would like to  
 be
 able to do this, but the template approach is obscure, and not really
 necessary as I see it.

 Instead I suggest the concept of "stable functions". A class of functions
 (and methods), that have no side effects and are guaranteed to generate  
 the
 same result given the same input. Lots of functions of the standard  
 library
 could be classified as stable where math and string functions are obvious
 examples. If the compiler was evaluating stable functions at  
 compile-time,
 incorporating them fully as constant expressions, it would allow us to do
 some interesting things. For example, a function like POSIX's regcomp()
 could be classified as a stable function, allowing compile-time  
 compilation
 of regular expressions - something that was brought up recently. An even
 more radical use would be to feed a grammar as a string into a
 parser-generator, and having it generate parser tables at compile-time;  
 ie.
 a YACC-like facility - but all within the existing language and without  
 use
 of templates. How would that be compared to Boost's Spirit library? Like
 Walter, I have big reservations about the extreme use of templates we
 sometimes see, and this may be an alternative in some cases.

 The only syntactical extension to the language would be the addition of  
 the
 "stable" keyword. Something like:

     stable real cos(real num);

 What could be done in stable functions would be restricted. For example  
 the
 following could not be allowed:

     - use of non-local variables
     - use of other non-stable functions

 On the other hand, the following should be allowed:

     - instantiation of objects
     - use of non-local constants
     - use of other stable functions

 In the compiler itself it would requiere a relatively large addition: A
 compile/runtime-time system that allows execution of stable methods. A
 stable method could be provided as extern(C) and not even be available in
 source form. The compiler would then need to dynamically link in and use  
 the
 function. A less feature-rich implementation could do, however. On the  
 other
 hand, I don't think the concept would not interfere with link-compability
 with C, and standard C linkers would still be used.

 I'm not aware of any other language having a feature like this, but I  
 know
 only a small fraction of languages out there. I may exist already.

 What do you think?

Interesting idea.. I assume the input has to be known at compile time? Meaning you need to ammend your definition a little:
 Instead I suggest the concept of "stable functions". A class of functions
 (and methods), that have no side effects and are guaranteed to generate  
 the same result given the same input

known at compile time. Regan
Mar 03 2005
next sibling parent reply Ilya Minkov <minkov cs.tum.edu> writes:
Regan Heath wrote:

 known at compile time.

No need to limit it to compile-time known arguments. Common example: if we don't know the arguments, but for 2 function calls we can prove that they are the same, the result would be the same, and can thus be cached. Because of such functions we know that they return predictably same results given the same input, the value induces itself further. This can be extremely advantageous in expanding all sorts of templates. I think they are properly called pure functions. Another question is whether they belong in the declaration. One can argue, that, given the source, the compiler can detect these automatically during whole-program compilation. -eye
Mar 03 2005
next sibling parent reply "Nick Sabalausky" <z a.a> writes:
"Ilya Minkov" <minkov cs.tum.edu> wrote in message 
news:d0887o$2m1a$1 digitaldaemon.com...
 Regan Heath wrote:

 known at compile time.

No need to limit it to compile-time known arguments. Common example: if we don't know the arguments, but for 2 function calls we can prove that they are the same, the result would be the same, and can thus be cached. Because of such functions we know that they return predictably same results given the same input, the value induces itself further. This can be extremely advantageous in expanding all sorts of templates. I think they are properly called pure functions. Another question is whether they belong in the declaration. One can argue, that, given the source, the compiler can detect these automatically during whole-program compilation. -eye

Isn't this more of a compiler optimization than a language feature though?
Mar 03 2005
parent Norbert Nemec <Norbert Nemec-online.de> writes:
Nick Sabalausky schrieb:
 Isn't this more of a compiler optimization than a language feature though?

Not really: the question is, whether you can use pure functions in places where a constant is expected: const double pi = asin(1)*2; const int DIM = myfunc(17); MyArrayClass!(DIM,float) someobject;
Mar 05 2005
prev sibling next sibling parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Fri, 04 Mar 2005 00:56:42 +0100, Ilya Minkov <minkov cs.tum.edu> wrote:
 Regan Heath wrote:

 known at compile time.

No need to limit it to compile-time known arguments. Common example: if we don't know the arguments, but for 2 function calls we can prove that they are the same, the result would be the same, and can thus be cached.

Sure, but how do you do that at "compile" time? Regan
Mar 03 2005
parent reply Ilya Minkov <minkov cs.tum.edu> writes:
Regan Heath wrote:
 On Fri, 04 Mar 2005 00:56:42 +0100, Ilya Minkov <minkov cs.tum.edu> wrote:

 Sure, but how do you do that at "compile" time?

You make a function call. A few lines later, in the same function, you make the same function call with the same variables as arguments - but then the compiler could have created a temporary variable earlier in the code, and instead of actually calling the function again it simply takes the result from a temporary variable. However, the varaibles have been used in between and you need a proof that they still contain the same values during the second call as during the first one. There are a few things that can help - in particular, that value types cannot be modified by functions without the inout/out specification, and that pure functions (which we wanted to detect at compile-time anyway) don't modify their inputs at all. Naturally you don't usually write the same function call a few lines apart, but the templates you expand might. Another interesting use would be if one could make compile-time checks on whether a function is provably pure or it isn't. If a function is provably pure, takes plenty of time to execute, and you have an algorithm that evaluates a function often with same values, it could conditionally cache already used argument-value pairs in a dictionary. I was first also thinking of the ability to make static assertions on the purity of a function, but this is probably a bad idea since the compiler might be unable to prove that the function is pure when it actually is - it might even lack such a prover completely or it might be disabled to make software build faster. However, if purity is specified within the declaration of a function, then the compiler could reverse the check. In debug builds it could write an assertion into the body of a function that caches last n argument-value pairs in a dictionary, and after executing the function body it could also check that the result matches the one in the dictionary it the argument already was in the dictionary. Then, of course, assertions on the purity of function could make sense from the outside. -eye
Mar 04 2005
parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Fri, 04 Mar 2005 20:56:49 +0100, Ilya Minkov <minkov cs.tum.edu> wrote:
 Regan Heath wrote:
 On Fri, 04 Mar 2005 00:56:42 +0100, Ilya Minkov <minkov cs.tum.edu>  
 wrote:

 Sure, but how do you do that at "compile" time?

You make a function call. A few lines later, in the same function, you make the same function call with the same variables as arguments - but then the compiler could have created a temporary variable earlier in the code, and instead of actually calling the function again it simply takes the result from a temporary variable.

If the input to the function is not known till runtime, how is this possible? Example: void uppercase(char[] line) {} BufferedFile f = ... char[] line; while((line = f.readLine() != null) { uppercase(line); //..a few lines.. uppercase(line); } Regan
Mar 04 2005
parent reply Ilya Minkov <minkov cs.tum.edu> writes:
Regan Heath wrote:

 If the input to the function is not known till runtime, how is this  
 possible?
 Example:
 
 void uppercase(char[] line) {}

Uppercase is not a pure function if it modifies its argument, so this optimization is not applicable to it. But if you have such a function return a new string, then you could prove it would return a string with the same contents and reuse the previously returned string instead of another function call. -eye
Mar 05 2005
parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Sat, 05 Mar 2005 19:09:49 +0100, Ilya Minkov <minkov cs.tum.edu> wrote:
 Regan Heath wrote:

 If the input to the function is not known till runtime, how is this   
 possible?
 Example:
  void uppercase(char[] line) {}

Uppercase is not a pure function if it modifies its argument, so this optimization is not applicable to it. But if you have such a function return a new string, then you could prove it would return a string with the same contents and reuse the previously returned string instead of another function call.

How can you do that at compile time, without the input to test it with? Regan
Mar 06 2005
next sibling parent reply Sebastian Beschke <s.beschke gmx.de> writes:
Regan Heath schrieb:
 On Sat, 05 Mar 2005 19:09:49 +0100, Ilya Minkov <minkov cs.tum.edu> wrote:
 
 Regan Heath wrote:
 Uppercase is not a pure function if it modifies its argument, so this 
 optimization is not applicable to it.

 But if you have such a function return a new string, then you could 
 prove it would return a string with the same contents and reuse the 
 previously returned string instead of another function call.

How can you do that at compile time, without the input to test it with?

There are two possibilities for optimization here: 1. Uppercase is called on constants in the code. In that case, the result can be evaluated at compile time. 2. Uppercase is called multiple times on the same input. In that case, the result can be cached. Example: char[] printUpper(char[] str) { writef("%s", uppercase(str)); return uppercase(str); // This result could be cached! } -Sebastian
Mar 07 2005
parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Mon, 07 Mar 2005 10:12:29 +0100, Sebastian Beschke <s.beschke gmx.de>  
wrote:
 Regan Heath schrieb:
 On Sat, 05 Mar 2005 19:09:49 +0100, Ilya Minkov <minkov cs.tum.edu>  
 wrote:

 Regan Heath wrote:
 Uppercase is not a pure function if it modifies its argument, so this
 optimization is not applicable to it.

 But if you have such a function return a new string, then you could
 prove it would return a string with the same contents and reuse the
 previously returned string instead of another function call.

How can you do that at compile time, without the input to test it with?

There are two possibilities for optimization here: 1. Uppercase is called on constants in the code. In that case, the result can be evaluated at compile time. 2. Uppercase is called multiple times on the same input. In that case, the result can be cached.

What input!?! It's unknown until runtime. The OP's suggestion was for a compile time feature, and by compile time he did not mean JIT compile. Am I not making myself clear somehow?? Regan
Mar 07 2005
parent reply Sebastian Beschke <s.beschke gmx.de> writes:
Regan Heath schrieb:
 On Mon, 07 Mar 2005 10:12:29 +0100, Sebastian Beschke
 <s.beschke gmx.de>  wrote:
 
 There are two possibilities for optimization here:

 1. Uppercase is called on constants in the code. In that case, the
 result can be evaluated at compile time.

 2. Uppercase is called multiple times on the same input. In that case,
 the result can be cached.

What input!?! It's unknown until runtime. The OP's suggestion was for a compile time feature, and by compile time he did not mean JIT compile. Am I not making myself clear somehow??

Well, optimization 2 could also be performed at compile time. For example, the code I mentioned: char[] printUpper(char[] str) { writef("%s", uppercase(str)); return uppercase(str); // This result could be cached! } could be optimized at compile time to: char[] printUpper(char[] str) { char[] temp = uppercase(str); writef("%s", temp); return temp; // This result could be cached! } Of course, you can't eliminate the first call to the function, except maybe in the case where the printUpper() function is called with a constant argument - dunno how hard this is to do. -Sebastian
Mar 07 2005
parent "Regan Heath" <regan netwin.co.nz> writes:
On Mon, 07 Mar 2005 13:10:17 +0100, Sebastian Beschke <s.beschke gmx.de>  
wrote:
 Regan Heath schrieb:
 On Mon, 07 Mar 2005 10:12:29 +0100, Sebastian Beschke
 <s.beschke gmx.de>  wrote:

 There are two possibilities for optimization here:

 1. Uppercase is called on constants in the code. In that case, the
 result can be evaluated at compile time.

 2. Uppercase is called multiple times on the same input. In that case,
 the result can be cached.

What input!?! It's unknown until runtime. The OP's suggestion was for a compile time feature, and by compile time he did not mean JIT compile. Am I not making myself clear somehow??

Well, optimization 2 could also be performed at compile time. For example, the code I mentioned: char[] printUpper(char[] str) { writef("%s", uppercase(str)); return uppercase(str); // This result could be cached! } could be optimized at compile time to: char[] printUpper(char[] str) { char[] temp = uppercase(str); writef("%s", temp); return temp; // This result could be cached! } Of course, you can't eliminate the first call to the function, except maybe in the case where the printUpper() function is called with a constant argument - dunno how hard this is to do.

So, what you're suggesting is that the program keeps a list of all arguments to every function, and records the result, then, if it gets the same argument to a function it simply returns the result it got last time. So, if you're processing a 1,000,000 line file, with printUpper it will have 2,000,000 lines stored in memory somewhere? Regan
Mar 07 2005
prev sibling parent Andy Friesen <andy ikagames.com> writes:
Regan Heath wrote:
 On Sat, 05 Mar 2005 19:09:49 +0100, Ilya Minkov <minkov cs.tum.edu> wrote:
 
 Regan Heath wrote:

 If the input to the function is not known till runtime, how is this   
 possible?
 Example:
  void uppercase(char[] line) {}

Uppercase is not a pure function if it modifies its argument, so this optimization is not applicable to it. But if you have such a function return a new string, then you could prove it would return a string with the same contents and reuse the previously returned string instead of another function call.

How can you do that at compile time, without the input to test it with?

There are a number of things the compiler could hypothetically do. // suppose uppercase is a pure function char[] uppercase(char[] str) { ... } // uppercase() call is interpreted at compile-time // same as "const char[] blah = "FOOBAR"; const char[] blah = uppercase("foobar"); void foo(char[] input) { char[] a = uppercase(input); // optimize to "char[] b = a.dup;" char[] b = uppercase(input); } int fibo(int i) { if (i < 2) return 1; else return fibo(i-1) + fibo(i-2); } // simpler than the template metaprogramming approach... const int f9 = fibo(9); The big question is how large the performance gains could be. Clean and O'Caml manage to beat out C frequently enough* that I can't help but think it may be worthwhile. :) * see <http://shootout.alioth.debian.org> -- andy
Mar 07 2005
prev sibling parent Norbert Nemec <Norbert Nemec-online.de> writes:
Ilya Minkov schrieb:
 Another question is whether they belong in the declaration. One can 
 argue, that, given the source, the compiler can detect these 
 automatically during whole-program compilation.

Still, the code would be somewhat cleaner if it had to be specified explicitely. You immediately see that a function is intended to be pure and the compiler can check this a the time of definition of the function. Basically, it is the same for all interface declarations: The compiler could very often determine the interface from reading the definition, but it cleans up the design phase a lot and improves the quality of the code if the interface gives all the crucial information and the compiler can then check whether the implementation actually does what the interface promises.
Mar 05 2005
prev sibling parent "Martin M. Pedersen" <martin moeller-pedersen.dk> writes:
"Regan Heath" <regan netwin.co.nz> skrev i en meddelelse 
news:opsm2yjnk423k2f5 ally...
 I assume the input has to be known at compile time?

Yes.
 Instead I suggest the concept of "stable functions". A class of functions
 (and methods), that have no side effects and are guaranteed to generate 
 the same result given the same input


Exactly. Regards, Martin
Mar 04 2005
prev sibling parent reply Andy Friesen <andy ikagames.com> writes:
Martin M. Pedersen wrote:
 In the process of considering another language of my own (not an ambitios 
 project like D), I has been thinking of a radical language feature: I like 
 to think of it as "stable functions".
 
 What do you think?

I believe this is called staged compilation. It's an immensely powerful concept that basically boils down to a simple idea: who says that compilation has to be done at compile time, and interpretation by an interpreter? Constant folding is an example of this: instead of emitting code to perform the calculation, the compiler computes the value on the spot and only emits code for the result value. .NET's new generic facilities are another example: instead of emitting a container class that works on a specific element type, emit the generic skeleton, and recompile that (at runtime!) for specific types. (for example, the generic container class ArrayList<T> is compiled into a library. When a concrete type is specified at runtime, the JIT engine recompiles the class for T) Lisp macros, themselves functions which are run by the compiler, are yet another example. I'm not sure whether the reverse is also true, but there /is/ a rule that suggests that every feature of every language is a special case of Lisp's one feature... -- andy
Mar 04 2005
parent reply "Martin M. Pedersen" <martin moeller-pedersen.dk> writes:
"Andy Friesen" <andy ikagames.com> skrev i en meddelelse 
news:d0969a$i1b$1 digitaldaemon.com...
 Martin M. Pedersen wrote:
 .NET's new generic facilities are another example: instead of emitting a 
 container class that works on a specific element type, emit the generic 
 skeleton, and recompile that (at runtime!) for specific types.

Interesting. Does that mean that you can have run-time compilation errors? Regards, Martin
Mar 04 2005
parent Andy Friesen <andy ikagames.com> writes:
Martin M. Pedersen wrote:
 "Andy Friesen" <andy ikagames.com> skrev i en meddelelse 
 news:d0969a$i1b$1 digitaldaemon.com...
 
Martin M. Pedersen wrote:
.NET's new generic facilities are another example: instead of emitting a 
container class that works on a specific element type, emit the generic 
skeleton, and recompile that (at runtime!) for specific types.

Interesting. Does that mean that you can have run-time compilation errors?

No, the language restricts things in such a way that it can't happen. A side effect of this is that .NET generics aren't as flexible as C++ or D templates, as you have to explicitly constrain a generic type to an interface. (of course, they still kick the crap out of Java 5.0's pathetic generics) -- andy
Mar 07 2005