www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Implementing Pure Functions

reply Walter Bright <newshound2 digitalmars.com> writes:
http://drdobbs.com/blogs/tools/230700070

Someone want to post this on reddit?
Jun 16 2011
next sibling parent reply Byakkun <byakkun myopera.com> writes:
On Thu, 16 Jun 2011 12:27:30 +0300, Walter Bright  
<newshound2 digitalmars.com> wrote:

 http://drdobbs.com/blogs/tools/230700070

 Someone want to post this on reddit?
Done http://www.reddit.com/r/programming/comments/i13qp/implementing_pure_functions_in_d_programming/ -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Jun 16 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 6/16/11 5:02 AM, Byakkun wrote:
 On Thu, 16 Jun 2011 12:27:30 +0300, Walter Bright
 <newshound2 digitalmars.com> wrote:

 http://drdobbs.com/blogs/tools/230700070

 Someone want to post this on reddit?
Done http://www.reddit.com/r/programming/comments/i13qp/implementing_pure_functions_in_d_programming/
This has sparked an interesting discussion, to which I added my bit. Andrei
Jun 16 2011
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Andrei:

 This has sparked an interesting discussion, to which I added my bit.
In D I suggest version(none){} to disable some code :-) Bye, bearophile
Jun 16 2011
prev sibling parent reply Kagamin <spam here.lot> writes:
Andrei Alexandrescu Wrote:

 This has sparked an interesting discussion, to which I added my bit.
int fun(int a) pure { if (a > 10) writeln("I'm impure); } As I understand, even if some calls to a function have some repeatability properties, this doesn't mean the function is pure. In this example fun is obviously impure. Here one can talk about allowing to call impure functions from pure ones, but that's a way different task.
Jun 16 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 6/17/11 1:56 AM, Kagamin wrote:
 Andrei Alexandrescu Wrote:

 This has sparked an interesting discussion, to which I added my
 bit.
int fun(int a) pure { if (a> 10) writeln("I'm impure); } As I understand, even if some calls to a function have some repeatability properties, this doesn't mean the function is pure. In this example fun is obviously impure. Here one can talk about allowing to call impure functions from pure ones, but that's a way different task.
Right. I gave that example within the context of the discussion, which considered purity a path-sensitive property. By that definition, if fun is provably never invoked with a > 10, then it's effectively pure. Andrei
Jun 16 2011
next sibling parent reply Don <nospam nospam.com> writes:
Andrei Alexandrescu wrote:
 On 6/17/11 1:56 AM, Kagamin wrote:
 Andrei Alexandrescu Wrote:

 This has sparked an interesting discussion, to which I added my
 bit.
int fun(int a) pure { if (a> 10) writeln("I'm impure); } As I understand, even if some calls to a function have some repeatability properties, this doesn't mean the function is pure. In this example fun is obviously impure. Here one can talk about allowing to call impure functions from pure ones, but that's a way different task.
Right. I gave that example within the context of the discussion, which considered purity a path-sensitive property. By that definition, if fun is provably never invoked with a > 10, then it's effectively pure. Andrei
And that's an interesting thing about CTFE: although it can never do anything impure, it can call impure functions, provided it avoids the impure bits of them: int foo(int n) { static int x = 56; if (n<10) return n; ++x; return x; } static assert(n(5) == 5); Likewise, it can never doing anything unsafe, but can call unsafe functions. bool bar(int n) { if (n < 2) return true; corruptTheStack(); return false; } static assert(bar(1));
Jun 17 2011
parent reply Byakkun <byakkun myopera.com> writes:
On Fri, 17 Jun 2011 10:49:43 +0300, Don <nospam nospam.com> wrote:

 Andrei Alexandrescu wrote:
 On 6/17/11 1:56 AM, Kagamin wrote:
 Andrei Alexandrescu Wrote:

 This has sparked an interesting discussion, to which I added my
 bit.
int fun(int a) pure { if (a > 10) writeln("I'm impure); } As I understand, even if some calls to a function have some repeatability properties, this doesn't mean the function is pure. In this example fun is obviously impure. Here one can talk about allowing to call impure functions from pure ones, but that's a way different task.
Right. I gave that example within the context of the discussion, which considered purity a path-sensitive property. By that definition, if fun is provably never invoked with a > 10, then it's effectively pure. Andrei
And that's an interesting thing about CTFE: although it can never do anything impure, it can call impure functions, provided it avoids the impure bits of them: int foo(int n) { static int x = 56; if (n<10) return n; ++x; return x; } static assert(n(5) == 5); Likewise, it can never doing anything unsafe, but can call unsafe functions. bool bar(int n) { if (n < 2) return true; corruptTheStack(); return false; } static assert(bar(1));
I was wondering if you could not implement [weak] purity in the language so that it requires the programmer to use contracts to insure that side effects newer happen. I would imagine that this should work with the semantic analysis the compiler provides without adding a specialized theorem prover. (Note: my knowledge of compilers and semantic analysis is fairly limited so ignore this if it is plain stupid). -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Jun 17 2011
parent reply Don <nospam nospam.com> writes:
Byakkun wrote:
 On Fri, 17 Jun 2011 10:49:43 +0300, Don <nospam nospam.com> wrote:
 
 Andrei Alexandrescu wrote:
 On 6/17/11 1:56 AM, Kagamin wrote:
 Andrei Alexandrescu Wrote:

 This has sparked an interesting discussion, to which I added my
 bit.
int fun(int a) pure { if (a > 10) writeln("I'm impure); } As I understand, even if some calls to a function have some repeatability properties, this doesn't mean the function is pure. In this example fun is obviously impure. Here one can talk about allowing to call impure functions from pure ones, but that's a way different task.
Right. I gave that example within the context of the discussion, which considered purity a path-sensitive property. By that definition, if fun is provably never invoked with a > 10, then it's effectively pure. Andrei
And that's an interesting thing about CTFE: although it can never do anything impure, it can call impure functions, provided it avoids the impure bits of them: int foo(int n) { static int x = 56; if (n<10) return n; ++x; return x; } static assert(n(5) == 5); Likewise, it can never doing anything unsafe, but can call unsafe functions. bool bar(int n) { if (n < 2) return true; corruptTheStack(); return false; } static assert(bar(1));
I was wondering if you could not implement [weak] purity in the language so that it requires the programmer to use contracts to insure that side effects newer happen. I would imagine that this should work with the semantic analysis the compiler provides without adding a specialized theorem prover.
You mean like: int foo(int n) in pure { assert(n < 10); } body { static int x = 56; if (n<10) return n; ++x; return x; } where the 'in pure' contract applies only if the function is called from a pure function?
 (Note: my knowledge of compilers and semantic analysis is fairly limited 
 so ignore this if it is plain stupid).
That'd probably work if the compiler already had a theorem prover -- but it doesn't.
Jun 17 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 6/17/2011 4:27 AM, Don wrote:
 That'd probably work if the compiler already had a theorem prover -- but it
 doesn't.
Jun 17 2011
parent bearophile <bearophileHUGS lycos.com> writes:
Walter:


You are beating this horse a bit too much. It uses a quite refined theorem prover, it works, but as I have said from the beginning it doesn't work in all cases. There are always cases automatic provers aren't able to handle. The case you have found is hard. Have you sent them a bug report? Even the one of SPARK doesn't work in all cases. When the one of SPARK doesn't work, then the programmer has to write down the proof manually (or half-manually, with an interactive prover). Bye, bearophile
Jun 17 2011
prev sibling next sibling parent Kagamin <spam here.lot> writes:
Andrei Alexandrescu Wrote:

 Right. I gave that example within the context of the discussion, which 
 considered purity a path-sensitive property. By that definition, if fun 
 is provably never invoked with a > 10, then it's effectively pure.
That's kinda holistic approach, quite strange thing to see in modern programming. The term "pure function" implies decomposition and definition of purity for parts the program was decomposed to. Usually it's a functional decomposition and in D purity is function-level. I would say that holistic approach is inadequate for notion which is reductionist by its nature.
Jun 17 2011
prev sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei:

 Right. I gave that example within the context of the discussion,
That Reddit page shows your last thoughts about conditional purity too:
We do plan, however, to add purity inference for all template functions because
the source code is by definition available. We believe that that is possible
with a worklist approach. I haven't thought about it a lot, but it seems a
fairly standard analysis. If you think it isn't, let us know before we look
deeper into it! :o)<
The algorithm would be quite powerful I think. It will be an optimistic
analysis that starts with the assumption that a given template function is
pure. Then for each function called by that function, put it in a worklist and
iterate the worklist transitively replacing other functions with the functions
they call, until either an impure function is found or the list is empty. If
the list is empty, then the function is pure. So essentially the algorithm
automatically annotates with "pure" (in the D sense) all template functions
wherever possible. One neat thing is that for a given template function, some
instantiations may be pure and some others may not.<
I presume this recursive algorithm is run only if the template is marked with "pure". The older idea was to let the programmer control, and just add a condition to attributes, using something like optional_tag or just adding the compile time condition beside the pure() attribute, an example: template isPure(F) { enum bool isPure = functionAttributes!(F) & FunctionAttribute.PURE; } pure(isPure!F) int[] map(F)(F f, int[] data) { int[] res; foreach (x; data) res ~= f(x); return res; } We have to keep into account compilation time too. As more and more powerful features are added to the D type system, like this automatic purity, compilation time gets longer. Scala compilation is about ten times slower than Java code (even with the fast fsc compiler), and this makes programming in Scala a little worse in practice. Compilation time matters. Is inferring purity automatically for templates as fast as a pure(CTcondition)? If the programmer uses something like pure(CTcondition) the compiler has to verify the purity of the template anyway, even if the condition is true, but the compiler doesn't need to verify that the template is pure if the CompileTime condition is false. Assuming most code is written correctly, I think this is able to shave away many tests, and save some time to the compiler, reducing compile time a bit. Bye, bearophile
Jun 17 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 6/17/11 7:01 AM, bearophile wrote:
 Andrei:

 Right. I gave that example within the context of the discussion,
That Reddit page shows your last thoughts about conditional purity too:
 We do plan, however, to add purity inference for all template functions
because the source code is by definition available. We believe that that is
possible with a worklist approach. I haven't thought about it a lot, but it
seems a fairly standard analysis. If you think it isn't, let us know before we
look deeper into it! :o)<
 The algorithm would be quite powerful I think. It will be an optimistic
analysis that starts with the assumption that a given template function is
pure. Then for each function called by that function, put it in a worklist and
iterate the worklist transitively replacing other functions with the functions
they call, until either an impure function is found or the list is empty. If
the list is empty, then the function is pure. So essentially the algorithm
automatically annotates with "pure" (in the D sense) all template functions
wherever possible. One neat thing is that for a given template function, some
instantiations may be pure and some others may not.<
I presume this recursive algorithm is run only if the template is marked with "pure".
No, it would work for all template functions and ran either opportunistically or only as needed.
 The older idea was to let the programmer control, and just add a condition to
attributes, using something like  optional_tag or just adding the compile time
condition beside the pure() attribute, an example:

 template isPure(F) {
      enum bool isPure = functionAttributes!(F)&  FunctionAttribute.PURE;
 }

 pure(isPure!F) int[] map(F)(F f, int[] data) {
      int[] res;
      foreach (x; data)
          res ~= f(x);
      return res;
 }

 We have to keep into account compilation time too. As more and more powerful
features are added to the D type system, like this automatic purity,
compilation time gets longer. Scala compilation is about ten times slower than
Java code (even with the fast fsc compiler), and this makes programming in
Scala a little worse in practice. Compilation time matters.

 Is inferring purity automatically for templates as fast as a
pure(CTcondition)? If the programmer uses something like pure(CTcondition) the
compiler has to verify the purity of the template anyway, even if the condition
is true, but the compiler doesn't need to verify that the template is pure if
the CompileTime condition is false. Assuming most code is written correctly, I
think this is able to shave away many tests, and save some time to the
compiler, reducing compile time a bit.
Automatic inference should not be slower than verification of conditional purity. The work involved is the same. We should do inference for pure, nothrow, const/immutable for any and all template functions. Andrei
Jun 17 2011
parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei:

 Automatic inference should not be slower than verification of 
 conditional purity. The work involved is the same.
 No, it would work for all template functions and ran either 
 opportunistically or only as needed.
The first thing you say goes against the second: if you run the verification algorithm even if the template is not tagged with pure, then you are doing more work than running it only on templates tagged as pure. And even if it runs only on templates tagged with pure, I think the compiler is able to do a bit less work if you use pure(CTcond) instead, because the CTcond is chosen by the programmer to be quite discriminating. I guess this can be solved adding some heuristics to the compiler: to test first the most probable places where something impure is (like the functions given as function or template arguments), to prune the impurity search graph faster. In the end I think your idea of automatic purity is better (if it works), because it's more handy for the programmer. But I suggest to add some pruning heuristics, and to run it only on templates tagged with pure. Bye, bearophile
Jun 17 2011
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 6/17/11 8:39 AM, bearophile wrote:
 Andrei:

 Automatic inference should not be slower than verification of
 conditional purity. The work involved is the same.
 No, it would work for all template functions and ran either
 opportunistically or only as needed.
The first thing you say goes against the second: if you run the verification algorithm even if the template is not tagged with pure, then you are doing more work than running it only on templates tagged as pure.
Well there are two sides of the coin. If you call e.g. a template function from within a pure function, you must do the verification anyway, or refuse to compile if the annotation is missing. Whether or not the annotation is present, the verification work done toward a successful compilation is the same. If anything, the time to a _failed_ compilation may be slower (but not arbitrarily so). On the bright side, inference is great - no more need for the user to bother with annotations, the compiler takes care of it. Furthermore, the compiler may opportunistically infer function attributes (pure, nothrow etc.) if it wants to optimize a specific context.
 And even if it runs only on templates tagged with pure, I think the
 compiler is able to do a bit less work if you use pure(CTcond)
 instead, because the CTcond is chosen by the programmer to be quite
 discriminating.
The downside of that would be not only that the programmer must spend time figuring the right annotation, but also that the result will be suboptimal.
 I guess this can be solved adding some heuristics to
 the compiler: to test first the most probable places where something
 impure is (like the functions given as function or template
 arguments), to prune the impurity search graph faster.

 In the end I think your idea of automatic purity is better (if it
 works), because it's more handy for the programmer. But I suggest to
 add some pruning heuristics, and to run it only on templates tagged
 with pure.
It's Walter's. Templates should not be tagged with pure unless they are pure for all instantiations. Otherwise the annotation would lose any guarantee. Thanks, Andrei
Jun 17 2011
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
bearophile wrote:
 Andrei:

 Automatic inference should not be slower than verification of
 conditional purity. The work involved is the same.
 No, it would work for all template functions and ran either
 opportunistically or only as needed.
The first thing you say goes against the second: if you run the verification
algorithm even if the template is not tagged with pure, then you are doing more work than running > it only on templates tagged as pure.
 And even if it runs only on templates tagged with pure, I think the compiler is
able to do a bit less work if you use pure(CTcond) instead, because the CTcond is chosen by
 the programmer to be quite discriminating. I guess this can be solved adding
some heuristics to the compiler: to test first the most probable places where something impure is > (like the functions given as function or template arguments), to prune the impurity search graph faster.
 In the end I think your idea of automatic purity is better (if it works),
because it's more handy for the programmer. But I suggest to add some pruning heuristics, and to run > it only on templates tagged with pure.
 Bye,
 bearophile
I think a template tagged with pure should be pure no matter what. Furthermore, I think the additional work required is minimal. The compiler just has to store a flag whether or not any function is pure and set it to false during semantic analysis when it encounters an impure action within the function body. Also, I assume that the bottleneck for compile time is parsing all the (Phobos) imports (that is quite a lot even if you only import one function from one module!). But hopefully local imports are going to improve that situation a lot. (I mean, why exactly do I need to import Eg. std.regex when I do import std.stdio : writeln?) A small problem I can see with the purity validation algorithm proposed by Andrei relates to incremental builds (again): module A; int iampure(){ return 5; } module B; int puretemplate()(){ // inferred pure return iampure(); } int main(){ import std.stdio; writeln(puretemplate(), puretemplate()); // compiler caches result } $ dmd -c A $ dmd -c B $ dmd A.o B.o; ./A 55 Some funny guy decides to change A.iampure: module A; int iampure(){ static int x; return ++x; } $ dmd -c A $ dmd A.o B.o; ./A 11 Wrong! Should be 12. Therefore, the compiler has to assume that functions defined in other modules are not pure unless marked as such. (or issue a pure copy of each would-be-pure function to the object file and call the pure version from inferred pure templates. Then the problem would be detected at link time.) I think the best way to handle it would be to make templates pure iff their body would pass the standard purity validation. Less special casing. Timon
Jun 17 2011
parent reply KennyTM~ <kennytm gmail.com> writes:
On Jun 17, 11 22:41, Timon Gehr wrote:
[snip]
 A small problem I can see with the purity validation algorithm proposed by
Andrei
 relates to incremental builds (again):

 module A;
 int iampure(){
      return 5;
 }

 module B;
 int puretemplate()(){ // inferred pure
      return iampure();
 }

 int main(){
      import std.stdio;
      writeln(puretemplate(), puretemplate()); // compiler caches result
 }

 $ dmd -c A
 $ dmd -c B
 $ dmd A.o B.o; ./A
 55

 Some funny guy decides to change A.iampure:
 module A;
 int iampure(){
      static int x;
      return ++x;
 }

 $ dmd -c A
 $ dmd A.o B.o; ./A
 11

 Wrong! Should be 12.
That happens too if 'iampure' is a template or the compiler inline a function which should be modified, with or without caching. I don't see this as a problem of attribute inference.

[snip]
 Timon
Jun 17 2011
parent Timon Gehr <timon.gehr gmx.ch> writes:
KennyTM~ wrote:
 On Jun 17, 11 22:41, Timon Gehr wrote:
 [snip]
 A small problem I can see with the purity validation algorithm proposed by >
Andrei
 relates to incremental builds (again):

 module A;
 int iampure(){
      return 5;
 }

 module B;
 int puretemplate()(){ // inferred pure
      return iampure();
 }

 int main(){
      import std.stdio;
      writeln(puretemplate(), puretemplate()); // compiler caches result
 }

 $ dmd -c A
 $ dmd -c B
 $ dmd A.o B.o; ./A
 55

 Some funny guy decides to change A.iampure:
 module A;
 int iampure(){
      static int x;
      return ++x;
 }

 $ dmd -c A
 $ dmd A.o B.o; ./A
 11

 Wrong! Should be 12.
That happens too if 'iampure' is a template or the compiler inline a function which should be modified, with or without caching. I don't see this as a problem of attribute inference.
It does not happen if 'iampure' is a template. You get the behavior the template provided before the change (something you understand quickly), not some strange mix based on 'wrong' assumptions of the compiler while performing *implicit* purity inference (which can be hard to track down). There is also no problem with inlining unless you pass -inline when performing incremental builds. Timon
Jun 17 2011
prev sibling next sibling parent reply Kagamin <spam here.lot> writes:
Walter Bright Wrote:

 http://drdobbs.com/blogs/tools/230700070
import std.stdio; int square_debug(int x) { writefln("square(%d)", x); return x * x; } pure alias int function(int) fp_t; pure int square(int x) { auto fp = cast(fp_t)&square_debug; return (*fp)(x); // dereference??? }
Jun 16 2011
parent reply Jimmy Cao <jcao219 gmail.com> writes:
On Thu, Jun 16, 2011 at 6:07 AM, Kagamin <spam here.lot> wrote:

 Walter Bright Wrote:

 http://drdobbs.com/blogs/tools/230700070
import std.stdio; int square_debug(int x) { writefln("square(%d)", x); return x * x; } pure alias int function(int) fp_t; pure int square(int x) { auto fp = cast(fp_t)&square_debug; return (*fp)(x); // dereference??? }
Yes, that's correct.
Jun 16 2011
parent reply Kagamin <spam here.lot> writes:
Jimmy Cao Wrote:

 pure int square(int x)
 {
    auto fp = cast(fp_t)&square_debug;
    return (*fp)(x); // dereference???
 }
Yes, that's correct.
You mean, implementation-defined?
Jun 16 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 6/16/2011 5:27 AM, Kagamin wrote:
 Jimmy Cao Wrote:

 pure int square(int x)
 {
     auto fp = cast(fp_t)&square_debug;
     return (*fp)(x); // dereference???
 }
Yes, that's correct.
You mean, implementation-defined?
It's typed as pointing to a pure function. So, it's callable from a pure function.
Jun 16 2011
parent reply Kagamin <spam here.lot> writes:
Walter Bright Wrote:

 You mean, implementation-defined?
It's typed as pointing to a pure function. So, it's callable from a pure function.
I mean that function pointers are not documented to allow dereference operator, that's why behavior of your code is implementation-dependent. You spend too much time writing in C++.
Jun 16 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 6/16/2011 11:59 PM, Kagamin wrote:
 I mean that function pointers are not documented to allow dereference
 operator, that's why behavior of your code is implementation-dependent. You
 spend too much time writing in C++.
I don't know what you mean.
Jun 17 2011
parent KennyTM~ <kennytm gmail.com> writes:
On Jun 17, 11 15:32, Walter Bright wrote:
 On 6/16/2011 11:59 PM, Kagamin wrote:
 I mean that function pointers are not documented to allow dereference
 operator, that's why behavior of your code is
 implementation-dependent. You
 spend too much time writing in C++.
I don't know what you mean.
You should call the function pointer as return fp(x); not return (*fp)(x);
Jun 17 2011
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Walter:

Thanks to Andrei Alexandrescu, Brad Roberts, David Held, and Bartosz Milewski
for their helpful comments and corrections on this post.<
*pulls Walter's ears* You have forgotten who has found this problem :-) Bye, bearophile
Jun 16 2011
parent reply Don <nospam nospam.com> writes:
bearophile wrote:
 Walter:
 
 Thanks to Andrei Alexandrescu, Brad Roberts, David Held, and Bartosz Milewski
for their helpful comments and corrections on this post.<
*pulls Walter's ears* You have forgotten who has found this problem :-)
So have I. It was mentioned at the start of 2008 with respect to pure, but the same problem was mentioned with regard to CTFE much earlier, in 2006 I think. I think Daniel Keep was the first to mention it.
Jun 16 2011
parent Walter Bright <newshound2 digitalmars.com> writes:
On 6/16/2011 6:29 AM, Don wrote:
 bearophile wrote:
 Walter:

 Thanks to Andrei Alexandrescu, Brad Roberts, David Held, and Bartosz Milewski
 for their helpful comments and corrections on this post.<
*pulls Walter's ears* You have forgotten who has found this problem :-)
So have I. It was mentioned at the start of 2008 with respect to pure, but the same problem was mentioned with regard to CTFE much earlier, in 2006 I think. I think Daniel Keep was the first to mention it.
The acknowledgments did not attempt to credit where the idea came from, just the people who made comments on a draft of that post. This thread is particularly relevant: http://www.digitalmars.com/d/archives/digitalmars/D/Temporarily_disable_all_purity_for_debug_prints_134460.html As for the problem, which is doing I/O in a pure function, that goes back decades.
Jun 16 2011
prev sibling parent bearophile <bearophileHUGS lycos.com> writes:
Andrei:

On the bright side, inference is great - no more need for the user to bother
with annotations, the compiler takes care of it. Furthermore, the compiler may
opportunistically infer function attributes (pure, nothrow etc.) if it wants to
optimize a specific context.<
Thank you for your answers. This automatic purity algorithm looks able to implement a D "compiler tip" similar to the warning "-Wsuggest-attribute=pure" of GCC 4.6 (compiler tip != compiler warning): http://d.puremagic.com/issues/show_bug.cgi?id=5137 Bye, bearophile
Jun 17 2011