www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Purity

reply bearophile <bearophileHUGS lycos.com> writes:
http://www.reddit.com/r/programming/comments/enajl/purity_in_d_language/

Bye,
bearophile
Dec 16 2010
next sibling parent Iain Buclaw <ibuclaw ubuntu.com> writes:
== Quote from bearophile (bearophileHUGS lycos.com)'s article
 http://www.reddit.com/r/programming/comments/enajl/purity_in_d_language/
 Bye,
 bearophile
The way I see it, there are 4 ways of describing the purity of a function. Though the comments on them are slightly vague and different to the way I'll describe it, the way I assume them to be treated in D <-> GCC is: Pure if the function has no effects except the return value, which depends only on parameters, and may read (but not write to) global memory. Constly Pure if the function does not read global memory (immutable data being the only notable exception because of the way it's treated) and has no effects except the return value. Weakly Pure if the function does not read global memory, but may have arbitrary side effects, such as a method altering it's local state. Impure if the function may write into global memory/have other side effects that won't appropriately categorize it into the above. Regards Iain
Dec 17 2010
prev sibling next sibling parent reply Don <nospam nospam.com> writes:
bearophile wrote:
 http://www.reddit.com/r/programming/comments/enajl/purity_in_d_language/
 
 Bye,
 bearophile
Nice. BTW you're incorrectly attributing some things to me. Although I recently championed the 'weakly pure' idea, I think Bruno made the original proposal.
Dec 17 2010
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Don:

 BTW you're incorrectly attributing some things to me. Although I 
 recently championed the 'weakly pure' idea, I think Bruno made the 
 original proposal.
I am sorry, but It's easy to fix that error. I guess you mean Bruno Medeiros. Bye, bearophile
Dec 17 2010
prev sibling parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 17/12/2010 09:41, Don wrote:
 bearophile wrote:
 http://www.reddit.com/r/programming/comments/enajl/purity_in_d_language/

 Bye,
 bearophile
Nice. BTW you're incorrectly attributing some things to me. Although I recently championed the 'weakly pure' idea, I think Bruno made the original proposal.
Well, kinda... I started the thread which described the original proposal in detail, and described how it was actually sound and potentially useful: http://www.digitalmars.com/d/archives/digitalmars/D/Idea_partially_pure_functions_70762.html But the idea itself didn't originate from me, it was derived from something you said, (or at least inspired by something you said): "I want to conceptualize an idea that has been briefly mentioned in the previous pure discussions (by Don I think?), but has not been explicitly brought into attention." -- Bruno Medeiros - Software Engineer
Jan 27 2011
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 17 Dec 2010 02:42:14 -0500, bearophile <bearophileHUGS lycos.com>  
wrote:

 http://www.reddit.com/r/programming/comments/enajl/purity_in_d_language/

 Bye,
 bearophile
This is a good post, you sort of lost me at the SPARK example, but I want to particularly note some inaccuracies that you have. At one point you have an example that creates memory, and say that two different calls to create memory with the same size could return the same array. This is not valid, you introduce an aliasing issue by optimizing. Only if the return is immutable can you do this. Another inaccuracy (really an omission) is that a weakly pure function is just like a pure function but cannot be memoized. In fact, it cannot be optimized in any way like strongly pure functions can. In fact, a weakly pure function is exactly like a normal function except it can only call pure functions, it can be called by a pure function (strong or weak), and it cannot access mutable globals. In particular, a function like this: pure int foo(ref int x) {return ++x;} cannot be factored or reordered, i.e. this doesn't work: foo(x) + foo(x) => 2 * foo(x) If you think about it, you can see why. This actually ties together nicely with my first point -- a pure function that returns a mutable pointer must be weakly pure. Your example which returns an int[] is returning such a pointer, so it is weakly pure. Therefore, this is why your example that creates memory cannot return the same array twice (that memoization optimization cannot legally be done). -Steve
Dec 17 2010
parent reply Don <nospam nospam.com> writes:
Steven Schveighoffer wrote:
 On Fri, 17 Dec 2010 02:42:14 -0500, bearophile 
 <bearophileHUGS lycos.com> wrote:
 
 http://www.reddit.com/r/programming/comments/enajl/purity_in_d_language/

 Bye,
 bearophile
You've got two statements here which are both sort-of true, but taken together are not correct: <1>
 Another inaccuracy (really an omission) is that a weakly pure function 
 is just like a pure function but cannot be memoized.  In fact, it cannot 
 be optimized in any way like strongly pure functions can. 
<2>
 This actually ties together nicely with my first point -- a pure 
 function that returns a mutable pointer must be weakly pure.  
A function which has immutably pure parameters can undergo *some* optimisation, even if the return value is a mutable pointer. For example, if the parameters are identical for both calls, you can do a deepdup of the first return value instead of calling the function again.
Dec 17 2010
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 17 Dec 2010 09:39:19 -0500, Don <nospam nospam.com> wrote:

 Steven Schveighoffer wrote:
 On Fri, 17 Dec 2010 02:42:14 -0500, bearophile  
 <bearophileHUGS lycos.com> wrote:

 http://www.reddit.com/r/programming/comments/enajl/purity_in_d_language/

 Bye,
 bearophile
You've got two statements here which are both sort-of true, but taken together are not correct: <1>
 Another inaccuracy (really an omission) is that a weakly pure function  
 is just like a pure function but cannot be memoized.  In fact, it  
 cannot be optimized in any way like strongly pure functions can.
<2>
 This actually ties together nicely with my first point -- a pure  
 function that returns a mutable pointer must be weakly pure.
A function which has immutably pure parameters can undergo *some* optimisation, even if the return value is a mutable pointer. For example, if the parameters are identical for both calls, you can do a deepdup of the first return value instead of calling the function again.
Yes, I agree with this. However, I still believe statement 1 is still correct, because you really have just introduced a third class of pure functions that I was not aware of :) So we have: weak-pure functions : functions which can accept and return any type of values, but can only call pure functions and cannot access global state. middle-pure ? functions : weak-pure functions which accept only immutable or implicitly convertable to immutable values, but returns a mutable reference. strong-pure funtions : pure functions which accept and return only immutable or implicitly convertible to immutable values. There actually could be another class, const-pure functions: pure functions which accept and return only const or implicitly convertible to const values. These 4th class of functions could be classified as strong-pure when called with all immutable or implicitly convertible to immutable parameters. I suspect the compiler will need to classify things as strong or weak pure, and store various attributes on weak-pure functions to see what optimizations can be had. This shouldn't matter too much to the user, he should be oblivious to such optimizations. I think it's going to be very difficult to 'accidentally' create pure functions that could be optimized better. -Steve P.
Dec 17 2010
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 17 Dec 2010 09:53:15 -0500, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:


 -Steve

 P.
Damn, accidentally hit shift-enter which sent the message. Meant to write: P.S. is there a better adjective for 'immutable or implicitly convertable to immutable'? That's a lot of verbiage for something that's really needed in a lot of places in the spec.
Dec 17 2010
parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Fri, 17 Dec 2010 09:55:18 -0500, Steven Schveighoffer wrote:

 On Fri, 17 Dec 2010 09:53:15 -0500, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:
 
 
 -Steve

 P.
Damn, accidentally hit shift-enter which sent the message. Meant to write: P.S. is there a better adjective for 'immutable or implicitly convertable to immutable'? That's a lot of verbiage for something that's really needed in a lot of places in the spec.
Value type? That is, a real value type, not one that is really a reference in disguise. "Pure value type", perhaps? -Lars
Dec 17 2010
parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Fri, 17 Dec 2010 15:22:25 +0000, Lars T. Kyllingstad wrote:

 On Fri, 17 Dec 2010 09:55:18 -0500, Steven Schveighoffer wrote:
 
 On Fri, 17 Dec 2010 09:53:15 -0500, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:
 
 
 -Steve

 P.
Damn, accidentally hit shift-enter which sent the message. Meant to write: P.S. is there a better adjective for 'immutable or implicitly convertable to immutable'? That's a lot of verbiage for something that's really needed in a lot of places in the spec.
Value type? That is, a real value type, not one that is really a reference in disguise. "Pure value type", perhaps?
Ah, forget it. That doesn't include immutable references. -Lars
Dec 17 2010
prev sibling parent reply Don <nospam nospam.com> writes:
Steven Schveighoffer wrote:
 On Fri, 17 Dec 2010 09:39:19 -0500, Don <nospam nospam.com> wrote:
 
 A function which has immutably pure parameters can undergo *some* 
 optimisation, even if the return value is a mutable pointer.
 For example, if the parameters are identical for both calls, you can 
 do a deepdup of the first return value instead of calling the function 
 again.
Yes, I agree with this. However, I still believe statement 1 is still correct, because you really have just introduced a third class of pure functions that I was not aware of :) So we have: weak-pure functions : functions which can accept and return any type of values, but can only call pure functions and cannot access global state. middle-pure ? functions : weak-pure functions which accept only immutable or implicitly convertable to immutable values, but returns a mutable reference.
 strong-pure funtions : pure functions which accept and return only 
 immutable or implicitly convertible to immutable values.
 
 There actually could be another class, const-pure functions: pure 
 functions which accept and return only const or implicitly convertible 
 to const values.
The problem with these ones, is there could be aliasing between the input and the output. Do they have any interesting properties? Another interesting one is a function with all-const parameters, that returns a mutable reference that has only mutable or immutable members. (ie, nothing const). Like the one you've called middle-pure, this returns a unique result, because you're guaranteed there is no aliasing.
 I suspect the compiler will need to classify things as strong or weak 
 pure, and store various attributes on weak-pure functions to see what 
 optimizations can be had.
The compiler does that already.
Dec 17 2010
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 18 Dec 2010 01:27:40 -0500, Don <nospam nospam.com> wrote:

 Steven Schveighoffer wrote:
 On Fri, 17 Dec 2010 09:39:19 -0500, Don <nospam nospam.com> wrote:

 A function which has immutably pure parameters can undergo *some*  
 optimisation, even if the return value is a mutable pointer.
 For example, if the parameters are identical for both calls, you can  
 do a deepdup of the first return value instead of calling the function  
 again.
Yes, I agree with this. However, I still believe statement 1 is still correct, because you really have just introduced a third class of pure functions that I was not aware of :) So we have: weak-pure functions : functions which can accept and return any type of values, but can only call pure functions and cannot access global state. middle-pure ? functions : weak-pure functions which accept only immutable or implicitly convertable to immutable values, but returns a mutable reference.
 strong-pure funtions : pure functions which accept and return only  
 immutable or implicitly convertible to immutable values.
  There actually could be another class, const-pure functions: pure  
 functions which accept and return only const or implicitly convertible  
 to const values.
The problem with these ones, is there could be aliasing between the input and the output. Do they have any interesting properties?
They can act like strong-pure functions when all the arguments being passed in are IOICTI (immutable or implicitly convertable to immutable).
 Another interesting one is a function with all-const parameters, that  
 returns a mutable reference that has only mutable or immutable members.
 (ie, nothing const). Like the one you've called middle-pure, this  
 returns a unique result, because you're guaranteed there is no aliasing.
Yes.
 I suspect the compiler will need to classify things as strong or weak  
 pure, and store various attributes on weak-pure functions to see what  
 optimizations can be had.
The compiler does that already.
OK, good. -Steve
Dec 19 2010
prev sibling parent reply Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 17/12/2010 14:39, Don wrote:
 <1>
 Another inaccuracy (really an omission) is that a weakly pure function
 is just like a pure function but cannot be memoized.  In fact, it
 cannot be optimized in any way like strongly pure functions can.
<2>
 This actually ties together nicely with my first point -- a pure
 function that returns a mutable pointer must be weakly pure.
A function which has immutably pure parameters can undergo *some* optimisation, even if the return value is a mutable pointer. For example, if the parameters are identical for both calls, you can do a deepdup of the first return value instead of calling the function again.
Are you sure? If I understood you correctly, that doesn't seem to be the case. Consider for example: string[] func(string arg) pure { string elem2 = "blah".idup; return [ arg, elem2 ]; } The compiler *cannot* know (well, looking at the signature only of course) how to properly deepdup the result from the first return value, so as to give the exact same result as if func was called again. Apologies if this has been discussed in some thread further ahead. -- Bruno Medeiros - Software Engineer
Jan 27 2011
parent reply "Simen kjaeraas" <simen.kjaras gmail.com> writes:
Bruno Medeiros <brunodomedeiros+spam com.gmail> wrote:

 string[] func(string arg) pure {
      string elem2 = "blah".idup;
      return [ arg, elem2 ];
 }

 The compiler *cannot* know (well, looking at the signature only of  
 course) how to properly deepdup the result from the first return value,  
 so as to give the exact same result as if func was called again.
Could you please elucidate, as I am unsure of your reasoning for saying the compiler cannot know how to deepdup the result. -- Simen
Jan 27 2011
parent reply Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 27/01/2011 21:05, Simen kjaeraas wrote:
 Bruno Medeiros <brunodomedeiros+spam com.gmail> wrote:

 string[] func(string arg) pure {
 string elem2 = "blah".idup;
 return [ arg, elem2 ];
 }

 The compiler *cannot* know (well, looking at the signature only of
 course) how to properly deepdup the result from the first return
 value, so as to give the exact same result as if func was called again.
Could you please elucidate, as I am unsure of your reasoning for saying the compiler cannot know how to deepdup the result.
string str = "blah"; string[] var1 = func(str); string[] var2 = func(str); How can the compiler optimize the second call to func, the one that is assigned to var2, such that he deepdups var1 instead of calling func again? Which code would be generated? The compiler can't do that because of all the transitive data of var1, the compiler doesn't know which of it was newly allocated by func, and which of it was reused from func's parameters or some other global inputs. -- Bruno Medeiros - Software Engineer
Jan 28 2011
parent reply "Simen kjaeraas" <simen.kjaras gmail.com> writes:
Bruno Medeiros <brunodomedeiros+spam com.gmail> wrote:

 On 27/01/2011 21:05, Simen kjaeraas wrote:
 Bruno Medeiros <brunodomedeiros+spam com.gmail> wrote:

 string[] func(string arg) pure {
 string elem2 = "blah".idup;
 return [ arg, elem2 ];
 }

 The compiler *cannot* know (well, looking at the signature only of
 course) how to properly deepdup the result from the first return
 value, so as to give the exact same result as if func was called again.
Could you please elucidate, as I am unsure of your reasoning for saying the compiler cannot know how to deepdup the result.
string str = "blah"; string[] var1 = func(str); string[] var2 = func(str); How can the compiler optimize the second call to func, the one that is assigned to var2, such that he deepdups var1 instead of calling func again? Which code would be generated? The compiler can't do that because of all the transitive data of var1, the compiler doesn't know which of it was newly allocated by func, and which of it was reused from func's parameters or some other global inputs.
But for immutable data (like the contents of the elements of a string[]), that doesn't matter, does it? -- Simen
Jan 28 2011
parent reply Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 28/01/2011 20:25, Simen kjaeraas wrote:
 Bruno Medeiros <brunodomedeiros+spam com.gmail> wrote:

 On 27/01/2011 21:05, Simen kjaeraas wrote:
 Bruno Medeiros <brunodomedeiros+spam com.gmail> wrote:

 string[] func(string arg) pure {
 string elem2 = "blah".idup;
 return [ arg, elem2 ];
 }

 The compiler *cannot* know (well, looking at the signature only of
 course) how to properly deepdup the result from the first return
 value, so as to give the exact same result as if func was called again.
Could you please elucidate, as I am unsure of your reasoning for saying the compiler cannot know how to deepdup the result.
string str = "blah"; string[] var1 = func(str); string[] var2 = func(str); How can the compiler optimize the second call to func, the one that is assigned to var2, such that he deepdups var1 instead of calling func again? Which code would be generated? The compiler can't do that because of all the transitive data of var1, the compiler doesn't know which of it was newly allocated by func, and which of it was reused from func's parameters or some other global inputs.
But for immutable data (like the contents of the elements of a string[]), that doesn't matter, does it?
Maybe it won't matter for the *contents of the elements* of the string array, but the whole result value has to be /the same/ as if the optimization was not applied. Otherwise the optimization is invalid, even if for most uses of the result value it would not make a difference for the program. -- Bruno Medeiros - Software Engineer
Feb 01 2011
parent reply "Simen kjaeraas" <simen.kjaras gmail.com> writes:
Bruno Medeiros <brunodomedeiros+spam com.gmail> wrote:

 But for immutable data (like the contents of the elements of a  
 string[]),
 that doesn't matter, does it?
Maybe it won't matter for the *contents of the elements* of the string array, but the whole result value has to be /the same/ as if the optimization was not applied. Otherwise the optimization is invalid, even if for most uses of the result value it would not make a difference for the program.
I admit to still not understanding this. The data can't be changed, so the contents do not matter. The array structs (prt/length) would not be the same as those fed to the function in any case, so I really cannot see how those would matter. If others do understand, please elucidate. -- Simen
Feb 01 2011
parent reply Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 01/02/2011 14:15, Simen kjaeraas wrote:
 Bruno Medeiros <brunodomedeiros+spam com.gmail> wrote:

 But for immutable data (like the contents of the elements of a
 string[]),
 that doesn't matter, does it?
Maybe it won't matter for the *contents of the elements* of the string array, but the whole result value has to be /the same/ as if the optimization was not applied. Otherwise the optimization is invalid, even if for most uses of the result value it would not make a difference for the program.
I admit to still not understanding this. The data can't be changed, so the contents do not matter. The array structs (prt/length) would not be the same as those fed to the function in any case, so I really cannot see how those would matter. If others do understand, please elucidate.
Here is a synthetic example: string[] func(string arg) pure { string elem2 = "blah".idup; return [ arg, elem2 ]; } string str = "blah"; string[] result1 = func(str); string[] result2 = func(str); if(result2[0] is str) { // then } else { // else } In this code sample if the optimization is applied on the second call to func, it would cause different code with be executed: the else clause instead of the then clause. Obviously this is not acceptable for an optimization, even if such code would be rare in practice. -- Bruno Medeiros - Software Engineer
Feb 11 2011
parent reply "Simen kjaeraas" <simen.kjaras gmail.com> writes:
Bruno Medeiros <brunodomedeiros+spam com.gmail> wrote:

 In this code sample if the optimization is applied on the second call to  
 func, it would cause different code with be executed: the else clause  
 instead of the then clause. Obviously this is not acceptable for an  
 optimization, even if such code would be rare in practice.
Ah yes, I see now. I could argue that 'is' is cheating. :p I believe actually, that pure is only said to promise that the returned values should be such that func(args) == func(args), not accounting for overloaded operators. -- Simen
Feb 11 2011
parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
result1 ==On 11/02/2011 22:51, Simen kjaeraas wrote:
 Bruno Medeiros <brunodomedeiros+spam com.gmail> wrote:

 In this code sample if the optimization is applied on the second call
 to func, it would cause different code with be executed: the else
 clause instead of the then clause. Obviously this is not acceptable
 for an optimization, even if such code would be rare in practice.
Ah yes, I see now. I could argue that 'is' is cheating. :p I believe actually, that pure is only said to promise that the returned values should be such that func(args) == func(args), not accounting for overloaded operators.
That still would not be guaranteed. In the example I mentioned before, it already happens that (result1 == result2) == false I think that the concession that pure will be allowed to allocate memory does inescapably remove some of the guarantees that pure functions offer (like that one that the return value depends only on the arguments). One possible fix to this would be to say that the allocated memory must be temporary (used only during the execution of the pure function). Thus you would not be able to return any newly-allocated value. But I don't know if this further restriction is desirable or not. I don't remember if this aspect of memory allocation in pure functions was discussed/thought-out extensively or not. (it probably needs to) -- Bruno Medeiros - Software Engineer
Mar 23 2011
prev sibling next sibling parent reply spir <denis.spir gmail.com> writes:
On Fri, 17 Dec 2010 02:42:14 -0500
bearophile <bearophileHUGS lycos.com> wrote:

 http://www.reddit.com/r/programming/comments/enajl/purity_in_d_language/
=20
 Bye,
 bearophile
I take the opportunity to question the def of weakly pure. -1- How can this even compile? pure append (ref int[] a, int i) {a ~=3D i;} The _only_ task of this func is to change state. -2- Why is output writing considered an impure task? It has no influence on= the rest/future of the program, lets reasoning easy, does not touch state. I would like weakly pure to include output funcs, and exclude all possibili= ties to modify (non-local) state. Denis -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Dec 18 2010
next sibling parent reply "Simen kjaeraas" <simen.kjaras gmail.com> writes:
spir <denis.spir gmail.com> wrote:

 -1- How can this even compile?
 	pure append (ref int[] a, int i) {a ~= i;}
 The _only_ task of this func is to change state.
It is designed to be usable in strongly pure functions. A strongly pure function may call this weakly pure function without compromising its purity. This because the strongly pure function could only pass the weakly pure function its own local state, and changes would thus be limited to the strongly pure function's scope.
 -2- Why is output writing considered an impure task? It has no influence  
 on the rest/future of the program, lets reasoning easy, does not touch  
 state.
Output would be affected by pure optimizations: pure int writeStuff() { output( "Hi!" ); return 1; } void foo( ) { int a = writeStuff( ) + writeStuff( ); } Normally, the compiler could rewrite to int a = 2 * writeStuff( ); but that would fuck up output. -- Simen
Dec 18 2010
parent spir <denis.spir gmail.com> writes:
On Sat, 18 Dec 2010 12:30:31 +0100
"Simen kjaeraas" <simen.kjaras gmail.com> wrote:

 spir <denis.spir gmail.com> wrote:
=20
 -1- How can this even compile?
 	pure append (ref int[] a, int i) {a ~=3D i;}
 The _only_ task of this func is to change state.
=20 It is designed to be usable in strongly pure functions. A strongly pure function may call this weakly pure function without compromising its purity. This because the strongly pure function could only pass the weakly pure function its own local state, and changes would thus be limited to the strongly pure function's scope.
Right, thank you.
 -2- Why is output writing considered an impure task? It has no influenc=
e =20
 on the rest/future of the program, lets reasoning easy, does not touch =
=20
 state.
=20 Output would be affected by pure optimizations: =20 pure int writeStuff() { output( "Hi!" ); return 1; } =20 void foo( ) { int a =3D writeStuff( ) + writeStuff( ); } =20 Normally, the compiler could rewrite to int a =3D 2 * writeStuff( ); but that would fuck up output.
... which means the optimisation scheme is wrong for "action-funcs". Note: As said in // post, languages maybe should properly make a distinctio= n between action-functions & true functions (returning a result) --but I'm = aware this goes against the whole culture of the C-line languages. I consider funcs like this one, that both perform an effect and return a re= sult, as confusing, and even erroneous, design/style. Denis -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Dec 18 2010
prev sibling parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 18/12/2010 08:52, spir wrote:
 On Fri, 17 Dec 2010 02:42:14 -0500
 bearophile<bearophileHUGS lycos.com>  wrote:

 http://www.reddit.com/r/programming/comments/enajl/purity_in_d_language/

 Bye,
 bearophile
I take the opportunity to question the def of weakly pure. -1- How can this even compile? pure append (ref int[] a, int i) {a ~= i;} The _only_ task of this func is to change state.
It is best not to think of D's pure as saying the function is pure in the traditional sense (side-effect free), but rather to say the side-effects are confined to the function's parameters only (and the state/data that is transitively reachable from them). The key benefit of this is that it allows the *caller* of such pure function to determine which subset of state/data *may* change by a call to such function. -- Bruno Medeiros - Software Engineer
Jan 27 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday 18 December 2010 00:52:23 spir wrote:
 On Fri, 17 Dec 2010 02:42:14 -0500
 
 bearophile <bearophileHUGS lycos.com> wrote:
 http://www.reddit.com/r/programming/comments/enajl/purity_in_d_language/
 
 Bye,
 bearophile
I take the opportunity to question the def of weakly pure. -1- How can this even compile? pure append (ref int[] a, int i) {a ~= i;} The _only_ task of this func is to change state.
It's _weakly_ pure, not strongly pure. This pretty much just means that the function does not access global variables. No optimizations can be done to it. It's purely so that strongly pure (or other weakly pure) functions can legally call it. If it's a strongly pure function calling it, then not only do you know that global state can't be affected, but because all of the function's parameters are immutable (or full-on copies of mutable data in the case where it's a type implicitly convertible to/from immutable). Weakly pure functions are _not_ pure in the functional sense at all. Only strongly pure functions are.
 -2- Why is output writing considered an impure task? It has no influence on
 the rest/future of the program, lets reasoning easy, does not touch state.
 
 I would like weakly pure to include output funcs, and exclude all
 possibilities to modify (non-local) state.
The problem is that output is accessing global variables - which weakly pure functions _cannot_ do. If you call a weakly pure function from a strongly pure function, it's still guranteed that the result of the strongly pure function will be the same and have _no_ side effects - that is, it effectively has the functional definition of purity. However, if you allowed I/O, you then have a side effect, and two calls to a particular strongly pure function could differ, and yet the optimizer would then only do one call, thereby changing the results of the program. Optimizing out calls pure functions should _never_ change the results of a program. Allowing output in weakly pure functions would allow that. Now, there is arguably some need for the ability to have _debug_ functions for printing in pure functions (be they weakly pure or strongly pure), but they'd likely be used in cases where pure optimizations were not used (since you wouldn't be using -O for debugging), and if they were used, the programmer would effectively be saying that they were okay with the I/O not necessarily occuring every time that a function call occured in the code (since the call could be optimized out). That is _not_ acceptible for general I/O. I/O breaks purity. The way to get around it is to use monads, but that would affect the signatures of every single function in the chain of function calls which included the I/O. So, you wouldn't normally want to do that. - Jonathan M Davis
Dec 18 2010
prev sibling next sibling parent reply spir <denis.spir gmail.com> writes:
On Sat, 18 Dec 2010 01:08:20 -0800
Jonathan M Davis <jmdavisProg gmx.com> wrote:

Thank you for the explanation about strongly pure funcs calling weakly pure=
 ones --this fully makes sense.

 I would like weakly pure to include output funcs, and exclude all
 possibilities to modify (non-local) state. =20
=20 The problem is that output is accessing global variables - which weakly p=
ure=20
 functions _cannot_ do.
Why? What is the rationale for excluding output (I don't mean I/O, only O)?
 If you call a weakly pure function from a strongly pure=20
 function, it's still guranteed that the result of the strongly pure funct=
ion=20
 will be the same and have _no_ side effects - that is, it effectively has=
the=20
 functional definition of purity. However, if you allowed I/O, you then ha=
ve a=20
 side effect, and two calls to a particular strongly pure function could d=
iffer,=20
 and yet the optimizer would then only do one call, thereby changing the r=
esults=20
 of the program.=20
I disagree. Again, I'm not talking of I/O. The FP definition of "pure" excl= uding output is irrational imo. Writing to output cannot change later funct= ion return values, so who cares? And it's not only true for debug --even if I very much agree debug is a sup= er important reason for allowing output. Denis -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Dec 18 2010
next sibling parent reply David Nadlinger <see klickverbot.at> writes:
On 12/18/10 12:52 PM, spir wrote:
 Why? What is the rationale for excluding output (I don't mean I/O, only O)?
Where would be the difference between writing to a global variable and writing to stdout? David
Dec 18 2010
parent spir <denis.spir gmail.com> writes:
On Sat, 18 Dec 2010 13:09:19 +0100
David Nadlinger <see klickverbot.at> wrote:

 On 12/18/10 12:52 PM, spir wrote:
 Why? What is the rationale for excluding output (I don't mean I/O, only=
O)?
=20
 Where would be the difference between writing to a global variable and=20
 writing to stdout?
How can writing to stdout change the result value of a later call to (the s= ame or any other) function? Denis -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Dec 18 2010
prev sibling parent reply Don <nospam nospam.com> writes:
spir wrote:
 On Sat, 18 Dec 2010 01:08:20 -0800
 Jonathan M Davis <jmdavisProg gmx.com> wrote:
 
 Thank you for the explanation about strongly pure funcs calling weakly pure
ones --this fully makes sense.
 
 I would like weakly pure to include output funcs, and exclude all
 possibilities to modify (non-local) state.  
The problem is that output is accessing global variables - which weakly pure functions _cannot_ do.
Why? What is the rationale for excluding output (I don't mean I/O, only O)?
You're correct in saying that it doesn't affect the operation of the program. But in practice, program output is almost always important. For example, suppose we allowed output to be pure. Then consider: writeln("Hello, world!"); Since it returns nothing, and has no influence on the future execution of the program, the writeln can be dropped from the program. Hmmm....
Dec 18 2010
next sibling parent reply spir <denis.spir gmail.com> writes:
On Sat, 18 Dec 2010 13:46:11 +0100
Don <nospam nospam.com> wrote:

 spir wrote:
 On Sat, 18 Dec 2010 01:08:20 -0800
 Jonathan M Davis <jmdavisProg gmx.com> wrote:
=20
 Thank you for the explanation about strongly pure funcs calling weakly =
pure ones --this fully makes sense.
=20
 I would like weakly pure to include output funcs, and exclude all
 possibilities to modify (non-local) state. =20
The problem is that output is accessing global variables - which weakl=
y pure=20
 functions _cannot_ do.
=20 Why? What is the rationale for excluding output (I don't mean I/O, only=
O)?
=20
 You're correct in saying that it doesn't affect the operation of the=20
 program. But in practice, program output is almost always important.
=20
 For example, suppose we allowed output to be pure. Then consider:
=20
 writeln("Hello, world!");
=20
 Since it returns nothing, and has no influence on the future execution=20
 of the program, the writeln can be dropped from the program.
=20
 Hmmm....
For sure, if you specify (in the compiler behaviour) that the only purpose = of a pure function is to return something. Which excudes all "action-foncti= ons" (=3D funcs which purpose is to perform an effect). But why should the = compiler rewrite of such a func be to erase it? Just let it alone, it won't= bother anybody. And can safely be qualified as pure. So that, transitively= , pure funcs can call funcs that perform output. Just what we want. pure Report getReport (Data data) { // compute report from data (only) log.write(report); writefln("report computed:\n\t%s", report); // debug return report; } Anyway, aside practicle & efficiency purposes, I'm even more annoyed by the= "conceptual wrongness" of excluding functions that perform output from the= "world of purity" ;-) I would want them included even if this did not buy = anything. All of this would be easier and clearer if D made a proper distinction betw= een 'true' functions (that compute a result) and 'action-functions' (that p= erform an effect). The actual purity criteria need not be the same: * actions must not write to state * functions must not read from state Just state: output ports and memory do not belong to program state, and we'= re done. There is an asymmetry: runtime/variable data input (from user, file,...) mu= st be considered as state, else a compiler cannot memoize function results. Denis -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Dec 18 2010
parent Don <nospam nospam.com> writes:
spir wrote:
 On Sat, 18 Dec 2010 13:46:11 +0100
 Don <nospam nospam.com> wrote:
 
 spir wrote:
 On Sat, 18 Dec 2010 01:08:20 -0800
 Jonathan M Davis <jmdavisProg gmx.com> wrote:

 Thank you for the explanation about strongly pure funcs calling weakly pure
ones --this fully makes sense.

 I would like weakly pure to include output funcs, and exclude all
 possibilities to modify (non-local) state.  
The problem is that output is accessing global variables - which weakly pure functions _cannot_ do.
Why? What is the rationale for excluding output (I don't mean I/O, only O)?
You're correct in saying that it doesn't affect the operation of the program. But in practice, program output is almost always important. For example, suppose we allowed output to be pure. Then consider: writeln("Hello, world!"); Since it returns nothing, and has no influence on the future execution of the program, the writeln can be dropped from the program. Hmmm....
For sure, if you specify (in the compiler behaviour) that the only purpose of a pure function is to return something.
Yes, that's what's done.
 Which excudes all "action-fonctions" (= funcs which purpose is to perform an
effect). 
 But why should the compiler rewrite of such a func be to erase it? Just let it
alone, it won't bother anybody. 
But if you do that, you lose most of the benefits of pure.
And can safely be qualified as pure. So that, transitively, pure funcs 
can call funcs that perform output. Just what we want. You could do this -- but in order to preserve the usefulness of pure, you'd need to introduce some way of flagging such functions. The signature would need to indicate that it is a "action function".
 	pure Report getReport (Data data) {
 	    // compute report from data (only)
 	    log.write(report);
 	    writefln("report computed:\n\t%s", report);   // debug
 	    return report;
 	}
 
 Anyway, aside practicle & efficiency purposes, 
 I'm even more annoyed by the "conceptual wrongness" of excluding functions
that perform output from the "world of purity" ;-)
 I would want them included even if this did not buy anything.
I think you're assuming that including them costs nothing. But it does. Including such functions destroys most of the benefits of pure. You do need to treat them differently.
 All of this would be easier and clearer if D made a proper distinction between
'true' functions (that compute a result) and 'action-functions' (that perform
an effect). The actual purity criteria need not be the same:
 * actions must not write to state
 * functions must not read from state
 Just state: output ports and memory do not belong to program state, and we're
done.
Definitely you could do this. I suspect that it wouldn't be worth it, though, because action functions are viral. Once a function is an action function, everything which calls it is also an action function. And from an optimisation perspective, there isn't much you can do with action functions.
 There is an asymmetry: runtime/variable data input (from user, file,...) must
be considered as state, else a compiler cannot memoize function results.
It also can't memoize the results of functions which call action functions. But aside from all of this, I'm actually rather sceptical that output can be done without using static variables. I cannot imagine how it could be done. It is an interesting point, that is particularly obvious at a low level. For example, in DOS text mode, writing 'a' to 0xB000 wrote an 'a' in the top left corner of the screen, even if the function never reads that address. But writing 'a' to 0x5000 is not output! Currently, a pure function cannot write to _any_ static variable. This disallows a few obscure functions which could be considered to be logically pure. Ultimately, I think your definition of 'pure function' would be: never reads from a static variable, and never writes to a static variable which is ever read by another function; and 'pure action function' would be: writes to an output port, or to a never-read static variable which is designated as an output. Again, although this could be done, I doubt that it's worth it. The costs seem high and the benefits low.
Dec 18 2010
prev sibling next sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2010-12-18 07:46:11 -0500, Don <nospam nospam.com> said:

 spir wrote:
 On Sat, 18 Dec 2010 01:08:20 -0800
 Jonathan M Davis <jmdavisProg gmx.com> wrote:
 
 Thank you for the explanation about strongly pure funcs calling weakly 
 pure ones --this fully makes sense.
 
 I would like weakly pure to include output funcs, and exclude all
 possibilities to modify (non-local) state.
The problem is that output is accessing global variables - which weakly pure functions _cannot_ do.
Why? What is the rationale for excluding output (I don't mean I/O, only O)?
You're correct in saying that it doesn't affect the operation of the program. But in practice, program output is almost always important. For example, suppose we allowed output to be pure. Then consider: writeln("Hello, world!"); Since it returns nothing, and has no influence on the future execution of the program, the writeln can be dropped from the program.
I agree with that for writeln, the function at global scope, can't be pure. The reason is simple: it refers to the stdout global variable. However, if someone was to pass stdout as a non-const function parameter I think it should work: pure void fun(File output, string msg) { output.writeln(msg); } void main() { fun(stdout, msg); } File.writeln cannot be strongly pure since it gets the File as a reference (and so it can't be optimized away), but it can be weakly pure. One could argue that this is not terribly useful since no strongly pure functions will be able to do IO, but there might be other reasons one would want a weakly pure function other than enabling strongly pure ones. (For one instance of this, see my two posts in the other discussion "Threads and static initialization".) -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Dec 18 2010
prev sibling parent reply Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 18/12/2010 12:46, Don wrote:
 spir wrote:
 On Sat, 18 Dec 2010 01:08:20 -0800
 Jonathan M Davis <jmdavisProg gmx.com> wrote:

 Thank you for the explanation about strongly pure funcs calling weakly
 pure ones --this fully makes sense.

 I would like weakly pure to include output funcs, and exclude all
 possibilities to modify (non-local) state.
The problem is that output is accessing global variables - which weakly pure functions _cannot_ do.
Why? What is the rationale for excluding output (I don't mean I/O, only O)?
You're correct in saying that it doesn't affect the operation of the program. But in practice, program output is almost always important. For example, suppose we allowed output to be pure. Then consider: writeln("Hello, world!"); Since it returns nothing, and has no influence on the future execution of the program, the writeln can be dropped from the program. Hmmm....
Hum, it might still be useful to have something like a compiler switch that disables pure altogether, then people could use I/O and other non-pure operations for debugging purposes. One could wrap such code with a version statement: void myPurefunc(Foo foo) pure { version(pure_disabled) { writeln("some debug info: ", foo); } //... -- Bruno Medeiros - Software Engineer
Feb 11 2011
parent bearophile <bearophileHUGS lycos.com> writes:
Bruno Medeiros:

 Hum, it might still be useful to have something like a compiler switch 
 that disables pure altogether, then people could use I/O and other 
 non-pure operations for debugging purposes. One could wrap such code 
 with a version statement:
 
 void myPurefunc(Foo foo) pure {
    version(pure_disabled) {
      writeln("some debug info: ", foo);
    }
    //...
This seems an interesting idea to help debug pure functions. Bye, bearophile
Feb 11 2011
prev sibling next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Steven Schveighoffer:

 you sort of lost me at the SPARK example,
I don't like C++ a lot because it's sometimes too much hard to understand/remember for me, and I don't like C much because it's too much bug prone for me. I think D shares some of the design philosophy of Ada (despite D2 is more bug-prone and less defined/deterministic than Ada).
 but I want to particularly note some inaccuracies that you have.
Going to be fixed. Bye, bearophile
Dec 18 2010
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/17/10 1:42 AM, bearophile wrote:
 http://www.reddit.com/r/programming/comments/enajl/purity_in_d_language/

 Bye,
 bearophile
This is an excellent article, which is reflected in high reddit vote and a spirited but troll-free discussion. A homerun! We need more of this stuff. Congratulations, Leonardo! Andrei
Dec 18 2010
parent Walter Bright <newshound2 digitalmars.com> writes:
Andrei Alexandrescu wrote:
 On 12/17/10 1:42 AM, bearophile wrote:
 http://www.reddit.com/r/programming/comments/enajl/purity_in_d_language/

 Bye,
 bearophile
This is an excellent article, which is reflected in high reddit vote and a spirited but troll-free discussion. A homerun! We need more of this stuff. Congratulations, Leonardo!
I agree. Nice job!
Dec 18 2010
prev sibling next sibling parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 17/12/2010 07:42, bearophile wrote:
 http://www.reddit.com/r/programming/comments/enajl/purity_in_d_language/

 Bye,
 bearophile
"pure functions are contravariant (a "subset") with impure functions" Nope, there is a mistake here, it is the other way around: pure functions are covariant with impure functions. This is even mentioned in the beginning of the article: "a pure function: [...] - Is covariant with an impure function;" -- Bruno Medeiros - Software Engineer
Jan 27 2011
prev sibling parent bearophile <bearophileHUGS lycos.com> writes:
Bruno Medeiros:

 I think that the concession that pure will be allowed to allocate memory 
 does inescapably remove some of the guarantees that pure functions offer 
 (like that one that the return value depends only on the arguments).
 One possible fix to this would be to say that the allocated memory must 
 be temporary (used only during the execution of the pure function). Thus 
 you would not be able to return any newly-allocated value. But I don't 
 know if this further restriction is desirable or not. I don't remember 
 if this aspect of memory allocation in pure functions was 
 discussed/thought-out extensively or not. (it probably needs to)
I have discussed this a bit with Steven Schveighoffer, see the transparent attribute, in this "Uh... destructors?" thread: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=130554 Bye, bearophile
Mar 24 2011