www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Do pure functions solve the "return const" problems?

reply Russell Lewis <webmaster villagersonline.com> writes:
I've been pondering the discussions about "return const" and such. 
Basically, the idea is to be able to define a function which is not 
allowed to modify certain data, but which can return subsets of that 
data in a non-const fashion.  Classic examples are strstr (scan, but 
don't alter, the data, but return a usable pointer) and min/max (compare 
the items and return one, but return non-const versions of them).

Are pure functions the way?  Isn't a pure function required to not 
modify the data?  And, for that reason, can't a pure function be utterly 
ignorant of the const-ness of the data?


BEGIN CODE
	pure char[] strstr(char[] haystack, char[] needle)
	{
		// perform the search
		int indx = <whatever>

		return haystack[indx..$];
	}
END CODE


Of course, somebody's going to post an example of a function which is 
inherently non-pure but which needs this sort of functionality.  I can't 
think of exactly what it would be, but let's assume that somebody does. 
  In this case, a "pure delegate" could handle it:


BEGIN CODE
	char[] pure delegate(char[])
	myBadlyBehavedFunction(string input, ....other args...)
	{
		int indx = <whatever>;

		return pure delegate char[](char[] input)
		       {
		         return input[indx..$];
		       };
	}
END CODE


So, the function which makes the decision isn't pure, but the delegate 
that it returns *is*, so a caller can happily and confidently do:


BEGIN CODE
	void foo()
	{
		char[] thing = <my critical data>;
		char[] subset = myBadlyBehavedFunction(thing) (thing);
	}
END CODE


If this was important enough of a pattern, we could even give syntax 
sugar for it.


Thoughts?
	Russ
Apr 01 2008
next sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 01/04/2008, Russell Lewis <webmaster villagersonline.com> wrote:
  BEGIN CODE
         pure char[] strstr(char[] haystack, char[] needle)
         {
                 // perform the search
                 int indx = <whatever>

                 return haystack[indx..$];
         }
  END CODE

  Thoughts?
         Russ
Just one... char[] s; string t; auto u = strstr(s,t); What type is u? (The alternative solution proposed by Stephen and myself solves this problem completely, by the way).
Apr 01 2008
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Russell Lewis" wrote
 I've been pondering the discussions about "return const" and such. 
 Basically, the idea is to be able to define a function which is not 
 allowed to modify certain data, but which can return subsets of that data 
 in a non-const fashion.  Classic examples are strstr (scan, but don't 
 alter, the data, but return a usable pointer) and min/max (compare the 
 items and return one, but return non-const versions of them).

 Are pure functions the way?  Isn't a pure function required to not modify 
 the data?  And, for that reason, can't a pure function be utterly ignorant 
 of the const-ness of the data?


 BEGIN CODE
 pure char[] strstr(char[] haystack, char[] needle)
 {
 // perform the search
 int indx = <whatever>

 return haystack[indx..$];
 }
 END CODE
Having never used pure functions, I'm not sure that this is the case for C++'s pure functions, but I thought pure functions for D would require invariant arguments, no? Otherwise, how do you guarantee another thread doesn't come along and munge the data while the pure function is running? -Steve
Apr 01 2008
parent reply Russell Lewis <webmaster villagersonline.com> writes:
Steven Schveighoffer wrote:
 Having never used pure functions, I'm not sure that this is the case for 
 C++'s pure functions, but I thought pure functions for D would require 
 invariant arguments, no?  Otherwise, how do you guarantee another thread 
 doesn't come along and munge the data while the pure function is running?
Interesting question, I don't really know the answer. However, it seems to me that "pure" is a statement that you are making about the function, not necessarily about the caller. That is, "pure" means "this function does not have side-effects, so use it with impunity," whereas "invariant" means "the caller must obey rules that the function is relying on." Put another way, saying that a function is "pure" allows the caller to make certain optimizations. Saying that the arguments are invariant allows the pure function to optimize internally. I think that it might be quite possible to have a pure function which doesn't require invariant arguments...it would just lose out on internal optimization opportunities.
Apr 01 2008
parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
Russell Lewis wrote:
 Steven Schveighoffer wrote:
 Having never used pure functions, I'm not sure that this is the case 
 for C++'s pure functions, but I thought pure functions for D would 
 require invariant arguments, no?  Otherwise, how do you guarantee 
 another thread doesn't come along and munge the data while the pure 
 function is running?
Interesting question, I don't really know the answer. However, it seems to me that "pure" is a statement that you are making about the function, not necessarily about the caller. That is, "pure" means "this function does not have side-effects, so use it with impunity," whereas "invariant" means "the caller must obey rules that the function is relying on." Put another way, saying that a function is "pure" allows the caller to make certain optimizations. Saying that the arguments are invariant allows the pure function to optimize internally.
Right. If it's pure *it* doesn't modify its arguments (or any other data that isn't local to the function). That enables you to run N copies of that function in parallel without worrying. Or N copies of any combination of pure functions without worrying. Keeping other threads from mucking with the data referred to by the pure functions is the user's responsibility.
 I think that it might be quite possible to have a pure function which 
 doesn't require invariant arguments...it would just lose out on internal 
 optimization opportunities.
Maybe. I'm not sure what those extra optimizations would be, though. I don't believe that code generation usually takes into account the possibility that some other thread might be clobbering your data. If there is another thread clobbering your data, that's your fault for not mutexing properly. Invariant may allow you to eliminate a data-protecting mutex, but I'm not sure how that helps the compiler. Maybe synchronized blocks' mutexes could automatically be made no-ops using knowledge of invariant? --bb
Apr 01 2008
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Bill Baxter" wrote
 Russell Lewis wrote:
 Steven Schveighoffer wrote:
 Having never used pure functions, I'm not sure that this is the case for 
 C++'s pure functions, but I thought pure functions for D would require 
 invariant arguments, no?  Otherwise, how do you guarantee another thread 
 doesn't come along and munge the data while the pure function is 
 running?
Interesting question, I don't really know the answer. However, it seems to me that "pure" is a statement that you are making about the function, not necessarily about the caller. That is, "pure" means "this function does not have side-effects, so use it with impunity," whereas "invariant" means "the caller must obey rules that the function is relying on." Put another way, saying that a function is "pure" allows the caller to make certain optimizations. Saying that the arguments are invariant allows the pure function to optimize internally.
Right. If it's pure *it* doesn't modify its arguments (or any other data that isn't local to the function). That enables you to run N copies of that function in parallel without worrying. Or N copies of any combination of pure functions without worrying. Keeping other threads from mucking with the data referred to by the pure functions is the user's responsibility.
 I think that it might be quite possible to have a pure function which 
 doesn't require invariant arguments...it would just lose out on internal 
 optimization opportunities.
Maybe. I'm not sure what those extra optimizations would be, though. I don't believe that code generation usually takes into account the possibility that some other thread might be clobbering your data. If there is another thread clobbering your data, that's your fault for not mutexing properly. Invariant may allow you to eliminate a data-protecting mutex, but I'm not sure how that helps the compiler. Maybe synchronized blocks' mutexes could automatically be made no-ops using knowledge of invariant?
Like I said, never used pure functions before. So what you are saying is that the purpose of pure functions is to allow parallelization within the context of a SINGLE thread? So multiprogramming has nothing to do with multi-threading? I always thought Walter's grand vision was to use pure functions to make multi-threaded programming easier... -Steve
Apr 02 2008
parent reply Yigal Chripun <yigal100 gmail.com> writes:
Steven Schveighoffer wrote:
 Like I said, never used pure functions before.  So what you are saying is 
 that the purpose of pure functions is to allow parallelization within the 
 context of a SINGLE thread?  So multiprogramming has nothing to do with 
 multi-threading?

 I always thought Walter's grand vision was to use pure functions to make 
 multi-threaded programming easier...

 -Steve 


   
after reading this whole thread I think I get your point and I agree with it. However, I have a few remaining questions: 1) since the purpose of pure functions as I understand it is thread-safe functions, than why should we care if they have side affects as long as they are thread-safe? what I basically want to know is whether the following scenario should be allowed: a "pure" function contains a thread-safe side affect like changing some global variable inside a synchronized() block. 2) why do we need read only view (aka const) at all if const is logical? isn't const equivalent with partial purity (the code the access a const variable is pure)? confused, Yigal
Apr 02 2008
parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 02/04/2008, Yigal Chripun <yigal100 gmail.com> wrote:
  1) since the purpose of pure functions as I understand it is thread-safe
  functions,
It isn't. A pure function is a function with no side effects.
 than why should we care if they have side affects
See above. Knowing that a function f is pure allows the compiler to make optimisations it cannot make with non-pure functions. For example: x = f() + f(); can be optimised to x = 2 * f(); thereby eliminating one entire function call. This makes your code go faster. Pure functions also happen to be thread-safe, but not all thread-safe functions are pure.
Apr 02 2008
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Janice Caron" wrote
 On 02/04/2008, Yigal Chripun wrote:
  1) since the purpose of pure functions as I understand it is thread-safe
  functions,
It isn't. A pure function is a function with no side effects.
 than why should we care if they have side affects
See above. Knowing that a function f is pure allows the compiler to make optimisations it cannot make with non-pure functions. For example: x = f() + f(); can be optimised to x = 2 * f(); thereby eliminating one entire function call. This makes your code go faster. Pure functions also happen to be thread-safe, but not all thread-safe functions are pure.
That was my impression too, but read Bill's post closely. What he is saying is that pure functions do not need invariant parameters, which means they are subject to thread issues. If pure functions are not required to take invariant parameters, then they cannot be guaranteed to be thread safe. -Steve
Apr 02 2008
next sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 02/04/2008, Steven Schveighoffer <schveiguy yahoo.com> wrote:
  If pure functions are not required to take invariant parameters, then they
  cannot be guaranteed to be thread safe.
Well, that's an "if". My suspicion is that pure functions will require invariant parameters. According to Wikipedia, in functional languages, all data is invariant, so that makes sense.
Apr 02 2008
prev sibling next sibling parent Bill Baxter <dnewsgroup billbaxter.com> writes:
Steven Schveighoffer wrote:
 "Janice Caron" wrote
 On 02/04/2008, Yigal Chripun wrote:
  1) since the purpose of pure functions as I understand it is thread-safe
  functions,
It isn't. A pure function is a function with no side effects.
 than why should we care if they have side affects
See above. Knowing that a function f is pure allows the compiler to make optimisations it cannot make with non-pure functions. For example: x = f() + f(); can be optimised to x = 2 * f(); thereby eliminating one entire function call. This makes your code go faster. Pure functions also happen to be thread-safe, but not all thread-safe functions are pure.
That was my impression too, but read Bill's post closely. What he is saying is that pure functions do not need invariant parameters, which means they are subject to thread issues. If pure functions are not required to take invariant parameters, then they cannot be guaranteed to be thread safe.
Well I of course am just guessing about what will and will not be allowed in a pure function inside a procedural language. I don't think there's ever been such a beast before, so no one can really say for sure. I guess the question is how much extra checking the compiler will do for you. Currently we think 'pure' will cause the compiler to prevent accesses to globals. If you pass in a const object and only * access it's local data members in a read-only way, or * call methods on it declared pure, then that should be enough to guarantee that the purity of the function is not lost. That would allow 5 copies of the function to be run in parallel or in any order, or to change f(c)+f(c) into 2*f(c). But it would not guarantee that the function would do the right thing if there were other threads mucking with argument in the background. However, the failure there would not be due to lack of purity on the part of the function. You could say it's not the function's fault that memory values are changing out from under it. Or you could say it *is* the function's fault for relying on memory values that may change. --bb
Apr 02 2008
prev sibling parent reply guslay <guslay gmail.com> writes:
Steven Schveighoffer Wrote:

 "Janice Caron" wrote
 Pure functions also happen to be thread-safe, but not all thread-safe
 functions are pure.
That was my impression too, but read Bill's post closely. What he is saying is that pure functions do not need invariant parameters, which means they are subject to thread issues. If pure functions are not required to take invariant parameters, then they cannot be guaranteed to be thread safe. -Steve
Just to complement: If f is pure, and a is invariant. f(a)+f(a) is equivalent to 2 * f(a). If a is not invariant, a can be modified by f(). However, that is not a side effect: a is considered as an output of f(). f can still called be pure. There is a dependency between the two f(a) that prevents reordering or caching or threading first call : f(a) produces a1. second call : f(a1) produces a2. The thing with pure function is that just by looking at the nature of the parameters, you know how coupled they are.
Apr 02 2008
parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 02/04/2008, guslay <guslay gmail.com> wrote:
  If a is not invariant, a can be modified by f().
Did you mean to say "const" there? If a is not /const/, a can be modified by f(). There is no way that f(a) can modify a if a is const. That's what const means.
 However, that is not a side effect: a is considered as an output of f(). f can
still called be pure.
I'd but huge amounts of money on the notion that that's not so.
  There is a dependency between the two f(a) that prevents reordering or
caching or threading
Then f is not pure.
  The thing with pure function is that just by looking at the nature of the
parameters, you know how coupled they are.
No, the thing with pure functions is, they are not coupled at all.
Apr 02 2008
parent reply Christopher Wright <dhasenan gmail.com> writes:
Janice Caron wrote:
 On 02/04/2008, guslay <guslay gmail.com> wrote:
  If a is not invariant, a can be modified by f().
Did you mean to say "const" there? If a is not /const/, a can be modified by f(). There is no way that f(a) can modify a if a is const. That's what const means.
 However, that is not a side effect: a is considered as an output of f(). f can
still called be pure.
I'd but huge amounts of money on the notion that that's not so.
Pure functions cannot have out or ref parameters? Ref might be an issue for reference types, since you're supposed to be able to memoize based on the bits on the stack passed in as arguments. For value types, it's doable. Out parameters should certainly be supported.
Apr 03 2008
parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 03/04/2008, Christopher Wright <dhasenan gmail.com> wrote:
  Pure functions cannot have out or ref parameters?
I'm only guessing, but I would guess no.
  Out parameters should certainly be supported.
If I've understood this correctly, in pure functions, you don't return anything via the parameters. Everything you return, you do through the return statement. That means, instead of void f(in int a, out float b, inout string s) { ... } f(a,b,s); you'd do import std.typecons; // for Tuple pure Tuple!(float,string) f(int a, string s) { } auto t = f(a,s); b = t._0; s = t._1;
Apr 03 2008
parent reply Christopher Wright <dhasenan gmail.com> writes:
Janice Caron wrote:
 On 03/04/2008, Christopher Wright <dhasenan gmail.com> wrote:
  Pure functions cannot have out or ref parameters?
I'm only guessing, but I would guess no.
  Out parameters should certainly be supported.
If I've understood this correctly, in pure functions, you don't return anything via the parameters. Everything you return, you do through the return statement.
Fine, assuming pure arrives after multiple return values.
Apr 03 2008
parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 04/04/2008, Christopher Wright <dhasenan gmail.com> wrote:
  Fine, assuming pure arrives after multiple return values.
Well that's a done deal, since multiple return values arrived a few releases back. Witness my prior example, but here's another one: import std.typecons; Tuple!(int,int,int,double) f() { return Tuple!(int,int,int,double)(10, 20, 42, 3.14); } auto t = f(); auto x = t._0; // 10 auto y = t._1; // 20 auto z = t._2; // 42 auto g = t._3; // 3.14
Apr 03 2008
parent reply Christopher Wright <dhasenan gmail.com> writes:
Janice Caron wrote:
 On 04/04/2008, Christopher Wright <dhasenan gmail.com> wrote:
  Fine, assuming pure arrives after multiple return values.
Well that's a done deal, since multiple return values arrived a few releases back. Witness my prior example, but here's another one: import std.typecons; Tuple!(int,int,int,double) f() { return Tuple!(int,int,int,double)(10, 20, 42, 3.14); } auto t = f(); auto x = t._0; // 10 auto y = t._1; // 20 auto z = t._2; // 42 auto g = t._3; // 3.14
Okay, didn't realize that. Still, there shouldn't be any difference between out parameters and return values.
Apr 04 2008
parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 05/04/2008, Christopher Wright <dhasenan gmail.com> wrote:
 Still, there shouldn't be any difference between
 out parameters and return values.
I think there should. Suppose f is declared: pure int f(out int n); Now consider the expression: f(x) + f(x) See the problem? Expressed in purely functional terms, you would have to rewrite the code as: alias Tuple!(int,int) Pair; pure Pair f(); pure Pair g(Pair lhs, Pair rhs); where g() takes the place of +. Now the expression becomes g(f(),f()) which doesn't suffer the same problem.
Apr 04 2008
parent reply Christopher Wright <dhasenan gmail.com> writes:
Janice Caron wrote:
 On 05/04/2008, Christopher Wright <dhasenan gmail.com> wrote:
 Still, there shouldn't be any difference between
 out parameters and return values.
I think there should. Suppose f is declared: pure int f(out int n); Now consider the expression: f(x) + f(x) See the problem?
That might be an argument for removing out parameters -- it looks like f(x) + f(x) is the same as 2 * f(x). The compiler could tell the difference, so there's no problem on that side. It could basically rewrite the pretty f(x) + f(x) as the mess you recommended. You could require the 'out' qualifier at the call site to make it clear what's going on when you're reading the code. That's probably a better solution.
Apr 05 2008
parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 05/04/2008, Christopher Wright <dhasenan gmail.com> wrote:
  That might be an argument for removing out parameters -- it looks like f(x)
 + f(x) is the same as 2 * f(x). The compiler could tell the difference, so
 there's no problem on that side. It could basically rewrite the pretty f(x)
 + f(x) as the mess you recommended.
I'm afraid it can't because + isn't defined for Tuple!(int,int). Even if the compiler could deduce what the declaration of g would have to be, it still couldn't write the body.
Apr 05 2008
parent reply Christopher Wright <dhasenan gmail.com> writes:
Janice Caron wrote:
 On 05/04/2008, Christopher Wright <dhasenan gmail.com> wrote:
  That might be an argument for removing out parameters -- it looks like f(x)
 + f(x) is the same as 2 * f(x). The compiler could tell the difference, so
 there's no problem on that side. It could basically rewrite the pretty f(x)
 + f(x) as the mess you recommended.
I'm afraid it can't because + isn't defined for Tuple!(int,int). Even if the compiler could deduce what the declaration of g would have to be, it still couldn't write the body.
Yes, it could. You're explicitly telling it which return value you want to use by which is listed at the function's return value rather than an out parameter.
Apr 05 2008
parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 05/04/2008, Christopher Wright <dhasenan gmail.com> wrote:
  Yes, it could. You're explicitly telling it which return value you want to
 use by which is listed at the function's return value rather than an out
 parameter.
You're not getting this. If two values are being returned, the compiler can't just throw one of them away. In the general case, it doesn't have enough information to know what to do.
Apr 05 2008
parent reply Christopher Wright <dhasenan gmail.com> writes:
Janice Caron wrote:
 On 05/04/2008, Christopher Wright <dhasenan gmail.com> wrote:
  Yes, it could. You're explicitly telling it which return value you want to
 use by which is listed at the function's return value rather than an out
 parameter.
You're not getting this. If two values are being returned, the compiler can't just throw one of them away. In the general case, it doesn't have enough information to know what to do.
You're not getting this. int f (out int i); int x; int y = f(x) + f(x); This already works. Why couldn't pure functions do this? You can't tell me. You aren't arguing. You are just stating that the compiler can't do something that it already does. If the compiler supports memoization of pure functions based on parameters, it could easily exclude out parameters. If the compiler requires that all inputs to a pure function be scope invariant, out parameters could be excluded from that requirement. You come up with good ideas, but it happens every day that you argue about someone else's idea without giving their point of view a reasonable amount of thought. You assume that they are wrong and only accept information that supports that conclusion. It is possible to change your mind, but it would be nice if you tried arguing fairly more often.
Apr 06 2008
next sibling parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 07/04/2008, Christopher Wright <dhasenan gmail.com> wrote:
  This already works. Why couldn't pure functions do this? You can't tell me.
 You aren't arguing. You are just stating that the compiler can't do
 something that it already does.
Allow me to demonstrate: int f(out int x) pure { x = 1; return 0; } int g(out int x) pure { x = 2; return 0; } int h() pure { int n; int t = f(n) + g(n); return n + t; } Question: What does h return? The problem is, it depends on evaluation order. If f(n) is evaluated last, h() will return 1, but if g(n) is evalutated last, h() will return 2. Pure functions are not allowed to depend on evaluation order, therefore f() and g() cannot be pure - and hence, neither can h(). My conclusion is that you can't have out parameters in pure functions.
Apr 09 2008
parent reply Christopher Wright <dhasenan gmail.com> writes:
Janice Caron wrote:
 On 07/04/2008, Christopher Wright <dhasenan gmail.com> wrote:
  This already works. Why couldn't pure functions do this? You can't tell me.
 You aren't arguing. You are just stating that the compiler can't do
 something that it already does.
Allow me to demonstrate: int f(out int x) pure { x = 1; return 0; } int g(out int x) pure { x = 2; return 0; } int h() pure { int n; int t = f(n) + g(n); return n + t; } Question: What does h return? The problem is, it depends on evaluation order. If f(n) is evaluated last, h() will return 1, but if g(n) is evalutated last, h() will return 2. Pure functions are not allowed to depend on evaluation order, therefore f() and g() cannot be pure - and hence, neither can h(). My conclusion is that you can't have out parameters in pure functions.
Yay! I got you to argue! But you seem to be saying that, if you remove 'pure' from all those functions, the result will be undefined. Is this the case?
Apr 09 2008
parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 10/04/2008, Christopher Wright <dhasenan gmail.com> wrote:
  But you seem to be saying that, if you remove 'pure' from all those
 functions, the result will be undefined. Is this the case?
The order of evaluation is undefined. The result depends on the order of evaluation. In fact, according to the D docs, it's even possible for the same function to give different results depending on whether it's evaluated at compile-time or run-time, if you rely on "dependency on implementation defined order of evaluation".
Apr 09 2008
parent reply Christopher Wright <dhasenan gmail.com> writes:
Janice Caron wrote:
 On 10/04/2008, Christopher Wright <dhasenan gmail.com> wrote:
  But you seem to be saying that, if you remove 'pure' from all those
 functions, the result will be undefined. Is this the case?
The order of evaluation is undefined. The result depends on the order of evaluation. In fact, according to the D docs, it's even possible for the same function to give different results depending on whether it's evaluated at compile-time or run-time, if you rely on "dependency on implementation defined order of evaluation".
So you're saying that undefined behavior won't magically become defined with pure, so the feature that can cause undefined behavior should not be allowed with pure. Why? Why wouldn't that merely be undefined behavior when you add pure?
Apr 10 2008
parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 10/04/2008, Christopher Wright <dhasenan gmail.com> wrote:
  Why? Why wouldn't that merely be undefined behavior when you add pure?
Part of the contract for purity is that the result *MUST* be independent of the order of evaluation of pure functions. As a general rule, given h(f(),g()) The compiler is free to implement this as either temp1 = f(); temp2 = g(); h(temp1,temp2) or as temp2 = g(); temp1 = f(); h(temp1,temp2) or even as previouslyCachedValueFromLastTimeAround That meas, if D's "pure" keyword is intended to give us functional programming, then it's not going to let us have "out" parameters. The way to return values is through the return statement.
Apr 10 2008
parent reply Christopher Wright <dhasenan gmail.com> writes:
Janice Caron wrote:
 On 10/04/2008, Christopher Wright <dhasenan gmail.com> wrote:
  Why? Why wouldn't that merely be undefined behavior when you add pure?
Part of the contract for purity is that the result *MUST* be independent of the order of evaluation of pure functions. That meas, if D's "pure" keyword is intended to give us functional programming, then it's not going to let us have "out" parameters. The way to return values is through the return statement.
The results of these functions is independent of the order in which they are called. This would not be true with inout parameters. Of course, you can fake inout parameters with repeated arguments. It should, however, suffice to say that such things are undefined behavior, as they currently are.
Apr 10 2008
parent "Janice Caron" <caron800 googlemail.com> writes:
On 11/04/2008, Christopher Wright <dhasenan gmail.com> wrote:
 The results of these functions is independent of the order in which they are
called.
How do you define "results"? I ask because, if you mean "that which is returned through the return statement", then maybe so, but if you include the out variables themselves as part of the "results" then that statement is false, as my previous counterexample shows.
  It should, however, suffice to say that such things are undefined behavior,
 as they currently are.
Look at it another way: int f(out int r, int n) { r = something; return somethingElse; } is exactly equivalent to: int f(int* r, int n) { *r = something; return somethingElse; } The latter would not be considered a pure function. Therefore, nor would the former.
Apr 11 2008
prev sibling parent reply Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
Christopher Wright wrote:
 
 You're not getting this.
 
 int f (out int i);
 int x;
 int y = f(x) + f(x);
 
 This already works. Why couldn't pure functions do this? You can't tell 
 me. You aren't arguing. You are just stating that the compiler can't do 
 something that it already does. If the compiler supports memoization of 
 pure functions based on parameters, it could easily exclude out 
 parameters. If the compiler requires that all inputs to a pure function 
 be scope invariant, out parameters could be excluded from that requirement.
Changing an out parameter is a side effect. Pure functions cannot have side effects (ie, they cannot change state outside of their scope). If they did, it would break the purpose for which they are intended. What more is there to get? -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Apr 26 2008
parent reply Christopher Wright <dhasenan gmail.com> writes:
Bruno Medeiros wrote:
 Christopher Wright wrote:
 You're not getting this.

 int f (out int i);
 int x;
 int y = f(x) + f(x);

 This already works. Why couldn't pure functions do this? You can't 
 tell me. You aren't arguing. You are just stating that the compiler 
 can't do something that it already does. If the compiler supports 
 memoization of pure functions based on parameters, it could easily 
 exclude out parameters. If the compiler requires that all inputs to a 
 pure function be scope invariant, out parameters could be excluded 
 from that requirement.
Changing an out parameter is a side effect. Pure functions cannot have side effects (ie, they cannot change state outside of their scope). If they did, it would break the purpose for which they are intended. What more is there to get?
An out parameter is a return value. What else is there to get?
Apr 26 2008
parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 27/04/2008, Christopher Wright <dhasenan gmail.com> wrote:
  An out parameter is a return value. What else is there to get?
An out parameter is global state. Sometimes literally: void foo(out int x, int y) { x = y; } foo(globalVariable, 3); Also, as has been repeatedly pointed out, one of the benefits of pure is that given an expression like f(x,y) + g(x,y) the compiler is free to evaluate f first then g, or g first then f, at its discression. It is also free not to call either of these functions at all if it has a previously cached result handy. Out parameters must be forbidden, or those assumptions fail. If you want to return multiple values, use a tuple. import std.typecons; Tuple!(int,double) f(int x, double y) pure { return tuple(x+1, y+1); } It's not rocket science.
Apr 26 2008
next sibling parent reply Christopher Wright <dhasenan gmail.com> writes:
Janice Caron wrote:
 On 27/04/2008, Christopher Wright <dhasenan gmail.com> wrote:
  An out parameter is a return value. What else is there to get?
An out parameter is global state. Sometimes literally: void foo(out int x, int y) { x = y; } foo(globalVariable, 3);
Then the calling function isn't pure! I could also do: int foo(int y) { return y; } globalVariable = foo(3);
 Also, as has been repeatedly pointed out, one of the benefits of pure
 is that given an expression like
 
     f(x,y) + g(x,y)
 
 the compiler is free to evaluate f first then g, or g first then f, at
 its discression. It is also free not to call either of these functions
 at all if it has a previously cached result handy. Out parameters must
 be forbidden, or those assumptions fail.
Then so must return values!
Apr 27 2008
next sibling parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 27/04/2008, Christopher Wright <dhasenan gmail.com> wrote:
 An out parameter is global state. Sometimes literally:

    void foo(out int x, int y) { x = y; }

    foo(globalVariable, 3);
Then the calling function isn't pure!
The function itself isn't pure! foo() isn't pure!
  I could also do:
  int foo(int y) { return y; }
  globalVariable = foo(3);
That's irrelevant. globalVariable is assigned /after/ the function returned (not during function execution, as would be the case with an out parameter). Assigning a global variable *during function execution* violates the contract for purity, but you can do whatever you like with return values, after the function exits.
 Out parameters must
 be forbidden, or those assumptions fail.
Then so must return values!
Watch closely. Given int f(out int x, int y) { x = y+2; return y-1; } int g(out int x, int y) { x = y-1; return y+1; } the expression int n = f(x,3) + g(x,3) will result in n = 6 always, but x = 2 (if f was evaluated first), or 5 (if g was evaluated first). The PURE way to do it is: alias Tuple!(int,int) pair; pure pair f(int y) { return tuple(y+2,y-1); } pure pair g(int y) { return tuple(y-1,y+1); } pure pair add(pair lhs, pair rhs) { return tuple(lhs._0 + rhs._0, lhs._1 + rhs._1); } and now the expression add( f(3), g(3) ); will result in the pair { 7, 6 } always, /regardless/ of which of f or g gets evaluated first.
Apr 27 2008
parent Christopher Wright <dhasenan gmail.com> writes:
Janice Caron wrote:
 That's irrelevant. globalVariable is assigned /after/ the function
 returned (not during function execution, as would be the case with an
 out parameter).
The compiler is free to change that, should change that, and has the necessary information to change that. Since a function isn't allowed to read its out parameters, the assignments can be delayed until the function exits. The point is, it isn't going to be terribly difficult to implement out parameters in pure functions, it will make them prettier in some cases, and there isn't any strong reason to disallow them. You can pull out maybe two or three more weak reasons, but it'd be reasonably simple to work around them. Of course, since out parameters are syntactic sugar for multiple return values, you might say it's not worthwhile to implement this because it'd be too difficult for the benefit. That's an acceptable argument.
 Assigning a global variable *during function execution* violates the
 contract for purity, but you can do whatever you like with return
 values, after the function exits.
 
 
 Out parameters must
 be forbidden, or those assumptions fail.
Then so must return values!
Watch closely. Given int f(out int x, int y) { x = y+2; return y-1; } int g(out int x, int y) { x = y-1; return y+1; }
Why are you still using this example? It's undefined behavior WITHOUT pure; OF COURSE it's undefined behavior with pure!
Apr 27 2008
prev sibling parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
Christopher Wright wrote:
 Janice Caron wrote:
 On 27/04/2008, Christopher Wright <dhasenan gmail.com> wrote:
  An out parameter is a return value. What else is there to get?
An out parameter is global state. Sometimes literally: void foo(out int x, int y) { x = y; } foo(globalVariable, 3);
In this example, a side effect occurs because of the evaluation of "foo(globalVariable, 3)".
 Then the calling function isn't pure!
 I could also do:
 int foo(int y) { return y; }
 globalVariable = foo(3);
 
In this example, no side effect occurs because of the evaluation of "foo(3)". The side effect only occurs when globalVariable is assigned, which happens outside of foo. Thus, foo remains pure. _And this little difference allows for several compiler optimizations._ -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Apr 27 2008
prev sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2008-04-27 02:29:19 -0400, "Janice Caron" <caron800 googlemail.com> said:

 On 27/04/2008, Christopher Wright <dhasenan gmail.com> wrote:
 An out parameter is a return value. What else is there to get?
An out parameter is global state. Sometimes literally: void foo(out int x, int y) { x = y; } foo(globalVariable, 3);
Well, its the caller that binds it to a global state, not the function. How is the above different from: int bar(int y) pure { return y; } globalVariable = bar(3); ? By the way, the ABI specifies that a function returning a struct recieves the address to write the struct to as a hidden parameter and write the final struct value there. Even returning a tuple (which is a struct) is implemented under the hood as a hidden out parameter; that shouldn't prevent its use as a return value in a pure function.
 Also, as has been repeatedly pointed out, one of the benefits of pure
 is that given an expression like
 
     f(x,y) + g(x,y)
 
 the compiler is free to evaluate f first then g, or g first then f, at
 its discression. It is also free not to call either of these functions
 at all if it has a previously cached result handy. Out parameters must
 be forbidden, or those assumptions fail.
Well the compiler could just decide to not optimize pure function calls with out parameters. It doesn't prevent in any way optimization of pure function calls for functions with only in parameters. In the example above, a clever compiler could even notice how you're always discarting the result of the call to f and just bind it to a different, otherwise unused, placeholder local variable; allowing it to paralellize the code. It shouldn't be much harder to optimize than silly repetitive assignments such as these: int x, y = 1; x = f(y); x = g(y); return x; Pure should just mean "no side effects beyond what its signature says". The compiler should be able decide if it can, and should, paralelize calls by examining the function signature.
 If you want to return multiple values, use a tuple.
 
     import std.typecons;
 
     Tuple!(int,double) f(int x, double y) pure
     {
         return tuple(x+1, y+1);
     }
 
 It's not rocket science.
That'll work. The question is should you be forced to resort to that? I don't think so. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Apr 27 2008
parent reply Christopher Wright <dhasenan gmail.com> writes:
Michel Fortin wrote:
 On 2008-04-27 02:29:19 -0400, "Janice Caron" <caron800 googlemail.com> 
 said:
 
 On 27/04/2008, Christopher Wright <dhasenan gmail.com> wrote:
 An out parameter is a return value. What else is there to get?
An out parameter is global state. Sometimes literally: void foo(out int x, int y) { x = y; } foo(globalVariable, 3);
Well, its the caller that binds it to a global state, not the function. How is the above different from: int bar(int y) pure { return y; } globalVariable = bar(3); ? By the way, the ABI specifies that a function returning a struct recieves the address to write the struct to as a hidden parameter and write the final struct value there. Even returning a tuple (which is a struct) is implemented under the hood as a hidden out parameter; that shouldn't prevent its use as a return value in a pure function.
 Also, as has been repeatedly pointed out, one of the benefits of pure
 is that given an expression like

     f(x,y) + g(x,y)

 the compiler is free to evaluate f first then g, or g first then f, at
 its discression. It is also free not to call either of these functions
 at all if it has a previously cached result handy. Out parameters must
 be forbidden, or those assumptions fail.
Well the compiler could just decide to not optimize pure function calls with out parameters. It doesn't prevent in any way optimization of pure function calls for functions with only in parameters.
Right. But since you can trivially rewrite every function with out parameters as one that has multiple return values, it's reasonable to expect a compiler to fully support pure functions with out parameters.
Apr 27 2008
parent reply Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
Christopher Wright wrote:
 
 Right. But since you can trivially rewrite every function with out 
 parameters as one that has multiple return values, it's reasonable to 
 expect a compiler to fully support pure functions with out parameters.
 
void func(out int a, out int b) { a = 2; b = 10; } How do you rewrite this function with multiple return values? (You can't, because of the case where there is aliasing of the arguments, which wouldn't have the same result if the function was rewritten with multiple return values) -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Apr 29 2008
parent reply Christopher Wright <dhasenan gmail.com> writes:
Bruno Medeiros wrote:
 Christopher Wright wrote:
 Right. But since you can trivially rewrite every function with out 
 parameters as one that has multiple return values, it's reasonable to 
 expect a compiler to fully support pure functions with out parameters.
void func(out int a, out int b) { a = 2; b = 10; } How do you rewrite this function with multiple return values? (You can't, because of the case where there is aliasing of the arguments, which wouldn't have the same result if the function was rewritten with multiple return values)
That is interesting, and the first reasonable argument I've heard. You can't just record, for a given function, what the precedence of the out parameters is: void func(out int a, out int b, bool c) { if (c) { a = 5; b = 9; } else { b = 12; a = 3; } } You could return a struct containing ordering information, but that's too ugly to consider. Though this brings up a point that out parameters are more expressive than multiple return values. While this is not likely to make a difference in most reasonable usage, it's still the case, and might therefore be an argument in favor of keeping out parameters for pure functions. The other argument would be, out parameters already exist and apparently are used for large structs and multiple return values already, so it'd be trivial to implement -- in point of fact, it'd take more effort to prevent out parameters for pure functions.
Apr 29 2008
parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 29/04/2008, Christopher Wright <dhasenan gmail.com> wrote:
  You can't just record, for a given function, what the precedence of the out
 parameters is:

  void func(out int a, out int b, bool c) {
         if (c) {
                 a = 5;
                 b = 9;
         } else {
                 b = 12;
                 a = 3;
         }
  }
The compiler is fully entitled to reorder a = 5; b = 9; into b = 9; a = 5; if it so chooses. On a multi-core architecture, these assignments might even happen simultaneously.
 It'd take more effort to prevent out parameters for pure functions.
That's clearly not true. Rule one is that a pure function "has parameters that are all invariant or are implicitly convertible to invariant". (See http://www.digitalmars.com/d/2.0/function.html#pure-functions ). Out parameters fail immediately on account of not being invariant.
Apr 29 2008
parent reply Christopher Wright <dhasenan gmail.com> writes:
Janice Caron wrote:
 The compiler is fully entitled to reorder
 
     a = 5;
     b = 9;
 
 into
 
     b = 9;
     a = 5;
 
 if it so chooses. On a multi-core architecture, these assignments
 might even happen simultaneously.
In that case, passing in the same variable as multiple out parameters results undefined behavior currently. And that means that multiple return values is exactly equivalent to out parameters. (Or if "a, a = someFuncWithMultipleReturnValues();" is well defined, then out parameters can at least be converted to multiple return values.)
 It'd take more effort to prevent out parameters for pure functions.
That's clearly not true. Rule one is that a pure function "has parameters that are all invariant or are implicitly convertible to invariant". (See http://www.digitalmars.com/d/2.0/function.html#pure-functions ). Out parameters fail immediately on account of not being invariant.
Argh, where is my mind today. Or yesterday, rather. You'd have to change something like: if (arg->storageClass & STCinvariant) to: if (arg->storageClass & (STCinvariant | STCout)) And then you'd have to amend the spec to mention the exception.
Apr 29 2008
parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
Christopher Wright wrote:
 Janice Caron wrote:
 It'd take more effort to prevent out parameters for pure functions.
That's clearly not true. Rule one is that a pure function "has parameters that are all invariant or are implicitly convertible to invariant". (See http://www.digitalmars.com/d/2.0/function.html#pure-functions ). Out parameters fail immediately on account of not being invariant.
Argh, where is my mind today. Or yesterday, rather. You'd have to change something like: if (arg->storageClass & STCinvariant) to: if (arg->storageClass & (STCinvariant | STCout)) And then you'd have to amend the spec to mention the exception.
Under the current pure rules, you simply cannot have out parameters. But if you consider the "partial" pure function rules (see the new thread), you can have out parameters, and more, and all of it safely. So I would say partially pure is the solution for this problem, since you seem to love out parameters so damn much. :) -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Apr 29 2008