www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Proposal: Relax rules for 'pure'

reply Don <nospam nospam.com> writes:
The docs currently state that:
---
Pure functions are functions that produce the same result for the same 
arguments. To that end, a pure function:

     * has parameters that are all immutable or are implicitly 
convertible to immutable
     * does not read or write any global mutable state
---

This is extremely restrictive, and not currently enforced.
Two specific limitations:
- a 'pure' member function doesn't really make sense (could only be 
called on an immutable class)
- a pure function cannot have 'out' parameters.

In a discussion on D.learn, Steven Schveighoffer noted that it's only 
shared variables which make things complicated.
The limitations exist because 'pure' is used to mean both 'no hidden 
state' AND 'cachable'. But 'cachable' is really only an implementation 
detail.

PROPOSAL:
Drop the first requirement. Only one requirement is necessary:

A pure function does not read or write any global mutable state.

If a pure function has parameters that are all immutable or are 
implicitly convertible to immutable, then the compiler is permitted to 
cache the results.

This is possible because the first rule (all immutable parameters) is 
trivial, and can be checked from the function declaration (in fact, you 
can check it just from the mangled function name). The second rule (no 
globals) is viral, and requires inspection of the function body, and the 
function bodies of every function called by that function.

Actually, a clever compiler could even cache the results of pure 
functions which have parameters which don't implicitly cast to immutable.

I haven't found anything in TDPL which mentions the "only immutable 
parameters" rule. Relaxing that rule would allow much larger functions 
to be cached. The more important benefit (IMHO) of pure, that it makes 
it easier to reason about code, would be dramatically extended.
Sep 21 2010
next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Don (nospam nospam.com)'s article
 The docs currently state that:
 ---
 Pure functions are functions that produce the same result for the same
 arguments. To that end, a pure function:
      * has parameters that are all immutable or are implicitly
 convertible to immutable
      * does not read or write any global mutable state
 ---
 This is extremely restrictive, and not currently enforced.
 Two specific limitations:
 - a 'pure' member function doesn't really make sense (could only be
 called on an immutable class)
 - a pure function cannot have 'out' parameters.
 In a discussion on D.learn, Steven Schveighoffer noted that it's only
 shared variables which make things complicated.
 The limitations exist because 'pure' is used to mean both 'no hidden
 state' AND 'cachable'. But 'cachable' is really only an implementation
 detail.
 PROPOSAL:
 Drop the first requirement. Only one requirement is necessary:
 A pure function does not read or write any global mutable state.
 If a pure function has parameters that are all immutable or are
 implicitly convertible to immutable, then the compiler is permitted to
 cache the results.
 This is possible because the first rule (all immutable parameters) is
 trivial, and can be checked from the function declaration (in fact, you
 can check it just from the mangled function name). The second rule (no
 globals) is viral, and requires inspection of the function body, and the
 function bodies of every function called by that function.
 Actually, a clever compiler could even cache the results of pure
 functions which have parameters which don't implicitly cast to immutable.
 I haven't found anything in TDPL which mentions the "only immutable
 parameters" rule. Relaxing that rule would allow much larger functions
 to be cached. The more important benefit (IMHO) of pure, that it makes
 it easier to reason about code, would be dramatically extended.

Two things: 1. The documentation that says that the parameters need to be convertible to immutable is outdated. This was changed a while ago to only requiring const. 2. If you mean that the parameters don't have to be const either, i.e. you can freely mutate state on the heap, i.e. of a ref parameter, or stuff that's reachable from a pointer or reference, then pure offers almost no useful guarantees.
Sep 21 2010
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 09/21/2010 10:47 PM, dsimcha wrote:
 2.  If you mean that the parameters don't have to be const either, i.e. you can
 freely mutate state on the heap, i.e. of a ref parameter, or stuff that's
 reachable from a pointer or reference, then pure offers almost no useful
guarantees.

Equal effects for equal inputs? Andrei
Sep 22 2010
prev sibling next sibling parent reply Tomek =?UTF-8?B?U293acWEc2tp?= <just ask.me> writes:
dsimcha napisał:

 1.  The documentation that says that the parameters need to be convertible
 to immutable is outdated.  This was changed a while ago to only requiring
 const.

Good to know. Yet, I wonder why such changes are not discussed on this NG and at the very least announced in the change log... On topic: this means a pure function can take a reference to data that can be mutated by someone else. So we're giving up on the "can parallelize with no dataraces" guarantee on pure functions? -- Tomek
Sep 23 2010
next sibling parent Don <nospam nospam.com> writes:
retard wrote:
 Thu, 23 Sep 2010 22:35:23 +0200, Tomek Sowiński wrote:
 
 dsimcha napisał:

 1.  The documentation that says that the parameters need to be
 convertible to immutable is outdated.  This was changed a while ago to
 only requiring const.

NG and at the very least announced in the change log...

Just download the documentation of two subsequent versions and run gnu diff. It's that simple.

It's not in the docs. I don't remember ever hearing about this, and I can't find this documented anywhere. And I made the patches for all of the improvements to the compiler's support for pure which have happened in the last year. There have always been bugs where the compiler accepts 'const' instead of immutable. These date back to 2.022, when pure was first implemented. In fact, my motivation in bringing up this topic was after I observed these bugs, and realized that most existing 'pure' code will break when they are fixed. Also 'pure' will become even more useless than it is now.
Sep 24 2010
prev sibling parent reply Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 23/09/2010 23:39, Robert Jacques wrote:
 On Thu, 23 Sep 2010 16:35:23 -0400, Tomek Sowiński <just ask.me> wrote:
 On topic: this means a pure function can take a reference to data that
 can be mutated by
 someone else. So we're giving up on the "can parallelize with no
 dataraces" guarantee on
 pure functions?

In short, No. In long; the proposal is for pure functions become broken up into two groups (weak and strong) based on their function signatures. This division is internal to the compiler, and isn't expressed in the language in any way. Strongly-pure functions provide all the guarantees that pure does today and can be automatically parallelized or cached without consequence. Weakly-pure functions, don't provide either of these guarantees, but allow a much larger number of functions to be strongly-pure. In order to guarantee a function is strongly pure, one would have to declare all its inputs immutable or use an appropriate template constraint.

I think we need to be more realistic with what kinds of optimizations we could expect from a D compiler and pure functions. Caching might be done, but only a temporary sense (caching under a limited execution scope). I doubt we would ever have something like memoization, which would incur memory costs (potentially quite big ones), and so the compiler almost certainly would not be able to know (without additional metadata/anotations or compile options) if that trade-off is acceptable. Similarly for parallelism, how would the compiler know that it's ok to spawn 10 or 100 new threads to parallelize the execution of some loop? The consequences for program and whole-machine scheduling would not be trivial and easy to understand. For this to happen, amongst other things the compiler and OS would need to ensure that the spawned threads would not starve the rest of the threads of that program. I suspect all these considerations might be very difficult to guarantee on a non-VM environment. -- Bruno Medeiros - Software Engineer
Oct 25 2010
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/25/10 8:44 CDT, Bruno Medeiros wrote:
 I think we need to be more realistic with what kinds of optimizations
 we could expect from a D compiler and pure functions.
 Caching might be done, but only a temporary sense (caching under a
 limited execution scope). I doubt we would ever have something like
 memoization, which would incur memory costs (potentially quite big
 ones), and so the compiler almost certainly would not be able to know
 (without additional metadata/anotations or compile options) if that
 trade-off is acceptable.

Speaking of which - memoization should be implementable safely and efficiently in a library. I'm thinking of something like this: alias memoize!sin msin; ... double x = msin(angle); It does get tricky though. For example, for doubles you'd need to do some interpolation and there are several ways of interpolating. For other types you need to make sure there's no undue aliasing, etc. Andrei
Oct 25 2010
parent bearophile <bearophileHUGS lycos.com> writes:
Andrei:

 Speaking of which - memoization should be implementable safely and 
 efficiently in a library. I'm thinking of something like this:
 
 alias memoize!sin msin;
 ...
 double x = msin(angle);

Good.
 It does get tricky though. For example, for doubles you'd need to do 
 some interpolation and there are several ways of interpolating. For 
 other types you need to make sure there's no undue aliasing, etc.

Not good. Please keep things simple. If the programmer wants to use floats, then it's not the job of the memoizing function to define tolerances, etc. If you want to add other features to a memoizing function, here are two useful features: - An optional max number of items stored in the cache, using a LRU strategy (Python 3.2 standard library has this). - An optional max number of bytes used by the cache (this is not present in Python std lib, but I have found this useful). The first feature (max number of items using LRU) may be done wrapping the associative array keys inside a linked list. Bye, bearophile
Oct 25 2010
prev sibling next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Bruno Medeiros:

 Robert Jacques:
 Weakly-pure functions, don't provide either of
 these guarantees, but allow a much larger number of functions to be
 strongly-pure.
 ...

we could expect from a D compiler and pure functions.

Partally unrelated: in DMD 2.050 I think it's possible to have both std.algorithm.swap and std.algorithm.sort weakly pure. So I am able to sort data even in a strongly pure function :-) This is an important change. Bye, bearophile
Oct 25 2010
prev sibling next sibling parent reply Don <nospam nospam.com> writes:
Bruno Medeiros wrote:
 On 23/09/2010 23:39, Robert Jacques wrote:
 On Thu, 23 Sep 2010 16:35:23 -0400, Tomek Sowiński <just ask.me> wrote:
 On topic: this means a pure function can take a reference to data that
 can be mutated by
 someone else. So we're giving up on the "can parallelize with no
 dataraces" guarantee on
 pure functions?

In short, No. In long; the proposal is for pure functions become broken up into two groups (weak and strong) based on their function signatures. This division is internal to the compiler, and isn't expressed in the language in any way. Strongly-pure functions provide all the guarantees that pure does today and can be automatically parallelized or cached without consequence. Weakly-pure functions, don't provide either of these guarantees, but allow a much larger number of functions to be strongly-pure. In order to guarantee a function is strongly pure, one would have to declare all its inputs immutable or use an appropriate template constraint.

I think we need to be more realistic with what kinds of optimizations we could expect from a D compiler and pure functions. Caching might be done, but only a temporary sense (caching under a limited execution scope). I doubt we would ever have something like memoization, which would incur memory costs (potentially quite big ones), and so the compiler almost certainly would not be able to know (without additional metadata/anotations or compile options) if that trade-off is acceptable. Similarly for parallelism, how would the compiler know that it's ok to spawn 10 or 100 new threads to parallelize the execution of some loop? The consequences for program and whole-machine scheduling would not be trivial and easy to understand. For this to happen, amongst other things the compiler and OS would need to ensure that the spawned threads would not starve the rest of the threads of that program. I suspect all these considerations might be very difficult to guarantee on a non-VM environment.

I agree with this, especially with regard to memoization. However, several other very interesting optimisations are possible with pure, especially with regard to memory allocation. At exit from an immutably pure function, all memory allocated by that function and its subfunctions can be collected, except for anything which is reachable through the function return value. Even for a weakly-pure function, the set of roots for gc is limited to the mutable function parameters and the return value. And for all pure functions with no mutable reference parameters, it is guaranteed that no other thread has access to them. The implications for gc are very interesting.
Oct 25 2010
parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 26/10/2010 03:07, Don wrote:
 Bruno Medeiros wrote:
 On 23/09/2010 23:39, Robert Jacques wrote:
 On Thu, 23 Sep 2010 16:35:23 -0400, Tomek Sowiński <just ask.me> wrote:
 On topic: this means a pure function can take a reference to data that
 can be mutated by
 someone else. So we're giving up on the "can parallelize with no
 dataraces" guarantee on
 pure functions?

In short, No. In long; the proposal is for pure functions become broken up into two groups (weak and strong) based on their function signatures. This division is internal to the compiler, and isn't expressed in the language in any way. Strongly-pure functions provide all the guarantees that pure does today and can be automatically parallelized or cached without consequence. Weakly-pure functions, don't provide either of these guarantees, but allow a much larger number of functions to be strongly-pure. In order to guarantee a function is strongly pure, one would have to declare all its inputs immutable or use an appropriate template constraint.

I think we need to be more realistic with what kinds of optimizations we could expect from a D compiler and pure functions. Caching might be done, but only a temporary sense (caching under a limited execution scope). I doubt we would ever have something like memoization, which would incur memory costs (potentially quite big ones), and so the compiler almost certainly would not be able to know (without additional metadata/anotations or compile options) if that trade-off is acceptable. Similarly for parallelism, how would the compiler know that it's ok to spawn 10 or 100 new threads to parallelize the execution of some loop? The consequences for program and whole-machine scheduling would not be trivial and easy to understand. For this to happen, amongst other things the compiler and OS would need to ensure that the spawned threads would not starve the rest of the threads of that program. I suspect all these considerations might be very difficult to guarantee on a non-VM environment.

I agree with this, especially with regard to memoization. However, several other very interesting optimisations are possible with pure, especially with regard to memory allocation. At exit from an immutably pure function, all memory allocated by that function and its subfunctions can be collected, except for anything which is reachable through the function return value. Even for a weakly-pure function, the set of roots for gc is limited to the mutable function parameters and the return value. And for all pure functions with no mutable reference parameters, it is guaranteed that no other thread has access to them. The implications for gc are very interesting.

Indeed, this opens up several possibilities, it will be interesting to see what else pure can give us in terms of optimization. (I'm not a compiler or low level optimization expert, so my imagination here is somewhat limited :P ) Even for the optimizations that should not be applied completely automatically (the aforementioned parallelization, memoization) it is nice to have a language mechanism to verify their safety. -- Bruno Medeiros - Software Engineer
Oct 28 2010
prev sibling parent reply Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 26/10/2010 04:47, Robert Jacques wrote:
 On Mon, 25 Oct 2010 09:44:14 -0400, Bruno Medeiros
 <brunodomedeiros+spam com.gmail> wrote:

 On 23/09/2010 23:39, Robert Jacques wrote:
 On Thu, 23 Sep 2010 16:35:23 -0400, Tomek Sowiński <just ask.me> wrote:
 On topic: this means a pure function can take a reference to data that
 can be mutated by
 someone else. So we're giving up on the "can parallelize with no
 dataraces" guarantee on
 pure functions?

In short, No. In long; the proposal is for pure functions become broken up into two groups (weak and strong) based on their function signatures. This division is internal to the compiler, and isn't expressed in the language in any way. Strongly-pure functions provide all the guarantees that pure does today and can be automatically parallelized or cached without consequence. Weakly-pure functions, don't provide either of these guarantees, but allow a much larger number of functions to be strongly-pure. In order to guarantee a function is strongly pure, one would have to declare all its inputs immutable or use an appropriate template constraint.

I think we need to be more realistic with what kinds of optimizations we could expect from a D compiler and pure functions. Caching might be done, but only a temporary sense (caching under a limited execution scope). I doubt we would ever have something like memoization, which would incur memory costs (potentially quite big ones), and so the compiler almost certainly would not be able to know (without additional metadata/anotations or compile options) if that trade-off is acceptable. Similarly for parallelism, how would the compiler know that it's ok to spawn 10 or 100 new threads to parallelize the execution of some loop? The consequences for program and whole-machine scheduling would not be trivial and easy to understand. For this to happen, amongst other things the compiler and OS would need to ensure that the spawned threads would not starve the rest of the threads of that program. I suspect all these considerations might be very difficult to guarantee on a non-VM environment.

Ahem, it's trivial for the compiler to know if it's okay to spawn 10 or 100 _tasks_. Tasks, as opposed to threads or even thread pools, are extremely cheap (think on the order of function call overhead).

What are these tasks you mention? I've never heard of them. -- Bruno Medeiros - Software Engineer
Oct 28 2010
next sibling parent tls <do notha.ev> writes:
Robert Jacques Wrote:

 Getting back to pure, one of the "big"  
 advantages of functional languages are their ability to automatically  
 parallelize themselves; and they use a work stealing runtime (aka tasks)  
 to do this.

What functional has task? I switch in Meantime when D2 not staple to study this.
Oct 29 2010
prev sibling parent reply Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 29/10/2010 02:32, Robert Jacques wrote:
 On Thu, 28 Oct 2010 10:48:34 -0400, Bruno Medeiros
 <brunodomedeiros+spam com.gmail> wrote:
 On 26/10/2010 04:47, Robert Jacques wrote:
 On Mon, 25 Oct 2010 09:44:14 -0400, Bruno Medeiros
 <brunodomedeiros+spam com.gmail> wrote:

 On 23/09/2010 23:39, Robert Jacques wrote:
 On Thu, 23 Sep 2010 16:35:23 -0400, Tomek Sowiński <just ask.me>
 wrote:
 On topic: this means a pure function can take a reference to data
 that
 can be mutated by
 someone else. So we're giving up on the "can parallelize with no
 dataraces" guarantee on
 pure functions?

In short, No. In long; the proposal is for pure functions become broken up into two groups (weak and strong) based on their function signatures. This division is internal to the compiler, and isn't expressed in the language in any way. Strongly-pure functions provide all the guarantees that pure does today and can be automatically parallelized or cached without consequence. Weakly-pure functions, don't provide either of these guarantees, but allow a much larger number of functions to be strongly-pure. In order to guarantee a function is strongly pure, one would have to declare all its inputs immutable or use an appropriate template constraint.

I think we need to be more realistic with what kinds of optimizations we could expect from a D compiler and pure functions. Caching might be done, but only a temporary sense (caching under a limited execution scope). I doubt we would ever have something like memoization, which would incur memory costs (potentially quite big ones), and so the compiler almost certainly would not be able to know (without additional metadata/anotations or compile options) if that trade-off is acceptable. Similarly for parallelism, how would the compiler know that it's ok to spawn 10 or 100 new threads to parallelize the execution of some loop? The consequences for program and whole-machine scheduling would not be trivial and easy to understand. For this to happen, amongst other things the compiler and OS would need to ensure that the spawned threads would not starve the rest of the threads of that program. I suspect all these considerations might be very difficult to guarantee on a non-VM environment.

Ahem, it's trivial for the compiler to know if it's okay to spawn 10 or 100 _tasks_. Tasks, as opposed to threads or even thread pools, are extremely cheap (think on the order of function call overhead).

What are these tasks you mention? I've never heard of them.

The programming language Cilk popularized the concept of parallelization through many small tasks combined with a work stealing runtime. Futures are essentially the same concept, but because futures were generally implemented with OS-threads, a thread pool or fibers/coroutines, that term is generally avoided. Like message passing, tasks are often implemented in libraries with Intel's threading building blocks probably being the most famous, though both Microsoft's Task Parallel Library and Apple's Grand Central are gaining mind-share. David Simcha currently has a task library in review for inclusion to phobos. Basically, the point of tasks is to provide parallelization with extremely low overhead (on average a Cilk spawn is less than 4 function calls). That way, instead of having a few coarse grain threads which neither scale nor load balance well, you're encouraged to use tasks everywhere and therefore reap the benefits of a balanced N-way scalable system.

Hum, I see what you mean know, but tasks only help with the *creation overhead* of otherwise spawning lots of OS threads, they don't solve the main problems I mentioned. First, it may be fine to spawn 100 tasks, but there is still the issue of deciding how many OS threads the tasks will run in! Obviously, you won't run them in just one OS thread, otherwise you won't get any parallelism. Ideally for your program only, your program would have as many OS threads as there are cores. But here there is still the same issue of whether its ok for you program to use up all the cores in your machine. The compiler doesn't know that. Could it be enough to have a global compiler option to specify that? I don't think so: What if you want some code of your program to use as much OS-threads as possible, but not some other code? Second, and perhaps more importantly, the very same issue occurs in the scope of your program alone. So, even if you use all OS threads, and don't care about other programs, spawning 100 tasks for some loop might take time away from other more important tasks of your program. The compiler/task-scheduler/whatever would not automatically know what is acceptable and what is not. (the only exception being if your program was logically single-threaded) -- Bruno Medeiros - Software Engineer
Nov 01 2010
parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 02/11/2010 03:26, Robert Jacques wrote:
 On Mon, 01 Nov 2010 10:24:43 -0400, Bruno Medeiros
 <brunodomedeiros+spam com.gmail> wrote:
 On 29/10/2010 02:32, Robert Jacques wrote:

 The programming language Cilk popularized the concept of parallelization
 through many small tasks combined with a work stealing runtime. Futures
 are essentially the same concept, but because futures were generally
 implemented with OS-threads, a thread pool or fibers/coroutines, that
 term is generally avoided. Like message passing, tasks are often
 implemented in libraries with Intel's threading building blocks probably
 being the most famous, though both Microsoft's Task Parallel Library and
 Apple's Grand Central are gaining mind-share. David Simcha currently has
 a task library in review for inclusion to phobos. Basically, the point
 of tasks is to provide parallelization with extremely low overhead (on
 average a Cilk spawn is less than 4 function calls). That way, instead
 of having a few coarse grain threads which neither scale nor load
 balance well, you're encouraged to use tasks everywhere and therefore
 reap the benefits of a balanced N-way scalable system.

Hum, I see what you mean know, but tasks only help with the *creation overhead* of otherwise spawning lots of OS threads, they don't solve the main problems I mentioned. First, it may be fine to spawn 100 tasks, but there is still the issue of deciding how many OS threads the tasks will run in! Obviously, you won't run them in just one OS thread, otherwise you won't get any parallelism. Ideally for your program only, your program would have as many OS threads as there are cores. But here there is still the same issue of whether its ok for you program to use up all the cores in your machine. The compiler doesn't know that. Could it be enough to have a global compiler option to specify that? I don't think so: What if you want some code of your program to use as much OS-threads as possible, but not some other code? Second, and perhaps more importantly, the very same issue occurs in the scope of your program alone. So, even if you use all OS threads, and don't care about other programs, spawning 100 tasks for some loop might take time away from other more important tasks of your program. The compiler/task-scheduler/whatever would not automatically know what is acceptable and what is not. (the only exception being if your program was logically single-threaded)

Controlling the task runtime thread-pool size is trivial. Indeed, you'll often want to reduce the number of daemon threads by the number of active program threads. And if you need fine grain control over pool sizes, you can always create separate pools and assign tasks to them. I think a reasonable default would be (# of cores - 2) daemons with automatic decreases/increases with every spawn/termination. But, all those settings should be controllable at runtime.

I didn't say controlling pools/task-count/threads-count/priorities wasn't easy or trivial. I just claimed it should not done automatically by the compiler, but rather "manually" in the code, and with fine grained control. ∎ -- Bruno Medeiros - Software Engineer
Nov 09 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 23 Sep 2010 16:35:23 -0400, Tomek Sowiński <just ask.me> wrote:

 dsimcha napisał:

 1.  The documentation that says that the parameters need to be  
 convertible
 to immutable is outdated.  This was changed a while ago to only  
 requiring
 const.

Good to know. Yet, I wonder why such changes are not discussed on this NG and at the very least announced in the change log... On topic: this means a pure function can take a reference to data that can be mutated by someone else. So we're giving up on the "can parallelize with no dataraces" guarantee on pure functions?

A const pointer to unshared data cannot be modified while inside a pure function. The key is that it's unshared. What it does disable is caching based on the reference value. -Steve
Sep 23 2010
prev sibling next sibling parent retard <re tard.com.invalid> writes:
Thu, 23 Sep 2010 22:35:23 +0200, Tomek Sowiński wrote:

 dsimcha napisał:
 
 1.  The documentation that says that the parameters need to be
 convertible to immutable is outdated.  This was changed a while ago to
 only requiring const.

Good to know. Yet, I wonder why such changes are not discussed on this NG and at the very least announced in the change log...

Just download the documentation of two subsequent versions and run gnu diff. It's that simple.
Sep 23 2010
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Thu, 23 Sep 2010 16:35:23 -0400, Tomek Sowiński <just ask.me> wrote:
 On topic: this means a pure function can take a reference to data that  
 can be mutated by
 someone else. So we're giving up on the "can parallelize with no  
 dataraces" guarantee on
 pure functions?

In short, No. In long; the proposal is for pure functions become broken up into two groups (weak and strong) based on their function signatures. This division is internal to the compiler, and isn't expressed in the language in any way. Strongly-pure functions provide all the guarantees that pure does today and can be automatically parallelized or cached without consequence. Weakly-pure functions, don't provide either of these guarantees, but allow a much larger number of functions to be strongly-pure. In order to guarantee a function is strongly pure, one would have to declare all its inputs immutable or use an appropriate template constraint.
Sep 23 2010
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Mon, 25 Oct 2010 09:44:14 -0400, Bruno Medeiros  
<brunodomedeiros+spam com.gmail> wrote:

 On 23/09/2010 23:39, Robert Jacques wrote:
 On Thu, 23 Sep 2010 16:35:23 -0400, Tomek Sowiński <just ask.me> wrote:
 On topic: this means a pure function can take a reference to data that
 can be mutated by
 someone else. So we're giving up on the "can parallelize with no
 dataraces" guarantee on
 pure functions?

In short, No. In long; the proposal is for pure functions become broken up into two groups (weak and strong) based on their function signatures. This division is internal to the compiler, and isn't expressed in the language in any way. Strongly-pure functions provide all the guarantees that pure does today and can be automatically parallelized or cached without consequence. Weakly-pure functions, don't provide either of these guarantees, but allow a much larger number of functions to be strongly-pure. In order to guarantee a function is strongly pure, one would have to declare all its inputs immutable or use an appropriate template constraint.

I think we need to be more realistic with what kinds of optimizations we could expect from a D compiler and pure functions. Caching might be done, but only a temporary sense (caching under a limited execution scope). I doubt we would ever have something like memoization, which would incur memory costs (potentially quite big ones), and so the compiler almost certainly would not be able to know (without additional metadata/anotations or compile options) if that trade-off is acceptable. Similarly for parallelism, how would the compiler know that it's ok to spawn 10 or 100 new threads to parallelize the execution of some loop? The consequences for program and whole-machine scheduling would not be trivial and easy to understand. For this to happen, amongst other things the compiler and OS would need to ensure that the spawned threads would not starve the rest of the threads of that program. I suspect all these considerations might be very difficult to guarantee on a non-VM environment.

Ahem, it's trivial for the compiler to know if it's okay to spawn 10 or 100 _tasks_. Tasks, as opposed to threads or even thread pools, are extremely cheap (think on the order of function call overhead).
Oct 25 2010
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Thu, 28 Oct 2010 10:48:34 -0400, Bruno Medeiros  
<brunodomedeiros+spam com.gmail> wrote:
 On 26/10/2010 04:47, Robert Jacques wrote:
 On Mon, 25 Oct 2010 09:44:14 -0400, Bruno Medeiros
 <brunodomedeiros+spam com.gmail> wrote:

 On 23/09/2010 23:39, Robert Jacques wrote:
 On Thu, 23 Sep 2010 16:35:23 -0400, Tomek Sowiński <just ask.me>  
 wrote:
 On topic: this means a pure function can take a reference to data  
 that
 can be mutated by
 someone else. So we're giving up on the "can parallelize with no
 dataraces" guarantee on
 pure functions?

In short, No. In long; the proposal is for pure functions become broken up into two groups (weak and strong) based on their function signatures. This division is internal to the compiler, and isn't expressed in the language in any way. Strongly-pure functions provide all the guarantees that pure does today and can be automatically parallelized or cached without consequence. Weakly-pure functions, don't provide either of these guarantees, but allow a much larger number of functions to be strongly-pure. In order to guarantee a function is strongly pure, one would have to declare all its inputs immutable or use an appropriate template constraint.

I think we need to be more realistic with what kinds of optimizations we could expect from a D compiler and pure functions. Caching might be done, but only a temporary sense (caching under a limited execution scope). I doubt we would ever have something like memoization, which would incur memory costs (potentially quite big ones), and so the compiler almost certainly would not be able to know (without additional metadata/anotations or compile options) if that trade-off is acceptable. Similarly for parallelism, how would the compiler know that it's ok to spawn 10 or 100 new threads to parallelize the execution of some loop? The consequences for program and whole-machine scheduling would not be trivial and easy to understand. For this to happen, amongst other things the compiler and OS would need to ensure that the spawned threads would not starve the rest of the threads of that program. I suspect all these considerations might be very difficult to guarantee on a non-VM environment.

Ahem, it's trivial for the compiler to know if it's okay to spawn 10 or 100 _tasks_. Tasks, as opposed to threads or even thread pools, are extremely cheap (think on the order of function call overhead).

What are these tasks you mention? I've never heard of them.

The programming language Cilk popularized the concept of parallelization through many small tasks combined with a work stealing runtime. Futures are essentially the same concept, but because futures were generally implemented with OS-threads, a thread pool or fibers/coroutines, that term is generally avoided. Like message passing, tasks are often implemented in libraries with Intel's threading building blocks probably being the most famous, though both Microsoft's Task Parallel Library and Apple's Grand Central are gaining mind-share. David Simcha currently has a task library in review for inclusion to phobos. Basically, the point of tasks is to provide parallelization with extremely low overhead (on average a Cilk spawn is less than 4 function calls). That way, instead of having a few coarse grain threads which neither scale nor load balance well, you're encouraged to use tasks everywhere and therefore reap the benefits of a balanced N-way scalable system. Getting back to pure, one of the "big" advantages of functional languages are their ability to automatically parallelize themselves; and they use a work stealing runtime (aka tasks) to do this.
Oct 28 2010
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Fri, 29 Oct 2010 07:13:46 -0400, tls <do notha.ev> wrote:
 Robert Jacques Wrote:

 Getting back to pure, one of the "big"
 advantages of functional languages are their ability to automatically
 parallelize themselves; and they use a work stealing runtime (aka tasks)
 to do this.

What functional has task? I switch in Meantime when D2 not staple to study this.

Every functional language can do this; it's part and parcel of being functional. It's really a more question of how mature their runtimes are, as opposed to language support.
Oct 29 2010
prev sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Mon, 01 Nov 2010 10:24:43 -0400, Bruno Medeiros  
<brunodomedeiros+spam com.gmail> wrote:
 On 29/10/2010 02:32, Robert Jacques wrote:

 The programming language Cilk popularized the concept of parallelization
 through many small tasks combined with a work stealing runtime. Futures
 are essentially the same concept, but because futures were generally
 implemented with OS-threads, a thread pool or fibers/coroutines, that
 term is generally avoided. Like message passing, tasks are often
 implemented in libraries with Intel's threading building blocks probably
 being the most famous, though both Microsoft's Task Parallel Library and
 Apple's Grand Central are gaining mind-share. David Simcha currently has
 a task library in review for inclusion to phobos. Basically, the point
 of tasks is to provide parallelization with extremely low overhead (on
 average a Cilk spawn is less than 4 function calls). That way, instead
 of having a few coarse grain threads which neither scale nor load
 balance well, you're encouraged to use tasks everywhere and therefore
 reap the benefits of a balanced N-way scalable system.

Hum, I see what you mean know, but tasks only help with the *creation overhead* of otherwise spawning lots of OS threads, they don't solve the main problems I mentioned. First, it may be fine to spawn 100 tasks, but there is still the issue of deciding how many OS threads the tasks will run in! Obviously, you won't run them in just one OS thread, otherwise you won't get any parallelism. Ideally for your program only, your program would have as many OS threads as there are cores. But here there is still the same issue of whether its ok for you program to use up all the cores in your machine. The compiler doesn't know that. Could it be enough to have a global compiler option to specify that? I don't think so: What if you want some code of your program to use as much OS-threads as possible, but not some other code? Second, and perhaps more importantly, the very same issue occurs in the scope of your program alone. So, even if you use all OS threads, and don't care about other programs, spawning 100 tasks for some loop might take time away from other more important tasks of your program. The compiler/task-scheduler/whatever would not automatically know what is acceptable and what is not. (the only exception being if your program was logically single-threaded)

Controlling the task runtime thread-pool size is trivial. Indeed, you'll often want to reduce the number of daemon threads by the number of active program threads. And if you need fine grain control over pool sizes, you can always create separate pools and assign tasks to them. I think a reasonable default would be (# of cores - 2) daemons with automatic decreases/increases with every spawn/termination. But, all those settings should be controllable at runtime.
Nov 01 2010
prev sibling next sibling parent reply "Robert Jacques" <sandford jhu.edu> writes:
On Tue, 21 Sep 2010 22:21:39 -0400, Don <nospam nospam.com> wrote:

 The docs currently state that:
 ---
 Pure functions are functions that produce the same result for the same  
 arguments. To that end, a pure function:

      * has parameters that are all immutable or are implicitly  
 convertible to immutable
      * does not read or write any global mutable state
 ---

 This is extremely restrictive, and not currently enforced.
 Two specific limitations:
 - a 'pure' member function doesn't really make sense (could only be  
 called on an immutable class)
 - a pure function cannot have 'out' parameters.

 In a discussion on D.learn, Steven Schveighoffer noted that it's only  
 shared variables which make things complicated.
 The limitations exist because 'pure' is used to mean both 'no hidden  
 state' AND 'cachable'. But 'cachable' is really only an implementation  
 detail.

 PROPOSAL:
 Drop the first requirement. Only one requirement is necessary:

 A pure function does not read or write any global mutable state.

 If a pure function has parameters that are all immutable or are  
 implicitly convertible to immutable, then the compiler is permitted to  
 cache the results.

 This is possible because the first rule (all immutable parameters) is  
 trivial, and can be checked from the function declaration (in fact, you  
 can check it just from the mangled function name). The second rule (no  
 globals) is viral, and requires inspection of the function body, and the  
 function bodies of every function called by that function.

 Actually, a clever compiler could even cache the results of pure  
 functions which have parameters which don't implicitly cast to immutable.

 I haven't found anything in TDPL which mentions the "only immutable  
 parameters" rule. Relaxing that rule would allow much larger functions  
 to be cached. The more important benefit (IMHO) of pure, that it makes  
 it easier to reason about code, would be dramatically extended.

One of the major benefits of functional languages (a.k.a pure functions) is that you can automatically parallelize them (i.e. via a task based runtime) in addition to caching the result. I really don't want to lose that guarantee. However, I do see your point. One of the benefits of D's pure functions is the ability to use procedural code inside them; however the "does not read or write any global mutable state" prevents pure functions from calling impure functions/members on their internal (impure) variables. So removing the concurrency safety from pure would greatly expand the number of pure functions, however, automatic parallelism would be lost. I don't view this as an acceptable reduction in pure's power, particularly given that Dave's std.parallelism module is currently under review. With regard to an alternative, I think two levels of pure have been identified as being useful and synergistic to each other. I think this could either manifest itself as pure + scope(pure) or as a new 'task' type + pure. Anyways, that's my two cents worth.
Sep 21 2010
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2010-09-22 01:26:01 -0400, "Robert Jacques" <sandford jhu.edu> said:

 So removing the concurrency safety from pure would greatly  expand the 
 number of pure functions, however, automatic parallelism would  be lost.

Don clearly mentioned that this is not lost. Basically, for safe parallelism what you need is a function that has a) pure and b) no mutable reference parameter. Both are easily checkable at compile time, you'd just need to change your test for pure for a test that also checks the arguments. The interesting thing with this change is that you can now call mutators functions on the local variables inside the pure function, because those can be made pure. You can't even iterate over a range inside a pure function without this! pure int test() { int result; auto r = iota(0, 10); while (!r.empty) { result += r; r.popFront(); // can't be pure by current rules! } return result; } -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Sep 22 2010
next sibling parent reply Don <nospam nospam.com> writes:
Robert Jacques wrote:
 On Wed, 22 Sep 2010 07:54:26 -0400, Michel Fortin 
 <michel.fortin michelf.com> wrote:
 
 On 2010-09-22 01:26:01 -0400, "Robert Jacques" <sandford jhu.edu> said:

 So removing the concurrency safety from pure would greatly  expand 
 the number of pure functions, however, automatic parallelism would  
 be lost.

Don clearly mentioned that this is not lost. Basically, for safe parallelism what you need is a function that has a) pure and b) no mutable reference parameter. Both are easily checkable at compile time, you'd just need to change your test for pure for a test that also checks the arguments.

What is lost is my ability to declare a function does x in the signature and for the compiler to check that. I really want to know if code monkey A changed some type's implementation to be thread unsafe, because detecting and tracking down a loss of performance due to loss of automatic parallelism is devilish.

No, you haven't lost that at all. Any function marked as pure still has no access to any state, other than what is provided by its parameters. It is still thread-safe.
 The interesting thing with this change is that you can now call 
 mutators functions on the local variables inside the pure function, 
 because those can be made pure. You can't even iterate over a range 
 inside a pure function without this!

     pure int test() {
         int result;
         auto r = iota(0, 10);
         while (!r.empty) {
             result += r;
             r.popFront(); // can't be pure by current rules!
         }
         return result;
     }

I did mention this benefit in my post, but this example really shows just how awesome it really is. Which is why I think the concept of a simplified pure is so powerful and be included in the language. I just want it included in addition to the current pure concept.

The current pure concept is still included. The syntax for strong pure is simply a function declaration with no mutable parameters. So all existing functions marked 'pure' would remain strong pure. The one situation where things get more complicated is if you have a templated function marked as pure, and you want to ensure that it is strong pure. An isStrongPure!(T) template would need to be written, and used in the template constraint.
Sep 22 2010
parent reply Don <nospam nospam.com> writes:
Robert Jacques wrote:
 On Wed, 22 Sep 2010 11:44:00 -0400, Don <nospam nospam.com> wrote:
 
 Robert Jacques wrote:
 On Wed, 22 Sep 2010 07:54:26 -0400, Michel Fortin 
 <michel.fortin michelf.com> wrote:

 On 2010-09-22 01:26:01 -0400, "Robert Jacques" <sandford jhu.edu> said:

 So removing the concurrency safety from pure would greatly  expand 
 the number of pure functions, however, automatic parallelism would  
 be lost.

Don clearly mentioned that this is not lost. Basically, for safe parallelism what you need is a function that has a) pure and b) no mutable reference parameter. Both are easily checkable at compile time, you'd just need to change your test for pure for a test that also checks the arguments.

signature and for the compiler to check that. I really want to know if code monkey A changed some type's implementation to be thread unsafe, because detecting and tracking down a loss of performance due to loss of automatic parallelism is devilish.

No, you haven't lost that at all. Any function marked as pure still has no access to any state, other than what is provided by its parameters. It is still thread-safe.

No, it isn't. A strongly-pure function is thread safe, but a weakly-pure function isn't. Since strong/weak is automatically determined by the compiler, a function's strength can switch due to long distance code changes.

This is completely incorrect. It can only change strength if its signature changes. This wouldn't be an issue today, but tomorrow large
 strongly-pure functions will be parallelized automatically in a manner 
 akin to inlining (outlining?). Then it makes a difference. However, 
 these issues feel more like D3 concerns than D2 concerns, particularly 
 when you consider the advantages of making tasks/futures a language 
 level concept.

Sep 22 2010
parent reply klickverbot <see klickverbot.at> writes:
On 9/22/10 9:14 PM, Steven Schveighoffer wrote:
 Hypothetical counter-case

 struct S
 {
 version(stronglypure)
 string s;
 else
 char[] s;
 }

 pure foo(S s); // changes strength depending on S' contents

 -Steve

This is a change to the signature of foo – S with the stronglypure version defined and S without it are two completely distinct types.
Sep 22 2010
parent Don <nospam nospam.com> writes:
Steven Schveighoffer wrote:
 On Wed, 22 Sep 2010 15:21:57 -0400, klickverbot <see klickverbot.at> wrote:
 
 On 9/22/10 9:14 PM, Steven Schveighoffer wrote:
 Hypothetical counter-case

 struct S
 {
 version(stronglypure)
 string s;
 else
 char[] s;
 }

 pure foo(S s); // changes strength depending on S' contents

 -Steve

This is a change to the signature of foo – S with the stronglypure version defined and S without it are two completely distinct types.

Wait, I didn't change foo's signature at all. This is what the OP meant by long-range changes. S can be defined far away from foo, and out of the author of foo's control. -Steve

Is that really what he meant? Note that foo() would need to be recompiled, and the silent change in strength would occur only if it still compiles despite a significant change in the definition of one of types. If you change the definition of a type, you're always going to have influences everywhere it's used in a function definition. I'd hardly call that a 'long-range' change. By contrast, in a language where you don't have pure at all, if someone sticks in a global variable somewhere, (even in a separate library), then code could suddenly have severe contention with another thread. That's long range, and nearly impossible to track down.
Sep 22 2010
prev sibling parent reply Tomek =?UTF-8?B?U293acWEc2tp?= <just ask.me> writes:
Michel Fortin napisał:

 The interesting thing with this change is that you can now call
 mutators functions on the local variables inside the pure function,
 because those can be made pure. You can't even iterate over a range
 inside a pure function without this!
 
 pure int test() {
 int result;
 auto r = iota(0, 10);
 while (!r.empty) {
 result += r;
 r.popFront(); // can't be pure by current rules!
 }
 return result;
 }

It's because popFront() can't be made pure. It could be if there was tail immutable in the language -- pure functions would mark their arguments with tail immutable instead of immutable. That allows popFront() to advance the range. I wrote about it before but the idea didn't catch on: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=11 6475 -- Tomek
Sep 23 2010
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2010-09-23 16:24:20 -0400, Tomek Sowiński <just ask.me> said:

 Michel Fortin napisał:
 
 The interesting thing with this change is that you can now call
 mutators functions on the local variables inside the pure function,
 because those can be made pure. You can't even iterate over a range
 inside a pure function without this!
 
 pure int test() {
     int result;
     auto r = iota(0, 10);
     while (!r.empty) {
         result += r;
         r.popFront(); // can't be pure by current rules!
     }
     return result;
 }

It's because popFront() can't be made pure. It could be if there was tail immutable in the language -- pure functions would mark their arguments with tail immutable instead of immutable. That allows popFront() to advance the range.

I know popFront() can't be made pure by the current rules. The proposal that started this thread is to relax 'pure' so that popFront() can be made pure. I was simply pointing a use case for that. And to tell the truth, I have no idea where a tail-immutable qualifier would be helpful in this situation... could you elaborate? -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Sep 23 2010
parent reply Tomek =?UTF-8?B?U293acWEc2tp?= <just ask.me> writes:
Michel Fortin napisał:

 It's because popFront() can't be made pure. It could be if there was
  tail immutable in the
 language -- pure functions would mark their arguments with  tail
 immutable instead of
 immutable. That allows popFront() to advance the range.

I know popFront() can't be made pure by the current rules. The proposal that started this thread is to relax 'pure' so that popFront() can be made pure. I was simply pointing a use case for that. And to tell the truth, I have no idea where a tail-immutable qualifier would be helpful in this situation... could you elaborate?

I wrote that before I read Don's re-explanation that pure should just mean no globals (weak- pure) and it's to be able to call weak-pure functions from const-pure and immutable-pure functions that do impose restrictions on their parameters and offer guarantees about parallelizing, enable optimizations, etc. And that's cool. So in the "three levels of purity" nomenclature, tail(const|immutable) would popularize the stronger two. E.g. popFront() could be immutably pure if the range in on an immutable collection and const-pure if it's on any collection. -- Tomek
Sep 23 2010
parent Tomek =?UTF-8?B?U293acWEc2tp?= <just ask.me> writes:
Tomek Sowiński napisał:

 So in the "three levels of purity" nomenclature, tail(const|immutable)
 would popularize the stronger two. E.g. popFront() could be immutably pure
 if the range in on an immutable collection and const-pure if it's on any
 collection.

Ehm, sorry.. popFront() is a bad example because 'this' is a reference. Tail const still would popularize the stronger pure, only it doesn't really buy you much. You can rebind argument pointers inside a strong-pure function, that's all. (must stop posting at 1 a.m.) -- Tomek
Sep 23 2010
prev sibling next sibling parent retard <re tard.com.invalid> writes:
Wed, 22 Sep 2010 04:21:39 +0200, Don wrote:

 The docs currently state that:
 ---
 Pure functions are functions that produce the same result for the same
 arguments. To that end, a pure function:
 
      * has parameters that are all immutable or are implicitly
 convertible to immutable
      * does not read or write any global mutable state
 ---
 
 This is extremely restrictive, and not currently enforced. Two specific
 limitations:
 - a 'pure' member function doesn't really make sense (could only be
 called on an immutable class)
 - a pure function cannot have 'out' parameters.
 
 In a discussion on D.learn, Steven Schveighoffer noted that it's only
 shared variables which make things complicated. The limitations exist
 because 'pure' is used to mean both 'no hidden state' AND 'cachable'.
 But 'cachable' is really only an implementation detail.
 
 PROPOSAL:
 Drop the first requirement. Only one requirement is necessary:
 
 A pure function does not read or write any global mutable state.

A purely functional language provides referential transparency: the same arguments applied to the function always return the same result.
Sep 21 2010
prev sibling next sibling parent Kagamin <spam here.lot> writes:
Technically this is feasible, but I don't think that it's a good idea to
surprise user with impurity of pure function.

 This is extremely restrictive, and not currently enforced.
 Two specific limitations:
 - a 'pure' member function doesn't really make sense (could only be 
 called on an immutable class)
 - a pure function cannot have 'out' parameters.

If you can't use purity, don't use it, impure functions work just fine.
Sep 21 2010
prev sibling next sibling parent Lutger <lutger.blijdestijn gmail.com> writes:
Don wrote:

 The docs currently state that:
 ---
 Pure functions are functions that produce the same result for the same
 arguments. To that end, a pure function:
 
      * has parameters that are all immutable or are implicitly
 convertible to immutable
      * does not read or write any global mutable state
 ---
 
 This is extremely restrictive, and not currently enforced.
 Two specific limitations:
 - a 'pure' member function doesn't really make sense (could only be
 called on an immutable class)
 - a pure function cannot have 'out' parameters.

Would it be possible to make an exception for out parameters? After all they can be considered part of the output, and as input are always initialized to the same value. The result of a function can be defined as the tuple of its return value and all out parameters.
 In a discussion on D.learn, Steven Schveighoffer noted that it's only
 shared variables which make things complicated.
 The limitations exist because 'pure' is used to mean both 'no hidden
 state' AND 'cachable'. But 'cachable' is really only an implementation
 detail.
 
 PROPOSAL:
 Drop the first requirement. Only one requirement is necessary:
 
 A pure function does not read or write any global mutable state.
 
 If a pure function has parameters that are all immutable or are
 implicitly convertible to immutable, then the compiler is permitted to
 cache the results.

If pure functions are allowed to mutate their input parameters, do we not lose referential transparency? a = pureFun(input); b = pureFun(input); assert( a == b ); Or am I misunderstanding your proposal?
Sep 22 2010
prev sibling next sibling parent reply Don <nospam nospam.com> writes:
Don wrote:
 The docs currently state that:

 PROPOSAL:
 Drop the first requirement. Only one requirement is necessary:
 
 A pure function does not read or write any global mutable state.
 

Wow. It seems that not one person who has responded so far has understood this proposal! I'll try again. Under this proposal: If you see a function which has mutable parameters, but is marked as 'pure', you can only conclude that it doesn't use global variables. That's not much use on it's own. Let's call this a 'weakly-pure' function. However, if you see a function maked as 'pure', which also has only immutable parameters, you have the same guarantee which 'pure' gives us as the moment. Let's call this a 'strongly-pure' function. The benefit of the relaxed rule is that a strongly-pure function can call a weakly-pure functions, while remaining strongly-pure. This allows very many more functions to become strongly pure. The point of the proposal is *not* to provide the weak guarantee. It is to provide the strong guarantee in more situations.
Sep 22 2010
next sibling parent reply Pelle <pelle.mansson gmail.com> writes:
On 09/22/2010 10:13 AM, Don wrote:
 Don wrote:
 The docs currently state that:

 PROPOSAL:
 Drop the first requirement. Only one requirement is necessary:

 A pure function does not read or write any global mutable state.

Wow. It seems that not one person who has responded so far has understood this proposal! I'll try again. Under this proposal: If you see a function which has mutable parameters, but is marked as 'pure', you can only conclude that it doesn't use global variables. That's not much use on it's own. Let's call this a 'weakly-pure' function. However, if you see a function maked as 'pure', which also has only immutable parameters, you have the same guarantee which 'pure' gives us as the moment. Let's call this a 'strongly-pure' function. The benefit of the relaxed rule is that a strongly-pure function can call a weakly-pure functions, while remaining strongly-pure. This allows very many more functions to become strongly pure. The point of the proposal is *not* to provide the weak guarantee. It is to provide the strong guarantee in more situations.

Will this allow member functions to be weakly pure? As in, only touches its own state. Because that would be useful.
Sep 22 2010
parent Don <nospam nospam.com> writes:
Pelle wrote:
 On 09/22/2010 10:13 AM, Don wrote:
 Don wrote:
 The docs currently state that:

 PROPOSAL:
 Drop the first requirement. Only one requirement is necessary:

 A pure function does not read or write any global mutable state.

Wow. It seems that not one person who has responded so far has understood this proposal! I'll try again. Under this proposal: If you see a function which has mutable parameters, but is marked as 'pure', you can only conclude that it doesn't use global variables. That's not much use on it's own. Let's call this a 'weakly-pure' function. However, if you see a function maked as 'pure', which also has only immutable parameters, you have the same guarantee which 'pure' gives us as the moment. Let's call this a 'strongly-pure' function. The benefit of the relaxed rule is that a strongly-pure function can call a weakly-pure functions, while remaining strongly-pure. This allows very many more functions to become strongly pure. The point of the proposal is *not* to provide the weak guarantee. It is to provide the strong guarantee in more situations.

Will this allow member functions to be weakly pure? As in, only touches its own state. Because that would be useful.

Sep 22 2010
prev sibling next sibling parent "Simen kjaeraas" <simen.kjaras gmail.com> writes:
Don <nospam nospam.com> wrote:

 Don wrote:
 The docs currently state that:

 PROPOSAL:
 Drop the first requirement. Only one requirement is necessary:
  A pure function does not read or write any global mutable state.

Wow. It seems that not one person who has responded so far has understood this proposal! I'll try again. Under this proposal: If you see a function which has mutable parameters, but is marked as 'pure', you can only conclude that it doesn't use global variables. That's not much use on it's own. Let's call this a 'weakly-pure' function. However, if you see a function maked as 'pure', which also has only immutable parameters, you have the same guarantee which 'pure' gives us as the moment. Let's call this a 'strongly-pure' function. The benefit of the relaxed rule is that a strongly-pure function can call a weakly-pure functions, while remaining strongly-pure. This allows very many more functions to become strongly pure. The point of the proposal is *not* to provide the weak guarantee. It is to provide the strong guarantee in more situations.

And my axe! That is, I support this. -- Simen
Sep 22 2010
prev sibling next sibling parent reply Jason House <jason.james.house gmail.com> writes:
I think you need to forbid access to shared state as well. It's possible to
allow it if a strongly pure function calls a weakly pure function in an object
that it created, but that seems unnecessarily complex. Actually, that makes me
wonder: can constructors be marked pure?

Don Wrote:

 Don wrote:
 The docs currently state that:

 PROPOSAL:
 Drop the first requirement. Only one requirement is necessary:
 
 A pure function does not read or write any global mutable state.
 

Wow. It seems that not one person who has responded so far has understood this proposal! I'll try again. Under this proposal: If you see a function which has mutable parameters, but is marked as 'pure', you can only conclude that it doesn't use global variables. That's not much use on it's own. Let's call this a 'weakly-pure' function. However, if you see a function maked as 'pure', which also has only immutable parameters, you have the same guarantee which 'pure' gives us as the moment. Let's call this a 'strongly-pure' function. The benefit of the relaxed rule is that a strongly-pure function can call a weakly-pure functions, while remaining strongly-pure. This allows very many more functions to become strongly pure. The point of the proposal is *not* to provide the weak guarantee. It is to provide the strong guarantee in more situations.

Sep 22 2010
parent Don <nospam nospam.com> writes:
Steven Schveighoffer wrote:
 On Wed, 22 Sep 2010 09:16:37 -0400, Jason House 
 <jason.james.house gmail.com> wrote:
 
 I think you need to forbid access to shared state as well. It's 
 possible to allow it if a strongly pure function calls a weakly pure 
 function in an object that it created, but that seems unnecessarily 
 complex.

Yes, I think Don expected that you shouldn't be allowed to access shared, but I don't think he specifically stated it. I do think it needs to be added as a specific rule.

You're right, it should be explicitly stated.
 
 Actually, that makes me wonder: can constructors be marked pure?

Of course! A constructor just allocates memory and initializes it. A pure constructor would not be allowed to access global or shared variables either. -Steve

Sep 22 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 22 Sep 2010 04:13:34 -0400, Don <nospam nospam.com> wrote:

 Don wrote:
 The docs currently state that:

 PROPOSAL:
 Drop the first requirement. Only one requirement is necessary:
  A pure function does not read or write any global mutable state.

Wow. It seems that not one person who has responded so far has understood this proposal! I'll try again. Under this proposal: If you see a function which has mutable parameters, but is marked as 'pure', you can only conclude that it doesn't use global variables. That's not much use on it's own. Let's call this a 'weakly-pure' function. However, if you see a function maked as 'pure', which also has only immutable parameters, you have the same guarantee which 'pure' gives us as the moment. Let's call this a 'strongly-pure' function. The benefit of the relaxed rule is that a strongly-pure function can call a weakly-pure functions, while remaining strongly-pure. This allows very many more functions to become strongly pure. The point of the proposal is *not* to provide the weak guarantee. It is to provide the strong guarantee in more situations.

Yes, this in particular solves a problem I thought of a long time ago -- modularizing functions. Let's say you have a function like this: pure int foo() { ulong[100] x; x[0] = 1; foreach(uint i; 1..x.length) x[i] = x[i-1] * i; ... } Let's say you have 10 pure functions that initialize x in the same way. If you wanted to abstract the initialization of x, you could do: void initX(ulong[] x) { x[0] = 1; foreach(uint i; 1..x.length) x[i] = x[i-1] * i; } And then you just have: pure int foo() { ulong[100] x; initX(x); ... } But pure functions can only call pure functions, so we'd have to mark initX pure. Your rules would allow this, and I think they are exactly what we need, I think you found the perfect rules to describe it. -Steve
Sep 22 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 22 Sep 2010 09:16:37 -0400, Jason House  
<jason.james.house gmail.com> wrote:

 I think you need to forbid access to shared state as well. It's possible  
 to allow it if a strongly pure function calls a weakly pure function in  
 an object that it created, but that seems unnecessarily complex.

Yes, I think Don expected that you shouldn't be allowed to access shared, but I don't think he specifically stated it. I do think it needs to be added as a specific rule.
 Actually, that makes me wonder: can constructors be marked pure?

Of course! A constructor just allocates memory and initializes it. A pure constructor would not be allowed to access global or shared variables either. -Steve
Sep 22 2010
prev sibling next sibling parent reply "Robert Jacques" <sandford jhu.edu> writes:
On Wed, 22 Sep 2010 04:13:34 -0400, Don <nospam nospam.com> wrote:

 Don wrote:
 The docs currently state that:

 PROPOSAL:
 Drop the first requirement. Only one requirement is necessary:
  A pure function does not read or write any global mutable state.

Wow. It seems that not one person who has responded so far has understood this proposal! I'll try again. Under this proposal:

Funny, your re-iteration appears to coincided to my previous understanding. So either I've mis-understood twice, or I didn't sufficiently demonstrate my understanding when I made my critique. :) That said, I do think this version is much clearer and understandable.
 If you see a function which has mutable parameters, but is marked as  
 'pure', you can only conclude that it doesn't use global variables.  
 That's not much use on it's own. Let's call this a 'weakly-pure'  
 function.

 However, if you see a function maked as 'pure', which also has only  
 immutable parameters, you have the same guarantee which 'pure' gives us  
 as the moment. Let's call this a 'strongly-pure' function.

 The benefit of the relaxed rule is that a strongly-pure function can  
 call a weakly-pure functions, while remaining strongly-pure.
 This allows very many more functions to become strongly pure.

 The point of the proposal is *not* to provide the weak guarantee. It is  
 to provide the strong guarantee in more situations.

The problem from my point of view is that the programmer can not declare that a function should be 'strongly-pure' or 'weakly-pure'. Essentially, the point of the proposal is *to* provide the weak guarantee and leave the strong guarantee up to a sufficiently smart compiler. And I really don't want a function to suddenly run 8x slower, just because Joe-coder changed a type somewhere and made my 'strongly-pure' inner-loop 'weakly-pure'. That said, if we can only have one type of purity in the language, I think 'weakly-pure' is more powerful a concept than 'strongly-pure'.
Sep 22 2010
parent reply Don <nospam nospam.com> writes:
Robert Jacques wrote:
 On Wed, 22 Sep 2010 04:13:34 -0400, Don <nospam nospam.com> wrote:
 
 Don wrote:
 The docs currently state that:

 PROPOSAL:
 Drop the first requirement. Only one requirement is necessary:
  A pure function does not read or write any global mutable state.

Wow. It seems that not one person who has responded so far has understood this proposal! I'll try again. Under this proposal:

Funny, your re-iteration appears to coincided to my previous understanding. So either I've mis-understood twice, or I didn't sufficiently demonstrate my understanding when I made my critique. :) That said, I do think this version is much clearer and understandable.
 If you see a function which has mutable parameters, but is marked as 
 'pure', you can only conclude that it doesn't use global variables. 
 That's not much use on it's own. Let's call this a 'weakly-pure' 
 function.

 However, if you see a function maked as 'pure', which also has only 
 immutable parameters, you have the same guarantee which 'pure' gives 
 us as the moment. Let's call this a 'strongly-pure' function.

 The benefit of the relaxed rule is that a strongly-pure function can 
 call a weakly-pure functions, while remaining strongly-pure.
 This allows very many more functions to become strongly pure.

 The point of the proposal is *not* to provide the weak guarantee. It 
 is to provide the strong guarantee in more situations.

The problem from my point of view is that the programmer can not declare that a function should be 'strongly-pure' or 'weakly-pure'.

Yes they can. They just need to declare the parameters to be immutable. Essentially,
 the point of the proposal is *to* provide the weak guarantee and leave 
 the strong guarantee up to a sufficiently smart compiler. 

No it is not. The key point you're not getting is that the difference between strong and weak purity can be determined from the function signature alone. You can do it in a very simple piece of library code. The compiler doesn't need to be involved at all. It's utterly trivial. Establishing weak purity, OTOH, is very complicated, and requires the compiler _and the linker_. And I really
 don't want a function to suddenly run 8x slower, just because Joe-coder 
 changed a type somewhere and made my 'strongly-pure' inner-loop 
 'weakly-pure'. That said, if we can only have one type of purity in the 
 language, I think 'weakly-pure' is more powerful a concept than 
 'strongly-pure'.

Sep 22 2010
next sibling parent reply klickverbot <see klickverbot.at> writes:
On 9/22/10 7:10 PM, Steven Schveighoffer wrote:
 But the compiler will be able to tell. I think adding a
 __traits(isStronglyPure, symbol) will be good for those rare occasions
 where you really want to ensure purity.

 static assert(__traits(isStronglyPure, foo));

 -Steve

Shouldn't it be possible to implement that in a library template, not requiring any additions to the compiler at all?
Sep 22 2010
parent Don <nospam nospam.com> writes:
klickverbot wrote:
 On 9/22/10 7:10 PM, Steven Schveighoffer wrote:
 But the compiler will be able to tell. I think adding a
 __traits(isStronglyPure, symbol) will be good for those rare occasions
 where you really want to ensure purity.

 static assert(__traits(isStronglyPure, foo));

 -Steve

Shouldn't it be possible to implement that in a library template, not requiring any additions to the compiler at all?

Yes. In fact, it's easy.
Sep 22 2010
prev sibling parent reply Don <nospam nospam.com> writes:
Robert Jacques wrote:

 The point of the proposal is *not* to provide the weak guarantee. 
 It is to provide the strong guarantee in more situations.

declare that a function should be 'strongly-pure' or 'weakly-pure'.

Yes they can. They just need to declare the parameters to be immutable.

What about value types?

Value types are implicitly convertable to immutable, so they can be strongly-pure.

No their not. Remember, arrays and other structs are value types in the type system. Logically, they may be reference types, but as far as their type signature goes, they are value types.
 The problem I think Robert is referring to is, a person cannot always 
 tell just from looking at a signature that a type is strongly-pure 
 qualified or not.

And that you can not manually specify one or the other.

Yes you can. Just add const to the parameter declaration. This works because const is transitive.
 
 For example, is this function weak or strong?

 pure int foo(T t);


pure int foo(const T t); is always strong pure, regardless of what T is.
Sep 22 2010
parent Sclytrack <Sclytrack fake.com> writes:
	???

//weak, strong
int hello (const ref int x)
{

}

int main()
{
	immutable int a;
	int b;
	hello(a); //strong
	hello(b); //weak

}

	???


---

//impure
impure int hello()
{
	//globals
}
---

//impure
impure int hello(shared Myclass a)
{

}
---

//weak
int hello(MyClass a, immutable MyClass b)
{
	//noglobals,weak
}
---

//strong
int hello(const int a)
{
	//noglobals, weak
}
---

//strong
int hello( immutable MyClass a)
{
	//noglobals, weak
}


---
Sep 23 2010
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Wed, 22 Sep 2010 11:53:22 -0400, Don <nospam nospam.com> wrote:

 Robert Jacques wrote:
 On Wed, 22 Sep 2010 04:13:34 -0400, Don <nospam nospam.com> wrote:

 Don wrote:
 The docs currently state that:

 PROPOSAL:
 Drop the first requirement. Only one requirement is necessary:
  A pure function does not read or write any global mutable state.

Wow. It seems that not one person who has responded so far has understood this proposal! I'll try again. Under this proposal:

understanding. So either I've mis-understood twice, or I didn't sufficiently demonstrate my understanding when I made my critique. :) That said, I do think this version is much clearer and understandable.
 If you see a function which has mutable parameters, but is marked as  
 'pure', you can only conclude that it doesn't use global variables.  
 That's not much use on it's own. Let's call this a 'weakly-pure'  
 function.

 However, if you see a function maked as 'pure', which also has only  
 immutable parameters, you have the same guarantee which 'pure' gives  
 us as the moment. Let's call this a 'strongly-pure' function.

 The benefit of the relaxed rule is that a strongly-pure function can  
 call a weakly-pure functions, while remaining strongly-pure.
 This allows very many more functions to become strongly pure.

 The point of the proposal is *not* to provide the weak guarantee. It  
 is to provide the strong guarantee in more situations.

declare that a function should be 'strongly-pure' or 'weakly-pure'.

Yes they can. They just need to declare the parameters to be immutable.

What about value types?
Sep 22 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 22 Sep 2010 12:00:16 -0400, Robert Jacques <sandford jhu.edu>  
wrote:

 On Wed, 22 Sep 2010 11:53:22 -0400, Don <nospam nospam.com> wrote:

 Robert Jacques wrote:
 On Wed, 22 Sep 2010 04:13:34 -0400, Don <nospam nospam.com> wrote:

 Don wrote:
 The docs currently state that:

 PROPOSAL:
 Drop the first requirement. Only one requirement is necessary:
  A pure function does not read or write any global mutable state.

Wow. It seems that not one person who has responded so far has understood this proposal! I'll try again. Under this proposal:

understanding. So either I've mis-understood twice, or I didn't sufficiently demonstrate my understanding when I made my critique. :) That said, I do think this version is much clearer and understandable.
 If you see a function which has mutable parameters, but is marked as  
 'pure', you can only conclude that it doesn't use global variables.  
 That's not much use on it's own. Let's call this a 'weakly-pure'  
 function.

 However, if you see a function maked as 'pure', which also has only  
 immutable parameters, you have the same guarantee which 'pure' gives  
 us as the moment. Let's call this a 'strongly-pure' function.

 The benefit of the relaxed rule is that a strongly-pure function can  
 call a weakly-pure functions, while remaining strongly-pure.
 This allows very many more functions to become strongly pure.

 The point of the proposal is *not* to provide the weak guarantee. It  
 is to provide the strong guarantee in more situations.

declare that a function should be 'strongly-pure' or 'weakly-pure'.

Yes they can. They just need to declare the parameters to be immutable.

What about value types?

Value types are implicitly convertable to immutable, so they can be strongly-pure. The problem I think Robert is referring to is, a person cannot always tell just from looking at a signature that a type is strongly-pure qualified or not. For example, is this function weak or strong? pure int foo(T t); But the compiler will be able to tell. I think adding a __traits(isStronglyPure, symbol) will be good for those rare occasions where you really want to ensure purity. static assert(__traits(isStronglyPure, foo)); -Steve
Sep 22 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 22 Sep 2010 14:03:05 -0400, klickverbot <see klickverbot.at> wrote:

 On 9/22/10 7:10 PM, Steven Schveighoffer wrote:
 But the compiler will be able to tell. I think adding a
 __traits(isStronglyPure, symbol) will be good for those rare occasions
 where you really want to ensure purity.

 static assert(__traits(isStronglyPure, foo));

 -Steve

Shouldn't it be possible to implement that in a library template, not requiring any additions to the compiler at all?

Well, if you can convert foo to it's parameter types, maybe. But the compiler will already be doing this, won't it? If it's not, then what's the point of caring? -Steve
Sep 22 2010
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
Don wrote:
 Don wrote:
 The docs currently state that:

 PROPOSAL:
 Drop the first requirement. Only one requirement is necessary:

 A pure function does not read or write any global mutable state.

Wow. It seems that not one person who has responded so far has understood this proposal! I'll try again. Under this proposal: If you see a function which has mutable parameters, but is marked as 'pure', you can only conclude that it doesn't use global variables. That's not much use on it's own. Let's call this a 'weakly-pure' function. However, if you see a function maked as 'pure', which also has only immutable parameters, you have the same guarantee which 'pure' gives us as the moment. Let's call this a 'strongly-pure' function. The benefit of the relaxed rule is that a strongly-pure function can call a weakly-pure functions, while remaining strongly-pure. This allows very many more functions to become strongly pure. The point of the proposal is *not* to provide the weak guarantee. It is to provide the strong guarantee in more situations.

A pure function also cannot modify any data via its parameters. In other words, its parameters must be transitively const.
Sep 22 2010
next sibling parent reply sclytrack <sclytrack fake.com> writes:
 weaklypure void reverse(int[] x)
 {
     for(int i = 0; i * 2 < x.length; i++)
         swap(x[i], x[$-1-i]);
 }
 pure int foo(const(int)[] x)
 {
      auto x2 = x.dup;
      reverse(x2);
      // do some calculation on x2
      ...
      return calculation;
 }

noglobal void reverse(MyClass x) { x.text = ""; } So weakly-pure should not access global stuff. But that is harder to track by the compiler with mutable parameters. Or weakly-pure is only noglobal when called in pure routines, because of the immutable barrier. Unwritten contract, not checked by the compiler? (Like property getters shouldn't modify the "state" of the object.)
Sep 22 2010
parent reply Don <nospam nospam.com> writes:
sclytrack wrote:
 weaklypure void reverse(int[] x)
 {
     for(int i = 0; i * 2 < x.length; i++)
         swap(x[i], x[$-1-i]);
 }
 pure int foo(const(int)[] x)
 {
      auto x2 = x.dup;
      reverse(x2);
      // do some calculation on x2
      ...
      return calculation;
 }

noglobal void reverse(MyClass x) { x.text = ""; } So weakly-pure should not access global stuff. But that is harder to track by the compiler with mutable parameters.

Not really. You just need to disallow shared, __gshared in pure function declarations.
 Or weakly-pure is only noglobal when called in pure routines, because of the
 immutable barrier.
 Unwritten contract, not checked by the compiler?  (Like property getters
shouldn't
 modify the "state" of the object.)

No. The contract is, that the function only has access to the parameters you have given it. It depends _only_ on its parameters. If you've made all parameters immutable or value types, it's strongly pure. These additional contracts come automatically from the type system.
Sep 22 2010
parent sclytrack <sclytrack fake.com> writes:
 So weakly-pure should not access global stuff. But that is harder
 to track by the compiler with mutable parameters.

declarations.
 Or weakly-pure is only noglobal when called in pure routines, because of the
 immutable barrier.
 Unwritten contract, not checked by the compiler?  (Like property getters
shouldn't
 modify the "state" of the object.)

you have given it. It depends _only_ on its parameters. If you've made all parameters immutable or value types, it's strongly pure. These additional contracts come automatically from the type system.

pure doStuff( [immutable] int a, [unshared] MyClass a) { } 1) If shared it must be immutable. 2) If mutable it must be not shared.
Sep 22 2010
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
Steven Schveighoffer wrote:
 A pure function also cannot modify any data via its parameters. In 
 other words, its parameters must be transitively const.

Yes, a strongly pure function must have those traits.

No, that would be a weakly pure one.
 But, purity exists to allow for optimization.  A weakly pure function 
 cannot be optimized anymore than a normal function, but a strongly pure 
 can still be optimized even if it calls weakly-pure functions.
 
 I'll give you an example (with a new keyword to help you understand the 
 difference):
 
 weaklypure void reverse(int[] x)
 {
    for(int i = 0; i * 2 < x.length; i++)
        swap(x[i], x[$-1-i]);
 }
 
 pure int foo(const(int)[] x)
 {
     auto x2 = x.dup;
     reverse(x2);
     // do some calculation on x2
     ...
     return calculation;
 }
 
 Hopefully you can see how foo still is pure, while being able to call 
 reverse.  Essentially, weakly-pure functions can be used to build 
 strongly-pure functions.

This isn't what I was talking about - you didn't modify the contents of the array x[].
Sep 22 2010
next sibling parent reply Jesse Phillips <jessekphillips+D gmail.com> writes:
Walter Bright Wrote:

 Steven Schveighoffer wrote:
 A pure function also cannot modify any data via its parameters. In 
 other words, its parameters must be transitively const.

Yes, a strongly pure function must have those traits.

No, that would be a weakly pure one.
 But, purity exists to allow for optimization.  A weakly pure function 
 cannot be optimized anymore than a normal function, but a strongly pure 
 can still be optimized even if it calls weakly-pure functions.
 
 I'll give you an example (with a new keyword to help you understand the 
 difference):
 
 weaklypure void reverse(int[] x)
 {
    for(int i = 0; i * 2 < x.length; i++)
        swap(x[i], x[$-1-i]);
 }
 
 pure int foo(const(int)[] x)
 {
     auto x2 = x.dup;
     reverse(x2);
     // do some calculation on x2
     ...
     return calculation;
 }
 
 Hopefully you can see how foo still is pure, while being able to call 
 reverse.  Essentially, weakly-pure functions can be used to build 
 strongly-pure functions.

This isn't what I was talking about - you didn't modify the contents of the array x[].

The problem is that you can't call reverse or sort unless the function is 'pure,' but the function can not be pure because it modifies its own parameters which means his example function cannot be marked pure even though it is.
Sep 22 2010
parent Walter Bright <newshound2 digitalmars.com> writes:
Jesse Phillips wrote:
 Walter Bright Wrote:
 
 Steven Schveighoffer wrote:
 A pure function also cannot modify any data via its parameters. In 
 other words, its parameters must be transitively const.


 But, purity exists to allow for optimization.  A weakly pure function 
 cannot be optimized anymore than a normal function, but a strongly pure 
 can still be optimized even if it calls weakly-pure functions.
 
 I'll give you an example (with a new keyword to help you understand the 
 difference):
 
 weaklypure void reverse(int[] x) { for(int i = 0; i * 2 < x.length; i++) 
 swap(x[i], x[$-1-i]); }
 
 pure int foo(const(int)[] x) { auto x2 = x.dup; reverse(x2); // do some
 calculation on x2 ... return calculation; }
 
 Hopefully you can see how foo still is pure, while being able to call 
 reverse.  Essentially, weakly-pure functions can be used to build 
 strongly-pure functions.

array x[].

The problem is that you can't call reverse or sort unless the function is 'pure,' but the function can not be pure because it modifies its own parameters which means his example function cannot be marked pure even though it is.

The function in the example does not modify its parameters.
Sep 22 2010
prev sibling parent reply Don <nospam nospam.com> writes:
Walter Bright wrote:
 Steven Schveighoffer wrote:
 A pure function also cannot modify any data via its parameters. In 
 other words, its parameters must be transitively const.

Yes, a strongly pure function must have those traits.

No, that would be a weakly pure one.

You're operating with a different definition. There are actually three levels. For extra clarity, let's split strongly-pure into two: immutably pure == all parameters are immutable const pure == all parameters are const weakly pure == no access to globals The interestingly thing is that an immutably pure function can safely call a weakly-pure function. The only things the weakly pure function will be able to change, are variables which are local to the immutably pure functions. The *only* difference between weakly pure and immutably pure is the guarantees which are made about the parameters. And this is already covered by the const modifiers in the function signature. So that concept doesn't actually need to be part of pure. Note that this is possible only because immutable and const are transitive. It wouldn't be possible with head-const, or logical const. Transitive const gives us an absolutely huge win.
Sep 23 2010
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
Don wrote:
 Walter Bright wrote:
 Steven Schveighoffer wrote:
 A pure function also cannot modify any data via its parameters. In 
 other words, its parameters must be transitively const.

Yes, a strongly pure function must have those traits.

No, that would be a weakly pure one.

You're operating with a different definition. There are actually three levels. For extra clarity, let's split strongly-pure into two: immutably pure == all parameters are immutable const pure == all parameters are const weakly pure == no access to globals The interestingly thing is that an immutably pure function can safely call a weakly-pure function. The only things the weakly pure function will be able to change, are variables which are local to the immutably pure functions.

Ok, I see your point now. It's a good one.
 
 The *only* difference between weakly pure and immutably pure is the 
 guarantees which are made about the parameters. And this is already 
 covered by the const modifiers in the function signature. So that 
 concept doesn't actually need to be part of pure.
 
 Note that this is possible only because immutable and const are 
 transitive. It wouldn't be possible with head-const, or logical const.
 Transitive const gives us an absolutely huge win.

Sep 23 2010
prev sibling parent reply Tomek =?UTF-8?B?U293acWEc2tp?= <just ask.me> writes:
Don napisał:

 You're operating with a different definition. There are actually three
 levels. For extra clarity, let's split strongly-pure into two:
 
 immutably pure == all parameters are immutable
 const pure == all parameters are const
 weakly pure == no access to globals
 
 The interestingly thing is that an immutably pure function can safely
 call a weakly-pure function. The only things the weakly pure function
 will be able to change, are variables which are local to the immutably
 pure functions.
 
 The only difference between weakly pure and immutably pure is the
 guarantees which are made about the parameters. And this is already
 covered by the const modifiers in the function signature. So that
 concept doesn't actually need to be part of pure.

Good stuff. Separating restriction on globals and on parameters opens the door for compiler-enforced guarantees on a much wider range of cases.
 Note that this is possible only because immutable and const are
 transitive. It wouldn't be possible with head-const, or logical const.

But it would be possible with tail const. Quite a few functions could be made immutably pure if it meant tail immutability of their parameters. One example are manipulations of ranges on immutable structures.
 Transitive const gives us an absolutely huge win.

I can't agree more. -- Tomek
Sep 23 2010
parent Tomek =?UTF-8?B?U293acWEc2tp?= <just ask.me> writes:
Tomek Sowiński napisał:

 Note that this is possible only because immutable and const are
 transitive. It wouldn't be possible with head-const, or logical const.

But it would be possible with tail const. Quite a few functions could be made immutably pure if it meant tail immutability of their parameters. One example are manipulations of ranges on immutable structures.

I take that back. Guarantees on parameters still hold with tail const but it's not a big win in context of purity. It is a huge win elsewhere, though. -- Tomek
Sep 23 2010
prev sibling parent Jason House <jason.james.house gmail.com> writes:
Walter Bright Wrote:
 A pure function also cannot modify any data via its parameters. In other
words, 
 its parameters must be transitively const.

A weakly pure function should be able to take mutable inputs and modify them. When called from a strongly pure function, the mutable data can only be local variables.
Sep 22 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 22 Sep 2010 15:36:34 -0400, Walter Bright  
<newshound2 digitalmars.com> wrote:

 Don wrote:
 Don wrote:
 The docs currently state that:

 PROPOSAL:
 Drop the first requirement. Only one requirement is necessary:

 A pure function does not read or write any global mutable state.

understood this proposal! I'll try again. Under this proposal: If you see a function which has mutable parameters, but is marked as 'pure', you can only conclude that it doesn't use global variables. That's not much use on it's own. Let's call this a 'weakly-pure' function. However, if you see a function maked as 'pure', which also has only immutable parameters, you have the same guarantee which 'pure' gives us as the moment. Let's call this a 'strongly-pure' function. The benefit of the relaxed rule is that a strongly-pure function can call a weakly-pure functions, while remaining strongly-pure. This allows very many more functions to become strongly pure. The point of the proposal is *not* to provide the weak guarantee. It is to provide the strong guarantee in more situations.

A pure function also cannot modify any data via its parameters. In other words, its parameters must be transitively const.

Yes, a strongly pure function must have those traits. But, purity exists to allow for optimization. A weakly pure function cannot be optimized anymore than a normal function, but a strongly pure can still be optimized even if it calls weakly-pure functions. I'll give you an example (with a new keyword to help you understand the difference): weaklypure void reverse(int[] x) { for(int i = 0; i * 2 < x.length; i++) swap(x[i], x[$-1-i]); } pure int foo(const(int)[] x) { auto x2 = x.dup; reverse(x2); // do some calculation on x2 ... return calculation; } Hopefully you can see how foo still is pure, while being able to call reverse. Essentially, weakly-pure functions can be used to build strongly-pure functions. -Steve
Sep 22 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 22 Sep 2010 16:42:09 -0400, sclytrack <sclytrack fake.com> wrote:

 weaklypure void reverse(int[] x)
 {
     for(int i = 0; i * 2 < x.length; i++)
         swap(x[i], x[$-1-i]);
 }
 pure int foo(const(int)[] x)
 {
      auto x2 = x.dup;
      reverse(x2);
      // do some calculation on x2
      ...
      return calculation;
 }

noglobal void reverse(MyClass x) { x.text = ""; } So weakly-pure should not access global stuff. But that is harder to track by the compiler with mutable parameters.

No, it just means they cannot access data except through their arguments. Global stuff is perfectly valid to pass in. However, any references must not be shared, because those could be actively changed by other threads.
 Or weakly-pure is only noglobal when called in pure routines, because of  
 the
 immutable barrier.

pure, weak or strong, means you cannot access global variables. e.g.: int x; shared int y; pure foo(int *n) { x = 5; // illegal *n = 5; // fine } foo(&x); // fine, x is TLS, it is guaranteed not to change from another thread. foo(&y); // nope The difference between weak and strong is that weak pure functions cannot be more optimized than normal functions. Strong pure functions can enjoy the optimizations that functional languages have. -Steve
Sep 22 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 22 Sep 2010 16:03:08 -0400, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 I'll give you an example (with a new keyword to help you understand the  
 difference):

 weaklypure void reverse(int[] x)
 {
     for(int i = 0; i * 2 < x.length; i++)
         swap(x[i], x[$-1-i]);
 }

A better example -- std.algorithm.sort can be made weakly pure (at least for arrays). It doesn't affect anything but the array passed in. Without this, you can't sort arrays within pure functions unless you make a strong-pure functional sort, which will likely not perform as well. I'm kind of excited to think that D might be able to parallelize things in the right places, and allow imperative algorithms to stay imperative within those parallelized functions. -Steve
Sep 22 2010
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 22 Sep 2010 04:13:34 -0400, Don <nospam nospam.com> wrote:

 Don wrote:
 The docs currently state that:

 PROPOSAL:
 Drop the first requirement. Only one requirement is necessary:
  A pure function does not read or write any global mutable state.

Wow. It seems that not one person who has responded so far has understood this proposal! I'll try again. Under this proposal: If you see a function which has mutable parameters, but is marked as 'pure', you can only conclude that it doesn't use global variables. That's not much use on it's own. Let's call this a 'weakly-pure' function. However, if you see a function maked as 'pure', which also has only immutable parameters, you have the same guarantee which 'pure' gives us as the moment. Let's call this a 'strongly-pure' function. The benefit of the relaxed rule is that a strongly-pure function can call a weakly-pure functions, while remaining strongly-pure. This allows very many more functions to become strongly pure. The point of the proposal is *not* to provide the weak guarantee. It is to provide the strong guarantee in more situations.

I just thought of something -- we may be defining the common case. For example, D breaks from the default of shared globals in C because most of the time, variables *aren't* shared. When I first heard of shared, I thought surely Walter was losing his mind. Why would shared not be the default, clearly as it's done in C! But after having used the language, I see that it would have been a huge issue if I had to mark everything as unshared. If we can define weakly pure functions this way, they most likely will be way more common than unpure functions. I know I avoid accessing global variables in most of my functions. Think about a range, almost all the methods in a range can be weakly pure. So that means you need to mark every function as pure. Would it not be less tedious to mark unpure functions instead of pure functions? Or am I just going too far with this? OR, maybe we could say, mark strongly pure functions as pure, mark functions that access global data as something else (global?) and weakly pure functions just aren't marked. -Steve
Sep 22 2010
next sibling parent reply Jesse Phillips <jessekphillips+D gmail.com> writes:
Steven Schveighoffer Wrote:

 I just thought of something -- we may be defining the common case.  For  
 example, D breaks from the default of shared globals in C because most of  
 the time, variables *aren't* shared.  When I first heard of shared, I  
 thought surely Walter was losing his mind.  Why would shared not be the  
 default, clearly as it's done in C!  But after having used the language, I  
 see that it would have been a huge issue if I had to mark everything as  
 unshared.
 
 If we can define weakly pure functions this way, they most likely will be  
 way more common than unpure functions.  I know I avoid accessing global  
 variables in most of my functions.  Think about a range, almost all the  
 methods in a range can be weakly pure.  So that means you need to mark  
 every function as pure.
 
 Would it not be less tedious to mark unpure functions instead of pure  
 functions?  Or am I just going too far with this?
 
 OR, maybe we could say, mark strongly pure functions as pure, mark  
 functions that access global data as something else (global?) and weakly  
 pure functions just aren't marked.
 
 -Steve

This is interesting, I'm not sure if either of these approaches could make it into D2, though a conservative approach was originally taken only because it was the most easy to prove. I also find it interesting that D actually takes the 3-stage approach to several things. mutable, const, immutable; safe, trusted, system. And this being unpure, contained, pure. D recognizes the importance of these stages in other areas, so I feel it would be a miss if there is not a good reason the proposal doesn't work. Sadly what you learn from one of the stated areas (mutability, safety, purity) can't be applied to the other. But the pattern is there.
Sep 22 2010
parent reply Don <nospam nospam.com> writes:
Jesse Phillips wrote:
 Steven Schveighoffer Wrote:
 If we can define weakly pure functions this way, they most likely will be  
 way more common than unpure functions.  I know I avoid accessing global  
 variables in most of my functions.  Think about a range, almost all the  
 methods in a range can be weakly pure.  So that means you need to mark  
 every function as pure.


I think that's true. I/O is impure, but most other things are not.
 Would it not be less tedious to mark unpure functions instead of pure  
 functions?  Or am I just going too far with this?


You can use pure: at the top of the file. The problem is that there's no syntax for impure, so it can't be used if you have a single impure function (which does logging, for example). You can also wrap a bunch of functions in pure {}.
 OR, maybe we could say, mark strongly pure functions as pure, mark  
 functions that access global data as something else (global?) and weakly  
 pure functions just aren't marked.

 -Steve

This is interesting, I'm not sure if either of these approaches could make it into D2, though a conservative approach was originally taken only because it was the most easy to prove. I also find it interesting that D actually takes the 3-stage approach to several things. mutable, const, immutable; safe, trusted, system. And this being unpure, contained, pure. D recognizes the importance of these stages in other areas, so I feel it would be a miss if there is not a good reason the proposal doesn't work. Sadly what you learn from one of the stated areas (mutability, safety, purity) can't be applied to the other. But the pattern is there.

Interestingly, 'pure' is much simpler. A weakly pure function can modify only the mutable parameters it was given. There's no need for a different language syntax for strongly pure, because it's just the case where the number of mutable parameters it was given are zero. The point is that weakly pure and strongly pure are almost the same. int weakfoo(ref int x) pure; is exactly the same as strongly pure, except that it can also modify one single int, which you specify. Whereas int impurefoo(int x); could be doing _anything_.
Sep 22 2010
parent reply Don <nospam nospam.com> writes:
Steven Schveighoffer wrote:
 On Thu, 23 Sep 2010 08:47:36 -0400, Robert Jacques <sandford jhu.edu> 
 wrote:
 
 On Thu, 23 Sep 2010 02:51:28 -0400, Don <nospam nospam.com> wrote:

 Jesse Phillips wrote:
 Steven Schveighoffer Wrote:
 If we can define weakly pure functions this way, they most likely 
 will be  way more common than unpure functions.  I know I avoid 
 accessing global  variables in most of my functions.  Think about a 
 range, almost all the  methods in a range can be weakly pure.  So 
 that means you need to mark  every function as pure.


I think that's true. I/O is impure, but most other things are not.

The GC also impure :)

The GC must be assumed to be pure even though it's not. Otherwise, pure functions can't do any heap allocation, and that makes them pretty useless in a garbage collected languages. In functional languages, allocating memory is usually considered pure.

In the D spec, it already says that 'new' is considered pure.
Sep 23 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/23/10 15:25 CDT, Don wrote:
 Steven Schveighoffer wrote:
 On Thu, 23 Sep 2010 08:47:36 -0400, Robert Jacques <sandford jhu.edu>
 wrote:

 On Thu, 23 Sep 2010 02:51:28 -0400, Don <nospam nospam.com> wrote:

 Jesse Phillips wrote:
 Steven Schveighoffer Wrote:
 If we can define weakly pure functions this way, they most likely
 will be way more common than unpure functions. I know I avoid
 accessing global variables in most of my functions. Think about a
 range, almost all the methods in a range can be weakly pure. So
 that means you need to mark every function as pure.


I think that's true. I/O is impure, but most other things are not.

The GC also impure :)

The GC must be assumed to be pure even though it's not. Otherwise, pure functions can't do any heap allocation, and that makes them pretty useless in a garbage collected languages. In functional languages, allocating memory is usually considered pure.

In the D spec, it already says that 'new' is considered pure.

Which is wrong :o(. new invokes the constructor, which may do a variety of impure things. Andrei
Sep 23 2010
prev sibling parent reply Gary Whatmore <no spam.spam> writes:
Simen kjaeraas Wrote:

 Steven Schveighoffer <schveiguy yahoo.com> wrote:
 
 Would it not be less tedious to mark unpure functions instead of pure  
 functions?  Or am I just going too far with this?

You're probably going too far for it to be included in D2. D: That said, I believe you are absolutely right, and what you're saying is the right thing to do. UP VOTES!!1

We could use these proposals as a base for D 2.5 or D 3.0. Now that a better purity/constness system seems to solve problems more easily, D 2.0 seems too limited for modern systems programming. Is it finally time to put D 1.0 to rest, D 2.0 in maintenance mode, and concentrate on D 3? The TDPL book was finally published and D 2.0 has been in bugfix mode for a while. Once we have a 64-bit compiler ready, it would be time to move on.
Sep 23 2010
next sibling parent Justin Johansson <no spam.com> writes:
On 23/09/2010 10:36 PM, Gary Whatmore wrote:
 Simen kjaeraas Wrote:

 Steven Schveighoffer<schveiguy yahoo.com>  wrote:

 Would it not be less tedious to mark unpure functions instead of pure
 functions?  Or am I just going too far with this?

You're probably going too far for it to be included in D2. D: That said, I believe you are absolutely right, and what you're saying is the right thing to do. UP VOTES!!1

We could use these proposals as a base for D 2.5 or D 3.0. Now that a better purity/constness system seems to solve problems more easily, D 2.0 seems too limited for modern systems programming. Is it finally time to put D 1.0 to rest, D 2.0 in maintenance mode, and concentrate on D 3? The TDPL book was finally published and D 2.0 has been in bugfix mode for a while. Once we have a 64-bit compiler ready, it would be time to move on.

Agreed. D 1.0 was a stem cell. D 2.0 was part organ. Time is of the essence to evolve to a transplant-able product.
Sep 23 2010
prev sibling next sibling parent Jesse Phillips <jessekphillips+D gmail.com> writes:
Gary Whatmore Wrote:

 We could use these proposals as a base for D 2.5 or D 3.0. Now that a better
purity/constness system seems to solve problems more easily, D 2.0 seems too
limited for modern systems programming. Is it finally time to put D 1.0 to
rest, D 2.0 in maintenance mode, and concentrate on D 3? The TDPL book was
finally published and D 2.0 has been in bugfix mode for a while. Once we have a
64-bit compiler ready, it would be time to move on.

I think the question is would this break any code? Don't think so. Would 'pure' ever be used with its current restrictions? To an extent, but I think it may result in multiple functions for the same thing. Don't know. And last if we start work on D3 which has ... such features over D2 who will be using D2? They'll still be waiting for the language to be finished. If it is decided to be the right thing I think we should do it now, not later. Lastly D1 will not be shot. D2 is already placed in maintenance mode, DMD just doesn't have everything implemented. But really, what is this obsession with thinking D1 has to go, or that it is slowing the progress of D2? Walter is running a business, he must support his products or clients won't buy/use them.
Sep 23 2010
prev sibling parent Tomek =?UTF-8?B?U293acWEc2tp?= <just ask.me> writes:
Gary Whatmore napisał:

 We could use these proposals as a base for D 2.5 or D 3.0. Now that a
 better purity/constness system seems to solve problems more easily, D 2.0
 seems too limited for modern systems programming. Is it finally time to
 put D 1.0 to rest, D 2.0 in maintenance mode, and concentrate on D 3? The
 TDPL book was finally published and D 2.0 has been in bugfix mode for a
 while. Once we have a 64-bit compiler ready, it would be time to move on.

As much as I'd like the new proposals, the focus and resources should be dead-on making D2 usable. Make a few stiches if necessary but it must be able to walk on its own, then carry the weight of industrial software development on its back. Most of these proposals come up because people can write increasingly serious code in D. Before that these problems were never stumbled upon. Real production code will surely reveal new weakspots untractible in small grass-root projects and D creators and Phobos programmers will have to allocate time to address them. Surprisingly, it should make D3 better because we'll know what's wrong with D2. We don't really know that now. -- Tomek
Sep 23 2010
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, September 22, 2010 14:19:42 Steven Schveighoffer wrote:
 I just thought of something -- we may be defining the common case.  For
 example, D breaks from the default of shared globals in C because most of
 the time, variables *aren't* shared.  When I first heard of shared, I
 thought surely Walter was losing his mind.  Why would shared not be the
 default, clearly as it's done in C!  But after having used the language, I
 see that it would have been a huge issue if I had to mark everything as
 unshared.
 
 If we can define weakly pure functions this way, they most likely will be
 way more common than unpure functions.  I know I avoid accessing global
 variables in most of my functions.  Think about a range, almost all the
 methods in a range can be weakly pure.  So that means you need to mark
 every function as pure.
 
 Would it not be less tedious to mark unpure functions instead of pure
 functions?  Or am I just going too far with this?
 
 OR, maybe we could say, mark strongly pure functions as pure, mark
 functions that access global data as something else (global?) and weakly
 pure functions just aren't marked.

That seems like it would clearly mark out the three types of functions, though I'm not quite sure what all the implications are of having to mark in a function's signature that it accesses global data. That might be untenable. However, it would be a *lot* clearer than having to worry about what kind of pure a particular function is. It would also discourage accessing global state, which would generally be good - though shouldn't immutable global state still be accessible even from a pure function? Regardless, requiring "global" or some other keyword would likely be too much of a breaking change at this point - though it might be worth it. In any case, while I'm not sure I entirely understand the discussion in this thread, I like where it's going. As it stands, pure is too restrictive to be generally useful. It's still worth having, but it doesn't do much good generally. Being able to call weakly-pure functions from pure ones would improve things considerably, even if weakly-pure functions got no actual benefit from being weakly-pure. - Jonathan M Davis
Sep 22 2010
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Wed, 22 Sep 2010 13:10:36 -0400, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Wed, 22 Sep 2010 12:00:16 -0400, Robert Jacques <sandford jhu.edu>  
 wrote:

 On Wed, 22 Sep 2010 11:53:22 -0400, Don <nospam nospam.com> wrote:

 Robert Jacques wrote:
 On Wed, 22 Sep 2010 04:13:34 -0400, Don <nospam nospam.com> wrote:

 Don wrote:
 The docs currently state that:

 PROPOSAL:
 Drop the first requirement. Only one requirement is necessary:
  A pure function does not read or write any global mutable state.

Wow. It seems that not one person who has responded so far has understood this proposal! I'll try again. Under this proposal:

understanding. So either I've mis-understood twice, or I didn't sufficiently demonstrate my understanding when I made my critique. :) That said, I do think this version is much clearer and understandable.
 If you see a function which has mutable parameters, but is marked as  
 'pure', you can only conclude that it doesn't use global variables.  
 That's not much use on it's own. Let's call this a 'weakly-pure'  
 function.

 However, if you see a function maked as 'pure', which also has only  
 immutable parameters, you have the same guarantee which 'pure' gives  
 us as the moment. Let's call this a 'strongly-pure' function.

 The benefit of the relaxed rule is that a strongly-pure function can  
 call a weakly-pure functions, while remaining strongly-pure.
 This allows very many more functions to become strongly pure.

 The point of the proposal is *not* to provide the weak guarantee. It  
 is to provide the strong guarantee in more situations.

declare that a function should be 'strongly-pure' or 'weakly-pure'.

Yes they can. They just need to declare the parameters to be immutable.

What about value types?

Value types are implicitly convertable to immutable, so they can be strongly-pure.

No their not. Remember, arrays and other structs are value types in the type system. Logically, they may be reference types, but as far as their type signature goes, they are value types.
 The problem I think Robert is referring to is, a person cannot always  
 tell just from looking at a signature that a type is strongly-pure  
 qualified or not.

And that you can not manually specify one or the other.
 For example, is this function weak or strong?

 pure int foo(T t);

 But the compiler will be able to tell.  I think adding a  
 __traits(isStronglyPure, symbol) will be good for those rare occasions  
 where you really want to ensure purity.

 static assert(__traits(isStronglyPure, foo));

 -Steve

This would work, but it wouldn't be self-documenting, etc.
Sep 22 2010
prev sibling next sibling parent "Simen kjaeraas" <simen.kjaras gmail.com> writes:
Steven Schveighoffer <schveiguy yahoo.com> wrote:

 Would it not be less tedious to mark unpure functions instead of pure  
 functions?  Or am I just going too far with this?

You're probably going too far for it to be included in D2. D: That said, I believe you are absolutely right, and what you're saying is the right thing to do. UP VOTES!!1 -- Simen
Sep 23 2010
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Thu, 23 Sep 2010 02:57:28 -0400, Don <nospam nospam.com> wrote:

 Robert Jacques wrote:

 The point of the proposal is *not* to provide the weak guarantee.  
 It is to provide the strong guarantee in more situations.

declare that a function should be 'strongly-pure' or 'weakly-pure'.

Yes they can. They just need to declare the parameters to be immutable.

What about value types?

Value types are implicitly convertable to immutable, so they can be strongly-pure.

the type system. Logically, they may be reference types, but as far as their type signature goes, they are value types.
 The problem I think Robert is referring to is, a person cannot always  
 tell just from looking at a signature that a type is strongly-pure  
 qualified or not.


Yes you can. Just add const to the parameter declaration. This works because const is transitive.
 For example, is this function weak or strong?

 pure int foo(T t);


pure int foo(const T t); is always strong pure, regardless of what T is.

Don, this isn't always strong pure, regardless of what T is. Consider: auto x = foo(t); t.mutate; auto y = foo(t); The value of foo(t) can not be either cached nor parallelized. I think what you meant was to just add immutable to the parameter declaration. i.e. pure int foo(immutable T t); And to answer my own question from earlier in the thread, this works because the automatic conversion of value types is done in a transitive manner. So if someone adds an indirection inside T, foo(t) will then fail to compile, because the implicit conversion of T to immutable(T) will fail. Having to specify all value-types as immutable is a bit annoying inside the function, as shadow variables would have to be used to make inputs mutable again, but this does satisfy the need to force a function signature to be strongly-pure.
Sep 23 2010
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Thu, 23 Sep 2010 02:51:28 -0400, Don <nospam nospam.com> wrote:

 Jesse Phillips wrote:
 Steven Schveighoffer Wrote:
 If we can define weakly pure functions this way, they most likely will  
 be  way more common than unpure functions.  I know I avoid accessing  
 global  variables in most of my functions.  Think about a range,  
 almost all the  methods in a range can be weakly pure.  So that means  
 you need to mark  every function as pure.


I think that's true. I/O is impure, but most other things are not.

The GC also impure :)
Sep 23 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 23 Sep 2010 08:47:36 -0400, Robert Jacques <sandford jhu.edu>  
wrote:

 On Thu, 23 Sep 2010 02:51:28 -0400, Don <nospam nospam.com> wrote:

 Jesse Phillips wrote:
 Steven Schveighoffer Wrote:
 If we can define weakly pure functions this way, they most likely  
 will be  way more common than unpure functions.  I know I avoid  
 accessing global  variables in most of my functions.  Think about a  
 range, almost all the  methods in a range can be weakly pure.  So  
 that means you need to mark  every function as pure.


I think that's true. I/O is impure, but most other things are not.

The GC also impure :)

The GC must be assumed to be pure even though it's not. Otherwise, pure functions can't do any heap allocation, and that makes them pretty useless in a garbage collected languages. In functional languages, allocating memory is usually considered pure. -Steve
Sep 23 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 22 Sep 2010 21:48:19 -0400, Robert Jacques <sandford jhu.edu>  
wrote:

 On Wed, 22 Sep 2010 13:10:36 -0400, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 On Wed, 22 Sep 2010 12:00:16 -0400, Robert Jacques <sandford jhu.edu>  
 wrote:
 What about value types?

Value types are implicitly convertable to immutable, so they can be strongly-pure.

No their not. Remember, arrays and other structs are value types in the type system. Logically, they may be reference types, but as far as their type signature goes, they are value types.

Arrays are not value types. A value type is one that contains no references, no matter how deep. It is irrelevant if you have to spell out the references or not. another example of something that is not a value type: alias int * iptr; foo(iptr p); iptr is not a value type just because you don't see any * in the signature.
 But the compiler will be able to tell.  I think adding a  
 __traits(isStronglyPure, symbol) will be good for those rare occasions  
 where you really want to ensure purity.

 static assert(__traits(isStronglyPure, foo));

 -Steve

This would work, but it wouldn't be self-documenting, etc.

Hm... OK, well I disagree, it looks like it's documented to me. I don't see a difference between tagging something as strongly pure and putting in the static assert (except verbosity of course). I will note that I think the above would be a rare situation. -Steve
Sep 23 2010
prev sibling next sibling parent reply Lutger <lutger.blijdestijn gmail.com> writes:
Don wrote:

 Don wrote:
 The docs currently state that:

 PROPOSAL:
 Drop the first requirement. Only one requirement is necessary:
 
 A pure function does not read or write any global mutable state.
 

Wow. It seems that not one person who has responded so far has understood this proposal! I'll try again. Under this proposal: If you see a function which has mutable parameters, but is marked as 'pure', you can only conclude that it doesn't use global variables. That's not much use on it's own. Let's call this a 'weakly-pure' function. However, if you see a function maked as 'pure', which also has only immutable parameters, you have the same guarantee which 'pure' gives us as the moment. Let's call this a 'strongly-pure' function. The benefit of the relaxed rule is that a strongly-pure function can call a weakly-pure functions, while remaining strongly-pure. This allows very many more functions to become strongly pure. The point of the proposal is *not* to provide the weak guarantee. It is to provide the strong guarantee in more situations.

I understand now, this is a great proposal! What confused me was the label of 'pure functions' for functions that actually aren't pure in the usual way that purity is defined.
Sep 23 2010
parent Don <nospam nospam.com> writes:
Lutger wrote:
 Don wrote:
 
 Don wrote:
 The docs currently state that:
 PROPOSAL:
 Drop the first requirement. Only one requirement is necessary:

 A pure function does not read or write any global mutable state.

understood this proposal! I'll try again. Under this proposal: If you see a function which has mutable parameters, but is marked as 'pure', you can only conclude that it doesn't use global variables. That's not much use on it's own. Let's call this a 'weakly-pure' function. However, if you see a function maked as 'pure', which also has only immutable parameters, you have the same guarantee which 'pure' gives us as the moment. Let's call this a 'strongly-pure' function. The benefit of the relaxed rule is that a strongly-pure function can call a weakly-pure functions, while remaining strongly-pure. This allows very many more functions to become strongly pure. The point of the proposal is *not* to provide the weak guarantee. It is to provide the strong guarantee in more situations.

I understand now, this is a great proposal! What confused me was the label of 'pure functions' for functions that actually aren't pure in the usual way that purity is defined.

Yes. The fact that you can declare a mutable variable inside a pure function was always a bit unconventional. This pushes that even further. Maybe to the extent that 'pure' becomes a misnomer.
Sep 23 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 23 Sep 2010 16:43:13 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 9/23/10 15:25 CDT, Don wrote:
 Steven Schveighoffer wrote:
 On Thu, 23 Sep 2010 08:47:36 -0400, Robert Jacques <sandford jhu.edu>
 wrote:

 On Thu, 23 Sep 2010 02:51:28 -0400, Don <nospam nospam.com> wrote:

 Jesse Phillips wrote:
 Steven Schveighoffer Wrote:
 If we can define weakly pure functions this way, they most likely
 will be way more common than unpure functions. I know I avoid
 accessing global variables in most of my functions. Think about a
 range, almost all the methods in a range can be weakly pure. So
 that means you need to mark every function as pure.


I think that's true. I/O is impure, but most other things are not.

The GC also impure :)

The GC must be assumed to be pure even though it's not. Otherwise, pure functions can't do any heap allocation, and that makes them pretty useless in a garbage collected languages. In functional languages, allocating memory is usually considered pure.

In the D spec, it already says that 'new' is considered pure.

Which is wrong :o(. new invokes the constructor, which may do a variety of impure things.

Wouldn't they need to be pure constructors to be called in pure functions? -Steve
Sep 23 2010
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Thu, 23 Sep 2010 10:05:45 -0400, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Wed, 22 Sep 2010 21:48:19 -0400, Robert Jacques <sandford jhu.edu>  
 wrote:

 On Wed, 22 Sep 2010 13:10:36 -0400, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 On Wed, 22 Sep 2010 12:00:16 -0400, Robert Jacques <sandford jhu.edu>  
 wrote:
 What about value types?

Value types are implicitly convertable to immutable, so they can be strongly-pure.

No their not. Remember, arrays and other structs are value types in the type system. Logically, they may be reference types, but as far as their type signature goes, they are value types.

Arrays are not value types. A value type is one that contains no references, no matter how deep. It is irrelevant if you have to spell out the references or not.

Arrays are implemented as a struct. And as per the language spec: (http://www.digitalmars.com/d/2.0/struct.html) all structs are value types *to the compiler*. This doesn't mean that logically, from the programmer's point of view, they aren't providing reference semantics.
 another example of something that is not a value type:

 alias int * iptr;

 foo(iptr p);

 iptr is not a value type just because you don't see any * in the  
 signature.

*sigh* I explicitly referred to the type system/compiler. And to the type system iptr is a pointer, no matter what you call it.
 But the compiler will be able to tell.  I think adding a  
 __traits(isStronglyPure, symbol) will be good for those rare occasions  
 where you really want to ensure purity.

 static assert(__traits(isStronglyPure, foo));

 -Steve

This would work, but it wouldn't be self-documenting, etc.

Hm... OK, well I disagree, it looks like it's documented to me. I don't see a difference between tagging something as strongly pure and putting in the static assert (except verbosity of course). I will note that I think the above would be a rare situation.

To clarify, self-documenting => be able to automatically shows up in ddoc, etc.
Sep 23 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 23 Sep 2010 18:26:28 -0400, Robert Jacques <sandford jhu.edu>  
wrote:

 On Thu, 23 Sep 2010 10:05:45 -0400, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 On Wed, 22 Sep 2010 21:48:19 -0400, Robert Jacques <sandford jhu.edu>  
 wrote:

 On Wed, 22 Sep 2010 13:10:36 -0400, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 On Wed, 22 Sep 2010 12:00:16 -0400, Robert Jacques <sandford jhu.edu>  
 wrote:
 What about value types?

Value types are implicitly convertable to immutable, so they can be strongly-pure.

No their not. Remember, arrays and other structs are value types in the type system. Logically, they may be reference types, but as far as their type signature goes, they are value types.

Arrays are not value types. A value type is one that contains no references, no matter how deep. It is irrelevant if you have to spell out the references or not.

Arrays are implemented as a struct. And as per the language spec: (http://www.digitalmars.com/d/2.0/struct.html) all structs are value types *to the compiler*. This doesn't mean that logically, from the programmer's point of view, they aren't providing reference semantics.

structs can have value copy semantics. But for a struct that contains a reference, the references have reference semantics, which makes the struct a reference type. Look, terms can be debated, but here is the bottom line: A struct with a reference type inside it cannot be implicitly converted to immutable, whereas a struct with only non-reference types in it can. The compiler knows this, and makes its decision on what must be immutable or not based on that knowledge. Call it whatever you want, but that's how it works internally, and it's not hidden from the compiler. When you say "what about value types" what do you mean? I thought you meant types which have value copy semantics, which can only have non-references in them. A type that has both value copy semantics and reference semantics, I guess you could call it a hybrid type (an array is such a type), but primarily it is treated like a reference. If that's what you're asking about, then those also would make a function weakly pure.
 another example of something that is not a value type:

 alias int * iptr;

 foo(iptr p);

 iptr is not a value type just because you don't see any * in the  
 signature.

*sigh* I explicitly referred to the type system/compiler. And to the type system iptr is a pointer, no matter what you call it.

This is also purely a reference type. Semantically, it's exactly the same as a pointer, except you have to refer to the member. struct iptr { int *ptr; } You are splitting hairs here, and it's not really furthering the discussion.
 But the compiler will be able to tell.  I think adding a  
 __traits(isStronglyPure, symbol) will be good for those rare  
 occasions where you really want to ensure purity.

 static assert(__traits(isStronglyPure, foo));

 -Steve

This would work, but it wouldn't be self-documenting, etc.

Hm... OK, well I disagree, it looks like it's documented to me. I don't see a difference between tagging something as strongly pure and putting in the static assert (except verbosity of course). I will note that I think the above would be a rare situation.

To clarify, self-documenting => be able to automatically shows up in ddoc, etc.

OK, that's a fair point, I hadn't thought of that aspect. The only counter I have for that is, does it matter in the docs whether the compiler will optimize via parallelizing or not? Or is it more important just to have the compiler refuse to compile if it can't parallelize? -Steve
Sep 24 2010
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Fri, 24 Sep 2010 09:32:40 -0400, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Thu, 23 Sep 2010 18:26:28 -0400, Robert Jacques <sandford jhu.edu>  
 wrote:

 On Thu, 23 Sep 2010 10:05:45 -0400, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 On Wed, 22 Sep 2010 21:48:19 -0400, Robert Jacques <sandford jhu.edu>  
 wrote:

 On Wed, 22 Sep 2010 13:10:36 -0400, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 On Wed, 22 Sep 2010 12:00:16 -0400, Robert Jacques  
 <sandford jhu.edu> wrote:
 What about value types?

Value types are implicitly convertable to immutable, so they can be strongly-pure.

No their not. Remember, arrays and other structs are value types in the type system. Logically, they may be reference types, but as far as their type signature goes, they are value types.

Arrays are not value types. A value type is one that contains no references, no matter how deep. It is irrelevant if you have to spell out the references or not.

Arrays are implemented as a struct. And as per the language spec: (http://www.digitalmars.com/d/2.0/struct.html) all structs are value types *to the compiler*. This doesn't mean that logically, from the programmer's point of view, they aren't providing reference semantics.

structs can have value copy semantics. But for a struct that contains a reference, the references have reference semantics, which makes the struct a reference type.

A struct is never a reference type. It may have reference semantics, but this is a high-order feature that is uncheckable by the compiler. At best the compiler can detect that a struct contains references, not how those references are exposed/managed by the API.
 Look, terms can be debated, but here is the bottom line: A struct with a  
 reference type inside it cannot be implicitly converted to immutable,  
 whereas a struct with only non-reference types in it can.  The compiler  
 knows this, and makes its decision on what must be immutable or not  
 based on that knowledge.  Call it whatever you want, but that's how it  
 works internally, and it's not hidden from the compiler.

Right. Immutable transitivity for the win.
 When you say "what about value types" what do you mean?  I thought you  
 meant types which have value copy semantics, which can only have  
 non-references in them.

Given that I explicitly referred to the spec and compiler and contrasted that to logical/semantic value types, I thought I was pretty clear about what I mean.
Sep 24 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 24 Sep 2010 10:33:10 -0400, Robert Jacques <sandford jhu.edu>  
wrote:

 On Fri, 24 Sep 2010 09:32:40 -0400, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:
 structs can have value copy semantics.  But for a struct that contains  
 a reference, the references have reference semantics, which makes the  
 struct a reference type.

A struct is never a reference type. It may have reference semantics, but this is a high-order feature that is uncheckable by the compiler. At best the compiler can detect that a struct contains references, not how those references are exposed/managed by the API.

Containing references means it cannot be cast to immutable, the only important thing when determining the strength of purity here. This whole discussion is going nowhere, you haven't made any real points. Can you think of a case where the compiler would be wrong in determining purity strength for "value types" as you define them? An example would be most helpful. -Steve
Sep 24 2010
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Fri, 24 Sep 2010 10:47:16 -0400, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Fri, 24 Sep 2010 10:33:10 -0400, Robert Jacques <sandford jhu.edu>  
 wrote:

 On Fri, 24 Sep 2010 09:32:40 -0400, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:
 structs can have value copy semantics.  But for a struct that contains  
 a reference, the references have reference semantics, which makes the  
 struct a reference type.

A struct is never a reference type. It may have reference semantics, but this is a high-order feature that is uncheckable by the compiler. At best the compiler can detect that a struct contains references, not how those references are exposed/managed by the API.

Containing references means it cannot be cast to immutable, the only important thing when determining the strength of purity here. This whole discussion is going nowhere, you haven't made any real points. Can you think of a case where the compiler would be wrong in determining purity strength for "value types" as you define them? An example would be most helpful. -Steve

I feel like I've upset you and would like to make amends. First, this discussion was never about the compiler being able to determine purity. That's easy. The point raised was about whether the programmer could enforce strong-purity for any strongly-pure fucntion. The answer is that you can by making all arguments immutable, which is a fairly minor restriction on the callee. Second, this thread (fiber?), has veered a little off-topic due to a misunderstanding/debate about value-types vs value-semantics and reference-types vs reference-semantics. Perhaps it's because I work with large, 'hybrid' structs a lot, but the difference between value-type and value-semantics is very large in my mind. Third, this fiber started, in part, due to a question by regarding value-types being answered with value-semantics. Forth, I raised the question regarding value-types in part due to half-remembered old posts in which value-semantics were discussed but under the label of value-types, thus causing confusion in my brain. And because it was late at night, I decided to ask a simple question instead of testing DMD's current behavior myself. I did end up testing DMD the next day (and then banging my head on a wall), but by then all this had started.
Sep 24 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 24 Sep 2010 12:16:23 -0400, Robert Jacques <sandford jhu.edu>  
wrote:

 On Fri, 24 Sep 2010 10:47:16 -0400, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 On Fri, 24 Sep 2010 10:33:10 -0400, Robert Jacques <sandford jhu.edu>  
 wrote:

 On Fri, 24 Sep 2010 09:32:40 -0400, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:
 structs can have value copy semantics.  But for a struct that  
 contains a reference, the references have reference semantics, which  
 makes the struct a reference type.

A struct is never a reference type. It may have reference semantics, but this is a high-order feature that is uncheckable by the compiler. At best the compiler can detect that a struct contains references, not how those references are exposed/managed by the API.

Containing references means it cannot be cast to immutable, the only important thing when determining the strength of purity here. This whole discussion is going nowhere, you haven't made any real points. Can you think of a case where the compiler would be wrong in determining purity strength for "value types" as you define them? An example would be most helpful. -Steve

I feel like I've upset you and would like to make amends. First, this discussion was never about the compiler being able to determine purity. That's easy. The point raised was about whether the programmer could enforce strong-purity for any strongly-pure fucntion. The answer is that you can by making all arguments immutable, which is a fairly minor restriction on the callee. Second, this thread (fiber?), has veered a little off-topic due to a misunderstanding/debate about value-types vs value-semantics and reference-types vs reference-semantics. Perhaps it's because I work with large, 'hybrid' structs a lot, but the difference between value-type and value-semantics is very large in my mind. Third, this fiber started, in part, due to a question by regarding value-types being answered with value-semantics. Forth, I raised the question regarding value-types in part due to half-remembered old posts in which value-semantics were discussed but under the label of value-types, thus causing confusion in my brain. And because it was late at night, I decided to ask a simple question instead of testing DMD's current behavior myself. I did end up testing DMD the next day (and then banging my head on a wall), but by then all this had started.

You haven't upset me, I just couldn't follow what you were saying :) When you said "what about value types," I thought it was a suggestion that value types would make it hard to determine whether a function was strongly pure or weakly pure, and my interpretation of what "value type" means sent us off on this tangent. I think we both agree that the proposed changes by Don will work with the compiler, and I understand your desire to ensure a function is strongly-pure, and why the proposal does not give an easy solution to that. So no hard feelings, we can move on and forget this whole thread ever happened :) -Steve
Sep 24 2010
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Fri, 24 Sep 2010 13:09:05 -0400, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Fri, 24 Sep 2010 12:16:23 -0400, Robert Jacques <sandford jhu.edu>  
 wrote:

 On Fri, 24 Sep 2010 10:47:16 -0400, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 On Fri, 24 Sep 2010 10:33:10 -0400, Robert Jacques <sandford jhu.edu>  
 wrote:

 On Fri, 24 Sep 2010 09:32:40 -0400, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:
 structs can have value copy semantics.  But for a struct that  
 contains a reference, the references have reference semantics, which  
 makes the struct a reference type.

A struct is never a reference type. It may have reference semantics, but this is a high-order feature that is uncheckable by the compiler. At best the compiler can detect that a struct contains references, not how those references are exposed/managed by the API.

Containing references means it cannot be cast to immutable, the only important thing when determining the strength of purity here. This whole discussion is going nowhere, you haven't made any real points. Can you think of a case where the compiler would be wrong in determining purity strength for "value types" as you define them? An example would be most helpful. -Steve

I feel like I've upset you and would like to make amends. First, this discussion was never about the compiler being able to determine purity. That's easy. The point raised was about whether the programmer could enforce strong-purity for any strongly-pure fucntion. The answer is that you can by making all arguments immutable, which is a fairly minor restriction on the callee. Second, this thread (fiber?), has veered a little off-topic due to a misunderstanding/debate about value-types vs value-semantics and reference-types vs reference-semantics. Perhaps it's because I work with large, 'hybrid' structs a lot, but the difference between value-type and value-semantics is very large in my mind. Third, this fiber started, in part, due to a question by regarding value-types being answered with value-semantics. Forth, I raised the question regarding value-types in part due to half-remembered old posts in which value-semantics were discussed but under the label of value-types, thus causing confusion in my brain. And because it was late at night, I decided to ask a simple question instead of testing DMD's current behavior myself. I did end up testing DMD the next day (and then banging my head on a wall), but by then all this had started.

You haven't upset me, I just couldn't follow what you were saying :) When you said "what about value types," I thought it was a suggestion that value types would make it hard to determine whether a function was strongly pure or weakly pure, and my interpretation of what "value type" means sent us off on this tangent. I think we both agree that the proposed changes by Don will work with the compiler, and I understand your desire to ensure a function is strongly-pure, and why the proposal does not give an easy solution to that. So no hard feelings, we can move on and forget this whole thread ever happened :) -Steve

Agreed :)
Sep 24 2010
prev sibling parent reply Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 22/09/2010 09:13, Don wrote:
 Don wrote:
 The docs currently state that:

 PROPOSAL:
 Drop the first requirement. Only one requirement is necessary:

 A pure function does not read or write any global mutable state.

Wow. It seems that not one person who has responded so far has understood this proposal! I'll try again. Under this proposal: If you see a function which has mutable parameters, but is marked as 'pure', you can only conclude that it doesn't use global variables. That's not much use on it's own. Let's call this a 'weakly-pure' function. However, if you see a function maked as 'pure', which also has only immutable parameters, you have the same guarantee which 'pure' gives us as the moment. Let's call this a 'strongly-pure' function. The benefit of the relaxed rule is that a strongly-pure function can call a weakly-pure functions, while remaining strongly-pure. This allows very many more functions to become strongly pure. The point of the proposal is *not* to provide the weak guarantee. It is to provide the strong guarantee in more situations.

I'm confused: Isn't this essentially the same as the "partially pure" functions idea that was discussed as far back as 2008? And wasn't it agreed already that this would be the way things would work? -- Bruno Medeiros - Software Engineer
Oct 25 2010
parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 25/10/2010 13:46, Bruno Medeiros wrote:
 I'm confused: Isn't this essentially the same as the "partially pure"
 functions idea that was discussed as far back as 2008? And wasn't it
 agreed already that this would be the way things would work?

I've only now seen this additional post (that shows up out of it's thread in my ThunderBird) : On 25/09/2010 03:53, Jason House wrote:
 It looks like your proposal was accepted. Walter just checked in 

http://www.dsource.org/projects/dmd/changeset/687 I was getting worried otherwise -- Bruno Medeiros - Software Engineer
Oct 25 2010
prev sibling next sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2010-09-21 22:21:39 -0400, Don <nospam nospam.com> said:

 PROPOSAL:
 Drop the first requirement. Only one requirement is necessary:
 
 A pure function does not read or write any global mutable state.
 
 If a pure function has parameters that are all immutable or are 
 implicitly convertible to immutable, then the compiler is permitted to 
 cache the results.

This is what I was advocating a long time ago when pure came out. The current rule prevent you from doing any serious work from inside a pure function. Inside a pure function you can only call a pure function, you can't even call functions that take mutable references to local variables. This makes it impossible to use generic algorithms, such as sort, to manipulate local mutable data, even when those algorithms have no need for a global state. So I certainly support this change. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Sep 22 2010
prev sibling next sibling parent klickverbot <see klickverbot.at> writes:
On 9/22/10 4:21 AM, Don wrote:
 PROPOSAL:
 Drop the first requirement. Only one requirement is necessary:

 A pure function does not read or write any global mutable state.

 If a pure function has parameters that are all immutable or are
 implicitly convertible to immutable, then the compiler is permitted to
 cache the results.

I also support this proposal. In its current state, pure is pretty much useless, since you almost can't do anything serious with pure functions, let alone generic programming – it seems more like an excuse for not adding logic to determine whether a result can be cached to the optimizer than a true language feature…
Sep 22 2010
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Wed, 22 Sep 2010 07:54:26 -0400, Michel Fortin  
<michel.fortin michelf.com> wrote:

 On 2010-09-22 01:26:01 -0400, "Robert Jacques" <sandford jhu.edu> said:

 So removing the concurrency safety from pure would greatly  expand the  
 number of pure functions, however, automatic parallelism would  be lost.

Don clearly mentioned that this is not lost. Basically, for safe parallelism what you need is a function that has a) pure and b) no mutable reference parameter. Both are easily checkable at compile time, you'd just need to change your test for pure for a test that also checks the arguments.

What is lost is my ability to declare a function does x in the signature and for the compiler to check that. I really want to know if code monkey A changed some type's implementation to be thread unsafe, because detecting and tracking down a loss of performance due to loss of automatic parallelism is devilish.
 The interesting thing with this change is that you can now call mutators  
 functions on the local variables inside the pure function, because those  
 can be made pure. You can't even iterate over a range inside a pure  
 function without this!

 	pure int test() {
 		int result;
 		auto r = iota(0, 10);
 		while (!r.empty) {
 			result += r;
 			r.popFront(); // can't be pure by current rules!
 		}
 		return result;
 	}

I did mention this benefit in my post, but this example really shows just how awesome it really is. Which is why I think the concept of a simplified pure is so powerful and be included in the language. I just want it included in addition to the current pure concept.
Sep 22 2010
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Wed, 22 Sep 2010 11:44:00 -0400, Don <nospam nospam.com> wrote:

 Robert Jacques wrote:
 On Wed, 22 Sep 2010 07:54:26 -0400, Michel Fortin  
 <michel.fortin michelf.com> wrote:

 On 2010-09-22 01:26:01 -0400, "Robert Jacques" <sandford jhu.edu> said:

 So removing the concurrency safety from pure would greatly  expand  
 the number of pure functions, however, automatic parallelism would   
 be lost.

Don clearly mentioned that this is not lost. Basically, for safe parallelism what you need is a function that has a) pure and b) no mutable reference parameter. Both are easily checkable at compile time, you'd just need to change your test for pure for a test that also checks the arguments.

signature and for the compiler to check that. I really want to know if code monkey A changed some type's implementation to be thread unsafe, because detecting and tracking down a loss of performance due to loss of automatic parallelism is devilish.

No, you haven't lost that at all. Any function marked as pure still has no access to any state, other than what is provided by its parameters. It is still thread-safe.

No, it isn't. A strongly-pure function is thread safe, but a weakly-pure function isn't. Since strong/weak is automatically determined by the compiler, a function's strength can switch due to long distance code changes. This wouldn't be an issue today, but tomorrow large strongly-pure functions will be parallelized automatically in a manner akin to inlining (outlining?). Then it makes a difference. However, these issues feel more like D3 concerns than D2 concerns, particularly when you consider the advantages of making tasks/futures a language level concept.
Sep 22 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 22 Sep 2010 11:55:57 -0400, Robert Jacques <sandford jhu.edu>  
wrote:

 On Wed, 22 Sep 2010 11:44:00 -0400, Don <nospam nospam.com> wrote:

 Robert Jacques wrote:
 On Wed, 22 Sep 2010 07:54:26 -0400, Michel Fortin  
 <michel.fortin michelf.com> wrote:

 On 2010-09-22 01:26:01 -0400, "Robert Jacques" <sandford jhu.edu>  
 said:

 So removing the concurrency safety from pure would greatly  expand  
 the number of pure functions, however, automatic parallelism would   
 be lost.

Don clearly mentioned that this is not lost. Basically, for safe parallelism what you need is a function that has a) pure and b) no mutable reference parameter. Both are easily checkable at compile time, you'd just need to change your test for pure for a test that also checks the arguments.

signature and for the compiler to check that. I really want to know if code monkey A changed some type's implementation to be thread unsafe, because detecting and tracking down a loss of performance due to loss of automatic parallelism is devilish.

No, you haven't lost that at all. Any function marked as pure still has no access to any state, other than what is provided by its parameters. It is still thread-safe.

No, it isn't. A strongly-pure function is thread safe, but a weakly-pure function isn't. Since strong/weak is automatically determined by the compiler, a function's strength can switch due to long distance code changes. This wouldn't be an issue today, but tomorrow large strongly-pure functions will be parallelized automatically in a manner akin to inlining (outlining?). Then it makes a difference. However, these issues feel more like D3 concerns than D2 concerns, particularly when you consider the advantages of making tasks/futures a language level concept.

It's thread safe. Calling a strongly-pure function concurrently on two threads is not multi-threading, it's an optimization.
Sep 22 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 22 Sep 2010 15:08:26 -0400, Don <nospam nospam.com> wrote:

 Robert Jacques wrote:
 On Wed, 22 Sep 2010 11:44:00 -0400, Don <nospam nospam.com> wrote:

 Robert Jacques wrote:
 On Wed, 22 Sep 2010 07:54:26 -0400, Michel Fortin  
 <michel.fortin michelf.com> wrote:

 On 2010-09-22 01:26:01 -0400, "Robert Jacques" <sandford jhu.edu>  
 said:

 So removing the concurrency safety from pure would greatly  expand  
 the number of pure functions, however, automatic parallelism would   
 be lost.

Don clearly mentioned that this is not lost. Basically, for safe parallelism what you need is a function that has a) pure and b) no mutable reference parameter. Both are easily checkable at compile time, you'd just need to change your test for pure for a test that also checks the arguments.

signature and for the compiler to check that. I really want to know if code monkey A changed some type's implementation to be thread unsafe, because detecting and tracking down a loss of performance due to loss of automatic parallelism is devilish.

No, you haven't lost that at all. Any function marked as pure still has no access to any state, other than what is provided by its parameters. It is still thread-safe.

weakly-pure function isn't. Since strong/weak is automatically determined by the compiler, a function's strength can switch due to long distance code changes.

This is completely incorrect. It can only change strength if its signature changes.

Hypothetical counter-case struct S { version(stronglypure) string s; else char[] s; } pure foo(S s); // changes strength depending on S' contents -Steve
Sep 22 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 22 Sep 2010 15:21:57 -0400, klickverbot <see klickverbot.at> wrote:

 On 9/22/10 9:14 PM, Steven Schveighoffer wrote:
 Hypothetical counter-case

 struct S
 {
 version(stronglypure)
 string s;
 else
 char[] s;
 }

 pure foo(S s); // changes strength depending on S' contents

 -Steve

This is a change to the signature of foo – S with the stronglypure version defined and S without it are two completely distinct types.

Wait, I didn't change foo's signature at all. This is what the OP meant by long-range changes. S can be defined far away from foo, and out of the author of foo's control. -Steve
Sep 22 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 22 Sep 2010 15:52:03 -0400, Don <nospam nospam.com> wrote:

 Steven Schveighoffer wrote:
 On Wed, 22 Sep 2010 15:21:57 -0400, klickverbot <see klickverbot.at>  
 wrote:

 On 9/22/10 9:14 PM, Steven Schveighoffer wrote:
 Hypothetical counter-case

 struct S
 {
 version(stronglypure)
 string s;
 else
 char[] s;
 }

 pure foo(S s); // changes strength depending on S' contents

 -Steve

This is a change to the signature of foo – S with the stronglypure version defined and S without it are two completely distinct types.

meant by long-range changes. S can be defined far away from foo, and out of the author of foo's control. -Steve

Is that really what he meant? Note that foo() would need to be recompiled, and the silent change in strength would occur only if it still compiles despite a significant change in the definition of one of types.

This isn't hard to come up with: struct S { string s; version(stronglypure) { } else char[] notused; }
 If you change the definition of a type, you're always going to have  
 influences everywhere it's used in a function definition. I'd hardly  
 call that a 'long-range' change.

I think the OP's request is to be able to tell the compiler "I want you to make *sure* this function is strongly pure." Not that I think it's terribly important, I'm just trying to convey what has been said. Adding another member to a struct is a much better example than my first, and I don't think it's an uncommon thing. Whether its extremely important to have this ability, I'm not sure about. It's sort of akin to having an inline directive. -Steve
Sep 22 2010
prev sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Wed, 22 Sep 2010 15:52:03 -0400, Don <nospam nospam.com> wrote:

 Steven Schveighoffer wrote:
 On Wed, 22 Sep 2010 15:21:57 -0400, klickverbot <see klickverbot.at>  
 wrote:

 On 9/22/10 9:14 PM, Steven Schveighoffer wrote:
 Hypothetical counter-case

 struct S
 {
 version(stronglypure)
 string s;
 else
 char[] s;
 }

 pure foo(S s); // changes strength depending on S' contents

 -Steve

This is a change to the signature of foo – S with the stronglypure version defined and S without it are two completely distinct types.

meant by long-range changes. S can be defined far away from foo, and out of the author of foo's control. -Steve

Is that really what he meant?

Yes.
 Note that foo() would need to be recompiled, and the silent change in  
 strength would occur only if it still compiles despite a significant  
 change in the definition of one of types.

 If you change the definition of a type, you're always going to have  
 influences everywhere it's used in a function definition. I'd hardly  
 call that a 'long-range' change.

Consider the case of adding some new feature Y to S or a change in the internal implementation. The API from foo's perspective doesn't change one iota, but the purity could.
Sep 22 2010