www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Memory allocation purity

reply "Brian Schott" <briancschott gmail.com> writes:
What is the plan for the "pure"-ity of memory management?

Right now the "new" operator is considered to be pure even though 
it is not, but related functinos like malloc, GC.addRange, 
GC.removeRange, and others are not.

This prevents large portions of std.allocator from being pure. 
This then prevents any containers library built on std.allocator 
from being pure, which does the same for any funcions and data 
structures written using those containers.

If malloc can never be considered pure, even when hidden behind 
an allocator, why can it be considered pure when hidden behind 
the GC?
May 14 2014
next sibling parent reply "w0rp" <devw0rp gmail.com> writes:
On Wednesday, 14 May 2014 at 22:42:47 UTC, Brian Schott wrote:
 What is the plan for the "pure"-ity of memory management?

 Right now the "new" operator is considered to be pure even 
 though it is not, but related functinos like malloc, 
 GC.addRange, GC.removeRange, and others are not.

 This prevents large portions of std.allocator from being pure. 
 This then prevents any containers library built on 
 std.allocator from being pure, which does the same for any 
 funcions and data structures written using those containers.

 If malloc can never be considered pure, even when hidden behind 
 an allocator, why can it be considered pure when hidden behind 
 the GC?
I think even C malloc should be considered pure. True, it affects global state by allocating memory, but it never changes existing values, it just allows for new values. free is pure because it isn't side-effecting, it deallocates what you give it. That's just my perspective on it though, others might have other views on it.
May 14 2014
parent reply "Idan Arye" <GenericNPC gmail.com> writes:
On Wednesday, 14 May 2014 at 22:50:10 UTC, w0rp wrote:
 I think even C malloc should be considered pure. True, it 
 affects global state by allocating memory, but it never changes 
 existing values, it just allows for new values. free is pure 
 because it isn't side-effecting, it deallocates what you give 
 it. That's just my perspective on it though, others might have 
 other views on it.
`free` is not pure, because if you have a reference to that memory that reference is no longer valid.
May 14 2014
next sibling parent "Meta" <jared771 gmail.com> writes:
On Thursday, 15 May 2014 at 01:33:36 UTC, Idan Arye wrote:
 On Wednesday, 14 May 2014 at 22:50:10 UTC, w0rp wrote:
 I think even C malloc should be considered pure. True, it 
 affects global state by allocating memory, but it never 
 changes existing values, it just allows for new values. free 
 is pure because it isn't side-effecting, it deallocates what 
 you give it. That's just my perspective on it though, others 
 might have other views on it.
`free` is not pure, because if you have a reference to that memory that reference is no longer valid.
It is weakly pure, i.e., it can only affect the world through the parameters given to it (and the state of the computer's memory... but we should ignore that).
May 14 2014
prev sibling parent Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Thu, 15 May 2014 01:33:34 +0000
Idan Arye via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 On Wednesday, 14 May 2014 at 22:50:10 UTC, w0rp wrote:
 I think even C malloc should be considered pure. True, it
 affects global state by allocating memory, but it never changes
 existing values, it just allows for new values. free is pure
 because it isn't side-effecting, it deallocates what you give
 it. That's just my perspective on it though, others might have
 other views on it.
`free` is not pure, because if you have a reference to that memory that reference is no longer valid.
But does that really matter from the perspective of pure? That's really more of an safety issue. There might be some way that that violates purtiy, but I can't think of one at the moment. free can't be strongly pure, because it's arguments couldn't be immutable (or even const) without violating the type system, but I _think_ that it's fine for free to be weakly pure. It's quite possible that I'm missing something though. - Jonathan M Davis
May 14 2014
prev sibling next sibling parent Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Wed, 14 May 2014 22:42:46 +0000
Brian Schott via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 What is the plan for the "pure"-ity of memory management?

 Right now the "new" operator is considered to be pure even though
 it is not, but related functinos like malloc, GC.addRange,
 GC.removeRange, and others are not.

 This prevents large portions of std.allocator from being pure.
 This then prevents any containers library built on std.allocator
 from being pure, which does the same for any funcions and data
 structures written using those containers.

 If malloc can never be considered pure, even when hidden behind
 an allocator, why can it be considered pure when hidden behind
 the GC?
I think malloc should definitely be considered pure for the same reasons that new is considered pure. I don't know about the other memory management functions though. I'd really have to think through their side effects to have an opinion on them. If we can make them pure though, that would certainly help with ensuring that allocators can be pure. - Jonathan M Davis
May 14 2014
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/14/2014 3:42 PM, Brian Schott wrote:
 If malloc can never be considered pure, even when hidden behind an allocator,
It cannot be pure as long as it can fail.
 why can it be considered pure when hidden behind the GC?
Because GC failures are not recoverable, so the pure allocation cannot fail.
May 14 2014
next sibling parent "Meta" <jared771 gmail.com> writes:
On Thursday, 15 May 2014 at 00:00:37 UTC, Walter Bright wrote:
 On 5/14/2014 3:42 PM, Brian Schott wrote:
 If malloc can never be considered pure, even when hidden 
 behind an allocator,
It cannot be pure as long as it can fail.
 why can it be considered pure when hidden behind the GC?
Because GC failures are not recoverable, so the pure allocation cannot fail.
Malloc is a tricky case. The results it returns are theoretically always unique, and it is referentially transparent if we pretend that we have infinite memory.
May 14 2014
prev sibling next sibling parent reply "Brian Schott" <briancschott gmail.com> writes:
On Thursday, 15 May 2014 at 00:00:37 UTC, Walter Bright wrote:
 On 5/14/2014 3:42 PM, Brian Schott wrote:
 If malloc can never be considered pure, even when hidden 
 behind an allocator,
It cannot be pure as long as it can fail.
 why can it be considered pure when hidden behind the GC?
Because GC failures are not recoverable, so the pure allocation cannot fail.
Can we say that Mallocator failures are not recoverable?
May 14 2014
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/14/2014 5:44 PM, Brian Schott wrote:
 Can we say that Mallocator failures are not recoverable?
malloc itself does not have that property. But you could design a wrapper for it that did.
May 14 2014
parent reply "Brian Schott" <briancschott gmail.com> writes:
On Thursday, 15 May 2014 at 00:48:52 UTC, Walter Bright wrote:
 On 5/14/2014 5:44 PM, Brian Schott wrote:
 Can we say that Mallocator failures are not recoverable?
malloc itself does not have that property. But you could design a wrapper for it that did.
I'm concerned specifically with this wrapper: https://github.com/andralex/phobos/blob/allocator/std/allocator.d#L773 We need to make these functions pure if they are going to be usable. Removing the stdlib import and adding private extern (C) { void* malloc(size_t) pure nothrow trusted; void free(void*) pure nothrow trusted; void* realloc(void*, size_t) pure nothrow trusted; } as well as a throwing OutOfMemoryError if they fail should be sufficient, correct?
May 14 2014
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/14/14, 6:11 PM, Brian Schott wrote:
 On Thursday, 15 May 2014 at 00:48:52 UTC, Walter Bright wrote:
 On 5/14/2014 5:44 PM, Brian Schott wrote:
 Can we say that Mallocator failures are not recoverable?
malloc itself does not have that property. But you could design a wrapper for it that did.
I'm concerned specifically with this wrapper: https://github.com/andralex/phobos/blob/allocator/std/allocator.d#L773 We need to make these functions pure if they are going to be usable. Removing the stdlib import and adding private extern (C) { void* malloc(size_t) pure nothrow trusted; void free(void*) pure nothrow trusted; void* realloc(void*, size_t) pure nothrow trusted; } as well as a throwing OutOfMemoryError if they fail should be sufficient, correct?
I think so. A more cautious solution would be to define PureMallocator in addition to Mallocator. Andrei
May 14 2014
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 14 May 2014 21:11:30 -0400, Brian Schott <briancschott gmail.com>  
wrote:

 On Thursday, 15 May 2014 at 00:48:52 UTC, Walter Bright wrote:
 On 5/14/2014 5:44 PM, Brian Schott wrote:
 Can we say that Mallocator failures are not recoverable?
malloc itself does not have that property. But you could design a wrapper for it that did.
I'm concerned specifically with this wrapper: https://github.com/andralex/phobos/blob/allocator/std/allocator.d#L773 We need to make these functions pure if they are going to be usable. Removing the stdlib import and adding private extern (C) { void* malloc(size_t) pure nothrow trusted; void free(void*) pure nothrow trusted; void* realloc(void*, size_t) pure nothrow trusted; }
Be careful. The above is all correct, but as has been discussed, the minute you start making returns immutable or parameters immutable, they become strong-pure, which has undesirable properties for allocators. If you wrap the above with calls to typed data, then strong-pure might be inferred. The compiler can specially designate things like idup to make sure they always are considered weak-pure. But library code has to be more cautious. -Steve
May 15 2014
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/14/14, 5:44 PM, Brian Schott wrote:
 On Thursday, 15 May 2014 at 00:00:37 UTC, Walter Bright wrote:
 On 5/14/2014 3:42 PM, Brian Schott wrote:
 If malloc can never be considered pure, even when hidden behind an
 allocator,
It cannot be pure as long as it can fail.
 why can it be considered pure when hidden behind the GC?
Because GC failures are not recoverable, so the pure allocation cannot fail.
Can we say that Mallocator failures are not recoverable?
FWIW std.allocator allows constructing a bunch of small-business allocators for which failure to allocate is entirely legit. The methods of such allocator may be weakly pure. An allocator that ultimately falls back to GCAllocator and "never fails" should be pure - better yet, inferred as such. Andrei
May 14 2014
prev sibling next sibling parent reply "Kapps" <opantm2+spam gmail.com> writes:
On Thursday, 15 May 2014 at 00:00:37 UTC, Walter Bright wrote:
 Because GC failures are not recoverable, so the pure allocation 
 cannot fail.
Is this intentionally the case? I always thought you could handle it with onOutOfMemoryError if you knew that your program could handle running out of memory and free some memory (i.e., cached data) to recover. I've never actually had a use for it myself, so I'm just basing this off of newsgroup discussions I remember (possibly incorrectly) reading.
May 14 2014
parent Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Thu, 15 May 2014 01:25:52 +0000
Kapps via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 On Thursday, 15 May 2014 at 00:00:37 UTC, Walter Bright wrote:
 Because GC failures are not recoverable, so the pure allocation
 cannot fail.
Is this intentionally the case? I always thought you could handle it with onOutOfMemoryError if you knew that your program could handle running out of memory and free some memory (i.e., cached data) to recover. I've never actually had a use for it myself, so I'm just basing this off of newsgroup discussions I remember (possibly incorrectly) reading.
It's intended that all Errors be considered unrecoverable. You can catch them when necessary to try and do cleanup, but that's fairly risky, and you should probably only do it when you're very sure of what's going on (which generally means that the Error was thrown very close to where you're catching it). There's no guarantee that automatic cleanup occurs when Errors are thrown (unlike with Exceptions), so when an Error is thrown, the program tends to be in a weird state on top of the weird state that caused the Error to be thrown in the first place. In theory, you could catch an OutOfMemoryError very close to where it was thrown and deal with it, but that's really not what was intended. - Jonathan M Davis
May 14 2014
prev sibling parent Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Wed, 14 May 2014 17:00:39 -0700
Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 On 5/14/2014 3:42 PM, Brian Schott wrote:
 If malloc can never be considered pure, even when hidden behind an
 allocator,
It cannot be pure as long as it can fail.
 why can it be considered pure when hidden behind the GC?
Because GC failures are not recoverable, so the pure allocation cannot fail.
Then we should create a wrapper for malloc which throws a MemoryError when malloc fails. Then malloc failures would be the same as GC failures. - Jonathan M Davis
May 14 2014
prev sibling next sibling parent reply "Meta" <jared771 gmail.com> writes:
On Wednesday, 14 May 2014 at 22:42:47 UTC, Brian Schott wrote:
 What is the plan for the "pure"-ity of memory management?

 Right now the "new" operator is considered to be pure even 
 though it is not, but related functinos like malloc, 
 GC.addRange, GC.removeRange, and others are not.

 This prevents large portions of std.allocator from being pure. 
 This then prevents any containers library built on 
 std.allocator from being pure, which does the same for any 
 funcions and data structures written using those containers.

 If malloc can never be considered pure, even when hidden behind 
 an allocator, why can it be considered pure when hidden behind 
 the GC?
Allocating memory through new and malloc should always be pure, I think, because failure either returns null in malloc's case, or throws an error in new's case. As for GC.addRange, GC.removeRange, etc... it's hard to say. These functions definitely alter the state of the GC, but it's all wrapped up in the GC's black box, guaranteed to never escape (I think? I don't know exactly how they're implemented). In the general case, as long as side-effects can be guaranteed to never affect anything outside of a function/class/struct, I think it can probably be pure (D's notion of purity, anyway).
May 14 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/14/2014 5:03 PM, Meta wrote:
 Allocating memory through new and malloc should always be pure, I think,
because
 failure either returns null in malloc's case,
malloc cannot be pure if, with the same arguments, it returns null sometimes and not other times.
 or throws an error in new's case.
A non-recoverable error. ^^^^^^^^^^^^^^^
May 14 2014
next sibling parent reply "Meta" <jared771 gmail.com> writes:
On Thursday, 15 May 2014 at 00:50:06 UTC, Walter Bright wrote:
 On 5/14/2014 5:03 PM, Meta wrote:
 Allocating memory through new and malloc should always be 
 pure, I think, because
 failure either returns null in malloc's case,
malloc cannot be pure if, with the same arguments, it returns null sometimes and not other times.
 or throws an error in new's case.
A non-recoverable error. ^^^^^^^^^^^^^^^
If we pretend that there's infinite memory, then malloc will never return null.
May 14 2014
parent Walter Bright <newshound2 digitalmars.com> writes:
On 5/14/2014 5:56 PM, Meta wrote:
 On Thursday, 15 May 2014 at 00:50:06 UTC, Walter Bright wrote:
 On 5/14/2014 5:03 PM, Meta wrote:
 Allocating memory through new and malloc should always be pure, I think,
because
 failure either returns null in malloc's case,
malloc cannot be pure if, with the same arguments, it returns null sometimes and not other times.
 or throws an error in new's case.
A non-recoverable error. ^^^^^^^^^^^^^^^
If we pretend that there's infinite memory, then malloc will never return null.
And what happens when malloc returns null?
May 14 2014
prev sibling next sibling parent reply Manu via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 15 May 2014 10:50, Walter Bright via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On 5/14/2014 5:03 PM, Meta wrote:
 Allocating memory through new and malloc should always be pure, I think,
 because
 failure either returns null in malloc's case,
malloc cannot be pure if, with the same arguments, it returns null sometimes and not other times.
Even if it doesn't fail, malloc called with identical arguments still returns a different result every time. You can't factor malloc outside a loop and cache the result because it's being called repeatedly with the same argument like you're supposed to be able to do with any other pure function. You shouldn't be able to do that with gcalloc either... how can gcalloc be pure?
May 15 2014
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 15 May 2014 07:52:20 -0400, Manu via Digitalmars-d  
<digitalmars-d puremagic.com> wrote:

 On 15 May 2014 10:50, Walter Bright via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 5/14/2014 5:03 PM, Meta wrote:
 Allocating memory through new and malloc should always be pure, I  
 think,
 because
 failure either returns null in malloc's case,
malloc cannot be pure if, with the same arguments, it returns null sometimes and not other times.
Even if it doesn't fail, malloc called with identical arguments still returns a different result every time. You can't factor malloc outside a loop and cache the result because it's being called repeatedly with the same argument like you're supposed to be able to do with any other pure function. You shouldn't be able to do that with gcalloc either... how can gcalloc be pure?
That only applies to strong-pure functions. malloc would be weak-pure, since it returns a mutable pointer. -Steve
May 15 2014
parent reply Manu via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 15 May 2014 22:30, Steven Schveighoffer via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On Thu, 15 May 2014 07:52:20 -0400, Manu via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:

 On 15 May 2014 10:50, Walter Bright via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 5/14/2014 5:03 PM, Meta wrote:
 Allocating memory through new and malloc should always be pure, I think,
 because
 failure either returns null in malloc's case,
malloc cannot be pure if, with the same arguments, it returns null sometimes and not other times.
Even if it doesn't fail, malloc called with identical arguments still returns a different result every time. You can't factor malloc outside a loop and cache the result because it's being called repeatedly with the same argument like you're supposed to be able to do with any other pure function. You shouldn't be able to do that with gcalloc either... how can gcalloc be pure?
That only applies to strong-pure functions. malloc would be weak-pure, since it returns a mutable pointer.
Why should returning a mutable pointer imply weak purity?
May 15 2014
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 15 May 2014 09:40:03 -0400, Manu via Digitalmars-d  
<digitalmars-d puremagic.com> wrote:

 On 15 May 2014 22:30, Steven Schveighoffer via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On Thu, 15 May 2014 07:52:20 -0400, Manu via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:

 On 15 May 2014 10:50, Walter Bright via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 5/14/2014 5:03 PM, Meta wrote:
 Allocating memory through new and malloc should always be pure, I  
 think,
 because
 failure either returns null in malloc's case,
malloc cannot be pure if, with the same arguments, it returns null sometimes and not other times.
Even if it doesn't fail, malloc called with identical arguments still returns a different result every time. You can't factor malloc outside a loop and cache the result because it's being called repeatedly with the same argument like you're supposed to be able to do with any other pure function. You shouldn't be able to do that with gcalloc either... how can gcalloc be pure?
That only applies to strong-pure functions. malloc would be weak-pure, since it returns a mutable pointer.
Why should returning a mutable pointer imply weak purity?
Because if it were strong-pure, the compiler could optimize two sequential calls as returning the same exact thing. Imagine if a constructor to a mutable object always returned the same object because the constructor's parameters were immutable. It would be useless. -Steve
May 15 2014
parent reply Manu via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 15 May 2014 23:47, Steven Schveighoffer via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On Thu, 15 May 2014 09:40:03 -0400, Manu via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:

 On 15 May 2014 22:30, Steven Schveighoffer via Digitalmars-d

 <digitalmars-d puremagic.com> wrote:
 On Thu, 15 May 2014 07:52:20 -0400, Manu via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:

 On 15 May 2014 10:50, Walter Bright via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 5/14/2014 5:03 PM, Meta wrote:
 Allocating memory through new and malloc should always be pure, I
 think,
 because
 failure either returns null in malloc's case,
malloc cannot be pure if, with the same arguments, it returns null sometimes and not other times.
Even if it doesn't fail, malloc called with identical arguments still returns a different result every time. You can't factor malloc outside a loop and cache the result because it's being called repeatedly with the same argument like you're supposed to be able to do with any other pure function. You shouldn't be able to do that with gcalloc either... how can gcalloc be pure?
That only applies to strong-pure functions. malloc would be weak-pure, since it returns a mutable pointer.
Why should returning a mutable pointer imply weak purity?
Because if it were strong-pure, the compiler could optimize two sequential calls as returning the same exact thing. Imagine if a constructor to a mutable object always returned the same object because the constructor's parameters were immutable. It would be useless. -Steve
I don't follow. Forget that it's a pointer... it's just a number. A strong pure function simply doesn't modify it's inputs (which we can guarantee with const/immutable args), and returns the same output. It shouldn't matter if the output is mutable or not, it just has to be the same. Whether it's the number 10, or a pointer. I guess it logically follows that the output must be const or immutable, because it can only possibly be returning a pointer to, or to a member of, one of it's args... and 'turtles all the way down'. But that leads me back to where I started... how can it possibly be that an allocator of any sort can be pure? If you can allocate a new pointer, you can return it as a const or immutable pointer if you like, and then the function looks awfully like a strong pure function even though it returns a new pointer every time. That's not pure at all.
May 15 2014
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 15 May 2014 12:01:35 -0400, Manu via Digitalmars-d  
<digitalmars-d puremagic.com> wrote:

 On 15 May 2014 23:47, Steven Schveighoffer via Digitalmars-d
 Because if it were strong-pure, the compiler could optimize two  
 sequential
 calls as returning the same exact thing. Imagine if a constructor to a
 mutable object always returned the same object because the constructor's
 parameters were immutable. It would be useless.
I don't follow. Forget that it's a pointer... it's just a number.
But that doesn't make sense. What you are returning is not the pointer value, but the pointed-at object. When you return an allocated object, you aren't returning the pointer, just a way to get to the object.
 A
 strong pure function simply doesn't modify it's inputs (which we can
 guarantee with const/immutable args), and returns the same output.
 It shouldn't matter if the output is mutable or not, it just has to be
 the same. Whether it's the number 10, or a pointer.
define "the same"? char[] foo(int x) pure {return to!(char[])(x);} would you argue that foo(5) always returns "5"? Or would you argue that it returns {length:1, ptr:0xee532000}? A memoization would mean it's always returning the same exact object, not the same value.
 I guess it logically follows that the output must be const or
 immutable, because it can only possibly be returning a pointer to, or
 to a member of, one of it's args... and 'turtles all the way down'.
You are thinking of uniqueness of the result. That does not require strong purity.
 But that leads me back to where I started... how can it possibly be
 that an allocator of any sort can be pure? If you can allocate a new
 pointer, you can return it as a const or immutable pointer if you
 like, and then the function looks awfully like a strong pure function
 even though it returns a new pointer every time. That's not pure at
 all.
Strong pure is not required for implicit casting, uniqueness is. You can establish uniqueness based on comparing the return types to the parameters. But a "strong pure" (and like Don says, this is really a bad description) as I understand it means pure in the functional purity sense -- all immutable parameters, all immutable returns, no global access. These can be optimized similarly to all functional languages (memoization of results, reordering or factoring of calls, dispatching calls to multiple threads, etc.). -Steve
May 15 2014
prev sibling parent "David Nadlinger" <code klickverbot.at> writes:
On Thursday, 15 May 2014 at 13:40:16 UTC, Manu via Digitalmars-d 
wrote:
 Why should returning a mutable pointer imply weak purity?
The argument here is that a valid mutable pointer returned from a pure function could always point to a new allocation, as new is allowed in pure code. More precisely, the only way to return a valid mutable pointer from a parameterless function is to make it point to a new allocation, so allowing memory allocations does not even preclude the inference of stronger guarantees here. David
May 15 2014
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 14 May 2014 20:50:08 -0400, Walter Bright  
<newshound2 digitalmars.com> wrote:

 On 5/14/2014 5:03 PM, Meta wrote:
 Allocating memory through new and malloc should always be pure, I  
 think, because
 failure either returns null in malloc's case,
malloc cannot be pure if, with the same arguments, it returns null sometimes and not other times.
Basically, you are saying that malloc must return the same block whenever it's called with the same parameters. This is simply nonsense. null is not special, it's just another block. I'm not saying malloc should be pure based on this, but the possibility that it returns null does not disqualify it. -Steve
May 15 2014
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/15/14, 6:02 AM, Steven Schveighoffer wrote:
 On Wed, 14 May 2014 20:50:08 -0400, Walter Bright
 <newshound2 digitalmars.com> wrote:

 On 5/14/2014 5:03 PM, Meta wrote:
 Allocating memory through new and malloc should always be pure, I
 think, because
 failure either returns null in malloc's case,
malloc cannot be pure if, with the same arguments, it returns null sometimes and not other times.
Basically, you are saying that malloc must return the same block whenever it's called with the same parameters. This is simply nonsense. null is not special, it's just another block. I'm not saying malloc should be pure based on this, but the possibility that it returns null does not disqualify it.
Null is special - it's a singularity. It can't be subsequently used for constructing a proper object. That makes it different even after we discount for comparing pointers. Andrei
May 15 2014
prev sibling parent reply Adam Sakareassen via Digitalmars-d <digitalmars-d puremagic.com> writes:
For my 2 cents,

The more D allows 'pure' functions to diverge from functional purity the 
less relevant the flag is for compiler optimisations.

I think purity should be defined by what the compiler is allowed to do 
to a function.  The results of all pure functions should be cacheable. 
If the compiler can prove that the arguments to a pure function are 
constant then the function can be run at compile time to get the result.

A more advanced compiler should be able to do a lot of optimisation 
based on the fact that a function is marked pure.  A compiler can often 
prove arguments are constant when a human can't show the same easily. 
Other optimisation parses often expose constants that may be arguments 
to pure functions.

For example, a foreach loop over a small array could be unwound to 
remove the overhead of branching.  The iterator then becomes a constant 
in each segment of the former loop.  If the iterator is the argument to 
a pure function, the entire function call could be eliminated and 
replaced with the result.

By the same reasoning cacheability is important.  A pure function might 
be called within a loop with a parameter that is not altered during the 
loop.  If the compiler knows the function is pure, it can perform the 
calculation before the loop and simply reuse the cached result for each 
iteration.

New and malloc are not really pure.  They return a different result each 
call, and the state of the PC is changed.   However, if the programmer 
knows his function behaves in a pure way, it would be nice if a 
programmer could tell the compiler to go ahead and optimise away the 
function call anyway.  Some kind of “trusted pure” tag would be required 
I guess.

A 'trusted pure' function would be free to allocate memory as it wishes, 
providing it is freed (or left for collection), and the pointer does not 
escape the function.  From the point of view of the calling software the 
function is functionally pure.

My point is that if we are going to allow relaxed purity in D, then the 
definition should be angled towards a performance advantage.  Naturally 
there are quite a number of other advantages of creating pure functions 
besides performance.  If we define what compiler is allowed to do to a 
pure function we make reasoning about pure functions simple.  It even 
opens the idea of 'trusted pure' functions that compiler can't prove are 
pure, because they may do something like call a C routine.  From a 
compilers point of view they could be considered pure because the 
programmer says the compiler is free to eliminate the function call if 
it knows the result.


On 15/05/2014 8:42 AM, Brian Schott via Digitalmars-d wrote:
 What is the plan for the "pure"-ity of memory management?

 Right now the "new" operator is considered to be pure even though it is
 not, but related functinos like malloc, GC.addRange, GC.removeRange, and
 others are not.

 This prevents large portions of std.allocator from being pure. This then
 prevents any containers library built on std.allocator from being pure,
 which does the same for any funcions and data structures written using
 those containers.

 If malloc can never be considered pure, even when hidden behind an
 allocator, why can it be considered pure when hidden behind the GC?
May 14 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 15 May 2014 at 02:49:28 UTC, Adam Sakareassen via 
Digitalmars-d wrote:
 The more D allows 'pure' functions to diverge from functional 
 purity the less relevant the flag is for compiler optimisations.
...
 By the same reasoning cacheability is important.  A pure 
 function might be called within a loop with a parameter that is 
 not altered during the loop.  If the compiler knows the 
 function is pure, it can perform the calculation before the 
 loop and simply reuse the cached result for each iteration.
Yep, purity implies memoing. Malloc and new are not pure because they return objects that can be differentiated by address. Malloc and new are random generators in that sense. So to make them pure you will have to put a ban on taking address, comparing address etc on the objects... However mmap to a fixed address is pure if it throws an exception on failure because the exception bypass the call site of the pure function (pure functions should not catch side effect exceptions). Returning null does not make a function impure. Pure functions may return bottom (like division by zero). Pure in D seems pointless to me.
May 14 2014
next sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 15 May 2014 at 05:51:16 UTC, Ola Fosheim Grøstad 
wrote:
 However mmap to a fixed address is pure if it throws an 
 exception on failure because the exception bypass the call site 
 of the pure function (pure functions should not catch side 
 effect exceptions).
This of course presumes a transactional view, that the transaction is the context of purity and that flaws in purity unwinds all changes and thus abort the transaction (the try block). If the result of a series of pure function calls can be used to flip a coin within a transaction, then those functions cannot be considered pure in any meaningful sense.
May 14 2014
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Ola Fosheim Grøstad:

 If the result of a series of pure function calls can be used to 
 flip a coin within a transaction, then those functions cannot 
 be considered pure in any meaningful sense.
A little example of D purity (this compiles): bool randomBit() pure nothrow safe { return (new int[1].ptr) > (new int[1].ptr); } void main() {} Bye, bearophile
May 14 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 15 May 2014 at 06:29:06 UTC, bearophile wrote:
 A little example of D purity (this compiles):
 bool randomBit() pure nothrow  safe {
     return (new int[1].ptr) > (new int[1].ptr);
 }
Yes, and then you may as well allow a random generator with private globals. Because memoing is no longer sound anyway.
May 14 2014
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 15 May 2014 02:50:05 -0400, Ola Fosheim Gr=C3=B8stad  =

<ola.fosheim.grostad+dlang gmail.com> wrote:

 On Thursday, 15 May 2014 at 06:29:06 UTC, bearophile wrote:
 A little example of D purity (this compiles):
 bool randomBit() pure nothrow  safe {
     return (new int[1].ptr) > (new int[1].ptr);
 }
Yes, and then you may as well allow a random generator with private =
 globals. Because memoing is no longer sound anyway.
This has nothing to do with allocators being pure. They must be pure as = a = matter of practicality. The issue that allows the above anomaly is using the address of somethin= g. = There is a reason functional languages do not allow these types of thing= s. = But in functional languages, allocating is allowed. To be honest, code that would exploit such an anomaly is only ever used = in = "proof" exercises, and never in real code. I don't think it's an issue. -Steve
May 15 2014
next sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 15 May 2014 at 12:57:43 UTC, Steven Schveighoffer 
wrote:
 To be honest, code that would exploit such an anomaly is only 
 ever used in "proof" exercises, and never in real code. I don't 
 think it's an issue.
That's the wrong attitude to take when it comes to the compiler and runtime. If it verifies, it should verify. Not mostly, but fully, otherwise "weakly pure" becomes a programmer guarantee. Which is ok too, but either the one or the other. How can you prove that such issues don't arise in real code?
May 15 2014
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 15 May 2014 09:24:54 -0400, Ola Fosheim Gr=C3=B8stad  =

<ola.fosheim.grostad+dlang gmail.com> wrote:

 On Thursday, 15 May 2014 at 12:57:43 UTC, Steven Schveighoffer wrote:
 To be honest, code that would exploit such an anomaly is only ever us=
ed =
 in "proof" exercises, and never in real code. I don't think it's an  =
 issue.
That's the wrong attitude to take when it comes to the compiler and =
 runtime.
There are always ways around the guarantees. This one isn't as detectabl= e, = since there is no "red-flag" cast or attribute. But I see no utility in = = such code.
 If it verifies, it should verify. Not mostly, but fully, otherwise  =
 "weakly pure" becomes a programmer guarantee. Which is ok too, but  =
 either the one or the other.
Memory allocation is an exception to purity that is widely considered OK= , = because of the practical need for allocating memory from a heap to do = work. In general, it's the block of data that is important, not it's = address. Where it comes from is of no consequence.
 How can you prove that such issues don't arise in real code?
Of course I can't prove it doesn't arise, but I can't think of any use = cases to do something significant based on the address of heap data in = relation to another unrelated heap block. Perhaps you can show a use case for it? The other thing to think about is that such code won't serve the purpose= = for which it was written! randomBit won't be random because it will like= ly = be memoized. In any case, the alternative is to have D pure functions avoid using = pointers. It's as drastic as that. -Steve
May 15 2014
next sibling parent reply "David Nadlinger" <code klickverbot.at> writes:
On Thursday, 15 May 2014 at 13:42:58 UTC, Steven Schveighoffer 
wrote:
 On Thu, 15 May 2014 09:24:54 -0400, Ola Fosheim Grøstad 
 <ola.fosheim.grostad+dlang gmail.com> wrote:
 That's the wrong attitude to take when it comes to the 
 compiler and runtime.
There are always ways around the guarantees. This one isn't as detectable, since there is no "red-flag" cast or attribute. But I see no utility in such code.
I have to agree with Ola here. If you write a piece of pure, safe code, there should be no way for compiler optimizations to make your code behave differently. This is not only because implicitly allowing unsafe code would be against the design philosophy behind D, but also as attribute inference might tag functions as pure without the programmer explicitly specifying so.
 In any case, the alternative is to have D pure functions avoid 
 using pointers. It's as drastic as that.
I'd suspect that it is enough to disallow using the content of pointers explicitly, i.e. as a sequence of bytes instead of just a handle to an object. David
May 15 2014
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 15 May 2014 10:42:00 -0400, David Nadlinger <code klickverbot.at=
  =
wrote:
 On Thursday, 15 May 2014 at 13:42:58 UTC, Steven Schveighoffer wrote:
 On Thu, 15 May 2014 09:24:54 -0400, Ola Fosheim Gr=C3=B8stad  =
 <ola.fosheim.grostad+dlang gmail.com> wrote:
 That's the wrong attitude to take when it comes to the compiler and =
=
 runtime.
There are always ways around the guarantees. This one isn't as =
 detectable, since there is no "red-flag" cast or attribute. But I see=
=
 no utility in such code.
I have to agree with Ola here. If you write a piece of pure, safe cod=
e, =
 there should be no way for compiler optimizations to make your code  =
 behave differently.
In general, I would say any code that performs differently after = optimization is not preferable. But in this case, you have ignored the = rules, and the result is not exactly incorrect or unsafe. In fact, you = can't really argue that it's invalid (randomBit could legitimately alway= s = return true or false, even in a non-optimized program). The question here is, can we come up with a static rule that is effectiv= e, = but not cripplingly prohibitive?
 This is not only because implicitly allowing unsafe code would be  =
 against the design philosophy behind D, but also as attribute inferenc=
e =
 might tag functions as pure without the programmer explicitly specifyi=
ng =
 so.
So far, I haven't seen code that's unsafe only if optimized. Have an = example?
 In any case, the alternative is to have D pure functions avoid using =
=
 pointers. It's as drastic as that.
I'd suspect that it is enough to disallow using the content of pointer=
s =
 explicitly, i.e. as a sequence of bytes instead of just a handle to an=
=
 object.
This means format("%x", ptr) isn't allowed to be pure? What about calculating index offsets? Note that pointer math between two= = pointers on the same block of memory is perfectly legitimate. I would expect that someone could be able to write a type equivalent to = a = slice, and it should be allowed to be pure. -Steve
May 15 2014
parent reply "David Nadlinger" <code klickverbot.at> writes:
On Thursday, 15 May 2014 at 15:09:32 UTC, Steven Schveighoffer 
wrote:
 But in this case, you have ignored the rules, […]
Which rules exactly? My point is mainly that this area of the language is underspecified.
 This means format("%x", ptr) isn't allowed to be pure?
The short answer would be: Yes. The alternatives seem to be: - Disallowing memory allocations in pure code (not workable) - Bidding farewell to the idea that pure + no mutable indirections means FP-sense purity.
 What about calculating index offsets? Note that pointer math 
 between two pointers on the same block of memory is perfectly 
 legitimate.
Taking offsets within the same block are fine. Even in the light of GC allocations being pure, this doesn't lead to any non-determinism, as there isn't any mechanism for the relative offset between whatever you are considering to change without any outside input.
 I would expect that someone could be able to write a type 
 equivalent to a slice, and it should be allowed to be pure.
Yes. David
May 15 2014
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 15 May 2014 18:30:55 -0400, David Nadlinger <code klickverbot.at=
  =
wrote:
 On Thursday, 15 May 2014 at 15:09:32 UTC, Steven Schveighoffer wrote:
 But in this case, you have ignored the rules, [=E2=80=A6]
Which rules exactly? My point is mainly that this area of the language=
=
 is underspecified.
The rules that we have been discussing, you cannot make significant = decisions based on the addresses of 2 unrelated pointers.
 This means format("%x", ptr) isn't allowed to be pure?
The short answer would be: Yes. The alternatives seem to be: - Disallowing memory allocations in pure code (not workable) - Bidding farewell to the idea that pure + no mutable indirections =
 means FP-sense purity.
There is a 3rd option: - FP purity, working fine as long as you don't do stupid things based on= = addresses. And even possibly working fine if you do stupid things based = on = addresses. In other words, still assume FP purity, and if you expect randomness bas= ed = on heap addresses to work, it just won't. Keep in mind, nothing discusse= d = so far can cause invalid results. It just will be less random than = expected. I really don't see a problem with this. You are changing non-deterministic behavior to different, but still = non-deterministic behavior. How do you even measure the impact?
 What about calculating index offsets? Note that pointer math between =
=
 two pointers on the same block of memory is perfectly legitimate.
Taking offsets within the same block are fine. Even in the light of GC=
=
 allocations being pure, this doesn't lead to any non-determinism, as  =
 there isn't any mechanism for the relative offset between whatever you=
=
 are considering to change without any outside input.
I think it is impossible to statically prevent address comparison betwee= n = 2 unrelated pointers, but statically allow address comparison between 2 = = related ones. The compiler does not have the information to determine = whether 2 pointers are related at compile time. In fact, it may do BOTH! bool foo(int *x, int *y) pure { return x < y; } bool bar() pure { if(foo(new int, new int)) // bad { int[10] x; return foo(x.ptr, x.ptr + 5); // ok } return false; } In the interest of nailing down what the rules SHOULD be (and I agree th= ey = need to be in the spec), here is a strawman to discuss: You cannot do comparisons between two unrelated heap pointers that were = = allocated from the heap inside the same pure call. This only applies to = = the pointer values themselves, not the data pointed at (as long as that = = data is not a similar pointer). I will define the "same pure call" as the function call into a pure = function from a non-pure function. Note, this is not a rule to implement as a static enforcement, it's one = = the programmer must not do. Like 'undefined behavior'. I don't think we = = can enforce it, as I mentioned above. A lint tool may be able to catch it, and we can also possibly catch it a= t = runtime, as long as the allocator can determine whether you're inside a = = pure function. -Steve
May 16 2014
prev sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
David Nadlinger:

 I'd suspect that it is enough to disallow using the content of 
 pointers explicitly, i.e. as a sequence of bytes instead of 
 just a handle to an object.
Yes, if you allow only a referentially pure view of pointers, then you have a strong purity. Bye, bearophile
May 15 2014
prev sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 15 May 2014 at 13:42:58 UTC, Steven Schveighoffer 
wrote:
 In any case, the alternative is to have D pure functions avoid 
 using pointers. It's as drastic as that.
You need: 1. references with value semantics. 2. "non pure exceptions" that cannot be caught within a pure chain, but can be caught outside Besides that I think programmer provided guarantees for referential transparency are valuable for C/ASM and for non-pure to pure conversion using the exception in point 2.
May 15 2014
prev sibling next sibling parent reply "Dicebot" <public dicebot.lv> writes:
On Thursday, 15 May 2014 at 12:57:43 UTC, Steven Schveighoffer 
wrote:
 On Thu, 15 May 2014 02:50:05 -0400, Ola Fosheim Grøstad 
 <ola.fosheim.grostad+dlang gmail.com> wrote:

 On Thursday, 15 May 2014 at 06:29:06 UTC, bearophile wrote:
 A little example of D purity (this compiles):
 bool randomBit() pure nothrow  safe {
    return (new int[1].ptr) > (new int[1].ptr);
 }
Yes, and then you may as well allow a random generator with private globals. Because memoing is no longer sound anyway.
This has nothing to do with allocators being pure. They must be pure as a matter of practicality. The issue that allows the above anomaly is using the address of something. There is a reason functional languages do not allow these types of things. But in functional languages, allocating is allowed.
Which is what makes D pure lint-like helper and not effective optimization enabler. This part of type system was under-designed initially, it should have been much more restrictive. I am ok with current state though, it is just sad that it creates lot of false expectations from those familiar with pure functional languages.
 To be honest, code that would exploit such an anomaly is only 
 ever used in "proof" exercises, and never in real code. I don't 
 think it's an issue.
This is not true. Because of such code you can't ever automatically memoize strongly pure function results by compiler. A very practical concern.
May 15 2014
next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 15 May 2014 09:28:57 -0400, Dicebot <public dicebot.lv> wrote:

 On Thursday, 15 May 2014 at 12:57:43 UTC, Steven Schveighoffer wrote:
 On Thu, 15 May 2014 02:50:05 -0400, Ola Fosheim Gr=C3=B8stad  =
 <ola.fosheim.grostad+dlang gmail.com> wrote:

 On Thursday, 15 May 2014 at 06:29:06 UTC, bearophile wrote:
 A little example of D purity (this compiles):
 bool randomBit() pure nothrow  safe {
    return (new int[1].ptr) > (new int[1].ptr);
 }
Yes, and then you may as well allow a random generator with private =
=
 globals. Because memoing is no longer sound anyway.
This has nothing to do with allocators being pure. They must be pure =
as =
 a matter of practicality.

 The issue that allows the above anomaly is using the address of  =
 something. There is a reason functional languages do not allow these =
=
 types of things. But in functional languages, allocating is allowed.
Which is what makes D pure lint-like helper and not effective =
 optimization enabler. This part of type system was under-designed  =
 initially, it should have been much more restrictive.
It's easy to say this, but the restrictions to apply would be draconian.= I = think it would be nearly impossible to prevent such abuses in a language= = with explicit references.
 To be honest, code that would exploit such an anomaly is only ever us=
ed =
 in "proof" exercises, and never in real code. I don't think it's an  =
 issue.
This is not true. Because of such code you can't ever automatically =
 memoize strongly pure function results by compiler. A very practical  =
 concern.
I think you can still memoize these cases. There is no reason for the = language to support a pure RNG. -Steve
May 15 2014
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/15/14, 6:28 AM, Dicebot wrote:
 This is not true. Because of such code you can't ever automatically
 memoize strongly pure function results by compiler. A very practical
 concern.
I think code that doesn't return pointers should be memoizable. Playing tricks with pointer comparisons would be appropriately penalized. -- Andrei
May 15 2014
next sibling parent Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Thu, 15 May 2014 08:43:11 -0700
Andrei Alexandrescu via Digitalmars-d <digitalmars-d puremagic.com>
wrote:

 On 5/15/14, 6:28 AM, Dicebot wrote:
 This is not true. Because of such code you can't ever automatically
 memoize strongly pure function results by compiler. A very practical
 concern.
I think code that doesn't return pointers should be memoizable. Playing tricks with pointer comparisons would be appropriately penalized. -- Andrei
Agreed. The fact that a pure function can return newly allocated memory pretty much kills the idea of being able to memoize pure functions that return pointers or references, because the program's behavior would change if it were to memoize the result and reuse it. However, that should have no effect on pure functions that return value types - even if the function took pointers or references as arguments or allocated memory internally. They should but perfectly memoizable. - Jonathan M Davis
May 17 2014
prev sibling next sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Sat, May 17, 2014 at 11:51:44AM -0700, Jonathan M Davis via Digitalmars-d
wrote:
 On Thu, 15 May 2014 08:43:11 -0700
 Andrei Alexandrescu via Digitalmars-d <digitalmars-d puremagic.com>
 wrote:
 
 On 5/15/14, 6:28 AM, Dicebot wrote:
 This is not true. Because of such code you can't ever
 automatically memoize strongly pure function results by compiler.
 A very practical concern.
I think code that doesn't return pointers should be memoizable. Playing tricks with pointer comparisons would be appropriately penalized. -- Andrei
Agreed. The fact that a pure function can return newly allocated memory pretty much kills the idea of being able to memoize pure functions that return pointers or references, because the program's behavior would change if it were to memoize the result and reuse it. However, that should have no effect on pure functions that return value types - even if the function took pointers or references as arguments or allocated memory internally. They should but perfectly memoizable.
[...] bool func(int x) /* pure? */ { int[] a, b; a = new int[x]; b = new int[x]; return (a.ptr < b.ptr); } Where do you draw the line? T -- Tech-savvy: euphemism for nerdy.
May 18 2014
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sun, 18 May 2014 09:58:25 -0400, H. S. Teoh via Digitalmars-d  
<digitalmars-d puremagic.com> wrote:

 On Sat, May 17, 2014 at 11:51:44AM -0700, Jonathan M Davis via  
 Digitalmars-d wrote:
 On Thu, 15 May 2014 08:43:11 -0700
 Andrei Alexandrescu via Digitalmars-d <digitalmars-d puremagic.com>
 wrote:

 On 5/15/14, 6:28 AM, Dicebot wrote:
 This is not true. Because of such code you can't ever
 automatically memoize strongly pure function results by compiler.
 A very practical concern.
I think code that doesn't return pointers should be memoizable. Playing tricks with pointer comparisons would be appropriately penalized. -- Andrei
Agreed. The fact that a pure function can return newly allocated memory pretty much kills the idea of being able to memoize pure functions that return pointers or references, because the program's behavior would change if it were to memoize the result and reuse it.
Memoizing reference returns that are immutable should be fine.
 However, that should have no effect on pure functions that return
 value types - even if the function took pointers or references as
 arguments or allocated memory internally. They should but perfectly
 memoizable.
[...] bool func(int x) /* pure? */ { int[] a, b; a = new int[x]; b = new int[x]; return (a.ptr < b.ptr); } Where do you draw the line?
Definitely this is fine being pure, and can be memoized. It just won't do what you thought it would. So? :) This is what Andrei meant as being "penalized." -Steve
May 19 2014
parent reply Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Mon, 19 May 2014 09:42:31 -0400
Steven Schveighoffer via Digitalmars-d <digitalmars-d puremagic.com>
wrote:

 On Sun, 18 May 2014 09:58:25 -0400, H. S. Teoh via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:

 On Sat, May 17, 2014 at 11:51:44AM -0700, Jonathan M Davis via
 Digitalmars-d wrote:
 On Thu, 15 May 2014 08:43:11 -0700
 Andrei Alexandrescu via Digitalmars-d <digitalmars-d puremagic.com>
 wrote:

 On 5/15/14, 6:28 AM, Dicebot wrote:
 This is not true. Because of such code you can't ever
 automatically memoize strongly pure function results by
 compiler. A very practical concern.
I think code that doesn't return pointers should be memoizable. Playing tricks with pointer comparisons would be appropriately penalized. -- Andrei
Agreed. The fact that a pure function can return newly allocated memory pretty much kills the idea of being able to memoize pure functions that return pointers or references, because the program's behavior would change if it were to memoize the result and reuse it.
Memoizing reference returns that are immutable should be fine.
Only if you consider it okay for the behavior of the function to change upon memoization - or at least don't consider that the behaviors changed due to the fact that the objects are equal but not the same object (which can matter for reference types) is something that matters. It has less of an impact when you're dealing with immutable objects, because changing the value of one won't change the value of another, but it can still change the behavior of the program due to the fact that they're not actually the same object. So, I'd be inclined to argue that no functions which return memory should be memoizable. And given that the compiler can only memoize functions within a single expression (or maybe statement - I can't remember which) - I don't think that that restriction even costs us much. - Jonathan M Davis
May 19 2014
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 19 May 2014 12:35:26 -0400, Jonathan M Davis via Digitalmars-d  
<digitalmars-d puremagic.com> wrote:

 On Mon, 19 May 2014 09:42:31 -0400
 Steven Schveighoffer via Digitalmars-d <digitalmars-d puremagic.com>
 wrote:

 On Sun, 18 May 2014 09:58:25 -0400, H. S. Teoh via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:

 On Sat, May 17, 2014 at 11:51:44AM -0700, Jonathan M Davis via
 Digitalmars-d wrote:
 On Thu, 15 May 2014 08:43:11 -0700
 Andrei Alexandrescu via Digitalmars-d <digitalmars-d puremagic.com>
 wrote:

 On 5/15/14, 6:28 AM, Dicebot wrote:
 This is not true. Because of such code you can't ever
 automatically memoize strongly pure function results by
 compiler. A very practical concern.
I think code that doesn't return pointers should be memoizable. Playing tricks with pointer comparisons would be appropriately penalized. -- Andrei
Agreed. The fact that a pure function can return newly allocated memory pretty much kills the idea of being able to memoize pure functions that return pointers or references, because the program's behavior would change if it were to memoize the result and reuse it.
Memoizing reference returns that are immutable should be fine.
Only if you consider it okay for the behavior of the function to change upon memoization - or at least don't consider that the behaviors changed due to the fact that the objects are equal but not the same object (which can matter for reference types)
It shouldn't matter. Something that returns immutable references, can return that same thing again if asked the same way. Nobody should be looking at the address in any meaningful way.
 is something that matters. It has less of an impact when
 you're dealing with immutable objects, because changing the value of one  
 won't
 change the value of another, but it can still change the behavior of the
 program due to the fact that they're not actually the same object.
Such a program is incorrectly written.
 And given that the compiler can only memoize functions within a single
 expression (or maybe statement - I can't remember which) - I don't think  
 that
 that restriction even costs us much.
It can make a huge difference, and it doesn't have to be memoized within the same expression, it could be memoized globally with a hashtable, or within the same function. -Steve
May 19 2014
next sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Monday, 19 May 2014 at 17:11:43 UTC, Steven Schveighoffer 
wrote:
 It shouldn't matter. Something that returns immutable 
 references, can return that same thing again if asked the same 
 way. Nobody should be looking at the address in any meaningful 
 way.
I think this is at odds with generic programming. What you are saying is that if you plug a pure function into an algorithm then you have to test for "pure" in the algorithm if it is affected by object identity. Otherwise, goodbye plug-n-play.
May 19 2014
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 19 May 2014 13:31:08 -0400, Ola Fosheim Gr=C3=B8stad  =

<ola.fosheim.grostad+dlang gmail.com> wrote:

 On Monday, 19 May 2014 at 17:11:43 UTC, Steven Schveighoffer wrote:
 It shouldn't matter. Something that returns immutable references, can=
=
 return that same thing again if asked the same way. Nobody should be =
=
 looking at the address in any meaningful way.
I think this is at odds with generic programming. What you are saying =
is =
 that if you plug a pure function into an algorithm then you have to te=
st =
 for "pure" in the algorithm if it is affected by object identity.  =
 Otherwise, goodbye plug-n-play.
I think I misstated this, of course, looking at the address for certain = = reasons is OK, Object identity being one of them. But some of the tricks being detailed here as "proof" that we cannot = memoize are not really valid code. Returning the same immutable object, when called with the same immutable= = parameters, should never cause a break in code, pure or not. -Steve
May 19 2014
next sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Monday, 19 May 2014 at 17:35:34 UTC, Steven Schveighoffer 
wrote:
 Returning the same immutable object, when called with the same 
 immutable parameters, should never cause a break in code, pure 
 or not.
This isn't at all obvious to me. Also I think the "coin flip trick" represent a class of algorithms that depend on imposing a sort order on objects. I can easily see an algorithm break if you sometimes receive the same object and sometimes receive a new object with the same values as parameters. That's how an optimizer works, sometimes it optimizes, sometimes not. If your algorithm depends on running something twice and then comparing the two results then it could easily break. In order to prevent this you would need to "box it as a value" and let the type system forbid object identity comparison.
May 19 2014
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 19 May 2014 13:44:59 -0400, Ola Fosheim Gr=C3=B8stad  =

<ola.fosheim.grostad+dlang gmail.com> wrote:

 On Monday, 19 May 2014 at 17:35:34 UTC, Steven Schveighoffer wrote:
 Returning the same immutable object, when called with the same  =
 immutable parameters, should never cause a break in code, pure or not=
.
 This isn't at all obvious to me. Also I think the "coin flip trick"  =
 represent a class of algorithms that depend on imposing a sort order o=
n =
 objects.
Anything that uses the order of unrelated addresses is incorrect outside= = of the heap code itself.
 I can easily see an algorithm break if you sometimes receive the same =
=
 object and sometimes receive a new object with the same values as  =
 parameters.
Then you should have no problem producing an example, right?
 That's how an optimizer works, sometimes it optimizes, sometimes not. =
If =
 your algorithm depends on running something twice and then comparing t=
he =
 two results then it could easily break.
The whole POINT of pure functions is that it will return the same thing.= = The fact that it lives in a different piece of memory or not is not = important. We have to accept that. Any code that DEPENDS on that being i= n = a different address is broken. For instance, I would consider it fully possible and reasonable, and to = = only break already-broken code (except for testing implementation, which= = would have to change anyway), to make idup just return the argument if t= he = argument is immutable. -Steve
May 19 2014
next sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Monday, 19 May 2014 at 18:33:55 UTC, Steven Schveighoffer 
wrote:
 Anything that uses the order of unrelated addresses is 
 incorrect outside of the heap code itself.
Nah, not on modern archiectures without GC compaction.
 Then you should have no problem producing an example, right?
I did. Besides, I think language constructs should be proven sound a priori, not post mortem...
 The whole POINT of pure functions is that it will return the 
 same thing. The fact that it lives in a different piece of 
 memory or not is not important. We have to accept that. Any 
 code that DEPENDS on that being in a different address is 
 broken.
Which neans you cannot safely plug a pure function into a generic algorithm unless it testsfor purity.
 For instance, I would consider it fully possible and 
 reasonable, and to only break already-broken code (except for 
 testing implementation, which would have to change anyway), to 
 make idup just return the argument if the argument is immutable.
That could easily break sound code that need guards with different identities.
May 19 2014
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 19 May 2014 14:49:41 -0400, Ola Fosheim Gr=C3=B8stad  =

<ola.fosheim.grostad+dlang gmail.com> wrote:

 On Monday, 19 May 2014 at 18:33:55 UTC, Steven Schveighoffer wrote:
 Then you should have no problem producing an example, right?
I did. Besides, I think language constructs should be proven sound a =
 priori, not post mortem...
Let me cut to the chase here. I haven't seen it. Let's not go any furthe= r = until you can produce it. -Steve
May 19 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Monday, 19 May 2014 at 18:55:57 UTC, Steven Schveighoffer 
wrote:
 On Mon, 19 May 2014 14:49:41 -0400, Ola Fosheim Grøstad 
 <ola.fosheim.grostad+dlang gmail.com> wrote:

 On Monday, 19 May 2014 at 18:33:55 UTC, Steven Schveighoffer 
 wrote:
 Then you should have no problem producing an example, right?
I did. Besides, I think language constructs should be proven sound a priori, not post mortem...
Let me cut to the chase here. I haven't seen it. Let's not go any further until you can produce it.
I've given several examples, but I oppose the general attitude. Language constructs should be formally proven correct. Proving a program to be correct is usually not worth it, but for individual language construct it is indeed possible and necessary, optimization depends on it. This is also why you should strive for a small set of primitives and build high level constructs by lowering them to the minimal language. You can then limit your proof to the primitives. The basic idea in generic programming is that you can implement the full blown generic algorithm, plug in the parts that can vary and let the optimizing compiler boil it down into an optimized tight machine language construct. The programmer should not be required to "deduce" what will happen when he plugs stuff into the generic algorithm. If optimization change semantics you will risk efficient algorithm implementations either going wrong or into an infinite loop. Not at all suitable for a programming language that aims for system level efficiency and generic programming.
May 19 2014
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 19 May 2014 16:56:58 -0400, Ola Fosheim Gr=C3=B8stad  =

<ola.fosheim.grostad+dlang gmail.com> wrote:

 On Monday, 19 May 2014 at 18:55:57 UTC, Steven Schveighoffer wrote:
 On Mon, 19 May 2014 14:49:41 -0400, Ola Fosheim Gr=C3=B8stad  =
 <ola.fosheim.grostad+dlang gmail.com> wrote:

 On Monday, 19 May 2014 at 18:33:55 UTC, Steven Schveighoffer wrote:
 Then you should have no problem producing an example, right?
I did. Besides, I think language constructs should be proven sound a=
=
 priori, not post mortem...
Let me cut to the chase here. I haven't seen it. Let's not go any =
 further until you can produce it.
I've given several examples
Links?
 , but I oppose the general attitude. Language constructs should be  =
 formally proven correct. Proving a program to be correct is usually no=
t =
 worth it, but for individual language construct it is indeed possible =
=
 and necessary, optimization depends on it.
I think it is correct. How is it not? Proving correct is very difficult,= = proving incorrect is not difficult. Just a counter example is needed. So= = far, the examples have not disproven the point.
 If optimization change semantics you will risk efficient algorithm  =
 implementations either going wrong or into an infinite loop. Not at al=
l =
 suitable for a programming language that aims for system level  =
 efficiency and generic programming.
It's not incorrect, just a bug to depend on an allocator allocating two = = different addresses, or allocating in a certain order. In other words, inside a pure function, an allocator of an immutable = object is to be treated as if it may or may not choose to return the sam= e = object. That's just what you should expect. Expecting something else is = a = BUG. -Steve
May 19 2014
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 19 May 2014 17:14:44 -0400, Steven Schveighoffer  =

<schveiguy yahoo.com> wrote:

 On Mon, 19 May 2014 16:56:58 -0400, Ola Fosheim Gr=C3=B8stad  =
 <ola.fosheim.grostad+dlang gmail.com> wrote:
 I've given several examples
Links?
BTW, you probably shouldn't expect any response from me, I'm about to go= = into travel-prep mode, and I likely will not be checking forums. If you = = are at dconf, perhaps we can continue the discussion in person :) -Steve
May 19 2014
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Monday, 19 May 2014 at 21:16:19 UTC, Steven Schveighoffer 
wrote:
 BTW, you probably shouldn't expect any response from me, I'm 
 about to go into travel-prep mode, and I likely will not be 
 checking forums. If you are at dconf, perhaps we can continue 
 the discussion in person :)
I am not going to be there, but I guess this topic could easily fit in on the back side of paper napkin (or a dozen). Have a nice trip!
May 19 2014
prev sibling parent Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Mon, 19 May 2014 14:33:55 -0400
Steven Schveighoffer via Digitalmars-d <digitalmars-d puremagic.com>
wrote:

 The whole POINT of pure functions is that it will return the same
 thing. The fact that it lives in a different piece of memory or not
 is not important. We have to accept that. Any code that DEPENDS on
 that being in a different address is broken.
Except that if you're talking about reference types, and the reference points to two different objects, then they're _equal_ rather than being the same object. That's the whole point of the is operator - to check whether two objects are in fact the same object. I agree that relying on things like whether one address in memory is larger or smaller than another address in unrelated memory is just plane broken, but it's perfectly legitimate for code to depend on whether a particular object is the same object as another rather than equal. And memoizing the result of a function - be it pure or otherwise - will therefore have an effect on the semantics of perfectly valid programs. And honestly, even if equality were all that mattered for memoization, the fact that the compiler only does it on a very, very restrictive basis (since it never saves the result, only optimizes out extraneous calls) makes it so that memoization is kind of useless from the standpoint of the compiler. It's only really useful when the programmer does it, and they can decide whether it's okay to memoize a function based on the requirements of their program (e.g. whether reference equality matters or just object equality). - Jonathan M Davis
May 19 2014
prev sibling parent reply "Dicebot" <public dicebot.lv> writes:
On Monday, 19 May 2014 at 17:35:34 UTC, Steven Schveighoffer 
wrote:
 On Mon, 19 May 2014 13:31:08 -0400, Ola Fosheim Grøstad 
 <ola.fosheim.grostad+dlang gmail.com> wrote:

 On Monday, 19 May 2014 at 17:11:43 UTC, Steven Schveighoffer 
 wrote:
 It shouldn't matter. Something that returns immutable 
 references, can return that same thing again if asked the 
 same way. Nobody should be looking at the address in any 
 meaningful way.
I think this is at odds with generic programming. What you are saying is that if you plug a pure function into an algorithm then you have to test for "pure" in the algorithm if it is affected by object identity. Otherwise, goodbye plug-n-play.
I think I misstated this, of course, looking at the address for certain reasons is OK, Object identity being one of them.
immutable(Object*) alloc() pure { return new Object(); } bool oops() pure { auto a = alloc(); auto b = alloc(); return a is b; } This is a snippet that will always return `true` if memoization is at work and `false` if strongly pure function will get actually called twice. If changing result of your program because of silently enabled compiler optimization does not indicate a broken compiler I don't know what does.
May 19 2014
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 19 May 2014 15:03:55 -0400, Dicebot <public dicebot.lv> wrote:

 On Monday, 19 May 2014 at 17:35:34 UTC, Steven Schveighoffer wrote:
 On Mon, 19 May 2014 13:31:08 -0400, Ola Fosheim Gr=C3=B8stad  =
 <ola.fosheim.grostad+dlang gmail.com> wrote:

 On Monday, 19 May 2014 at 17:11:43 UTC, Steven Schveighoffer wrote:
 It shouldn't matter. Something that returns immutable references, c=
an =
 return that same thing again if asked the same way. Nobody should b=
e =
 looking at the address in any meaningful way.
I think this is at odds with generic programming. What you are sayin=
g =
 is that if you plug a pure function into an algorithm then you have =
to =
 test for "pure" in the algorithm if it is affected by object identit=
y. =
 Otherwise, goodbye plug-n-play.
I think I misstated this, of course, looking at the address for certa=
in =
 reasons is OK, Object identity being one of them.
immutable(Object*) alloc() pure { return new Object(); } bool oops() pure { auto a =3D alloc(); auto b =3D alloc(); return a is b; } This is a snippet that will always return `true` if memoization is at =
=
 work and `false` if strongly pure function will get actually called  =
 twice. If changing result of your program because of silently enabled =
=
 compiler optimization does not indicate a broken compiler I don't know=
=
 what does.
The code is incorrectly implemented, let me fix it: bool oops() pure { return false; } -Steve
May 19 2014
next sibling parent reply "Dicebot" <public dicebot.lv> writes:
On Monday, 19 May 2014 at 19:20:46 UTC, Steven Schveighoffer 
wrote:
 On Mon, 19 May 2014 15:03:55 -0400, Dicebot <public dicebot.lv> 
 wrote:

 On Monday, 19 May 2014 at 17:35:34 UTC, Steven Schveighoffer 
 wrote:
 On Mon, 19 May 2014 13:31:08 -0400, Ola Fosheim Grøstad 
 <ola.fosheim.grostad+dlang gmail.com> wrote:

 On Monday, 19 May 2014 at 17:11:43 UTC, Steven Schveighoffer 
 wrote:
 It shouldn't matter. Something that returns immutable 
 references, can return that same thing again if asked the 
 same way. Nobody should be looking at the address in any 
 meaningful way.
I think this is at odds with generic programming. What you are saying is that if you plug a pure function into an algorithm then you have to test for "pure" in the algorithm if it is affected by object identity. Otherwise, goodbye plug-n-play.
I think I misstated this, of course, looking at the address for certain reasons is OK, Object identity being one of them.
immutable(Object*) alloc() pure { return new Object(); } bool oops() pure { auto a = alloc(); auto b = alloc(); return a is b; } This is a snippet that will always return `true` if memoization is at work and `false` if strongly pure function will get actually called twice. If changing result of your program because of silently enabled compiler optimization does not indicate a broken compiler I don't know what does.
The code is incorrectly implemented, let me fix it: bool oops() pure { return false; } -Steve
Oh, and are probably eager to show me links to specs which indicate what part of my snippet breaks the type system? Is it allocation that is forbidden in reasonable code? Or object identity? Please stop this "write proper code" absurdism. This code is safe, doesn't use casts or pointer forging or any other dirty low level tricks. If compiler can't reject it as invalid and goes into funny mode instead, compiler is fucked. There can be no excuse for it. Your statement is essentially no different from "yeah this language supports addition but please never add 3 and 6, it is incorrect and will always return random results". Language spec is demanded for a reason.
May 19 2014
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 19 May 2014 16:23:34 -0400, Dicebot <public dicebot.lv> wrote:

 On Monday, 19 May 2014 at 19:20:46 UTC, Steven Schveighoffer wrote:
 On Mon, 19 May 2014 15:03:55 -0400, Dicebot <public dicebot.lv> wrote=
:
 On Monday, 19 May 2014 at 17:35:34 UTC, Steven Schveighoffer wrote:
 On Mon, 19 May 2014 13:31:08 -0400, Ola Fosheim Gr=C3=B8stad  =
 <ola.fosheim.grostad+dlang gmail.com> wrote:

 On Monday, 19 May 2014 at 17:11:43 UTC, Steven Schveighoffer wrote=
:
 It shouldn't matter. Something that returns immutable references,=
=
 can return that same thing again if asked the same way. Nobody  =
 should be looking at the address in any meaningful way.
I think this is at odds with generic programming. What you are =
 saying is that if you plug a pure function into an algorithm then =
=
 you have to test for "pure" in the algorithm if it is affected by =
=
 object identity. Otherwise, goodbye plug-n-play.
I think I misstated this, of course, looking at the address for =
 certain reasons is OK, Object identity being one of them.
immutable(Object*) alloc() pure { return new Object(); } bool oops() pure { auto a =3D alloc(); auto b =3D alloc(); return a is b; } This is a snippet that will always return `true` if memoization is a=
t =
 work and `false` if strongly pure function will get actually called =
=
 twice. If changing result of your program because of silently enable=
d =
 compiler optimization does not indicate a broken compiler I don't kn=
ow =
 what does.
The code is incorrectly implemented, let me fix it: bool oops() pure { return false; } -Steve
Oh, and are probably eager to show me links to specs which indicate wh=
at =
 part of my snippet breaks the type system? Is it allocation that is  =
 forbidden in reasonable code? Or object identity?
None is forbidden, and the combination above is a BUG. Bugs happen, = compilers actually compile them. pure !=3D bug-free.
 Please stop this "write proper code" absurdism. This code is safe,  =
 doesn't use casts or pointer forging or any other dirty low level  =
 tricks. If compiler can't reject it as invalid and goes into funny mod=
e =
 instead, compiler is fucked. There can be no excuse for it.
The code above relies on implementation details of the allocator to do = it's work. It's invalid to do so. Please show me the code that goes into "funny land" because of an = incorrect result of oops. I suppose it looks like this: if(oops()) { system("rm -rf /"); } The example is meaningless, because it uses a ridiculous method to = calculate false. Honestly, the above code COULD be written like this: immutable(Object) alloc() pure { static immutable o =3D new immutable(Object); return o; } And be just as valid (and more efficient). -Steve
May 19 2014
parent reply "Dicebot" <public dicebot.lv> writes:
On Monday, 19 May 2014 at 20:51:22 UTC, Steven Schveighoffer 
wrote:
 On Mon, 19 May 2014 16:23:34 -0400, Dicebot <public dicebot.lv> 
 wrote:

 On Monday, 19 May 2014 at 19:20:46 UTC, Steven Schveighoffer 
 wrote:
 On Mon, 19 May 2014 15:03:55 -0400, Dicebot 
 <public dicebot.lv> wrote:

 On Monday, 19 May 2014 at 17:35:34 UTC, Steven Schveighoffer 
 wrote:
 On Mon, 19 May 2014 13:31:08 -0400, Ola Fosheim Grøstad 
 <ola.fosheim.grostad+dlang gmail.com> wrote:

 On Monday, 19 May 2014 at 17:11:43 UTC, Steven 
 Schveighoffer wrote:
 It shouldn't matter. Something that returns immutable 
 references, can return that same thing again if asked the 
 same way. Nobody should be looking at the address in any 
 meaningful way.
I think this is at odds with generic programming. What you are saying is that if you plug a pure function into an algorithm then you have to test for "pure" in the algorithm if it is affected by object identity. Otherwise, goodbye plug-n-play.
I think I misstated this, of course, looking at the address for certain reasons is OK, Object identity being one of them.
immutable(Object*) alloc() pure { return new Object(); } bool oops() pure { auto a = alloc(); auto b = alloc(); return a is b; } This is a snippet that will always return `true` if memoization is at work and `false` if strongly pure function will get actually called twice. If changing result of your program because of silently enabled compiler optimization does not indicate a broken compiler I don't know what does.
The code is incorrectly implemented, let me fix it: bool oops() pure { return false; } -Steve
Oh, and are probably eager to show me links to specs which indicate what part of my snippet breaks the type system? Is it allocation that is forbidden in reasonable code? Or object identity?
None is forbidden, and the combination above is a BUG. Bugs happen, compilers actually compile them. pure != bug-free.
No it is not. It is semantically valid code which does exactly what it was expected to do. Unless compiler optimization happens which will actually introduce a bug silently. It is optimization that is broken, not code itself. And this is not some sort of imaginary code. `alloc` implementation may be located in some other static library and not available to compiler. It is likely to be not a plain `alloc` in real code but some utility function that creates and returns object internally. In the end result is the same. You can't allow even object identity operator if memoization is to happen reliably.
 Please stop this "write proper code" absurdism. This code is 
 safe, doesn't use casts or pointer forging or any other dirty 
 low level tricks. If compiler can't reject it as invalid and 
 goes into funny mode instead, compiler is fucked. There can be 
 no excuse for it.
The code above relies on implementation details of the allocator to do it's work. It's invalid to do so.
Wrong again. It does not rely upon anything. It simply checks if two objects returned by two functions calls are identical. With zero assumptions about it.
 Please show me the code that goes into "funny land" because of 
 an incorrect result of oops.
What the hell are you speaking about? Getting two different results for a function depending on -O flag is not weird enough for you? - Hey, this program produces a wrong output! - But it doesn't wipe your system. You will be fine. At this point it is hard to believe you are serious and not trolling.
May 19 2014
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 19 May 2014 18:46:22 -0400, Dicebot <public dicebot.lv> wrote:

 On Monday, 19 May 2014 at 20:51:22 UTC, Steven Schveighoffer wrote:
 On Mon, 19 May 2014 16:23:34 -0400, Dicebot <public dicebot.lv> wrote:
 Oh, and are probably eager to show me links to specs which indicate  
 what part of my snippet breaks the type system? Is it allocation that  
 is forbidden in reasonable code? Or object identity?
None is forbidden, and the combination above is a BUG. Bugs happen, compilers actually compile them. pure != bug-free.
No it is not. It is semantically valid code which does exactly what it was expected to do. Unless compiler optimization happens which will actually introduce a bug silently. It is optimization that is broken, not code itself.
Again, the bug is to break if a function that allocates an immutable object for some reason re-uses the immutable object (perfectly legitimate optimization, whether done by the compiler or intentionally). I still have not seen the code that somehow can't handle this.
 And this is not some sort of imaginary code. `alloc` implementation may  
 be located in some other static library and not available to compiler.  
 It is likely to be not a plain `alloc` in real code but some utility  
 function that creates and returns object internally.
alloc isn't the problem. oops is incorrectly implemented to depend on implementation details of the allocator.
 Please stop this "write proper code" absurdism. This code is safe,  
 doesn't use casts or pointer forging or any other dirty low level  
 tricks. If compiler can't reject it as invalid and goes into funny  
 mode instead, compiler is fucked. There can be no excuse for it.
The code above relies on implementation details of the allocator to do it's work. It's invalid to do so.
Wrong again. It does not rely upon anything. It simply checks if two objects returned by two functions calls are identical. With zero assumptions about it.
Then why is oops returning true a failure case?
 Please show me the code that goes into "funny land" because of an  
 incorrect result of oops.
What the hell are you speaking about? Getting two different results for a function depending on -O flag is not weird enough for you?
It's not unheard of. When you violate language assumptions, it happens. I've seen code that breaks when -O is passed, vs. when it's not.
 - Hey, this program produces a wrong output!
 - But it doesn't wipe your system. You will be fine.
This is a superficial reading of my statement ;)
 At this point it is hard to believe you are serious and not trolling.
Yeah, it was kind of trolling, but I keep asking questions and have been given the same irrelevant answers. Sorry. -Steve
May 20 2014
prev sibling parent reply Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Mon, 19 May 2014 15:20:46 -0400
Steven Schveighoffer via Digitalmars-d <digitalmars-d puremagic.com>
wrote:

 On Mon, 19 May 2014 15:03:55 -0400, Dicebot <public dicebot.lv> wrote:

 On Monday, 19 May 2014 at 17:35:34 UTC, Steven Schveighoffer wrote:
 On Mon, 19 May 2014 13:31:08 -0400, Ola Fosheim Grøstad
 <ola.fosheim.grostad+dlang gmail.com> wrote:

 On Monday, 19 May 2014 at 17:11:43 UTC, Steven Schveighoffer
 wrote:
 It shouldn't matter. Something that returns immutable
 references, can return that same thing again if asked the same
 way. Nobody should be looking at the address in any meaningful
 way.
I think this is at odds with generic programming. What you are saying is that if you plug a pure function into an algorithm then you have to test for "pure" in the algorithm if it is affected by object identity. Otherwise, goodbye plug-n-play.
I think I misstated this, of course, looking at the address for certain reasons is OK, Object identity being one of them.
immutable(Object*) alloc() pure { return new Object(); } bool oops() pure { auto a = alloc(); auto b = alloc(); return a is b; } This is a snippet that will always return `true` if memoization is at work and `false` if strongly pure function will get actually called twice. If changing result of your program because of silently enabled compiler optimization does not indicate a broken compiler I don't know what does.
The code is incorrectly implemented, let me fix it: bool oops() pure { return false; }
Sure, that particular example may be contrived, but there's plenty of code out there that checks for whether two objects are the same object. A classic example would be for skipping equality checks if the two objects are the same object, but it could be used in far more complex situations that don't involve doing equality checks when two objects aren't the same. The reality of the matter is that regardless of whether you think that checking two objects to see whether they're the same object is a valid operation or not, it's perfectly legal to do so and _will_ happen in code, so if the compiler makes optimizations which will change whether two objects are the same object (e.g. optimizing out a second call to a pure function within an expression so that the result is reused in spite of the fact that the function returns a new object each time it's called), then the result is that the semantics of the program have changed due to an optimization. And that is most definitely a bad thing. IMHO, if the compiler cannot guarantee that a pure function returns the same object every time if it returns a reference type, then the compiler should never optimize out multiple calls to that function. To do otherwise would make it so that the semantics of the program change due to an optimization. - Jonathan M Davis
May 19 2014
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 19 May 2014 23:33:30 -0400, Jonathan M Davis via Digitalmars-d  =

<digitalmars-d puremagic.com> wrote:

 On Mon, 19 May 2014 15:20:46 -0400
 Steven Schveighoffer via Digitalmars-d <digitalmars-d puremagic.com>
 wrote:

 On Mon, 19 May 2014 15:03:55 -0400, Dicebot <public dicebot.lv> wrote=
:
 On Monday, 19 May 2014 at 17:35:34 UTC, Steven Schveighoffer wrote:=
 On Mon, 19 May 2014 13:31:08 -0400, Ola Fosheim Gr=C3=B8stad
 <ola.fosheim.grostad+dlang gmail.com> wrote:

 On Monday, 19 May 2014 at 17:11:43 UTC, Steven Schveighoffer
 wrote:
 It shouldn't matter. Something that returns immutable
 references, can return that same thing again if asked the same
 way. Nobody should be looking at the address in any meaningful
 way.
I think this is at odds with generic programming. What you are saying is that if you plug a pure function into an algorithm then=
 you have to test for "pure" in the algorithm if it is affected by=
 object identity. Otherwise, goodbye plug-n-play.
I think I misstated this, of course, looking at the address for certain reasons is OK, Object identity being one of them.
immutable(Object*) alloc() pure { return new Object(); } bool oops() pure { auto a =3D alloc(); auto b =3D alloc(); return a is b; } This is a snippet that will always return `true` if memoization is at work and `false` if strongly pure function will get actually called twice. If changing result of your program because of silently enabled compiler optimization does not indicate a broken compiler I don't know what does.
The code is incorrectly implemented, let me fix it: bool oops() pure { return false; }
Sure, that particular example may be contrived, but there's plenty of =
=
 code out
 there that checks for whether two objects are the same object. A class=
ic
 example would be for skipping equality checks if the two objects are t=
he =
 same
 object
If anything, the optimization will make this more efficient.
 , but it could be used in far more complex situations that don't invol=
ve
 doing equality checks when two objects aren't the same.
This is hand-waving. I need a real example to understand.
 The reality of the matter is that regardless of whether you think that=
 checking two objects to see whether they're the same object is a valid=
 operation or not, it's perfectly legal to do so and _will_ happen in  =
 code, so
 if the compiler makes optimizations which will change whether two  =
 objects are
 the same object (e.g. optimizing out a second call to a pure function =
=
 within
 an expression so that the result is reused in spite of the fact that t=
he
 function returns a new object each time it's called), then the result =
is =
 that
 the semantics of the program have changed due to an optimization. And =
=
 that is
 most definitely a bad thing.
How so? What exactly happens when two identical, but immutable objects, = = are optimized into being the same object?
 IMHO, if the compiler cannot guarantee that a pure function returns th=
e =
 same
 object every time if it returns a reference type, then the compiler  =
 should
 never optimize out multiple calls to that function. To do otherwise wo=
uld
 make it so that the semantics of the program change due to an  =
 optimization.
General reference type, yes. Immutable reference type, no. -Steve
May 20 2014
prev sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 05/19/2014 09:03 PM, Dicebot wrote:
 immutable(Object*) alloc() pure
 {
      return new Object();
 }

 bool oops() pure
 {
      auto a = alloc();
      auto b = alloc();
      return a is b;
 }

 This is a snippet that will always return `true` if memoization is at
 work and `false` if strongly pure function will get actually called
 twice. If changing result of your program because of silently enabled
 compiler optimization does not indicate a broken compiler I don't know
 what does.
Furthermore, it may not at all be obvious that this is happening: After all, purity can be inferred for template-heavy code, and comparing addresses will not prevent purity inference.
May 19 2014
prev sibling parent reply Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Mon, 19 May 2014 13:11:43 -0400
Steven Schveighoffer via Digitalmars-d <digitalmars-d puremagic.com>
wrote:

 On Mon, 19 May 2014 12:35:26 -0400, Jonathan M Davis via
 Digitalmars-d <digitalmars-d puremagic.com> wrote:

 On Mon, 19 May 2014 09:42:31 -0400
 Steven Schveighoffer via Digitalmars-d <digitalmars-d puremagic.com>
 wrote:

 On Sun, 18 May 2014 09:58:25 -0400, H. S. Teoh via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:

 On Sat, May 17, 2014 at 11:51:44AM -0700, Jonathan M Davis via
 Digitalmars-d wrote:
 On Thu, 15 May 2014 08:43:11 -0700
 Andrei Alexandrescu via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:

 On 5/15/14, 6:28 AM, Dicebot wrote:
 This is not true. Because of such code you can't ever
 automatically memoize strongly pure function results by
 compiler. A very practical concern.
I think code that doesn't return pointers should be memoizable. Playing tricks with pointer comparisons would be appropriately penalized. -- Andrei
Agreed. The fact that a pure function can return newly allocated memory pretty much kills the idea of being able to memoize pure functions that return pointers or references, because the program's behavior would change if it were to memoize the result and reuse it.
Memoizing reference returns that are immutable should be fine.
Only if you consider it okay for the behavior of the function to change upon memoization - or at least don't consider that the behaviors changed due to the fact that the objects are equal but not the same object (which can matter for reference types)
It shouldn't matter. Something that returns immutable references, can return that same thing again if asked the same way.
Except that a pure function _can't_ return the same object by definition - not unless it was passed in via an argument. And unless the compiler can guarantee that the object being return _isn't_ newly allocated, then it has to assume that it could have been, in which case, it can't assume that two calls to the same pure function return the same object. It may return _equal_ objects - but not the same object. And since we're talking about reference types, the fact that they're not the same object can definitely have an impact on the behavior of the program.
 Nobody should be looking at the address in any meaningful way.
Maybe they shouldn't be, but there's nothing stopping them. It's perfectly legal to write code which depends on the value of the address of an object. So, memoizing the result of a pure function which returns a reference type _will_ have an impact on the behavior of some programs.
 is something that matters. It has less of an impact when
 you're dealing with immutable objects, because changing the value
 of one won't
 change the value of another, but it can still change the behavior
 of the program due to the fact that they're not actually the same
 object.
Such a program is incorrectly written.
Well, then Object.toHash is incorrectly written.
 And given that the compiler can only memoize functions within a
 single expression (or maybe statement - I can't remember which) - I
 don't think that
 that restriction even costs us much.
It can make a huge difference, and it doesn't have to be memoized within the same expression, it could be memoized globally with a hashtable, or within the same function.
Not by the compiler. All it ever does with regards to memoization is optimize out multiple calls to the same function with the same argument within a single expression. It doesn't even make that optimization elsewhere within a function. Sure, a programmer could choose to explicitly memoize the result, but then it's up to them to not memoize it when it shouldn't be memoized. - Jonathan M Davis
May 19 2014
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 19 May 2014 23:14:03 -0400, Jonathan M Davis via Digitalmars-d  
<digitalmars-d puremagic.com> wrote:

 On Mon, 19 May 2014 13:11:43 -0400
 Steven Schveighoffer via Digitalmars-d <digitalmars-d puremagic.com>
 wrote:

 On Mon, 19 May 2014 12:35:26 -0400, Jonathan M Davis via
 Digitalmars-d <digitalmars-d puremagic.com> wrote:

 On Mon, 19 May 2014 09:42:31 -0400
 Steven Schveighoffer via Digitalmars-d <digitalmars-d puremagic.com>
 wrote:

 On Sun, 18 May 2014 09:58:25 -0400, H. S. Teoh via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:

 On Sat, May 17, 2014 at 11:51:44AM -0700, Jonathan M Davis via
 Digitalmars-d wrote:
 On Thu, 15 May 2014 08:43:11 -0700
 Andrei Alexandrescu via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:

 On 5/15/14, 6:28 AM, Dicebot wrote:
 This is not true. Because of such code you can't ever
 automatically memoize strongly pure function results by
 compiler. A very practical concern.
I think code that doesn't return pointers should be memoizable. Playing tricks with pointer comparisons would be appropriately penalized. -- Andrei
Agreed. The fact that a pure function can return newly allocated memory pretty much kills the idea of being able to memoize pure functions that return pointers or references, because the program's behavior would change if it were to memoize the result and reuse it.
Memoizing reference returns that are immutable should be fine.
Only if you consider it okay for the behavior of the function to change upon memoization - or at least don't consider that the behaviors changed due to the fact that the objects are equal but not the same object (which can matter for reference types)
It shouldn't matter. Something that returns immutable references, can return that same thing again if asked the same way.
Except that a pure function _can't_ return the same object by definition - not unless it was passed in via an argument.
Compiles today (2.065): immutable(Object) alloc() pure { static immutable o = new immutable(Object); return o; }
 Such a program is incorrectly written.
Well, then Object.toHash is incorrectly written.
It might actually be, yes. toHash really should examine the object contents, not the address. -Steve
May 20 2014
prev sibling parent reply Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Sun, 18 May 2014 06:58:25 -0700
"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> wrote:

 On Sat, May 17, 2014 at 11:51:44AM -0700, Jonathan M Davis via
 Digitalmars-d wrote:
 On Thu, 15 May 2014 08:43:11 -0700
 Andrei Alexandrescu via Digitalmars-d <digitalmars-d puremagic.com>
 wrote:

 On 5/15/14, 6:28 AM, Dicebot wrote:
 This is not true. Because of such code you can't ever
 automatically memoize strongly pure function results by
 compiler. A very practical concern.
I think code that doesn't return pointers should be memoizable. Playing tricks with pointer comparisons would be appropriately penalized. -- Andrei
Agreed. The fact that a pure function can return newly allocated memory pretty much kills the idea of being able to memoize pure functions that return pointers or references, because the program's behavior would change if it were to memoize the result and reuse it. However, that should have no effect on pure functions that return value types - even if the function took pointers or references as arguments or allocated memory internally. They should but perfectly memoizable.
[...] bool func(int x) /* pure? */ { int[] a, b; a = new int[x]; b = new int[x]; return (a.ptr < b.ptr); } Where do you draw the line?
Hmmmm. I think that it was pointed out somewhere else in this thread that that sort of comparison is already undefined in C (and thus probably D) - certainly it's not something that's really valid to do normally. However, there are other things that we currently do "normally" which have similar problems (e.g. toHash on Object hashing the reference). Maybe if we could come up with a set of operations which weren't valid in a pure function because they'd behave differently depending on which memory block was given to new, then we could make it work. But we may simply need to say that memoization of pure functions just isn't going to work if we allow allocations to take place in pure functions. That wouldn't be ideal, but I'm also not convinced that it matters much. It's already so rare that memoization of a function call can occur, that I'm pretty much convinced that memoization is useless as an optimization - at least as long as the compiler is doing it. After all, how often does a function get called with same the arguments within a single function let alone a single expression (and as I understand it, memoization only ever occurs at this point within either a single expression or statement - I don't remember which - but regardless, it's not even within a whole function)? And since there's no way that the compiler is going to memoize alls to a function across functions or even across calls to the same function which is calling the function which is being memoized, unless it's very common to call a function with the same result within a single function (and I really don't think that it is), then memoization by the compiler is really of minimal benefit, much as it would be nice to have it where we can. Regardless, we should error on the side of not memoizing in order to avoid undefined behavior due to memory allocations causing the pure function to act slightly differently across function calls (even when it's given the same arguments). - Jonathan M Davis
May 18 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Monday, 19 May 2014 at 01:19:29 UTC, Jonathan M Davis via 
Digitalmars-d wrote:
 It's already so rare that memoization of a function call can 
 occur, that I'm
 pretty much convinced that memoization is useless as an 
 optimization - at
 least as long as the compiler is doing it. After all, how often 
 does a
 function get called with same the arguments within a single 
 function let alone
 a single expression (and as I understand it, memoization only 
 ever occurs at
 this point within either a single expression or statement - I 
 don't remember
 which - but regardless, it's not even within a whole function)?
Memoization is valid throughout the program. Opportunities occur frequently with generic programming and inlining.
 Regardless, we should error on the side of not memoizing in 
 order to avoid
Then you don't need strict pure functions.
May 18 2014
parent reply Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Mon, 19 May 2014 05:16:13 +0000
via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 On Monday, 19 May 2014 at 01:19:29 UTC, Jonathan M Davis via
 Digitalmars-d wrote:
 It's already so rare that memoization of a function call can
 occur, that I'm
 pretty much convinced that memoization is useless as an
 optimization - at
 least as long as the compiler is doing it. After all, how often
 does a
 function get called with same the arguments within a single
 function let alone
 a single expression (and as I understand it, memoization only
 ever occurs at
 this point within either a single expression or statement - I
 don't remember
 which - but regardless, it's not even within a whole function)?
Memoization is valid throughout the program. Opportunities occur frequently with generic programming and inlining.
I seriously question that assertion. How often do you really call the same function with the same arguments? And given that for the memoization to be done by the compiler without saving the result somewhere (which could actually _harm_ efficiency in many cases and certainly isn't something that the compiler does right now), the most that it could possibly memoize would be within a single function, that makes it all the less likely that it could memoize any calls. And given that the most it currently memoizes would be within a single expression (or possibly a single statement - I can't remember which), that makes it so that about the only time that it can memoize function calls is when you do something like foo(2) * foo(2), and how often does _that_ happen? Sure, it happens from time to time, but in my experience, it's very rare to make the same function call with the same arguments within a single expression or statement.
 Regardless, we should error on the side of not memoizing in
 order to avoid
Then you don't need strict pure functions.
As far as I'm concerned the two big gains from pure are 1. it makes it easier to reason about code, because it guarantees that the function didn't access any global or static variables. 2. it allows us to implicitly convert to different levels of mutability for the return type of pure functions where the compiler can guarantee that the return value was allocated within the function. Any other benefits we get from pure are great, but they're incidental in comparison. And in particular, in my experience, memoization opportunities are so rare that there's really no point in worrying about them. They're just a nice bonus when they do happen. - Jonathan M Davis
May 18 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Monday, 19 May 2014 at 05:39:49 UTC, Jonathan M Davis via 
Digitalmars-d wrote:
 1. it makes it easier to reason about code, because it 
 guarantees that the
 function didn't access any global or static variables.
It can, through the parameters, like an array of pointers. And avoiding IO is not sufficient to mark 90% of my code as weakly pure.
 2. it allows us to implicitly convert to different levels of 
 mutability for
 the return type of pure functions where the compiler can 
 guarantee that the
 return value was allocated within the function.
But if you can have a struct/pointer as a parameter then you can clearly return objects not allocated in the function?
May 18 2014
parent reply Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Mon, 19 May 2014 06:05:26 +0000
via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 On Monday, 19 May 2014 at 05:39:49 UTC, Jonathan M Davis via
 Digitalmars-d wrote:
 1. it makes it easier to reason about code, because it
 guarantees that the
 function didn't access any global or static variables.
It can, through the parameters, like an array of pointers. And avoiding IO is not sufficient to mark 90% of my code as weakly pure.
Except that if most of your code is marked pure, then there aren't very many points where it could access a global or static variable. And regardless of that, the fact that the only way to access a global or static variable in a pure function is through one of its arguments still guarantees that the function isn't messing with anything that wasn't passed to it, so that still helps a lot with being able to reason about code. Sure, it could still be passed an argument that points to a global variable (directly or indirectly), but then you only have to worry about globals being accessed if they were passed in (directly or indirectly). Sure, it's not as big a gain as when a function is strongly pure, but it's good enough for most cases.
 2. it allows us to implicitly convert to different levels of
 mutability for
 the return type of pure functions where the compiler can
 guarantee that the
 return value was allocated within the function.
But if you can have a struct/pointer as a parameter then you can clearly return objects not allocated in the function?
The compiler is smart enough in many cases to determine whether the return value could have been passed in or not (though it wouldn't surprise me if it could be made smarter in that regard). With a function like string foo(string bar) pure {...} it can't assume that the return type is unique, because it could have been passed in via the parameter, but with string foo(char[] bar) pure {..} or int* foo(string bar) pure {..} it could, because it's impossible for the parameter to be returned from the function (unless casting that breaks the type system is used anyway - and the compiler is free to assume that that isn't done). So, it varies quite a bit as to whether a pure function is guaranteed to be returning newly allocated memory or not, but the compiler often can determine that, and when it can, it makes dealing with immutable far, far more pleasant. It's particularly useful when you need to allocate an immutable object but also need to mutate it as part of initializing it. If you do it in a pure function where the compiler knows that the object couldn't have been passed in, then the return type can be freely converted to various levels of mutability - including immutable - without having to use immutable within the function. - Jonathan M Davis
May 18 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Monday, 19 May 2014 at 06:30:46 UTC, Jonathan M Davis via 
Digitalmars-d wrote:
 makes dealing with immutable far, far more pleasant. It's 
 particularly useful
 when you need to allocate an immutable object but also need to 
 mutate it as
 part of initializing it. If you do it in a pure function where 
 the compiler
 knows that the object couldn't have been passed in, then the 
 return type can
 be freely converted to various levels of mutability - including 
 immutable -
 without having to use immutable within the function.
It does not appear as a clean design that functions should have different semantics than a block. What matters is that the object reference is unique. Confusing this with pure seems like a bad idea.
May 19 2014
parent reply Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Mon, 19 May 2014 07:37:55 +0000
via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 On Monday, 19 May 2014 at 06:30:46 UTC, Jonathan M Davis via
 Digitalmars-d wrote:
 makes dealing with immutable far, far more pleasant. It's
 particularly useful
 when you need to allocate an immutable object but also need to
 mutate it as
 part of initializing it. If you do it in a pure function where
 the compiler
 knows that the object couldn't have been passed in, then the
 return type can
 be freely converted to various levels of mutability - including
 immutable -
 without having to use immutable within the function.
It does not appear as a clean design that functions should have different semantics than a block. What matters is that the object reference is unique. Confusing this with pure seems like a bad idea.
I don't follow you. The fact that D's pure helps the compiler determine cases where it knows that the return value of a function is unique is a key feature of pure and has proven to be a great idea. Perhaps you're hung up on the fact that the term "pure" is being used, and you're thinking about functional purity? If so, forget about pure in the functional sense if you want to discuss D purity. You need to think of it as something more like noglobal. That combined with other information in the function signature allows the compiler to determine cases where it knows that the returned value is unique. It also can lead to the compiler determining that a function in functionally pure and thus memoizable, but at this point, that's pretty incidental to what pure is and does. It's part of it, but it's not the primary feature of what D's pure is or is for. It's unfortunate that the language's evolution lead us to using the term pure for what it's currently used for, but we're pretty much stuck with it at this point. Regardless, the fact that D's pure allows us to determine when the return value of a function has to be unique is fantastic and has proven very helpful. - Jonathan M Davis
May 19 2014
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Monday, 19 May 2014 at 08:51:11 UTC, Jonathan M Davis via 
Digitalmars-d wrote:
 Perhaps you're hung up on the fact
 that the term "pure" is being used, and you're thinking about 
 functional
 purity?
No, I just don't think it makes much sense the way "pure" is defined in D. Since it doesn't actually mean anything specific unless you also do analysis of the parameters and return type. If you put a restriction on a function then that restriction should be well defined, clear and useful for a specific purpose.
 stuck with it at this point. Regardless, the fact that D's pure 
 allows us to
 determine when the return value of a function has to be unique
But it doesn't declare a return value to be unique… It just states that there are no side effects except through the arguments, and except for object identity. I am also not sure if it makes much sense to make it mandatory to define a function in order to initialize an immutable value in an imperative language. I don't like the orthogonal aspect of blocks and functions. Imperative functions and procedures are essentially named blocks of statements. Pure functions are essentially expressions.
May 19 2014
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 05/15/2014 02:57 PM, Steven Schveighoffer wrote:
 On Thu, 15 May 2014 02:50:05 -0400, Ola Fosheim Grøstad
 <ola.fosheim.grostad+dlang gmail.com> wrote:

 On Thursday, 15 May 2014 at 06:29:06 UTC, bearophile wrote:
 A little example of D purity (this compiles):
 bool randomBit() pure nothrow  safe {
     return (new int[1].ptr) > (new int[1].ptr);
 }
Yes, and then you may as well allow a random generator with private globals. Because memoing is no longer sound anyway.
This has nothing to do with allocators being pure. They must be pure as a matter of practicality. The issue that allows the above anomaly is using the address of something. There is a reason functional languages do not allow these types of things. But in functional languages, allocating is allowed. ...
Not really, allocation is just an implementation detail. The computational language is meaningful independent of how you might achieve evaluation of expressions. I can in principle evaluate an expression of such a language on paper without managing a separate store, even though this usually will help efficiency. Functional programming languages are not about taking away features from a procedural core language. They are based on the idea that the fundamental operation is substitution of terms.
 To be honest, code that would exploit such an anomaly is only ever used
 in "proof" exercises, and never in real code.
Hashtables are quite real.
 I don't think it's an issue.

 -Steve
This is the issue: On Thu, 15 May 2014 10:48:07 +0000 Don via Digitalmars-d <digitalmars-d puremagic.com> wrote:
 Yes. 'strong pure' means pure in the way that the functional
 language crowd means 'pure'.
 'weak pure' just means doesn't use globals.
There is no way to make that claim precise in an adequate way such that it is correct.
May 15 2014
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 15 May 2014 11:42:00 -0400, Timon Gehr <timon.gehr gmx.ch> wrote=
:

 On 05/15/2014 02:57 PM, Steven Schveighoffer wrote:
 On Thu, 15 May 2014 02:50:05 -0400, Ola Fosheim Gr=C3=B8stad
 <ola.fosheim.grostad+dlang gmail.com> wrote:

 On Thursday, 15 May 2014 at 06:29:06 UTC, bearophile wrote:
 A little example of D purity (this compiles):
 bool randomBit() pure nothrow  safe {
     return (new int[1].ptr) > (new int[1].ptr);
 }
Yes, and then you may as well allow a random generator with private globals. Because memoing is no longer sound anyway.
This has nothing to do with allocators being pure. They must be pure =
as
 a matter of practicality.

 The issue that allows the above anomaly is using the address of
 something. There is a reason functional languages do not allow these
 types of things. But in functional languages, allocating is allowed.
 ...
Not really, allocation is just an implementation detail. The =
 computational language is meaningful independent of how you might  =
 achieve evaluation of expressions. I can in principle evaluate an  =
 expression of such a language on paper without managing a separate  =
 store, even though this usually will help efficiency.

 Functional programming languages are not about taking away features fr=
om =
 a procedural core language. They are based on the idea that the  =
 fundamental operation is substitution of terms.
But they do not deal in explicit pointers. Otherwise, that's a can of = worms that would weaken the guarantees, similar to how D's guarantees ar= e = weakened. We have no choice in D, we must accept that explicit pointers are used.
 To be honest, code that would exploit such an anomaly is only ever us=
ed
 in "proof" exercises, and never in real code.
Hashtables are quite real.
Pretend I'm ignorant, what does this imply?
 This is the issue:

 On Thu, 15 May 2014 10:48:07 +0000
 Don via Digitalmars-d <digitalmars-d puremagic.com> wrote:
 Yes. 'strong pure' means pure in the way that the functional
 language crowd means 'pure'.
 'weak pure' just means doesn't use globals.
There is no way to make that claim precise in an adequate way such tha=
t =
 it is correct.
This doesn't help, I'm lost with this statement. -Steve
May 15 2014
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 05/15/2014 07:41 PM, Steven Schveighoffer wrote:
 Not really, allocation is just an implementation detail. The
 computational language is meaningful independent of how you might
 achieve evaluation of expressions. I can in principle evaluate an
 expression of such a language on paper without managing a separate
 store, even though this usually will help efficiency.

 Functional programming languages are not about taking away features
 from a procedural core language. They are based on the idea that the
 fundamental operation is substitution of terms.
But they do not deal in explicit pointers.
(Well, http://hackage.haskell.org/package/base-4.7.0.0/docs/Foreign-Marshal Alloc.html#v:malloc )
 Otherwise, that's a can of
 worms that would weaken the guarantees, similar to how D's guarantees
 are weakened.
 ...
The concept of a 'pointer' to some primitive value does not have a meaning in such languages. Every value is an "rvalue".
 We have no choice in D, we must accept that explicit pointers are used.
 ...
We could e.g. ban comparing immutable references for identity / in-memory order.
 To be honest, code that would exploit such an anomaly is only ever used
 in "proof" exercises, and never in real code.
Hashtables are quite real.
Pretend I'm ignorant, what does this imply? ...
E.g. order of iteration may be dependent on in-memory order of keys and the 'same' keys might occur multiple times.
 This is the issue:

 On Thu, 15 May 2014 10:48:07 +0000
 Don via Digitalmars-d <digitalmars-d puremagic.com> wrote:
 Yes. 'strong pure' means pure in the way that the functional
 language crowd means 'pure'.
 'weak pure' just means doesn't use globals.
There is no way to make that claim precise in an adequate way such that it is correct.
This doesn't help, I'm lost with this statement. -Steve
The issue is that there are apparently different expectations about what the keyword is supposed to do.
May 15 2014
prev sibling parent "David Nadlinger" <code klickverbot.at> writes:
On Thursday, 15 May 2014 at 06:50:06 UTC, Ola Fosheim Grøstad 
wrote:
 On Thursday, 15 May 2014 at 06:29:06 UTC, bearophile wrote:
 A little example of D purity (this compiles):
 bool randomBit() pure nothrow  safe {
    return (new int[1].ptr) > (new int[1].ptr);
 }
Yes, and then you may as well allow a random generator with private globals. Because memoing is no longer sound anyway.
No, this particular example appears to be invalid code. Under the C memory model [1], the result of comparing two pointers to distinct objects (allocations) is undefined/unspecified behavior (I think it is undefined in C and unspecified in C++, but I don't remember the details). Of course, you could rescue the example by casting the pointers to size_t or something along the lines, but this is something that could be disallowed in pure code. David [1] Which typically applies to D, unless defined otherwise. Our language specification is woefully incomplete in this regard, though.
May 15 2014
prev sibling next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Ola Fosheim Grøstad:

 Pure in D seems pointless to me.
If you start using pure in D you see it's like const: it allows you to remove certain kinds of your mistakes from the code, and it makes it more easy to reason about the code. You can use mutability inside a strongly pure function. This is a very good. Bye, bearophile
May 14 2014
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 15 May 2014 at 06:24:30 UTC, bearophile wrote:
 If you start using pure in D you see it's like const: it allows 
 you to remove certain kinds of your mistakes from the code, and 
 it makes it more easy to reason about the code.
As lint like functionality, yes.
 You can use mutability inside a strongly pure function. This is 
 a very good.
Local mutability does not affect purity, it is the pre and post conditions at the call site that matters.
May 14 2014
prev sibling next sibling parent reply Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Thu, 15 May 2014 05:51:14 +0000
via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 Yep, purity implies memoing.
No, it doesn't. _All_ that it means when a function is pure is that it cannot access global or static variables unless they can't be changed after being initialized (e.g. they're immutable, or they're const value types), and it can't call any other functions which aren't pure. It means _nothing_ else. And it _definitely_ has nothing to do with functional purity. Now, combined with other information, you _can_ get functional purity out it - e.g. if all the parameters to a function are immutable, then it _is_ functionally pure, and optimizations requiring functional purity can be done with that function. But by itself, pure means nothing of the sort. So, no, purity does _not_ imply memoization. - Jonathan M Davis
May 14 2014
next sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 15 May 2014 at 06:59:08 UTC, Jonathan M Davis via 
Digitalmars-d wrote:
 And it _definitely_ has nothing to do with functional purity.
Which makes it pointless and misleading.
 Now, combined with other information, you _can_ get functional 
 purity out it -
 e.g. if all the parameters to a function are immutable, then it 
 _is_
 functionally pure, and optimizations requiring functional 
 purity can be done
 with that function.
No, you can't say it is functionally pure if you can flip a coin with a pure function. To do that you would need a distinction between "prove pure" and "assume pure" as well as having immutable reference types that ban identity comparison.
 So, no, purity does _not_ imply memoization.
It should, or use a different name.
May 15 2014
next sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
And mutating through parameters does not affect functional purity 
in the theoretical sense if the function holds the only reference 
to it. It is comparable to taking one value and returning a new 
version of it.

Rigor is important in language design. If you cannot say ALWAYS, 
then the compiler will have to assume NEVER.
May 15 2014
prev sibling parent reply Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Thu, 15 May 2014 07:22:02 +0000
via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 On Thursday, 15 May 2014 at 06:59:08 UTC, Jonathan M Davis via
 Digitalmars-d wrote:
 And it _definitely_ has nothing to do with functional purity.
Which makes it pointless and misleading.
 Now, combined with other information, you _can_ get functional
 purity out it -
 e.g. if all the parameters to a function are immutable, then it
 _is_
 functionally pure, and optimizations requiring functional
 purity can be done
 with that function.
No, you can't say it is functionally pure if you can flip a coin with a pure function. To do that you would need a distinction between "prove pure" and "assume pure" as well as having immutable reference types that ban identity comparison.
 So, no, purity does _not_ imply memoization.
It should, or use a different name.
Originally, pure required that the function parameters be pure in addition to disallowing the function from accessing global or static variables or calling functions that weren't pure. It allowed for mutation within the function, and it allowed for allocation via new, but from the outside, the function _was_ functionally pure. The problem was that it was almost useless. You just couldn't do anything in a pure function that mattered most of the time. You couldn't call any other functions from the pure function unless they were pure, which meant that the arguments to them had to be immutable, which just didn't work, because while the arguments to the first function were immutable, what it had to do internally often involved operating on other variables which were created within the function which were not immutable and didn't need to be immutable, but you couldn't use them with any functions unless they were immutable thanks to the fact that all pure functions had to have immutable parameters, and pure functions could only call pure functions. It just didn't work. So, Don introduced the idea of "weak" purity. What it comes down to is that it's an extension of the concept that mutation within a pure function is fine just so long as its arguments aren't mutated. We made it so that pure functions _didn't_ have to have immutable parameters. They just couldn't access anything that wasn't passed to them as arguments. This meant that they could only mutate what they were given and thus they didn't violate the "strong" purity of the original pure function which had immutable parameters. e.g. string strongFunc(immutable string foo, int i) pure { auto foo = str ~ " hello world "; weak(str, i); return str; } void weakFunc(ref string str, int i) pure { foreach(j; 0 .. i) str ~= to!(j); } The strong guarantees that strongFunc has which make it functionally pure are not violated by the fact that weakFunc is _not_ functionally pure. But by marking it pure, it guarantees that it can safely be called from a strongly pure function without violating the guarantees of that strongly pure function. To do that, _all_ we need to guarantee is that the weakly pure function cannot access anything save for what's passed into it (since if it could access global variables, that would violate the guarantees of any other pure functions that called it), but we do need that guarantee. The result is that the pure attribute doesn't in and of itself mean functional purity anymore, but it _can_ be used to build a function which is functionally pure. You could argue that a different attribute should be used other than pure to mark "weakly" pure functions, but that would just complicate things. The compiler is capable of figuring out the difference between a "weakly" pure and "strongly" pure function on its own from just the function signature just so long as "weak" purity is detectable from the function signature. So, we only need one attribute - one to mark the fact that the function can't access global, mutable state and can't call any functions that can. And we were already marking strongly pure functions with pure, so it made perfect sense to use it on weakly pure functions as well. At that point, it was just up to the compiler to detect whether the function was strongly pure or not and thus was functionally pure and could be used in optimizations. So, sorry that it offends your sensibilities that pure by itself does not indicate functional purity, but it's a building block for functional purity, and the evolution of things made it make perfect sense to use the pure attribute for this. And even if pure _didn't_ enable functional purity, it would still be highly useful just from the fact that a pure function (be it weak or strong) cannot access global variables, and that makes it _much_ easier to reason about code, because you know that it isn't accessing anything that wasn't passed to it. I recommend that you read this article by David Nadlinger: http://klickverbot.at/blog/2012/05/purity-in-d/ - Jonathan M Davis
May 15 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 15 May 2014 at 09:23:00 UTC, Jonathan M Davis via 
Digitalmars-d wrote:
 functions that weren't pure. It allowed for mutation within the 
 function, and
 it allowed for allocation via new, but from the outside, the 
 function _was_
 functionally pure.
If it didn't return the memory allocated with new and if the call to new resulted in an exception, yes.
 It just didn't work.
That I question. A pure function ( http://en.wikipedia.org/wiki/Pure_function ) depends on the values of the parameters, and only that. That is most useful. Those value can be very complex. You could have a pure member function look up values in a cache. Then the configuration of entire cache is the value. You need to think about this in terms of pre/post conditions in Hoare Logic (which I am not very good at btw).
 So, Don introduced the idea of "weak" purity. What it comes 
 down to is that
 it's an extension of the concept that mutation within a pure 
 function is fine
 just so long as its arguments aren't mutated. We made it so 
 that pure
 functions _didn't_ have to have immutable parameters. They just 
 couldn't
 access anything that wasn't passed to them as arguments. This 
 meant that they
 could only mutate what they were given and thus they didn't 
 violate the
 "strong" purity of the original pure function which had 
 immutable parameters.
And that's fine as long as nobody else is holding a reference to those mutable parameters. That means that you are taking version N of the mutable and returning version N+1. That's similar to x=1 a =f(x) x=x+1 b = f(x) which can be rewritten as: x0 =1 a = f(x0) x1 = x0+1 b = f(x1) If you think in terms of a context for purity such as a transaction then you can even allow access to globals as long as they remain constant until the transaction is committed (or you leave the context where purity is desired). Meaning, you can memoize within that context.
 functions that called it), but we do need that guarantee. The 
 result is that
 the pure attribute doesn't in and of itself mean functional 
 purity anymore,
 but it _can_ be used to build a function which is functionally 
 pure.
But, that can be deduced by the compiler, so what is the point of having "pure" for "weakly pure"? Clearly you only need to specify "strongly pure"?
 So, sorry that it offends your sensibilities that pure by 
 itself does not
 indicate functional purity, but it's a building block for 
 functional purity,
It doesn't offend me, but it is a source of confusion and not sticking to definitions makes the language design look random. The same thing goes for lazy parameters. Why pick Call-By-Name (Algol style) over Call-By-Need (memoing)? Call-By-Name is known to be error prone. Actually, lazy parameters should be restricted to pure expressions… if correctness and safety is the goal.
 attribute for this. And even if pure _didn't_ enable functional 
 purity, it
 would still be highly useful just from the fact that a pure 
 function (be it
 weak or strong) cannot access global variables, and that makes 
 it _much_
 easier to reason about code, because you know that it isn't 
 accessing anything
 that wasn't passed to it.
Ok, but maybe the opposite would be better. Marking functions that access globals with global or something. After all, most functions don't access globals.
May 15 2014
parent reply Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Thu, 15 May 2014 10:10:57 +0000
via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 On Thursday, 15 May 2014 at 09:23:00 UTC, Jonathan M Davis via
 Digitalmars-d wrote:
 functions that weren't pure. It allowed for mutation within the
 function, and
 it allowed for allocation via new, but from the outside, the
 function _was_
 functionally pure.
If it didn't return the memory allocated with new and if the call to new resulted in an exception, yes.
 It just didn't work.
That I question. A pure function ( http://en.wikipedia.org/wiki/Pure_function ) depends on the values of the parameters, and only that. That is most useful. Those value can be very complex. You could have a pure member function look up values in a cache. Then the configuration of entire cache is the value. You need to think about this in terms of pre/post conditions in Hoare Logic (which I am not very good at btw).
 So, Don introduced the idea of "weak" purity. What it comes
 down to is that
 it's an extension of the concept that mutation within a pure
 function is fine
 just so long as its arguments aren't mutated. We made it so
 that pure
 functions _didn't_ have to have immutable parameters. They just
 couldn't
 access anything that wasn't passed to them as arguments. This
 meant that they
 could only mutate what they were given and thus they didn't
 violate the
 "strong" purity of the original pure function which had
 immutable parameters.
And that's fine as long as nobody else is holding a reference to those mutable parameters.
That would only matter if the compiler were trying to optimize based on pure functions with mutable parameters. It doesn't. And it would actually be very difficult for it to do so without doing full program optimization, which really doesn't work with the C linking model that D uses. The fact that we have thread-local by default helps, but it's not enough for more than very simple cases. The compiler doesn't care about optimizing weakly pure functions. The whole purpose of weakly pure functions is to have functions which aren't functionally pure but can still be used in functions that _are_ functionally pure.
 If you think in terms of a context for purity such as a
 transaction then you can even allow access to globals as long as
 they remain constant until the transaction is committed (or you
 leave the context where purity is desired). Meaning, you can
 memoize within that context.
That doesn't work with D's model, because it doesn't have any concept of transactions like that. It also doesn't really have any concept of memoization either. The most that it does that is anything like memoizaton is optimize away multiple calls to a function within a single expression (or maybe within a statement - I don't remember which). So, auto y = sqrt(x) * sqrt(x); might become something more like auto temp = sqrt(x); y = x * x; but after that, the result of sqrt(x) is forgotten. So, in reality, the optimization gains from strongly pure functions are pretty minimal (almost non-existent really). If we were willing to do code flow analysis, we could probably make more optimizations (assuming that the exact same function call was made several times within a single function, which isn't all that common anyway), but Walter is generally against doing code flow analysis in the compiler due to the complications that it adds. We have some, but not a lot. The two main gains for purity are 1. being able to know for sure that a function doesn't access any global variables, which makes it easier to reason about the code. 2. being able to implicitly convert types to and from mutable, const, and immutable based on the knowledge that a particular value has to be unique. I'd say that functional purity was really the original goal of adding pure to the language, but it's really those two effects which have given us the most new places that they can take advantage of it, which makes dealing with immutable a lot more pleasant - particularly when it comes to creating immutable objects that require mutation in order to be initialized properly.
 functions that called it), but we do need that guarantee. The
 result is that
 the pure attribute doesn't in and of itself mean functional
 purity anymore,
 but it _can_ be used to build a function which is functionally
 pure.
But, that can be deduced by the compiler, so what is the point of having "pure" for "weakly pure"? Clearly you only need to specify "strongly pure"?
It can't be deduced from the signature, and the compiler has to be able to know based only on the signature, because it doesn't necessarily have the source code for the function available. The only functions for which the compiler ever deduces anything from their bodies are templated functions, because it always has their bodies, and if it didn't do the attribute inference for you, you'd be forced to either restrict the function by marking it (or not marking it) with a particular set of attributes, or you'd have to duplicate the function for each combination of attributes that you wanted. So, when you mark a function as pure, the compiler verifies that it's pure based on its body, but when other functions use it, all they know or care about is the signature of the function. So, it's up to the programmer to tell the compiler that a function is supposed to be pure, and it's up to the compiler to verify that, and then use that knowledge to infer other things based purely on the signature (like functional purity or the ability to implicitly convert a type to the same type but with different mutability).
 Ok, but maybe the opposite would be better. Marking functions
 that access globals with  global or something. After all, most
 functions don't access globals.
I agree that that would be great if we were starting over - just like it would be great if safe were the default, but it's too late now. We have to make do with what we have. We can make backwards-compatible changes, but we're very much trying to minimize breaking changes at this point. We'll still make them if we really think that they're needed, but we really want to avoid it. And as nice as changing some of the attributes around would be nice, it's too big a breaking change for too little gain. - Jonathan M Davis
May 15 2014
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 15 May 2014 at 11:01:38 UTC, Jonathan M Davis via 
Digitalmars-d wrote:
 functions with mutable parameters. It doesn't. And it would 
 actually be very
 difficult for it to do so without doing full program 
 optimization, which
 really doesn't work with the C linking model that D uses.
I think this C linking excuse is used too much :-). Clearly one can specify properties of C functions when writing the bindings. (Or; should be able to). Besides, not all call trees are full of C calls.
 functions. The whole
 purpose of weakly pure functions is to have functions which 
 aren't
 functionally pure but can still be used in functions that _are_ 
 functionally pure.
I take it you mean "expressions" and not necessarily defined functions. So the compiler memoize values derived from weakly pure functions?
 source code for the function available. The only functions for 
 which the
 compiler ever deduces anything from their bodies are templated 
 functions,
But with the amount of templates in phobos… maybe it is time to move beyond fully separate compilation units?
 about is the signature of the function. So, it's up to the 
 programmer to tell
 the compiler that a function is supposed to be pure, and it's 
 up to the
 compiler to verify that, and then use that knowledge to infer
Ok, but based on bearophile's example it doesn't verify purity properly. So it is essentially part verified ("verify noglob") and part programmer guarantee ("this is pure"). I don't mind programmers being able to set guarantees. For instance for a C function or machine language routines it might be useful to specify that this is pure and avoid recomputation.
 I agree that that would be great if we were starting over - 
 just like it would
 be great if  safe were the default, but it's too late now. We 
 have to make do with what we have.
Yes, many changes over time is not good. Maybe it would be a good idea to collect all those syntax improvements and apply them all at once once the semantics are frozen.
 avoid it. And as
 nice as changing some of the attributes around would be nice, 
 it's too big a
 breaking change for too little gain.
It doesn't have to break anything. You could add " global" and " pure" (strong pure) and keep "pure" (weakly pure) as a deprecated feature.
May 15 2014
prev sibling parent reply luka8088 <luka8088 owave.net> writes:
On 15.5.2014. 8:58, Jonathan M Davis via Digitalmars-d wrote:
 On Thu, 15 May 2014 05:51:14 +0000
 via Digitalmars-d <digitalmars-d puremagic.com> wrote:
 
 Yep, purity implies memoing.
No, it doesn't. _All_ that it means when a function is pure is that it cannot access global or static variables unless they can't be changed after being initialized (e.g. they're immutable, or they're const value types), and it can't call any other functions which aren't pure. It means _nothing_ else. And it _definitely_ has nothing to do with functional purity. Now, combined with other information, you _can_ get functional purity out it - e.g. if all the parameters to a function are immutable, then it _is_ functionally pure, and optimizations requiring functional purity can be done with that function. But by itself, pure means nothing of the sort. So, no, purity does _not_ imply memoization. - Jonathan M Davis
Um. Yes it does. http://dlang.org/function.html#pure-functions "functional purity (i.e. the guarantee that the function will always return the same result for the same arguments)" The fact that it should not be able to effect or be effected by the global state is not a basis for purity, but rather a consequence. Even other sources are consistent on this matter, and this is what purity by definition is.
May 15 2014
next sibling parent reply Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Thu, 15 May 2014 10:14:48 +0200
luka8088 via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 On 15.5.2014. 8:58, Jonathan M Davis via Digitalmars-d wrote:
 On Thu, 15 May 2014 05:51:14 +0000
 via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 Yep, purity implies memoing.
No, it doesn't. _All_ that it means when a function is pure is that it cannot access global or static variables unless they can't be changed after being initialized (e.g. they're immutable, or they're const value types), and it can't call any other functions which aren't pure. It means _nothing_ else. And it _definitely_ has nothing to do with functional purity. Now, combined with other information, you _can_ get functional purity out it - e.g. if all the parameters to a function are immutable, then it _is_ functionally pure, and optimizations requiring functional purity can be done with that function. But by itself, pure means nothing of the sort. So, no, purity does _not_ imply memoization. - Jonathan M Davis
Um. Yes it does. http://dlang.org/function.html#pure-functions "functional purity (i.e. the guarantee that the function will always return the same result for the same arguments)" The fact that it should not be able to effect or be effected by the global state is not a basis for purity, but rather a consequence. Even other sources are consistent on this matter, and this is what purity by definition is.
The reread the paragraph at the top of the section of the documentation that you linked to: "Pure functions are functions which cannot access global or static, mutable state save through their arguments. This can enable optimizations based on the fact that a pure function is guaranteed to mutate nothing which isn't passed to it, and in cases where the compiler can guarantee that a pure function cannot alter its arguments, it can enable full, functional purity (i.e. the guarantee that the function will always return the same result for the same arguments)." That outright says that pure only _can_ enable functional purity - in particular when the compiler is able to guarantee that the function cannot mutate its arguments. pure itself however means nothing of the sort. The fact that pure functions cannot access global state _is_ the basis for functional purity when combined with parameters that arguments cannot be mutated. If you get hung up on what the concept of functional purity is or what you thought pure was before using D, then you're going to have a hard time understanding what pure means in D. And yes, it's a bit weird, but it comes from the practical standpoint of how to make functional purity possible without being too restrictive to be useful. So, it really doesn't matter what other sources say about what purity means. That's not what D's pure means. D's pure is just a building block for what purity normally means. It makes it so that the compiler can detect functional purity and then optimize based on it, but it doesn't in and of itself have anything to do with functional purity. If the documentation isn't getting that across, then I guess that it isn't clear enough. But I would have thought that the part that said "and in cases where the compiler can guarantee that a pure function cannot alter its arguments, it can enable full, functional purity" would have made it clear that D's pure is _not_ functionally pure by itself. The first part of the paragraph says what pure really means: "Pure functions are functions which cannot access global or static, mutable state save through their arguments." Everything else with regards to functional purity is derived from there, but in and of itself, that's _all_ that the pure attribute in D means. See also: http://klickverbot.at/blog/2012/05/purity-in-d/ - Jonathan M Davis
May 15 2014
parent luka8088 <luka8088 owave.net> writes:
On 15.5.2014. 11:35, Jonathan M Davis via Digitalmars-d wrote:
 On Thu, 15 May 2014 10:14:48 +0200
 luka8088 via Digitalmars-d <digitalmars-d puremagic.com> wrote:
 
 On 15.5.2014. 8:58, Jonathan M Davis via Digitalmars-d wrote:
 On Thu, 15 May 2014 05:51:14 +0000
 via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 Yep, purity implies memoing.
No, it doesn't. _All_ that it means when a function is pure is that it cannot access global or static variables unless they can't be changed after being initialized (e.g. they're immutable, or they're const value types), and it can't call any other functions which aren't pure. It means _nothing_ else. And it _definitely_ has nothing to do with functional purity. Now, combined with other information, you _can_ get functional purity out it - e.g. if all the parameters to a function are immutable, then it _is_ functionally pure, and optimizations requiring functional purity can be done with that function. But by itself, pure means nothing of the sort. So, no, purity does _not_ imply memoization. - Jonathan M Davis
Um. Yes it does. http://dlang.org/function.html#pure-functions "functional purity (i.e. the guarantee that the function will always return the same result for the same arguments)" The fact that it should not be able to effect or be effected by the global state is not a basis for purity, but rather a consequence. Even other sources are consistent on this matter, and this is what purity by definition is.
The reread the paragraph at the top of the section of the documentation that you linked to: "Pure functions are functions which cannot access global or static, mutable state save through their arguments. This can enable optimizations based on the fact that a pure function is guaranteed to mutate nothing which isn't passed to it, and in cases where the compiler can guarantee that a pure function cannot alter its arguments, it can enable full, functional purity (i.e. the guarantee that the function will always return the same result for the same arguments)." That outright says that pure only _can_ enable functional purity - in particular when the compiler is able to guarantee that the function cannot mutate its arguments. pure itself however means nothing of the sort. The fact that pure functions cannot access global state _is_ the basis for functional purity when combined with parameters that arguments cannot be mutated.
I am aware of weak/strong purity. I am only talking about strong purity now. To quote bearophile: bool randomBit() pure nothrow safe { return (new int[1].ptr) > (new int[1].ptr); } void main() {} "Pure functions are functions which cannot access global or static, mutable state save through their arguments." - no objections here "This can enable optimizations based on the fact that a pure function is guaranteed to mutate nothing which isn't passed to it, and in cases where the compiler can guarantee that a pure function cannot alter its arguments, it can enable full, functional purity (i.e. the guarantee that the function will always return the same result for the same arguments)." - no arguments where passed to the function, it should always return the same result
 
 If you get hung up on what the concept of functional purity is or what
 you thought pure was before using D, then you're going to have a hard
 time understanding what pure means in D. And yes, it's a bit weird, but
 it comes from the practical standpoint of how to make functional purity
 possible without being too restrictive to be useful. So, it really
 doesn't matter what other sources say about what purity means. That's
 not what D's pure means. D's pure is just a building block for what
 purity normally means. It makes it so that the compiler can detect
 functional purity and then optimize based on it, but it doesn't in and
 of itself have anything to do with functional purity.
 
 If the documentation isn't getting that across, then I guess that it
 isn't clear enough. But I would have thought that the part that said
 "and in cases where the compiler can guarantee that a pure function
 cannot alter its arguments, it can enable full, functional purity"
 would have made it clear that D's pure is _not_ functionally pure by
 itself. The first part of the paragraph says what pure really means:
 "Pure functions are functions which cannot access global or static,
 mutable state save through their arguments." Everything else with
 regards to functional purity is derived from there, but in and of
 itself, that's _all_ that the pure attribute in D means.
 
 See also: http://klickverbot.at/blog/2012/05/purity-in-d/
 
 - Jonathan M Davis
 
Yeah, it does not seem to be clear enough. It comes down to two simple questions: - should you be able to make a strong pure rand() function i D? - if not, why? My answer would be: no, because the result should always be the same. Of course I could be wrong, but my understanding of it fits with what the docs say. I think that answering this questions would clarify a lot!
May 15 2014
prev sibling parent reply "Don" <x nospam.com> writes:
On Thursday, 15 May 2014 at 08:14:50 UTC, luka8088 wrote:
 On 15.5.2014. 8:58, Jonathan M Davis via Digitalmars-d wrote:
 On Thu, 15 May 2014 05:51:14 +0000
 via Digitalmars-d <digitalmars-d puremagic.com> wrote:
 
 Yep, purity implies memoing.
No, it doesn't. _All_ that it means when a function is pure is that it cannot access global or static variables unless they can't be changed after being initialized (e.g. they're immutable, or they're const value types), and it can't call any other functions which aren't pure. It means _nothing_ else. And it _definitely_ has nothing to do with functional purity. Now, combined with other information, you _can_ get functional purity out it - e.g. if all the parameters to a function are immutable, then it _is_ functionally pure, and optimizations requiring functional purity can be done with that function. But by itself, pure means nothing of the sort. So, no, purity does _not_ imply memoization. - Jonathan M Davis
Um. Yes it does. http://dlang.org/function.html#pure-functions "functional purity (i.e. the guarantee that the function will always return the same result for the same arguments)" The fact that it should not be able to effect or be effected by the global state is not a basis for purity, but rather a consequence. Even other sources are consistent on this matter, and this is what purity by definition is.
Please note: D's 'pure' annotation does *not* mean that the function is pure. It means that it is statically verified to be OK to call it from a pure function. The compiler determines if a function is pure, the programmer never does. There are two things going on here, and they are quite distinct. (1) Really the keyword should be something like ' noglobal', rather than 'pure'. It's called pure for historical reasons. To reduce confusion I'll call D's pure ' noglobal' and the functional languages pure ' memoizable'. But it turns out that memoizable isn't actually an interesting property, whereas ' noglobal' is. "No global state" is a deep, transitive property of a function. "Memoizable" is a superficial supersetextra property which the compiler can trivially determine from noglobal. Suppose you have function f(), which calls function g(). If f does not depend on global state, then g must not depend on global state. BUT if f() can be memoizable even if g() is not memoizable. This approach used by D enormously increases the number of functions which can be statically proven to be pure. The nomenclature can create confusion though. (2) Allowing GC activity inside a noglobal function does indeed weaken our ability to memoize. The compiler can still perform memoizing operations on most functions that return GC-allocated memory, but it's more difficult. We don't yet have data on how much of a problem this is. An interesting side-effect of the recent addition of nogc to the language, is that we get this ability back.
May 15 2014
next sibling parent reply luka8088 <luka8088 owave.net> writes:
On 15.5.2014. 11:45, Don wrote:
 On Thursday, 15 May 2014 at 08:14:50 UTC, luka8088 wrote:
 On 15.5.2014. 8:58, Jonathan M Davis via Digitalmars-d wrote:
 On Thu, 15 May 2014 05:51:14 +0000
 via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 Yep, purity implies memoing.
No, it doesn't. _All_ that it means when a function is pure is that it cannot access global or static variables unless they can't be changed after being initialized (e.g. they're immutable, or they're const value types), and it can't call any other functions which aren't pure. It means _nothing_ else. And it _definitely_ has nothing to do with functional purity. Now, combined with other information, you _can_ get functional purity out it - e.g. if all the parameters to a function are immutable, then it _is_ functionally pure, and optimizations requiring functional purity can be done with that function. But by itself, pure means nothing of the sort. So, no, purity does _not_ imply memoization. - Jonathan M Davis
Um. Yes it does. http://dlang.org/function.html#pure-functions "functional purity (i.e. the guarantee that the function will always return the same result for the same arguments)" The fact that it should not be able to effect or be effected by the global state is not a basis for purity, but rather a consequence. Even other sources are consistent on this matter, and this is what purity by definition is.
Please note: D's 'pure' annotation does *not* mean that the function is pure. It means that it is statically verified to be OK to call it from a pure function. The compiler determines if a function is pure, the programmer never does. There are two things going on here, and they are quite distinct. (1) Really the keyword should be something like ' noglobal', rather than 'pure'. It's called pure for historical reasons. To reduce confusion I'll call D's pure ' noglobal' and the functional languages pure ' memoizable'. But it turns out that memoizable isn't actually an interesting property, whereas ' noglobal' is. "No global state" is a deep, transitive property of a function. "Memoizable" is a superficial supersetextra property which the compiler can trivially determine from noglobal. Suppose you have function f(), which calls function g(). If f does not depend on global state, then g must not depend on global state. BUT if f() can be memoizable even if g() is not memoizable. This approach used by D enormously increases the number of functions which can be statically proven to be pure. The nomenclature can create confusion though. (2) Allowing GC activity inside a noglobal function does indeed weaken our ability to memoize. The compiler can still perform memoizing operations on most functions that return GC-allocated memory, but it's more difficult. We don't yet have data on how much of a problem this is. An interesting side-effect of the recent addition of nogc to the language, is that we get this ability back.
Yeah, I read all about weak/string purity and I do understand the background. I was talking about strong purity, maybe I should pointed that out. So, to correct myself: As I understood strong purity implies memoization. Am I correct?
May 15 2014
next sibling parent reply "Don" <x nospam.com> writes:
On Thursday, 15 May 2014 at 10:31:47 UTC, luka8088 wrote:
 On 15.5.2014. 11:45, Don wrote:
 On Thursday, 15 May 2014 at 08:14:50 UTC, luka8088 wrote:
 On 15.5.2014. 8:58, Jonathan M Davis via Digitalmars-d wrote:
 On Thu, 15 May 2014 05:51:14 +0000
 via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 Yep, purity implies memoing.
No, it doesn't. _All_ that it means when a function is pure is that it cannot access global or static variables unless they can't be changed after being initialized (e.g. they're immutable, or they're const value types), and it can't call any other functions which aren't pure. It means _nothing_ else. And it _definitely_ has nothing to do with functional purity. Now, combined with other information, you _can_ get functional purity out it - e.g. if all the parameters to a function are immutable, then it _is_ functionally pure, and optimizations requiring functional purity can be done with that function. But by itself, pure means nothing of the sort. So, no, purity does _not_ imply memoization. - Jonathan M Davis
Um. Yes it does. http://dlang.org/function.html#pure-functions "functional purity (i.e. the guarantee that the function will always return the same result for the same arguments)" The fact that it should not be able to effect or be effected by the global state is not a basis for purity, but rather a consequence. Even other sources are consistent on this matter, and this is what purity by definition is.
Please note: D's 'pure' annotation does *not* mean that the function is pure. It means that it is statically verified to be OK to call it from a pure function. The compiler determines if a function is pure, the programmer never does. There are two things going on here, and they are quite distinct. (1) Really the keyword should be something like ' noglobal', rather than 'pure'. It's called pure for historical reasons. To reduce confusion I'll call D's pure ' noglobal' and the functional languages pure ' memoizable'. But it turns out that memoizable isn't actually an interesting property, whereas ' noglobal' is. "No global state" is a deep, transitive property of a function. "Memoizable" is a superficial supersetextra property which the compiler can trivially determine from noglobal. Suppose you have function f(), which calls function g(). If f does not depend on global state, then g must not depend on global state. BUT if f() can be memoizable even if g() is not memoizable. This approach used by D enormously increases the number of functions which can be statically proven to be pure. The nomenclature can create confusion though. (2) Allowing GC activity inside a noglobal function does indeed weaken our ability to memoize. The compiler can still perform memoizing operations on most functions that return GC-allocated memory, but it's more difficult. We don't yet have data on how much of a problem this is. An interesting side-effect of the recent addition of nogc to the language, is that we get this ability back.
Yeah, I read all about weak/string purity and I do understand the background. I was talking about strong purity, maybe I should pointed that out. So, to correct myself: As I understood strong purity implies memoization. Am I correct?
Yes. 'strong pure' means pure in the way that the functional language crowd means 'pure'. 'weak pure' just means doesn't use globals. But note that "strong purity" isn't an official concept, it was just the terminology I used when explain to Walter what I meant. I don't like the term because it's rather misleading -- in reality you could define a whole range of purity strengths (more than just two). The stronger the purity, the more optimizations you can apply.
May 15 2014
next sibling parent reply Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Thu, 15 May 2014 10:48:07 +0000
Don via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 Yes. 'strong pure' means pure in the way that the functional
 language crowd means 'pure'.
 'weak pure' just means doesn't use globals.

 But note that "strong purity" isn't an official concept, it was
 just the terminology I used when explain to Walter what I meant.
 I don't like the term because it's rather misleading
 -- in reality you could define a whole range of purity strengths
 (more than just two).
 The stronger the purity, the more optimizations you can apply.
Yeah, I agree. The problem is that it always seems necessary to use the terms weak pure to describe the distinction - or maybe I just suck at coming up with a better way to describe it than you did initially. Your recent post in this thread talking about noglobal seems to be a pretty good alternate way to explain it though. Certainly, the term pure throws everyone off at first. - Jonathan M Davis
May 15 2014
parent luka8088 <luka8088 owave.net> writes:
On 15.5.2014. 13:04, Jonathan M Davis via Digitalmars-d wrote:
 On Thu, 15 May 2014 10:48:07 +0000
 Don via Digitalmars-d <digitalmars-d puremagic.com> wrote:
 
 Yes. 'strong pure' means pure in the way that the functional
 language crowd means 'pure'.
 'weak pure' just means doesn't use globals.

 But note that "strong purity" isn't an official concept, it was
 just the terminology I used when explain to Walter what I meant.
 I don't like the term because it's rather misleading
 -- in reality you could define a whole range of purity strengths
 (more than just two).
 The stronger the purity, the more optimizations you can apply.
Yeah, I agree. The problem is that it always seems necessary to use the terms weak pure to describe the distinction - or maybe I just suck at coming up with a better way to describe it than you did initially. Your recent post in this thread talking about noglobal seems to be a pretty good alternate way to explain it though. Certainly, the term pure throws everyone off at first. - Jonathan M Davis
Yeah, +1. Or isolated, as in "isolated from outer scopes".
May 15 2014
prev sibling parent luka8088 <luka8088 owave.net> writes:
On 15.5.2014. 12:48, Don wrote:
 On Thursday, 15 May 2014 at 10:31:47 UTC, luka8088 wrote:
 On 15.5.2014. 11:45, Don wrote:
 On Thursday, 15 May 2014 at 08:14:50 UTC, luka8088 wrote:
 On 15.5.2014. 8:58, Jonathan M Davis via Digitalmars-d wrote:
 On Thu, 15 May 2014 05:51:14 +0000
 via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 Yep, purity implies memoing.
No, it doesn't. _All_ that it means when a function is pure is that it cannot access global or static variables unless they can't be changed after being initialized (e.g. they're immutable, or they're const value types), and it can't call any other functions which aren't pure. It means _nothing_ else. And it _definitely_ has nothing to do with functional purity. Now, combined with other information, you _can_ get functional purity out it - e.g. if all the parameters to a function are immutable, then it _is_ functionally pure, and optimizations requiring functional purity can be done with that function. But by itself, pure means nothing of the sort. So, no, purity does _not_ imply memoization. - Jonathan M Davis
Um. Yes it does. http://dlang.org/function.html#pure-functions "functional purity (i.e. the guarantee that the function will always return the same result for the same arguments)" The fact that it should not be able to effect or be effected by the global state is not a basis for purity, but rather a consequence. Even other sources are consistent on this matter, and this is what purity by definition is.
Please note: D's 'pure' annotation does *not* mean that the function is pure. It means that it is statically verified to be OK to call it from a pure function. The compiler determines if a function is pure, the programmer never does. There are two things going on here, and they are quite distinct. (1) Really the keyword should be something like ' noglobal', rather than 'pure'. It's called pure for historical reasons. To reduce confusion I'll call D's pure ' noglobal' and the functional languages pure ' memoizable'. But it turns out that memoizable isn't actually an interesting property, whereas ' noglobal' is. "No global state" is a deep, transitive property of a function. "Memoizable" is a superficial supersetextra property which the compiler can trivially determine from noglobal. Suppose you have function f(), which calls function g(). If f does not depend on global state, then g must not depend on global state. BUT if f() can be memoizable even if g() is not memoizable. This approach used by D enormously increases the number of functions which can be statically proven to be pure. The nomenclature can create confusion though. (2) Allowing GC activity inside a noglobal function does indeed weaken our ability to memoize. The compiler can still perform memoizing operations on most functions that return GC-allocated memory, but it's more difficult. We don't yet have data on how much of a problem this is. An interesting side-effect of the recent addition of nogc to the language, is that we get this ability back.
Yeah, I read all about weak/string purity and I do understand the background. I was talking about strong purity, maybe I should pointed that out. So, to correct myself: As I understood strong purity implies memoization. Am I correct?
Yes. 'strong pure' means pure in the way that the functional language crowd means 'pure'. 'weak pure' just means doesn't use globals. But note that "strong purity" isn't an official concept, it was just the terminology I used when explain to Walter what I meant. I don't like the term because it's rather misleading -- in reality you could define a whole range of purity strengths (more than just two). The stronger the purity, the more optimizations you can apply.
Ok. Now it is much clearer, thanks.
May 15 2014
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/15/14, 3:31 AM, luka8088 wrote:
 Yeah, I read all about weak/string purity and I do understand the
 background. I was talking about strong purity, maybe I should pointed
 that out.

 So, to correct myself: As I understood strong purity implies
 memoization. Am I correct?
Yes, as long as you don't rely on distinguishing objects by address. Purity of allocation is frequently assumed by functional languages because without it it would be difficult to get much work done. Then, most functional languages make it difficult or impossible to distinguish values by their address. In D that's easy. A D programmer needs to be aware of that, and I think that's fine. Andrei
May 15 2014
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 05/15/2014 05:24 PM, Andrei Alexandrescu wrote:
 On 5/15/14, 3:31 AM, luka8088 wrote:
 Yeah, I read all about weak/string purity and I do understand the
 background. I was talking about strong purity, maybe I should pointed
 that out.

 So, to correct myself: As I understood strong purity implies
 memoization. Am I correct?
Yes, as long as you don't rely on distinguishing objects by address. Purity of allocation is frequently assumed by functional languages
Examples?
 because without it it would be difficult to get much work done.
Why?
 Then,
 most functional languages make it difficult or impossible to distinguish
 values by their address.
If it's not impossible because of lack of the concept it's not a pure functional language.
 In D that's easy. A D programmer needs to be
 aware of that, and I think that's fine.


 Andrei
It's not fine: The spec claims that this problem does not exist. http://dlang.org/function.html "... and in cases where the compiler can guarantee that a pure function cannot alter its arguments, it can enable full, functional purity (i.e. the guarantee that the function will always return the same result for the same arguments)"
May 15 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/15/14, 9:03 AM, Timon Gehr wrote:
 On 05/15/2014 05:24 PM, Andrei Alexandrescu wrote:
 On 5/15/14, 3:31 AM, luka8088 wrote:
 Yeah, I read all about weak/string purity and I do understand the
 background. I was talking about strong purity, maybe I should pointed
 that out.

 So, to correct myself: As I understood strong purity implies
 memoization. Am I correct?
Yes, as long as you don't rely on distinguishing objects by address. Purity of allocation is frequently assumed by functional languages
Examples?
cons 1 2 is equal to cons 1 2
 because without it it would be difficult to get much work done.
Why?
It's rather obvious. You've got to have the ability to create new values in a pure functional programming language.
 Then,
 most functional languages make it difficult or impossible to distinguish
 values by their address.
If it's not impossible because of lack of the concept it's not a pure functional language.
Agreed.
 In D that's easy. A D programmer needs to be
 aware of that, and I think that's fine.


 Andrei
It's not fine: The spec claims that this problem does not exist.
We must fix the spec. Please submit a pull request, thanks. Andrei
May 15 2014
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 05/15/2014 08:03 PM, Andrei Alexandrescu wrote:
 Purity of allocation is frequently assumed by functional languages
Examples?
cons 1 2 is equal to cons 1 2 ...
I don't see anything whose specification would need to mention 'allocation'.
 because without it it would be difficult to get much work done.
Why?
It's rather obvious. You've got to have the ability to create new values in a pure functional programming language.
This kind of operational reasoning is not essential. Of course, in practice you want to evaluate expressions, but the resulting programs are of the same kind as those of a non-pure language, and can do the same kind of operations. There is not really a distinction to be made at that level of abstraction.
May 15 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/15/14, 2:52 PM, Timon Gehr wrote:
 On 05/15/2014 08:03 PM, Andrei Alexandrescu wrote:
 Purity of allocation is frequently assumed by functional languages
Examples?
cons 1 2 is equal to cons 1 2 ...
I don't see anything whose specification would need to mention 'allocation'.
That's the point. There's no specification of allocation - the evaluator may create two lists or reuse the same, and it's all transparent.
 because without it it would be difficult to get much work done.
Why?
It's rather obvious. You've got to have the ability to create new values in a pure functional programming language.
This kind of operational reasoning is not essential. Of course, in practice you want to evaluate expressions, but the resulting programs are of the same kind as those of a non-pure language, and can do the same kind of operations. There is not really a distinction to be made at that level of abstraction.
I'm afraid I got lost here. Andrei
May 15 2014
parent "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Friday, 16 May 2014 at 00:27:34 UTC, Andrei Alexandrescu wrote:
 On 5/15/14, 2:52 PM, Timon Gehr wrote:
 On 05/15/2014 08:03 PM, Andrei Alexandrescu wrote:
 Purity of allocation is frequently assumed by functional 
 languages
Examples?
cons 1 2 is equal to cons 1 2 ...
I don't see anything whose specification would need to mention 'allocation'.
That's the point. There's no specification of allocation - the evaluator may create two lists or reuse the same, and it's all transparent.
 because without it it would be difficult to get much work 
 done.
Why?
It's rather obvious. You've got to have the ability to create new values in a pure functional programming language.
This kind of operational reasoning is not essential. Of course, in practice you want to evaluate expressions, but the resulting programs are of the same kind as those of a non-pure language, and can do the same kind of operations. There is not really a distinction to be made at that level of abstraction.
I'm afraid I got lost here.
I believe Timon's point is that allocation is an implementation detail, just like the existence of registers, memory caches, and the stack. Those things need not exist to perform the computation and as a result there is no need to assume anything about their purity (as far as the language is concerned, they don't exist). D dropped the ball (perhaps) by making memory allocation real. If 'new' was just defined to create a new object without specifying (directly or indirectly) that it lived in memory and had an address that could be compared then the allocation of memory would be an implementation detail invisible to the language, and that would allow "true" purity, and the optimisation opportunities that come with it.
May 16 2014
prev sibling parent luka8088 <luka8088 owave.net> writes:
On 15.5.2014. 17:24, Andrei Alexandrescu wrote:
 On 5/15/14, 3:31 AM, luka8088 wrote:
 Yeah, I read all about weak/string purity and I do understand the
 background. I was talking about strong purity, maybe I should pointed
 that out.

 So, to correct myself: As I understood strong purity implies
 memoization. Am I correct?
Yes, as long as you don't rely on distinguishing objects by address. Purity of allocation is frequently assumed by functional languages because without it it would be difficult to get much work done. Then, most functional languages make it difficult or impossible to distinguish values by their address. In D that's easy. A D programmer needs to be aware of that, and I think that's fine. Andrei
Hm, this does not seem right. safe prevents you from taking the address of of a value, as stated in http://dlang.org/function.html#safe-functions , shouldn't pure do the same? Reading again through the safe docs it seems to me that purity (both strong and weak) should imply safe. I have seen many claims that in D pure means something else from what it means in functional languages and I think that it is too bad if there is not going to be functional language alike purity in D. I have not seen any example of something that can't be forbidden by the compiler if such support would considered.
May 15 2014
prev sibling next sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 15 May 2014 at 09:45:52 UTC, Don wrote:
 But it turns out that  memoizable isn't actually an interesting 
 property, whereas ' noglobal' is.

 "No global state" is a deep, transitive property of a function. 
 "Memoizable" is a superficial supersetextra property which the 
 compiler can trivially determine from  noglobal.
Uhm. That is a pretty strong assumption. "memoizable" is very useful property when you do multihreading, transactions or anything that requires locking. And you can still access globals, you just need a guarantee that globals don't change until you are done. Considering that > 90% of the functions I write don't do IO or globals I'd rather specify the opposite. "io", "global" whatever. That is also easy to enforce, i.e. you don't get to access IO/globals if you don't annotate the function.
May 15 2014
next sibling parent reply "Don" <x nospam.com> writes:
On Thursday, 15 May 2014 at 10:46:21 UTC, Ola Fosheim Grøstad 
wrote:
 On Thursday, 15 May 2014 at 09:45:52 UTC, Don wrote:
 But it turns out that  memoizable isn't actually an 
 interesting property, whereas ' noglobal' is.

 "No global state" is a deep, transitive property of a 
 function. "Memoizable" is a superficial supersetextra property 
 which the compiler can trivially determine from  noglobal.
Uhm. That is a pretty strong assumption. "memoizable" is very useful property when you do multihreading, transactions or anything that requires locking.
It's useful, but it's not a deep property, and importantly, it isn't transient. The compiler can trivially work it out if it knows the function is noglobal.
 And you can still access globals, you just need a guarantee 
 that globals don't change until you are done.
Sure, but how can the compiler statically check that? It's a tough problem. (That's not a rhetorical question, BTW. If you have a solution, that would be awesome).
 Considering that > 90% of the functions I write don't do IO or 
 globals I'd rather specify the opposite. "io", "global" 
 whatever. That is also easy to enforce, i.e. you don't get to 
 access IO/globals if you don't annotate the function.
I agree, I'd personally like to have an annotation ' global', and put 'pure:' at the top of every module.
May 15 2014
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Don:

 I'd personally like to have an annotation ' global', and put 
 'pure:' at the top of every module.
I suggested a little more powerful outer: https://d.puremagic.com/issues/show_bug.cgi?id=5007 Bye, bearophile
May 15 2014
prev sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 15 May 2014 at 11:03:35 UTC, Don wrote:
 And you can still access globals, you just need a guarantee 
 that globals don't change until you are done.
Sure, but how can the compiler statically check that? It's a tough problem. (That's not a rhetorical question, BTW. If you have a solution, that would be awesome).
What you need is some way to formally specify what a lock covers or rather register what globals can be accessed in a transaction. So you basically need some sort of whole program analysis. With transactional memory you only take the lock if the transaction fails, so it is ok if locking is slow or to take multiple locks. The CPU reverts to regular mutex based locking and clears the cache if another thread touch the memory that has been used in the transaction. At least that is how I read the Haswell documentation.
 I agree, I'd personally like to have an annotation ' global', 
 and put 'pure:' at the top of every module.
It makes a lot of sense to put the annotation burden on the things you want less of, yes. "Do I really need to make this global? Maybe I should do it some other way…"
May 15 2014
prev sibling parent reply "w0rp" <devw0rp gmail.com> writes:
Ola, you do not understand 'pure.'

To consider the design of pure, you must first consider that you 
cannot add functional purity to an imperative language. It cannot 
be done. What you can do instead is what D offers with a 'weak 
purity' guarantee. That global state is not modified indirectly, 
that side-effects do not occur in a function, except through the 
function's parameters, except for memory allocation. D allows you 
to come pretty close to strong purity when a function is marked 
pure, does not allocate memory inside it or through its 
arguments, and has only const or immutable arguments. The only 
way you can get a better purity guarantee than that is to use a 
functional programming language like Haskell.
May 15 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 15 May 2014 at 12:37:22 UTC, w0rp wrote:
 To consider the design of pure, you must first consider that 
 you cannot add functional purity to an imperative language.
Of course you can. Functional languages execute in an "imperative context". That's why you need monads. The term "pure function" is only needed in a non-functional language. Applicative/functional languages only have mathematical functions, no need for the term "pure" there.
May 15 2014
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 05/15/2014 03:06 PM, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" wrote:
 On Thursday, 15 May 2014 at 12:37:22 UTC, w0rp wrote:
 To consider the design of pure, you must first consider that you
 cannot add functional purity to an imperative language.
Of course you can. Functional languages execute in an "imperative context". That's why you need monads. ...
Strictly speaking you don't "need" monads, they are sometimes just an adequate way of structuring a program.
 The term "pure function" is only needed in a non-functional language.
 Applicative/functional languages only have mathematical functions, no
 need for the term "pure" there.
In discussions about e.g. Haskell, it is often used to denote an expression of a specific form inside a `stateful' DSL. E.g. if "η" is the unit of some monad, then (η v) is sometimes called a "pure value", while values of other forms are not called pure.
May 15 2014
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 15 May 2014 at 21:48:16 UTC, Timon Gehr wrote:
 The term "pure function" is only needed in a non-functional 
 language.
 Applicative/functional languages only have mathematical 
 functions, no
 need for the term "pure" there.
In discussions about e.g. Haskell, it is often used to denote an expression of a specific form inside a `stateful' DSL. E.g. if "η" is the unit of some monad, then (η v) is sometimes called a "pure value", while values of other forms are not called pure.
Yes, from haskell.org: <<While programs may describe impure effects and actions outside Haskell, they can still be combined and processed ("assembled") purely, inside Haskell, creating a pure Haskell value - a computation action description that describes an impure calculation. That is how Monads in Haskell separate between the pure and the impure.>> So, I think my statement holds.
May 16 2014
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 05/15/2014 11:45 AM, Don wrote:
 "No global state" is a deep, transitive property of a function.
 "Memoizable" is a superficial supersetextra
Why? A memoizable function is still memoizable if it is changed internally to memoize values in global memory, for example.
 property which the compiler
 can trivially determine from  noglobal.
May 15 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/15/2014 9:07 AM, Timon Gehr wrote:
 Why? A memoizable function is still memoizable if it is changed internally to
 memoize values in global memory, for example.
I doubt a compiler could prove it was pure.
May 15 2014
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 05/15/2014 11:33 PM, Walter Bright wrote:
 On 5/15/2014 9:07 AM, Timon Gehr wrote:
 Why? A memoizable function is still memoizable if it is changed
 internally to
 memoize values in global memory, for example.
I doubt a compiler could prove it was pure.
Yes, that was actually my point. Memoizable is actually a non-trivial property. (But note that while a compiler cannot in general discover a proof, it could just _check_ a supplied proof.)
May 15 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/15/2014 2:41 PM, Timon Gehr wrote:
 On 05/15/2014 11:33 PM, Walter Bright wrote:
 On 5/15/2014 9:07 AM, Timon Gehr wrote:
 Why? A memoizable function is still memoizable if it is changed
 internally to
 memoize values in global memory, for example.
I doubt a compiler could prove it was pure.
Yes, that was actually my point. Memoizable is actually a non-trivial property. (But note that while a compiler cannot in general discover a proof, it could just _check_ a supplied proof.)
If the compiler cannot mechanically verify purity, the notion of purity is rather useless, since as this thread shows it is incredibly easy for humans to be mistaken about it.
May 15 2014
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Thu, May 15, 2014 at 03:22:25PM -0700, Walter Bright via Digitalmars-d wrote:
 On 5/15/2014 2:41 PM, Timon Gehr wrote:
On 05/15/2014 11:33 PM, Walter Bright wrote:
On 5/15/2014 9:07 AM, Timon Gehr wrote:
Why? A memoizable function is still memoizable if it is changed
internally to memoize values in global memory, for example.
I doubt a compiler could prove it was pure.
Yes, that was actually my point. Memoizable is actually a non-trivial property. (But note that while a compiler cannot in general discover a proof, it could just _check_ a supplied proof.)
If the compiler cannot mechanically verify purity, the notion of purity is rather useless, since as this thread shows it is incredibly easy for humans to be mistaken about it.
What if the language allowed the user to supply a proof of purity, which can be mechanically checked? Just rephrasing what Timon said, though -- I've no idea how such a thing might be implemented. You'd need some kind of metalanguage for writing the proof in, perhaps inside a proof block attached to the function declaration, that can be mechanically verified to be correct. Alternatively, if only one or two statements are causing trouble, the proof can provide mechanically checkable evidence just for those specific statements. The metalanguage must be mechanically checkable, of course. But this may be more plausible than it sounds -- for example, solutions to certain NP-complete problems are verifiable within a far shorter time than it would take to actually solve said problem. This suggests that perhaps, while the purity of an arbitrary piece of code is, in general, infeasible for the compiler to mechanically prove, it may be possible in some cases that the compiler can mechanically verify a user-supplied proof, and thus provide the same guarantees as it would for directly-provable code. T -- When solving a problem, take care that you do not become part of the problem.
May 15 2014
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
H. S. Teoh:

 it may be possible in some cases that the compiler can 
 mechanically
 verify a user-supplied proof, and thus provide the same 
 guarantees
 as it would for directly-provable code.
Yes, this is common enough. As an example the Whiley (http://whiley.org/ ) is not able to find the proof for various invariants and contracts, but the programmer can write them down and Whiley verifies them to be correct during the compilation. Inventing a proof is harder, but verifying it is often much easier. The first way D can introduce such things (simple proof) is in the static verify of some contracts. Verifying memoizable or reflexive/ symmetric/ transitive (for functions with two arguments) is perhaps a bit too much ambitious for D compilers. But it surely goes well with the idea of a language that tries to both avoid bugs and generate efficient binaries :-) (And perhaps someday C++ Axioms of Concepts will be handled by a C++ compiler able to verify a function to be symmetric). Bye, bearophile
May 15 2014
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/15/2014 4:00 PM, H. S. Teoh via Digitalmars-d wrote:
 What if the language allowed the user to supply a proof of purity, which
 can be mechanically checked?
I think those sorts of things are PhD research topics. It's a bit beyond the scope of what we're trying to do with D.
May 15 2014
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Walter Bright:

 I think those sorts of things are PhD research topics. It's a 
 bit beyond the scope of what we're trying to do with D.
Perhaps yes. Yet, the Whiley language is not PhD-level, and it's inspiring :) Things like value range propagation show that even rather small amounts of statically known information can give back plenty of value to D! Bye, bearophile
May 15 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/15/2014 5:20 PM, bearophile wrote:
 Walter Bright:

 I think those sorts of things are PhD research topics. It's a bit beyond the
 scope of what we're trying to do with D.
Perhaps yes. Yet, the Whiley language is not PhD-level, and it's inspiring :) Things like value range propagation show that even rather small amounts of statically known information can give back plenty of value to D!
Yes, the VRP has been a nice win for D. No other language does it.
May 15 2014
next sibling parent reply "Araq" <rumpf_a web.de> writes:
 Yes, the VRP has been a nice win for D. No other language does 
 it.
I don't know why you keep saying things like that, you don't know all the languages out there. Nimrod does it too fwiw...
May 16 2014
parent Walter Bright <newshound2 digitalmars.com> writes:
On 5/16/2014 1:54 AM, Araq wrote:
 Yes, the VRP has been a nice win for D. No other language does it.
I don't know why you keep saying things like that, you don't know all the languages out there. Nimrod does it too fwiw...
I having trouble finding it in the spec: http://nimrod-lang.org/manual.html Can you please point me to it?
May 16 2014
prev sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Walter Bright:

 Yes, the VRP has been a nice win for D.
There are ways to improve it in very useful ways, that increase static safety and cut down the number of useless casts required. Some currently refused examples: --------------- int[100] array; void main() { foreach (immutable idx, const x; array) ubyte y = idx; } --------------- void foo(immutable uint x) in { assert(x <= ubyte.max); } body { ubyte y = x; } void main() {} https://issues.dlang.org/show_bug.cgi?id=10751 More on contracts: uint foo() in { } out(result) { assert(result < 100); } body { return 20; } void bar(immutable ubyte x) {} void main() { bar(foo()); } --------------- D immutability is under-utilized, it avoids the need for flow analysis: void main(in string[] args) { immutable ushort x = args.length % 5; immutable ubyte y = x; } uint x; void main() { immutable uint y = x; if (y <= ubyte.max) { ubyte z = y; } } uint x; void main() { immutable uint y = x; assert(y <= ubyte.max); ubyte z = y; } https://issues.dlang.org/show_bug.cgi?id=10594 --------------- More: https://issues.dlang.org/show_bug.cgi?id=10615 https://issues.dlang.org/show_bug.cgi?id=12753 Bye, bearophile
May 16 2014
prev sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 05/16/2014 01:56 AM, Walter Bright wrote:
 On 5/15/2014 4:00 PM, H. S. Teoh via Digitalmars-d wrote:
 What if the language allowed the user to supply a proof of purity, which
 can be mechanically checked?
I think those sorts of things are PhD research topics.
Well, feasibility has long ago been demonstrated and I hope those ideas will eventually see general adoption.
 It's a bit beyond the scope of what we're trying to do with D.
Sure, but it still makes sense to be aware of and think about what would be possible. (Otherwise it is too tempting to get fully sold on inferior technology, based on the mistaken assumption that there is no way to do significantly better.)
May 16 2014
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 05/16/2014 01:00 AM, H. S. Teoh via Digitalmars-d wrote:
 On Thu, May 15, 2014 at 03:22:25PM -0700, Walter Bright via Digitalmars-d
wrote:
 On 5/15/2014 2:41 PM, Timon Gehr wrote:
 On 05/15/2014 11:33 PM, Walter Bright wrote:
 On 5/15/2014 9:07 AM, Timon Gehr wrote:
 Why? A memoizable function is still memoizable if it is changed
 internally to memoize values in global memory, for example.
I doubt a compiler could prove it was pure.
Yes, that was actually my point. Memoizable is actually a non-trivial property. (But note that while a compiler cannot in general discover a proof, it could just _check_ a supplied proof.)
If the compiler cannot mechanically verify purity, the notion of purity is rather useless, since as this thread shows it is incredibly easy for humans to be mistaken about it.
What if the language allowed the user to supply a proof of purity, which can be mechanically checked? Just rephrasing what Timon said, though -- I've no idea how such a thing might be implemented. You'd need some kind of metalanguage for writing the proof in, perhaps inside a proof block attached to the function declaration, that can be mechanically verified to be correct.
Yes, either that or one could even just implement it in the existing language by introducing types for evidence, and basic termination checking. eg. http://dpaste.dzfl.pl/33018edab028 (This is a really basic example. Templates or more language features could be used to simplify some of the more tedious steps, but I think it illustrates well the basic ideas. Maybe there are some small mistakes because I didn't find the time to actually implement the checker.)
 Alternatively, if only one or two statements are causing trouble, the
 proof can provide mechanically checkable evidence just for those
 specific statements.

 The metalanguage must be mechanically checkable, of course. But this may
 be more plausible than it sounds -- for example, solutions to certain
 NP-complete problems are verifiable within a far shorter time than it
 would take to actually solve said problem.
Indeed checking whether there is a proof of some fact up to some specific length N for a powerful enough logic is an NP-complete problem. (If N is encoded in unary.)
 This suggests that perhaps,
 while the purity of an arbitrary piece of code is, in general,
 infeasible for the compiler to mechanically prove, it may be possible in
 some cases that the compiler can mechanically verify a user-supplied
 proof, and thus provide the same guarantees as it would for
 directly-provable code.


 T
In fact, this would cover most cases. Usually there is some checkable reason why a piece of code is correct (because otherwise it would not have been invented in the first place.)
May 16 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/16/14, 4:53 AM, Timon Gehr wrote:
 On 05/16/2014 01:00 AM, H. S. Teoh via Digitalmars-d wrote:
 On Thu, May 15, 2014 at 03:22:25PM -0700, Walter Bright via
 Digitalmars-d wrote:
 On 5/15/2014 2:41 PM, Timon Gehr wrote:
 On 05/15/2014 11:33 PM, Walter Bright wrote:
 On 5/15/2014 9:07 AM, Timon Gehr wrote:
 Why? A memoizable function is still memoizable if it is changed
 internally to memoize values in global memory, for example.
I doubt a compiler could prove it was pure.
Yes, that was actually my point. Memoizable is actually a non-trivial property. (But note that while a compiler cannot in general discover a proof, it could just _check_ a supplied proof.)
If the compiler cannot mechanically verify purity, the notion of purity is rather useless, since as this thread shows it is incredibly easy for humans to be mistaken about it.
What if the language allowed the user to supply a proof of purity, which can be mechanically checked? Just rephrasing what Timon said, though -- I've no idea how such a thing might be implemented. You'd need some kind of metalanguage for writing the proof in, perhaps inside a proof block attached to the function declaration, that can be mechanically verified to be correct.
Yes, either that or one could even just implement it in the existing language by introducing types for evidence, and basic termination checking. eg. http://dpaste.dzfl.pl/33018edab028
Typo: int_leibiz_equality :o). -- Andrei
May 16 2014
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 05/16/2014 07:41 PM, Andrei Alexandrescu wrote:
 On 5/16/14, 4:53 AM, Timon Gehr wrote:
 ...

 Yes, either that or one could even just implement it in the existing
 language by introducing types for evidence, and basic termination
 checking.

 eg. http://dpaste.dzfl.pl/33018edab028
 On 5/16/14, 4:53 AM, Timon Gehr wrote:
 (This is a really basic example. Templates or more language features
 could be used to simplify some of the more tedious steps, but I think
 it illustrates well the basic ideas. Maybe there are some small mistakes
 because I didn't find the time to actually implement the checker.)
 Typo: int_leibiz_equality :o). -- Andrei
If that is everything, then I am in good shape! :o)
May 17 2014
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 05/17/2014 09:29 PM, Timon Gehr wrote:

 Typo: int_leibiz_equality :o). -- Andrei
If that is everything, then I am in good shape! :o)
It could be argued though, that this axiom was not too aptly named in the first place, because it describes the indiscernibility of identicals instead of the identity of indiscernibles.
May 17 2014
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/15/2014 2:45 AM, Don wrote:
 An interesting side-effect of the recent addition of  nogc to the language, is
 that we get this ability back.
I hadn't thought of that. Pretty cool!
May 15 2014
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/15/14, 11:03 AM, Walter Bright wrote:
 On 5/15/2014 2:45 AM, Don wrote:
 An interesting side-effect of the recent addition of  nogc to the
 language, is
 that we get this ability back.
I hadn't thought of that. Pretty cool!
Yah, that's unexpected in a nice way. -- Andrei
May 15 2014
parent Walter Bright <newshound2 digitalmars.com> writes:
On 5/15/2014 4:32 PM, Andrei Alexandrescu wrote:
 On 5/15/14, 11:03 AM, Walter Bright wrote:
 On 5/15/2014 2:45 AM, Don wrote:
 An interesting side-effect of the recent addition of  nogc to the
 language, is
 that we get this ability back.
I hadn't thought of that. Pretty cool!
Yah, that's unexpected in a nice way. -- Andrei
Nice to have a positive surprise!
May 15 2014
prev sibling parent Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Thu, 15 May 2014 11:03:13 -0700
Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 On 5/15/2014 2:45 AM, Don wrote:
 An interesting side-effect of the recent addition of  nogc to the
 language, is that we get this ability back.
I hadn't thought of that. Pretty cool!
Definitely, but we also need to be careful with it. If nogc just restricts allocations by the GC and not allocations in general, and if we make it so that malloc is pure (even if it's only when wrapped by a function which throws an Error when malloc returns null), then I don't think that we quite get it back, because while the GC may not have allocated any objects, malloc could still have be used to allocate them. We'd need to be able to either say that _nothing_ allocated within the function, which isn't quite what nogc does as I understand it (though admittedly, I haven't paid much attention to the discussions on it, much as I would have liked to). So, maybe we need to find a way to make it so that a wrapped malloc can be pure but isn't nogc? Though if we go that route, that implies that nogc should have been noalloc. Regardless, I think that making malloc pure definitely affects the issue of whether a nogc function can be assumed to not return newly allocated memory. - Jonathan M Davis
May 17 2014
prev sibling parent reply "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Thursday, 15 May 2014 at 05:51:16 UTC, Ola Fosheim Grøstad 
wrote:
 On Thursday, 15 May 2014 at 02:49:28 UTC, Adam Sakareassen via 
 Digitalmars-d wrote:
 The more D allows 'pure' functions to diverge from functional 
 purity the less relevant the flag is for compiler 
 optimisations.
...
 By the same reasoning cacheability is important.  A pure 
 function might be called within a loop with a parameter that 
 is not altered during the loop.  If the compiler knows the 
 function is pure, it can perform the calculation before the 
 loop and simply reuse the cached result for each iteration.
Yep, purity implies memoing. Malloc and new are not pure because they return objects that can be differentiated by address.
There's an important difference between malloc and new: malloc returns a pointer, but new returns a typed object. This is crucial IMO, because the returned objects are equal to each other. They aren't identical, but then different int variables with the same value aren't identical either, and a function returning int is still considered pure. So it's not identity (~ address) that matters.
 Pure in D seems pointless to me.
Not at all: Don't think of it in terms of low-level optimization opportunities, but in terms of semantics. For example, you get the concept of uniqueness. And the optimizations can still be done, because strongly pure functions can be recognized by their signatures.
May 15 2014
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Marc Schütz:

 And the optimizations can still be done, because strongly pure 
 functions can be recognized by their signatures.
What optimizations do you think GDC compiler is doing (or will do) on pure functions? Bye, bearophile
May 15 2014
parent "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Thursday, 15 May 2014 at 11:35:46 UTC, bearophile wrote:
 Marc Schütz:

 And the optimizations can still be done, because strongly pure 
 functions can be recognized by their signatures.
What optimizations do you think GDC compiler is doing (or will do) on pure functions?
I don't know whether it does or will do any. It is a theoretical option ("can be done"). It's the kind of optimizations Ole talked about that apply only to functionally pure functions. But the important point is that `new` can be pure, if you consider pure to be about equality, not identity. This applies only to typed allocators, not malloc.
May 15 2014
prev sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 15 May 2014 at 11:31:34 UTC, Marc Schütz wrote:
 There's an important difference between malloc and new: malloc 
 returns a pointer, but new returns a typed object. This is 
 crucial IMO, because the returned objects are equal to each 
 other.
I most code, but not all, so how does the compiler know if you don't have a reference type that explicitly bans identity comparison? If one requires value semantics it should also cover the reference. (Some programs allocate empty objects as "enums".)
 optimization opportunities, but in terms of semantics. For 
 example, you get the concept of uniqueness. And the
I agree that uniqueness is an important property. I think Rust is onto something when they now want to rename "mut" to "uniq" or "only". But in this case uniqueness is the problem with "weakly pure", a problem that "pure functions" don't have.
May 15 2014