digitalmars.D - purity and memory allocations/pointers

monarch_dodra (64/64) Aug 03 2013 I'm a bit confused about where we draw the line with purity, and

Timon Gehr (28/90) Aug 03 2013 Since a pure function call can overflow the stack, I don't really see

Meta (5/28) Aug 03 2013 It depends on whether you think a pointer dereference is pure or

monarch_dodra (3/32) Aug 03 2013 Thank the 3 of you for your answers. I think I had a wrong

Lars T. Kyllingstad (5/8) Aug 03 2013 You may also find this discussion interesting, if you haven't
John Colvin (7/42) Aug 03 2013 Is there anywhere formal defining D's pure (weak vs strong etc.)?

John Colvin (4/47) Aug 04 2013 Woops, forgot about this:

H. S. Teoh (6/10) Aug 04 2013 AFAICT, yes.

H. S. Teoh (94/102) Aug 03 2013 It's really quite simple once you understand the motivation behind it.

Meta (2/4) Aug 10 2013 This seems like it would be a good topic for another article.

H. S. Teoh (6/11) Aug 10 2013 :)

Meta (3/5) Aug 10 2013 Humm, don't know how I missed that. I think somebody linked this

Meta (2/4) Aug 10 2013 No, it was in this very thread and I just forgot about it.

Jonathan M Davis (13/15) Aug 03 2013 _All_ that pure does by itself is guarantee that the function doesn't ac...

Timon Gehr (13/40) Aug 03 2013 It is pure in D, but I guess you are not referring to that.

Meta (15/32) Aug 03 2013 I'm thinking of when a pointer points to an invalid location

Timon Gehr (14/46) Aug 03 2013 My point was basically that x is a pure expression of type Integer.

Meta (18/42) Aug 03 2013 I'm confused about what your point is now. Are you trying to

Timon Gehr (29/79) Aug 03 2013 Weak purity is actually superficially similar to state transformers in

H. S. Teoh (39/67) Aug 03 2013 I think a pointer dereference is weakly pure. Consider this:
H. S. Teoh (7/20) Aug 03 2013 [...]

Timon Gehr (2/5) Aug 03 2013 s/mutated/accessed/

anonymous (21/85) Aug 03 2013 When `dup` fails, some Exception/Error is thrown and the function

"monarch_dodra" <monarchdodra gmail.com> writes:

I'm a bit confused about where we draw the line with purity, and 
memory.

For example, take this "simple" function:

//----
char[] getMutableHello() pure
{
     return "hello".dup;
}
//----

This seems like it will *always* create the same result, yet at 
the same time, it calls the GC. "dup" could fail, amongst others, 
meaning there is failed purity.

Is this OK?

:::::::::::::::::::::::::::::::::::::::::

Where does purity mean when you are talking about 
slices/pointers? Simply that they *point*/*reference* the same 
payload? I mean:

void main()
{
     auto a = getMutableHello();
     auto b = getMutableHello();
     assert(a == b); //OK
     assert(a is b); //Derp :/
}

Did getMutableHello just violate its purity promise?

:::::::::::::::::::::::::::::::::::::::::

Another thing that bothers me, again, when interfacing with the 
GC, is "capacity". It is currently marked as pure, but I believe 
this is wrong: The GC can in-place extend the "useable" memory of 
a dynamic array. This means all referencing slices will be 
affected, yet un-modified. This means that calling "s.capacity" 
depends on the global state of the GC:

//----
     auto s1 = new ubyte[](12_000);
     auto bck = s1;
     size_t cap = s1.capacity;
     s1.reserve(13_000);
     assert(s1 is bck); //Apparently, s1 is not changed.
     assert(cap == s1.capacity); //This fails, yet capacity is 
pure?
//----

Again, is this OK?

:::::::::::::::::::::::::::::::::::::::::

One last question: Pointers.

int get(int* p) pure
{
     return *p;
}

void main()
{
     int i = 0;
     auto p = &i;
     get(p);
}

Here, get, to me, is obviously not pure, since it depends on the 
state of the global "i". *Where* did "get" go wrong? Did I simply 
"abusively" mark get as pure? Is the "pure" keyword's guarantee 
simply "weak"?

:::::::::::::::::::::::::::::::::::::::::

To sum up the question, regardless of how the *keyword* pure 
currently works, my question is "*How* should it behave"? We're 
currently marking things in Phobos as pure, and I'm worried the 
only reason we are doing it is that the keyword is lenient, and 
said functions *aren't* actually pure...

Aug 03 2013

Timon Gehr <timon.gehr gmx.ch> writes:

On 08/03/2013 05:59 PM, monarch_dodra wrote:
 I'm a bit confused about where we draw the line with purity, and memory.

 For example, take this "simple" function:

 //----
 char[] getMutableHello() pure
 {
      return "hello".dup;
 }
 //----

 This seems like it will *always* create the same result, yet at the same
 time, it calls the GC. "dup" could fail, amongst others, meaning there
 is failed purity.

 Is this OK?

Since a pure function call can overflow the stack, I don't really see 
this particular point as an issue. (D is Turing complete, unlike the 
machines it runs on.)

 :::::::::::::::::::::::::::::::::::::::::

 Where does purity mean when you are talking about slices/pointers?
 Simply that they *point*/*reference* the same payload? I mean:

 void main()
 {
      auto a = getMutableHello();
      auto b = getMutableHello();
      assert(a == b); //OK
      assert(a is b); //Derp :/
 }

 Did getMutableHello just violate its purity promise?
 ...

Well, no. It is by design. All purity really means is that the part of 
the existing store not reachable by following references in the 
arguments will not be mutated.

What is unfortunate in a sense is that 'is' can be used on immutable 
references. This often precludes common subexpression elimination for 
pure function calls even though such an optimization would be valid 
otherwise.

 :::::::::::::::::::::::::::::::::::::::::

 Another thing that bothers me, again, when interfacing with the GC, is
 "capacity". It is currently marked as pure, but I believe this is wrong:
 The GC can in-place extend the "useable" memory of a dynamic array. This
 means all referencing slices will be affected, yet un-modified. This
 means that calling "s.capacity" depends on the global state of the GC:

 //----
      auto s1 = new ubyte[](12_000);
      auto bck = s1;
      size_t cap = s1.capacity;
      s1.reserve(13_000);
      assert(s1 is bck); //Apparently, s1 is not changed.
      assert(cap == s1.capacity); //This fails, yet capacity is pure?
 //----

 Again, is this OK?
 ...

Well, not really.

IMO the issue here is that if D had a formal semantics, it would likely 
be indeterministic due to the GC interface. To make the execution of 
pure functions deterministic (which is not currently part of their 
guarantees), all operations that can witness indeterminism would need to 
be banned in pure functions (eg. ~=, ordering addresses, traversing hash 
tables etc.) It is not really possible to get rid of the non-determinism 
completely while keeping any kind of allocations, since allocations will 
return nondeterministic memory addresses. (Unless the formal semantics 
fully specifies the GC implementation. :o) )

Another stance that can be taken is that it is not really global state, 
since it is referenced by the slice. Then the above would be ok. (But 
semantically ugly.) Under that point of view, an issue is that capacity 
violates the transitivity of immutable.

 :::::::::::::::::::::::::::::::::::::::::

 One last question: Pointers.

 int get(int* p) pure
 {
      return *p;
 }

 void main()
 {
      int i = 0;
      auto p = &i;
      get(p);
 }

 Here, get, to me, is obviously not pure, since it depends on the state
 of the global "i". *Where* did "get" go wrong? Did I simply "abusively"
 mark get as pure? Is the "pure" keyword's guarantee simply "weak"?
 ...

Yes, it's weak.

 :::::::::::::::::::::::::::::::::::::::::

 To sum up the question, regardless of how the *keyword* pure currently
 works, my question is "*How* should it behave"? We're currently marking
 things in Phobos as pure, and I'm worried the only reason we are doing
 it is that the keyword is lenient, and said functions *aren't* actually
 pure...

I guess adding the requirement that a pure function cannot witness its 
own indeterminism is required to make the keyword legitimate.

Aug 03 2013

"Meta" <jared771 gmail.com> writes:

On Saturday, 3 August 2013 at 16:47:52 UTC, Timon Gehr wrote:
 On 08/03/2013 05:59 PM, monarch_dodra wrote:
 One last question: Pointers.

 int get(int* p) pure
 {
     return *p;
 }

 void main()
 {
     int i = 0;
     auto p = &i;
     get(p);
 }

 Here, get, to me, is obviously not pure, since it depends on 
 the state
 of the global "i". *Where* did "get" go wrong? Did I simply 
 "abusively"
 mark get as pure? Is the "pure" keyword's guarantee simply 
 "weak"?
 ...

 Yes, it's weak.

It depends on whether you think a pointer dereference is pure or 
not (I don't know the answer). That aside, as long as get doesn't 
modify the value at *p or change what p points to, this is 
strongly pure (i.e., the academic definition of purity).

Aug 03 2013

"monarch_dodra" <monarchdodra gmail.com> writes:

On Saturday, 3 August 2013 at 19:07:49 UTC, Meta wrote:
 On Saturday, 3 August 2013 at 16:47:52 UTC, Timon Gehr wrote:
 On 08/03/2013 05:59 PM, monarch_dodra wrote:
 One last question: Pointers.

 int get(int* p) pure
 {
    return *p;
 }

 void main()
 {
    int i = 0;
    auto p = &i;
    get(p);
 }

 Here, get, to me, is obviously not pure, since it depends on 
 the state
 of the global "i". *Where* did "get" go wrong? Did I simply 
 "abusively"
 mark get as pure? Is the "pure" keyword's guarantee simply 
 "weak"?
 ...

 Yes, it's weak.

 It depends on whether you think a pointer dereference is pure 
 or not (I don't know the answer). That aside, as long as get 
 doesn't modify the value at *p or change what p points to, this 
 is strongly pure (i.e., the academic definition of purity).

Thank the 3 of you for your answers. I think I had a wrong 
preconception of what pure is. I think this cleared most of it up.

Aug 03 2013

"Lars T. Kyllingstad" <public kyllingen.net> writes:

On Saturday, 3 August 2013 at 21:19:35 UTC, monarch_dodra wrote:

 Thank the 3 of you for your answers. I think I had a wrong 
 preconception of what pure is. I think this cleared most of it 
 up.

You may also find this discussion interesting, if you haven't 
already seen it:

http://forum.dlang.org/thread/i7bp8o$6po$1 digitalmars.com

It basically led to our current definition of "pure".

Aug 03 2013

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Saturday, 3 August 2013 at 21:19:35 UTC, monarch_dodra wrote:
 On Saturday, 3 August 2013 at 19:07:49 UTC, Meta wrote:
 On Saturday, 3 August 2013 at 16:47:52 UTC, Timon Gehr wrote:
 On 08/03/2013 05:59 PM, monarch_dodra wrote:
 One last question: Pointers.

 int get(int* p) pure
 {
   return *p;
 }

 void main()
 {
   int i = 0;
   auto p = &i;
   get(p);
 }

 Here, get, to me, is obviously not pure, since it depends on 
 the state
 of the global "i". *Where* did "get" go wrong? Did I simply 
 "abusively"
 mark get as pure? Is the "pure" keyword's guarantee simply 
 "weak"?
 ...

 Yes, it's weak.

 It depends on whether you think a pointer dereference is pure 
 or not (I don't know the answer). That aside, as long as get 
 doesn't modify the value at *p or change what p points to, 
 this is strongly pure (i.e., the academic definition of 
 purity).

 Thank the 3 of you for your answers. I think I had a wrong 
 preconception of what pure is. I think this cleared most of it 
 up.

Is there anywhere formal defining D's pure (weak vs strong etc.)? 
A page in the wiki perhaps?

Imagine someone new coming to D and being confused by what our 
purity system is. It would suck to only be able to give an ad-hoc 
answer or link them to a previous discussion.

I would offer but I don't really understand it myself.

Aug 03 2013

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Saturday, 3 August 2013 at 23:15:25 UTC, John Colvin wrote:
 On Saturday, 3 August 2013 at 21:19:35 UTC, monarch_dodra wrote:
 On Saturday, 3 August 2013 at 19:07:49 UTC, Meta wrote:
 On Saturday, 3 August 2013 at 16:47:52 UTC, Timon Gehr wrote:
 On 08/03/2013 05:59 PM, monarch_dodra wrote:
 One last question: Pointers.

 int get(int* p) pure
 {
  return *p;
 }

 void main()
 {
  int i = 0;
  auto p = &i;
  get(p);
 }

 Here, get, to me, is obviously not pure, since it depends 
 on the state
 of the global "i". *Where* did "get" go wrong? Did I simply 
 "abusively"
 mark get as pure? Is the "pure" keyword's guarantee simply 
 "weak"?
 ...

 Yes, it's weak.

 It depends on whether you think a pointer dereference is pure 
 or not (I don't know the answer). That aside, as long as get 
 doesn't modify the value at *p or change what p points to, 
 this is strongly pure (i.e., the academic definition of 
 purity).

 Thank the 3 of you for your answers. I think I had a wrong 
 preconception of what pure is. I think this cleared most of it 
 up.

 Is there anywhere formal defining D's pure (weak vs strong 
 etc.)? A page in the wiki perhaps?

 Imagine someone new coming to D and being confused by what our 
 purity system is. It would suck to only be able to give an 
 ad-hoc answer or link them to a previous discussion.

 I would offer but I don't really understand it myself.

Woops, forgot about this: 
http://klickverbot.at/blog/2012/05/purity-in-d/

Is it still up-to-date?

Aug 04 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Sun, Aug 04, 2013 at 02:35:38PM +0200, John Colvin wrote:
[...]
 Woops, forgot about this:
 http://klickverbot.at/blog/2012/05/purity-in-d/
 
 Is it still up-to-date?

AFAICT, yes.


T

-- 
The computer is only a tool. Unfortunately, so is the user. -- Armaphine, K5

Aug 04 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Sun, Aug 04, 2013 at 01:15:24AM +0200, John Colvin wrote:
[...]
 Is there anywhere formal defining D's pure (weak vs strong etc.)? A
 page in the wiki perhaps?
 
 Imagine someone new coming to D and being confused by what our
 purity system is. It would suck to only be able to give an ad-hoc
 answer or link them to a previous discussion.
 
 I would offer but I don't really understand it myself.

It's really quite simple once you understand the motivation behind it.

Pure (aka "strong" purity) means the return value of the function only
depends on its arguments, not on any hidden state, not on any global
state, and it has no side-effects. So no static variables, no globals,
no side-effects. This is purity as generally understood in functional
languages.

Now, traditionally, (strong) purity is interpreted to mean that the
function cannot use anything with side effects, like assignment
statements and mutable variables, since easiest way to ensure that
function as a whole has no side-effects is to build it out of primitives
that also has no side-effects.

The first insight in D, however, is that as long as the function's
return value only ever depends on its arguments, and it cannot access
any global variables (or cause side-effects like I/O), then it cannot be
externally distinguished from a genuinely pure function (in the Haskell
sense of pure). The fact that the function may be using mutable
variables and assignment statements is invisible to the outside world,
so for all intents and purposes, it might as well be Haskell-pure. This
leads to the convenience in D that it's OK to use mutable variables and
assignment statements, etc., that are traditionally prohibited in pure
languages, inside a D pure function, as long as the outside world can't
tell the difference. So we don't have to go full-blown functional
programming inside the function; we can still use loops and arrays and
stuff. This makes pure functions much more convenient to write, yet
without sacrificing the advantages of true purity.

The second insight in D, is that suppose you have a function, let's call
it W, that (1) has no internal state, (2) doesn't directly access global
variables, and (3) can affect the outside world but only through its
arguments (e.g., a pointer, a reference to mutable variables, etc.).
Now, clearly, being able to modify external state through its arguments
violates the traditional sense of purity. But here's where the insight
comes in: suppose we have a strongly pure function as described in the
previous paragraph, let's call it F. It has no side-effects, no hidden
state, no accessing of global variables, but does have some local
variables it uses to calculate its return value. The question is, will F
remain strongly pure if it calls W?

So let's think about the ramifications of calling W. Since W is able to
modify the outside world through its arguments, it seems that F can no
longer be pure, since any side-effects caused by W will also implicate
F. But if you think about it, the worst that F can do is to pass a
reference to a local variable to W. It can't pass a reference to a
global variable because it doesn't have any access to a global variable!
So whatever side-effect W may have, that side-effect is confined inside
the scope of F, and continues to be invisible to the world outside of F.
We can therefore conclude that F is *still* strongly pure!

Now, the story would be different if W could access global variables --
then its side-effects would be visible outside of F, and so F can no
longer be pure. So W still has some restrictions on it, it's just that
it's a little more lax than the restrictions on F.

So here is where the idea of weak purity comes in. We say that W is
"weakly pure", because it violates the traditional definition of purity
(can't mutate outside state through reference arguments), but it is
still restricted enough that when it's called from a strongly pure
function, all of its side-effects are confined inside the local scope of
the strongly pure function, and thereby are invisible to the world
outside of the strongly pure function. IOW, a weakly pure function is a
function that can be called by a strongly pure function without
violating the strong purity of the latter.

Finally, we make the observation that the distinction between a
strongly-pure and weakly-pure function is completely determined by its
argument types. A weakly-pure function is still not allowed to access
global variables, for example; the only way it can mutate the outside
world is if at least one of its arguments is mutable, or contains a
reference to some outside state. If all of its arguments are immutable
and contain no references, then the function can't actually have any
side-effects on the outside world at all, and so it is, in fact,
strongly pure. Thus, this allows us to use the "pure" keyword on *both*
strongly-pure and weakly-pure functions: the weakly pure functions can
be identified by the fact that one or more of their arguments may
contain mutable references; the strongly pure functions are the ones
whose arguments are immutable (or implicitly castable to immutable, like
ints -- since modifying an int argument has no effect on the outside
world).

Thus, we arrive at our current definition of purity in D.

To end, I'd like to mention the reason we want to go through all this
tedium to define strongly pure vs. weakly pure: the key insight is that
even though weakly pure functions are *not* pure in the traditional
sense, they *can* be used as building blocks to build strongly pure
functions! This greatly increases the implementation possibilities of
strongly pure functions. Not only can we use local variables and
assignments in imperative code inside a strongly pure function, we can
also be assured that calling weakly-pure functions, which have
side-effects, will still maintain our purity! The notion of weak purity
assures us that none of the side-effects we use inside the strongly-pure
function will ever "leak out" to the outside world. Therefore, we
continue to enjoy the benefits of purity-based optimizations, like
caching of return values, merging identical subexpressions, and many of
the other optimizations that are traditionally restricted only to
functional languages.


T

-- 
"640K ought to be enough" -- Bill G., 1984. "The Internet is not a primary goal
for PC usage" -- Bill G., 1995. "Linux has no impact on Microsoft's strategy"
-- Bill G., 1999.

Aug 03 2013

"Meta" <jared771 gmail.com> writes:

On Saturday, 10 August 2013 at 18:28:07 UTC, H. S. Teoh wrote:
 It's really quite simple once you understand the motivation 
 behind it...

This seems like it would be a good topic for another article.

Aug 10 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Sat, Aug 10, 2013 at 08:44:06PM +0200, Meta wrote:
 On Saturday, 10 August 2013 at 18:28:07 UTC, H. S. Teoh wrote:
It's really quite simple once you understand the motivation behind
it...

 
 This seems like it would be a good topic for another article.

:)

But there's already one: http://klickverbot.at/blog/2012/05/purity-in-d/


T

-- 
People say I'm arrogant, and so I am!!

Aug 10 2013

"Meta" <jared771 gmail.com> writes:

On Saturday, 10 August 2013 at 20:23:38 UTC, H. S. Teoh wrote:
 But there's already one: 
 http://klickverbot.at/blog/2012/05/purity-in-d/

Humm, don't know how I missed that. I think somebody linked this 
a few days ago in a different thread, too.

Aug 10 2013

"Meta" <jared771 gmail.com> writes:

On Sunday, 11 August 2013 at 00:31:30 UTC, Meta wrote:
 Humm, don't know how I missed that. I think somebody linked 
 this a few days ago in a different thread, too.

No, it was in this very thread and I just forgot about it.

Aug 10 2013

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday, August 03, 2013 23:19:33 monarch_dodra wrote:
 Thank the 3 of you for your answers. I think I had a wrong
 preconception of what pure is. I think this cleared most of it up.

_All_ that pure does by itself is guarantee that the function doesn't access 
any static or global variables which can be mutated after they're initialized. 
Where it becomes more entertaining is when the function becomes strongly pure 
(which at this point requires that all of the functions parameters be 
immutable or implicitly convertible to immutable - though it really should be 
when the _arguments_ are immutable or implicitly convertible to immutable and 
all of the parameters are either const or immutable or implicitly convertible 
to immutable). At that point, certain optimizations become possible, but what 
exactly those optimizations are is still something that's up for debate (e.g. 
when you can get away with converting the return value to a different level of 
mutability and when you can't).

- Jonathan M Davis

Aug 03 2013

Timon Gehr <timon.gehr gmx.ch> writes:

On 08/03/2013 09:07 PM, Meta wrote:
 On Saturday, 3 August 2013 at 16:47:52 UTC, Timon Gehr wrote:
 On 08/03/2013 05:59 PM, monarch_dodra wrote:
 One last question: Pointers.

 int get(int* p) pure
 {
     return *p;
 }

 void main()
 {
     int i = 0;
     auto p = &i;
     get(p);
 }

 Here, get, to me, is obviously not pure, since it depends on the state
 of the global "i". *Where* did "get" go wrong? Did I simply "abusively"
 mark get as pure? Is the "pure" keyword's guarantee simply "weak"?
 ...

 Yes, it's weak.

 It depends on whether you think a pointer dereference is pure or not  (I
 don't know the answer).

It is pure in D, but I guess you are not referring to that.
What's your understanding of purity in this context?

 That aside, as long as get doesn't modify the
 value at *p or change what p points to, this is strongly pure

Modification and dereference within a Haskell expression:

import Data.STRef
import Control.Monad.ST

x = runST $ do
   x <- newSTRef 0
   writeSTRef x 1
   v <- readSTRef x
   return v

main = print x

 (i.e., the academic definition of purity).

I wouldn't go that far.

Aug 03 2013

"Meta" <jared771 gmail.com> writes:

On Saturday, 3 August 2013 at 23:04:14 UTC, Timon Gehr wrote:
 It depends on whether you think a pointer dereference is pure 
 or not  (I don't know the answer).

 It is pure in D, but I guess you are not referring to that.
 What's your understanding of purity in this context?

I'm thinking of when a pointer points to an invalid location 
(0x00000000 or similar) and is dereferenced. I'm not sure if that 
would be considered impure or not, i.e., causes side effects.

 That aside, as long as get doesn't modify the
 value at *p or change what p points to, this is strongly pure

 Modification and dereference within a Haskell expression:

 import Data.STRef
 import Control.Monad.ST

 x = runST $ do
   x <- newSTRef 0
   writeSTRef x 1
   v <- readSTRef x
   return v

 main = print x

I apologize, as I don't know how familiar you are with Haskell, 
so forgive me if I'm telling you something you already know. That 
code is 100% pure and side-effect free; no variables are being 
modified. Haskell only simulates side-effects with monads, and 
do-notation (syntwhat you use in this example) which is syntactic 
sugar.

 (i.e., the academic definition of purity).

 I wouldn't go that far.

Perhaps that may go too far, as academics love to obfuscate 
topics with a bunch of extraneous cruft, but the fact remains 
that purity means:

1. No modification of local or global state (side-effects)
2. No dependence on global mutable state.

Aug 03 2013

Timon Gehr <timon.gehr gmx.ch> writes:

On 08/04/2013 02:50 AM, Meta wrote:
 On Saturday, 3 August 2013 at 23:04:14 UTC, Timon Gehr wrote:
 ...

 Modification and dereference within a Haskell expression:

 import Data.STRef
 import Control.Monad.ST

 x = runST $ do
   x <- newSTRef 0
   writeSTRef x 1
   v <- readSTRef x
   return v

 main = print x

 I apologize, as I don't know how familiar you are with Haskell,

With the language, quite intimately.

 so forgive me if I'm telling you something you already know. That code is
 100% pure and side-effect free;

My point was basically that x is a pure expression of type Integer.

 no variables are being modified.

Which I didn't claim. A reference is dereferenced and the contents of 
the referenced slot are replaced. Then the reference is dereferenced 
again to read the modified value out.

 Haskell only simulates side-effects

What tells you that D does not 'simulate' mutable state?

 with monads,

What is done to main behind the scenes in order to execute the program 
is most definitely side-effecting.

 and do-notation

do notation is not central to my point.

 (syntwhat you
 use in this example) which is syntactic sugar.

 (i.e., the academic definition of purity).

 I wouldn't go that far.

 Perhaps that may go too far, as academics love to obfuscate topics with
 a bunch of extraneous cruft,

I take issue with this statement. I meant, I wouldn't assume 'the 
academic definition of purity' is a thing.

 but the fact remains that purity means:

 1. No modification of local or global state (side-effects)
 2. No dependence on global mutable state.

What does this statement quantify over? Eg, where is the abstraction 
boundary?

Aug 03 2013

"Meta" <jared771 gmail.com> writes:

On Sunday, 4 August 2013 at 02:05:55 UTC, Timon Gehr wrote:
 no variables are being modified.

 Which I didn't claim. A reference is dereferenced and the 
 contents of the referenced slot are replaced. Then the 
 reference is dereferenced again to read the modified value out.

I'm confused about what your point is now. Are you trying to 
convince me that dereferencing p and modifying the value it 
points to are side-effect free? Of course you can do it in 
Haskell with some wizardry, but D doesn't do any of that.

 Haskell only simulates side-effects

 What tells you that D does not 'simulate' mutable state?

I see what you're getting at. I thought I had a good answer 
before I really thought about your question. I can't think of any 
way to prove that a language is not simply simulating mutable 
state off the top of my head. Do you actually have an answer to 
this question, or were you just making a point?

 with monads,

 What is done to main behind the scenes in order to execute the 
 program is most definitely side-effecting.

I agree with you, but that is done by the runtime in a safe way 
and is mostly out of the user's hands.

 (i.e., the academic definition of purity).

 I wouldn't go that far.

 Perhaps that may go too far, as academics love to obfuscate 
 topics with
 a bunch of extraneous cruft,

 I take issue with this statement. I meant, I wouldn't assume 
 'the academic definition of purity' is a thing.

I don't know what you mean by this. By academic definition, I 
meant the definition of functional purity that is generally 
agreed-upon by the CS academia.

 but the fact remains that purity means:

 1. No modification of local or global state (side-effects)
 2. No dependence on global mutable state.

 What does this statement quantify over? Eg, where is the 
 abstraction boundary?

Again, can you be more clear? My meaning was that for a function 
to be pure, in a functional programming sense, it must adhere to 
these rules.

Aug 03 2013

Timon Gehr <timon.gehr gmx.ch> writes:

On 08/04/2013 06:37 AM, Meta wrote:
 On Sunday, 4 August 2013 at 02:05:55 UTC, Timon Gehr wrote:
 no variables are being modified.

 Which I didn't claim. A reference is dereferenced and the contents of
 the referenced slot are replaced. Then the reference is dereferenced
 again to read the modified value out.

 I'm confused about what your point is now. Are you trying to convince me
 that dereferencing p and modifying the value it points to are
 side-effect free? Of course you can do it in Haskell with some wizardry,
 but D doesn't do any of that.
 ...

Weak purity is actually superficially similar to state transformers in 
Haskell.

 Haskell only simulates side-effects

 What tells you that D does not 'simulate' mutable state?

 I see what you're getting at. I thought I had a good answer before I
 really thought about your question. I can't think of any way to prove
 that a language is not simply simulating mutable state off the top of my
 head. Do you actually have an answer to this question, or were you just
 making a point?
 ...

I think you cannot tell, and hence it is questionable whether simulated 
side effects are different from real side effects. Basically it is an 
implementation detail. The side effects occurring in that Haskell 
expression are as real as those occurring in D. (GHC actually implements 
this as updating a mutable memory location under the hood, but it does 
not really matter.)

 with monads,

 What is done to main behind the scenes in order to execute the program
 is most definitely side-effecting.

 I agree with you, but that is done by the runtime in a safe way and is
 mostly out of the user's hands.
 ...

I'd agree with 'mostly safe'. We don't/can't know how the real world 
works, and it has inherent limitations not shared by idealized 
programming languages. (and vice versa!)

 (i.e., the academic definition of purity).

 I wouldn't go that far.

 Perhaps that may go too far, as academics love to obfuscate topics with
 a bunch of extraneous cruft,

 I take issue with this statement. I meant, I wouldn't assume 'the
 academic definition of purity' is a thing.

 I don't know what you mean by this. By academic definition, I meant the
 definition of functional purity that is generally agreed-upon by the CS
 academia.
 ...

Me too. I just don't think there is a notion of functional purity 
applying equally well to all programming languages.


languages can most definitely allocate heap storage when evaluating an 
expression.

 but the fact remains that purity means:

 1. No modification of local or global state (side-effects)
 2. No dependence on global mutable state.

 What does this statement quantify over? Eg, where is the abstraction
 boundary?

 Again, can you be more clear?
 My meaning was that for a function to be
 pure, in a functional programming sense, it must adhere to these rules.

At the lowest level of abstraction, no function call will adhere to 
these rules, since the machine changes its state.

Then there is the memory allocator that carries around global state. The 
perception of whether or not this allocator is pure will likely depend 
on how much access to this global state is given to the user program.

etc.

How pure or impure a given operation is in a given language will 
sometimes depend on the entire language design.

In order to resolve the topic, I guess it is useful to state which 
properties of 'pure' functions, we want to rely on. (Such properties are 
also easier to translate between programming languages, but this is 
still moot.)

Aug 03 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Sat, Aug 03, 2013 at 09:07:47PM +0200, Meta wrote:
 On Saturday, 3 August 2013 at 16:47:52 UTC, Timon Gehr wrote:
On 08/03/2013 05:59 PM, monarch_dodra wrote:
One last question: Pointers.

int get(int* p) pure
{
    return *p;
}

void main()
{
    int i = 0;
    auto p = &i;
    get(p);
}

Here, get, to me, is obviously not pure, since it depends on the
state of the global "i". *Where* did "get" go wrong? Did I simply
"abusively" mark get as pure? Is the "pure" keyword's guarantee
simply "weak"?
...

Yes, it's weak.

 
 It depends on whether you think a pointer dereference is pure or not
 (I don't know the answer). That aside, as long as get doesn't modify
 the value at *p or change what p points to, this is strongly pure
 (i.e., the academic definition of purity).

I think a pointer dereference is weakly pure. Consider this:

	int func1(int x) pure {
		int scratch = x+1;
		func2(&scratch);
		return scratch;
	}

	void func2(int* x) pure {
		*x = 1;
	}

Clearly, func1 can be strongly pure, because its input will never change
based on any global state, in spite of the fact that func2 is
dereferencing pointers. Therefore, func2 is (should be) weakly pure.

Now consider this:

	void func2(int* x) pure {
		*x = 1;
	}

	int y = 2;

	void main() {
		func2(&y);
	}

Can func2 still be said to be weakly pure? I say yes, because it cannot
access global that that main() isn't already passing in! It is main
that's impure, because it's the one that handed a reference to a global
to func2. If we move the definition of y inside main, then main could be
strongly pure.

Which leads us to conclude that the reason main is impure is because
it's referencing a non-local variable when it took the address of y. The
act of taking addresses and dereferencing pointers does not make code
impure; it merely makes them weakly pure as opposed to strongly pure.
And weakly pure functions are callable from strongly pure functions,
because any changes effected through pointers / references / whatever
will never escape the scope of the strongly pure function (the strongly
pure function isn't allowed to take an address of a non-local variable,
so the weakly pure function will never be able to change anything
outside). The outside world will never notice the difference.


T

-- 
Designer clothes: how to cover less by paying more.

Aug 03 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Sat, Aug 03, 2013 at 12:24:08PM -0700, H. S. Teoh wrote:
[...]
 I think a pointer dereference is weakly pure. Consider this:
 
 	int func1(int x) pure {
 		int scratch = x+1;
 		func2(&scratch);
 		return scratch;
 	}
 
 	void func2(int* x) pure {
 		*x = 1;
 	}
 
 Clearly, func1 can be strongly pure, because its input will never change

[...]

Gah, I meant its *output* will never change.


T

-- 
This sentence is false.

Aug 03 2013

Timon Gehr <timon.gehr gmx.ch> writes:

On 08/03/2013 06:47 PM, Timon Gehr wrote:
 Well, no. It is by design. All purity really means is that the part of
 the existing store not reachable by following references in the
 arguments will not be mutated.

s/mutated/accessed/

Aug 03 2013

"anonymous" <anonymous example.com> writes:

On Saturday, 3 August 2013 at 15:59:22 UTC, monarch_dodra wrote:
 I'm a bit confused about where we draw the line with purity, 
 and memory.

 For example, take this "simple" function:

 //----
 char[] getMutableHello() pure
 {
     return "hello".dup;
 }
 //----

 This seems like it will *always* create the same result, yet at 
 the same time, it calls the GC. "dup" could fail, amongst 
 others, meaning there is failed purity.

 Is this OK?

When `dup` fails, some Exception/Error is thrown and the function 
won't return. So, purity is not affected, I think. There's also 
`nothrow` when you want to rule out exceptions.

 :::::::::::::::::::::::::::::::::::::::::

 Where does purity mean when you are talking about 
 slices/pointers? Simply that they *point*/*reference* the same 
 payload? I mean:

 void main()
 {
     auto a = getMutableHello();
     auto b = getMutableHello();
     assert(a == b); //OK
     assert(a is b); //Derp :/
 }

 Did getMutableHello just violate its purity promise?

I'd say `pure` doesn't promise equality of memory addresses. I 
imagine it would be rather annoying if you couldn't `new` in pure 
functions.

 :::::::::::::::::::::::::::::::::::::::::

 Another thing that bothers me, again, when interfacing with the 
 GC, is "capacity". It is currently marked as pure, but I 
 believe this is wrong: The GC can in-place extend the "useable" 
 memory of a dynamic array. This means all referencing slices 
 will be affected, yet un-modified. This means that calling 
 "s.capacity" depends on the global state of the GC:

 //----
     auto s1 = new ubyte[](12_000);
     auto bck = s1;
     size_t cap = s1.capacity;
     s1.reserve(13_000);
     assert(s1 is bck); //Apparently, s1 is not changed.
     assert(cap == s1.capacity); //This fails, yet capacity is 
 pure?
 //----

 Again, is this OK?

When you understand `capacity` as being referenced by the arrays, 
it's ok. Because then, of course `s1 is bck` even though its 
capacity changed. And `cap == s1.capacity` fails, because `cap` 
is a copy of an earlier state.

 :::::::::::::::::::::::::::::::::::::::::

 One last question: Pointers.

 int get(int* p) pure
 {
     return *p;
 }

 void main()
 {
     int i = 0;
     auto p = &i;
     get(p);
 }

 Here, get, to me, is obviously not pure, since it depends on 
 the state of the global "i". *Where* did "get" go wrong? Did I 
 simply "abusively" mark get as pure? Is the "pure" keyword's 
 guarantee simply "weak"?

By passing a pointer to `i`, you explicitly add it to the list of 
stuff, `get` may access (and alter).

Indeed, `pure` by itself only guarantees "weak purity". That 
means, the function may alter its arguments (and everything 
reachable from them). To get "strong purity", mark all arguments 
as const.

 :::::::::::::::::::::::::::::::::::::::::

 To sum up the question, regardless of how the *keyword* pure 
 currently works, my question is "*How* should it behave"? We're 
 currently marking things in Phobos as pure, and I'm worried the 
 only reason we are doing it is that the keyword is lenient, and 
 said functions *aren't* actually pure...

I think all cases you showed work as expected. For `pure` to 
really shine (strong purity), you need to add some `const` into 
the mix.

See also: http://dlang.org/function.html#pure-functions

Aug 03 2013

D Programming

C/C++ Programming

Other

digitalmars.D - purity and memory allocations/pointers