digitalmars.D - Is this function pure?

Janice Caron (15/15) Sep 18 2007 Is this function pure?

Vladimir Panteleev (13/28) Sep 18 2007 Forbidding operations that may cause an exception would imply forbidding...
Lutger (4/21) Sep 18 2007 Putting what the compiler can check aside, isn't it the return statement...

Janice Caron (16/21) Sep 18 2007 I don't think so, since I could easily rewrite it as

Lutger (33/56) Sep 18 2007 What I meant is that the return value is depending on whether new throws...

Steven Schveighoffer (7/22) Sep 18 2007 I would guess that it is not pure, as it is having side effects. The ne...

Lutger (5/34) Sep 18 2007 That's true, using 'new' does make a function impure. That doesn't mean

Don Clugston (16/33) Sep 18 2007 Interesting question.
OF (5/22) Sep 18 2007 As long as new char[x] can fail without consistency this function is not...

Janice Caron (13/15) Sep 18 2007 I think the "address of" operator would have to banned completely from

Bruno Medeiros (6/19) Sep 18 2007 is invalid even for non-pure functions.

Janice Caron (7/16) Sep 18 2007 If you insist...

Brad Roberts (20/42) Sep 18 2007 Purity typically means that the result of the function is exclusively

Steven Schveighoffer (3/6) Sep 18 2007 You mean 'logically' pure? (cackles manically)

Nathan Reed (19/36) Sep 18 2007 Pure is intended to mimic some of the capabilities of functional

Nathan Reed (9/13) Sep 18 2007 I just realized that (as someone else pointed out) taking the address of...

BCS (8/23) Sep 18 2007 It would be a better approach (IMHO) for pure function to just be preven...

BCS (2/27) Sep 18 2007 (BTW: all that is with regards to pure running at compile time)

Steven Schveighoffer (26/42) Sep 18 2007 I tend to agree with you. But Walter is touting pure functions as being...

Steven Schveighoffer (6/16) Sep 18 2007 Totally misread your challenge. I thought you said impossible to write ...
Nathan Reed (7/12) Sep 18 2007 The allocator has to deal with allocation requests coming from multiple

0ffh (5/9) Sep 18 2007 We might even want go all the way, and *require* a pure function to thro...

Bruce Adams (3/14) Sep 18 2007 My understanding is that pure is a compile time concept. If your functio...

Nathan Reed (8/9) Sep 18 2007 Pure functions aren't necessarily /evaluated/ at compile time. They can...

Janice Caron (22/24) Sep 19 2007 I presume that stands for Compile Time Function Execution and not

Robert Fraser (3/33) Sep 19 2007 Someone probably already mentioned this, but you can't (easily, i.e. may...

Brad Roberts (3/38) Sep 19 2007 Yes you can. The malloc subsystem out of practical necessity is

Robert Fraser (2/42) Sep 19 2007

Bill Baxter (7/24) Sep 19 2007 On the other hand the support for concurrency may consist of little more...

Nathan Reed (12/21) Sep 19 2007 On a multicore system the allocator might conceivably have a heap for

Bill Baxter (22/39) Sep 19 2007 I've heard this claim a couple of times recently (allocation is fast in

Frits van Bommel (15/17) Sep 19 2007 Yes, AFAIK the 'fast allocation' is mostly true of garbage collectors
Nathan Reed (9/11) Sep 19 2007 I didn't know that; I assumed D was using at least a copying collector.

Frits van Bommel (43/53) Sep 19 2007 The big blocker to any sort of copying/moving collector is that it needs...

0ffh (11/25) Sep 18 2007 Actually, in spite of all the "Not quite sure, probably not." replies

Janice Caron (4/23) Sep 18 2007 You don't? The way I see it, if the input is 5, then output might

0ffh (11/14) Sep 18 2007 That's surely a point which can be discussed:

Janice Caron (12/14) Sep 18 2007 I don't think it's that simple. Exceptions should be allowed as flow

Ingo Oeser (12/27) Sep 18 2007 A valid optimisation is to assume that x -> a.length, since a.length

Janice Caron (16/20) Sep 18 2007 OK, so change it to

Ingo Oeser (7/25) Sep 18 2007 Ok, now that's impure, since now you are using the memory by returning i...

0ffh (6/18) Sep 18 2007 I am afraid I must disagree.

Ingo Oeser (8/12) Sep 18 2007 Sorry, mixed that up with const functions.

Nathan Reed (16/20) Sep 18 2007 I think if you take this point of view, you have to conclude that even a...

"Janice Caron" <caron800 googlemail.com> writes:

Is this function pure?

int probablyNotPure(int x)
{
	try
	{
		char[] a = new char[x];
		return a.length;
	}
	catch
	{
		return 0;
	}
}

I must assume not, since it's not deterministic. But which part of it
wrong? Should "new" be forbidden? Or is "try/catch"? Or both?

Sep 18 2007

"Vladimir Panteleev" <thecybershadow gmail.com> writes:

On Tue, 18 Sep 2007 15:36:50 +0300, Janice Caron <caron800 googlemail.co=
m> wrote:

 Is this function pure?

 int probablyNotPure(int x)
 {
 	try
 	{
 		char[] a =3D new char[x];
 		return a.length;
 	}
 	catch
 	{
 		return 0;
 	}
 }

 I must assume not, since it's not deterministic. But which part of it
 wrong? Should "new" be forbidden? Or is "try/catch"? Or both?

Forbidding operations that may cause an exception would imply forbidding=
 division, since that may raise a division by zero statement. I don't se=
e what's wrong with handling your own exceptions. And, "determinism" is =
a stretchable notion - if you can determine the number of concurrent thr=
eads which will call this function, the points where switches occur and =
the amount of available memory... in D's context, though, I don't think =
determinism should be a necessity for pure functions - just the limitati=
on of not accessing external data or "non-pure" code.

-- =

Best regards,
 Vladimir                          mailto:thecybershadow gmail.com

Sep 18 2007

Lutger <lutger.blijdestijn gmail.com> writes:

Janice Caron wrote:
 Is this function pure?
 
 int probablyNotPure(int x)
 {
 	try
 	{
 		char[] a = new char[x];
 		return a.length;
 	}
 	catch
 	{
 		return 0;
 	}
 }
 
 I must assume not, since it's not deterministic. But which part of it
 wrong? Should "new" be forbidden? Or is "try/catch"? Or both?

Putting what the compiler can check aside, isn't it the return statement 
in the catch block what prevents this function from being pure, or 
referentially transparent?

Sep 18 2007

"Janice Caron" <caron800 googlemail.com> writes:

On 9/18/07, Lutger <lutger.blijdestijn gmail.com> wrote:
 Janice Caron wrote:
 Is this function pure?

 Putting what the compiler can check aside, isn't it the return statement
 in the catch block what prevents this function from being pure, or
 referentially transparent?

I don't think so, since I could easily rewrite it as

int probablyNotPure(int x)
{
    int n;
    try
    {
        char[] a = new char[x];
        n = a.length;
    }
    catch
    {
        n = 0;
    }
    return n;
}

Sep 18 2007

Lutger <lutger.blijdestijn gmail.com> writes:

Janice Caron wrote:
 On 9/18/07, Lutger <lutger.blijdestijn gmail.com> wrote:
 Janice Caron wrote:
 Is this function pure?

 Putting what the compiler can check aside, isn't it the return statement
 in the catch block what prevents this function from being pure, or
 referentially transparent?

 
 I don't think so, since I could easily rewrite it as
 
 int probablyNotPure(int x)
 {
     int n;
     try
     {
         char[] a = new char[x];
         n = a.length;
     }
     catch
     {
         n = 0;
     }
     return n;
 }

What I meant is that the return value is depending on whether new throws 
or not. This is the case in the above function as well as the one in the 
original post, they amount to the same thing.
 From the standpoint of referential transparency I don't think this 
function is really different from:

int surelyNotPure(int x)
{
     int n = rand();
     if (n > 100)
     {
         return x;
     }
     else
     {
         return 0;
     }
}

But if you write it as this I don't see why it would not be pure:

int probablyPure(int x)
{
     int n;
     try
     {
         char[] a = new char[x];
         n = a.length;
     }
     catch
     {
         throw new FooException;
     }
     return n;
}

Sep 18 2007

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

"Janice Caron" wrote
 Is this function pure?

 int probablyNotPure(int x)
 {
 try
 {
 char[] a = new char[x];
 return a.length;
 }
 catch
 {
 return 0;
 }
 }

 I must assume not, since it's not deterministic. But which part of it
 wrong? Should "new" be forbidden? Or is "try/catch"? Or both?

I would guess that it is not pure, as it is having side effects.  The new 
call is modifying global state (that of the garbage collector), so I don't 
see how that 'new' call can be pure.  From what I understand, a pure 
function can only call other pure functions.

So I think to answer your question, new should be forbidden.

-Steve

Sep 18 2007

Lutger <lutger.blijdestijn gmail.com> writes:

Steven Schveighoffer wrote:
 "Janice Caron" wrote
 Is this function pure?

 int probablyNotPure(int x)
 {
 try
 {
 char[] a = new char[x];
 return a.length;
 }
 catch
 {
 return 0;
 }
 }

 I must assume not, since it's not deterministic. But which part of it
 wrong? Should "new" be forbidden? Or is "try/catch"? Or both?

 
 I would guess that it is not pure, as it is having side effects.  The new 
 call is modifying global state (that of the garbage collector), so I don't 
 see how that 'new' call can be pure.  From what I understand, a pure 
 function can only call other pure functions.
 
 So I think to answer your question, new should be forbidden.
 
 -Steve 
 
 

That's true, using 'new' does make a function impure. That doesn't mean 
it cannot be referentially transparent though.

Even if some 'impure' operations were to be allowed, in the case of new 
you would also have to take custom allocators into account.

Sep 18 2007

Don Clugston <dac nospam.com.au> writes:

Janice Caron wrote:
 Is this function pure?
 
 int probablyNotPure(int x)
 {
 	try
 	{
 		char[] a = new char[x];
 		return a.length;
 	}
 	catch
 	{
 		return 0;
 	}
 }
 
 I must assume not, since it's not deterministic. But which part of it
 wrong? Should "new" be forbidden? Or is "try/catch"? Or both?

Interesting question.
This is exactly the same:

int probablyToo(char [] x)
{
       try {
          return (x ~ "a").length;
       }
       catch {
          return 0;
       }
}

If 'new' was forbidden for reasons of non-determinism, then string
concatenation 
would have to be, as well. Which I think would make pure functions pretty 
limited. But, if you assume that anything with is CTFE-able is also pure, then 
string concatenation is OK.

Sep 18 2007

OF <nospam nospammington.com> writes:

Janice Caron Wrote:

 Is this function pure?
 
 int probablyNotPure(int x)
 {
 	try
 	{
 		char[] a = new char[x];
 		return a.length;
 	}
 	catch
 	{
 		return 0;
 	}
 }
 
 I must assume not, since it's not deterministic. But which part of it
 wrong? Should "new" be forbidden? Or is "try/catch"? Or both?

As long as new char[x] can fail without consistency this function is not pure.
That is, the return value depends on something else than x.

However, allocating memory is in itself not necessarily 'unpure', as long as
it's predictible (say: I always get memory, else make this computation fail,
which could be handled by a caller on a lower and not pure level). 

Problems arise as long as you allow new to fail like this or if you use the
address you get (which of course is not deterministic to the degree we want). 

Probably it's best to leave memory allocations to implicit allocations in pure
functions, but this is tricky if you want to return a new object as a modified
version of an input object. I don't know how it's intended for D to solve this
nicely, but I'm very interested in hearing it.

Sep 18 2007

"Janice Caron" <caron800 googlemail.com> writes:

On 9/18/07, OF <nospam nospammington.com> wrote:
 Problems arise as long as you allow new to fail like this or if you use the
address you get (which of course is not deterministic to the degree we want).

I think the "address of" operator would have to banned completely from
pure functions, otherwise you could do

int definitelyNotPure()
{
    int x;
    return cast(int)&x;
}


 Probably it's best to leave memory allocations to implicit allocations in pure
functions, but this is tricky if you want to return a new object as a modified
version of an input object. I don't know how it's intended for D to solve this
nicely, but I'm very interested in hearing it.

The only way I can think of is the old-fashioned C way, of passing a
buffer and a length (or in D, an array) into the function, and using
that and that alone as your memory.

...Ooooh, but wait - don't all inputs have to be (transitively) const!?

Yikes.

Sep 18 2007

Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:

Janice Caron wrote:
 On 9/18/07, OF <nospam nospammington.com> wrote:
 Problems arise as long as you allow new to fail like this or if you use the
address you get (which of course is not deterministic to the degree we want).

 
 I think the "address of" operator would have to banned completely from
 pure functions, otherwise you could do
 

Can you give another example? The one you gave:

 int definitelyNotPure()
 {
     int x;
     return cast(int)&x;
 }
 
 

is invalid even for non-pure functions.

-- 
Bruno Medeiros - MSc in CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Sep 18 2007

"Janice Caron" <caron800 googlemail.com> writes:

On 9/19/07, Bruno Medeiros <brunodomedeiros+spam com.gmail> wrote:
 Can you give another example? The one you gave:

 int definitelyNotPure()
 {
     int x;
     return cast(int)&x;
 }

 is invalid even for non-pure functions.

If you insist...

int definitelyNotPure()
{
    auto x = new int[1];
    return cast(int)&x;
}

Sep 18 2007

Brad Roberts <braddr puremagic.com> writes:

Janice Caron wrote:
 On 9/18/07, OF <nospam nospammington.com> wrote:
 Problems arise as long as you allow new to fail like this or if you use the
address you get (which of course is not deterministic to the degree we want).

 
 I think the "address of" operator would have to banned completely from
 pure functions, otherwise you could do
 
 int definitelyNotPure()
 {
     int x;
     return cast(int)&x;
 }
 
 
 Probably it's best to leave memory allocations to implicit allocations in pure
functions, but this is tricky if you want to return a new object as a modified
version of an input object. I don't know how it's intended for D to solve this
nicely, but I'm very interested in hearing it.

 
 The only way I can think of is the old-fashioned C way, of passing a
 buffer and a length (or in D, an array) into the function, and using
 that and that alone as your memory.
 
 ...Ooooh, but wait - don't all inputs have to be (transitively) const!?
 
 Yikes.

Purity typically means that the result of the function is exclusively 
determined by it's inputs.  IE, that the results are entirely 
repeatable.  So, this would be pure:

void foo(in int x, out int y)
{
    y = x+1;
}

No matter how often you call it, for any given value of x, what it sets 
y to is exactly the same.  Obviously, this is a simple example and it's 
essentially isomorphic with the simpler:

int foo(in int x)
{
     return x+1;
}

Whether use of the memory subsystem or other subtle side effects affects 
purity likely becomes a matter of definition.  I personally hope that 
it's included such that your original example is pure.

Later,
Brad

Sep 18 2007

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

"Brad Roberts" wrote
 Whether use of the memory subsystem or other subtle side effects affects 
 purity likely becomes a matter of definition.  I personally hope that it's 
 included such that your original example is pure.

You mean 'logically' pure?  (cackles manically)

-Steve

Sep 18 2007

Nathan Reed <nathaniel.reed gmail.com> writes:

Janice Caron wrote:
 Is this function pure?
 
 int probablyNotPure(int x)
 {
 	try
 	{
 		char[] a = new char[x];
 		return a.length;
 	}
 	catch
 	{
 		return 0;
 	}
 }
 
 I must assume not, since it's not deterministic. But which part of it
 wrong? Should "new" be forbidden? Or is "try/catch"? Or both?

Pure is intended to mimic some of the capabilities of functional 
languages, yes?  Well, at the implementation level functional languages 
have to allocate memory just like anyone else, and those allocations can 
potentially fail just like always.  That doesn't keep functional 
language functions from being pure, so it shouldn't keep D functions 
from being pure either.

The function you've written is nondeterministic, but only due to the 
nondeterminism inherent in memory allocation.  It would be interesting 
to see if it's possible to write a function that acheives nondeterminism 
  *without* either memory allocation or reading global state (e.g. 
rand()).  I believe it's not.

If that is the case, then the only potential source of nondeterminism in 
pure functions is memory allocation, and I don't think we need to really 
worry about that too much.  After all, if we've run out of memory, we've 
likely got bigger problems than our pure functions becoming 
nondeterministic. :)

Thanks,
Nathan Reed

Sep 18 2007

Nathan Reed <nathaniel.reed gmail.com> writes:

Nathan Reed wrote:
 It would be interesting 
 to see if it's possible to write a function that acheives nondeterminism 
  *without* either memory allocation or reading global state (e.g. 
 rand()).  I believe it's not.

I just realized that (as someone else pointed out) taking the address of 
a local variable would also be nondeterministic.  Pointer shenanigans in 
general can lead to nondeterminism, so maybe pointers should be 
forbidden entirely in pure functions.  Given that pointers are rarely 
used in D except for interfacing with C functions and suchlike, this 
seems like it would be not entirely unreasonable.

Thanks,
Nathan Reed

Sep 18 2007

BCS <ao pathlink.com> writes:

Reply to Nathan,

 Nathan Reed wrote:
 
 It would be interesting to see if it's possible to write a function
 that acheives nondeterminism *without* either memory allocation or
 reading global state (e.g. rand()).  I believe it's not.
 

 I just realized that (as someone else pointed out) taking the address
 of a local variable would also be nondeterministic.  Pointer
 shenanigans in general can lead to nondeterminism, so maybe pointers
 should be forbidden entirely in pure functions.  Given that pointers
 are rarely used in D except for interfacing with C functions and
 suchlike, this seems like it would be not entirely unreasonable.
 
 Thanks,
 Nathan Reed

It would be a better approach (IMHO) for pure function to just be prevented 
from accessing "other things". This could be done with some sort of VM. It 
is interesting to note that given such a VM, pure function can call some 
non pure function. The requirement would be that a pure function must not 
call anything that accesses something that the pure function can't. In a 
VM this is easy to enforce, just don't put anything into it that the function 
shouldn't access.

Sep 18 2007

BCS <ao pathlink.com> writes:

Reply to Benjamin,

 Reply to Nathan,
 
 Nathan Reed wrote:
 
 It would be interesting to see if it's possible to write a function
 that acheives nondeterminism *without* either memory allocation or
 reading global state (e.g. rand()).  I believe it's not.
 

 I just realized that (as someone else pointed out) taking the address
 of a local variable would also be nondeterministic.  Pointer
 shenanigans in general can lead to nondeterminism, so maybe pointers
 should be forbidden entirely in pure functions.  Given that pointers
 are rarely used in D except for interfacing with C functions and
 suchlike, this seems like it would be not entirely unreasonable.
 
 Thanks,
 Nathan Reed

 It would be a better approach (IMHO) for pure function to just be
 prevented from accessing "other things". This could be done with some
 sort of VM. It is interesting to note that given such a VM, pure
 function can call some non pure function. The requirement would be
 that a pure function must not call anything that accesses something
 that the pure function can't. In a VM this is easy to enforce, just
 don't put anything into it that the function shouldn't access.
 

(BTW: all that is with regards to pure running at compile time)

Sep 18 2007

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

"Nathan Reed" wrote
 Pure is intended to mimic some of the capabilities of functional 
 languages, yes?  Well, at the implementation level functional languages 
 have to allocate memory just like anyone else, and those allocations can 
 potentially fail just like always.  That doesn't keep functional language 
 functions from being pure, so it shouldn't keep D functions from being 
 pure either.

I tend to agree with you.  But Walter is touting pure functions as being 
able to run simultaneously on two cores.  What happens when both functions 
allocate memory without normal thread locking?  i.e.:

char(invariant)[] x = y() ~ z();  // both y and z are pure

Since y and z are pure, they should be able to run simultaneously on two 
cores.  What happens when they both try to allocate memory?

 The function you've written is nondeterministic, but only due to the 
 nondeterminism inherent in memory allocation.  It would be interesting to 
 see if it's possible to write a function that acheives nondeterminism 
 *without* either memory allocation or reading global state (e.g. rand()). 
 I believe it's not.

I don't follow... Isn't this a pure function?  no global state read/memory 
allocation:

int add(int x, int y)
{
   return x + y;
}

Basically, you can do pure functions only on the stack, but the issue 
becomes returning a variable length data structure, such as a string.  I 
think you can't do it unless memory allocation is given a pass.  However, 
then all new operator overrides must be allowed to be pure.

Perhaps you give a single memory allocation routine a free pass, and code 
that deep in the bowels of D, making sure it is thread-safe.  Then all other 
memory allocation routines can declare themselves pure if they use that 
memory allocation routine as a base.

I'm sure Walter has a solution in mind here...

 If that is the case, then the only potential source of nondeterminism in 
 pure functions is memory allocation, and I don't think we need to really 
 worry about that too much.  After all, if we've run out of memory, we've 
 likely got bigger problems than our pure functions becoming 
 nondeterministic. :)

I think exception catching should be outlawed in pure functions.  This makes 
the pure function deterministic iff it does not throw an exception.  You can 
handle cases like divide by zero just by checking if the divisor is 0.

-Steve

Sep 18 2007

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

"Steven Schveighoffer" wrote
 "Nathan Reed" wrote
 It would be interesting to see if it's possible to write a function that 
 acheives nondeterminism *without* either memory allocation or reading 
 global state (e.g. rand()). I believe it's not.

 I don't follow... Isn't this a pure function?  no global state read/memory 
 allocation:

 int add(int x, int y)
 {
   return x + y;
 }

Totally misread your challenge.  I thought you said impossible to write a 
function that achieves *determinism*.  Sorry :)

I think you are right.  I think the challenge can be reduced to just reading 
global state, as allocating memory is a subset of that.

-Steve

Sep 18 2007

Nathan Reed <nathaniel.reed gmail.com> writes:

Steven Schveighoffer wrote:
 Since y and z are pure, they should be able to run simultaneously on two 
 cores.  What happens when they both try to allocate memory?

The allocator has to deal with allocation requests coming from multiple 
threads/cores anyway.  This case presents no additional problem.

 I think exception catching should be outlawed in pure functions.  This makes 
 the pure function deterministic iff it does not throw an exception.  You can 
 handle cases like divide by zero just by checking if the divisor is 0.

Exception throwing and catching is deterministic too as long as the 
condition that generates the exception is (which dividing by zero would 
be).  I think it is perfectly reasonable to allow pure functions to 
throw and catch exceptions.

Sep 18 2007

0ffh <spam frankhirsch.net> writes:

Nathan Reed wrote:
 Exception throwing and catching is deterministic too as long as the 
 condition that generates the exception is (which dividing by zero would 
 be).  I think it is perfectly reasonable to allow pure functions to 
 throw and catch exceptions.

We might even want go all the way, and *require* a pure function to throw
an exception if it detects that functional purity has been compromised
like in Janice's out-of-memory example.

Regards, frank

Sep 18 2007

Bruce Adams <tortoise_74 yeah.who.co.uk> writes:

0ffh Wrote:

 Nathan Reed wrote:
 Exception throwing and catching is deterministic too as long as the 
 condition that generates the exception is (which dividing by zero would 
 be).  I think it is perfectly reasonable to allow pure functions to 
 throw and catch exceptions.

 
 We might even want go all the way, and *require* a pure function to throw
 an exception if it detects that functional purity has been compromised
 like in Janice's out-of-memory example.
 
 Regards, frank

My understanding is that pure is a compile time concept. If your function can
violate it its not pure, unless you have a scheme for compile time exception
handling - yeauch.

Bruce.

Sep 18 2007

Nathan Reed <nathaniel.reed gmail.com> writes:

Bruce Adams wrote:
 My understanding is that pure is a compile time concept. If your function can
violate it its not pure, unless you have a scheme for compile time exception
handling - yeauch.

Pure functions aren't necessarily /evaluated/ at compile time.  They can 
throw exceptions when evaluated at run-time, just like any function.

However, even for functions evaluated at compile time I can't see that 
throwing and catching exceptions would be that big a deal.  CTFE is 
basically like running an interpreter, right?

Thanks,
Nathan Reed

Sep 18 2007

"Janice Caron" <caron800 googlemail.com> writes:

On 9/19/07, Nathan Reed <nathaniel.reed gmail.com> wrote:
 CTFE is
 basically like running an interpreter, right?

I presume that stands for Compile Time Function Execution and not
chlorotrifluoroethylene, right? :-)

My feeling is that Ingo is basically right, in that Walter needs to
assume that new can never fail - that it can successfully allocate
infinite amounts of memory without causing trouble. If that assumption
holds (...obviously it doesn't, but bear with me...) then new is
allowed. Otherwise the simplest things would be banned: a function
which does toString (e.g. that turns 42 into "42") would be banned; a
function which adds a passed-in element to a passed-in associative
array would be banned, and so on. I guess it just has to be allowed.

But I still have a problem with it, because such a program is only
"logically pure", not "physically pure". And I have to wonder, will be
be allowed to write our own "logically pure" functions and declare
them pure?

How will custom allocators for "new" fit into the picture? A custom
allocator must surely access global state? A custom allocator could be
written to be (logically) pure, or it could be decidedly impure (for
example by initialising the created variables with rand()).

Will purity be transitive and as strictly enforced as const? Will we
be able to cast away purity when we need to?

I guess we'll wait and see, but I would surely like to know.

Sep 19 2007

Robert Fraser <fraserofthenight gmail.com> writes:

Someone probably already mentioned this, but you can't (easily, i.e. maybe some
sort of heap partitioning could do it but you'd need OS support for that)
automatically parallelize functions that allocate memory (since the heap is a
shared resource), which was one of the big selling points of "pure". So I'm
assuming that allocations are out of the equation. And since exceptions are
most often paired up with an allocation (although not necessarily), those might
be out, too. Not totally sure about that one.

There's no reason stack-allocated/scope classes couldn't be used, though.

Janice Caron Wrote:

 On 9/19/07, Nathan Reed <nathaniel.reed gmail.com> wrote:
 CTFE is
 basically like running an interpreter, right?

 
 I presume that stands for Compile Time Function Execution and not
 chlorotrifluoroethylene, right? :-)
 
 My feeling is that Ingo is basically right, in that Walter needs to
 assume that new can never fail - that it can successfully allocate
 infinite amounts of memory without causing trouble. If that assumption
 holds (...obviously it doesn't, but bear with me...) then new is
 allowed. Otherwise the simplest things would be banned: a function
 which does toString (e.g. that turns 42 into "42") would be banned; a
 function which adds a passed-in element to a passed-in associative
 array would be banned, and so on. I guess it just has to be allowed.
 
 But I still have a problem with it, because such a program is only
 "logically pure", not "physically pure". And I have to wonder, will be
 be allowed to write our own "logically pure" functions and declare
 them pure?
 
 How will custom allocators for "new" fit into the picture? A custom
 allocator must surely access global state? A custom allocator could be
 written to be (logically) pure, or it could be decidedly impure (for
 example by initialising the created variables with rand()).
 
 Will purity be transitive and as strictly enforced as const? Will we
 be able to cast away purity when we need to?
 
 I guess we'll wait and see, but I would surely like to know.

Sep 19 2007

Brad Roberts <braddr puremagic.com> writes:

Yes you can.  The malloc subsystem out of practical necessity is 
internally thread safe and can be used concurrently without concern.

Robert Fraser wrote:
 Someone probably already mentioned this, but you can't (easily, i.e. maybe
some sort of heap partitioning could do it but you'd need OS support for that)
automatically parallelize functions that allocate memory (since the heap is a
shared resource), which was one of the big selling points of "pure". So I'm
assuming that allocations are out of the equation. And since exceptions are
most often paired up with an allocation (although not necessarily), those might
be out, too. Not totally sure about that one.
 
 There's no reason stack-allocated/scope classes couldn't be used, though.
 
 Janice Caron Wrote:
 
 On 9/19/07, Nathan Reed <nathaniel.reed gmail.com> wrote:
 CTFE is
 basically like running an interpreter, right?

 I presume that stands for Compile Time Function Execution and not
 chlorotrifluoroethylene, right? :-)

 My feeling is that Ingo is basically right, in that Walter needs to
 assume that new can never fail - that it can successfully allocate
 infinite amounts of memory without causing trouble. If that assumption
 holds (...obviously it doesn't, but bear with me...) then new is
 allowed. Otherwise the simplest things would be banned: a function
 which does toString (e.g. that turns 42 into "42") would be banned; a
 function which adds a passed-in element to a passed-in associative
 array would be banned, and so on. I guess it just has to be allowed.

 But I still have a problem with it, because such a program is only
 "logically pure", not "physically pure". And I have to wonder, will be
 be allowed to write our own "logically pure" functions and declare
 them pure?

 How will custom allocators for "new" fit into the picture? A custom
 allocator must surely access global state? A custom allocator could be
 written to be (logically) pure, or it could be decidedly impure (for
 example by initialising the created variables with rand()).

 Will purity be transitive and as strictly enforced as const? Will we
 be able to cast away purity when we need to?

 I guess we'll wait and see, but I would surely like to know.

Sep 19 2007

Robert Fraser <fraserofthenight gmail.com> writes:

Heh, never knew that; that's good news. Thanks for brightening up my day!

Brad Roberts Wrote:

 Yes you can.  The malloc subsystem out of practical necessity is 
 internally thread safe and can be used concurrently without concern.
 
 Robert Fraser wrote:
 Someone probably already mentioned this, but you can't (easily, i.e. maybe
some sort of heap partitioning could do it but you'd need OS support for that)
automatically parallelize functions that allocate memory (since the heap is a
shared resource), which was one of the big selling points of "pure". So I'm
assuming that allocations are out of the equation. And since exceptions are
most often paired up with an allocation (although not necessarily), those might
be out, too. Not totally sure about that one.
 
 There's no reason stack-allocated/scope classes couldn't be used, though.
 
 Janice Caron Wrote:
 
 On 9/19/07, Nathan Reed <nathaniel.reed gmail.com> wrote:
 CTFE is
 basically like running an interpreter, right?

 I presume that stands for Compile Time Function Execution and not
 chlorotrifluoroethylene, right? :-)

 My feeling is that Ingo is basically right, in that Walter needs to
 assume that new can never fail - that it can successfully allocate
 infinite amounts of memory without causing trouble. If that assumption
 holds (...obviously it doesn't, but bear with me...) then new is
 allowed. Otherwise the simplest things would be banned: a function
 which does toString (e.g. that turns 42 into "42") would be banned; a
 function which adds a passed-in element to a passed-in associative
 array would be banned, and so on. I guess it just has to be allowed.

 But I still have a problem with it, because such a program is only
 "logically pure", not "physically pure". And I have to wonder, will be
 be allowed to write our own "logically pure" functions and declare
 them pure?

 How will custom allocators for "new" fit into the picture? A custom
 allocator must surely access global state? A custom allocator could be
 written to be (logically) pure, or it could be decidedly impure (for
 example by initialising the created variables with rand()).

 Will purity be transitive and as strictly enforced as const? Will we
 be able to cast away purity when we need to?

 I guess we'll wait and see, but I would surely like to know.

Sep 19 2007

Bill Baxter <dnewsgroup billbaxter.com> writes:

 Robert Fraser wrote:

 Someone probably already mentioned this, but you can't (easily, i.e. 
 maybe some sort of heap partitioning could do it but you'd need OS 
 support for that) automatically parallelize functions that allocate 
 memory (since the heap is a shared resource), which was one of the
 big selling points of "pure". So I'm assuming that allocations are
 out of the equation. And since exceptions are most often paired up
 with an allocation (although not necessarily), those might be out,
 too. Not totally sure about that one.

 There's no reason stack-allocated/scope classes couldn't be used,
 though.

 Brad Roberts Wrote:
 
 Yes you can.  The malloc subsystem out of practical necessity is 
 internally thread safe and can be used concurrently without concern.


 Robert Fraser wrote:
 Heh, never knew that; that's good news. Thanks for brightening up my day!


On the other hand the support for concurrency may consist of little more 
than a global mutex.  That means that the performance benefits of 
launching multiple threads might not be so great since they'll all just 
end up in a queued waiting for their turn to malloc.

I have no idea what they do in practice though.  Do typical malloc 
systems actually allow simultaneous allocations in multiple threads?

--bb

Sep 19 2007

Nathan Reed <nathaniel.reed gmail.com> writes:

Bill Baxter wrote:
 On the other hand the support for concurrency may consist of little more 
 than a global mutex.  That means that the performance benefits of 
 launching multiple threads might not be so great since they'll all just 
 end up in a queued waiting for their turn to malloc.
 
 I have no idea what they do in practice though.  Do typical malloc 
 systems actually allow simultaneous allocations in multiple threads?
 
 --bb

On a multicore system the allocator might conceivably have a heap for 
each core or something like that.

However, even if this is not the case, allocation in a garbage-collected 
language is very quick and the mutex would only need to be held for the 
duration of the allocation.

With a pure function that does some allocation and then a bunch of 
computation, all the computations could proceed in parallel once each 
thread's memory had been allocated.  Synchronized allocations certainly 
need not negate the benefits of parallelism.

Thanks,
Nathan Reed

Sep 19 2007

Bill Baxter <dnewsgroup billbaxter.com> writes:

Nathan Reed wrote:
 Bill Baxter wrote:
 On the other hand the support for concurrency may consist of little 
 more than a global mutex.  That means that the performance benefits of 
 launching multiple threads might not be so great since they'll all 
 just end up in a queued waiting for their turn to malloc.

 I have no idea what they do in practice though.  Do typical malloc 
 systems actually allow simultaneous allocations in multiple threads?

 --bb

 
 On a multicore system the allocator might conceivably have a heap for 
 each core or something like that.
 
 However, even if this is not the case, allocation in a garbage-collected 
 language is very quick and the mutex would only need to be held for the 
 duration of the allocation.

I've heard this claim a couple of times recently (allocation is fast in 
GC languages).  I don't think it's true, though, at least not of D. 
Maybe it's a distortion of the the claim that *overall*, memory 
management in a GC language can be faster than explicit management.  But 
that's not because allocations are cheaper.  It's because you don't have 
to keep deleting things one by one as they go out of scope.  Instead, a 
bunch of out-of-scope objects all get reclaimed together during a GC 
sweep.  But the GC sweep takes place when you do an allocation, so in 
addition to the normal work necessary to do an allocation (finding a 
contiguous block of free memory that's big enough) you also have to do a 
full mark-sweep of all reachable memory.  I don't see how 
mark+sweep+allocate can be faster than just allocate.

[Rereading that now, though, it doesn't make much sense to me -- if 
there's a big gain to be had just by not explicitly freeing memory until 
the next allocation attempt, then regular malloc implementations could 
do that too, queuing up the free() requests until the next malloc call.]

The 'fast allocation' claim may be true of generational GCs or other 
fancy GCs, but D currently has a mark&sweep type.

Sean and other folks more in the know about GC -- please feel free to 
correct me on any of this.

--bb

Sep 19 2007

Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:

Bill Baxter wrote:
 The 'fast allocation' claim may be true of generational GCs or other 
 fancy GCs, but D currently has a mark&sweep type.

Yes, AFAIK the 'fast allocation' is mostly true of garbage collectors 
that move objects that aren't collected, since they have all free memory 
in a contiguous block and can just increment a pointer.

However, I can see /deallocation/ being faster with a GC since the cost 
of explicit deallocation isn't just limited to the free/delete call; it 
also often incurs the cost of tracking *when* to call them (which often 
means using things like reference counts, which in a multithreaded app 
need a mutex...).
Also, a GC gets to postpone memory deallocation until it's actually 
needed, which can sometimes mean they don't have to do it at all (if 
your program doesn't allocate all that much).

Note: I'm not saying using a GC is necessarily faster (since I don't 
have the numbers), just that I can see how it *could* be.

Of course, there's also the programmer convenience factor :).

Sep 19 2007

Nathan Reed <nathaniel.reed gmail.com> writes:

Bill Baxter wrote:
 The 'fast allocation' claim may be true of generational GCs or other 
 fancy GCs, but D currently has a mark&sweep type.

I didn't know that; I assumed D was using at least a copying collector. 
  You're right that allocation wouldn't be any faster with a 
mark-and-sweep collector than in a language with explicit memory 
management.  D /could/ theoretically use a copying collector or 
generational collector though; as far as I know there's no barrier 
except writing the code for it.

Thanks,
Nathan Reed

Sep 19 2007

Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:

Nathan Reed wrote:
 Bill Baxter wrote:
 The 'fast allocation' claim may be true of generational GCs or other 
 fancy GCs, but D currently has a mark&sweep type.

 
 I didn't know that; I assumed D was using at least a copying collector. 
  You're right that allocation wouldn't be any faster with a 
 mark-and-sweep collector than in a language with explicit memory 
 management.  D /could/ theoretically use a copying collector or 
 generational collector though; as far as I know there's no barrier 
 except writing the code for it.

The big blocker to any sort of copying/moving collector is that it needs 
to know which bytes represent pointers and which don't.

This would require the compiler to emit extra type information it 
currently doesn't. In particular:
* ClassInfo.offTi[] would need to actually be filled (IIRC it's always 
an empty array currently)
* Similar data is needed for stack frames. AFAIK there's currently no 
way to communicate this.
* Pointer data for structs, delegates, array types, AAs etc. would also 
need to be provided (but could be "inlined" into the containing 
class/stackframe information)
* Heap objects would need some kind of tag to indicate how to find 
pointers in them. For classes it would be easy, just get 
.classinfo.offTi through the vtable ptr, but for example structs and 
array data can also be heap-allocated and don't have a vtable ptr. In 
fact, for arrays it isn't even constant (though a 'repeat x times to 
visit all allocated data' marker could be used).

The problem that requires all that is the fact that when you move a heap 
cell, you need to update all (reachable) pointers/references to it (but 
not non-pointer data that happens to *look* like a pointer). So the GC 
can only do this when it is sure it can update all pointers to it, and 
*only* pointers to it.
A GC could of course refrain from moving objects if it's not sure 
everything that looks like a pointer to it actually is one, but that 
would negate at least some of the benefit of a moving collector.

(An interesting idea might be to try and use debugging information as 
the source of type information. However, even though that might work for 
stack data it could be harder for heap data since all you've got to go 
on is the declared types of the pointers to it; and void*s could point 
to anything. Besides, requiring debug info for the GC to work well is a 
bit of a hack)

Another issue is that sometimes the compiler simply doesn't *know* 
whether something is a pointer or not. Consider:
* You can allocate void[]s, which explicitly *may* contain pointers but 
don't necessarily consist of only pointers.
* Unions of pointer and non-pointer types
* Code that's linked in could be compiled by another compiler entirely 
(for instance, a library written in C). Currently using these just 
requires that you keep a pointer where the GC can find it, but if you 
need to find all pointers to heap objects you'd need non-D code to 
declare where exactly they keep their pointers... (Assuming it can't be 
inferred from sources such as debug info)

Sep 19 2007

0ffh <spam frankhirsch.net> writes:

Janice Caron wrote:
 Is this function pure?
 
 int probablyNotPure(int x)
 {
 	try
 	{
 		char[] a = new char[x];
 		return a.length;
 	}
 	catch
 	{
 		return 0;
 	}
 }

Actually, in spite of all the "Not quite sure, probably not." replies
you had so far, I want to state boldly: Yes, sure, why not? :-)

Scared by memory allocation changing the state of the GC?
You may give a damn: When the function leaves, the dynamic array is
dead, and thus not part of the state any more.

Functional purity has only these conditions:

* The result is a function of the parameters, and only of the parameters.
* Variables can be created and duplicated, but not changed.

I don't see your example violating these conditions.

Regards, frank

Sep 18 2007

"Janice Caron" <caron800 googlemail.com> writes:

On 9/18/07, 0ffh <spam frankhirsch.net> wrote:
 Janice Caron wrote:
 Is this function pure?

 int probablyNotPure(int x)
 {
       try
       {
               char[] a = new char[x];
               return a.length;
       }
       catch
       {
               return 0;
       }
 }

 Functional purity has only these conditions:

 * The result is a function of the parameters, and only of the parameters.
 * Variables can be created and duplicated, but not changed.

 I don't see your example violating these conditions.

You don't? The way I see it, if the input is 5, then output might
sometimes be 5 and sometimes be 0. That would seem to violate your
first bullet-point, would it not?

Sep 18 2007

0ffh <spam frankhirsch.net> writes:

Janice Caron wrote:
 You don't? The way I see it, if the input is 5, then output might
 sometimes be 5 and sometimes be 0. That would seem to violate your
 first bullet-point, would it not?

That's surely a point which can be discussed:

I could, for example, say: A result of 0 happens only in case of an
error condition. Sure, you loose functional purity then, but there's
just nothing you can do about that. (Actually, in *this* special case
you could do something about it, but not in the general case.)

Your function could return 23, because a lump of radiation hit the
stack memory and flipped two bits. That's just to bad, but that's it.

Therefore: In case of errors, there's an exception!
(And, in this case, a double entendre. :-)

Regards, frank

Sep 18 2007

"Janice Caron" <caron800 googlemail.com> writes:

On 9/18/07, 0ffh <spam frankhirsch.net> wrote:
 Therefore: In case of errors, there's an exception!
 (And, in this case, a double entendre. :-)

I don't think it's that simple. Exceptions should be allowed as flow
control structures even in pure code, and don't necessarily represent
an error. For example, a (pure) divide function might legitimately
throw a divide-by-zero exception, and then a (also pure) fixed-point
saturating divide function might legitimately call the first divide
function, catch the exception, and return the saturation value. None
of that violates purity.

I don't think you can ban try/catch in pure code, nor assume that
exception==error.

Of course, only Walter can tell us the /real/ (as in,
will-be-implemented) answer!

Sep 18 2007

Ingo Oeser <ioe-news rameria.de> writes:

Janice Caron wrote:

 Is this function pure?

Yes.

 int probablyNotPure(int x)
 {
       try
       {
               char[] a = new char[x];
               return a.length;
       }
       catch
       {
               return 0;
       }
 }

A valid optimisation is to assume that x -> a.length, since a.length
is directly derived from x. If you don't use (e.g. reference) the memory
you are allocating, the compiler should be free to optimise the allocation
away. The catch will be dead code, but the function is also valid,
with the catch being dead code, since that would happen with infinite
storage or a GC deleting unreachable reference (like D has :-) ).

 I must assume not, since it's not deterministic. But which part of it
 wrong? Should "new" be forbidden? Or is "try/catch"? Or both?

It can be optimised into a deterministic function without any algorithmic
side effect. So that function actually IS deterministic and thus pure :-)


Best Regards

Ingo Oeser

Sep 18 2007

"Janice Caron" <caron800 googlemail.com> writes:

On 9/18/07, Ingo Oeser <ioe-news rameria.de> wrote:
 A valid optimisation is to assume that x -> a.length, since a.length
 is directly derived from x. If you don't use (e.g. reference) the memory
 you are allocating, the compiler should be free to optimise the allocation
 away.

OK, so change it to

char[] probablyNotPure(int x)
{
      try
      {
              char[] a = new char[x];
              return a;
      }
      catch
      {
              return null;
      }
}

and consider that the input might be 0x7FFFFFFF.

I think it looks less easy to optimise now.

Sep 18 2007

Ingo Oeser <ioe-news rameria.de> writes:

Janice Caron wrote:

 OK, so change it to
 
 char[] probablyNotPure(int x)
 {
       try
       {
               char[] a = new char[x];
               return a;
       }
       catch
       {
               return null;
       }
 }
 
 and consider that the input might be 0x7FFFFFFF.
 
 I think it looks less easy to optimise now.

Ok, now that's impure, since now you are using the memory by returning it.
This is really not side effect free. You are draining a global resource,
which is a no-no for pure functions. All the discussions and points
raised by the other people now apply.


Best Regards

Ingo Oeser

Sep 18 2007

0ffh <spam frankhirsch.net> writes:

Ingo Oeser wrote:
 Janice Caron wrote:
 
 OK, so change it to

 char[] probablyNotPure(int x)
 [...]
 I think it looks less easy to optimise now.

 
 Ok, now that's impure, since now you are using the memory by returning it.
 This is really not side effect free. You are draining a global resource,
 which is a no-no for pure functions. All the discussions and points
 raised by the other people now apply.

I am afraid I must disagree.
We're talking functional purity in a programming context, not maths.
It is perfectly allowable in a purely functional language to return
a pointer to a heap allocated data structure.

Regards, frank

Sep 18 2007

Ingo Oeser <ioe-news rameria.de> writes:

0ffh wrote:
 I am afraid I must disagree.
 We're talking functional purity in a programming context, not maths.
 It is perfectly allowable in a purely functional language to return
 a pointer to a heap allocated data structure.

Sorry, mixed that up with const functions.

Those (const functions) only depend on their parameters 
and have no effect besides their return value.

At least GCC defines them this way.

Sorry again.

Best Regards

Ingo Oeser

Sep 18 2007

Nathan Reed <nathaniel.reed gmail.com> writes:

Ingo Oeser wrote:
 Ok, now that's impure, since now you are using the memory by returning it.
 This is really not side effect free. You are draining a global resource,
 which is a no-no for pure functions. All the discussions and points
 raised by the other people now apply.

I think if you take this point of view, you have to conclude that even a 
function that allocates memory that becomes garbage by the end of the 
function is strictly not side-effect-free, since now some other code 
that is executing simultaneously with our "pure" function could run out 
of memory, where it wouldn't have otherwise.  In other words, memory 
allocation in a pure function would have to be totally disallowed, 
drastically reducing the usefulness of pure functions.

In fact, following the same line of reasoning, the space consumed by the 
*stack frame* of a pure function would make it strictly not 
side-effect-free.

Memory allocation shouldn't be an impediment to making a function pure. 
  If we actually run out of memory, we've got bigger problems than 
breaking the purity of the functions.

Thanks,
Nathan Reed

Sep 18 2007

D Programming

C/C++ Programming

Other

digitalmars.D - Is this function pure?