digitalmars.D - assumeSafeAppend and purity

Jonathan M Davis (16/16) Feb 06 2012 At present, assumeSafeAppend isn't pure - nor is capacity or reserve. AF...

Vladimir Panteleev (17/24) Feb 06 2012 pure void f(const(int)[] arr)

Jonathan M Davis (13/40) Feb 06 2012 Except that assumeSafeAppend was misused. It's dangerous to use when you...

Vladimir Panteleev (13/49) Feb 06 2012 When reviewing @safe or pure code, there is inevitably a list of

Steven Schveighoffer (12/52) Feb 06 2012 by the definition of assumeSafeAppend, using it, and then using data in ...

Steven Schveighoffer (16/63) Feb 06 2012 I thought of a better solution:

Jonathan M Davis (21/36) Feb 06 2012 Does it really? What if I did this:

Steven Schveighoffer (14/42) Feb 08 2012 There is a difference between this and the example given by Vladimir. I...

Vladimir Panteleev (3/10) Feb 06 2012 If precedent means anything, assumeUnique is pure.

Steven Schveighoffer (7/16) Feb 06 2012 I think there is a difference -- assumeSafeAppend can make invalid data ...

Timon Gehr (4/21) Feb 07 2012 I think both cases are kinda equivalent, but you are right: pure does

Jonathan M Davis <jmdavisProg gmx.com> writes:

At present, assumeSafeAppend isn't pure - nor is capacity or reserve. AFAIK, 
none of them access any global variables aside from GC-related stuff (and new 
is already allowed in pure functions). All it would take to make them pure is 
to mark the declarations for the C functions that they call pure (and those 
functions aren't part of the public API) and then mark them as pure. Is there 
any reason why this would be a _bad_ idea?

Appender runs into similar difficulties. Would it make sense to just mark the 
various memory-related functions in core.memory as pure (or at least some 
subset of them)? The fact that they aren't in spite of the fact that they 
involve memory like new does really makes it hard to both use pure and 
optimize code in a number of cases - especially when dealing with arrays. We 
might also want to just mark malloc as pure for the same reason.

What are the downsides to doing this? It probably wouldn't be a good idea to 
mark functions like free pure, but the ones that involve allocating or 
reallocating memory seem like good candidates for it.

- Jonathan M Davis

Feb 06 2012

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Tuesday, 7 February 2012 at 01:47:12 UTC, Jonathan M Davis 
wrote:
 At present, assumeSafeAppend isn't pure - nor is capacity or 
 reserve. AFAIK, none of them access any global variables aside 
 from GC-related stuff (and new is already allowed in pure 
 functions). All it would take to make them pure is to mark the 
 declarations for the C functions that they call pure (and those 
 functions aren't part of the public API) and then mark them as 
 pure. Is there any reason why this would be a _bad_ idea?

pure void f(const(int)[] arr)
{
	debug /* bypass purity check to pretend assumeSafeAppend is pure 
*/
	{
		assumeSafeAppend(arr);
	}
	arr ~= 42;
}

void main()
{
	int[] arr = [0, 1, 2, 3, 4];
	f(arr[1..$-1]);
	assert(arr[4] == 4, "f has a side effect");
}

Feb 06 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday, February 07, 2012 02:54:40 Vladimir Panteleev wrote:
 On Tuesday, 7 February 2012 at 01:47:12 UTC, Jonathan M Davis
 
 wrote:
 At present, assumeSafeAppend isn't pure - nor is capacity or
 reserve. AFAIK, none of them access any global variables aside
 from GC-related stuff (and new is already allowed in pure
 functions). All it would take to make them pure is to mark the
 declarations for the C functions that they call pure (and those
 functions aren't part of the public API) and then mark them as
 pure. Is there any reason why this would be a _bad_ idea?

 
 pure void f(const(int)[] arr)
 {
 	debug /* bypass purity check to pretend assumeSafeAppend is pure
 */
 	{
 		assumeSafeAppend(arr);
 	}
 	arr ~= 42;
 }
 
 void main()
 {
 	int[] arr = [0, 1, 2, 3, 4];
 	f(arr[1..$-1]);
 	assert(arr[4] == 4, "f has a side effect");
 }

Except that assumeSafeAppend was misused. It's dangerous to use when you don't 
use it properly regardless of purity. By its very nature, it can screw stuff 
up. The problem is what to do when you use it _correctly_ and want to use it 
in a pure function?  If used properly, aside from avoiding potential 
reallocations, assumeSafeAppend has no effect. Should it be made pure, because 
as long as you're using it properly it's not a problem (and it's always a 
problem if you misuse it - regardless of purity)? Or should the caller be 
forced to cast it to pure to use it in a pure function?

Given how ugly having to deal with the casting is and the fact that misusing 
assumeSafeAppend results in very broken code anyway, I'd be inclined to just 
mark it as pure.

- Jonathan M Davis

Feb 06 2012

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Tuesday, 7 February 2012 at 02:02:22 UTC, Jonathan M Davis 
wrote:
 On Tuesday, February 07, 2012 02:54:40 Vladimir Panteleev wrote:
 On Tuesday, 7 February 2012 at 01:47:12 UTC, Jonathan M Davis
 
 wrote:
 At present, assumeSafeAppend isn't pure - nor is capacity or
 reserve. AFAIK, none of them access any global variables 
 aside
 from GC-related stuff (and new is already allowed in pure
 functions). All it would take to make them pure is to mark 
 the
 declarations for the C functions that they call pure (and 
 those
 functions aren't part of the public API) and then mark them 
 as
 pure. Is there any reason why this would be a _bad_ idea?

 
 pure void f(const(int)[] arr)
 {
 	debug /* bypass purity check to pretend assumeSafeAppend is 
 pure
 */
 	{
 		assumeSafeAppend(arr);
 	}
 	arr ~= 42;
 }
 
 void main()
 {
 	int[] arr = [0, 1, 2, 3, 4];
 	f(arr[1..$-1]);
 	assert(arr[4] == 4, "f has a side effect");
 }

 Except that assumeSafeAppend was misused. It's dangerous to use 
 when you don't use it properly regardless of purity. By its 
 very nature, it can screw stuff up.

When reviewing  safe or pure code, there is inevitably a list of 
language features that reviewers need to be aware of as bypassing 
the guarantees that said language features provide, for example 
assumeUnique, calling  trusted functions, or faux-pure functions 
which may lead to side effects. It's a question of how big do we 
want to let this list grow.

The situation where assumeSafeAppend may be misused due to a bug, 
but the source of the bug is "hidden out of sight" because it 
happens inside a pure function, is imaginable. Personally, I 
never use assumeSafeAppend often enough to justify a potential 
headache later on.

Feb 06 2012

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 06 Feb 2012 21:18:21 -0500, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Tuesday, 7 February 2012 at 02:02:22 UTC, Jonathan M Davis wrote:
 On Tuesday, February 07, 2012 02:54:40 Vladimir Panteleev wrote:
 On Tuesday, 7 February 2012 at 01:47:12 UTC, Jonathan M Davis
  wrote:
 At present, assumeSafeAppend isn't pure - nor is capacity or
 reserve. AFAIK, none of them access any global variables > aside
 from GC-related stuff (and new is already allowed in pure
 functions). All it would take to make them pure is to mark > the
 declarations for the C functions that they call pure (and > those
 functions aren't part of the public API) and then mark them > as
 pure. Is there any reason why this would be a _bad_ idea?

  pure void f(const(int)[] arr)
 {
 	debug /* bypass purity check to pretend assumeSafeAppend is pure
 */
 	{
 		assumeSafeAppend(arr);
 	}
 	arr ~= 42;
 }
  void main()
 {
 	int[] arr = [0, 1, 2, 3, 4];
 	f(arr[1..$-1]);
 	assert(arr[4] == 4, "f has a side effect");
 }

 Except that assumeSafeAppend was misused. It's dangerous to use when  
 you don't use it properly regardless of purity. By its very nature, it  
 can screw stuff up.

 When reviewing  safe or pure code, there is inevitably a list of  
 language features that reviewers need to be aware of as bypassing the  
 guarantees that said language features provide, for example  
 assumeUnique, calling  trusted functions, or faux-pure functions which  
 may lead to side effects. It's a question of how big do we want to let  
 this list grow.

 The situation where assumeSafeAppend may be misused due to a bug, but  
 the source of the bug is "hidden out of sight" because it happens inside  
 a pure function, is imaginable. Personally, I never use assumeSafeAppend  
 often enough to justify a potential headache later on.

by the definition of assumeSafeAppend, using it, and then using data in  
the now 'unallocated' space results in undefined behavior.  It should  
definitely not be marked  safe or  trusted, but pure should be ok.

You can also do this in a pure function without issue:

pure void crap(int *data) {*--data = 5;}

Which might or might not be valid, depending on the context.

 safe != pure, and at some point, even compiler guarantees cannot  
guarantee validity.

At the very least, however, reserve and capacity should be pure.

-Steve

Feb 06 2012

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 06 Feb 2012 21:28:49 -0500, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Mon, 06 Feb 2012 21:18:21 -0500, Vladimir Panteleev  
 <vladimir thecybershadow.net> wrote:

 On Tuesday, 7 February 2012 at 02:02:22 UTC, Jonathan M Davis wrote:
 On Tuesday, February 07, 2012 02:54:40 Vladimir Panteleev wrote:
 On Tuesday, 7 February 2012 at 01:47:12 UTC, Jonathan M Davis
  wrote:
 At present, assumeSafeAppend isn't pure - nor is capacity or
 reserve. AFAIK, none of them access any global variables > aside
 from GC-related stuff (and new is already allowed in pure
 functions). All it would take to make them pure is to mark > the
 declarations for the C functions that they call pure (and > those
 functions aren't part of the public API) and then mark them > as
 pure. Is there any reason why this would be a _bad_ idea?

  pure void f(const(int)[] arr)
 {
 	debug /* bypass purity check to pretend assumeSafeAppend is pure
 */
 	{
 		assumeSafeAppend(arr);
 	}
 	arr ~= 42;
 }
  void main()
 {
 	int[] arr = [0, 1, 2, 3, 4];
 	f(arr[1..$-1]);
 	assert(arr[4] == 4, "f has a side effect");
 }

 Except that assumeSafeAppend was misused. It's dangerous to use when  
 you don't use it properly regardless of purity. By its very nature, it  
 can screw stuff up.

 When reviewing  safe or pure code, there is inevitably a list of  
 language features that reviewers need to be aware of as bypassing the  
 guarantees that said language features provide, for example  
 assumeUnique, calling  trusted functions, or faux-pure functions which  
 may lead to side effects. It's a question of how big do we want to let  
 this list grow.

 The situation where assumeSafeAppend may be misused due to a bug, but  
 the source of the bug is "hidden out of sight" because it happens  
 inside a pure function, is imaginable. Personally, I never use  
 assumeSafeAppend often enough to justify a potential headache later on.

 by the definition of assumeSafeAppend, using it, and then using data in  
 the now 'unallocated' space results in undefined behavior.  It should  
 definitely not be marked  safe or  trusted, but pure should be ok.

I thought of a better solution:

pure T[] pureSafeShrink(T)(ref T[] arr, size_t maxLength)
{
    if(maxLength < arr.length)
    {
        bool safeToShrink = (arr.capacity == arr.length);
        arr = arr[0..maxLength];
        if(safeToShrink) arr.assumeSafeAppend(); // must workaround purity  
here
    }
    return arr;
}

This guarantees that you only affect data you were passed.

-Steve

Feb 06 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Monday, February 06, 2012 21:40:55 Steven Schveighoffer wrote:
 I thought of a better solution:
 
 pure T[] pureSafeShrink(T)(ref T[] arr, size_t maxLength)
 {
     if(maxLength < arr.length)
     {
         bool safeToShrink = (arr.capacity == arr.length);
         arr = arr[0..maxLength];
         if(safeToShrink) arr.assumeSafeAppend(); // must workaround purity
 here
     }
     return arr;
 }
 
 This guarantees that you only affect data you were passed.

Does it really? What if I did this:

auto arr = new int[](63);
auto saved = arr;
assert(arr.capacity == 63);
assert(saved.capacity == 63);
pureSafeToShrink(arr, 0);

This happens to pass on my computer, though the exact value required for the 
length will probably vary. So, a slice of the data which is now supposed to be 
no longer part of any array still exists.

Also, given that allocating a new array and then immediately trying to shrink 
it with pureSafeShrink will only use assumeSafeAppend if you just so happen to 
have picked a length that lines up with the block size allocated makes it 
pretty much useless IMHO. I'm only going to use assumeSafe append if I _know_ 
that it's safe. pureSafeShrink is therefore trying to protect me when I don't 
need it and is ruining the guarantees that assumeSafeAppend gives me, since 
it's only better than arr = arr[0 .. maxLength]; if the array just so happens 
to have the same length as its capacity.

So, I don't think that this function really buys us anything. I'm inclined to 
just make assumeSafeAppend pure.

- Jonathan M Davis

Feb 06 2012

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 07 Feb 2012 00:35:14 -0500, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 On Monday, February 06, 2012 21:40:55 Steven Schveighoffer wrote:
 I thought of a better solution:

 pure T[] pureSafeShrink(T)(ref T[] arr, size_t maxLength)
 {
     if(maxLength < arr.length)
     {
         bool safeToShrink = (arr.capacity == arr.length);
         arr = arr[0..maxLength];
         if(safeToShrink) arr.assumeSafeAppend(); // must workaround  
 purity
 here
     }
     return arr;
 }

 This guarantees that you only affect data you were passed.

 Does it really? What if I did this:

 auto arr = new int[](63);
 auto saved = arr;
 assert(arr.capacity == 63);
 assert(saved.capacity == 63);
 pureSafeToShrink(arr, 0);

 This happens to pass on my computer, though the exact value required for  
 the
 length will probably vary. So, a slice of the data which is now supposed  
 to be
 no longer part of any array still exists.

There is a difference between this and the example given by Vladimir.  In  
Vladimir's example, you are passed an array slice of elements 0-3, but the  
assumeSafeAppend affects element 4.  This violates the spirit of pure  
having no side effects, even if it is technically sound.  I still am  
undecided as to whether assumeSafeAppend should be pure or not.

In this case, the function will only affect array elements that it is  
passed.  The fact that you changed something in data you were passed does  
not violate pure rules.

However, I think my test is too strict, it actually should be arr.capacity  
!= 0.  This means that the array ends at the end of valid data (no valid  
data exists beyond the array).

-Steve

Feb 08 2012

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Tuesday, 7 February 2012 at 01:47:12 UTC, Jonathan M Davis 
wrote:
 At present, assumeSafeAppend isn't pure - nor is capacity or 
 reserve. AFAIK, none of them access any global variables aside 
 from GC-related stuff (and new is already allowed in pure 
 functions). All it would take to make them pure is to mark the 
 declarations for the C functions that they call pure (and those 
 functions aren't part of the public API) and then mark them as 
 pure. Is there any reason why this would be a _bad_ idea?

If precedent means anything, assumeUnique is pure.

Feb 06 2012

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 06 Feb 2012 21:32:05 -0500, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Tuesday, 7 February 2012 at 01:47:12 UTC, Jonathan M Davis wrote:
 At present, assumeSafeAppend isn't pure - nor is capacity or reserve.  
 AFAIK, none of them access any global variables aside from GC-related  
 stuff (and new is already allowed in pure functions). All it would take  
 to make them pure is to mark the declarations for the C functions that  
 they call pure (and those functions aren't part of the public API) and  
 then mark them as pure. Is there any reason why this would be a _bad_  
 idea?

 If precedent means anything, assumeUnique is pure.

I think there is a difference -- assumeSafeAppend can make invalid data  
that you did not pass to the pure function.

I'm still not sure if it's pure's job to protect data that you can get to  
via pointer arithmetic, but I think the two cases are different.

-Steve

Feb 06 2012

Timon Gehr <timon.gehr gmx.ch> writes:

On 02/07/2012 03:48 AM, Steven Schveighoffer wrote:
 On Mon, 06 Feb 2012 21:32:05 -0500, Vladimir Panteleev
 <vladimir thecybershadow.net> wrote:

 On Tuesday, 7 February 2012 at 01:47:12 UTC, Jonathan M Davis wrote:
 At present, assumeSafeAppend isn't pure - nor is capacity or reserve.
 AFAIK, none of them access any global variables aside from GC-related
 stuff (and new is already allowed in pure functions). All it would
 take to make them pure is to mark the declarations for the C
 functions that they call pure (and those functions aren't part of the
 public API) and then mark them as pure. Is there any reason why this
 would be a _bad_ idea?

 If precedent means anything, assumeUnique is pure.

 I think there is a difference -- assumeSafeAppend can make invalid data
 that you did not pass to the pure function.

 I'm still not sure if it's pure's job to protect data that you can get
 to via pointer arithmetic, but I think the two cases are different.

 -Steve

I think both cases are kinda equivalent, but you are right: pure does 
not protect from undefined behaviour nor is it supposed to guarantee 
memory safety.

Feb 07 2012

D Programming

C/C++ Programming

Other

digitalmars.D - assumeSafeAppend and purity