digitalmars.D - Forcing weak-pure

Steven Schveighoffer (51/51) Mar 25 2014 There has been an increased effort as of late to make as much as possibl...

monarch_dodra (13/17) Mar 25 2014 Maybe instead of using null, we could use global junk?

Steven Schveighoffer (7/23) Mar 25 2014 I think again, a sufficiently intelligent compiler could see that
Joseph Rushton Wakeling (5/6) Mar 25 2014 That really seems like a nasty workaround which could break at any momen...

Artur Skawina (10/22) Mar 25 2014 Functions returning 'void' and w/o mutable args cannot be logically pure...

Steven Schveighoffer (17/46) Mar 25 2014 This could not be elided, because you are using the return value. It wou...

Artur Skawina (23/55) Mar 25 2014 It's ok to treat allocator and factory functions as pure, because those ...

Steven Schveighoffer (9/74) Mar 25 2014 You what I wasn't thinking of? parallelization of pure function calls!
Kagamin (3/8) Mar 26 2014 Allocators don't return unique results, GC being the primary

Artur Skawina (11/17) Mar 26 2014 I'm not really sure what your point is. I was saying that "allocator fun...
Steven Schveighoffer (6/13) Mar 26 2014 I think you misunderstand. The result is uniquely referenced, meaning no...

Artur Skawina (11/15) Apr 05 2014 That was wrong; treating these funcs as pure won't work. While allocator...

Steven Schveighoffer (5/22) Apr 07 2014 I think this still works, even if makeS is pure, since it would be

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

There has been an increased effort as of late to make as much as possible  
inside the runtime  safe, pure, and nothrow. The goal is to make it so  
it's not as nasty writing such code, because the library will allow it.  
Everything that we mark properly down below will allow higher-level  
constructs to be marked as well.

Of course naturally, with all things this low-level, we have to make  
decisions as to what is *provably*  safe/pure/nothrow, and what is  
*logically*  safe/pure/nothrow. The most obvious example is memory  
allocation -- It is technically not pure as it accesses global state (the  
heap), but we can rationalize making it pure since we have more logical  
knowledge of the situation than the compiler (i.e. locking the global  
heap, and realizing that if a block is unallocated, it is uniquely  
accessed with the allocation of it).

One issue, however, is modification of heap metadata for immutable data.  
The D compiler has the (correct) rationalization that for a pure function  
that has only immutable or implicitly convertible to immutable parameters,  
it is strong-pure. This means, it can optimize away calls to such  
functions. But in the case of modifying heap metadata, the call is  
actually weak-pure, since it is making modifications that you *don't* want  
to optimize away.

For this reason, functions like GC.setAttr and assumeSafeAppend cannot be  
marked pure. For example:

auto str = "hello".idup;
str = str[0..1];
str.assumeSafeAppend();
str ~= "iya";

The compiler could rationally elide the call to assumeSafeAppend if it is  
pure. We are not using the return value, and the only parameter is  
immutable. Since pure functions technically have no side effects, this  
call can be eliminated. A recent compiler change made such calls warnings  
(not using result of strong-pure function). But assumeSafeAppend really  
should be weak-pure, because it does have an effect. In essence, you are  
technically passing to assumeSafeAppend a pointer to the block that  
contains the slice, not the slice itself. And that block is mutable.

GC.setAttr has similar issues.

How can we force these to be weak-pure? One option suggested is to add a  
hidden void * parameter that defaults to null, to force the issue. But I  
think future compilers may be smart enough to realize that can also be a  
strong-pure call.

Another possibility is to pass the array/pointer by reference. This causes  
its own problems with rvalues, and does not help when the array/pointer  
itself is immutable.

Do we need a way (maybe a pragma) to make a function inside druntime  
"forced weak" pure? Is there an option any of you wonderful geniuses out  
there can figure out without changes to the compiler? :)

Some reference material:

https://github.com/D-Programming-Language/druntime/pull/553
https://github.com/D-Programming-Language/druntime/pull/747
https://github.com/D-Programming-Language/phobos/pull/2025
https://github.com/D-Programming-Language/phobos/pull/2046

-Steve

Mar 25 2014

"monarch_dodra" <monarchdodra gmail.com> writes:

On Tuesday, 25 March 2014 at 13:30:04 UTC, Steven Schveighoffer 
wrote:
 How can we force these to be weak-pure? One option suggested is 
 to add a hidden void * parameter that defaults to null, to 
 force the issue. But I think future compilers may be smart 
 enough to realize that can also be a strong-pure call.

Maybe instead of using null, we could use global junk?

//Global junk
struct WeakPure{int a;}
__gshared WeakPure dummyWeakPure;


//Signature
void weakPureFun(WeakPure* p = &dummyWeakPure)
{}


//Useage
weakPureFun();



AFAIK, the compiler should NOT be able to infer strong purity 
here. I don't know about performance effects though.

Mar 25 2014

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 25 Mar 2014 10:08:20 -0400, monarch_dodra <monarchdodra gmail.com>  
wrote:

 On Tuesday, 25 March 2014 at 13:30:04 UTC, Steven Schveighoffer wrote:
 How can we force these to be weak-pure? One option suggested is to add  
 a hidden void * parameter that defaults to null, to force the issue.  
 But I think future compilers may be smart enough to realize that can  
 also be a strong-pure call.

 Maybe instead of using null, we could use global junk?

 //Global junk
 struct WeakPure{int a;}
 __gshared WeakPure dummyWeakPure;


 //Signature
 void weakPureFun(WeakPure* p = &dummyWeakPure)
 {}


 //Useage
 weakPureFun();



 AFAIK, the compiler should NOT be able to infer strong purity here. I  
 don't know about performance effects though.

I think again, a sufficiently intelligent compiler could see that  
dummyWeakPure would not be used in the template, we likely would have to  
pass it to the underlying opaque function. I'd really like to avoid this  
option, if it's not going to be optimized down to the minimal case.

-Steve

Mar 25 2014

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 25/03/14 15:08, monarch_dodra wrote:
 Maybe instead of using null, we could use global junk?

That really seems like a nasty workaround which could break at any moment 
depending on how the compiler is tweaked. :-(

Better surely to have an explicit indication that a function may not be assumed 
to be strongly pure?

Mar 25 2014

Artur Skawina <art.08.09 gmail.com> writes:

On 03/25/14 14:30, Steven Schveighoffer wrote:
 [...] functions like GC.setAttr and assumeSafeAppend cannot be marked pure.
For example:
 
 auto str = "hello".idup;
 str = str[0..1];
 str.assumeSafeAppend();
 str ~= "iya";
 
 The compiler could rationally elide the call to assumeSafeAppend if it is
pure. We are not using the return value, and the only parameter is immutable.
Since pure functions technically have no side effects, this call can be
eliminated. A recent compiler change made such calls warnings (not using result
of strong-pure function). But assumeSafeAppend really should be weak-pure,
because it does have an effect. In essence, you are technically passing to
assumeSafeAppend a pointer to the block that contains the slice, not the slice
itself. And that block is mutable.
 
 GC.setAttr has similar issues.
 
 How can we force these to be weak-pure?

Functions returning 'void' and w/o mutable args cannot be logically pure,
as long as they aren't no-ops, obviously. While this property could be
used to render them "weak-pure" in d-speak, this (or any other approach
to marking them as such) would not be enough...

   // assuming 'assumeSafeAppend()' is "weak-pure":

   string f(string s) pure { s.assumeSafeAppend(); s ~= "d"; return s; }   
   
   string a = "abc".idup, b = f(a[0..2]);

artur

Mar 25 2014

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 25 Mar 2014 14:49:27 -0400, Artur Skawina <art.08.09 gmail.com>  
wrote:

 On 03/25/14 14:30, Steven Schveighoffer wrote:
 [...] functions like GC.setAttr and assumeSafeAppend cannot be marked  
 pure. For example:

 auto str = "hello".idup;
 str = str[0..1];
 str.assumeSafeAppend();
 str ~= "iya";

 The compiler could rationally elide the call to assumeSafeAppend if it  
 is pure. We are not using the return value, and the only parameter is  
 immutable. Since pure functions technically have no side effects, this  
 call can be eliminated. A recent compiler change made such calls  
 warnings (not using result of strong-pure function). But  
 assumeSafeAppend really should be weak-pure, because it does have an  
 effect. In essence, you are technically passing to assumeSafeAppend a  
 pointer to the block that contains the slice, not the slice itself. And  
 that block is mutable.

 GC.setAttr has similar issues.

 How can we force these to be weak-pure?

 Functions returning 'void' and w/o mutable args cannot be logically pure,
 as long as they aren't no-ops, obviously. While this property could be
 used to render them "weak-pure" in d-speak, this (or any other approach
 to marking them as such) would not be enough...

    // assuming 'assumeSafeAppend()' is "weak-pure":

    string f(string s) pure { s.assumeSafeAppend(); s ~= "d"; return s; }
   string a = "abc".idup, b = f(a[0..2]);

This could not be elided, because you are using the return value. It would  
have to be called at least once.

However, I am OK with it being elided for a second call with the same  
input. In other words, this strong pure function does not have the same  
issues.

But I get what you are saying -- just wrap it again, and we're back to the  
same issue. It is a systematic problem, because the concept is that you  
are not passing mutable references, even though you are then using those  
references to mark global data.

There is no really good answer I suppose. The huge downside of not marking  
assumeSafeAppend as weak-pure is that one cannot use assumeSafeAppend or  
set attributes inside pure functions that deal with arrays with immutable  
data. For example, the Appender type cannot be marked pure for immutable  
data.

-Steve

Mar 25 2014

Artur Skawina <art.08.09 gmail.com> writes:

On 03/25/14 21:51, Steven Schveighoffer wrote:
 On Tue, 25 Mar 2014 14:49:27 -0400, Artur Skawina <art.08.09 gmail.com> wrote:
 
 On 03/25/14 14:30, Steven Schveighoffer wrote:
 [...] functions like GC.setAttr and assumeSafeAppend cannot be marked pure.
For example:

 auto str = "hello".idup;
 str = str[0..1];
 str.assumeSafeAppend();
 str ~= "iya";

 The compiler could rationally elide the call to assumeSafeAppend if it is
pure. We are not using the return value, and the only parameter is immutable.
Since pure functions technically have no side effects, this call can be
eliminated. A recent compiler change made such calls warnings (not using result
of strong-pure function). But assumeSafeAppend really should be weak-pure,
because it does have an effect. In essence, you are technically passing to
assumeSafeAppend a pointer to the block that contains the slice, not the slice
itself. And that block is mutable.

 GC.setAttr has similar issues.

 How can we force these to be weak-pure?

 Functions returning 'void' and w/o mutable args cannot be logically pure,
 as long as they aren't no-ops, obviously. While this property could be
 used to render them "weak-pure" in d-speak, this (or any other approach
 to marking them as such) would not be enough...

    // assuming 'assumeSafeAppend()' is "weak-pure":

    string f(string s) pure { s.assumeSafeAppend(); s ~= "d"; return s; }
    string a = "abc".idup, b = f(a[0..2]);

 
 This could not be elided, because you are using the return value. It would
have to be called at least once.
 
 However, I am OK with it being elided for a second call with the same input.
In other words, this strong pure function does not have the same issues.

It's ok to treat allocator and factory functions as pure, because those really
are logically pure, ie can't affect any /visible/ state and return results that
are unique.
'assumeSafeAppend()' does have side effects - it can lead to corruption of
other,
not directly related, data. The only thing that prevents this is a declaration
by the programmer -- "Trust me, I own this data and there are no other live
aliases".
The problem is that this declaration would now happen /implicitly when calling
a (strongly) pure function/.
Note that the 'a' string in the above example could no longer be "abc" after
'f()'
runs.
The "pure" concept (re-)definition in D is bad enough even without having to
worry
about supposedly pure functions stomping on unrelated data.


 But I get what you are saying -- just wrap it again, and we're back to the
same issue. It is a systematic problem, because the concept is that you are not
passing mutable references, even though you are then using those references to
mark global data.
 
 There is no really good answer I suppose. The huge downside of not marking
assumeSafeAppend as weak-pure is that one cannot use assumeSafeAppend or set
attributes inside pure functions that deal with arrays with immutable data. For
example, the Appender type cannot be marked pure for immutable data.

If there was a version of assumeSafeAppend() marked as pure, then that one could
be used for such cases, where no aliases are allowed to escape before an
ownership
transfer happens. It's much easier to audit a contained implementation (eg
Appender)
than to check the whole program and every caller of every function which relies
on
a potentially unsafe assumption. If 'assumeSafeAppend()' isn't pure /by default/
then D's purity guarantees are at least a little bit stronger.

Or maybe such a change (marking unsafe code as pure and hoping that the
programmer
knows exactly what (s)he's doing) wouldn't really make the situation
significantly
worse than it already is - I'm not sure...

artur

Mar 25 2014

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 25 Mar 2014 19:30:04 -0400, Artur Skawina <art.08.09 gmail.com>  
wrote:

 On 03/25/14 21:51, Steven Schveighoffer wrote:
 On Tue, 25 Mar 2014 14:49:27 -0400, Artur Skawina <art.08.09 gmail.com>  
 wrote:

 On 03/25/14 14:30, Steven Schveighoffer wrote:
 [...] functions like GC.setAttr and assumeSafeAppend cannot be marked  
 pure. For example:

 auto str = "hello".idup;
 str = str[0..1];
 str.assumeSafeAppend();
 str ~= "iya";

 The compiler could rationally elide the call to assumeSafeAppend if  
 it is pure. We are not using the return value, and the only parameter  
 is immutable. Since pure functions technically have no side effects,  
 this call can be eliminated. A recent compiler change made such calls  
 warnings (not using result of strong-pure function). But  
 assumeSafeAppend really should be weak-pure, because it does have an  
 effect. In essence, you are technically passing to assumeSafeAppend a  
 pointer to the block that contains the slice, not the slice itself.  
 And that block is mutable.

 GC.setAttr has similar issues.

 How can we force these to be weak-pure?

 Functions returning 'void' and w/o mutable args cannot be logically  
 pure,
 as long as they aren't no-ops, obviously. While this property could be
 used to render them "weak-pure" in d-speak, this (or any other approach
 to marking them as such) would not be enough...

    // assuming 'assumeSafeAppend()' is "weak-pure":

    string f(string s) pure { s.assumeSafeAppend(); s ~= "d"; return s;  
 }
    string a = "abc".idup, b = f(a[0..2]);

 This could not be elided, because you are using the return value. It  
 would have to be called at least once.

 However, I am OK with it being elided for a second call with the same  
 input. In other words, this strong pure function does not have the same  
 issues.

 It's ok to treat allocator and factory functions as pure, because those  
 really
 are logically pure, ie can't affect any /visible/ state and return  
 results that
 are unique.
 'assumeSafeAppend()' does have side effects - it can lead to corruption  
 of other,
 not directly related, data. The only thing that prevents this is a  
 declaration
 by the programmer -- "Trust me, I own this data and there are no other  
 live aliases".
 The problem is that this declaration would now happen /implicitly when  
 calling
 a (strongly) pure function/.
 Note that the 'a' string in the above example could no longer be "abc"  
 after 'f()'
 runs.
 The "pure" concept (re-)definition in D is bad enough even without  
 having to worry
 about supposedly pure functions stomping on unrelated data.

You what I wasn't thinking of? parallelization of pure function calls!  
Right there, that kills this whole idea.

In fact, I think at this point, any operation to immutable arrays that  
affects the metadata for that block, including setting length and  
appending, must be by definition impure.

This is not a good turn of events...

-Steve

Mar 25 2014

"Kagamin" <spam here.lot> writes:

On Tuesday, 25 March 2014 at 23:30:28 UTC, Artur Skawina wrote:
 It's ok to treat allocator and factory functions as pure, 
 because those really
 are logically pure, ie can't affect any /visible/ state and 
 return results that
 are unique.

Allocators don't return unique results, GC being the primary 
example.

Mar 26 2014

Artur Skawina <art.08.09 gmail.com> writes:

On 03/26/14 15:36, Kagamin wrote:
 On Tuesday, 25 March 2014 at 23:30:28 UTC, Artur Skawina wrote:
 It's ok to treat allocator and factory functions as pure, because those really
 are logically pure, ie can't affect any /visible/ state and return results that
 are unique.

 
 Allocators don't return unique results, GC being the primary example.

I'm not really sure what your point is. I was saying that "allocator functions",
ie functions that return newly allocated objects, can be treated as pure as long
they do not have any other visible side effects and the result is "unique", ie
does
not have any aliases. This will be true in many cases. The /implementation/
won't
likely be pure, but from the caller's POV this is irrelevant. A way to make
those
functions appear pure would allow for more pure /callers/; that's why Steven was
asking for a way to "force" purity.
Even C compilers are treating 'malloc' etc specially, exactly because they can
assume that nothing aliases the returned object.

artur

Mar 26 2014

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 26 Mar 2014 10:36:45 -0400, Kagamin <spam here.lot> wrote:

 On Tuesday, 25 March 2014 at 23:30:28 UTC, Artur Skawina wrote:
 It's ok to treat allocator and factory functions as pure, because those  
 really
 are logically pure, ie can't affect any /visible/ state and return  
 results that
 are unique.

 Allocators don't return unique results, GC being the primary example.

I think you misunderstand. The result is uniquely referenced, meaning no  
other pointers to the data are present (except the all-seeing heap).

Multiple calls to the allocator may return different results, and each  
result is uniquely referenced.

-Steve

Mar 26 2014

Artur Skawina <art.08.09 gmail.com> writes:

On 03/25/14 14:30, Steven Schveighoffer wrote:
 Of course naturally, with all things this low-level, we have to make decisions
as to what is *provably*  safe/pure/nothrow, and what is *logically*
 safe/pure/nothrow. The most obvious example is memory allocation -- It is
technically not pure as it accesses global state (the heap), but we can
rationalize making it pure since we have more logical knowledge of the
situation than the compiler (i.e. locking the global heap, and realizing that
if a block is unallocated, it is uniquely accessed with the allocation of it).

On 03/26/14 00:30, Artur Skawina wrote:
 It's ok to treat allocator and factory functions as pure, because those really
 are logically pure, ie can't affect any /visible/ state and return results that
 are unique.

That was wrong; treating these funcs as pure won't work. While allocator
functions
have some traits of pure ones -- no side-effects, their result only depends on
the
args, and the call can be optimized out when the result in unused -- there is
one
difference, which isn't expressible in D: the uniqueness of the result. Letting
them
be pure would mean that important purity based optimization could not be allowed
(so that "S* s1 = makeS(32), s2 = makeS(32);" does not break).

But this case is easy to deal with -- it's enough to (1) mark the alloc funcs
with
a 'malloc' attribute, (2) allow calling  malloc functions inside pure ones.

artur

Apr 05 2014

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sat, 05 Apr 2014 08:26:01 -0400, Artur Skawina <art.08.09 gmail.com>  
wrote:
 On 03/26/14 00:30, Artur Skawina wrote:
 It's ok to treat allocator and factory functions as pure, because those  
 really
 are logically pure, ie can't affect any /visible/ state and return  
 results that
 are unique.

 That was wrong; treating these funcs as pure won't work. While allocator  
 functions
 have some traits of pure ones -- no side-effects, their result only  
 depends on the
 args, and the call can be optimized out when the result in unused --  
 there is one
 difference, which isn't expressible in D: the uniqueness of the result.  
 Letting them
 be pure would mean that important purity based optimization could not be  
 allowed
 (so that "S* s1 = makeS(32), s2 = makeS(32);" does not break).

I think this still works, even if makeS is pure, since it would be  
weak-pure.

-Steve

Apr 07 2014

D Programming

C/C++ Programming

Other

digitalmars.D - Forcing weak-pure