digitalmars.D - RFC, ensureHeaped

Steven Schveighoffer (26/26) Nov 12 2010 I just recently helped someone with an issue of saving an array to stack...

Andrei Alexandrescu (4/30) Nov 12 2010 Sounds good, but if we offer it we should also define the primitive
bearophile (10/25) Nov 12 2010 Another possible solution, this turns some cases of stack assignment int...

Pillsy (7/15) Nov 12 2010

bearophile (5/21) Nov 12 2010 That scope syntax is already supported for closures, and it's partially ...

Pillsy (6/11) Nov 12 2010

Jonathan M Davis (7/19) Nov 12 2010 in is actually const scope. So

bearophile (4/5) Nov 12 2010 This scope will not go away.

Jonathan M Davis (3/7) Nov 12 2010 What's the difference between this scope and using scope on a local vari...

Steven Schveighoffer (15/22) Nov 15 2010 All that is going away is scope classes. All other uses of scope are

Jonathan M Davis (5/32) Nov 15 2010 Thanks. I knew about scope classes and scope statements (e.g. scope(fail...

Steven Schveighoffer (4/39) Nov 15 2010 Forgot completely about scope(failure|success|exit)...

Jonathan M Davis (6/7) Nov 15 2010

bearophile (7/9) Nov 13 2010 I have created a bug report to avoid this whole pair of threads to be lo...

spir (24/38) Nov 13 2010 =20

Steven Schveighoffer (14/48) Nov 15 2010 I don't really agree. The ... version is optimized so you can pass

bearophile (19/22) Nov 15 2010 The experienced programmers may write "scope int[] a...", and have no he...

Steven Schveighoffer (43/68) Nov 16 2010 This is a good idea. This isn't what I thought spir was saying, I thoug...

bearophile (10/33) Nov 16 2010 If the variadics is in the signature of a free function then I agree wit...

Johann MacDonagh (15/48) Nov 21 2010 I'm for the "safe by default, you have to work to be unsafe". In this

Steven Schveighoffer (10/103) Nov 22 2010 Let's say you give the compiler this .di file:

Andrei Alexandrescu (4/16) Nov 16 2010 Hm, interestingly a data qualifier @noheap would not need to be

Steven Schveighoffer (8/25) Nov 16 2010 I think he means transitive the same way pure is transitive. Not sure

Jonathan M Davis (5/34) Nov 16 2010 Pure is hard enough to deal with (especially since it we probably have m...

bearophile (4/6) Nov 16 2010 Weakly pure on default isn't good for a language that is supposed to b e...

Jonathan M Davis (9/15) Nov 16 2010 Well, like I said, it's too late at this point, and really, it would be ...

Steven Schveighoffer (17/37) Nov 16 2010 everything you are saying seems to be backwards, stop it! ;)

Rainer Deyke (8/14) Nov 16 2010 Making functions weakly pure by default means that temporarily adding a

Jonathan M Davis (10/24) Nov 16 2010 It has already been argued that I/O should be exempt (at least for debug...
spir (24/27) Nov 17 2010 Output in general, programmer feedback in particuliar, should simply not...

bearophile (26/40) Nov 17 2010 I agree, it's a (small) problem. Originally 'pure' in D was closer to th...
Rainer Deyke (7/10) Nov 17 2010 My debug output actually goes through my logging library which, among

spir (38/46) Nov 17 2010 ing=20
Steven Schveighoffer (9/21) Nov 17 2010 As would I. But I think in the case of debugging, we can have "trusted ...

Jonathan M Davis (34/76) Nov 16 2010 II was not trying to separate out weakly pure and strongly pure. pure is...

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

I just recently helped someone with an issue of saving an array to stack  
data beyond the existence of that stack frame.  However, the error was one  
level deep, something like this:

int[] globalargs;

void foo(int[] args...)
{
    globalargs = args;
}

void bar()
{
    foo(1,2,3); // passes stack data to foo.
}

One thing I suggested is, you have to dup args.  But what if you call it  
like this?

void bar()
{
    foo([1,2,3]);
}

Then you just wasted time duping that argument.  Instead of a defensive  
dup, what if we had a function ensureHeaped (better name suggestions?)  
that ensured the data was on the heap?  If it wasn't, it dups the original  
onto the heap.  It would be less expensive than a dup when the data is  
already on the heap, but probably only slightly more expensive than a  
straight dup when the data isn't on the heap.

Would such a function make sense or be useful?

-Steve

Nov 12 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/12/10 5:10 AM, Steven Schveighoffer wrote:
 I just recently helped someone with an issue of saving an array to stack
 data beyond the existence of that stack frame. However, the error was
 one level deep, something like this:

 int[] globalargs;

 void foo(int[] args...)
 {
 globalargs = args;
 }

 void bar()
 {
 foo(1,2,3); // passes stack data to foo.
 }

 One thing I suggested is, you have to dup args. But what if you call it
 like this?

 void bar()
 {
 foo([1,2,3]);
 }

 Then you just wasted time duping that argument. Instead of a defensive
 dup, what if we had a function ensureHeaped (better name suggestions?)
 that ensured the data was on the heap? If it wasn't, it dups the
 original onto the heap. It would be less expensive than a dup when the
 data is already on the heap, but probably only slightly more expensive
 than a straight dup when the data isn't on the heap.

 Would such a function make sense or be useful?

 -Steve

Sounds good, but if we offer it we should also define the primitive 
isOnStack() or something.

Andrei

Nov 12 2010

bearophile <bearophileHUGS lycos.com> writes:

Steven Schveighoffer:

 int[] globalargs;
 
 void foo(int[] args...)
 {
     globalargs = args;
 }
 
 void bar()
 {
     foo(1,2,3); // passes stack data to foo.
 }
 ...
 Then you just wasted time duping that argument.  Instead of a defensive  
 dup, what if we had a function ensureHeaped (better name suggestions?)  
 that ensured the data was on the heap?

Another possible solution, this turns some cases of stack assignment into a
syntax error (wrong memory zone assignment error), and turns the undeterminable
ones in a runtime test + optional allocation:

 onheap int[] globalargs;

Another possible solution is to modify the semantics of this kind of arguments
pass, so the code inside the function foo always see an args array allocated on
the heap:

void foo(int[] args...) { // code

You may then add "scope" to restore the original lighter semantics:
void foo(scope int[] args...) { // code

This is safer than the current semantics because the safe design is the
built-in one and the faster is on request.

Bye,
bearophile

Nov 12 2010

Pillsy <pillsbury gmail.com> writes:

bearophile wrote:
[...]
 Another possible solution is to modify the semantics of this kind of 
 arguments pass, so the code inside the function foo always see an 
 args array allocated on the heap:

 void foo(int[] args...) { // code

 
 You may then add "scope" to restore the original lighter semantics:
 void foo(scope int[] args...) { // code

 
 This is safer than the current semantics because the safe design is 
 the built-in one and the faster is on request.

I don't know how easy it would be, but I *really* like this proposal. It has
one other advantage, in that you can use the `scope` keyword for things other
than varargs, like closures.

Cheers,
Pillsy

Nov 12 2010

bearophile <bearophileHUGS lycos.com> writes:

Pillsy:

 bearophile wrote:
 [...]
 Another possible solution is to modify the semantics of this kind of 
 arguments pass, so the code inside the function foo always see an 
 args array allocated on the heap:

 
 void foo(int[] args...) { // code

  
 You may then add "scope" to restore the original lighter semantics:
 void foo(scope int[] args...) { // code

  
 This is safer than the current semantics because the safe design is 
 the built-in one and the faster is on request.

 
 I don't know how easy it would be,

That looks easy to implement, the compiler doesn't need to be smart to do that.
The other idea of  noheap is harder to implement.


 It has one other advantage, in that you can use the `scope` keyword for things
other than varargs, like closures.

That scope syntax is already supported for closures, and it's partially
implemented (or fully implemented, I am not sure).

Bye,
bearophile

Nov 12 2010

Pillsy <pillsbury gmail.com> writes:

bearophile wrote:

 Pillsy:

[...]
 It has one other advantage, in that you can use the `scope`
 keyword for things other than varargs, like closures.


 
 That scope syntax is already supported for closures, and it's partially 
 implemented (or fully implemented, I am not sure).

Oh, cool. I had no idea that `scope` was supported for function arguments at
all.

Cheers,
Pillsy

Nov 12 2010

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Friday, November 12, 2010 13:34:03 Pillsy wrote:
 bearophile wrote:
 Pillsy:

 [...]
 
 It has one other advantage, in that you can use the `scope`
 keyword for things other than varargs, like closures.

 
 That scope syntax is already supported for closures, and it's partially
 implemented (or fully implemented, I am not sure).

 
 Oh, cool. I had no idea that `scope` was supported for function arguments
 at all.

in is actually const scope. So

void func(in Foo foo)


would be

void func(const scope Foo foo)


I'm not quite sure how that will work with scope going away though.

- Jonathan M Davis

Nov 12 2010

bearophile <bearophileHUGS lycos.com> writes:

Jonathan M Davis:

 I'm not quite sure how that will work with scope going away though.

This scope will not go away.

Bye,
bearophile

Nov 12 2010

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Friday, November 12, 2010 17:25:31 bearophile wrote:
 Jonathan M Davis:
 I'm not quite sure how that will work with scope going away though.

 
 This scope will not go away.

What's the difference between this scope and using scope on a local variable?

- Jonathan M Davis

Nov 12 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 12 Nov 2010 20:33:37 -0500, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 On Friday, November 12, 2010 17:25:31 bearophile wrote:
 Jonathan M Davis:
 I'm not quite sure how that will work with scope going away though.

 This scope will not go away.

 What's the difference between this scope and using scope on a local  
 variable?

All that is going away is scope classes.  All other uses of scope are  
staying.

And even then, I think a scope class will still be supported, it just  
won't allocate on the stack (it will probably be a noop like it is for  
other variable types).

scope means different things in different places.  Currently, in a  
parameter it means that references in the parameter cannot be escaped  
(i.e. assigned to a global variable).  When the compiler sees this on  
delegates, it will avoid allocating a closure when taking the address of a  
local function.  This is essential in opApply loops.

And you know about the scope for classes.  AFAIK, those are really the  
only two behavior-altering uses.  Other than that, I think it's a noop.

-Steve

Nov 15 2010

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Monday, November 15, 2010 07:28:33 Steven Schveighoffer wrote:
 On Fri, 12 Nov 2010 20:33:37 -0500, Jonathan M Davis <jmdavisProg gmx.com>
 
 wrote:
 On Friday, November 12, 2010 17:25:31 bearophile wrote:
 Jonathan M Davis:
 I'm not quite sure how that will work with scope going away though.

 
 This scope will not go away.

 
 What's the difference between this scope and using scope on a local
 variable?

 
 All that is going away is scope classes.  All other uses of scope are
 staying.
 
 And even then, I think a scope class will still be supported, it just
 won't allocate on the stack (it will probably be a noop like it is for
 other variable types).
 
 scope means different things in different places.  Currently, in a
 parameter it means that references in the parameter cannot be escaped
 (i.e. assigned to a global variable).  When the compiler sees this on
 delegates, it will avoid allocating a closure when taking the address of a
 local function.  This is essential in opApply loops.
 
 And you know about the scope for classes.  AFAIK, those are really the
 only two behavior-altering uses.  Other than that, I think it's a noop.

Thanks. I knew about scope classes and scope statements (e.g. scope(failure) 
...), but I didn't know that scope on a parameter was different from scope 
classes. scope is definitely an over-used keyword IMHO.

- Jonathan M Davis

Nov 15 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 15 Nov 2010 13:36:42 -0500, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 On Monday, November 15, 2010 07:28:33 Steven Schveighoffer wrote:
 On Fri, 12 Nov 2010 20:33:37 -0500, Jonathan M Davis  
 <jmdavisProg gmx.com>

 wrote:
 On Friday, November 12, 2010 17:25:31 bearophile wrote:
 Jonathan M Davis:
 I'm not quite sure how that will work with scope going away though.

 This scope will not go away.

 What's the difference between this scope and using scope on a local
 variable?

 All that is going away is scope classes.  All other uses of scope are
 staying.

 And even then, I think a scope class will still be supported, it just
 won't allocate on the stack (it will probably be a noop like it is for
 other variable types).

 scope means different things in different places.  Currently, in a
 parameter it means that references in the parameter cannot be escaped
 (i.e. assigned to a global variable).  When the compiler sees this on
 delegates, it will avoid allocating a closure when taking the address  
 of a
 local function.  This is essential in opApply loops.

 And you know about the scope for classes.  AFAIK, those are really the
 only two behavior-altering uses.  Other than that, I think it's a noop.

 Thanks. I knew about scope classes and scope statements (e.g.  
 scope(failure)
 ...), but I didn't know that scope on a parameter was different from  
 scope
 classes. scope is definitely an over-used keyword IMHO.

Forgot completely about scope(failure|success|exit)...

-Steve

Nov 15 2010

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Monday, November 15, 2010 10:45:06 Steven Schveighoffer wrote:
 Forgot completely about scope(failure|success|exit)...

 
LOL. Whereas that's almost the only reason that I use the scope keyword -
though 
the fact that it doesn't actually give you the exception means that I don't end 
up using it anywhere near as much as I'd like.

- Jonathan M Davis

Nov 15 2010

bearophile <bearophileHUGS lycos.com> writes:

Steven Schveighoffer:

 Then you just wasted time duping that argument.  Instead of a defensive  
 dup, what if we had a function ensureHeaped (better name suggestions?)  

I have created a bug report to avoid this whole pair of threads to be lost in
the dusts of time:
http://d.puremagic.com/issues/show_bug.cgi?id=5212

Feel free to add a note about your ensureHeaped() idea at the end of that
enhancement request :-)

(To that enhancement request I have not added my idea of the  onheap attribute
because I think it's too much complex to implement according to the design
style of the D compiler).

Bye,
bearophile

Nov 13 2010

spir <denis.spir gmail.com> writes:

On Sat, 13 Nov 2010 13:19:25 -0500
bearophile <bearophileHUGS lycos.com> wrote:

 Steven Schveighoffer:
=20
 Then you just wasted time duping that argument.  Instead of a defensive=


 =20
 dup, what if we had a function ensureHeaped (better name suggestions?) =


=20
=20
 I have created a bug report to avoid this whole pair of threads to be los=

t in the dusts of time:
 http://d.puremagic.com/issues/show_bug.cgi?id=3D5212
=20
 Feel free to add a note about your ensureHeaped() idea at the end of that=

 enhancement request :-)
=20
 (To that enhancement request I have not added my idea of the  onheap attr=

ibute because I think it's too much complex to implement according to the d=
esign style of the D compiler).
=20
 Bye,
 bearophile

I was the one bitten by the bug. I think it's really a naughty feature, was=
 about to create a bug entry when saw Bearophile's post. In my opinion, if
	void f(int[] ints) {doWhateverWith(ints);}
works, then
	void f(int[] ints...) {doWhateverWith(ints);}
must just work as well.

I consider variadic args as just syntactic honey for clients of a func, typ=
e, lib. There should be no visible semantic difference, even less bugs (and=
 certainly not segfault when the code does not manually play with memory!).=
 But I may have wrong expectations about this feature in D, due to how vari=
adics work in other languages I have used.

It was hard to debug even with the help of 3 experienced D programmers.

Denis
-- -- -- -- -- -- --
vit esse estrany =E2=98=A3

spir.wikidot.com

Nov 13 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sat, 13 Nov 2010 16:09:32 -0500, spir <denis.spir gmail.com> wrote:

 On Sat, 13 Nov 2010 13:19:25 -0500
 bearophile <bearophileHUGS lycos.com> wrote:

 Steven Schveighoffer:

 Then you just wasted time duping that argument.  Instead of a  

 defensive
 dup, what if we had a function ensureHeaped (better name suggestions?)

 I have created a bug report to avoid this whole pair of threads to be  
 lost in the dusts of time:
 http://d.puremagic.com/issues/show_bug.cgi?id=5212

 Feel free to add a note about your ensureHeaped() idea at the end of  
 that enhancement request :-)

 (To that enhancement request I have not added my idea of the  onheap  
 attribute because I think it's too much complex to implement according  
 to the design style of the D compiler).

 Bye,
 bearophile

 I was the one bitten by the bug. I think it's really a naughty feature,  
 was about to create a bug entry when saw Bearophile's post. In my  
 opinion, if
 	void f(int[] ints) {doWhateverWith(ints);}
 works, then
 	void f(int[] ints...) {doWhateverWith(ints);}
 must just work as well.

I don't really agree.  The ... version is optimized so you can pass  
typesafe variadic args.  If the compiler would generate a heap allocation  
for that, then you may be wasting a lot of heap allocations.  One thing  
you may learn in D is that heap allocation == crappy performance.  The  
less you allocate the faster your code gets.  It's one of the main reasons  
Tango is so damned fast.  To have the language continually working against  
that goal is going to great for inexperienced programmers but hell for  
people trying to squeeze performance out of it.

I think what we need however, is a way to specify intentions inside the  
function.  If you intend to escape this data, then the runtime/compiler  
should make it easy to avoid re-duping something.

 I consider variadic args as just syntactic honey for clients of a func,  
 type, lib. There should be no visible semantic difference, even less  
 bugs (and certainly not segfault when the code does not manually play  
 with memory!). But I may have wrong expectations about this feature in  
 D, due to how variadics work in other languages I have used.

 It was hard to debug even with the help of 3 experienced D programmers.

To be fair, it was easy to spot when you gave us the pertinent code :)

-Steve

Nov 15 2010

bearophile <bearophileHUGS lycos.com> writes:

Steven Schveighoffer:

To have the language continually working against that goal is going to great
for inexperienced programmers but hell for people trying to squeeze performance
out of it.<

The experienced programmers may write "scope int[] a...", and have no heap
allocations.

All the other people need first of all a program that doesn't contain hard to
spot bugs, and a fast progam then. Such people don't stick the "scope" there,
so in this case the compiler performs the test you were talking about, if it's
on the heap it doesn't copy it and takes a slice of it, otherwise if the data
was on the heap it dups it.


I think what we need however, is a way to specify intentions inside the
function.  If you intend to escape this data, then the runtime/compiler should
make it easy to avoid re-duping something.<

This is like for the "automatic" closures. The right design in a modern
language is to use the safer strategy on default, and the less safe on request.

If you want, a new compiler switch may be added that lists all the spots in the
code where a closure or hidden heap allocation occurs, useful for performance
tuning (an idea by Denis Koroskin):
http://d.puremagic.com/issues/show_bug.cgi?id=5070

I have even suggested a transitive  noheap annotation, similar to  nothrow,
that makes sure a function contains no heap allocations and doesn't call other
things that perform heap allocations:
http://d.puremagic.com/issues/show_bug.cgi?id=5219
The proliferation of function attributes produces "interesting" results:
 noheap  safe nothrow pure real sin(in real x) { ... }


To be fair, it was easy to spot when you gave us the pertinent code :)

I didn't even know/remember that that array data is on the stack. That error
will give bad surprises to some D newbies that are not fluent in C.

It's a problem of perception: typesafe variadic arguments don't look like
normal function arguments that you know are usually on the stack, they look
like dynamic arrays, and in D most dynamic arrays are allocated on the heap
(it's easy and useful to take a dynamic-array-slice of a stack allocated array,
but in this case the code shows that the slice doesn't contain heap data).

If your function has a signature similar to this one:

void foo(int[3] arr...) {

It's not too much hard to think that 'arr' is on the stack. But dynamic arrays
don't give that image:

void foo(int[] arr...) {

This is why I think it's better for the compiler to test if the arr data is on
the stack, and dup it otherwise (unless a 'scope' is present, in this case both
the test and allocation aren't present).

Bye,
bearophile

Nov 15 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 15 Nov 2010 17:02:27 -0500, bearophile <bearophileHUGS lycos.com>  
wrote:

 Steven Schveighoffer:

 To have the language continually working against that goal is going to  
 great for inexperienced programmers but hell for people trying to  
 squeeze performance out of it.<

 The experienced programmers may write "scope int[] a...", and have no  
 heap allocations.

This is a good idea.  This isn't what I thought spir was saying, I thought  
he wanted the function to always allocate.

At first glance, I thought your idea might be bad, because duping an array  
decouples it from the original, but then I realized -- there *is* no  
original.  This is the only reference to that data, so you can't change  
any expectations.

The only issue I see here is that scope should really be the default,  
because that is what you want most of the time.  However, the compiler  
cannot prove that the data doesn't escape so it can't really enforce that  
as the default.  I have the same issue with closures (the compiler is too  
eager to allocate closures because it is too conservative).  But I don't  
know how this can be fixed without redesigning the compilation model.

 I think what we need however, is a way to specify intentions inside the  
 function.  If you intend to escape this data, then the runtime/compiler  
 should make it easy to avoid re-duping something.<

 This is like for the "automatic" closures. The right design in a modern  
 language is to use the safer strategy on default, and the less safe on  
 request.

This is not always possible, I still see a good need for ensuring heaped  
data.  For example:

int[] y;

foo(int[] x...)
{
    y = ensureHeaped(x);
}

bar(int[] x)
{
    foo(x);
}

baz()
{
    int[3] x;
    bar(x);
}

I believe the compiler cannot really be made to enforce that all passed-in  
data will be heap-allocated when passed to foo.  A runtime check would be  
a very good safety net.

 If you want, a new compiler switch may be added that lists all the spots  
 in the code where a closure or hidden heap allocation occurs, useful for  
 performance tuning (an idea by Denis Koroskin):
 http://d.puremagic.com/issues/show_bug.cgi?id=5070

Also a good idea.

 I have even suggested a transitive  noheap annotation, similar to  
  nothrow, that makes sure a function contains no heap allocations and  
 doesn't call other things that perform heap allocations:
 http://d.puremagic.com/issues/show_bug.cgi?id=5219
 The proliferation of function attributes produces "interesting" results:
  noheap  safe nothrow pure real sin(in real x) { ... }

This is a bit much.  Introducing these attributes is viral -- once you go  
 noheap, anything you call must be  noheap, and the majority of functions  
will need to be marked  noheap.  The gain is marginal at best anyways.

 To be fair, it was easy to spot when you gave us the pertinent code :)

 I didn't even know/remember that that array data is on the stack. That  
 error will give bad surprises to some D newbies that are not fluent in C.

I didn't know until about a month and a half ago (in dcollections, this  
bug was prominent in all the array-based classes).  Only after inspecting  
the disassembly did I realize.

I agree we need some sort of protection or alert for this -- it's too  
simple to make this mistake.

-Steve

Nov 16 2010

bearophile <bearophileHUGS lycos.com> writes:

Steven Schveighoffer:

 The experienced programmers may write "scope int[] a...", and have no  
 heap allocations.

 
 This is a good idea.  This isn't what I thought spir was saying, I thought  
 he wanted the function to always allocate.

I have also suggested that when "scope" is not present, DMD may automatically
add a runtime test similar to the one done by ensureHeaped and dup the array
data only if it's on the stack. So even when you don't use "scope" it doesn't
always copy.


 The only issue I see here is that scope should really be the default,  
 because that is what you want most of the time.

If the variadics is in the signature of a free function then I agree with you.
But if it's inside the this() of a class, then often you want that data on the
heap.


 However, the compiler  
 cannot prove that the data doesn't escape so it can't really enforce that  
 as the default.

If you look at my original answer I have suggested something like  heaped,
that's attached to an array and makes sure its data is on the heap. This is
done with a mix (when possible) of static analysis and runtime tests (in the
other cases). But I have not added this idea to the enhancement request of
scoped variadics because it looks too much hard to implement in D/DMD.


 I believe the compiler cannot really be made to enforce that all passed-in  
 data will be heap-allocated when passed to foo.  A runtime check would be  
 a very good safety net.

Static analysis is able to do this and more, but it requires some logic added
to the compiler (and such logic probably doesn't work in all cases).



 I have even suggested a transitive  noheap annotation, similar to  
  nothrow, that makes sure a function contains no heap allocations and  
 doesn't call other things that perform heap allocations:
 http://d.puremagic.com/issues/show_bug.cgi?id=5219
 The proliferation of function attributes produces "interesting" results:
  noheap  safe nothrow pure real sin(in real x) { ... }

 
 This is a bit much.  Introducing these attributes is viral -- once you go  
  noheap, anything you call must be  noheap, and the majority of functions  
 will need to be marked  noheap.  The gain is marginal at best anyways.

Indeed, it's a lot, and I am not sure it's a good idea.

I have had this idea reading one or two articles written by people that write
high performance games in C++. They need to keep the frame rate constantly
higher than a minimum, like 30/s or 60/s. To do this they have to avoid C heap
allocations inside certain large loops (D GC heap allocations may be even
worse). Using  noheap is a burden, but it may help you write code with a more
deterministic performance.

Maybe someday it will be possible to implement  noheap with user-defined
attributes plus static reflection, in D. But then the standard library will not
use that user-defined  noheap attribute, making it not so useful. So if you
want it to be transitive, Phobos needs to be aware of it.

Bye,
bearophile

Nov 16 2010

Johann MacDonagh <johann.macdonagh..no spam..gmail.com> writes:

On 11/16/2010 12:58 PM, bearophile wrote:
 Steven Schveighoffer:

 The experienced programmers may write "scope int[] a...", and have no
 heap allocations.

 This is a good idea.  This isn't what I thought spir was saying, I thought
 he wanted the function to always allocate.

 I have also suggested that when "scope" is not present, DMD may automatically
add a runtime test similar to the one done by ensureHeaped and dup the array
data only if it's on the stack. So even when you don't use "scope" it doesn't
always copy.


 The only issue I see here is that scope should really be the default,
 because that is what you want most of the time.

 If the variadics is in the signature of a free function then I agree with you.
But if it's inside the this() of a class, then often you want that data on the
heap.


 However, the compiler
 cannot prove that the data doesn't escape so it can't really enforce that
 as the default.

 If you look at my original answer I have suggested something like  heaped,
that's attached to an array and makes sure its data is on the heap. This is
done with a mix (when possible) of static analysis and runtime tests (in the
other cases). But I have not added this idea to the enhancement request of
scoped variadics because it looks too much hard to implement in D/DMD.


 I believe the compiler cannot really be made to enforce that all passed-in
 data will be heap-allocated when passed to foo.  A runtime check would be
 a very good safety net.

 Static analysis is able to do this and more, but it requires some logic added
to the compiler (and such logic probably doesn't work in all cases).



 I have even suggested a transitive  noheap annotation, similar to
  nothrow, that makes sure a function contains no heap allocations and
 doesn't call other things that perform heap allocations:
 http://d.puremagic.com/issues/show_bug.cgi?id=5219
 The proliferation of function attributes produces "interesting" results:
  noheap  safe nothrow pure real sin(in real x) { ... }

 This is a bit much.  Introducing these attributes is viral -- once you go
  noheap, anything you call must be  noheap, and the majority of functions
 will need to be marked  noheap.  The gain is marginal at best anyways.

 Indeed, it's a lot, and I am not sure it's a good idea.

 I have had this idea reading one or two articles written by people that write
high performance games in C++. They need to keep the frame rate constantly
higher than a minimum, like 30/s or 60/s. To do this they have to avoid C heap
allocations inside certain large loops (D GC heap allocations may be even
worse). Using  noheap is a burden, but it may help you write code with a more
deterministic performance.

 Maybe someday it will be possible to implement  noheap with user-defined
attributes plus static reflection, in D. But then the standard library will not
use that user-defined  noheap attribute, making it not so useful. So if you
want it to be transitive, Phobos needs to be aware of it.

 Bye,
 bearophile

I'm for the "safe by default, you have to work to be unsafe". In this 
case, the compiler should have noticed the data was being escaped and 
passed 1,2,3 as a heap (or perhaps auto-duped at the point it was 
assigned). It does this kind of thing when you have a closure (nested 
delegate).

The variadic syntax is confusing (although arguably probably the best 
way to deal with it). An average developer expects assignments to always 
be safe. Assigning a function scope static array to a global static or 
dynamic array results in either copying or a heap allocation / copying. 
There's no need to "think about it" as you would in C. You simply assign 
and it works.

Although I do agree some kind of compiler switch that tells you when 
there hidden heap allocations would be nice for performance tuning.

- Johann

Nov 21 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sun, 21 Nov 2010 17:54:25 -0500, Johann MacDonagh  
<johann.macdonagh..no spam..gmail.com> wrote:

 On 11/16/2010 12:58 PM, bearophile wrote:
 Steven Schveighoffer:

 The experienced programmers may write "scope int[] a...", and have no
 heap allocations.

 This is a good idea.  This isn't what I thought spir was saying, I  
 thought
 he wanted the function to always allocate.

 I have also suggested that when "scope" is not present, DMD may  
 automatically add a runtime test similar to the one done by  
 ensureHeaped and dup the array data only if it's on the stack. So even  
 when you don't use "scope" it doesn't always copy.


 The only issue I see here is that scope should really be the default,
 because that is what you want most of the time.

 If the variadics is in the signature of a free function then I agree  
 with you. But if it's inside the this() of a class, then often you want  
 that data on the heap.


 However, the compiler
 cannot prove that the data doesn't escape so it can't really enforce  
 that
 as the default.

 If you look at my original answer I have suggested something like  
  heaped, that's attached to an array and makes sure its data is on the  
 heap. This is done with a mix (when possible) of static analysis and  
 runtime tests (in the other cases). But I have not added this idea to  
 the enhancement request of scoped variadics because it looks too much  
 hard to implement in D/DMD.


 I believe the compiler cannot really be made to enforce that all  
 passed-in
 data will be heap-allocated when passed to foo.  A runtime check would  
 be
 a very good safety net.

 Static analysis is able to do this and more, but it requires some logic  
 added to the compiler (and such logic probably doesn't work in all  
 cases).



 I have even suggested a transitive  noheap annotation, similar to
  nothrow, that makes sure a function contains no heap allocations and
 doesn't call other things that perform heap allocations:
 http://d.puremagic.com/issues/show_bug.cgi?id=5219
 The proliferation of function attributes produces "interesting"  
 results:
  noheap  safe nothrow pure real sin(in real x) { ... }

 This is a bit much.  Introducing these attributes is viral -- once you  
 go
  noheap, anything you call must be  noheap, and the majority of  
 functions
 will need to be marked  noheap.  The gain is marginal at best anyways.

 Indeed, it's a lot, and I am not sure it's a good idea.

 I have had this idea reading one or two articles written by people that  
 write high performance games in C++. They need to keep the frame rate  
 constantly higher than a minimum, like 30/s or 60/s. To do this they  
 have to avoid C heap allocations inside certain large loops (D GC heap  
 allocations may be even worse). Using  noheap is a burden, but it may  
 help you write code with a more deterministic performance.

 Maybe someday it will be possible to implement  noheap with  
 user-defined attributes plus static reflection, in D. But then the  
 standard library will not use that user-defined  noheap attribute,  
 making it not so useful. So if you want it to be transitive, Phobos  
 needs to be aware of it.

 Bye,
 bearophile

 I'm for the "safe by default, you have to work to be unsafe". In this  
 case, the compiler should have noticed the data was being escaped and  
 passed 1,2,3 as a heap (or perhaps auto-duped at the point it was  
 assigned). It does this kind of thing when you have a closure (nested  
 delegate).

 The variadic syntax is confusing (although arguably probably the best  
 way to deal with it). An average developer expects assignments to always  
 be safe. Assigning a function scope static array to a global static or  
 dynamic array results in either copying or a heap allocation / copying.  
 There's no need to "think about it" as you would in C. You simply assign  
 and it works.

 Although I do agree some kind of compiler switch that tells you when  
 there hidden heap allocations would be nice for performance tuning.

Let's say you give the compiler this .di file:

void foo(int[] nums...);

And an object file, how does the compiler know whether nums escapes or not?

The answer is, it cannot.  That is the problem with D's compilation model  
which makes analysis impossible.  One must declare the intentions first  
(via the function signature), and then adhere to the intentions.

Bearophile's idea of using scope is good, we can probably make that work.

-Steve

Nov 22 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/16/10 4:40 AM, Steven Schveighoffer wrote:
 On Mon, 15 Nov 2010 17:02:27 -0500, bearophile
 <bearophileHUGS lycos.com> wrote:
 I have even suggested a transitive  noheap annotation, similar to
  nothrow, that makes sure a function contains no heap allocations and
 doesn't call other things that perform heap allocations:
 http://d.puremagic.com/issues/show_bug.cgi?id=5219
 The proliferation of function attributes produces "interesting" results:
  noheap  safe nothrow pure real sin(in real x) { ... }

 This is a bit much. Introducing these attributes is viral -- once you go
  noheap, anything you call must be  noheap, and the majority of
 functions will need to be marked  noheap. The gain is marginal at best
 anyways.

Hm, interestingly a data qualifier  noheap would not need to be 
transitive as data on the stack may refer to data on the heap.

Andrei

Nov 16 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 16 Nov 2010 13:04:32 -0500, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 11/16/10 4:40 AM, Steven Schveighoffer wrote:
 On Mon, 15 Nov 2010 17:02:27 -0500, bearophile
 <bearophileHUGS lycos.com> wrote:
 I have even suggested a transitive  noheap annotation, similar to
  nothrow, that makes sure a function contains no heap allocations and
 doesn't call other things that perform heap allocations:
 http://d.puremagic.com/issues/show_bug.cgi?id=5219
 The proliferation of function attributes produces "interesting"  
 results:
  noheap  safe nothrow pure real sin(in real x) { ... }

 This is a bit much. Introducing these attributes is viral -- once you go
  noheap, anything you call must be  noheap, and the majority of
 functions will need to be marked  noheap. The gain is marginal at best
 anyways.

 Hm, interestingly a data qualifier  noheap would not need to be  
 transitive as data on the stack may refer to data on the heap.

I think he means transitive the same way pure is transitive.  Not sure  
what the term would be, functionally transitive?

in other words, if your function is marked  noheap, it cannot allocate any  
memory, which means it cannot call any *other* functions that allocate  
memory.

-Steve

Nov 16 2010

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday, November 16, 2010 10:31:43 Steven Schveighoffer wrote:
 On Tue, 16 Nov 2010 13:04:32 -0500, Andrei Alexandrescu
 
 <SeeWebsiteForEmail erdani.org> wrote:
 On 11/16/10 4:40 AM, Steven Schveighoffer wrote:
 On Mon, 15 Nov 2010 17:02:27 -0500, bearophile
 
 <bearophileHUGS lycos.com> wrote:
 I have even suggested a transitive  noheap annotation, similar to
  nothrow, that makes sure a function contains no heap allocations and
 doesn't call other things that perform heap allocations:
 http://d.puremagic.com/issues/show_bug.cgi?id=5219
 The proliferation of function attributes produces "interesting"
 results:
  noheap  safe nothrow pure real sin(in real x) { ... }

 
 This is a bit much. Introducing these attributes is viral -- once you go
  noheap, anything you call must be  noheap, and the majority of
 functions will need to be marked  noheap. The gain is marginal at best
 anyways.

 
 Hm, interestingly a data qualifier  noheap would not need to be
 transitive as data on the stack may refer to data on the heap.

 
 I think he means transitive the same way pure is transitive.  Not sure
 what the term would be, functionally transitive?
 
 in other words, if your function is marked  noheap, it cannot allocate any
 memory, which means it cannot call any *other* functions that allocate
 memory.

Pure is hard enough to deal with (especially since it we probably have made it 
the default, but it's too late for that now). We shouldn't even consider adding 
anything more like that without a _really_ good reason.

- Jonathan M Davis

Nov 16 2010

bearophile <bearophileHUGS lycos.com> writes:

Jonathan M Davis:

 Pure is hard enough to deal with (especially since it we probably have made it 
 the default, but it's too late for that now).

Weakly pure on default isn't good for a language that is supposed to b e
somewhat compatible with C syntax, I think it breaks too many C functions.

Bye,
bearophile

Nov 16 2010

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday 16 November 2010 12:37:10 bearophile wrote:
 Jonathan M Davis:
 Pure is hard enough to deal with (especially since it we probably have
 made it the default, but it's too late for that now).

 
 Weakly pure on default isn't good for a language that is supposed to b e
 somewhat compatible with C syntax, I think it breaks too many C functions.

Well, like I said, it's too late at this point, and really, it would be good to 
have a nice way to deal with C functions and purity (particularly since most of 
them are pure anyway), but the result at present is that most functions should 
be marked with pure. And if you're marking more functions with pure than not, 
that would imply that the default should be (at least ideally) impure. 
Regardless, however, it's not reasonable for D to go for impure rather than
pure 
at this point.

- Jonathan M Davis

Nov 16 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 16 Nov 2010 16:04:18 -0500, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 On Tuesday 16 November 2010 12:37:10 bearophile wrote:
 Jonathan M Davis:
 Pure is hard enough to deal with (especially since it we probably have
 made it the default, but it's too late for that now).

 Weakly pure on default isn't good for a language that is supposed to b e
 somewhat compatible with C syntax, I think it breaks too many C  
 functions.

 Well, like I said, it's too late at this point, and really, it would be  
 good to
 have a nice way to deal with C functions and purity (particularly since  
 most of
 them are pure anyway), but the result at present is that most functions  
 should
 be marked with pure. And if you're marking more functions with pure than  
 not,
 that would imply that the default should be (at least ideally) impure.
 Regardless, however, it's not reasonable for D to go for impure rather  
 than pure
 at this point.

everything you are saying seems to be backwards, stop it! ;)

1. currently, the default is impure.
2. Most functions will naturally be weakly pure, so making *pure* the  
default would seem more useful.

It seems backwards to me to think pure functions should be the default, I  
mean, this isn't a functional language!  But you also have to forget  
everything you know about pure, because a weakly pure function is a very  
useful idiom, and it is most certainly not compatible with functional  
languages.  It's both imperative and can accept and return mutable data.

It makes me think that this is going to be extremely confusing for a  
while, because people are so used to pure being equated with a functional  
language, so when they see a function is pure but takes mutable data, they  
will be scratching their heads.  It would be awesome to make weakly pure  
the default, and it would also make it so we have to change much less code.

-Steve

Nov 16 2010

Rainer Deyke <rainerd eldwood.com> writes:

On 11/16/2010 21:53, Steven Schveighoffer wrote:
 It makes me think that this is going to be extremely confusing for a
 while, because people are so used to pure being equated with a
 functional language, so when they see a function is pure but takes
 mutable data, they will be scratching their heads.  It would be awesome
 to make weakly pure the default, and it would also make it so we have to
 change much less code.

Making functions weakly pure by default means that temporarily adding a
tiny debug printf to any function will require a shitload of cascading
'impure' annotations.  I would consider that completely unacceptable.

(Unless, of course, purity is detected automatically without the use of
annotations at all.)


-- 
Rainer Deyke - rainerd eldwood.com

Nov 16 2010

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday 16 November 2010 23:03:05 Rainer Deyke wrote:
 On 11/16/2010 21:53, Steven Schveighoffer wrote:
 It makes me think that this is going to be extremely confusing for a
 while, because people are so used to pure being equated with a
 functional language, so when they see a function is pure but takes
 mutable data, they will be scratching their heads.  It would be awesome
 to make weakly pure the default, and it would also make it so we have to
 change much less code.

 
 Making functions weakly pure by default means that temporarily adding a
 tiny debug printf to any function will require a shitload of cascading
 'impure' annotations.  I would consider that completely unacceptable.
 
 (Unless, of course, purity is detected automatically without the use of
 annotations at all.)

It has already been argued that I/O should be exempt (at least for debugging 
purposes), and I think that that would could be acceptable for weakly pure 
functions. But it's certainly true that as it stands, dealing with I/O and 
purity doesn't work very well. And since you have to try and mark as much as 
possible pure (to make it weakly pure at least) if you want much hope of being 
able to have much of anything be strongly pure, it doesn't take long before you 
can't actually have I/O much of anywhere - even for debugging. It's definitely
a 
problem.

- Jonathan M Davis

Nov 16 2010

spir <denis.spir gmail.com> writes:

On Wed, 17 Nov 2010 00:03:05 -0700
Rainer Deyke <rainerd eldwood.com> wrote:

 Making functions weakly pure by default means that temporarily adding a
 tiny debug printf to any function will require a shitload of cascading
 'impure' annotations.  I would consider that completely unacceptable.

Output in general, programmer feedback in particuliar, should simply not be=
 considered effect. It is transitory change to dedicated areas of memory --=
 not state. Isn't this the sense of "output", after all? (One cannot read i=
t back, thus it has no consequence on future process.) The following is imo=
 purely referentially transparent and effect-free (where effect means chang=
ing state); it always executes the same way, produces the same result, and =
never influences later processes else as via said result:

uint square(uint n) {
    uint sq =3D n*n;
    writefln("%s^2 =3D %s", n, sq);
    return sq;
}

Sure, the physical machine's state has changed, but it's not the same machi=
ne (state) as the one the program runs on (as the one the program can play =
with). There is some bizarre confusion.
[IMO, FP's notion of purity is at best improper for imperative programming =
(& at worst requires complicated hacks for using FP itself). We need to fin=
d our own way to make programs easier to understand and reason about.]


Denis
-- -- -- -- -- -- --
vit esse estrany =E2=98=A3

spir.wikidot.com

Nov 17 2010

bearophile <bearophileHUGS lycos.com> writes:

Steven Schveighoffer:

It makes me think that this is going to be extremely confusing for a while,
because people are so used to pure being equated with a functional language, so
when they see a function is pure but takes mutable data, they will be
scratching their heads.<

I agree, it's a (small) problem. Originally 'pure' in D was closer to the
correct definition of purity. Then its semantics was changed and it was not
replaced by  strongpure/ weakpure annotations, so there is now a bit of
semantic mismatch.

------------------------

Rainer Deyke:

 Making functions weakly pure by default means that temporarily adding a
 tiny debug printf to any function will require a shitload of cascading
 'impure' annotations.  I would consider that completely unacceptable.

To face this problem I have proposed a pureprintf() function (or purewriteln),
that's a kind of alias of printf (or writeln), the only differences between
pureprintf() and printf() are the name and D seeing the first one as strongly
pure.

The pureprintf() is meant only for *unreliable* debug prints, not for the
normal program console output.

------------------------

spir:

Output in general, programmer feedback in particuliar, should simply not be
considered effect.

You are very wrong.


 The following is imo purely referentially transparent and effect-free (where
effect
 means changing state); it always executes the same way, produces the same
result,
 and never influences later processes else as via said result:
 
 uint square(uint n) {
     uint sq = n*n;
     writefln("%s^2 = %s", n, sq);
     return sq;
 }

If we replace that function signature with this (assuming writefln is
considered pure):

pure uint square(uint n) { ...


Then the following code will print one or two times according to how much
optimizations the compiler is performing:

void main() {
    uint x = square(10) + square(10);
}

Generally in DMD if you compile with -O you will see only one print. If you
replace the signature with this one:

pure double square(double n) { ...

You will see two prints. In general the compiler is able to replace two calls
with same arguments to a strongly pure function with a single call. DMD doesn't
do it on floating point numbers to respect its not-optimization FP rules, but
LDC doesn't respect them if you use the 
-enable-unsafe-fp-math compiler switch, so if you use -enable-unsafe-fp-math
you will probably see only one print.

Generally if the compiler sees code like:

uint x = foo(x) + bar(x);

And both foo and bar are strongly pure, the compiler must be free to call them
in any order it likes, because they are side-effects-free.

So normal printing functions can't be allowed inside pure functions, because
printing is a big side effect (even memory allocation is a side effect, because
I may cast the dynamic array pointer to size_t and then use this number. Even
exceptions are a side effect, but probably they give less troubles than
printing).

I have suggested the pureprintf() that allows the user to remember its printing
will be unreable (printing may appear or disappear according to compiler used,
optimization levels, day of the week).

Bye,
bearophile

Nov 17 2010

Rainer Deyke <rainerd eldwood.com> writes:

On 11/17/2010 05:10, spir wrote:
 Output in general, programmer feedback in particuliar, should simply
 not be considered effect. It is transitory change to dedicated areas
 of memory -- not state. Isn't this the sense of "output", after all?

My debug output actually goes through my logging library which, among
other things, maintains a list of log messages in memory.  If this is
considered "pure", then we might as well strip "pure" from the language,
because it has lost all meaning.


-- 
Rainer Deyke - rainerd eldwood.com

Nov 17 2010

spir <denis.spir gmail.com> writes:

On Tue, 16 Nov 2010 23:28:37 -0800
Jonathan M Davis <jmdavisProg gmx.com> wrote:

 It has already been argued that I/O should be exempt (at least for debugg=

ing=20
 purposes), and I think that that would could be acceptable for weakly pur=

e=20
 functions. But it's certainly true that as it stands, dealing with I/O an=

d=20
 purity doesn't work very well. And since you have to try and mark as much=

 as=20
 possible pure (to make it weakly pure at least) if you want much hope of =

being=20
 able to have much of anything be strongly pure, it doesn't take long befo=

re you=20
 can't actually have I/O much of anywhere - even for debugging. It's defin=

itely a=20
 problem.

(See also my previous post on this thread).
What we are missing is a clear notion of program state, distinct from physi=
cal machine. A non-referentially transparent function is one that reads fro=
m this state; between 2 runs of the function, this state may have been chan=
ged by the program itself, so that execution is influenced. Conversely, an =
effect-ive function is one that changes state; such a change may influence =
parts of the program that read it, including possibly itself.

This true program state is not the physical machine's one. Ideally, there w=
ould be in the core language's organisation a clear definition of what stat=
e is -- it could be called "state", or "world". An approximation in super s=
imple imperative languages is the set of global variables. (Output does not=
 write onto globals -- considering writing onto video port or memory state =
change is close to nonsense ;-) In pure OO, this is more or less the set of=
 objects / object fields. (A func that does not affect any object field is =
effect-free.)
State is something the program can read (back); all the rest, such as writi=
ng to unreachable parts of memory like for output, cannot have any conseque=
nce on future process (*). I'm still far to be clear on this topic; as of n=
ow, I think only assignments to state, as so defined, should be considered =
effects.
This would lead to a far more practicle notion of "purity", I guess, esp fo=
r imperative and/or OO programming.


Denis

(*) Except possibly when using low level direct access to (pseudo) memory a=
ddresses. Even then, one cannot read plain output ports, or write to plain =
input ports, for instance.
-- -- -- -- -- -- --
vit esse estrany =E2=98=A3

spir.wikidot.com

Nov 17 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 17 Nov 2010 02:03:05 -0500, Rainer Deyke <rainerd eldwood.com>  
wrote:

 On 11/16/2010 21:53, Steven Schveighoffer wrote:
 It makes me think that this is going to be extremely confusing for a
 while, because people are so used to pure being equated with a
 functional language, so when they see a function is pure but takes
 mutable data, they will be scratching their heads.  It would be awesome
 to make weakly pure the default, and it would also make it so we have to
 change much less code.

 Making functions weakly pure by default means that temporarily adding a
 tiny debug printf to any function will require a shitload of cascading
 'impure' annotations.  I would consider that completely unacceptable.

As would I.  But I think in the case of debugging, we can have "trusted  
pure."  This can be achieved by using extern(C) pure runtime functions.

 (Unless, of course, purity is detected automatically without the use of
 annotations at all.)

That would be ideal, but the issue is that the compiler may only have the  
signature and not the implementation.  D would need to change its  
compilation model for this to work (and escape analysis, and link-time  
optimizations, etc.)

-Steve

Nov 17 2010

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday 16 November 2010 20:53:04 Steven Schveighoffer wrote:
 On Tue, 16 Nov 2010 16:04:18 -0500, Jonathan M Davis <jmdavisProg gmx.com>
 
 wrote:
 On Tuesday 16 November 2010 12:37:10 bearophile wrote:
 Jonathan M Davis:
 Pure is hard enough to deal with (especially since it we probably have
 made it the default, but it's too late for that now).

 
 Weakly pure on default isn't good for a language that is supposed to b e
 somewhat compatible with C syntax, I think it breaks too many C
 functions.

 
 Well, like I said, it's too late at this point, and really, it would be
 good to
 have a nice way to deal with C functions and purity (particularly since
 most of
 them are pure anyway), but the result at present is that most functions
 should
 be marked with pure. And if you're marking more functions with pure than
 not,
 that would imply that the default should be (at least ideally) impure.
 Regardless, however, it's not reasonable for D to go for impure rather
 than pure
 at this point.

 
 everything you are saying seems to be backwards, stop it! ;)
 
 1. currently, the default is impure.
 2. Most functions will naturally be weakly pure, so making *pure* the
 default would seem more useful.
 
 It seems backwards to me to think pure functions should be the default, I
 mean, this isn't a functional language!  But you also have to forget
 everything you know about pure, because a weakly pure function is a very
 useful idiom, and it is most certainly not compatible with functional
 languages.  It's both imperative and can accept and return mutable data.
 
 It makes me think that this is going to be extremely confusing for a
 while, because people are so used to pure being equated with a functional
 language, so when they see a function is pure but takes mutable data, they
 will be scratching their heads.  It would be awesome to make weakly pure
 the default, and it would also make it so we have to change much less code.

II was not trying to separate out weakly pure and strongly pure. pure is pure
as 
far as marking the functions go. Whether that purity strong or weak depends on 
the parameters. And since most functions should at least be weakly pure, you
end 
up marking most functions with pure. Ideally, you'd be marking functions for
the 
uncommon case rather than the common one.

I do think that a serious downside to using pure to mark weak purity is that 
it's pretty much going to bury the difference. You're not using global
variables, 
so you mark the function as pure. Whether it's actually strongly pure and thus 
the compiler can optimize it is then an optimization detail (though you can of 
course figure it out if you want to). I expect that that's pretty much what the 
situation is going to end up being.

Of course, the fact that C functions aren't marked as pure (even though in most 
cases they are) tends to put a damper on things, and the fact that you have to 
create multiple versions of the same function in different static if blocks
when 
the purity depends on a templated function or type that the function is using 
also puts a major damper on things. However, the overall trend will likely be
to 
mark next to everything as pure.

It would certainly be cool to have weakly pure be the default, but that would 
require adding impure or something similar for cases where a function can't
even 
be weakly pure.

I would think that ideally, you'd make the default weakly pure, have impure (or 
something similar) to mark functions which can't even be weakly pure, and have 
full-on pure just be detected by the compiler (since it should be able to do 
that if weakly pure is the default). pure could be dropped entirely, or you 
could keep it to enforce that a function actually be strongly pure, forcing you 
to change the function if something changes to make it only weakly pure or 
outright impure. But I don't see that change having any chance of being made. 
Even if you kept pure and made it strongly pure only, adding impure (or
whatever 
you wanted to call it) would mean adding a keyword, which always seems to go 
over badly around here. It would also mean changing a fair bit of code (mostly 
due to stdin, stdout, and C functions I expect). I think that it would be 
ultimately worth it though, as long as we were willing to pay the pain up front.

- Jonathan M Davis

Nov 16 2010

D Programming

C/C++ Programming

Other

digitalmars.D - RFC, ensureHeaped