www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - forcing weak purity

reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
I have come across a dilemma.

Alex R=C3=B8nne Petersen has a pull request changing some things in the =
GC to  =

pure.  I think gc_collect() should be weak-pure, because it could  =

technically run on any memory allocation (which is already allowed in pu=
re  =

functions), and it runs in a context that doesn't really affect executio=
n  =

of the pure function.

So I think it should be able to be run inside a strong pure function.  B=
ut  =

because it has no parameters and no return, marking it as pure makes it =
 =

strong pure, and an optimizing compiler can effectively remove the call =
 =

completely!

So how do we force something to be weak-pure?  What I want is:

1. it can be called from a pure function
2. it will not be optimized out in any way.

This solution looks crappy to me:

void gc_collect(void *unused =3D null);

any other ideas?

-Steve
May 22 2012
next sibling parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 23-05-2012 05:22, Steven Schveighoffer wrote:
 I have come across a dilemma.

 Alex Rønne Petersen has a pull request changing some things in the GC to
 pure. I think gc_collect() should be weak-pure, because it could
 technically run on any memory allocation (which is already allowed in
 pure functions), and it runs in a context that doesn't really affect
 execution of the pure function.

 So I think it should be able to be run inside a strong pure function.
 But because it has no parameters and no return, marking it as pure makes
 it strong pure, and an optimizing compiler can effectively remove the
 call completely!

 So how do we force something to be weak-pure? What I want is:

 1. it can be called from a pure function
 2. it will not be optimized out in any way.

 This solution looks crappy to me:

 void gc_collect(void *unused = null);

 any other ideas?

 -Steve

I'm in favor of what you suggested on GitHub: A weak attribute to enforce weak purity for functions marked pure. BTW, any compiler with alias analysis and LTO might even decide to remove the call even with the unused parameter, since it, well, isn't used. I think we need a language-level solution here. -- Alex Rønne Petersen alex lycus.org http://lycus.org
May 22 2012
next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 22 May 2012 23:31:59 -0400, Alex R=C3=B8nne Petersen <alex lycus=
.org>  =

wrote:

 On 23-05-2012 05:22, Steven Schveighoffer wrote:

 This solution looks crappy to me:

 void gc_collect(void *unused =3D null);

BTW, any compiler with alias analysis and LTO might even decide to =

 remove the call even with the unused parameter, since it, well, isn't =

 used. I think we need a language-level solution here.

The LTO would have to be able to make decisions at link time based on = purity, because the call *does* do things, it just doesn't use the = parameter. I have no idea, maybe you are right, it would be a hard = problem to fix if this happened. -Steve
May 23 2012
prev sibling parent Artur Skawina <art.08.09 gmail.com> writes:
On 05/23/12 13:45, Steven Schveighoffer wrote:
 On Tue, 22 May 2012 23:31:59 -0400, Alex Rønne Petersen <alex lycus.org>
wrote:
 
 On 23-05-2012 05:22, Steven Schveighoffer wrote:

 This solution looks crappy to me:

 void gc_collect(void *unused = null);

BTW, any compiler with alias analysis and LTO might even decide to remove the call even with the unused parameter, since it, well, isn't used. I think we need a language-level solution here.

The LTO would have to be able to make decisions at link time based on purity, because the call *does* do things, it just doesn't use the parameter. I have no idea, maybe you are right, it would be a hard problem to fix if this happened.

A compiler can, once it notices the unused argument(s), (more or less) easily rewrite (clone) the function and modify the callers to use the new version. And when the LTO pass (re)compiles the program (from the intermediate representation in the object files) it could already see the cloned version, and optimize based on that. So it is possible. artur
May 23 2012
prev sibling next sibling parent reply "Mehrdad" <wfunction hotmail.com> writes:
We should make 'pure' mean strongly pure.

For weakly pure, we could introduce the 'doped' keyword :-D
May 22 2012
parent Don Clugston <dac nospam.com> writes:
On 23/05/12 07:05, Mehrdad wrote:
 We should make 'pure' mean strongly pure.

 For weakly pure, we could introduce the 'doped' keyword :-D

No, the keyword should be more like noglobal I wish people would stop using this "weak purity" / "strong purity" terminology, it's very unhelpful. (And it's my fault. I've created a monster!) There is absolutely no need for a keyword to mark (strong) purity, and "weak purity" isn't actually pure. The real question being asked is, do we need something for logical purity? Note that we need the same thing for caching. Or are the cases like this rare enough that we can just fake it with a cast?
May 23 2012
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 23 May 2012 06:00:11 -0400, Don Clugston <dac nospam.com> wrote:

 On 23/05/12 07:05, Mehrdad wrote:
 We should make 'pure' mean strongly pure.

 For weakly pure, we could introduce the 'doped' keyword :-D

No, the keyword should be more like noglobal

Well, it's actually noglobalorshared
 I wish people would stop using this "weak purity" / "strong purity"  
 terminology, it's very unhelpful. (And it's my fault. I've created a  
 monster!)

I think the feature itself is essential and useful. I just think it's a victim of historical naming.
 There is absolutely no need for a keyword to mark (strong) purity, and  
 "weak purity" isn't actually pure.

Now that we have the proverbial foot in the door, and we see that weak pure... er.. noglobal functions are very useful, it might be worth investigating whether it makes sense to rename the keywords. I agree that 'pure' is somewhat misleading on noglobal functions, simply because of the history of the pure keyword in other languages. But I really like how you don't have to care whether a function is actually pure or just noglobal, the compiler has a better perspective on what is fully pure and what is noglobal. So are you suggesting that we replace pure wholesale with another keyword that better describes what D purity is? Or are you suggesting that we have to specifically mark fully-pure functions as pure to get the benefits?
 The real question being asked is, do we need something for logical  
 purity? Note that we need the same thing for caching.

Yes. Memory allocation and deallocation is *not pure*, since it affects global state. But it should never participate in optimization.
 Or are the cases like this rare enough that we can just fake it with a  
 cast?

As long as we can have a way to say "this can be called from pure/noglobal functions, but should *not* participate in pure optimizations", I think it's a valid solution. -Steve
May 23 2012
prev sibling next sibling parent reply Artur Skawina <art.08.09 gmail.com> writes:
On 05/23/12 05:22, Steven Schveighoffer wrote:
 it has no parameters and no return, marking it as pure makes it strong pure,
and an optimizing compiler can effectively remove the call completely!

Arguably a pure function not returning a value doesn't make sense... D's definition of "pure" makes things a bit more complicated, and the fact that it is so vaguely defined doesn't help. Eg what does "a pure function can terminate the program" mean? A literal interpretation forbids eliminating any calls, or even moving them in a way that could affect control flow (by terminating early/late)... Anyway, result-less "pure" functions obviously can have side effects, so removing calls to them shouldn't be allowed. On 05/23/12 05:31, Alex Rønne Petersen wrote:
 I'm in favor of what you suggested on GitHub: A  weak attribute to enforce
weak purity for functions marked pure.

No. " weak" should be for defining weak symbols, reusing it for anything else would just cause confusion, artur
May 23 2012
parent =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 23-05-2012 13:48, Artur Skawina wrote:
 On 05/23/12 05:22, Steven Schveighoffer wrote:
 it has no parameters and no return, marking it as pure makes it strong pure,
and an optimizing compiler can effectively remove the call completely!

Arguably a pure function not returning a value doesn't make sense... D's definition of "pure" makes things a bit more complicated, and the fact that it is so vaguely defined doesn't help. Eg what does "a pure function can terminate the program" mean? A literal interpretation forbids eliminating any calls, or even moving them in a way that could affect control flow (by terminating early/late)... Anyway, result-less "pure" functions obviously can have side effects, so removing calls to them shouldn't be allowed. On 05/23/12 05:31, Alex Rønne Petersen wrote:
 I'm in favor of what you suggested on GitHub: A  weak attribute to enforce
weak purity for functions marked pure.

No. " weak" should be for defining weak symbols, reusing it for anything else would just cause confusion, artur

I had no idea what weak symbols were until I read your post. Just due to that fact, I am not convinced that weak symbols are interesting enough to occupy weak. -- Alex Rønne Petersen alex lycus.org http://lycus.org
May 23 2012
prev sibling next sibling parent reply deadalnix <deadalnix gmail.com> writes:
Le 23/05/2012 05:22, Steven Schveighoffer a écrit :
 I have come across a dilemma.

 Alex Rønne Petersen has a pull request changing some things in the GC to
 pure. I think gc_collect() should be weak-pure, because it could
 technically run on any memory allocation (which is already allowed in
 pure functions), and it runs in a context that doesn't really affect
 execution of the pure function.

 So I think it should be able to be run inside a strong pure function.
 But because it has no parameters and no return, marking it as pure makes
 it strong pure, and an optimizing compiler can effectively remove the
 call completely!

 So how do we force something to be weak-pure? What I want is:

 1. it can be called from a pure function
 2. it will not be optimized out in any way.

 This solution looks crappy to me:

 void gc_collect(void *unused = null);

 any other ideas?

 -Steve

Why a pure function can call a collection cycle ???? This is an impure operation by essence. I think what is need here is to break the type system to allow call of impure function into a pure one.
May 23 2012
next sibling parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 23-05-2012 14:21, deadalnix wrote:
 Le 23/05/2012 05:22, Steven Schveighoffer a écrit :
 I have come across a dilemma.

 Alex Rønne Petersen has a pull request changing some things in the GC to
 pure. I think gc_collect() should be weak-pure, because it could
 technically run on any memory allocation (which is already allowed in
 pure functions), and it runs in a context that doesn't really affect
 execution of the pure function.

 So I think it should be able to be run inside a strong pure function.
 But because it has no parameters and no return, marking it as pure makes
 it strong pure, and an optimizing compiler can effectively remove the
 call completely!

 So how do we force something to be weak-pure? What I want is:

 1. it can be called from a pure function
 2. it will not be optimized out in any way.

 This solution looks crappy to me:

 void gc_collect(void *unused = null);

 any other ideas?

 -Steve

Why a pure function can call a collection cycle ???? This is an impure operation by essence. I think what is need here is to break the type system to allow call of impure function into a pure one.

I think you're missing an amusing point: class C { this() pure {} } C foo() pure { return new C(); // can trigger a collection! } -- Alex Rønne Petersen alex lycus.org http://lycus.org
May 23 2012
parent reply deadalnix <deadalnix gmail.com> writes:
Le 23/05/2012 14:32, Alex Rønne Petersen a écrit :
 On 23-05-2012 14:21, deadalnix wrote:
 Le 23/05/2012 05:22, Steven Schveighoffer a écrit :
 I have come across a dilemma.

 Alex Rønne Petersen has a pull request changing some things in the GC to
 pure. I think gc_collect() should be weak-pure, because it could
 technically run on any memory allocation (which is already allowed in
 pure functions), and it runs in a context that doesn't really affect
 execution of the pure function.

 So I think it should be able to be run inside a strong pure function.
 But because it has no parameters and no return, marking it as pure makes
 it strong pure, and an optimizing compiler can effectively remove the
 call completely!

 So how do we force something to be weak-pure? What I want is:

 1. it can be called from a pure function
 2. it will not be optimized out in any way.

 This solution looks crappy to me:

 void gc_collect(void *unused = null);

 any other ideas?

 -Steve

Why a pure function can call a collection cycle ???? This is an impure operation by essence. I think what is need here is to break the type system to allow call of impure function into a pure one.

I think you're missing an amusing point: class C { this() pure {} } C foo() pure { return new C(); // can trigger a collection! }

Ok, but no direct call to GC collect will be done, so the function don't need to be pure, it need to be somehow hacked into the allocation mecanism, probably using compiler magic. So basically the allocation mecanism have to break the type system to act like it is pure, even if it isn't.
May 23 2012
parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 23-05-2012 15:46, deadalnix wrote:
 Le 23/05/2012 14:32, Alex Rønne Petersen a écrit :
 On 23-05-2012 14:21, deadalnix wrote:
 Le 23/05/2012 05:22, Steven Schveighoffer a écrit :
 I have come across a dilemma.

 Alex Rønne Petersen has a pull request changing some things in the
 GC to
 pure. I think gc_collect() should be weak-pure, because it could
 technically run on any memory allocation (which is already allowed in
 pure functions), and it runs in a context that doesn't really affect
 execution of the pure function.

 So I think it should be able to be run inside a strong pure function.
 But because it has no parameters and no return, marking it as pure
 makes
 it strong pure, and an optimizing compiler can effectively remove the
 call completely!

 So how do we force something to be weak-pure? What I want is:

 1. it can be called from a pure function
 2. it will not be optimized out in any way.

 This solution looks crappy to me:

 void gc_collect(void *unused = null);

 any other ideas?

 -Steve

Why a pure function can call a collection cycle ???? This is an impure operation by essence. I think what is need here is to break the type system to allow call of impure function into a pure one.

I think you're missing an amusing point: class C { this() pure {} } C foo() pure { return new C(); // can trigger a collection! }

Ok, but no direct call to GC collect will be done, so the function don't need to be pure, it need to be somehow hacked into the allocation mecanism, probably using compiler magic.

Sure there'll be a direct call to that. It's effectively what the GC does when collecting.
 So basically the allocation mecanism have to break the type system to
 act like it is pure, even if it isn't.

-- Alex Rønne Petersen alex lycus.org http://lycus.org
May 23 2012
parent reply deadalnix <deadalnix gmail.com> writes:
Le 23/05/2012 15:47, Alex Rønne Petersen a écrit :
 On 23-05-2012 15:46, deadalnix wrote:
 Le 23/05/2012 14:32, Alex Rønne Petersen a écrit :
 On 23-05-2012 14:21, deadalnix wrote:
 Le 23/05/2012 05:22, Steven Schveighoffer a écrit :
 I have come across a dilemma.

 Alex Rønne Petersen has a pull request changing some things in the
 GC to
 pure. I think gc_collect() should be weak-pure, because it could
 technically run on any memory allocation (which is already allowed in
 pure functions), and it runs in a context that doesn't really affect
 execution of the pure function.

 So I think it should be able to be run inside a strong pure function.
 But because it has no parameters and no return, marking it as pure
 makes
 it strong pure, and an optimizing compiler can effectively remove the
 call completely!

 So how do we force something to be weak-pure? What I want is:

 1. it can be called from a pure function
 2. it will not be optimized out in any way.

 This solution looks crappy to me:

 void gc_collect(void *unused = null);

 any other ideas?

 -Steve

Why a pure function can call a collection cycle ???? This is an impure operation by essence. I think what is need here is to break the type system to allow call of impure function into a pure one.

I think you're missing an amusing point: class C { this() pure {} } C foo() pure { return new C(); // can trigger a collection! }



Rethinking about this, it show something interesting. To make sense, allocation in pure function should either be scoped or immutable. Otherwise we can't ensure any « strong purity » on a function that return anything that can reference something else.
 Ok, but no direct call to GC collect will be done, so the function don't
 need to be pure, it need to be somehow hacked into the allocation
 mecanism, probably using compiler magic.

Sure there'll be a direct call to that. It's effectively what the GC does when collecting.

At this point, you are in the allocation mecanism in druntime. You are not in pure code anymore. If it is the case, then the type system is broken at the wrong place.
May 23 2012
parent =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 23-05-2012 15:55, deadalnix wrote:
 Le 23/05/2012 15:47, Alex Rønne Petersen a écrit :
 On 23-05-2012 15:46, deadalnix wrote:
 Le 23/05/2012 14:32, Alex Rønne Petersen a écrit :
 On 23-05-2012 14:21, deadalnix wrote:
 Le 23/05/2012 05:22, Steven Schveighoffer a écrit :
 I have come across a dilemma.

 Alex Rønne Petersen has a pull request changing some things in the
 GC to
 pure. I think gc_collect() should be weak-pure, because it could
 technically run on any memory allocation (which is already allowed in
 pure functions), and it runs in a context that doesn't really affect
 execution of the pure function.

 So I think it should be able to be run inside a strong pure function.
 But because it has no parameters and no return, marking it as pure
 makes
 it strong pure, and an optimizing compiler can effectively remove the
 call completely!

 So how do we force something to be weak-pure? What I want is:

 1. it can be called from a pure function
 2. it will not be optimized out in any way.

 This solution looks crappy to me:

 void gc_collect(void *unused = null);

 any other ideas?

 -Steve

Why a pure function can call a collection cycle ???? This is an impure operation by essence. I think what is need here is to break the type system to allow call of impure function into a pure one.

I think you're missing an amusing point: class C { this() pure {} } C foo() pure { return new C(); // can trigger a collection! }



Rethinking about this, it show something interesting. To make sense, allocation in pure function should either be scoped or immutable. Otherwise we can't ensure any « strong purity » on a function that return anything that can reference something else.
 Ok, but no direct call to GC collect will be done, so the function don't
 need to be pure, it need to be somehow hacked into the allocation
 mecanism, probably using compiler magic.

Sure there'll be a direct call to that. It's effectively what the GC does when collecting.

At this point, you are in the allocation mecanism in druntime. You are not in pure code anymore. If it is the case, then the type system is broken at the wrong place.

The runtime is a bit of a special case in the first place. It's an extremely low-level component of the language. Mind you, I'm not saying there isn't room for improvement in our type system! -- Alex Rønne Petersen alex lycus.org http://lycus.org
May 23 2012
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 23 May 2012 08:21:42 -0400, deadalnix <deadalnix gmail.com> wrot=
e:

 Le 23/05/2012 05:22, Steven Schveighoffer a =C3=A9crit :
 I have come across a dilemma.

 Alex R=C3=B8nne Petersen has a pull request changing some things in t=


 pure. I think gc_collect() should be weak-pure, because it could
 technically run on any memory allocation (which is already allowed in=


 pure functions), and it runs in a context that doesn't really affect
 execution of the pure function.

 So I think it should be able to be run inside a strong pure function.=


 But because it has no parameters and no return, marking it as pure ma=


 it strong pure, and an optimizing compiler can effectively remove the=


 call completely!

 So how do we force something to be weak-pure? What I want is:

 1. it can be called from a pure function
 2. it will not be optimized out in any way.

 This solution looks crappy to me:

 void gc_collect(void *unused =3D null);

 any other ideas?

 -Steve

Why a pure function can call a collection cycle ???? This is an impure=

 operation by essence.

Yes. Memory allocation and deallocation from a global heap is by = definition an impure operation (it affects global state). However, we = must make exceptions because without being able to allocate memory, pure= = functions become quite trivial and useless. In functional languages, if such exceptions were not granted, a program = = would not be able to do much of anything.
 I think what is need here is to break the type system to allow call of=

 impure function into a pure one.

We already are breaking the type system by marking an impure function (a= = function whose implementation is not marked pure) pure. This only works= = for extern(C) fucntions, which are not mangled with the pure attribute. But we need to be more specific about how the compiler should treat thes= e = functions. They need to undergo no optimization based on purity. They = = simply need to be able to be called from a pure function. -Steve
May 23 2012
parent reply deadalnix <deadalnix gmail.com> writes:
Le 23/05/2012 14:35, Steven Schveighoffer a écrit :
 On Wed, 23 May 2012 08:21:42 -0400, deadalnix <deadalnix gmail.com> wrote:

 Le 23/05/2012 05:22, Steven Schveighoffer a écrit :
 I have come across a dilemma.

 Alex Rønne Petersen has a pull request changing some things in the GC to
 pure. I think gc_collect() should be weak-pure, because it could
 technically run on any memory allocation (which is already allowed in
 pure functions), and it runs in a context that doesn't really affect
 execution of the pure function.

 So I think it should be able to be run inside a strong pure function.
 But because it has no parameters and no return, marking it as pure makes
 it strong pure, and an optimizing compiler can effectively remove the
 call completely!

 So how do we force something to be weak-pure? What I want is:

 1. it can be called from a pure function
 2. it will not be optimized out in any way.

 This solution looks crappy to me:

 void gc_collect(void *unused = null);

 any other ideas?

 -Steve

Why a pure function can call a collection cycle ???? This is an impure operation by essence.

Yes. Memory allocation and deallocation from a global heap is by definition an impure operation (it affects global state). However, we must make exceptions because without being able to allocate memory, pure functions become quite trivial and useless. In functional languages, if such exceptions were not granted, a program would not be able to do much of anything.

Yes, you are missing the point. collect is not something you should be able to call in a pure function. It can be triggered by allocating, but at this point you already are in an impure context called from a pure context. At the end, you need an unsafe way to call impure code in pure functions.
 I think what is need here is to break the type system to allow call of
 impure function into a pure one.

We already are breaking the type system by marking an impure function (a function whose implementation is not marked pure) pure. This only works for extern(C) fucntions, which are not mangled with the pure attribute. But we need to be more specific about how the compiler should treat these functions. They need to undergo no optimization based on purity. They simply need to be able to be called from a pure function. -Steve

May 23 2012
parent reply deadalnix <deadalnix gmail.com> writes:
Le 23/05/2012 15:57, Steven Schveighoffer a écrit :
 On Wed, 23 May 2012 09:52:31 -0400, deadalnix <deadalnix gmail.com> wrote:

 Le 23/05/2012 14:35, Steven Schveighoffer a écrit :
 Yes. Memory allocation and deallocation from a global heap is by
 definition an impure operation (it affects global state). However, we
 must make exceptions because without being able to allocate memory, pure
 functions become quite trivial and useless.

 In functional languages, if such exceptions were not granted, a program
 would not be able to do much of anything.

Yes, you are missing the point. collect is not something you should be able to call in a pure function. It can be triggered by allocating, but at this point you already are in an impure context called from a pure context. At the end, you need an unsafe way to call impure code in pure functions.

I'm failing to see an argument in this response. If I can call an impure function for allocating memory, why is it illegal to call an impure function for collecting unused memory? -Steve

Allocating is a much more simpler operation than collect, and its impact is way more reduced. Plus, allocating memory is something mandatory to do anything non trivial. GC collect isn't that important, as it can be triggered by allocating mecanism itself (and this mecanism is already impure). If you do allow everything impure to be done on pure function based on the fact that allocating is impure, what is the point to have pure function at all ?
May 23 2012
next sibling parent =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 23-05-2012 16:03, deadalnix wrote:
 Le 23/05/2012 15:57, Steven Schveighoffer a écrit :
 On Wed, 23 May 2012 09:52:31 -0400, deadalnix <deadalnix gmail.com>
 wrote:

 Le 23/05/2012 14:35, Steven Schveighoffer a écrit :
 Yes. Memory allocation and deallocation from a global heap is by
 definition an impure operation (it affects global state). However, we
 must make exceptions because without being able to allocate memory,
 pure
 functions become quite trivial and useless.

 In functional languages, if such exceptions were not granted, a program
 would not be able to do much of anything.

Yes, you are missing the point. collect is not something you should be able to call in a pure function. It can be triggered by allocating, but at this point you already are in an impure context called from a pure context. At the end, you need an unsafe way to call impure code in pure functions.

I'm failing to see an argument in this response. If I can call an impure function for allocating memory, why is it illegal to call an impure function for collecting unused memory? -Steve

Allocating is a much more simpler operation than collect, and its impact is way more reduced.

Yes, but allocation implies possible collection!
 Plus, allocating memory is something mandatory to do anything non
 trivial. GC collect isn't that important, as it can be triggered by
 allocating mecanism itself (and this mecanism is already impure).

 If you do allow everything impure to be done on pure function based on
 the fact that allocating is impure, what is the point to have pure
 function at all ?

The point is *real world practicality*. You need to be able to use the core.memory API in pure functions to do anything memory-heavy. -- Alex Rønne Petersen alex lycus.org http://lycus.org
May 23 2012
prev sibling parent reply deadalnix <deadalnix gmail.com> writes:
Le 23/05/2012 16:03, deadalnix a écrit :
 Le 23/05/2012 15:57, Steven Schveighoffer a écrit :
 On Wed, 23 May 2012 09:52:31 -0400, deadalnix <deadalnix gmail.com>
 wrote:

 Le 23/05/2012 14:35, Steven Schveighoffer a écrit :
 Yes. Memory allocation and deallocation from a global heap is by
 definition an impure operation (it affects global state). However, we
 must make exceptions because without being able to allocate memory,
 pure
 functions become quite trivial and useless.

 In functional languages, if such exceptions were not granted, a program
 would not be able to do much of anything.

Yes, you are missing the point. collect is not something you should be able to call in a pure function. It can be triggered by allocating, but at this point you already are in an impure context called from a pure context. At the end, you need an unsafe way to call impure code in pure functions.

I'm failing to see an argument in this response. If I can call an impure function for allocating memory, why is it illegal to call an impure function for collecting unused memory? -Steve

Allocating is a much more simpler operation than collect, and its impact is way more reduced. Plus, allocating memory is something mandatory to do anything non trivial. GC collect isn't that important, as it can be triggered by allocating mecanism itself (and this mecanism is already impure). If you do allow everything impure to be done on pure function based on the fact that allocating is impure, what is the point to have pure function at all ?

You'll find a different between : pure void foo() { new Stuff(); // May collect } and pure void foo() { gc.collect(); } The second one is obviously not pure. And the first one need a hook to an allocating function in druntime, function which isn't pure, but made look like a pure one, and that is able to call impure gc.collect . gc.collect is a system wide procedure that involve all thread runniong in your application and every single piece of memory in it. This is probable the most far away from pure function I can think of.
May 23 2012
parent =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 23-05-2012 16:42, deadalnix wrote:
 Le 23/05/2012 16:03, deadalnix a écrit :
 Le 23/05/2012 15:57, Steven Schveighoffer a écrit :
 On Wed, 23 May 2012 09:52:31 -0400, deadalnix <deadalnix gmail.com>
 wrote:

 Le 23/05/2012 14:35, Steven Schveighoffer a écrit :
 Yes. Memory allocation and deallocation from a global heap is by
 definition an impure operation (it affects global state). However, we
 must make exceptions because without being able to allocate memory,
 pure
 functions become quite trivial and useless.

 In functional languages, if such exceptions were not granted, a
 program
 would not be able to do much of anything.

Yes, you are missing the point. collect is not something you should be able to call in a pure function. It can be triggered by allocating, but at this point you already are in an impure context called from a pure context. At the end, you need an unsafe way to call impure code in pure functions.

I'm failing to see an argument in this response. If I can call an impure function for allocating memory, why is it illegal to call an impure function for collecting unused memory? -Steve

Allocating is a much more simpler operation than collect, and its impact is way more reduced. Plus, allocating memory is something mandatory to do anything non trivial. GC collect isn't that important, as it can be triggered by allocating mecanism itself (and this mecanism is already impure). If you do allow everything impure to be done on pure function based on the fact that allocating is impure, what is the point to have pure function at all ?

You'll find a different between : pure void foo() { new Stuff(); // May collect } and pure void foo() { gc.collect(); } The second one is obviously not pure. And the first one need a hook to an allocating function in druntime, function which isn't pure, but made look like a pure one, and that is able to call impure gc.collect . gc.collect is a system wide procedure that involve all thread runniong in your application and every single piece of memory in it. This is probable the most far away from pure function I can think of.

You're still just lying to yourself by thinking that a GC allocation is strictly pure. I understand that there is a difference between implicit and explicit collection, but it boils down to the same thing. D is a systems language, and a pragmatic language. We're not Haskell. Let's be practical. -- Alex Rønne Petersen alex lycus.org http://lycus.org
May 23 2012
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 23 May 2012 09:52:31 -0400, deadalnix <deadalnix gmail.com> wrot=
e:

 Le 23/05/2012 14:35, Steven Schveighoffer a =C3=A9crit :
 Yes. Memory allocation and deallocation from a global heap is by
 definition an impure operation (it affects global state). However, we=


 must make exceptions because without being able to allocate memory, p=


 functions become quite trivial and useless.

 In functional languages, if such exceptions were not granted, a progr=


 would not be able to do much of anything.

Yes, you are missing the point. collect is not something you should be able to call in a pure function=

 It can be triggered by allocating, but at this point you already are i=

 an impure context called from a pure context.

 At the end, you need an unsafe way to call impure code in pure functio=

I'm failing to see an argument in this response. If I can call an impur= e = function for allocating memory, why is it illegal to call an impure = function for collecting unused memory? -Steve
May 23 2012
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 23 May 2012 10:03:01 -0400, deadalnix <deadalnix gmail.com> wrot=
e:

 Le 23/05/2012 15:57, Steven Schveighoffer a =C3=A9crit :
 On Wed, 23 May 2012 09:52:31 -0400, deadalnix <deadalnix gmail.com>  =


 wrote:

 Le 23/05/2012 14:35, Steven Schveighoffer a =C3=A9crit :
 Yes. Memory allocation and deallocation from a global heap is by
 definition an impure operation (it affects global state). However, =




 must make exceptions because without being able to allocate memory,=




 pure
 functions become quite trivial and useless.

 In functional languages, if such exceptions were not granted, a  =




 program
 would not be able to do much of anything.

Yes, you are missing the point. collect is not something you should be able to call in a pure function. It can be triggered by allocating, but at this point you already are in an impure context called from a pure context. At the end, you need an unsafe way to call impure code in pure =



 functions.

I'm failing to see an argument in this response. If I can call an imp=


 function for allocating memory, why is it illegal to call an impure
 function for collecting unused memory?

 -Steve

Allocating is a much more simpler operation than collect, and its impa=

 is way more reduced.

Huh? Allocation *does* a collect! How can it be "simpler"?
 If you do allow everything impure to be done on pure function based on=

 the fact that allocating is impure, what is the point to have pure  =

 function at all ?

Memory collection is hardly "everything." We are looking at the few = exceptions needed to make the system usable. -Steve
May 23 2012
prev sibling next sibling parent reply Don Clugston <dac nospam.com> writes:
On 23/05/12 05:22, Steven Schveighoffer wrote:
 I have come across a dilemma.

 Alex Rønne Petersen has a pull request changing some things in the GC to
 pure. I think gc_collect() should be weak-pure, because it could
 technically run on any memory allocation (which is already allowed in
 pure functions), and it runs in a context that doesn't really affect
 execution of the pure function.

 So I think it should be able to be run inside a strong pure function.

I am almost certain it should not. And I think this is quite important. A strongly pure function should be considered to have its own gc, and should not be able to collect any memory it did not allocate itself. Memory allocation from a pure function might trigger a gc cycle, but it would ONLY look at the memory allocated inside that pure function.
May 23 2012
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 23 May 2012 09:17:43 -0400, Don Clugston <dac nospam.com> wrote:=


 On 23/05/12 05:22, Steven Schveighoffer wrote:
 I have come across a dilemma.

 Alex R=C3=B8nne Petersen has a pull request changing some things in t=


 pure. I think gc_collect() should be weak-pure, because it could
 technically run on any memory allocation (which is already allowed in=


 pure functions), and it runs in a context that doesn't really affect
 execution of the pure function.

 So I think it should be able to be run inside a strong pure function.=


 I am almost certain it should not.

 And I think this is quite important. A strongly pure function should b=

 considered to have its own gc, and should not be able to collect any  =

 memory it did not allocate itself.

Well, given that the above is not implemented, what do you propose for t= he = meantime?
 Memory allocation from a pure function might trigger a gc cycle, but i=

 would ONLY look at the memory allocated inside that pure function.

What if memory is tight, and the only way to get memory for this new = allocation is to collect from the main heap? This seems an odd = limitation, since strong-pure functions would not be affected by = collecting in the main heap *at all*. -Steve
May 23 2012
parent deadalnix <deadalnix gmail.com> writes:
Le 23/05/2012 15:52, Steven Schveighoffer a écrit :
 What if memory is tight, and the only way to get memory for this new
 allocation is to collect from the main heap? This seems an odd
 limitation, since strong-pure functions would not be affected by
 collecting in the main heap *at all*.

 -Steve

It is up to the GC to collect memory when it is tight. This operation isn't and shouldn't be pure. It can eventually be called within the allocation mecanism.
May 23 2012
prev sibling next sibling parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 23-05-2012 15:17, Don Clugston wrote:
 On 23/05/12 05:22, Steven Schveighoffer wrote:
 I have come across a dilemma.

 Alex Rønne Petersen has a pull request changing some things in the GC to
 pure. I think gc_collect() should be weak-pure, because it could
 technically run on any memory allocation (which is already allowed in
 pure functions), and it runs in a context that doesn't really affect
 execution of the pure function.

 So I think it should be able to be run inside a strong pure function.

I am almost certain it should not. And I think this is quite important. A strongly pure function should be considered to have its own gc, and should not be able to collect any memory it did not allocate itself. Memory allocation from a pure function might trigger a gc cycle, but it would ONLY look at the memory allocated inside that pure function.

Implementing this on a per-function basis is not very realistic. Some programs have hundreds (if not thousands) of pure functions. Not to mention, we'd need some mechanism akin to critical regions to figure out when a thread is in a pure function during stop-the-world. Further, data allocated in a pure function f() in thread A must not be touched by a collection triggered by an allocation inside f() in thread B. It'd be a huge mess. And, frankly, if my program dies from an OOME due to pure functions being unable to do full collection cycles, I'd just stop using pure permanently. It's not a very realistic approach to automatic memory management; at that point, manual memory management would work better. -- Alex Rønne Petersen alex lycus.org http://lycus.org
May 23 2012
parent reply Don Clugston <dac nospam.com> writes:
On 23/05/12 15:56, Alex Rønne Petersen wrote:
 On 23-05-2012 15:17, Don Clugston wrote:
 On 23/05/12 05:22, Steven Schveighoffer wrote:
 I have come across a dilemma.

 Alex Rønne Petersen has a pull request changing some things in the GC to
 pure. I think gc_collect() should be weak-pure, because it could
 technically run on any memory allocation (which is already allowed in
 pure functions), and it runs in a context that doesn't really affect
 execution of the pure function.

 So I think it should be able to be run inside a strong pure function.

I am almost certain it should not. And I think this is quite important. A strongly pure function should be considered to have its own gc, and should not be able to collect any memory it did not allocate itself. Memory allocation from a pure function might trigger a gc cycle, but it would ONLY look at the memory allocated inside that pure function.

Implementing this on a per-function basis is not very realistic. Some programs have hundreds (if not thousands) of pure functions.

No, it's not realistic for every function. But it's extremely easy for others. In particular, if you have a pure function which has no reference parameters, you just need a pointer to the last point a strongly pure function was entered. This partitions the heap into two parts. Each can be gc'd independently. And, in the non-pure part, nothing is happening. Once you've done a GC there, you NEVER need to do it again.
 Not to mention, we'd need some mechanism akin to critical regions to
 figure out when a thread is in a pure function during stop-the-world.
 Further, data allocated in a pure function f() in thread A must not be
 touched by a collection triggered by an allocation inside f() in thread
 B. It'd be a huge mess.

Not so. It's impossible for anything outside of a strongly pure function to hold a pointer to memory allocated by the pure function. In my view, this is the single most interesting feature of purity.
 And, frankly, if my program dies from an OOME due to pure functions
 being unable to do full collection cycles, I'd just stop using pure
 permanently. It's not a very realistic approach to automatic memory
 management; at that point, manual memory management would work better.

Of course. But I don't see how that's relevant. How the pure function actually obtains its memory is an implementation detail. There's a huge difference between "a global collection *may* be performed from a pure function" vs "it *must* be possible to force a global collection from a pure function". The difficulty in expressing the latter is a simple consequence of the fact that it is intrinsically impure.
May 23 2012
next sibling parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 23-05-2012 17:29, Don Clugston wrote:
 On 23/05/12 15:56, Alex Rønne Petersen wrote:
 On 23-05-2012 15:17, Don Clugston wrote:
 On 23/05/12 05:22, Steven Schveighoffer wrote:
 I have come across a dilemma.

 Alex Rønne Petersen has a pull request changing some things in the
 GC to
 pure. I think gc_collect() should be weak-pure, because it could
 technically run on any memory allocation (which is already allowed in
 pure functions), and it runs in a context that doesn't really affect
 execution of the pure function.

 So I think it should be able to be run inside a strong pure function.

I am almost certain it should not. And I think this is quite important. A strongly pure function should be considered to have its own gc, and should not be able to collect any memory it did not allocate itself. Memory allocation from a pure function might trigger a gc cycle, but it would ONLY look at the memory allocated inside that pure function.

Implementing this on a per-function basis is not very realistic. Some programs have hundreds (if not thousands) of pure functions.

No, it's not realistic for every function. But it's extremely easy for others. In particular, if you have a pure function which has no reference parameters, you just need a pointer to the last point a strongly pure function was entered. This partitions the heap into two parts. Each can be gc'd independently. And, in the non-pure part, nothing is happening. Once you've done a GC there, you NEVER need to do it again.
 Not to mention, we'd need some mechanism akin to critical regions to
 figure out when a thread is in a pure function during stop-the-world.
 Further, data allocated in a pure function f() in thread A must not be
 touched by a collection triggered by an allocation inside f() in thread
 B. It'd be a huge mess.

Not so. It's impossible for anything outside of a strongly pure function to hold a pointer to memory allocated by the pure function.

Not sure I follow: immutable(int)* foo() pure { return new int; } void main() { auto ptr = foo(); // we now have a pointer to memory allocated by a pure function? } Unless, of course, you consider this weakly pure. But at that point, strongly pure functions are starting to get very, very useless.
 In my view, this is the single most interesting feature of purity.

 And, frankly, if my program dies from an OOME due to pure functions
 being unable to do full collection cycles, I'd just stop using pure
 permanently. It's not a very realistic approach to automatic memory
 management; at that point, manual memory management would work better.

Of course. But I don't see how that's relevant. How the pure function actually obtains its memory is an implementation detail. There's a huge difference between "a global collection *may* be performed from a pure function" vs "it *must* be possible to force a global collection from a pure function". The difficulty in expressing the latter is a simple consequence of the fact that it is intrinsically impure.

-- Alex Rønne Petersen alex lycus.org http://lycus.org
May 23 2012
parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 23-05-2012 17:56, Steven Schveighoffer wrote:
 On Wed, 23 May 2012 11:41:00 -0400, Alex Rønne Petersen
 <alex lycus..org> wrote:

 On 23-05-2012 17:29, Don Clugston wrote:

 Not so. It's impossible for anything outside of a strongly pure function
 to hold a pointer to memory allocated by the pure function.

Not sure I follow: immutable(int)* foo() pure { return new int; } void main() { auto ptr = foo(); // we now have a pointer to memory allocated by a pure function? }

I think what Don means is this: 1. upon entry into a strong-pure function, record a GC context that remembers what point in the stack it entered (no need to search above that stack), and uses its parameters as "context roots". 2. Any collection performed while *in* the strong-pure function explicitly will simply deal with the contexted GC data. It does not need to look at the main heap, except for those original roots. 3. upon exiting, you can remove the original roots, and add the return value as a root, and run one final GC collection within the context. This should deterministically clean up any memory that was temporary while inside the pure function. 4. Anything that is left in the contexted GC is assimilated into the main GC. Everything can be done without using a GC lock *except* the final assimilation (which may not need to lock because there is nothing to add). Don, why can't gc_collect do the right thing based on whether it's in a contexted GC or not? The compiler is going to have to initialize the context when first entering a strong-pure function, so we should be able to have a hook recording that the thread is using a pure-function context, no? Also, let's assume it's a separate call to do pure function gc_collect, i.e. we have a pure gc_pureCollect function. What if a weak-pure function calls this? What happens when it is not called from within a strong-pure function? I still think gc_collect can be marked pure and do the right thing. -Steve

I still don't think my concern about pure functions allocating tons of memory (and thus requiring global GC to happen) has been addressed, though. -- Alex Rønne Petersen alex lycus.org http://lycus.org
May 23 2012
parent reply deadalnix <deadalnix gmail.com> writes:
Le 23/05/2012 18:22, Alex Rønne Petersen a écrit :
 On 23-05-2012 17:56, Steven Schveighoffer wrote:
 On Wed, 23 May 2012 11:41:00 -0400, Alex Rønne Petersen
 <alex lycus..org> wrote:

 On 23-05-2012 17:29, Don Clugston wrote:

 Not so. It's impossible for anything outside of a strongly pure
 function
 to hold a pointer to memory allocated by the pure function.

Not sure I follow: immutable(int)* foo() pure { return new int; } void main() { auto ptr = foo(); // we now have a pointer to memory allocated by a pure function? }

I think what Don means is this: 1. upon entry into a strong-pure function, record a GC context that remembers what point in the stack it entered (no need to search above that stack), and uses its parameters as "context roots". 2. Any collection performed while *in* the strong-pure function explicitly will simply deal with the contexted GC data. It does not need to look at the main heap, except for those original roots. 3. upon exiting, you can remove the original roots, and add the return value as a root, and run one final GC collection within the context. This should deterministically clean up any memory that was temporary while inside the pure function. 4. Anything that is left in the contexted GC is assimilated into the main GC. Everything can be done without using a GC lock *except* the final assimilation (which may not need to lock because there is nothing to add). Don, why can't gc_collect do the right thing based on whether it's in a contexted GC or not? The compiler is going to have to initialize the context when first entering a strong-pure function, so we should be able to have a hook recording that the thread is using a pure-function context, no? Also, let's assume it's a separate call to do pure function gc_collect, i.e. we have a pure gc_pureCollect function. What if a weak-pure function calls this? What happens when it is not called from within a strong-pure function? I still think gc_collect can be marked pure and do the right thing. -Steve

I still don't think my concern about pure functions allocating tons of memory (and thus requiring global GC to happen) has been addressed, though.

If the pure function allocate a ton of memory, one of these allocation will trigger the collection. Why is that an issue ?
May 23 2012
parent =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 23-05-2012 19:19, deadalnix wrote:
 Le 23/05/2012 18:22, Alex Rønne Petersen a écrit :
 On 23-05-2012 17:56, Steven Schveighoffer wrote:
 On Wed, 23 May 2012 11:41:00 -0400, Alex Rønne Petersen
 <alex lycus..org> wrote:

 On 23-05-2012 17:29, Don Clugston wrote:

 Not so. It's impossible for anything outside of a strongly pure
 function
 to hold a pointer to memory allocated by the pure function.

Not sure I follow: immutable(int)* foo() pure { return new int; } void main() { auto ptr = foo(); // we now have a pointer to memory allocated by a pure function? }

I think what Don means is this: 1. upon entry into a strong-pure function, record a GC context that remembers what point in the stack it entered (no need to search above that stack), and uses its parameters as "context roots". 2. Any collection performed while *in* the strong-pure function explicitly will simply deal with the contexted GC data. It does not need to look at the main heap, except for those original roots. 3. upon exiting, you can remove the original roots, and add the return value as a root, and run one final GC collection within the context. This should deterministically clean up any memory that was temporary while inside the pure function. 4. Anything that is left in the contexted GC is assimilated into the main GC. Everything can be done without using a GC lock *except* the final assimilation (which may not need to lock because there is nothing to add). Don, why can't gc_collect do the right thing based on whether it's in a contexted GC or not? The compiler is going to have to initialize the context when first entering a strong-pure function, so we should be able to have a hook recording that the thread is using a pure-function context, no? Also, let's assume it's a separate call to do pure function gc_collect, i.e. we have a pure gc_pureCollect function. What if a weak-pure function calls this? What happens when it is not called from within a strong-pure function? I still think gc_collect can be marked pure and do the right thing. -Steve

I still don't think my concern about pure functions allocating tons of memory (and thus requiring global GC to happen) has been addressed, though.

If the pure function allocate a ton of memory, one of these allocation will trigger the collection. Why is that an issue ?

Not with the scheme Steven first proposed, but it will with his clarification, so all is good. That said, I'm really not convinced it's worth the effort, but as long as a "pure GC" doesn't start breaking my programs, I won't complain. ;) -- Alex Rønne Petersen alex lycus.org http://lycus.org
May 23 2012
prev sibling parent reply deadalnix <deadalnix gmail.com> writes:
Le 23/05/2012 17:29, Don Clugston a écrit :
 There's a huge difference between "a global collection *may* be
 performed from a pure function" vs "it *must* be possible to force a
 global collection from a pure function".

Thank you !
May 23 2012
parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 23-05-2012 19:16, deadalnix wrote:
 Le 23/05/2012 17:29, Don Clugston a écrit :
 There's a huge difference between "a global collection *may* be
 performed from a pure function" vs "it *must* be possible to force a
 global collection from a pure function".

Thank you !

I personally disagree that this should be a rationale to not allow the latter. D is a systems language and we really should stop trying to pretend that it isn't. There's a reason we have a core.memory module that lets us control the GC. -- Alex Rønne Petersen alex lycus.org http://lycus.org
May 23 2012
parent Don Clugston <dac nospam.com> writes:
On 24/05/12 02:26, Alex Rønne Petersen wrote:
 On 23-05-2012 19:16, deadalnix wrote:
 Le 23/05/2012 17:29, Don Clugston a écrit :
 There's a huge difference between "a global collection *may* be
 performed from a pure function" vs "it *must* be possible to force a
 global collection from a pure function".

Thank you !

I personally disagree that this should be a rationale to not allow the latter. D is a systems language and we really should stop trying to pretend that it isn't. There's a reason we have a core.memory module that lets us control the GC.

This is all about not exposing quirks of the current implementation. The way it currently is, would get you to perform a gc before you enter the first pure function. After that, the only possible garbage to collect would have been generated from inside the pure function. And that should be very cheap to collect.
May 24 2012
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 23 May 2012 09:56:58 -0400, deadalnix <deadalnix gmail.com> wrot=
e:

 Le 23/05/2012 15:52, Steven Schveighoffer a =C3=A9crit :
 What if memory is tight, and the only way to get memory for this new
 allocation is to collect from the main heap? This seems an odd
 limitation, since strong-pure functions would not be affected by
 collecting in the main heap *at all*.

 -Steve

It is up to the GC to collect memory when it is tight. This operation =

 isn't and shouldn't be pure.

 It can eventually be called within the allocation mecanism.

Don has said that any collection cycle triggered by a pure function shou= ld = not touch the main heap, only memory allocated from within this function= = (or 'pure function' stack I guess). This means that a strong-pure function could throw an OutOfMemoryError, = = whereas a non-pure one would run fine. I'm with Alex on this, this situation is not tenable. -Steve
May 23 2012
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 23 May 2012 11:41:00 -0400, Alex R=C3=B8nne Petersen <alex lycus=
.org>  =

wrote:

 On 23-05-2012 17:29, Don Clugston wrote:

 Not so. It's impossible for anything outside of a strongly pure funct=


 to hold a pointer to memory allocated by the pure function.

Not sure I follow: immutable(int)* foo() pure { return new int; } void main() { auto ptr =3D foo(); // we now have a pointer to memory allocated by a pure functi=

 }

I think what Don means is this: 1. upon entry into a strong-pure function, record a GC context that = remembers what point in the stack it entered (no need to search above th= at = stack), and uses its parameters as "context roots". 2. Any collection performed while *in* the strong-pure function explicit= ly = will simply deal with the contexted GC data. It does not need to look a= t = the main heap, except for those original roots. 3. upon exiting, you can remove the original roots, and add the return = value as a root, and run one final GC collection within the context. Th= is = should deterministically clean up any memory that was temporary while = inside the pure function. 4. Anything that is left in the contexted GC is assimilated into the mai= n = GC. Everything can be done without using a GC lock *except* the final = assimilation (which may not need to lock because there is nothing to add= ). Don, why can't gc_collect do the right thing based on whether it's in a = = contexted GC or not? The compiler is going to have to initialize the = context when first entering a strong-pure function, so we should be able= = to have a hook recording that the thread is using a pure-function contex= t, = no? Also, let's assume it's a separate call to do pure function gc_collect, = = i.e. we have a pure gc_pureCollect function. What if a weak-pure function calls this? What happens when it is not = called from within a strong-pure function? I still think gc_collect can be marked pure and do the right thing. -Steve
May 23 2012
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 23 May 2012 12:22:39 -0400, Alex R=C3=B8nne Petersen <alex lycus=
.org>  =

wrote:

 On 23-05-2012 17:56, Steven Schveighoffer wrote:
 On Wed, 23 May 2012 11:41:00 -0400, Alex R=C3=B8nne Petersen
 <alex lycus..org> wrote:

 On 23-05-2012 17:29, Don Clugston wrote:

 Not so. It's impossible for anything outside of a strongly pure  =




 function
 to hold a pointer to memory allocated by the pure function.

Not sure I follow: immutable(int)* foo() pure { return new int; } void main() { auto ptr =3D foo(); // we now have a pointer to memory allocated by a pure function? }

I think what Don means is this: 1. upon entry into a strong-pure function, record a GC context that remembers what point in the stack it entered (no need to search above=


 that stack), and uses its parameters as "context roots".
 2. Any collection performed while *in* the strong-pure function
 explicitly will simply deal with the contexted GC data. It does not n=


 to look at the main heap, except for those original roots.
 3. upon exiting, you can remove the original roots, and add the retur=


 value as a root, and run one final GC collection within the context.
 This should deterministically clean up any memory that was temporary
 while inside the pure function.
 4. Anything that is left in the contexted GC is assimilated into the
 main GC.

 Everything can be done without using a GC lock *except* the final
 assimilation (which may not need to lock because there is nothing to =


 add).

 Don, why can't gc_collect do the right thing based on whether it's in=


 contexted GC or not? The compiler is going to have to initialize the
 context when first entering a strong-pure function, so we should be a=


 to have a hook recording that the thread is using a pure-function
 context, no?

 Also, let's assume it's a separate call to do pure function gc_collec=


 i.e. we have a pure gc_pureCollect function.

 What if a weak-pure function calls this? What happens when it is not
 called from within a strong-pure function?

 I still think gc_collect can be marked pure and do the right thing.

 -Steve

I still don't think my concern about pure functions allocating tons of=

 memory (and thus requiring global GC to happen) has been addressed,  =

 though.

What I forgot is: 2a. Any collection performed *implicitly* during strong-pure function ma= y = run a full collection cycle. -Steve
May 23 2012
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 24 May 2012 04:58:56 -0400, Don Clugston <dac nospam.com> wrote:=


 On 24/05/12 02:26, Alex R=C3=B8nne Petersen wrote:
 On 23-05-2012 19:16, deadalnix wrote:
 Le 23/05/2012 17:29, Don Clugston a =C3=A9crit :
 There's a huge difference between "a global collection *may* be
 performed from a pure function" vs "it *must* be possible to force =




 global collection from a pure function".

Thank you !

I personally disagree that this should be a rationale to not allow th=


 latter. D is a systems language and we really should stop trying to
 pretend that it isn't. There's a reason we have a core.memory module
 that lets us control the GC.

This is all about not exposing quirks of the current implementation. The way it currently is, would get you to perform a gc before you ente=

 the first pure function.
 After that, the only possible garbage to collect would have been  =

 generated from inside the pure function. And that should be very cheap=

 to collect.

The more I think about it, the more I believe that what gc_collect does = = (i.e. run a full collect, or run a pure-function specific collect) is an= = implementation detail. I don't think exposing gc_collect is a quirk of the current = implementation, and it should be marked pure (weak purity). This whole thread has kind of flown way off topic. Regardless of whethe= r = gc_collect should be callable from a pure function, there are other use = = cases for considering logically pure functions as pure (for the same = reasons you can cast away const). Should we be able to force weak purit= y = or not? If so, how to do it? -Steve
May 24 2012
prev sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, May 24, 2012 at 08:56:25AM -0400, Steven Schveighoffer wrote:
[...]
 The more I think about it, the more I believe that what gc_collect
 does (i.e. run a full collect, or run a pure-function specific
 collect) is an implementation detail.

I agree. Purity only exists at a certain level of abstraction. Even "completely pure" languages like Haskell are ultimately compiled into machine code that modify pointer memory, do pointer arithmetic, and all sorts of other "impure" operations. Yet at a certain level of abstraction, that is, at the language-level abstraction, all functions are still pure. Purity is not violated because the "impure operations" are part of the Haskell runtime, which expose a pure interface to the programmer. IOW, the impure operations are implementation details. So it's unreasonable to require that all low-level operations be pure. At a certain level, the D runtime has to do "impure", or low-level operations. That doesn't mean stuff built on top of the runtime can't be pure. I would argue that since the GC is part of the runtime, it should be regarded as part of the "implementation details" that are allowed to do "impure" operations, as long as the runtime itself provides an API that abstractly exposes a pure system. In that sense, I think that things like triggering a GC collection, etc., are just an implementation detail. We explicitly trigger collection in order to optimize performance. In an ideal world, the compiler would divine how to do this automatically without needing our code to call "impure" GC functions. But we're not in an ideal world, and sometimes we do need to allow pure functions to optimize by hinting to the GC when is a good time to run a collection. I don't think this should violate purity. Otherwise, the notion of purity would be greatly limited in its usefulness. T -- Bomb technician: If I'm running, try to keep up.
May 24 2012
prev sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Don Clugston:

 The real question being asked is, do we need something for 
 logical purity?

You mean something like trusted_pure?
 Note that we need the same thing for caching.

Regarding caching, memoization, weak references, and related things, I prefer a more principled approach, like the ones I've shown in some Haskell-related papers. Bye, bearophile
May 23 2012