www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Disadvantages of ARC

reply Max Klyga <max.klyga gmail.com> writes:
Anti-GC crowd tries to promote ARC as an deterministic alternative for 
memory management.
I noticed that people promoting ARC do not provide any disadvantages 
for proposed approach.

The thing is in gamedev and other soft-realitime software background 
only a handfull types of resources are really managed by RC and memory 
usage patterns are VERY specific to their domain (mostly linear 
allocation/deallocation and objects with non deterministic lifetime are 
preallocated in pools).

Trying to use RC as a general method of memory management leads to some 
problems.
A pretty detailed view by John Harrop (He is somewhat known for 
trolling in PL community, but nonetheless knows what he is talking 
about) - 
http://www.quora.com/Computer-Programming/How-do-reference-counting-and-garbage-collection-compare/answer/Jon-Harrop-
?srid=3Gvg&share=1# 


So RC could also introduce unpredictable pause times at undesired places.

This is also confirmed by research from HP - 
http://www.hpl.hp.com/personal/Hans_Boehm/popl04/refcnt.pdf

My point is that we should not ruin the language ease of use. We do 
need to deal with Phobos internal allocations, but we should not switch 
to ARC as a default memory management scheme. In practice people 
promoting ARC will probably not use phobos anyway. Currently its just 
an excuse to not use D.

Look at c++ and STL, etc. People will roll their own solutions no 
matter what you try.
Feb 06 2014
next sibling parent "Dejan Lekic" <dejan.lekic gmail.com> writes:
On Thursday, 6 February 2014 at 11:37:59 UTC, Max Klyga wrote:
 Anti-GC crowd tries to promote ARC as an deterministic 
 alternative for memory management.
 I noticed that people promoting ARC do not provide any 
 disadvantages for proposed approach.

 The thing is in gamedev and other soft-realitime software 
 background only a handfull types of resources are really 
 managed by RC and memory usage patterns are VERY specific to 
 their domain (mostly linear allocation/deallocation and objects 
 with non deterministic lifetime are preallocated in pools).

 Trying to use RC as a general method of memory management leads 
 to some problems.
 A pretty detailed view by John Harrop (He is somewhat known for 
 trolling in PL community, but nonetheless knows what he is 
 talking about) - 
 http://www.quora.com/Computer-Programming/How-do-reference-counting-and-garbage-collection-compare/answer/Jon-Harrop-1?srid=3Gvg&share=1#


 So RC could also introduce unpredictable pause times at 
 undesired places.

 This is also confirmed by research from HP - 
 http://www.hpl.hp.com/personal/Hans_Boehm/popl04/refcnt.pdf

 My point is that we should not ruin the language ease of use. 
 We do need to deal with Phobos internal allocations, but we 
 should not switch to ARC as a default memory management scheme. 
 In practice people promoting ARC will probably not use phobos 
 anyway. Currently its just an excuse to not use D.

 Look at c++ and STL, etc. People will roll their own solutions 
 no matter what you try.

Nicely said. I believe both approaches should be available. Those who must work without GC should be able to easily do that, but GC should be picked as default simply because it is better for general programming tasks. If D turns out to be used for something else, and majority of use-cases require GC to be off, perhaps D should switch to no-GC by default. I do not see this happening, to be honest. This is somewhat similar to final-by-default crowd who wants class methods to be final, not virtual by default. Again, D should provide means to satisfy both crowds, because I think it is possible.
Feb 06 2014
prev sibling next sibling parent reply =?ISO-8859-1?Q?S=F6nke_Ludwig?= <sludwig+dforum outerproduct.org> writes:
Am 06.02.2014 12:37, schrieb Max Klyga:
 Anti-GC crowd tries to promote ARC as an deterministic alternative for
 memory management.
 I noticed that people promoting ARC do not provide any disadvantages for
 proposed approach.

 The thing is in gamedev and other soft-realitime software background
 only a handfull types of resources are really managed by RC and memory
 usage patterns are VERY specific to their domain (mostly linear
 allocation/deallocation and objects with non deterministic lifetime are
 preallocated in pools).

 Trying to use RC as a general method of memory management leads to some
 problems.
 A pretty detailed view by John Harrop (He is somewhat known for trolling
 in PL community, but nonetheless knows what he is talking about) -
 http://www.quora.com/Computer-Programming/How-do-reference-counting-and-garbage-collection-compare/answer/Jon-Harrop-1?srid=3Gvg&share=1#


 So RC could also introduce unpredictable pause times at undesired places.

 This is also confirmed by research from HP -
 http://www.hpl.hp.com/personal/Hans_Boehm/popl04/refcnt.pdf

 My point is that we should not ruin the language ease of use. We do need
 to deal with Phobos internal allocations, but we should not switch to
 ARC as a default memory management scheme. In practice people promoting
 ARC will probably not use phobos anyway. Currently its just an excuse to
 not use D.

 Look at c++ and STL, etc. People will roll their own solutions no matter
 what you try.

Full ACK! Reference counting should be well supported, but it shouldn't be the default scheme or built-in at a low level. From my personal experience it would be ideal to be able to customize certain types to be reference counted (allowing the user full flexibility implementing the actual reference counting and without ruling out weak references!), but have them accessible using the same syntax and type conversion semantics as normal references.
Feb 06 2014
next sibling parent reply "Meta" <jared771 gmail.com> writes:
On Thursday, 6 February 2014 at 13:23:14 UTC, Sönke Ludwig wrote:
 Am 06.02.2014 12:37, schrieb Max Klyga:
 Anti-GC crowd tries to promote ARC as an deterministic 
 alternative for
 memory management.
 I noticed that people promoting ARC do not provide any 
 disadvantages for
 proposed approach.

 The thing is in gamedev and other soft-realitime software 
 background
 only a handfull types of resources are really managed by RC 
 and memory
 usage patterns are VERY specific to their domain (mostly linear
 allocation/deallocation and objects with non deterministic 
 lifetime are
 preallocated in pools).

 Trying to use RC as a general method of memory management 
 leads to some
 problems.
 A pretty detailed view by John Harrop (He is somewhat known 
 for trolling
 in PL community, but nonetheless knows what he is talking 
 about) -
 http://www.quora.com/Computer-Programming/How-do-reference-counting-and-garbage-collection-compare/answer/Jon-Harrop-1?srid=3Gvg&share=1#


 So RC could also introduce unpredictable pause times at 
 undesired places.

 This is also confirmed by research from HP -
 http://www.hpl.hp.com/personal/Hans_Boehm/popl04/refcnt.pdf

 My point is that we should not ruin the language ease of use. 
 We do need
 to deal with Phobos internal allocations, but we should not 
 switch to
 ARC as a default memory management scheme. In practice people 
 promoting
 ARC will probably not use phobos anyway. Currently its just an 
 excuse to
 not use D.

 Look at c++ and STL, etc. People will roll their own solutions 
 no matter
 what you try.

Full ACK! Reference counting should be well supported, but it shouldn't be the default scheme or built-in at a low level. From my personal experience it would be ideal to be able to customize certain types to be reference counted (allowing the user full flexibility implementing the actual reference counting and without ruling out weak references!), but have them accessible using the same syntax and type conversion semantics as normal references.

I think the best way forward would be to look at the places in D where allocations happen, and then figure out how we can optionally allow reference counting in these situations. Andrei just made a thread on this yesterday in regard to slices, which I think are the most promising for a RC solution.
Feb 06 2014
parent reply =?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= <sludwig+dforum outerproduct.org> writes:
Am 06.02.2014 14:35, schrieb Meta:
 On Thursday, 6 February 2014 at 13:23:14 UTC, Sönke Ludwig wrote:
 Am 06.02.2014 12:37, schrieb Max Klyga:
 Anti-GC crowd tries to promote ARC as an deterministic alternative for
 memory management.
 I noticed that people promoting ARC do not provide any disadvantages for
 proposed approach.

 The thing is in gamedev and other soft-realitime software background
 only a handfull types of resources are really managed by RC and memory
 usage patterns are VERY specific to their domain (mostly linear
 allocation/deallocation and objects with non deterministic lifetime are
 preallocated in pools).

 Trying to use RC as a general method of memory management leads to some
 problems.
 A pretty detailed view by John Harrop (He is somewhat known for trolling
 in PL community, but nonetheless knows what he is talking about) -
 http://www.quora.com/Computer-Programming/How-do-reference-counting-and-garbage-collection-compare/answer/Jon-Harrop-1?srid=3Gvg&share=1#



 So RC could also introduce unpredictable pause times at undesired
 places.

 This is also confirmed by research from HP -
 http://www.hpl.hp.com/personal/Hans_Boehm/popl04/refcnt.pdf

 My point is that we should not ruin the language ease of use. We do need
 to deal with Phobos internal allocations, but we should not switch to
 ARC as a default memory management scheme. In practice people promoting
 ARC will probably not use phobos anyway. Currently its just an excuse to
 not use D.

 Look at c++ and STL, etc. People will roll their own solutions no matter
 what you try.

Full ACK! Reference counting should be well supported, but it shouldn't be the default scheme or built-in at a low level. From my personal experience it would be ideal to be able to customize certain types to be reference counted (allowing the user full flexibility implementing the actual reference counting and without ruling out weak references!), but have them accessible using the same syntax and type conversion semantics as normal references.

I think the best way forward would be to look at the places in D where allocations happen, and then figure out how we can optionally allow reference counting in these situations. Andrei just made a thread on this yesterday in regard to slices, which I think are the most promising for a RC solution.

I'm just not convinced (far from it) that Phobos should be built on top of such an RCSlice type. I rather strongly agree with Dicebot that the API should be extended to work with ranges or pre-allocated buffers where possible + support for custom allocators where it makes sense. How the memory is managed is then totally up to the user and no Phobos function needs to be aware of that (e.g. just pass in a pre-allocated, reference counted slice).
Feb 06 2014
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/6/14, 7:22 AM, Sönke Ludwig wrote:
 I'm just not convinced (far from it) that Phobos should be built on top
 of such an RCSlice type. I rather strongly agree with Dicebot that the
 API should be extended to work with ranges or pre-allocated buffers
 where possible + support for custom allocators where it makes sense. How
 the memory is managed is then totally up to the user and no Phobos
 function needs to be aware of that (e.g. just pass in a pre-allocated,
 reference counted slice).

That makes sense. One possibility I was thinking about was to make Phobos largely transparent wrt types trafficked and simply return the type received. Consider: // lib code struct RCSlice(T) { ... } alias rcstring = RCSlice!(immutable char); rcstring rc!(string s) { ... } // user code auto s1 = buildPath!("hello", "world"); auto s2 = buildPath!(rc!"hello", rc!"world"); In this example s1 will have type string and s2 will have type rcstring. There are of course functions that would need to be given hints as to the output type. Andrei
Feb 06 2014
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/6/14, 8:25 AM, Andrei Alexandrescu wrote:
 rcstring rc!(string s) { ... }

I meant rcstring rc(string s)() { ... } Andrei
Feb 06 2014
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/6/14, 9:19 AM, H. S. Teoh wrote:
 On Thu, Feb 06, 2014 at 04:30:31PM +0000, Dicebot wrote:
 On Thursday, 6 February 2014 at 16:25:37 UTC, Andrei Alexandrescu
 wrote:
 // lib code
 struct RCSlice(T) { ... }
 alias rcstring = RCSlice!(immutable char);
 rcstring rc!(string s) { ... }

 // user code
 auto s1 = buildPath!("hello", "world");
 auto s2 = buildPath!(rc!"hello", rc!"world");

 In this example s1 will have type string and s2 will have type
 rcstring.

Looks unnecessary restrictive. Why can't one build rc-string from stack buffers or Array!char from rc-strings? Type of output buffer does not have to do anything with input.

Agree. Phobos algorithms that populate a data sink should migrate toward using output ranges instead of returning a predetermined type. This will not only address ARC needs, but a bunch of other things as well (output range support/use in Phobos is still rather scanty at the moment).

I will mention again that output ranges lead to quite a bit more code on the caller site. They do give great control, but I'm hoping for something more convenient. Andrei
Feb 06 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/6/14, 10:15 AM, H. S. Teoh wrote:
 That's only because the current output range API consists of only a
 single .put method. Please see the other thread started by Adam Ruppe:
 we should spend some time to think about how we can streamline output
 ranges so that they can be used just as easily as input ranges --
 y'know, with UFCS chaining and such, that doesn't require a ton of
 boilerplate like the current process of: declare output range, pass to
 function, get data from result, pass to next function, etc.. This is
 primarily a syntactical problem, not a logical one, and since we're so
 good at syntactic bikeshedding, we should be able to solve this
 relatively easily, right? ;-)

I don't think it's that easy. For example the output range must be passed as a ref parameter into the function, which is already introducing friction. FWIW things we can add are to output ranges: ~= for convenience .flush() or .done() to mark the end of several writes .clear() to clear the range (useful if e.g. it's implemented as a slice with appending) Andrei
Feb 06 2014
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
06-Feb-2014 22:29, Andrei Alexandrescu пишет:
 On 2/6/14, 10:15 AM, H. S. Teoh wrote:
 That's only because the current output range API consists of only a
 single .put method. Please see the other thread started by Adam Ruppe:
 we should spend some time to think about how we can streamline output
 ranges so that they can be used just as easily as input ranges --
 y'know, with UFCS chaining and such, that doesn't require a ton of
 boilerplate like the current process of: declare output range, pass to
 function, get data from result, pass to next function, etc.. This is
 primarily a syntactical problem, not a logical one, and since we're so
 good at syntactic bikeshedding, we should be able to solve this
 relatively easily, right? ;-)

I don't think it's that easy. For example the output range must be passed as a ref parameter into the function, which is already introducing friction. FWIW things we can add are to output ranges: ~= for convenience .flush() or .done() to mark the end of several writes .clear() to clear the range (useful if e.g. it's implemented as a slice with appending)

.reserve(n) to notify underlying sink that it n items are coming (it should preallocate etc.) -- Dmitry Olshansky
Feb 06 2014
prev sibling parent Paulo Pinto <pjmlp progtools.org> writes:
Am 06.02.2014 16:22, schrieb Sönke Ludwig:
 Am 06.02.2014 14:35, schrieb Meta:
 On Thursday, 6 February 2014 at 13:23:14 UTC, Sönke Ludwig wrote:
 Am 06.02.2014 12:37, schrieb Max Klyga:
 Anti-GC crowd tries to promote ARC as an deterministic alternative for
 memory management.
 I noticed that people promoting ARC do not provide any disadvantages
 for
 proposed approach.

 The thing is in gamedev and other soft-realitime software background
 only a handfull types of resources are really managed by RC and memory
 usage patterns are VERY specific to their domain (mostly linear
 allocation/deallocation and objects with non deterministic lifetime are
 preallocated in pools).

 Trying to use RC as a general method of memory management leads to some
 problems.
 A pretty detailed view by John Harrop (He is somewhat known for
 trolling
 in PL community, but nonetheless knows what he is talking about) -
 http://www.quora.com/Computer-Programming/How-do-reference-counting-and-garbage-collection-compare/answer/Jon-Harrop-1?srid=3Gvg&share=1#




 So RC could also introduce unpredictable pause times at undesired
 places.

 This is also confirmed by research from HP -
 http://www.hpl.hp.com/personal/Hans_Boehm/popl04/refcnt.pdf

 My point is that we should not ruin the language ease of use. We do
 need
 to deal with Phobos internal allocations, but we should not switch to
 ARC as a default memory management scheme. In practice people promoting
 ARC will probably not use phobos anyway. Currently its just an
 excuse to
 not use D.

 Look at c++ and STL, etc. People will roll their own solutions no
 matter
 what you try.

Full ACK! Reference counting should be well supported, but it shouldn't be the default scheme or built-in at a low level. From my personal experience it would be ideal to be able to customize certain types to be reference counted (allowing the user full flexibility implementing the actual reference counting and without ruling out weak references!), but have them accessible using the same syntax and type conversion semantics as normal references.

I think the best way forward would be to look at the places in D where allocations happen, and then figure out how we can optionally allow reference counting in these situations. Andrei just made a thread on this yesterday in regard to slices, which I think are the most promising for a RC solution.

I'm just not convinced (far from it) that Phobos should be built on top of such an RCSlice type. I rather strongly agree with Dicebot that the API should be extended to work with ranges or pre-allocated buffers where possible + support for custom allocators where it makes sense. How the memory is managed is then totally up to the user and no Phobos function needs to be aware of that (e.g. just pass in a pre-allocated, reference counted slice).

Although I seldom use D, I would like to say +1, if I may. -- Paulo
Feb 06 2014
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
On Thursday, 6 February 2014 at 16:25:37 UTC, Andrei Alexandrescu 
wrote:
 // lib code
 struct RCSlice(T) { ... }
 alias rcstring = RCSlice!(immutable char);
 rcstring rc!(string s) { ... }

 // user code
 auto s1 = buildPath!("hello", "world");
 auto s2 = buildPath!(rc!"hello", rc!"world");

 In this example s1 will have type string and s2 will have type 
 rcstring.

Looks unnecessary restrictive. Why can't one build rc-string from stack buffers or Array!char from rc-strings? Type of output buffer does not have to do anything with input.
Feb 06 2014
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Feb 06, 2014 at 04:30:31PM +0000, Dicebot wrote:
 On Thursday, 6 February 2014 at 16:25:37 UTC, Andrei Alexandrescu
 wrote:
// lib code
struct RCSlice(T) { ... }
alias rcstring = RCSlice!(immutable char);
rcstring rc!(string s) { ... }

// user code
auto s1 = buildPath!("hello", "world");
auto s2 = buildPath!(rc!"hello", rc!"world");

In this example s1 will have type string and s2 will have type
rcstring.

Looks unnecessary restrictive. Why can't one build rc-string from stack buffers or Array!char from rc-strings? Type of output buffer does not have to do anything with input.

Agree. Phobos algorithms that populate a data sink should migrate toward using output ranges instead of returning a predetermined type. This will not only address ARC needs, but a bunch of other things as well (output range support/use in Phobos is still rather scanty at the moment). T -- MSDOS = MicroSoft's Denial Of Service
Feb 06 2014
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Feb 06, 2014 at 09:56:14AM -0800, Andrei Alexandrescu wrote:
 On 2/6/14, 9:19 AM, H. S. Teoh wrote:
On Thu, Feb 06, 2014 at 04:30:31PM +0000, Dicebot wrote:
On Thursday, 6 February 2014 at 16:25:37 UTC, Andrei Alexandrescu
wrote:
// lib code
struct RCSlice(T) { ... }
alias rcstring = RCSlice!(immutable char);
rcstring rc!(string s) { ... }

// user code
auto s1 = buildPath!("hello", "world");
auto s2 = buildPath!(rc!"hello", rc!"world");

In this example s1 will have type string and s2 will have type
rcstring.

Looks unnecessary restrictive. Why can't one build rc-string from stack buffers or Array!char from rc-strings? Type of output buffer does not have to do anything with input.

Agree. Phobos algorithms that populate a data sink should migrate toward using output ranges instead of returning a predetermined type. This will not only address ARC needs, but a bunch of other things as well (output range support/use in Phobos is still rather scanty at the moment).

I will mention again that output ranges lead to quite a bit more code on the caller site. They do give great control, but I'm hoping for something more convenient.

That's only because the current output range API consists of only a single .put method. Please see the other thread started by Adam Ruppe: we should spend some time to think about how we can streamline output ranges so that they can be used just as easily as input ranges -- y'know, with UFCS chaining and such, that doesn't require a ton of boilerplate like the current process of: declare output range, pass to function, get data from result, pass to next function, etc.. This is primarily a syntactical problem, not a logical one, and since we're so good at syntactic bikeshedding, we should be able to solve this relatively easily, right? ;-) T -- Life is unfair. Ask too much from it, and it may decide you don't deserve what you have now either.
Feb 06 2014
prev sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Thursday, 6 February 2014 at 17:56:15 UTC, Andrei Alexandrescu 
wrote:
 I will mention again that output ranges lead to quite a bit 
 more code on the caller site.

People are asking for control over memory management. You can't then complain that you get control over memory management! I'd furthermore like to note that there's no reason why we can't have the best of both worlds through default parameters and/or different names. Suppose our thing is defined as this: T[] toUpper(T, OR = GCSink!T)(in T[] data, OR output = OR()) { output.start(); foreach(d; data) output.put(d & ~0x20); return output.finish(); } struct GCSink(T) { // so this is a reference type private struct Impl { T[] data; void put(T t) { data ~= t; } T[] finish() { return data; } } Impl* impl; alias impl this; void start() { impl = new Impl; } } // an output range into an existing array container struct StaticSink(T) { T[] container; this(T[] c) { container = c; } size_t size; void start() { size = 0; } void put(T t) { container[size++] = t; } T[] finish() { return container[0 .. size]; } } StaticSink!T staticSink(T)(T[] t) { return StaticSink!T(t); } void main() { import std.stdio; writeln(toUpper("cool")); // default: GC char[10] buffer; auto received = toUpper("cool", staticSink(buffer[])); // custom static sink assert(buffer.ptr is received.ptr); assert(received == "COOL"); }
Feb 06 2014
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
I still wonder where the idea of replacing GC with something as a 
silver bullet came from. There is no problem with GC itself as 
you can remove it easily. Problem is state of language after it 
was removed and it is something completely different and 
unrelated.

All this ARC fuss came from few speculative discussions and 
suddenly got caught with great attention for reasons I fail to 
understand. And doing something like going for ARC by default is 
just crazy - it will make life more difficult for majority of 
users that don't care and won't fix many real issues for vocal 
minority.

Real helping problems to be addressed instead in my opinion:
1) providing RC-based gc_stub
2) -vgc and/or better control over hidden allocation
3) removing as much internal allocations as possible from Phobos, 
move to output ranges instead
4) provide examples of containers befriended with std.allocator

Not related to memory management but demanded in same domain - 
fix symbol boat, __forceinline

As you may notice, reference counting is just a one tiny part 
here and only desired as some way to get non-leaking basic 
language in absence of gc. And probably least important.

All recent threads just make me frustrated despite being one of 
pushers for better low-level memory management options.
Feb 06 2014
prev sibling next sibling parent "qznc" <qznc web.de> writes:
On Thursday, 6 February 2014 at 11:37:59 UTC, Max Klyga wrote:
 Anti-GC crowd tries to promote ARC as an deterministic 
 alternative for memory management.
 I noticed that people promoting ARC do not provide any 
 disadvantages for proposed approach.

Feel free to extend this wiki page: http://wiki.dlang.org/Versus_the_garbage_collector
Feb 06 2014
prev sibling next sibling parent reply "ponce" <contact gam3sfrommars.fr> writes:
On Thursday, 6 February 2014 at 11:37:59 UTC, Max Klyga wrote:
 Anti-GC crowd tries to promote ARC as an deterministic 
 alternative for memory management.
 I noticed that people promoting ARC do not provide any 
 disadvantages for proposed approach.

 The thing is in gamedev and other soft-realitime software 
 background only a handfull types of resources are really 
 managed by RC and memory usage patterns are VERY specific to 
 their domain (mostly linear allocation/deallocation and objects 
 with non deterministic lifetime are preallocated in pools).

 Trying to use RC as a general method of memory management leads 
 to some problems.
 A pretty detailed view by John Harrop (He is somewhat known for 
 trolling in PL community, but nonetheless knows what he is 
 talking about) - 
 http://www.quora.com/Computer-Programming/How-do-reference-counting-and-garbage-collection-compare/answer/Jon-Harrop-1?srid=3Gvg&share=1#


 So RC could also introduce unpredictable pause times at 
 undesired places.

 This is also confirmed by research from HP - 
 http://www.hpl.hp.com/personal/Hans_Boehm/popl04/refcnt.pdf

 My point is that we should not ruin the language ease of use. 
 We do need to deal with Phobos internal allocations, but we 
 should not switch to ARC as a default memory management scheme. 
 In practice people promoting ARC will probably not use phobos 
 anyway. Currently its just an excuse to not use D.

 Look at c++ and STL, etc. People will roll their own solutions 
 no matter what you try.

I think of RC as a greater evil that GC. From what I've seen it does creates leaking cycles in tree structures AND pauses. From a low-level point of view, RC pointers are ugly (separate counter that will trash your cache) and do atomics all over the place (ie. memory barriers). That they don't have to because of shared is TBD. It is comforting to me to know that a GC pointer is still just a pointer. If we go the RC route, we will have to constantly think about the higher cost of RC pointer vs weak-ref, instead of thinking about other things instead Right now a GC pointer is the same size as a non-GC one, it's liberating. It looks like we want to solve a PR problem more than a real one.
Feb 06 2014
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/6/14, 7:11 AM, ponce wrote:
 On Thursday, 6 February 2014 at 11:37:59 UTC, Max Klyga wrote:
 Anti-GC crowd tries to promote ARC as an deterministic alternative for
 memory management.
 I noticed that people promoting ARC do not provide any disadvantages
 for proposed approach.

 The thing is in gamedev and other soft-realitime software background
 only a handfull types of resources are really managed by RC and memory
 usage patterns are VERY specific to their domain (mostly linear
 allocation/deallocation and objects with non deterministic lifetime
 are preallocated in pools).

 Trying to use RC as a general method of memory management leads to
 some problems.
 A pretty detailed view by John Harrop (He is somewhat known for
 trolling in PL community, but nonetheless knows what he is talking
 about) -
 http://www.quora.com/Computer-Programming/How-do-reference-counting-and-garbage-collection-compare/answer/Jon-Harrop-1?srid=3Gvg&share=1#



 So RC could also introduce unpredictable pause times at undesired places.

 This is also confirmed by research from HP -
 http://www.hpl.hp.com/personal/Hans_Boehm/popl04/refcnt.pdf

 My point is that we should not ruin the language ease of use. We do
 need to deal with Phobos internal allocations, but we should not
 switch to ARC as a default memory management scheme. In practice
 people promoting ARC will probably not use phobos anyway. Currently
 its just an excuse to not use D.

 Look at c++ and STL, etc. People will roll their own solutions no
 matter what you try.

I think of RC as a greater evil that GC. From what I've seen it does creates leaking cycles in tree structures AND pauses. From a low-level point of view, RC pointers are ugly (separate counter that will trash your cache) and do atomics all over the place (ie. memory barriers). That they don't have to because of shared is TBD. It is comforting to me to know that a GC pointer is still just a pointer. If we go the RC route, we will have to constantly think about the higher cost of RC pointer vs weak-ref, instead of thinking about other things instead Right now a GC pointer is the same size as a non-GC one, it's liberating. It looks like we want to solve a PR problem more than a real one.

Though I partly agree with your considerations, let me note that PR problems are real. Andrei
Feb 06 2014
prev sibling next sibling parent reply Johannes Pfau <nospam example.com> writes:
Am Thu, 6 Feb 2014 14:37:59 +0300
schrieb Max Klyga <max.klyga gmail.com>:

 
 My point is that we should not ruin the language ease of use. We do 
 need to deal with Phobos internal allocations, but we should not
 switch to ARC as a default memory management scheme. 

What's with all this finger pointing and drawing battle lines in the last few days? GC-crowd vs ARC-crowd? Can we please all calm down? Who even proposed to replace the GC completely with ARC? Can somebody point me to a clear statement demanding that? 90% of phobos isn't actually affected by the ARC/GC issue. Most functions just allocate memory, then return some result and are finished. They do not keep data references. As they do not keep references there's no need for ARC or GC, we just need a way to tell every function how it should allocate. Some people seem to want some implicit way to set a 'default' allocator, but I haven't heard of any solution that works. (E.g. having a thread-local default allocator, per library default allocator, how would that even work?) I don't think there's anything wrong with the obvious solution: All phobos functions which allocate take an optional Allocator parameter, defaulting to GC. The little extra typing won't harm anyone and if you want to use things like stack-based buffers you'll have to write extra code and think about memory allocation anyway. auto gcString = toUpper("test"); auto mallocString = toUpper!Malloc("test"); ubtye[64] sbuf; auto stackString = toUpper(sbuf[], "test"); What's so bad about this? It works for most of phobos, doesn't require language changes and it's easy to realize what's going on when reading the code. Having an 'application default allocator' or 'thread local default allocator' or 'per function default allocator' will actually hide the allocation strategy and I bet it would cause issues. So the question then is: what about language feature which allocate using the GC? Wouldn't we want these to work with any kind of allocator? Answer: no, because: This is the list of language features which allocate: * .sort, .idup, .dup, setting .length Sort is deprecated; we should provide duplicate!Allocator functions as a replacement for dup/idup (Or dup/idup could support an allocator argument); just don't set .length. If you need some way to grow an Array just use Appender or a library array, ... * closures Who needs these anyway? If the callees only use scoped delegates closures do not allocate. Otherwise just implement the closure yourself and allocate wherever you want: int localA, localB; struct Frame { int a, b; int callback() {a++}; } auto f = allocate!(Frame, Malloc)(localA, localB); functionWithDelegate(&f.callback); * ~, ~= on slices (not on user types) Just avoid these. When string processing a call to format!Allocator would be better anyway. For other stuff use Appender!Allocator or some other library type which could actually overload ~,~= and then you can use these again... * delete deprecated * new we need a generic allocation function anyway, allocate!Allocator * Array literals * Associative array literals (not in all cases) That should be fixed. I hope at some point these types will be implemented in druntime/phobos, support allocators and don't need TypeInfo. Until then, just use user defined Array, AssociativeArray types. I think with relatively little effort we could solve most problems. The remaining cases must be decided on a case-by case basis. Should containers use RC or GC, some stuff like that. Exceptions currently can't work on a system without GC cause we always use 'throw new' and nobody ever explicitly frees Exceptions. ARC could be a solution to that issue (if we enforce that exceptions may not have circular references, but they shouldn't anyway) And then slices which are actually stored by functions are another issue. But it's not like we just change the whole language to ARC. We already have forced RC in phobos: std.stdio.File for example. And nobody complained, except that RefCounted has bugs and an ARC implementation for File could avoid these bugs and be faster than the current implementation... We just have to provide everyone with a way to choose their favorite implementation. Which means we provide public APIs which allow any kind of memory allocation and internally do not rely on automatic memory management (internal allocation in phobos should be done on the stack/ with malloc / made configurable, but not with a GC).
Feb 06 2014
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/6/14, 7:47 AM, Johannes Pfau wrote:
 Am Thu, 6 Feb 2014 14:37:59 +0300
 schrieb Max Klyga <max.klyga gmail.com>:

 My point is that we should not ruin the language ease of use. We do
 need to deal with Phobos internal allocations, but we should not
 switch to ARC as a default memory management scheme.

What's with all this finger pointing and drawing battle lines in the last few days? GC-crowd vs ARC-crowd? Can we please all calm down?

Nice. An interspersed point:
 I don't think there's anything wrong with the obvious solution: All
 phobos functions which allocate take an optional Allocator parameter,
 defaulting to GC. The little extra typing won't harm anyone and if you
 want to use things like stack-based buffers you'll have to write extra
 code and think about memory allocation anyway.

 auto gcString = toUpper("test");
 auto mallocString = toUpper!Malloc("test");
 ubtye[64] sbuf;
 auto stackString = toUpper(sbuf[], "test");

 What's so bad about this?

The issue here is that Phobos functions need to document whether e.g. they return memory that can be deallocated or not. Counterexamples would be returning static strings or subslices of allocations. I'm not saying it's not solvable, but it'll take some thinking and some work.
 It works for most of phobos, doesn't require
 language changes and it's easy to realize what's going on when reading
 the code. Having an 'application default allocator' or 'thread local
 default allocator' or 'per function default allocator' will actually
 hide the allocation strategy and I bet it would cause issues.

I think a crack should be given to the user to install their own allocator (per thread and/or shared). Perhaps we can limit that to the startup stage, i.e. before any allocation takes place.
 So the question then is: what about language feature which allocate
 using the GC? Wouldn't we want these to work with any kind of
 allocator? Answer: no, because:

 This is the list of language features which allocate:

I think you forgot AAs.
 We just have to provide everyone with a way to choose their favorite
 implementation. Which means we provide public APIs which allow any kind
 of memory allocation and internally do not rely on automatic memory
 management (internal allocation in phobos should be done on the stack/
 with malloc / made configurable, but not with a GC).

I agree that's a nice goal. But I don't think it's easily attainable. The "choose the allocator" part is easy. The harder is choosing the reclamation method. There are differences between GC and RC that are very difficult to unify under a common API. Andrei
Feb 06 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/6/14, 9:18 AM, Adam D. Ruppe wrote:
 Something that mallocs should return Malloced!T which calls the
 appropriate free (specified by the allocator) in the destructor. GC
 should return GC!T. Refconted should return RefCounted!T, and so on.

That ain't going to work. Malloced!T and GC!T suggests parameterization by the type of the allocator. So there would need to be a type per allocator, which is a losing proposition from std.allocator's viewpoint, since there can be so many of them via template combinatorics. RefCounted!T is a whole different thing, because it doesn't encode allocation strategy but instead memory reclamation tactics. There's no "and so on" and RefCounted!T cannot occur in an enumeration that includes Malloced!T and GC!T. Andrei
Feb 06 2014
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/6/14, 11:14 AM, Adam D. Ruppe wrote:
 On Thursday, 6 February 2014 at 17:54:38 UTC, Andrei Alexandrescu wrote:
 Malloced!T and GC!T suggests parameterization by the type of the
 allocator.

Not necessarily, different allocators with the same free could return the same type. The key point is the knowledge of how to free it is encapsulated there in some way.
 RefCounted!T is a whole different thing, because it doesn't encode
 allocation strategy but instead memory reclamation tactics.

Malloced!T also encodes reclamation tactics: ~this() { free(ptr); }

So if T is int[] and you have taken a slice into it...?
 You could also call it Unique!T(&free): the malloced pointer is unique
 and must be released with free. That cvers the same ground in more
 generic way. (Surely refcounted!T needs to know what happens when
 count==0 too.)

I'm not sure I understand what you are talking about. Andrei
Feb 06 2014
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Thursday, 6 February 2014 at 16:40:32 UTC, Andrei Alexandrescu 
wrote:
 The issue here is that Phobos functions need to document 
 whether e.g. they return memory that can be deallocated or not. 
 Counterexamples would be returning static strings or subslices 
 of allocations.

This is why specifying ownership by type is important. It documents the need, it makes sure the information doesn't get dropped, and it can automatically manage the details (via RAII). Something that mallocs should return Malloced!T which calls the appropriate free (specified by the allocator) in the destructor. GC should return GC!T. Refconted should return RefCounted!T, and so on. alias this can easily allow interoperability... though, of course, not escaping things incorrectly would have to be taken care of, either manually or automatically. I keep coming back to this because it cannot be avoided, except by GC through and through. If the language does not help with this, it doesn't mean the complexity goes away. It just means it is moved onto the (fallible) programmer.
 I think a crack should be given to the user to install their 
 own allocator (per thread and/or shared). Perhaps we can limit 
 that to the startup stage, i.e. before any allocation takes 
 place.

You could always link in your own _d_allocmemory, etc. I wouldn't do this, it will make things hard to get right, but it is very easy - just add the functions to your main project. the linker will prefer your functions to the druntime functions.
Feb 06 2014
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Thursday, 6 February 2014 at 17:54:38 UTC, Andrei Alexandrescu 
wrote:
 Malloced!T and GC!T suggests parameterization by the type of 
 the allocator.

Not necessarily, different allocators with the same free could return the same type. The key point is the knowledge of how to free it is encapsulated there in some way.
 RefCounted!T is a whole different thing, because it doesn't 
 encode allocation strategy but instead memory reclamation 
 tactics.

Malloced!T also encodes reclamation tactics: ~this() { free(ptr); } You could also call it Unique!T(&free): the malloced pointer is unique and must be released with free. That cvers the same ground in more generic way. (Surely refcounted!T needs to know what happens when count==0 too.)
Feb 06 2014
prev sibling next sibling parent Max Klyga <email domain.com> writes:
On 2014-02-06 15:47:05 +0000, Johannes Pfau said:

 Am Thu, 6 Feb 2014 14:37:59 +0300
 schrieb Max Klyga <max.klyga gmail.com>:
 
 
 My point is that we should not ruin the language ease of use. We do
 need to deal with Phobos internal allocations, but we should not
 switch to ARC as a default memory management scheme.

snip

I wholeheartedly agree that we should define methods in phobos taking output buffers/ranges. One of the reasons Tango xml parser was the fastest in the world was because almost every method/function in Tango was takinig output buffer as argument and never allocated unless asked specifically. This would allow everyone chosing a method of memory management most suited for their domain.
Feb 06 2014
prev sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Thursday, 6 February 2014 at 19:33:39 UTC, Andrei Alexandrescu 
wrote:
 So if T is int[] and you have taken a slice into it...?

If you escape it, congratulations, you have a memory safety bug. Have fun tracking it down. You could also offer refcounted slicing, of course (wrapping the Unique thing in a refcounter would work), or you could be converted to the church of scope where the compiler will help you catch these bugs without run time cost.
 I'm not sure I understand what you are talking about.

When the reference count reaches zero, what happens? This changes based on the allocation method: you might call GC.free, you might call free(), you might do nothing, The destructor needs to know, otherwise the refcounting achieves exactly nothing! We can encode this in the type or use a function pointer for it. struct RefCounted(T) { private struct Impl { // potential double indirection lol private T payload; private size_t count; private void function(T) free; T getPayload() { return payload; } // so it is an lvalue alias getPayload this; } Impl* impl; alias impl this; this(T t, void function(T) free) { impl = new Impl; // some kind allocation at startup lol // naturally, this could also be malloc // or take a generic allocator form the user // but refcounted definitely needs some pointer love impl.payload = t; impl.count = 1; impl.free = free; // gotta store this so we can free later } this(this) { impl.count++; } ~this() { impl.count--; if(impl.count == 0) { // how do we know how to free it? impl.free(impl.payload); // delete impl; GC.free(impl) core.stdc.stdlib.free(impl); // whatever impl = null; } } } In this example, we take the reference we're counting in the constructor... which means it is already allocated. So logically, the user code should tell it how to deallocate it too. We can't just call a global free, we take a pointer instead. So this would work kinda like this: import core.stdc.stdlib; int[] stuff = malloc(int.sizeof * 5)[0 .. 5]; auto counted = RefCounted!(int[])(stuff, (int[] stuff) { free(stuff.ptr); }); The allocator is not encoded in the type, but ref counted does need to know what happens when the final reference is gone. It takes a function pointer from the user for that. This is a generically refcounting type. It isn't maximally efficient but it also works with arbitrary inputs allocated by any means. Unique!T could do something similar, but unique would disable its postblit instead of incrementing a refcount.
Feb 06 2014
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Feb 06, 2014 at 04:47:05PM +0100, Johannes Pfau wrote:
[...]
 Some people seem to want some implicit way to set a 'default'
 allocator, but I haven't heard of any solution that works. (E.g. having
 a thread-local default allocator, per library default allocator, how
 would that even work?)
 
 I don't think there's anything wrong with the obvious solution: All
 phobos functions which allocate take an optional Allocator parameter,
 defaulting to GC. The little extra typing won't harm anyone and if you
 want to use things like stack-based buffers you'll have to write extra
 code and think about memory allocation anyway.
 
 auto gcString = toUpper("test");
 auto mallocString = toUpper!Malloc("test");
 ubtye[64] sbuf;
 auto stackString = toUpper(sbuf[], "test");
 
 What's so bad about this? It works for most of phobos, doesn't require
 language changes and it's easy to realize what's going on when reading
 the code. Having an 'application default allocator' or 'thread local
 default allocator' or 'per function default allocator' will actually
 hide the allocation strategy and I bet it would cause issues.

I think a superior solution is to pass in an output range to toUpper, that does whatever form of allocation you prefer. There's nothing about toUpper that *fundamentally* depends on an allocator, therefore it shouldn't even *care* what an allocator is. Reduced to its absolute fundamentals, it just takes data from some input string, and produces some output data. Where this output data goes is none of its concern -- it can be a GC string, an ARC string, stdout, an interprocess pipe, a network socket, toUpper shouldn't have to care which one it is. Just take an output range. Then on the complementary side, have Phobos provide a bunch of premade output ranges that allocates a GC string, or an ARC string, or whatever, and then the user can just pick one of those to pass to toUpper. T -- Almost all proofs have bugs, but almost all theorems are true. -- Paul Pedersen
Feb 06 2014
prev sibling parent Johannes Pfau <nospam example.com> writes:
Am Thu, 06 Feb 2014 08:40:28 -0800
schrieb Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:

 auto gcString = toUpper("test");
 auto mallocString = toUpper!Malloc("test");
 ubtye[64] sbuf;
 auto stackString = toUpper(sbuf[], "test");

 What's so bad about this?

The issue here is that Phobos functions need to document whether e.g. they return memory that can be deallocated or not. Counterexamples would be returning static strings or subslices of allocations. I'm not saying it's not solvable, but it'll take some thinking and some work.

That's true. I wonder how common these cases are but slices are probably the bigger problem here. (OTOH if a function just slices the input, we'd have to document it but there's no bigger issue)
 [...] Having an 'application default allocator' or
 'thread local default allocator' or 'per function default
 allocator' will actually hide the allocation strategy and I bet it
 would cause issues.

I think a crack should be given to the user to install their own allocator (per thread and/or shared). Perhaps we can limit that to the startup stage, i.e. before any allocation takes place.

If we can make that work then I won't complain. As long as the default allocator can't be changed at random a point in time most problems should be solved for a global default allocator. For per-thread allocators this is difficult: If you allocate in one thread and free in another how do you make sure you use the correct free function? There are some interesting possibilities though: For example we could add a delegate to object which points to the correct 'free' function. But then things get complicated if we have to manage the lifetime of the allocator as well....
 This is the list of language features which allocate:

I think you forgot AAs.

I had AA literals in the list, but you're right some other AA features allocate as well. Good you mentioned that, I'll have to detect these cases in -nogc/-vgc code as well... However, from a user point of view dcollections (and I hope at some point std.container as well) provides a nice replacement for all these operations, except for literals.
 
 We just have to provide everyone with a way to choose their favorite
 implementation. Which means we provide public APIs which allow any
 kind of memory allocation and internally do not rely on automatic
 memory management (internal allocation in phobos should be done on
 the stack/ with malloc / made configurable, but not with a GC).

I agree that's a nice goal. But I don't think it's easily attainable. The "choose the allocator" part is easy. The harder is choosing the reclamation method. There are differences between GC and RC that are very difficult to unify under a common API.

I'd guess that allocation is actually a bigger issue for those who are unhappy with the GC right now, but I have no way to prove that ;-) (Explicit manual freeing is annoying, but possible. But if a function internally allocates with the GC it can't be used at all). But you're of course right, getting reclamation right is probably more difficult and also important.
Feb 06 2014