digitalmars.D - Disadvantages of ARC

Max Klyga (25/25) Feb 06 2014 Anti-GC crowd tries to promote ARC as an deterministic alternative for

Dejan Lekic (13/39) Feb 06 2014 Nicely said.
=?ISO-8859-1?Q?S=F6nke_Ludwig?= (8/32) Feb 06 2014 Full ACK! Reference counting should be well supported, but it shouldn't

Meta (6/61) Feb 06 2014 I think the best way forward would be to look at the places in D

=?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= (8/57) Feb 06 2014 I'm just not convinced (far from it) that Phobos should be built on top

Andrei Alexandrescu (15/22) Feb 06 2014 That makes sense. One possibility I was thinking about was to make

Dicebot (5/14) Feb 06 2014 Looks unnecessary restrictive. Why can't one build rc-string from

H. S. Teoh (8/25) Feb 06 2014 Agree. Phobos algorithms that populate a data sink should migrate toward

Andrei Alexandrescu (5/27) Feb 06 2014 I will mention again that output ranges lead to quite a bit more code on...

H. S. Teoh (15/43) Feb 06 2014 [...]

Andrei Alexandrescu (10/20) Feb 06 2014 I don't think it's that easy. For example the output range must be

Dmitry Olshansky (5/24) Feb 06 2014 .reserve(n) to notify underlying sink that it n items are coming (it

Adam D. Ruppe (48/50) Feb 06 2014 People are asking for control over memory management. You can't

Andrei Alexandrescu (4/5) Feb 06 2014 I meant

Paulo Pinto (4/68) Feb 06 2014 Although I seldom use D, I would like to say +1, if I may.

Dicebot (24/24) Feb 06 2014 I still wonder where the idea of replacing GC with something as a
qznc (3/7) Feb 06 2014 Feel free to extend this wiki page:
ponce (14/40) Feb 06 2014 I think of RC as a greater evil that GC. From what I've seen it

Andrei Alexandrescu (4/49) Feb 06 2014 Though I partly agree with your considerations, let me note that PR

Johannes Pfau (84/88) Feb 06 2014 What's with all this finger pointing and drawing battle lines in the

Andrei Alexandrescu (18/50) Feb 06 2014 [snip]

Adam D. Ruppe (20/28) Feb 06 2014 This is why specifying ownership by type is important. It

Andrei Alexandrescu (11/14) Feb 06 2014 That ain't going to work.

Adam D. Ruppe (11/16) Feb 06 2014 Not necessarily, different allocators with the same free could

Andrei Alexandrescu (4/17) Feb 06 2014 I'm not sure I understand what you are talking about.

Adam D. Ruppe (67/69) Feb 06 2014 If you escape it, congratulations, you have a memory safety bug.

Johannes Pfau (27/65) Feb 06 2014 That's true. I wonder how common these cases are but slices are

H. S. Teoh (18/39) Feb 06 2014 [...]
Max Klyga (8/17) Feb 06 2014 I wholeheartedly agree that we should define methods in phobos taking

Max Klyga <max.klyga gmail.com> writes:

Anti-GC crowd tries to promote ARC as an deterministic alternative for 
memory management.
I noticed that people promoting ARC do not provide any disadvantages 
for proposed approach.

The thing is in gamedev and other soft-realitime software background 
only a handfull types of resources are really managed by RC and memory 
usage patterns are VERY specific to their domain (mostly linear 
allocation/deallocation and objects with non deterministic lifetime are 
preallocated in pools).

Trying to use RC as a general method of memory management leads to some 
problems.
A pretty detailed view by John Harrop (He is somewhat known for 
trolling in PL community, but nonetheless knows what he is talking 
about) - 
http://www.quora.com/Computer-Programming/How-do-reference-counting-and-garbage-collection-compare/answer/Jon-Harrop-



So RC could also introduce unpredictable pause times at undesired places.

This is also confirmed by research from HP - 
http://www.hpl.hp.com/personal/Hans_Boehm/popl04/refcnt.pdf

My point is that we should not ruin the language ease of use. We do 
need to deal with Phobos internal allocations, but we should not switch 
to ARC as a default memory management scheme. In practice people 
promoting ARC will probably not use phobos anyway. Currently its just 
an excuse to not use D.

Look at c++ and STL, etc. People will roll their own solutions no 
matter what you try.

Feb 06 2014

"Dejan Lekic" <dejan.lekic gmail.com> writes:

On Thursday, 6 February 2014 at 11:37:59 UTC, Max Klyga wrote:
 Anti-GC crowd tries to promote ARC as an deterministic 
 alternative for memory management.
 I noticed that people promoting ARC do not provide any 
 disadvantages for proposed approach.

 The thing is in gamedev and other soft-realitime software 
 background only a handfull types of resources are really 
 managed by RC and memory usage patterns are VERY specific to 
 their domain (mostly linear allocation/deallocation and objects 
 with non deterministic lifetime are preallocated in pools).

 Trying to use RC as a general method of memory management leads 
 to some problems.
 A pretty detailed view by John Harrop (He is somewhat known for 
 trolling in PL community, but nonetheless knows what he is 
 talking about) - 



 So RC could also introduce unpredictable pause times at 
 undesired places.

 This is also confirmed by research from HP - 
 http://www.hpl.hp.com/personal/Hans_Boehm/popl04/refcnt.pdf

 My point is that we should not ruin the language ease of use. 
 We do need to deal with Phobos internal allocations, but we 
 should not switch to ARC as a default memory management scheme. 
 In practice people promoting ARC will probably not use phobos 
 anyway. Currently its just an excuse to not use D.

 Look at c++ and STL, etc. People will roll their own solutions 
 no matter what you try.

Nicely said.

I believe both approaches should be available. Those who must 
work without GC should be able to easily do that, but GC should 
be picked as default simply because it is better for general 
programming tasks. If D turns out to be used for something else, 
and majority of use-cases require GC to be off, perhaps D should 
switch to no-GC by default. I do not see this happening, to be 
honest.

This is somewhat similar to final-by-default crowd who wants 
class methods to be final, not virtual by default. Again, D 
should provide means to satisfy both crowds, because I think it 
is possible.

Feb 06 2014

=?ISO-8859-1?Q?S=F6nke_Ludwig?= <sludwig+dforum outerproduct.org> writes:

Am 06.02.2014 12:37, schrieb Max Klyga:
 Anti-GC crowd tries to promote ARC as an deterministic alternative for
 memory management.
 I noticed that people promoting ARC do not provide any disadvantages for
 proposed approach.

 The thing is in gamedev and other soft-realitime software background
 only a handfull types of resources are really managed by RC and memory
 usage patterns are VERY specific to their domain (mostly linear
 allocation/deallocation and objects with non deterministic lifetime are
 preallocated in pools).

 Trying to use RC as a general method of memory management leads to some
 problems.
 A pretty detailed view by John Harrop (He is somewhat known for trolling
 in PL community, but nonetheless knows what he is talking about) -



 So RC could also introduce unpredictable pause times at undesired places.

 This is also confirmed by research from HP -
 http://www.hpl.hp.com/personal/Hans_Boehm/popl04/refcnt.pdf

 My point is that we should not ruin the language ease of use. We do need
 to deal with Phobos internal allocations, but we should not switch to
 ARC as a default memory management scheme. In practice people promoting
 ARC will probably not use phobos anyway. Currently its just an excuse to
 not use D.

 Look at c++ and STL, etc. People will roll their own solutions no matter
 what you try.

Full ACK! Reference counting should be well supported, but it shouldn't 
be the default scheme or built-in at a low level. From my personal 
experience it would be ideal to be able to customize certain types to be 
reference counted (allowing the user full flexibility implementing the 
actual reference counting and without ruling out weak references!), but 
have them accessible using the same syntax and type conversion semantics 
as normal references.

Feb 06 2014

"Meta" <jared771 gmail.com> writes:

On Thursday, 6 February 2014 at 13:23:14 UTC, Sönke Ludwig wrote:
 Am 06.02.2014 12:37, schrieb Max Klyga:
 Anti-GC crowd tries to promote ARC as an deterministic 
 alternative for
 memory management.
 I noticed that people promoting ARC do not provide any 
 disadvantages for
 proposed approach.

 The thing is in gamedev and other soft-realitime software 
 background
 only a handfull types of resources are really managed by RC 
 and memory
 usage patterns are VERY specific to their domain (mostly linear
 allocation/deallocation and objects with non deterministic 
 lifetime are
 preallocated in pools).

 Trying to use RC as a general method of memory management 
 leads to some
 problems.
 A pretty detailed view by John Harrop (He is somewhat known 
 for trolling
 in PL community, but nonetheless knows what he is talking 
 about) -



 So RC could also introduce unpredictable pause times at 
 undesired places.

 This is also confirmed by research from HP -
 http://www.hpl.hp.com/personal/Hans_Boehm/popl04/refcnt.pdf

 My point is that we should not ruin the language ease of use. 
 We do need
 to deal with Phobos internal allocations, but we should not 
 switch to
 ARC as a default memory management scheme. In practice people 
 promoting
 ARC will probably not use phobos anyway. Currently its just an 
 excuse to
 not use D.

 Look at c++ and STL, etc. People will roll their own solutions 
 no matter
 what you try.

 Full ACK! Reference counting should be well supported, but it 
 shouldn't be the default scheme or built-in at a low level. 
 From my personal experience it would be ideal to be able to 
 customize certain types to be reference counted (allowing the 
 user full flexibility implementing the actual reference 
 counting and without ruling out weak references!), but have 
 them accessible using the same syntax and type conversion 
 semantics as normal references.

I think the best way forward would be to look at the places in D 
where allocations happen, and then figure out how we can 
optionally allow reference counting in these situations. Andrei 
just made a thread on this yesterday in regard to slices, which I 
think are the most promising for a RC solution.

Feb 06 2014

=?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= <sludwig+dforum outerproduct.org> writes:

Am 06.02.2014 14:35, schrieb Meta:
 On Thursday, 6 February 2014 at 13:23:14 UTC, Sönke Ludwig wrote:
 Am 06.02.2014 12:37, schrieb Max Klyga:
 Anti-GC crowd tries to promote ARC as an deterministic alternative for
 memory management.
 I noticed that people promoting ARC do not provide any disadvantages for
 proposed approach.

 The thing is in gamedev and other soft-realitime software background
 only a handfull types of resources are really managed by RC and memory
 usage patterns are VERY specific to their domain (mostly linear
 allocation/deallocation and objects with non deterministic lifetime are
 preallocated in pools).

 Trying to use RC as a general method of memory management leads to some
 problems.
 A pretty detailed view by John Harrop (He is somewhat known for trolling
 in PL community, but nonetheless knows what he is talking about) -




 So RC could also introduce unpredictable pause times at undesired
 places.

 This is also confirmed by research from HP -
 http://www.hpl.hp.com/personal/Hans_Boehm/popl04/refcnt.pdf

 My point is that we should not ruin the language ease of use. We do need
 to deal with Phobos internal allocations, but we should not switch to
 ARC as a default memory management scheme. In practice people promoting
 ARC will probably not use phobos anyway. Currently its just an excuse to
 not use D.

 Look at c++ and STL, etc. People will roll their own solutions no matter
 what you try.

 Full ACK! Reference counting should be well supported, but it
 shouldn't be the default scheme or built-in at a low level. From my
 personal experience it would be ideal to be able to customize certain
 types to be reference counted (allowing the user full flexibility
 implementing the actual reference counting and without ruling out weak
 references!), but have them accessible using the same syntax and type
 conversion semantics as normal references.

 I think the best way forward would be to look at the places in D where
 allocations happen, and then figure out how we can optionally allow
 reference counting in these situations. Andrei just made a thread on
 this yesterday in regard to slices, which I think are the most promising
 for a RC solution.

I'm just not convinced (far from it) that Phobos should be built on top 
of such an RCSlice type. I rather strongly agree with Dicebot that the 
API should be extended to work with ranges or pre-allocated buffers 
where possible + support for custom allocators where it makes sense. How 
the memory is managed is then totally up to the user and no Phobos 
function needs to be aware of that (e.g. just pass in a pre-allocated, 
reference counted slice).

Feb 06 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/6/14, 7:22 AM, Sönke Ludwig wrote:
 I'm just not convinced (far from it) that Phobos should be built on top
 of such an RCSlice type. I rather strongly agree with Dicebot that the
 API should be extended to work with ranges or pre-allocated buffers
 where possible + support for custom allocators where it makes sense. How
 the memory is managed is then totally up to the user and no Phobos
 function needs to be aware of that (e.g. just pass in a pre-allocated,
 reference counted slice).

That makes sense. One possibility I was thinking about was to make 
Phobos largely transparent wrt types trafficked and simply return the 
type received. Consider:

// lib code
struct RCSlice(T) { ... }
alias rcstring = RCSlice!(immutable char);
rcstring rc!(string s) { ... }

// user code
auto s1 = buildPath!("hello", "world");
auto s2 = buildPath!(rc!"hello", rc!"world");

In this example s1 will have type string and s2 will have type rcstring.

There are of course functions that would need to be given hints as to 
the output type.


Andrei

Feb 06 2014

"Dicebot" <public dicebot.lv> writes:

On Thursday, 6 February 2014 at 16:25:37 UTC, Andrei Alexandrescu 
wrote:
 // lib code
 struct RCSlice(T) { ... }
 alias rcstring = RCSlice!(immutable char);
 rcstring rc!(string s) { ... }

 // user code
 auto s1 = buildPath!("hello", "world");
 auto s2 = buildPath!(rc!"hello", rc!"world");

 In this example s1 will have type string and s2 will have type 
 rcstring.

Looks unnecessary restrictive. Why can't one build rc-string from 
stack buffers or Array!char from rc-strings? Type of output 
buffer does not have to do anything with input.

Feb 06 2014

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, Feb 06, 2014 at 04:30:31PM +0000, Dicebot wrote:
 On Thursday, 6 February 2014 at 16:25:37 UTC, Andrei Alexandrescu
 wrote:
// lib code
struct RCSlice(T) { ... }
alias rcstring = RCSlice!(immutable char);
rcstring rc!(string s) { ... }

// user code
auto s1 = buildPath!("hello", "world");
auto s2 = buildPath!(rc!"hello", rc!"world");

In this example s1 will have type string and s2 will have type
rcstring.

 
 Looks unnecessary restrictive. Why can't one build rc-string from
 stack buffers or Array!char from rc-strings? Type of output buffer
 does not have to do anything with input.

Agree. Phobos algorithms that populate a data sink should migrate toward
using output ranges instead of returning a predetermined type. This will
not only address ARC needs, but a bunch of other things as well (output
range support/use in Phobos is still rather scanty at the moment).


T

-- 
MSDOS = MicroSoft's Denial Of Service

Feb 06 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/6/14, 9:19 AM, H. S. Teoh wrote:
 On Thu, Feb 06, 2014 at 04:30:31PM +0000, Dicebot wrote:
 On Thursday, 6 February 2014 at 16:25:37 UTC, Andrei Alexandrescu
 wrote:
 // lib code
 struct RCSlice(T) { ... }
 alias rcstring = RCSlice!(immutable char);
 rcstring rc!(string s) { ... }

 // user code
 auto s1 = buildPath!("hello", "world");
 auto s2 = buildPath!(rc!"hello", rc!"world");

 In this example s1 will have type string and s2 will have type
 rcstring.

 Looks unnecessary restrictive. Why can't one build rc-string from
 stack buffers or Array!char from rc-strings? Type of output buffer
 does not have to do anything with input.

 Agree. Phobos algorithms that populate a data sink should migrate toward
 using output ranges instead of returning a predetermined type. This will
 not only address ARC needs, but a bunch of other things as well (output
 range support/use in Phobos is still rather scanty at the moment).

I will mention again that output ranges lead to quite a bit more code on 
the caller site. They do give great control, but I'm hoping for 
something more convenient.

Andrei

Feb 06 2014

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, Feb 06, 2014 at 09:56:14AM -0800, Andrei Alexandrescu wrote:
 On 2/6/14, 9:19 AM, H. S. Teoh wrote:
On Thu, Feb 06, 2014 at 04:30:31PM +0000, Dicebot wrote:
On Thursday, 6 February 2014 at 16:25:37 UTC, Andrei Alexandrescu
wrote:
// lib code
struct RCSlice(T) { ... }
alias rcstring = RCSlice!(immutable char);
rcstring rc!(string s) { ... }

// user code
auto s1 = buildPath!("hello", "world");
auto s2 = buildPath!(rc!"hello", rc!"world");

In this example s1 will have type string and s2 will have type
rcstring.

Looks unnecessary restrictive. Why can't one build rc-string from
stack buffers or Array!char from rc-strings? Type of output buffer
does not have to do anything with input.

Agree. Phobos algorithms that populate a data sink should migrate toward
using output ranges instead of returning a predetermined type. This will
not only address ARC needs, but a bunch of other things as well (output
range support/use in Phobos is still rather scanty at the moment).

 
 I will mention again that output ranges lead to quite a bit more
 code on the caller site. They do give great control, but I'm hoping
 for something more convenient.

[...]

That's only because the current output range API consists of only a
single .put method. Please see the other thread started by Adam Ruppe:
we should spend some time to think about how we can streamline output
ranges so that they can be used just as easily as input ranges --
y'know, with UFCS chaining and such, that doesn't require a ton of
boilerplate like the current process of: declare output range, pass to
function, get data from result, pass to next function, etc.. This is
primarily a syntactical problem, not a logical one, and since we're so
good at syntactic bikeshedding, we should be able to solve this
relatively easily, right? ;-)


T

-- 
Life is unfair. Ask too much from it, and it may decide you don't deserve what
you have now either.

Feb 06 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/6/14, 10:15 AM, H. S. Teoh wrote:
 That's only because the current output range API consists of only a
 single .put method. Please see the other thread started by Adam Ruppe:
 we should spend some time to think about how we can streamline output
 ranges so that they can be used just as easily as input ranges --
 y'know, with UFCS chaining and such, that doesn't require a ton of
 boilerplate like the current process of: declare output range, pass to
 function, get data from result, pass to next function, etc.. This is
 primarily a syntactical problem, not a logical one, and since we're so
 good at syntactic bikeshedding, we should be able to solve this
 relatively easily, right? ;-)

I don't think it's that easy. For example the output range must be 
passed as a ref parameter into the function, which is already 
introducing friction.

FWIW things we can add are to output ranges:

~= for convenience
.flush() or .done() to mark the end of several writes
.clear() to clear the range (useful if e.g. it's implemented as a slice 
with appending)


Andrei

Feb 06 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

06-Feb-2014 22:29, Andrei Alexandrescu пишет:
 On 2/6/14, 10:15 AM, H. S. Teoh wrote:
 That's only because the current output range API consists of only a
 single .put method. Please see the other thread started by Adam Ruppe:
 we should spend some time to think about how we can streamline output
 ranges so that they can be used just as easily as input ranges --
 y'know, with UFCS chaining and such, that doesn't require a ton of
 boilerplate like the current process of: declare output range, pass to
 function, get data from result, pass to next function, etc.. This is
 primarily a syntactical problem, not a logical one, and since we're so
 good at syntactic bikeshedding, we should be able to solve this
 relatively easily, right? ;-)

 I don't think it's that easy. For example the output range must be
 passed as a ref parameter into the function, which is already
 introducing friction.

 FWIW things we can add are to output ranges:

 ~= for convenience
 .flush() or .done() to mark the end of several writes
 .clear() to clear the range (useful if e.g. it's implemented as a slice
 with appending)

.reserve(n) to notify underlying sink that it n items are coming (it 
should preallocate etc.)


-- 
Dmitry Olshansky

Feb 06 2014

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Thursday, 6 February 2014 at 17:56:15 UTC, Andrei Alexandrescu 
wrote:
 I will mention again that output ranges lead to quite a bit 
 more code on the caller site.

People are asking for control over memory management. You can't 
then complain that you get control over memory management!

I'd furthermore like to note that there's no reason why we can't 
have the best of both worlds through default parameters and/or 
different names.

Suppose our thing is defined as this:

T[] toUpper(T, OR = GCSink!T)(in T[] data, OR output = OR()) {
     output.start();
     foreach(d; data)
        output.put(d & ~0x20);
     return output.finish();
}

struct GCSink(T) {
     // so this is a reference type
     private struct Impl {
         T[] data;
         void put(T t) { data ~= t; }
         T[] finish() { return data; }
     }
     Impl* impl;
     alias impl this;
     void start() {
         impl = new Impl;
     }
}

// an output range into an existing array container
struct StaticSink(T) {
     T[] container;
     this(T[] c) { container = c; }
     size_t size;
     void start() { size = 0; }
     void put(T t) { container[size++] = t; }
     T[] finish() { return container[0 .. size]; }
}
StaticSink!T staticSink(T)(T[] t) {
     return StaticSink!T(t);
}

void main() {
     import std.stdio;
     writeln(toUpper("cool")); // default: GC
     char[10] buffer;
     auto received = toUpper("cool", staticSink(buffer[])); // 
custom static sink
     assert(buffer.ptr is received.ptr);
     assert(received == "COOL");
}

Feb 06 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/6/14, 8:25 AM, Andrei Alexandrescu wrote:
 rcstring rc!(string s) { ... }

I meant

rcstring rc(string s)() { ... }


Andrei

Feb 06 2014

Paulo Pinto <pjmlp progtools.org> writes:

Am 06.02.2014 16:22, schrieb Sönke Ludwig:
 Am 06.02.2014 14:35, schrieb Meta:
 On Thursday, 6 February 2014 at 13:23:14 UTC, Sönke Ludwig wrote:
 Am 06.02.2014 12:37, schrieb Max Klyga:
 Anti-GC crowd tries to promote ARC as an deterministic alternative for
 memory management.
 I noticed that people promoting ARC do not provide any disadvantages
 for
 proposed approach.

 The thing is in gamedev and other soft-realitime software background
 only a handfull types of resources are really managed by RC and memory
 usage patterns are VERY specific to their domain (mostly linear
 allocation/deallocation and objects with non deterministic lifetime are
 preallocated in pools).

 Trying to use RC as a general method of memory management leads to some
 problems.
 A pretty detailed view by John Harrop (He is somewhat known for
 trolling
 in PL community, but nonetheless knows what he is talking about) -





 So RC could also introduce unpredictable pause times at undesired
 places.

 This is also confirmed by research from HP -
 http://www.hpl.hp.com/personal/Hans_Boehm/popl04/refcnt.pdf

 My point is that we should not ruin the language ease of use. We do
 need
 to deal with Phobos internal allocations, but we should not switch to
 ARC as a default memory management scheme. In practice people promoting
 ARC will probably not use phobos anyway. Currently its just an
 excuse to
 not use D.

 Look at c++ and STL, etc. People will roll their own solutions no
 matter
 what you try.

 Full ACK! Reference counting should be well supported, but it
 shouldn't be the default scheme or built-in at a low level. From my
 personal experience it would be ideal to be able to customize certain
 types to be reference counted (allowing the user full flexibility
 implementing the actual reference counting and without ruling out weak
 references!), but have them accessible using the same syntax and type
 conversion semantics as normal references.

 I think the best way forward would be to look at the places in D where
 allocations happen, and then figure out how we can optionally allow
 reference counting in these situations. Andrei just made a thread on
 this yesterday in regard to slices, which I think are the most promising
 for a RC solution.

 I'm just not convinced (far from it) that Phobos should be built on top
 of such an RCSlice type. I rather strongly agree with Dicebot that the
 API should be extended to work with ranges or pre-allocated buffers
 where possible + support for custom allocators where it makes sense. How
 the memory is managed is then totally up to the user and no Phobos
 function needs to be aware of that (e.g. just pass in a pre-allocated,
 reference counted slice).

Although I seldom use D, I would like to say +1, if I may.

--
Paulo

Feb 06 2014

"Dicebot" <public dicebot.lv> writes:

I still wonder where the idea of replacing GC with something as a 
silver bullet came from. There is no problem with GC itself as 
you can remove it easily. Problem is state of language after it 
was removed and it is something completely different and 
unrelated.

All this ARC fuss came from few speculative discussions and 
suddenly got caught with great attention for reasons I fail to 
understand. And doing something like going for ARC by default is 
just crazy - it will make life more difficult for majority of 
users that don't care and won't fix many real issues for vocal 
minority.

Real helping problems to be addressed instead in my opinion:
1) providing RC-based gc_stub
2) -vgc and/or better control over hidden allocation
3) removing as much internal allocations as possible from Phobos, 
move to output ranges instead
4) provide examples of containers befriended with std.allocator

Not related to memory management but demanded in same domain - 
fix symbol boat, __forceinline

As you may notice, reference counting is just a one tiny part 
here and only desired as some way to get non-leaking basic 
language in absence of gc. And probably least important.

All recent threads just make me frustrated despite being one of 
pushers for better low-level memory management options.

Feb 06 2014

"qznc" <qznc web.de> writes:

On Thursday, 6 February 2014 at 11:37:59 UTC, Max Klyga wrote:
 Anti-GC crowd tries to promote ARC as an deterministic 
 alternative for memory management.
 I noticed that people promoting ARC do not provide any 
 disadvantages for proposed approach.


Feel free to extend this wiki page:

http://wiki.dlang.org/Versus_the_garbage_collector

Feb 06 2014

"ponce" <contact gam3sfrommars.fr> writes:

On Thursday, 6 February 2014 at 11:37:59 UTC, Max Klyga wrote:
 Anti-GC crowd tries to promote ARC as an deterministic 
 alternative for memory management.
 I noticed that people promoting ARC do not provide any 
 disadvantages for proposed approach.

 The thing is in gamedev and other soft-realitime software 
 background only a handfull types of resources are really 
 managed by RC and memory usage patterns are VERY specific to 
 their domain (mostly linear allocation/deallocation and objects 
 with non deterministic lifetime are preallocated in pools).

 Trying to use RC as a general method of memory management leads 
 to some problems.
 A pretty detailed view by John Harrop (He is somewhat known for 
 trolling in PL community, but nonetheless knows what he is 
 talking about) - 



 So RC could also introduce unpredictable pause times at 
 undesired places.

 This is also confirmed by research from HP - 
 http://www.hpl.hp.com/personal/Hans_Boehm/popl04/refcnt.pdf

 My point is that we should not ruin the language ease of use. 
 We do need to deal with Phobos internal allocations, but we 
 should not switch to ARC as a default memory management scheme. 
 In practice people promoting ARC will probably not use phobos 
 anyway. Currently its just an excuse to not use D.

 Look at c++ and STL, etc. People will roll their own solutions 
 no matter what you try.

I think of RC as a greater evil that GC. From what I've seen it 
does creates leaking cycles in tree structures AND pauses. From a 
low-level point of view, RC pointers are ugly (separate counter 
that will trash your cache) and do atomics all over the place 
(ie. memory barriers). That they don't have to because of shared 
is TBD.

It is comforting to me to know that a GC pointer is still just a 
pointer. If we go the RC route, we will have to constantly think 
about the higher cost of RC pointer vs weak-ref, instead of 
thinking about other things instead
Right now a GC pointer is the same size as a non-GC one, it's 
liberating.

It looks like we want to solve a PR problem more than a real one.

Feb 06 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/6/14, 7:11 AM, ponce wrote:
 On Thursday, 6 February 2014 at 11:37:59 UTC, Max Klyga wrote:
 Anti-GC crowd tries to promote ARC as an deterministic alternative for
 memory management.
 I noticed that people promoting ARC do not provide any disadvantages
 for proposed approach.

 The thing is in gamedev and other soft-realitime software background
 only a handfull types of resources are really managed by RC and memory
 usage patterns are VERY specific to their domain (mostly linear
 allocation/deallocation and objects with non deterministic lifetime
 are preallocated in pools).

 Trying to use RC as a general method of memory management leads to
 some problems.
 A pretty detailed view by John Harrop (He is somewhat known for
 trolling in PL community, but nonetheless knows what he is talking
 about) -




 So RC could also introduce unpredictable pause times at undesired places.

 This is also confirmed by research from HP -
 http://www.hpl.hp.com/personal/Hans_Boehm/popl04/refcnt.pdf

 My point is that we should not ruin the language ease of use. We do
 need to deal with Phobos internal allocations, but we should not
 switch to ARC as a default memory management scheme. In practice
 people promoting ARC will probably not use phobos anyway. Currently
 its just an excuse to not use D.

 Look at c++ and STL, etc. People will roll their own solutions no
 matter what you try.

 I think of RC as a greater evil that GC. From what I've seen it does
 creates leaking cycles in tree structures AND pauses. From a low-level
 point of view, RC pointers are ugly (separate counter that will trash
 your cache) and do atomics all over the place (ie. memory barriers).
 That they don't have to because of shared is TBD.

 It is comforting to me to know that a GC pointer is still just a
 pointer. If we go the RC route, we will have to constantly think about
 the higher cost of RC pointer vs weak-ref, instead of thinking about
 other things instead
 Right now a GC pointer is the same size as a non-GC one, it's liberating.

 It looks like we want to solve a PR problem more than a real one.

Though I partly agree with your considerations, let me note that PR 
problems are real.

Andrei

Feb 06 2014

Johannes Pfau <nospam example.com> writes:

Am Thu, 6 Feb 2014 14:37:59 +0300
schrieb Max Klyga <max.klyga gmail.com>:

 
 My point is that we should not ruin the language ease of use. We do 
 need to deal with Phobos internal allocations, but we should not
 switch to ARC as a default memory management scheme. 

What's with all this finger pointing and drawing battle lines in the
last few days? GC-crowd vs ARC-crowd? Can we please all calm down?

Who even proposed to replace the GC completely with ARC? Can somebody
point me to a clear statement demanding that?

90% of phobos isn't actually affected by the ARC/GC issue. Most
functions just allocate memory, then return some result and are
finished. They do not keep data references. As they do not keep
references there's no need for ARC or GC, we just need a way to tell
every function how it should allocate.

Some people seem to want some implicit way to set a 'default'
allocator, but I haven't heard of any solution that works. (E.g. having
a thread-local default allocator, per library default allocator, how
would that even work?)

I don't think there's anything wrong with the obvious solution: All
phobos functions which allocate take an optional Allocator parameter,
defaulting to GC. The little extra typing won't harm anyone and if you
want to use things like stack-based buffers you'll have to write extra
code and think about memory allocation anyway.

auto gcString = toUpper("test");
auto mallocString = toUpper!Malloc("test");
ubtye[64] sbuf;
auto stackString = toUpper(sbuf[], "test");

What's so bad about this? It works for most of phobos, doesn't require
language changes and it's easy to realize what's going on when reading
the code. Having an 'application default allocator' or 'thread local
default allocator' or 'per function default allocator' will actually
hide the allocation strategy and I bet it would cause issues.


So the question then is: what about language feature which allocate
using the GC? Wouldn't we want these to work with any kind of
allocator? Answer: no, because:

This is the list of language features which allocate:

* .sort, .idup, .dup, setting .length

Sort is deprecated; we should provide duplicate!Allocator functions as
a replacement for dup/idup (Or dup/idup could support an allocator
argument); just don't set .length. If you need some way to grow an
Array just use Appender or a library array, ...

* closures

Who needs these anyway? If the callees only use scoped delegates
closures do not allocate. Otherwise just implement the closure
yourself and allocate wherever you want:

int localA, localB;
struct Frame
{
    int a, b;
    int callback() {a++};
}
auto f = allocate!(Frame, Malloc)(localA, localB);
functionWithDelegate(&f.callback);

* ~, ~= on slices (not on user types)

Just avoid these. When string processing a call to format!Allocator
would be better anyway. For other stuff use Appender!Allocator or some
other library type which could actually overload ~,~= and then you can
use these again...

* delete

deprecated

* new

we need a generic allocation function anyway, allocate!Allocator

* Array literals
* Associative array literals (not in all cases)

That should be fixed. I hope at some point these types will be
implemented in druntime/phobos, support allocators and don't need
TypeInfo. Until then, just use user defined Array, AssociativeArray
types.


I think with relatively little effort we could solve most problems. The
remaining cases must be decided on a case-by case basis. Should
containers use RC or GC, some stuff like that.

Exceptions currently can't work on a system without GC cause we always
use 'throw new' and nobody ever explicitly frees Exceptions. ARC could
be a solution to that issue (if we enforce that exceptions may not have
circular references, but they shouldn't anyway)

And then slices which are actually stored by functions are another
issue.

But it's not like we just change the whole language to ARC. We already
have forced RC in phobos: std.stdio.File for example. And nobody
complained, except that RefCounted has bugs and an ARC implementation
for File could avoid these bugs and be faster than the current
implementation...

We just have to provide everyone with a way to choose their favorite
implementation. Which means we provide public APIs which allow any kind
of memory allocation and internally do not rely on automatic memory
management (internal allocation in phobos should be done on the stack/
with malloc / made configurable, but not with a GC).

Feb 06 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/6/14, 7:47 AM, Johannes Pfau wrote:
 Am Thu, 6 Feb 2014 14:37:59 +0300
 schrieb Max Klyga <max.klyga gmail.com>:

 My point is that we should not ruin the language ease of use. We do
 need to deal with Phobos internal allocations, but we should not
 switch to ARC as a default memory management scheme.

 What's with all this finger pointing and drawing battle lines in the
 last few days? GC-crowd vs ARC-crowd? Can we please all calm down?

[snip]

Nice. An interspersed point:

 I don't think there's anything wrong with the obvious solution: All
 phobos functions which allocate take an optional Allocator parameter,
 defaulting to GC. The little extra typing won't harm anyone and if you
 want to use things like stack-based buffers you'll have to write extra
 code and think about memory allocation anyway.

 auto gcString = toUpper("test");
 auto mallocString = toUpper!Malloc("test");
 ubtye[64] sbuf;
 auto stackString = toUpper(sbuf[], "test");

 What's so bad about this?

The issue here is that Phobos functions need to document whether e.g. 
they return memory that can be deallocated or not. Counterexamples would 
be returning static strings or subslices of allocations.

I'm not saying it's not solvable, but it'll take some thinking and some 
work.

 It works for most of phobos, doesn't require
 language changes and it's easy to realize what's going on when reading
 the code. Having an 'application default allocator' or 'thread local
 default allocator' or 'per function default allocator' will actually
 hide the allocation strategy and I bet it would cause issues.

I think a crack should be given to the user to install their own 
allocator (per thread and/or shared). Perhaps we can limit that to the 
startup stage, i.e. before any allocation takes place.

 So the question then is: what about language feature which allocate
 using the GC? Wouldn't we want these to work with any kind of
 allocator? Answer: no, because:

 This is the list of language features which allocate:

[snip]

I think you forgot AAs.

 We just have to provide everyone with a way to choose their favorite
 implementation. Which means we provide public APIs which allow any kind
 of memory allocation and internally do not rely on automatic memory
 management (internal allocation in phobos should be done on the stack/
 with malloc / made configurable, but not with a GC).

I agree that's a nice goal. But I don't think it's easily attainable. 
The "choose the allocator" part is easy. The harder is choosing the 
reclamation method. There are differences between GC and RC that are 
very difficult to unify under a common API.


Andrei

Feb 06 2014

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Thursday, 6 February 2014 at 16:40:32 UTC, Andrei Alexandrescu 
wrote:
 The issue here is that Phobos functions need to document 
 whether e.g. they return memory that can be deallocated or not. 
 Counterexamples would be returning static strings or subslices 
 of allocations.

This is why specifying ownership by type is important. It 
documents the need, it makes sure the information doesn't get 
dropped, and it can automatically manage the details (via RAII).

Something that mallocs should return Malloced!T which calls the 
appropriate free (specified by the allocator) in the destructor. 
GC should return GC!T. Refconted should return RefCounted!T, and 
so on.

alias this can easily allow interoperability... though, of 
course, not escaping things incorrectly would have to be taken 
care of, either manually or automatically. I keep coming back to 
this because it cannot be avoided, except by GC through and 
through. If the language does not help with this, it doesn't mean 
the complexity goes away. It just means it is moved onto the 
(fallible) programmer.

 I think a crack should be given to the user to install their 
 own allocator (per thread and/or shared). Perhaps we can limit 
 that to the startup stage, i.e. before any allocation takes 
 place.

You could always link in your own _d_allocmemory, etc. I wouldn't 
do this, it will make things hard to get right, but it is very 
easy  - just add the functions to your main project. the linker 
will prefer your functions to the druntime functions.

Feb 06 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/6/14, 9:18 AM, Adam D. Ruppe wrote:
 Something that mallocs should return Malloced!T which calls the
 appropriate free (specified by the allocator) in the destructor. GC
 should return GC!T. Refconted should return RefCounted!T, and so on.

That ain't going to work.

Malloced!T and GC!T suggests parameterization by the type of the 
allocator. So there would need to be a type per allocator, which is a 
losing proposition from std.allocator's viewpoint, since there can be so 
many of them via template combinatorics.

RefCounted!T is a whole different thing, because it doesn't encode 
allocation strategy but instead memory reclamation tactics. There's no 
"and so on" and RefCounted!T cannot occur in an enumeration that 
includes Malloced!T and GC!T.


Andrei

Feb 06 2014

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Thursday, 6 February 2014 at 17:54:38 UTC, Andrei Alexandrescu 
wrote:
 Malloced!T and GC!T suggests parameterization by the type of 
 the allocator.

Not necessarily, different allocators with the same free could 
return the same type. The key point is the knowledge of how to 
free it is encapsulated there in some way.

 RefCounted!T is a whole different thing, because it doesn't 
 encode allocation strategy but instead memory reclamation 
 tactics.

Malloced!T also encodes reclamation tactics: ~this() { free(ptr); 
}

You could also call it Unique!T(&free): the malloced pointer is 
unique and must be released with free. That cvers the same ground 
in more generic way. (Surely refcounted!T needs to know what 
happens when count==0 too.)

Feb 06 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/6/14, 11:14 AM, Adam D. Ruppe wrote:
 On Thursday, 6 February 2014 at 17:54:38 UTC, Andrei Alexandrescu wrote:
 Malloced!T and GC!T suggests parameterization by the type of the
 allocator.

 Not necessarily, different allocators with the same free could return
 the same type. The key point is the knowledge of how to free it is
 encapsulated there in some way.

 RefCounted!T is a whole different thing, because it doesn't encode
 allocation strategy but instead memory reclamation tactics.

 Malloced!T also encodes reclamation tactics: ~this() { free(ptr); }

So if T is int[] and you have taken a slice into it...?

 You could also call it Unique!T(&free): the malloced pointer is unique
 and must be released with free. That cvers the same ground in more
 generic way. (Surely refcounted!T needs to know what happens when
 count==0 too.)

I'm not sure I understand what you are talking about.


Andrei

Feb 06 2014

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Thursday, 6 February 2014 at 19:33:39 UTC, Andrei Alexandrescu 
wrote:
 So if T is int[] and you have taken a slice into it...?

If you escape it, congratulations, you have a memory safety bug. 
Have fun tracking it down.

You could also offer refcounted slicing, of course (wrapping the 
Unique thing in a refcounter would work), or you could be 
converted to the church of scope where the compiler will help you 
catch these bugs without run time cost.

 I'm not sure I understand what you are talking about.

When the reference count reaches zero, what happens? This changes 
based on the allocation method: you might call GC.free, you might 
call free(), you might do nothing, The destructor needs to know, 
otherwise the refcounting achieves exactly nothing! We can encode 
this in the type or use a function pointer for it.

struct RefCounted(T) {
     private struct Impl {
         // potential double indirection lol
         private T payload;
         private size_t count;
         private void function(T) free;
         T getPayload() { return payload; } // so it is an lvalue
         alias getPayload this;
     }
     Impl* impl;
     alias impl this;
     this(T t, void function(T) free) {
        impl = new Impl; // some kind allocation at startup lol
                         // naturally, this could also be malloc
                         // or take a generic allocator form the 
user
                         // but refcounted definitely needs some 
pointer love
        impl.payload = t;
        impl.count = 1;
        impl.free = free; // gotta store this so we can free later
     }
     this(this) {
        impl.count++;
     }
     ~this() {
        impl.count--;
        if(impl.count == 0) {
            // how do we know how to free it?
            impl.free(impl.payload);

            // delete impl; GC.free(impl) 
core.stdc.stdlib.free(impl);
            // whatever
            impl = null;
        }
     }
}


In this example, we take the reference we're counting in the 
constructor... which means it is already allocated. So logically, 
the user code should tell it how to deallocate it too. We can't 
just call a global free, we take a pointer instead.


So this would work kinda like this:

import core.stdc.stdlib;
int[] stuff = malloc(int.sizeof * 5)[0 .. 5];
auto counted = RefCounted!(int[])(stuff, (int[] stuff) { 
free(stuff.ptr); });



The allocator is not encoded in the type, but ref counted does 
need to know what happens when the final reference is gone. It 
takes a function pointer from the user for that.


This is a generically refcounting type. It isn't maximally 
efficient but it also works with arbitrary inputs allocated by 
any means.

Unique!T could do something similar, but unique would disable its 
postblit instead of incrementing a refcount.

Feb 06 2014

Johannes Pfau <nospam example.com> writes:

Am Thu, 06 Feb 2014 08:40:28 -0800
schrieb Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:

 auto gcString = toUpper("test");
 auto mallocString = toUpper!Malloc("test");
 ubtye[64] sbuf;
 auto stackString = toUpper(sbuf[], "test");

 What's so bad about this?

 
 The issue here is that Phobos functions need to document whether e.g. 
 they return memory that can be deallocated or not. Counterexamples
 would be returning static strings or subslices of allocations.
 
 I'm not saying it's not solvable, but it'll take some thinking and
 some work.

That's true. I wonder how common these cases are but slices are
probably the bigger problem here. (OTOH if a function just slices the
input, we'd have to document it but there's no bigger issue)

 [...] Having an 'application default allocator' or
 'thread local default allocator' or 'per function default
 allocator' will actually hide the allocation strategy and I bet it
 would cause issues.

 
 I think a crack should be given to the user to install their own 
 allocator (per thread and/or shared). Perhaps we can limit that to
 the startup stage, i.e. before any allocation takes place.

If we can make that work then I won't complain. As long as the default
allocator can't be changed at random a point in time most problems
should be solved for a global default allocator.
For per-thread allocators this is difficult: If you allocate in one
thread and free in another how do you make sure you use the correct free
function?

There are some interesting possibilities though: For example we could
add a delegate to object which points to the correct 'free' function.
But then things get complicated if we have to manage the lifetime of the
allocator as well....

 This is the list of language features which allocate:

 [snip]
 
 I think you forgot AAs.

I had AA literals in the list, but you're right some other AA features
allocate as well. Good you mentioned that, I'll have to detect these
cases in -nogc/-vgc code as well...

However, from a user point of view dcollections (and I hope at some
point std.container as well) provides a nice replacement for all these
operations, except for literals.

 
 We just have to provide everyone with a way to choose their favorite
 implementation. Which means we provide public APIs which allow any
 kind of memory allocation and internally do not rely on automatic
 memory management (internal allocation in phobos should be done on
 the stack/ with malloc / made configurable, but not with a GC).

 
 I agree that's a nice goal. But I don't think it's easily attainable. 
 The "choose the allocator" part is easy. The harder is choosing the 
 reclamation method. There are differences between GC and RC that are 
 very difficult to unify under a common API.
 

I'd guess that allocation is actually a bigger issue for those who
are unhappy with the GC right now, but I have no way to prove that ;-)
(Explicit manual freeing is annoying, but possible. But if a function
internally allocates with the GC it can't be used at all).

But you're of course right, getting reclamation right is probably more
difficult and also important.

Feb 06 2014

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, Feb 06, 2014 at 04:47:05PM +0100, Johannes Pfau wrote:
[...]
 Some people seem to want some implicit way to set a 'default'
 allocator, but I haven't heard of any solution that works. (E.g. having
 a thread-local default allocator, per library default allocator, how
 would that even work?)
 
 I don't think there's anything wrong with the obvious solution: All
 phobos functions which allocate take an optional Allocator parameter,
 defaulting to GC. The little extra typing won't harm anyone and if you
 want to use things like stack-based buffers you'll have to write extra
 code and think about memory allocation anyway.
 
 auto gcString = toUpper("test");
 auto mallocString = toUpper!Malloc("test");
 ubtye[64] sbuf;
 auto stackString = toUpper(sbuf[], "test");
 
 What's so bad about this? It works for most of phobos, doesn't require
 language changes and it's easy to realize what's going on when reading
 the code. Having an 'application default allocator' or 'thread local
 default allocator' or 'per function default allocator' will actually
 hide the allocation strategy and I bet it would cause issues.

[...]

I think a superior solution is to pass in an output range to toUpper,
that does whatever form of allocation you prefer. There's nothing about
toUpper that *fundamentally* depends on an allocator, therefore it
shouldn't even *care* what an allocator is. Reduced to its absolute
fundamentals, it just takes data from some input string, and produces
some output data. Where this output data goes is none of its concern --
it can be a GC string, an ARC string, stdout, an interprocess pipe, a
network socket, toUpper shouldn't have to care which one it is. Just
take an output range.

Then on the complementary side, have Phobos provide a bunch of premade
output ranges that allocates a GC string, or an ARC string, or whatever,
and then the user can just pick one of those to pass to toUpper.


T

-- 
Almost all proofs have bugs, but almost all theorems are true. -- Paul Pedersen

Feb 06 2014

Max Klyga <email domain.com> writes:

On 2014-02-06 15:47:05 +0000, Johannes Pfau said:

 Am Thu, 6 Feb 2014 14:37:59 +0300
 schrieb Max Klyga <max.klyga gmail.com>:
 
 
 My point is that we should not ruin the language ease of use. We do
 need to deal with Phobos internal allocations, but we should not
 switch to ARC as a default memory management scheme.

 
 snip

I wholeheartedly agree that we should define methods in phobos taking 
output buffers/ranges.
One of the reasons Tango xml parser was the fastest in the world was 
because almost every method/function in Tango was takinig output buffer 
as argument and never allocated unless asked specifically.

This would allow everyone chosing a method of memory management most 
suited for their domain.

Feb 06 2014

D Programming

C/C++ Programming

Other

digitalmars.D - Disadvantages of ARC