digitalmars.D - Componentizing D's garbage collector

Andrei Alexandrescu (41/41) Jan 10 2014 The GC comes up repeatedly in discussions around here. I've been

Joseph Rushton Wakeling (4/14) Jan 10 2014 Can you recommend some good background reading for those of us who would...

Jacob Carlborg (6/9) Jan 10 2014 There's the classic The Garbage Collection Handbook. Here in a later
Paulo Pinto (5/29) Jan 10 2014 Two good books
Bienlein (8/11) Jan 10 2014 You mean about how GCs work? I have the book "Garbage Collection:

Bienlein (3/14) Jan 10 2014 Seems like we were all in parallel with our book recommendations;

Jacob Carlborg (4/6) Jan 10 2014 That would hopefully indicate they're good recommendations :)

Regan Heath (12/15) Jan 10 2014 IMO, the Microsoft documentation for the C# GC is a good place to start.
Joseph Cassman (8/12) Jan 10 2014 I like the thesis "Scheduling Garbage Collection in Embedded

Joseph Rushton Wakeling (4/8) Jan 10 2014 Interesting -- I'll have to look that up.
Joseph Rushton Wakeling (3/7) Jan 10 2014 The link, for anyone who's interested:

Dmitry Olshansky (20/47) Jan 10 2014 O.T.: Please use full URLs, it's not that long.

Andrei Alexandrescu (8/37) Jan 10 2014 You know I wouldn't write that unless I had an ace up my sleeve. Watch

H. S. Teoh (17/48) Jan 10 2014 I don't see why having a templated mark() is a problem. Nothing says you
Joseph Rushton Wakeling (2/3) Jan 10 2014 Ahh, we're getting an allocation-based Dance of the Seven Veils here? :-...

bioinfornatics (5/5) Jan 10 2014 One big generic problem for GC that is to manage efficiently a
Sean Kelly (19/42) Jan 10 2014 This only exists because the runtime is statically linked and
Sean Kelly (3/3) Jan 10 2014 Oh, I assume this will eventually include your ideas about having

Andrei Alexandrescu (4/7) Jan 10 2014 Yah, such integration would make a lot of sense. Thanks for the

Benjamin Thaut (2/4) Jan 10 2014 Just one question. Did you read "the garbage collection handbook"?

Andrei Alexandrescu (4/10) Jan 10 2014 Yah, the new edition. Unfortunately I didn't find a lot of new stuff or

Benjamin Thaut (13/24) Jan 11 2014 Very good,

Andrei Alexandrescu (10/34) Jan 11 2014 I'm not considering done as much as separable as a concern. All I'm

Benjamin Thaut (15/52) Jan 11 2014 Well as far as my understanding of GCs goes, you have two options:

Andrei Alexandrescu (5/10) Jan 11 2014 The way I see it/hope it turns out, precision is a separate concern from...
=?ISO-8859-1?Q?S=F6nke_Ludwig?= (5/10) Jan 11 2014 The language can help a lot there (pure... and scope, if "fixed"), so
Rainer Schuetze (8/65) Jan 12 2014 I think a moving collector is currently not feasible without restricting...

Benjamin Thaut (8/11) Jan 12 2014 Could you give an example which part of the language would not be doable...

Jakob Ovrum (7/10) Jan 12 2014 We don't necessarily need that, though. In D we have a plethora

Benjamin Thaut (3/13) Jan 12 2014 You should really try to write non-GC D code some time. You would be

Joseph Rushton Wakeling (3/5) Jan 12 2014 Is that down to the language, or parts of druntime and phobos

Benjamin Thaut (2/7) Jan 12 2014 Both. Although it kept getting better lately.

Jakob Ovrum (3/5) Jan 12 2014 I was, years ago.

Andrei Alexandrescu (4/17) Jan 12 2014 Yah, moving would be real nice. I hope to at least clarify the issues
Brian Rogoff (8/23) Jan 12 2014 I'd rather have more restrictions and a working precise GC, and

Benjamin Thaut (8/9) Jan 12 2014 I don't see any problem with pointer arithmetic? Either the pointer is

Sean Kelly (3/10) Jan 12 2014 Or you don't know if something is a pointer or not and so
Brad Anderson (5/15) Jan 12 2014 The garbage collection page of the D spec actually talks a lot

Rainer Schuetze (29/45) Jan 13 2014 Maybe I'm too pessimistic ;-) I guess moving in general could be

Jacob Carlborg (6/19) Jan 13 2014 Could we have a segregated heap for C pointers? Would that help?

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (15/18) Jan 13 2014 Is it possible to declare whether the C-function retains a

Dmitry Olshansky (6/26) Jan 13 2014 How would it be expensive? I don't see a need to do anything to "pin" a

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (3/7) Jan 13 2014 How do you know that it is a pointer?

Dmitry Olshansky (6/11) Jan 13 2014 Precisely. Since C space has no type information you have to

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (6/8) Jan 13 2014 I think we are misunderstanding each other? I don't think a Boehm

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (2/2) Jan 13 2014 …and scanning of C data segments is insufficient anyway, since

Dmitry Olshansky (7/9) Jan 13 2014 A stack is a stack is a stack.

Dmitry Olshansky (6/13) Jan 13 2014 So what? It can't move it and as such there is nothing special about it,...

Timon Gehr (5/17) Jan 13 2014 Yes, there is a cost. The requirement to pin arbitrary objects at

Jacob Carlborg (4/8) Jan 14 2014 Can't these object be pinned somewhere else?

Benjamin Thaut (4/10) Jan 14 2014 Yes, thats the usual approach. But that means that you have to copy them...

Jacob Carlborg (4/7) Jan 14 2014 Is there a problem to allocate them in the correct space from the beginn...

Benjamin Thaut (6/12) Jan 14 2014 Well you have to know it at allocation time. If you have a function that...

Timon Gehr (3/16) Jan 14 2014 Once it becomes known that an object should be pinned, it is too late

Benjamin Thaut (4/21) Jan 14 2014 Well no, usually you have to pin objects _before_ passing them to

Timon Gehr (8/30) Jan 14 2014 Well, there is no such manual mechanism in place, so only strategies

Benjamin Thaut (3/36) Jan 14 2014 Yes, but pinning will still be necessary for passing data to C/C++

Timon Gehr (2/4) Jan 14 2014 (I agree, it is the spec that does not.)

Timon Gehr (10/15) Jan 14 2014 BTW, our misunderstandings are rooted here:
Dmitry Olshansky (7/29) Jan 14 2014 If one accepts that any heap can have object that is pinned I don't see

Tofu Ninja (3/3) Jan 14 2014 Why are unions with pointers even allowed in the first place? It

Benjamin Thaut (5/8) Jan 14 2014 Yes, low level programming. D is supposed to be a systems programmin
deadalnix (7/10) Jan 14 2014 Tagged unions are very useful. For instance, to take a

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (11/13) Jan 13 2014 Well, the advantage of compaction is that you (in theory) can

Dmitry Olshansky (13/24) Jan 14 2014 Bleh. If anything you may as well lose it - manual memory management is

woh (7/7) Jan 14 2014 If Java with all it that time and money can not create a

Paulo Pinto (2/9) Jan 14 2014 Are you aware of Azul's GC or some real time VMs like Aonix PERC?

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (8/15) Jan 14 2014 Under what assumption? There is not optimal garbage collector,

Benjamin Thaut (6/20) Jan 13 2014 I also don't see how that would be expenive. Pinning is a common pratice...
Rainer Schuetze (32/42) Jan 15 2014 The stack reference can be moved outside the memory visible to

deadalnix (4/12) Jan 13 2014 I had some similar reflection. I'm not sure moving is worthwhile

Walter Bright (7/19) Jan 13 2014 A moving GC is already supported by D's semantics.

Brad Roberts (9/29) Jan 13 2014 The key advantage that moving gives to a GC is to allow it to have a blo...

Walter Bright (2/12) Jan 13 2014 The reason pinning doesn't particularly impede this is because pinning i...

Timon Gehr (2/11) Jan 13 2014 In Java or in D? Eg. there are no unions in Java.

Paulo Pinto (3/16) Jan 13 2014 C# has unions.

Timon Gehr (3/20) Jan 13 2014 "The common language runtime controls the physical layout of the data

Paulo Pinto (9/37) Jan 14 2014 Did you also read the remaining part of the page, or just looked

Timon Gehr (13/48) Jan 14 2014 Look, you hadn't done anything besides pasting for backing up your point...

Paulo Pinto (8/66) Jan 15 2014 Excuse me for being stupid.

Benjamin Thaut (7/14) Jan 13 2014 I would prefer a option where the user can specifiy a function to scan

Walter Bright (3/20) Jan 13 2014 I agree, but I was trying to correct the misperception that current D do...

Benjamin Thaut (9/11) Jan 14 2014 Current D does not allow a moving collector because of the lack of

Walter Bright (2/9) Jan 14 2014 This is not true, I've implemented one for Java.

Benjamin Thaut (2/16) Jan 14 2014 So how do you implement a semi-space GC with pinned objects?

renoX (6/29) Jan 14 2014 You manage the pinned object in a different list that the

Benjamin Thaut (8/32) Jan 14 2014 I meant a pure semi-space collector. A pure semi-space collector can not...

Walter Bright (2/5) Jan 14 2014 Be more flexible in what the semi-spaces are.

Benjamin Thaut (4/10) Jan 14 2014 So you just don't. Effectivley you work around the problem. I would

Walter Bright (6/16) Jan 14 2014 My experience with book algorithms is they usually require some adjustme...

Benjamin Thaut (2/3) Jan 14 2014 Then this was a big misunderstanding.

Frustrated (35/40) Jan 14 2014 So, the issue seems to be that everyone is treating the GC as the

deadalnix (3/45) Jan 14 2014 Try to do the pull request that goes with that, and we will be

bearophile (6/14) Jan 13 2014 Time ago I suggested an optional standard method for unions of

Dmitry Olshansky (8/31) Jan 12 2014 I might be ignorant but why can't we make "mostly-moving" collector?

Benjamin Thaut (17/47) Jan 12 2014 A semi-space garbage collector is best fitted for small short lived

sunspyre (27/28) Jan 12 2014 I just want to point out that from an outer perspective, this is

qznc (8/15) Jan 13 2014 That is not technically possible. A truly concurrent GC has heavy

Marco Leise (11/68) Jan 10 2014 Nothing to destroy yet. But it does feel good that someone is
Rainer Schuetze (17/59) Jan 10 2014 You mean the proxy functions? Yes, please axe 'em, that only works in

Andrei Alexandrescu (13/51) Jan 10 2014 I need to write some code to explain it all. An figure it all :o).

Rainer Schuetze (8/53) Jan 11 2014 I was thinking about using the proposed module info extension to

Jacob Carlborg (5/8) Jan 11 2014 You mean this: https://github.com/D-Programming-Language/dmd/pull/2271 ?...

Rainer Schuetze (4/10) Jan 12 2014 Yes. I'm not sure if there is enough compiler support yet to figure out

evansl (9/17) Jan 13 2014 [snip]

Rainer Schuetze (8/35) Jan 15 2014 std.emplace can be used on a partial memory block (e.g. as part

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (16/20) Jan 15 2014 This is a good point. It is a mistake to mix manual memory layout
evansl (23/55) Feb 28 2014 I'm afraid I'm not familiar with much of what you're talking about.

Walter Bright (3/8) Jan 10 2014 Rainer, I want to apologize to you for not getting your efforts the atte...
deadalnix (18/21) Jan 10 2014 One of the key to a fast GC is segregating the heap in smaller

H. S. Teoh (19/43) Jan 10 2014 Which unfortunately is not an option in D if write barriers are
Rainer Schuetze (8/30) Jan 11 2014 This might work with safe D. I haven't yet figured the consequences of

Timon Gehr (3/7) Jan 10 2014 I assume you are aware that there is implicit casting to immutable upon

deadalnix (2/14) Jan 10 2014 Is it a bird ? Is it a plane ? No it is *ISOLATED* !
Andrei Alexandrescu (4/12) Jan 10 2014 I don't know. Need to think about it. Maybe it's a wrong design decision...

H. S. Teoh (10/23) Jan 10 2014 [...]
Nicholas Londey (10/10) Jan 13 2014 In theory you could use a region allocator for a pure function

=?ISO-8859-1?Q?S=F6nke_Ludwig?= (4/12) Jan 12 2014 I'd say the easiest way to deal with it is to let the compiler emit a

Chris Williams (29/34) Jan 14 2014 An additional feature that I would like to suggest is the concept

Rikki Cattermole (4/42) Jan 14 2014 I really love this idea. However I'm not sure how kernels would

Chris Williams (7/10) Jan 15 2014 Sleep allows a thread to give way to another thread within the

Rikki Cattermole (4/15) Jan 15 2014 I'm referring to cpu prioritization. For the process. It'll

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

The GC comes up repeatedly in discussions around here. I've been 
thinking for a while about breaking it down into components, and now it 
seems the time is ripe.

The design at http://goo.gl/ZVCJeB seems to be a win. It works well, 
comprehends all major allocation tropes, someone implemented a subset of 
it in C++ and measured good results, and a coworker is considering 
adopting the design for a C++ project as well.

I've started with the next logical step - higher-level allocation that 
is aware of the type of the object being allocated, and realized that 
integrating a notion of tracing is appropriate at that level, and 
actually quite easy. So I'm thinking of just doing it.

A few highlights:

* The design will foster the small, composable components with 
required/optional primitives that worked for std.allocator.

* I plan to review and probably discard all of the pointers-to-functions 
nonsense in the current GC.

* At this point I won't worry about discovering roots; I assume druntime 
has the appropriate mechanisms in place. Instead I'll focus on 
primitives such as "given this root, mark all that transitively follows 
from it" (conservatively or not).

* I plan to rely on static introspection for the mark function, i.e:

void mark(T)(ref T obj);

would mark obj and everything transitively referred to by it. The 
function would probably take a second parameter that's the heap that obj 
is sitting in.

* I plan to segregate all objects that don't include references and 
pointers (e.g. int, int[], Tuple!(int, double) etc) into a completely 
different heap than the "interesting" objects. For the simpler objects 
there's no need to save detailed type information so they can be stored 
in a more compact, efficient manner.

* There will be no possibility to change the type of certain objects 
once allocated. An allocation for an immutable object cannot e.g. make 
it later a mutable one. (This is an essential issue with the current 
allocator, I think.)

* At this point I'm unclear on how generations can be componentized, but 
am cautiously optimistic. Will see once I get to it.

One thing that would be great now would be to make an effort to review 
and merge the current precise GC work. I'm sure it will be of great help 
with breaking into components.


Destroy.

Andrei

Jan 10 2014

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 10/01/14 09:03, Andrei Alexandrescu wrote:
 The GC comes up repeatedly in discussions around here. I've been thinking for a
 while about breaking it down into components, and now it seems the time is
ripe.

 The design at http://goo.gl/ZVCJeB seems to be a win. It works well,
comprehends
 all major allocation tropes, someone implemented a subset of it in C++ and
 measured good results, and a coworker is considering adopting the design for a
 C++ project as well.

 I've started with the next logical step - higher-level allocation that is aware
 of the type of the object being allocated, and realized that integrating a
 notion of tracing is appropriate at that level, and actually quite easy. So I'm
 thinking of just doing it.

Can you recommend some good background reading for those of us who would love
to 
have some input (or at least insight) to this, but don't yet have the 
theoretical understanding?

Jan 10 2014

Jacob Carlborg <doob me.com> writes:

On 2014-01-10 10:56, Joseph Rushton Wakeling wrote:

 Can you recommend some good background reading for those of us who would
 love to have some input (or at least insight) to this, but don't yet
 have the theoretical understanding?

There's the classic The Garbage Collection Handbook. Here in a later 
edition:

http://www.amazon.com/The-Garbage-Collection-Handbook-Management/dp/1420082795

-- 
/Jacob Carlborg

Jan 10 2014

"Paulo Pinto" <pjmlp progtools.org> writes:

On Friday, 10 January 2014 at 09:57:18 UTC, Joseph Rushton 
Wakeling wrote:
 On 10/01/14 09:03, Andrei Alexandrescu wrote:
 The GC comes up repeatedly in discussions around here. I've 
 been thinking for a
 while about breaking it down into components, and now it seems 
 the time is ripe.

 The design at http://goo.gl/ZVCJeB seems to be a win. It works 
 well, comprehends
 all major allocation tropes, someone implemented a subset of 
 it in C++ and
 measured good results, and a coworker is considering adopting 
 the design for a
 C++ project as well.

 I've started with the next logical step - higher-level 
 allocation that is aware
 of the type of the object being allocated, and realized that 
 integrating a
 notion of tracing is appropriate at that level, and actually 
 quite easy. So I'm
 thinking of just doing it.

 Can you recommend some good background reading for those of us 
 who would love to have some input (or at least insight) to 
 this, but don't yet have the theoretical understanding?

Two good books

http://www.amazon.de/The-Garbage-Collection-Handbook-Management/dp/1420082795

http://www.amazon.de/Garbage-Collection-Algorithms-Automatic-Management/dp/0471941484

Jan 10 2014

"Bienlein" <jeti789 web.de> writes:

On Friday, 10 January 2014 at 09:57:18 UTC, Joseph Rushton 
Wakeling wrote:

 Can you recommend some good background reading for those of us 
 who would love to have some input (or at least insight) to 
 this, but don't yet have the theoretical understanding?

You mean about how GCs work? I have the book "Garbage Collection: 
Algorithms for Automatic Dynamic Memory Management" by Richard 
Jones, Rafael D Lins. I can recommend it. It is not too technical 
that it gets too hard. But it doesn't tell the reader how to 
write a GC. It's more an overview of various approaches in GC 
construction.

Jan 10 2014

"Bienlein" <jeti789 web.de> writes:

On Friday, 10 January 2014 at 10:05:45 UTC, Bienlein wrote:
 On Friday, 10 January 2014 at 09:57:18 UTC, Joseph Rushton 
 Wakeling wrote:

 Can you recommend some good background reading for those of us 
 who would love to have some input (or at least insight) to 
 this, but don't yet have the theoretical understanding?

 You mean about how GCs work? I have the book "Garbage 
 Collection: Algorithms for Automatic Dynamic Memory Management" 
 by Richard Jones, Rafael D Lins. I can recommend it. It is not 
 too technical that it gets too hard. But it doesn't tell the 
 reader how to write a GC. It's more an overview of various 
 approaches in GC construction.

Seems like we were all in parallel with our book recommendations; 
only some minutes apart ...

Jan 10 2014

Jacob Carlborg <doob me.com> writes:

On 2014-01-10 11:30, Bienlein wrote:

 Seems like we were all in parallel with our book recommendations; only
 some minutes apart ...

That would hopefully indicate they're good recommendations :)

-- 
/Jacob Carlborg

Jan 10 2014

"Regan Heath" <regan netmail.co.nz> writes:

On Fri, 10 Jan 2014 09:56:45 -0000, Joseph Rushton Wakeling  
<joseph.wakeling webdrake.net> wrote:
 Can you recommend some good background reading for those of us who would  
 love to have some input (or at least insight) to this, but don't yet  
 have the theoretical understanding?


http://msdn.microsoft.com/en-us/library/ms973837.aspx

It's an overview of a specific collector so you get a concrete understand  
of one way to do it, which you can then use as a foundation to build  
understanding on etc.  It's the only thing I've read on GC and I  
understood (at least at the surface level) Andrei's post completely.  Plus  
it's not too long, and available online for free :)

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Jan 10 2014

"Joseph Cassman" <jc7919 outlook.com> writes:

On Friday, 10 January 2014 at 09:57:18 UTC, Joseph Rushton 
Wakeling wrote:
 On 10/01/14 09:03, Andrei Alexandrescu wrote:

 Can you recommend some good background reading for those of us 
 who would love to have some input (or at least insight) to 
 this, but don't yet have the theoretical understanding?

I like the thesis "Scheduling Garbage Collection in Embedded 
Systems" by Roger Henriksson. It shows how GC can work with 
various types of systems, including real-time embedded control 
systems. It includes a comparison of GC styles. And disects an 
actual algorithm.

Joseph

Jan 10 2014

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 10/01/14 23:25, Joseph Cassman wrote:
 I like the thesis "Scheduling Garbage Collection in Embedded Systems" by Roger
 Henriksson. It shows how GC can work with various types of systems, including
 real-time embedded control systems. It includes a comparison of GC styles. And
 disects an actual algorithm.

Interesting -- I'll have to look that up.

Thanks to everyone for your suggestions -- some good new books to add to the 
reading list! :-)

Jan 10 2014

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 10/01/14 23:25, Joseph Cassman wrote:
 I like the thesis "Scheduling Garbage Collection in Embedded Systems" by Roger
 Henriksson. It shows how GC can work with various types of systems, including
 real-time embedded control systems. It includes a comparison of GC styles. And
 disects an actual algorithm.

The link, for anyone who's interested:
http://www.cs.lth.se/home/Roger_Henriksson/thesis.pdf

Jan 10 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

10-Jan-2014 12:03, Andrei Alexandrescu пишет:
 The GC comes up repeatedly in discussions around here. I've been
 thinking for a while about breaking it down into components, and now it
 seems the time is ripe.

 The design at http://goo.gl/ZVCJeB seems to be a win.

O.T.: Please use full URLs, it's not that long.

Also how about actually starting formal process to include it as 
core.alllocator (package preferably).  This activity can go in parallel 
with with getting a through reveiw of currrent GC and taking steps 
towards precise GC into the mainstream.

 I've started with the next logical step - higher-level allocation that
 is aware of the type of the object being allocated, and realized that
 integrating a notion of tracing is appropriate at that level, and
 actually quite easy. So I'm thinking of just doing it.

 A few highlights:

 * The design will foster the small, composable components with
 required/optional primitives that worked for std.allocator.

 * I plan to review and probably discard all of the pointers-to-functions
 nonsense in the current GC.

Cool so far.

 * I plan to rely on static introspection for the mark function, i.e:

 void mark(T)(ref T obj);

 would mark obj and everything transitively referred to by it. The
 function would probably take a second parameter that's the heap that obj
 is sitting in.

This will not work AFAICT. Since nobody can tell what kind of type a
user-defined program may have and GC is certainly not templated on all 
types.
GC would have to rely on TypeInfos or rather RTInfo-s that describe the 
layout of Ts in a uniform way (bitmaps to indicate pointers, etc.).
I think Walter had some ideas on how to represent these to reduce space.

 * I plan to segregate all objects that don't include references and
 pointers (e.g. int, int[], Tuple!(int, double) etc) into a completely
 different heap than the "interesting" objects. For the simpler objects
 there's no need to save detailed type information so they can be stored
 in a more compact, efficient manner.

Yup. I though GC already did something like that, with BlkAttr.NO_SCAN.

 * There will be no possibility to change the type of certain objects
 once allocated. An allocation for an immutable object cannot e.g. make
 it later a mutable one. (This is an essential issue with the current
 allocator, I think.)

Would that mean 3 separate heaps, so that addresses given for immutable 
stuff would never be (sometime later) assigned to (thread-local) 
mutable? I might have got this point of yours wrong, but I think there 
is a merit in having separate heap at least for shared stuff.

-- 
Dmitry Olshansky

Jan 10 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/10/14 3:29 AM, Dmitry Olshansky wrote:
 10-Jan-2014 12:03, Andrei Alexandrescu пишет:
 * I plan to rely on static introspection for the mark function, i.e:

 void mark(T)(ref T obj);

 would mark obj and everything transitively referred to by it. The
 function would probably take a second parameter that's the heap that obj
 is sitting in.

 This will not work AFAICT. Since nobody can tell what kind of type a
 user-defined program may have and GC is certainly not templated on all
 types.
 GC would have to rely on TypeInfos or rather RTInfo-s that describe the
 layout of Ts in a uniform way (bitmaps to indicate pointers, etc.).
 I think Walter had some ideas on how to represent these to reduce space.

You know I wouldn't write that unless I had an ace up my sleeve. Watch 
me :o).

 * I plan to segregate all objects that don't include references and
 pointers (e.g. int, int[], Tuple!(int, double) etc) into a completely
 different heap than the "interesting" objects. For the simpler objects
 there's no need to save detailed type information so they can be stored
 in a more compact, efficient manner.

 Yup. I though GC already did something like that, with BlkAttr.NO_SCAN.

That design will disappear. The attribute won't be settable once 
initialized, and will never be conservative ("scan all words in this 
block").

 * There will be no possibility to change the type of certain objects
 once allocated. An allocation for an immutable object cannot e.g. make
 it later a mutable one. (This is an essential issue with the current
 allocator, I think.)

 Would that mean 3 separate heaps, so that addresses given for immutable
 stuff would never be (sometime later) assigned to (thread-local)
 mutable? I might have got this point of yours wrong, but I think there
 is a merit in having separate heap at least for shared stuff.

Yah, separation of shared data allocation is crucial. We'll see.


Andrei

Jan 10 2014

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Jan 10, 2014 at 08:22:56AM -0800, Andrei Alexandrescu wrote:
 On 1/10/14 3:29 AM, Dmitry Olshansky wrote:
10-Jan-2014 12:03, Andrei Alexandrescu пишет:
* I plan to rely on static introspection for the mark function, i.e:

void mark(T)(ref T obj);

would mark obj and everything transitively referred to by it. The
function would probably take a second parameter that's the heap that
obj is sitting in.

This will not work AFAICT. Since nobody can tell what kind of type a
user-defined program may have and GC is certainly not templated on
all types.  GC would have to rely on TypeInfos or rather RTInfo-s
that describe the layout of Ts in a uniform way (bitmaps to indicate
pointers, etc.).  I think Walter had some ideas on how to represent
these to reduce space.

 
 You know I wouldn't write that unless I had an ace up my sleeve.
 Watch me :o).

I don't see why having a templated mark() is a problem. Nothing says you
can't do this:

	module core.gc;
	void mark(T)(ref T obj) {
		size_t objSize = T.sizeof;
		// ... and whatever other type-specific info you need
		// here.

		// N.B.: non-template function call
		__mark_impl(cast(void*)&obj, objSize, /* ... whatever else is needed */);
	}


* I plan to segregate all objects that don't include references and
pointers (e.g. int, int[], Tuple!(int, double) etc) into a
completely different heap than the "interesting" objects. For the
simpler objects there's no need to save detailed type information so
they can be stored in a more compact, efficient manner.

Yup. I though GC already did something like that, with BlkAttr.NO_SCAN.

 
 That design will disappear. The attribute won't be settable once
 initialized, and will never be conservative ("scan all words in this
 block").

[...]

So this will be the beginning of the precise GC that we've been talking
about for who knows how long now? I like that!


T

-- 
Любишь кататься - люби и саночки возить.

Jan 10 2014

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 10/01/14 17:22, Andrei Alexandrescu wrote:
 You know I wouldn't write that unless I had an ace up my sleeve. Watch me :o).

Ahh, we're getting an allocation-based Dance of the Seven Veils here? :-)

Jan 10 2014

"bioinfornatics" <bioinfornatics fedoraproject.org> writes:

One big generic problem for GC that is to manage efficiently a 
huge tree.
More you program use memory for GC will go slower.
Maybe to have a GC profile for these applications could to be a 
good thing.

Jan 10 2014

"Sean Kelly" <sean invisibleduck.org> writes:

On Friday, 10 January 2014 at 08:03:03 UTC, Andrei Alexandrescu
wrote:
 The GC comes up repeatedly in discussions around here. I've 
 been thinking for a while about breaking it down into 
 components, and now it seems the time is ripe.

 * The design will foster the small, composable components with 
 required/optional primitives that worked for std.allocator.

 * I plan to review and probably discard all of the 
 pointers-to-functions nonsense in the current GC.

This only exists because the runtime is statically linked and
supporting D in a DLL was a requirement.  Grafting together GCs
from multiple runtimes was the easiest way to do this at the
time, but the correct solution is really just to put the runtime
in a DLL and jettison the whole idea of a GC proxy.  In short,
I'm all for it.


 * I plan to rely on static introspection for the mark function, 
 i.e:

 void mark(T)(ref T obj);

 would mark obj and everything transitively referred to by it. 
 The function would probably take a second parameter that's the 
 heap that obj is sitting in.

This will take some work.  Maybe thread_scanAll can take an alias
to such a function instead of a function pointer as it does
today.  Something along those lines anyway.  In short, the
template part is what's tricky about this approach.


 * I plan to segregate all objects that don't include references 
 and pointers (e.g. int, int[], Tuple!(int, double) etc) into a 
 completely different heap than the "interesting" objects. For 
 the simpler objects there's no need to save detailed type 
 information so they can be stored in a more compact, efficient 
 manner.

Yay!  I don't like how the current GC mixes these and indicates
the presence of pointers via a bit flag.


 * There will be no possibility to change the type of certain 
 objects once allocated. An allocation for an immutable object 
 cannot e.g. make it later a mutable one. (This is an essential 
 issue with the current allocator, I think.)

I think this is reasonable, since it also may affect which pool
the data is allocated in.  I really would love to aim for a
concurrent allocator scheme.  Kinda like HOARD but with a GC
attached.  But this is really only feasible if you make some hard
assertions about type conversion.

Jan 10 2014

"Sean Kelly" <sean invisibleduck.org> writes:

Oh, I assume this will eventually include your ideas about having
the GC manage mapped files?  That one has lingered on the
"someday" pile for too long.

Jan 10 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/10/14 10:57 AM, Sean Kelly wrote:
 Oh, I assume this will eventually include your ideas about having
 the GC manage mapped files?  That one has lingered on the
 "someday" pile for too long.

Yah, such integration would make a lot of sense. Thanks for the 
reminder, I'd forgotten about it.

Andrei

Jan 10 2014

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 10.01.2014 09:03, schrieb Andrei Alexandrescu:
 Destroy.

 Andrei

Just one question. Did you read "the garbage collection handbook"?

Jan 10 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/10/14 11:34 AM, Benjamin Thaut wrote:
 Am 10.01.2014 09:03, schrieb Andrei Alexandrescu:
 Destroy.

 Andrei

 Just one question. Did you read "the garbage collection handbook"?

Yah, the new edition. Unfortunately I didn't find a lot of new stuff or 
things that would help in a practical implementation.

Andrei

Jan 10 2014

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 10.01.2014 20:40, schrieb Andrei Alexandrescu:
 On 1/10/14 11:34 AM, Benjamin Thaut wrote:
 Am 10.01.2014 09:03, schrieb Andrei Alexandrescu:
 Destroy.

 Andrei

 Just one question. Did you read "the garbage collection handbook"?

 Yah, the new edition. Unfortunately I didn't find a lot of new stuff or
 things that would help in a practical implementation.

 Andrei

Very good,

In my opinion it should be required for everyone who wants to 
participate in D's GC to read that book. So everyone has at least a 
basic understanding of the problem at Hand.

But if you read the book I don't understand, why you simply declare the 
hardest problem, percise pointer discovery, as done. To be able to 
actually implement and test a GC that is not a reimplementation of what 
we already have one needs percise pointer discovery of _all_ pointers, 
write barriers and GC halting points. So please enlighten me why you 
simply decalre this done?

Kind Regards
Benjamin Thaut

Jan 11 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/11/14 3:52 AM, Benjamin Thaut wrote:
 Am 10.01.2014 20:40, schrieb Andrei Alexandrescu:
 On 1/10/14 11:34 AM, Benjamin Thaut wrote:
 Am 10.01.2014 09:03, schrieb Andrei Alexandrescu:
 Destroy.

 Andrei

 Just one question. Did you read "the garbage collection handbook"?

 Yah, the new edition. Unfortunately I didn't find a lot of new stuff or
 things that would help in a practical implementation.

 Andrei

 Very good,

 In my opinion it should be required for everyone who wants to
 participate in D's GC to read that book. So everyone has at least a
 basic understanding of the problem at Hand.

 But if you read the book I don't understand, why you simply declare the
 hardest problem, percise pointer discovery, as done. To be able to
 actually implement and test a GC that is not a reimplementation of what
 we already have one needs percise pointer discovery of _all_ pointers,
 write barriers and GC halting points. So please enlighten me why you
 simply decalre this done?

I'm not considering done as much as separable as a concern. All I'm 
saying is I hope to be able to separate the low-level part of 
discovering roots from the high-level part of marking used memory. BTW 
the way I see this done is "mostly precise", i.e. there will be at least 
for a while some words that will be handled conservatively (stack, 
registers, certain union members).

If there's anything in the GC book that suggest that would be 
impossible, please do let me know!


Andrei

Jan 11 2014

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 11.01.2014 21:44, schrieb Andrei Alexandrescu:
 On 1/11/14 3:52 AM, Benjamin Thaut wrote:
 Am 10.01.2014 20:40, schrieb Andrei Alexandrescu:
 On 1/10/14 11:34 AM, Benjamin Thaut wrote:
 Am 10.01.2014 09:03, schrieb Andrei Alexandrescu:
 Destroy.

 Andrei

 Just one question. Did you read "the garbage collection handbook"?

 Yah, the new edition. Unfortunately I didn't find a lot of new stuff or
 things that would help in a practical implementation.

 Andrei

 Very good,

 In my opinion it should be required for everyone who wants to
 participate in D's GC to read that book. So everyone has at least a
 basic understanding of the problem at Hand.

 But if you read the book I don't understand, why you simply declare the
 hardest problem, percise pointer discovery, as done. To be able to
 actually implement and test a GC that is not a reimplementation of what
 we already have one needs percise pointer discovery of _all_ pointers,
 write barriers and GC halting points. So please enlighten me why you
 simply decalre this done?

 I'm not considering done as much as separable as a concern. All I'm
 saying is I hope to be able to separate the low-level part of
 discovering roots from the high-level part of marking used memory. BTW
 the way I see this done is "mostly precise", i.e. there will be at least
 for a while some words that will be handled conservatively (stack,
 registers, certain union members).

 If there's anything in the GC book that suggest that would be
 impossible, please do let me know!


 Andrei

Well as far as my understanding of GCs goes, you have two options:

1) Impercise pointer discovery => limiting yourself to a mark & sweep
2) Percise pointer disccovery => option to use a semi space GC in 
combination with a mark & sweep and generations, which allows for 
superior handling of short lived allocations (which is the biggest 
problem of D's current GC).

Also without percise pointer discovery you will never be able to move 
objects from one heap into another. This would be especially a problem 
for the immutable heap, because you might want to allocate all strings 
on the thread local heap until you discover that you actually need them 
to be shared between threads. You might also need to move objects into 
antoher heap whenever a casts happens or global variable is assigned.

Kind Regards
Benjamin Thaut

Jan 11 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/11/14 1:20 PM, Benjamin Thaut wrote:
 1) Impercise pointer discovery => limiting yourself to a mark & sweep
 2) Percise pointer disccovery => option to use a semi space GC in
 combination with a mark & sweep and generations, which allows for
 superior handling of short lived allocations (which is the biggest
 problem of D's current GC).

The way I see it/hope it turns out, precision is a separate concern from 
what I'm working on now, which is tracing.

Andrei

P.S. s/percise/precise/g

Jan 11 2014

=?ISO-8859-1?Q?S=F6nke_Ludwig?= <sludwig+dforum outerproduct.org> writes:

Am 11.01.2014 22:20, schrieb Benjamin Thaut:
 Also without percise pointer discovery you will never be able to move
 objects from one heap into another. This would be especially a problem
 for the immutable heap, because you might want to allocate all strings
 on the thread local heap until you discover that you actually need them
 to be shared between threads.

The language can help a lot there (pure... and scope, if "fixed"), so 
that the compiler can emit the necessary calls to make this explicit, 
because it already knows that it's safe without performing a garbage 
collection cycle.

Jan 11 2014

Rainer Schuetze <r.sagitario gmx.de> writes:

On 11.01.2014 22:20, Benjamin Thaut wrote:
 Am 11.01.2014 21:44, schrieb Andrei Alexandrescu:
 On 1/11/14 3:52 AM, Benjamin Thaut wrote:
 Am 10.01.2014 20:40, schrieb Andrei Alexandrescu:
 On 1/10/14 11:34 AM, Benjamin Thaut wrote:
 Am 10.01.2014 09:03, schrieb Andrei Alexandrescu:
 Destroy.

 Andrei

 Just one question. Did you read "the garbage collection handbook"?

 Yah, the new edition. Unfortunately I didn't find a lot of new stuff or
 things that would help in a practical implementation.

 Andrei

 Very good,

 In my opinion it should be required for everyone who wants to
 participate in D's GC to read that book. So everyone has at least a
 basic understanding of the problem at Hand.

 But if you read the book I don't understand, why you simply declare the
 hardest problem, percise pointer discovery, as done. To be able to
 actually implement and test a GC that is not a reimplementation of what
 we already have one needs percise pointer discovery of _all_ pointers,
 write barriers and GC halting points. So please enlighten me why you
 simply decalre this done?

 I'm not considering done as much as separable as a concern. All I'm
 saying is I hope to be able to separate the low-level part of
 discovering roots from the high-level part of marking used memory. BTW
 the way I see this done is "mostly precise", i.e. there will be at least
 for a while some words that will be handled conservatively (stack,
 registers, certain union members).

 If there's anything in the GC book that suggest that would be
 impossible, please do let me know!


 Andrei

 Well as far as my understanding of GCs goes, you have two options:

 1) Impercise pointer discovery => limiting yourself to a mark & sweep
 2) Percise pointer disccovery => option to use a semi space GC in
 combination with a mark & sweep and generations, which allows for
 superior handling of short lived allocations (which is the biggest
 problem of D's current GC).

 Also without percise pointer discovery you will never be able to move
 objects from one heap into another. This would be especially a problem
 for the immutable heap, because you might want to allocate all strings
 on the thread local heap until you discover that you actually need them
 to be shared between threads. You might also need to move objects into
 antoher heap whenever a casts happens or global variable is assigned.

 Kind Regards
 Benjamin Thaut

I think a moving collector is currently not feasible without restricting 
the language a lot, probably similar to safe D and more. I'm not sure we 
want that in general.

We should aim at something in between mark & sweep and compacting 
generational collection, e.g. a non-moving collector that keeps track of 
the youngest generation (I just made that up, not sure if it is 
realistic). Adding concurrency would also be nice...

Jan 12 2014

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 12.01.2014 11:27, schrieb Rainer Schuetze:
 I think a moving collector is currently not feasible without restricting
 the language a lot, probably similar to safe D and more. I'm not sure we
 want that in general.

Could you give an example which part of the language would not be doable 
with a moving collector? The only thing that comes to my mind is unions 
and that problem can be solved by allowing the user to specify manual 
scanning functions for structs or classes containing unions.

Also I don't think that we can create a GC which performs as good as the 

for a moving gc.

Jan 12 2014

"Jakob Ovrum" <jakobovrum gmail.com> writes:

On Sunday, 12 January 2014 at 10:40:50 UTC, Benjamin Thaut wrote:
 Also I don't think that we can create a GC which performs as 

 neccessary changes for a moving gc.

We don't necessarily need that, though. In D we have a plethora 
of non-GC options, so it might be a better idea to tailor our GC 
to typical D programs rather than trying to reproduce the Java 

perspective, I don't think the young generation is significant in 
idiomatic D code, unlike Java code which relies on it heavily.

Jan 12 2014

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 12.01.2014 11:54, schrieb Jakob Ovrum:
 On Sunday, 12 January 2014 at 10:40:50 UTC, Benjamin Thaut wrote:
 Also I don't think that we can create a GC which performs as good as

 changes for a moving gc.

 We don't necessarily need that, though. In D we have a plethora of
 non-GC options, so it might be a better idea to tailor our GC to typical

 For example, from a generational GC perspective, I don't think the young
 generation is significant in idiomatic D code, unlike Java code which
 relies on it heavily.

You should really try to write non-GC D code some time. You would be 
astonished how many hidden allocations there are.

Jan 12 2014

"Joseph Rushton Wakeling" <joseph.wakeling webdrake.net> writes:

On Sunday, 12 January 2014 at 11:02:28 UTC, Benjamin Thaut wrote:
 You should really try to write non-GC D code some time. You 
 would be astonished how many hidden allocations there are.

Is that down to the language, or parts of druntime and phobos 
that assume GC when they shouldn't?

Jan 12 2014

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 12.01.2014 12:12, schrieb Joseph Rushton Wakeling:
 On Sunday, 12 January 2014 at 11:02:28 UTC, Benjamin Thaut wrote:
 You should really try to write non-GC D code some time. You would be
 astonished how many hidden allocations there are.

 Is that down to the language, or parts of druntime and phobos that
 assume GC when they shouldn't?

Both. Although it kept getting better lately.

Jan 12 2014

"Jakob Ovrum" <jakobovrum gmail.com> writes:

On Sunday, 12 January 2014 at 11:02:28 UTC, Benjamin Thaut wrote:
 You should really try to write non-GC D code some time.

I do.

 You would be astonished how many hidden allocations there are.

I was, years ago.

Jan 12 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/12/14 2:40 AM, Benjamin Thaut wrote:
 Am 12.01.2014 11:27, schrieb Rainer Schuetze:
 I think a moving collector is currently not feasible without restricting
 the language a lot, probably similar to safe D and more. I'm not sure we
 want that in general.

 Could you give an example which part of the language would not be doable
 with a moving collector? The only thing that comes to my mind is unions
 and that problem can be solved by allowing the user to specify manual
 scanning functions for structs or classes containing unions.

 Also I don't think that we can create a GC which performs as good as the

 for a moving gc.

Yah, moving would be real nice. I hope to at least clarify the issues 
related to moving with my work.

Andrei

Jan 12 2014

"Brian Rogoff" <brogoff gmail.com> writes:

On Sunday, 12 January 2014 at 10:40:50 UTC, Benjamin Thaut wrote:
 Am 12.01.2014 11:27, schrieb Rainer Schuetze:
 I think a moving collector is currently not feasible without 
 restricting
 the language a lot, probably similar to safe D and more. I'm 
 not sure we
 want that in general.


I'd rather have more restrictions and a working precise GC, and 
let those who wish to do without the
safety ask for it explicitly.

 Could you give an example which part of the language would not 
 be doable with a moving collector? The only thing that comes to 
 my mind is unions and that problem can be solved by allowing 
 the user to specify manual scanning functions for structs or 
 classes containing unions.

How would the moving GC deal with pointer arithmetic?

 Also I don't think that we can create a GC which performs as 

 neccessary changes for a moving gc.


is to be a 'fully garbage collected language' then it will have 
to support a state of the art GC.

Jan 12 2014

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 12.01.2014 18:12, schrieb Brian Rogoff:
 How would the moving GC deal with pointer arithmetic?

I don't see any problem with pointer arithmetic? Either the pointer is 
pointing to a gc managed memory block and will be patched accordingly, 
or it is not pointing to a gc managed memory block, and nothing will 
happen.

Obviously if you do unsafe things, you have to know what you are doing 
and might give additional information to the gc so it can propperly 
support unsafe stuff.

Jan 12 2014

"Sean Kelly" <sean invisibleduck.org> writes:

On Sunday, 12 January 2014 at 17:15:44 UTC, Benjamin Thaut wrote:
 Am 12.01.2014 18:12, schrieb Brian Rogoff:
 How would the moving GC deal with pointer arithmetic?

 I don't see any problem with pointer arithmetic? Either the 
 pointer is pointing to a gc managed memory block and will be 
 patched accordingly, or it is not pointing to a gc managed 
 memory block, and nothing will happen.

Or you don't know if something is a pointer or not and so 
whatever it references is pinned.

Jan 12 2014

"Brad Anderson" <eco gnuk.net> writes:

On Sunday, 12 January 2014 at 17:15:44 UTC, Benjamin Thaut wrote:
 Am 12.01.2014 18:12, schrieb Brian Rogoff:
 How would the moving GC deal with pointer arithmetic?

 I don't see any problem with pointer arithmetic? Either the 
 pointer is pointing to a gc managed memory block and will be 
 patched accordingly, or it is not pointing to a gc managed 
 memory block, and nothing will happen.

 Obviously if you do unsafe things, you have to know what you 
 are doing and might give additional information to the gc so it 
 can propperly support unsafe stuff.

The garbage collection page of the D spec actually talks a lot
about what is safe and unsafe (even though D doesn't have a
moving collector).

http://dlang.org/garbage.html

Jan 12 2014

"Rainer Schuetze" <r.sagitario gmx.de> writes:

On Sunday, 12 January 2014 at 10:40:50 UTC, Benjamin Thaut wrote:
 Am 12.01.2014 11:27, schrieb Rainer Schuetze:
 I think a moving collector is currently not feasible without 
 restricting
 the language a lot, probably similar to safe D and more. I'm 
 not sure we
 want that in general.

 Could you give an example which part of the language would not 
 be doable with a moving collector? The only thing that comes to 
 my mind is unions and that problem can be solved by allowing 
 the user to specify manual scanning functions for structs or 
 classes containing unions.

Maybe I'm too pessimistic ;-) I guess moving in general could be 
ok, I was thinking about segregating heaps by type 
(shared/immutable/mutable) and moving data between them adds 
restrictions. I'd like to be proven wrong.

Some thoughts regarding a moving collector:

- interfacing with C/C++ is problematic: even if a pointer is 
passed on the stack, it is not guaranteed that this stack entry 
is not modified by the called function. While this might cause 
problems when collecting memory due to this being the last 
reference to the data, it is much more likely that there are 
still references, but moving pointers will definitely fail.
Having to explicitely pin every pointer passed to C functions 
would be very expensive.

- if we have many pinned objects, compaction loses some of its 
advantages (like fast allocation). If we keep allocating from 
free lists of equal sized bins, fragmantation and memory overhead 
is limited (though not small) without moving. Also, moving lot's 
of data can also be expensive.

- interior pointers do not allow "threading" for pointer updates, 
so it might get pretty expensive to update references to moved 
objects


 Also I don't think that we can create a GC which performs as 

 neccessary changes for a moving gc.

To compete with other GCs we'd probably need write barriers to 
keep track of changed references (for concurrent operation or 
generations). There just needs to be a way to avoid having to 
rescan the full heap every time, it does not scale.

PS: my SSD with all the D stuff just died yesterday (was it a 
failure of the disks GC?). I'll probably need some time to 
recover from that...

Jan 13 2014

Jacob Carlborg <doob me.com> writes:

On 2014-01-13 10:20, Rainer Schuetze wrote:

 Maybe I'm too pessimistic ;-) I guess moving in general could be ok, I
 was thinking about segregating heaps by type (shared/immutable/mutable)
 and moving data between them adds restrictions. I'd like to be proven
 wrong.

 Some thoughts regarding a moving collector:

 - interfacing with C/C++ is problematic: even if a pointer is passed on
 the stack, it is not guaranteed that this stack entry is not modified by
 the called function. While this might cause problems when collecting
 memory due to this being the last reference to the data, it is much more
 likely that there are still references, but moving pointers will
 definitely fail.
 Having to explicitely pin every pointer passed to C functions would be
 very expensive.

Could we have a segregated heap for C pointers? Would that help? 
Basically having a special function allocating everything that should 
interface with C.

-- 
/Jacob Carlborg

Jan 13 2014

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:

On Monday, 13 January 2014 at 10:59:44 UTC, Jacob Carlborg wrote:
 Could we have a segregated heap for C pointers? Would that 
 help? Basically having a special function allocating everything 
 that should interface with C.

Is it possible to declare whether the C-function retains a 
pointer to the memory area or not, and to what extent? In general 
you will have to assume that the C code retains pointers not only 
to the object, but the transitive closure of anything that can be 
reached from it. That is quite extensive…

I also hope that the GC isn't fully modularized, because compiler 
support for specific GC strategies is likely to give better 
performance. Especially with whole program analysis.

It would be very nice to localize GC to a few threads. This  
would be useful in games where you only want to GC the AI/game 
mechanics portion of the simulated world, but use less demanding 
memory management for graphics, physics etc, which keeps running 
uninterrupted (one can interpolate/predict for a few frames 
giving the GC some more room to complete).

Jan 13 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

13-Jan-2014 13:20, Rainer Schuetze пишет:
 On Sunday, 12 January 2014 at 10:40:50 UTC, Benjamin Thaut wrote:
 Am 12.01.2014 11:27, schrieb Rainer Schuetze:
 I think a moving collector is currently not feasible without restricting
 the language a lot, probably similar to safe D and more. I'm not sure we
 want that in general.

 Could you give an example which part of the language would not be
 doable with a moving collector? The only thing that comes to my mind
 is unions and that problem can be solved by allowing the user to
 specify manual scanning functions for structs or classes containing
 unions.

 Maybe I'm too pessimistic ;-) I guess moving in general could be ok, I
 was thinking about segregating heaps by type (shared/immutable/mutable)
 and moving data between them adds restrictions. I'd like to be proven
 wrong.

 Some thoughts regarding a moving collector:

 Having to explicitely pin every pointer passed to C functions would be
 very expensive.

How would it be expensive? I don't see a need to do anything to "pin" a 
memory block, at the time of scanning there will be a potential pointer 
to it (in the stack space of C function or registers).

-- 
Dmitry Olshansky

Jan 13 2014

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:

On Monday, 13 January 2014 at 17:45:26 UTC, Dmitry Olshansky 
wrote:
 How would it be expensive? I don't see a need to do anything to 
 "pin" a memory block, at the time of scanning there will be a 
 potential pointer to it (in the stack space of C function or 
 registers).

How do you know that it is a pointer?

Jan 13 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

13-Jan-2014 21:49, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" пишет:
 On Monday, 13 January 2014 at 17:45:26 UTC, Dmitry Olshansky wrote:
 How would it be expensive? I don't see a need to do anything to "pin"
 a memory block, at the time of scanning there will be a potential
 pointer to it (in the stack space of C function or registers).

 How do you know that it is a pointer?

Precisely. Since C space has no type information you have to 
conservatively assume that everything can be a pointer.

-- 
Dmitry Olshansky

Jan 13 2014

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:

On Monday, 13 January 2014 at 17:56:12 UTC, Dmitry Olshansky 
wrote:
 Precisely. Since C space has no type information you have to 
 conservatively assume that everything can be a pointer.

I think we are misunderstanding each other? I don't think a Boehm 
style GC can do compaction, since you cannot modify the 
"pointer", hence you have to "pin down" the object it possibly 
points to (to prevent it from moving)?

Jan 13 2014

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:

…and scanning of C data segments is insufficient anyway, since 
protected drivers can retain pointers that are out of reach...

Jan 13 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

13-Jan-2014 22:56, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" пишет:
 …and scanning of C data segments is insufficient anyway, since protected
 drivers can retain pointers that are out of reach...

A stack is a stack is a stack.
Not that D haven't had problem with hidden pointers such as these in C 
heap (and not being marked explicitly).

-- 
Dmitry Olshansky

Jan 13 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

13-Jan-2014 22:44, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" пишет:
 On Monday, 13 January 2014 at 17:56:12 UTC, Dmitry Olshansky wrote:
 Precisely. Since C space has no type information you have to
 conservatively assume that everything can be a pointer.

 I think we are misunderstanding each other? I don't think a Boehm style
 GC can do compaction, since you cannot modify the "pointer", hence you
 have to "pin down" the object it possibly points to (to prevent it from
 moving)?

So what? It can't move it and as such there is nothing special about it, 
it doesn't _cost_ anything to pin object.

-- 
Dmitry Olshansky

Jan 13 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 01/13/2014 10:08 PM, Dmitry Olshansky wrote:
 13-Jan-2014 22:44, "Ola Fosheim Grøstad"
 <ola.fosheim.grostad+dlang gmail.com>" пишет:
 On Monday, 13 January 2014 at 17:56:12 UTC, Dmitry Olshansky wrote:
 Precisely. Since C space has no type information you have to
 conservatively assume that everything can be a pointer.

 I think we are misunderstanding each other? I don't think a Boehm style
 GC can do compaction, since you cannot modify the "pointer", hence you
 have to "pin down" the object it possibly points to (to prevent it from
 moving)?

 So what? It can't move it and as such there is nothing special about it,
 it doesn't _cost_ anything to pin object.

Yes, there is a cost. The requirement to pin arbitrary objects at 
arbitrary times, without the possibility to move at pinning time, 
invalidates GC algorithms that depend on being able to move _all_ 
objects within some pool.

Jan 13 2014

Jacob Carlborg <doob me.com> writes:

On 2014-01-13 22:23, Timon Gehr wrote:

 Yes, there is a cost. The requirement to pin arbitrary objects at
 arbitrary times, without the possibility to move at pinning time,
 invalidates GC algorithms that depend on being able to move _all_
 objects within some pool.

Can't these object be pinned somewhere else?

-- 
/Jacob Carlborg

Jan 14 2014

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 14.01.2014 09:15, schrieb Jacob Carlborg:
 On 2014-01-13 22:23, Timon Gehr wrote:

 Yes, there is a cost. The requirement to pin arbitrary objects at
 arbitrary times, without the possibility to move at pinning time,
 invalidates GC algorithms that depend on being able to move _all_
 objects within some pool.

 Can't these object be pinned somewhere else?

Yes, thats the usual approach. But that means that you have to copy them 
to a other space before pinning, or allocate them in a space that allows 
pinning in the first place.

Jan 14 2014

Jacob Carlborg <doob me.com> writes:

On 2014-01-14 10:20, Benjamin Thaut wrote:

 Yes, thats the usual approach. But that means that you have to copy them
 to a other space before pinning, or allocate them in a space that allows
 pinning in the first place.

Is there a problem to allocate them in the correct space from the beginning?

-- 
/Jacob Carlborg

Jan 14 2014

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 14.01.2014 10:52, schrieb Jacob Carlborg:
 On 2014-01-14 10:20, Benjamin Thaut wrote:

 Yes, thats the usual approach. But that means that you have to copy them
 to a other space before pinning, or allocate them in a space that allows
 pinning in the first place.

 Is there a problem to allocate them in the correct space from the
 beginning?

Well you have to know it at allocation time. If you have a function that 
takes a blob of data as argument and passes it to C, the caller does not 
know of the requirement and you (maybe) have to copy inside the callee.

Kind Regards
Benjamin Thaut

Jan 14 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 01/14/2014 10:20 AM, Benjamin Thaut wrote:
 Am 14.01.2014 09:15, schrieb Jacob Carlborg:
 On 2014-01-13 22:23, Timon Gehr wrote:

 Yes, there is a cost. The requirement to pin arbitrary objects at
 arbitrary times, without the possibility to move at pinning time,
 invalidates GC algorithms that depend on being able to move _all_
 objects within some pool.

 Can't these object be pinned somewhere else?

 Yes, thats the usual approach. But that means that you have to copy them
 to a other space before pinning, or allocate them in a space that allows
 pinning in the first place.

Once it becomes known that an object should be pinned, it is too late 
for moving. (There will then exist a potential pointer to the object.)

Jan 14 2014

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 14.01.2014 14:42, schrieb Timon Gehr:
 On 01/14/2014 10:20 AM, Benjamin Thaut wrote:
 Am 14.01.2014 09:15, schrieb Jacob Carlborg:
 On 2014-01-13 22:23, Timon Gehr wrote:

 Yes, there is a cost. The requirement to pin arbitrary objects at
 arbitrary times, without the possibility to move at pinning time,
 invalidates GC algorithms that depend on being able to move _all_
 objects within some pool.

 Can't these object be pinned somewhere else?

 Yes, thats the usual approach. But that means that you have to copy them
 to a other space before pinning, or allocate them in a space that allows
 pinning in the first place.

 Once it becomes known that an object should be pinned, it is too late
 for moving. (There will then exist a potential pointer to the object.)

Well no, usually you have to pin objects _before_ passing them to 
C-land. Pinning is not a automatic process its done manually (or by the 
compiler).

Jan 14 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 01/14/2014 06:56 PM, Benjamin Thaut wrote:
 Am 14.01.2014 14:42, schrieb Timon Gehr:
 On 01/14/2014 10:20 AM, Benjamin Thaut wrote:
 Am 14.01.2014 09:15, schrieb Jacob Carlborg:
 On 2014-01-13 22:23, Timon Gehr wrote:

 Yes, there is a cost. The requirement to pin arbitrary objects at
 arbitrary times, without the possibility to move at pinning time,
 invalidates GC algorithms that depend on being able to move _all_
 objects within some pool.

 Can't these object be pinned somewhere else?

 Yes, thats the usual approach. But that means that you have to copy them
 to a other space before pinning, or allocate them in a space that allows
 pinning in the first place.

 Once it becomes known that an object should be pinned, it is too late
 for moving. (There will then exist a potential pointer to the object.)

 Well no, usually you have to pin objects _before_ passing them to
 C-land. Pinning is not a automatic process its done manually (or by the
 compiler).

Well, there is no such manual mechanism in place, so only strategies 
implemented fully within the compiler work.

I think it is not very sensible to let the compiler insert call-backs to 
eg. pin objects pointed to from unions. It might as well track occupied 
pointer fields explicitly then.

In any case, I think we should just require manual marking of unions and 
call it a day.

Jan 14 2014

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 14.01.2014 19:26, schrieb Timon Gehr:
 On 01/14/2014 06:56 PM, Benjamin Thaut wrote:
 Am 14.01.2014 14:42, schrieb Timon Gehr:
 On 01/14/2014 10:20 AM, Benjamin Thaut wrote:
 Am 14.01.2014 09:15, schrieb Jacob Carlborg:
 On 2014-01-13 22:23, Timon Gehr wrote:

 Yes, there is a cost. The requirement to pin arbitrary objects at
 arbitrary times, without the possibility to move at pinning time,
 invalidates GC algorithms that depend on being able to move _all_
 objects within some pool.

 Can't these object be pinned somewhere else?

 Yes, thats the usual approach. But that means that you have to copy
 them
 to a other space before pinning, or allocate them in a space that
 allows
 pinning in the first place.

 Once it becomes known that an object should be pinned, it is too late
 for moving. (There will then exist a potential pointer to the object.)

 Well no, usually you have to pin objects _before_ passing them to
 C-land. Pinning is not a automatic process its done manually (or by the
 compiler).

 Well, there is no such manual mechanism in place, so only strategies
 implemented fully within the compiler work.

 I think it is not very sensible to let the compiler insert call-backs to
 eg. pin objects pointed to from unions. It might as well track occupied
 pointer fields explicitly then.

 In any case, I think we should just require manual marking of unions and
 call it a day.

Yes, but pinning will still be necessary for passing data to C/C++ 
functions.

Jan 14 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 01/14/2014 07:29 PM, Benjamin Thaut wrote:
 Yes, but pinning will still be necessary for passing data to C/C++
 functions.

(I agree, it is the spec that does not.)

Jan 14 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 01/14/2014 06:56 PM, Benjamin Thaut wrote:
 Once it becomes known that an object should be pinned, it is too late
 for moving. (There will then exist a potential pointer to the object.)

 Well no, usually you have to pin objects _before_ passing them to
 C-land. Pinning is not a automatic process its done manually (or by the
 compiler).

BTW, our misunderstandings are rooted here:
http://dlang.org/garbage.html

"Object Pinning and a Moving Garbage Collector

Although D does not currently use a moving garbage collector, by 
following the rules listed above one can be implemented. No special 
action is required to pin objects. A moving collector will only move 
objects for which there are no ambiguous references, and for which it 
can update those references. All other objects will be automatically 
pinned."

Jan 14 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

14-Jan-2014 21:56, Benjamin Thaut пишет:
 Am 14.01.2014 14:42, schrieb Timon Gehr:
 On 01/14/2014 10:20 AM, Benjamin Thaut wrote:
 Am 14.01.2014 09:15, schrieb Jacob Carlborg:
 On 2014-01-13 22:23, Timon Gehr wrote:

 Yes, there is a cost. The requirement to pin arbitrary objects at
 arbitrary times, without the possibility to move at pinning time,
 invalidates GC algorithms that depend on being able to move _all_
 objects within some pool.

 Can't these object be pinned somewhere else?

 Yes, thats the usual approach. But that means that you have to copy them
 to a other space before pinning, or allocate them in a space that allows
 pinning in the first place.

 Once it becomes known that an object should be pinned, it is too late
 for moving. (There will then exist a potential pointer to the object.)

 Well no, usually you have to pin objects _before_ passing them to
 C-land. Pinning is not a automatic process its done manually (or by the
 compiler).

If one accepts that any heap can have object that is pinned I don't see 
why you need explicit pinning. It just becomes (a complication to) a 
part of normal marking stage, the discovery of which objects are pinned.
In semi-precise setting it IMHO makes perfect sense.

-- 
Dmitry Olshansky

Jan 14 2014

"Tofu Ninja" <emmons0 purdue.edu> writes:

Why are unions with pointers even allowed in the first place? It 
seems like a really really really bad thing to do. Am I missing 
some really important use case?

Jan 14 2014

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 14.01.2014 20:08, schrieb Tofu Ninja:
 Why are unions with pointers even allowed in the first place? It seems
 like a really really really bad thing to do. Am I missing some really
 important use case?

Yes, low level programming. D is supposed to be a systems programmin 
glanguage after all. Tagged unions are a often used concept, and 
pointers within these tagged unions are very common. Also its needed for 
compatibility with C.

Jan 14 2014

"deadalnix" <deadalnix gmail.com> writes:

On Tuesday, 14 January 2014 at 19:09:00 UTC, Tofu Ninja wrote:
 Why are unions with pointers even allowed in the first place? 
 It seems like a really really really bad thing to do. Am I 
 missing some really important use case?

Tagged unions are very useful. For instance, to take a
simplified, but real life example, I have a function to resolve
identifier in a compiler. It can resolve as a type, as an
expression, as a template, as a package or module, or whatnot.
This is a case where you need to return an union of types (or
rely on complete type erasure).

Jan 14 2014

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:

On Monday, 13 January 2014 at 21:08:50 UTC, Dmitry Olshansky 
wrote:
 So what? It can't move it and as such there is nothing special 
 about it, it doesn't _cost_ anything to pin object.

Well, the advantage of compaction is that you (in theory) can 
relocate objects so that you:

1. get temporal and spatial coherency (sitting on the same page, 
to avoid paging)
2. get rid of fragmentation (less paging, requires smaller 
address space)
3. can have a faster and simpler allocator

If you start using the GC heap for "malloc" with pinning, you 
loose a lot of that? That's the cost.

Jan 13 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

14-Jan-2014 01:24, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" пишет:
 On Monday, 13 January 2014 at 21:08:50 UTC, Dmitry Olshansky wrote:
 So what? It can't move it and as such there is nothing special about
 it, it doesn't _cost_ anything to pin object.

 Well, the advantage of compaction is that you (in theory) can relocate
 objects so that you:

 1. get temporal and spatial coherency (sitting on the same page, to
 avoid paging)

Bleh. If anything you may as well lose it - manual memory management is 
the only option to guarantee you something to the extent of having 
related object together in the same page. That is if you can make it so 
by hand.

Segregated heaps also get fine locality for similarly-sized objects.

 2. get rid of fragmentation (less paging, requires smaller address space)
 3. can have a faster and simpler allocator

 If you start using the GC heap for "malloc" with pinning, you loose a
 lot of that? That's the cost.

This is a missed potential not an extra cost.
On the contrary write barriers to track precisely every pointer write do 
*cost* something. This cost is then regained (with interest) by applying 
simpler and efficient algorithms such as semi-space compaction.


-- 
Dmitry Olshansky

Jan 14 2014

"woh" <wojw yahoo.com> writes:

  If Java with all it that time and money can not create a 
performant GC, what hope has D?

  Full program tracing GC is a dead horse, it does not scale, all 
of the better Java GC's require 2x or memory of the working set 
to actually get any speed


  isolated task or allocator based memory that can use GC as a 
last resort is what you need, but D lacks in this regard

Jan 14 2014

Paulo Pinto <pjmlp progtools.org> writes:

Am 14.01.2014 18:53, schrieb woh:
   If Java with all it that time and money can not create a performant
 GC, what hope has D?

   Full program tracing GC is a dead horse, it does not scale, all of the
 better Java GC's require 2x or memory of the working set to actually get
 any speed


   isolated task or allocator based memory that can use GC as a last
 resort is what you need, but D lacks in this regard

Are you aware of Azul's GC or some real time VMs like Aonix PERC?

Jan 14 2014

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:

On Tuesday, 14 January 2014 at 17:38:56 UTC, Dmitry Olshansky 
wrote:
 Bleh. If anything you may as well lose it - manual memory 
 management is the only option to guarantee you something to the 
 extent of having related object together in the same page. That 
 is if you can make it so by hand.

Under what assumption? There is not optimal garbage collector, 
you have to select or adapt one that fits the application.

You can tag objects by a group id or type id and use that during 
compaction.

 write do *cost* something. This cost is then regained (with 
 interest) by applying simpler and efficient algorithms such as 
 semi-space compaction.

I wouldn't call semi-space compaction efficient, it can lead to 
heavy paging.

Jan 14 2014

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 13.01.2014 18:45, schrieb Dmitry Olshansky:
 13-Jan-2014 13:20, Rainer Schuetze пишет:
 Maybe I'm too pessimistic ;-) I guess moving in general could be ok, I
 was thinking about segregating heaps by type (shared/immutable/mutable)
 and moving data between them adds restrictions. I'd like to be proven
 wrong.

 Some thoughts regarding a moving collector:

 Having to explicitely pin every pointer passed to C functions would be
 very expensive.

 How would it be expensive? I don't see a need to do anything to "pin" a
 memory block, at the time of scanning there will be a potential pointer
 to it (in the stack space of C function or registers).

I also don't see how that would be expenive. Pinning is a common pratice 
in other garbage collected languages. But your argument is invalid. As 
the D garbage collector does not have any information about the stack 
frames of the C functions it simply has to ignore those (assuming its a 
truly percise GC).

Jan 13 2014

"Rainer Schuetze" <r.sagitario gmx.de> writes:

On Monday, 13 January 2014 at 17:45:26 UTC, Dmitry Olshansky
wrote:
 13-Jan-2014 13:20, Rainer Schuetze пишет:
 On Sunday, 12 January 2014 at 10:40:50 UTC, Benjamin Thaut 
 wrote:
 Having to explicitely pin every pointer passed to C functions 
 would be
 very expensive.

 How would it be expensive? I don't see a need to do anything to 
 "pin" a memory block, at the time of scanning there will be a 
 potential pointer to it (in the stack space of C function or 
 registers).

The stack reference can be moved outside the memory visible to
the GC:

// D
extern(C) void funC(const char* s);
void main()
{
       funC("Hello".dup.ptr);
}

// C
void funC(const char* s)
{
       SomeStruct* struc = malloc(sizeof(SomeStruct));
       struc->ptr = s;
       s = 0;
       struc->doSomething();
       free(struc);
}

The "s = 0;" will remove the last reference to the passed string
visible to the GC. The duped string might get collected during
execution of "doSomething". This is a problem with the current GC
already, though it is pretty unlikely that both this kind of
operation is used in the C function (or similar) and there are no
other references in the D program.

With a moving GC, references in the D code no longer pin the
object, so the problem appears if the C function just works as
above.

Another bad use case in the C function: the passed stack
parameter is used to iterate to the end of an array and then back
later. The pointer to the end might not be within the GC
allocated memory block anymore, but pointing to the next.

Jan 15 2014

"deadalnix" <deadalnix gmail.com> writes:

On Monday, 13 January 2014 at 09:20:16 UTC, Rainer Schuetze wrote:
 Maybe I'm too pessimistic ;-) I guess moving in general could 
 be ok, I was thinking about segregating heaps by type 
 (shared/immutable/mutable) and moving data between them adds 
 restrictions. I'd like to be proven wrong.

I had some similar reflection. I'm not sure moving is worthwhile
for D. Segeregation is.

 To compete with other GCs we'd probably need write barriers to 
 keep track of changed references (for concurrent operation or 
 generations). There just needs to be a way to avoid having to 
 rescan the full heap every time, it does not scale.

Barrier are only necessary for mutable shared reference write.

Jan 13 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 1/12/2014 2:40 AM, Benjamin Thaut wrote:
 Am 12.01.2014 11:27, schrieb Rainer Schuetze:
 I think a moving collector is currently not feasible without restricting
 the language a lot, probably similar to safe D and more. I'm not sure we
 want that in general.

 Could you give an example which part of the language would not be doable with a
 moving collector? The only thing that comes to my mind is unions and that
 problem can be solved by allowing the user to specify manual scanning functions
 for structs or classes containing unions.

 Also I don't think that we can create a GC which performs as good as the one of

gc.


A moving GC is already supported by D's semantics.

Unions are dealt with by 'pinning' those objects, i.e. simply don't move them.
I 
know this can work because I implemented a mostly copying generational
collector 
years ago for Java. (After I invented this, someone else came out with a paper 
about a "mostly copying" collector, so I didn't get any credit. Oh well! But
the 
idea is sound and it works in the real world.)

Jan 13 2014

Brad Roberts <braddr puremagic.com> writes:

On 1/13/14 12:44 PM, Walter Bright wrote:
 On 1/12/2014 2:40 AM, Benjamin Thaut wrote:
 Am 12.01.2014 11:27, schrieb Rainer Schuetze:
 I think a moving collector is currently not feasible without restricting
 the language a lot, probably similar to safe D and more. I'm not sure we
 want that in general.

 Could you give an example which part of the language would not be doable with a
 moving collector? The only thing that comes to my mind is unions and that
 problem can be solved by allowing the user to specify manual scanning functions
 for structs or classes containing unions.

 Also I don't think that we can create a GC which performs as good as the one of

gc.


 A moving GC is already supported by D's semantics.

 Unions are dealt with by 'pinning' those objects, i.e. simply don't move them.
I know this can work
 because I implemented a mostly copying generational collector years ago for
Java. (After I invented
 this, someone else came out with a paper about a "mostly copying" collector,
so I didn't get any
 credit. Oh well! But the idea is sound and it works in the real world.)

The key advantage that moving gives to a GC is to allow it to have a block of
memory that it can do 
allocations out of by simply bumping the pointer.  When that region is full, a
minor collection 
kicks in, moving anything still alive out to a different block of memory, then
reset the region for 
re-use.  Pinned objects == region can't be emptied for reuse.  That leads to
fragmentation and free 
list maintenance and you're right back with a more typical allocator.

Not to say there aren't other ways of doing things, but with random objects
becoming pinnable puts a 
big damper on things unless you can identify all the objects that might be
pinned and isolate them. 
  But I doubt that's really knowable up front.

Jan 13 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 1/13/2014 12:57 PM, Brad Roberts wrote:
 The key advantage that moving gives to a GC is to allow it to have a block of
 memory that it can do allocations out of by simply bumping the pointer.  When
 that region is full, a minor collection kicks in, moving anything still alive
 out to a different block of memory, then reset the region for re-use.  Pinned
 objects == region can't be emptied for reuse.  That leads to fragmentation and
 free list maintenance and you're right back with a more typical allocator.

 Not to say there aren't other ways of doing things, but with random objects
 becoming pinnable puts a big damper on things unless you can identify all the
 objects that might be pinned and isolate them.  But I doubt that's really
 knowable up front.

The reason pinning doesn't particularly impede this is because pinning is rare.

Jan 13 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 01/13/2014 10:11 PM, Walter Bright wrote:
 Not to say there aren't other ways of doing things, but with random
 objects
 becoming pinnable puts a big damper on things unless you can identify
 all the
 objects that might be pinned and isolate them.  But I doubt that's really
 knowable up front.

 The reason pinning doesn't particularly impede this is because pinning
 is rare.

In Java or in D? Eg. there are no unions in Java.

Jan 13 2014

Paulo Pinto <pjmlp progtools.org> writes:

Am 13.01.2014 22:31, schrieb Timon Gehr:
 On 01/13/2014 10:11 PM, Walter Bright wrote:
 Not to say there aren't other ways of doing things, but with random
 objects
 becoming pinnable puts a big damper on things unless you can identify
 all the
 objects that might be pinned and isolate them.  But I doubt that's
 really
 knowable up front.

 The reason pinning doesn't particularly impede this is because pinning
 is rare.

 In Java or in D? Eg. there are no unions in Java.



http://msdn.microsoft.com/en-us/library/system.runtime.interopservices.structlayoutattribute%28v=vs.110%29.aspx

Jan 13 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 01/13/2014 11:05 PM, Paulo Pinto wrote:
 Am 13.01.2014 22:31, schrieb Timon Gehr:
 On 01/13/2014 10:11 PM, Walter Bright wrote:
 Not to say there aren't other ways of doing things, but with random
 objects
 becoming pinnable puts a big damper on things unless you can identify
 all the
 objects that might be pinned and isolate them.  But I doubt that's
 really
 knowable up front.

 The reason pinning doesn't particularly impede this is because pinning
 is rare.

 In Java or in D? Eg. there are no unions in Java.



 http://msdn.microsoft.com/en-us/library/system.runtime.interopservices.structlayoutattribute%28v=vs.110%29.aspx

"The common language runtime controls the physical layout of the data 
fields of a class or structure in managed memory. However, ..."

Jan 13 2014

"Paulo Pinto" <pjmlp progtools.org> writes:

On Monday, 13 January 2014 at 23:42:50 UTC, Timon Gehr wrote:
 On 01/13/2014 11:05 PM, Paulo Pinto wrote:
 Am 13.01.2014 22:31, schrieb Timon Gehr:
 On 01/13/2014 10:11 PM, Walter Bright wrote:
 Not to say there aren't other ways of doing things, but 
 with random
 objects
 becoming pinnable puts a big damper on things unless you 
 can identify
 all the
 objects that might be pinned and isolate them.  But I doubt 
 that's
 really
 knowable up front.

 The reason pinning doesn't particularly impede this is 
 because pinning
 is rare.

 In Java or in D? Eg. there are no unions in Java.



 http://msdn.microsoft.com/en-us/library/system.runtime.interopservices.structlayoutattribute%28v=vs.110%29.aspx

 "The common language runtime controls the physical layout of 
 the data fields of a class or structure in managed memory. 
 However, ..."


Did you also read the remaining part of the page, or just looked 
for something to paste?

You can control the layout, that is what matters. Not at what 
level you are expressing it.

Ignoring what such runtimes offer, only puts D at disadvantage 
when comparing feature lists, which many in the industry do.

--
Paulo

Jan 14 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 01/14/2014 10:11 AM, Paulo Pinto wrote:
On Monday, 13 January 2014 at 23:42:50 UTC, Timon Gehr wrote:
On 01/13/2014 11:05 PM, Paulo Pinto wrote:
Am 13.01.2014 22:31, schrieb Timon Gehr:
On 01/13/2014 10:11 PM, Walter Bright wrote:
Not to say there aren't other ways of doing things, but with random
objects
becoming pinnable puts a big damper on things unless you can identify
all the
objects that might be pinned and isolate them. But I doubt that's
really
knowable up front.

The reason pinning doesn't particularly impede this is because pinning
is rare.

In Java or in D? Eg. there are no unions in Java.

http://msdn.microsoft.com/en-us/library/system.runtime.interopservices.structlayoutattribute%28v=vs.110%29.aspx

"The common language runtime controls the physical layout of the data
fields of a class or structure in managed memory. However, ..."

Did you also read the remaining part of the page,

Obviously.

or just looked for something to paste?
...

Look, you hadn't done anything besides pasting for backing up your point
or making it precise. I'll be proven wrong if you are able to show us

union U{
int x;
Object y;
}

You can control the layout, that is what matters. Not at what level you
are expressing it.
...

Obviously.

Ignoring what such runtimes offer, only puts D at disadvantage when
comparing feature lists, which many in the industry do.
...

functionality because it is detrimental to GC, eg:

http://stackoverflow.com/questions/17771902/struct-memory-hack-to-overlap-object-reference-is-it-possible

Jan 14 2014

"Paulo Pinto" <pjmlp progtools.org> writes:

On Tuesday, 14 January 2014 at 14:22:27 UTC, Timon Gehr wrote:
On 01/14/2014 10:11 AM, Paulo Pinto wrote:
On Monday, 13 January 2014 at 23:42:50 UTC, Timon Gehr wrote:
On 01/13/2014 11:05 PM, Paulo Pinto wrote:
Am 13.01.2014 22:31, schrieb Timon Gehr:
On 01/13/2014 10:11 PM, Walter Bright wrote:
Not to say there aren't other ways of doing things, but
with random
objects
becoming pinnable puts a big damper on things unless you
can identify
all the
objects that might be pinned and isolate them. But I
doubt that's
really
knowable up front.

The reason pinning doesn't particularly impede this is
because pinning
is rare.

In Java or in D? Eg. there are no unions in Java.

http://msdn.microsoft.com/en-us/library/system.runtime.interopservices.structlayoutattribute%28v=vs.110%29.aspx

"The common language runtime controls the physical layout of
the data
fields of a class or structure in managed memory. However,
..."

Did you also read the remaining part of the page,

Obviously.

or just looked for something to paste?
...

Look, you hadn't done anything besides pasting for backing up
your point or making it precise. I'll be proven wrong if you

union U{
int x;
Object y;
}

You can control the layout, that is what matters. Not at what
level you
are expressing it.
...

Obviously.

Ignoring what such runtimes offer, only puts D at disadvantage
when
comparing feature lists, which many in the industry do.
...

offer this functionality because it is detrimental to GC, eg:

http://stackoverflow.com/questions/17771902/struct-memory-hack-to-overlap-object-reference-is-it-possible

Excuse me for being stupid.

I just read MSDN without trying it out, so I wasn't aware that
object references cannot be made to overllap with other data,
even in unsafe structs.

Me and my big mouth. Better test it properly next time.

--
Paulo

Jan 15 2014

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 13.01.2014 21:44, schrieb Walter Bright:
 A moving GC is already supported by D's semantics.

 Unions are dealt with by 'pinning' those objects, i.e. simply don't move
 them. I know this can work because I implemented a mostly copying
 generational collector years ago for Java. (After I invented this,
 someone else came out with a paper about a "mostly copying" collector,
 so I didn't get any credit. Oh well! But the idea is sound and it works
 in the real world.)

I would prefer a option where the user can specifiy a function to scan 
classes / structs that contain unions percisely instead of pinning it.
In generall I would want a option to specify a manual scanning function 
for any block of GC managed memory, D is a systems programming language 
after all and specifing a custom scanning functionn can be a lot more 
effective then assuming the worst possible case for that memory block.

Jan 13 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 1/13/2014 1:28 PM, Benjamin Thaut wrote:
 Am 13.01.2014 21:44, schrieb Walter Bright:
 A moving GC is already supported by D's semantics.

 Unions are dealt with by 'pinning' those objects, i.e. simply don't move
 them. I know this can work because I implemented a mostly copying
 generational collector years ago for Java. (After I invented this,
 someone else came out with a paper about a "mostly copying" collector,
 so I didn't get any credit. Oh well! But the idea is sound and it works
 in the real world.)

 I would prefer a option where the user can specifiy a function to scan classes
/
 structs that contain unions percisely instead of pinning it.
 In generall I would want a option to specify a manual scanning function for any
 block of GC managed memory, D is a systems programming language after all and
 specifing a custom scanning functionn can be a lot more effective then assuming
 the worst possible case for that memory block.

I agree, but I was trying to correct the misperception that current D does not 
allow a moving collector.

Jan 13 2014

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 13.01.2014 23:55, schrieb Walter Bright:
 I agree, but I was trying to correct the misperception that current D
 does not allow a moving collector.

Current D does not allow a moving collector because of the lack of 
compiler support. It is still not possible to identify all pointers 
percicesly, especially those on the stack. Also when you want to 
implement a semi-space GC everything _must_ be moveable. Pinning is not 
an option for a semi-space GC. There for current D does not allow the 
implementation of semi-space GC without some changes to the spec. (E.g. 
structs / classes containing unions _must_ provide a custom scanning 
function).

Jan 14 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 1/14/2014 1:18 AM, Benjamin Thaut wrote:
 Current D does not allow a moving collector because of the lack of compiler
 support. It is still not possible to identify all pointers percicesly,
 especially those on the stack. Also when you want to implement a semi-space GC
 everything _must_ be moveable. Pinning is not an option for a semi-space GC.
 There for current D does not allow the implementation of semi-space GC without
 some changes to the spec. (E.g. structs / classes containing unions _must_
 provide a custom scanning function).


This is not true, I've implemented one for Java.

Jan 14 2014

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 14.01.2014 11:05, schrieb Walter Bright:
 On 1/14/2014 1:18 AM, Benjamin Thaut wrote:
 Current D does not allow a moving collector because of the lack of
 compiler
 support. It is still not possible to identify all pointers percicesly,
 especially those on the stack. Also when you want to implement a
 semi-space GC
 everything _must_ be moveable. Pinning is not an option for a
 semi-space GC.
 There for current D does not allow the implementation of semi-space GC
 without
 some changes to the spec. (E.g. structs / classes containing unions
 _must_
 provide a custom scanning function).


 This is not true, I've implemented one for Java.

So how do you implement a semi-space GC with pinned objects?

Jan 14 2014

"renoX" <renozyx gmail.com> writes:

On Tuesday, 14 January 2014 at 11:02:38 UTC, Benjamin Thaut wrote:
 Am 14.01.2014 11:05, schrieb Walter Bright:
 On 1/14/2014 1:18 AM, Benjamin Thaut wrote:
 Current D does not allow a moving collector because of the 
 lack of
 compiler
 support. It is still not possible to identify all pointers 
 percicesly,
 especially those on the stack. Also when you want to 
 implement a
 semi-space GC
 everything _must_ be moveable. Pinning is not an option for a
 semi-space GC.
 There for current D does not allow the implementation of 
 semi-space GC
 without
 some changes to the spec. (E.g. structs / classes containing 
 unions
 _must_
 provide a custom scanning function).


 This is not true, I've implemented one for Java.

 So how do you implement a semi-space GC with pinned objects?

You manage the pinned object in a different list that the 
semi-space list?

It's quite often that GCs maintain object inventory with several 
methods, for example using semi-space for young object but not 
for old objects..

Jan 14 2014

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 14.01.2014 13:53, schrieb renoX:
 On Tuesday, 14 January 2014 at 11:02:38 UTC, Benjamin Thaut wrote:
 Am 14.01.2014 11:05, schrieb Walter Bright:
 On 1/14/2014 1:18 AM, Benjamin Thaut wrote:
 Current D does not allow a moving collector because of the lack of
 compiler
 support. It is still not possible to identify all pointers percicesly,
 especially those on the stack. Also when you want to implement a
 semi-space GC
 everything _must_ be moveable. Pinning is not an option for a
 semi-space GC.
 There for current D does not allow the implementation of semi-space GC
 without
 some changes to the spec. (E.g. structs / classes containing unions
 _must_
 provide a custom scanning function).


 This is not true, I've implemented one for Java.

 So how do you implement a semi-space GC with pinned objects?

 You manage the pinned object in a different list that the semi-space list?

 It's quite often that GCs maintain object inventory with several
 methods, for example using semi-space for young object but not for old
 objects..

I meant a pure semi-space collector. A pure semi-space collector can not 
pin objects. And I didn't make that up, it even says so in "The Garbage 
Collection Handbook".

Pinning Unions is a bad approach anyway in my eyes. Because you will 
have to allocate them in seperate space (which supports pinning) you 
loose the advantge of fast collections for short living objects which a 
semi-space collector provides.

Jan 14 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 1/14/2014 3:02 AM, Benjamin Thaut wrote:
 Am 14.01.2014 11:05, schrieb Walter Bright:
 This is not true, I've implemented one for Java.

 So how do you implement a semi-space GC with pinned objects?

Be more flexible in what the semi-spaces are.

Jan 14 2014

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 14.01.2014 21:28, schrieb Walter Bright:
 On 1/14/2014 3:02 AM, Benjamin Thaut wrote:
 Am 14.01.2014 11:05, schrieb Walter Bright:
 This is not true, I've implemented one for Java.

 So how do you implement a semi-space GC with pinned objects?

 Be more flexible in what the semi-spaces are.

So you just don't. Effectivley you work around the problem. I would 
still prefer that specifing a scanning function for unions becomes 
mandatory.

Jan 14 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 1/14/2014 1:10 PM, Benjamin Thaut wrote:
 Am 14.01.2014 21:28, schrieb Walter Bright:
 On 1/14/2014 3:02 AM, Benjamin Thaut wrote:
 Am 14.01.2014 11:05, schrieb Walter Bright:
 This is not true, I've implemented one for Java.

 So how do you implement a semi-space GC with pinned objects?

 Be more flexible in what the semi-spaces are.

 So you just don't. Effectivley you work around the problem.

My experience with book algorithms is they usually require some adjustments
when 
working with real problems in the real world. I was well aware of how moving
GCs 
worked when I set the semantics of D objects, since I'd implemented one, and
was 
careful to make them possible with D.


 I would still prefer that specifing a scanning function for unions becomes
mandatory.

I already agreed that there are better ways.

Jan 14 2014

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 14.01.2014 23:10, schrieb Walter Bright:
 I already agreed that there are better ways.

Then this was a big misunderstanding.

Jan 14 2014

"Frustrated" <c1514843 drdrb.com> writes:

On Tuesday, 14 January 2014 at 22:36:12 UTC, Benjamin Thaut wrote:
 Am 14.01.2014 23:10, schrieb Walter Bright:
 I already agreed that there are better ways.

 Then this was a big misunderstanding.

So, the issue seems to be that everyone is treating the GC as the 
Thors hammer and seeing everything as a nails?

The GC does and can not solve all problems. Why force people to 
use it when it doesn't work well?

I would think having a multi-layered approach would be best. 
Allow the GC to do everything it can but also allow it to be 
configured from 0 to 60. For noobs, non-realtime apps, etc the GC 
can be set to do it all. One can trade speed for memory(less 
memory, more scanning etc...). One can disable to in specific 
cases(such as on unions), have separate heaps for GC memory and 
non-GC allocated memory, etc.

It seems that a more robust GC with many features(with the 
feature of not using it and either manually allocating or using 
ARC).

The the only real issue becomes how to keep the GC in sync with 
the non-GC allocated memory... and this only happens if one 
chooses to use non-GC features(which must be intentional so 
hopefully the programming knows what he's doing if he got that 
far).

I doubt this will ever be possible to solve efficiently so why 
bother? Leave it up to the programmer to deal with it. Make some 
resources available to solve these problems but don't try to 
dictate what must be done. If the programmer is going to bypass 
the GC he already must have good reasons so why put up 
roadblocks? Cause *you* think you know better than him what he 
needs to do?


For example, suppose we have a non-GC based pointer. Allow the 
programmer to mark it as a pointer with certain contextual 
parameters so that either the GC ca n "deal with it" better or at 
least for debugging purposes help find memory leaks.

I'm way more interested in having a GC that doesn't stop the 
world and a compiler that doesn't require the GC for simple stuff 
as slices(could it not use a buffer and/or ARC 
instead/alternatively?).

Jan 14 2014

"deadalnix" <deadalnix gmail.com> writes:

On Wednesday, 15 January 2014 at 00:24:48 UTC, Frustrated wrote:
 On Tuesday, 14 January 2014 at 22:36:12 UTC, Benjamin Thaut 
 wrote:
 Am 14.01.2014 23:10, schrieb Walter Bright:
 I already agreed that there are better ways.

 Then this was a big misunderstanding.

 So, the issue seems to be that everyone is treating the GC as 
 the Thors hammer and seeing everything as a nails?

 The GC does and can not solve all problems. Why force people to 
 use it when it doesn't work well?

 I would think having a multi-layered approach would be best. 
 Allow the GC to do everything it can but also allow it to be 
 configured from 0 to 60. For noobs, non-realtime apps, etc the 
 GC can be set to do it all. One can trade speed for memory(less 
 memory, more scanning etc...). One can disable to in specific 
 cases(such as on unions), have separate heaps for GC memory and 
 non-GC allocated memory, etc.

 It seems that a more robust GC with many features(with the 
 feature of not using it and either manually allocating or using 
 ARC).

 The the only real issue becomes how to keep the GC in sync with 
 the non-GC allocated memory... and this only happens if one 
 chooses to use non-GC features(which must be intentional so 
 hopefully the programming knows what he's doing if he got that 
 far).

 I doubt this will ever be possible to solve efficiently so why 
 bother? Leave it up to the programmer to deal with it. Make 
 some resources available to solve these problems but don't try 
 to dictate what must be done. If the programmer is going to 
 bypass the GC he already must have good reasons so why put up 
 roadblocks? Cause *you* think you know better than him what he 
 needs to do?


 For example, suppose we have a non-GC based pointer. Allow the 
 programmer to mark it as a pointer with certain contextual 
 parameters so that either the GC ca n "deal with it" better or 
 at least for debugging purposes help find memory leaks.

 I'm way more interested in having a GC that doesn't stop the 
 world and a compiler that doesn't require the GC for simple 
 stuff as slices(could it not use a buffer and/or ARC 
 instead/alternatively?).

Try to do the pull request that goes with that, and we will be
able to discuss the real problems.

Jan 14 2014

"bearophile" <bearophileHUGS lycos.com> writes:

Benjamin Thaut:

 I would prefer a option where the user can specifiy a function 
 to scan classes / structs that contain unions percisely instead 
 of pinning it.
 In generall I would want a option to specify a manual scanning 
 function for any block of GC managed memory, D is a systems 
 programming language after all and specifing a custom scanning 
 functionn can be a lot more effective then assuming the worst 
 possible case for that memory block.

Time ago I suggested an optional standard method for unions of 
structs named as "onGC", that if present is used by the GC at 
run-time to know what to do with the contents of the unions.

Bye,
bearophile

Jan 13 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

12-Jan-2014 14:27, Rainer Schuetze пишет:
 On 11.01.2014 22:20, Benjamin Thaut wrote:
 Am 11.01.2014 21:44, schrieb Andrei Alexandrescu:

 Well as far as my understanding of GCs goes, you have two options:

 1) Impercise pointer discovery => limiting yourself to a mark & sweep
 2) Percise pointer disccovery => option to use a semi space GC in
 combination with a mark & sweep and generations, which allows for
 superior handling of short lived allocations (which is the biggest
 problem of D's current GC).

 Also without percise pointer discovery you will never be able to move
 objects from one heap into another. This would be especially a problem
 for the immutable heap, because you might want to allocate all strings
 on the thread local heap until you discover that you actually need them
 to be shared between threads. You might also need to move objects into
 antoher heap whenever a casts happens or global variable is assigned.

 Kind Regards
 Benjamin Thaut

 I think a moving collector is currently not feasible without restricting
 the language a lot, probably similar to safe D and more. I'm not sure we
 want that in general.

I might be ignorant but why can't we make "mostly-moving" collector?

For that we discern precise pointers (with typeinfo) and conservative 
(such as potential pointers in registers/stack). Then any block pointed 
to by a least one conservative pointer is pinned, others though can be 
moved freely since all of the pointers are known.


-- 
Dmitry Olshansky

Jan 12 2014

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 12.01.2014 12:03, schrieb Dmitry Olshansky:
 12-Jan-2014 14:27, Rainer Schuetze пишет:
 On 11.01.2014 22:20, Benjamin Thaut wrote:
 Am 11.01.2014 21:44, schrieb Andrei Alexandrescu:

 Well as far as my understanding of GCs goes, you have two options:

 1) Impercise pointer discovery => limiting yourself to a mark & sweep
 2) Percise pointer disccovery => option to use a semi space GC in
 combination with a mark & sweep and generations, which allows for
 superior handling of short lived allocations (which is the biggest
 problem of D's current GC).

 Also without percise pointer discovery you will never be able to move
 objects from one heap into another. This would be especially a problem
 for the immutable heap, because you might want to allocate all strings
 on the thread local heap until you discover that you actually need them
 to be shared between threads. You might also need to move objects into
 antoher heap whenever a casts happens or global variable is assigned.

 Kind Regards
 Benjamin Thaut

 I think a moving collector is currently not feasible without restricting
 the language a lot, probably similar to safe D and more. I'm not sure we
 want that in general.

 I might be ignorant but why can't we make "mostly-moving" collector?

 For that we discern precise pointers (with typeinfo) and conservative
 (such as potential pointers in registers/stack). Then any block pointed
 to by a least one conservative pointer is pinned, others though can be
 moved freely since all of the pointers are known.

A semi-space garbage collector is best fitted for small short lived 
allocations. Which are mostly those that are referenced by the stack 
because they happend as functions are called. And for a semi-space 
garbage collector there is not mostly moving, it has to copy _all_ 
objects from one heap onto another so that those left on the original 
heap are known to be all garbage, can be destroyed and then the heap can 
be reused for the next collection. Mostly-moving doens't work here. 
Either you know all pointers percisely, or you don't do it. What you 
mean is pinning, you can pin objects for which you know they might be 
referenced by a impercise pointer and thus prevent them from beeing 
moved around. But this type of moving only works with mark & sweep 
compacting collectors and thats not really better than what we already have.

If you are really interrested in the topic I higly recommend reading 
"The garbage collector handbook".

Kind Regards
Benjamin Thaut

Jan 12 2014

"sunspyre" <sunspyre gmail.com> writes:

On Sunday, 12 January 2014 at 10:27:42 UTC, Rainer Schuetze wrote:
 [...] Adding concurrency would also be nice...

I just want to point out that from an outer perspective, this is 
a really really *really* big deal.

By far the most common arguments I've heard against D are:

1. library availability (derelict, deimos now)
2. community size and impetus (getting there!)
3. shared druntime/phobos so Hello World isn't 800kb (getting 
there?)
4. garbage collector which is only possible to opt out of by 
writing C

The very reliance of the garbage collector, regardless of how far 
between the stop-the-world sweeps are, is a showstopper for many 
people. They hear GC and think Java pauses. Being able to 
honestly claim "well it runs concurrently in a separate thread 
and doesn't[*] incur any performance penalty" would be the single 
biggest leap to greater adoption D could take at this point. 
Maybe barring "it prints money", but only maybe.

It may be less of a *technical* problem than, say, this or that 
bug in the type system, or the identity crisis of shared. But 
fixing those would not make for a *twentieth* of the marketing 
that a concurrent GC would. Fixing those would make people stay 
-- introducing that GC would make people join.

Not saying that such bugs shouldn't get attention, I'm just 
saying that Bob would scroll past that link on /r/programming. In 
comparison, Lucarella's dconf slides were... *compelling*.



(P.S. many awkwardly long hugs to those culling allocations from 
druntime/phobos, you rock)

Jan 12 2014

"qznc" <qznc web.de> writes:

On Sunday, 12 January 2014 at 19:29:08 UTC, sunspyre wrote:
 The very reliance of the garbage collector, regardless of how 
 far between the stop-the-world sweeps are, is a showstopper for 
 many people. They hear GC and think Java pauses. Being able to 
 honestly claim "well it runs concurrently in a separate thread 
 and doesn't[*] incur any performance penalty" would be the 
 single biggest leap to greater adoption D could take at this 
 point.

That is not technically possible. A truly concurrent GC has heavy 
penalty. People probably think of "Concurrent GC" has Microsoft 
calls one of the .NET GCs, which is "mostly concurrent".

The first step is get a precise GC. That should give a 
significant performance boost already. Everything else should 
probably build on this.

http://stackoverflow.com/questions/2583644/difference-between-background-and-concurrent-garbage-collection

Jan 13 2014

Marco Leise <Marco.Leise gmx.de> writes:

Am Fri, 10 Jan 2014 00:03:03 -0800
schrieb Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:

 The GC comes up repeatedly in discussions around here. I've been 
 thinking for a while about breaking it down into components, and now it 
 seems the time is ripe.
 
 The design at http://goo.gl/ZVCJeB seems to be a win. It works well, 
 comprehends all major allocation tropes, someone implemented a subset of 
 it in C++ and measured good results, and a coworker is considering 
 adopting the design for a C++ project as well.
 
 I've started with the next logical step - higher-level allocation that 
 is aware of the type of the object being allocated, and realized that 
 integrating a notion of tracing is appropriate at that level, and 
 actually quite easy. So I'm thinking of just doing it.
 
 A few highlights:
 
 * The design will foster the small, composable components with 
 required/optional primitives that worked for std.allocator.
 
 * I plan to review and probably discard all of the pointers-to-functions 
 nonsense in the current GC.
 
 * At this point I won't worry about discovering roots; I assume druntime 
 has the appropriate mechanisms in place. Instead I'll focus on 
 primitives such as "given this root, mark all that transitively follows 
 from it" (conservatively or not).
 
 * I plan to rely on static introspection for the mark function, i.e:
 
 void mark(T)(ref T obj);
 
 would mark obj and everything transitively referred to by it. The 
 function would probably take a second parameter that's the heap that obj 
 is sitting in.
 
 * I plan to segregate all objects that don't include references and 
 pointers (e.g. int, int[], Tuple!(int, double) etc) into a completely 
 different heap than the "interesting" objects. For the simpler objects 
 there's no need to save detailed type information so they can be stored 
 in a more compact, efficient manner.
 
 * There will be no possibility to change the type of certain objects 
 once allocated. An allocation for an immutable object cannot e.g. make 
 it later a mutable one. (This is an essential issue with the current 
 allocator, I think.)
 
 * At this point I'm unclear on how generations can be componentized, but 
 am cautiously optimistic. Will see once I get to it.
 
 One thing that would be great now would be to make an effort to review 
 and merge the current precise GC work. I'm sure it will be of great help 
 with breaking into components.
 
 
 Destroy.
 
 Andrei

Nothing to destroy yet. But it does feel good that someone is
putting the pieces together to come up with the second
generation of D garbage collection. With pieces I mean the
discussions about making it optional or allowing user defined
allocators, how immutable can be used to an advantage, precise
scanning and generational GC. Basically there are enough use
cases now that one can shape a new GC model around it.

-- 
Marco

Jan 10 2014

Rainer Schuetze <r.sagitario gmx.de> writes:

On 10.01.2014 09:03, Andrei Alexandrescu wrote:
 The GC comes up repeatedly in discussions around here. I've been
 thinking for a while about breaking it down into components, and now it
 seems the time is ripe.

 The design at http://goo.gl/ZVCJeB seems to be a win. It works well,
 comprehends all major allocation tropes, someone implemented a subset of
 it in C++ and measured good results, and a coworker is considering
 adopting the design for a C++ project as well.

 I've started with the next logical step - higher-level allocation that
 is aware of the type of the object being allocated, and realized that
 integrating a notion of tracing is appropriate at that level, and
 actually quite easy. So I'm thinking of just doing it.

 A few highlights:

 * The design will foster the small, composable components with
 required/optional primitives that worked for std.allocator.

 * I plan to review and probably discard all of the pointers-to-functions
 nonsense in the current GC.

You mean the proxy functions? Yes, please axe 'em, that only works in 
very simple single-threaded programs.

 * At this point I won't worry about discovering roots; I assume druntime
 has the appropriate mechanisms in place.

It currently does not have precise information, but it is dearly needed, 
too.

 Instead I'll focus on
 primitives such as "given this root, mark all that transitively follows
 from it" (conservatively or not).

 * I plan to rely on static introspection for the mark function, i.e:

 void mark(T)(ref T obj);

 would mark obj and everything transitively referred to by it. The
 function would probably take a second parameter that's the heap that obj
 is sitting in.

I guess the mark function will be stored somewhere in the TypeInfo. How 
is this going to work with dynamic and associative array operations if 
these are not also templated?

 * I plan to segregate all objects that don't include references and
 pointers (e.g. int, int[], Tuple!(int, double) etc) into a completely
 different heap than the "interesting" objects. For the simpler objects
 there's no need to save detailed type information so they can be stored
 in a more compact, efficient manner.

So no more std.emplace?

 * There will be no possibility to change the type of certain objects
 once allocated. An allocation for an immutable object cannot e.g. make
 it later a mutable one. (This is an essential issue with the current
 allocator, I think.)

I think this might help in having different heaps (especially thread 
local heaps if you include "shared"), but I'm not sure this works in 
unsafe D where casting is allowed.

 * At this point I'm unclear on how generations can be componentized, but
 am cautiously optimistic. Will see once I get to it.

 One thing that would be great now would be to make an effort to review
 and merge the current precise GC work. I'm sure it will be of great help
 with breaking into components.

As written in the other thread ("how to contribute to GC"), I have just 
made an attempt to make it more reviewable: 
https://github.com/rainers/druntime/commits/gcx_precise2

The necessary compiler fixes are here: 
https://github.com/D-Programming-Language/dmd/pull/2480

 Destroy.

 Andrei

Jan 10 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/10/14 12:37 PM, Rainer Schuetze wrote:
 * At this point I won't worry about discovering roots; I assume druntime
 has the appropriate mechanisms in place.

 It currently does not have precise information, but it is dearly needed,
 too.

Yah, that's where I'm counting on your work :o).

 * I plan to rely on static introspection for the mark function, i.e:

 void mark(T)(ref T obj);

 would mark obj and everything transitively referred to by it. The
 function would probably take a second parameter that's the heap that obj
 is sitting in.

 I guess the mark function will be stored somewhere in the TypeInfo. How
 is this going to work with dynamic and associative array operations if
 these are not also templated?

I need to write some code to explain it all. An figure it all :o).

 * I plan to segregate all objects that don't include references and
 pointers (e.g. int, int[], Tuple!(int, double) etc) into a completely
 different heap than the "interesting" objects. For the simpler objects
 there's no need to save detailed type information so they can be stored
 in a more compact, efficient manner.

 So no more std.emplace?

std.emplace will continue to work as a way to build an object at a 
specified address. I suspect that allocating and manipulating objects on 
the GC heap in particular may have certain restrictions. One possibility 
to avoid such restrictions is to have a function typify(T)(void* p) 
which ascribes type T to heap location p.

 * There will be no possibility to change the type of certain objects
 once allocated. An allocation for an immutable object cannot e.g. make
 it later a mutable one. (This is an essential issue with the current
 allocator, I think.)

 I think this might help in having different heaps (especially thread
 local heaps if you include "shared"), but I'm not sure this works in
 unsafe D where casting is allowed.

Type-forging casts have always been somewhat implementation-defined. The 
way I see it they'll continue to only work for people who really know 
what they're doing. I'm not too worried about that.

 * At this point I'm unclear on how generations can be componentized, but
 am cautiously optimistic. Will see once I get to it.

 One thing that would be great now would be to make an effort to review
 and merge the current precise GC work. I'm sure it will be of great help
 with breaking into components.

 As written in the other thread ("how to contribute to GC"), I have just
 made an attempt to make it more reviewable:
 https://github.com/rainers/druntime/commits/gcx_precise2

 The necessary compiler fixes are here:
 https://github.com/D-Programming-Language/dmd/pull/2480

Yah, time for you to get some destruction first :o).


Andrei

Jan 10 2014

Rainer Schuetze <r.sagitario gmx.de> writes:

On 10.01.2014 22:42, Andrei Alexandrescu wrote:
 On 1/10/14 12:37 PM, Rainer Schuetze wrote:
 * At this point I won't worry about discovering roots; I assume druntime
 has the appropriate mechanisms in place.

 It currently does not have precise information, but it is dearly needed,
 too.

 Yah, that's where I'm counting on your work :o).

I was thinking about using the proposed module info extension to 
generate data similar to RTInfo by interpreting global/tls data of a 
module as a struct with type info. I don't know if this is feasable.

 * I plan to rely on static introspection for the mark function, i.e:

 void mark(T)(ref T obj);

 would mark obj and everything transitively referred to by it. The
 function would probably take a second parameter that's the heap that obj
 is sitting in.

 I guess the mark function will be stored somewhere in the TypeInfo. How
 is this going to work with dynamic and associative array operations if
 these are not also templated?

 I need to write some code to explain it all. An figure it all :o).

Waiting eagerly for the code :-)

 * I plan to segregate all objects that don't include references and
 pointers (e.g. int, int[], Tuple!(int, double) etc) into a completely
 different heap than the "interesting" objects. For the simpler objects
 there's no need to save detailed type information so they can be stored
 in a more compact, efficient manner.

 So no more std.emplace?

 std.emplace will continue to work as a way to build an object at a
 specified address. I suspect that allocating and manipulating objects on
 the GC heap in particular may have certain restrictions. One possibility
 to avoid such restrictions is to have a function typify(T)(void* p)
 which ascribes type T to heap location p.

That sounds similar to my gc_emplace function. The problematic part is 
how to save that information in the GC.

 * At this point I'm unclear on how generations can be componentized, but
 am cautiously optimistic. Will see once I get to it.

 One thing that would be great now would be to make an effort to review
 and merge the current precise GC work. I'm sure it will be of great help
 with breaking into components.

 As written in the other thread ("how to contribute to GC"), I have just
 made an attempt to make it more reviewable:
 https://github.com/rainers/druntime/commits/gcx_precise2

 The necessary compiler fixes are here:
 https://github.com/D-Programming-Language/dmd/pull/2480

 Yah, time for you to get some destruction first :o).

Ready to get some blows...

Jan 11 2014

Jacob Carlborg <doob me.com> writes:

On 2014-01-11 10:30, Rainer Schuetze wrote:

 I was thinking about using the proposed module info extension to
 generate data similar to RTInfo by interpreting global/tls data of a
 module as a struct with type info. I don't know if this is feasable.

You mean this: https://github.com/D-Programming-Language/dmd/pull/2271 ? 
I really hope this can be merged sometime.

-- 
/Jacob Carlborg

Jan 11 2014

Rainer Schuetze <r.sagitario gmx.de> writes:

On 11.01.2014 21:30, Jacob Carlborg wrote:
 On 2014-01-11 10:30, Rainer Schuetze wrote:

 I was thinking about using the proposed module info extension to
 generate data similar to RTInfo by interpreting global/tls data of a
 module as a struct with type info. I don't know if this is feasable.

 You mean this: https://github.com/D-Programming-Language/dmd/pull/2271 ?
 I really hope this can be merged sometime.

Yes. I'm not sure if there is enough compiler support yet to figure out 
the global and TLS data layout of a module, especially for a static 
library where the module is split into a lot of object files.

Jan 12 2014

evansl <cppljevans suddenlink.net> writes:

On 01/11/14 03:30, Rainer Schuetze wrote:
 On 10.01.2014 22:42, Andrei Alexandrescu wrote:

[snip]
 std.emplace will continue to work as a way to build an object at a
 specified address. I suspect that allocating and manipulating objects on
 the GC heap in particular may have certain restrictions. One possibility
 to avoid such restrictions is to have a function typify(T)(void* p)
 which ascribes type T to heap location p.

 That sounds similar to my gc_emplace function. The problematic part is
 how to save that information in the GC.

[snip]
Couldn't you store a pointer to the typeinfo within the
allocated memory? IOW, the first part of the allocated memory
would be the typeinfo* followed by the actual memory used to store the
value of the allocated T object.

-regards,
Larry

Jan 13 2014

"Rainer Schuetze" <r.sagitario gmx.de> writes:

On Monday, 13 January 2014 at 22:48:37 UTC, evansl wrote:
 On 01/11/14 03:30, Rainer Schuetze wrote:
 On 10.01.2014 22:42, Andrei Alexandrescu wrote:

 [snip]
 std.emplace will continue to work as a way to build an object 
 at a
 specified address. I suspect that allocating and manipulating 
 objects on
 the GC heap in particular may have certain restrictions. One 
 possibility
 to avoid such restrictions is to have a function 
 typify(T)(void* p)
 which ascribes type T to heap location p.

 That sounds similar to my gc_emplace function. The problematic 
 part is
 how to save that information in the GC.

 [snip]
 Couldn't you store a pointer to the typeinfo within the
 allocated memory? IOW, the first part of the allocated memory
 would be the typeinfo* followed by the actual memory used to 
 store the
 value of the allocated T object.

 -regards,
 Larry

std.emplace can be used on a partial memory block (e.g. as part
of a struct), so you will have to add the emplaced type info in
addition to the outer struct's type info. There can be multiple
areas with emplaced dta within the same memory allocation, too.
So you'll need to store a list of type infos paired with the
offsets within the meory block. How do you do this efficiently
without extra cost for the usual scanning?

Jan 15 2014

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:

On Wednesday, 15 January 2014 at 08:54:51 UTC, Rainer Schuetze 
wrote:
 std.emplace can be used on a partial memory block (e.g. as part
 of a struct), so you will have to add the emplaced type info in
 addition to the outer struct's type info. There can be multiple
 areas with emplaced dta within the same memory allocation, too.

This is a good point. It is a mistake to mix manual memory layout 
and byte-level control  with an obligatory GC regime. Those 
optimizations should either be done automatically by a high level 
optimizer, triggered by compiler hits, or be non-GC. Basically, 
one has to decide whether to focus on high level or low level for 
heap allocations.

D is by design a high-level effort with low-level access bolted 
on everywhere. That undermines the premises for efficient 
high-level implementation. I think you either have to restrict 
the low level to the non-GC-heap-allocations and inner loops at 
the leaves of the call-tree, or start with a low level language 
design and very carefully add optional high level constructs. 
High level as the default with low level control interspersed 
everywhere makes high level optimization intractable.

Jan 15 2014

evansl <cppljevans suddenlink.net> writes:

On 01/15/14 02:54, Rainer Schuetze wrote:
 On Monday, 13 January 2014 at 22:48:37 UTC, evansl wrote:
 On 01/11/14 03:30, Rainer Schuetze wrote:
 On 10.01.2014 22:42, Andrei Alexandrescu wrote:

 [snip]
 std.emplace will continue to work as a way to build an object at a
 specified address. I suspect that allocating and manipulating
 objects on
 the GC heap in particular may have certain restrictions. One
 possibility
 to avoid such restrictions is to have a function typify(T)(void* p)
 which ascribes type T to heap location p.

 That sounds similar to my gc_emplace function. The problematic part is
 how to save that information in the GC.

 [snip]
 Couldn't you store a pointer to the typeinfo within the
 allocated memory? IOW, the first part of the allocated memory
 would be the typeinfo* followed by the actual memory used to store the
 value of the allocated T object.

 -regards,
 Larry

 std.emplace can be used on a partial memory block (e.g. as part
 of a struct), so you will have to add the emplaced type info in
 addition to the outer struct's type info. There can be multiple
 areas with emplaced dta within the same memory allocation, too.
 So you'll need to store a list of type infos paired with the
 offsets within the meory block. How do you do this efficiently
 without extra cost for the usual scanning?

I'm afraid I'm not familiar with much of what you're talking about.
However, in another thread, containing this post:

http://forum.dlang.org/post/ldhquc$11ec$1 digitalmars.com

Benjamin mentioned RTInfo*.  Which made me think that instead of
pointer to the typeinfo, what about a RTInfo pointer which, according
to comments at:

https://github.com/D-Programming-Language/druntime/blob/e47a00bff935c3f079bb567a6ec97663ba384487/src/object_.d#L1101

is the data for precise GC.  Surely if RTInfo contains the data for
precise GC, then that would work.  However, I'm guessing that RTInfo*
is only available for class types.  OTOH, couldn't the compiler figure
out the similar information for struct types?  Then, if a class type
were being scanned, use the existing RTInfo* in the object supertype.
OTOH, if a struct type type is being scanned, then use the compiler
generated RTInfo for the struct type, a pointer to which could be
placed in the heap memory just before the memory for the struct.  I
guess the scanner would then have to be able to figure out if a void*
was pointing to a struct or a class and behave accordingly.

Of course I'm probably missing something, coming from a c++
background; hence, I'd appreciate someone pointing out what I'm
missing.

-regards,
Larry

Feb 28 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 1/10/2014 12:37 PM, Rainer Schuetze wrote:
 As written in the other thread ("how to contribute to GC"), I have just made an
 attempt to make it more reviewable:
 https://github.com/rainers/druntime/commits/gcx_precise2

 The necessary compiler fixes are here:
 https://github.com/D-Programming-Language/dmd/pull/2480

Rainer, I want to apologize to you for not getting your efforts the attention 
they deserve. I hope we can make progress on this shortly.

Jan 10 2014

"deadalnix" <deadalnix gmail.com> writes:

On Friday, 10 January 2014 at 20:37:53 UTC, Rainer Schuetze wrote:
 I think this might help in having different heaps (especially 
 thread local heaps if you include "shared"), but I'm not sure 
 this works in unsafe D where casting is allowed.

One of the key to a fast GC is segregating the heap in smaller
parts. This is why generational collectors are so popular in java
for instance.

Segregating the heap imply so complications. For instance, it is
necessary to track reference from one heap to another, typically
from young to older objects.

In D, the type system provide a natural segregation. It would be
a great missed opportunity to not take advantage of it. This
segregation is really nice as pointers go from one to another in
a single direction, avoiding the need to track references
changes, and all element of the heap part share some common
characteristics. These characteristics allow for specialized GC
algorithms (for instance, a GC can run on the immutable heap
without the program even knowing about it).

Casting is by essence bypassing the type system. In this case,
you must be help responsible for what happen. The language cannot
provide guarantees.

Jan 10 2014

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Jan 10, 2014 at 11:58:21PM +0000, deadalnix wrote:
 On Friday, 10 January 2014 at 20:37:53 UTC, Rainer Schuetze wrote:
I think this might help in having different heaps (especially
thread local heaps if you include "shared"), but I'm not sure this
works in unsafe D where casting is allowed.

 
 One of the key to a fast GC is segregating the heap in smaller
 parts. This is why generational collectors are so popular in java
 for instance.
 
 Segregating the heap imply so complications. For instance, it is
 necessary to track reference from one heap to another, typically
 from young to older objects.

Which unfortunately is not an option in D if write barriers are
required.


 In D, the type system provide a natural segregation. It would be
 a great missed opportunity to not take advantage of it. This
 segregation is really nice as pointers go from one to another in
 a single direction, avoiding the need to track references
 changes, and all element of the heap part share some common
 characteristics. These characteristics allow for specialized GC
 algorithms (for instance, a GC can run on the immutable heap
 without the program even knowing about it).

Yeah, the transitivity of immutable really makes this a big opportunity.
You know immutable can never point to mutable, and also that immutable
never ever changes, so that makes for a perfect partitioning of GC
memory. You never have to scan the immutable heap when collecting the
mutable heap, and there's never a problem with mutating references in
the immutable heap. So you don't need write barriers, and probably(?)
don't need locks when collecting the immutable heap, too.

Const is an interesting grey area here, though. It's also transitive,
and doesn't allow mutation through it, but something else may mutate the
data via another reference. So I'm not sure how exactly const will come
into play here, though it does seem like another promising area for GC
optimization.


 Casting is by essence bypassing the type system. In this case, you
 must be help responsible for what happen. The language cannot provide
 guarantees.

Agreed.


T

-- 
Don't drink and derive. Alcohol and algebra don't mix.

Jan 10 2014

Rainer Schuetze <r.sagitario gmx.de> writes:

On 11.01.2014 00:58, deadalnix wrote:
 On Friday, 10 January 2014 at 20:37:53 UTC, Rainer Schuetze wrote:
 I think this might help in having different heaps (especially thread
 local heaps if you include "shared"), but I'm not sure this works in
 unsafe D where casting is allowed.

 One of the key to a fast GC is segregating the heap in smaller
 parts. This is why generational collectors are so popular in java
 for instance.

 Segregating the heap imply so complications. For instance, it is
 necessary to track reference from one heap to another, typically
 from young to older objects.

 In D, the type system provide a natural segregation. It would be
 a great missed opportunity to not take advantage of it. This
 segregation is really nice as pointers go from one to another in
 a single direction, avoiding the need to track references
 changes, and all element of the heap part share some common
 characteristics. These characteristics allow for specialized GC
 algorithms (for instance, a GC can run on the immutable heap
 without the program even knowing about it).

 Casting is by essence bypassing the type system. In this case,
 you must be help responsible for what happen. The language cannot
 provide guarantees.

This might work with safe D. I haven't yet figured the consequences of 
implicit heap segregation by type. What restrictions will this impose on 
system programming?

As compacting data is also pretty problematic as long as part of the 
scanning is conservative, we might as well consider storing generation 
and type information per allocation. This will destroy some optimization 
possibilities, though.

Jan 11 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 01/10/2014 09:03 AM, Andrei Alexandrescu wrote:
 * There will be no possibility to change the type of certain objects
 once allocated. An allocation for an immutable object cannot e.g. make
 it later a mutable one. (This is an essential issue with the current
 allocator, I think.)

I assume you are aware that there is implicit casting to immutable upon 
return from a strongly pure function. What about it?

Jan 10 2014

"deadalnix" <deadalnix gmail.com> writes:

On Saturday, 11 January 2014 at 04:26:51 UTC, Timon Gehr wrote:
 On 01/10/2014 09:03 AM, Andrei Alexandrescu wrote:
 * There will be no possibility to change the type of certain 
 objects
 once allocated. An allocation for an immutable object cannot 
 e.g. make
 it later a mutable one. (This is an essential issue with the 
 current
 allocator, I think.)

 I assume you are aware that there is implicit casting to 
 immutable upon return from a strongly pure function. What about 
 it?

Is it a bird ? Is it a plane ? No it is *ISOLATED* !

Jan 10 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/10/14 8:26 PM, Timon Gehr wrote:
 On 01/10/2014 09:03 AM, Andrei Alexandrescu wrote:
 * There will be no possibility to change the type of certain objects
 once allocated. An allocation for an immutable object cannot e.g. make
 it later a mutable one. (This is an essential issue with the current
 allocator, I think.)

 I assume you are aware that there is implicit casting to immutable upon
 return from a strongly pure function. What about it?

I don't know. Need to think about it. Maybe it's a wrong design decision 
(or maybe separate heaps are wrong).

Andrei

Jan 10 2014

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Jan 10, 2014 at 08:53:44PM -0800, Andrei Alexandrescu wrote:
 On 1/10/14 8:26 PM, Timon Gehr wrote:
On 01/10/2014 09:03 AM, Andrei Alexandrescu wrote:
* There will be no possibility to change the type of certain objects
once allocated. An allocation for an immutable object cannot e.g.
make it later a mutable one. (This is an essential issue with the
current allocator, I think.)

I assume you are aware that there is implicit casting to immutable
upon return from a strongly pure function. What about it?

 
 I don't know. Need to think about it. Maybe it's a wrong design
 decision (or maybe separate heaps are wrong).

[...]

I dunno, separate heaps, esp. for immutable vs. mutable, seems like a
very powerful GC optimization opportunity. Missing it seems like such a
pity, esp. given the price we pay for immutability (transitivity, actual
immutability guarantees unlike C/C++ const, etc.).


T

-- 
If you want to solve a problem, you need to address its root cause, not
just its symptoms. Otherwise it's like treating cancer with Tylenol...

Jan 10 2014

"Nicholas Londey" <londey gmail.com> writes:

In theory you could use a region allocator for a pure function 
and copy its result out.
This wouldn't be worth it in the general case but imagine worker 
threads in a task pool. Pure functions executing message 
in/message(s) out that did not significantly contribute to the 
stop the world GC workload.

My D experience is only limited (mainly a C++ programmer) but 
something along these lines is how I have always imagined GC will 
sidestep the 'stop the world' problem as we move toward 
async/await programming models.

Jan 13 2014

=?ISO-8859-1?Q?S=F6nke_Ludwig?= <sludwig+dforum outerproduct.org> writes:

Am 11.01.2014 05:26, schrieb Timon Gehr:
 On 01/10/2014 09:03 AM, Andrei Alexandrescu wrote:
 * There will be no possibility to change the type of certain objects
 once allocated. An allocation for an immutable object cannot e.g. make
 it later a mutable one. (This is an essential issue with the current
 allocator, I think.)

 I assume you are aware that there is implicit casting to immutable upon
 return from a strongly pure function. What about it?

I'd say the easiest way to deal with it is to let the compiler emit a 
"gc_isolatedToImmutable"-like call in this case which moves the object 
from one heap to another, assuming that the language guarantees safety.

Jan 12 2014

"Chris Williams" <yoreanon-chrisw yahoo.co.jp> writes:

On Friday, 10 January 2014 at 08:03:03 UTC, Andrei Alexandrescu
wrote:
 The design at http://goo.gl/ZVCJeB seems to be a win. It works 
 well, comprehends all major allocation tropes, someone 
 implemented a subset of it in C++ and measured good results, 
 and a coworker is considering adopting the design for a C++ 
 project as well.

An additional feature that I would like to suggest is the concept
of time-limited collection.

The goal would be to be able to write a main loop in my
application where I say:

const auto FRAME_DUR_MS = 25;
while (keepPlaying) {
    auto start = Clock.currSystemTick()
    update();
    render();
    GC.collectMsecs( FRAME_DUR_MS - (Clock.currSystemTick() -
start).msecs() );
}

The idea is that the collector, conceptually, has a routine that
it runs. I.e.

1. Point to bottom of stack, scan through.
2. Pop registers from each thread, check values.
3. Determine unreachable values amongst everything allocated.
4. Call destructors on objects.
5. Free memory.

Since "unreachability" is a one-way street, I can pause anywhere
in this process and resume from that location at a later time. I
don't have to go through all of the steps every time I am called.
So while there's no guarantee that any one call to collectMsecs()
actually releases any memory, if the caller continues to call it
reliably, they should be able to keep on top of their memory
usage (or know that they need to allocate less frequently if they
see their memory continue to grow, even under this scheme).

Jan 14 2014

"Rikki Cattermole" <alphaglosined gmail.com> writes:

On Tuesday, 14 January 2014 at 23:48:33 UTC, Chris Williams wrote:
 On Friday, 10 January 2014 at 08:03:03 UTC, Andrei Alexandrescu
 wrote:
 The design at http://goo.gl/ZVCJeB seems to be a win. It works 
 well, comprehends all major allocation tropes, someone 
 implemented a subset of it in C++ and measured good results, 
 and a coworker is considering adopting the design for a C++ 
 project as well.

 An additional feature that I would like to suggest is the 
 concept
 of time-limited collection.

 The goal would be to be able to write a main loop in my
 application where I say:

 const auto FRAME_DUR_MS = 25;
 while (keepPlaying) {
    auto start = Clock.currSystemTick()
    update();
    render();
    GC.collectMsecs( FRAME_DUR_MS - (Clock.currSystemTick() -
 start).msecs() );
 }

 The idea is that the collector, conceptually, has a routine that
 it runs. I.e.

 1. Point to bottom of stack, scan through.
 2. Pop registers from each thread, check values.
 3. Determine unreachable values amongst everything allocated.
 4. Call destructors on objects.
 5. Free memory.

 Since "unreachability" is a one-way street, I can pause anywhere
 in this process and resume from that location at a later time. I
 don't have to go through all of the steps every time I am 
 called.
 So while there's no guarantee that any one call to 
 collectMsecs()
 actually releases any memory, if the caller continues to call it
 reliably, they should be able to keep on top of their memory
 usage (or know that they need to allocate less frequently if 
 they
 see their memory continue to grow, even under this scheme).

I really love this idea. However I'm not sure how kernels would 
like no sleeping to occur at least by my understanding.
It would definitely be a big plus to anything related to gui's.

Jan 14 2014

"Chris Williams" <yoreanon-chrisw yahoo.co.jp> writes:

On Wednesday, 15 January 2014 at 01:55:13 UTC, Rikki Cattermole
wrote:
 I really love this idea. However I'm not sure how kernels would 
 like no sleeping to occur at least by my understanding.
 It would definitely be a big plus to anything related to gui's.

Sleep allows a thread to give way to another thread within the
same process. If an app only has one thread, then that should end
up ceding time to other processes, but the OS is still able to
perform process switching regardless of whether an application
ever has all of its threads asleep at once or not.

Jan 15 2014

"Rikki Cattermole" <alphaglosined gmail.com> writes:

On Wednesday, 15 January 2014 at 21:54:18 UTC, Chris Williams 
wrote:
 On Wednesday, 15 January 2014 at 01:55:13 UTC, Rikki Cattermole
 wrote:
 I really love this idea. However I'm not sure how kernels 
 would like no sleeping to occur at least by my understanding.
 It would definitely be a big plus to anything related to gui's.

 Sleep allows a thread to give way to another thread within the
 same process. If an app only has one thread, then that should 
 end
 up ceding time to other processes, but the OS is still able to
 perform process switching regardless of whether an application
 ever has all of its threads asleep at once or not.

I'm referring to cpu prioritization. For the process. It'll 
depend heavily on the kernel/config of it however.

Jan 15 2014

D Programming

C/C++ Programming

Other

digitalmars.D - Componentizing D's garbage collector