digitalmars.D - Possible quick win in GC?

Abdulhaq (48/48) Sep 28 2014 Perhaps I've too had much caffeine today but I've had an idea

Abdulhaq (40/40) Sep 28 2014 Here's a code snippet which mopefully makes things a bit clearer:

Freddy (3/44) Sep 28 2014 You mean this?

Abdulhaq (15/17) Sep 28 2014 Of course my proposal uses the techique of escape analysis as

David Nadlinger (8/13) Sep 28 2014 LDC does the "fairly simple" part of this already in a custom

Abdulhaq (28/41) Sep 29 2014 That's interesting, yes I guessed that the escape analysis would

Mike (3/47) Sep 29 2014 Sounds a little like http://wiki.dlang.org/DIP46

Abdulhaq (8/60) Sep 29 2014 Ah thanks for the link yes there are definite similarities,

thedeemon (4/4) Sep 29 2014 On Sunday, 28 September 2014 at 16:29:45 UTC, Abdulhaq wrote...

Abdulhaq (5/9) Sep 29 2014 Ah yes - it's identical isn't it - I don't know whether to be

thedeemon (8/10) Sep 29 2014 From the literature on this topic I remember attempts for

Abdulhaq (2/12) Sep 30 2014 Mmm yes I can imagine that being the case... oh well!

"Abdulhaq" <alynch4047 gmail.com> writes:

Perhaps I've too had much caffeine today but I've had an idea 
which might give a fairly quick win on the GC speed situation. 
It's a simple idea at heart so it's very possible/likely that 
this is a well known idea that has already been discarded but 
anyway here goes.

I got the idea after thinking that it should be fairly simple for 
the compiler to detect straightforward cases of when a variable 
can be declared as going on the stack - i.e. no references to it 
are retained after its enclosing function returns. At the moment 
AIUI it is necessary for a class instance to be declared by the 
programmer as 'scoped' for this to take place.

Further, I was considering the type of ownership and boundary 
considerations that could be used to improve memory management - 
e.g. using the notion of an owner instance which, upon 
destruction, destroys all owned objects.

Afer some consideration it seems to me that by using only static 
analysis a tree of references could be constructed of references 
from a root 'scoped' object to all referred to objects that are 
allocated after the allocation of the root object. When the root 
object goes out of scope it is destroyed and all the descendent 
objects from the root object (as identified by the static 
analysis) could also be destroyed in one simple shot. The static 
analysis of course constructs the tree by analysing the capturing 
of references from one object to another. It could be the case 
that even a simple static analysis at first (e.g. discard the 
technique in difficult situations) could cover a lot of use cases 
(statistically).

Of course, if one of the descendent objects is referred to by an 
object which is not in the object tree, then this technique 
cannot be used. However, I envisage that there are many 
situations where upon the destruction of a root object all 
related post-allocated objects can also be destroyed.

In terms of implementation I see this being done by what I am 
calling 'bands' within the GC. With the allocation of any 
identified root object, a new band (heap) is created in the GC. 
Child objects of the root object (i.e. only referred to by the 
root object and other child objects in its tree) are placed in 
the same band. When the root object goes out of scope the entire 
band is freed. This by definition is safe because the static 
analysis has ensured that there are no 'out-of-tree' references 
between child objects in the tree and out-of-tree (out-of-band) 
objects. This property also means that normal GC runs do not need 
to add the scoped root object as a GC root object - this memory 
will normally only be freed when the scoped root object at the 
top of the tree goes out of scope. If memory becomes constrained 
then the bands can be added as root objects to the GC and memory 
incrementally freed just as with regularly allocated objects.


Sorry if this idea is daft and I've wasted your time!

Sep 28 2014

"Abdulhaq" <alynch4047 gmail.com> writes:

Here's a code snippet which mopefully makes things a bit clearer:


/**
* In this example the variable foo can be statically analysed as 
safe to go on the stack.
* The new instance of Bar allocated in funcLevelB is only 
referred to by foo. foo can
* be considered a root 'scoped' variable and the GC can delete 
both foo and the new Bar()
* when foo goes out of scope. There is no need (except when under 
memory pressure) for
* the GC to scan the band created for foo and it's related child 
allocations.
*/

import std.stdio;

class Bar {
	public:
	int x;
	
	this(int x) {
		this.x = x;
	}
}

class Foo {
	public:
	Bar bar;
}

void funcLevelA() {
	Foo foo = new Foo(); // static analysis could detect this as 
able to go on the stack
	funcLevelB(foo);
	writeln(foo.bar.x);
}

void funcLevelB(Foo foo) {
	foo.bar = new Bar(12); // this allocated memory is only referred 
to by foo, which
						   // static analysis has established can go on the stack	
}

void main() {
	funcLevelA();
}

Sep 28 2014

"Freddy" <Hexagonalstar64 gmail.com> writes:

On Sunday, 28 September 2014 at 17:47:42 UTC, Abdulhaq wrote:
 Here's a code snippet which mopefully makes things a bit 
 clearer:


 /**
 * In this example the variable foo can be statically analysed 
 as safe to go on the stack.
 * The new instance of Bar allocated in funcLevelB is only 
 referred to by foo. foo can
 * be considered a root 'scoped' variable and the GC can delete 
 both foo and the new Bar()
 * when foo goes out of scope. There is no need (except when 
 under memory pressure) for
 * the GC to scan the band created for foo and it's related 
 child allocations.
 */

 import std.stdio;

 class Bar {
 	public:
 	int x;
 	
 	this(int x) {
 		this.x = x;
 	}
 }

 class Foo {
 	public:
 	Bar bar;
 }

 void funcLevelA() {
 	Foo foo = new Foo(); // static analysis could detect this as 
 able to go on the stack
 	funcLevelB(foo);
 	writeln(foo.bar.x);
 }

 void funcLevelB(Foo foo) {
 	foo.bar = new Bar(12); // this allocated memory is only 
 referred to by foo, which
 						   // static analysis has established can go on the stack	
 }

 void main() {
 	funcLevelA();
 }

You mean this?
https://en.wikipedia.org/wiki/Escape_analysis

Sep 28 2014

"Abdulhaq" <alynch4047 gmail.com> writes:

 You mean this?
 https://en.wikipedia.org/wiki/Escape_analysis

Of course my proposal uses the techique of escape analysis as 
part of its methodology, but the essence of the idea is to 
greatly cut down on the work that the GC has to do on each sweep 
when dealing with objects that have been found to belong to a 
particular set. The objects in each set are in an object graph 
that has no incoming references from objects external to the set 
and which can therefore be allocated in their own heap that is 
destroyed when the root object goes out of scope.

The saving takes place because the GC does not need to scan the 
default heap for pointers found in the new heaps (bands). For 
certain type of programs such as compilers / lexers / parsers 
where many temporary objects are allocated and shortly after 
deallocated this can result in a substantial time saving in 
execution. In terms of memory usage we would see multiple 
potentially large but short-lived spikes.

Sep 28 2014

"David Nadlinger" <code klickverbot.at> writes:

On Sunday, 28 September 2014 at 16:29:45 UTC, Abdulhaq wrote:
 I got the idea after thinking that it should be fairly simple 
 for the compiler to detect straightforward cases of when a 
 variable can be declared as going on the stack - i.e. no 
 references to it are retained after its enclosing function 
 returns.

LDC does the "fairly simple" part of this already in a custom 
LLVM optimizer pass. The issue is that escape analysis is fairly 
hard in general, and currently even more limited because we only 
do it on the LLVM IR level (i.e. don't leverage any additional 
attributes like scope, pure, … that might be present in the D 
source code).

David

Sep 28 2014

"Abdulhaq" <alynch4047 gmail.com> writes:

On Sunday, 28 September 2014 at 20:20:29 UTC, David Nadlinger 
wrote:
 On Sunday, 28 September 2014 at 16:29:45 UTC, Abdulhaq wrote:
 I got the idea after thinking that it should be fairly simple 
 for the compiler to detect straightforward cases of when a 
 variable can be declared as going on the stack - i.e. no 
 references to it are retained after its enclosing function 
 returns.

 LDC does the "fairly simple" part of this already in a custom 
 LLVM optimizer pass. The issue is that escape analysis is 
 fairly hard in general, and currently even more limited because 
 we only do it on the LLVM IR level (i.e. don't leverage any 
 additional attributes like scope, pure, … that might be present 
 in the D source code).

 David

That's interesting, yes I guessed that the escape analysis would 
present the harder part, but I'm hoping that the algorithm can be 
built up incrementally, identifying the easy wins first and then 
over time extending it to cover harder cases.

One way that I see it working it is to conduct a form of lowering 
where the new operator has some information added to it to 
indicate the 'band' that the GC should place the non-root objects 
into (root objects go on the stack). Using the syntax of C++'s 
placement new (but totally different semantics) code could be 
lowered to e.g.

External externalObj = new(0) External(); // 0 means use the 
default heap
Foo foo = new(0x1234) Foo(); // 0x1234 is the heap/band id for 
this set of objects
...
Bar bar = new (0x1234) Bar();

When the GC allocates memory it does so in the indicated 
band/heap, and then when foo (the root object of the object 
graph) goes out of scope the relevant band/heap is destroyed en 
bloc. The benefit of the idea is that when scanning for objects  
that can be deleted the GC does not need to consider those 
objects in the non default bands/heaps. For some classes of 
programs such as compilers (it was Higgs that gave me the 
stimulus), and with good static analysis (aye there's the rub 
cap'n) this could represent a very substantial time saving on ech 
GC sweep.

Sep 29 2014

"Mike" <none none.com> writes:

On Monday, 29 September 2014 at 07:03:29 UTC, Abdulhaq wrote:
 On Sunday, 28 September 2014 at 20:20:29 UTC, David Nadlinger 
 wrote:
 On Sunday, 28 September 2014 at 16:29:45 UTC, Abdulhaq wrote:
 I got the idea after thinking that it should be fairly simple 
 for the compiler to detect straightforward cases of when a 
 variable can be declared as going on the stack - i.e. no 
 references to it are retained after its enclosing function 
 returns.

 LDC does the "fairly simple" part of this already in a custom 
 LLVM optimizer pass. The issue is that escape analysis is 
 fairly hard in general, and currently even more limited 
 because we only do it on the LLVM IR level (i.e. don't 
 leverage any additional attributes like scope, pure, … that 
 might be present in the D source code).

 David

 That's interesting, yes I guessed that the escape analysis 
 would present the harder part, but I'm hoping that the 
 algorithm can be built up incrementally, identifying the easy 
 wins first and then over time extending it to cover harder 
 cases.

 One way that I see it working it is to conduct a form of 
 lowering where the new operator has some information added to 
 it to indicate the 'band' that the GC should place the non-root 
 objects into (root objects go on the stack). Using the syntax 
 of C++'s placement new (but totally different semantics) code 
 could be lowered to e.g.

 External externalObj = new(0) External(); // 0 means use the 
 default heap
 Foo foo = new(0x1234) Foo(); // 0x1234 is the heap/band id for 
 this set of objects
 ...
 Bar bar = new (0x1234) Bar();

 When the GC allocates memory it does so in the indicated 
 band/heap, and then when foo (the root object of the object 
 graph) goes out of scope the relevant band/heap is destroyed en 
 bloc. The benefit of the idea is that when scanning for objects
  that can be deleted the GC does not need to consider those 
 objects in the non default bands/heaps. For some classes of 
 programs such as compilers (it was Higgs that gave me the 
 stimulus), and with good static analysis (aye there's the rub 
 cap'n) this could represent a very substantial time saving on 
 ech GC sweep.

Sounds a little like http://wiki.dlang.org/DIP46

Mike

Sep 29 2014

"Abdulhaq" <alynch4047 gmail.com> writes:

On Monday, 29 September 2014 at 09:45:38 UTC, Mike wrote:
 On Monday, 29 September 2014 at 07:03:29 UTC, Abdulhaq wrote:
 On Sunday, 28 September 2014 at 20:20:29 UTC, David Nadlinger 
 wrote:
 On Sunday, 28 September 2014 at 16:29:45 UTC, Abdulhaq wrote:
 I got the idea after thinking that it should be fairly 
 simple for the compiler to detect straightforward cases of 
 when a variable can be declared as going on the stack - i.e. 
 no references to it are retained after its enclosing 
 function returns.

 LDC does the "fairly simple" part of this already in a custom 
 LLVM optimizer pass. The issue is that escape analysis is 
 fairly hard in general, and currently even more limited 
 because we only do it on the LLVM IR level (i.e. don't 
 leverage any additional attributes like scope, pure, … that 
 might be present in the D source code).

 David

 That's interesting, yes I guessed that the escape analysis 
 would present the harder part, but I'm hoping that the 
 algorithm can be built up incrementally, identifying the easy 
 wins first and then over time extending it to cover harder 
 cases.

 One way that I see it working it is to conduct a form of 
 lowering where the new operator has some information added to 
 it to indicate the 'band' that the GC should place the 
 non-root objects into (root objects go on the stack). Using 
 the syntax of C++'s placement new (but totally different 
 semantics) code could be lowered to e.g.

 External externalObj = new(0) External(); // 0 means use the 
 default heap
 Foo foo = new(0x1234) Foo(); // 0x1234 is the heap/band id for 
 this set of objects
 ...
 Bar bar = new (0x1234) Bar();

 When the GC allocates memory it does so in the indicated 
 band/heap, and then when foo (the root object of the object 
 graph) goes out of scope the relevant band/heap is destroyed 
 en bloc. The benefit of the idea is that when scanning for 
 objects
 that can be deleted the GC does not need to consider those 
 objects in the non default bands/heaps. For some classes of 
 programs such as compilers (it was Higgs that gave me the 
 stimulus), and with good static analysis (aye there's the rub 
 cap'n) this could represent a very substantial time saving on 
 ech GC sweep.

 Sounds a little like http://wiki.dlang.org/DIP46

 Mike

Ah thanks for the link yes there are definite similarities, 
Walter identifies sets of objects to go in his region through 
having a boundary on the pure function. My set is determined by 
static analysis and does not require the function to be pure. 
However, thedeemon has pointed out that this technique is in fact 
well known - I'm left to wonder if I should have a go at an 
implementation, but my wife would not be too chuffed about that.

Sep 29 2014

"thedeemon" <dlang thedeemon.com> writes:

On Sunday, 28 September 2014 at 16:29:45 UTC, Abdulhaq wrote...

Congratulations on inventing
http://en.wikipedia.org/wiki/Region-based_memory_management
and "Region inference" in particular.

Sep 29 2014

"Abdulhaq" <alynch4047 gmail.com> writes:

On Monday, 29 September 2014 at 11:02:12 UTC, thedeemon wrote:
 On Sunday, 28 September 2014 at 16:29:45 UTC, Abdulhaq wrote...

 Congratulations on inventing
 http://en.wikipedia.org/wiki/Region-based_memory_management
 and "Region inference" in particular.

Ah yes - it's identical isn't it - I don't know whether to be 
happy or sad about that :). They explain it much better too...

So, (does anyone know) has this technique been discarded for D or 
is it 'just' a matter of the resources to do it?

Sep 29 2014

"thedeemon" <dlang thedeemon.com> writes:

On Monday, 29 September 2014 at 11:36:51 UTC, Abdulhaq wrote:

 So, (does anyone know) has this technique been discarded for D 
 or is it 'just' a matter of the resources to do it?

 From the literature on this topic I remember attempts for 
automatic region inference mostly failed: it lead to some small 
regions here and there that didn't affect anything much and then 
one huge region where most of the data landed, requiring a 
full-blown GC inside that big region. Making it work in D, where 
everything can be mutated by anything and any type system wall is 
a cast() away from breaking, seems virtually impossible.

Sep 29 2014

"Abdulhaq" <alynch4047 gmail.com> writes:

On Monday, 29 September 2014 at 12:02:40 UTC, thedeemon wrote:
 On Monday, 29 September 2014 at 11:36:51 UTC, Abdulhaq wrote:

 So, (does anyone know) has this technique been discarded for D 
 or is it 'just' a matter of the resources to do it?

 From the literature on this topic I remember attempts for 
 automatic region inference mostly failed: it lead to some small 
 regions here and there that didn't affect anything much and 
 then one huge region where most of the data landed, requiring a 
 full-blown GC inside that big region. Making it work in D, 
 where everything can be mutated by anything and any type system 
 wall is a cast() away from breaking, seems virtually impossible.

Mmm yes I can imagine that being the case... oh well!

Sep 30 2014

D Programming

C/C++ Programming

Other

digitalmars.D - Possible quick win in GC?