www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Idea for allocators

So, I've been thinking about a few of the current problems with D:
- No allocators on containers
- Standard library functions doing too much GC allocation
- Escaping pointers to memory not allocated using the GC
- Implicit allocation with "~", "~=" and array literals

And I came up with something that might be able to solve a few of 

string Test(Alloc = allocator(return))(string a, string b) {
     return a ~ b;

Escape analysis would be done by the compiler for every 
allocation, whether that's implicit via "~", explicit with "new" 
or whatever.

This will result in a list of ways that a reference to the 
allocated memory can escape, which can contain:
- By assignment to a global
- By return value
- By a particular parameter (if parameter is ref or contains a 
- By the "this" parameter

If there are multiple ways it could escape then a partial 
ordering can help the compiler choose the most general, or if 
there is no reasonable ordering then it could error.

In each case the allocator can be specified using a template 
- string Test(Alloc = allocator(global))(string a, string b);
- string Test(Alloc = allocator(return))(string a, string b);
- string Test(Alloc = allocator("a"))(string a, string b);
- string Test(Alloc = allocator(this))(string a, string b);

Multiple values could also be specified:
- string Test(Alloc = allocator("b", return))(string a, string b);

This does two things - it tells the caller what the allocator 
will be used for, and it helps the compiler decide which 
allocator to use for each allocation. If an allocation can't be 
escaped at all then it should be allocated on the stack, or at 
least using a stack/region allocator for best performance.

Anyway, going back to this case:

string Test(Alloc = allocator(return))(string a, string b) {
     return a ~ b;

The compiler can see that the allocation caused by "a ~ b" can 
only be escaped via the return value, so it will automatically 
use the type "Alloc" as the allocator for that allocation.

If "Test" is called like so:

void Test2() {
     auto result = Test("Hello ", "world!");
     if (result.length > 5)

The "Alloc" parameter has a default value of "allocator(...)" 
which means that the caller should try to figure out what to pass 
in. "allocator(return)" means it will be used to allocate the 
return value, so the compiler performs escape analysis on the 
return value and finds out that it never escapes, and so provides 
a simple stack/region allocator.

The Alloc parameter is a normal template parameter aside from its 
default value so you can always explicitly specify a different 
allocator to use (saves having GC and no-GC versions of each 
phobos function). It can also be used directly as an allocator 
from within the function.

It could also be used with non-function templates such as 
containers, although the only useful defaults would be 
"allocator(this)" and "allocator(global)". The allocator would 
still be filled in automatically by the compiler if not specified 
so that it could potentially allocate an entire container in a 
stack/region allocator and all transparent to the caller.

In cases where there is no allocator such as in a non-template it 
will fall back to using the GC and so be completely backward 

This could all be quite difficult to implement but it does 
provide some nice benefits:
- In most cases the only thing needed to take advantage is to add 
"Alloc = allocator(return)" to the template parameters
- Should massively reduce GC usage and cost of allocations (think 
toLower, etc.)
- No new syntax apart from the keyword "allocator"
- Can still use "~" and all the other nice features of D even in 
performance critical/no-gc code
- Compiler analysis required is confined to a single function at 
a time
- The biggest problem with allocators in C++ is that nobody 
actually bothers to use them. Since in this case the best 
allocator is chosen automatically that's not a problem.
May 31 2013