www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Proposal for dual memory management

reply F. Almeida <francisco.m.almeida gmail.com> writes:
In the spirit of accepting the decision of removing delete, I would like to
present a proposal for semantics/library features that would give any
programmer the choice between managing dinamically allocated memory by himself
and trusting the garbage collector to efficiently handle everything. This
would be similar to what happens in C# (minus the extra unsafe and fixed
keywords), and can be considered safe.

1. As decided by the D designers, the "delete" keyword is to be removed. If the
"new" keyword is kept, it is reserved to allocating memory managed by the
garbage collector for objects.

2. Two distinct heaps are available by default: the managed (garbage collected)
heap and the unmanaged heap. Objects declared on one heap cannot be moved
to the other heap (Except by virtue of copying an object from the unmanaged
heap to the garbage collected heap and then destroying the original, which is
the closest to a safe operation.) The garbage collector monitors the state of
both, but only reads/writes the managed heap.

3. Manual memory (de)allocation, and construction/destruction of objects are
separated, each is given its own function/operator. If a function is used, it
should be a function template that infers the required memory amount needed by
a initialized object according to the contained data (i.e., calling
cmalloc()/cfree() without needing anything other than the object type).

The functions to take care of memory allocation/deallocation could simply be
called alloc()/dealloc(). Example:

class Foo
{
  this() {}
  ~this() {}  // no object may be declared on the unmanaged heap unless there
are explicit constructors/destructors.
}

Foo f1;    // Declared, not initialized.
alloc!Foo(f1); // Memory is allocated for the object. Constructor not yet
called. Reference to f1 is returned.
f1 = Foo(); // Proposed syntax for construction on unmanaged heap. Only legal
*after* allocating with alloc() (i.e. , the .init state is tested.)

destroy!Foo(f1); // Possible syntax for a function template that calls the
destructor of Foo and reverts to .init. No deallocation occurs yet.
dealloc!Foo(f1); // Memory may only be deallocated *after* destroying the
object (i.e., reverting it to initial state.)

Foo f1 = new Foo();  // The familiar syntax would be kept for creating on the
managed heap, without any further intervention from the programmer.

4. Needless to say, in many cases the correct type would be inferred without
any explicit template parameters (i.e., one could just write alloc(f1) instead
of alloc!Foo(f1).) Attempts to call either alloc(null), destroy(null) or
dealloc(null) would result in exceptions.

5. Additional functions could be available to simplify these tasks. For example,

class Boo : Foo
{
   this() {}
   ~this() {}
}

Foo f2 = create!Foo(f2);  // Calls alloc!Foo(f2) and then calls the
constructor. Equivalent to new in C++.
Foo b = create!Boo(b);  // Casts the pointer to the correct type before
allocating, then calls the constructor. Again, same behavior as new.

delete(f2);  // Calls the destructor, and then deallocates. Would replace
delete, except it wouldn't touch the managed heap.

I think this could be a reasonable way to keep manual memory management, at
least on a superficial analysis.
Jul 27 2010
parent reply Petr Janda <janda.petr gmail.com.au> writes:
You said:
Foo f2 = create!Foo(f2)

But I don't like the fact that f2 is mentioned twice.

I would prefer the exact same behaviour as C++:

Class* c = new Class(); // allocate on the umanaged heap and run the default
constructor
delete c; // run destructor and deallocate

objects to be allocated on the garbage collected heap would be constructed this
way:
Class c = Class.new() // delete c would be illegal

Simple and logical to me, unfortunately TDPL has been written and everyone
would be using "new Class()" instead of "Class.new()" :(

Petr
Jul 27 2010
parent reply Petr Janda <janda.petr gmail.com> writes:
Looking at TDPL, placement new can actually take a memory address.

So Im thinking why not do this:

void* mp = alloc!Foo(); // allocate raw memory off the C heap to hold Foo.
Foo* f = new(mp) Foo(); // construct an object in the memory pointed to by mp

and then

clear(f); // call destructor on f, obliterate with Buffer.init and then call
Buffer's default constructor as per TDPL
dealloc(mp); //deallocate memory

We reached the desired removal of delete without breaking TDPL.

Any comments about how possible or not possible it is to do?
Jul 27 2010
parent reply F. Almeida <francisco.m.almeida gmail.com> writes:
== Quote from Petr Janda (janda.petr gmail.com)'s article
 Looking at TDPL, placement new can actually take a memory address.
 So Im thinking why not do this:
 void* mp = alloc!Foo(); // allocate raw memory off the C heap to

 Foo* f = new(mp) Foo(); // construct an object in the memory

It sounds like a fine alternative for construction, except that it can lead to confusion regarding the role of new, which changes drastically depending on whether we pass onto it an address or not. Also, I would prefer if we could avoid explicitly declaring a separate void* pointer. At any rate, it is already an improvement over my clumsy f = Foo(); syntax. Keep in mind that this would probably also be legal code: Foo f = new(alloc!Foo()) Foo();
 and then
 clear(f); // call destructor on f, obliterate with Buffer.init and

 Buffer's default constructor as per TDPL

It should be possible with ctor/dtor and alloc/dealloc separation, but in some instances, we should be able to skip the last step of calling the constructor again. Buffer.init is a better state for debugging purposes too.
 dealloc(mp); //deallocate memory

Again, this would require anybody who wants to manage memory to keep track of two variables per object. Is there a simpler alternative?
 We reached the desired removal of delete without breaking TDPL.
 Any comments about how possible or not possible it is to do?

It all comes together by protecting managed from unmanaged memory in a final analysis. We just need to be sure about how to bring alloc()/dealloc() to the table and how to link construction/destruction to them, without reproducing the C++ new/delete in the process. IMHO, if we are to replace them, make the alternatives even more powerful.
Jul 28 2010
parent reply Petr Janda <janda.petr gmail.com> writes:
 Keep in mind that this would probably also be legal code:
 Foo f = new(alloc!Foo()) Foo();

Well I think the goal, according to Andrei, is to split allocation/construction and deallocation/destruction.
 dealloc(mp); //deallocate memory

track of two variables per object. Is there a simpler alternative?

Yes but how to do this effectively without combining them back to one command like delete? I mentioned in my first post that I'd prefer new and delete to behave like it does in C++ (heap allocation) and Class.new for GC allocation, seeing as thats not possible anymore due to TDPL, I don't know. I really know bugger all about D. I know a little bit about C++ though, and if D2 wants to remain a system programming language, it must have same "power" as C++, in particular manual memory management. I think what has to be done is clearly indicate which features of D can't be used when managing your own memory. Best if compiler could warn us.
Jul 28 2010
next sibling parent F. Almeida <francisco.m.almeida gmail.com> writes:
== Quote from Petr Janda (janda.petr gmail.com)'s article
 I think what has to be done is clearly indicate which features of D can't be
used
 when managing your own memory. Best if compiler could warn us.

Aside from manual allocation and deallocation, there should be very little difference between what you can do to managed and to unmanaged objects. We could eventually adopt a unsafe attribute that tells the compiler to disable whatever is unsafe code whenever using the SafeD subset. This would make manually managed code refuse to compile. But I don't think this step should be taken before the GC is updated and optimized.
Jul 28 2010
prev sibling parent reply Kagamin <spam here.lot> writes:
Petr Janda Wrote:

 Keep in mind that this would probably also be legal code:
 Foo f = new(alloc!Foo()) Foo();

Well I think the goal, according to Andrei, is to split allocation/construction and deallocation/destruction.

Allocation and construction is also split? Then you can use unused new() argument list: --- shared stream = File.new().ctor("/foo.d").open(O_READ); //gc heap shared stream = File.new(heap2).ctor("/foo.d").open(O_READ); //oops, leak shared stream = File.new(scope).ctor("/foo.d").open(O_READ); //now ok stream.dtor().free(); //illegal, free(Heap storage) requires heap stream.dtor().free(heap2); //throw if not found in heap2 ---
Jul 28 2010
parent reply Petr Janda <janda.petr gmail.com> writes:
I don't think the intention is to split GC allocation/construction into to
expressions.
Only heap allocation/construction and heap deallocation/destruction.

It should be duly noted that features that can't be used without garbage
collection ie. D Arrays?
There should have an alternative that maybe doesn't have all the features but
doesn't require GC.

I'm personally quite fond of STL containers. Rewritten in D would be cool.

vector!int myvec;

Yay!
Jul 30 2010
parent Kagamin <spam here.lot> writes:
Petr Janda Wrote:

 It should be duly noted that features that can't be used without garbage
 collection ie. D Arrays?

Array concat, array literals, closures - what I can think of.
 I'm personally quite fond of STL containers. Rewritten in D would be cool.

Are they reference counted? As I remember, std::string is a value type.
Jul 30 2010