www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Built-in RAII in D

reply Nerve <nervecenter7 gmail.com> writes:
Thanks to Walter Bright's recent comments at Dconf about memory 
safety, and my own lamentations about the continued use of C in 
all contexts where memory safety is crucial by overconfident 
programmers who believe they can do no wrong, I've decided to 
propose a baked-in RAII implementation for D. I would like to 
submit a DIP, but first I'd like to run it by the forum community 
and see what improvements need to be made, possibly due to my own 
naivety on the subject.

To qualify that, I am nowhere near an expert on memory 
management. But, I've spent enough time absorbing discussion on 
the topic through osmosis and reading on it due to the D 
community's concerns about it that I may as well make myself 
heard and be shot down if I'm wrong.

I understand if many people are resistant to building it into the 
language. Phobos already has it, and there's the automem library 
by Atila Neves. However, I think the perception shift gained by 
baking in these features will benefit D enormously. D users can 
run around all day trying to convince people that different 
library implementations of RAII exist so one need not use the GC, 
but the only thing that is going to convince the programming 
world at large is a giant announcement on Reddit, HackerNews, and 
other places that "Hey, D has RAII now!" This will also drive 
many new eyes to the language that never would have looked 
otherwise.

There are also the obvious syntactical benefits. Referencing an 
RAII object and its members would be literally no different than 
referencing a GC heap object. No need to fiddle with library 
constructs to extract one's reference.

Without further adieu, let's get started.

--- refcounted ---

Keyword "refcounted" allocates an object on the heap, and 
therefore uses an allocator. By default, a built-in malloc() 
implementation is used.

 nogc void main()
{
    auto refCountedObject = refcounted Object();
} // References go to 0, object is destroyed
The allocation method used by refcounted can be overloaded. The overload is a function which expects a type and a set of parameters that were passed to the constructor. One can allocate the object however they choose, call the passed constructor, and then the function expects a return of a reference to the allocated object. Forgive my ignorance, I'm unsure how to handle a collection of parameters. Haven't had to do it yet.
ref T opRefCounted(T)(params)
{
    T* object = malloc(sizeof(T));
    object.this(params);
    return ref object;
}
opRefCounted() is ALWAYS UNSAFE SYSTEM CODE! This could manifest as a compiler warning whenever it is present that must be suppressed by a flag so the developer must acknowledge they have used a custom allocation scheme. There are, of course, other options for handling this, I'm just stating the most obvious. --- unique --- Keyword "unique" allocates an object on the stack. It is only accessible to the given scope, child scopes, or functions it is explicitly passed to. Therefore, it does not use an allocator.
 nogc void main()
{
    auto scopedObject = unique Object();
} // Fall out of scope, object destroyed
--- How new and GC fit in --- Keyword "new", which allocates to the heap for the D garbage collector, may not be used with the nogc attribute. Only refcounted and unique. No objects, functions, methods, or any other code within the scope of, or called from the scope of, a nogc context, may allocate using new.
 nogc void main()
{
    auto refCountedObject = refcounted Object(); // Okay
    auto scopedObject = unique Object();         // Okay
    auto tracedObject = new Object();        // Error!

    {
        auto refCountedObject = refcounted Object(); // Okay
        auto scopedObject = unique Object();         // Okay
        auto tracedObject = new Object();        // Error!
    }
}
More examples using called functions.
void refCountedUsed()
{
    auto refCountedObject = refcounted Object();
}

void uniqueUsed()
{
    auto scopedObject = unique Object();
}

void newUsed()
{
    auto tracedObject = new Object();
}
 nogc void main()
{
    refCountedUsed(); // Okay
    uniqueUsed();     // Okay
    newUsed();        // Error!
}
void main()
{
    refCountedUsed(); // Okay
    uniqueUsed();     // Okay
    newUsed();        // Okay
}
All of these methods are legal when the GC is allowed.
void main()
{
    auto refCountedObject = refcounted Object(); // Okay
    auto scopedObject = unique Object();         // Okay
    auto tracedObject = new Object();        // Okay

    {
        auto refCountedObject = refcounted Object(); // Okay
        auto scopedObject = unique Object();         // Okay
        auto tracedObject = new Object();        // Okay
    }
}
I may be missing some things, but I've left out some exhaustive details since I'm sure many of you are already experienced in the subject and aren't looking for complete documentation in a proposal like this. Feel free to level criticism, and let me know if I should submit a DIP.
May 28
next sibling parent reply Moritz Maxeiner <moritz ucworks.org> writes:
On Sunday, 28 May 2017 at 17:34:30 UTC, Nerve wrote:
 Thanks to Walter Bright's recent comments at Dconf about memory 
 safety, and my own lamentations about the continued use of C in 
 all contexts where memory safety is crucial by overconfident 
 programmers who believe they can do no wrong, I've decided to 
 propose a baked-in RAII implementation for D.I would like to 
 submit a DIP, but first I'd like to run it by the forum 
 community and see what improvements need to be made, possibly 
 due to my own naivety on the subject.
D already has baked-in RAII; acquire the resource in a constructor, release it in a destructor. I assume you are referring to object lifetime management?
 Phobos already has it, and there's the automem library by Atila 
 Neves. However, I think the perception shift gained by baking 
 in these features will benefit D enormously.
Phobos and automem provide mechanisms for object lifetime management and some objects in Phobos use RAII (such as std.stdio.File).
 D users can run around all day trying to convince people that 
 different library implementations of RAII exist so one need not 
 use the GC, but the only thing that is going to convince the 
 programming world at large is a giant announcement on Reddit, 
 HackerNews, and other places that "Hey, D has RAII now!" This 
 will also drive many new eyes to the language that never would 
 have looked otherwise.
To be clear: - RAII is binding a resource's lifetime to an object's lifetime, so that a resource leak can only occur when an object leak occurs - D offers memory and object lifetime management via the GC in druntime - Phobos offers an advanced memory management framework via std.experimental.allocator - Phobos offers object lifetime management via + std.typecons.RefCounted: Reference counting + std.typecons.Unique: Single owner However, since these are older than std.experimental.allocator, they do not use the latter for managing the memory of their objects - automem offers object lifetime management with the same RefCounted / Unique model as std.typecons, but with its memory backed by the std.experimental.allocator framework With regards to your popularity argument: IMHO the only people we should concern ourselves with are those that evaluate which are the right tools for their current task as objectively as feasible given their respective circumstances, not those that are swayed by whatever is the current hype on social site XYZ. But that's just me.
 There are also the obvious syntactical benefits. Referencing an 
 RAII object and its members would be literally no different 
 than referencing a GC heap object. No need to fiddle with 
 library constructs to extract one's reference.
This is what `alias this` exists for.
 [...]

 nogc void main()
{
    auto refCountedObject = refcounted Object();
} // References go to 0, object is destroyed
To be honest I think this is no different to read than --- nogc void main() { auto refCountedObject = RefCounted!Object(); } // References go to 0, object is destroyed --- which we already have for free thanks to templates.
 [...]

 I may be missing some things, but I've left out some exhaustive 
 details since I'm sure many of you are already experienced in 
 the subject and aren't looking for complete documentation in a 
 proposal like this.

 Feel free to level criticism, and let me know if I should 
 submit a DIP.
AFAICT all of your examples look (and work) pretty much the same if you used templates. Only differences would be that each `unique X(...)` becomes `Unique!X(...)`, analogous for refcounted. All in all, I see little to no benefit to what you propose, while requiring significant work on the language spec.
May 28
next sibling parent reply Nerve <nervecenter7 gmail.com> writes:
On Sunday, 28 May 2017 at 18:38:21 UTC, Moritz Maxeiner wrote:
 All in all, I see little to no benefit to what you propose, 
 while requiring significant work on the language spec.
Point taken. My only remaining reservation then is the communication problem D has with the wider prospective programming world in conveying that the GC has alternatives that work. Otherwise, this thread can die.
May 28
next sibling parent Moritz Maxeiner <moritz ucworks.org> writes:
On Sunday, 28 May 2017 at 18:50:02 UTC, Nerve wrote:
 On Sunday, 28 May 2017 at 18:38:21 UTC, Moritz Maxeiner wrote:
 All in all, I see little to no benefit to what you propose, 
 while requiring significant work on the language spec.
Point taken. My only remaining reservation then is the communication problem D has with the wider prospective programming world in conveying that the GC has alternatives that work.
Well, we need people who either - care enough about this to do this work for free, - or are paid to do this You could, e.g., self elect as the champion of Dlang propaganda *cough* PR.
May 28
prev sibling next sibling parent reply Mike Parker <aldacron gmail.com> writes:
On Sunday, 28 May 2017 at 18:50:02 UTC, Nerve wrote:

 Point taken. My only remaining reservation then is the 
 communication problem D has with the wider prospective 
 programming world in conveying that the GC has alternatives 
 that work.
More broadly, I think what we need to be doing is teaching people that D's GC is not their grandfather's GC and that, unless they are doing something highly specialized, they probably don't need alternatives. The GC is fine by itself for a number of apps and, where it isn't, mixing in a bit of nogc is probably better than cutting it out altogether. The problem is one of scale. We've had blog posts on this (for example, [1] and [2], and more to come), comments in threads at reddit and hacker news, tweets, Facebook posts... Any one of those is only going to reach a small number of people. Among those reached who aren't already aware, it's going to stick with an even smaller number. However, each blog post on the topic is one more link that can be brought up in social media threads (in the appropriate contexts!) to bring it to a few more people and perhaps enlighten one or two more. Any D user who cares about this issue can do that easily. And, if you have a blog, write about it yourself. Submit a post to the D Blog (through me). Or host/submit a tutorial somewhere on memory management and RAII in D. Submit a talk for a programming conference or meetup. This is something Walter and Andrei are aware of and they, too, would like to see the perception change. But the only way for that to happen is to make a long-term and sustained effort, with participation from D community at large (unless someone wants to fork over the cash for a targeted marketing blitz). It isn't going to happen overnight. [1] https://dlang.org/blog/2017/03/20/dont-fear-the-reaper/ [2] https://dlang.org/blog/2017/04/28/automem-hands-free-raii-for-d/
May 28
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Monday, 29 May 2017 at 01:36:31 UTC, Mike Parker wrote:
 More broadly, I think what we need to be doing is teaching 
 people that D's GC is not their grandfather's GC and that, 
 unless they are doing something highly specialized, they 
 probably don't need alternatives. The GC is fine by itself for 
 a number of apps and, where it isn't, mixing in a bit of  nogc 
 is probably better than cutting it out altogether.
One fun tutorial would be to integrate with a tedious C++ framework and let the GC take care of allocations in C++ code where speed doesn't matter. Then write a C++ integration tutorial around it. That could be a selling point.
May 29
parent reply evilrat <evilrat666 gmail.com> writes:
On Monday, 29 May 2017 at 11:06:20 UTC, Ola Fosheim Grøstad wrote:
 One fun tutorial would be to integrate with a tedious C++ 
 framework and let the GC take care of allocations in C++ code 
 where speed doesn't matter.

 Then write a C++ integration tutorial around it.

 That could be a selling point.
Can't sell to those who don't buy. I can't say for all, but I have noticed that those who generally use C++ will tell that speed matters "everywhere" (oh right, they do use C++ "for a reason") And there is another kind of people. Even if something happens to be faster than their 'favorite language/framework of choice' they start complaining on how unfair the competiton. And don't forget anti-GC hype wave, it tries to convince how bad, slow, memory hungry, clunky and ugly (literaly all the galactic evil in one face!) your GC is and so it should be banished! Not that I'm trying to convince not to do tutorials, but I'm skeptical on that matter. But I myself interested in seeing more such materials too.
May 29
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Monday, 29 May 2017 at 11:58:56 UTC, evilrat wrote:
 Can't sell to those who don't buy.
 I can't say for all, but I have noticed that those who 
 generally use C++ will tell that speed matters "everywhere" (oh 
 right, they do use C++ "for a reason")
Well, I am in that C++ group... So yes, you cannot sell a GC language to them on it's own terms for _new_ projects. But if you can show that an existing legacy C++ framework benefits in terms of maintenance and further development then it could be a selling point to managers for such projects. Not all C++ code bases need max performance throughout, many are running on much better machines today than they were originally limited to. So if you can show a better maintenance scenario, with a gradual move to D, then it could be a way in. IIRC Inkscape is using a Boehm collector. Integrating Inkscape with D could be one such "legacy" project that would be worth noticing.
May 29
prev sibling parent reply qznc <qznc web.de> writes:
On Sunday, 28 May 2017 at 18:50:02 UTC, Nerve wrote:
 On Sunday, 28 May 2017 at 18:38:21 UTC, Moritz Maxeiner wrote:
 All in all, I see little to no benefit to what you propose, 
 while requiring significant work on the language spec.
Point taken. My only remaining reservation then is the communication problem D has with the wider prospective programming world in conveying that the GC has alternatives that work.
I agree. Currently, a good answer is to direct people to the "Don't fear the reaper" [0] article, but I feel it does not really address all concerns of people. Concerns like: * How much of Phobos does not work with nogc? A good answer would probably be case studies of larger programs/companies. Does Weka use nogc a lot? * How to work around the GC? The reaper article does not mention RefCounted. * Limitations of nogc? It does not prevent *another* thread to call the GC, which might then stop the world. We have to mention the trick to create threads which are not tracked by GC. * How good is the D GC? Will it improve in the foreseeable future? Information about the performance of the current GC is quite dated, although I guess not much has changed. Also, p0nce has some more GC tricks: https://p0nce.github.io/d-idioms/ [0] https://dlang.org/blog/2017/03/20/dont-fear-the-reaper/
May 30
parent Mike Parker <aldacron gmail.com> writes:
On Tuesday, 30 May 2017 at 10:30:13 UTC, qznc wrote:

 Currently, a good answer is to direct people to the "Don't fear 
 the reaper" [0] article, but I feel it does not really address 
 all concerns of people. Concerns like:
That was the introductory post in what I hope will be a long series where many of those concerns *will* be addressed, though I don't have the breadth of knowledge to write them all. My next one is going to cover the basics of nogc (looking like next week). I hope to have a few guest posters contributing to the series on topics I'm not well-versed in.
 * How much of Phobos does not work with  nogc? A good answer 
 would probably be case studies of larger programs/companies. 
 Does Weka use  nogc a lot?
Been reading my TODO list? I have a shortlist of people I plan to ask about a Phobos nogc post. And I'm planning a series of company profiles where that sort of thing will probably be discussed. I had one company lined up to start before DConf, but it never panned out. But during DConf and since I've gotten some tentative commitments. I intend to kick that off in July.
 * How to work around the GC? The reaper article does not 
 mention RefCounted.
A future post will. I expect to have a few posts related to more specialized strategies (Atila's post on automem is in that vein). Currently, I'm planning a post about something I call "Terminator" (which is essentially RefCounted, but calling a `terminate` method rather than the destructor) to introduce the concept, and will talk about RefCounted there. But that's going to come after a post about how the GC interacts with destructors, and that will come after my nogc post.
 * Limitations of  nogc? It does not prevent *another* thread to 
 call the GC, which might then stop the world. We have to 
 mention the trick to create threads which are not tracked by GC.
 * How good is the D GC? Will it improve in the foreseeable 
 future? Information about the performance of the current GC is 
 quite dated, although I guess not much has changed.
The GC implementation is another thing I hope to have a post about, but it will again require a guest author to write it.
 [0] https://dlang.org/blog/2017/03/20/dont-fear-the-reaper/
May 30
prev sibling parent reply aberba <karabutaworld gmail.com> writes:
On Sunday, 28 May 2017 at 18:38:21 UTC, Moritz Maxeiner wrote:
 On Sunday, 28 May 2017 at 17:34:30 UTC, Nerve wrote:
 With regards to your popularity argument: IMHO the only people 
 we should concern ourselves with are those that evaluate which 
 are the right tools for their current task as objectively as 
 feasible given their respective circumstances, not those that 
 are swayed by whatever is the current hype on social site XYZ. 
 But that's just me.
True. Its the current pollution in the software dev community. Baffles me. I'm glad the OP brought this forward.
May 29
parent "Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:
On 05/29/2017 06:25 AM, aberba wrote:
 On Sunday, 28 May 2017 at 18:38:21 UTC, Moritz Maxeiner wrote:
 On Sunday, 28 May 2017 at 17:34:30 UTC, Nerve wrote:
 With regards to your popularity argument: IMHO the only people we 
 should concern ourselves with are those that evaluate which are the 
 right tools for their current task as objectively as feasible given 
 their respective circumstances, not those that are swayed by whatever 
 is the current hype on social site XYZ. But that's just me.
True. Its the current pollution in the software dev community. Baffles me. I'm glad the OP brought this forward.
+1(billion)
May 29
prev sibling parent Stanislav Blinov <stanislav.blinov gmail.com> writes:
On Sunday, 28 May 2017 at 17:34:30 UTC, Nerve wrote:

 --- refcounted ---
 The allocation method used by refcounted can be overloaded...
1. Then the deallocation has to also be overloaded. 2. This introduces tight coupling between a type and its allocator, preventing the programmer from reusing the same types with different allocators, which is crucial when we're talking about manual memory management.
 --- unique ---

 Keyword "unique" allocates an object on the stack. It is only 
 accessible to the given scope, child scopes, or functions it is 
 explicitly passed to. Therefore, it does not use an allocator.
We already have that via DIP1000 and "scope".
 I may be missing some things, but I've left out some exhaustive 
 details since I'm sure many of you are already experienced in 
 the subject and aren't looking for complete documentation in a 
 proposal like this.

 Feel free to level criticism, and let me know if I should 
 submit a DIP.
IMHO, since we're theorizing here, the more feasible direction to take would be to allow expressing ownership, and explicitly qualifying types and variables as nogc. The rest should be left up to programmer, as only they are responsible for imposing concrete requirements on allocation/deallocation.
May 28